




Prepara tus exámenes y mejora tus resultados gracias a la gran cantidad de recursos disponibles en Docsity
Gana puntos ayudando a otros estudiantes o consíguelos activando un Plan Premium
Prepara tus exámenes
Prepara tus exámenes y mejora tus resultados gracias a la gran cantidad de recursos disponibles en Docsity
Prepara tus exámenes con los documentos que comparten otros estudiantes como tú en Docsity
Los mejores documentos en venta realizados por estudiantes que han terminado sus estudios
Estudia con lecciones y exámenes resueltos basados en los programas académicos de las mejores universidades
Responde a preguntas de exámenes reales y pon a prueba tu preparación
Consigue puntos base para descargar
Gana puntos ayudando a otros estudiantes o consíguelos activando un Plan Premium
Comunidad
Pide ayuda a la comunidad y resuelve tus dudas de estudio
Descubre las mejores universidades de tu país según los usuarios de Docsity
Ebooks gratuitos
Descarga nuestras guías gratuitas sobre técnicas de estudio, métodos para controlar la ansiedad y consejos para la tesis preparadas por los tutores de Docsity
The Cleanroom process, a software engineering approach that places software development under statistical quality control. The process emphasizes human mathematical verification and statistical testing to improve software quality and productivity. The authors provide evidence of the effectiveness of this method through case studies and experiments.
Qué aprenderás
Tipo: Guías, Proyectos, Investigaciones
1 / 8
Esta página no es visible en la vista previa
¡No te pierdas las partes importantes!
University of Tennessee, KnoxvilleUniversity of Tennessee, Knoxville
The Harlan D. Mills Collection Science Alliance
Harlan D. Mills
M. Dyer
R. C. Linger
Follow this and additional works at: https://trace.tennessee.edu/utk_harlan
Part of the Software Engineering Commons
Recommended CitationRecommended Citation
Mills, Harlan D.; Dyer, M.; and Linger, R. C., "Cleanroom Software Engineering" (1987). The Harlan D. Mills
Collection.
https://trace.tennessee.edu/utk_harlan/
This Article is brought to you for free and open access by the Science Alliance at TRACE: Tennessee Research and Creative Exchange. It has been accepted for inclusion in The Harlan D. Mills Collection by an authorized administrator of TRACE: Tennessee Research and Creative Exchange. For more information, please contact trace@utk.edu.
<4ag :
Harlan D. Mills, Information Systems Institute Michael Dyer and Richard C. Linger, IBM^ Federal Systems Division
Software quality can be engineered under
statistical quality control
anddelivered with better
quality. The Cleanroom process gives manage- ment an engineering approach to release reliable products.
R ecent experience demonstrates that software can be engineered R under statistical quality control and that certified reliability statistics can be provided with delivered software.
a surprising synergy between mathemati- cal verification and statistical testing of software, as well as a major difference between mathematical fallibility and debugging fallibility in people. With the Cleanroom process, you can engineer software under statistical quality control. As with cleanroom hardware development, the process's first priority is
removal (of course, any defects not prevented should be removed). This first priority is achieved by using human mathematical verification in place of pro- gram debugging to prepare software for system test.
ate units of time (real or processor time) of the delivered product. The certification takes into account the growth of reliabil- ity achieved during system testing before delivery. To gain the benefits of quality control during development, Cleanroom software engineering requires a development cycle of concurrent fabrication and certification of product increments that accumulate into the system to be delivered. This lets the fabrication process be altered on the basis of early certification results to achieve the quality desired.
Cleanroom experience Typical of our experience with the Cleanroom process were three projects: an IBM language product (40,000 lines of
flight program (35,000 lines), and a NASA contract space-transportation planning system (45,000 lines). A major finding in these cases was that human verification,
ging in software development -^ even informal human verification can produce
statistical quality can be measured. For even the simplest of products, there is no absolute best statistical measure of quality. For example, a statistical average can be computed many ways -^ an arith- metic average, a weighted average, a geo- metric average, and a reciprocal average can each be better than the others in vari- ous circumstances. It finally comes down to a judgment of business and management - in every case. In most cases, the judgment is practically automatic from experience and precedent, but it is a judgment. In the case of soft- ware, that judgment has no precedent
under statistical quality control is just at its inception. A new basis for the certification of soft- ware quality, given in Currit, Dyer, and
software specification and a probability distribution on scenarios of the software's use; it then defines a testing procedure and a prescribed computation from test data results to provide a certified statistical quality of delivered software. This new basis represents scientific and engineering judgment of a fair and reasonable way to measure statistical qual- ity of software. As for simpler products, there is no absolute best and no logical arguments for it beyond business and management judgment. But it can provide a basis for software statistical quality as a contractual item where no such reasonable item existed before. The certification of software quality is given in terms of its measured reliability
scenarios in statistical testing. Certifica- tion is an ordinary process in business - even in the certification of the net worth of a bank. As in software certification, there is a fact-finding process, followed by a prescribed computation. In the case of a bank, the fact-finding produces assets and liabilities, and the computation subtracts the sum of the lia- bilities from the sum of the assets. For the bank, there are other measures of impor- tance besides net worth -^ such as good- will, growth, and security of assets -^ just as there are other measures for software
and performance. So a certification of having no gotos without acquiring the fun- software quality is a business measure, damental discipline of mathematical
producing and receiving software. even discovering that such a discipline Once a basis for measuring statistical exists. quality of delivered software is available, In contrast, learning the rigor of mathe- creating a management process for statisti- matical verification leads to behavioral cal quality control is relatively straightfor- modification in both individuals and ward. In principle, the goal is to find ways teams of programmers, whether programs
ment during software development and to cal verification requires precise specifica- modify the development process, where tions and formal arguments about the necessary, to achieve a desired level of correctness with respect to those specifi- statistical quality. cations. The Cleanroom process has been Two main behavioral effects are read-
forthe are in incre- among programmers (and managers) 'ments that nermit realistic measuremenTs becomes much more precise, especially
with provision for improving themasure remium is placed on the simplest pro- grams possible to achieve specified func- tion and performance. If a program looks hard to verify, it is
measurements verification. The result is high productiv-
cal testing. This mathematical verification
Mathematical verification
matical verification is no more than a buzzword. When Dijkstra introduced the idea of structured programming at an early
principal motivation was to reduce the length of mathematical verifications of
about mathematical verification in favor of the easy part about no gotos. But by cut-
cut out much of the real benefit of struc-
those taught at the IBM Software Engi- neering Institute. We find that human verification is (^) surprisingly synergistic with
libility is very different from debugging
statistical testing than are errors of debug- ging fallibility. Perhaps one day automatic verification
no need to wait for the engineering value
until that day. Experimental data from projects where both Cleanroom verification and more
used offers evidence that the Cleanroom- verified software exhibited higher quality.
were injected, and these errors were less
better field quality, all of which was due to
design.
Findings from an early Cleanroom proj- ect (where verified software accounted for
tion) indicate that verified software accounted for only one fourth the error count. (^) Moreover, the verified software was responsible for less than 10 percent of the severe failures. These findings substan- tiate that verified software contains fewer defects (^) and that those defects that are pres- ent are simpler (^) and have less effect on product execution.
The method of human mathematical verification (^) used in Cleanroom develop- ment, called functional verification, is quite different than the method of axio-
sities. It (^) is based on functional semantics and on^ the reduction of software verifica- tion to (^) ordinary mathematical reasoning about (^) sets and functions as directly as possible.
The motivation for functional verifica- tion and for the earliest possible reduction
tions is the problem of scaling up. A set or function can be described in three lines of
lines of English text. There is more human
three lines of mathematical notation, (^) but
rather than with arrays and pointers.
tems in top-level design as well as for
level. The evidence that such reasoning is effective is in the small amount of back-
Cleanroom software engineering While it may (^) sound revolutionary at first glance, the Cleanroom software engi- neering process is an evolutionary step (^) in software development. It is (^) evolutionary in eliminating debugging because, over the past 20 years, more and more program design has been developed in design lan- guages that must be verified rather than executed. So the relative effort (^) for advanced teams in debugging, (^) compared to verifying, is now quite small, even in non-Cleanroom development. It is evolutionary in statistical testing because with higher quality programs (^) at the outset, representative-user testing is correspondingly a greater and greater (^) frac- tion of the total testing effort. And, as
synergism between human verification and statistical testing: People are fallible with human verification, but the errors they
easier to find^ and fix than those left behind from debugging. Results from an early Cleanroom proj-
ware indicate that corrections to the veri- fied software were accomplished in about one fifth the average time of corrections to
percent of^ the cases) to isolate and repair reported defects.
verification with statistical testing makes it possible to define a new software-
incremental releases to (^) achieve a (^) struc-
arguments ensure^ the^ accuracy of the
strategy is to create specifications and the design to those (^) specifications, as well as to
The Cleanroom (^) design methods use a
software (^) logic (sequence, selection, and
ion set of data-structuring primitives (^) (sets,
designs with strongly typed data (^) opera-
forward translation to standard program- ming forms. In the Cleanroom model, structural test- ing that requires knowledge of the design is replaced by formal verification, but functional testing is retained. In fact, this testing can be performed with the two goals of demonstrating that the product requirements are correctly implemented in the software and of providing a basis for product-reliability prediction. The latter is a unique Cleanroom capability that results from its statistical testing method, which supports statistical inference from the test to operating environments. The Cleanroom life cycle of incremen- tal product releases supports software test- ing throughout the product development rather than only when it is completed. This allows the continuous assessment of prod- uct quality from an execution perspective and permits any necessary adjustments in the process to improve observed product quality. As each release becomes available,
software product very much like its predecessor but with a different reliability (intended to be better, but possibly worse). However, each of these corrected software products, by itself, will be subject to a strictly limited amount of testing before it is superseded by its successor, and statisti- cal estimates of reliability will be cor- respondingly limited in confidence. Therefore, to aggregate the testing expe- rience for an increment release, we define a model of reliability change with parameters Mand R (as discussed in Cur-
failure after c software changes, of the form MTTF =^ MRC where Mis the initial mean time to failure of the release and where R is the observed effectiveness ratio for improving mean time to failure with software changes. Although various technical rationales
tractual basis for the eventual certification of the finally released software by the developer to the user. Moreover, because there is no way to know that the model parameters M and R are absolutely cor- rect, we define statistical estimators for them in terms of the test data. The choice of these estimators is based on statistical analysis, but the choice should also be a contractual basis for certification. The net result of these two contractual bases -^ a reliability change model and statistical estimators for its parameters -
References
gives producer and receiver (seller and pur- chaser) a common, objective way to cer- tify the reliability of the delivered software. The certification is a scientific, statistical inference obtained by (^) a prescribed computation on test data war- ranted to be correct by the developer. In principle, the estimators for software reliability are no more than a sophisticated way to average the times between failure, taking into account the change activity called for during statistical testing. As (^) test data materializes, the reliability can be esti- mated, even change by change. And with successful corrections, the reliability esti- mates will improve with further testing, providing objective, quantitative evidence of the achievement of reliability goals.
for management control of the software development to meet reliability goals. For example, process analysis may reveal both unexpected sources of errors (such as poor understanding of the underlying hard- ware) and appropriate corrections in the process itself for later increments. Inter- mediate rehearsals of the final certification provide a basis for management feedback to meet final goals. The treatment of separate increment releases should also be part of the contrac- tual basis between the developer and user. Perhaps the simplest treatment is to treat separate increments independently. How- ever, more statistical confidence in the final certification will result from
aggregate testing experience across incre- ments. A simple aggregation could com- plement separately treated increments with management judgment. A more sophisticated treatment of (^) sep- arate releases would be to model the fail- ure contribution of each newly released part of the software and to develop strati- fied estimators release (^) by release. Earlier releases can be expected to mature while later releases come under test. This matu- ration rate in reliability improvement can be used to estimate the amount of test time required to reach prescribed reliability levels. Mean time to failure and the rate of change in mean time to failure can be use- ful decision tools for project management. For software under test, which has both an estimated mean time to failure and a known rate of change in mean time to fail- ure, decisions on releasability can be based on an evaluation of life-cycle costs rather than on just marketing effect. When the software is delivered, the average cost for each failure must include both the direct costs of repair and the indirect costs to the users (which may be much larger). These postdelivery costs can be estimated from the number of expected failures and compared with the costs for additional predelivery testing. Judgments could then be made about the profitabil- ity of continuing tests to minimize lifetime
Harlan D. Mills is director of the Information Systems Institute in Vero Beach, Fla., and a professor of computer science at the University of Maryland at College Park. He has been an IBM (^) fellow, a member of the IBM Corporate Technical (^) Committee, and director of software engineering and technology in^ the IBM Federal Systems Division. Mills received his PhD in mathematics from Iowa State University and served on the facul- ties of Iowa State, Princeton, New York, and Johns Hopkins universities. He has served as a regent of the DPMA Education Foundation and as a governor of the Computer Society of the IEEE.
SOFTWARE ENGINEERS
including salary requirements, for immediate,
P.O. Box 2953, Mail Drop (^) CS44, Phoenix, AZ (^) COMPLETE
Michael Dyer is a senior programming manager at IBM (^) Federal Systems Division. He has held various management and staff (^) positions and is currently responsible for Cleanroom software development and (^) methodology. Dyer received (^) a BS in (^) mathematics from Fordham University. He is a member of the Computer Society of the IEEE.
Richard C. Linger is a senior programming manager of software engineering studies in the IBM Federal Systems Division. Linger (^) received a BS in electrical engineering from Duke University. He is a member (^) of the ACM and Computer Society of the IEEE.
Mills can be reached at Information Science Institute, 2770 Indian River Blvd., Vero (^) Beach, FL 32960. Dyer and (^) Linger can be contacted at IBM Federal Systems Division, 6600 (^) Rockledge Dr., Bethesda, MD 20817.
Built in Screen Compiler (converts Screen Images to Object Modules without Source Code)
Mainframe style Transaction Processor
3270 Keyboard Features
End User control over Screen Colors
Realtime Full Screen Editor MANY MORE UNIQUE FEATURES! REQUIREMENTS:
IBM PC, XT, AT, 3270 PC, or 100% compatibles
Lattice "C" 3.x or Microsoft "C" 4.
(^) 256 K
Ad For more information write to:
WOLF P^ I (^) so90 Upland Rd.
In Australia: CSC Computer Systems, (03) 749-
Reader Servke Nmumber 3
A
b