Docsity
Docsity

Prepara tus exámenes
Prepara tus exámenes

Prepara tus exámenes y mejora tus resultados gracias a la gran cantidad de recursos disponibles en Docsity


Consigue puntos base para descargar
Consigue puntos base para descargar

Gana puntos ayudando a otros estudiantes o consíguelos activando un Plan Premium


Orientación Universidad
Orientación Universidad

Engineering Software under Statistical Quality Control: The Cleanroom Process, Guías, Proyectos, Investigaciones de Ingeniería del Software

The Cleanroom process, a software engineering approach that places software development under statistical quality control. The process emphasizes human mathematical verification and statistical testing to improve software quality and productivity. The authors provide evidence of the effectiveness of this method through case studies and experiments.

Qué aprenderás

  • How does human mathematical verification contribute to software development under statistical quality control?
  • What are the benefits of using the Cleanroom process in software development?
  • What is the Cleanroom process and how does it differ from traditional software engineering methods?

Tipo: Guías, Proyectos, Investigaciones

2018/2019

Subido el 08/07/2022

joseplolo
joseplolo 🇵🇪

2 documentos

1 / 8

Toggle sidebar

Esta página no es visible en la vista previa

¡No te pierdas las partes importantes!

bg1
University of Tennessee, Knoxville University of Tennessee, Knoxville
TRACE: Tennessee Research and Creative TRACE: Tennessee Research and Creative
Exchange Exchange
The Harlan D. Mills Collection Science Alliance
9-1987
Cleanroom Software Engineering Cleanroom Software Engineering
Harlan D. Mills
M. Dyer
R. C. Linger
Follow this and additional works at: https://trace.tennessee.edu/utk_harlan
Part of the Software Engineering Commons
Recommended Citation Recommended Citation
Mills, Harlan D.; Dyer, M.; and Linger, R. C., "Cleanroom Software Engineering" (1987).
The Harlan D. Mills
Collection.
https://trace.tennessee.edu/utk_harlan/18
This Article is brought to you for free and open access by the Science Alliance at TRACE: Tennessee Research and
Creative Exchange. It has been accepted for inclusion in The Harlan D. Mills Collection by an authorized
administrator of TRACE: Tennessee Research and Creative Exchange. For more information, please contact
trace@utk.edu.
pf3
pf4
pf5
pf8

Vista previa parcial del texto

¡Descarga Engineering Software under Statistical Quality Control: The Cleanroom Process y más Guías, Proyectos, Investigaciones en PDF de Ingeniería del Software solo en Docsity!

University of Tennessee, KnoxvilleUniversity of Tennessee, Knoxville

TRACE: Tennessee Research and CreativeTRACE: Tennessee Research and Creative

ExchangeExchange

The Harlan D. Mills Collection Science Alliance

Cleanroom Software EngineeringCleanroom Software Engineering

Harlan D. Mills

M. Dyer

R. C. Linger

Follow this and additional works at: https://trace.tennessee.edu/utk_harlan

Part of the Software Engineering Commons

Recommended CitationRecommended Citation

Mills, Harlan D.; Dyer, M.; and Linger, R. C., "Cleanroom Software Engineering" (1987). The Harlan D. Mills

Collection.

https://trace.tennessee.edu/utk_harlan/

This Article is brought to you for free and open access by the Science Alliance at TRACE: Tennessee Research and Creative Exchange. It has been accepted for inclusion in The Harlan D. Mills Collection by an authorized administrator of TRACE: Tennessee Research and Creative Exchange. For more information, please contact trace@utk.edu.

<4ag :

Q U A L I T Y A S S U R A N C E

Cleanroom Software

Engineering

Harlan D. Mills, Information Systems Institute Michael Dyer and Richard C. Linger, IBM^ Federal Systems Division

Software quality can be engineered under

statistical quality control

anddelivered with better

quality. The Cleanroom process gives manage- ment an engineering approach to release reliable products.

R ecent experience demonstrates that software can be engineered R under statistical quality control and that certified reliability statistics can be provided with delivered software.

IBM's Cleanroom process' has uncovered

a surprising synergy between mathemati- cal verification and statistical testing of software, as well as a major difference between mathematical fallibility and debugging fallibility in people. With the Cleanroom process, you can engineer software under statistical quality control. As with cleanroom hardware development, the process's first priority is

defect prevention rather than defect

removal (of course, any defects not prevented should be removed). This first priority is achieved by using human mathematical verification in place of pro- gram debugging to prepare software for system test.

statistical certification of the software'

quality through representative-user testing

at thesystem level. The measure of qual-

ityTis the mean time to failure in appropri-

ate units of time (real or processor time) of the delivered product. The certification takes into account the growth of reliabil- ity achieved during system testing before delivery. To gain the benefits of quality control during development, Cleanroom software engineering requires a development cycle of concurrent fabrication and certification of product increments that accumulate into the system to be delivered. This lets the fabrication process be altered on the basis of early certification results to achieve the quality desired.

Cleanroom experience Typical of our experience with the Cleanroom process were three projects: an IBM language product (40,000 lines of

code), an Air Force contract helicopter

flight program (35,000 lines), and a NASA contract space-transportation planning system (45,000 lines). A major finding in these cases was that human verification,

even though fallible, could replace debug-

ging in software development -^ even informal human verification can produce

September 1987 0740-7459/87/0900/0019/$01.00^ ©^ 1987 IEEE^19

statistical quality can be measured. For even the simplest of products, there is no absolute best statistical measure of quality. For example, a statistical average can be computed many ways -^ an arith- metic average, a weighted average, a geo- metric average, and a reciprocal average can each be better than the others in vari- ous circumstances. It finally comes down to a judgment of business and management - in every case. In most cases, the judgment is practically automatic from experience and precedent, but it is a judgment. In the case of soft- ware, that judgment has no precedent

because the concept of producing software

under statistical quality control is just at its inception. A new basis for the certification of soft- ware quality, given in Currit, Dyer, and

Mills,' is based on a new software-

engineering process.4 This basis requires a

software specification and a probability distribution on scenarios of the software's use; it then defines a testing procedure and a prescribed computation from test data results to provide a certified statistical quality of delivered software. This new basis represents scientific and engineering judgment of a fair and reasonable way to measure statistical qual- ity of software. As for simpler products, there is no absolute best and no logical arguments for it beyond business and management judgment. But it can provide a basis for software statistical quality as a contractual item where no such reasonable item existed before. The certification of software quality is given in terms of its measured reliability

over a probability distribution of usage

scenarios in statistical testing. Certifica- tion is an ordinary process in business - even in the certification of the net worth of a bank. As in software certification, there is a fact-finding process, followed by a prescribed computation. In the case of a bank, the fact-finding produces assets and liabilities, and the computation subtracts the sum of the lia- bilities from the sum of the assets. For the bank, there are other measures of impor- tance besides net worth -^ such as good- will, growth, and security of assets -^ just as there are other measures for software

than reliability -^ such as maintainability

and performance. So a certification of having no gotos without acquiring the fun- software quality is a business measure, damental discipline of mathematical

part of the overall consideration in verification in engineering software -^ of

producing and receiving software. even discovering that such a discipline Once a basis for measuring statistical exists. quality of delivered software is available, In contrast, learning the rigor of mathe- creating a management process for statisti- matical verification leads to behavioral cal quality control is relatively straightfor- modification in both individuals and ward. In principle, the goal is to find ways teams of programmers, whether programs

to repeatedly rehearse the final measure- are verified formally or not. Mathemati-

ment during software development and to cal verification requires precise specifica- modify the development process, where tions and formal arguments about the necessary, to achieve a desired level of correctness with respect to those specifi- statistical quality. cations. The Cleanroom process has been Two main behavioral effects are read-

degn ocarryotis pnciple. It c ily observable. First, communication

forthe are in incre- among programmers (and managers) 'ments that nermit realistic measuremenTs becomes much more precise, especially

.rof statistical quality during d;evelopout program specifications. Second, a

with provision for improving themasure remium is placed on the simplest pro- grams possible to achieve specified func- tion and performance. If a program looks hard to verify, it is

Statistical quality the program that should be revised, not the

measurements verification. The result is high productiv-

ultimately come down ity in producing software that requires lit-

to management and tle or no debugging.

business judgmentsCleanroomthematical softwarverification engineering to replace pro- uses

gram debugging before releaseato-staTi-

cal testing. This mathematical verification

quality by additional testing, by process isdone^ by people, based on standard

changes (such as increased insnections and software-engineering practices^ such as

conguration control), or by lh

mnietnods. ' -

Mathematical verification

Software engineering without mathe-

matical verification is no more than a buzzword. When Dijkstra introduced the idea of structured programming at an early

software-engineering conference,5 his

principal motivation was to reduce the length of mathematical verifications of

programs by using a few basic control

structures and eliminating gotos.

Many popularizers of structured pro-

gramming have cut out the rigorous part

about mathematical verification in favor of the easy part about no gotos. But by cut-

ting out the rigorous part, they have also

cut out much of the real benefit of struc-

tured programming. As a result, a lot of

people have become three-day wonders in

those taught at the IBM Software Engi- neering Institute. We find that human verification is (^) surprisingly synergistic with

statistical testing -^ that mathematical fal-

libility is very different from debugging

fallibility and that^ errors^ of^ mathematical

fallibility are much easier to discover in

statistical testing than are errors of debug- ging fallibility. Perhaps one day automatic verification

of software will be practical. But there is

no need to wait for the engineering value

and discipline of mathematical verification

until that day. Experimental data from projects where both Cleanroom verification and more

traditional debugging techniques were

used offers evidence that the Cleanroom- verified software exhibited higher quality.

For the verified software, fewer errors

were injected, and these errors were less

severe and required less time to find and

fix. The verified product also experienced

September 1987 21

better field quality, all of which was due to

the added care and attention paid during

design.

Findings from an early Cleanroom proj- ect (where verified software accounted for

approximately half the product's func-

tion) indicate that verified software accounted for only one fourth the error count. (^) Moreover, the verified software was responsible for less than 10 percent of the severe failures. These findings substan- tiate that verified software contains fewer defects (^) and that those defects that are pres- ent are simpler (^) and have less effect on product execution.

The method of human mathematical verification (^) used in Cleanroom develop- ment, called functional verification, is quite different than the method of axio-

matic verification usually taught in univer-

sities. It (^) is based on functional semantics and on^ the reduction of software verifica- tion to (^) ordinary mathematical reasoning about (^) sets and functions as directly as possible.

The motivation for functional verifica- tion and for the earliest possible reduction

of verification reasoning to sets and func-

tions is the problem of scaling up. A set or function can be described in three lines of

ordinary mathematics notation or in 300

lines of English text. There is more human

fallibility in 300 lines of English than in

three lines of mathematical notation, (^) but

the verification paradigm is the same.

By introducing verification in terms of

sets and functions, you establish a basis for

reasoning that scales up. Large programs

have many variables, but only one func-

tion. Mills and Linger6 gave an additional

basis for verifying large programs by

designing with sets, stacks, and queues

rather than with arrays and pointers.

While initially harder to teach than axio-

matic verification, functional verification

scales up to reasoning for million-line sys-

tems in top-level design as well as for

hundred-line programs at the bottom

level. The evidence that such reasoning is effective is in the small amount of back-

tracking required in very large systems

designed top-down with functional verifi-

cation.

Cleanroom software engineering While it may (^) sound revolutionary at first glance, the Cleanroom software engi- neering process is an evolutionary step (^) in software development. It is (^) evolutionary in eliminating debugging because, over the past 20 years, more and more program design has been developed in design lan- guages that must be verified rather than executed. So the relative effort (^) for advanced teams in debugging, (^) compared to verifying, is now quite small, even in non-Cleanroom development. It is evolutionary in statistical testing because with higher quality programs (^) at the outset, representative-user testing is correspondingly a greater and greater (^) frac- tion of the total testing effort. And, as

In verfied software,

developers essentially

never resorted to

debugging.

already noted, we have found a surprising

synergism between human verification and statistical testing: People are fallible with human verification, but the errors they

leave behind for system testing are much

easier to find^ and fix than those left behind from debugging. Results from an early Cleanroom proj-

ect where verification and debugging were

used to develop different parts of the soft-

ware indicate that corrections to the veri- fied software were accomplished in about one fifth the average time of corrections to

the debugged software. In the verified

software case, the developers essentially

never resorted to debugging (less than 0.

percent of^ the cases) to isolate and repair reported defects.

The feasibility of combining human

verification with statistical testing makes it possible to define a new software-

engineering process under statistical^ qual-

ity control.' For that purpose, we define

a new development life cycle of successive

incremental releases to (^) achieve a (^) struc-

tured specification of function and statisti-

cal usage. A structured specification is a

formal specification (a relation or set of

ordered pairs) for a decomposition into a

nested set of subspecifications for succes-

sive product releases. A structured speci-

fication defines not only the final software

but also a release plan for its incremental

implementation and^ statistical testing.

A stepwise refinement or decomposition

of requirements creates successive levels of

software design. At each level of decom-

position, mathematics-based correctness

arguments ensure^ the^ accuracy of the

evolving design and^ the^ continued integrity

of the product requirements. The work

strategy is to create specifications and the design to those (^) specifications, as well as to

check the correctness of that design before

proceeding to^ the^ next^ decomposition.

The Cleanroom (^) design methods use a

limited set of design primitives to capture

software (^) logic (sequence, selection, and

iteration). They use module and procedure

primitives to package software designs

into products. Decomposition of software

data requirements is handled by a compan-

ion set of data-structuring primitives (^) (sets,

stacks, and^ queues) that^ ensure^ product

designs with strongly typed data (^) opera-

tions. Specially defined design languages

document designs and provide a straight-

forward translation to standard program- ming forms. In the Cleanroom model, structural test- ing that requires knowledge of the design is replaced by formal verification, but functional testing is retained. In fact, this testing can be performed with the two goals of demonstrating that the product requirements are correctly implemented in the software and of providing a basis for product-reliability prediction. The latter is a unique Cleanroom capability that results from its statistical testing method, which supports statistical inference from the test to operating environments. The Cleanroom life cycle of incremen- tal product releases supports software test- ing throughout the product development rather than only when it is completed. This allows the continuous assessment of prod- uct quality from an execution perspective and permits any necessary adjustments in the process to improve observed product quality. As each release becomes available,

22 IEEE Software

software product very much like its predecessor but with a different reliability (intended to be better, but possibly worse). However, each of these corrected software products, by itself, will be subject to a strictly limited amount of testing before it is superseded by its successor, and statisti- cal estimates of reliability will be cor- respondingly limited in confidence. Therefore, to aggregate the testing expe- rience for an increment release, we define a model of reliability change with parameters Mand R (as discussed in Cur-

rit, Dyer, and Mills') for the mean time to

failure after c software changes, of the form MTTF =^ MRC where Mis the initial mean time to failure of the release and where R is the observed effectiveness ratio for improving mean time to failure with software changes. Although various technical rationales

are given for this model by Currit, Dyer,

and Mills,' it should be considered a con-

tractual basis for the eventual certification of the finally released software by the developer to the user. Moreover, because there is no way to know that the model parameters M and R are absolutely cor- rect, we define statistical estimators for them in terms of the test data. The choice of these estimators is based on statistical analysis, but the choice should also be a contractual basis for certification. The net result of these two contractual bases -^ a reliability change model and statistical estimators for its parameters -

References

  1. A. Currit, M. Dyer, and H.D. Mills, "Cer- tifying the Reliability of Software," IEEE Trans. SoftwareEng., Jan. 1986, pp. 3-1.
  2. (^) Cobol Structuring Facility Users (^) Guide, IBM (^) Corp., Armonk, N.Y., 1986.
  3. R.W. Selby, V.R. Basili, and F.T. Baker, "Cleanroom Software Development: An Empirical Evaluation," Tech. Report TR-1415, Computer Science Dept., Univ. of Maryland, College Park, Md., Feb.

gives producer and receiver (seller and pur- chaser) a common, objective way to cer- tify the reliability of the delivered software. The certification is a scientific, statistical inference obtained by (^) a prescribed computation on test data war- ranted to be correct by the developer. In principle, the estimators for software reliability are no more than a sophisticated way to average the times between failure, taking into account the change activity called for during statistical testing. As (^) test data materializes, the reliability can be esti- mated, even change by change. And with successful corrections, the reliability esti- mates will improve with further testing, providing objective, quantitative evidence of the achievement of reliability goals.

This objective evidence is itself a basis

for management control of the software development to meet reliability goals. For example, process analysis may reveal both unexpected sources of errors (such as poor understanding of the underlying hard- ware) and appropriate corrections in the process itself for later increments. Inter- mediate rehearsals of the final certification provide a basis for management feedback to meet final goals. The treatment of separate increment releases should also be part of the contrac- tual basis between the developer and user. Perhaps the simplest treatment is to treat separate increments independently. How- ever, more statistical confidence in the final certification will result from

  1. R.C. (^) Linger, H.D. Mills, and B.I. Witt, Structured Programming: Theory and Practice, Addison-Wesley, Reading, Mass., 1979.
  2. E.W. Dijkstra, "Structured Program- ming," in Software Engineering Tech- niques, J.N. Burton and B. Randell, eds., NATO Science Committee, New York, 1969, pp. 88-93.
  3. H.D. Mills and R.C. Linger, "Data- Structured Programming," IEEE Trans.

Software Eng., Feb. 1086, pp. 192-197.

aggregate testing experience across incre- ments. A simple aggregation could com- plement separately treated increments with management judgment. A more sophisticated treatment of (^) sep- arate releases would be to model the fail- ure contribution of each newly released part of the software and to develop strati- fied estimators release (^) by release. Earlier releases can be expected to mature while later releases come under test. This matu- ration rate in reliability improvement can be used to estimate the amount of test time required to reach prescribed reliability levels. Mean time to failure and the rate of change in mean time to failure can be use- ful decision tools for project management. For software under test, which has both an estimated mean time to failure and a known rate of change in mean time to fail- ure, decisions on releasability can be based on an evaluation of life-cycle costs rather than on just marketing effect. When the software is delivered, the average cost for each failure must include both the direct costs of repair and the indirect costs to the users (which may be much larger). These postdelivery costs can be estimated from the number of expected failures and compared with the costs for additional predelivery testing. Judgments could then be made about the profitabil- ity of continuing tests to minimize lifetime

costs. -6-

  1. A.J. Jordano, "DSM Software Architec- ture and Development," IBM Technical Directions, No. 3, 1984, pp. 17-28.
  2. C.K. Cho, Quality Programming: Develop- ing and Testing Software with Statistical Quality Control, John Wiley & Sons, New York, 1987.
  3. H.D. Mills, "Structured Programming: Retrospect and Prospect," IEEE Software, Nov. 1986, pp. 58-66.
  4. E.N. Adams, "Optimizing Preventive Ser - vice of Software Products," IBM J. Research and Development, Jan. 1984.

24 IEEE^ Software

Harlan D. Mills is director of the Information Systems Institute in Vero Beach, Fla., and a professor of computer science at the University of Maryland at College Park. He has been an IBM (^) fellow, a member of the IBM Corporate Technical (^) Committee, and director of software engineering and technology in^ the IBM Federal Systems Division. Mills received his PhD in mathematics from Iowa State University and served on the facul- ties of Iowa State, Princeton, New York, and Johns Hopkins universities. He has served as a regent of the DPMA Education Foundation and as a governor of the Computer Society of the IEEE.

SOFTWARE ENGINEERS

EMTEK HEALTH CARE SYSTEMS, a fast-growing member of Motorola's New

Enterprises Organization based in Phoenix, is looking for some very talented

people to develop a state-of-the-art hospital clinical information management

system.

Engineering Positions are available for:

* System Architecture

* Relational Database Tools Development

* Hospital Applications Development

* Computer-to-Computer Interfaces

Applicants should have an MS in Computer Science with 3-5 years of rele-

vant experience.

Phoenix, AZ-the valley of the Sun.

The economic forecast for the Phoenix area is sunny indeed- this is the 2nd

fastest growing market in the country. Steady business growth, warm climate

and diverse economic base are drawing new residents (average age 31.3 years)

from all over. Join us in one of the most dynamic markets in the U.S.

If you are the best at what you do, and want the challenge and excitement

of getting in on the ground floor of a dynamic start-up, send your resume,

including salary requirements, for immediate,

confidential consideration to: Personnel

Manager, Dept. #IEEE0901, MOTOROLA INC.,

P.O. Box 2953, Mail Drop (^) CS44, Phoenix, AZ (^) COMPLETE

  1. An Equal Opportunity/Affirmative Action ClT (usTo

Employer. E_VT^ sATIsFAcTION

A Motorola New Enterprises Company

|AA MOTOROLA^ INC.

Michael Dyer is a senior programming manager at IBM (^) Federal Systems Division. He has held various management and staff (^) positions and is currently responsible for Cleanroom software development and (^) methodology. Dyer received (^) a BS in (^) mathematics from Fordham University. He is a member of the Computer Society of the IEEE.

Richard C. Linger is a senior programming manager of software engineering studies in the IBM Federal Systems Division. Linger (^) received a BS in electrical engineering from Duke University. He is a member (^) of the ACM and Computer Society of the IEEE.

Mills can be reached at Information Science Institute, 2770 Indian River Blvd., Vero (^) Beach, FL 32960. Dyer and (^) Linger can be contacted at IBM Federal Systems Division, 6600 (^) Rockledge Dr., Bethesda, MD 20817.

September 1987

FEATURES:
  • Built in Screen Compiler (converts Screen Images to Object Modules without Source Code)

  • Mainframe style Transaction Processor

  • 3270 Keyboard Features

  • End User control over Screen Colors

  • Realtime Full Screen Editor MANY MORE UNIQUE FEATURES! REQUIREMENTS:

  • IBM PC, XT, AT, 3270 PC, or 100% compatibles

  • Lattice "C" 3.x or Microsoft "C" 4.

  • (^) 256 K

Price $295.00 Tel. C91 4) 245-

Ad For more information write to:

W WOLF PAK RE!

WOLF P^ I (^) so90 Upland Rd.

RESEARCH. INC. Yorktown Hts.,

In Australia: CSC Computer Systems, (03) 749-

SEARCH, INC.
NY 10598

Reader Servke Nmumber 3

A

b