Software Reliability Engineering By Jackie Wadzinski.

28
Software Reliability Engineering By Jackie Wadzinski

Transcript of Software Reliability Engineering By Jackie Wadzinski.

Page 1: Software Reliability Engineering By Jackie Wadzinski.

Software Reliability Engineering

By Jackie Wadzinski

Page 2: Software Reliability Engineering By Jackie Wadzinski.

The Patriot Missile

Used to destroy incoming Iraqi Scud Missiles

Hailed for effectiveness

Operated for 100 consecutive hours

28 American soldiers killed

Cause: Software Failure

Page 3: Software Reliability Engineering By Jackie Wadzinski.

The Patriot MissileA Learning Experience

The software can be redesigned

A new Patriot Missile can be built

The fate of the 28 soldiers remains the same

THE MORAL: Software Engineers need to find a way to engineer reliability into software.

Page 4: Software Reliability Engineering By Jackie Wadzinski.

Objectives Definition of Software Reliability Importance of Reliability Engineering Why Reliability Engineering is DifficultReliability Engineering Processes

WeibullMusaMonte Carlo

Conclusion

Page 5: Software Reliability Engineering By Jackie Wadzinski.

What is Software Reliability?

IEEE Definition:

“The ability of a system or component to perform its required functions under stated conditions for a specified period of time.”

Definition allows for “Just Right” level of reliability for software

Software Reliability and Hardware Reliability have the same definition

Page 6: Software Reliability Engineering By Jackie Wadzinski.

Why is Software Reliability Important?

Manager View

Reliable software means satisfied customers

Reliable software means repeat customers

Reliable software is ethical

Legal liability

Customer View

Reliable software saves time

Reliable software increases efficiency

Page 7: Software Reliability Engineering By Jackie Wadzinski.

Why Software Reliability is Difficult to Calculate

Without considering program evolution, failure rate is statistically non existent

There are many possible causes for design defects for failures to arise from

Page 8: Software Reliability Engineering By Jackie Wadzinski.

Why Software Reliability is Difficult to Calculate

Errors can occur without warning

Cannot improve software quality if identical software components are used

Periodic restarts can sometimes help fix problems

Errors are caused by incorrect logic, incorrect statements, or incorrect input data

Software may require infinite testing

Software reliability models do not always fit the data points well

Page 9: Software Reliability Engineering By Jackie Wadzinski.

Over View

There are many models to chose from when calculating software reliability

Focus on three

Weibull Failure Time Model

Musa’s Basic Execution Time Model

Monte Carlo Simulation

Of all the models, each has strengths and limitations

Page 10: Software Reliability Engineering By Jackie Wadzinski.

Weibull Failure Time

Page 11: Software Reliability Engineering By Jackie Wadzinski.

About Weibull Failure Model

Used to model failure processes of hardware

One of the first models to be applied to software reliability modeling

Flexible – accommodates increasing, decreasing or constant failure rates

Page 12: Software Reliability Engineering By Jackie Wadzinski.

Weibull Failure Model

Weibull Failure Model Assumptions:

There are a fixed number of faults in the software being tested

The number of faults are detected in time intervals ((t=0, t1), (t1,t2)….)

Limitations:

Flexibility allows for greater chance of making the wrong assumption

Page 13: Software Reliability Engineering By Jackie Wadzinski.

Weibull Failure Model Example

Notice how the model follows the actual data

Page 14: Software Reliability Engineering By Jackie Wadzinski.

Musa

Page 15: Software Reliability Engineering By Jackie Wadzinski.

About Musa’s Basic Time Execution Model

Developed by John Musa of AT&T Bell Laboratories

One of the first models to use actual execution time of software components versus calendar time

Time between failures is expressed in terms of CPU time

Page 16: Software Reliability Engineering By Jackie Wadzinski.

Musa’s Basic Time Execution Model

Uses a Poisson Distribution

Model Assumptions:

The execution times between failures is exponentially distributed

The hazard rate for a single fault is constant

Limitations:

Assumes new faults are not introduced after correction

Assumes number of faults decreases over time

Page 17: Software Reliability Engineering By Jackie Wadzinski.

Musa’s Basic Time Execution Model Example

Notice how the model follows the actual data

Page 18: Software Reliability Engineering By Jackie Wadzinski.

Monte Carlo Simulation

Page 19: Software Reliability Engineering By Jackie Wadzinski.

About Monte Carlo Simulation

Developed in 1940s as part of the atomic bomb program

Named after Monte Carlo, Monaco because city’s casinos featured games of chance like dice and roulette

Today Monte Carlo Simulations are used in many applications including physics, finance, and system reliability

Page 20: Software Reliability Engineering By Jackie Wadzinski.

Monte Carlo Simulation

Used for very complex problems which are difficult to solve or no solution exists

Uses statistics to mathematically model real life processes and then estimates the probability of possible outcomes

Involves fitting a curve to a process and then using the fitted curve to model a process over time

Dice Example

Page 21: Software Reliability Engineering By Jackie Wadzinski.

Monte Carlo Simulation Process

Determine a probability function

Weibull Distribution – Best for failure process

Lognormal Distribution – Best for repair process

Determine the random number generator, the source for selecting random numbers that are distributed uniformly on the proper unit interval

Determine a sampling rule for selecting samples for the model given a unit interval of random numbers

Record a count successes and failures

Page 22: Software Reliability Engineering By Jackie Wadzinski.

Monte Carlo Example

Select a random location within the rectangle

If the selected location is blue, record a hit

Repeat 10,000 times

Blue Area = (Hits / 10,000) * Area of Rectangle

Note: The standard error in the result is inversely proportional to the square root of the sample size

Page 23: Software Reliability Engineering By Jackie Wadzinski.

Monte Carlo Software Example

Arbitrary 3 component subsystem

The failure probability of each component given in the diagram above

If the first component fails, then the second is checked

If the second component fails, then the third component is checked

If the third component fails, then the entire subsystem fails

Page 24: Software Reliability Engineering By Jackie Wadzinski.

Monte Carlo Software Example

The actual failure of the subsystem is:

The results of the actual simulation are:

Page 25: Software Reliability Engineering By Jackie Wadzinski.

Conclusion

Page 26: Software Reliability Engineering By Jackie Wadzinski.

Conclusion

Engineering reliable software is important to both the engineer and the end user

Engineering reliable software is not an easy task to accomplish

There are methods available for measuring reliability

Each method has its strengths and weaknesses

At this time, no one method is superior

Page 27: Software Reliability Engineering By Jackie Wadzinski.

Questions

Page 28: Software Reliability Engineering By Jackie Wadzinski.

Title

Subtitle

References

Ganesh, Pai. Survey of Software Reliability Models. Fall 2002.

Korver, Brian. The Monte Carlo Method and Software Reliability Theory. Portland State University Computer Science, Portland Oregan, 1994.

Lyu, Michael R, Editor. Handbook of Software Reliability Engineering. IEEE Computer Society Press, McGraw-Hill, 1996.

Mladen, Vouk A. Software Reliability Engineering. Tutorial Presented at Annual Reliability and Maintenance Symposium, 1998.

Pham, Hoang. Software Reliability. Springer-Verlag, 2000.