Software Reliability and Quality Management
1 of 8
Software Reliability and Quality Management
Featured
trains
Integr Marketing Comm
The End-to-end Approach To Host Mobility
Mountain Dew 3rd Phase.docx
Project Risk Management
The Goof17
A New Kind of War
Displaying and Describing Categorical Data
Control Of Variable Speed Electric Generators Present Status And Perspective
developed notes
Rubrics for Video
Identification and Authentication
INTRODUCTION TO BUDDHISM
Adapting A Polarity Lexicon Using Integer Linear Programming For Domain-specific Sentiment Classification
Solving Equations
prime factor tree
Feasibility Plan Commercial Wind Farm Development
BALDWIN case Barnali
Optoelectronic Multi-chip Module Demonstrator System
Fractions 3
Software Reliability and Quality Management - Transcript
Module 13
Software Reliability and Quality Management
Version 2 CSE IIT Kharagpur
Lesson 32
Software Reliability Issues
Version 2 CSE IIT Kharagpur
Specific Instructional Objectives
At the end of this lesson the student would be able to Differentiate between a repeatable software development organization and a non repeatable software development organization What is the relationship between the number of latent errors in a software system and its reliability Identify the main reasons for why software reliability is difficult to measure Explain how the characteristics of hardware reliability and software reliability differ Identify the reliability metrics which can be used to quantify the reliability of software products Identify the different types of failures of software products Explain the reliability growth models of a software product
Repeatable vs organization
non repeatable
software
development
A repeatable software development organization is one in which the software development process is person independent In a non repeatable software development organization a software development project becomes successful primarily due to the initiative effort brilliance or enthusiasm displayed by certain individuals Thus in a non repeatable software development organization the chances of successful completion of a software project is to a great extent depends on the team members
Software reliability
Reliability of a software product essentially denotes its trustworthiness or dependability Alternatively reliability of a software product can also be defined as the probability of the product working correctly over a given period of time It is obvious that a software product having a large number of defects is unreliable It is also clear that the reliability of a system improves if the number of defects in it is reduced However there is no simple relationship between the observed system reliability and the number of latent defects in the system For example removing errors from parts of a software which are rarely executed makes little difference to the perceived reliability of the product It has been experimentally observed by analyzing the behavior of a large number of programs that 90 of the execution time of a typical program is spent in executing only 10 of the instructions in the program These most used 10 instructions are often called the core of the program The rest 90 of the program statements are called non core and are executed only for 10 of the total execution time It therefore may not be very surprising to note that removing Version 2 CSE IIT Kharagpur
60 product defects from the least used parts of a system would typically lead to only 3 improvement to the product reliability It is clear that the quantity by which the overall reliability of a program improves due to the correction of a single error depends on how frequently is the corresponding instruction executed Thus reliability of a product depends not only on the number of latent errors but also on the exact location of the errors Apart from this reliability also depends upon how the product is used i e on its execution profile If it is selected input data to the system such that only the correctly implemented functions are executed none of the errors will be exposed and the perceived reliability of the product will be high On the other hand if the input data is selected such that only those functions which contain errors are invoked the perceived reliability of the system will be very low
Reasons for software reliability being difficult to measure
The reasons why software reliability is difficult to measure can be summarized as follows The reliability improvement due to fixing a single bug depends on where the bug is located in the code The perceived reliability of a software product is highly observerdependent The reliability of a product keeps changing as errors are detected and fixed
Hardware reliability vs software reliability differ
Reliability behavior for hardware and software are very different For example hardware failures are inherently different from software failures Most hardware failures are due to component wear and tear A logic gate may be stuck at 1 or 0 or a resistor might short circuit To fix hardware faults one has to either replace or repair the failed part On the other hand a software product would continue to fail until the error is tracked down and either the design or the code is changed For this reason when a hardware is repaired its reliability is maintained at the level that existed before the failure occurred whereas when a software failure is repaired the reliability may either increase or decrease reliability may decrease if a bug introduces new errors To put this fact in a different perspective hardware reliability study is concerned with stability for example inter failure times remain constant On the other hand software reliability study aims at reliability growth i e inter failure times increase
Version 2 CSE IIT Kharagpur
The change of failure rate over the product lifetime for a typical hardware and a software product are sketched in fig 13 1 For hardware products it can be observed that failure rate is high initially but decreases as the faulty components are identified and removed The system then enters its useful life After some time called product life time the components wear out and the failure rate increases This gives the plot of hardware reliability over time its characteristics bath tub shape On the other hand for software the failure rate is at it s highest during integration and test As the system is tested more and more errors are identified and removed resulting in reduced failure rate This error removal continues at a slower pace during the useful life of the product As the software becomes obsolete no error corrections occurs and the failure rate remains unchanged
a Hardware product
b Software product Fig 13 1 Change in failure rate of a product Version 2 CSE IIT Kharagpur
Reliability metrics
The reliability requirements for different categories of software products may be different For this reason it is necessary that the level of reliability required for a software product should be specified in the SRS software requirements specification document In order to be able to do this some metrics are needed to quantitatively express the reliability of a software product A good reliability measure should be observer dependent so that different people can agree on the degree of reliability a system has For example there are precise techniques for measuring performance which would result in obtaining the same performance value irrespective of who is carrying out the performance measurement However in practice it is very difficult to formulate a precise reliability measurement technique The next base case is to have measures that correlate with reliability There are six reliability metrics which can be used to quantify the reliability of software products Rate of occurrence of failure ROCOF ROCOF measures the frequency of occurrence of unexpected behavior i e failures ROCOF measure of a software product can be obtained by observing the behavior of a software product in operation over a specified time interval and then recording the total number of failures occurring during the interval Mean Time To Failure MTTF MTTF is the average time between two successive failures observed over a large number of failures To measure MTTF we can record the failure data for n failures Let the failures occur at the time instants t1 t2 tn Then MTTF can be n tt calculated as i 1 i It is important to note that only run time is i 1 n 1 considered in the time measurements i e the time for which the system is down to fix the error the boot time etc are not taken into account in the time measurements and the clock is stopped at these times Mean Time To Repair MTTR Once failure occurs some time is required to fix the error MTTR measures the average time it takes to track the errors causing the failure and to fix them Mean Time Between Failure MTBR MTTF and MTTR can be combined to get the MTBR metric MTBF MTTF MTTR Thus MTBF of 300 hours indicates that once a failure occurs the next failure is expected after 300 hours In this case time measurements are real time and not the execution time as in MTTF Probability of Failure on Demand POFOD Unlike the other metrics discussed this metric does not explicitly involve time measurements POFOD measures the likelihood of the system failing when a service request is made For example a POFOD of 0 001 would mean that 1 out of every 1000 service requests would result in a failure Version 2 CSE IIT Kharagpur
Availability Availability of a system is a measure of how likely shall the system be available for use over a given period of time This metric not only considers the number of failures occurring during a time interval but also takes into account the repair time down time of a system when a failure occurs This metric is important for systems such as telecommunication systems and operating systems which are supposed to be never down and where repair and restart time are significant and loss of service during that time is important
Classification of software failures
A possible classification of failures of software products into five different types is as follows Transient Transient failures occur only for certain input values while invoking a function of the system Permanent Permanent failures occur for all input values while invoking a function of the system Recoverable When recoverable failures occur the system recovers with or without operator intervention Unrecoverable In unrecoverable failures the system may need to be restarted Cosmetic These classes of failures cause only minor irritations and do not lead to incorrect results An example of a cosmetic failure is the case where the mouse button has to be clicked twice instead of once to invoke a given function through the graphical user interface
Reliability growth models
A reliability growth model is a mathematical model of how software reliability improves as errors are detected and repaired A reliability growth model can be used to predict when or if at all a particular level of reliability is likely to be attained Thus reliability growth modeling can be used to determine when to stop testing to attain a given reliability level Although several different reliability growth models have been proposed in this text we will discuss only two very simple reliability growth models Jelinski and Moranda Model The simplest reliability growth model is a step function model where it is assumed that the reliability increases by a constant increment each time an error is detected and repaired Such a model is shown in fig 13 2 However this simple model of reliability which implicitly assumes that all errors contribute equally to reliability growth is highly unrealistic since it is already known that correction of different types of errors contribute differently to reliability growth Version 2 CSE IIT Kharagpur
Fig 13 2 Step function model of reliability growth
Littlewood and Verall s Model This model allows for negative reliability growth to reflect the fact that when a repair is carried out it may introduce additional errors It also models the fact that as errors are repaired the average improvement in reliability per repair decreases Fig 13 3 It treat s an error s contribution to reliability improvement to be an independent random variable having Gamma distribution This distribution models the fact that error corrections with large contributions to reliability growth are removed first This represents diminishing return as test continues
Different reliability improvements Fault repair adds new fault and decreases reliability increases ROCOF
ROCOF
TIME Fig 13 3 Random step function model of reliability growth Version 2 CSE IIT Kharagpur












