SENG 521 Software Reliability & Software...

Post on 30-Aug-2021

3 views 0 download

Transcript of SENG 521 Software Reliability & Software...

SENG 521SENG 521Software Reliability & Software Reliability & Software Reliability & Software Reliability & Software QualitySoftware Qualityyy

Chapter 9: Strategies Chapter 9: Strategies to Meet to Meet Reliability Reliability ObjectiveObjective

D t t f El t i l & C t E i i U i it f C lDepartment of Electrical & Computer Engineering, University of Calgary

B.H. Far (far@ucalgary.ca)http://www.enel.ucalgary.ca/People/far/Lectures/SENG521

far@ucalgary.ca 1

ContentsContents

Fault tolerance techniques Definition, goal and problems, g p History Fault tolerance process Fault tolerance process Recovery techniques Design techniques

Techniques to improve product and processTechniques to improve product and process ISO 9000-3

far@ucalgary.ca 2

Review: TerminologyReview: TerminologyFailures

TreatsFailuresFaultsErrors

Attributes

AvailabilityReliabilitySafetyC fid ti litDependability Attributes ConfidentialityIntegrityMaintainability

Dependability

MeansFault preventionFault toleranceFault removal

Models

Fault removalFault forecasting

Reliability Block DiagramFault Tree model

far@ucalgary.ca 3

Fault Tree modelReliability Graph

Section Section 11

Fault Tolerant Software SystemsFault Tolerant Software Systems

far@ucalgary.ca 4

Fault Tolerance: RAIDFault Tolerance: RAID

far@ucalgary.ca 5

Fault Tolerance: RAIDFault Tolerance: RAID

Fault Tolerance TerminologyFault Tolerance Terminology

Recovery

Backward

Fault

y

ForwardHardware redundancy

Fault

ToleranceSoftware redundancy

Data redundancyArchitectural

Redundancy Temporal redundancy

S i lFunctional

Serial

Parallel

Sequential

far@ucalgary.ca 6

Sequential

Definition & Goal /1Definition & Goal /1

A fault-tolerant computing system must be capable of providing specified services in the p p g ppresence of a bounded number of failures.

Use of techniques to enable continued Use of techniques to enable continued delivery of service during system operation

Based on the principle of Act during operation while Act during operation while Defined during specification and design

far@ucalgary.ca 7

Definition & Goal /2Definition & Goal /2Th f il ld b f lt t The failures could occur because faults are present in either the components of the system or in the system’s designsystem s design.

Building large computing systems is a complex task; fault tolerance requirements could make thetask; fault-tolerance requirements could make the task even more difficult unless appropriate system structuring concepts are utilized.structuring concepts are utilized.

Reliability growth (modeling, computation and interpretation) of a system featuring fault toleranceinterpretation) of a system featuring fault tolerance is different from a system without such feature.

far@ucalgary.ca 8

Problems …Problems …Th t diti l h t f lt t l The traditional approaches to fault tolerance in hardware systems have been based on coping with the effects of well-understood p gfailure modes of physical components.

Conventional hardware fault tolerance th d ( d d ) l

2+2=5

methods (e.g., redundancy) are rarely powerful enough to cope with design deficiencies. E.g., designing a square wheel!

2+2=5

Redundancyg , g g q Consequently, most hardware fault tolerance

techniques cannot be applied directly in ft h l t ll f lt d i

Redundancy of wrongly designed component doesn’tsoftware, where almost all faults are design

faults.doesn t help!

far@ucalgary.ca 9

Example: Consistency CheckExample: Consistency Check66

1

yx ii

i yx

19222630

21182420

102111101021010,3,10,10,1223,10 x

19222630 10,2111,10,10,2,10 y

The correct answer should be 8779. But ordinary implementation of this will return zero y p

due to rounding and large differences in the order of magnitude of the summands.

far@ucalgary.ca 10

History …History …D f i i Defensive programming: Implementing relatively ad-hoc methods used to

i i i h d hi h ld i f hminimize the damage which could arise from the damage of presence of residual bugs.

Dual software technique: Implementing two distinct versions of the same

software and executing them. Any discrepancy in the outputs of the two versions may trigger an lalarm.

Etc.

far@ucalgary.ca 11

Fault Tolerance Fault Tolerance ProcessProcess1 i1. Detection

Identify faults and their causes (errors)

2. Assessment Assess the extent to which the system state has been

damaged or corrupted

3. Recoveryy Remain operational or regain operational status

4. Fault treatment and continued service4. Fault treatment and continued service Locate and repair the fault to prevent another occurrence

far@ucalgary.ca 12

Fault Tolerance Phases /1Fault Tolerance Phases /11 i1 i Phase 1: Fault detectionPhase 1: Fault detection

For a fault to be tolerated, it must first be detected. Thus h i i f f l l h i ithe starting point for fault-tolerance techniques is

observing failures. 1 C t1. Concurrent

Look for errors during service delivery e g self-testing techniques: duplicate codes module pairs e.g., self testing techniques: duplicate codes, module pairs

2. Preemptive Look for errors when service is suspendedp e.g., spare-checking, audit program

far@ucalgary.ca 13

Fault Tolerance Phases /2Fault Tolerance Phases /2

Phase 2: Damage assessmentPhase 2: Damage assessment It is necessary to assess the extent to which the y

system state has been damaged or corrupted. If the delay involved between the manifestation If the delay involved between the manifestation

of a fault (failure) and the detection of its cause (error) is large then it is likely that the damage to(error) is large then it is likely that the damage to the system state will be more severe than if the latency interval were shorter.latency interval were shorter.

far@ucalgary.ca 14

Fault Tolerance Phases /3Fault Tolerance Phases /333 Phase 3: RecoveryPhase 3: Recovery

Error recovery techniques must be utilized in order to b i l fobtain a normal, error-free system state.

There are two different kinds of recovery technique. 1. Backward recovery technique consists of

discarding the current (corrupted) state in favor of an earlier state Therefore mechanisms are needed toearlier state. Therefore, mechanisms are needed to record and store system states. e.g., roll-back.

2. Forward recovery technique involves making use2. Forward recovery technique involves making use of the current (corrupted) state to construct an error-free state.

far@ucalgary.ca 15

Fault Tolerance Phases /4Fault Tolerance Phases /44 & i i4 & i i Phase 4: Fault treatment & continued servicePhase 4: Fault treatment & continued service

Once recovery has been undertaken, it is essential to h h l i f h illensure that the normal operation of the system will

continue without the fault immediately manifesting itself once moreonce more.

The first aspect of fault treatment is to attempt to locate the fault.the fault.

Following this, steps can be taken either to repair the fault or to reconfigure the rest of the system to avoid the g yfault.

far@ucalgary.ca 16

ExampleExampleF lt t l i l b t ki Fault tolerance: simple bug tracking1. Detection: acceptance test (a Boolean expression) is used. 2 D t l th i ti i2. Damage assessment: only the program in execution is

assumed to be affected. 3. Recovery: (backward in this case) consists of recovering3. Recovery: (backward in this case) consists of recovering

the state of the executing program to that at the beginning of the recovery block.

4. Fault treatment: the program in execution (primary or alternative) is assumed to be faulty, so its faults are avoided by executing the next alternative (if any)avoided by executing the next alternative (if any).

far@ucalgary.ca 17

Section Section 22

Fault Tolerance: TechniquesFault Tolerance: Techniques

far@ucalgary.ca 18

DefinitionsDefinitions Fault

Tolerance

Recovery

Redundancy

RecoveryRecovery Actions to restore the system state to a “correct y

state” Recovery usually requires consistency checking Recovery usually requires consistency checking Usually the first choice in software

R d dR d d RedundancyRedundancy Designing the system with multiple components g g y p p

with the same functionality Usually the second choice in software

far@ucalgary.ca 19

Usually the second choice in software

Consistency CheckConsistency Check

A program-specific error detection mechanism to check on the results of program execution.

Usually evaluates to either “true” or “false” Usually evaluates to either true or false .

t t t b P l b P l f ilensure<acceptance test>by P0 else-by P1 else fail

far@ucalgary.ca 20

Example: Consistency CheckExample: Consistency CheckCh k f li k Checksums for program parts or split packages

Internal check points:ABS[(SQRT(x)*SQRT(x)) – x] < E

Exception signal when dividing by zerop g g y Integer overflow signal

Interrupt signal for program loop Interrupt signal for program loop Float point numerical failure check

far@ucalgary.ca 21

Example: Consistency CheckExample: Consistency Check66

1

yx ii

i yx

19222630

21182420

102111101021010,3,10,10,1223,10 x

19222630 10,2111,10,10,2,10 y

The correct answer should be 8779. But ordinary implementation of this will return zero y p

due to rounding and large differences in the order of magnitude of the summands.

far@ucalgary.ca 22

Backward RecoveryBackward Recovery

Roll back the system yto a previouslypreviously saved correct statecorrect state

Consistency check fails

far@ucalgary.ca 23

Laura L. Pullum: Software Fault Tolerance Techniques and Implementation, Artech House, 2001

Domino Effect Domino Effect Wh b k d i t l ibl ? Why backward recovery is not always possible?

Domino Effect:Domino Effect: successive rollback of communicating processes when a failure is detected in any one of theprocesses when a failure is detected in any one of the processes.

far@ucalgary.ca 24

Laura L. Pullum: Software Fault Tolerance Techniques and Implementation, Artech House, 2001

Forward RecoveryForward Recovery

Use redundancy to recover from a failure

far@ucalgary.ca 25

Laura L. Pullum: Software Fault Tolerance Techniques and Implementation, Artech House, 2001

Forward Recovery: Pros & ConsForward Recovery: Pros & Cons Advantages:Advantages:

Forward recovery is fairly efficient in terms of the overhead (time and memory) it requires. This can be crucial in real-time applications where the time overhead of backward recovery can exceed stringent time constraints.

If the fault is an anticipated one, such as the potential loss of data, then redundancy and forward recovery can be a useful and timely approach.

Faults involving missed deadlines may be better recovered from using forward recovery than by introducing additional delay in roll back and recovering.

Disadvantages: Application-specific, that is, it must be tailored to each situation or program.pp p p g Can only remove predictable errors from the system state. Requires knowledge of the error. Cannot aid in recovery if the state is damaged beyond recoverability.y g y y Depends on the ability to accurately detect the occurrence of a fault (thus initiating the

recovery actions.

SENG635 (Winter 2007) far@ucalgary.ca 26

Laura L. Pullum: Software Fault Tolerance Techniques and Implementation, Artech House, 2001

RedundancyRedundancyR d d d i i h i h l i l Redundancy: designing the system with multiple components with the same functionality

Redundancy techniques: Implementing two (or more) distinct

versions of the same software and executingversions of the same software and executing them for the same set of inputs. Any discrepancy in the outputs of the two

i t i lversions may trigger an alarm.

Redundancy techniques’ efficiency depends on coincident and correlateddepends on coincident and correlatedfaults.

far@ucalgary.ca 27

Types of RedundancyTypes of RedundancyH d d dH d d d Hardware redundancyHardware redundancy Replicated and supplementary hardware added to the system to

support fault tolerance. Software redundancySoftware redundancy

Also called program, modular, or functional redundancy, includes programs modules functions used to support fault toleranceprograms, modules, functions used to support fault tolerance.

Data redundancyData redundancy Using additional forms of data to assist in fault tolerance.g

Temporal redundancyTemporal redundancy Using additional time to perform tasks related to fault tolerance, i.e.

repeating an execution using the same software and hardwarerepeating an execution using the same software and hardware resources involved in the initial, failed execution.

far@ucalgary.ca 28

1. Coincident Faults1. Coincident Faults

Coincident Faults:Coincident Faults: when two or more functionally equivalent software components y q pfail on the same input.

When two or more software versions give the When two or more software versions give the same incorrect response, an identical-and-

(IAW) i b i dwrong (IAW) answer is obtained.

far@ucalgary.ca 29

2. Correlated Faults2. Correlated Faults

Correlated Faults:Correlated Faults: Two faults are correlated when the measured probability p yof the coincidence failures is significantly higher than what would be expected fromhigher than what would be expected from the individual failure.

There will be no failure independence.

failsipfailsjfailsipif __|_

p

far@ucalgary.ca 30

Failure ScenarioFailure Scenarioh if h f Input space for

What if the software components produce d bl i l

P1P2

Input space for each procedure

doublet or triplet identical-and-wrong(IAW) ?(IAW) responses?

P3P3

Doublet & tripletDoublet & tripletIAW faultsAdjudication AlgorithmAdjudication Algorithm

far@ucalgary.ca 31

Adjudication by VotingAdjudication by VotingA lt f t A voter compares results from two or more functionally equivalent software components

d d id hi h f h id dand decides which of the answers provided by those components is correct.

Various versions of voting algorithm: Various versions of voting algorithm: Majority voting

Consensus voting Consensus voting 2-of-N voting

far@ucalgary.ca 32

Majority VotingMajority VotingS l id i l d i Several identical components are structured in parallel and all are active. If the component outputs

id i l h i iare not identical, the minority components are ignored (i.e., disabled or switched off).

Majority voting: N: number of systems

m≥[(N+1)/2] , N>1 m: agreement number[( ) ] System reliability (Rsystem) for majority voting

(assuming components with identical reliability Rc)(assuming components with identical reliability Rc) 1

2Nwhere m

1 1 msystem cR R

far@ucalgary.ca 33

2

Consensus VotingConsensus VotingIf j i i hi d l hi If majority agreement is achieved, select this answer

If unique maximum agreement is achieved but ( )/ l h i i ( i h ilim< [(N+1)/2], select the unique maximum (m is the ceiling

value)If ti i th i t b i hi d If tie in the maximum agreement number is achieved,select randomly

System reliability (R t ) for consensus voting (assuming System reliability (Rsystem) for consensus voting (assuming components with identical reliability Rc)

m is the number of unique maximum components 1 1 m

system cR R

far@ucalgary.ca 34

22--ofof--N VotingN Voting

Agreement number m can be set to 2 if the output space is large and statistical p p gindependence of variant failures can be assumed.assumed.

System reliability (Rsystem) for 2-of-N voting ( i i h id i l(assuming components with identical reliability Rc)

21 1system cR R

far@ucalgary.ca 35

Design TechniquesDesign Techniques

1) Robust software systems2) Recovery blocks2) Recovery blocks3) N-version programming4) Consensus recovery block5) Acceptance voting5) Acceptance voting6) N-self-checking programming

far@ucalgary.ca 36

1) Robust Software Systems /11) Robust Software Systems /1S f SS f S (A d d 1981 Robust Software SystemsRobust Software Systems (Anderson and Lee 1981,

etc.): Construction of a robust module requires:

Exception handlers for coping with exceptions propagated from lower levels; and

Boolean expressions for detecting exceptions arising in the module itself, and their exception handlers.

It is often possible (and desirable for the sake of simplicity) to map several exceptions onto a single handler.

far@ucalgary.ca 37

1) Robust Software Systems /21) Robust Software Systems /2F i d l f ll l th th t For a given module, carefully analyze the cases that could prevent the module from providing the desired normal servicesdesired normal services.

Make use of exception handlers either to mask the effects of such undesired but expected exceptionseffects of such undesired, but expected, exceptions or to signal an appropriate exception to the caller of the module.the module.

Make use of default exception handlers or recovery blocks to obtain a measure of tolerance againstblocks to obtain a measure of tolerance against design faults.

far@ucalgary.ca 38

2) Recovery Blocks (RB)2) Recovery Blocks (RB)U i l i l i Using multiple versions of software module and acceptance test.p

The output of the 1st

module is tested for bili d if f ilacceptability and if fails,

the 2nd module is executed after backward e ecu ed e b c w dstate recovery.

The system fails only if ll d l f il h iall modules fail on their

acceptance tests.

far@ucalgary.ca 39

Figure from Reliability Engineering Handbook

3) N3) N--Version Programming (NVP)Version Programming (NVP)P ll l ti f N Parallel execution of N independently developed functionally equivalent y qmodules.

Adjudication is via voting. The voter accepts all N

outputs and selects the correct one among themcorrect one among them, i.e., the one that meets the specification.

Advantage of NVP: no service interrupt.

far@ucalgary.ca 40

Figure from Reliability Engineering Handbook

4) Consensus Recovery Block4) Consensus Recovery BlockC bi ti f N Combination of N-version programming (NVP) and recovery NVP

input

Correct

success

(NVP) and recovery blocks (RB).

IF NVP fails the

NVP

failure

Correctoutput

IF NVP fails, the system reverts to RB using the same blocks.

RB Correctoutput

using the same blocks. Advantage: highest

possible systemSystemfailurepossible system

reliability.

far@ucalgary.ca 41

5) Acceptance Voting5) Acceptance VotingLik N i Like N-version programming (NVP) all versions areall versions are executed in parallel.

The output of each The output of each module goes to an acceptance test.acceptance test.

If acceptance test is successful, the outputsuccessful, the output goes to a voter.

far@ucalgary.ca 42

Figure from Reliability Engineering Handbook

6) N6) N--SelfSelf--Check ProgrammingCheck ProgrammingI N S lf Ch k In N-Self-Check Programming (NSCP), N modules areN modules are executed in pairs.

The pairs’ outputs canThe pairs outputs can be compared or accessed for correctness.

far@ucalgary.ca 43

Figure from Reliability Engineering Handbook

DiscussionDiscussionTh bilit f t l ti d i f lt t l l th The capability of tolerating design faults rests largely on the ‘coverage’ of run-time checks (i.e. acceptance tests) for detecting errors. g

Often, it is not possible to check completely within a procedure that the results produced have been according to th ifi ti ( f “ t” l ith th t t itthe specification (e.g., for a “sort” algorithm that sorts its input, the check that the output has been sorted correctly would be as complex as the “sort” algorithm itself). p g )

Hence run-time checks are often limited to checking certain critical aspects of the specification.

This means that the possibility of undetected failures cannot be ruled out entirely.

far@ucalgary.ca 44

Fault Tolerance:Fault Tolerance:Adjudication by votingAdjudication by voting

far@ucalgary.ca 45

Section Section 33

Software Product & Process Software Product & Process Improvement using ISO 9000Improvement using ISO 9000--33

far@ucalgary.ca 46

How to Develop Better Software?How to Develop Better Software?

fi d l Define development process

Establish standards (requirements, design, coding, testing)

Follow best practicesp Review & audit Follow ISO 9000 3 Follow ISO 9000-3

guidelines

far@ucalgary.ca 47

Capability Maturity Model (CMM)Capability Maturity Model (CMM)

The Software Engineering Institute (SEI) Capability Maturity Model (CMM)p y y ( )

far@ucalgary.ca 48

Seven Development TipsSeven Development Tipsh h k d i1. Keep the human network up and running

2. Constantly look for and plug time/effort leaks3. Establish functional contracts (who checks what)4 Test early but not too early4. Test early, but not too early5. Support manual testing with automated tools

U d d h k /6. Use automated code checkers/generators7. Write stub code where possible

far@ucalgary.ca 49

From: Keeping Software Development Projects Healthy (IEEE IT Pro Jul/Aug 2002)

Review: Fault PreventionReview: Fault PreventionT id f l b i To avoid fault occurrences by construction.

Activities: Requirement review Design review

l d Clear code Establishing standards (ISO 9000-3, etc.)

U i CASE t l ith b ilt i h k h i Using CASE tools with built-in check mechanisms

All these activities are included in ISO 9000All these activities are included in ISO 9000--33

far@ucalgary.ca 50

ISO 9000 Family /1ISO 9000 Family /1ISO 9000 Q lit t d lit ISO 9000: Quality management and quality assurance standards: guidelines for selection and useISO 9000 1 R i i f ISO 9000 (1994) ISO 9000-1: Revision of ISO 9000 (1994)

ISO 9001: Quality systems: models for quality i d i /d l t d tiassurance in design/development, production,

installation and servicing (1987, 1991)ISO 9002 Q lit t d l f lit ISO 9002: Quality systems: models for quality assurance in production, installation and servicing (1994)(1994)

far@ucalgary.ca 51

ISO 9000 Family /2ISO 9000 Family /2ISO 9003 Q lit t d l f lit ISO 9003: Quality systems: models for quality assurance in final inspection and test (1994)ISO 9004 Q lit t d lit t ISO 9004: Quality management and quality system elements —guideline— (1987)ISO 9004 2 Q lit t d lit ISO 9004-2: Quality management and quality system elements —Part 2: guideline for services—(1991)(1991)

ISO 9000-3: Guidelines for application of ISO 9001 to the development supply and9001 to the development, supply and maintenance of software (1991)

far@ucalgary.ca 52

My First ImpressionMy First Impression

SENG521 (Winter 2008) far@ucalgary.ca 53

OverviewOverview

far@ucalgary.ca 54CSA - ISO 9000 - 2000 Essentials A practical handbook for implementing the ISO 9000 Standards 3Ed (2001)

ISO9000ISO9000--3: Organization 3: Organization of of GuidelinesGuidelines

1) Scope2) Normative References C t2) Normative References3) Definitions

Core parts

4) Quality System: Framework5) Quality System: Life-Cycle Activities5) Quality System: Life Cycle Activities6) Quality System: Supporting Activities

far@ucalgary.ca 55

1. Scope /11. Scope /1hi f SO 9000 id li This part of ISO 9000 sets out guidelines to

facilitate the application of ISO 9001 to i i d l i l i dorganizations developing, supplying and

maintaining software. It is intended to provide guidance where a

contract between two parties requires the demonstration of a supplier’s capability to develop, supply andcapability to develop, supply and maintain software products.

far@ucalgary.ca 56

1. Scope /21. Scope /2h id li li bl i l The guidelines are applicable in contractual

situations for software products when: The contract specifically requires design effort and the

product requirements are stated principally in performance terms or they need to be established; i eperformance terms, or they need to be established; i.e., identifying product requirements in a quantifiable and testable manner.testable manner.

Confidence in the product can be attained by the adequate demonstration of a certain supplier’s capabilities in pp pdevelopment, supply and maintenance.

far@ucalgary.ca 57

2. Normative References2. Normative Referenceshi i id ifi h i i h This section identifies the provisions that are

referenced in the text and are part of the ISO 9000. The references are as follows:

ISO 2382-1:Data processing Vocabulary Part 01 Fundamental terms (1984)

ISO 8402: Quality Vocabulary (1986) ISO 9001:Quality systems-Model for quality assurance in

design/development, production, installation and i i (1987)servicing (1987)

far@ucalgary.ca 58

3. Definitions /13. Definitions /1S ft I t ll t l ti i i th Software: Intellectual creation comprising the programs, procedures, rules and any associated documentation pertaining to the operation of a datadocumentation pertaining to the operation of a data processing system.

Software product: Complete set of computer Software product: Complete set of computer programs, procedures and associated documentation and data designated for delivery to a user.and data designated for delivery to a user.

Software item: Any identifiable part of a software product at an intermediate step or at the final step ofproduct at an intermediate step or at the final step of development.

far@ucalgary.ca 59

3. Definitions /23. Definitions /2D l t All ti iti t b i d t t Development: All activities to be carried out to create a software product. Ph D fi d t f k Phase: Defined segment of work.

Verification (for software) : The process of l ti th d t f i h tevaluating the products of a given phase to ensure

correctness and consistency with respect to the products and standards provided as input to thatproducts and standards provided as input to that phase.

Validation (for software): The process ofBuilding the “right” thing

Validation (for software): The process of evaluating software to ensure compliance with specified requirements. Building the thing “right”

far@ucalgary.ca 60

specified requirements. Building the thing right

ISO 9000ISO 9000--33

4. Quality System: 4. Quality System: FrameworkFramework

far@ucalgary.ca 61

4.1 Management Responsibility4.1 Management Responsibility

B th i li ( d l ) d Both senior supplier (= developer) and purchaser (= customer) management are

f h i ibili iaware of their responsibilities. The purchaser management should p g

Ensure the supplier has all the purchaser-specific information needed to meet contractual obligations, and

Identify one purchaser representative responsible y p p pfor supplier interface.

far@ucalgary.ca 62

4.1 Management Responsibility4.1 Management ResponsibilityTh li ( d l ) t t The supplier (= developer) management must Create an engineering environment with clearly identified

roles and responsibilities for the engineers who work inroles and responsibilities for the engineers who work in the environment

Identify and provide the resources needed to verify the y p yengineering work being performed is accurate and complete E th t d fi d ti d d b i Ensure that defined practices and procedures are being followed

Take part in the review of the engineering and Take part in the review of the engineering and engineering practices and procedures to ensure their suitability and effectiveness.

far@ucalgary.ca 63

4.2 Quality System4.2 Quality SystemM t t id tif it i ti ’ l d Management must identify its organization’s goals and ensure the existence of an engineering environment where those goals can be reached in the most efficient manner. g

Engineering environment should have: Defined processes and procedures; EEqq

Development and maintenance plans based on the defined process and procedures; R i dit d t t t d t i th lit f th

Eleme

Eleme

qualityquality

Reviews, audits, and tests to determine the quality of the product(s) being created and the process used to create those products;

ents ofents ofy systy syst p ;

Corrective actions based on the information gained from reviews, audits, and tests.

f af atemtem

far@ucalgary.ca 64

4.3 4.3 Internal System AuditsInternal System Auditsh li ’ i i h ld h i l The supplier’s organization should have an internal

audit process to ensure that the engineering process d b h i i f i hused by the company is in fact meeting the

company’s product and process quality goals. Documented findings from the audits are to be

reviewed and acted upon by those responsible for the areas or processes audited.

far@ucalgary.ca 65

4.4 4.4 Corrective Action Corrective Action

A “closed loop” management process should be in place to ensure that: p Causes of quality problems are determined Actions are taken to control the problems with Actions are taken to control the problems with

the productAdd th h i ti d d Address the changes in practices and procedures required to avoid recurrence of the problem.

far@ucalgary.ca 66

ISO 9000ISO 9000--33

5. Quality System: 5. Quality System: LifeLife--Cycle ActivitiesCycle Activities

far@ucalgary.ca 67

5. 1 General5. 1 GeneralAll d l j ( d i All development projects (and maintenance projects) should follow an organized life cycle (or

)process). Suppliers are free to use any life cycle (process)

they deem best suited for the type of product being developed, or maintained, as long as consideration is given to the various activities referred to in Sections 5.2 through 5.10 as life-cycle activities.

far@ucalgary.ca 68

5.2 Contract Review5.2 Contract ReviewTh f ll i d id tifi d The following needs are identified: The need for the purchaser and supplier to come to an

agreementagreement The need to identify methods for resolving contract issues

that may arise y Both purchaser and supplier management need to

understand The scope of the contract Its organization’s responsibilities Risks to organizations (e.g., schedule, budget, legal) Ownership of the product and by-products

far@ucalgary.ca 69

5.3 Purchaser’s Requirements5.3 Purchaser’s Requirements

hi i f h d id if This section focuses on the need to identify The product’s functional and technical requirements

(e.g., performance, safety, reliability, etc.) The product’s external interface

The purchaser and supplier have a responsibility to work together toresponsibility to work together to ensure these requirements are

l t d tifi bl b fcomplete and quantifiable before product development begins.

far@ucalgary.ca 70

5.4 Development Planning5.4 Development PlanningOnce purchaser’s requirements have been identified Once purchaser’s requirements have been identified there needs to be a development plan to deliver a product that meets those requirementsproduct that meets those requirements.

The development plan identifies the resources and schedule required to deliver a productschedule required to deliver a product.

The resources and schedule are based upon a combination of:combination of: Purchaser’s requirements Engineering practices, and procedures used by the Engineering practices, and procedures used by the

supplier to meet those requirements Purchaser’s need date for the product.

far@ucalgary.ca 71

5.4 Development Planning5.4 Development Planning

Development plan should show The phases of developmentp p Inputs and outputs to each phase Schedule and resources for each phase Schedule and resources for each phase Progress status and control Tools and methods to be used, and Verification procedures for each phase (reviews, p p ( ,

audits, and testing)

far@ucalgary.ca 72

5.4 Development Planning5.4 Development Planning

Progress Control Briefly discusses the need to ensure the proper y p p

researching of a project throughout the entire life cycle. y

Input to Development PhasesP i t t th t h d l t h h ld Points out that each development phase should have defined inputs and that each product

i t h ld b t t d i hrequirement should be stated in such a manner as to be quantifiably tested.

far@ucalgary.ca 73

5.4 Development Planning5.4 Development PlanningO f Output from Development Phases The outputs of one step in the software development

h i i hprocess are the inputs into the next step. These outputs should be validated to ensure that they

t th i t f th t tmeet the requirements of the next step.

Verification of Each Phase The output of each phase should be tested in some

manner to ensure its fitness for use in the next phase.

far@ucalgary.ca 74

5.5 Quality Planning /15.5 Quality Planning /1l b d fi d h i i i l d Plans be defined to ensure that activities related to

the quality of a development effort’s products or by-d k lproducts take place.

The plans for these activities can be a separate plan (software quality assurance plan) or incorporated in other plans like the development plan, test plan, and configuration management plan.

far@ucalgary.ca 75

5.5 Quality Planning 5.5 Quality Planning /2/2T i l li t f lit l i ti iti Typical list of quality planning activities: Defining inputs and outputs for each

d l hdevelopment phase. Identifying the types of test to be carried out. Identifying the resources, schedules, and roles

and responsibilities for carrying out the tests. Configuration management. Defect control and corrective action.

far@ucalgary.ca 76

5.6 Design & Implementation5.6 Design & Implementation

i d i l i h h Design and implementation are the processes that turn requirements into a product.

Design is the technical kernel and to a great degree dictates the quality of the product.

The design effort and the product itself would benefit from considering the following: g g Design methodologies Design rules and guidelinesDesign rules and guidelines Internal design (not seen by the user), comparison to

previous designs

far@ucalgary.ca 77

p g

5.6 Design & Implementation5.6 Design & Implementation

i Implementation The supplier establish and use guidelines for subjects

h i i di dsuch as naming conventions, coding, and comments. Reviews

The supplier should review the products of analysis, design, implementation, and testing in order to ensure th t th fi l d t t th h ' i tthat the final product meets the purchaser's requirements.

Reviews and inspections are also meant to ensure that the methodologies and rules that were meant to be usedmethodologies and rules that were meant to be used during design were actually used.

far@ucalgary.ca 78

5.7 Testing & Validation5.7 Testing & ValidationA l il l i b d A multilevel testing process may be used to test a product and that a plan should be in place to

lli h isupport controlling the testing process. Test results should be recorded and used in order

to: Identify problems with the product being tested Identify areas where tests need to be rerun Determine the adequacy of the test process

far@ucalgary.ca 79

5.7 Testing & Validation5.7 Testing & Validationi Test Planning

There need to be plans in place to support thisprocess. The plans should address Types of testing Test cases Test environment Resources and schedule required to create the tests Resources and schedule required to execute the testsq Test completion criteria

far@ucalgary.ca 80

5.7 Testing & Validation5.7 Testing & Validation

Validation Validation is the testing that is performed by the g p y

supplier on a version of the product that is intended to be delivered to the purchaser. p

Software testing can occur before validation, but that type of testing is verification of a product’sthat type of testing is verification of a product s components as opposed to validation of an entire product.product.

far@ucalgary.ca 81

5.7 Testing & Validation5.7 Testing & Validation

Field Testing Field testing takes place at a site other than the g p

supplier’s that is as close to an operational environment as possible. p

The field tests need to be planned and that the supplier and purchaser may need to coordinatesupplier and purchaser may need to coordinate their efforts in the support of this type of testing.

far@ucalgary.ca 82

5.8 Acceptance (Testing) 5.8 Acceptance (Testing) A t t ti i f d b th h Acceptance testing is performed by the purchaser.

This should be a formal process planned well in d f th t l t tiadvance of the actual testing.

An acceptance test plan should identify the schedule, l d ibiliti it iresources, roles and responsibilities, success criteria,

and problem handling procedures. Th li d h h h d The supplier and purchaser have a shared responsibility for testing that goes on in this phase and must work closely togetherand must work closely together.

far@ucalgary.ca 83

5.9 Delivery & Installation5.9 Delivery & InstallationR li ti d d li th t Replication and delivery are processes that are performed after a product has been developed or enhancedenhanced.

Installation, may require coordination between the purchaser and the supplierpurchaser and the supplier.

The level of coordination depends upon the complexity of the product and the number ofcomplexity of the product and the number of purchaser sites that use the product.

Installation planning should address schedules Installation planning should address schedules, available personnel, site access, availability, access to systems and equipment, and testing.

far@ucalgary.ca 84

to systems and equipment, and testing.

5.10 Maintenance 5.10 Maintenance d i h h h Product maintenance, or enhancement, has the same

components as product development. Analysis, design, implementation, and testing of

changes to the product must all be planned, scheduled, and performed.

The purchaser and the supplier must agree as to the p pp gtiming and content of products releases so that both the supplier and purchaser can support the rate of pp p ppproduct release.

far@ucalgary.ca 85

ISO 9000ISO 9000--33

6. Quality System: 6. Quality System: Supporting ActivitiesSupporting Activities

far@ucalgary.ca 86

Supporting ActivitiesSupporting ActivitiesC fi i Configuration management

Document control Quality records Measurement Measurement Rules, practices, and conventions

T l f ili i d h i Tools, facilities, and techniques Purchasing Included software Training Training

SENG521 (Winter 2008) far@ucalgary.ca 87

6.1 Configuration Management6.1 Configuration ManagementC fi ti t i th b hi h Configuration management is the process by which a product’s baselines (e.g., requirements, source code, test cases, test results, user documentation, etc.) are identified , , , )and changes to those baselines are controlled.

The engineering organization has to identify, define, and l fplan for: Identification of product baselines Version control of the product baselinesp Roles and responsibilities of the engineering organizations in the

change process Change control procedures Change control procedures Status of the change control processes and baseline products

far@ucalgary.ca 88

6.2 Document Control6.2 Document ControlE i i i ifi ti d i Th Engineering is a specification-driven process. There are a number of different engineering documents used in the development and maintenance of aused in the development and maintenance of a software product.

The engineering organization has to identify and The engineering organization has to identify and control the use of these documents.

Control is especially important for the initial Control is especially important for the initial approval and dissemination of such documents and the authorization and reissue of updated versions.the authorization and reissue of updated versions.

far@ucalgary.ca 89

6.3 Quality Records 6.3 Quality Records E i i i ti h ld i t i d Engineering organizations should maintain records that document the quality of their processes and productsproducts.

Records of reviews, audits, and test results should be maintained to allow for their use in process andbe maintained to allow for their use in process and organizational improvement.

The procedures and processes should be identified The procedures and processes should be identified and implemented to control the accumulation, storage, and retrieval of such documents.storage, and retrieval of such documents.

far@ucalgary.ca 90

6.4 Measurement /16.4 Measurement /1E i i t h ld b l d l Engineering management should be a closed-loop process where measurements are taken to determine the quality of the products and processes used tothe quality of the products and processes used to create or manage those products.

The product measurement is based on The product measurement is based on purchaser/customer feedback and internal audits performed by the supplier organization.performed by the supplier organization.

These measurements are needed for product and process improvement.process improvement.

far@ucalgary.ca 91

6.4 Measurement /26.4 Measurement /2Th t i d t d t i The process measurement is used to determine whether schedule milestones are being met and whether the by products of the process are meetingwhether the by-products of the process are meeting their quality goals.

For measurements to be useful an organization For measurements to be useful an organization needs to identify: The current level of performance The current level of performance Improvement goals Measurement data to be collected Actions to be taken based on measuring data against

goals.

far@ucalgary.ca 92

6.5 Rules, Practices & Conventions6.5 Rules, Practices & Conventions

Supplier’s organization should Document the engineering practices and Document the engineering practices and

procedures that are used to develop and i t i ft d tmaintain software products

Revise the practices and procedures as p pnecessary.

far@ucalgary.ca 93

6.6 Tools & Techniques6.6 Tools & Techniques

Supplier’s organization should Identify and use “tools, facilities, and y , ,

techniques” in the development and management of software productsp

Replace or improve the tools, facilities, and techniques as necessarytechniques as necessary

far@ucalgary.ca 94

6.7 Purchasing 6.7 Purchasing h h i hi d d h li When purchasing third-party products, the supplier

should apply many of the same management and i i h i h h iengineering techniques as the purchaser uses in

relation to the supplier. For example, the supplier (now acting as a

purchaser) should ensure the subcontractor’s ability to perform the contracted work and validate the subcontractor’s products.

far@ucalgary.ca 95

6.8 Included Software6.8 Included SoftwareA li h hi d d h d A supplier may have third-party products that need to be integrated with its own products.

The supplier should identify procedures to ensure these products meet stated quality goals and that procedures and plans are in place for the storage, protection, and maintenance of the third-party products.

far@ucalgary.ca 96

6.9 Training6.9 Trainingi d b i d i d h i Engineers need to be trained in order to meet their

responsibilities. An engineering organization must identify the

techniques, tools, procedures, and methodologies that are used in the development and maintenance of a product.

Once these have been identified, then a program can be instituted to ensure that engineers are proficient g pin their use.

far@ucalgary.ca 97

ConclusionsConclusions

ISO 9000-3 does not guarantee quality software g q yproducts.

But those who follow But those who follow guidelines and procedures

f ISO 9000 3of ISO 9000-3 are more likely to develop more reliable software products in a more consistent way.

far@ucalgary.ca 98

a o e co s ste t way.

Establishing standardsg(ISO 9000-3)

far@ucalgary.ca 99