Reliability

105
1 Federal University of Technology Owerri Reliability By Prof. Chukwudebe G.N. and Diala U.H. 2 nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

description

Reliability

Transcript of Reliability

Page 1: Reliability

1

Federal University of

Technology Owerri

Reliability

By

Prof. Chukwudebe G.N. and Diala U.H.

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 2: Reliability

2

RISK

Major accidents in recent years have taken a sad toll of lives:

Bhopal Chernobyl Piper Alpha Challenger

So have natural disasters:

Bam December 26 Tsunami

The immediate reaction is always “ It must never happen again”

We need to eliminate hazards as far as possible and reduce

the risks so that the remaining hazards are only a small

addition to the inherent risks of everyday life

RELIABILITY & RISK ANALYSIS TECHNIQUES are the

methods used to assess the safety of modern complex systems

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 3: Reliability

3

RELIABILITY

Definition Reliability is the ability of a product

to perform as intended ( that is without failure and

within specified performance limits )

for a specified mission time

when used in the manner and for the purposes

intended

under specified application and operational

conditions

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 4: Reliability

4

RELIABILITY

Alternative Definition

Reliability is the probability that a device or system

properly performs its intended function

over time

when operated within the environment for which it

is designed

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 5: Reliability

5

RELIABILITY

Definition stresses 4 elements

Probability quantitative

Adequate performance must be defined

Time the period over which we can expect a

certain degree of performance

Operating conditions temperature, humidity,

shock, vibration

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 6: Reliability

6

RELIABILITY

Characteristics of a Product

Estimated in Design

Controlled in Manufacturing

Measured during Testing

Sustained in the Field

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 7: Reliability

7

RELIABILITY

Importance of Reliability

In this modern day of science and technology where

complex devices are used for commercial, military,

scientific, consumer and pleasure purposes

A high degree of reliability is an absolute necessity

There is too much at stake in terms of cost and

human life to take any significant risks with devices

that might not function properly when needed most

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 8: Reliability

8

RELIABILITY

First we will deal with

Foundation of Reliability

Probability and Statistics

Then

In-depth reliability engineering considerations

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 9: Reliability

9

Objective

To give an overview of

The reliability issues

Techniques

Tasks

Limitations associated with

The design

Manufacture

Operation

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 10: Reliability

10

Probability & Statistics

Pragmatic approach

Discussion will include

Shape of failure distributions

Estimating parameters

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 11: Reliability

11

RELIABILITY

Focus on

Preventing failures through

Robust design and manufacturing practices

Based on

Life cycle loads and stresses

Product architecture

Potential defects and failure mechanisms

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 12: Reliability

12

RELIABILITY

There are 2 strands in Reliability

FAULT AVOIDANCE

Conservative Design

High Quality Components

FAULT TOLERANCE

Assumes despite all efforts components will fail

USE REDUNDANCY

Price in efficiency

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 13: Reliability

13

RELIABILITY

The Characteristics of a Product are:

Estimated in Design

Controlled in Manufacturing

Measured during Testing

Sustained in the Field

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 14: Reliability

14

RELIABILITY

Random input

Variables

Output

performance

Characteristics

Continuous

{x}

Discrete

{y}

Binary

{z}

Performance Characteristics of an engineering product

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 15: Reliability

15

Quality ProductionThe Quality of a Product

Performance characteristics may be Continuous. Fuel consumption

these are objective can be accurately establishedby independent measurements and not dependenton the opinion of an individual

Discrete. Visual appeal, body stylethese are subjective based on some scale like (5)excellent (4) Good……..

Binary Based on some feature that the productdoes or does not possess. Presence or absenceof sun roof…..

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 16: Reliability

16

Continuous {x} Discrete {y} Binary {z}

Urban fuel

consumption

Visual appeal of body and

style

Leaded/ unleaded petrol?

Time from 0 to

60m.p.h.

Visual appeal of interior Starts first time?

Braking distance

at 60m.p.h.

Comfort of ride Central locking?

Engine noise level Range of exterior colours Quad stereo?

% CO2 in exhaust Range of interior colours Tinted glass?

Maximum speed Power assisted steering?

Sun-roof?

PERFORMANCE CHARACTERISTICS

FOR A FAMILY CAR

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 17: Reliability

17

Specification

The manufacturer of an engineering product will need toproduce a specification

defines the product for a potential customer.Consists of a set of target values x1T, x2T, a target vector {xT}

Urban fuel consumption 40 miles per gallon

Maximum speed 100 miles per hour

Time from 0 to 60m.p.h. 13 seconds

Braking distance at 60m.p.h. 180 feet

Engine noise level 70dB

% CO2 in exhaust 1%

Random effects X1T + 1

Tolerance limits Tolerance vector {}

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 18: Reliability

18

RELIABILITY

Target performance {xT}

Tolerance {}

Reject if :

the actual performance {x} lies outside {xT }

Both manufacturer and customer need to know how the actual

random variations in a given performance characteristic {x} for a

given product, across different individual units and under different

environmental and operating conditions, compare with the target and

tolerance values {xT} and { }.

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 19: Reliability

19

RELIABILITY

A statistical analysis of the variations is required.

This involves calculating:

Mean

Standard deviation

Probability

Probability density function

To do this we require N sample values of the

performance characteristic {x} specified by xi where

i = 1,2,………N

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 20: Reliability

20

RELIABILITY

xi i = N

i = 1

1 N

Mean x =

Standard Deviation = (xi - x)2 i = N

i = 1

1 N

Root mean square xRMS = (xi)2

i = N

i = 1

1 N

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 21: Reliability

21

RELIABILITY

F

a

I

l

u

r

e

R

a

t

e Time

Infant Mortality

Period Operating Period Wear-out Period

System Failure Rate The Bathtub Curve

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 22: Reliability

22

The infant mortality period or debugging stage

Failures typically caused by manufacturing flaws

Damage received in transit

Damage received in handling

The operating period

Smaller failure rate

Failure rate tends to remain constant

Failure typically due to only to chance

Failure generally results from severe, unpredictable and

usually unavoidable stresses that arise from environmental

factors such as vibrations, temperature, shock and pressure.

THE BATHTUB CURVE

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 23: Reliability

23

The wear-out period

Failure rate increases rapidly

Failure as a result of gradual degradation of some

property of the system essential to proper operation

The degradation may occur from causes such as

fatigue, creep, corrosion and abrasion

We are most interested in the period between infant

mortality and wear-out.

In this period we have

Constant failure rate

Exponential failure time density function

THE BATHTUB CURVE

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 24: Reliability

24

RELIABILITY

Failure Rate

Assume at t = 0 we have N0 articles

At time t = t we observe Ns have survived

The number failed is NF

So

N0 = NS(t) + NF (t)

And

R(t) = NS(t) / N0

R(t) is the Reliability as a function of time

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 25: Reliability

25

Ns

N0

Ns(t)

Time t t+t

Graph of NS vs t

The failure rate is the limit

as t 0 of

(the gradient at t) NS

RELIABILITY

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 26: Reliability

26

RELIABILITY

The reliability R of a product can be defined

as the probability that the product continues to meet

some specification

The unreliability F of a product can be

defined as the probability that the product fails to

meet the specification

Both reliability and unreliability vary with time

R(t) decreases with time

F(t) increases with time

R(t) + F(t) = 1

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 27: Reliability

27

PRACTICAL RELIABILITY

DEFINITIONS

Non- repairable items

Suppose that N individual items of a given non-repairable

product are placed in service and the times at which failures

occur are recorded during a test interval T

Further assume that all N items fail during T and the ith failure

occurs at time Ti

i.e. Ti is the survival time or up time for the ith failure

The total up time for N failures is therefore

and the mean time to failure is given by

I =N

I =1 Ti

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 28: Reliability

28

PRACTICAL RELIABILITY

DEFINITIONS

Mean Time To Fail = Total up time

Number of failures

Ti i = N

i = 1

1

N MTTF =

i.e.

Mean Failure Rate = Number of Failures

Total up time

= Ti i = N

i = 1

N

i.e. The mean failure rate is

the reciprocal of MTTF

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 29: Reliability

29

Mean Time To Failure & Mean Time Between Failures

MTTFN

fti

i

N

1

1

fti is TTF

MTTFN

t Nf t dt tf t dt

1

0 0

[ ( ) ] ( )

or total life for N devices = N MTTF

and between t and t+t the number live is NR(t)

MTTF R t dt( )0

Repair

Time

LIVE

Under

Repair

TTF

TBF

REPAIRABLE SYSTEMS

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 30: Reliability

30

PRACTICAL RELIABILITY

DEFINITIONS

T1

T2

TN

There are N survivors at

time t = 0, N - i at t = Ti,

decreasing to zero at

time t = T. The figure

shows the probability of

survival, i.e. the

reliability, Ri = (N-i) / N

decreases from Ri = 1 at

t = 0, to Ri = 0 at t = T.

MTTF = Total area under

graph

1

R

t 0 1/N

2/N

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 31: Reliability

31

Quantification of Reliability

Reliability⇛The probability that a system/component

works Availability⇛The probability that a system/component

works on demand Availability at time t ⇛The probability that a system /

component works on demand Availability⇛The fraction of the total time that a system /

component can perform its required function

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 32: Reliability

32

t

dttftF0

)()(

Quantification of Reliability

Unavailability = 1 - Availability

Unreliability = 1 - Reliability

For the failure process let

F(t) = P[a given component fails in [0,t)]

The corresponding probability density function f(t) is therefore

dt

tdFtf

)()(

So f(t)dt = P[a component fails in time period [t, t + dt)]

So

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 33: Reliability

33

Quantification of Reliability

Transition to the failed state can be characterised by the

conditional failure rate h(t).

This function is sometimes referred to as the hazard rate or

hazard function.

This parameter is a measure of the rate at which failures

occur taking into account the size of the population with the

potential to fail, i.e. those that are still functioning at time t:

So

h(t)dt = P[a component fails in time period t, t + dt| it has

not failed in [0, t)]

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 34: Reliability

34

Quantification of Reliability

For conditional probabilities we can write:

)( BAP)(

)(

BP

BAP

Since h(t)dt is a conditional probability we can define events A

and B as follows by comparing this with the equation above:

A — Component fails between t and t + t+dt

B — component has not failed in[0, t)

With events defined like this P(A B) = P(A) since if the

component fails between t and t+dt it is implicit that it cannot

have failed before time t

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 35: Reliability

35

Quantification of Reliability

P[component fails between t and t+dt]

P[component not fail in [0, t)] h(t)dt =

= )(1

)(

tF

dttf

Integrating gives ')'(1

)'(')'(

00

dttF

tfdtth

tt

t

dtth0

')'( = -ln[1-F(t)]

F(t) = 1-exp

t

dtth0

')'(

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 36: Reliability

36

Quantification of Reliability

F(t) = 1-exp

t

dtth0

')'(

If h(t), the failure rate or hazard rate for a general system or

component is plotted against time we get the bathtub curve.

In the useful life period h(t) = is constant.

So after integration F(t) = 1-e-t

And the reliability, the probability that the component works

continuously over (0, t] is the exponential function

R(t) = e-t

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 37: Reliability

37

System Mean Time to Failure

When system failure can be tolerated and repair can

be instigated an important measure of the system

performance is the system availability

A = MTBF + MTTR

MTBF

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 38: Reliability

38

QUANTIFIED RISK ASSESSMENT

SYSTEM LIFE CYCLE

MANUFACTURING PHASE

SYSTEM

DEFINITION

PHASE

CONCEPT

DESIGN

PHASE

DETAIL

DESIGN

PHASE

OPERATING PHASE

Establish

reliability

requirements

Set provisional

reliability/

availability

targets

Prepare

reliability

specification

Perform global

safety/

availability

assessment

Identify critical

areas and

components

Confirm /

review targets

FMEA / FTA of

critical systems

and components

Review reliability

database

Carry out

detailed system

reliability

assessment

Prepare safety

case

Prepare and

implement

reliability

specifications

Review reliability

demonstrations

Audit reliability

performance

Collect and

analyse

reliability, test

and

maintenance

data

Assess reliability

impact of

modifications

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 39: Reliability

39

RELIABILITY

CRITICAL FAILURES

Where failure causes total loss of function

MAJOR FAILURES

Where failure causes major loss of function but

the product can still be used to some extent

MINOR FAILURES

Where failure leaves the product still able to be

used to perform the major function but with the

loss of some convenience function

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 40: Reliability

A reliability network is a representation of the reliability dependencies between components of a system

Dependencies are used in such a way as to represent the means by which the system will function

Such a network can be used to assess the probability of failure of a system

40 2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 41: Reliability

41

The functional behaviour of most systems can be

characterised by a network diagram

Nodes denote the subsystems

Branches of the network represent the functional

relationship between these subsystemsExample

A high voltage supply system consisting

of two transmitters A and B and

a power supply.

For the system to work

the power supply and at least one of

the transmitters must operate

A path must exist between C and D for the system to work.

Power

Supply

Transmitter A

Transmitter B

C D

TOPOLOGICAL RELIABILITY

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 42: Reliability

The question of what constitutes proper operation or proper function for a particular type of equipment is usually specific to the equipment

Rather than attempt to suggest a general definition for proper function we assume that the appropriate definition for a device of interest has been specified

42 2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 43: Reliability

We can represent the functional status of the device as

43

Φ { 1 if the device functions properly

0 if the device has failed

Note that this representation is intentionally binary.

We assume that the status of the equipment of interest is either

satisfactory or failed.

There are many types of equipment where one or more de-rated

states are possible and methods have been developed to cope.

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 44: Reliability

We presume that most equipment is comprised of components and that the status of the device is determined by the status of the components.

Let n be the number of components that make up the device and define the component status

variables xi as

44

xi{ 1 if the device functions properly

0 if the device has failed

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 45: Reliability

The set of n components that make up the device is represented by the

component status vector: x = { x1, x2,…………, xn } The dependence of the device status on the

component status is represented by the function

Φ = Φ(x) referred to as a “system structure function” or a “system status function” or simply as a “structure”

45 2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 46: Reliability

There are 4 generic types of structural relationships between a device and its components.

1. Series

2. Parallel

3. k out of n

4. All others

46 2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 47: Reliability

SERIES SYSTEMS Definition

A series system is one in which all components must function properly in order for the system to function properly.

Reliability block diagram of a series system

Conceptual analogue Series electrical circuit

47

1 2 3

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 48: Reliability

48

TOPOLOGICAL RELIABILITY

Example 2

In an aircraft electronics system consisting of

a sensor subsystem,

guidance subsystem,

computer subsystem and

fire control subsystem

the system can only operate successfully if these four subsystems operate

NOTE: The figure only depicts the functional relationship required for system

operation and does not necessarily mean that these subsystems are electrically

wired together in series.

Sensor Guidance Computer Fire Control

Examples where components are not physically connected:

The set of legs on a 3-legged stool. The set of tyres on a car.

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 49: Reliability

SERIES SYSTEMS For the series structure the requirement that all

components must function implies that an algebraic form for the structure function is:

n

Φ(x) = ∏xi i=1

Only the functioning of all components results in system function

49

1 2 3

Examples x1= x2 = 1, x3 = 0 results in Φ(x) = 0 x1= x2 = x3 = 0 results in Φ(x) = 0 x1= x2 = x3 = 1 results in Φ(x) = 1

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 50: Reliability

PARALLEL SYSTEMS Definition

A parallel system is one in which any one component must function properly in order for the system to function properly.

Reliability block diagram of a series system Conceptual analogue Parallel electrical circuit

50

1

2

3

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 51: Reliability

PARALLEL SYSTEMS

51

1

2

3

Similar to the series the structure function for

the parallel system may be expressed as: n

Φ(x) = 1- ∏ (1- xi) i=1

Examples

x1= x2 = 1, x3 = 0 results in Φ(x) = 1 x1= 1, x2 = x3 = 0 results in Φ(x) = 1 x1= x2 = x3 = 0 results in Φ(x) = 0

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 52: Reliability

PARALLEL SYSTEMS Parallel systems are often referred to as Redundancy Often, but not always, the parallel components are identical There are actually several ways in which the redundancy may be implemented This diversity can lead to different reliability under different environmental conditions A distinction is made between redundancy obtained using a parallel structure in which all components function simultaneously (ACTIVE REDUNDANCY) and that obtained using parallel components of which one functions and the other or others wait as standby units (STANDBY REDUNDANCY).

52 2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 53: Reliability

k-out-of-n SYSTEMS Definition

A k-out-of-n system is one in which any k of the n components that comprise the system must function properly in order for the system to function properly.

53

1

2

3

4

k-out-of-n

Example cases for a 3-out-4 system x1= x2 = x3 = 1, x4 = 0 results in Φ(x) = 1 x1= x2 = 1, x3 = x4 = 0 results in Φ(x) = 0 x1= x2 = x3 = 0, x4 = 1 results in Φ(x) = 0

Φ(x)={ 1 if i=1

∑n

xi ≥ k

0 otherwise

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 54: Reliability

54

TOPOLOGICAL RELIABILITY

Example 3

In a computer system with a computer, a controller, and three

memory units suppose that the system can only satisfy its

operational requirements if at least two of the three memory units

are operable and both the computer and the controller are operable.

The 4 branches represent the 4 possible

ways we can obtain system

operation.

Controller

Unit 1 Unit 2

Unit 1 Unit 3

Unit 2 Unit 3

Unit 1 Unit 2 Unit 3

Computer

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 55: Reliability

55

1

2

2

3

4 5

Equivalent Computer Network

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 56: Reliability

56

Communication System diagram

At least 1 of the two Antenna Receiver Converters must work

At least 1 of the two Teleprinters must work

The Pulse shaper must work

Pulse

Shaping

Unit

Antenna

Receiver

Converter

Antenna

Receiver

Converter

Teleprinter

Teleprinter

TOPOLOGICAL RELIABILITY

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 57: Reliability

57

In general suppose the topological or network representation

of a system consists of n nodes and define

R(N1,……Nm) = probability that nodes number N1,……, Nm are

operating and the other n-m nodes are not operating

Then the probability that exactly m nodes are simultaneously

operating is given by

Rm = 1

.....N Nm

R(N1, …, Nm)

Where the sum is taken over all positive integers N1, …, Nm

such that n N1 > N2 >….>Nm 1

Thus the probability that at least k nodes are operating is

given by

n

kmRm

TOPOLOGICAL RELIABILITY

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 58: Reliability

58

System of m elements in series with individual reliabilities

R1, R2, , Ri, Rm respectively.

The system will only survive if every element survives , if

one element fails the system fails.

Assume The reliability of each element is independent

of the reliability of the other elements

The probability that the system survives is the probability

that element 1 survives and the probability that element 2

survives and the probability that element 3 survives etc.

The system reliability is the product of the element

reliabilities.

Rsyst = R1R2R3 Ri Rm

Reliability of Systems 1 Series systems

R1 R2 R3 R4 Ri Rm

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 59: Reliability

59

If we assume a constant failure rate for the elements

then since

Ri = e-t

Rsyst = e-1 t e

-2 t ….

e-I t

….e

-mt

So if syst is the overall system failure rate

Rsyst = e-syst t =

e

-(1 + 2 + ….+ I t

…. +m) t

syst = 1 + 2+…. +i+…. +m

Failure rate of a series network is the sum of the

individual element failure rates so it is important to

keep the number of elements to a minimum and so the

reliability will be maximum.

Reliability of Systems 1 Series systems

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 60: Reliability

60

Reliability of Systems 1 Series systems

Unreliability of Series System with Small Failure ratesProtective systems have element and system UNRELIABILITIES F

that are very small. The corresponding system reliabilities are

therefore very close to 1, for example 0.9999 may be typical. Then the

calculation of Rsyst = R1R2R3 Ri Rm may be arithmetically unwieldy

and the alternative equation involving unreliabilities may be more

useful since

Rsyst = 1 - Fsyst and Ri = 1- FI

We have 1 - Fsyst= (1- F1 ) (1- F2 )… (1- FI )… (1- Fm )

= 1 – (F1 + F2 +…+ FI + … +Fm )

+ terms involving products of F’s

If the individual Fi are small i.e. Fi << 1 the terms involving products

of Fs can be neglected giving the approximate equation

Fsyst F1 + F2 +…+ FI + … +Fm

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 61: Reliability

61

Reliability of Systems Parallel Systems

An overall system consisting of n individual

Elements or systems in parallel with

Individual unreliabilities F1, F2, …, Fj, …Fn

Only one individual element is necessary

to meet the functional requirements of the

overall system

The remaining elements increase the

reliability of the system

THIS IS CALLED REDUNDANCY.

Failure only if ALL the elements fail

The unreliability of a parallel system

Fsyst = F1F2…Fj…Fn

F1

F2

Fj

Fn

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 62: Reliability

62

Majority voting systems are used to protect

hazardous plant and processes and have

applications in the chemical, nuclear and aerospace

industries. Diagram shows a typical system with 2

out of 4 voting with initiators A, B, C, D.

Reliability of Systems Voting Systems

Trip setting

Process

parameter inputs

A

B

C

D

2oo4 voting

element

Shut down

system

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 63: Reliability

63

Reliability of Systems Voting Systems

Suppose R and F are the reliability and unreliability of

the individual initiators.

The overall initiation system fails to protect the plant if

either all 4 initiators fail

or any 3 initiators fail

If 2 or less initiators fail there are still sufficient left to

trip the plant

Since the F’s are normally small then the rare events

approximation is valid and the overall system

unreliability is the sum of the following probabilities

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 64: Reliability

64

Reliability of Systems Voting Systems

FINIT = Probability that A and B and C and D fail

+

Probability that A and B and C fail

+

Probability that B and C and D fail

+

Probability that A and C and D fail

+

Probability that A and B and D fail

Each of the terms is the product of individual

unreliabilities and reliabilities.

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 65: Reliability

65

Reliability of Systems Voting Systems

FINIT = F4 + F3R + F3R + F3R + F3R = F4 + 4F3R

= F3(F + 4R)

This result can be obtained from the binomial

expansion of (F + R)4

(F + R)4 = F4 + 4F3R + 6F2R2 + 4FR3 + R4

The first term F4 represents the probability of all 4

initiators failing and the second the total probability

of 3 failing. If R 1 and F 1 Then FINIT 4F3

The unreliability of the complete protective system is

FSYST FINIT + FVOTING + FSHUT-DOWN

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 66: Reliability

66

Reliability of Systems Majority Voting Systems

In a majority voting system there are n trip channels

and the plant is tripped if m (n m) indicate that the

plant should be tripped.

Such a system is referred to as ‘m out of n’ or m oo n

The binomial distribution can be used to calculate

overall failure probabilities.

Consider the jth term in the binomial expansion of

(F +R)n where F and R are the single channel

reliability and unreliability and n is the total number

of channels.

This is nCjFjRn-j and is the probability that

j channels fail i.e. (n-j) channels survive.

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 67: Reliability

67

Reliability of Systems Fail-Safe & Fail-Danger

Fail-Danger failure

Any system or component failure that prevents, or

tends to prevent, the plant being tripped when a

potentially hazardous fault condition occurs.

Example A pressure switch failed to open when the

pressure exceeded the trip pressure

Fail-Safe failure

Any system or component failure that produces a

plant trip when a plant trip is not required.

Example A pressure switch opened when the

pressure was below the trip pressure

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 68: Reliability

68

Reliability of Systems Fail-Safe & Fail-Danger

A fail-danger failure is a very serious occurrence

Fail-safe failures are less serious but cause loss of

production and confidence in the trip system

Fail-danger and Fail-safe failures will generally

have different failure rates and so different failure

probabilities

Detailed information on the failure rates associated

with all possible modes of failure of trip equipment

is not always available

We may have to assume both rates are equal to the

average failure rate

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 69: Reliability

69

Reliability of Systems Fail-Safe & Fail-Danger

Supposing we wish to calculate overall fail-danger

and fail-safe for a system with ‘two out of three’

voting i.e. 2 oo 3 where m = 2 and n = 3 . These

probabilities can be calculated from the binomial

expansion of (R + F) 3, where F and R are the single

channel reliability and unreliability respectively.

So we have (F + R)3 = F3 + 3F2R + 3FR2 + R3

where F3 represents the probability that all 3

channels fail, 3F2R the probability that 2 channels

fail, 3FR2 the probability that 1 channel fails and R3

the probability that no channel fails

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 70: Reliability

70

Reliability of Systems Fail-Safe & Fail-Danger

Looking at fail-danger first

if either 2 or 3 channels fail dangerously then

there are correspondingly only 1 or zero channels

left working.

This is insufficient to trip the plant with 2 oo 3

voting and an overall fail danger situation has

occurred.

If FD is the single channel fail-danger probability

then the overall fail danger probability is

PD = FD3 + 3R FD

2 = FD2(3R + FD)

In a protective system R 1 and FD 1 giving

PD 3FD2

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 71: Reliability

71

Reliability of Systems Fail-Safe & Fail-Danger

Looking at fail-safe

A fail-safe failure of no channels or only one

channel will not cause a plant trip with 2 oo 3 voting.

A fail-safe failure of two channels will cause an

unnecessary plant trip.

The failure of a third channel is irrelevant because

the plant is tripped by only 2 channels.

The overall fail-safe probability is therefore

PS = 3RFS2 3FS

2

where FS is the single channel fail-safe probability

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 72: Reliability

72

Reliability of Systems Fail-Safe & Fail-Danger

Overall fail-danger probability

PD = nCrF

r

D

where r = n-m+1 and nCr = n! {r!(n-r)!}

FD

MAX = 1- e

-DT DT (if DT 1

Overall fail-safe probability

PS = nCmF

m

S

where nCm= n! {m!(n-m)!}

FS

MAX = 1- e

-ST ST (if ST 1

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 73: Reliability

73

Reliability of Systems Fail-Safe & Fail-Danger

FRACTIONAL DEAD TIME FDT

Is related to fail-danger probability

Is the mean proportion of the testing interval T that

the trip system is incapable of protecting the plant

FDT = {1 T} 0TFD(t)dt

FDT is a similar concept to unavailability

FDT = {1 (r+1)} nCrF

r

D

Majority voting can be implemented with

combinatorial logic.

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 74: Reliability

74

FAULT TREE , EVENT TREE and

FMECA ANALYSIS

To check for fault propagation one technique is

Failure

Modes

Event

Criticality

Analysis

A full FMECA is hard and expensive. Take every

component, wire, connector and think of every possible

fault. Consider the effects of all of these - are they single

point failures? Can they propogate? Document the

results.

An FMECA is a development from an FMEA - (Failure

Modes Event Analysis)

FMEA and FMECA are bottom-up analyses. The alternative

approach is a fault tree analysis.

propagate?

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 75: Reliability

75

FMECA ANALYSIS

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 76: Reliability

76

FMECA ANALYSIS

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 77: Reliability

77 2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 78: Reliability

78

FAULT TREE , EVENT TREE and

FMECA ANALYSIS

Encountered frequently in the analysis of

events including human activities that can

lead to disasters or undesirable events

Sometimes called Cause - Consequence

analysis

Used more frequently in Safety studies

Event trees are

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 79: Reliability

79

EVENT TREE

Consider the following example of a fire alarm system.

Ideally if there is a fire then

The alarm goes off.

A sprinkler system extinguishes the fire.

In each case there is a human standby

If either the alarm or the sprinkler system fails

Human operator can operate either or both

This can be represented by the event tree in the figure

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 80: Reliability

80

Fire

starts

Alarm

functions

Operator notices

malfunction

Operator notices

malfunction

Sprinkler

system

functions

YES

510-4

YES

YES

YES

YES

YES

YES

NO

NO

NO

NO NO

NO NO 0.999

10-3

0.9995

10-2

210-3 10-3

210-3

0.99

0.998

0.9

0.999

0.998

0.1

Fire Suppressed 9.910-10

Fire Spreads 510-9

NO FIRE 0.9995

Fire Spreads 9.910-13

Fire Spreads 9.9910-8

Fire Suppressed 4.910-7

Fire Suppressed 8.9910-7

Fire Suppressed 4.9810-4

EVENT TREE

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 81: Reliability

81

EVENT TREE

Notice that of all the possible outcomes only three are that the fire

spreads

The possible sequence of events that that can lead to this undesireable

event can now be identified from these outcomes.

The alarm fails to function and the operator fails to notice and

take action in time

The alarm functions but the sprinkler fails to function and the

operator fails to notice and take action in time

The alarm fails to function, and the operator notices, but the

sprinkler fails to function, and the operator fails to notice

If sufficient data exists to estimate probabilities the likelihood of the

various outcomes can be obtained

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 82: Reliability

82

FAULT TREE

An Example of a Deductive approach.

“What can cause this?”

Used to identify the causal relationships leading to a

specific system failure mode.

The system failure mode is the TOP event and the

FAULT TREE is developed in branches below this

event showing its causes.

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 83: Reliability

83

a

f f

T6

T4

T5

T1

T3

T2 T

b

a

e

b

d

a

c

Fault Tree From Logic Expression T = (abc + f)[(a + d)f](a +be)

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 84: Reliability

84

Simplifying the expression:

T = (abc + f)[(a + d)f](a +be)

= (abc +f)(af + df)(a + be)

= abcf + af + abcdf + adf + abcef + abef + abcdef + bdef

Using XX = X

= abcf(1+d+e+de) + af (1+d+be) + bdef

= abcf + af + bdef Using (1 + X) = 1

= af (bc + 1) + bdef

= af + bdef

= f(a +bde)

Fault Tree from Logic Expression

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 85: Reliability

85

TESTING ACTIVITIES

Design testing refers to

laboratory tests

on computerand / or

prototype models

to prove that the design is capable of

meeting the quality specification

QUALITY DESIGN AND QUALIFICATION

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 86: Reliability

86

QUALITY DESIGN AND QUALIFICATION

Qualification testing refers to

field testing of

pre-production models and

production models involving

all performance characteristics

over the full range of relevantenvironmental variables

to further verify that the specification canbe met.

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 87: Reliability

87

DESIGN FOR RELIABILITY

Objective

To design a given product or system which

meets the target failure rate T

under the environmental conditionsspecified.

It is assumed that

all components and elements are operatingin the useful life region where failure rate isconstant with time.

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 88: Reliability

88

DESIGN FOR RELIABILITY

General principles to be observed.a) Element / component selection

Only elements / components with wellestablished failure rate data / models should beused

Some technologies are inherently more reliable thanothers.e.g.

Solid state switching devices are more reliablethan electromechanical reed relays

Inductive displacement transducers are morereliable than the resistive potentiometer type.

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 89: Reliability

89

DESIGN FOR RELIABILITY

b) De-rating

Stress (x) was defined as variable which when appliedto an element or component tends to increase failurerate.e.g.

mechanical stress

voltage

Strength (y) was defined as any property of theelement or component which resists the appliedstresse.g.

elastic limit

rated voltage

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 90: Reliability

90

DESIGN FOR RELIABILITY

To reduce failure rate

strength should exceed stress by anadequate

Safety Margin

In a mechanical element SM > 5.0

In an electronic circuit the voltageStress Ratio SR should be kept below0.7

Stress Ratio

( y - x )

(x2 +

y

2)

x

y

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 91: Reliability

91

DESIGN FOR RELIABILITY

c)Environment

Component / element failure rate critically dependent on environment.

Environment correction factor E

OBS = E B

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 92: Reliability

92

DESIGN FOR RELIABILITY

d) Minimum complexity

For a series system the system failure rate is the sum of the individual component / element failure rates

thus The number of components / elements in the system should be the minimum required for the system to perform its function

In electronic systems reliability can be improved by using integrated circuits which replace hundreds or thousands of basic devices.

The failure rate of the integrated circuit is generally less than the sum of the failure rates of the devices it replaces.

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 93: Reliability

93

DESIGN FOR RELIABILITY

e) Redundancy

The use of several identical elements /systems connected in parallel increases thereliability of the overall system.

Redundancy should be considered in situationswhere either the complete system or certainelements of the system have too high a failurerate.

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 94: Reliability

94

DESIGN FOR RELIABILITY

f) Diversity

The problem of common mode failure was discussed

Here a fault can occur that causes more than one element in a system to fail simultaneously e.g.

Electronic system where several of the circuits share a common power supply

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 95: Reliability

95

DESIGN FOR RELIABILITY

If the probability of common mode failure limitsthe reliability of the overall system

equipment diversity should be considered

Here a common function is carried by twosystems in parallel

but

Each element is made up of

different elementswith

different operating principlese.g. A temperature measurement device made up oftwo subsystems in parallel

one electronic

one pneumatic

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 96: Reliability

96

DESIGN FOR RELIABILITY

g) Calculation of system reliability

Once the components / elements have been chosen and their configuration in the system/product decided

then

The overall system / product reliability can be calculated.

The Reliability / Failure rate calculated for the overall product should be then compared with the target value

If the target value is not met then the design should be adjusted until the target figure is reached.

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 97: Reliability

97

The system designer may consider component

redundancy

ADVANTAGES of redundancy

The quickest solution if time is of prime importance

The easiest solution, if the component is already

designed

The cheapest solution, If the component is

economical in comparison with the cost of redesign

The only solution, if the reliability requirement is

beyond the state of the art

High Reliability Design

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 98: Reliability

98

DISADVANTAGES of redundancy

Too expensive, if the components are costly

Exceed the limitations on size and weight,

particularly in satellites

Exceed the power limitations, particularly in active

redundancy

Attenuate the input signal, requiring additional

amplifiers which increase complexity

Require sensing and switching circuitry so complex

as to offset the advantage of redundancy

High Reliability Design

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 99: Reliability

99

Exercises

1) Discuss, giving examples, the methods including procurement and testing procedures,

used by manufactures to ensure the reliability of a product.

2) Discuss the differences in reliability required in systems such as consumer products,

trains, aeroplanes, satellites. What value would you assign to the overall failure rate of

each of these systems.

3) What do you understand by the reliability of a system? Discuss some practical ways to

assign a quantitative value to the reliability of a system.

4) Draw a fault tree for the lighting system in a car and hence derive a logic equation for

the failure of the headlights.

5) Discuss the concepts of fail-safe and fail-danger. Why is the single unit probability for

fail-safe and fail-danger often assumed to be the same. Explain why in spite of this

assumption the overall fail-safe probability of a complex system will be different from the

overall fail-safe probability.

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 100: Reliability

10

0

Problems

1) )The figure shows a protective system, based on temperature measurement. The system is to have a

maximum fail-danger probability not exceeding 810-3 and a maximum fail-safe probability not exceeding

510-2. The system is tested and proved to be working correctly at three-week intervals. Annual fail-safe

and fail-danger failure rates for each component are:

Thermocouple S = D = 0.5

Thermocouple input trip amplifier/comparator S = D = 0.1

m out of n voting element S = D = 0.05

Logic operated switch S = D = 0.1

Solenoid valve S = D = 0.1

Trip valve S = D = 0.1

Calculate the maximum fail-safe and fail-danger probability PS and PD for:

a) The high integrity voting equipment, HIVE.

b) The high integrity trip initiator, HITI.

c) The high integrity shutdown system, HISS.

And hence

d) The total system fail-danger probability

e) The total system fail-safe probability

State whether the system meets the design criteria

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 101: Reliability

10

1

Problems

0.5 0.1

0.5 0.1

0.5 0.1

0.05

0.1 0.1 0.1

0.1 0.1 0.1

HIVE HITI HISS

Thermocouple Trip amp/comp

2 oo 3 voting Logic switch Solenoid valve Trip valve

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 102: Reliability

10

2

Problems

Solution

i HIVE Maximum FS = FD = 1- e-0.053/52

= 2.8 10-3

ii HITI single channel

Maximum FS = FD = 1- e-0.63/52

= 3.4 10-2

2 00 3 voting HITI;

PD= 3FD2 = 3 (3.4 10

-2)

2 3.4 10

-3

iii HISS single channel

Maximum FS = FD = 1- e-0.33/52

= 1.7 10-2

Two channels in parallel

PD= FD2 = (1.7 10

-2)

2 0.3 10

-3

PS= 2FS 2 1.7 10-2

= .3.44 10-2

iv Total System fail-danger probability = (PD)HITI + (PD)HIVE + (PD)HISS

= 3.4 10-3

+ 2.8 10-3

+ 0.3 10-3

= 6.5 10-3

v Total System fail-safe probability = (PS)HITI + (PS)HIVE + (PS)HISS

= 3.4 10-3

+ 2.8 10-3

+ 34.4 10-3

= 40.6 10-3

= 4.1 10-2

The system meets the design specification

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 103: Reliability

10

3

Problems

2) A taxi owner has 20 cars. Records show each car on average breaks down once every 2 years and that this is

reasonably constant. How many breakdown calls will he have per year?

What is the probability of 1 breakdown in a 3 month period?

Solution.

‘Statistical assumptions hold’

c = 1/2 (for 1 car)

F = Nc = 10

3 months = 1/4 year

F(t) = 1 - e-Ft

F(1/4) = 1 - e-10.1/4

= 0.92

OR

92% chance of at least one failure in 3 months.

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 104: Reliability

10

4

3) A basic guidance and navigation system for a proposed space probe consists of an Inertial Set, a

Canopus Sensor, a Sun Sensor, and a Computer. The reliability for each device is Rinertial set =

0.95; Rsun sensor = 0.90; Rcanopus sensor = 0.85; Rcomputer = 0.90. For the system to operate all four

subsystems must be operating. Due to design constraints the space probe can only contain one

Inertial set and one Computer. To increase the reliability of the system three Canopus Sensors

and two Sun Sensors are used in hot redundancy.

(i). Draw a reliability block diagram for the system without the redundancy

(ii). Draw a reliability block diagram for the system including the redundancy

Calculate the reliability of the system in each case.

Problems

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics

Page 105: Reliability

10

5

(4) In a distillation column a hazardous situation is created if the flow rate of steam to the reboiler goes high; this

causes a high flow rate of vapour up the column producing a high pressure which could cause the vessel to rupture.

The temperature control loop consists of a platinum resistance thermometer (PRT), a transmitter ( which converts

resistance change to a 4-20mA current signal), a controller, a current-to-pneumatic converter and a control valve.

The plant is protected by a pressure trip system consisting of a pressure switch and three-way solenoid valve

located in the air line between the converter and the control valve.

Failure mode and effect analysis of the system shows that:

1. A fail-danger situation F in which the Steam control valve moves fully open, occurs if either Pressure in valve

bonnet increases (F1) or Control valve fails open (F2).

2. F1 occurs if Pressure signal to control valve increases (F3) and Solenoid does not vent air (F4).

3. F3 occurs if PRT short circuit (F5) or Transmitter O/P fails low (F6) or Controller O/P fails high (F7) or I/P

Converter O/P fails high (F8).

4 F4 occurs if Pressure switch fails to open (F9) or Solenoid fails to vent (F10)

(i). Draw a fault tree diagram for the fail-danger failure

(ii). Write down the logic expression for the fail-danger failure F.

Problems

2nd Semester April. 2013 ECE 510 Reliability and Quality Assurance in Electronics