Idaho RISE System Reliability and Designing to Reduce Failure ENGR204 19 Sept 2005.

Idaho RISE

System Reliability and Designing to Reduce Failure

ENGR20419 Sept 2005

Reliability Analysis

Let R = probability system (or instrument) will operate without failure for time t (Success Probability)

R = e-t

Note: = failure rate (failures/second), sec-1

= -1 where = average seconds/failure

Failure Probability = 1 - R

If a system comprises n nonredundant systems all equallyessential for mission success, then the total system reliabilityis

Rs = R1 * R2 * R3 * ... ... Rn

= e-t * e-t

* e-t

* …

* e-nt

wherei is the failure rate of the ith system

If a system comprises n redundant systems in parallel, each ofwhich can satisfy the mission requirements individually, thenthe system parallel (redundant) reliability is

Rp = 1 - (1 - R1 ) * (1 - R2 ) * (1 - R3) ...* …(1 - Rn)

= 1 - F1 * F2 * F3 ...* … Fn

where Fi = (1 - Ri)is the failure probability of the ith system

Series Reliability

A B C

Rtot = RA * RB * RC

Full RedundancyA

B

C

Rtot = 1- (1- RA ) * (1 - RB) * (1 - RC)

Partial Redundancy (A & B are redundant, C is essential)

A

B

C

Rtot = RC* [ 1- (1- RA) * (1 - RB) ]

A

C

B

Rtot = 1 - (1- RA* RB ) * (1 - RC)

Non-Identical Full Redundancy (A & B are Essential, C is redundant)

Designing for Reliability

1. Keep It Simple!

2. Design Margin - Assure adequate strength of all mechanical and electrical parts, including allowancefor unusual loads due to environmental extremes. This includes environmental shielding.

3. Redundancy - Provide alternative means of accomplishingrequired functions where design for excess strengthis not suitable / reasonable. This includes most electronics.

Notes on Redundancy

Same Design Redundancy: two or more identical components or systems

• Switching allows only one system to be active

• Outputs can be combined so switching is not necessary (e.g. power distribution systems)

• Voting for combining outputs of redundant units. Requires three or more units (e.g. accelerometer activation of

critical sequence)

• Offers high protection against random failures

• Not effective against design deficiencies

Notes on Redundancy, cont.

Diverse Design Redundancy: utilize two or more systems of different design

• High protection against failures due to design deficiencies

• Can offer lower cost if backup is “lifeboat” with lesser accuracy and functionality, but still adequate for minimum mission needs

Notes on Redundancy, cont.

Functional (Analytic) Redundancy: addressing requirements bydifferent techniques. For example, determination of spacecraft attitude by gyroscope or by star tracker.

• Avoids cost and weight penalties of physical redundancy

• Provides protection against design faults

• Disadvantage: backup usually provides reduced performance.

Temporal Redundancy: Repetition of unsuccessful operation (i.e., retry after failure)

Apollo Design PrinciplesThe primary consideration governing the design of the Apollo system was that, if it could be made so, no single failure should cause the loss of any crewmember, prevent the successful continuation of the mission, or, in the event of a second failure in the same area, prevent a successful abort of the mission.

To implement this policy, the following specific principles were established:

1. Use established technology

2. Stress hardware reliability

3. Comply with safety standards

4. Minimize in-flight maintenance and testing for failure isolation

5. Simplify operations

6. Minimize interfaces

7. Make maximum use of experience gained from previous manned-space missions.

Reference: NASA SP-287

Qualification and Acceptance TestingAssume

- Engineering data is complete and exact- Engineering data completely controls manufacture- All items manufactured to same engineering data are identical.

Therefore- the results of Qualification Tests for one component are considered valid for all components.- If a representative component passes a sequence of qualification tests, all other components built to same engineering specifications should also pass

Design is said to be “Qualified”

Acceptance Testing is less severe, and is for the purpose of certifying workmanship

Failure Mode Definitions

Catastrophic failure – complete loss of mission, including flighthardware. (Examples: Loss of GPS; Parachute failure)

Major failure – significant loss of mission primary goals; significant degradation expected. (Example: Power supplyfailure)

Minor - minor loss of data or ability to achieve mission goals; system failure that is overcome by other flight systems.(Example: loss of primary temp sensor, but temp data still retrieved from backup sensor; Loss of single GPS)

Negligible – negligible impact on achieving mission goals.

Team Assignment

Consider Catastrophic and Single Point failure possibilities.

1. Initiate a list of potential Catastrophic, Major, and Minor Failures.

2. How can Catastrophic and Major failure possibilities be prevented? Consider simplifying design, redundancy, and design margins.

3. Which failures are Single Point (i.e., if a failure occurs there is no viable means of recovery)?

Example of Catastrophic Single Point Failure: heat shield on atmospheric entry probe

Idaho RISE System Reliability and Designing to Reduce Failure ENGR204 19 Sept 2005.

Documents

Transcript of Idaho RISE System Reliability and Designing to Reduce Failure ENGR204 19 Sept 2005.