Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.
-
Upload
lionel-jordan -
Category
Documents
-
view
220 -
download
0
Transcript of Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.
![Page 1: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/1.jpg)
Software Safety Engineering(S2E) Program Status
Dan Fitch
March 7, 2001
![Page 2: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/2.jpg)
Software Safety Program - Overview
General Safety Concepts - WHY
Software Safety and CLCS - HOWKnown HazardsDesigning for SafetySafety & Reliability Thread
Current Status
![Page 3: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/3.jpg)
Software Safety – What is it?
Limit
LimitAnticipate
Limit
Limit
Detect
ControlLimit
Limit
Mitigate
RateSlope
AbsoluteValue
Prevent Limit DamageReturn to Safe State
![Page 4: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/4.jpg)
Software Safety – What is it?
DefinitionsFunctionally-critical
Mission completionSafety-Critical
Humans = Life & LimbHardware = $106
Some set theoryInput versus output
![Page 5: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/5.jpg)
Some Theory…
Set ofInputs ()
Set ofOutputs
Unknowns ()
KnownKnown
SafeUnsafe
AssumedSafe
Sources: Normal Operation Hardware Failures Human Intervention Models/Simulators
![Page 6: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/6.jpg)
Software Safety – Why do it?
Direction:DoD Mil-Std-882D, DoD-Std-2167
NASA NSTS-07700, NSS-8719.13, NASA-GB-1740.13, NSS-
22206, NSS-22254, Direction from Dan Goldin
CLCS 84K-00055, KDP-P-2901
![Page 7: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/7.jpg)
Software Safety – Why do it?
Objective: Identify & Mitigate Risk
Known Fault Scenarios – by requirements, analyses & test
Possible Unknowns – by design approach & further test
![Page 8: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/8.jpg)
“Knowns”
Hardware fault-driven scenarios
Legacy of hardware failure data available from the 1970’s
Hardware-driven hazards May be analyzed – the SSAMay be tested – specific fault injection
Identifies Risk & Yields Design Changes – Issues/ESRs
The Safety Case – Summary of Risk Findings
![Page 9: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/9.jpg)
“Unknowns”
“Stuff” Happens
Software doesn’t fail – It just doesn’t do what we thought it would
Hardware and some functions (e.g., seeds & races) cause most random errors
Specification & Coding errors = Prime Cause90% of errors are in the specificationsC++ and Java are inherently powerful, but
dangerous
![Page 10: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/10.jpg)
Farengi Software Safety Rule #76
If it "touches*" hardware that can impact the safety of people or equipment, an SSA is absolutely necessary.
*(i.e., controls, monitors, or mitigates therisk of using)
![Page 11: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/11.jpg)
SSA - What and When
Assessment of risk factors due to softwareHardware Hazards SFMEA and SFTAKDP-P-2901
Schedule: 30 days before the first interaction with Flight HardwareIn time for 5A/B TestingPresented at TRR/ORR
![Page 12: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/12.jpg)
System Safety Analysis
Detail Design
Code Development
Conceptual Design
IPT/DP-1 SRS/DP-2 DDS/ODS/DP-3
![Page 13: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/13.jpg)
System Safety Analysis
TRR/ORR
Detail Design
Code Development
Val/VerTest
5A/B(WithHdwr)
Conceptual Design
SystemTest
IPT/DP-1 SRS/DP-2 DDS/ODS/DP-3
3A/B4A/B
ReadinessReviews
![Page 14: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/14.jpg)
System Safety Analysis
PHA
TRR/ORR
Detail Design
Code Development
Val/VerTest
5A/B(WithHdwr)
KDP-P-2901 SSA Process
Conceptual Design
SystemTest
IPT/DP-1 SRS/DP-2 DDS/ODS/DP-3
3A/B4A/B
S-CMatrixH
azar
ds ReadinessReviews
![Page 15: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/15.jpg)
System Safety Analysis
PHAFTA/
FMEA
TRR/ORR
Detail Design
Code Development
Val/VerTest
Issu
es
5A/B(WithHdwr)
KDP-P-2901 SSA Process
Conceptual Design
SystemTest
IPT/DP-1 SRS/DP-2 DDS/ODS/DP-3
3A/B4A/B
S-CMatrixH
azar
ds ReadinessReviews
![Page 16: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/16.jpg)
System Safety Analysis
PHAFTA/
FMEARisk
Assessment
TRR/ORR
Detail Design
Code Development
Val/VerTest
CH
AW
S*
Issu
es
5A/B(WithHdwr)
KDP-P-2901 SSA Process
Conceptual Design
SystemTest
IPT/DP-1 SRS/DP-2 DDS/ODS/DP-3
3A/B4A/B
S-CMatrixH
azar
ds
*CHAWS = CLCS Hazard Analysis Worksheet
ReadinessReviews
Issu
es
![Page 17: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/17.jpg)
System Safety Analysis
PHAFTA/
FMEARisk
Assessment
SSA Report
TRR/ORR
Detail Design
Code Development
Val/VerTest
CH
AW
S*
Issu
es
5A/B(WithHdwr)
KDP-P-2901 SSA Process
Conceptual Design
SystemTest
IPT/DP-1 SRS/DP-2 DDS/ODS/DP-3
3A/B4A/B
S-CMatrix
Risk
CM-Driven Changes
Haz
ards
*CHAWS = CLCS Hazard Analysis Worksheet
ReadinessReviews
Issu
es
![Page 18: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/18.jpg)
Software Fault Tree Analysis
Works backward from the fault to its root causesUses design details of the entire systemLeads to better understanding of causes and their
preventionUnknown fault events not considered
![Page 19: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/19.jpg)
Fault Tree Analysis
Top Event Fill Valve not closed
Other Root
Cause
Human did not notice
pressure
S/W did not react to over pressure
Basic Fault EventsIntermediate Events
S/W did not anticipate rapid
pressure rise
Causal RelationshipAND
![Page 20: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/20.jpg)
Analysis & CLCS Architecture
HardwareSafing
System S/W
Sys Srvcs
Apps Srvcs
Applications
RemainingRisk
Hazardous Event
Control &Mitigation
Detection &Anticipation
![Page 21: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/21.jpg)
The Software FMEA
Predicted hardware failures followed to their conclusion through the softwareWhat can go wrong?What happens when it does?
Must know system failures up frontWon’t prevent the unexpected
![Page 22: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/22.jpg)
CLCS
Spiral Development Cultural Changes
Failure of software Test
![Page 23: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/23.jpg)
SSA – Traditional Approach
Failure Modes& Effects Analysis
Fault Tree
Analysis
Traditional Development
•All or most code available•A lot known about the system•Too late…
![Page 24: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/24.jpg)
SSA - An Iterative Process
Safety Criticality Assessment
EngineeringDesign Changes
Failure Modes& Effects Analysis
Fault Tree Analysis
Spiral Development
![Page 25: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/25.jpg)
S&MA will perform a Software Safety Analysis (SSA) for each Delivery and every location; i.e., as we step up to each new drop.
After the initial SSA, an update of the analysis and a new SSA report will be done for each modification to the safety critical software.
SSA - Where
![Page 26: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/26.jpg)
SSA - Planning
On a Pert chart, the SSA preparation activity will begin during the preparation of the design specifications and have a finish-to-finish relationship with the validation/verification (4A/B) testing.
Design Begin … Val/Ver Test
PHAFTA
FMEARisk Assessment
SSAReport
![Page 27: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/27.jpg)
Farengi Software Safety Rule #304
The SSA isn’t enough.
![Page 28: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/28.jpg)
CLCS
Spiral Development Cultural Changes
Failure of software Test
![Page 29: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/29.jpg)
Paradigms
Software Failures:
“Software does not fail - it just does not perform as intended”
Dr Nancy Leveson, MIT
![Page 30: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/30.jpg)
Paradigms
Design and test for functionality:
Also specify what the system
should not do.
Then test it.
![Page 31: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/31.jpg)
Some Theory… 2nd Look
Set ofInputs ()
Set ofOutputs
Unknowns ()
KnownKnown
SafeUnsafe
AssumedSafe
Sources: Normal Operation Hardware Failures Human Intervention Models/Simulators
Fault Injection(added known)
![Page 32: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/32.jpg)
Design for Safety
“Program and Project Responsibilities”Dan Goldin message:
Safety is more than FMEA and FTASafety must be designed in at the earliest
Existing SpecificationsMust include safety
Methods & techniques for mitigation of hazardsRequirements – Traceable and Testable
![Page 33: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/33.jpg)
Initiatives
Dan Goldin: “Design for Safety”Smart Practices applied early to designs
Early engineering changes are cheaperProvide draft guidance for design of safety-critical
softwareProcess changes
Design Guidelines – NASA-GB-7410.13Peer reviews – enhanced checklistTest development – Fault Injection for Robustness
Works to prevent unforeseen fault scenarios
![Page 34: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/34.jpg)
Objectives
Known fault scenarios – AnalysisRedesignTest – functionality and robustness
UnknownsDesign them out of the systemTest – fault injection
![Page 35: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/35.jpg)
S/W Safety – Where we are.
Safety-Critical software identified & in engineering review
Software Safety Integration Team formedSoftware FTA/FMEA in work
Will be recurring due to spiral development
Design for Safety concepts being integratedSafety & Reliability Thread introducedPost-SSA Analysis Tools being procured
![Page 36: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/36.jpg)
S/W Safety – What’s Next?
Today“Design for Safety” and “Known Fault
Analyses”Tomorrow
Recursive and bi-directional analysesReliability predictions, Markov, Numerical
Integration, Weibull analysis techniquesProbabilistic fault injection techniques
![Page 37: Software Safety Engineering (S2E) Program Status Dan Fitch March 7, 2001.](https://reader035.fdocuments.us/reader035/viewer/2022062309/56649d955503460f94a7de07/html5/thumbnails/37.jpg)
Summary
Life on the Leading Edge
Probably the “Largest real-time safety-critical control system on the planet”
Safety is our #1 core value
We are on front and center stage – The NASA team is watching