Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure...
-
Upload
clemence-reed -
Category
Documents
-
view
213 -
download
0
Transcript of Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure...
![Page 1: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/1.jpg)
Engineering a Safer World
![Page 2: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/2.jpg)
Traditional Approach to Safety
• Traditionally view safety as a failure problem
– Chain of random, directly related failure events leads to loss
– Establish barriers between events or try to prevent individual component failures
e.g., redundancy, overdesign, safety margins, punishment and training for operators, defense in depth
• Analysis techniques
– Focus on probability of component failures and combinations of component failures
– Where do they get the probabilities?
• Historical failure rates
• Make up numbers for human error and software error or ignore these in the analysis
![Page 3: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/3.jpg)
Chain-of-events example
![Page 4: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/4.jpg)
Limitations of Traditional Approach
• Systems are becoming more complex
– Accidents often result from interactions among components, not just component failures
– Too complex to anticipate all potential interactions • By designers
• By operators
– Indirect and non-linear interactions
– Can no longer exhaustively test and get out design errors
• Omits or oversimplifies important factors
– Human error
– New technology, particularly software
– Culture and management
– Evolution and adaptation
![Page 5: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/5.jpg)
Accident with No Component Failures
![Page 6: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/6.jpg)
Types of Accidents
• Component Failure Accidents
– Single or multiple component failures
– Usually assume random failure
• Component Interaction Accidents
– Arise in interactions among components
– Complexity getting to point where cannot anticipate or guard against all potential interactions
– Exacerbated by introduction of computers and software
![Page 7: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/7.jpg)
Interactive Complexity
• Arises in interactions among system components
– Software allows us to build highly coupled and interactively complex systems
– Coupling causes interdependence
– Increases number of interfaces and potential interactions
• Too complex to anticipate all potential interactions
– By designers
– By operators
• May lead to accidents even when no individual component failures
![Page 8: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/8.jpg)
Non-Linear Complexity
• Cause and effect not related in an obvious way
• Systemic factors in accidents, e.g., safety culture, work environment, production pressures, etc.
– Our safety engineering techniques assume linearity
– Systemic factors affect events in non-linear and indirect ways
![Page 9: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/9.jpg)
Dynamic Complexity
• Related to changes over time
– Systems are not static, but we often assume they are
– Systems migrate toward states of high risk under competitive and financial pressures [Rasmussen]
– Need to control and identify unsafe changes
![Page 10: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/10.jpg)
Software-Related Accidents
• Are usually caused by flawed requirements
– Incomplete or wrong assumptions about operation of controlled system or required operation of computer
– Unhandled controlled-system states and environmental conditions
• Merely trying to get the software “correct” or to make it reliable will not make it safer under these conditions.
![Page 11: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/11.jpg)
Software-Related Accidents (2)
• Software may be highly reliable and “correct” and still be unsafe:
– Correctly implements requirements but specified behavior unsafe from a system perspective.
– Requirements do not specify some particular behavior required for system safety (incomplete)
– Software has unintended (and unsafe) behavior beyond what is specified in requirements.
![Page 12: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/12.jpg)
Safety vs. Correctness
• Safety involves more than simply getting the software “correct”:
Example: altitude switch
1. Signal safety-increasing
Require any of three altimeters report below threshold
2. Signal safety-decreasing
Require all three altimeters to report below threshold
![Page 13: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/13.jpg)
Confusing Safety and Reliability
![Page 14: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/14.jpg)
Systems Thinking
Event-basedThinking
![Page 15: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/15.jpg)
STAMP:System-Theoretic Accident
Model and Processes
Based on Systems Theory (vs. Reliability Theory)
![Page 16: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/16.jpg)
Applying Systems Thinking to Safety
• Accidents involve a complex, dynamic “process”
– Not simply chains of failure events
– Arise in interactions among humans, machines and the environment
• Treat safety as a dynamic control problem
– Safety requires enforcing a set of constraints on system behavior
– Accidents occur when interactions among system components violate those constraints
– Safety becomes a control problem rather than just a reliability problem
![Page 17: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/17.jpg)
Safety as a Dynamic Control Problem
• Examples
– O-ring did not control propellant gas release by sealing gap in field joint of Challenger Space Shuttle
– Software did not adequately control descent speed of Mars Polar Lander
– At Texas City, did not control the level of liquids in the ISOM tower;
– In DWH, did not control the pressure in the well;
– Financial system did not adequately control the use of financial instruments
![Page 18: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/18.jpg)
Safety as a Dynamic Control ProblemSafety as a Dynamic Control Problem (2)
• Most major accidents arise from a slow migration of the entire system toward a state of high-risk
• Need to control and detect this migration
• A change in emphasis:
“prevent failures” ↓
“enforce safety constraints on system behavior”
![Page 19: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/19.jpg)
ExampleSafetyControlStructure
![Page 20: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/20.jpg)
Qi Hommes, 2012
![Page 21: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/21.jpg)
Safety as a Control Problem (3)
• Goal: Design an effective control structure that eliminates or reduces adverse events.
– Need clear definition of expectations, responsibilities, authority, and accountability at all levels of safety control structure
– Entire control structure must together enforce the system safety property (constraints)
• Physical design (inherent safety)
• Operations
• Management
• Social interactions and culture
![Page 22: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/22.jpg)
Controlled Process
ProcessModel
ControlActions Feedback
Role of Process Models in Control
• Controllers use a process model to determine control actions
• Accidents often occur when the process model is incorrect
• Four types of hazardous control actions:• Control commands required for safety
are not given• Unsafe ones are given• Potentially safe commands given too
early, too late• Control stops too soon or applied too
long
Controller
22(Leveson, 2003); (Leveson, 2011)
ControlAlgorithm
![Page 23: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/23.jpg)
STAMP: Theoretical Causality Model
Accident/Event AnalysisCAST
Hazard AnalysisSTPA
System Engineering(e.g., Specification,
Safety-Guided Design, Design Principles)
Specification ToolsSpecTRM
Risk Management
Operations
Management Principles/Organizational Design
Identifying LeadingIndicators
Organizational/CulturalRisk Analysis
Tools
Processes
Regulation
![Page 24: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/24.jpg)
Learning from Events
• CAST: Causal Analysis based on System Theory
• Goal: more complete causal analysis of accidents, incidents, and adverse events
![Page 25: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/25.jpg)
Facts about Accidents
• Almost never have single causes
– “Root cause seduction”
– Accidents are complex processes
• Usually involve flaws in
– Engineered equipment
– Operator behavior
– Management decision making
– Safety culture
– Regulatory oversight
![Page 26: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/26.jpg)
Root Cause Seduction
• Assuming there is a root cause gives us an illusion of control.
– Usually focus on operator error or technical failures
– Ignore systemic and management factors
– Leads to a sophisticated “whack a mole” game
• Fix symptoms but not process that led to those symptoms
• In continual fire-fighting mode
• Having the same accident over and over
![Page 27: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/27.jpg)
Three Levels of Analysis
• What (events)
– e.g., explosion
• Who and how (conditions)
– e.g., bad valve design, operator did not notice something
• Why (systemic factors)
– e.g., production pressures, cost concerns, flaws in design process, flaws in reporting process, etc.
– Why was safety control structure ineffective in preventing the loss?
![Page 28: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/28.jpg)
Hazard Analysis
• “Investigating an accident before it occurs”
• Identify potential causal scenarios and try to eliminate them
• Must be based on some model of how and why accidents occur
• STPA (System-Theoretic Process Analysis)
– Based on STAMP
– Assumes accidents are more complex processes than just chains of component failure events
![Page 29: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/29.jpg)
Goals for an Accident Analysis Technique
• Minimize hindsight bias
• Provide a framework or process to assist in understanding entire accident process and identifying systemic factors
• Get away from blame (“who”) and shift focus to “why” and how to prevent in the future
• Goal is to determine
– Why people behaved the way they did
– Weaknesses in the safety control structure that allowed the loss to occur
![Page 30: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/30.jpg)
STPA Step 2
30
Inadequate Control Algorithm
(Flaws in creation, process changes,
incorrect modification or adaptation)
ControllerProcess Model
(inconsistent, incomplete, or
incorrect)
Control input or external information wrong or missing
ActuatorInadequate operation
Inappropriate, ineffective, or
missing control action
SensorInadequate operation
Inadequate or missing feedback
Feedback Delays
Component failures
Changes over time
Controlled Process
Unidentified or out-of-range disturbance
Controller
Process input missing or wrongProcess output contributes to system hazard
Incorrect or no information provided
Measurement inaccuracies
Feedback delays
Delayed operation
Conflicting control actions
Missing or wrong communication with another controller
Controller
![Page 31: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/31.jpg)
STPA (System-Theoretic Process Analysis)
• A top-down, system engineering technique
• Identifies safety constraints (system and component safety requirements)
• Identifies scenarios leading to violation of safety constraints; use results to design or redesign system to be safer
• Can be used on technical design and organizational design
• Supports a safety-driven design process where
– Hazard analysis influences and shapes early design decisions
– Hazard analysis iterated and refined as design evolves
![Page 32: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/32.jpg)
Steps in STPA
• Establish fundamentals
– Define “accident” for your system
– Define hazards
– Rewrite hazards as constraints on system design
– Draw preliminary (high-level) safety control structure
• Identify potentially unsafe control actions (high-level safety requirements and constraints)
• Determine how each potentially hazardous control action could occur
![Page 33: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/33.jpg)
Steps in STPA
• Establish foundation for analysis
– Define “accident” for your system
– Define hazards
– Rewrite hazards as constraints on system design
– Draw preliminary (high-level) safety control structure
• Step 1: Identify potentially unsafe control actions (high-level safety requirements and constraints) [Step 1 STPA]
• Step 2: Determine how each potentially hazardous control action could occur [Step 2 STPA]
x
![Page 34: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/34.jpg)
Identifying Accidents and Hazards
• Accident (Loss): An undesired or unplanned event that results in a loss, including a loss of human life or human injury, property damage, environmental pollution, mission loss, financial loss, etc.
• Hazard: A system state or set of conditions that together with a worst-case set of environmental conditions, will lead to an accident (loss)
![Page 35: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/35.jpg)
• Accident (Loss): An undesired or unplanned event that results in a loss, including a loss of human life or human injury, property damage, environmental pollution, mission loss, financial loss, etc.
• Hazard: A system state or set of conditions that together with a worst-case set of environmental conditions, will lead to an accident (loss)
System Accident Hazard
ACC Two vehicles collide Inadequate distance between vehicle and one in front or in back
Chemical Plant
People die or are injured due to exposure to chemicals
Chemicals in air after release from plant
Train door controller
Passenger falls out of train Door is open when train starts
Door is open while train moving
![Page 36: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/36.jpg)
Identify High-Level Safety Constraints (Requirements)
Hazard Safety Constraint (Requirement)
Inadequate distance between vehicle and one in front or in back
Vehicles must never violate minimum separation requirements
Chemicals in air after release from plant
Chemicals must never be released inadvertently from plant
Door is open when train starts Train must never start while door is open
Door is open while train moving Train must never open while train is moving
![Page 37: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/37.jpg)
Safety Constraints vs. Safety Requirements
• Design constraints:
– ACC must not violate separation requirements with object ahead
– ACC must not brake too abruptly
• Design requirements
– ACC shall maintain a TBD amount of distance between the vehicle and the object in front when engaged
– ACC shall limit vehicle deceleration to no more than TBC m/s2
![Page 38: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/38.jpg)
In-Class Example
In-Trail Procedure (ITP)
A new passing procedure for oceanic flights
![Page 39: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/39.jpg)
In-Trail Procedure (ITP)In-Trail Procedure (ITP)
• Enables aircraft to achieve FL changes on a more frequent basis.
• Designed for oceanic and remote airspaces not covered by radar.
• Permits climb and descent using new reduced longitudinal separation standards.
• Potential Benefits– Reduced fuel burn and CO2 emissions via more opportunities to reach the optimum
FL or FL with more favorable winds.– Increased safety via more opportunities to leave turbulent FL.
• But standard separation requirements not met during maneuver
![Page 40: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/40.jpg)
ITP Procedure – Step by StepITP Procedure – Step by Step
1. Check that ITP criteria are met.
2. If ITP is possible, request ATC clearance
3. Check that there are no blocking aircraft other than Reference Aircraft in the ITP request.
4. Check that ITP request is applicable (i.e. standard request not sufficient) and compliant with ITP phraseology.
5. Check that ITP criteria are met.
6. If all checks are positive, issue ITP clearance.
Flight Crew Air Traffic Controller
8. When ITP clearance is received, check that ITP criteria are still met.
9. If ITP criteria are still met, accept ITP clearance via CPDLC.
10. Execute ITP clearance without delay.
11. Report when established at the cleared FL.
Involves multiple aircraft, crew, communications
(ADS-B, GPS) , ATC
![Page 41: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/41.jpg)
Accident and Hazard Definition for ITP
Accident: Two aircraft collide
Hazard?:
![Page 42: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/42.jpg)
Batch Reactor In-Class Exercise
• What is the accident?
• What is the system-level hazard (associated with that accident)?
• What is the high-level system safety requirement (safety constraint)?
![Page 43: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/43.jpg)
Steps in STPA
• Establish foundation for analysis
– Define “accident” for your system
– Define hazards
– Rewrite hazards as constraints on system design
– Draw preliminary (high-level) functional control structure
• Step 1: Identify potentially unsafe control actions (high-level safety requirements and constraints)
• Step 2: Determine how each potentially hazardous control action could occur
![Page 44: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/44.jpg)
Draw the Functional Control Structure
• Identify major components and controllers
(HINT: Start at very high level)
• Label control and feedback arrows
• Create the preliminary process models
![Page 45: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/45.jpg)
ITP High-Level Control Structure
• What are the major components and controllers of the system?
![Page 46: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/46.jpg)
ITP High-Level Control Structure (2)
• What commands are sent and feedback provided?
ATC
Clearance to pass (to execute ITP)
Requests
Acknowledgements
Pilot
Aircraft
Execute ITPmaneuver A/C status, position, etc.
![Page 47: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/47.jpg)
High-Level Control High-Level Control Structure for ITPStructure for ITP
![Page 48: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/48.jpg)
Pilot Responsibilities and Process Model
• Responsibilities:– Assess whether ITP appropriate
– Check if ITP criteria are met
– Request ITP
– Receive ITP approval
– Recheck criteria
– Execute flight level change
– Confirm new flight level to ATC
• Process Model– Own ship climb/descend capability
– ADS-B data for nearby aircraft (velocity, position, orientation)
– ITP criteria (speed, distance, relative attitude, similar track, data quality)
– State of ITP request/approval
– etc.
![Page 49: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/49.jpg)
For the Batch Reactor
Draw the functional control structure
![Page 50: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/50.jpg)
What are the Responsibilities of the Software Controller?
• Not the requirements, simply the basic functions to be implemented
![Page 51: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/51.jpg)
Steps in STPA
• Establish foundation for analysis
– Define “accident” for your system
– Define hazards
– Rewrite hazards as constraints on system design
– Draw preliminary (high-level) safety control structure
• Step 1: Identify potentially unsafe control actions (high-level safety requirements and constraints)
• Step 2: Determine how each potentially hazardous control action could occur
x
![Page 52: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/52.jpg)
Four Ways Unsafe Control Can Occur
• A control action required for safety is not provided or is not followed
• An unsafe control action is provided that leads to a hazard
• A potentially safe control action provided too late, too early, or out of sequence
• A safe control action is stopped too soon or applied too long (for a continuous or non-discrete control action)
![Page 53: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/53.jpg)
Four Ways Unsafe Control Can Occur
• A control action required for safety is not provided or is not followed
• An unsafe control action is provided that leads to a hazard
• A potentially safe control action provided too late, too early, or out of sequence
• A safe control action is stopped too soon or applied too long (for a continuous or non-discrete control action)
Control Action
Not providing causes hazard
Providing causes hazard
Too early/too late, wrong order
Stopped too soon/ applied too long
![Page 54: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/54.jpg)
Step 1 for Pilot
Control Action
Not providing causes hazard
Providing causes hazard
Too early/too late, wrong order
Stopped too soon/ applied too long
Execute
ITP
Maneuver
Pilot
Aircraft
Execute ITPmaneuver A/C status, position, etc.
![Page 55: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/55.jpg)
Potentially Hazardous Control Actions by the Flight Crew
Control Action
Not Providing Causes Hazard
Providing Causes Hazard
Wrong Timing/OrderCauses Hazard
Stopped Too Soon/Applied Too Long
Execute ITP
ITP executed when not approved
ITP executed when ITP criteria are not satisfied
ITP executed with incorrect climb rate, final altitude, etc
ITP executed too soon before approval
ITP executed too late after reassessment
ITP aircraft levels off above requested FL
ITP aircraft levels off below requested FL
Abnormal Termination of ITP
FC continues with maneuver in dangerous situation
FC aborts unnecessarily
FC does not follow regional contingency procedures while aborting
![Page 56: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/56.jpg)
High Level Constraints on Flight Crew
• The flight crew must not execute the ITP when it has not been approved by ATC.
• The flight crew must not execute an ITP when the ITP criteria are not satisfied.
• The flight crew must execute the ITP with correct climb rate, flight levels, Mach number, and other associated performance criteria.
• The flight crew must not continue the ITP maneuver when it would be dangerous to do so.
• The flight crew must not abort the ITP unnecessarily. (Rationale: An abort may violate separation minimums)
• When performing an abort, the flight crew must follow regional contingency procedures.
• The flight crew must not execute the ITP before approval by ATC.
• The flight crew must execute the ITP immediately when approved unless it would be dangerous to do so.
• The crew shall be given positive notification of arrival at the requested FL
![Page 57: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/57.jpg)
Potentially Hazardous Control Actions for ATC
Control Action
Not Providing Causes Hazard
Providing Causes Hazard
Wrong Timing/Order Causes Hazard
Stopped Too Soon or Applied Too Long Causes Hazard
Approve ITP request
Approval given when criteria are not met
Approval given to incorrect aircraft
Approval given too early
Approval given too late
Deny ITP request
Abnormal Termination Instruction
Aircraft should abort but instruction not given
Abort instruction given when abort is not necessary
Abort instruction given too late
![Page 58: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/58.jpg)
High-Level Constraints on ATC
• Approval of an ITP request must be given only when the ITP criteria are met.
• Approval must be given to the requesting aircraft only.
• Approval must not be given too early or too late [needs to be clarified as to the actual time limits]
• An abnormal termination instruction must be given when continuing the ITP would be unsafe.
• An abnormal termination instruction must not be given when it is not required to maintain safety and would result in a loss of separation.
• An abnormal termination instruction must be given immediately if an abort is required.
![Page 59: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/59.jpg)
Hazard: Catalyst in reactor without reflux condenser operating (water flowing through it)
Control Action
Not providing causes hazard
Providing causes hazard
Too early/too late, wrong order
Stopped too soon/ applied too long
Create this table for the computer controlling the valves loop in your control diagram
For the Batch Reactor:
![Page 60: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/60.jpg)
What are the safety requirements (constraints) on the software controller given this table?
![Page 61: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/61.jpg)
Steps in STPA
• Establish foundation for analysis
– Define “accident” for your system
– Define hazards
– Rewrite hazards as constraints on system design
– Draw preliminary (high-level) safety control structure
• Step 1: Identify potentially unsafe control actions (high-level safety requirements and constraints)
• Step 2: Determine how each potentially hazardous control action could occur
x
![Page 62: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/62.jpg)
STPA Step 2
62
Inadequate Control Algorithm
(Flaws in creation, process changes,
incorrect modification or adaptation)
Controller
Process Model(inconsistent,
incomplete, or incorrect)
Control input or external information wrong or missing
ActuatorInadequate operation
Inappropriate, ineffective, or
missing control action
SensorInadequate operation
Inadequate or missing feedback
Feedback Delays
Component failures
Changes over time
Controlled Process
Unidentified or out-of-range disturbance
Controller
Process input missing or wrongProcess output contributes to system hazard
Incorrect or no information provided
Measurement inaccuracies
Feedback delays
Delayed operation
Conflicting control actions
Missing or wrong communication with another controller Controller
![Page 63: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/63.jpg)
ExampleSTPA Results
![Page 64: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/64.jpg)
Example Causal Analysis
• Unsafe control action: Pilot executes maneuver when criteria not met
• Possible Causes?
![Page 65: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/65.jpg)
Exercise Continued (Batch Reactor)
• STEP 2: Identify some causes of the hazardous control action: Open catalyst valve when water valve not open
– HINT: Consider how controller’s process model could identify that water valve is open when it is not.
• What are some causes for a required control action (e.g., open water valve) being given by the software but not executed.
• What design features (controls) might you use to protect the system from the scenarios you found?
![Page 66: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/66.jpg)
Results
• What did you find?
• What “controls” did you add?
• What about the actual scenario that occurred in the accident?
![Page 67: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/67.jpg)
Four Ways Unsafe Control Can Occur
• A control action required for safety is not provided or is not followed
• An unsafe control action is provided that leads to a hazard
• A potentially safe control action provided too late, too early, or out of sequence
• A safe control action is stopped too soon or applied too long (for a continuous or non-discrete control action)
![Page 68: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/68.jpg)
Inadequate Control Algorithm
(Flaws in creation, process changes,
incorrect modification or
adaptation)
ControllerProcess Model
(inconsistent, incomplete, or incorrect)
Control input or external information wrong or missing
ActuatorInadequate operation
SensorInadequate operation
Inadequate or missing feedback
Feedback Delays
Component failures
Changes over time
Controlled Process
Unidentified or out-of-range disturbance
Controller
Process input missing or wrongProcess output contributes to system hazard
Incorrect or no information provided
Measurement inaccuracies
Feedback delays
Delayed operation
Conflicting control actions
Missing or wrong communication with another controller Controller
Safe controlaction providedbut not followed
![Page 69: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/69.jpg)
Exercise Continued (Batch Reactor)
• STEP 2: Identify some causes of the hazardous control action: Open catalyst valve when water valve not open
– HINT: Consider how controller’s process model could identify that water valve is open when it is not.
• What are some causes for a required control action (e.g., open water valve) being given by the software but not executed.
• What design features (controls) might you use to protect the system from the scenarios you found?
![Page 70: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/70.jpg)
Results
• What did you find?
• What “controls” did you add?
• What about the actual scenario that occurred in the accident?
![Page 71: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/71.jpg)
Next Steps
• Use causal analysis to identify detailed safety design requirements and design options
• If desired, iterate top-down
– Refine into more detailed control structures
– Refine safety constraints (requirements) into more detailed requirements for each component
• Use STPA results to assure safety when use in different environment(s) than assumed for original certification
• Use STPA results to evaluate other types of changes to system and environment
![Page 72: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/72.jpg)
Example Controls for Causal Scenarios
• Scenario 1 - Operator was expecting patient to have been positioned, but table positioning was delayed compared to plan (e.g. because of delays in patient preparation or patient transfer to treatment area; because of unexpected delays in beam availability or technical issues being processed by other personnel without proper communication with the operator).
• Controls:
– Provide operator with direct visual feedback to the gantry coupling point, and require check that patient has been positioned before starting treatment (M1).
– Provide a physical interlock that prevents beam-on unless table positioned according to plan
![Page 73: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/73.jpg)
Example Controls for Causal Scenarios (2)
• Scenario 2 - Operator is asked to turn the beam on outside of a treatment sequence (e.g. because the design team wants to troubleshoot a problem) but inadvertently starts treatment and does not realize that the facility proceeds with reading the treatment plan.
• Controls:
– Reduce the likelihood that non-treatment activities have access to treatment-related input by creating a non-treatment mode to be used for QA and experiments, during which facility does not read treatment plans that may have been previously been loaded (M2);
– Make procedures (including button design if pushing a button is what starts treatment) to start treatment sufficiently different from non-treatment beam on procedures that the confusion is unlikely.
![Page 74: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/74.jpg)
Additional Things Not Covered Today
• Rigorous method for performing Step 1
– Much of it can be automated or assistance provided
– Generate executable and analyzable model-based safety requirements
• Multiple controller analysis
![Page 75: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/75.jpg)
Organizational Aspects of Risk
• Examples so far focus on physical level
• Also requirements and control responsibilities at management level to satisfy system safety requirements
• Can identify unsafe control actions and causal scenarios at higher levels of the control structure (perform a risk analysis) and build in controls to prevent them
• Behavior and control structures change over time
– Prevent migration to higher levels of risk
– Detect when occurs
![Page 76: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/76.jpg)
Organizational Aspects of Risk (2)
• Can look at non-safety risks, including project risks, budget risks, schedule risks and tradeoffs
• Goal may be to evaluate an existing control structure or to create a new one
• Creating leading indicators
• Current or past examples:
– NASA safety management after Columbia
– Radiation therapy at UCSD and UCLA hospitals (and maybe Boston Mass General)
– CO2 capture, transport, and storage (Samadi, Ecole des Mines)
– Product Development Process (Goerges, Cummins Engine)
![Page 77: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/77.jpg)
Other Topics Covered by STAMP
• Operations
• Managing safety-critical projects
• Integrating safety into system engineering
– Designing safety into systems from the beginning
– Specification to support maintenance and evolution
![Page 78: Engineering a Safer World. Traditional Approach to Safety Traditionally view safety as a failure problem –Chain of random, directly related failure events.](https://reader038.fdocuments.us/reader038/viewer/2022110400/56649ddb5503460f94ad2fcf/html5/thumbnails/78.jpg)
Current Projects
• Human factors engineering– Design to reduce human error
– Integrating sophisticated human factors into hazard analysis
• Leading Indicators
• Cyber Warfare and Nuclear Security
• Organizational, Managerial, and Project Risk Analysis
• More applications: high-speed rail, autos, medicine, aircraft, NextGen (TBO)
• Application to Financial Systems
• Other Emergent System Properties
• Tools and formal assistance with analysis