Design Considerations in Safety Critical Systemsdevans/7343/PresentationSlides...What is a...
Transcript of Design Considerations in Safety Critical Systemsdevans/7343/PresentationSlides...What is a...
Design Considerations in Safety Critical Systems
Presented by Remus Tumac
What is a safety-critical system?A safety-critical system is a system whose failure or malfunctioning may result in the loss of human life or serious injury to people.
Safety-critical systems are all around us● Infrastructure
○ Emergency services dispatch○ Water and wastewater systems○ Transport (railway, automotive, aviation, spaceflight)
● Medicine○ Mechanical ventilation systems○ Robotic surgery machines
● Nuclear Engineering● Recreation
○ Amusement Parks
Safety is not Reliability
Reliability(systems without safety impact)
Safety(systems with a fail-safe state)
Systems without
a fail-safe state
Most safety critical systems are composed off:
1. Sensors gathering data2. Software to process the data
Garbage in, garbage out
No matter how well implemented, a system cannot produce a valid output when an invalid input was provided
Memory protection● Threads sharing the same memory space could potentially corrupt each
other’s code, data, or stack segment.
● A misbehaved thread could bring down an entire system.
● For safety-critical systems, process-based real-time operating systems are
preferred.
Kernel protection● A bad system call should not be able to take down the kernel.● Should use opaque handles for kernel objects.
Fault tolerance and high availability● When a thread faults, the supervising thread should be notified.● Supervising thread can be hooked into a “watchdog” setup, whereby thread
deadlocks and starvation can be detected.
Guaranteed space availability● The system designer statically defines how much physical memory each
process gets.● When a thread wants to spun another thread, it must give part of its memory
quota to the newly created thread.
Guaranteed time availability● Threads with the same priority level usually share the processor via time slicing.
● Issue: no guarantee that critical threads will get the appropriate processor time
● Solution: when a thread creates a new thread, the creating thread must give up part
of its processor time to the newly created thread
Schedulability● The majority of safety-critical operating systems use priority-based,
preemptive schedulers.● Meeting hard deadlines is very important in safety-critical systems. Missing a
deadline can cause a critical fault.● Designers must understand how long it takes to execute a thread’s code
including any overhead (context switch, kernel system calls, interrupts).
Interrupt latency● Interrupts are usually disabled while the kernel is manipulating internal data
structures during system calls.● Better solution: postpone the interrupts until the system call is completed
Priority inversion● Occurs when a high priority thread is waiting on a mutex held by a low priority
thread, but the low priority thread cannot run because a medium priority thread has the processor.
Priority inheritance:
● The kernel temporarily elevates the low priority thread to the priority of the high priority thread
Priority ceiling:
● Each mutex has a priority associated with it.● When a thread acquires a mutex, the thread is elevated to the priority of the
mutex.● When the mutex is released, thread goes back to its original priority● This solution prevents chain blocking.
Sources[1] B. P. Douglass, “Safety Critical Systems Design : Patterns and Practices for Designing Mission and Safety-Critical Systems,”
Object Management Group. [Online]. Available: http://www.omg.org/news/meetings/workshops/RT_2002_Workshop_Presentations/01-3_Douglass_Safety_Critical_Systems_Design.pdf. [Accessed: 21-Apr-2018].
[2] C. Walls, “Safety critical systems - the basics,” Embedded, 25-May-2016. [Online]. Available: https://www.embedded.com/design/safety-and-security/4442103/Safety-critical-systems---the-basics. [Accessed: 21-Apr-2018]
[3] D. Kleidermacher and M. Griglock, “Safety-Critical Operating Systems,” Embedded, 31-Aug-2001. [Online]. Available: https://www.embedded.com/design/prototyping-and-development/4023830/Safety-Critical-Operating-Systems. [Accessed: 21-Apr-2018].