Scheduling and Voltage Scaling for Energy/Reliability ...paupo/publications/Pop2007ab... ·...
Transcript of Scheduling and Voltage Scaling for Energy/Reliability ...paupo/publications/Pop2007ab... ·...
Scheduling and Voltage Scaling for Energy/Reliability Trade-offs in Fault-Tolerant Time-Triggered
Embedded Systems
Kåre Harbo Poulsen,
Paul Pop, Viacheslav Izosimov
August 23, 2007
Embedded Systems
Design of Embedded Systems
Faults
Permanent faults are decreasing Transient faults are increasing
From: Cristian Continescu, Trends and challenges in VLSI circuit reliability, 2003
Fault-Tolerance
Tolerate faults gracefully Expressions for reliability for fault-
tolerance
Reexecution
PE2
PE1 P1 P1
Passive Replication
PE2
PE1 P1
P1
Replication
PE2
PE1 P1
P1
Embedded Systems Model
Input Application Architecture Reliability goal: 0.999 999 999
PE2
PE1 P1 P2 P4
P3
P5
Bus 1 2
DeadlineR0=0.999 981
P1
P2 P3
P4
P5
PE1 PE2
P1 4 4
P2 4 4
P3 3 3
P4 4 4
P5 4 4
PE1 PE 2
m1
m2
k=1
Fault-Tolerant Scheduling
Input Application Architecture Reliability goal: 0.999 999 999
Fault-tolerance for k=1 faults
PE2
PE1 P1 P2 P4
P3
P5
Bus 1 2
Deadline
P1 P2 P4
P3
P5
R0=0.999 999 999 927
P1
P2 P3
P4
P5
PE1 PE2
P1 4 4
P2 4 4
P3 3 3
P4 4 4
P5 4 4
PE1 PE 2
m1
m2
k=1
Fault-Tolerant Scheduling
Fault tolerant scheduler Full transparency
Good debugability Little memory
PE2
PE1 P1 P2 P4
P3
P5
Bus 1 2
Deadline
P1 P2 P4
P3
P5
Only 1 faultFully Transparent Scheduling
P1
P2 P3
P4
P5
PE1 PE2
P1 4 4
P2 4 4
P3 3 3
P4 4 4
P5 4 4
PE1 PE 2
m1
m2
k=1
Fault-Tolerant Scheduling
Can be done faster Sacrifice local transparency
PE2
PE1 P1 P2 P4
P3
P5
Bus 1 2
Deadline
P1 P2 P4
P3
P5
Only 1 faultFully Transparent Scheduling
Only 1 fault
P1
P2 P3
P4
P5
PE1 PE2
P1 4 4
P2 4 4
P3 3 3
P4 4 4
P5 4 4
PE1 PE 2
m1
m2
k=1
Fault-Tolerant Scheduling
Can be done faster Sacrifice local transparency More complex online scheduler
PE2
PE1 P1 P2 P4
P3
P5
Bus 1 2
Deadline
P1/2
P3
P4/5
Only 1 fault
Slack Sharing Scheduling
P1
P2 P3
P4
P5
PE1 PE2
P1 4 4
P2 4 4
P3 3 3
P4 4 4
P5 4 4
PE1 PE 2
m1
m2
k=1
Fault-Tolerant Scheduling
Even faster Sacrifice all transparency Schedule for each fault scenario
PE2
PE1 P1 P2 P4
P3
P5
Bus 1 2
Deadline
P1/2
P3
P4/5
Slack Sharing Scheduling
P1
P2 P3
P4
P5
PE1 PE2
P1 4 4
P2 4 4
P3 3 3
P4 4 4
P5 4 4
PE1 PE 2
m1
m2
k=1
Fault-Tolerant Scheduling
Even faster Sacrifice all transparency Schedule for each fault scenario At most k re-executions
PE2
PE1 P1 P1 P4
P3
P5
Bus 1 2
Deadline
P2
P1
P2 P3
P4
P5
PE1 PE2
P1 4 4
P2 4 4
P3 3 3
P4 4 4
P5 4 4
PE1 PE 2
m1
m2
k=1
Fault-Tolerant Scheduling
Even faster Sacrifice all transparency Schedule for each fault scenario At most k re-executions
PE2
PE1 P4P1 P4
P3
P5
Bus 1 2
Deadline
P2
P1
P2 P3
P4
P5
PE1 PE2
P1 4 4
P2 4 4
P3 3 3
P4 4 4
P5 4 4
PE1 PE 2
m1
m2
k=1
Fault-Tolerant Scheduling
Even faster Sacrifice all transparency Schedule for each fault scenario At most k re-executions All faults information is shared
PE2
PE1 P1 P4
P3
P5
Bus 1 2
Deadline
P2
Conditional Scheduling
P1
P2 P3
P4
P5
PE1 PE2
P1 4 4
P2 4 4
P3 3 3
P4 4 4
P5 4 4
PE1 PE 2
m1
m2
k=1
Fault-Tolerant Schedulings
PE2
PE1 P1 P4
P3
P5
Bus 1 2
P2
Conditional Scheduling
PE2
PE1 P1 P2 P4
P3
P5
Bus 1 2
P1/2
P3
P4/5
Slack Sharing Scheduling
PE2
PE1 P1 P2 P4
P3
P5
Bus 1 2
P1 P2 P4
P3
P5
Fully Transparent Scheduling Deadline
P1
P2 P3
P4
P5
PE1 PE2
P1 4 4
P2 4 4
P3 3 3
P4 4 4
P5 4 4
PE1 PE 2
m1
m2
k=1
Energy Management
Goal: minimise energy consumption Dynamic voltage scaling
PE2
PE1 P1
100% Vss
100% E0100% E0
PE2
PE1 P1
66% Vss
44% E0
PE2
PE1 P1
33% Vss
11% E0
Energy Management
PE2
PE1 P1 P4
P3
P5
Bus 1 2
P2
Conditional Scheduling
PE2
PE1 P1 P2 P4
P3
P5
Bus 1 2
P1/2
P3
P4/5
Slack Sharing Scheduling
PE2
PE1 P1 P2 P4
P3
P5
Bus 1 2
P1 P2 P4
P3
P5
Fully Transparent Scheduling Deadline
P1
P2 P3
P4
P5
PE1 PE2
P1 4 4
P2 4 4
P3 3 3
P4 4 4
P5 4 4
PE1 PE 2
m1
m2
k=1
Energy Management
PE2
PE1 P1 P4
P3
P5
Bus 1 2
P2
Conditional Scheduling
PE2
PE1 P1 P2 P4
P3
P5
Bus 1 2
P1/2
P3
P4/5
Slack Sharing Scheduling
PE2
PE1 P1 P2 P4
P3
P5
Bus 1 2
P1 P2 P4
P3
P5
Fully Transparent Scheduling Deadline
100% E0
P1
P2 P3
P4
P5
PE1 PE2
P1 4 4
P2 4 4
P3 3 3
P4 4 4
P5 4 4
PE1 PE 2
m1
m2
k=1
63% E0
Energy Management
PE2
PE1 P1 P4
P3
P5
Bus 1 2
P2
Conditional Scheduling
PE2
PE1 P1 P2P4
P3
P5
Bus 1 2
P3
P4/5
Slack Sharing Scheduling
PE2
PE1 P1 P2 P4
P3
P5
Bus 1 2
P1 P2 P4
P3
P5
Fully Transparent Scheduling Deadline
100% E0
P1
P2 P3
P4
P5
PE1 PE2
P1 4 4
P2 4 4
P3 3 3
P4 4 4
P5 4 4
PE1 PE 2
m1
m2
k=1
63% E0
Energy Management
PE2
PE1 P1
P3
P5
Bus 1 2
P2 P4
Conditional Scheduling
PE2
PE1 P1 P2P4
P3
P5
Bus 1 2
P3
P4/5
Slack Sharing Scheduling
PE2
PE1 P1 P2 P4
P3
P5
Bus 1 2
P1 P2 P4
P3
P5
Fully Transparent Scheduling Deadline
100% E0
38% E0
P1
P2 P3
P4
P5
PE1 PE2
P1 4 4
P2 4 4
P3 3 3
P4 4 4
P5 4 4
PE1 PE 2
m1
m2
k=1
63% E0
Energy Management
PE2
PE1 P1
P3
P5
Bus 1 2
P2 P4
Conditional Scheduling
PE2
PE1 P1 P2P4
P3
P5
Bus 1 2
P3
P4/5
Slack Sharing Scheduling
PE2
PE1 P1 P2 P4
P3
P5
Bus 1 2
P1 P2 P4
P3
P5
Fully Transparent Scheduling Deadline
100% E0
38% E0
P1
P2 P3
P4
P5
PE1 PE2
P1 4 4
P2 4 4
P3 3 3
P4 4 4
P5 4 4
PE1 PE 2
m1
m2
k=1
Reliability and Energy
Lower voltage Critical energy is lowered Probability of faults
increases Circuit operates slower
Lower frequency Longer execution time Probability of faults
increases
Reliability and Energy
Exponential model
Dakai Zhu, ReliabilityAware Dynamic Energy Management in Dependable Embedded RealTime Systems, 2006
λ f =λ010d 1− f 1− f
min
Fmin 10 20 30 40 50 60 70 80 90 Fmax
0102030405060708090
100
Failure rate vs. Voltage
Relative voltage
Incr
ease
in fa
ult r
ate
63% E0
Energy Management
PE2
PE1 P1
P3
P5
Bus 1 2
P2 P4
Conditional Scheduling
PE2
PE1 P1 P2P4
P3
P5
Bus 1 2
P3
P4/5
Slack Sharing Scheduling
PE2
PE1 P1 P2 P4
P3
P5
Bus 1 2
P1 P2 P4
P3
P5
Fully Transparent Scheduling Deadline
100% E0
38% E0
R=0.999 999 999 93
R=0.999 999 999 25
R=0.999 999 958 208
P1
P2 P3
P4
P5
PE1 PE2
P1 4 4
P2 4 4
P3 3 3
P4 4 4
P5 4 4
PE1 PE 2
m1
m2
k=1
Energy/Reliability Trade-off
PE2
PE1
Bus
P5
P1
P2
P3
P6
P4
1 2
Deadline
100% E0
R=0.999 999 987
Reliability goal: 0.999 999 9
k = 1
P1
P2 P3
P4
P5m1
m2
A: G1
N1N2
Voltage levels100%100%
66%66%
33%33%
P6
G2
P1P2P3P4P5
N1 N21070X40X
XX
X40
40
P6 X 50
PE1 PE2
Energy/Reliability Trade-off
PE2
PE1
Bus
P5
P2P
1
P3
P6
1 2
P4
Deadline
68% E0
R=0.999 999 878
Reliability goal: 0.999 999 9 Set reliability as hard constraint
k = 1
P1
P2 P3
P4
P5m1
m2
A: G1
N1N2
Voltage levels100%100%
66%66%
33%33%
P6
G2
P1P2P3P4P5
N1 N21070X40X
XX
X40
40
P6 X 50
PE1 PE2
Energy/Reliability Trade-off
PE2
PE1
Bus
P1
P2
P6
1
P3
P4
2
P5
Deadline
73% E0
R=0.999 999 920
Reliability goal: 0.999 999 9 Set reliability as hard constraint Trade-off 5% energy Meets reliability goal
k = 1
P1
P2 P3
P4
P5m1
m2
A: G1
N1N2
Voltage levels100%100%
66%66%
33%33%
P6
G2
P1P2P3P4P5
N1 N21070X40X
XX
X40
40
P6 X 50
PE1 PE2
Problem Formulation
Input Application Architecture Reliability goal
Decide Fault-Tolerant Scheduling Mapping Fault-Tolerance Policy
While optimising for Energy Under hard reliability goal
Implementation
Problem is NP-Complete Normally solved using “best effort” heuristics
Use constraint logic programming Good performance with NP-completeness Optimal solutions are feasible Flexible model ECLiPSe-CLP
Comparison of Schedulers
Comparison of Schedulers
Reliability and Energy Trade-offs
Conclusions
Design tool for doing Fault tolerant scheduling Mapping Policy assignment
Optimising for Minimal energy Hard constraints for timing and reliability
Message: Reliability can be met at little energy cost
Embedded Systems