Soner Yaldiz , Alper Demir, Serdar Tasiran Koç University, Istanbul, Turkey
description
Transcript of Soner Yaldiz , Alper Demir, Serdar Tasiran Koç University, Istanbul, Turkey
Soner Yaldiz , Alper Demir, Serdar Tasiran
Koç University, Istanbul, Turkey
Paolo Ienne, Yusuf Leblebici
Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland
Characterizing and ExploitingCharacterizing and ExploitingTask-Load Variability and Task-Load Variability and CorrelationCorrelationfor Energy Management for Energy Management in Multi-Core Systemsin Multi-Core Systems
ESTIMedia 2005ESTIMedia 2005
ESTIMedia 2005 2
Multi-Core Soft Real-Time Multi-Core Soft Real-Time SystemsSystems
processors
• Chip-level multiprocessing for massive performance– Energy management problem
• Real-time multimedia applications– Audio, video processing
• Soft real-time systems– Tolerance to deadline misses
t t + TDEADLINEtime
start end
T2
T4T3
T1task graph
MPEG2 video frames
ESTIMedia 2005 3
Variability and CorrelationVariability and Correlation
• Capture by Stochastic Models
• Exploit for Energy Management– Dynamic Voltage Scaling (DVS)
TimeV
olt
ag
e
V1
deadline
V2
workload
Taskivariability
probability
Task2
workload
Task
1
workload positivecorrelation
• This work: First approach to consider variability and correlations for multiprocessor energy management
ESTIMedia 2005 4
• Application composed of two tasks on a single processor
Motivating ExampleMotivating Example
start endT2T1
TDEADLINE = 2 sec
• Task loads low (2) or high (10) with equal probability
• Processor Operating Modes – Slow Mode -> 6 instructions-per-second– Fast Mode -> 10 instructions-per-second
2 10instructions
T1,T250%50%
probability
ESTIMedia 2005 5
start endT2T1
TDEADLINE = 2 sec 2 10
instructions
T1,T250%50%
probability
T1 T2
2 10
2 25%
25%
10 25%
25%
T1 T2
2 10
2 50%
0
10 0 50%
T1 T2
2 10
2 0 50%
10 50%
0
Probabilities for task load combinations:
Independent Positively Correlated Negatively Correlated
Task Load CombinationsTask Load Combinations
ESTIMedia 2005 6
T1 T2
2 10
2 25%
25%
10 25%
25%
Motivating Motivating ExampleExample
T1 T2
2 10
2 50%
0
10 0 50%
T1 T2
2 10
2 0 50%
10 50%
0
Independent
PositivelyCorrelated
NegativelyCorrelated
Slow mode -> 12 instructions in 2 secMisses desired performance
0.75 0.50 never happens !
Fast mode -> 20 instructions in 2 secSuboptimal energy
1.0
• Application– 2 tasks
• Processor modes– Slow 6 inst/sec– Fast 10 inst/sec
• Deadline– 2 sec
Target 75%
Assumption
Independent
Reality Positive Correlation
Target 100%
Assumption
Independent
Reality Negative Correlation
ESTIMedia 2005 7
• Stochastic Modeling
• Energy Management Scheme– OFFLINE Optimization– ONLINE Adjustments
• Experimental Results
• Conclusions
OUTLINEOUTLINE
ESTIMedia 2005 8
Stochastic Modeling FlowStochastic Modeling Flow
• Computational Demand (CD) of a task– Number of CPU cycles for execution
• Demands are represented by dist– Quantized for manageability
• dist is obtained from a set of traces
• Demand of tasks constitutes an ‘observation’
– (T1,T2) = ( 5, 5 ) observed 3 out of 8.
– dist ( 5,5 ) = 3/8
OBSERVATIONS
Task1 Task2
1 2 10
2 5 5
3 2 5
4 10 2
5 5 5
6 2 10
7 2 5
8 5 5
start endT2T1
T1 T2
2 5 10
2 2/8 2/8
5 3/8
10 1/8
disdistt
ESTIMedia 2005 9
• MPEG2 video decoding– Widely-used and computationally intensive
• Slice-based task decomposition(Olukotun et.al,1998) – VLD ( Variable-length decoding)– MC ( Motion compensation )
Case Study: MPEG2Case Study: MPEG2VLD0, MC0VLD1, MC1VLD2, MC2... ...
Experimental Data: – 10 movie segments– 19 slices, 38 tasks – 24 frames-per-second– ~ 14000 frames per movie
Task Assignment Processor Precedence
Data Precedence
slice0
slice1
slice2
ESTIMedia 2005 10
Variability of MPEG2 Task Variability of MPEG2 Task LoadsLoads
aggregate
one movie
aggregate
1- SimilarityTraning set predicts workload
for others
2- Long TailsWorst-Case causes overdesign
one movie
ESTIMedia 2005 11
Correlation among MPEG2 Task Correlation among MPEG2 Task LoadsLoads
High Correlation
aggregatestatistics
one movieSl
ice
9
Slice
14
Slice
18
Slice
0
Slice
5... ... ... ...
ESTIMedia 2005 12
Critical PathCritical Path
• Summation of worst-case task loads : 64 million cycles • Observed worst-case total load : 28 million cycles• Ignoring correlations lead to far from optimal
ESTIMedia 2005 13
• Stochastic Modeling
• Energy Management Scheme– OFFLINE Optimization– ONLINE Adjustments
• Experimental Results
• Conclusions
OUTLINEOUTLINE
ESTIMedia 2005 14
OFFLINE: OFFLINE: Optimization FormulationOptimization Formulation
• Nonlinear constrained optimization problem with 38 variables– One voltage per task
• Stochastic programming formulation– Based on stochastic application model
• Optimized voltages stored for run-time look-up
• Each task i has fixed voltage Vi for all periods• GOAL: Determine optimal Vi’s
minimizeaverage energy consumption
subject tocompletion probability
ESTIMedia 2005 15
ONLINE ONLINE AdjustmentsAdjustments
• When low load is detected, lower the task voltage– Preserving probabilistic performance
• Very small run-time expense– Few comparisons and arithmetic operations
Load lower than expectedSlow down further
ESTIMedia 2005 16
• Stochastic Modeling
• Energy Management Scheme– OFFLINE Optimization– ONLINE Adjustments
• Experimental Results
• Conclusions
OUTLINEOUTLINE
ESTIMedia 2005 17
Experimental SetupExperimental Setup
• Compared with approaches for multiprocessor systems:– I (Zhang et. al, DAC2002 )
• Ignores variability, correlations• 100% completion• Worst-case task load
– II ( Hua et. al, EMSOFT2003 )• Ignores correlations• Completion Probability• Marginal load distribution
• Training set: 8 movie segments out of 10
• Test set has 2 movies not included in training set.
• Three completion probabilities PCON– 0.90, 0.95, 0.99
• Two deadlines– Normal , Tight
ESTIMedia 2005 18
Experiment I : Normal DeadlineExperiment I : Normal Deadline
1. Significant energy savings 2. Desired completion probability achieved
Avg E 860
154
100 98 833 147
100 97 764 129
100 91
Avg Pr 0.9026 0.9511 0.9899
Movie #
PCON=0.90 PCON=0.95 PCON=0.99
I II OFLN ONLN
I II OFLN ONLN I II OFLN
ONLN
ESTIMedia 2005 19
Experiment II : Tight DeadlineExperiment II : Tight Deadline
Avg E 100 95 100 91 100 70
Avg Pr 0.9030 0.9515 0.9898
• II (Hua2003) fails with tight deadline– Ignores correlations
• ONLN improves more
• Accurate stochastic model
ESTIMedia 2005 20
Experiment III: Comparison with Experiment III: Comparison with GODGOD
Single Movie
OFFLINE
ONLINE GOD
PCON = 0.99 100 66 52
PCON = 0.95 100 86 72
PCON = 0.90 100 92 76• GOD
– Ideal, Unrealizable, Non-causal– For every individual frame
• Knows load of each task• Computes optimal voltages
• There is still room for future work– “application state” structure
ESTIMedia 2005 21
ConclusionsConclusions
• Demonstrated significant variability and correlations among workloads of MPEG2 tasks
• Our stochastic models capture essential characteristics of applications– Accurately predict performance
• Novel energy management scheme based on stochastic models– Significant energy savings
Soner Yaldiz , Alper Demir, Serdar Tasiran
Koç University, Istanbul, Turkey
Paolo Ienne, Yusuf Leblebici
Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland
Characterizing and ExploitingCharacterizing and ExploitingTask-Load Variability and CorrelationTask-Load Variability and Correlationfor Energy Management for Energy Management in Multi-Core Systemsin Multi-Core Systems
ESTIMedia 2005ESTIMedia 2005
- Questions ?- Questions ?