Application Transformations for Energy and Performance-Aware Device Management Taliver Heath,...
-
Upload
leslie-carter -
Category
Documents
-
view
219 -
download
1
description
Transcript of Application Transformations for Energy and Performance-Aware Device Management Taliver Heath,...
Application Transformations for Energy and Performance-Aware Device Management
Taliver Heath, Eduardo Pinheiro, Jerry Hom, Ulrich Kremer, and
Ricardo BianchiniRutgers University
www.darklab.rutgers.edu
The Problem
Conserve energy in devices Must take advantage of lower power
states State transitions have overhead Cost in both energy and performance
Challenge: non-interactive applications and fast processors Short device idle times Devices cannot use lower power states
www.darklab.rutgers.edu
Our Solution
CPUDisk
CPUDisk
idle idleactive activeidle
An Unmodified Application (UM)
Transformed Applicationidleactiveactive
www.darklab.rutgers.edu
Our Goals
Conserve energy by exploiting transformations that increase idle time
Evaluate ideas using: Hand-modified programs Automated compiler transformations
Specific policies: Energy Oblivious, Fixed Threshold, Direct Deactivation, Pre-Activation, and Combined
www.darklab.rutgers.edu
Application Transformations
Increase idle times with help of compiler or programmer Identify loops where accesses occur Perform loop transformations Estimate device idle times Insert system calls
Idle time limited by memory or real-time constraints
www.darklab.rutgers.edu
Example: Original Application
i = 1;while i <= N {
read chunk[i] of file;compute on chunk[i];i = i+1;
}
www.darklab.rutgers.edu
Example: Transformed Application
available = how_much_memory();numchunks = available/sizeof(chunks);compute_time = appfunc(numchunks);i = 1;while i <= N {
read chunk[i…i+numchunks] of file;next_R(compute_time);compute on chunk[i…i+numchunks];i = i+numchunks;
}
www.darklab.rutgers.edu
Compiler Framework Annotations to file descriptors Replace disk calls using
interprocedural analysis Profiling Buffered I/O Notify OS of idle times
Based on SUIF infrastructure
www.darklab.rutgers.edu
Device Management
Energy-Oblivious (EO) Fixed-Threshold (FT) Direct-Deactivation (DD) Pre-Activation (PA) Combined(CO) : DD+PA
Final state based on model [Heath02]
www.darklab.rutgers.edu
Sample Disk Power Graphs (mp3 player)
FT
UM
CO
0
1
2
3
0 5 10 15 20 25 30
Power(W)
0
1
2
3
0 5 10 15 20 25 30
Power(W)
0
1
2
3
0 5 10 15 20 25 30
Power(W)
www.darklab.rutgers.edu
Experimental Setup
Fujitsu Disk 6-GB, 4200-rpm laptop disk 3 states
Idle – 0.9 W Standby – 0.22 W Sleep - 0.09W
Buffer memory available: 19MB Time allowed for reading: .3 seconds
www.darklab.rutgers.edu
Experiment
6 applications Mp3 player, mpeg-player Gzip, sftp, mpeg-encode, image
smoother Variables investigated
Disk management policies Compiler vs. hand-optimized OS prefetching on/off
www.darklab.rutgers.edu
Non-streaming: SFTP
www.darklab.rutgers.edu
Streaming: MP3 player
www.darklab.rutgers.edu
Average Hand-Modified Results
Policy Energy Performance
EO 40% 0%FT 60% 5%DD 73% 7%PA 60% 1%CO 70% 4%
www.darklab.rutgers.edu
Average Compiler ResultsPolicy Energy Performanc
eEO 46% 1%FT 68% 4%DD 79% 7%CO 75% 3%
www.darklab.rutgers.edu
Related Work (partial list)
Application-controlled power states Concept, but no implementation
[Ellis99,Lu99] Compiler infrastructure [Delaluz01]
Direct deactivation and preactivation [Hom01,Heath02]
Conserving disk energy [Douglis94] Modifying disk access API [Weissel02]
www.darklab.rutgers.edu
Conclusions
Application transformations 55-89% savings in energy Minimal effect on performance Idle time predictions are difficult
Prefetching has little impact Compiler transformations work well
As good as hand modifications Generic framework: other disks and
devices
www.darklab.rutgers.edu
For more information
www.darklab.rutgers.edu
www.darklab.rutgers.edu
Technique
Create model of disk energy Transform applications Realize model on real disk Predict disk energy usage Measure disk on 4 applications
www.darklab.rutgers.edu
Future Work
More disks Other devices Multiple active processes Asynchronous I/O
www.darklab.rutgers.edu
Summary
www.darklab.rutgers.edu
Historical Use of States
Change to Lower State during Period of Idleness Fixed-threshold Adaptive/Heuristic OS Hints
Based on general knowledge of system
www.darklab.rutgers.edu
Runlength vs. Energy
www.darklab.rutgers.edu
Projected Application Gain
www.darklab.rutgers.edu
Projected Application Gain
www.darklab.rutgers.edu
Overhead for DD
www.darklab.rutgers.edu
Combined (CO)
fact
ffact
fdeact
co EPTREE ,1
RT co
CPUDisk
idle idleactive activeidle
1111,1,1 fact
ffact
fdeact
fact
ffact
fdeact EPTREEPTRE
active active idle
www.darklab.rutgers.edu
Parameter DescriptionParameter
Explanation
Energy consumed by policy polCPU time consumed by policy polRun-lengthAverage power consumed in sInactivity threshold for sAverage reactivation energyAverage deactivation energyAverage reactivation time
polTRsPsTsactE',ss
deactEsactT
polE
www.darklab.rutgers.edu
Reality Departs from Model
Hidden states in several transitions Transition from active to idle Behavior on activation
fPWRadj 75.14.0)(For CO:
www.darklab.rutgers.edu
Experiments
Application s1 s2 s3MP3 player 0 0 1MPEG player 0 0 1Image smoother 0 0 1Gzip 0 .36 .64Sftp 0 0 1MPEG encoder 0 0.5 0.5
Modified App Runlengths
www.darklab.rutgers.edu
Energy, mpg123
www.darklab.rutgers.edu
Energy, sftp
www.darklab.rutgers.edu
Performance, mpg123
www.darklab.rutgers.edu
Performace, sftp
www.darklab.rutgers.edu
Experimental Results
MP3 player
www.darklab.rutgers.edu
Summary
direct-deactivation and preactivation (CO) Can save up to 89% of disk energy No performance penalty, except for MPEG
player (<10%) Just increasing runlengths, we can save
up to 50% energy Error in model can be significant – up to
50% for the entire application
www.darklab.rutgers.edu
Energy Oblivious(EO)
1PREeo
RT eo
CPUDisk
idle idleactive activeidle
www.darklab.rutgers.edu
Direct Deactivation(DD)
fact
ffdeact
dd ERPEE '',1
'fact
dd TRT
CPUDisk
idle idleactive active
1'1'1',1''',1 fact
ffdeact
fact
ffdeact EPREEPRE
www.darklab.rutgers.edu
Pre-Activation(PA)
''''1''
1
1,1''
1
fact
ffact
f
s
sssdeact
f
s
sspa EPTTRETPE
RT pa
RTT fact
f
s
s
1''1
1
CPUDisk
idle idleactive activeidle
www.darklab.rutgers.edu
Fixed-Threshold(FT)
**1*
1
1,1*
1
fact
ff
s
sssdeact
f
s
ssft EPTRETPE
*fact
ft TRT
CPUDisk
idle idleactive active
RTf
s
s
1*
1
www.darklab.rutgers.edu
Terminology
CPU Time Device Time
Blocking device accesses (reads)Single ready-to-run application
Time between device accesses by the processor
Runlength (R)
R R