Optimal Dynamic Frequency Scaling for Energy - Performance...
Transcript of Optimal Dynamic Frequency Scaling for Energy - Performance...
Optimal Dynamic Frequency Scaling forEnergy - Performance of Parallel MPIPrograms
Jean-Claude Charr, Raphaël Couturier, Ahmed Fanfakh andArnaud Giersch
FEMTO-ST - DISC Department - AND Team
June 3th, 2014
Outline
1. Definitions and objectives
2. Energy and performance models
3. Performance and energy reduction trade-off
4. Experimental results and comparison
5. Conclusions and future works
DISC Department - AND Team 2 / 18
Definitions
• Modern processors provide Dynamic Voltage and FrequencyScaling (DVFS) technique.
• DVFS is used to reduce the frequency and thus to reduce theenergy consumption by a CPU while computing.
• But scaling the frequency to lower level reduces theperformance (execution time) of parallel program.
• Energy consumption for individual processor depends on twopower metrics: the static power Pstatic and the dynamic powerPdyn.
DISC Department - AND Team 3 / 18
Definitions
• Pdyn = α · CL · V 2 · F .
• Pstatic = V · Ntrans · Kdesign · Ileak .
• Energy consumption by individual processor of a synchronousparallel program:Eind = Pdyn · TComp + Pstatic · (TComp + TComm).
• The frequency scaling factor is the ratio between the maximumand the new frequency, S = Fmax
Fnew.
DISC Department - AND Team 4 / 18
Objectives• Study the effect of the scaling factor S on energy consumption
of parallel iterative applications such as NAS Benchmarks.
• Study the effect of the scaling factor S on performance of thesebenchmarks.
• Discovering the energy-performance trade-off relation whenchanging the frequency.
• We propose an algorithm for selecting the scaling factor Sproducing optimal trade-off between the energy andperformance.
• Improving Rauber and Rünger’s1 method that our method beston.
1Thomas Rauber and Gudula Rünger. Analytical modeling and simulation of the energy consumptionof independent tasks. In Proceedings of the Winter Simulation Conference, 2012.
DISC Department - AND Team 5 / 18
Energy model for homogeneous platform
The dynamic power is exponentially related to the scaling factor Sand the static consumed energy is linearly related to this factor.
Rauber and Rünger’s energy model
E = Pdyn · S−21 ·
(T1 +
∑Ni=2
T 3i
T 21
)+ Pstatic · S1 · T1 · N
S1: is the max. scaling factor, TI : is the time of the slower task, Ti : isthe time of the other tasks and N: is the number of nodes.
Rauber and Rünger’s optimal scaling factor
Sopt =3
√2N · Pdyn
Pstatic·(
1 +∑N
i=2T 3
iT 3
1
)They reduce degradation of the performance by setting the highestfrequency to the slowest task.
DISC Department - AND Team 6 / 18
Slack times of the sync. parallel program
Communication Time
Time
Barrier
Idle time
Barrier
Task 3
Task 2
Task 1
Task 4
Task N
(a) Sync. imbalanced communications
Computation Time
Time
Barrier
Idle time
Barrier
Task 3
Task 2
Task 1
Task 4
Task N
(b) Sync. imbalanced computations
ProgramTime = maxi=1,2,...,N Ti
DISC Department - AND Team 7 / 18
Performance evaluation of MPI programs
Execution time prediction model
Tnew = TMaxCompOld · S + TMaxCommOld
1
1.2
1.4
1.6
1.8
1 1.5 2 2.5 3
No
rma
lize
d t
ime
Frequency scaling factors
CG Class B
Normalized predicted timeNormalized real time
1.2
1.5
1.8
2.1
2.4
2.7
1 1.5 2 2.5 3N
orm
aliz
ed
tim
e
Frequency scaling factors
LU Class B
Normalized predicted timeNormalized real time
The maximum normalized error for CG=0.0073 (the smallest) andLU=0.031 (the worst).
DISC Department - AND Team 8 / 18
Performance and energy reduction trade-off
0.5
1
1.5
2
2.5
1 1.5 2 2.5 3Norm
aliz
ed e
nerg
y a
nd e
xecution tim
e
Frequency scaling factors
Normalized execution timeNormalized energy
(c) Real relation.
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
1 1.5 2 2.5 3
Norm
aliz
ed e
nerg
y a
nd p
erf
orm
ance
Frequency scaling factors
Optimal scaling factor
Normalized performanceNormalized energy
(d) Converted relation.
Performance = 1execution time
Our objective function
MaxDist = maxj=1,2,...,F (
Maximize︷ ︸︸ ︷PNorm(Sj)−
Minimize︷ ︸︸ ︷ENorm(Sj))
DISC Department - AND Team 9 / 18
Scaling factor selection algorithm
Enumerate the available scaling factors and find Soptimal forwhich PNorm − ENorm is maximal.
Where:
ENorm = EReducedEOriginal
=Pdyn·S−2
1 ·(
T1+∑N
i=2T3
iT2
1
)+Pstatic ·T1·S1·N
Pdyn·(
T1+∑N
i=2T 3
iT 2
1
)+Pstatic ·T1·N
PNorm = ToldTnew
=TMaxCompOld+TMaxCommOld
TMaxCompOld ·S+TMaxCommOld
DISC Department - AND Team 10 / 18
Scaling factor selection algorithm
Algorithm characteristics• It works online.
• It predicts both the energy consumption and performance.
• It is simultaneously reduces the energy consumption andmaintaining performance of iterative algorithm.
• It takes into account the communication time.
• It is well adapted to imbalanced tasks. Fi =Fmax ·Ti
Soptimal ·Tmax
• It has a very small overhead. It takes 6.65 µs for 32 nodes.
DISC Department - AND Team 11 / 18
Experimental results
• Our experiments are executed on the simulator SimGrid/SMPIv3.10.
• Our algorithm is applied to NAS parallel benchmarks.
• Each node in the cluster has 18 frequency values from 2.5GHzto 800MHz.
• We run the classes A, B and C on 4, 8 or 9 and 16 nodesrespectively.
• The dynamic power with the highest frequency is equal to 20 Wand the power static is equal to 4 W .
DISC Department - AND Team 12 / 18
Experimental results
0.2
0.4
0.6
0.8
1
1.2
1 1.5 2 2.5 3
No
rma
lize
d e
ne
rgy a
nd
pe
rfo
rma
nce
Frequency scaling factors
Optimal scaling factor=1.04
EP Class C Normalized performanceNormalized energy
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
1 1.5 2 2.5 3
No
rma
lize
d e
ne
rgy a
nd
pe
rfo
rma
nce
Frequency scaling factors
Optimal scaling factor=1.56
CG Class C Normalized performanceNormalized energy
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
1 1.5 2 2.5 3
No
rma
lize
d e
ne
rgy a
nd
pe
rfo
rma
nce
Frequency scaling factors
Optimal scaling factor=1.315
BT Class C Normalized performanceNormalized energy
CG MG EP LU BT SP FT0
5
10
15
20
25
30
35
40
45
39.23
34.97
22.14
35.83
29.6
33.4834.72
14.88
21.69 20.7322.49
21.28 21.3519
Energy Saving % Performance Degradation %
NAS Benchmarks Class C
DISC Department - AND Team 13 / 18
Results comparison
CG MG EP LU BT SP FT0
5
10
15
20
25
30
35
40
45
Energy Saving %Performance Degradation %
Our Method for NAS benchmarks Class C
CG MG EP LU BT SP FT0
10
20
30
40
50
60
Energy Saving %Performance Degradation %
Rauber and Rünger for NAS benchmarks Class C
DISC Department - AND Team 14 / 18
Results comparison
EP CG MG BT LU SP FT
-20
-15
-10
-5
0
5
10
15
20
25
30Rauber E-P EPSA
Comparing our method with Rauber and Rünger method for NAS benchmarks class C
En
ergy
to
Per
form
ance
Dis
tan
ce
DISC Department - AND Team 15 / 18
Conclusions
• We have presented a new online scaling factor selection methodthat optimizes simultaneously the energy and performance.
• It predicts the energy consumption and the performance ofthe parallel applications.
• Our algorithm saves more energy when the communicationand the other slacks times are big.
• It gives the best trade-off between energy reduction andperformance.
• Our method outperforms Rauber and Rünger’s method interms of energy-performance ratio.
DISC Department - AND Team 16 / 18
Future works
• We will apply the proposed algorithm to a heterogeneousplatform.
• While the nodes of a heterogeneous platform are different in:
- Dynamic and static power.- Individual energy consumption.- The available frequencies.- Performance capabilities.
• We will apply the proposed algorithm to a real cluster.
• We will apply the proposed algorithm to a real applications.
DISC Department - AND Team 17 / 18