Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 11
A Hardware-Software Processor Architecture A Hardware-Software Processor Architecture UsingUsing
Pipeline Stalls For Leakage Power ManagementPipeline Stalls For Leakage Power Management
Khushboo ShethKhushboo Sheth
Master’s Thesis DefenseMaster’s Thesis DefenseDecember 3, 2008December 3, 2008
Thesis Committee:
Dr. Vishwani Agrawal, Advisor
Dr. Victor Nelson
Dr. Adit Singh
Dec 3, 2008Dec 3, 2008 11Sheth: MS ThesisSheth: MS Thesis
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 22
Outline Motivation Background NOP-cycle method for energy saving Comparison of Reference method with NOP-
cycle method Architecture Modification Power Management Techniques Sleep mode operation Drowsy mode operation Conclusion
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 33
Power components in CMOS circuit
VVDDDD
GroundGround
CL
Ron
R=large
vi (t) vo(t)Dynamic power
Short circuit power
Leakage power
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 44
Motivation
Technology scaling Per transistor dynamic
power decreases Per transistor leakage
power increases Number of transistors
increase
Contribution of Leakage increases
Reduction in threshold voltage
LeakageLeakage
Gate sizeGate size
Power DensityPower Density
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 55
Processor Power Trend
Processor power increases every generation
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 66
Objective of This WorkObjective of This Work
Explore power management for a Explore power management for a processor at the architecture level.processor at the architecture level.
Reduce power and minimize leakage Reduce power and minimize leakage energy.energy.
Propose and evaluate a new hardware-Propose and evaluate a new hardware-software technique for power software technique for power management.management.
Dec 3, 2008Dec 3, 2008 66Sheth: MS ThesisSheth: MS Thesis
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 77
BackgroundBackground A simple technique to reduce power is to A simple technique to reduce power is to
slow-down the clock:slow-down the clock: Dynamic power reduced in proportion to clock Dynamic power reduced in proportion to clock
rate.rate. Leakage power remains unchanged.Leakage power remains unchanged. A computing task takes longer in the power A computing task takes longer in the power
saving mode:saving mode:• Consumes the same dynamic energyConsumes the same dynamic energy• Consumes more leakage energyConsumes more leakage energy
We use this as a reference method.We use this as a reference method.Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 77
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 88
Clock-Slowdown (Reference) MethodClock-Slowdown (Reference) Method Normal operation:Normal operation:
• Rated clock frequency, fRated clock frequency, f• Dynamic power, PdDynamic power, Pd• Static power, PsStatic power, Ps• Total power, Pd + PsTotal power, Pd + Ps• Energy consumed by an N-cycle task = (Pd + Ps) N/fEnergy consumed by an N-cycle task = (Pd + Ps) N/f
Power saving mode:Power saving mode:• Clock frequency, f/nClock frequency, f/n• Dynamic Power, Pd/nDynamic Power, Pd/n• Static Power, PsStatic Power, Ps• Total power, P(n) = Pd/n +PsTotal power, P(n) = Pd/n +Ps• Energy consumed by an N-cycle task, E(N,n) = (Pd+ nPs) N/fEnergy consumed by an N-cycle task, E(N,n) = (Pd+ nPs) N/f
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 88
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 99
Power Saving RatioPower Saving Ratio P-ratio P-ratio = P(1)/P(n)= P(1)/P(n)
= n(Pd + Ps)/(Pd + nPs)= n(Pd + Ps)/(Pd + nPs)
= n(k+1)/(k+n), where k = Pd/Ps= n(k+1)/(k+n), where k = Pd/Ps Low leakage technology, k >> 1Low leakage technology, k >> 1
P-ratioP-ratio = n= n High leakage technology, k ≤ 2High leakage technology, k ≤ 2
P-ratioP-ratio = 3n/(n+2)= 3n/(n+2) for k = 2for k = 2
= 2n/(n+1)= 2n/(n+1) for k = 1for k = 1
= 3n/(2n+1)= 3n/(2n+1) for k = 0.5for k = 0.5
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 99
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 1010
Power Saving Ratio, P-ratioPower Saving Ratio, P-ratio
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 1010
Low leakagek >> 1
k = 2
k = 1
k = 0.5
P-r
atio
5
4
3
2
11 2 3 4 5
Clock slowdown factor, n
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 1111
Energy Saving RatioEnergy Saving Ratio E-ratio E-ratio = E(N,1)/E(N,n)= E(N,1)/E(N,n)
= (Pd + Ps)/(Pd + nPs)= (Pd + Ps)/(Pd + nPs) = n P-ratio= n P-ratio
= (k+1)/(k+n), where k = Pd/Ps= (k+1)/(k+n), where k = Pd/Ps Low leakage technology, k >> 1Low leakage technology, k >> 1
E-ratioE-ratio = 1= 1 High leakage technology, k ≤ 2High leakage technology, k ≤ 2
E-ratioE-ratio = 3/(n+2)= 3/(n+2) for k = 2for k = 2
= 2/(n+1)= 2/(n+1) for k = 1for k = 1
= 3/(2n+1)= 3/(2n+1) for k = 0.5for k = 0.5
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 1111
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 1212
Energy Saving Ratio, E-ratioEnergy Saving Ratio, E-ratio
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 1212
Low leakagek >> 1
k = 2
k = 1
k = 0.5
1/E
-rat
io4
3
2
1
01 2 3 4 5
Clock slowdown factor, n
No energyincrease
Ene
rgy
incr
ease
→
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 1313
Instruction Slowdown: New Energy Saving Instruction Slowdown: New Energy Saving MethodMethod
Maintain rated clock frequency (f).Maintain rated clock frequency (f). Instruction slowdown factor, m, where m ≥ 0; power Instruction slowdown factor, m, where m ≥ 0; power
management hardware inserts m nop’s per management hardware inserts m nop’s per instruction.instruction.
Provide hardware sleep modes to reduce nop power:Provide hardware sleep modes to reduce nop power: Power control signals generated by control logicPower control signals generated by control logic
• ALU powered downALU powered down
• Register file clocks gatedRegister file clocks gated
• Memory sleep modeMemory sleep mode
• Pipeline register clocks gatedPipeline register clocks gated
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 1313
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 1414
Power Consumed With NOPsPower Consumed With NOPs
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 1414
1 second (f cycles)
f/(m+1) Instruction cyclesEnergy = P/(m+1)
mf/(m+1) NOP cyclesEnergy = mβP/(m+1)
P = Power consumed by instructions cyclesP/f = energy consumed per instruction cycleβP/f = energy consumed per NOP cycleβ = reduction factor (0≤β≤1) due to
power down/sleep modes
Power = P(1 + mβ)/(m + 1)
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 1515
NOP-Cycles MethodNOP-Cycles Method Normal operation:Normal operation:
• Rated clock frequency, f, m = 0Rated clock frequency, f, m = 0• Dynamic power, PdDynamic power, Pd• Static power, PsStatic power, Ps• Total power, Pd + PsTotal power, Pd + Ps• Energy consumed by an N-cycle task = (Pd + Ps) N/fEnergy consumed by an N-cycle task = (Pd + Ps) N/f
Power saving mode:Power saving mode:• Clock frequency, fClock frequency, f• Dynamic Power, Pd (1 + mDynamic Power, Pd (1 + mββ)/(m + 1))/(m + 1)• Static Power, Ps (1 + mStatic Power, Ps (1 + mββ)/(m + 1))/(m + 1)• Total power, P(m) = (Pd + Ps) (1 + mTotal power, P(m) = (Pd + Ps) (1 + mββ)/(m + 1))/(m + 1)• Energy consumed by an N-cycle task, Energy consumed by an N-cycle task,
E(N,m) = (Pd+Ps) [(1+mE(N,m) = (Pd+Ps) [(1+mββ)/(m+1)] )/(m+1)] N(m+1)/f = (Pd+Ps)(1+mN(m+1)/f = (Pd+Ps)(1+mββ)N/f)N/f
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 1515
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 1616
Power and Energy Saving RatioPower and Energy Saving Ratio
P-ratio P-ratio == P(0) / P(m)P(0) / P(m)
== (m + 1) / (1 + m(m + 1) / (1 + mββ)) E-ratioE-ratio == E(N,0) / E(N,m)E(N,0) / E(N,m)
== 1 / (1 + m1 / (1 + mββ))
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 1616
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 1717
Power Saving Ratio, P-ratioPower Saving Ratio, P-ratio
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 1717
Ideal caseβ = 0
β = 0.1
β = 1
β = 0.5
P-r
atio
5
4
3
2
10 1 2 3 4
Instruction slowdown factor, m
β = 0.33
Dec
reas
ing
pow
er →
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 1818
Energy Saving Ratio, P-ratioEnergy Saving Ratio, P-ratio
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 1818
β = 1
β = 0.5
β = 0 β = 0.1
1/E
-rat
io5
4
3
2
10 1 2 3 4
Instruction slowdown factor, m
β = 0.33
Incr
easi
ng e
nerg
y →
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 1919
Comparing Two CasesComparing Two Cases
Energy(Clock slowdown)/Energy(Instruction slowdown)Energy(Clock slowdown)/Energy(Instruction slowdown)
k + m + 1k + m + 1== __________________________
(k+1) (1+m(k+1) (1+mββ))
where, n = m+1, and k = Pd/Pswhere, n = m+1, and k = Pd/Ps
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 1919
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 2020
Clock Slowdown Vs. Instruction Clock Slowdown Vs. Instruction Slowdown, Slowdown, ββ = 1 (No Sleep Mode) = 1 (No Sleep Mode)
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 2020
k = 0.5
k = 1
k >> 1
Ene
rgy
ratio
4
3
2
1
00 1 2 3 4
Slowdown factor, m or n-1
k = 2A
dvan
tage
→
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 2121
Clock Slowdown Vs. Instruction Clock Slowdown Vs. Instruction Slowdown, Slowdown, ββ = 0.5 (Sleep Mode) = 0.5 (Sleep Mode)
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 2121
k = 0.5k = 1
k >> 1
Ene
rgy
ratio
4
3
2
1
00 1 2 3 4
Slowdown factor, m or n-1
k = 2A
dvan
tage
→
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 2222
Clock Slowdown Vs. Instruction Clock Slowdown Vs. Instruction Slowdown, Slowdown, ββ = 0.1 (Sleep Mode) = 0.1 (Sleep Mode)
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 2222
k = 0.5k = 1
k >> 1
Ene
rgy
ratio
4
3
2
1
00 1 2 3 4
Slowdown factor, m or n-1
k = 2
Adv
anta
ge →
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 2323
32 Bit MIPS pipeline processor
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 2424
Modified Architecture
Slow down signal
ALU, Data memory and Register File put to sleep mode
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 2525
Power Management Techniques Clock Gating:
Clock Signal halted in idle devices Switching activity reduced Leakage power unaffected A glitch can cause a temporarily false clock turn off/on
Enabled Flip Flops: Registers replaced by a representative with an
enabled signal When disabled, outputs are not changing Reduces switching activity, but clock still active which
consumes lot of power Less effective
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 2626
Sleep Mode Operation
Activity of the entire system is monitored rather than that of the individual modules.
If the system has been idle for some predetermined time-out duration, then the entire system is shut down and enters what is known as sleep mode.
System inputs are monitored for activity, which will then trigger the system to wake up and resume processing.
Overhead in time and power associated with entering and leaving sleep mode.
Trade-offs to be made in setting the length of the desired time-out period.
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 2727
Implementing Sleep Mode Power-gating technique
Suitably sized header or footer transistor for a circuit block
Sleep signal applied to the gate of the header or footer transistor to turn-off the supply voltage of the circuit block
When circuit block is being requested for use, the sleep signal is de-asserted to restore the voltage at the virtual Vdd.
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 2828
Drowsy mode for memories To retain any information stored in the
memory cells when switched to low-power mode drowsy mode provides a better solution
High-threshold (high-Vt) transistor used to separate virtual Vdd from Vdd supply line
Supplies a very low voltage to the cell when it is turned in to low power mode
High-Vt device drastically reduces the leakage of the circuit because of the exponential dependence of leakage on Vt
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 2929
Conclusion For the higher-leakage technologies, hardware-software
technique inserts pipeline stalls in the processor while maintaining the clock rate of the processor. The hardware units are designed to save leakage power while processing NOP instruction by putting the idle blocks into sleep mode.
This technique is more effective when NOP cycle consumes less than 50% power than regular instruction cycle
Future work includes considering the power of the active cycles and applying voltage reduction when reducing the clock frequency, if the performance penalty can be met.
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 3030
References P. Lotfi-Kamran, A. Rahmani, A. Salehpour, A. Afzali-Kusha, and Z. Navabi,
“Stall Power Reduction in Pipelined Architecture Processors”, in Proc. of 21st International Conference on VLSI Design, 2008, pp.541-546.
K. Najeeb, V. V. R. Konda, S. S. Hari, V. Kamakoti, and V. M. Vedula, “Power Virus Generation Using Behavioral Models of Circuits”, in Proc. 25th IEEE VLSI Test Symposium, 2007, pp.35-40.
B. Yu and M. L. Bushnell, “A Novel Dynamic Power Cut-off Technique (DPCT) for Active Leakage Reduction in Deep Submicron CMOS Circuits”, in Proc. International Symposium On Low Power Electronics and Design, 2006, pp. 214-219.
K. Flautner, N. S. Kim, S. Martin, D. Blaauw, and T. Mudge, “Drowsy Caches: Simple Techniques for Reducing Leakage Power”, in Proc. International Symposium on Computer Architecture, 2002, pp.148-157.
Z. Hu, A. Buyuktosunoglu, V. Srinivasan, V. Zyuban, H. Jacobson, and P. Bose, “Microarchitectural Techniques for Power Gating of Execution Units”, in International Symposium on Low Power Electronics and Design, 2004, pp. 32-37.
D. Ernst, N. S. Kim, S. Das, S. Pant, R. Rao, T. Pham, C. Ziesler, D. Blaauw,T. Austin, K. Flautner, and T. Mudge, “Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation, in Proc. 36th Annual IEEE/ACM International Symposium on Microarchitecture, Dec. 2003, pp. 7-18.
Dec 3, 2008Dec 3, 2008 Sheth: MS ThesisSheth: MS Thesis 3131
Thank You !!