Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M....

27
Digital System Clocking: Digital System Clocking: High-Performance and Low-Power Aspects High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience and IEEE Press, January 2003 Chapter 3: Timing and Energy Parameters

Transcript of Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M....

Page 1: Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.

Digital System Clocking:Digital System Clocking:Digital System Clocking:Digital System Clocking:High-Performance and Low-Power AspectsHigh-Performance and Low-Power Aspects

Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic

Wiley-Interscience and IEEE Press, January 2003

Chapter 3: Timing and Energy Parameters

Page 2: Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.

Nov. 14, 2003

Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 2

Basic timing diagram in flip-flops

Clk

D

Q

tCQ,LH tCQ,HL U H t < U Setup time violation

Definitions:

• Clock-to-Q Delay: tCQ low-to high= tCQ,LH high-to-low= tCQ,HL

• Setup time U

• Hold time H

Page 3: Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.

Nov. 14, 2003

Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 3

Setup and hold time behavior as a function of clock-to-output delay

0

50

100

150

200

250

300

350

-200 -150 -100 -50 0 50 100 150 200

Data-Clk [ps]

Clk

-Ou

tpu

t[p

s]

Setup Hold

Neither Setup nor Hold time are fixed constant parameters.They are function of Data-to-Clock time “distance”.

Page 4: Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.

Nov. 14, 2003

Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 4

Setup time behavior as a function of data-to-output delay

Data to Clock Delay

Dat

a to

Ou

tpu

t D

elay

DDQm

Constant Clk-Q Region

Failure Region

Variable Clk-Q Region

Data arrives early Data arrives late

D-Q

Clk-Q

Uopt

45o

When D-to-Q delay is observed, we can see that data can come closerto the triggering event than we thought.Data-to-Q is the RELEVANT parameter – NOT Clock-to-Q as many think !

Page 5: Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.

Nov. 14, 2003

Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 5

D-Q and Clk-Q delay as a function of D-Clk offset

Determining the optimal setup Uopt and hold time Hopt.

Page 6: Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.

Nov. 14, 2003

Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 6

Setup time, hold time, sampling window and clock width in a flip-flop

D Q

Clk

U H

Q

D

tCW

tCQ

FF

Clk

SamplingWindow

Page 7: Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.

Nov. 14, 2003

Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 7

Latch: setup and hold time

D Q

ClkU H

Q1

D1

D2

Q2

(a)

(b)

tCW

tCQ

tDQ

L

Clk

(a) early data D1 arrival; (b) late data D2 arrival

Unlike Flip-Flop, Latch has two situations:

(a)Data is ready – waiting for the clock

(b)Latch is open – data arrives during the active clock signal

Page 8: Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.

Nov. 14, 2003

Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 8

Illustration of a data path

Logic

ClktClk-Q

tLogic

U

D Q

Q

D Q

Q

tskew

H

A B

Page 9: Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.

Nov. 14, 2003

Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 9

Late Data Arrival and Time Borrowing in a pipelined design

Source Destination

TCR1

Cycle-1

TCR2

TCR1> TCR2dw1>dw3>dw2

LogicD Q

Q

D Q

Q

Clk

Cycle-2dw1 dw2 dw3

Clk

Data-to-Output (Q) time window moves around the time axis; (becoming larger or smaller)

Page 10: Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.

Nov. 14, 2003

Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 10

Early Data Arrival and Internal Race Imunity

• The maximum clock skew that system can tolerate is determined by the clock storage elements:– If Clk-Q delay of CSE is shorter than H, race

can occur if there is no logic inbetween.– If Clk-Q is greater than H + possible skew –

no problem:

tClk-Q> H+ tskew

• Internal race immunity:

R = tClk-Q-H

Page 11: Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.

Nov. 14, 2003

Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 11

Impact of supply voltage on the sampling window

-500-300

-100100

300

1

1.3

1.6

1.9

2.2

2.5

0.5

1.0

1.5

2.0t C

lk-Q

[ns]

Sampling window determines the minimum required duration of data signal

Page 12: Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.

Nov. 14, 2003

Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 12

Energy Parameters

Page 13: Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.

Nov. 14, 2003

Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 13

Components of Energy Consumption

• Switching Energy• Short-Circuit Energy• Leakage Energy• Static Energy

Page 14: Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.

Nov. 14, 2003

Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 14

Components of Energy Consumption

Tt

t

VDD diVEDD

)(

staticleakagecircuitshortswitching EEEEE

N

iDDswingiswitching ViVCiE

110 )()(

Switching Energy:

Energy consumed by CSE during one clock period T, where t is chosen to include all relevant transitions: arrival of new data, clock pulse, and output transition.

This energy has four components:

• N is the number of nodes • Ci is the capacitance of the node I 0-1(i) is the probability that a transition occurs at the

node i • Vswing(i) is voltage swing of the node I

Page 15: Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.

Nov. 14, 2003

Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 15

Short-circuit current in an inverter

Vin

CL

Vout

ipull-up

iL

ishort-circ

Vin

CL

Vout

ipull-down

iL

ishort-circ

(a) (b)

(a) pull-up; (b) pull-down operation

Page 16: Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.

Nov. 14, 2003

Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 16

Projected leakage currents

1

10

100

1,000

10,000

30 40 50 60 70 80 90 100 110

Temp (C)

Ioff

(n

a/

)

0.10m

0.13m

0.18m

0.25m

Assuming doubling of transistors / generation the leakage current will increase about 7.5 times corresponding to a 5 times increase in total leakage power

Leakage power will soon become a significant portion of the total power consumption in modern microprocessors

Page 17: Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.

Nov. 14, 2003

Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 17

Where does the Energy go in CSE ?

1. Internal clocked nodes in storage elements

2. Internal non-clocked nodes in storage elements

3. Data and clock input load4. Output load.

Page 18: Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.

Nov. 14, 2003

Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 18

Energy Breakdown

1. Internal clocking energy2. Data and Clock Input Energy3. Energy in Internal Non-clocked

Nodes4. Energy in Output Load5. Energy per Transition6. Glitching Energy

Page 19: Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.

Nov. 14, 2003

Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 19

Energy Breakdown in Clocked-Storage Elements during one of the possible input

data transitions

E0-0 E0-1 E1-0 E1-1

EClk Y/N Y Y Y/N

Eint Y/N Y Y Y/N

Eext N Y/N Y/N NTwo cases:• Storage elements without pre-charge

nodes• Storage elements with pre-charge

nodes

Page 20: Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.

Nov. 14, 2003

Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 20

CSE characterization and Test setup

D Q

Q

VDD-D VDD-CSE

VDD-Clk

CL

Dt

Clkt

D

Clk

0 1 1 0int 2 Clk Load

E EE E E

• Clock Energy

• Internal Energy:

• Energy in Output Load

Page 21: Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.

Nov. 14, 2003

Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 21

Energy per transition• The energy-per-transition is the total energy consumed in a CSE

during one clock cycle for a specified input data transition: 0-0, 0-1, 1-0, or 1-1

• This metric is crucial in that it yields significant insight about circuit energy

• By inspection of the node activity in a CSE for different input data transitions, the energy-per-transition can be utilized to obtain the energy breakdown between clocked nodes, internal nodes, and the external output load.

• This forms a good basis for the study of alternative circuit techniques that deal with internal clock gating.

• The energy breakdown information also offers valuable information about the tradeoffs associated with reduced clocking energy and the energy penalty incurred by the clock-gating logic, thus providing a better understanding of the optimization goals for the overall design

Page 22: Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.

Nov. 14, 2003

Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 22

Energy per transition

0 1

p E

p E

p E p E

0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1averageE p E p E p E p E

Page 23: Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.

Nov. 14, 2003

Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 23

Glitching Energy in CSEs

Clk

D

1 2 3 4init. int.nodes

(D = 0)

init. int.nodes

(D = 1)pos. glitch(Clk = 0)

pos. glitch(Clk = 1)

neg. glitch(Clk = 0)

neg. glitch(Clk = 1)

• Glitches are generated by the unintended transitions propagating from the fan-in gates, termed propagating glitches.

• Glitches produced by non-glitch transitions at the inputs, called generated glitches

4

1igiglitchavg i

EE

Page 24: Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.

Nov. 14, 2003

Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 24

Interface with Clock Network and Combinational Logic

• We assumed that the data and clock inputs were supplied by drivers with sufficient drive strength.

• The input clock and data capacitances are important interface parameters for the clock network and logic design.

• The clock network designer and logic designer need to be aware of these capacitances in order to design circuits that drive storage elements.

Page 25: Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.

Nov. 14, 2003

Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 25

Interface with Clock Network and Combinational Logic

Interface with Combinational Logic:

The relevant parameters to the combinational logic designer are:

• CSE input data slope• Input data capacitance

The data slope affects performance and energy consumption of both driving logic and storage elements.

Clock and data slopes are generally not equal.

Page 26: Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.

Nov. 14, 2003

Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 26

Interface with Clock Network and Combinational Logic

Interface with Clock Network:• CSEs are affected by clock skew and clock slope. • The total load of the clock distribution network is

defined by the input capacitance of the clock node and number of CSEs on a chip.

• Increase in clock slope results in degradation of the CSE performance - the clock network designer has to know what slopes CSE can tolerate.

• This is especially important if Flip-Flops are used. • The clock slope also affects energy consumption of the

clock distribution network. – If larger clock drivers with smaller fanout are used, the clock

edges are sharper and the storage element performance better, at the expense of an increase in energy consumption of the clock network.

– Optimal tradeoff is achieved with minimal energy consumption that delivers the desired storage element performance.

Page 27: Digital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M. Nedovic Wiley-Interscience.

Nov. 14, 2003

Digital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 27

Interface with Clock Network and Combinational Logic

• To evaluate the total clocking energy per clock cycle in the entire clock subsystem, one needs to add the energy consumed in the clock distribution network.

• The energy consumed in the clock distribution network depends on the total switched capacitance which is determined by the total number of clocked storage elements on a chip and the input capacitance of their clock inputs, the total wiring capacitance, and the total switched capacitance of clock drivers as given by:

,distrib net FF in Clk FF wire sw buffC N C C C

The last two terms depend on buffer insertion/placement strategy and should be minimized.