main.pdf - ACFrOgAY0p6FJis LQzIDDW2AS7dZHhoF ... · (PrimeTime ® PX J-2014.12-SP2) Average Power...

13

Transcript of main.pdf - ACFrOgAY0p6FJis LQzIDDW2AS7dZHhoF ... · (PrimeTime ® PX J-2014.12-SP2) Average Power...

Page 1: main.pdf - ACFrOgAY0p6FJis LQzIDDW2AS7dZHhoF ... · (PrimeTime ® PX J-2014.12-SP2) Average Power Logic Synthesis T ool (Design Compiler ® J-2014.09-SP4) Placement and R oute T ool
Page 2: main.pdf - ACFrOgAY0p6FJis LQzIDDW2AS7dZHhoF ... · (PrimeTime ® PX J-2014.12-SP2) Average Power Logic Synthesis T ool (Design Compiler ® J-2014.09-SP4) Placement and R oute T ool
Page 3: main.pdf - ACFrOgAY0p6FJis LQzIDDW2AS7dZHhoF ... · (PrimeTime ® PX J-2014.12-SP2) Average Power Logic Synthesis T ool (Design Compiler ® J-2014.09-SP4) Placement and R oute T ool

/p

frequ

ency

sample mean

sampling distribution

! " #$%&' #! ' ()' * ' )

/ /

/p

/

>

⇢⇣/

⌘,

Page 4: main.pdf - ACFrOgAY0p6FJis LQzIDDW2AS7dZHhoF ... · (PrimeTime ® PX J-2014.12-SP2) Average Power Logic Synthesis T ool (Design Compiler ® J-2014.09-SP4) Placement and R oute T ool

Full Program Execution (hundreds of billion cycles)

Replayed RTL Snapshots (L = a few thousand cycles)

... ...

S1 ...

Random Sampling

S2 S3 S4 S5 S6 S30S29

Cycle selected to create a replayable RTL snapshot.

A replayable RTL snapshot containing all register stateand I/O traces over the replay length

S#

A replayed RTL snapshot on slow power simulator

Full RTL simulation running on fast simulator

>/

Page 5: main.pdf - ACFrOgAY0p6FJis LQzIDDW2AS7dZHhoF ... · (PrimeTime ® PX J-2014.12-SP2) Average Power Logic Synthesis T ool (Design Compiler ® J-2014.09-SP4) Placement and R oute T ool

RAM

Address Generation

RAM

FAME1 Transform

Add Register Scan Chains

Add RAM Scan Chains

Mux

TokenCommunication ChannelModule Port

Module containing comb. logic

Register

RAM SRAM/BRAM

Chise

lBa

cken

d

Strober FPGA Simulator

Chise

lFr

onte

nd

Scan Chain Insertion

Simulation Metadata Dump

Channel Wrapping

FAME1Transform

Platform Mapping

Chisel RTL

Page 6: main.pdf - ACFrOgAY0p6FJis LQzIDDW2AS7dZHhoF ... · (PrimeTime ® PX J-2014.12-SP2) Average Power Logic Synthesis T ool (Design Compiler ® J-2014.09-SP4) Placement and R oute T ool

Chisel RTL

Chisel Verilog Backend

Verilog RTL

Matching Points Samples

FPGA Simulation

Post-layout Design

Verification Tool(Formality ® J-2014.09-SP4)

Gate-level Simulation(VCS ® H-2013.06)

Signal Activities

Power Analysis Tool(PrimeTime ® PX J-2014.12-SP2)

Average Power

Logic Synthesis Tool(Design Compiler ® J-2014.09-SP4)

Placement and Route Tool(IC Compiler ™ J-2014.09-SP4)

Page 7: main.pdf - ACFrOgAY0p6FJis LQzIDDW2AS7dZHhoF ... · (PrimeTime ® PX J-2014.12-SP2) Average Power Logic Synthesis T ool (Design Compiler ® J-2014.09-SP4) Placement and R oute T ool

/ /

⇡ / ⇥ /

⇥ /

Page 8: main.pdf - ACFrOgAY0p6FJis LQzIDDW2AS7dZHhoF ... · (PrimeTime ® PX J-2014.12-SP2) Average Power Logic Synthesis T ool (Design Compiler ® J-2014.09-SP4) Placement and R oute T ool

RegFile

ICache

Uncore

LSU

RenameTable

FPU

ROB

Free List

Issue Window

Branch PredictorALUs

Fetch Buffer

DCache

DCacheControl

IDIVIMUL Busy Table

Bypasses

/

/

$ $

gcc

Page 9: main.pdf - ACFrOgAY0p6FJis LQzIDDW2AS7dZHhoF ... · (PrimeTime ® PX J-2014.12-SP2) Average Power Logic Synthesis T ool (Design Compiler ® J-2014.09-SP4) Placement and R oute T ool

0.00%

0.50%

1.00%

1.50%

2.00%

2.50%

3.00%

1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5

vvadd towers dhrystone qsort spmv dgemm

Erro

r

Theoretical Error Bound (99% Confidence) Actual Error

⇥⇥⇥⇥⇥⇥

uname ls

403.gcc gcc

Page 10: main.pdf - ACFrOgAY0p6FJis LQzIDDW2AS7dZHhoF ... · (PrimeTime ® PX J-2014.12-SP2) Average Power Logic Synthesis T ool (Design Compiler ® J-2014.09-SP4) Placement and R oute T ool

0

50

100

150

200

250

300

350

400

450

500

rocket boom 1-w boom 2-w rocket boom 1-w boom 2-w rocket boom 1-w boom 2-w

Coremark Linux Boot gcc

mW

DRAMMiscUncoreL1 D-cache controlL1 D-cache meta+dataL1 I-cacheROBFPULSUInteger UnitIssue LogicRegister FileRename + Decode LogicFetch Unit

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

0

0.5

1

1.5

2

2.5

3

3.5

4

rocket boom 1-w boom 2-w rocket boom 1-w boom 2-w rocket boom 1-w boom 2-w

Coremark Linux Boot gcc

EPI(n

J/Ins

t)

CPI

CPI EPI(nJ)

Page 11: main.pdf - ACFrOgAY0p6FJis LQzIDDW2AS7dZHhoF ... · (PrimeTime ® PX J-2014.12-SP2) Average Power Logic Synthesis T ool (Design Compiler ® J-2014.09-SP4) Placement and R oute T ool

403.gcc

Page 12: main.pdf - ACFrOgAY0p6FJis LQzIDDW2AS7dZHhoF ... · (PrimeTime ® PX J-2014.12-SP2) Average Power Logic Synthesis T ool (Design Compiler ® J-2014.09-SP4) Placement and R oute T ool
Page 13: main.pdf - ACFrOgAY0p6FJis LQzIDDW2AS7dZHhoF ... · (PrimeTime ® PX J-2014.12-SP2) Average Power Logic Synthesis T ool (Design Compiler ® J-2014.09-SP4) Placement and R oute T ool