ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM...

85
ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM SCALING Thesis Defense Yoongu Kim

Transcript of ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM...

Page 1: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

ARCHITECTURAL TECHNIQUES

TO ENHANCE DRAM SCALING

Thesis Defense

Yoongu Kim

Page 2: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

2

CPU+CACHE

MAIN MEMORY

STORAGE

Page 3: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

3

Complex Problems

Large Datasets

High Throughput

Page 4: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

4

DRAM Cell(Capacitor)

DRAM Module

DRAM Chip ‘0’‘1’

Page 5: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

5

1971 2015

103 Cells 109 Cells

FIRSTDRAM CHIP

Page 6: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

FUTU

RE?

6

1971 2015

103

109

Page 7: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

DRAM SCALING

7

TECHNOLOGICAL FEASIBILITYCan we make smaller cells?

ECONOMIC VIABILITYShould we make smaller cells?

Page 8: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

8

SMALLERCELLS

LOWERCOST/BIT

1. RELIABILITY TAX

2. PERFORMANCE TAX

Page 9: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

1. RELIABILITY

9

COUPLING BETWEEN

NEARBY CELLS

ROW HAMMER (ISCA 2014)

Your DRAM chips are probably broken.

Page 10: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

2. PERFORMANCE

10

ABNORMALLY SLOW

OUTLIER CELLS

BANK CONFLICTS (ISCA 2012)

Our solution may be adopted by industry.

Page 11: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

11

CO-DESIGN: CPU & DRAM

CPUMemory

Management

DRAMController

DRAMArch &

Interface

Circuits &Devices

Page 12: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

THESIS STATEMENT

The degradation in DRAM

reliability & performance can be

effectively mitigated by making

low-overhead, non-intrusive

modifications to the DRAM chips

and the DRAM controller.

12

Page 13: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

13

DRAMSCALING

ARCHITECTURALSUPPORT

MAIN MEMORY:LARGER, FASTER,

RELIABLE, EFFICIENT

Page 14: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

THREE CONTRIBUTIONS

1. We show that DRAM scaling is negatively affecting reliability.

We expose a new type ofDRAM failure, and propose a

cost-effective way to address it.

14

Page 15: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

THREE CONTRIBUTIONS

2. We propose a high-performancearchitecture for DRAM that mitigates its growing latency.

We identify bottlenecks in DRAM’s internal design and alleviate them

in a cost-effective manner.

15

Page 16: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

THREE CONTRIBUTIONS

3. We develop a new simulator for facilitating rapid design space exploration of DRAM.

The simulator is the fastest,while also being easy to modify

due to its modular design.

16

Page 17: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

OUTLINE

17

1. RELIABILTY: ROW HAMMER

2. PERF: BANK CONFLICT

3. SIMULATOR: RAMULATOR

4. CONCLUSION

Page 18: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

18

1. ROW HAMMER

FLIPPING BITS IN MEMORYWITHOUT ACCESSING THEM

ISCA 2014

Page 19: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

19

DRAM CHIPWORDLINE

ROW LOW VOLTAGEHIGH VOLTAGEVICTIM

VICTIMAGGRESSOR

READ DATA FROM HERE, GET ERRORS OVER THERE

Page 20: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

20

GOOGLE’S EXPLOIT

http://googleprojectzero.blogspot.com

“We learned about

rowhammer from

Yoongu Kim et al.”

Page 21: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

21

GOOGLE’S EXPLOIT

PROPOSEDSOLUTIONS

OUR PROOF-OF-CONCEPT

EMPIRICALANALYSIS

Page 22: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

REAL SYSTEM

22

x86 DRAM

1. CACHE HITS 2. ROW HITS

MANY READS TO

SAME ADDRESS

OPEN/CLOSE

SAME ROW≠

Page 23: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

23

LOOP:

mov (X), %regmov (Y), %regclflush (X)clflush (Y)jmp LOOP

11111111111111111111111111111111111111111111

Y

X

11111111111

1111

1111

11011110010

10111010111

x86 CPU DRAM

http://www.github.com/CMU-SAFARI/rowhammer

MANYERRORS!

Page 24: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

24

WHY DO THE

ERRORS OCCUR?

Page 25: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

25

DRAM CELLS ARE LEAKYCH

AR

GE

‘1’

‘0’

64ms0msR

EFR

ESHNORMAL CELL

TIME

Page 26: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

26

DRAM CELLS ARE LEAKYCH

AR

GE

‘1’

‘0’

64ms0msR

EFR

ESH

VICTIM CELL

AGGRESSOR

TIME

Page 27: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

27

COUPLING• Electromagnetic

• Tunneling

ROOT CAUSE?

⇝ ⇝⇝⇝⇝ ⇝⇝⇝

ACCELERATES CHARGE LOSS

Page 28: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

AS DRAM SCALES …

• CELLS BECOME SMALLER Less tolerance to coupling effects

• CELLS BECOME PLACED CLOSERStronger coupling effects

COUPLING ERRORS MORE LIKELY28

Page 29: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

29

1. ERRORS ARE RECENTNot found in pre-2010 chips

2. ERRORS ARE WIDESPREAD>80% of chips have errors

Up to one error per ~1K cells

Page 30: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

Test Engine

DRAM CtrlPCIe

FPGAPC

30

DRAM

Page 31: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

TemperatureController

PCHeater

FPGAs FPGAs

Page 32: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

MOST MODULES AT RISK

32

A VENDOR

B VENDOR

C VENDOR

86%

83%

88%

(37/43)

(45/54)

(28/32)

Page 33: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

33

MANUFACTURE DATE

ERRORS PER 109 CELLS

MODULES: A B C

Page 34: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

DISTURBING FACTS

•AFFECTS ALL VENDORSNot an isolated incidentDeeper issue in DRAM scaling

•UNADDRESSED FOR YEARSCould impact systems in the field

34

Page 35: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

35

HOW TO PREVENT

COUPLING ERRORS?

Previous Approaches1. Make Better Chips: Expensive

2. Rigorous Testing: Takes Too Long

Page 36: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

36

ROWMORE ERRORS

ROWMORE ERRORS

FASTER ACCESS

ROW

ROWFEWER ERRORS

ROWFEWER ERRORS

FREQUENT REFRESH

ROW

Page 37: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

Faster ⟶ Slower

37

ONE MODULE: A B C

ACCESS INTERVAL (ns)

TOTALERRORS

55

ns

50

0n

s

64ms

𝟓00ns= 𝟏𝟐𝟖𝐊

Page 38: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

Often ⟵ Seldom

38

ONE MODULE: A B C

REFRESH INTERVAL (ms)

TOTALERRORS

64

ms

11

ms

11ms

𝟓5ns= 𝟐𝟎𝟎𝐊

Page 39: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

1. LIMIT ACCESSES TO ROWAccess Interval > 500ns

2. REFRESH ALL ROWS OFTENRefresh Interval < 11ms

39

TWO NAIVE SOLUTIONS

LARGE OVERHEAD: PERF, ENERGY, COMPLEXITY

Page 40: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

OUR SOLUTION: PARRProbabilistic Adjacent Row Refresh

40

Do nothing Refresh (=Open) adjacent rows

After closing any row ...

0.1%99.9%

Page 41: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

PARR: CHANCE OF ERROR

• NO REFRESHES IN N TRIALSProbability: 0.999N

• N=128K FOR ERROR (64ms)Probability: 0.999128K = 10–56

STRONG RELIABILITY GUARANTEE

41

Page 42: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

42

STRONG RELIABILITY

LOW PERFOVERHEAD

9.4× 10–14

Errors/Year

0.20%Slowdown

NO STORAGEOVERHEAD 0 Bytes

Page 43: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

RELATED WORK

• Security Exploit (Seaborn@Google 2015)

• Industry Analysis (Kang@SK Hynix 2014)“... will be [more] severe as technology shrinks down.”

• Targeted Row Refresh (JEDEC 2014)

• DRAM Testing (e.g., Van de Goor+ 1999)

• Disturbance in Flash & Hard Disk

43

Page 44: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

44

RECAP: RELIABILITY

CPU

MemoryManagement

DRAM

Arch &Interface

Circuits &Devices

DRAMControllerPARR ROW

HAMMER

Page 45: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

45

2. BANK CONFLICTS

A CASE FOR SUBARRAY PARALLELISM

ISCA 2012

Page 46: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

46

CUTOFF

FASTEST SLOWEST

DRAM

CELLS

5⨉ “WRITE PENALTY”Source: Samsung & Intel. The Memory Forum 2014

Page 47: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

47

FIGHT LATENCY

WITH MORE

PARALLELISM

Page 48: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

48

BANK

CHIP

BANKCPU

WR

WR

DIFFERENT BANKS:SMALL LATENCY

Page 49: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

49

BANK

CHIP

BANKCPU

WR WR

BANK CONFLICT:LARGE LATENCY

Page 50: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

50

BANK CONFLICT LATENCY

WR WR

SERIALIZATION

WRITE PENALTY (GETTING WORSE)

TIME

“ROW-BUFFER” THRASHING

Page 51: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

51

BANK

BITLINES

ROWROWROWROW

CACHES A COPY OF OPENED ROW

CONNECTS ROWS TO ROW-BUFFER

BUFFER

MEMORY REQUEST

DATA

MEMORY REQUESTMEMORY REQUESTMEMORY REQUEST

Page 52: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

52

HOW TO

PARALLELIZE

BANK CONFLICTS?Previous Approaches1. Have More Banks: Expensive

2. Bank Interleaving: Non-Solution

Page 53: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

53

BUFFER

BANK (LOGICAL VIEW)

ROWROW

ROWROW

MANY ROWS(~100K)

JUST ONE

CANNOT DRIVE LONG BITLINES

Page 54: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

54

ROWROW

ROWROW

BUFFER

FEWER ROWS (~1K)

BANK (PHYSICAL VIEW)

DEC

DRSUB

ARRAY

DEC

OD

ER

LOCAL (SUBARRAY)

GLOBAL (BANK)

BUFFER

BUFFER

DEC

DR

Page 55: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

55

ROWROW

BUFFER

SHARING = ROOT OF EVIL

DEC

DRSUB

ARRAY

DEC

OD

ER

BUFFER

SHARED:1. GLOBAL DEC.2. GLOBAL BUF.

NO SUBARRAYPARALLELISM

ROWROW

BUFFERDEC

DR

Page 56: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

56

ROW2ROW1

BUFFER

ROW4D

ECD

RROW3

BUFFERDEC

DR

PROBLEM #1: DECODER

ADDR1ADDR3 BOTHD

ECO

DER

ROW1OPENEDROW1CLOSED

ROW3OPENED

Page 57: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

57

ROW1

BUFFER

REG

.

ROW2

DEC

DRROW3

BUFFERROW4

DEC

DR

OUR SOLUTION

ADDR3ADDR1D

ECO

DER

STORES ADDRESS FOR EACH SUBARRAY

BOTHSTILLOPEN

ALSOOPEN

Page 58: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

58

ROW2ROW1

PROBLEM #2: BUFFER

BUFFER

ROW4ROW3

BUFFER

BUFFER

DEC

DR

DEC

DR

DEC

OD

ER

BOTH

Page 59: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

59

ROW2ROW1

OUR SOLUTION

BUFFER

ROW4ROW3D

ECO

DER

BOTH

REG

. SELECTSSUBARRAY FOR ACCESS

BUFFER

BUFFERDEC

DR

DEC

DR

Page 60: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

60

WR

SERIALSUBARRAYS

WR

WR

PARALLELSUBARRAYS

WR WR

WR

TIME

TIME

Page 61: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

61

0%

5%

10%

15%

20%

1x 2x 4x 8x 16x 32x 64x 128x

SP

EED

UP

SUBARRAY PARALLELISM

• Simulation: Out-of-Order CPU + DDR3-1066

• Benchmarks: SPEC/TPC/STREAM/RANDOM

NO

NPA

RA

LLEL

PA

RA

LLEL

+17%

Page 62: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

62

SUBARRAYS BANKS

PARALLELISM8x 8x

SPEEDUP+17% +20%

vs.

CHIP-SIZE+0.2% +36%Estimated using DRAM area model from Rambus

Page 63: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

RELATED WORK

• Module Partitioning (e.g., Zheng+ 2008)Divide module into small, independent subsets

Narrower data-bus Higher unloaded latency

• Hierarchical Bank (Yamauchi+ 1997)Parallelizes accesses to different subarrays

Does not utilize multiple local row-buffers

• Cached DRAM (e.g., Hidaka+ 1990)

• Low-Latency DRAM (e.g., Sato+ 1998)

63

Page 64: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

64

RECAP: PERFORMANCE

CPU

MemoryManagement

DRAM

Arch &Interface

Circuits &Devices

DRAMController

SUBARRAYAWARE

SLOWOUTLIERS

PARALLELSUBARRAYS

Page 65: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

65

3. RAMULATOR

RAMULATOR:A FAST AND EXTENSIBLE

DRAM SIMULATORIEEE CAL 2015

Page 66: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

66

IDEA

IDEAIDEA NEW IDEAS FOR

BETTER DRAM ...

DRAMSIMULATOR

MUST BE VETTED BY SIMULATION

Page 67: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

PREVIOUS SIMULATORS

•PRO: HIGH FIDELITY– Cycle-accurate DRAM models

• CON: LOW FLEXIBILITY– Hardcoded for DDR3/DDR4 DRAM

– Difficult to extend to othersE.g., LPDDRx, WIOx, GDDRx, RLDRAMx, HBM, HMC

67

Page 68: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

OUR RAMULATOR

•BUILT FOR EXTENSIBILITY– Easy to incorporate new ideas

•HIGH SIMULATION SPEED– 2.5x faster than the next fastest

•PORTABLE: SIMPLE C++ API68

Page 69: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

RAMULATOR’S APPROACH

69

BankBankBankBANK

RANK

CHANNEL

DRAM SYSTEM(STATE MACHINES)

BankBankBankBANK

RANK

CHANNELDRAM

COMMAND& ADDRESS

(INPUTS)

Page 70: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

70

BANKX

Z Y

RANK

CHANNELA

C

B

D

P Q

TEMPLATE

?

?

??

STATE MACHINESARE DEFINED BY:• STATES• EDGES

RAMULATOR:RECONFIGURABLESTATE MACHINE

Page 71: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

71

TEMPLATE TEMPLATE

TEMPLATE

TREE OFSTATE MACHINES

COMMAND& ADDRESS

• Hierarchy

• States

• EdgesDDR4

• Hierarchy

• States

• Edges

GDDR5

• Hierarchy

• States

• EdgesHBM

TEMPLATE TEMPLATE

TEMPLATE

Page 72: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

72

TMP TMP

TMP TMP TMP

TMP

DDR4

GDDR5

HBM

LPDDR4

WIO2DRAM CONTROLLER

DRAMTRACE

CPUFRONTEND

FULLSYSTEM

RAMULATOR• PERFORMANCE

• POWER (for some)

http://www.github.com/CMU-SAFARI/ramulator

Page 73: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

73

SupportedDRAM Specifications

SimulationSpeed (DDR3)

RamulatorDDR3/4, LPDDR3/4,

GDDR5, HBM, WIO1/2, Subarray Parallelism, etc.

2.70x

DRAMSim2(Rosenfeld et al.)

DDR2/3 1x

USIMM(Chatterjee et al.)

DDR3 1.08x

DrSim(Jeong et al.)

DDR2/3, LPDDR2 0.11x

NVMain(Poremba and Xie)

DDR3, LPDDR3/4 0.30x

Page 74: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

74

4. CONCLUSION

Page 75: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

SUMMARY

75

Traditional DRAM Scaling at Risk

Architectural Techniques for

Coping with Degrading DRAM

Regain Reliability & Performance

Page 76: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

CONTRIBUTIONS

1. We show that DRAM scaling is

negatively affecting reliability.

2. We propose a high-performance,

cost-effective DRAM architecture.

3. We develop a new simulator for

facilitating DRAM research.76

Page 77: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

77

RELIABILITY

• Row HammerKim et al., ISCA ’14

• Retention FailuresLiu et al., ISCA ’13Khan et al., SIGMETRICS ’14

• Speed vs. ReliabilityLee et al., HPCA ’15

EFFICIENCY

• Bank ConflictsKim et al., ISCA ’12

• In-DRAM Page CopySeshadri et al., MICRO ’13

• Tiered LatencyLee et al., HPCA ’13

• Fine-Grained RefreshesChang et al., HPCA ’14

RESEARCH

Page 78: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

78

COMPRESSION• Simple & Effective

Pekhimenko et al., MICRO ’13

SCHEDULING• High Throughput

Kim et al., HPCA ’10

• Quality of ServiceKim et al., MICRO ’10Kim et al., IEEE Micro Top Picks ’11Subramanian et al., HPCA ’13

SIMULATION• Fast & Extensible

Kim et al., IEEE CAL ’15

RESEARCH

Page 79: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

OUTDATED ASSUMPTIONS

•DRAM SCALING WILL TAKE CARE OF MAIN MEMORY

•LATEST AND CHEAPEST DRAM IS THE GREATEST

79

Page 80: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

NEW TECHNOLOGIES

80

NONVOLATILEMEMORY (NVM)

3-DIMENSIONAL

DIE STACKING

Image: Loke et al., Science 2012

Image: Micron Technology, 2015

Page 81: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

TREND: HETEROGENEITY

81

CACHE:

MAINMEMORY:

STORAGE:

SRAM

FLASH/HDD

DRAM

eDRAM3D DRAM

NVM

Page 82: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

TREND: SPECIALIZATION

82

MOBILE

GRAPHICS

NETWORKS

PC/SERVER

BANDWIDTH

POWER

LATENCY

COST

Page 83: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

83

HOW TO

REFORMULATE THE

MEMORY HIERARCHY?

Page 84: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

•NEW SOFTWARE ABSTRACTIONSPrimitives for managing heterogeneity

•NEW HARDWARE TRADE-OFFSDefining appropriate roles for each tier

• RICH MEMORY CAPABILITIES Memory as more than a “bag of bits”

84

Page 85: ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM …research.ece.cmu.edu/safari/thesis/ykim_defense_slides.pdfCO-DESIGN: CPU & DRAM CPU Memory Management DRAM Controller DRAM Arch & Interface

ARCHITECTURAL TECHNIQUES

TO ENHANCE DRAM SCALING

Thesis Defense

Yoongu Kim