DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf ·...

73
DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland DRAM: why bother? (i mean, besides the “memory wall” thing? ... is it just a performance issue?) think about embedded systems: think cellphones, think printers, think switches ... nearly every embedded product that used to be expensive is now cheap. why? for one thing, rapid turnover from high performance to obsolescence guarantees generous supply of CHEAP, HIGH-PERFORMANCE embedded processors to suit nearly any design need. what does the “memory wall” mean in this context? perhaps it will take longer for a high- performance design to become obsolete? DRAM: Architectures, Interfaces, and Systems A Tutorial Bruce Jacob and David Wang Electrical & Computer Engineering Dept. University of Maryland at College Park http://www.ece.umd.edu/~blj/DRAM/ UNIVERSITY OF MARYLAND DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland NOTE Outline Basics DRAM Evolution: Structural Path Advanced Basics DRAM Evolution: Interface Path Future Interface Trends & Research Areas Performance Modeling: Architectures, Systems, Embedded Break at 10 a.m. — Stop us or starve

Transcript of DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf ·...

Page 1: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

f

Mar

ylan

d

DR

AM

: why

bot

her?

(i m

ean,

be

side

s th

e “m

emor

y w

all”

thin

g? ..

. is

it ju

st a

per

form

ance

is

sue?

)

thin

k ab

out e

mbe

dded

sys

tem

s:

thin

k ce

llpho

nes,

thin

k pr

inte

rs,

thin

k sw

itche

s ...

nea

rly e

very

em

bedd

ed p

rodu

ct th

at u

sed

to

be e

xpen

sive

is n

ow c

heap

. w

hy?

for o

ne th

ing,

rapi

d tu

rnov

er fr

om

high

per

form

ance

to

obso

lesc

ence

gua

rant

ees

gene

rous

sup

ply

of C

HE

AP,

H

IGH

-PE

RF

OR

MA

NC

E

embe

dded

pro

cess

ors

to s

uit

near

ly a

ny d

esig

n ne

ed.

wha

t doe

s th

e “m

emor

y w

all”

mea

n in

this

con

text

? pe

rhap

s it

will

take

long

er fo

r a

high

-pe

rfor

man

ce d

esig

n to

bec

ome

obso

lete

?

DR

AM

: Arc

hit

ectu

res,

In

terf

aces

, an

d S

yste

ms

A T

uto

rial

Bru

ce J

aco

b a

nd

Dav

id W

ang

Ele

ctri

cal &

Co

mp

ute

r E

ng

inee

rin

g D

ept.

Un

iver

sity

of

Mar

ylan

d a

t C

olle

ge

Par

k

http://www.ece.umd.edu/~blj/DRAM/

UN

IVE

RS

ITY

OF

MA

RY

LAN

D

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

f

Mar

ylan

d

NO

TE

Ou

tlin

e

Bas

ics

DR

AM

Evo

luti

on

:

Str

uct

ura

l Pat

h

Ad

van

ced

Bas

ics

DR

AM

Evo

luti

on

:

Inte

rfac

e P

ath

Fu

ture

Inte

rfac

e Tr

end

s &

Res

earc

h A

reas

Per

form

ance

Mo

del

ing

:

Arc

hit

ectu

res,

Sys

tem

s, E

mb

edd

ed

Bre

ak a

t 10

a.m

. — S

top

us

or

star

ve

Page 2: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

f

Mar

ylan

d

first

off

-- w

hat i

s D

RA

M?

an

arra

y of

sto

rage

ele

men

ts

(cap

acito

r-tr

ansi

stor

pai

rs)

“DR

AM

” is

an a

cron

ym (

expl

ain)

w

hy “

dyna

mic

”?

- ca

paci

tors

are

not

per

fect

...

need

rec

harg

ing

- ve

ry d

ense

par

ts; v

ery

smal

l; ca

pact

iros

have

ver

y lit

tle

char

ge ..

. thu

s, th

e bi

t lin

es a

re

char

ged

up to

1/2

vol

tage

leve

l an

d th

e ss

ense

am

ps d

etec

t the

m

inut

e ch

ange

on

the

lines

, the

n re

cove

r th

e fu

ll si

gnal

Bas

ics

DR

AM

OR

GA

NIZ

AT

ION

... B

it L

ines

...

Mem

ory

Arr

ay

Row Decoder

. .. Word Lines ...

DR

AM

Sto

rag

e el

emen

t

Sw

itch

ing

el

emen

t

Bit

Lin

e

Wo

rd L

ine

Dat

a In

/Ou

tB

uff

ers

Sen

se A

mp

s

Co

lum

n D

eco

der

(cap

acit

or)

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

f

Mar

ylan

d

so h

ow d

o yo

u in

tera

ct w

ith th

is

thin

g? le

t’s lo

ok a

t a tr

aditi

onal

or

gani

zatio

n fir

st ..

. CP

U

conn

ects

to a

mem

ory

cont

rolle

r th

at c

onne

cts

to th

e D

RA

M it

self.

let’s

look

at a

rea

d op

erat

ion

Bas

ics

BU

S T

RA

NS

MIS

SIO

N

BU

SM

EM

OR

YC

ON

TR

OL

LE

RC

PU

... B

it L

ines

...

Mem

ory

Arr

ay

Row Decoder

. .. Word Lines ...

DR

AM

Dat

a In

/Ou

tB

uff

ers

Sen

se A

mp

s

Co

lum

n D

eco

der

Page 3: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

f

Mar

ylan

d

at th

is p

oint

, all

but l

ines

are

attt

th

e 1/

2 vo

ltage

leve

l.

the

read

dis

char

ges

the

capa

cito

rs o

nto

the

bit l

ines

...

this

pul

ls th

e lin

es ju

st a

littl

e bi

t hi

gh o

r a

little

bit

low

; the

sen

se

amps

det

ect t

he c

hang

e an

d re

cove

r th

e fu

ll si

gnal

the

read

is d

estr

uctiv

e --

the

capa

cito

rs h

ave

been

di

scha

rged

... h

owev

er, w

hen

the

sens

e am

ps p

ull t

he li

nes

to

the

full

logi

c-le

vel (

eith

er h

igh

or

low

), th

e tr

ansi

stor

s ar

e ke

pt

open

and

so

allo

w th

eir a

ttach

ed

capa

cito

rs to

bec

ome

rech

arge

d(if

they

hol

d a

‘1’ v

alue

)

Bas

ics

[PR

EC

HA

RG

E a

nd

] R

OW

AC

CE

SS

AK

A: O

PE

N a

DR

AM

Pag

e/R

ow

RA

S (

Ro

w A

dd

ress

Str

ob

e)

or

or

AC

T (

Act

ivat

e a

DR

AM

Pag

e/R

ow

)

BU

SM

EM

OR

YC

ON

TR

OL

LE

RC

PU

... B

it L

ines

...

Mem

ory

Arr

ay

Row Decoder

. .. Word Lines ...

DR

AM

Dat

a In

/Ou

tB

uff

ers

Sen

se A

mp

s

Co

lum

n D

eco

der

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

f

Mar

ylan

d

once

the

data

is v

alid

on

ALL

of

the

bit l

ines

, you

can

sel

ect a

su

bset

of t

he b

its a

nd s

end

them

to

the

outp

ut b

uffe

rs ..

. CA

S

pick

s on

e of

the

bits

big

poin

t: ca

nnot

do

anot

her

RA

S o

r pr

echa

rge

of th

e lin

es

until

fini

shed

read

ing

the

colu

mn

data

... c

an’t

chan

ge th

e va

lues

on

the

bit l

ines

or

the

outp

ut o

f th

e se

nse

amps

unt

il it

has

been

re

ad b

y th

e m

emor

y co

ntro

ller

Bas

ics

CO

LU

MN

AC

CE

SS

RE

AD

Co

mm

and

or

CA

S: C

olu

mn

Ad

dre

ss S

tro

be

BU

SM

EM

OR

YC

ON

TR

OL

LE

RC

PU

... B

it L

ines

...

Mem

ory

Arr

ay

Row Decoder

. .. Word Lines ...

DR

AM

Dat

a In

/Ou

tB

uff

ers

Sen

se A

mp

s

Co

lum

n D

eco

der

Page 4: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

f

Mar

ylan

d

then

the

data

is v

alid

on

the

data

bu

s ...

dep

endi

ng o

n w

hat y

ou

are

usin

g fo

r in

/out

buf

fers

, you

m

ight

be

able

to o

verla

p a

litttl

e or

a lo

t of t

he d

ata

tran

sfer

with

th

e ne

xt C

AS

to th

e sa

me

page

(t

his

is P

AG

E M

OD

E)

Bas

ics

DA

TA T

RA

NS

FE

R

no

te: p

age

mo

de

enab

les

over

lap

wit

h C

AS

BU

SM

EM

OR

YC

ON

TR

OL

LE

RC

PU

... B

it L

ines

...

Mem

ory

Arr

ay

Row Decoder

. .. Word Lines ...

DR

AM

Dat

a In

/Ou

tB

uff

ers

Sen

se A

mp

s

Co

lum

n D

eco

der

... w

ith

op

tio

nal

ad

dit

ion

alC

AS

: Co

lum

n A

dd

ress

Str

ob

e

Dat

a O

ut

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

f

Mar

ylan

d

NO

TE

Bas

ics

BU

S T

RA

NS

MIS

SIO

N

BU

SM

EM

OR

YC

ON

TR

OL

LE

RC

PU

... B

it L

ines

...

Mem

ory

Arr

ay

Row Decoder

. .. Word Lines ...

DR

AM

Dat

a In

/Ou

tB

uff

ers

Sen

se A

mp

s

Co

lum

n D

eco

der

Page 5: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

f

Mar

ylan

d

DR

AM

“la

tenc

y” is

n’t

dete

rmin

istic

bec

ause

of C

AS

or

RA

S+

CA

S, a

nd th

ere

may

be

sign

ifica

nt q

ueui

ng d

elay

s w

ithin

th

e C

PU

and

the

mem

ory

cont

rolle

rE

ach

tran

sact

ion

has

som

e ov

erhe

ad. S

ome

type

s of

ov

erhe

ad c

anno

t be

pipe

lined

. T

his

mea

ns th

at in

gen

eral

, lo

nger

bur

sts

are

mor

e ef

ficie

nt.

Bas

ics

B

CD

DR

AM

E2/

E3

E1

F

A

CP

UM

emC

ontr

olle

r

A: T

rans

actio

n re

ques

t may

be

dela

yed

in Q

ueue

B: T

rans

actio

n re

ques

t sen

t to

Mem

ory

Con

trol

ler

C: T

rans

actio

n co

nver

ted

to C

omm

and

Seq

uenc

es(m

ay b

e qu

eued

)D

: Com

man

d/s

Sen

t to

DR

AM

E1:

Req

uire

s on

ly a

CA

S o

rE

2: R

equi

res

RA

S +

CA

S o

r

F: T

rans

actio

n se

nt b

ack

to C

PU

“DR

AM

Lat

ency

” = A

+ B

+ C

+ D

+ E

+ F

E3:

Req

uire

s P

RE

+ R

AS

+ C

AS

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

f

Mar

ylan

d

NO

TE

Bas

ics

PH

YS

ICA

L O

RG

AN

IZA

TIO

N

Th

is is

per

ban

k …

Typ

ical

DR

AM

s h

ave

2+ b

anks

x2 D

RA

Mx4

DR

AM

x8 D

RA

M

... B

it L

ines

...

Sen

se A

mp

s

. . . .

Dat

aB

uff

ers

x8 D

RA

M

Row Decoder

Mem

ory

Arr

ay

Co

lum

n D

eco

der

... B

it L

ines

...

Sen

se A

mp

s

. . . .

Dat

aB

uff

ers

x2 D

RA

M

Row Decoder

Mem

ory

Arr

ay

Co

lum

n D

eco

der

... B

it L

ines

...

Sen

se A

mp

s

. . . .

Dat

aB

uff

ers

x4 D

RA

M

Row Decoder

Mem

ory

Arr

ay

Co

lum

n D

eco

der

Page 6: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

f

Mar

ylan

d

let’s

look

at t

he in

terf

ace

anot

her

way

.. th

e sa

y th

e da

ta s

heet

s po

rtra

y it.

[exp

lain

]

mai

n po

int:

the

RA

S\ a

nd C

AS

\ si

gnal

s di

rect

ly c

ontr

ol th

e la

tche

s th

at h

old

the

row

and

co

lum

n ad

dres

ses

...

Bas

ics

Rea

d T

imin

g fo

r C

onv

enti

on

al D

RA

M

Ro

wA

dd

ress

Co

lum

nA

dd

ress

Val

idD

atao

ut

RA

S

CA

S

Ad

dre

ss

DQ

Ro

wA

dd

ress

Co

lum

nA

dd

ress

Val

idD

atao

ut

Dat

a Tr

ansf

er

Co

lum

n A

cces

s

Ro

w A

cces

s

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

f

Mar

ylan

d

sinc

e D

RA

M’s

ince

ptio

n, th

ere

have

bee

n a

stre

am o

f cha

nges

to

the

desi

gn, f

rom

FP

M to

ED

O

to B

urst

ED

O to

SD

RA

M. t

he

chan

ges

are

larg

ely

stru

ctur

al

mod

ifica

tions

--

nim

or -

- th

at

targ

et T

HR

OU

GH

PU

T.

[dis

cuss

FP

M u

p to

SD

RA

M

Eve

ryth

ing

up to

and

incl

udin

g S

DR

AM

has

bee

n re

lativ

ely

inex

pens

ive,

esp

ecia

lly w

hen

cons

ider

ing

the

pay-

off (

FP

M

was

ess

entia

lly fr

ee, E

DO

cos

t a

latc

h, P

BE

DO

cos

t a c

ount

er,

SD

RA

M c

ost a

slig

ht r

e-de

sign

). ho

wev

er, w

e’re

run

out

of “

free

” id

eas,

and

now

all

chan

ges

are

cons

ider

ed e

xpen

sive

... t

hus

ther

e is

no

cons

ensu

s on

new

di

rect

ions

and

myr

iad

of c

hoic

es

has

appe

ared

[ do

LAT

EN

CY

mod

s st

artin

g w

ith E

SD

RA

M ..

. and

then

the

INT

ER

FAC

E m

ods

]

DR

AM

Evo

luti

on

ary

Tree

(Mo

stly

) S

tru

ctu

ral M

od

ific

atio

ns

Inte

rfac

e M

od

ific

atio

ns

Str

uct

ura

l

Co

nve

nti

on

al

FP

ME

DO

ES

DR

AM

Ram

bu

s, D

DR

/2F

utu

re T

ren

ds

. . . . . . . .

. . .

. . .

MO

SY

S

FC

RA

M

VC

DR

AM$

Mo

dif

icat

ion

sTa

rget

ing

Lat

ency

Targ

etin

g T

hro

ug

hp

ut

Targ

etin

g T

hro

ug

hp

ut

DR

AM

SD

RA

MP

/BE

DO

Page 7: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

NO

TE

DR

AM

Evo

luti

on

Rea

d T

imin

g fo

r C

onv

enti

on

al D

RA

M

Ro

wA

dd

ress

Co

lum

nA

dd

ress

Val

idD

atao

ut

RA

S

CA

S

Ad

dre

ss

DQ

Ro

wA

dd

ress

Co

lum

nA

dd

ress

Val

idD

atao

ut

Dat

a Tr

ansf

er

Co

lum

n A

cces

s

Tran

sfer

Ove

rlap

Ro

w A

cces

s

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

FP

M a

allo

ws

you

to k

eep

th

esen

se a

mps

act

uve

for

mul

tiple

C

AS

com

man

ds ..

.

muc

h be

tter

thro

ughp

ut

prob

lem

: can

not l

atch

a n

ew

valu

e in

the

colu

mn

addr

ess

buffe

r un

til th

e re

ad-o

ut o

f the

da

ta is

com

plet

e

DR

AM

Evo

luti

on

Rea

d T

imin

g fo

r Fa

st P

age

Mo

de

Ro

wA

dd

ress

Co

lum

nA

dd

ress

Val

idD

atao

ut

Co

lum

nA

dd

ress

Co

lum

nA

dd

ress

Val

idD

atao

ut

Val

idD

atao

ut

RA

S

CA

S

Ad

dre

ss

DQ

Dat

a Tr

ansf

er

Co

lum

n A

cces

s

Tran

sfer

Ove

rlap

Ro

w A

cces

s

Page 8: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

solu

tion

to th

at p

robl

em -

- in

stea

d of

sim

ple

tri-s

tate

bu

ffers

, use

a la

tch

as w

ell.

by p

uttin

g a

latc

h af

ter

the

colu

mn

mux

, the

nex

t col

umn

addr

ess

com

man

d ca

n be

gin

soon

er

DR

AM

Evo

luti

on

Rea

d T

imin

g fo

r E

xten

ded

Dat

a O

ut

Ro

wA

dd

ress

Co

lum

nA

dd

ress

Val

idD

atao

ut

RA

S

CA

S

Ad

dre

ss

DQ

Co

lum

nA

dd

ress

Co

lum

nA

dd

ress

Val

idD

atao

ut

Val

idD

atao

ut

Dat

a Tr

ansf

er

Co

lum

n A

cces

s

Tran

sfer

Ove

rlap

Ro

w A

cces

s

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

by d

rivin

g th

e co

l-add

r lat

ch fr

om

an in

tern

al c

ount

er r

athe

r th

an

an e

xter

nal s

igna

l, th

e m

inim

um

cycl

e tim

e fo

r dr

ivin

g th

e ou

tput

bu

s w

as r

educ

ed b

y ro

ughl

y 30

%

DR

AM

Evo

luti

on

Rea

d T

imin

g fo

r B

urs

t E

DO

Ro

wA

dd

ress

Co

lum

nA

dd

ress

RA

S

CA

S

Ad

dre

ss

DQ

Dat

a Tr

ansf

er

Co

lum

n A

cces

s

Tran

sfer

Ove

rlap

Ro

w A

cces

s

Val

idD

ata

Val

idD

ata

Val

idD

ata

Val

idD

ata

Page 9: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

“pip

elin

e” r

efer

s to

the

setti

ng u

p of

the

read

pip

elin

e ...

firs

t CA

S\

togg

le la

tche

s th

e co

lum

n ad

dres

s, a

ll fo

llow

ing

CA

S\

togg

les

driv

e da

ta o

ut o

nto

the

bus.

ther

efor

e da

ta s

tops

com

ing

whe

n th

e m

emor

y co

ntro

ller

stop

s to

gglin

g C

AS

\

DR

AM

Evo

luti

on

Rea

d T

imin

g fo

r P

ipel

ine

Bu

rst

ED

O

Ro

wA

dd

ress

Co

lum

nA

dd

ress

RA

S

CA

S

Ad

dre

ss

DQ

Dat

a Tr

ansf

er

Co

lum

n A

cces

s

Tran

sfer

Ove

rlap

Ro

w A

cces

s

Val

idD

ata

Val

idD

ata

Val

idD

ata

Val

idD

ata

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

mai

n be

nefit

: fre

es u

p th

e C

PU

or

mem

ory

cont

rolle

r fr

om

havi

ng to

con

trol

the

DR

AM

’s

inte

rnal

latc

hes

dire

ctly

... t

he

cont

rolle

r/C

PU

can

go

off a

nd d

o ot

her

thin

gs d

urin

g th

e id

le

cycl

es in

stea

d of

“w

ait”

... e

ven

thou

gh th

e tim

e-to

-firs

t-w

ord

late

ncy

actu

ally

get

s w

orse

, the

sc

hem

e in

crea

ses

syst

em

thro

ughp

ut

DR

AM

Evo

luti

on

Rea

d T

imin

g fo

r S

ynch

ron

ou

s D

RA

M

(RA

S +

CA

S +

OE

... =

= C

om

man

d B

us)

Co

mm

and

Ad

dre

ss

DQ

Clo

ck Ro

wA

dd

rC

ol

Ad

dr

Val

idD

ata

Val

idD

ata

Val

idD

ata

Val

idD

ata

AC

TR

EA

D

RA

S

CA

S

Dat

a Tr

ansf

er

Co

lum

n A

cces

s

Tran

sfer

Ove

rlap

Ro

w A

cces

s

Page 10: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

outp

ut la

tch

on E

DO

allo

wed

you

to

sta

rt C

AS

soo

ner

for

next

ac

cess

s (t

o sa

me

row

)

latc

h w

hole

row

in E

SD

RA

M -

- al

low

s yo

u to

sta

rt p

rech

arge

&

RA

S s

oone

r fo

r th

ee n

ext p

age

acce

ss -

- H

IDE

TH

E

PR

EC

HA

RG

E O

VE

RH

EA

D.

DR

AM

Evo

luti

on

Inte

r-R

ow

Rea

d T

imin

g fo

r E

SD

RA

M

Co

mm

and

Ad

dre

ss

DQ

Clo

ck Ro

wA

dd

rC

ol

Ad

dr

Val

idD

ata

Val

idD

ata

Val

idD

ata

Val

idD

ata

AC

TR

EA

D

Ro

wA

dd

rC

ol

Ad

dr

Val

idD

ata

Val

idD

ata

Val

idD

ata

Val

idD

ata

AC

TR

EA

DP

RE

Reg

ula

r C

AS

-2 S

DR

AM

, R/R

to

sam

e b

ank

Co

mm

and

Ad

dre

ss

DQ

Clo

ck Ro

wA

dd

rC

ol

Ad

dr

Val

idD

ata

Val

idD

ata

Val

idD

ata

Val

idD

ata

AC

TR

EA

D

Ro

wA

dd

rC

ol

Ad

dr

Val

idD

ata

Val

idD

ata

Val

idD

ata

Val

idD

ata

AC

TR

EA

D

ES

DR

AM

, R/R

to

sam

e b

ank

PR

E

Ban

k

Ban

k

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

neat

feat

ure

of th

is ty

pe o

f bu

fferin

g: w

rite-

arou

nd

DR

AM

Evo

luti

on

Wri

te-A

rou

nd

in E

SD

RA

M

(can

sec

on

d R

EA

D b

e th

is a

gg

ress

ive?

)

Com

man

d

Addr

ess

DQCloc

k

Row

Addr

Col

Addr

Valid

Data

Valid

Data

Valid

Data

Valid

Data

ACT

READ

Row

Addr

Col

Addr

Valid

Data

Valid

Data

Valid

Data

Valid

Data

ACT

WRI

TEPR

E

Reg

ula

r C

AS

-2 S

DR

AM

, R/W

/R t

o s

ame

ban

k, r

ow

s 0/

1/0

Com

man

d

Addr

ess

DQCloc

k

Row

Addr

Col

Addr

Valid

Data

Valid

Data

Valid

Data

Valid

Data

ACT

READ

Row

Addr

Col

Addr

Valid

Data

Valid

Data

Valid

Data

Valid

Data

ACT

WRI

TE

ES

DR

AM

, R/W

/R t

o s

ame

ban

k, r

ow

s 0/

1/0

PRE

Bank

Bank

Row

Addr

Col

Addr

Valid

Data

Valid

Data

Valid

Data

ACT

READ

PRE

Bank

Col

Addr

Valid

Data

Valid

Data

Valid

Data

Valid

Data

READ

Page 11: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

mai

n th

ing

... it

is li

ke h

avin

g a

bunc

h of

ope

n ro

w b

uffe

rs (

a la

ra

mbu

s), b

ut th

e pr

oble

m is

that

yo

u m

ust d

eal w

ith th

e ca

che

dire

ctly

(m

ove

into

and

out

of i

t),

not t

he D

RA

M b

anks

... a

dds

an

extr

a co

uple

of c

ycle

s of

late

ncy

... h

owev

er, y

ou g

et g

ood

band

wid

th if

the

data

you

wan

t is

cach

e, a

nd y

ou c

an “

pref

etch

” in

to c

ache

ahe

ad o

f whe

n yo

u w

ant i

t ...

orig

inal

ly ta

rget

ted

at

redu

cing

late

ncy,

now

that

S

DR

AM

is C

AS

-2 a

nd R

CD

-2,

this

mak

e se

nse

only

in a

th

roug

hput

way

DR

AM

Evo

luti

on

Inte

rnal

Str

uct

ure

of V

irtu

al C

han

nel

Seg

men

t ca

che

is s

oft

war

e-m

anag

ed, r

edu

ces

ener

gy

$

Ro

w D

eco

der

2Kb

Seg

men

t

2Kb

Seg

men

t

2Kb

Seg

men

t

2Kb

Seg

men

t

Ban

k A

Ban

k B

16 C

han

nel

s

Inp

ut/

Ou

tpu

tB

uff

er

DQ

s

Sel

/Dec

(seg

men

ts)

Sen

seA

mp

s

2Kb

it#

DQ

s

Act

ivat

eP

refe

tch

Res

tore

Rea

dW

rite

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

FC

RA

M o

pts

to b

reak

up

the

data

arr

ay ..

onl

y ac

tivat

e a

port

ion

of th

e w

ord

line

8K r

ows

requ

ires

13 b

its tt

o se

lect

... F

CR

AM

use

s 15

(a

ssum

ing

the

arra

y is

8k

x 1k

...

the

data

she

et d

oes

not s

peci

fy)

DR

AM

Evo

luti

on

Inte

rnal

Str

uct

ure

of

Fast

Cyc

le R

AM

Red

uce

s ac

cess

tim

e an

d e

ner

gy/

acce

ss

t RC

D =

15n

st R

CD

= 5

ns

8M A

rray

13 b

its

Sen

se A

mp

s

8M A

rray

15 b

its

Sen

se A

mp

s

SD

RA

MF

CR

AM

(on

e cl

ock

)(t

wo

clo

cks)

Row Decoder

Row Decoder

(8K

r x

1Kb

)(?

)

Page 12: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

MoS

ys ta

kes

this

one

ste

p fu

rthe

r ...

DR

AM

with

an

SR

AM

in

terf

ace

& s

peed

but

DR

AM

en

ergy

[phy

sica

l par

titio

ning

: 72

bank

s]

auto

ref

resh

--

how

to d

o th

is

tran

spar

ently

? th

e lo

gic

mov

es

tthro

ugh

the

arra

ys, r

efre

shin

g th

em w

hen

not a

ctiv

e.

but w

hat i

s on

e ba

nk g

ets

repe

ated

acc

ess

for

a lo

ng

dura

tion?

all

othe

r ba

nks

will

be

refr

eshe

d, b

ut th

at o

ne w

ill n

ot.

solu

tion:

they

hav

e a

bank

-siz

ed

CA

CH

E o

f lin

es ..

. in

theo

ry,

shou

ld n

ever

hav

e a

prob

lem

(m

agic

)

DR

AM

Evo

luti

on

Inte

rnal

Str

uct

ure

of

Mo

Sys

1T-

SR

AM

. . . . . . . .

. . .

. . .

add

r

Ban

kS

elec

t

DQ

s

$

Au

toR

efre

sh

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

here

’s a

n id

ea o

f how

the

desi

gns

com

pare

...

bus

spee

d =

= C

AS

-to-

CA

S

RA

S-C

AS

==

tim

e to

rea

d da

ta

from

cap

acito

rs in

to s

ense

am

ps

RA

S-D

Q =

= R

AS

to v

alid

dat

a

DR

AM

Evo

luti

on

Co

mp

aris

on

of

Lo

w-L

aten

cy D

RA

M C

ore

s

DR

AM

Typ

eD

ata

Bu

s S

pee

dB

us

Wid

th

(per

ch

ip)

Pea

k B

W

(per

Ch

ip)

RA

S–C

AS

(tR

CD

)R

AS

–DQ

(t

RA

C)

PC

133

SD

RA

M13

316

266

MB

/s15

ns

30 n

s

VC

DR

AM

133

1626

6 M

B/s

30 n

s45

ns

FC

RA

M20

0 *

216

800

MB

/s5

ns22

ns

1T-S

RA

M20

0 32

800

MB

/s—

10 n

s

DD

R 2

66

133

* 2

1653

2 M

B/s

20 n

s45

ns

DR

DR

AM

40

0 *

216

1.6

GB

/s22

.5 n

s60

ns

RL

DR

AM

300

* 2

322.

4 G

B/s

???

25 n

s

Page 13: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Ou

tlin

e

•B

asic

s

•D

RA

M E

volu

tio

n: S

tru

ctu

ral P

ath

•A

dva

nce

d B

asic

s •

Mem

ory

Sys

tem

Det

ails

(L

ots

)

•D

RA

M E

volu

tio

n: I

nte

rfac

e P

ath

•F

utu

re In

terf

ace

Tren

ds

& R

esea

rch

Are

as

•P

erfo

rman

ce M

od

elin

g:

Arc

hit

ectu

res,

Sys

tem

s, E

mb

edd

ed

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Som

e Te

chno

logy

has

legs

, so

me

do n

ot h

ave

legs

, and

so

me

have

gon

e be

lly u

p.

We’

ll st

art b

y em

aini

ng th

e fu

ndam

enta

l tec

hnol

ogie

s (I

/O

pack

agin

g et

c) th

en e

xplo

re o

me

of th

ese

tech

nolo

gies

in d

epth

a

bit l

ater

.

Wh

at D

oes

Th

is A

ll M

ean

?

ED

O

BE

DO

FP

M

SL

DR

AM

DD

R II

DD

RS

DR

AM

SD

RA

M

D-R

DR

AM

ES

DR

AM

FC

RA

M

RL

DR

AM

xDD

R II

net

DR

AM

Page 14: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Wha

t is

a “g

ood”

sys

tem

?

It’s

all a

bout

the

cost

of a

sy

stem

. Thi

s is

a m

ulti-

dim

ensi

onal

trad

eoff

prob

lem

. E

spec

ially

toug

h w

hen

the

rela

tive

cost

fact

ors

of p

ins,

die

ar

ea, a

nd th

e de

man

ds o

f ba

ndw

idth

and

late

ncy

keep

s on

ch

angi

ng. G

ood

deci

sion

s fo

r on

e ge

nera

tion

may

not

be

good

fo

r fu

ture

gen

erat

ions

. Thi

s is

w

hy w

e do

n’t k

eep

a D

RA

M

prot

ocol

for

a lo

ng ti

me.

FP

M

last

ed a

whi

le, b

ut w

e’ve

qui

ckly

pr

ogre

ssed

thro

ugh

ED

O,

SD

RA

M, D

DR

/RD

RA

M, a

nd

now

DD

R II

and

wha

teve

r els

e is

on

the

horiz

on.

Co

st -

Ben

efit

Cri

teri

on

Lo

gic

Ove

rhea

d

Po

wer

C

on

sum

pti

on

Pac

kag

e C

ost

Test

an

d

DR

AM

Sys

tem

Des

ign

Ban

dw

idth

Lat

ency

Imp

lem

enta

tio

n

Inte

rco

nn

ect

Co

st

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Now

we’

ll re

ally

get

our

han

ds

dirt

y, a

nd tr

y to

bec

ome

DR

AM

de

sign

ers.

Tha

t is,

we

wan

t to

unde

rsta

nd th

e tr

adeo

ffs, a

nd

desi

gn o

ur o

wn

mem

ory

syst

em

with

DR

AM

cel

ls. B

y do

ing

this

, w

e ca

n ga

in s

ome

insi

ght i

nto

som

e of

the

basi

s of

cla

ims

by

prop

onen

ts o

f var

ious

DR

AM

m

emor

y sy

stem

s.

A M

emor

y S

yste

m is

a s

yste

m

that

has

man

y pa

rts.

It’s

a s

et o

f te

chno

logi

es a

nd d

esig

n de

cisi

ons.

All

of th

e pa

rts

are

inte

r-re

late

d, b

ut fo

r th

e sa

ke o

f di

scus

sion

, we’

ll sp

lite

the

com

pone

nts

into

ova

ls s

een

here

, and

try

to e

xam

ine

each

pa

rt o

f a D

RA

M s

yste

m

sepa

rate

ly.

Mem

ory

Sys

tem

Des

ign

DR

AM

Mem

ory

S

yste

m

Top

olo

gy

I/O T

ech

no

log

y

Acc

ess

Pro

toco

l

DR

AM

Ch

ipA

rch

itec

tureClo

ck N

etw

ork

Ro

w B

uff

er

Ad

dre

ss M

app

ing

Man

agem

ent

Pin

Co

un

t

Ch

ip P

acka

gin

g

Page 15: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Pro

fess

or J

acob

has

sho

wn

yoou

som

e ni

ce ti

min

g di

agra

ms,

I to

o w

ill s

how

you

so

me

nice

tim

ing

diag

ram

s, b

ut

the

timin

g di

agra

ms

are

a si

mpl

ifica

tion

that

hid

es th

e de

tails

of i

mpl

emet

atio

n. W

hy

don’

t the

y ju

st r

un th

e sy

stem

at

XX

X M

Hz

like

the

othe

r gu

y?

The

n th

e la

tenc

y w

ould

be

muc

h be

tter,

and

the

band

wid

th w

ould

be

ext

rem

e. P

erha

ps th

ey c

an’t,

an

d w

e’ll

expl

ain

why

. To

unde

rsta

nd th

e re

ason

why

so

me

syst

ems

can

oper

ate

at

XX

X M

Hz

whi

le o

ther

s ca

nnot

, w

e m

ust g

o di

ggin

g pa

st th

e ni

ce

timin

g di

gram

s an

d th

e ar

chite

ctur

al b

lock

dia

gram

s an

d se

e w

hat t

urns

up

unde

rnea

th.

So

unde

rnea

th th

e tim

ing

diag

ram

, we

find

this

....

DR

AM

Inte

rfac

es

Th

e D

igit

al F

anta

sy

Ro

w

r 0

d0

d0

d0

d0

Co

l

Dat

a

a 0r 1

d1

d1

d1

d1

RA

Sla

ten

cyC

AS

late

ncy

Pip

elin

ed A

cces

s

Pre

ten

d t

hat

th

e w

orl

d lo

oks

like

th

is

Bu

t...

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

We

don’

t get

nic

e sq

uare

or

even

nic

ely

shap

ed w

avef

orm

s.

Jitte

r, sk

ew, e

tc. V

ddq

and

Vss

q ar

e th

e vo

ltage

sup

plie

s to

the

I/O

pad

s on

the

DR

AM

chi

ps. T

he

sign

al b

ounc

es a

roun

d, a

nd v

ery

non

idea

l. S

o w

hat a

re th

e pr

oble

ms,

and

wha

t are

the

solu

tion(

s) th

at a

re u

sed

to s

olve

th

ese

prob

lem

s? (

note

, the

158

ps

ske

w o

n pa

ralle

l dat

a ch

anne

ls If

you

r cy

cle

time

is

10ns

, or

10,0

00ps

, a s

kew

of

158

ps is

no

big

deal

, but

if y

our

cycl

e tim

e is

1ns

, or

1000

ps,

then

a s

kew

of 1

58ps

is a

big

de

al)

Alre

ady,

we

see

hint

s of

som

e pr

oble

ms

as w

e tr

y to

pus

h sy

stem

s to

hig

her

and

high

er

cloc

k fr

eque

ncie

s.

Th

e R

eal W

orl

dV

DD

Q(P

ad

)FCR

AM

sid

e

Con

trol

ler

side

VS

SQ

(Pa

d)

DQ

S (

Pin

)

DQ

0-1

5 (

Pin

)

DQ

S (

Pin

)D

Q0

-15 (

Pin

)

ske

w=

15

8p

se

cske

w=

10

2p

se

c

*To

shib

a P

rese

nta

tio

n, D

enal

i Mem

Co

n 2

002

Page 16: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Firs

t, w

e ha

ve to

intr

oduc

e th

e co

ncep

t tha

t sig

nal p

ropa

gatio

n ta

kes

finite

tim

e. L

imite

d by

the

spee

d of

ligh

t, or

rat

her

idea

l tr

ansm

issi

on li

nes

we

shou

ld

have

spe

ed o

f app

roxi

mat

ely

2/3

the

spee

d of

ligh

t. T

hat g

ets

us

20cm

/ns.

All

sign

als,

incl

udin

g sy

stem

wid

e cl

ock

sign

als

has

to

be s

ent o

n a

syst

em b

oard

, so

if yo

u se

nt a

clo

ck s

igna

l fro

m

poin

t A to

poi

nt B

on

an id

eal

sign

al li

ne, p

oint

B w

on’t

be a

ble

to te

ll th

at th

e cl

ock

has

chan

ge

until

at t

he e

arlie

st, 1

/20

ns/c

m *

di

stan

ce la

ter

that

the

cloc

k ha

s ris

en.

The

n ag

ain,

PC

boa

rds

are

not

exac

tly id

eal t

rans

mis

sion

line

s.

(rin

ging

effe

ct, d

rive

stre

ngth

, et

c)T

he c

once

pt o

f “S

ynch

rono

us”

brea

ks d

own

whe

n di

ffere

nt

part

s of

the

syst

em o

bser

ve

diffe

rent

clo

cks.

Kin

d of

like

re

lativ

ity

Sig

nal

Pro

pag

atio

n

Idea

l Tra

nsm

issi

on

Lin

e

AB

PC

Bo

ard

+ M

od

ule

Co

nn

ecto

rs +

V

aryi

ng

Ele

ctri

cal L

oad

s

= R

ath

er n

on

-Id

eal T

ran

smis

sio

n L

ine

~ 0.

66c

= 20

cm

/ns

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Whe

n w

e bu

ild a

“sy

nchr

onou

s sy

stem

” on

a P

CB

boa

rd, h

ow

do w

e di

strib

ute

the

cloc

k si

gnal

? D

o w

e w

ant a

slid

ing

time

dom

ain?

Is H

Tre

e do

-abl

e to

N-m

odul

es in

par

alle

l? S

kew

co

mpe

nsat

ion?

Clo

ckin

g Is

sues

0thN

th

0thN

th

Clk

SR

C

Clk

SR

C Wh

at K

ind

of

Clo

ckin

g S

yste

m?

Fig

ure

1:

Slid

ing

Tim

e

Fig

ure

2:

H T

ree?

Page 17: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

We

wou

ld w

ant t

he c

hips

to b

e on

a “

glob

al c

lock

”, e

very

one

is

perf

ectly

syn

chro

nous

, but

sin

ce

cloc

k si

gnal

s ar

e de

liver

ed

thro

ugh

wire

s, d

iffer

ent c

hips

in

the

syst

em w

ill s

ee th

e ris

ing

edge

of a

clo

ck a

littl

e bi

t ear

lier/

late

r th

an o

ther

chi

ps.

Whi

le a

n H

-Tre

e m

ay w

ork

for

a lo

w fr

eq s

yste

m, w

e re

ally

nee

d a

cloc

k fo

r se

ndin

g (w

ritin

g)

sign

als

from

the

cont

rolle

r to

the

chip

s, a

nd a

noth

er o

ne fo

r sn

ding

sig

nals

from

chi

ps to

co

ntro

ller

(rea

ding

)

Clo

ckin

g Is

sues

0thN

th

0thN

th

Clk

SR

C

Clk

SR

C

We

nee

d d

iffe

ren

t “cl

ock

s” fo

r R

/W

Fig

ure

1:

Wri

te D

ata

Fig

ure

2:

Rea

d D

ata

Sig

nal

Dir

ecti

on S

ign

al D

irec

tio

n

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

We

purp

osef

ully

“rou

ted

path

#2

to b

e a

bit l

onge

r th

an p

ath

#1 to

ill

ustr

ate

the

poin

t in

betw

een

the

sign

al p

ath

leng

th

diffe

rent

ials

. As

illus

trat

ed,

sign

als

will

rea

ch lo

ad B

at a

la

ter

time

than

load

A s

impl

y be

caus

e it

is fa

rthe

r aw

ay fr

om

cont

rolle

r th

an lo

ad A

.

It is

als

o di

fficu

lt to

do

path

le

ngth

and

impe

denc

e m

atch

ing

on a

sys

tem

boa

rd. S

omet

imes

he

roic

effo

rts

mus

t be

utili

zed

to

get u

s a

nice

“pa

ralle

l” bu

s.

Pat

h L

eng

th D

iffe

ren

tial

AC

ontr

olle

r

Path

#3

Path

#2

Path

#1

Bus

Sig

nal 2

Bus

Sig

nal 1

Inte

rmod

ule

Con

nect

ors

B

Hig

h F

req

uen

cy A

ND

Wid

e P

aral

lel

Bu

sses

are

Dif

ficu

lt t

o Im

ple

men

t

Page 18: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

It’s

hard

to b

ring

the

Wid

e pa

ralle

l bus

from

poi

nt A

to p

oint

B

, but

it’s

eas

ier

to b

ring

in

smal

ler

grou

ps o

f sig

nals

from

A

to B

. To

ensu

re p

rope

r tim

ing,

w

e al

so s

end

alon

g a

sour

ce

sync

hron

ous

cloc

k si

gnal

that

is

path

leng

th m

atch

ed w

ith th

e si

gnal

gro

un it

cov

ers.

In th

is

figur

e, s

igna

l gro

ups

1,2,

and

3

may

hav

e so

me

timin

g sk

ew w

ith

resp

ect t

o ea

ch o

ther

, but

with

in

the

grou

p th

e si

gnal

s w

ill h

ave

min

imal

ske

w. (

smal

ler

chan

nel

can

be c

lock

ed h

ighe

r)

Su

bd

ivid

ing

Wid

e B

uss

es

Ob

stru

ctio

n

1

1 23

2

3

Nar

row

Ch

ann

els,

So

urc

e S

ynch

ron

ou

sL

oca

l Clo

ck S

ign

als

A

B

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Ana

logy

, it’s

a lo

t har

der

to

sche

dule

8 p

eopl

e fo

r a m

eetin

g,

but a

lot e

asie

r to

sch

edul

e 2

mee

tings

with

4 p

eopl

e ea

ch.

The

res

ults

of t

he tw

o m

eetin

g ca

n be

cor

rela

ted

late

r.

Why

Su

bd

ivis

ion

Hel

ps

Su

bC

han

nel

1

Su

bC

han

nel

2

Wo

rst

Cas

esk

ew o

f {C

han

1 +

Ch

an 2

}

Wo

rst

Cas

esk

ew o

f C

han

1

Wo

rst

Cas

esk

ew o

f C

han

2

Wo

rst

Cas

e S

kew

mu

st b

e C

on

sid

ered

in S

yste

m T

imin

g

Page 19: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

A “

Sys

tem

” is

a ha

rd th

ing

to

desi

gn. E

spec

ially

one

that

al

low

s en

d us

ers

to p

erfo

rm

confi

gura

tions

that

will

impa

ct

timin

g. T

o gu

aren

tee

func

tiona

l co

rrec

tnes

s of

the

syst

em, a

ll co

rner

cas

es o

f var

ianc

es in

lo

adin

g an

d tim

ing

mus

t be

acco

unte

d fo

r.

Tim

ing

Var

iati

on

s

Co

ntr

olle

r

Co

ntr

olle

r

4 L

oad

s

1 L

oad

Clo

ck

Cm

d t

o 1

Lo

ad

Cm

d t

o 4

Lo

ads

Ho

w m

any

DIM

Ms

in S

yste

m?

Ho

w m

any

dev

ices

on

eac

h D

IMM

?

Infi

nit

e va

riat

ion

s o

n t

imin

g!

Wh

o b

uilt

th

e m

emo

ry m

od

ule

?

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

To e

nsur

e th

at a

ligh

tly lo

aded

sy

stem

and

a fu

lly lo

aded

sy

stem

do

not d

iffer

sig

nific

antly

in

tim

ing,

we

eith

er h

ave

dupl

icat

e si

gnal

s se

nt to

di

ffere

nt m

emor

y m

odul

es, o

r w

e ha

ve th

e sa

me

sign

al li

ne,

but t

he s

igna

l lin

e us

es v

aria

ble

stre

ngth

s to

driv

e th

e I/O

pad

s,

depe

ndin

g on

if th

e sy

stem

has

1,

2,3

or 4

load

s.

Lo

adin

g B

alan

ce

Co

ntr

olle

r

Co

ntr

olle

r

Co

ntr

olle

r

Co

ntr

olle

rD

up

licat

eS

ign

alL

ines

Var

iab

leS

ign

alD

rive

S

tren

gth

Page 20: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Sel

f Exp

lana

tory

. top

olog

y de

term

ines

load

ing

and

sign

al

prop

agat

ion

leng

ths.

Top

olo

gy

Co

ntr

olle

r

DR

AM

Ch

ip

DR

AM

Ch

ip

DR

AM

Ch

ip

DR

AM

Ch

ip

DR

AM

Ch

ip

DR

AM

Ch

ip

DR

AM

Ch

ip

DR

AM

Ch

ip

DR

AM

Ch

ip

DR

AM

Ch

ip

DR

AM

Ch

ip

DR

AM

Ch

ip

DR

AM

Ch

ip

DR

AM

Ch

ip

DR

AM

Ch

ip

DR

AM

Ch

ip?

DR

AM

Sys

tem

To

po

log

y D

eter

min

es

and

Sig

nal

Pro

pag

atio

n L

eng

ths

Ele

ctri

cal L

oad

ing

Co

nd

itio

ns

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Ver

y si

mpl

e to

polo

gy. T

he c

lock

si

gnal

that

turn

s ar

ound

is v

ery

nice

. Sol

ves

prob

lem

of n

eedi

ng

mul

tiple

clo

cks.

SD

RA

M T

op

olo

gy

Exa

mp

le

Com

man

d &

Dat

a bu

s

Sin

gle

Ch

ann

elS

DR

AM

Co

ntr

olle

r

Add

ress

(64

bits

)(1

6 bi

ts)

Loa

din

g Im

bal

ance

x16

DR

AM

Chi

p

x16

DR

AM

Chi

p

x16

DR

AM

Chi

p

x16

DR

AM

Chi

p

x16

DR

AM

Chi

p

x16

DR

AM

Chi

p

x16

DR

AM

Chi

p

x16

DR

AM

Chi

p

Page 21: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

All

sign

als

in th

is to

polo

gy, A

ddr/

Cm

d/D

ata/

Clo

ck a

re s

ent f

rom

po

int t

o po

int o

n ch

anne

ls th

at is

pa

th le

ngth

mat

ched

by

defin

ition

.

RD

RA

M T

op

olo

gy

Exa

mp

le

RD

RA

MC

on

tro

ller

Con

trol

ler

Chi

pP

acke

ts t

rave

ling

do

wn

Par

alle

l Pat

hs.

Ske

w is

m

inim

al b

y d

esig

n.

clo

cktu

rns

aro

un

d

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

RS

L vs

SS

TL2

etc

.

(like

EC

L vs

TT

L of

ano

ther

era

)

Wha

t is

“Log

ic L

ow”,

wha

t is

“Log

ic H

igh”

? D

iffer

ent E

lect

rical

S

igna

lling

pro

toco

ls d

iffer

on

volta

ge s

win

g, h

igh/

low

leve

l, et

c.

delta

t is

on

the

orde

r of

ns,

we

wan

t it t

o be

on

the

orde

r of

ps.

I/O T

ech

no

log

y ∆∆∆∆ t

∆∆∆∆ v

Lo

gic

Hig

h

Lo

gic

Lo

wT

ime

Sle

w R

ate

= ∆∆∆∆

v∆∆∆∆

t

Sm

alle

r ∆∆∆∆

v =

Sm

alle

r ∆∆∆∆

t at

sam

e sl

ew r

ate

Incr

ease

Rat

e o

f b

its/

s/p

in

Page 22: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Use

d on

clo

ckin

g sy

stem

s, i.

e.

RD

RA

M (

all c

lock

sig

nals

are

pa

irs, c

lk a

nd c

lk#)

.

Hig

hest

noi

se to

lera

nce,

doe

s no

t nee

d as

man

y gr

ound

si

gnal

s. W

here

as

sing

led

ende

d si

gnal

s ne

ed m

any

grou

nd

conn

ectio

ns. A

lso

diffe

rent

ial

pair

sign

als

may

be

cloc

ked

even

hig

her,

so p

in-b

andw

idth

di

sadv

anta

ge is

not

nea

rly 2

:1

as im

plie

d by

the

diag

ram

.

I/O -

Dif

fere

nti

al P

air

Diff

eren

tial P

air

Tran

smis

sion

Lin

e

Sin

gle

End

ed T

rans

mis

sion

Lin

e

Incr

ease

Rat

e of

bit

s/s/

pin

?

Cos

t P

er P

in?

Pin

Cou

nt?

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

One

of s

ever

al w

ays

on th

e ta

ble

to fu

rthe

r in

crea

se th

e bi

t-ra

te o

f th

e in

terc

onne

cts.

I/O -

Mu

lti L

evel

Lo

gic

tim

e

logi

c 01

ra

nge

logi

c 00

ra

nge

voltage

logi

c 10

ra

nge

logi

c 11

ra

nge

Incr

ease

Rat

e of

bit

s/s/

pin

Vre

f_2

Vre

f_0

Vre

f_1

Page 23: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Diff

eren

t pac

kagi

ng ty

pes

impa

ct

cost

s an

d sp

eed.

Slo

w p

arts

can

us

e th

e ch

eape

st p

acka

ging

av

aila

ble.

Fas

ter

part

s m

ay h

ave

to u

se m

ore

expe

nsiv

e pa

ckag

ing.

Thi

s ha

s lo

ng b

e ac

cept

ed in

the

high

er m

argi

n pr

oces

sor

wor

ld, b

ut to

DR

AM

, ea

ch c

ent h

as to

har

d fo

ught

for.

To s

ome

exte

nt, t

he d

eman

d fo

r hi

gher

per

form

ance

is p

ushi

ng

mem

ory

mak

ers

to u

se m

ore

expe

nsiv

e pa

ckag

ing

to

acco

mm

odat

e hi

gher

freq

uenc

y pa

rts.

Whe

n R

AM

BU

S fi

rst

spec

’ed

FB

GA

, mod

ule

mak

ers

com

plai

ned,

sin

ce th

ey h

ave

to

purc

hase

exp

ensi

ve e

quip

men

t to

val

idat

e th

at c

hips

wer

e pr

oper

ly s

olde

red

to th

e m

odul

e bo

ard,

whe

reas

som

ethi

ng li

ke

TS

OP

can

be

done

with

vis

ual

insp

ectio

n.

Pac

kag

ing

DIP

“go

od

old

day

s”

FB

GA

LQ

FP

TS

OP

SO

JS

mal

l Ou

tlin

e J-

lead

Th

in S

mal

l Ou

tlin

e

Lo

w P

rofi

le Q

uad

Fin

e B

all G

rid

Arr

ay

Fla

t P

acka

ge

Pac

kag

e

Fea

ture

sTa

rget

Sp

ecifi

cati

on

Pac

kage

FB

GA

LQF

P

Spe

ed80

0MB

p55

0Mbp

s

Vdd

/Vdd

q2.

5V/2

.5V

(1.

8V)

Inte

rfac

eS

ST

L_2

Row

Cyc

le

Tim

e t R

C

35ns

Mem

ory

Ro

adm

ap fo

rH

ynix

Net

DD

R II

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

16 b

it w

ide

com

man

d I h

ave

to

send

from

A to

B, I

nee

d 16

pin

s,

or if

I ha

ve le

ss th

an 1

6, I

need

m

ultip

le c

ycle

s.

How

man

y bi

ts d

o I n

eed

to s

end

from

poi

nt A

to p

oint

B?

How

m

any

pins

do

I get

?

Cyc

les

= B

its/P

ins.

Acc

ess

Pro

toco

l

Sin

gle

C

ycle

C

md

Mu

ltip

le

Cyc

le

Cm

d

Cm

d

Dat

a

Cm

d

Dat

a

r 0

d0

d0

d0

d0

r 0r 0

r 0r 0

d0

d0

d0

d0

Sin

gle

Cyc

le C

om

man

d

Mu

ltip

le C

ycle

Co

mm

and

Page 24: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

The

re is

inhe

rant

late

ncy

betw

een

issu

ance

of a

rea

d co

mm

and,

and

the

resp

onse

of

the

chip

with

dat

a. T

o in

crea

se

effic

ienc

y, a

pip

elin

e st

ruct

ure

is

nece

ssar

y to

obt

ain

full

utili

zatio

n of

the

com

man

d,

addr

ess

and

data

bus

ses.

D

iffer

ent f

rom

an

“ord

inar

y”

pipe

line

on a

pro

cess

or, a

m

emor

y pi

pelin

e ha

s da

ta

flow

ing

in b

oth

dire

ctio

ns.

Arc

hite

ctur

e w

ise,

we

shou

ld b

e co

ncer

ned

with

full

utili

zatio

n ev

eryw

here

, so

we

can

use

the

leas

t num

ber

of p

ins

for

the

grea

test

ben

efit,

but i

n ac

tual

us

e, w

e ar

e us

ually

con

cern

ed

with

full

utili

zatio

n of

the

data

bu

s.

Acc

ess

Pro

toco

l (r/

r)

Co

ntr

ol

DR

AM

DR

AM

DR

AM

DR

AM

Ro

w

r 0

d0

d0

d0

d0

Co

l

Dat

a

a 0r 1

d1

d1

d1

d1

RA

Sla

ten

cyC

AS

late

ncy

Pip

elin

ed A

cces

s

Co

nse

cuti

ve C

ach

e L

ine

Rea

d R

equ

ests

to

Sam

e D

RA

M R

ow

a =

Act

ive

(op

en p

age)

r =

Rea

d (

Co

lum

n R

ead

)

d =

Dat

a (D

ata

chu

nk)

Co

mm

and

Dat

a

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

The

DR

AM

chi

ps d

eter

min

e th

e la

tenc

y of

dat

a af

ter

a re

ad

com

man

d is

rec

eive

d, b

ut th

e co

ntro

ller

dete

rmin

es th

e tim

ing

rela

tions

hip

betw

een

the

writ

e co

mm

and

and

the

data

bei

ng

writ

ten

to th

e dr

am c

hips

.

(If t

he D

RA

M d

evic

e ca

nnot

ha

ndle

pip

elin

ed R

/W, t

hen.

..)

Cas

e 1:

Con

trol

ler

send

s w

rite

data

at s

ame

time

as th

e w

rite

com

man

d to

diff

eren

t dev

ices

(p

ipel

ined

)

Cas

e 2:

Con

trol

ler

send

s w

rite

data

at s

ame

time

as th

e w

rite

com

man

d to

sam

e de

vice

. (no

t pi

pelin

ed)

Acc

ess

Pro

toco

l (r/

w)

w0

d0

d0

d0

d0

Co

l

Dat

a

r 1

d1

d1

d1

d1

Cas

e 2:

Rea

d F

ollo

win

g a

Wri

te C

om

man

d t

o S

ame

DR

AM

Dev

ice

w0

d0

d0

d0

d0

Co

l

Dat

a

r 1

d1

d1

d1

d1

Cas

e 1:

Rea

d F

ollo

win

g a

Wri

te C

om

man

d t

o D

iffe

ren

t D

RA

M D

evic

es

Sen

se A

mp

s

Co

lum

n D

eco

der

Dat

a In

/Ou

tB

uff

ers

DR

AM

On

e D

atap

ath

- T

wo

Co

mm

and

s

w0

d0

d0

d0

d0

Co

l

Dat

a

r 1

d1

d1

d1

d1

So

ln: D

elay

Dat

a o

f Wri

te C

om

man

d t

o m

atch

Rea

d L

aten

cy

Page 25: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

To in

crea

se “

effic

ienc

y”,

pipe

lines

is r

equi

red.

How

man

y co

mm

ands

mus

t one

dev

ice

supp

ort c

oncu

rren

tly?

2? 3

? 4?

(de

pend

s on

wha

t?)

Imag

ine

we

mus

t inc

reas

t dat

a ra

te (

high

er p

in fr

eq),

but

allo

w

DR

AM

cor

e to

ope

rate

slig

htly

sl

ower

. (2X

pin

freq

., sa

me

core

la

tenc

y)

Thi

s is

sue

ties

acce

ss p

roto

col

to in

tern

al D

RA

M a

rchi

tect

ure

issu

es.

Acc

ess

Pro

toco

l (p

ipel

ines

)

r 1

d1

d1

d1

d1

Co

l

Dat

a

r 2

d2

d2

d2

d2

Co

l

Dat

a

CA

Sla

ten

cy

r 0

d0

d0

d0

d0

CA

Sla

ten

cy

r 0r 1

r 2

d0d

0d

0d

0d

1d

1d

1d

1d2d

2d

2d

2

Th

ree

Bac

k-to

-Bac

k P

ipel

ined

Rea

d C

om

man

ds

“Sam

e” L

aten

cy, 2

X p

in f

req

uen

cy, D

eep

er P

ipel

ine

Wh

en p

in f

req

uen

cy in

crea

ses,

ch

ips

mu

st e

ith

er

red

uce

“re

al la

ten

cy”,

or

sup

po

rt lo

ng

er b

urs

ts, o

r p

ipel

ine

mo

re c

om

man

ds.

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

440B

X u

sed

132

pins

to c

ontr

ol

a si

ngle

SD

RA

M c

hann

el, n

ot

coun

ting

Pw

r &

GN

D. n

ow 8

45

chip

set o

nly

uses

102

.

Als

o sl

ower

ver

sion

s. 6

6/10

0

Als

o pa

ge b

urst

, an

entir

e pa

ge.

Bur

st le

ngth

pro

gram

med

to

mat

ch c

ache

line

size

. (i.e

. 32

byte

= 2

56 b

its =

4 c

ycle

s of

64

bits

)

Late

ncy

as s

een

by th

e co

ntro

ller

is r

eally

CA

S +

1

cycl

es .

Ou

tlin

e

•B

asic

s

•D

RA

M E

volu

tio

n: S

tru

ctu

ral P

ath

•A

dva

nce

d B

asic

s

•D

RA

M E

volu

tio

n: I

nte

rfac

e P

ath

•S

DR

AM

, DD

R S

DR

AM

, RD

RA

M M

emo

ry S

yste

m

Co

mp

aris

on

s•

Pro

cess

or-

Mem

ory

Sys

tem

Tre

nd

s•

RL

DR

AM

, FC

RA

M, D

DR

II M

emo

ry S

yste

ms

Su

mm

ary

•F

utu

re In

terf

ace

Tren

ds

& R

esea

rch

Are

as

•P

erfo

rman

ce M

od

elin

g:

Arc

hit

ectu

res,

Sys

tem

s, E

mb

edd

ed

Page 26: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

SD

RA

M S

yste

m In

Det

ail

Sin

gle

Ch

ann

elS

DR

AM

Co

ntr

olle

r

Dim

m1

Dim

m2

Dim

m3

Dim

m4

Dat

a B

us

Ad

dr

& C

md

Ch

ip (

DIM

M)

Sel

ect

“Mes

h T

op

olo

gy”

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

SD

RA

M: i

nexp

ensi

ve

pack

agin

g, lo

wes

t cos

t (LV

TT

L si

gnal

ing)

, sta

ndar

d 3.

3V s

uppl

y vo

ltage

. DR

AM

cor

e an

d I/O

sh

are

sam

e po

wer

sup

ply.

“Pow

er c

ost”

is b

etw

een

560m

W

and

1W p

er c

hip

for

the

dura

tion

of th

e ca

che

line

burs

t. (N

ote:

it

cost

s po

wer

just

to k

eep

lines

st

ored

in s

ense

am

p/ro

w b

uffe

rs,

som

ethi

ng fo

r ro

w b

uffe

r m

anag

emen

t pol

icie

s to

thin

k ab

out.)

Abo

ut 2

/7 o

f pin

s us

ed fo

r A

ddr,

2/7

used

for

Dat

a, 2

/7 u

sed

fo

Pw

r/G

nd, a

nd 1

/7 u

sed

for

cmd

sign

als.

Row

Com

man

ds a

nd c

olum

n co

mm

ands

are

sen

t on

the

sam

e bu

s, s

o th

ey h

ave

to b

e de

mul

tiple

xed

insi

de th

e D

RA

M

and

deco

ded.

SD

RA

M C

hip

133

MH

z (7

.5n

s cy

cle

tim

e)25

6 M

Bit

54 p

in

14 P

wr/

Gn

d16

Dat

a15

Ad

dr

7 C

md

1 C

lk1

NC

Mu

ltip

lexe

d C

om

man

d/A

dd

ress

Bu

s

Pro

gra

mm

able

Bu

rst

Len

gth

, 1,2

,4 o

r 8

Su

pp

ly V

olt

age

of

3.3V

LVT

TL

Sig

nal

ing

(0.

8V t

o 2

.0V

)

Qu

ad B

anks

Inte

rnal

ly

Lo

w L

aten

cy, C

AS

= 2

, 3

Co

nd

itio

n S

pec

ifica

tio

nC

ur.

Pw

r

Ope

ratin

g (A

ctiv

e) B

urst

= C

ontin

ous

300m

A1W

Ope

ratin

g (A

ctiv

e) B

urst

= 2

170m

A56

0mW

Sta

ndby

(A

ctiv

e) A

ll ba

nks

activ

e60

mA

200m

W

Sta

ndby

(po

wer

dow

n) A

ll ba

nks

inac

tive

2mA

6.6m

W

TS

OP

(0 t

o 3

.3V

rai

l to

rai

l.)

Page 27: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

We’

ve s

pent

som

e tim

e di

scus

sing

som

e pi

pelin

ed b

ack

to b

ack

read

com

man

ds s

ent t

o th

e sa

me

chip

, now

let’s

try

to

pipe

line

com

man

ds to

diff

eren

t ch

ips.

In o

rder

for

the

mem

ory

cont

rolle

r to

be

able

to la

tch

in

data

on

the

data

bus

on

cons

ecut

ive

cycl

es, c

hip

#0 h

as

to h

old

the

data

val

ue p

ast t

he

risin

g ed

ge o

f the

clo

ck to

sat

isfy

th

e ho

ld ti

me

requ

irem

ents

, the

n ch

ip #

0 ha

s to

sto

p, a

llow

the

bus

to g

o “q

uiet

”, th

en c

hip

#1

can

star

t to

driv

e th

e da

ta b

us a

t le

ast s

ome

“set

up ti

me”

ahe

ad o

f th

e ris

ing

edge

of t

he n

ext c

lock

. C

lock

cyc

les

has

to b

e lo

ng

enou

gh to

tole

rate

all

of th

e tim

ing

requ

irem

ents

.

SD

RA

M A

cces

s P

roto

col (

r/r)

01

23

45

67

98

10

Bac

k-to

-bac

k M

emor

y R

ead

Acc

esse

s to

Diff

eren

t Chi

ps in

SD

RA

M

1211

13

read

com

man

d an

d ad

dres

s as

sert

ion

data

bus

util

izat

ion

CA

SL 3

Mem

ory

Con

trolle

r

SDR

AM

chip

# 0

SDR

AM

chip

# 1

1

42

Dat

a bu

s

1

3

24

Dat

a re

turn

fro

mch

ip #

1

Dat

a re

turn

fro

mch

ip #

0

chip

# 0

h

old

tim

eb

us

idle

tim

ech

ip #

1se

tup

tim

e

Clo

ck s

ign

al

2

3

4

Clo

ck C

ycle

s ar

e st

ill lo

ng e

noug

h to

al

low

for

pipe

lined

bac

k-to

-bac

k R

eads

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

I sho

w d

iffer

ent p

aths

, but

th

eses

sig

nals

are

sha

ring

the

bi-d

irect

iona

l dat

a bu

s. F

or a

re

ad to

follo

w a

writ

e to

a

diffe

rent

chi

p, th

e w

orst

cas

e sk

ew is

whe

n w

e w

rite

to th

e (N

-1)

th c

hip,

then

exp

ect t

o pi

pelin

e a

read

com

man

d in

the

next

cy

cle

right

beh

ind

it. T

he w

orst

ca

se s

igna

l pat

h sk

ew is

the

sum

of t

he d

ista

nces

. Isn

’t fr

om

N to

N e

ven

wor

se?

No,

SD

RA

M

does

not

sup

port

pip

elin

ed r

ead

behi

nd a

writ

e on

the

sam

e ch

ip.

Als

o, it

’s n

ot a

s ba

d as

I pr

ojec

t he

re, s

ince

rea

d cy

cles

are

ce

nter

alig

ned,

and

writ

es a

re

edge

alig

ned,

so

in e

ssen

ce, w

e ge

t 1 1

/2 c

ycle

s to

pip

elin

e th

is

case

, ins

tead

of j

ust 1

cyc

le.

Stil

l, th

is p

robl

em li

mits

the

freq

sc

alab

ility

of S

DR

AM

, an

idle

cy

cles

may

be in

sert

ed to

mee

t tim

ing.

Look

s Ju

st li

ke S

DR

AM

!

SD

RA

M A

cces

s P

roto

col (

w/r

)

0thN

thF

igu

re 1

:C

on

secu

tive

Rea

ds

0thN

thF

igu

re 1

:R

ead

Aft

erW

rite

Wo

rst

case

= D

ist(

N)

- D

ist(

0)

Wri

teR

ead

(N-1

)th

Wo

rst

case

= D

ist(

N)

+ D

ist(

N-1

)

Bu

s Tu

rn A

rou

nd

Page 28: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Tim

ing

bubb

les.

Mor

e de

ad

cycl

es

SD

RA

M A

cces

s P

roto

col (

w/r

)

Mem

ory

Con

trolle

r

SDR

AM

chip

# 0

SDR

AM

chip

# 1

4

2

Dat

a bu

s

1

3

w0

d0

d0

d0

d0

Co

l

Dat

a

r 1

d1

d1

d1

d1

Rea

d F

ollo

win

g a

Wri

te C

om

man

d t

o S

ame

SD

RA

M D

evic

e

1 2

3

4

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Sin

ce d

ata

bus

has

a m

uch

light

er lo

ad, i

f we

can

use

bette

r si

gnal

ing

tech

nolo

gy, p

erha

ps

we

can

run

just

the

data

bus

at a

hi

gher

freq

uenc

y.

At t

he h

ighe

r fr

eque

ncy,

the

skew

s w

e ta

lked

abo

ut w

ould

be

terr

ible

with

a 6

4 bi

t wid

e da

ta

bus.

So

we

use

a so

urce

sy

ncho

nous

str

obe

sign

als

(cal

led

DQ

S)

that

is r

oute

d pa

ralle

l to

each

8 b

it w

ide

sub

chan

nel.

DD

R is

new

er, s

o le

t’s u

se lo

wer

co

re v

olta

ge, s

aves

on

pow

er

too!

DD

R S

DR

AM

Sys

tem

Sin

gle

Ch

ann

el

Dim

m1

Dim

m2

Dim

m3

DD

R

SD

RA

MC

on

tro

ller

Dat

a B

us

Ad

dr

& C

md

Ch

ip (

DIM

M)

Sel

ect

DQ

S (

Dat

a S

tro

be)

Sam

e To

po

log

yas

SD

RA

M

Page 29: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Slig

htly

larg

er p

acka

ge, s

ame

wid

th fo

r A

ddr

and

Dat

a, n

ew

pins

are

2 D

QS

, Vre

f, an

d no

w

diffe

rent

ial c

lock

s. L

ower

sup

ply

volta

ge.

Low

vol

tage

sw

ing,

now

with

re

fere

nce

to V

ref i

nste

ad o

f (0.

8 to

2.0

V)

No

pow

er d

iscu

ssio

n he

re

beca

use

data

she

et (

mic

ron)

is

inco

mpl

ete.

Rea

d D

ata

retu

rned

from

DR

AM

ch

ips

now

get

s re

ad in

with

re

spec

t to

the

timin

g of

the

DQ

S

sign

als,

sen

t by

the

dram

chi

ps

in p

aral

lel w

ith th

e da

ta it

self.

The

use

of D

QS

intr

oduc

es

“bub

bles

” in

betw

een

burs

ts fr

om

diffe

rent

chi

ps, a

nd r

educ

es

band

wid

th e

ffici

ency

.

DD

R S

DR

AM

Ch

ip13

3 M

Hz

(7.5

ns

cycl

e ti

me)

16 P

wr/

Gn

d*

16 D

ata

15 A

dd

r7

Cm

d2

Clk

*7

NC

*

Mu

ltip

lexe

d C

om

man

d/A

dd

ress

Bu

s

Pro

gra

mm

able

Bu

rst

Len

gth

s, 2

, 4 o

r 8*

Su

pp

ly V

olt

age

of

2.5V

*

SS

TL

-2 S

ign

alin

g (

Vre

f +/

- 0.

15V

)

Qu

ad B

anks

Inte

rnal

ly

Lo

w L

aten

cy, C

AS

= 2

, 2.

5, 3

*

256

MB

it66

pin

2 D

QS

*1

Vre

f *

01

23

45

CA

SL

= 2

Clk

Clk

#

Rea

d

Dat

a

DQ

S

Cm

d

DQ

S P

re-a

mb

leDQ

S P

ost

-am

ble

TS

OP

(0 t

o 2

.5V

rai

l to

rai

l)

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Her

e w

e se

e th

at tw

o co

nsec

utiv

e co

lum

n re

ad

com

man

ds to

diff

eren

t chi

ps o

n th

e D

DR

mem

ory

chan

nel

cann

ot b

e pl

aced

bac

k to

bac

k on

the

Dat

a bu

s du

e to

the

DQ

S

sign

al h

and-

off i

ssue

. The

y m

ay

be p

ipel

ined

with

one

idle

cyc

le

in b

etw

een

burs

ts.

Thi

s si

tuat

ion

is tr

ue fo

r al

l co

nsec

utiv

e ac

cess

es to

di

ffere

nt c

hips

, r/r.

r/w

. w/r.

(e

xcep

t w/w

, whe

n th

e co

ntro

ller

keep

s co

ntro

l of t

he D

QS

sig

nal,

just

cha

nges

targ

et c

hips

)

Bec

ause

of t

his

over

head

, sho

rt

nurs

ts a

re in

effic

ient

on

DD

R,

long

er b

urst

s ar

e m

ore

effic

ient

. (3

2 by

te c

ache

line

= 4

bur

st, 6

4 by

te li

ne =

8 b

urst

)

DD

R S

DR

AM

Pro

toco

l (r/

r)

Bac

k-to

-bac

k M

emor

y R

ead

Acc

esse

s to

Diff

eren

t Chi

ps in

DD

R S

DR

AM

Mem

ory

Con

trol

ler

SD

RA

Mch

ip #

0S

DR

AM

chip

# 1

d 0

Dat

a bu

s

01

23

45

CA

SL

= 2

Clk

Clk

#

Dat

a

DQ

S

Cm

d

DQ

S p

re-a

mb

leD

QS

po

st-a

mb

le

r 0

67

8

d1

r 1

d1

d1

d1

d0

d0

d0

d0

d 1

r 0

r 1

Page 30: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Ver

y di

ffere

nt fr

om S

DR

AM

. E

very

thin

g is

sen

t aro

und

in 8

(h

alf c

ycle

) pa

cket

s. M

ost

syst

ems

now

run

s at

400

MH

z,

but s

ince

eve

ryth

ing

is D

DR

, it’s

ca

lled

“800

MH

z”. T

he o

nly

diffe

renc

e is

that

pac

kets

can

on

ly b

e in

itiat

ed a

t the

ris

ing

edge

of t

he c

lock

, oth

er th

an

that

, the

re’s

no

diff

betw

een

400

DD

R a

nd 8

00.

Ver

y cl

ean

topo

logy

, ver

y cl

ever

cl

ocki

ng s

chem

e. N

o cl

ock

hand

off i

ssue

, hig

h ef

ficie

ncy.

Writ

e de

lay

impr

oves

mat

chin

g w

ith r

ead

late

ncy.

(no

t per

fect

, as

sho

wn)

sin

ce d

ata

bus

is 1

6 bi

ts w

ide,

eac

h re

ad c

omm

and

gets

16*

8=12

8 bi

ts b

ack.

Eac

h ca

chel

ine

fetc

h =

mul

tiple

pa

cket

s.

Up

to 3

2 de

vice

s.

RD

RA

M S

yste

m

RD

RA

MC

on

tro

ller

Bus

Clo

ck

Col

umn

Cm

d

Dat

a B

us

t CA

C(C

AS

acce

ss d

elay

)

w0

t CW

D(W

rite

del

ay)

t CA

C -

t CW

D

Pac

ket

Pro

toco

l : E

very

thin

g in

8 (

hal

f) c

ycle

pac

kets

r 2w

1

d 0d 2

d 1

Two

Wri

te C

om

man

ds

Fo

llow

ed b

y a

Rea

d C

om

man

d

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

RD

RA

M p

acke

ts d

oes

not r

e-or

der

the

data

insi

de th

e pa

cket

. To

com

pute

RD

RA

M la

tenc

y, w

e m

ust a

dd in

the

com

man

d pa

cket

tran

smis

sion

tim

e as

wel

l as

dat

a pa

cket

tran

smis

sion

tim

e.

RD

RA

M r

elie

s on

the

mul

titud

es

of b

anks

to tr

y to

mak

e su

re th

at

a hi

gh p

erce

ntag

e of

req

uest

s w

ould

hit

open

pag

es, a

nd o

nly

incu

r th

e co

st o

f a C

AS

, ins

tead

of

a R

AS

+ C

AS

.

Dir

ect

RD

RA

M C

hip

400

MH

z (2

.5n

s cy

cle

tim

e)

49 P

wr/

Gn

d*

16 D

ata

8 A

dd

r/C

md

4 C

lk*

6 C

TL

*2

NC

Sep

arat

e R

ow

-Co

l Co

mm

and

Bu

sses

Bu

rst

Len

gth

= 8

*

Su

pp

ly V

olt

age

of

2.5V

*

RS

L S

ign

alin

g (

Vre

f +/

- 0.

2V)

4/16

/32

Ban

ks In

tern

ally

*

Lo

w L

aten

cy, C

AS

= 4

to

6 f

ull

cycl

es*

256

MB

it86

pin

1 V

ref

*

FB

GA

Act

ive

read

read

pre

char

ge

dat

ad

ata

All

pac

kets

are

8 (

hal

f) c

ycle

s in

len

gth

,th

e p

roto

col a

llow

s n

ear

100%

ban

dw

idth

Act

ive

read

read

dat

ad

ata

uti

lizat

ion

on

all

chan

nel

s. (

Ad

dr/

Cm

d/D

ata)

(800

mV

rai

l to

rai

l)

Page 31: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

RD

RA

M p

rovi

des

high

ba

ndw

idth

, but

wha

t are

the

cost

s?R

AM

BU

S p

ushe

d in

man

y di

ffere

nt a

reas

sim

ulta

neou

sly.

T

he d

raw

back

was

that

with

new

se

t of i

nfra

stru

ctur

e, th

e co

sts

for

first

gen

erat

ion

prod

ucts

wer

e ex

horb

ant.

RD

RA

M D

raw

bac

ks

Sig

nifi

can

t C

ost

Del

ta fo

r F

irst

Gen

erat

ion

RS

L: S

epar

ate

Po

wer

Pla

ne

Co

ntr

ol L

og

ic -

Ro

w b

uff

ers

30%

die

co

stfo

r lo

gic

@64

Mb

it n

od

e

Hig

h F

req

uen

cyI/O

Tes

t an

d

Pac

kag

e C

ost

Act

ive

Dec

od

e L

og

ic +

Op

enR

ow

Bu

ffer

.(H

igh

po

wer

fo

r “q

uie

t” s

tate

)

Sin

gle

Ch

ipP

rovi

des

All

Dat

a B

its

for

Eac

h P

acke

t(P

ow

er)

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Low

pin

cou

nt, h

ighe

r la

tenc

y.In

gen

eral

term

s, th

e sy

stem

co

mpa

rison

sim

ply

poin

ts o

ut

the

vario

us p

arts

that

RD

RA

M

exce

lls in

, i.e

. hig

h ba

ndw

idth

an

d lo

w p

in c

ount

., bu

t the

y al

so

have

long

er la

tenc

y, s

ince

it

take

s 10

ns

just

to m

ove

the

com

man

d fr

om th

e co

ntro

ller

onto

the

DR

AM

chi

p, a

nd

anot

her

10ns

just

to g

et th

e da

ta

from

the

DR

AM

chi

ps b

ack

onto

th

e co

ntro

ller

inte

rfac

e

Sys

tem

Co

mp

aris

on

SD

RA

MD

DR

RD

RA

M

Fre

qu

ency

(M

Hz)

133

133*

240

0*2

Pin

Co

un

t (D

ata

Bu

s)64

6416

Pin

Co

un

t (C

on

tro

ller)

102

101

33

Th

eore

tica

l Ban

dw

idth

(M

B/s

)10

6421

2816

00

Th

eore

tica

l Effi

cien

cy

(dat

a b

its/

cycl

e/p

in)

0.63

0.63

0.48

Su

stai

ned

BW

(M

B/s

)*65

598

610

72

Su

stai

ned

Effi

cien

cy*

(dat

a b

its/

cycl

e/p

in)

0.39

0.29

0.32

RA

S +

CA

S (

t RA

C)

(ns)

45 ~

50

45 ~

50

57 ~

67

CA

S L

aten

cy (

ns)

**22

~ 3

022

~ 3

040

~ 5

0

133

MH

z P

6 C

hip

set

+ S

DR

AM

CA

S L

aten

cy ~

80

ns

*Str

eam

Ad

d**

Lo

ad t

o u

se la

ten

cy

Page 32: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

RD

RA

M m

oves

com

plex

ity fr

om

inte

rfac

e in

to D

RA

M c

hips

. Is

this

a g

ood

trad

e of

f? W

hat

does

the

futu

re lo

ok li

ke?

Dif

fere

nce

s o

f P

hilo

sop

hy

Co

mp

lexi

ty M

oved

to

DR

AM

SD

RA

M -

Var

ian

ts

Co

ntr

olle

rD

RA

MC

hip

s

Co

mp

lex

Inte

rco

nn

ect

Inex

pen

sive

Inte

rfac

e

RD

RA

M -

Var

ian

ts

Co

ntr

olle

rD

RA

MC

hip

s

Sim

plif

ied

Inte

rco

nn

ect

exp

ensi

veIn

terf

ace

Sim

ple

Lo

gic

Co

mp

lex

Lo

gic

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

To b

egin

with

, we

look

in a

cr

ysta

l bal

l to

look

for t

rend

s th

at

will

cau

se c

hang

es o

r lim

it sc

alab

ility

in a

reas

that

we

are

inte

rest

ed in

. IT

RS

= In

tern

atio

nal T

echn

olog

y R

oadm

ap fo

r S

emic

ondu

ctor

s.

Tran

sist

or F

requ

ecie

s ar

e su

ppos

ed to

nea

rly d

oubl

e ev

ery

gene

ratio

n, a

nd tr

ansi

stor

bu

dget

(as

indi

cate

d by

Mill

ion

Logi

c Tr

ansi

stor

s pe

r cm

^2)

are

proj

ecte

d to

dou

ble.

Inte

rcon

nect

s be

twee

n ch

ips

are

a di

ffere

nt s

tory

. Mea

sure

d in

ce

nts/

pin,

pin

cos

t dec

reas

es

only

slo

wly

, and

pin

bud

get

grow

s sl

owly

eac

h ge

nera

tion.

Pun

chlin

e: In

the

futu

re, F

ree

Tran

sist

ors

and

Cos

tly

inte

rcon

nect

s.

Tech

no

log

y R

oad

map

(IT

RS

)

2004

2007

2010

2013

2016

Sem

i Gen

erat

ion

(n

m)

9065

4532

22

CP

U M

Hz

3990

6740

1200

019

000

2900

0

ML

og

icTr

ansi

sto

rs/

cm^2

77.2

154.

330

961

712

35

Hig

h P

erf

chip

pin

co

un

t22

6330

1240

0953

3571

00

Hig

h P

erfo

rman

ce c

hip

cost

(ce

nts

/pin

)1.

881.

611.

681.

441.

22

Mem

ory

pin

co

st

(cen

ts/p

in)

0.34

-1.

390.

27 -

0.

840.

22 -

0.

340.

19 -

0.

390.

19 -

0.

33

Mem

ory

pin

co

un

t48

-160

48-1

6062

-208

81-2

7010

5-35

1

Fre

e Tr

ansi

sto

rs &

Co

stly

Inte

rco

nn

ects

Tren

d:

Page 33: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

So

we

have

som

e ch

oice

s to

m

ake.

Inte

grat

ion

of m

emor

y co

ntro

ller

will

mov

e th

e m

emor

y co

ntro

ller

on d

ie, f

requ

ency

will

be

muc

h hi

gher

. Com

man

d-da

ta

path

will

onl

y cr

oss

chip

bo

unda

ries

twic

e in

stea

d of

4

times

. But

inte

rfac

ing

with

m

emor

y ch

ips

dire

ctly

mea

ns

that

you

are

to b

e lim

ited

by th

e lo

wes

t com

mon

den

omin

ator

. To

get h

ighe

st b

andw

idth

(fo

r a

give

n nu

mbe

r of

pin

s) A

ND

lo

wes

t lat

ency

, we’

ll ne

ed

cust

om R

AM

, mig

ht a

s w

ell b

e S

RA

M, b

ut it

will

be

proh

ibat

ivel

y ex

pens

ive

Ch

oic

es fo

r F

utu

re

DRAM

CP

U

DRAM

DRAM

DRAM

DRAM

CP

U

CP

U

CP

U

DR

AM

DRAM

DRAM

DRAM

DR

AM

DR

AM

DR

AM

Mem

ory

C

on

tro

ller

Dir

ect

Co

nn

ect

Cu

sto

m D

RA

M:

Hig

hes

t B

and

wid

th +

Lo

w L

aten

cy

Dir

ect

Co

nn

ect

Co

mm

od

ity

DR

AM

Lo

w B

and

wid

th +

Lo

w L

aten

cy

Dir

ect

Co

nn

ect

sem

i-co

mm

. DR

AM

:H

igh

Ban

dw

idth

+L

ow

/Mo

der

ate

Lat

ency

Ind

irec

t C

on

nec

tio

n

Inex

pen

sive

DR

AM

Hig

hes

t B

and

wid

th H

igh

est

Lat

ency

DR

AM

DR

AM

DR

AM

DR

AM

DR

AM

DR

AM

DR

AM

DR

AM

DR

AM

DR

AM

DR

AM

DR

AM

DR

AM

DR

AM

DR

AM

DR

AM

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Two

RD

RA

M c

ontr

olle

r m

eans

2

inde

pend

ent c

hann

els.

Onl

y 1

pack

et h

as to

be

gene

rate

d fo

r ea

ch 6

4 by

te c

ache

line

tr

ansa

ctio

n re

ques

t.

(ext

ra c

hann

el s

tore

s ca

che

cohe

renc

e da

ta. i

.e. I

bel

ong

to

CP

U#2

, exc

lusi

vely

.)

Ver

y ag

gres

sive

use

of a

vaila

ble

page

s on

RD

RA

M m

emor

y.

EV

7 +

RD

RA

M (

Co

mp

aq/H

P)

•R

DR

AM

Mem

ory

(2

Co

ntr

olle

rs)

•D

irec

t C

on

nec

tio

n t

o p

roce

sso

r

•75

ns

Lo

ad t

o u

se la

ten

cy

•12

.8 G

B/s

Pea

k b

and

wid

th

•6

GB

/s r

ead

or

wri

te b

and

wid

th

•20

48 o

pen

pag

es (

2 *

32 *

32)

16 16 16 16

64M

C

Eac

h c

olu

mn

rea

dfe

tch

es 1

28 *

4 =

512

b(d

ata)

16 16 16 16

64M

C

Page 34: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

EV

7 ca

chel

ine

is 6

4 by

tes,

so

each

4-c

hann

el g

ange

d R

DR

AM

ca

n fe

tch

64 b

ytes

with

1 s

ingl

e pa

cket

.E

ach

DD

R S

DR

AM

cha

nnel

can

fe

tch

64 b

ytes

by

itsel

f. S

o w

e ne

ed 6

con

trol

lers

if w

e ga

ng

two

DD

R S

DR

AM

s to

geth

er in

to

one

chan

nel,

we

have

to r

educ

e th

e bu

rst l

engt

h fr

om 8

to 4

. S

hort

er b

urst

s ar

e le

ss e

ffici

ent.

Sus

tain

able

ban

dwid

th d

rops

.

Wh

at if

EV

7 U

sed

DD

R?

•P

eak

Ban

dw

idth

12.

8 G

B/s

•6

Ch

ann

els

of

133*

2 M

Hz

DD

R S

DR

AM

==

•6

Co

ntr

olle

rs o

f 6

64 b

it w

ide

chan

nel

s, o

r

•3

Co

ntr

olle

rs o

f 3

128

bit

wid

e ch

ann

els

* pa

ge h

it C

AS

+ m

emor

y co

ntro

ller

late

ncy.

** in

clud

ing

all s

igna

ls, a

ddre

ss, c

omm

and,

dat

a, c

lock

, not

incl

udin

g E

CC

or

pari

ty**

* 3

cont

rolle

r de

sign

is le

ss b

andw

idth

eff

icie

nt.

Syst

emE

V7

+ R

DR

AM

EV

7 +

6 co

ntro

ller

DD

R S

DR

AM

EV

7 +

3 co

ntro

ller

DD

R S

DR

AM

Lat

ency

75 n

s~

50 n

s*~

50 n

s*

Pin

coun

t~2

65**

+ P

wr/

Gnd

~ 60

0**

+ Pw

r/G

nd~

600*

* +

Pwr/

Gnd

Con

trol

ler

Cou

nt2

6***

3***

Ope

n pa

ges

2048

144

72

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

DD

R S

DR

AM

was

an

adva

nced

men

t fro

m S

DR

AM

, w

ith lo

wer

ed V

dd, n

ew e

lect

rical

si

gnal

inte

rfac

e (S

ST

L), n

ew

prot

ocol

, but

fund

amen

tally

the

sam

e tR

C ~

= 6

0ns.

RD

RA

M h

as

tRC

of ~

70ns

. All

com

para

ble

in

Row

rec

over

y tim

e.S

o w

hat’s

nex

t? W

hat’s

on

the

Hor

izon

? D

DR

II/F

CR

AM

/R

LDR

AM

/RD

RA

M-n

extG

en/

Ken

tron

? W

hat a

re th

ey, a

nd

wha

t do

they

brin

g to

the

tabl

e?

Wh

at’s

Nex

t?

•D

DR

II

•F

CR

AM

•R

LD

RA

M

•R

DR

AM

(Ye

llow

sto

ne

etc)

•K

entr

on

QB

M

Page 35: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

DD

R II

is a

follo

w o

n to

DD

R

DD

R II

com

man

d se

ts a

re a

su

pers

et o

f DD

R S

DR

AM

co

mm

ands

.Lo

wer

I/O

vol

tage

mea

ns lo

wer

po

wer

for

I/O a

nd p

ossi

bly

fast

er

sign

al s

witc

hing

due

to lo

wer

rail

to r

ail v

olta

ge.

DR

AM

cor

e no

w o

pera

tes

at 1

:4

of d

ata

bus

frqu

ency

. val

ida

com

man

d m

ay b

e la

tche

d on

an

y gi

ven

risin

g ed

ge o

f clo

ck,

but m

ay b

e de

laye

d a

cycl

e si

nce

com

man

d bu

s is

run

ning

at

1:2

freq

uenc

y to

the

core

now

.In

a m

emor

y sy

stem

it c

an ru

n at

40

0 M

Hz

per

pin,

whi

le it

can

be

cran

ked

up to

800

MH

z pe

r pi

n in

an

embe

dded

sys

tem

with

out

conn

ecto

rs.

DD

R II

elim

inat

es th

e tr

ansf

er-

until

-inte

rrup

ted

com

man

ds, a

s w

ell a

s lim

its th

e bu

rst l

engt

h to

4

only

. (si

mpl

e to

test

)

DD

R II

- D

DR

Nex

t G

enL

ow

er I/

OVo

ltag

e (1

.8V

)D

RA

M c

ore

op

erat

es a

t1:

4 fr

eq o

f d

ata

bus

freq

(SD

RA

M 1

:1, D

DR

1:2

)

Bac

kwar

d C

om

pat

.to

DD

R (C

om

mo

n

mo

du

les

po

ssib

le)

400

Mb

ps

- m

ult

idro

p80

0 M

bp

s-p

oin

t to

po

int

FP

BG

A p

acka

ge

No

mo

re P

age-

Tran

sfer

-Un

til-

Inte

rru

pte

dC

om

man

ds

(rem

oves

sp

eed

pat

h)

Bu

rst

Len

gth

==

4 O

nly

! 4 B

anks

inte

rnal

ly(s

ame

as S

DR

AM

an

d D

DR

)W

rite

Lat

ency

= C

AS

-1

(in

crea

sed

Bu

s U

tiliz

atio

n)

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Inst

ead

of a

con

trol

ler t

hat k

eeps

tr

ack

of c

ycle

s, w

e ca

n no

w h

ave

a “d

umbe

r” c

ontr

olle

r. C

ontr

ol is

no

w s

impl

e, k

ind

of li

ke S

RA

M.

part

I of

add

ress

one

cyc

le, p

art

II th

e ne

xt c

ycle

.

DD

R II

- C

on

tin

ued

Po

sted

Co

mm

and

sR

ead

Act

ive

dat

a

(RA

S)

(CA

S)

t RC

D

Rea

dA

ctiv

e

dat

a

(RA

S)

(CA

S)

t RC

D

SD

RA

M &

DD

R S

DR

AM

rel

ies

on

mem

ory

co

ntr

olle

r to

kn

ow

t R

CD

an

d is

sue

CA

S a

fter

tR

CD

for

low

est

late

ncy

.

Inte

rnal

co

un

ter

del

ays

CA

S c

om

man

d, D

RA

M c

hip

issu

es “

real

”co

mm

and

aft

er t

RC

D fo

r lo

wes

t la

ten

cy.

SD

RA

M &

DD

R

DD

R II

: Po

sted

CA

S

Page 36: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

FC

RA

M is

a tr

adem

ark

of

Fuj

itsu.

Tos

hiba

man

ufac

ture

s un

der

this

trad

emar

k, S

amsu

ng

sells

Net

wor

k D

RA

M. S

ame

thin

g.ex

tra

die

area

dev

oted

to c

ircui

ts

that

low

ers

Row

Cyc

le d

own

to

half

of D

DR

, and

Ran

dom

A

cces

s (R

AC

) la

tenc

y do

wn

to

22 to

26n

s.W

rites

are

del

ay-m

atch

ed w

ith

CA

SL,

bet

ter

bus

utili

zatio

n.

FC

RA

M

Fast

Cyc

le R

AM

(ak

a N

etw

ork

-DR

AM

)

Fea

ture

sD

DR

SD

RA

MF

CR

AM

/Net

wo

rk-D

RA

M

Vdd

, Vdd

q2.

5 +

/- 0

.2V

2.5

+/-

0.1

5

Ele

ctric

al In

terf

ace

SS

TL-

2S

ST

L-2

Clo

ck F

requ

ency

100~

167

MH

z15

4~20

0 M

Hz

t RA

C~

40ns

22~

26ns

t RC

~60

ns25

~30

ns

# B

anks

4

4

Bur

st L

engt

h2,

4,8

2,4

Writ

e La

tenc

y1

Clo

ckC

AS

L -1

FC

RA

M/N

etw

ork

-DR

AM

loo

ks li

ke D

DR

+

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

With

fast

er D

RA

M tu

rn a

roun

d tim

e on

the

tRC

, a r

ando

m

acce

ss th

at h

its th

e sa

me

page

ov

er a

nd o

ver a

gain

will

hav

e th

e hi

ghes

t bus

util

izat

ion.

(W

ith

rand

om R

/W a

cces

ses)

Als

o, w

hy P

eak

BW

!= s

usta

ined

B

W. D

evia

tions

from

pea

k ba

ndw

idth

cou

ld b

e du

e to

ar

chite

ctur

e re

late

d is

sues

suc

h as

tRC

(ca

nnot

cyc

le D

RA

M

arra

ys to

gra

b da

ta o

ut o

f sam

e D

RA

M a

rray

and

re-

use

sens

e am

ps)

FC

RA

M C

on

tin

ued

Fast

er t

RC

allo

ws

Sam

sun

g t

o c

laim

hig

her

bu

s ef

fici

ency

* S

amsu

ng

Ele

ctro

nic

s, D

enal

i Mem

Co

n 2

002

Page 37: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Ano

ther

Var

iant

, but

RLD

RA

M is

ta

rget

ted

tow

ard

embe

dded

sy

stem

s. T

here

are

no

conn

ecto

r sp

ecifi

catio

ns, s

o it

can

targ

et a

hig

her f

requ

ency

off

the

bat.

RL

DR

AM

DR

AM

Typ

eF

req

uen

cyB

us

Wid

th

(per

ch

ip)

Pea

k B

and

wid

th

(per

Ch

ip)

Ran

do

m

Acc

ess

Tim

e (t

RA

C)

Ro

w C

ycle

T

ime

(tR

C)

PC

133

SD

RA

M13

316

200

MB

/s45

ns

60 n

s

DD

R 2

66

133

* 2

1653

2 M

B/s

45 n

s60

ns

PC

800

RD

RA

M

400

* 2

161.

6 G

B/s

60 n

s70

ns

FC

RA

M20

0 *

216

0.8

GB

/s25

ns

25 n

s

RLD

RA

M30

0 *

232

2.4

GB

/s25

ns

25 n

s

Co

mp

arab

le t

o F

CR

AM

in la

ten

cyH

igh

er F

req

uen

cy (

No

Co

nn

ecto

rs)

no

n-M

ult

iple

xed

Ad

dre

ss (

SR

AM

like

)

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Infin

eon

prop

oses

that

RLD

RA

M

coul

d be

inte

grat

ed o

nto

the

mot

herb

oard

as

an L

3 ca

che.

64

MB

of L

3. S

have

s 25

ns

off o

f tR

AC

from

goi

ng to

SD

RA

M o

r D

DR

SD

RA

M. N

ot to

be

used

as

mai

n m

emor

y du

e to

cap

acity

co

nstr

aint

s.

RL

DR

AM

Co

nti

nu

ed

256M

bR

LD

RA

M

Pro

cess

or

Hig

h-e

nd P

C a

nd S

erv

er

X32

2.4

GB

/s

Mem

ory

C

ontr

olle

r

Nort

hbridge

256M

bR

LD

RA

M

X32

2.4

GB

/s

???

RLD

RA

M is

a g

reat

repla

cem

ent

to S

RA

M in L

3

ca

ch

e a

pp

lica

tio

ns b

eca

use

of its h

igh

de

nsity,

low

pow

er

and low

cost

*Infi

neo

n P

rese

nta

tio

n, D

enal

i Mem

Co

n 2

002

Page 38: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Unl

ike

othe

r D

RA

M’s

. Ye

llow

ston

e is

onl

y a

volta

ge

and

I/O s

peci

ficat

ion,

no

DR

AM

A

FAIK

. R

AM

BU

S h

as le

arne

d th

eir

less

on, t

hey

used

exp

ensi

ve

pack

agin

g, 8

laye

r m

othe

rboa

rds,

and

add

ed c

ost

ever

ywhe

re. N

ow th

e ne

w p

itch

is “

high

er p

erfo

rman

ce w

ith

sam

e in

fras

truc

ture

”.

RA

MB

US

Yel

low

sto

ne

•B

i-D

irec

tio

nal

Dif

fere

nti

al S

ign

als

•U

ltra

low

200

mV

p-p

sig

nal

sw

ing

s

•8

dat

a b

its

tran

sfer

red

per

clo

ck

•40

0 M

Hz

syst

em c

lock

•3.

2 G

Hz

effe

ctiv

e d

ata

freq

uen

cy

•C

hea

p 4

laye

r P

CB

•C

om

mo

dit

y p

acka

gin

g

Sys

tem

Clo

ck

Dat

a1.

2 V

1.0

V

Oct

al D

ata

Rat

e (O

DR

) S

ign

alin

g

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Qua

d B

and

Mem

ory

Use

s F

et s

witc

hes

to c

ontr

ol

whi

ch D

IMM

sen

ds o

utpu

t.Tw

o D

DR

mem

ory

chip

s ar

e in

terle

aved

to g

et Q

uad

mem

ory.

Adv

anta

ges,

use

s st

anda

rd

DD

R c

hips

, ext

ra c

ost i

s lo

w,

only

the

wra

pper

ele

ctro

nics

. M

odifi

catio

n to

mem

ory

cont

rolle

r re

quire

d, b

ut m

inim

al.

Has

to u

nder

stan

d th

at d

ata

is

bein

g bu

rst b

ack

at 4

X c

lock

fr

eque

ncy.

Doe

s no

t im

prov

e ef

ficie

ncy,

but

che

ap b

andw

idth

.S

uppo

rts

mor

e lo

ads

than

“o

rdin

ary

DD

R”,

so

mor

e ca

paci

ty.

Ken

tro

n Q

BM

TM

SWITCH

Co

ntr

olle

r

SWITCH

SWITCH

SWITCH

DD

R A

DD

R B

SWITCH

SWITCH

SWITCH

DD

R A

DD

R B

SWITCH

SWITCH

SWITCH

DD

R A

DD

R B

op

eno

pen

op

en

SWITCH

SWITCH

DD

R A

DD

R B

“Wra

pp

er E

lect

ron

ics

aro

un

d D

DR

mem

ory

”G

ener

ates

4 d

ata

bit

s p

er c

ycle

inst

ead

of

2.

Qu

ad B

and

Mem

oryd

1d

1d

1d

1

d0

d0

d0

d0

d1

d1

d1

d1

d0

d0

d0

d0

DD

R A

DD

R B

Ou

tpu

t

Clo

ck

Page 39: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Inst

ead

of th

inin

g ab

out t

hing

s on

a s

tric

t lat

ency

-ban

dwid

th

pers

pect

ive,

it m

ight

be

mor

e he

lpfu

l to

thin

k in

term

s of

la

tenc

y vs

pin

-tra

nsiti

on

effic

ienc

y pe

rspe

ctiv

e. T

he id

ea

is th

at

A D

iffe

ren

t P

ersp

ecti

ve

Eve

ryth

ing

is b

and

wid

th

Lat

ency

an

d B

and

wid

th

Pin

-ban

dw

idth

an

d

Pin

-tra

nsi

tio

n *

Effi

cien

cy (

bit

s/cy

cle/

sec)

Clo

ck

Co

l. C

md

/Ad

dr

Ban

dw

idth

Wri

te D

ata

Ban

dw

idth

Rea

d D

ata

Ban

dw

idth

Ro

w C

md

/Ad

dr

Ban

dw

idth

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

DR

AM

sys

tem

s is

bas

ical

ly a

ne

twor

king

sys

tem

with

a s

mar

t m

aste

r co

ntro

ller

and

a la

rge

num

ber

of “d

umb”

sla

ve d

evic

es.

If w

e ar

e co

ncer

ned

abou

t “e

ffici

ency

” on

a bi

t/pin

/sec

leve

l, it

mig

ht b

ehoo

ve u

s to

dra

w

insp

iratio

n fr

om n

etw

ork

inte

rfac

es, a

nd d

esig

n so

met

hing

like

this

...

Uni

dire

ctio

n co

mm

and

and

writ

e pa

cket

s fr

om c

ontr

olle

r to

DR

AM

ch

ips,

and

Uni

dire

ctio

n bu

s fr

om

DR

AM

chi

ps b

ack

to th

e co

ntro

ller.

The

n it

look

s lik

e a

netw

ork

syst

em w

ith s

lot r

ing

inte

rfac

e, n

o ne

ed to

dea

l with

bu

s-tu

rn a

roun

d is

sues

.

Res

earc

h A

reas

: To

po

log

y

Un

idir

ecti

on

al T

op

olo

gy:

•W

rite

Pac

kets

sen

t o

n C

om

man

d B

us

•P

ins

use

d fo

r C

om

man

d/A

dd

ress

/Dat

a

•F

urt

her

Incr

ease

of

Lo

gic

on

DR

AM

ch

ips

Page 40: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Cer

tain

thin

gs s

impl

y do

es n

ot

mak

e se

nse

to d

o. S

uch

as

vario

us S

TR

EA

M c

ompo

nent

s.

Mov

e m

ultim

egab

yte

arra

ys fr

om

DR

AM

to C

PU

, jus

t to

perf

orm

si

mpl

e “a

dd” f

unct

ion,

then

mov

e th

at m

ulti

meg

abyt

e ar

rays

rig

ht

back

. In

such

ext

rem

e ba

ndw

idth

con

stra

ined

ap

plic

atio

ns, i

t wou

ld b

e be

nefic

ial t

o ha

ve s

ome

logi

c or

ha

rdw

are

on D

RA

M c

hips

that

ca

n pe

rfor

m s

impl

e co

mpu

tatio

n. T

his

is tr

icky

, sin

ce

we

do n

ot w

ant t

o ad

d to

o m

uch

logi

c as

to m

ake

the

DR

AM

ch

ips

proh

ibat

ivel

y ex

pens

ive

to

man

ufac

ture

. (lo

gic

over

head

de

crea

ses

with

eac

h ge

nera

tion,

so

add

ing

logi

c is

not

an

impo

ssib

le d

ream

) A

lso,

we

do

not w

ant t

o ad

d lo

gic

into

the

criti

cal p

ath

of a

DR

AM

acc

ess.

T

hat w

ould

ser

ve to

slo

w d

own

a ge

nera

l acc

ess

in te

rms

of th

e “r

eal l

aten

cy”

in n

s.

Mem

ory

Co

mm

and

s?

Inst

ead

of

A[

] =

0; D

o “

wri

te 0

Wri

teA

ct

0000

00W

rite

0A

ct

Why

do

ST

RE

AM

add

in C

PU

?

A[

] =

B[

] +

C[

]

Act

ive

Pag

es *

(Ch

on

g e

t. al

. IS

CA

‘98)

Why

do

A[

] =

B[

] in

CP

U?

Mov

e D

ata

insi

de

of

DR

AM

or

bet

wee

n D

RA

Ms.

Mem

ory

Co

ntr

olle

r

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

For

a g

iven

phy

sica

l add

ress

, th

ere

are

a nu

mbe

r of

way

s to

m

ap th

e bi

ts o

f the

phy

sica

l ad

dres

s to

gen

erat

e th

e “m

emor

y ad

dres

s” in

term

s of

de

vice

ID, R

ow/c

ol a

ddr,

and

bank

id.

The

map

ping

pol

icie

s co

uld

impa

ct p

erfo

rman

ce, s

ince

bad

ly

map

ped

syst

ems

can

caus

e ba

nk c

onfli

cts

in c

onse

cutiv

e ac

cess

es.

Now

, map

ping

pol

icie

s m

ust a

lso

take

tem

pera

ture

con

trol

into

ac

coun

t, as

con

secu

tive

acce

sses

that

hit

the

sam

e D

RA

M c

hip

can

pote

ntia

lly

crea

te u

ndes

irabl

e ho

t spo

ts.

One

rea

son

for

the

addi

tiona

l co

st o

f RD

RA

M in

itial

ly w

as th

e us

e of

hea

t spr

eade

rs o

n th

e m

emor

y m

odul

es to

pre

vent

the

hots

pots

from

bui

ldin

g up

.

Ad

dre

ss M

app

ing

Acc

ess

Dis

trib

uti

on

for T

emp

Co

ntr

ol

Avo

id B

ank

Co

nfl

icts

Acc

ess

Reo

rder

ing

for

per

form

ance

Phy

sica

l A

dd

ress

Ro

w A

dd

rB

ank

IdC

ol A

dd

rD

evic

e Id

Page 41: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Eac

h M

emor

y sy

stem

con

sist

s of

one

or

mor

e m

emor

y ch

ips,

an

d m

ost t

imes

, acc

esse

s to

th

ese

chip

s ca

n be

pip

elin

ed.

Eac

h ch

ip a

lso

has

mul

titud

es o

f ba

nks,

and

mos

t of t

he ti

mes

, ac

cess

es to

thes

e ba

nks

can

also

be

pipe

lined

. (ke

y to

ef

ficie

ncy

is to

pip

elin

e co

mm

ands

)

Exa

mp

le: B

ank

Co

nfl

icts

... B

it L

ines

...

Mem

ory

Arr

ay

Sen

se A

mp

s

Row Decoder

Co

lum

n D

eco

der

. . . .

... B

it L

ines

...

Mem

ory

Arr

ay

Sen

se A

mp

s

Row Decoder

Co

lum

n D

eco

der

. . . .

... B

it L

ines

...

Mem

ory

Arr

ay

Sen

se A

mp

s

Row Decoder

Co

lum

n D

eco

der

. . . .

... B

it L

ines

...

Mem

ory

Arr

ay

Sen

se A

mp

s

Row Decoder

Co

lum

n D

eco

der

. . . .

Mu

ltip

le B

anks

to R

edu

ce

Acc

ess

Co

nfl

icts

Rea

d 0

5AE

5700

Rea

d 0

23B

B88

0

Rea

d 0

0CB

A2C

0

Dev

ice

id 3

, Ro

w id

266

, Ban

k id

0D

evic

e id

3, R

ow

id 1

BA

, Ban

k id

0

Dev

ice

id 3

, Ro

w id

052

, Ban

k id

1R

ead

05A

E57

80D

evic

e id

3, R

ow

id 2

66, B

ank

id 0

Mo

re B

anks

per

Ch

ip =

= P

erfo

rman

ce =

= L

og

ic O

verh

ead

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Eac

h Lo

ad c

omm

and

is

tran

slat

ed to

a r

ow c

omm

and

and

a co

lum

n co

mm

and.

If tw

o co

mm

ands

are

map

ped

to th

e sa

me

bank

, one

mus

t be

com

plet

ed b

efor

e th

e ot

her

can

star

t.

Or,

if w

e ca

n re

-ord

er th

e se

quen

ces,

then

the

entir

e se

quen

ce c

an b

e co

mpl

eted

fa

ster

.

By

allo

win

g R

ead

3 to

byp

ass

Rea

d 2,

we

do n

ot n

eed

to

gene

rate

ano

ther

row

act

ivat

ion

com

man

d. R

ead

4 m

ay a

lso

bypa

ss R

ead

2, s

ince

it o

pera

tes

on a

diff

eren

t Dev

ice/

bank

en

tirel

y.D

RA

M n

ow c

an d

o au

to

prec

harg

e, b

ut I

put i

n th

e pr

echa

rge

expl

icitl

y to

sho

w th

at

two

row

s ca

nnot

be

activ

e w

ithin

tR

C (

DR

AM

arc

hite

ctur

e)

cons

trai

nts.

Exa

mp

le: A

cces

s R

eord

erin

gR

ead

05A

E57

00R

ead

023

BB

880

Rea

d 0

0CB

A2C

0

Dev

ice

id 3

, Ro

w id

266

, Ban

k id

0D

evic

e id

3, R

ow

id 1

BA

, Ban

k id

0

Dev

ice

id 1

, Ro

w id

052

, Ban

k id

1R

ead

05A

E57

80D

evic

e id

3, R

ow

id 2

66, B

ank

id 0

Rea

d

Act

Dat

a

Pre

c

Act

= A

ctiv

ate

Pag

e (D

ata

mov

ed f

rom

DR

AM

cel

ls t

o r

ow

bu

ffer

)R

ead

= R

ead

Dat

a (D

ata

mov

ed f

rom

ro

w b

uff

er t

o m

emo

ry c

on

tro

ller)

Pre

c =

Pre

char

ge

(clo

se p

age/

evic

t d

ata

in r

ow

bu

ffer

/sen

se a

mp

)

Rea

d

Act

Dat

a

Pre

c

1 2 3 4

Rea

d

Act

Dat

a

Rea

d

Act

Dat

a

Pre

c

Str

ict

Ord

erin

g

Rea

d

Act

Dat

a

Pre

c4

2

Rea

d

Act

Dat

a

Pre

c

3

112

3

Mem

ory

Acc

ess

Re-

ord

ered

t RC

Page 42: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

now

--

talk

abo

ut p

erfo

rman

ce

issu

es.

Ou

tlin

e

•B

asic

s

•D

RA

M E

volu

tio

n: S

tru

ctu

ral P

ath

•A

dva

nce

d B

asic

s

•D

RA

M E

volu

tio

n: I

nte

rfac

e P

ath

•F

utu

re In

terf

ace

Tren

ds

& R

esea

rch

Are

as

•P

erfo

rman

ce M

od

elin

g:

Arc

hit

ectu

res,

Sys

tem

s, E

mb

edd

ed

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

NO

TE

Sim

ula

tor

Ove

rvie

w

CP

U: S

imp

leS

cala

r v3

.0a

•8-

way

ou

t-o

f-o

rder

•L

1 ca

che:

sp

lit 6

4K/6

4K, l

ock

up

fre

e x3

2

•L

2 ca

che:

un

ified

1M

B, l

ock

up

fre

e x1

•L

2 b

lock

size

: 128

byt

es

Mai

n M

emo

ry: 8

64M

b D

RA

Ms

•10

0MH

z/12

8-b

it m

emo

ry b

us

•O

pti

mis

tic

op

en-p

age

po

licy

Ben

chm

arks

: SP

EC

’95

Page 43: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

NO

TE

DR

AM

Co

nfi

gu

rati

on

s

No

te: T

RA

NS

FE

R W

IDT

H o

f D

irec

t R

ambu

s C

han

nel

•eq

ual

s th

at o

f g

ang

ed F

PM

, ED

O, e

tc.

•is

2x

that

of

Ram

bus

& S

LD

RA

M

CP

UM

emo

ry

Co

ntr

olle

ran

d c

ach

es

x16

DR

AM

x16

DR

AM

x16

DR

AM

x16

DR

AM

x16

DR

AM

x16

DR

AM

x16

DR

AM

x16

DR

AM

128-

bit

100

MH

z bu

s

DIM

M

FP

M, E

DO

, SD

RA

M, E

SD

RA

M, D

DR

:

Fast

, Nar

row

Ch

ann

el

CP

UM

emo

ry

Co

ntr

olle

ran

d c

ach

es12

8-b

it 1

00M

Hz

bus

DRAM

DRAM

DRAM

DRAM

DRAM

DRAM

DRAM

DRAM

Ram

bus,

Dir

ect

Ram

bus,

SL

DR

AM

:

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

NO

TE

DR

AM

Co

nfi

gu

rati

on

s DRAM

DRAM

DRAM

DRAM

DRAM

... Mu

ltip

le P

aral

lel C

han

nel

s

CP

UM

emo

ry

Co

ntr

olle

ran

d c

ach

es12

8-b

it 1

00M

Hz

bus

Str

awm

an: R

ambu

s, e

tc.

Fast

, Nar

row

Ch

ann

el x

2

CP

Uan

d c

ach

es12

8-b

it 1

00M

Hz

bus

DRAM

DRAM

DRAM

DRAM

DRAM

DRAM

DRAM

DRAM

Ram

bus

& S

LD

RA

M d

ual

-ch

ann

el:

DRAM

DRAM

DRAM

DRAM

DRAM

DRAM

DRAM

DRAMM

emo

ry

Co

ntr

olle

r

Page 44: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

NO

TE

Fir

st …

Ref

resh

Mat

ters

Ass

um

es r

efre

sh o

f ea

ch b

ank

ever

y 64

ms

FP

M1

FP

M2

FP

M3

ED

O1

ED

O2

SD

RA

M1

ES

DR

AM

SLD

RA

MR

DR

AM

DR

DR

AM

DR

AM

Con

figur

atio

ns0

400

800

1200

Time per Access (ns)

com

pre

ss

Bus

Tra

nsm

issi

on T

ime

Row

Acc

ess

Tim

eC

olum

n A

cces

s T

ime

Dat

a T

rans

fer

Tim

e O

verla

pD

ata

Tra

nsfe

r T

ime

Ref

resh

Tim

eB

us W

ait T

ime

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

NO

TE

Ove

rhea

d: M

emo

ry v

s. C

PU

Var

iab

le: s

pee

d o

f p

roce

sso

r &

cac

hes

Com

pres

sG

ccG

oIjp

egLi

M88

ksim

Per

lV

orte

x0

0.51

1.52

2.53

Clocks Per Instruction (CPI)

To

tal E

xecu

tio

n T

ime

in C

PI —

SD

RA

M

Pro

cess

or

Exe

cuti

on

(in

clu

des

cac

hes

)O

verl

ap b

etw

een

Exe

cuti

on

& M

emo

ryS

talls

du

e to

Mem

ory

Acc

ess

Tim

e

Yest

erda

y’s

CPU

Tom

orro

w’s

CPU

Toda

y’s

CPU

BE

NC

HM

AR

K

Page 45: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

NO

TE

Defi

nit

ion

s (v

ar. o

n B

urg

er, e

t al

)

•t P

RO

C —

pro

cess

or

wit

h p

erfe

ct m

emo

ry

•t R

EA

L —

rea

listi

c co

nfi

gu

rati

on

•t B

W —

CP

U w

ith

wid

e m

emo

ry p

ath

s

•t D

RA

M —

tim

e se

en b

y D

RA

M s

yste

m

Sta

lls D

ue

toB

AN

DW

IDT

H

Sta

lls D

ue

toL

AT

EN

CY

CP

U+L

1+L

2E

xecu

tio

n

CP

U-M

emo

ryO

VE

RL

AP

t DR

AM

t PR

OC

t BW

t RE

AL

t RE

AL

- t B

W

t BW

- t P

RO

C

t RE

AL

- t D

RA

M

t PR

OC

- (t

RE

AL

- t D

RA

M)

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

NO

TE

Mem

ory

& C

PU

— P

ER

L

Ban

dw

idth

-En

han

cin

g T

ech

niq

ues

I:

FP

ME

DO

DR

DR

AM

ES

DR

AM

DD

RS

LDR

AM

RD

RA

MS

DR

AM

012345

Cycles Per Instruction (CPI)

DR

AM

Co

nfi

gu

rati

on

Yeste

rday

’s CPU

Tomorro

w’s CPU

Today’s

CPU

Pro

cess

or

Exe

cuti

on

Ove

rlap

bet

wee

n E

xecu

tio

n &

Mem

ory

Sta

lls d

ue

to M

emo

ry L

aten

cyS

talls

du

e to

Mem

ory

Ban

dw

idth

New

er D

RA

Ms

Page 46: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

NO

TE

Mem

ory

& C

PU

— P

ER

L

Ban

dw

idth

-En

han

cin

g T

ech

niq

ues

II:

FP

M/in

terle

aved

ED

O/in

terle

aved

SD

RA

M &

DD

RS

LDR

AM

x1/

x2R

DR

AM

x1/

x2012345

Cycles Per Instruction (CPI)

Exe

cutio

n T

ime

in C

PI —

PE

RL

DR

AM

Co

nfi

gu

rati

on

(10

GH

z C

PU

s)

Pro

cess

or

Exe

cuti

on

Ove

rlap

bet

wee

n E

xecu

tio

n &

Mem

ory

Sta

lls d

ue

to M

emo

ry L

aten

cyS

talls

du

e to

Mem

ory

Ban

dw

idth

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

NO

TE

Ave

rag

e L

aten

cy o

f D

RA

Ms

no

te: S

LD

RA

M &

RD

RA

M 2

x d

ata

tran

sfer

s

0

100

200

300

400

500

Avg Time per Access (ns)

DR

AM

Co

nfi

gu

rati

on

s

Bu

s T

ran

smis

sio

n T

ime

Ro

w A

cces

s T

ime

Co

lum

n A

cces

s T

ime

Dat

a T

ran

sfer

Tim

e O

verl

apD

ata

Tra

nsf

er T

ime

Ref

resh

Tim

eB

us

Wai

t T

ime

FP

ME

DO

DR

DR

AM

ES

DR

AM

DD

RS

LDR

AM

RD

RA

MS

DR

AM

Page 47: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

NO

TE

Ave

rag

e L

aten

cy o

f D

RA

Ms

no

te: S

LD

RA

M &

RD

RA

M 2

x d

ata

tran

sfer

s

0

100

200

300

400

500

Avg Time per Access (ns)

DR

AM

Co

nfi

gu

rati

on

s

Bu

s T

ran

smis

sio

n T

ime

Ro

w A

cces

s T

ime

Co

lum

n A

cces

s T

ime

Dat

a T

ran

sfer

Tim

e O

verl

apD

ata

Tra

nsf

er T

ime

Ref

resh

Tim

eB

us

Wai

t T

ime

FP

ME

DO

DR

DR

AM

ES

DR

AM

DD

RS

LDR

AM

RD

RA

MS

DR

AM

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

NO

TE

DD

R2

Stu

dy

Res

ult

s

cc1

com

pres

s

goijp

egli

linea

r_w

alk

mpe

g2de

c

mpe

g2en

c

peg−

wit

perl

ran−

dom

_wa

stre

am

stre

am

_no_

un

0.51

1.52

2.53

Arc

hite

ctur

al C

ompa

rison

pc10

0

ddr1

33

drd

ddr2

ddr2

ems

ddr2

vc

Ben

chm

ark

Normalized Execution Time (DDR2)

Page 48: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

NO

TE

DD

R2

Stu

dy

Res

ult

s

1 G

hz5

Ghz

10 G

hz0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Per

l Run

time

pc10

0

ddr1

33

drd

ddr2

ddr2

ems

ddr2

vc

Pro

cess

or F

requ

ency

Execution Time (Sec.)

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

NO

TE

Ro

w-B

uff

er H

it R

ates

Co

mp

ress

Gcc

Go

Ijp

eg

Li

M8

8ksim

Pe

rlV

ort

ex

SP

EC

IN

T 9

5 B

en

ch

ma

rks

0

20

40

60

80

10

0

Hit rate in row buffers

1M

B L

2 C

ach

e

Acro

rea

dCo

mp

ress

Gcc

Go

Ne

tsca

pe

Pe

rlP

ho

tosh

op

Po

we

rpo

int

Win

wo

rd

ET

CH

Tra

ce

s0

20

40

60

80

10

0

Hit rate in row buffers

1M

B L

2 C

ach

e

Co

mp

ress

Gcc

Go

Ijp

eg

Li

M8

8ksim

Pe

rlV

ort

ex

SP

EC

IN

T 9

5 B

en

ch

ma

rks

0

20

40

60

80

10

0

Hit rate in row buffers

4M

B C

ach

e

Acro

rea

dCo

mp

ress

Gcc

Go

Ne

tsca

pe

Pe

rlP

ho

tosh

op

Po

we

rpo

int

Win

wo

rd

ET

CH

Tra

ce

s0

20

40

60

80

10

0

Hit rate in row buffers

4M

B L

2 C

ach

e

Co

mp

ress

Gcc

Go

Ijp

eg

Li

M8

8ksim

Pe

rlV

ort

ex

SP

EC

IN

T 9

5 B

en

ch

ma

rks

0

20

40

60

80

10

0

Hit rate in row buffers

No

L2

Ca

ch

e

Acro

rea

dCo

mp

ress

Gcc

Go

Ne

tsca

pe

Pe

rlP

ho

tosh

op

Po

we

rpo

int

Win

wo

rd

ET

CH

Tra

ce

s0

20

40

60

80

10

0

Hit rate in row buffers

No

L2

Ca

ch

eF

PM

DR

AM

ED

OD

RA

MS

DR

AM

ES

DR

AM

DD

RS

DR

AM

SL

DR

AM

RD

RA

MD

RD

RA

M

Page 49: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

NO

TE

Ro

w-B

uff

er H

it R

ates

02

46

810

0

1000

0

2000

0

3000

0

4000

0

5000

0

02

46

810

0

200

400

600

800

1000

02

46

810

0

5000

0

1e+

05

1.5e

+05

2e+

05

02

46

810

050100

150

200

02

46

810

0

1e+

05

2e+

05

3e+

05

4e+

05

02

46

810

0

5000

0

1e+

05

1.5e

+05

Co

mp

ress

Go

Ijpeg

Li

Per

l

Vo

rtex

Hit

s vs

. Dep

th in

Vic

tim

-Ro

w F

IFO

Bu

ffer

2000

4000

6000

8000

1000

0

Inte

r-ar

riva

l tim

e (C

PU

Clo

cks)

110100

1000

1000

0

1000

00

Co

mp

ress

2000

4000

6000

8000

1000

0

Inte

r-ar

riva

l tim

e (C

PU

Clo

cks)

110100

1000

1000

0V

ort

ex

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

NO

TE

Ro

w B

uff

ers

as L

2 C

ach

e

Com

pres

sG

ccG

oIjp

egLi

M88

ksim

Per

lV

orte

x01020

Clocks Per Instruction (CPI)

Pro

cess

or E

xecu

tion

Ove

rlap

betw

een

Exe

cutio

n &

Mem

ory

Sta

lls d

ue to

Mem

ory

Late

ncy

Sta

lls d

ue to

Mem

ory

Ban

dwid

th

Page 50: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Eac

h m

emor

y tr

ansa

ctio

n ha

s to

br

eak

dow

n in

to a

two

part

ac

cess

, a r

ow a

cces

s an

d a

colu

mn

acce

ss. I

n es

senc

e th

e ro

w b

uffe

r/se

ns a

mp

is a

ctio

n as

a

cach

e. w

here

a p

age

is

brou

ght i

n fr

om th

e m

emor

y ar

ray

and

stor

ed in

the

buffe

r th

en th

e se

cond

ste

p is

to m

ove

that

dat

a fr

om th

e ro

w b

uffe

rs

back

into

the

mem

ory

cont

rolle

r. fr

om a

cer

tain

per

spec

tive,

it

mak

es s

ense

to s

pecu

lativ

ely

mov

e pa

ges

from

mem

ory

arra

ys in

to th

e ro

w b

uffe

rs to

m

axim

ize

the

page

hit

of a

co

lum

n ac

cess

, and

red

uce

late

ncy.

The

cos

t of a

sp

ecul

ativ

e ro

w a

ctiv

atio

n co

mm

and

is th

e ~

20 b

it of

ba

ndw

idth

sen

t on

the

com

man

d ch

anne

l fro

m c

ontr

olle

r to

D

RA

M. i

nste

ad o

f pre

fetc

hing

in

to D

RA

M, w

e’re

just

pr

efet

chin

g in

side

of D

RA

M.

Row

buf

fer

hit r

ates

40~

90%

, de

pend

ing

on a

pplic

atio

n.

*cou

ld*

be n

ear 1

00%

if m

emor

y sy

stem

get

s sp

ecul

ativ

e ro

w

buffe

r m

anag

emen

t com

man

ds.

(Thi

s on

ly m

akes

sen

se if

m

emor

y co

ntro

ller

is in

tegr

ated

)

Ro

w B

uff

er M

anag

emen

t

RA

S is

like

Cac

he

Acc

ess

Why

no

t S

pec

ula

te?

... B

it L

ines

...

Mem

ory

Arr

ay

Sen

se A

mp

s

Row Decoder

Co

lum

n D

eco

der

. . . .

Dat

a In

/Ou

tB

uff

ers

RA

S

RO

W A

CC

ES

S

... B

it L

ines

...

Mem

ory

Arr

ay

Sen

se A

mp

s

Row Decoder

Co

lum

n D

eco

der

. . . .

Dat

a In

/Ou

tB

uff

ers

CA

S

CO

LU

MN

AC

CE

SS

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

NO

TE

Co

st-P

erfo

rman

ce

FP

M, E

DO

, SD

RA

M, E

SD

RA

M:

•L

ow

er L

aten

cy =

> W

ide/

Fast

Bu

s

•In

crea

se C

apac

ity

=> D

ecre

ase

Lat

ency

•L

ow

Sys

tem

Co

st

Ram

bus,

Dir

ect

Ram

bus,

SL

DR

AM

:

•L

ow

er L

aten

cy =

> M

ult

iple

Ch

ann

els

•In

crea

se C

apac

ity

=> In

crea

se C

apac

ity

•H

igh

Sys

tem

Co

st

Ho

wev

er, 1

DR

DR

AM

= M

ult

iple

SD

RA

M

Page 51: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

NO

TE

Co

ncl

usi

on

s

100M

Hz/

128-

bit

Bu

s is

Cu

rren

t B

ott

len

eck

•S

olu

tio

n: F

ast

Bu

s/es

& M

C o

n C

PU

(e

.g. A

lph

a 21

364,

Em

oti

on

En

gin

e, …

)

Cu

rren

t D

RA

Ms

So

lvin

g B

and

wid

th P

rob

lem

(b

ut

no

t L

aten

cy P

rob

lem

)

•S

olu

tio

n: N

ew c

ore

s w

ith

on

-ch

ip S

RA

M(e

.g. E

SD

RA

M, V

CD

RA

M, …

)

•S

olu

tio

n: N

ew c

ore

s w

ith

sm

alle

r b

anks

(e.g

. Mo

Sys

“S

RA

M”,

FC

RA

M, …

)

Dir

ect

Ram

bus

seem

s to

sca

le b

est

for

futu

re

hig

h-s

pee

d C

PU

s

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

now

--

let’s

talk

abo

ut D

RA

M

perf

orm

ance

at t

heS

YS

TE

M

leve

l.

prev

ious

stu

dies

sho

w

ME

MO

RY

BU

S is

sig

nific

ant

bottl

enec

k in

toda

y’s

high

-pe

rfor

man

ce s

yste

ms

- S

chum

ann

repo

rts

that

in

Alp

ha w

orks

tatio

ns, 3

0-60

% o

f P

RIM

AR

Y M

EM

OR

Y L

ATE

NC

Y,

is d

ue to

SY

ST

EM

OV

ER

HE

AD

ot

her

than

DR

AM

late

ncy

- H

arva

rd s

tudy

cite

s B

US

T

UR

NA

RO

UN

D a

s re

spon

sibl

e fo

r fa

ctor

-of-

two

diffe

renc

e be

twee

n P

RE

DIC

TE

D a

nd

ME

AS

UR

ED

per

form

ance

in P

6 sy

stem

s

- ou

r pr

evio

us w

ork

show

s to

day’

s bu

sses

(19

99’s

bus

ses)

ar

e bo

ttlen

ecks

for

tom

orro

w’s

D

RA

Ms

so -

- lo

ok a

t bus

, mod

el s

yste

m

over

head

Ou

tlin

e

•B

asic

s

•D

RA

M E

volu

tio

n: S

tru

ctu

ral P

ath

•A

dva

nce

d B

asic

s

•D

RA

M E

volu

tio

n: I

nte

rfac

e P

ath

•F

utu

re In

terf

ace

Tren

ds

& R

esea

rch

Are

as

•P

erfo

rman

ce M

od

elin

g:

Arc

hit

ectu

res,

Sys

tem

s, E

mb

edd

ed

Page 52: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

at th

e S

YS

TE

M L

EV

EL

-- i.

e.

outs

ide

the

CP

U -

- w

e fin

d a

larg

e nu

mbe

r of

par

amet

ers

this

stu

dy o

nly

VA

RIE

S a

ha

ndfu

l, bu

t it s

till y

ield

s a

fairl

y la

rge

spac

e

the

para

met

ers

we

VA

RY

are

in

blue

& g

reen

by “

part

ially

” ind

epen

dent

, i

mea

n th

at w

e lo

oked

at a

sm

all

num

ber

of p

ossi

bilit

ies:

- tu

rnar

ound

is 0

/1 c

ycle

on

800M

Hz

bus

- re

ques

t ord

erin

g is

IN

TE

RLE

AV

ED

or

NO

T

Mo

tiva

tio

n

Eve

n w

hen

we

rest

rict

ou

r fo

cus

SY

ST

EM

-LE

VE

L P

AR

AM

ET

ER

S

•N

um

ber

of

chan

nel

sW

idth

of

chan

nel

s•

Ch

ann

el la

ten

cyC

han

nel

ban

dw

idth

•B

anks

per

ch

ann

elTu

rnar

ou

nd

tim

e•

Req

ues

t-q

ueu

e si

zeR

equ

est

reo

rder

ing

Ro

w-a

cces

sC

olu

mn

-acc

ess

•D

RA

M p

rech

arg

eC

AS

-to

-CA

S la

ten

cy•

DR

AM

bu

ffer

ing

L2

cach

e b

lock

size

•N

um

ber

of

MS

HR

sB

us

pro

toco

l

Fu

lly |

par

tial

ly |

no

t in

dep

end

ent

(th

is s

tud

y)

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

and

yet,

even

in th

is r

estr

icte

d de

sign

spa

ce, w

e fin

d hi

ghly

E

XT

RE

ME

LY C

OM

PLE

X r

esul

ts

SY

ST

EM

is v

ery

SE

NS

ITIV

E to

C

HA

NG

ES

in th

ese

para

met

ers

[dis

cuss

gra

ph]

if yo

u ho

ld a

ll el

se c

onst

ant a

nd

vary

one

par

amet

er, y

ou c

an

see

extr

emel

y la

rge

chan

ges

in

end

perf

orm

ance

... u

p to

40%

di

ffere

nce

by c

hang

ing

ON

E

PAR

AM

ET

ER

by

a FA

CTO

R O

F

TW

O (

e.g.

dou

blin

g th

e nu

mbe

r of

ban

ks, d

oubl

ing

the

size

of t

he

burs

t, do

ublin

g th

e nu

mbe

r of

ch

anne

ls, e

tc.)

Mo

tiva

tio

n

... t

he

des

ign

sp

ace

is h

igh

ly n

on

-lin

ear

0.8

1.6

3.2

6.4

12.8

25.6

Sys

tem

Ban

dwid

th

0123

Cycles per Instruction (CPI)

(GB

/s =

Cha

nnel

s *

Wid

th *

800

MH

z)

1 ch

an x

4 b

ytes

2 ch

an x

2 b

ytes

4 ch

an x

1 b

yte 1

chan

x 8

byt

es

2 ch

an x

4 b

ytes

4 ch

an x

2 b

ytes 2

chan

x 8

byt

es

4 ch

an x

4 b

ytes

4 ch

an x

8 b

ytes

1 ch

an x

2 b

ytes

2 ch

an x

1 b

yte

1 ch

an x

1 b

yte

GC

C12

8-B

yte

Bu

rst

64-B

yte

Bu

rst

32-B

yte

Bu

rst

Page 53: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

so -

- w

e ha

ve th

e w

orst

pos

sibl

e sc

enar

io: a

des

ign

spac

e th

at is

ve

ry s

ensi

tive

to c

hang

es in

pa

ram

eter

s an

d ex

ecut

ion

times

th

at c

an v

ary

by a

FA

CTO

R O

F

TH

RE

E fr

om w

orst

-cas

e to

bes

t

clea

rly, w

e w

ould

be

wel

l-ser

ved

to u

nder

stan

d th

is d

esig

n sp

ace

Mo

tiva

tio

n

... a

nd

th

e co

st o

f p

oo

r ju

dg

men

t is

hig

h.

bzi

pg

ccm

cfp

arse

rp

erl

vpr

aver

age

SP

EC

200

0 B

ench

mar

ks

0246810 Cycles per Instruction (CPI)

Bes

t O

rgan

izat

ion

Ave

rag

e O

rgan

izat

ion

Wo

rst

Org

aniz

atio

n

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

so b

y no

w w

e’re

ver

y fa

mili

ar

with

this

pic

ture

...

we

cann

ot u

se it

in th

is s

tudy

, be

caus

e th

is r

epre

sent

s th

e in

terf

ace

betw

een

the

DR

AM

an

d th

e M

EM

OR

Y

CO

NT

RO

LLE

R.

typi

cally

, the

CP

U’s

inte

rfac

e is

m

uch

sim

pler

: the

CP

U s

ends

all

of th

e ad

dres

s bi

ts a

t onc

e w

ith

CO

NT

RO

L IN

FO

(r/

w),

and

the

mem

ory

cont

rolle

r ha

ndle

s th

e bi

t add

ress

ing

and

the

RA

S/C

AS

tim

ing

Sys

tem

-Lev

el M

od

el

SD

RA

M T

imin

g

Co

mm

and

Ad

dre

ss

DQ

Clo

ck Ro

wA

dd

rC

ol

Ad

dr

Val

idD

ata

Val

idD

ata

Val

idD

ata

Val

idD

ata

AC

TR

EA

DD

ata

Tran

sfer

Co

lum

n A

cces

s

Tran

sfer

Ove

rlap

Ro

w A

cces

s

Page 54: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

this

giv

es th

e pi

ctur

e of

wha

t is

happ

enin

g at

the

SY

TE

M

LEV

EL

the

CP

U to

ME

M

CO

NT

RO

LLE

R is

sho

wn

as

“AB

US

Act

ive”

the

ME

M-C

ON

TR

OLL

ER

to

DR

AM

act

ivity

is s

how

n as

D

RA

M B

AN

K A

CT

IVE

and

the

data

rea

d-ou

t is

show

n as

“D

BU

S A

ctiv

e”

i

Sys

tem

-Lev

el M

od

el

Tim

ing

dia

gra

ms

are

at t

he

DR

AM

leve

l…

no

t th

e sy

stem

leve

l

Co

mm

and

Ad

dre

ss

DQ

Clo

ck Ro

wA

dd

rC

ol

Ad

dr

Val

idD

ata

Val

idD

ata

Val

idD

ata

Val

idD

ata

AC

TR

EA

DD

ata

Tran

sfer

Co

lum

n A

cces

s

Tran

sfer

Ove

rlap

Ro

w A

cces

s

DR

AM

Ban

k A

ctiv

e

DB

US

Act

ive

AB

US

Act

ive

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

f the

DR

AM

’s p

ins

do n

ot

conn

ect d

irect

ly to

the

CP

U (

e.g.

in

a h

iera

rchi

cal b

us

orga

niza

tion,

or

if th

e da

ta is

fu

nnel

led

thro

ugh

the

mem

ory

cont

rolle

r lik

e th

e no

rthb

ridge

ch

ipse

t), t

hen

ther

e is

yet

an

othe

r D

BU

S A

CT

IVE

tim

ing

slot

that

follo

ws

belo

w a

nd to

the

right

... t

his

can

cont

inue

to

exte

nd to

any

num

ber

of

hier

arch

ical

leve

ls, a

s se

en in

hu

ge s

erve

r sy

stem

s w

ith

hund

reds

of G

B o

f DR

AM

Sys

tem

-Lev

el M

od

el

Tim

ing

dia

gra

ms

are

at t

he

DR

AM

leve

l…

no

t th

e sy

stem

leve

l

Co

mm

and

Ad

dre

ss

DQ

Clo

ck Ro

wA

dd

rC

ol

Ad

dr

Val

idD

ata

Val

idD

ata

Val

idD

ata

Val

idD

ata

AC

TR

EA

DD

ata

Tran

sfer

Co

lum

n A

cces

s

Tran

sfer

Ove

rlap

Ro

w A

cces

s

DR

AM

Ban

k A

ctiv

e

DB

US

1 A

ctiv

eA

BU

S A

ctiv

e

DB

US

2 A

ctiv

e

Page 55: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

so le

t’s fo

rmal

ize

this

sys

tem

-le

vel i

nter

face

. her

e’s

the

requ

est t

imin

g in

a s

light

ly

diffe

rent

way

, as

wel

l as

an

exam

ple

syst

em m

odel

take

n fr

om s

chum

ann’

s pa

per

desc

ribin

g th

e 21

174

mem

ory

cont

rolle

r

the

DR

AM

’s d

ata

pins

are

co

nnec

ted

dire

ctly

to th

e C

PU

(s

impl

est p

ossi

ble

mod

el),

the

mem

ory

cont

rolle

r ha

ndle

s th

e R

AS

/CA

S ti

min

g, a

nd th

e C

PU

an

d m

emor

y co

ntro

ller

only

talk

in

term

s of

add

ress

es a

nd

cont

rol i

nfor

mat

ion

Req

ues

t Tim

ing

CP

UC

ach

e

MC

DRAM

DRAM

DRAM

DRAM

Bac

ksid

e bu

sF

ron

tsid

e bu

s

Dat

a bu

s (8

00 M

Hz) R

ow

/Co

lum

n A

dd

ress

es &

Co

ntr

ol

Ad

dre

ss

Co

ntr

ol

Ad

dre

ss

Dat

a bu

s t 0

<DB

0><D

B1>

<DB

2><D

B3>

DA

TA B

US

AD

DR

ES

S B

US

DR

AM

BA

NK

RE

AD

RE

QU

ES

T T

IMIN

G: <R

OW

>

<

CO

L>

<P

RE

>

(800

MH

z)

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

f

Mar

ylan

d

such

a m

odel

giv

es u

s th

ese

type

s of

req

uest

sha

pes

for

read

s an

d w

rites

this

sho

ws

a fe

w e

xam

ple

bus/

burs

t con

figur

aito

ns, i

n pa

rtic

ular

:

4-by

te b

us w

ith b

urst

siz

es o

f 32,

64

, and

128

byt

es p

er b

urst

Rea

d/W

rite

Req

ues

t S

hap

es

t

0

10n

s90

ns

DAT

A B

US

AD

DR

ES

S B

US

DR

AM

BA

NK

70n

s

RE

AD

RE

QU

ES

TS

:

DAT

A B

US

AD

DR

ES

S B

US

DR

AM

BA

NK

20n

s90

ns

70n

s

DAT

A B

US

AD

DR

ES

S B

US

DR

AM

BA

NK

40n

s10

0ns

70n

s

t

0

10

ns

10n

s90

ns

D

ATA

BU

S

AD

DR

ES

S B

US

DR

AM

BA

NK

40

ns

WR

ITE

RE

QU

ES

TS

:

DAT

A B

US

AD

DR

ES

S B

US

DR

AM

BA

NK

20n

s

10n

s90

ns

40n

s

DAT

A B

US

AD

DR

ES

S B

US

DR

AM

BA

NK

40n

s

10n

s90

ns

40n

s

10n

s

10n

s

10n

s

Page 56: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

f

Mar

ylan

d

the

bus

is P

IPE

LIN

ED

and

su

ppor

ts S

PLI

T

TR

AN

SA

CT

ION

S w

here

one

re

ques

t can

be

NE

ST

LED

insi

de

of a

noth

er if

the

timin

g is

rig

ht

[exp

lain

exa

mpl

es]

wha

t we’

re tr

ying

to d

o is

to fi

t th

ese

2D p

uzzl

e pi

eces

toge

ther

in

TIM

E

Pip

elin

ed/S

plit

Tra

nsa

ctio

ns

Nes

tling

of w

rites

insi

de r

eads

is le

gal i

f R/W

to d

iffer

ent b

anks

:

Leg

al if

R/R

to d

iffer

ent b

anks

:

(a)

(b)

(c)

B

ack-

to-b

ack

R/W

pai

r th

at c

anno

t be

nest

led:

Leg

al if

turn

arou

nd <

= 1

0ns:

20ns

10ns

90ns

70ns

20ns

10ns

90ns

40ns

20ns

10ns

90ns

70ns

Leg

al if

no

turn

arou

nd:

Read

:

Writ

e:

10ns

10ns

90ns

40ns

10ns

10ns

90ns

70ns

Read

:

Writ

e:

20ns

10ns

90ns

70ns

Read

:

Read

:

20ns

1010

40ns

10ns

90ns

40ns

40ns

10ns

100n

s

70ns

Read

:

Writ

e:

10

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

f

Mar

ylan

d

as fo

r phy

sica

l con

nect

ions

, her

e ar

e th

e w

ays

we

mod

eled

in

depe

nden

t DR

AM

cha

nnel

s an

d in

depe

nden

t BA

NK

S p

er

CH

AN

NE

L

the

figur

e sh

ows

a fe

w o

f the

pa

ram

eter

s th

at w

e st

udy.

in

addi

tion,

we

look

at:

- tu

rnar

ound

tim

e (0

,1 c

ycle

)

- qu

eue

size

(0,

1, 2

, 4, 8

, 16,

32

, infi

nite

req

uest

s pe

r ch

anne

l)

Ch

ann

els

& B

anks

1, 2

, 480

0 M

Hz

Ch

ann

els

8, 1

6, 3

2, 6

4D

ata

Bit

s p

er C

han

nel

1, 2

, 4, 8

Ban

ks p

er C

han

nel

(In

dep

.)32

, 64,

128

Byt

es p

er B

urs

t

C

D

C

DD

C

DD

DD

D

CD

C

DD

DD

C

DD

DD

DD

DD

C

DD

DD

DD

DD

DD

DD

DD

DD

C

DD

DD

DD

DD

C

DD

DD

...

... ...

On

e in

dep

end

ent

chan

nel

Ban

kin

g d

egre

es o

f 1,

2, 4

, ...

Fo

ur

ind

epen

den

t ch

ann

els

Ban

kin

g d

egre

es o

f 1,

2, 4

, ...

Two

ind

epen

den

t ch

ann

els

Ban

kin

g d

egre

es o

f 1,

2, 4

, ...

C

DD

C

DD

DD

CD

C

DD

C

DD

DD

C

DD

DD

DD

DD

C

DD

DD

DD

DD

DD

DD

DD

DD

C

DD

DD

DD

DD

C

DD

DD

Page 57: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

f

Mar

ylan

d

how

do

you

chun

k up

a c

ache

bl

ock?

(L@

cac

hes

use

128-

byte

blo

cks)

[rea

d th

e bu

llets

]

LON

GE

R B

UR

ST

S a

mor

tize

the

cost

of a

ctiv

atin

g an

d pr

echa

rgin

g th

e ro

w o

ver

mor

e da

ta tr

ansf

erre

d

SH

OR

TE

R B

UR

ST

S a

llow

the

criti

cal w

ord

of a

FO

LLO

WIN

G

RE

QU

ES

T to

be

serv

iced

so

oner

so -

- th

is is

not

nov

el, b

ut it

is

fairl

y ag

gres

sive

NO

TE

: we

use

clos

e-pa

ge,

auto

prec

harg

e po

licy

with

E

SD

RA

M-s

tyle

buf

ferin

g of

R

OW

in S

RA

M. r

esul

t: ge

t the

be

st p

ossi

ble

prec

harg

e ov

erla

p A

ND

mul

tiple

bur

st r

eque

sts

to

the

sam

e ro

w w

ill n

ot re

-invo

ke a

R

AS

cyc

le u

nlea

ss a

n in

terv

enin

g R

EA

D r

eque

st g

oes

to a

diff

eren

t RO

W in

sam

e B

AN

K

Bu

rst

Sch

edu

ling

(Bac

k-to

-Bac

k R

ead

Req

ues

ts)

Cri

tica

l-bu

rst-

firs

t

No

n-c

riti

cal b

urs

ts a

re p

rom

ote

d

Wri

tes

hav

e lo

wes

t p

rio

rity

(t

end

bac

k u

p in

req

ues

t q

ueu

e …

)

Ten

sio

n b

etw

een

larg

e &

sm

all b

urs

ts:

amo

rtiz

atio

n v

s. f

aste

r ti

me

to d

ata

128-

Byt

e B

urs

ts:

64-B

yte

Bu

rsts

:

32-B

yte

Bu

rsts

:

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

f

Mar

ylan

d

we

run

a se

ries

of d

iffer

ent

sim

ulat

ions

to g

et b

reak

-dow

ns:

- C

PU

Act

ivity

- m

emor

y ac

tivity

ove

rlapp

ed

with

CP

U-

non-

over

lapp

ed -

SY

ST

EM

- no

n-ov

erla

pped

- d

ue to

DR

AM

so th

e to

p tw

o ar

e M

EM

OR

Y

STA

LL C

YC

LES

botto

m tw

o ar

e P

ER

FE

CT

M

EM

OR

Y

Not

e: M

EM

OR

Y L

ATE

NC

Y is

no

t fur

ther

div

ided

into

late

ncy/

band

wid

th/e

tc.

New

Bar

-Ch

art

Defi

nit

ion

t

PR

OC

— C

PU

wit

h 1

-cyc

le L

2 m

iss

t RE

AL —

rea

listi

c C

PU

/DR

AM

co

nfi

g

•t S

YS —

CP

U w

ith

1-c

ycle

DR

AM

late

ncy

•t D

RA

M —

tim

e se

en b

y D

RA

M s

yste

m

Sta

lls D

ue

toD

RA

M L

aten

cy

Sta

lls D

ue

toQ

ueu

e, B

us,

...

CP

U+L

1+L

2E

xecu

tio

n

CP

U-M

emo

ryO

VE

RL

AP

t DR

AM

t PR

OC

t SY

S

t RE

AL

t RE

AL

- t B

W

t SY

S -

t PR

OC

t RE

AL

- t D

RA

M

t PR

OC

- (t

RE

AL

- t D

RA

M)

Page 58: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

so -

- w

e’re

mod

elin

g a

mem

ory

syst

em th

at is

fairl

y ag

gres

sive

in

term

s of

-

sche

dulin

g po

licie

s-

supp

ort f

or c

oncu

rren

cy

and

we’

re tr

ying

to fi

nd w

hich

of

the

follo

win

g is

to b

lam

e fo

r th

e m

ost o

verh

ead:

- co

ncur

renc

y-

late

ncy

- sy

stem

(qu

euei

ng, p

rech

arge

, ch

unks

, etc

)

Sys

tem

Ove

rhea

d

1.6

3.2

6.4

Sys

tem

Ban

dwid

th(G

B/s

= C

hann

els

* W

idth

* S

peed

)

1 ba

nk/ch

anne

l

2 ba

nks/c

hann

el

4 ba

nks/c

hann

el

8 ba

nks/c

hann

el 1 ba

nk/c

hann

el

2 ba

nks/

chan

nel

4 ba

nks/

chan

nel

8 ba

nks/

chan

nel 1 ba

nk/c

hann

el

2 ba

nks/

chan

nel

4 ba

nks/

chan

nel

8 ba

nks/

chan

nel

0123

0-C

ycle

Bus

Tur

naro

und

Reg

ular

Bus

Org

aniz

atio

n

Cycles per Instruction (CPI)

Per

fect

Mem

ory

Ben

chm

ark

= B

ZIP

(S

PE

C 2

000)

, 32-

byt

e b

urs

t, 1

6-b

it b

us

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

the

figur

e sh

ows:

- sy

stem

ove

rhea

d is

sig

nific

ant

(usu

ally

20-

40%

of t

he to

tal

mem

ory

over

head

)

- th

e m

ost s

igni

fican

t ove

rhea

d te

nds

to b

e th

e D

RA

M la

tenc

y

- tu

rnar

ound

is r

elat

ivel

y in

sign

ifica

nt (

how

ever

, re

mem

ber

that

this

is a

n 80

0MH

z bu

s sy

stem

...)

Sys

tem

Ove

rhea

d

1.6

3.2

6.4

Sys

tem

Ban

dwid

th(G

B/s

= C

hann

els

* W

idth

* S

peed

)

1 ba

nk/ch

anne

l

2 ba

nks/c

hann

el

4 ba

nks/c

hann

el

8 ba

nks/c

hann

el 1 ba

nk/c

hann

el

2 ba

nks/

chan

nel

4 ba

nks/

chan

nel

8 ba

nks/

chan

nel 1 ba

nk/c

hann

el

2 ba

nks/

chan

nel

4 ba

nks/

chan

nel

8 ba

nks/

chan

nel

0123

0-C

ycle

Bus

Tur

naro

und

Reg

ular

Bus

Org

aniz

atio

n

Cycles per Instruction (CPI)

Ben

chm

ark

= B

ZIP

(S

PE

C 2

000)

, 32-

byt

e b

urs

t, 1

6-b

it b

us

Sys

tem

ove

rhea

d 1

0–10

0%o

ver

per

fect

mem

ory

Page 59: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

the

figur

e al

so s

how

s th

at

incr

easi

ng B

AN

KS

PE

R

CH

AN

NE

L gi

ves

you

alm

ost a

s m

uch

bene

fit a

s in

crea

sing

C

HA

NN

EL

BA

ND

WID

TH

, whi

ch

is m

uch

mor

e co

stly

to

impl

emen

t

=>

cle

arly

, the

re a

re s

ome

conc

urre

ncy

effe

cts

goin

g on

, an

d w

e’d

like

to q

uant

ify a

nd

bette

r un

ders

tand

them

Co

ncu

rren

cy E

ffec

ts

Ban

ks/c

han

nel

as

sig

nifi

can

t as

ch

ann

el B

W

1.6

3.2

6.4

Sys

tem

Ban

dwid

th(G

B/s

= C

hann

els

* W

idth

* S

peed

)

0123

Cycles per Instruction (CPI)

Ben

chm

ark

= B

ZIP

(S

PE

C 2

000)

, 32-

byt

e b

urs

t, 1

6-b

it b

us

8 B

anks

per

Ch

ann

el4

Ban

ks p

er C

han

nel

2 B

anks

per

Ch

ann

el1

Ban

k p

er C

han

nel

BA

NK

S p

er C

HA

NN

EL

CH

AN

NE

L B

AN

DW

IDT

H

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

take

a lo

ok a

t a la

rger

por

tion

of

the

desi

gn s

pace

x-ax

is: S

YS

TE

M B

AN

DW

IDT

H,

whi

ch =

= c

hann

els

X c

hann

el

wid

th X

800

MH

z

y-ax

is: e

xecu

tion

time

of

appl

icat

ion

on th

e gi

ven

confi

gura

tion

diffe

rent

col

ored

bar

s re

pres

ent

diffe

rent

bur

st w

idth

s

ME

MO

RY

OV

ER

HE

AD

is

subs

tant

ial

obvi

ous

tren

d: m

ore

band

wid

th

is b

ette

r

anot

her

obvi

ous

tren

d: m

ore

band

wid

th is

NO

T

NE

CE

SS

AR

ILY

bet

ter

...

Ban

dw

idth

vs.

Bu

rst W

idth

0.8

1.6

3.2

6.4

12.8

25.6

Sys

tem

Ban

dw

idth

0123

Cycles per Instruction (CPI)

(GB

/s =

Ch

ann

els

* W

idth

* 8

00M

Hz)12

8-B

yte

Bu

rst

64-B

yte

Bu

rst

32-B

yte

Bu

rst

Ben

chm

ark

= G

CC

(S

PE

C 2

000)

, 2 b

anks

/ch

ann

el

1 ch

an x

4 b

ytes

2 ch

an x

2 b

ytes

4 ch

an x

1 b

yte 1

chan

x 8

byt

es

2 ch

an x

4 b

ytes

4 ch

an x

2 b

ytes 2

chan

x 8

byt

es

4 ch

an x

4 b

ytes

4 ch

an x

8 b

ytes

1 ch

an x

2 b

ytes

2 ch

an x

1 b

yte

1 ch

an x

1 b

yte

Page 60: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

so -

- th

ere

are

som

e ob

viou

s tr

ade-

offs

rel

ated

to B

UR

ST

S

IZE

, whi

ch c

an a

ffect

the

TOTA

L E

XE

CU

TIO

N T

IME

by

30%

or

mor

e, k

eepi

ng a

ll el

se

cons

tant

.

Ban

dw

idth

vs.

Bu

rst W

idth

0.8

1.6

3.2

6.4

12.8

25.6

Sys

tem

Ban

dw

idth

0123

Cycles per Instruction (CPI)

(GB

/s =

Ch

ann

els

* W

idth

* 8

00M

Hz)12

8-B

yte

Bu

rst

64-B

yte

Bu

rst

32-B

yte

Bu

rst

Ben

chm

ark

= G

CC

(S

PE

C 2

000)

, 2 b

anks

/ch

ann

el

1 ch

an x

4 b

ytes

2 ch

an x

2 b

ytes

4 ch

an x

1 b

yte 1

chan

x 8

byt

es

2 ch

an x

4 b

ytes

4 ch

an x

2 b

ytes 2

chan

x 8

byt

es

4 ch

an x

4 b

ytes

4 ch

an x

8 b

ytes

1 ch

an x

2 b

ytes

2 ch

an x

1 b

yte

1 ch

an x

1 b

yte

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

if w

e lo

ok m

ore

clos

ely

at

indi

vidu

al s

yste

m o

rgan

izat

ions

, th

ere

are

som

e cl

ear

RU

LES

of

TH

UM

Bth

at a

pear

...

Ban

dw

idth

vs.

Bu

rst W

idth

0.8

1.6

3.2

6.4

12.8

25.6

Sys

tem

Ban

dw

idth

0123

Cycles per Instruction (CPI)

(GB

/s =

Ch

ann

els

* W

idth

* 8

00M

Hz)12

8-B

yte

Bu

rst

64-B

yte

Bu

rst

32-B

yte

Bu

rst

Ben

chm

ark

= G

CC

(S

PE

C 2

000)

, 2 b

anks

/ch

ann

el

1 ch

an x

4 b

ytes

2 ch

an x

2 b

ytes

4 ch

an x

1 b

yte 1

chan

x 8

byt

es

2 ch

an x

4 b

ytes

4 ch

an x

2 b

ytes 2

chan

x 8

byt

es

4 ch

an x

4 b

ytes

4 ch

an x

8 b

ytes

1 ch

an x

2 b

ytes

2 ch

an x

1 b

yte

1 ch

an x

1 b

yte

Page 61: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

for

one,

LA

RG

E B

UR

ST

S a

re

optim

al fo

r WID

ER

CH

AN

NE

LS

Ban

dw

idth

vs.

Bu

rst W

idth

0.8

1.6

3.2

6.4

12.8

25.6

Sys

tem

Ban

dw

idth

0123

Cycles per Instruction (CPI)

(GB

/s =

Ch

ann

els

* W

idth

* 8

00M

Hz)12

8-B

yte

Bu

rst

64-B

yte

Bu

rst

32-B

yte

Bu

rst

Ben

chm

ark

= G

CC

(S

PE

C 2

000)

, 2 b

anks

/ch

ann

el

1 ch

an x

4 b

ytes

2 ch

an x

2 b

ytes

4 ch

an x

1 b

yte 1

chan

x 8

byt

es

2 ch

an x

4 b

ytes

4 ch

an x

2 b

ytes 2

chan

x 8

byt

es

4 ch

an x

4 b

ytes

4 ch

an x

8 b

ytes

1 ch

an x

2 b

ytes

2 ch

an x

1 b

yte

1 ch

an x

1 b

yte

Wid

e ch

ann

els

(32/

64-b

it)

wan

t la

rge

bu

rsts

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

for

anot

her,

SM

ALL

BU

RS

TS

ar

e op

timal

for

NA

RR

OW

C

HA

NN

ELS

Ban

dw

idth

vs.

Bu

rst W

idth

0.8

1.6

3.2

6.4

12.8

25.6

Sys

tem

Ban

dw

idth

0123

Cycles per Instruction (CPI)

(GB

/s =

Ch

ann

els

* W

idth

* 8

00M

Hz)12

8-B

yte

Bu

rst

64-B

yte

Bu

rst

32-B

yte

Bu

rst

Ben

chm

ark

= G

CC

(S

PE

C 2

000)

, 2 b

anks

/ch

ann

el

1 ch

an x

4 b

ytes

2 ch

an x

2 b

ytes

4 ch

an x

1 b

yte 1

chan

x 8

byt

es

2 ch

an x

4 b

ytes

4 ch

an x

2 b

ytes 2

chan

x 8

byt

es

4 ch

an x

4 b

ytes

4 ch

an x

8 b

ytes

1 ch

an x

2 b

ytes

2 ch

an x

1 b

yte

1 ch

an x

1 b

yte

Nar

row

ch

ann

els

(8-b

it)

wan

t sm

all b

urs

ts

Page 62: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

and

ME

DIU

M B

UR

ST

S a

re

optim

al fo

r M

ED

IUM

C

HA

NN

ELS

so -

- if

CO

NC

UR

RE

NC

Y w

ere

all-i

mpo

rtan

t, w

e w

ould

exp

ect

smal

l bur

sts

to b

e be

st, b

ecau

se

they

wou

ld a

llow

a L

OW

ER

A

VE

RA

GE

TIM

E-T

O-C

RIT

ICA

L-W

OR

D fo

r a

larg

er n

umbe

r of

si

mul

tane

ous

requ

ests

.

wha

t we

actu

ally

see

is th

at th

e op

timal

bur

st w

idth

sca

lies

with

th

e bu

s w

idth

, sug

estin

g an

op

timal

num

ber

of D

ATA

T

RA

NS

FE

RS

per

BA

NK

A

CT

IVAT

ION

/PR

EC

HA

RG

E

cycl

e

i’ll i

llust

rate

that

...

Ban

dw

idth

vs.

Bu

rst W

idth

0.8

1.6

3.2

6.4

12.8

25.6

Sys

tem

Ban

dw

idth

0123

Cycles per Instruction (CPI)

(GB

/s =

Ch

ann

els

* W

idth

* 8

00M

Hz)12

8-B

yte

Bu

rst

64-B

yte

Bu

rst

32-B

yte

Bu

rst

Ben

chm

ark

= G

CC

(S

PE

C 2

000)

, 2 b

anks

/ch

ann

el

1 ch

an x

4 b

ytes

2 ch

an x

2 b

ytes

4 ch

an x

1 b

yte 1

chan

x 8

byt

es

2 ch

an x

4 b

ytes

4 ch

an x

2 b

ytes 2

chan

x 8

byt

es

4 ch

an x

4 b

ytes

4 ch

an x

8 b

ytes

1 ch

an x

2 b

ytes

2 ch

an x

1 b

yte

1 ch

an x

1 b

yte

Med

ium

ch

ann

els

(16-

bit

)w

ant

med

ium

bu

rsts

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

this

figu

re s

how

s th

e en

tire

rang

e of

bur

st w

idth

s th

at w

e m

odel

ed. n

ote

that

som

e of

the

figur

e re

pres

ent s

ever

al d

iffer

ent

com

bina

tions

... f

or e

xam

ple,

the

one

TH

IRD

DO

WN

FR

OM

TO

P

is

2-bu

te c

hann

el +

32-

byte

bur

st4-

byte

cha

nnel

+ 6

4-by

te b

urst

8-by

te c

hann

el +

128

-byt

e bu

rst

Bu

rst W

idth

Sca

les

wit

h B

us

Ran

ge

of

Bu

rst-

Wid

ths

Mo

del

ed

10ns

10ns

90ns

DATA

BUS

ADDR

ESS

BUS

DRAM

BAN

K

70ns

DATA

BUS

ADDR

ESS

BUS

DRAM

BAN

K

20ns

10ns

90ns

70ns

DATA

BUS

ADDR

ESS

BUS

DRAM

BAN

K

40ns

10ns

100n

s

70ns

5ns

10ns

90ns

DATA

BUS

ADDR

ESS

BUS

DRAM

BAN

K

70ns

DATA

BUS

ADDR

ESS

BUS

DRAM

BAN

K

80ns

10ns

140n

s

70ns

DATA

BUS

ADDR

ESS

BUS

DRAM

BAN

K

160n

s

10ns

220n

s

70ns

64-b

it ch

anne

l x 3

2-by

te b

urst

32-b

it ch

anne

l x 3

2-by

te b

urst

64-b

it ch

anne

l x 6

4-by

te b

urst

32-b

it ch

anne

l x 6

4-by

te b

urst

64-b

it ch

anne

l x 1

28-b

yte

burs

t

16-b

it ch

anne

l x 3

2-by

te b

urst

16-b

it ch

anne

l x 6

4-by

te b

urst

32-b

it ch

anne

l x 1

28-b

yte

burs

t

8-bi

t cha

nnel

x 3

2-by

te b

urst

16-b

it ch

anne

l x 1

28-b

yte

burs

t8-

bit c

hann

el x

64-

byte

bur

st

8-bi

t cha

nnel

x 1

28-b

yte

burs

t

Page 63: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

the

optim

al c

onfig

urat

ions

are

in

the

mid

dle,

sug

gest

ing

an

optim

al n

umbe

r of

DAT

A

TR

AN

SF

ER

S p

er B

AN

K

AC

TIV

ATIO

N/P

RE

CH

AR

GE

cy

cle

[BO

TTO

M] -

- to

o m

any

tran

sfer

s pe

r bu

rst c

row

ds o

ut o

ther

re

ques

ts

[TO

P] -

- to

o fe

w tr

ansf

ers

per

requ

est l

ets

the

bank

ove

rhea

d (a

ctiv

atio

n/pr

echa

rge

cycl

e)

dom

inat

e

how

ever

, tho

ugh

this

tells

us

how

to b

est o

rgan

ize

a ch

anne

l w

ith a

giv

en b

andw

idth

, the

se

rule

s of

thum

b do

not

say

an

ythi

ng a

bout

how

the

diffe

rent

co

nfigu

ratio

ns (

wid

e/na

rrow

/m

ediu

m c

hann

els)

com

pare

to

each

oth

er ..

.

so le

t’s fo

cus

on m

ultip

le

confi

gura

tions

for

ON

E

BA

ND

WID

TH

CLA

SS

...

Bu

rst W

idth

Sca

les

wit

h B

us

Ran

ge

of

Bu

rst-

Wid

ths

Mo

del

ed

10ns

10ns

90ns

DATA

BUS

ADDR

ESS

BUS

DRAM

BAN

K

70ns

DATA

BUS

ADDR

ESS

BUS

DRAM

BAN

K

20ns

10ns

90ns

70ns

DATA

BUS

ADDR

ESS

BUS

DRAM

BAN

K

40ns

10ns

100n

s

70ns

5ns

10ns

90ns

DATA

BUS

ADDR

ESS

BUS

DRAM

BAN

K

70ns

DATA

BUS

ADDR

ESS

BUS

DRAM

BAN

K

80ns

10ns

140n

s

70ns

DATA

BUS

ADDR

ESS

BUS

DRAM

BAN

K

160n

s

10ns

220n

s

70ns

64-b

it ch

anne

l x 3

2-by

te b

urst

32-b

it ch

anne

l x 3

2-by

te b

urst

64-b

it ch

anne

l x 6

4-by

te b

urst

32-b

it ch

anne

l x 6

4-by

te b

urst

64-b

it ch

anne

l x 1

28-b

yte

burs

t

16-b

it ch

anne

l x 3

2-by

te b

urst

16-b

it ch

anne

l x 6

4-by

te b

urst

32-b

it ch

anne

l x 1

28-b

yte

burs

t

8-bi

t cha

nnel

x 3

2-by

te b

urst

16-b

it ch

anne

l x 1

28-b

yte

burs

t8-

bit c

hann

el x

64-

byte

bur

st

8-bi

t cha

nnel

x 1

28-b

yte

burs

t

OP

TIM

AL

BU

RS

T W

IDT

HS

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

this

is li

ke s

ayin

g

i hav

e 4

800M

Hz

8-bi

t Ram

bus

chan

nels

... w

hat s

houl

d i d

o?

- ga

ng th

em to

geth

er?

- ke

ep th

em in

depe

nden

t?

- so

met

hing

in b

etw

een?

like

befo

re, w

e se

e th

at m

ore

bank

s is

bet

ter,

but n

ot a

lway

s by

muc

h

Fo

cus

on

3.2

GB

/s —

MC

F

3.2

GB

/s S

yste

m B

andw

idth

(ch

anne

ls x

wid

th x

spe

ed)

32-B

yte

Bur

st64

-Byt

e B

urst

128-

Byt

e B

urst

Cycles per Instruction (CPI)

01234567

1 ch

an x

4 b

ytes

2 ch

an x

2 b

ytes 4 ch

an x

1 b

yte 1 ch

an x

4 b

ytes

2 ch

an x

2 b

ytes

4 ch

an x

1 b

yte 1

chan

x 4

byt

es

2 ch

an x

2 b

ytes

4 ch

an x

1 b

yte

8 B

anks

per

Ch

ann

el4

Ban

ks p

er C

han

nel

2 B

anks

per

Ch

ann

el1

Ban

k p

er C

han

nel

Page 64: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

with

larg

e bu

rsts

, the

re is

less

in

terle

avin

g of

requ

ests

, so

extr

a ba

nks

are

not n

eede

d

Fo

cus

on

3.2

GB

/s —

MC

F

3.2

GB

/s S

yste

m B

andw

idth

(ch

anne

ls x

wid

th x

spe

ed)

32-B

yte

Bur

st64

-Byt

e B

urst

128-

Byt

e B

urst

Cycles per Instruction (CPI)

01234567

1 ch

an x

4 b

ytes

2 ch

an x

2 b

ytes 4 ch

an x

1 b

yte 1 ch

an x

4 b

ytes

2 ch

an x

2 b

ytes

4 ch

an x

1 b

yte 1

chan

x 4

byt

es

2 ch

an x

2 b

ytes

4 ch

an x

1 b

yte

8 B

anks

per

Ch

ann

el4

Ban

ks p

er C

han

nel

2 B

anks

per

Ch

ann

el1

Ban

k p

er C

han

nel

#Ban

ks n

ot

par

ticu

larl

y im

po

rtan

tg

iven

larg

e b

urs

t si

zes

...

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

with

mul

tiple

inde

pend

ent

chan

nels

, you

hav

e a

degr

ee o

f co

ncur

renc

y th

at, t

o so

me

exte

nt, O

BV

IAT

ES

TH

E N

EE

D

for

BA

NK

ING

Fo

cus

on

3.2

GB

/s —

MC

F

3.2

GB

/s S

yste

m B

andw

idth

(ch

anne

ls x

wid

th x

spe

ed)

32-B

yte

Bur

st64

-Byt

e B

urst

128-

Byt

e B

urst

Cycles per Instruction (CPI)

01234567

1 ch

an x

4 b

ytes

2 ch

an x

2 b

ytes 4 ch

an x

1 b

yte 1 ch

an x

4 b

ytes

2 ch

an x

2 b

ytes

4 ch

an x

1 b

yte 1

chan

x 4

byt

es

2 ch

an x

2 b

ytes

4 ch

an x

1 b

yte

8 B

anks

per

Ch

ann

el4

Ban

ks p

er C

han

nel

2 B

anks

per

Ch

ann

el1

Ban

k p

er C

han

nel

... e

ven

less

so

w

ith

mu

lti-

chan

nel

sys

tem

s

Page 65: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

how

ever

, try

ing

to r

educ

e ex

ecut

ion

time

via

mul

tiple

ch

anne

ls is

a b

it ris

ky:

it is

ver

y S

EN

SIT

IVE

to B

UR

ST

S

IZE

, bec

aus

emul

tiple

ch

anne

ls g

ivee

s th

e lo

nges

t la

tenc

y po

ssib

le to

EV

ER

Y

RE

QU

ES

T in

the

syst

em, a

nd

havi

ng L

ON

G B

UR

ST

S

exac

erba

tes

that

pro

blem

--

byt

incr

easi

ng th

e le

ngth

of t

ime

that

a

requ

est c

an b

e de

laye

d by

w

aitin

g fo

r an

othe

r ah

ead

of it

in

the

queu

e

Fo

cus

on

3.2

GB

/s —

MC

F

3.2

GB

/s S

yste

m B

andw

idth

(ch

anne

ls x

wid

th x

spe

ed)

32-B

yte

Bur

st64

-Byt

e B

urst

128-

Byt

e B

urst

Cycles per Instruction (CPI)

01234567

1 ch

an x

4 b

ytes

2 ch

an x

2 b

ytes 4 ch

an x

1 b

yte 1 ch

an x

4 b

ytes

2 ch

an x

2 b

ytes

4 ch

an x

1 b

yte 1

chan

x 4

byt

es

2 ch

an x

2 b

ytes

4 ch

an x

1 b

yte

8 B

anks

per

Ch

ann

el4

Ban

ks p

er C

han

nel

2 B

anks

per

Ch

ann

el1

Ban

k p

er C

han

nel

Mu

lti-

chan

nel

sys

tem

s so

met

imes

(bu

t n

ot

alw

ays)

a g

oo

d id

ea

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

anot

her

way

to lo

ok a

t it:

how

doe

s th

e ch

oice

of b

urst

si

ze a

ffect

the

NU

MB

ER

or

WID

TH

of c

hann

els?

=>

WID

E C

HA

NN

ELS

: an

impr

ovem

ent i

s se

en b

y in

crea

sing

the

burs

t siz

e

=>

NA

RR

OW

CH

AN

NE

LS:

eith

er n

o im

prov

emen

t is

seen

, or

a s

light

deg

rada

tion

is s

een

by in

crea

sing

bur

st s

ize

Fo

cus

on

3.2

GB

/s —

MC

F

3.2

GB

/s S

yste

m B

andw

idth

(ch

anne

ls x

wid

th x

spe

ed)

32-B

yte

Bur

st64

-Byt

e B

urst

128-

Byt

e B

urst

Cycles per Instruction (CPI)

01234567

1 ch

an x

4 b

ytes

2 ch

an x

2 b

ytes 4 ch

an x

1 b

yte 1 ch

an x

4 b

ytes

2 ch

an x

2 b

ytes

4 ch

an x

1 b

yte 1

chan

x 4

byt

es

2 ch

an x

2 b

ytes

4 ch

an x

1 b

yte

8 B

anks

per

Ch

ann

el4

Ban

ks p

er C

han

nel

2 B

anks

per

Ch

ann

el1

Ban

k p

er C

han

nel

4x 1

-byt

e ch

ann

els

2x 2

-byt

e ch

ann

els

1x 4

-byt

e ch

ann

els

Page 66: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

we

see

the

sam

e tr

ends

in a

ll th

e be

nchm

arks

sur

veye

d

the

onlll

y th

ing

that

cha

nges

is

som

e of

the

rela

tions

bet

wee

n pa

ram

eter

s ...

Fo

cus

on

3.2

GB

/s —

BZ

IP

3.2

GB

/s S

yste

m B

andw

idth

(ch

anne

ls x

wid

th x

spe

ed)

Cycles per Instruction (CPI)

32-B

yte

Bur

st64

-Byt

e B

urst

128-

Byt

e B

urst

012

1 ch

an x

4 b

ytes

2 ch

an x

2 b

ytes 4 ch

an x

1 b

yte 1 ch

an x

4 b

ytes

2 ch

an x

2 b

ytes

4 ch

an x

1 b

yte 1

chan

x 4

byt

es

2 ch

an x

2 b

ytes

4 ch

an x

1 b

yte

8 B

anks

per

Ch

ann

el4

Ban

ks p

er C

han

nel

2 B

anks

per

Ch

ann

el1

Ban

k p

er C

han

nel

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

for

exam

ple,

in B

ZIP

, the

bes

t co

nfigu

ratio

ns a

re a

t sm

alle

r bu

rst s

izes

than

MC

F

how

ever

, tho

ugh

TH

E O

PT

IMA

L C

ON

FIG

cha

nges

from

ba

nchm

ark

to b

ench

mar

k, th

ere

are

alw

ays

seve

ral c

onfig

s th

at

are

with

in 5

-10%

of t

he o

ptim

al

confi

g --

IN A

LL B

EN

CH

MA

RK

S

Fo

cus

on

3.2

GB

/s —

BZ

IP

3.2

GB

/s S

yste

m B

andw

idth

(ch

anne

ls x

wid

th x

spe

ed)

Cycles per Instruction (CPI)

32-B

yte

Bur

st64

-Byt

e B

urst

128-

Byt

e B

urst

012

1 ch

an x

4 b

ytes

2 ch

an x

2 b

ytes 4 ch

an x

1 b

yte 1 ch

an x

4 b

ytes

2 ch

an x

2 b

ytes

4 ch

an x

1 b

yte 1

chan

x 4

byt

es

2 ch

an x

2 b

ytes

4 ch

an x

1 b

yte

8 B

anks

per

Ch

ann

el4

Ban

ks p

er C

han

nel

2 B

anks

per

Ch

ann

el1

Ban

k p

er C

han

nel

BE

ST

CO

NF

IGS

are

at

SM

AL

LE

R B

UR

ST

SIZ

ES

Page 67: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Qu

eue

Siz

e &

Reo

rder

ing

0123

BZ

IP:

1.6

GB

/s (

1 ch

ann

el)

32-b

yte

burs

t64

-byt

e bu

rst

128-

byte

bur

st

Cycles per Instruction (CPI) 1 bank/channel

2 banks/ch

annel

4 banks/ch

annel

8 banks/ch

annel 1 bank/channel

2 banks/ch

annel

4 banks/ch

annel

8 banks/ch

annel1 bank/c

hannel

2 banks/ch

annel

4 banks/ch

annel

8 banks/ch

annel

32-E

ntr

y Q

ueu

eIn

fin

ite

Qu

eue

1-E

ntr

y Q

ueu

eN

o Q

ueu

e

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

we

have

a c

ompl

ex e

dsig

n sp

ace

whe

re n

eigh

borin

g de

sign

s di

ffer

sign

ifica

ntly

if yo

u ar

e ca

refu

l, yo

u ca

n be

at

the

perf

orm

ance

of t

he a

vera

ge

orga

niza

tion

by 3

0-40

%

supp

ortin

g m

emor

y co

ncur

renc

y im

prov

es s

yste

m p

erfo

rman

ce,

as lo

ng a

s it

is n

ot d

one

at th

e ex

pens

e of

mem

ory

late

ncy:

- us

ing

MU

LTIP

LE C

HA

NN

ELS

is

goo

d, b

ut n

ot b

est s

olut

ion

- m

ultip

le b

anks

/cha

nnel

is

alw

ays

good

idea

- try

ing

to in

terle

ave

smal

l bur

sts

is in

tuiti

vely

app

ealin

g, b

ut it

do

esn’

t wor

k-

MS

HR

s: a

lwas

a g

od id

ea

In g

ener

al, b

urst

s sh

ould

be

larg

e en

ough

to a

mor

tize

the

prec

harg

e co

st.

Dire

ct R

ambu

s =

16

byte

sD

DR

2 =

16/

32 b

ytes

TH

IS IS

NO

T E

NO

UG

H

Co

ncl

usi

on

s

DE

SIG

N S

PAC

E is

NO

N-L

INE

AR

,C

OS

T o

f M

ISJU

DG

ING

is H

IGH

CA

RE

FU

L T

UN

ING

YIE

LD

S 3

0–40

% G

AIN

MO

RE

CO

NC

UR

RE

NC

Y =

= B

ET

TE

R(b

ut

no

t at

exp

ense

of

LA

TE

NC

Y)

•V

ia C

han

nel

s→

NO

T w

/ LA

RG

E B

UR

ST

S•

Via

Ban

ks→

ALW

AY

S S

AF

E•

Via

Bu

rsts

→ D

OE

SN

’T P

AY

OF

F•

Via

MS

HR

s→

NE

CE

SS

AR

Y

BU

RS

TS

AM

OR

TIZ

E C

OS

T O

F P

RE

CH

AR

GE

•Ty

pic

al S

yste

ms:

32

byte

s (e

ven

DD

R2)

→ T

HIS

IS N

OT

EN

OU

GH

Page 68: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

NO

TE

Ou

tlin

e

•B

asic

s

•D

RA

M E

volu

tio

n: S

tru

ctu

ral P

ath

•A

dva

nce

d B

asic

s

•D

RA

M E

volu

tio

n: I

nte

rfac

e P

ath

•F

utu

re In

terf

ace

Tren

ds

& R

esea

rch

Are

as

•P

erfo

rman

ce M

od

elin

g:

Arc

hit

ectu

res,

Sys

tem

s, E

mb

edd

ed

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

NO

TE

Em

bed

ded

DR

AM

Pri

mer

DR

AM

Arr

ay

CP

UC

ore

CP

UC

ore

DR

AM

Arr

ay

Em

bed

ded

No

t E

mb

edd

ed

Page 69: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

NO

TE

Wh

ith

er E

mb

edd

ed D

RA

M?

Mic

rop

roce

sso

r R

epo

rt, A

ug

ust

199

6: “

[Fiv

e]A

rch

itec

ts L

oo

k to

Pro

cess

ors

of

Fu

ture

•Tw

o p

red

ict

imm

inen

t m

erg

er

of

CP

U a

nd

DR

AM

•A

no

ther

sta

tes

we

can

no

t ke

ep c

ram

min

g

mo

re d

ata

over

th

e p

ins

at f

aste

r ra

tes

(im

plic

atio

n: e

mb

edd

ed D

RA

M)

•A

fou

rth

wan

ts g

igan

tic

on

-ch

ip L

3 ca

che

(per

hap

s D

RA

M L

3 im

ple

men

tati

on

?)

SO

WH

AT

HA

PP

EN

ED

?

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

NO

TE

Em

bed

ded

DR

AM

for

DS

Ps

MO

TIV

AT

ION

DS

P C

om

pile

rs =

> Tr

ansp

aren

t C

ach

e M

od

el

NO

N-T

RA

NS

PAR

EN

T a

ddre

ssin

gE

XP

LIC

ITLY

MA

NA

GE

D c

onte

nts

TR

AN

SPA

RE

NT

add

ress

ing

TR

AN

SPA

RE

NT

LY M

AN

AG

ED

con

tent

s

TAG

LE

SS

SR

AM

TR

AD

ITIO

NA

L C

AC

HE

(har

dwar

e-m

anag

ed)

CA

CH

E

MA

INM

EM

OR

Y

Mov

e fr

omm

emor

y sp

ace

to “

cach

e” s

pace

crea

tes

a ne

w,

equi

vale

nt d

ata

obje

ct, n

ot a

mer

e co

py o

f th

e or

igin

al.

Add

ress

spa

cein

clud

es b

oth

“cac

he” a

nd

prim

ary

mem

ory

(and

mem

ory-

map

ped

I/O)

MA

INM

EM

OR

Y

Add

ress

spa

cein

clud

es o

nly

prim

ary

mem

ory

(and

mem

ory-

map

ped

I/O)

The

cac

he “

cove

rs”

the

entir

e ad

dres

ssp

ace:

any

dat

umin

the

spac

e m

ay b

eca

ched

CA

CH

E

Cop

ying

from

mem

ory

to c

ache

cre

ates

asu

bord

inat

e co

py o

fth

e da

tum

that

is

kept

con

sist

ent w

ithth

e da

tum

stil

l in

mem

ory.

Har

dwar

een

sure

s co

nsis

tenc

y.

HA

RD

WA

RE

man

ages

th

ism

ovem

ent

of

dat

a

SO

FT

WA

RE

man

ages

th

ism

ovem

ent

of

dat

a

Page 70: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

NO

TE

DS

P B

uff

er O

rgan

izat

ion

Use

d fo

r S

tud

y

Ban

dw

idth

vs.

Die

-Are

a Tr

ade-

Off

for

DS

P P

erfo

rman

ce

DS

P

S0

S1

DS

P

Fu

lly A

sso

c 4-

Blo

ck C

ach

e

Ld

St0

Ld

St1

Ld

St0

Ld

St1

buff

er-0

buff

er-1

vict

im-0

vict

im-1

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

NO

TE

E-D

RA

M P

erfo

rman

ce

0.4

GB

/s

0.8

GB

/s

1.6

GB

/s

3.2

GB

/s

6.4

GB

/s

12.8

GB

/s

25.6

GB

/s

Ban

dwid

th

CPI Em

bedd

ed N

etw

orki

ng B

ench

mar

k -

Patr

icia

200M

Hz

C60

00 D

SP :

50,

100

, 200

MH

z M

emor

y

#

#

##

#

#

##

##

#

#

##

#

024681032

byt

es

64 b

ytes

128

byte

s

256

byte

s

512

byte

s#

#

1024

byt

es

Cac

he L

ine

Siz

e

Page 71: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

NO

TE

E-D

RA

M P

erfo

rman

ce

0.4

GB

/s

0.8

GB

/s

1.6

GB

/s

3.2

GB

/s

6.4

GB

/s

12.8

GB

/s

25.6

GB

/s

Ban

dwid

th

CPI Em

bedd

ed N

etw

orki

ng B

ench

mar

k -

Patr

icia

200M

Hz

C60

00 D

SP :

50M

Hz

Mem

ory

#

#

##

#

#

##

##

#

#

##

#

0246810

Incr

easi

ng b

us w

idth

32 b

ytes

64 b

ytes

128

byte

s

256

byte

s

512

byte

s#

#

1024

byt

es

Cac

he L

ine

Siz

e

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

NO

TE

E-D

RA

M P

erfo

rman

ce

Ban

dwid

th

CPI Em

bedd

ed N

etw

orki

ng B

ench

mar

k -

Patr

icia

200M

Hz

C60

00 D

SP :

100

MH

z M

emor

y

#

##

##

#

#

##

#

#

#

##

#

0246810

0.4

GB

/s

0.8

GB

/s

1.6

GB

/s

3.2

GB

/s

6.4

GB

/s

12.8

GB

/s

25.6

GB

/s

Incr

easi

ng b

us w

idth

32 b

ytes

64 b

ytes

128

byte

s

256

byte

s

512

byte

s#

#

1024

byt

es

Cac

he L

ine

Siz

e

Page 72: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

NO

TE

E-D

RA

M P

erfo

rman

ce

#

#

##

#

#

#

##

##

##

##

0.4

GB

/s

0.8

GB

/s

1.6

GB

/s

3.2

GB

/s

6.4

GB

/s

12.8

GB

/s

25.6

GB

/s

Ban

dwid

th

0246810 CPI Em

bedd

ed N

etw

orki

ng B

ench

mar

k -

Patr

icia

200M

Hz

C60

00 D

SP:

200M

Hz

Mem

ory

Incr

easi

ng b

us w

idth

32 b

ytes

64 b

ytes

128

byte

s

256

byte

s

512

byte

s#

#

1024

byt

es

Cac

he L

ine

Siz

e

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Per

form

ance

-Dat

a S

ou

rces

“A P

erfo

rman

ce S

tudy

of C

onte

mpo

rary

DR

AM

Arc

hite

ctur

es,”

Pro

c. IS

CA

’99.

V. C

uppu

, B. J

acob

, B. D

avis

, and

T. M

udge

.

“Org

aniz

atio

nal D

esig

n Tr

ade-

Offs

at t

he D

RA

M, M

emor

y B

us, a

nd

Mem

ory

Con

trol

ler

Leve

l: In

itial

Res

ults

,” U

nive

rsity

of M

aryl

and

Tech

nica

l Rep

ort U

MD

-SC

A-T

R-1

999-

2. V

. Cup

pu a

nd B

. Jac

ob.

“DD

R2

and

Low

Lat

ency

Var

iant

s,” M

emor

y W

all W

orks

hop

2000

, in

conj

unct

ion

w/ I

SC

A ’0

0. B

. Dav

is, T

. Mud

ge, V

. Cup

pu, a

nd B

. Jac

ob.

“Con

curr

ency

, Lat

ency

, or

Sys

tem

Ove

rhea

d: W

hich

Has

the

Larg

est

Impa

ct o

n D

RA

M-S

yste

m P

erfo

rman

ce?”

P

roc.

ISC

A ’0

1. V

. Cup

pu a

nd B

. Jac

ob.

“Tra

nspa

rent

Dat

a-M

emor

y O

rgan

izat

ions

for

Dig

ital S

igna

l Pro

cess

ors,

” P

roc.

CA

SE

S ’0

1. S

. Srin

ivas

an, V

. Cup

pu, a

nd B

. Jac

ob.

“Hig

h P

erfo

rman

ce D

RA

Ms

in W

orks

tatio

n E

nviro

nmen

ts,”

IEE

E T

rans

actio

ns o

n C

ompu

ters

, Nov

embe

r 20

01.

V. C

uppu

, B. J

acob

, B. D

avis

, and

T. M

udge

.

Rec

ent e

xper

imen

ts b

y S

adag

opan

Srin

ivas

an, P

h.D

. stu

dent

at

Uni

vers

ity o

f Mar

ylan

d.

Page 73: DRAM: Architectures, Interfaces, and Systems A Tutorialblj/talks/DRAM-Tutorial-isca2002-2.pdf · DRAM TUTORIAL ISCA 2002 Bruce Jacob David Wang University of Maryland at this point,

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

Ack

no

wle

dg

men

ts

Th

e p

rece

din

g w

ork

was

su

pp

ort

ed

in p

art

by t

he

follo

win

g s

ou

rces

:

•N

SF

CA

RE

ER

Aw

ard

CC

R-9

9836

18

•N

SF

gra

nt

EIA

-980

6645

•N

SF

gra

nt

EIA

-000

0439

•D

OD

aw

ard

AF

OS

R-F

4962

0011

0374

•…

an

d b

y C

om

paq

an

d IB

M.

DR

AM

TU

TOR

IAL

ISC

A 2

002

Bru

ce J

acob

Dav

id W

ang

Uni

vers

ity o

fM

aryl

and

CO

NTA

CT

INF

O

Bru

ce J

aco

b

Ele

ctri

cal &

Co

mp

ute

r E

ng

inee

rin

gU

niv

ersi

ty o

f M

aryl

and

, Co

lleg

e P

ark

http://www.ece.umd.edu/~blj/

[email protected]

Dav

e W

ang

Ele

ctri

cal &

Co

mp

ute

r E

ng

inee

rin

gU

niv

ersi

ty o

f M

aryl

and

, Co

lleg

e P

ark

http://www.wam.umd.edu/~davewang/

[email protected]

UN

IVE

RS

ITY

OF

MA

RY

LAN

D