DRAM Lecture2

32
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland DRAM Circuit and Architecture Basics Overview Terminology Access Protocol Architecture Storage element Switching element Bit Line Word Line (capacitor)

description

DRAM Architecture

Transcript of DRAM Lecture2

Page 1: DRAM Lecture2

DSys

t

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

DRAM Circuit and Architecture Basics

• Overview

• Terminology

• Access Protocol

• ArchitectureStorage elemen

Switching element

Bit Line

Word Line(capacitor)

Page 2: DRAM Lecture2

DSys

... Bit Lines...

MemoryArray

DRAM

Sense Amps

olumn Decoder

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

DRAM Circuit Basics

DRAM Cell

Row

Dec

od

er

. .. W

ord

Lin

es ..

.

Storage element

Switching element

Bit Line

Word Line Data In/OutBuffers

C(capacitor)

Page 3: DRAM Lecture2

DSys

Row, B

node node

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

itlines and Wordlines DRAM Circuit Basics

“Row” Defined

Bit Lines

Word Line

“Row” of DRAM

Row Size: 8 Kb @ 256 Mb SDRAM4 Kb @ 256 Mb RDRAM

Page 4: DRAM Lecture2

DSys

SenseandAmplify

4

5

6

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

DRAM Circuit Basics

Sense Amplifier I

6 Rows shown

1

2

3

Page 5: DRAM Lecture2

DSys

Vcc/2

SenseandAmplify

4

5

6

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

DRAM Circuit Basics

Sense Amplifier II : Precharged

Vcc (logic 1) Gnd (logic 0)

precharged to Vcc/2

1

2

3

Page 6: DRAM Lecture2

DSys

ead

SenseandAmplify

4

5

6

Vcc/2

WordlineDriven

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

DRAM Circuit Basics

Sense Amplifier III : Destructive R

1

2

3

Vcc (logic 1) Gnd (logic 0)

Page 7: DRAM Lecture2

DSys

l

... Bit Lines...

MemoryArray

DRAM

Sense Amps

olumn Decoder

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

DRAM Access Protoco

ROW ACCESS

AKA: OPEN a DRAM Page/Row

RAS (Row Address Strobe)

or

orACT (Activate a DRAM Page/Row)

BUSMEMORY

CONTROLLERCPU

Row

Dec

od

er

. .. W

ord

Lin

es ..

.

Data In/OutBuffers

C

Page 8: DRAM Lecture2

DSys

once tthe bitsubsetto the picks o

big poiRAS ountil findata ..on thethe senread b

DRAM on chip

idth (4, 8,16, 32)th (128 bit fixed)

(1, 2, 4, 8)

. DDR SDRAM

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

he data is valid on ALL of lines, you can select a of the bits and send them output buffers ... CAS ne of the bits

nt: cannot do another r precharge of the lines ished reading the column

. can’t change the values bit lines or the output of se amps until it has been

y the memory controller

DRAM Circuit Basics

“Column” Defined

“One Row” of DRAM

Column: Smallest addressable quantity of

SDRAM*: column size == chip data bus wRDRAM: column size != chip data bus wid

4 bit wide columns

SDRAM*: get “n” columns per access. n =RDRAM: get 1 column per access.

#2 #3 #4 #5#0 #1

* SDRAM means SDRAM and variants. i.e

Page 9: DRAM Lecture2

DSys

l

... Bit Lines...

MemoryArray

DRAM

Sense Amps

olumn Decoder

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

DRAM Access Protoco

COLUMN ACCESS I

READ Commandor

CAS: Column Address Strobe

BUSMEMORY

CONTROLLERCPU

Row

Dec

od

er

. .. W

ord

Lin

es ..

.

Data In/OutBuffers

C

Page 10: DRAM Lecture2

DSys

then thbus ... are usmight or a lothe ne(this is

l

with CAS

... Bit Lines...

MemoryArray

DRAM

Sense Amps

olumn Decoder

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

e data is valid on the data depending on what you ing for in/out buffers, you be able to overlap a litttle t of the data transfer with xt CAS to the same page PAGE MODE)

DRAM Access Protoco

Column Access II

note: page mode enables overlap

BUSMEMORY

CONTROLLERCPU

Row

Dec

od

er

. .. W

ord

Lin

es ..

.

Data In/OutBuffers

C

... with optional additionalCAS: Column Address Strobe

Data Out

Page 11: DRAM Lecture2

DSys

NOTE

AM cell to

... Bit Lines...

MemoryArray

DRAM

Sense Amps

olumn Decoder

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

DRAM “Speed” Part I

How fast can I move data from DRsense amp?

RCD (Row Command Delay)

BUSMEMORY

CONTROLLERCPU

Ro

w D

eco

der

. .. W

ord

Lin

es ..

.

Data In/OutBuffers

C

tRCD

Page 12: DRAM Lecture2

DSys

se amps

... Bit Lines...

MemoryArray

DRAM

Sense Amps

olumn Decoder

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

DRAM “Speed” Part II

How fast can I get data out of senback into memory controller?

BUSMEMORY

CONTROLLERCPU

Ro

w D

eco

der

. .. W

ord

Lin

es ..

.

Data In/OutBuffers

C

CAS: Column Address Strobe

tCAS aka

tCL

tCASL aka

CASL: Column Address Strobe Latency

CL: Column Address Strobe Latency

Page 13: DRAM Lecture2

DSys

I

AM cell into

... Bit Lines...

MemoryArray

DRAM

Sense Amps

olumn Decoder

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

DRAM “Speed” Part II

How fast can I move data from DRmemory controller?

RAC (Random Access Delay)

BUSMEMORY

CONTROLLERCPU

Ro

w D

eco

der

. .. W

ord

Lin

es ..

.

Data In/OutBuffers

C

tRAC = tRCD + tCAS

Page 14: DRAM Lecture2

DSys

rray so I can

... Bit Lines...

MemoryArray

DRAM

Sense Amps

olumn Decoder

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

DRAM “Speed” Part IV

How fast can I precharge DRAM aengage another RAS?

RP (Row Precharge Delay)

BUSMEMORY

CONTROLLERCPU

Ro

w D

eco

der

. .. W

ord

Lin

es ..

.

Data In/OutBuffers

C

tRP

Page 15: DRAM Lecture2

DSys

t rows?

... Bit Lines...

MemoryArray

DRAM

Sense Amps

olumn Decoder

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

DRAM “Speed” Part V

How fast can I read from differen

RC (Row Cycle Time)

BUSMEMORY

CONTROLLERCPU

Row

Dec

od

er

. .. W

ord

Lin

es ..

.

Data In/OutBuffers

C

tRC = tRAS + tRP

Page 16: DRAM Lecture2

DSys

ary I

Cycle Time

Precharge Delay

dom Access Delay

umn Address Strobe

Address Strobe

tems designersctuers

hitect:d code

t traversal

Command Delay

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

DRAM “Speed” Summ

What do I care about?

RC :Row

tRC = tRAS + tRP

RP :Row

tRP

RAC :Ran

tRAC = tRCD + tCAS

CAS: Col

tCAS

RAS: Row

tRCD Seen in ads.Easy to explain

Embedded sysDRAM manufa

Computer ArcLatency bouni.e. linked lis

Easy to sell

RCD: Row

Page 17: DRAM Lecture2

DSys

ary IIom ss (tRAC)

Row Cycle Time (tRC)

60 ns

60 ns

70 ns

25 ns

25 ns

to beable

pensive

n standard

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

DRAM “Speed” Summ

DRAM Type FrequencyData Bus Width (per chip)

Peak Data Bandwidth (per Chip)

RandAcceTime

PC133 SDRAM

133 16 200 MB/s 45 ns

DDR 266 133 * 2 16 532 MB/s 45 ns

PC800RDRAM

400 * 2 16 1.6 GB/s 60 ns

FCRAM 200 * 2 16 0.8 GB/s 25 ns

RLDRAM 300 * 2 32 2.4 GB/s 25 ns

DRAM is “slow”But doesn’t havetRC < 10ns achiev

Higher die cost

Not commodity Ex

Not adopted i

Page 18: DRAM Lecture2

DSys

DRAMdetermRAS+Csignificthe CPcontroEach toverheoverheThis mlonger

DRAM

E2/E3

1

ueuetrolleruencesqueued)

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

“latency” isn’t inistic because of CAS or AS, and there may be

ant queuing delays within U and the memory

llerransaction has some ad. Some types of ad cannot be pipelined. eans that in general, bursts are more efficient.

“DRAM latency”

B

CD

E

F

A

CPU MemController

A: Transaction request may be delayed in QB: Transaction request sent to Memory ConC: Transaction converted to Command Seq

(may be D: Command/s Sent to DRAME1: Requires only a CAS orE2: Requires RAS + CAS or

F: Transaction sent back to CPU

“DRAM Latency” = A + B + C + D + E + F

E3: Requires PRE + RAS + CAS

Page 19: DRAM Lecture2

DSys

NOTE

sics

x8 DRAM

... Bit Lines...

Sense Amps

. . .

.

rs

DRAM

Ro

w D

eco

der

MemoryArray

Column Decoder

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

DRAM Architecture Ba

PHYSICAL ORGANIZATION

This is per bank …Typical DRAMs have 2+ banks

x2 DRAM x4 DRAM

DataBuffe

x8

... Bit Lines...

Sense Amps. .

. .

DataBuffers

x2 DRAM

Ro

w D

eco

der

MemoryArray

Column Decoder

... Bit Lines...

Sense Amps

. . .

.

DataBuffers

x4 DRAM

Ro

w D

eco

der

MemoryArray

Column Decoder

Page 20: DRAM Lecture2

DSys

let’s loway .. portray

[explai

main psignalslatchescolumn

sics

AM

dut

Data Transfer

Column Access

Row Access

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

ok at the interface another the say the data sheets it.

n]

oint: the RAS\ and CAS\ directly control the that hold the row and addresses ...

DRAM Architecture Ba

Read Timing for Conventional DR

RowAddress

ColumnAddress

ValidDataout

RAS

CAS

Address

DQ

RowAddress

ColumnAddress

ValiDatao

Page 21: DRAM Lecture2

DSys

since Dhave bto the to Burschangmodifictarget

[discus

EverytSDRAinexpeconsidwas eslatch, PSDRAhowevideas, considthere idirectiohas ap

[ do LAwith EINTER

ee

Structural

M

, DDR/2 Future Trends

. . .

. . .

. .

. .

S

M

AM

$

ModificationsTargetingLatency

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

RAM’s inception, there een a stream of changes design, from FPM to EDO t EDO to SDRAM. the

es are largely structural ations -- nimor -- that

THROUGHPUT.

s FPM up to SDRAM

hing up to and including M has been relatively nsive, especially when ering the pay-off (FPM sentially free, EDO cost a BEDO cost a counter,

M cost a slight re-design). er, we’re run out of “free” and now all changes are ered expensive ... thus s no consensus on new ns and myriad of choices peared

TENCY mods starting SDRAM ... and then the FACE mods ]

DRAM Evolutionary Tr

(Mostly) Structural Modifications

Interface Modifications

Conventional

FPM EDO ESDRA

Rambus

. . . .

MOSY

FCRA

VCDRTargeting Throughput

Targeting Throughput

DRAM

SDRAMP/BEDO

Page 22: DRAM Lecture2

DSys

NOTE

AM

idout

Data Transfer

Column Access

Transfer Overlap

Row Access

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

DRAM Evolution

Read Timing for Conventional DR

RowAddress

ColumnAddress

ValidDataout

RAS

CAS

Address

DQ

RowAddress

ColumnAddress

ValData

Page 23: DRAM Lecture2

DSys

FPM aesenseCAS c

much

problevalue ibuffer data is

ValidDataout

Data Transfer

Column Access

Transfer Overlap

Row Access

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

allows you to keep th amps actuve for multiple

ommands ...

better throughput

m: cannot latch a new n the column address until the read-out of the complete

DRAM Evolution

Read Timing for Fast Page Mode

RowAddress

ColumnAddress

ValidDataout

ColumnAddress

ColumnAddress

ValidDataout

RAS

CAS

Address

DQ

Page 24: DRAM Lecture2

DSys

solutioinsteadbuffers

by putcolumnaddressoone

ut

Data Transfer

Column Access

Transfer Overlap

Row Access

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

n to that problem -- of simple tri-state , use a latch as well.

ting a latch after the mux, the next column s command can begin

r

DRAM Evolution

Read Timing for Extended Data O

RowAddress

ColumnAddress

ValidDataout

RAS

CAS

Address

DQ

ColumnAddress

ColumnAddress

ValidDataout

ValidDataout

Page 25: DRAM Lecture2

DSys

by drivan intean extecycle tbus wa30% Data Transfer

Column Access

Transfer Overlap

Row Access

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

ing the col-addr latch from rnal counter rather than rnal signal, the minimum

ime for driving the output s reduced by roughly

DRAM Evolution

Read Timing for Burst EDO

RowAddress

ColumnAddress

RAS

CAS

Address

DQ ValidData

ValidData

ValidData

ValidData

Page 26: DRAM Lecture2

DSys

“pipelinof the toggleaddrestogglebus. thwhen tstops t

DO

Data Transfer

Column Access

Transfer Overlap

Row Access

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

e” refers to the setting up read pipeline ... first CAS\ latches the column s, all following CAS\

s drive data out onto the erefore data stops coming he memory controller oggling CAS\

DRAM Evolution

Read Timing for Pipeline Burst E

RowAddress

ColumnAddress

RAS

CAS

Address

DQ ValidData

ValidData

ValidData

ValidData

Page 27: DRAM Lecture2

DSys

main bor memhavinginternacontroother tcyclesthoughlatencyschemthroug

AM

d Bus)

Data Transfer

Column Access

Transfer Overlap

Row Access

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

enefit: frees up the CPU ory controller from

to control the DRAM’s l latches directly ... the

ller/CPU can go off and do hings during the idle instead of “wait” ... even the time-to-first-word actually gets worse, the e increases system hput

DRAM Evolution

Read Timing for Synchronous DR

(RAS + CAS + OE ... == Comman

Command

Address

DQ

Clock

RowAddr

ColAddr

ValidData

ValidData

ValidData

ValidData

ACT READ

RAS

CAS

Page 28: DRAM Lecture2

DSys

outputto staraccess

latch wallowsRAS saccessPREC

AM

ValidData

ValidData

ValidData

ValidData

idta

ValidData

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

latch on EDO allowed you t CAS sooner for next s (to same row)

hole row in ESDRAM -- you to start precharge & ooner for thee next page -- HIDE THE

HARGE OVERHEAD.

DRAM Evolution

Inter-Row Read Timing for ESDR

Command

Address

DQ

Clock

RowAddr

ColAddr

ValidData

ValidData

ValidData

ValidData

ACT READ

RowAddr

ColAddr

ACT READPRE

“Regular” CAS-2 SDRAM, R/R to same bank

Command

Address

DQ

Clock

RowAddr

ColAddr

ValidData

ValidData

ValidData

ValidData

ACT READ

RowAddr

ColAddr

ValidData

ValidData

ValDa

ACT READ

ESDRAM, R/R to same bank

PRE

Bank

Bank

Page 29: DRAM Lecture2

DSys

neat febufferi

sive?)

1/0

ColAddr

ValidData

ValidData

ValidData

READ

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

ature of this type of ng: write-around

DRAM Evolution

Write-Around in ESDRAM

(can second READ be this aggres

Command

Address

DQ

Clock

RowAddr

ColAddr

ValidData

ValidData

ValidData

ValidData

ACT READ

RowAddr

ColAddr

ValidData

ValidData

ValidData

ValidData

ACT WRITEPRE

“Regular” CAS-2 SDRAM, R/W/R to same bank, rows 0/

Command

Address

DQ

Clock

RowAddr

ColAddr

ValidData

ValidData

ValidData

ValidData

ACT READ

RowAddr

ColAddr

ValidData

ValidData

ValidData

ValidData

ACT WRITE

ESDRAM, R/W/R to same bank, rows 0/1/0

PRE

Bank

Bank

RowAddr

ACTPRE

Bank

ColAddr

ValidData

ValidData

ValidData

ValidData

READ

Page 30: DRAM Lecture2

DSys

main tbunch rambuyou mudirectlynot theextra c... howbandwcache,into cawant itreduciSDRAthis mathroug

nel

duces energy

$

t/Outputuffer

DQs

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

hing ... it is like having a of open row buffers (a la s), but the problem is that st deal with the cache (move into and out of it), DRAM banks ... adds an ouple of cycles of latency ever, you get good idth if the data you want is and you can “prefetch” che ahead of when you ... originally targetted at ng latency, now that M is CAS-2 and RCD-2, ke sense only in a

hput way

DRAM Evolution

Internal Structure of Virtual Chan

Segment cache is software-managed, re

Row Decoder

2Kb Segment

2Kb Segment

2Kb Segment

2Kb Segment

Bank A

Bank B16 “Channels”

InpuB

Sel/Dec

(segments)

SenseAmps

2Kbit # DQs

Activate PrefetchRestore

ReadWrite

Page 31: DRAM Lecture2

DSys

FCRAdata aportion

8K rowselect (assumthe da

AM

/access

tRCD = 5ns

8M Array

Sense Amps

FCRAM

(one clock)

(?)

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

M opts to break up the rray .. only activate a of the word line

s requires 13 bits tto ... FCRAM uses 15 ing the array is 8k x 1k ...

ta sheet does not specify)

DRAM Evolution

Internal Structure of Fast Cycle R

Reduces access time and energy

tRCD = 15ns

8M Array13 bits

Sense Amps

15 bits

SDRAM

(two clocks)

Ro

w D

eco

der

Ro

w D

eco

der

(8Kr x 1Kb)

Page 32: DRAM Lecture2

DSys

MoSysfurtherinterfaenergy

[physic

auto retransptthrougthem w

but whrepeatduratiorefresh

solutioCACHshould(magic

AM

. . .

. . .

. .

. . . . . .

RAM Memorytem: Lecture 2

Spring 2003

Bruce JacobDavid Wang

University ofMaryland

takes this one step ... DRAM with an SRAM ce & speed but DRAM

al partitioning: 72 banks]

fresh -- how to do this arently? the logic moves h the arrays, refreshing hen not active.

at is one bank gets ed access for a long n? all other banks will be ed, but that one will not.

n: they have a bank-sized E of lines ... in theory, never have a problem )

DRAM Evolution

Internal Structure of MoSys 1T-SR

addr

BankSelect

DQs

$

AutoRefresh