Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM...

57
Worst Case Analysis of DRAM Latency in Multi-Requestor Systems Zheng Pei Wu Yogen Krish Rodolfo Pellizzoni

Transcript of Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM...

Page 1: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Worst Case Analysis of DRAM Latency in Multi-Requestor Systems

Zheng Pei WuYogen KrishRodolfo Pellizzoni

Page 2: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Multi-Requestor Systems

CPU CPU CPU

Inter-connect

DRAM DMA I/O

1/26

Page 3: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Multi-Requestor Systems

CPU CPU CPU

Inter-connect

DRAM DMA I/O

INTERFERENCE!!!

1/26

Page 4: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Multi-Requestor Systems

CPU CPU CPU

Inter-connect

DRAM DMA I/O

INTERFERENCE!!!

Hard Real Time Systems Must be Predictable!!!

1/26

Page 5: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Multi-Requestor Systems

• Schedulability Analysis: needs WCET as input

• WCET depends on hardware platform

• WCET: needs Latency to access shared resource (e.g. cache, DRAM)

• Existing approaches can bound the interference but they assume the latency for DRAM access is constant

2/26

Page 6: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Multi-Requestor Systems

• Schedulability Analysis: needs WCET as input

• WCET depends on hardware platform

• WCET: needs Latency to access shared resource (e.g. cache, DRAM)

• Existing approaches can bound the interference but they assume the latency for DRAM access is constant

2/26

Problem:DRAM latency is variable and

changes depending on its state

Page 7: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Contribution

CPU CPU CPU

Inter-connect

DRAM DMA I/O

Timing analysis that bounds the worst case latency for DRAM access

Requestor Under Analysis

3/26

Page 8: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Contribution

CPU CPU CPU

Inter-connect

DRAM DMA I/O

Interfering Requestors

Assuming we do not know what they are doing, so we assume they cause the worst case interference

Interfering Requestors

3/26

Page 9: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Outline

1. Background & Related Work

2. Memory Controller Model

3. Worst Case Latency Analysis

4. Results & Conclusion

Page 10: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Background

Can only Read/Write to Row Buffer

Storage Array contains Data

4/26

Page 11: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Background

READ

Targeting Data in this Row

Row Buffer contain data from a different row

4/26

Page 12: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Background

READ

P, A, R

Front End generates the needed commands

Back End issues commands on command bus

4/26

Page 13: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Background

PRE

ACT

P, A, R

Pre-Charge: store the data back into arrayACT: Load the data from array into buffer

P

Pre-charge command issued on command bus

A

Timing Constraint 4/26

Page 14: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Background

P, A, R

P A DataR

READ

4/26

Page 15: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Background

R

P A DataR

Targeting Data Already in Row Buffer

READ

Only Need Read Command

DataR

Can be issued immediately

4/26

Page 16: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Background

R

P A DataR

READ

DataR

Latency of a close request Latency of a open request

-Latency of a close request is much longer than the latency of an open request

-Latency of memory access is variable!

4/26

Page 17: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Predictable Memory Controllers

• Close Row Policy:– After each access, the row buffer is

automatically pre-charged

A DataR A DataRP

Implicit Pre-chargeMemory Latency is the same for all requests

-Can not take advantage of locality (row hits)-Latency is much longer than open request

Next Request targets same bank

5/26

Page 18: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

• Interleaving BanksBank 1 Bank 2 Bank 3 Bank 4

DataR

DataR

DataR

DataR

A

A

A

A

Accessing data in multiple banks

Multiple data can be pipelined

6/26

Predictable Memory Controllers

Page 19: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

• Interleaving BanksBank 1 Bank 2 Bank 3 Bank 4

DataR

DataR

DataR

DataR

A

A

A

A

A

Problem: requestors can close each other’s row buffer since they can access all banks

Thus closed row policy is used to make latency predictable The problem of long latency

of close row policy still exist!

6/26

Predictable Memory Controllers

Page 20: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

A

• Interleaving BanksBank 1 Bank 2 Bank 3 Bank 4

DataR

DataR

DataR

DataR

A

A

A

A

This is good for system with small DRAM data bus width (e.g. 16 bits)

Larger data buses can transfer same amount

of data without interleaving so many

banks6/26

Predictable Memory Controllers

Page 21: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

• Interleaving BanksBank 1 Bank 2

DataR

DataR

A

A

Interleaving two banks for wider data bus (e.g. 32 bits)

A

Time Wasted!!

Interleaving Problems:1. Requestors can close each other’s

rows (interference)2. Must be used with close row

policy to make latency predictable3. For wider data bus, effectiveness

of interleaving is diminished7/26

Predictable Memory Controllers

Page 22: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

• Private Banks

Core 1 Core 2 DMA

Bank 1 Bank 2 Bank 3 Bank 4

• Can partition banks to either requestors or tasks

• This can be done by:– Hardware if Memory

controller supports

– By compiler

– In OS, using virtual memory

8/26

Predictable Memory Controllers

Page 23: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Related Work

• AMC[1] and Predator [2]:-Close Row Policy-Interleaved Bank

• Conservative Open-Page [3]:– Interleaved Bank – Leave row open for a small window of time

• PRET DRAM Controller [4]:– Close Row Policy– Private Bank

9/26

Page 24: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Our Approach

• Private Bank– eliminates row buffer interferences from other

requestors

• Open Row Policy– reduce latency and take advantage or row hit

ratio (locality)

10/26

Challenge:1. Analysis is more complex2. More than 20 timing constraints3. Latency depends on the dynamic

state of DRAM

Page 25: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Outline

1. Background & Related Work

2. Memory Controller Model

3. Worst Case Latency Analysis

4. Results & Conclusion

Page 26: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Memory Controller Model

Per Requestor BuffersGlobal FIFO

Queue CommandBus

DataBus

AW

Core 2

Core 1

DMA

Front End

A P

WR

Back End

CommandGenerator

ignore CONSTANT front end delayWe focus on the back end latency

W

11/26

Page 27: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Memory Controller Model

Per Requestor BuffersGlobal FIFO

Queue CommandBus

DataBus

AW

Core 2

Core 1

DMA

Front End

WR

Back End

CommandGenerator

Each requestor has a private buffer for memory command

Global FIFO is used for arbitration

A PW

11/26

Page 28: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Memory Controller Model

Per Requestor BuffersGlobal FIFO

Queue CommandBus

DataBus

AW

Core 2

Core 1

DMA

Front End

WR

Back End

CommandGenerator

Command at head of each private buffer are inserted into the FIFO

A PW

11/26

Page 29: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Memory Controller Model

Per Requestor BuffersGlobal FIFO

Queue CommandBus

DataBus

W

Core 2

Core 1

DMA

Front End

R

Back End

CommandGenerator

A

W

A PW

11/26

Command at head of each private buffer are inserted into the FIFO

Page 30: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Memory Controller Model

Per Requestor BuffersGlobal FIFO

Queue CommandBus

DataBus

W

Core 2

Core 1

DMA

Front End

R

Back End

CommandGenerator

Controller scan the global FIFO from front to end for a command that can be issued

A

W

A PW

11/26

Page 31: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Memory Controller Model

Per Requestor BuffersGlobal FIFO

Queue CommandBus

DataBus

W

Core 2

Core 1

DMA

Front End

R

Back End

CommandGenerator

Command Issued

A

WA

P

W

Next command must wait until timing constraints are satisfied before it can be inserted into FIFO

Intuitively, the arbitration is fair and is similar to a round robin policy

11/26

Page 32: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Outline

1. Background & Related Work

2. Memory Controller Model

3. Worst Case Latency Analysis

4. Results & Conclusion

Page 33: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Worst Case Analysis

Worst Case Single Request

Latency Analysis

Total # of Requestors

Memory Device Parameters

Cumulative Worst Case

Execution Time

OpenRead

CloseRead

OpenWrite

CloseWrite

Latency for different types of request

TaskUnder

Analysis

# of open reads# of close reads# of open writes# of close writes

WCET

Part 1 – Main Contribution

Part 2 – Only provided for in-order core

Work for any type of cores

Assumption:We do not know about the activity on the other interfering requestors,

so we assume those requestors produce the worst case pattern to

cause maximum interference

12/26

Page 34: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Worst Case Analysis

Worst Case Single Request

Latency Analysis

Total # of Requestors

Memory Device Parameters

Cumulative Worst Case

Execution Time

OpenRead

CloseRead

OpenWrite

CloseWrite

Latency for different types of request

TaskUnder

Analysis

# of open reads# of close reads# of open writes# of close writes

WCET

12/26

Page 35: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Single Request Latency

DataR/W

R/W

Decomposed into two parts

Request Arrival

Arrival until Read/Write command is inserted into the global FIFO

Read/write inserted into FIFO until data is finished transmitting

Arrival to Read/Write Read/Write to Data

13/26

Page 36: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Single Request Latency

DataR/W

Request Arrival

Arrival to Read/Write Read/Write to Data

P A

This part may include Pre-charge and ACT commands

Latency depends on the previous request (i.e., state of the DRAM)

Latency does not depend on state of the DRAM

R/W

13/26

Page 37: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Single Request Latency

R/W

Request Arrival

Arrival to Read/Write Read/Write to Data

Both parts depends on the # of interfering requestors as well as DRAM timing constraints

R/W

P A Data

13/26

Page 38: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Single Request Latency

R/W

Request Arrival

Arrival to Read/Write Read/Write to Data

R/W

P A Data

13/26

We will focus on this partFor details on this part,refer to paper

Page 39: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Read/Write to Data Latency

14/26

DataR

DataR

DataR

Read to Read has no timing constraints, only contention on the data bus

Same for Write to Write

Page 40: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Read/Write to Data Latency

DataR

DataW

DataW

Write to Read timing constraint

Read to Write timing constraint

15/26

Therefore, an alternation of read and write commands produce longer latency

Page 41: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Read/Write to Data Latency

R

W

R

W

Front

R

R

W

W

Data

Data

Data

Data

• Interference on Write command

All other requestors inserts R/W commands to create maximum interference

16/26

Page 42: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Read/Write to Data Latency

R

W

R

W

Front

R

R

W

W

Data

Data

Data

Data

• Interference on Write command

A write command could of finished immediately before t0

W Data

17/26

Page 43: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Read/Write to Data Latency

R

W

R

W

Front

R

R

W

W

W

Data

Data

Data

Data

Data

• Interference on Write command

Therefore, further delay the first Read command

18/26

Page 44: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Worst Case Analysis

Worst Case Single Request

Latency Analysis

Total # of Requestors

Memory Device Parameters

Cumulative Worst Case

Execution Time

OpenRead

CloseRead

OpenWrite

CloseWrite

Latency for different types of request

TaskUnder

Analysis

# of open reads# of close reads# of open writes# of close writes

WCET

Part 2 – Only provided for in-order core

Page 45: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Cumulative Latency

Open Read Close Read Open Write Close Write

19/26

Task Under Analysis:

t

Page 46: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Cumulative Latency

Open Read Close Read Open Write Close Write

19/26

Task Under Analysis:

t

If worst case request order is known, we can sum the latency of each request

Worst case request order depends on input value, code path, cache state, etc.

Page 47: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Cumulative Latency

Open Read Close Read Open Write Close Write

19/26

Task Under Analysis:

t

If worst case request order is known, we can sum the latency of each request

Static Analysis tools can be used to obtain safe bound for # of each type of request

Page 48: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Cumulative Latency

Open Read Close Read Open Write Close Write

Which pattern leads to worst case latency?

This problem can be solved in constant time; see paper for detail

19/26

Task Under Analysis:

Page 49: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Outline

1. Background & Related Work

2. Memory Controller Model

3. Worst Case Latency Analysis– Single Request Latency– Cumulative Latency

4. Results & Conclusion

Page 50: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Results• Comparison against Analyzable Memory Controller [1]

– Since they use fair arbitration (Round Robin) which is similar to our approach

• Synthetic Benchmarks– Used to show how worst case latency varies as

parameters are changed

• CHStone Benchmarks– Memory traces are obtained from gem5 simulator– Memory traces are used as input the worst case

analysis20/26

Page 51: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Results• Synthetic Benchmarks

21/26

Page 52: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Results• Synthetic Benchmarks

22/26

Page 53: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Results

• As memory devices becomes faster, the difference between open and close access is getting larger and therefore close row is becoming too pessimistic

Devices 800D(ns)

1066F(ns)

1333H(ns)

1600K(ns)

1866L(ns)

2133N(ns)

% better

AMC(64 bits) 185 185.27 180.9 178 169.84 163 11.89%Our(64 bits) 125.2 112.47 104.85 102.18 96.97 92.85 25.84%

23/26

50% Row Hit Ratio, 4 Requestors, 20% Writes

Page 54: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Results• CHStone Benchmarks for 64bits bus

24/26

Page 55: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Conclusion• A novel worst case analysis that takes dynamic state into

account

• Open row policy can reduce memory latency as devices are becoming faster

• Private bank scheme is used to eliminate row buffer interference from other requestors

25/26

Page 56: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

Future Work• Discussion of shared data

• Bus utilization is still poor due to read/write switching

• Read/Write optimization to reduce latency bound

• Handle Multiple Ranks

• Implementation in hardware

26/26

Page 57: Worst Case Analysis of DRAM Latency in Multi-Requestor Systems · Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu. Yogen Krish. Rodolfo Pellizzoni

References[1] M. Paolieri, E. Quin ̃ones, F. Cazorla, and M. Valero, “An Analyzable Memory Controller for Hard Real-Time CMPs,” Embedded Systems Letters, IEEE, vol. 1, no. 4, pp. 86–90, 2009. [2] B. Akesson, K. Goossens, and M. Ringhofer, “Predator: a predictable SDRAM memory controller,” in CODES+ISSS, 2007, pp. 251–256. [3] S. Goossens, B. Akesson, and K. Goossens, “Conservative Open- page Policy for Mixed Time-Criticality Memory Controllers,” in DATE, 2013. [4] J. Reineke, I. Liu, H. D. Patel, S. Kim, and E. A. Lee, “Pret dram controller: Bank privatization for predictability and temporal isolation,” in CODES+ISSS, 2011, pp. 99–108.