Lect14.LecMar09 2006.CPU Cahce Main Memory Performance

8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance

1/37

Anshul Kumar, CSE IITD

CSL718 : Main Memory

CPU-Cache-Main Memory Performance

9th Mar, 2006


2/37

Anshul Kumar, CSE IITD slide 2

A Simple Model

tav = tc + pm . tc.misswhere

tav = average memory access time as seen by CPU

tc = cache access time

pm = miss probability (consider only read misses, if write penaltiesare hidden by buffers)

tc.miss = cache miss penalty

CPU Cache Memory


3/37


Cache miss penalty

Depends on

Various cache policies

Read policy

Load policy Write policy

Write buffers etc.

Main memory organization

Interleaving

Page mode


4/37


Read Policies

Cache

Memory

Teff=(1-pm).1 +

pm . (T+2)

Sequential Simple:

CacheMemory

Teff=(1-pm).1 +

pm . (T+1)

Concurrent Simple:

CacheMemory

Teff

=(1-pm

).1 +

pm . (T+1)

Sequential Forward:

Cache

Memory

Teff=(1-pm).1 +

pm . (T)

Concurrent Forward:

1 1 1

T

1 1 1T

1 1

T

1 1

T


5/37


Load policies

4 AU Block

Cache miss on AU 1

Block Load

Load ForwardFetch Bypass

(wrap around

load)

0 1 2 3


6/37


Analyzing Write Policies:CPU time

Hit:WB, Miss: WB 1 Tb + i 1 1

Hit:WB, Miss: WTWA 1 Tb + i 1 1

Hit:WB, Miss: WTNWA 1 Tb + i 1 1

Hit:WT, Miss: WB 1 Tb + i 1 1

Hit:WT, Miss: WTWA 1 Tb + i 1 1

Hit:WT, Miss: WTNWA 1 Tb + i 1 1

Policy Read Read Write Writehit miss hit miss

i depends on read policy


7/37


Analyzing Write Policies:Bus time

Hit:WB, Miss: WB 0 Tb (2-Pc) 0 Tb(2-Pc)

Hit:WB, Miss: WTWA 0 Tb (2-Pc) 0 Tb(2-Pc)+Tw

Hit:WB, Miss: WTNWA 0 Tb (2-Pc) 0 Tw

Hit:WT, Miss: WB 0 Tb (2-Pc) Tw Tb(2-Pc)

Hit:WT, Miss: WTWA 0 Tb Tw Tb+Tw

Hit:WT, Miss: WTNWA 0 Tb Tw Tw

Policy Read Read Write Writehit miss hit miss


8/37


Interleaving with Fast Page Mode

m

LLT

m

LTTT buscalineaccess 1


9/37


A Refined Model

tav = tc + pm . (tc.miss + tinterference + tw-interference + tIO-interference )where

tinterference = interference among line transfers

tw-interference = interference between word writes and line

transferstIO-interference = interference between I/O and line transfers


10/37


Interference among line transfers

What happens when another miss occurs in tbusy =tm.miss -tc.miss interval?

tinterference = additional delay due to this

= expected number of misses during tbusy *

delay per miss= ( * tbusy * pm) * (tbusy/ 2)

where = memory request rate of processor

tc tc.miss

tm.miss

CPU blocked CPU executing

Memory busy


11/37


Interference I/Os and writes

delay = prob that memory is busy when request arrives *

average waiting periodwhat happens when memory is found to be busy serving one

request and some other requests are waiting?

Memory busy

request arrivals

served waiting served


12/37


I/O Interference

tIO-interference = delay due to I/O contention

= probability that memory is occupied with I/O *

average time taken to complete ongoing I/O

= () * (tservice +tIO-wait)/2tservice = time to service (block read/write time)

tIO-wait= waiting time

= 0, if CPU has a higher priority

0, otherwise

estimate using queuing

model


13/37


Write Interference Delay

tw-interference = probability that a write through is occupying thememory when a read miss occurs *

average time taken to complete ongoing write


14/37


Memory performance using queuing model

Arrival of

requests

(from processor/cache)

Servicing of

requests

(by memory)

Requests queuedfor service

Statistical behaviour of arrivals ?

Statistical behaviour of service?

Model Nomenclature: arrival / service / numberM / G / 1 G : General

M / M / 1 M : Poisson/Exponential

M / D / 1 D : Constant

MB / D / 1 MB : Binomial


15/37


Modeling memory requests

prob of a request in one cycle =p

prob of no request in one cycle = 1p

prob of no request in T/cycles = (1p)T/

prob of at least one req in T/cycles = 1

(1

p)T/prob ofkrequests in n (=T/) cycles = nCkp

k(1p)n-k

(Binomial distribution)

expected no. of requests in n cycles = n p

T: interval

(memory cycle time)

: processor cycle


16/37


Poisson Approximation

If processor cycles are small

(i.e., 0,p 0, n, n pT),

Binomial distribution Poisson distribution, request rate =

prob ofkrequests in interval T =

expected no. of requests in intervalT =T

Interval between two consecutive requests has an exponentialdistribution, prob (inter arrival interval > t) = 1 e - t

Tk

ekT

!)(


17/37


Modeling Service

Each request is served in constant time

e.g. cache write through requests,

cache block transfer requests

or Service time has an exponential distribution

e.g. I/O requests with varying block sizes where

small blocks are more common than large blocks


18/37


M / G / 1 Model

Average waiting time = Tw =

Average queue length = Q =

where

= occupancy of server = /

= average service ratec =

= variance of service time

)1(2)1(1

22

c

)1(2

)1(22

c


19/37


Special cases: M/M/1, M/D/1

M/M/1 c = 1Average waiting time = Tw =


M/D/1 c = 0



1

12

1

2

)1(2

12

)1(2

2


20/37


M/D/1 with low server occupancy



when is small, Tw =

=

Compare this with

)1(2

12

)1(2

2

2

12

2

1

2

1

2

busym tp 2

1


21/37


Designing buffer to hold the queue

How to design a buffer so that buffer overflowor stalling due to buffer full is within certain

limit?

For M/M/1 model ,

prob(queue size buffer size BF) = BF+1

Choose BF so that this probability is below a

desired value.


22/37


Open and Closed Queues

Arrival of

requests


Servicing of

requests

(by memory)

Requests queuedfor service

Processor is not blocked by queuing delays and

request rate remains unaffected Open queue

Processor is blocked due to queuing delays andrequest rate drops Closed queue


23/37


Open and Closed Queues

Arrival of

requests


Servicing of

requests

(by memory)

Requests queued

for service

Time Tw 1/

Number (open) Q = Tw = /

Number (closed) Qa a

occupancy(open q)=

= occupancy(closed q) + waiting (closed q) a +Qa


24/37


M/D/1 Closed Queue

Reduced request rate = aReduced occupancy =a =a/

Requests being served = a

Requests waiting =

)1(2

2

a

a

1)1(1)1(

1)1()1(2

2

2

2

2

a

a

a

a

a


25/37


Deriving queue length, wait time

Let ti = time when request i is being served

ri = no. of arrivals during ti

ni = queue length at the end ofti

including item in serviceAssume occupancy of server = = /< 1

process reaches a steady state

Expected value E(ti ) = E(t) = T = 1/E(ri ) = E(r) = E(t) = /=

E(ni ) = E(n) = N


26/37


Relating ni+1

to ni

ni+1 = ni + arrivals departures

two cases need to be considered:

i) ni 0

ii) ni = 0

Ci+1Ci+2Ci+3 Ci

ni


27/37


When ni 0

Ci+1 arrived before Ci left

ni+1 = ni + ri+1 - 1

Ci served Ci+1 served

Ci leaves Ci+1 leaves

time

ti ti+1


28/37


When ni= 0

Ci+1 arrived after Ci leftni+1 = ni + 1 + ri+1 1

= ni + ri+1

Ci served Ci+1 served

Ci leaves Ci+1 leaves

time

ti ti+1

Ci+1 arrives


29/37


Combining the two cases

ni+1 = ni + ri+1 1 + i

where i = 0, when ni 0 and

i = 1, when ni = 0

note that nii = 0 and i2= i

E(ni+1) =E(ni ) +E(ri+1 ) 1 +E(i )

in steady state, E(n) =E(n) +E(r) 1 +E()

that is, E() = 1 -E(r) = 1 - prob ( n 0) =


30/37


Combining the two cases

ni+1 = ni + ri+1 1 + i

ni+12 = ni

2 + (ri+1 1)2 + i

2+ 2 ni (ri+1 1)

+ 2(ri+1 1) i + 2 nii

ni+12 = ni

2 + (ri+1 1)2 + i+ 2 ni (ri+1 1) + 2(ri+1 1) i

E(ni+12) = E( ni

2 ) + E(ri+1 1)2 + E( i )

+ 2E[ ni (ri+1 1) ] + 2E[(ri+1 1) i ]

0 = E[(r 1)2] + E( )+ 2E[ n (r 1) ] + 2E[(r 1) ]

0 = E(r2)-2+1+ (1-)+ 2E(n) ( 1) + 2( 1)(1-)

2E(n) (1-) = E(r2)-22 +


31/37


continued

2E(n) (1-) = E(r2)-22 +

This is valid for G/G/1

)1(2

-)E(

)1(2

2-)E()E(N

222

rrn


32/37


Consider Poisson arrival

P(ri) =

mean E(ri) = ti

variance ri2 = ti

ri2 =E(ri

2) - |E(ri)|2

E(ri2) = ri

2 +|E(ri)|2

Take expectation over i

E(r2) = E(t) + 2 E(t2)

i

i

!)(

i

i t

r

ert


33/37


continued

mean E(t) = 1/variance t

2

E(t2) = t2 + [E(t) ] 2 = t

2 + 1/2

Recall E(r2

) = E(t) + 2

E(t2

)Therefore, E(r2) = /+ 2 (t

2 + 1/2 )

= + 2t2 + 2

where c2 = 2t2

)1(2)(1

)1(2

)1(2-)E()E(N

222222

crn t


34/37


Direct Derivation for M/M/1

P(n; t) = prob that there are n req in the system attime t (in queue + in service)

P(n; t+t) = P(n; t)(1 - t - t)

+ P(n-1; t) t

+ P(n+1; t) t

P(0; t+t) = P(0; t)(1 - t) + P(1; t) t

Prob of more than one event in tis neglected (t2

term)


35/37



dP(n; t)/dt= P(n; t)(--) + P(n-1; t)+ P(n+1; t)dP(0; t)/dt= P(0; t)(-) + P(1; t)

In steady state, We can drop ;t

Derivatives tend to 0

0 = P(n)(--) + P(n-1)+ P(n+1)

0 = P(0)(-) + P(1)

P(n) - P(n+1) = P(n-1) - P(n)

P(0) - P(1) = 0

P(n-1) - P(n) = 0 P(n) =P(n-1)


36/37



P(n) =P(n-1)

P(n) =n P(0)

11)1()1(

)1()1()()(

)1()(1)0(

1

1

)0(1)0(1)(

2

2

000

00

i

i

i

i

i

n

i

i

i

iiiPinE

nPandP

PPiP


37/37

A h l K CSE IITD slide 37


)(

)(Prob

)1(

)1)(1()1()(

)(Prob

1

1

2

00

k

k

k

i

ik

i

kserverqueueinitems

iP

kserverqueueinitems

Lect14.LecMar09 2006.CPU Cahce Main Memory Performance

Documents

Transcript of Lect14.LecMar09 2006.CPU Cahce Main Memory Performance