Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
Transcript of Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
1/37
Anshul Kumar, CSE IITD
CSL718 : Main Memory
CPU-Cache-Main Memory Performance
9th Mar, 2006
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
2/37
Anshul Kumar, CSE IITD slide 2
A Simple Model
tav = tc + pm . tc.misswhere
tav = average memory access time as seen by CPU
tc = cache access time
pm = miss probability (consider only read misses, if write penaltiesare hidden by buffers)
tc.miss = cache miss penalty
CPU Cache Memory
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
3/37
Anshul Kumar, CSE IITD slide 3
Cache miss penalty
Depends on
Various cache policies
Read policy
Load policy Write policy
Write buffers etc.
Main memory organization
Interleaving
Page mode
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
4/37
Anshul Kumar, CSE IITD slide 4
Read Policies
Cache
Memory
Teff=(1-pm).1 +
pm . (T+2)
Sequential Simple:
CacheMemory
Teff=(1-pm).1 +
pm . (T+1)
Concurrent Simple:
CacheMemory
Teff
=(1-pm
).1 +
pm . (T+1)
Sequential Forward:
Cache
Memory
Teff=(1-pm).1 +
pm . (T)
Concurrent Forward:
1 1 1
T
1 1 1T
1 1
T
1 1
T
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
5/37
Anshul Kumar, CSE IITD slide 5
Load policies
4 AU Block
Cache miss on AU 1
Block Load
Load ForwardFetch Bypass
(wrap around
load)
0 1 2 3
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
6/37
Anshul Kumar, CSE IITD slide 6
Analyzing Write Policies:CPU time
Hit:WB, Miss: WB 1 Tb + i 1 1
Hit:WB, Miss: WTWA 1 Tb + i 1 1
Hit:WB, Miss: WTNWA 1 Tb + i 1 1
Hit:WT, Miss: WB 1 Tb + i 1 1
Hit:WT, Miss: WTWA 1 Tb + i 1 1
Hit:WT, Miss: WTNWA 1 Tb + i 1 1
Policy Read Read Write Writehit miss hit miss
i depends on read policy
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
7/37
Anshul Kumar, CSE IITD slide 7
Analyzing Write Policies:Bus time
Hit:WB, Miss: WB 0 Tb (2-Pc) 0 Tb(2-Pc)
Hit:WB, Miss: WTWA 0 Tb (2-Pc) 0 Tb(2-Pc)+Tw
Hit:WB, Miss: WTNWA 0 Tb (2-Pc) 0 Tw
Hit:WT, Miss: WB 0 Tb (2-Pc) Tw Tb(2-Pc)
Hit:WT, Miss: WTWA 0 Tb Tw Tb+Tw
Hit:WT, Miss: WTNWA 0 Tb Tw Tw
Policy Read Read Write Writehit miss hit miss
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
8/37
Anshul Kumar, CSE IITD slide 8
Interleaving with Fast Page Mode
m
LLT
m
LTTT buscalineaccess 1
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
9/37
Anshul Kumar, CSE IITD slide 9
A Refined Model
tav = tc + pm . (tc.miss + tinterference + tw-interference + tIO-interference )where
tinterference = interference among line transfers
tw-interference = interference between word writes and line
transferstIO-interference = interference between I/O and line transfers
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
10/37
Anshul Kumar, CSE IITD slide 10
Interference among line transfers
What happens when another miss occurs in tbusy =tm.miss -tc.miss interval?
tinterference = additional delay due to this
= expected number of misses during tbusy *
delay per miss= ( * tbusy * pm) * (tbusy/ 2)
where = memory request rate of processor
tc tc.miss
tm.miss
CPU blocked CPU executing
Memory busy
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
11/37
Anshul Kumar, CSE IITD slide 11
Interference I/Os and writes
delay = prob that memory is busy when request arrives *
average waiting periodwhat happens when memory is found to be busy serving one
request and some other requests are waiting?
Memory busy
request arrivals
served waiting served
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
12/37
Anshul Kumar, CSE IITD slide 12
I/O Interference
tIO-interference = delay due to I/O contention
= probability that memory is occupied with I/O *
average time taken to complete ongoing I/O
= () * (tservice +tIO-wait)/2tservice = time to service (block read/write time)
tIO-wait= waiting time
= 0, if CPU has a higher priority
0, otherwise
estimate using queuing
model
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
13/37
Anshul Kumar, CSE IITD slide 13
Write Interference Delay
tw-interference = probability that a write through is occupying thememory when a read miss occurs *
average time taken to complete ongoing write
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
14/37
Anshul Kumar, CSE IITD slide 14
Memory performance using queuing model
Arrival of
requests
(from processor/cache)
Servicing of
requests
(by memory)
Requests queuedfor service
Statistical behaviour of arrivals ?
Statistical behaviour of service?
Model Nomenclature: arrival / service / numberM / G / 1 G : General
M / M / 1 M : Poisson/Exponential
M / D / 1 D : Constant
MB / D / 1 MB : Binomial
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
15/37
Anshul Kumar, CSE IITD slide 15
Modeling memory requests
prob of a request in one cycle =p
prob of no request in one cycle = 1p
prob of no request in T/cycles = (1p)T/
prob of at least one req in T/cycles = 1
(1
p)T/prob ofkrequests in n (=T/) cycles = nCkp
k(1p)n-k
(Binomial distribution)
expected no. of requests in n cycles = n p
T: interval
(memory cycle time)
: processor cycle
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
16/37
Anshul Kumar, CSE IITD slide 16
Poisson Approximation
If processor cycles are small
(i.e., 0,p 0, n, n pT),
Binomial distribution Poisson distribution, request rate =
prob ofkrequests in interval T =
expected no. of requests in intervalT =T
Interval between two consecutive requests has an exponentialdistribution, prob (inter arrival interval > t) = 1 e - t
Tk
ekT
!)(
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
17/37
Anshul Kumar, CSE IITD slide 17
Modeling Service
Each request is served in constant time
e.g. cache write through requests,
cache block transfer requests
or Service time has an exponential distribution
e.g. I/O requests with varying block sizes where
small blocks are more common than large blocks
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
18/37
Anshul Kumar, CSE IITD slide 18
M / G / 1 Model
Average waiting time = Tw =
Average queue length = Q =
where
= occupancy of server = /
= average service ratec =
= variance of service time
)1(2)1(1
22
c
)1(2
)1(22
c
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
19/37
Anshul Kumar, CSE IITD slide 19
Special cases: M/M/1, M/D/1
M/M/1 c = 1Average waiting time = Tw =
Average queue length = Q =
M/D/1 c = 0
Average waiting time = Tw =
Average queue length = Q =
1
12
1
2
)1(2
12
)1(2
2
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
20/37
Anshul Kumar, CSE IITD slide 20
M/D/1 with low server occupancy
Average waiting time = Tw =
Average queue length = Q =
when is small, Tw =
=
Compare this with
)1(2
12
)1(2
2
2
12
2
1
2
1
2
busym tp 2
1
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
21/37
Anshul Kumar, CSE IITD slide 21
Designing buffer to hold the queue
How to design a buffer so that buffer overflowor stalling due to buffer full is within certain
limit?
For M/M/1 model ,
prob(queue size buffer size BF) = BF+1
Choose BF so that this probability is below a
desired value.
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
22/37
Anshul Kumar, CSE IITD slide 22
Open and Closed Queues
Arrival of
requests
(from processor/cache)
Servicing of
requests
(by memory)
Requests queuedfor service
Processor is not blocked by queuing delays and
request rate remains unaffected Open queue
Processor is blocked due to queuing delays andrequest rate drops Closed queue
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
23/37
Anshul Kumar, CSE IITD slide 23
Open and Closed Queues
Arrival of
requests
(from processor/cache)
Servicing of
requests
(by memory)
Requests queued
for service
Time Tw 1/
Number (open) Q = Tw = /
Number (closed) Qa a
occupancy(open q)=
= occupancy(closed q) + waiting (closed q) a +Qa
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
24/37
Anshul Kumar, CSE IITD slide 24
M/D/1 Closed Queue
Reduced request rate = aReduced occupancy =a =a/
Requests being served = a
Requests waiting =
)1(2
2
a
a
1)1(1)1(
1)1()1(2
2
2
2
2
a
a
a
a
a
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
25/37
Anshul Kumar, CSE IITD slide 25
Deriving queue length, wait time
Let ti = time when request i is being served
ri = no. of arrivals during ti
ni = queue length at the end ofti
including item in serviceAssume occupancy of server = = /< 1
process reaches a steady state
Expected value E(ti ) = E(t) = T = 1/E(ri ) = E(r) = E(t) = /=
E(ni ) = E(n) = N
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
26/37
Anshul Kumar, CSE IITD slide 26
Relating ni+1
to ni
ni+1 = ni + arrivals departures
two cases need to be considered:
i) ni 0
ii) ni = 0
Ci+1Ci+2Ci+3 Ci
ni
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
27/37
Anshul Kumar, CSE IITD slide 27
When ni 0
Ci+1 arrived before Ci left
ni+1 = ni + ri+1 - 1
Ci served Ci+1 served
Ci leaves Ci+1 leaves
time
ti ti+1
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
28/37
Anshul Kumar, CSE IITD slide 28
When ni= 0
Ci+1 arrived after Ci leftni+1 = ni + 1 + ri+1 1
= ni + ri+1
Ci served Ci+1 served
Ci leaves Ci+1 leaves
time
ti ti+1
Ci+1 arrives
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
29/37
Anshul Kumar, CSE IITD slide 29
Combining the two cases
ni+1 = ni + ri+1 1 + i
where i = 0, when ni 0 and
i = 1, when ni = 0
note that nii = 0 and i2= i
E(ni+1) =E(ni ) +E(ri+1 ) 1 +E(i )
in steady state, E(n) =E(n) +E(r) 1 +E()
that is, E() = 1 -E(r) = 1 - prob ( n 0) =
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
30/37
Anshul Kumar, CSE IITD slide 30
Combining the two cases
ni+1 = ni + ri+1 1 + i
ni+12 = ni
2 + (ri+1 1)2 + i
2+ 2 ni (ri+1 1)
+ 2(ri+1 1) i + 2 nii
ni+12 = ni
2 + (ri+1 1)2 + i+ 2 ni (ri+1 1) + 2(ri+1 1) i
E(ni+12) = E( ni
2 ) + E(ri+1 1)2 + E( i )
+ 2E[ ni (ri+1 1) ] + 2E[(ri+1 1) i ]
0 = E[(r 1)2] + E( )+ 2E[ n (r 1) ] + 2E[(r 1) ]
0 = E(r2)-2+1+ (1-)+ 2E(n) ( 1) + 2( 1)(1-)
2E(n) (1-) = E(r2)-22 +
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
31/37
Anshul Kumar, CSE IITD slide 31
continued
2E(n) (1-) = E(r2)-22 +
This is valid for G/G/1
)1(2
-)E(
)1(2
2-)E()E(N
222
rrn
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
32/37
Anshul Kumar, CSE IITD slide 32
Consider Poisson arrival
P(ri) =
mean E(ri) = ti
variance ri2 = ti
ri2 =E(ri
2) - |E(ri)|2
E(ri2) = ri
2 +|E(ri)|2
Take expectation over i
E(r2) = E(t) + 2 E(t2)
i
i
!)(
i
i t
r
ert
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
33/37
Anshul Kumar, CSE IITD slide 33
continued
mean E(t) = 1/variance t
2
E(t2) = t2 + [E(t) ] 2 = t
2 + 1/2
Recall E(r2
) = E(t) + 2
E(t2
)Therefore, E(r2) = /+ 2 (t
2 + 1/2 )
= + 2t2 + 2
where c2 = 2t2
)1(2)(1
)1(2
)1(2-)E()E(N
222222
crn t
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
34/37
Anshul Kumar, CSE IITD slide 34
Direct Derivation for M/M/1
P(n; t) = prob that there are n req in the system attime t (in queue + in service)
P(n; t+t) = P(n; t)(1 - t - t)
+ P(n-1; t) t
+ P(n+1; t) t
P(0; t+t) = P(0; t)(1 - t) + P(1; t) t
Prob of more than one event in tis neglected (t2
term)
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
35/37
Anshul Kumar, CSE IITD slide 35
Direct Derivation for M/M/1
dP(n; t)/dt= P(n; t)(--) + P(n-1; t)+ P(n+1; t)dP(0; t)/dt= P(0; t)(-) + P(1; t)
In steady state, We can drop ;t
Derivatives tend to 0
0 = P(n)(--) + P(n-1)+ P(n+1)
0 = P(0)(-) + P(1)
P(n) - P(n+1) = P(n-1) - P(n)
P(0) - P(1) = 0
P(n-1) - P(n) = 0 P(n) =P(n-1)
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
36/37
Anshul Kumar, CSE IITD slide 36
Direct Derivation for M/M/1
P(n) =P(n-1)
P(n) =n P(0)
11)1()1(
)1()1()()(
)1()(1)0(
1
1
)0(1)0(1)(
2
2
000
00
i
i
i
i
i
n
i
i
i
iiiPinE
nPandP
PPiP
-
8/4/2019 Lect14.LecMar09 2006.CPU Cahce Main Memory Performance
37/37
A h l K CSE IITD slide 37
Direct Derivation for M/M/1
)(
)(Prob
)1(
)1)(1()1()(
)(Prob
1
1
2
00
k
k
k
i
ik
i
kserverqueueinitems
iP
kserverqueueinitems