Post on 05-Aug-2020
EE
CS
366:C
omputer
Architecure
Instructor:ShantanuD
utt
Departm
entofEE
CS
University
ofIllinoisatC
hicago
LectureNotes#
16
Mem
oryO
rganization
c�
ShantanuD
utt
c�
ShantanuD
utt,UIC
1
1
Mem
oryH
ierarchyD
esign�
Many
programsneedlarge
amountsof
mem
ory,as
thesize
ofthe
prob-lem
sthey
solveincrease.To
solvethe
problemquickly,
fastaccessisneededto
allthisdata
�
One
solutionis,ofcourse,to
buildvery
largefastm
emoryunits
capableofstoring
1000sofMB
ytes.As
we
saw,fastm
emory(staticm
emory,for
example)consum
estoom
uchVLS
Iareaandpower,so
thatlargem
emory
ofthiskind
isim
practicaltorealize
�
Furtherm
ore,evenif
itbecom
esfeasibletobuild
largeam
ountsoffast
mem
ory,itisw
ellknown
thataccesstothis
mem
orygetsslowerasit
getslarger
�
Fortunately,there
isa
way
out!B
ecauseoflocality
propertyof
most
programs,it
isnotnecessaryto
havelarge
amountsof
fastmem
oryforquick
accesstolarge
amountsofdata:
(1)Tem
poralL
ocality:A
nitem
justreferencedwill
bereferencedagain
soon.(2)SpatialL
ocality:W
henanitem
isreferenced,nearbyitem
sinm
emory
will
alsobe
referencedsoon.
c�
ShantanuD
utt,UIC
2
2
Mem
oryH
ierarchyD
esign(contd.)�
Whattheselocality
propertiesmean
isthatprogram
susea
physicallycontiguousblock
ofdata
forsom
eperiod
oftim
ebefore
moving
onto
anotherblockofdata.
�
Thusw
ecan
buildvery
fastmem
orythatisjustlarge
enoughtostorethis
smallblock
ofdatathattheprogram
iscurrentlyw
orkingon—
thisisthe
1stlevelofthem
emoryhierarchy,and
isthe
registerfilein
theC
PU
.
�
The
nextblock
ofdatathatthe
programw
illm
oveto
hastobe
retrievedfrom
thenextlevelofthe
mem
oryhierarchyw
hichhasthe
2ndfastestand
2ndsm
allestmem
oryunit—this
thecache
�
Note
thatjustlike
thereislocality
forindividualdataitems(w
ords),thereis
alsolocality
betweensm
allblocksand
betweengroupsof
thesesmall
blocks(largerblocks),andso
on.
�
Thusm
orelevels
arerequiredthathold
largerandlargerblocksuntilthe
lastlevelholdstheentire
data:The
3rdlevelis
main
mem
oryandthe
4thlevelis
secondary/diskstorage.
�
Block
sizegets
largerasone
goesdown
thehierarchy
mainly
becauseaccesstim
eto
thelow
erlevelincreases,andthus
we
needtospreadthis
accesstime
overmore
words.
c�
ShantanuD
utt,UIC
3
3
Mem
oryH
ierarchyD
esign(contd.)�
Inprinciple,therecan
be�
levelsin
them
emoryhierarchy
asshown
be-low
.
Faster,
more
expensive
Slow
er,lessexpensive
Th
e Mem
ory H
ierarchy
c�
ShantanuD
utt,UIC
4
4
Mem
oryH
ierarchyD
esign(contd.)�
An
upperlevelisgenerallya
subsetofthedatacontainedin
thenextlow
erlevel,and
alsobelongto
theentire
mem
oryaddressspace
�
An
exceptionisthe
registerlevel,allofwhosedatam
aynotbe
containedin
thecacheatalltim
es.Also,the
registerfileis
notpartofthem
emory
addressspace—registersareaddressedby
adifferentaddressthatpertains
tothe
registerfileonly,and
datatransferbetweenthe
registerfileand
thelow
erlevelsare
handledexplicitlyby
theprogram
inusing
LOA
Ds
andS
TOR
Es
�
The
restofthe
levelssharea
comm
onmem
oryaddressspace,anddata
transfersbetweenthem
are“automatic”andtransparentto
theprogram
—they
arehandledeitherby
hardware(cache–m
ainmem
.hierarchy)orthe
operatingsystem(m
ainm
em.–secondarystoragehierarchy)
c�
ShantanuD
utt,UIC
5
5
Mem
oryH
ierarchyD
esign(contd.)
GeneralD
efinitionsandP
rinciplesofMem
oryH
ierarchy
Considerany
2adjacentlevels� and� ��
�� inthe
mem
oryhierarchy:
�
Block:
Minim
umam
ountofdata
(in#
ofw
ords)thatcanbe
transferredbetw
eenthe2
levels
Processor
Blocks of level 2
Blocks of level 1
Level 1
Level 2
Level 3
�
Hit
rate:F
ractionof
mem
oryaccessestothe
upperlevel(ofthe
2-levelsub-hierarchy)
thatarefound
inthatlevel;denotedby � �
� �
Miss
rate:F
ractionof
accessesthatarenotfound
inthe
upperlevel �
���� �� �;denotedby��� �� �
�
Hit
time:
Tim
etaken
toaccessa
blockin
theupperlevel;
denotedby
� � �
c�
ShantanuD
utt,UIC
6
6
GeneralD
efinitionsandP
rinciplesofMem
oryH
ierarchy(contd.)
Considerany
2adjacentlevels
inthe
mem
oryhierarchy:
�
Miss
penalty:T
ime
toreplacea
blockin
theupperlevelby
aneededblock
thatisnotin
thatlevel.Sincetherecanbe
hitsorm
issesatlowerlevelsfor
obtainingthe
requiredblock.T
hem
isspenalty�������
forthe
upper-most
level(level1)isbe
givenby:
������������������ �� �� ������� � � �
!�"� #���� �� �� � �����
��$����� �
!�%& #�&"� #�
��� �� �� � & ' ����� (
where ��� �
� �isthe
missrate
inlevel� ,and� & ' ����
isthe
blockreplace-
menttim
efrom
level) ��
to) .
�
The
averagemem
ortaccesstime�*+
forthe
CP
Uis
givenby
�*+�� � ����� �� �,��������� � $-*-./
���� �� �-*-./,� -*-./
����
�
The
blockreplacem
enttime� & ' ����
=accesstim
e� & ' �*--(tim
eto
accessthethe
1stword
oftheblock
inthe
lowerlevel) �
�
)+
transfertime�10 �
�� � & ' �23*4�
(time
toaccessthe
remainingw
ord),w
here0
isthe
blocksize
inthe
upperlevel)
and� & ' �23*4�is
thetransferrate
(perword)from
level) �� .
7
�Fore.g.,thereis
aninitialtim
e� ���3-.
requiredtosearchfor
theblock/page
locationinm
ainm
emory(M
M),andfurtherdueto
refreshingwe
sawthat
averagetime� ��*+
toaccessM
Mis
givenby:� ��*+
�� 2565 �� 23�� ���/� � ��7� .
Then
theinititalaccesstim
eto
MM
is:
� ��*--�� ���3-. �� ��*+
�
How
ever,the
entirerow
isstoredin
therow
registerafterspending� ��*+
time
toaccessthe
word,and
therequiredblock
ispartofthis
row.
Thus
therestof
thew
ordsin
theblock
canbe
sentinapprox.��7�
time
perw
ord.Thus� ��23*4�
���7� .
�
Exam
ple:T
hereare
3-levelsin
them
emory
hierarchy:cache,M
M,
secondarystorage.T
hefollo
wing
arevalues
ofabove
parameters:
� � ��8
cc’s, � �� ��9 :;
,cacheblocksize
=4
words,� ��*--
�:
cc’s,� ��23*4��8
cc’s,��� �� ����9 !<,� ��*--�=9( 999
cc’s,� ��23*4��89
cc’s,MM
pagesize=
2K
� 8 ��� words.
Then,the
averagetime
takenby
theC
PU
toaccessa
word
is:
�*+�� � ����� �� �> � �/�������� �� �/�,� �����?
8� 9 9;>� :� 8 @A� � �9 !<� =999� 89 =B @89�?�8� 9 B;� 9 9; @9 C � �8 B: DDE�
c�
ShantanuD
utt,UIC
8
8
GeneralD
efinitionsandP
rinciplesofMem
oryH
ierarchy(contd.)
Considerany
2adjacentlevels
inthe
mem
oryhierarchy:
�
Addressing:
Block fram
e addressB
lock offset addr.
or Block # or P
age #or W
ord #Word #
314 3
0
Block size is 16 w
ords
10 9
Block offset
within a page
Cache−
main m
em.
hierarchy(virtual addr.)
Block # (28 bits)
3110 9
0
Word #
Page size is 1K
words
Main m
em−
Sec. storage
hierarchy(virtual addr.)
10 90
Translation
Physical addr.
Page # (14 bits)
Word #
04 3
Block #
Word #
(20 bits)
Page # (22 bits)
Corresponding
physical addr.of the cache
Generic
2323
c�
ShantanuD
utt,UIC
9
9
GeneralD
efinitionsandP
rinciplesofMem
oryH
ierarchy(contd.)
EffectofB
lockS
ize:
�
Largertheblock
size,bettertheanticipationofnearbyitem
stobe
refer-encedsoon(spatiallocality)
�
How
ever,beyond
acertain
blocksize,the
conceptofspatiallocality
isstretched.N
otethatw
hilea
programm
ayaccessalm
ostallitem
sin
asm
allorm
edium-sizeblock,it
lateraccessesarandom
nextblock,not
necessarilyonefollo
wing
thecurrentone—
spatiallocalityis
punctuatedby
randomaccesses(for
ex.,dueto
branches)
�
Thusfor
largeblock
sizes,therewill
bem
anyuselessdataitem
sinit
thatthe
programm
ightnotaccessinthe
near-future.Since
thespaceon
theupperlevelis
limited,
largertheblock
size,smalleris
the#
ofblocks.
Hencethe
missrate
increaseswhen
thenextrandom
blockis
accessedbythe
program
c�
ShantanuD
utt,UIC
10
10
EffectofB
lockS
ize(contd.)
CA
CA
Initial A access, m
iss,
Work on A
Work on C
A loaded
Next access is C
, miss,
A
Em
pty
C loaded
Next access is A
, hit
Work on A
Next access is C
, hit
0 misses per iteration
(c) Miss pattern w
ithblock size =
16 words
(b) Miss pattern w
ithblock size =
32 words
A &
B
Next access is C
, miss,
C &
D loaded
Initial A access, m
iss,A
&B
loaded
C &
D
Next access is A
, miss,
A &
B loaded
Work on A
Work on C
2 misses per iteration
ABCD
0.950.05
1
0.9
0.1
16 words
16 words
16 words
16 words
(a) Program
Structure
c�
ShantanuD
utt,UIC
11
11
EffectofB
lockS
ize(contd.)
�*+�� � �� ��� �� � @� ���F G� HI�
where�*+
isthe
averagemem
oryaccesstime.
Ave
rag
ea
ccess
time
Miss
pe
na
lty
Blo
ck sizeB
lock size
Miss
rate
Po
llutio
n p
oin
t
Blo
ck size
t_a
v
t_a
v = h
it_tim
e +
(miss_
rate
) (miss_
pe
na
lty)
Incre
ase
ha
pp
en
se
arlie
r tha
n in
"miss ra
te" p
lot
Acce
ss time
c�
ShantanuD
utt,UIC
12
12
GeneralD
efinitionsandP
rinciplesofMem
oryH
ierarchy(contd.)
�W
hattheC
PU
doesona
missin
theupperlevel:
(1)If
them
isspenaltyis
afew
10sof
clockcycles
(cc’s),thenthe
CP
Uw
aits(ex.,cachem
iss)(2)If
them
isspenalty,is100sto
1000sofcc’s(asin
main-m
emorym
issor
pagefault),CP
Uis
interruptedona
miss,and
anotherprocessstartsexecuting.W
henthe
requestedblockis
broughtin,this
isnoted
inthe
previousprocess’sstatus,so
thatitcan
startre-executingata
laterstage(w
henthe
currentprocessisdoneorit
alsohasa
miss)
�
Block
transfermechanism
:(1)D
onein
hardwarefor
few10sofcc’s
penalty(cache)(2)D
onein
software
(O.S
.coulddo
this)form
ain-mem
.miss—
theO.S
.setsup
theappropriatedisk
interfacefora
DM
Aand
leavestheC
PU
;theC
PU
executesanotherprocess,while
transferfromdisk
tom
ain-mem
.takesplacesim
ultaneously
c�
ShantanuD
utt,UIC
13
13
Som
eBasicIssuesin
Mem
oryH
ierarchies
Again
we
consider2adjacentlevels
ofthehierarchy:
1.Block
Placem
ent:Wherecan
ablock
beplacedin
theupperlevel?
2.Block
Identification:How
isa
blockfound
inthe
upperlevel?3.B
lockR
eplacement:W
hichblock
toreplaceduring
am
iss?4.W
riteS
trategy:W
hathappensona
write
tothe
upperlevel—how
isthis
percolatedtothe
lowerlevel
c�
ShantanuD
utt,UIC
14
14
Som
eBasicIssuesin
Mem
oryH
ierarchies(contd.)
(1)Block
Placem
ent:
�
Fully
Associative
(FA):
Can
placeanywhere;have
tolook
everywhere
�
SetA
ssociative(S
A):T
heupper-levelis
dividedintoJ
sets9( ( J ��,
eachcontaining0blocks(0
-way
setassociative).Ablock
with
block#� ,
isplacedonly
inset� �KL
J ;itcan
beplacedanyw
hereinthis
set
�
DirectM
apped(DM
):The
upper-levelisdivided
into� blocks9( ( � �� ,
anda
blockw
ithblock
#� ,isplacedonly
inblock� �KL
� ;�
isgenerally
apow
erof2,say,8 �.
Will
needtolook
atonly1
blockposition
forthe
requiredblock.
01
23
45
67
01
23
45
67
01
23
45
67
Set
0S
et 1
Set
2S
et 3
01
23
45
67
89
01
23
45
67
89
01
23
45
67
89
Block 14 can go anyw
hereF
ully associative (FA
):D
irect mapped (D
M):
Block 14 can only go into
block 14 mod 8 =
6
2-way Set A
ssociative (SA):
Block 14 can go anyw
here inset 14 m
od 4 = 2
MM MM MM MM MM MM MM MM MM MM MMNN NN NN NN NN NN NN NN NN NN NNOO OO OO OO OO OO OO OO OO OO OOPP PP PP PP PP PP PP PP PP PP PPQQ QQ QQ QQ QQ QQ QQ QQ QQ QQ QQRR RR RR RR RR RR RR RR RR RR RRSS SS SS SS SS SS SS SS SS SS SSTT TT TT TT TT TT TT TT TT TT TTUU UU UU UU UU UU UU UU UU UU UUVV VV VV VV VV VV VV VV VV VV VVWW WW WW WW WW WW WW WW WW WW WWXX XX XX XX XX XX XX XX XX XX XXYY YY YY YY YY YY YY YY YY YY YYZZ ZZ ZZ ZZ ZZ ZZ ZZ ZZ ZZ ZZ ZZ[[ [[ [[ [[ [[ [[ [[ [[ [[ [[ [[\\ \\ \\ \\ \\ \\ \\ \\ \\ \\ \\
]] ]] ]] ]] ]] ]] ]] ]] ]] ]]^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^__ __ __ __ __ __ __ __ __ __ __`` `` `` `` `` `` `` `` `` `` ``
aa aa aa aa aa aa aa aa aa aa aabb bb bb bb bb bb bb bb bb bb bbcc cc cc cc cc cc cc cc cc cc ccdd dd dd dd dd dd dd dd dd dd dd
eeeeeeeee
fffffffff
Bl. #
Bl. #
Bl. #
Bl. #
11
11
11
11
11
22
22
22
22
22
33
01
Block 14
c�
ShantanuD
utt,UIC
15
15
Som
eBasicIssuesin
Mem
oryH
ierarchies
(1)Block
Placem
ent(contd.):
�
FAand
DM
arespecialcasesofset-associative.In
FA,
thereisonly
onesetcontainingall� blocks.In
DM
,thereare� sets,eachcontainingexactly1
block
�
FAhasthe
mostflexibility
inplacing
ablock,w
hileD
Mhasthe
least
c�
ShantanuD
utt,UIC
16
16
Som
eBasicIssuesin
Mem
oryH
ierarchies(contd.)
(2)Block
Identification:�
Associative
orcontent-addressiblemem
ory(CA
M):
storestheblock
#or
tagsof
residentblocksfor
eachset.
The
index,w
hichis
the HKg J�h
rightmostbits
oftheblock
#,determinesw
hichsetofthe
CA
Mto
searchfor
therestofthe
block#
(thetag).T
hisis
generallyusedinthe
cache–m
ain-mem
.hierarchy.01
23
45
67
01
23
45
67
01
23
45
67
Set
0S
et 1
Set
2S
et 3
1414
14
Search only in tag
position 14 mod 8 =
6S
earch everywhere
within set 14 m
od 4 = 2
Block offset/
Word #
Tag
Index
Block #
(b) Different portions of an address: T
he index (address mod s) is used to
select the set (in DM
and SA
), and the tag is used to check all blocks inthe "indexed" set, and the w
ord # is used to select the word in the block
ii ii ii ii ii ii ii ii ii iijj jj jj jj jj jj jj jj jj jj
kk kk kk kk kk kk kk kk kk kkll ll ll ll ll ll ll ll ll ll
mm mm mm mm mm mm mm mm mm mmnn nn nn nn nn nn nn nn nn nno o o o o op p p p p pq q q q q qr r r r r rs s s s s st t t t t t
u u u u u uv v v v v v
Bl. #
Bl. #
Bl. #
Block 14
Direct m
apped (DM
):2-w
ay Set Associative (SA
):F
ully associative (FA
):
DataTag
Data
Data
Tag
Tag
Search everyw
here
(a) Block identification in different cache types. S
earchperform
ed in parallel in FA
and SA
caches for speed.
Search
Search
Search
c�
ShantanuD
utt,UIC
17
17
Som
eBasicIssuesin
Mem
oryH
ierarchies(contd.)
(2)Block
Identification:CA
Ms
–S
tructureofaC
AM
:
Com
paratorE
quality
Com
paratorE
quality
Com
paratorE
quality
Data Store
Tag Store
Tag
Desired W
ord
Word #
16 words/blockB
lock
1/0
1/0
1/0V
alid bit
m
2r
Miss
Structure of a CA
M :
Note: Search logic replaces a regular decoder.
Fully-associative cache
Note: V
alid bit is present in tag storeA
ND
’s with the O
/P of the
corresponding equality comparator.
a1
a0
a2
a3
a4
a5
a7
a6
x0
x1
x2
x3
x4
x5
x6
x7
1 : Equal
0 : Not equal
Equality C
omparator
(Inputs x & a)
c�
ShantanuD
utt,UIC
18
18
Som
eBasicIssuesin
Mem
oryH
ierarchies(contd.)
CA
Ms:
�
HardwareC
omplexity:
Ofparallelsearchlogic
=w� 8x8 3� fora
FAcache,
where8 3
isthe
sizeof
thecachein
blocks,andx
isthe
#of
bitsin
theblock
#.T
hiscan
beprohibitive
forlargex
andy
�
ForSA
cache,we
haveone
suchCA
Mofsize8 3 !�@
� x �h�
foreachof
theJ �8 �
sets.So
totalCA
Msize
is 8 3@� x �h� .H
owever,thereis
onlyone
parallelsearchlogicof
size8 3 !�@� x �h�
which
isusedto
searchonly
theindexed
set
19
Tag
Data B
lock
Search
LogicIndex
Index
Data S
toreT
agS
tore
l−to−
2**l=
5−
to−32
Decoder
2**(r−l)−
to−1
= 32−
to−1
Mux
Set
#
012**l−1
= 31
2**(r−l)
=32
m−
l=
15
2**(r−l)
=32
16 blocks =
512 bits
512bits
15bits
15
55
l−to−
2**l=1−
to−32
Decoder
Set #
0131
4 30
Word #
23 9 8
m=
20l=
5r=
10
Block # (20)
Tag (15)
Index (5)
Cache size=
2**r = 1024 blocks
# of sets = 32, set size =
32 blocks
lm
�
Thereis
onlyone
equalitycomparatorin
aD
Mcache;thuscom
plexityis
w� 8� x �y��
�
Tim
ecom
plexityofsearch:w� HKg x
�
forFA
,w� HKg� x �h��
forS
A,and
w� HKg� x �y��
c�
ShantanuD
utt,UIC
20
20
Som
eBasicIssuesin
Mem
oryH
ierarchies
(2)Block
Identification(contd.):
�
Lookuptable:S
toresthetagsalsoby
sets,asinthe
CA
M.H
owever,this
isregularkind
ofmem
ory,andis
ofthesam
etechnologyastheupperlevel.
Thus2
mem
oryaccessesarereqd.to
theupperlevelto
getaw
ordfrom
there.This
isgenerallyusedin
them
ain-mem
.–sec.storagehierarchy.
�
Tablesizez
totalsizein
blocksinlow
erlevel.T
hisis
differentthanthe
upperlevelinw
hicha
CA
Mis
usedasthe
“lookuptable”and
itssize
is
z
thesize
inblocksin
theupperlevel.
Block #
of address
Block #
Present
bitD
irtybit
Location incurrent level
012141
02
1516
Lookup T
able:
c�
ShantanuD
utt,UIC
21
21
Som
eBasicIssuesin
Mem
oryH
ierarchies(contd.)
(3)Block
Replacem
entPolicy:W
hichblock
inthe
settoreplace?N
ochoice
inD
Mcache.S
othe
questionappliestoFA
andS
Acache.T
hefollo
wing
policiescanbe
usedforeachset;allpoliciesm
akeuse
oftem
porallocalityto
predictwhich
blockw
illbe
accessedfurthestinthe
future.
�
Least
Frequently
Used
(LF
U):
Note
the#
oftim
eseachblock
hasbeenusedoversom
ewindow
oftim
eand
replacetheone
usedtheleast#
oftim
es.Mostexpensive
toim
plement
�
Least
Recently
Used
(LR
U):
Keep
theblocks
ineachsetorderedby
thetim
eoftheirm
ostrecentused.Whenevera
newblock
isaccessedin
theset,m
oveit
tothe
topof
thelist.
Replacethe
blockatthe
bottom.2nd
mostexpensive,butbestperform
ance
shifted left 1 block
Move to end on access
Data
LRU
MRU
shifted left 1 block
Move to end on access
Tag
Implem
entation of LRU
scheme: LR
U is perform
edin entire cache for F
A or in the accessed set for S
A
c�
ShantanuD
utt,UIC
22
22
Som
eBasicIssuesin
Mem
oryH
ierarchies(contd.)
(3)Block
Replacem
entPolicy(contd.):
�
Not
Recently
Used
(NR
U):
Justpointtothe
blockused
mostrecently.
Replaceany
ofthe
otherblocks.3rd
mostexpensive
inhardware
andtim
e,andw
orstperformance
�
Random
:R
andomlychooseany
blockto
replace.Leastexpensive(espe-
ciallyin
time;have
todo
thisonly
when
thereis
am
iss)toim
plement,
and3rd
bestperformance(afterLR
U)
c�
ShantanuD
utt,UIC
23
23
Som
eBasicIssuesin
Mem
oryH
ierarchies(contd.)
(4)Write
Strategy:W
hathappensona
write?
�
On
aw
ritehit:
1.Write
Back:
Write
tolow
erlevelw
henblock
isreplacedand
ifits
“dirty”bit
isset.T
hisbit
issetw
heneverwe
write
toa
blockin
theupperlevel.
This
isgenerallyusedw
henaccesstim
eto
lowerlevelis
high.2.W
riteT
hrough:W
riteto
bothlevels
simultaneouslythuskeepingthem
alwaysconsistent.
�
On
aw
ritem
iss:1.W
riteA
llocate:Load
theblock
written
toto
theupperlevel.
Again,
thisis
generallydonewhen
accesstime
tolow
erlevelishigh.
2.No
Write
Allocate:
Block
notloadedtothe
upperlevel—the
rationaleis
thatreadand
write
donothave
thesam
esphereofspatiallocality,
andas
explainedlater,
theC
PU
generallydoesnothaveto
wait
forw
rites(i.e.,S
TOR
Es)
�
The
combinationsgenerallyusedon
write
hit/miss
are1/1
and2/2.
The
latterisusedm
ainlyfor
thecache–
main-m
em.hierarchy
andthe
1/1com
binationforthe
main-m
em.–
sec.storagehierarchy(becauseofthe
largeraccesstime)
c�
ShantanuD
utt,UIC
24
24
More
AboutC
ahces�
Made
fromS
RA
Ms
�
Sourceofcachem
isses:(1)
Com
pulsory:1sttime
accesstoa
blockw
illresultin
am
iss—“cold
startmiss”
(2)Capacity:C
achescannotcontainallblocksneededduringa
program’s
execution(3)
Conflictor
Collission:O
ccurswhen
(a)toom
anyreferencedblocks
map
tothe
sameset,and/or(b)the
setsizeis
verysm
all(fore.g.,in
DM
caches)
c�
ShantanuD
utt,UIC
25
25
Sourceofcachem
isses(contd.)
23
56
NY
Y
N
Block # accessed:
Access class:
Block # replaced:
(using LRU
in sets)
Global LR
U block?
42
73
Set 0Set 1
2-way SA
cache(size=
4 blocks)-, -, -, -, -, .........., -, ..........., 2, -, 6, 3, ......., 5
Cm
, h, Cm
, h, Cm
, ..h’s..., Cm
, h, h, Cm
, h, Cn, C
m,.h’s., C
p
2, 2, 3, 3, 5, .........., 6, 2, 6 , 4, 4, 2, 7, ......., 3
c�
ShantanuD
utt,UIC
26
26
Sourceofcachem
isses(contd.)
c�
ShantanuD
utt,UIC
27
27
More
AboutC
ahces(contd.)�
Effectofblock
size
�*+�� � �� ��� �� � @� ���F G� HI�
c�
ShantanuD
utt,UIC
28
28
More
AboutC
ahces(contd.)�
Separatedataand
instructioncaches.Can
havedifferentblock
sizes,ca-pacitiesand
associativitiesto
optimize
performance
c�
ShantanuD
utt,UIC
29
29