L18-CacheMem
-
Upload
domainname9 -
Category
Documents
-
view
217 -
download
0
Transcript of L18-CacheMem
8/6/2019 L18-CacheMem
http://slidepdf.com/reader/full/l18-cachemem 1/19
Cache
� Small amount of fast memory
� Sits between normal main memory and CPU
� May be located on CPU chip or module
� It is relatively expensive
8/6/2019 L18-CacheMem
http://slidepdf.com/reader/full/l18-cachemem 2/19
Why Cache is needed?
� Cache is a special very high speed memory used to increase
the speed of processing by making current programs and data
available to CPU at a rapid rate.
� CPU logic is usually faster than main memory access time.� To compensate for the mismatch in operating speeds is to use,
extremely fast & small cache b/w CPU & main memory.
� Access time of cache is close to processor logic clock cycle
time.
� Cache is used to store segments of programs currently being
executed in CPU & temporary data frequently needed to do
calculations.
8/6/2019 L18-CacheMem
http://slidepdf.com/reader/full/l18-cachemem 3/19
Cache/Main Memory Structure
8/6/2019 L18-CacheMem
http://slidepdf.com/reader/full/l18-cachemem 4/19
� Main memory consists of up to 2n addressable words, each
word has a unique n-bit address.
� For mapping purpose, this memory is considered to consist of a number of fixed length blocks of K words each.
� Total no of blocks M = 2n / K
� Cache is divided into C no of slots, each slot can contain K
words.� No of slots in cache is considerably less than no of blocks in
main memory (C<<M).
� An individual slot cannnot be dedicated to a particular block ,
so each slot includes a tag that identifies which block iscurrently being stored. This is a portion of the main memory
address.
8/6/2019 L18-CacheMem
http://slidepdf.com/reader/full/l18-cachemem 5/19
Cache operation ± overview
� CPU requests contents of memory location� Check cache for this data
� If present, get from cache (fast)
� If not present, read required block from mainmemory to cache
� Then deliver from cache to CPU
� Cache includes tags to identify which block of main memory is in each cache slot
8/6/2019 L18-CacheMem
http://slidepdf.com/reader/full/l18-cachemem 6/19
Cache Read Operation - Flowchart
8/6/2019 L18-CacheMem
http://slidepdf.com/reader/full/l18-cachemem 7/19
Hit Ratio :
� The performance of cache memory is measured in terms of Hit ratio.
� Hit Ratio = no of hits/ total CPU references to memory
(hits + misses)
� When CPU refers to memory and finds the word in cache,it is said to produce a hit .
� If the word is not found in cache, it is in main memory, it
counts as a miss.
8/6/2019 L18-CacheMem
http://slidepdf.com/reader/full/l18-cachemem 8/19
Elements of Cache Design
1. Size2. Mapping Function
3. Replacement Algorithm
4. Write Policy5. Block Size
6. Number of Caches
8/6/2019 L18-CacheMem
http://slidepdf.com/reader/full/l18-cachemem 9/19
1. Size :
� Size of cache should be small enough so that overallaverage cost per bit is close to that of main memory alone
and large enough so that overall average access time is close
to that of the cache alone.
� Larger is cache, larger is the no of gates involved in
addressing the cache, as a result performance decreases i.e.
cache becomes slower.
� Cache size also depend on the available chip and board area.
8/6/2019 L18-CacheMem
http://slidepdf.com/reader/full/l18-cachemem 10/19
2. Mapping function :� Basic characteristics of cache is its fast access time, so time
should not be wasted while searching for words in cache.
� There are fewer cache lines than main memory blocks, analgorithm is needed for mapping main memory blocks intocache lines. i.e. a means is needed to determine which mainmemory blocks occupies a cache line.
� The transformation of data from main memory to cache isreferred to as a mapping process.
� Mapping function determines how the cache is organized.
� There are 3 types of mappings:
a) Associative mapping b) Direct mapping
c) Set-associative mapping
8/6/2019 L18-CacheMem
http://slidepdf.com/reader/full/l18-cachemem 11/19
Main memory32k X 12 Cache
memory 512 X12
CPU
8/6/2019 L18-CacheMem
http://slidepdf.com/reader/full/l18-cachemem 12/19
a) Associative Mapping :
- Any block location in Cache can store any block in memory-> Most flexible
- Mapping Table is implemented in an associative memory-> Fast, very Expensive
- Mapping TableStores both address and the content of the memory word
8/6/2019 L18-CacheMem
http://slidepdf.com/reader/full/l18-cachemem 13/19
b) Direct Mapping :
Addressing relationships b/w main and cache memories
- Each memory block has only one place to load in Cache- Mapping Table is made of RAM instead of CAM
- n-bit memory address consists of 2 par ts; k bits of Index f ield andn-k bits of Tag f ield
- n-bit addresses are used to access main memoryand k-bit Index is used to access the Cache
8/6/2019 L18-CacheMem
http://slidepdf.com/reader/full/l18-cachemem 14/19
Oper ation
- CPU gener ates a memory request with (TAG;INDEX)- Access Cache using INDEX ; (tag; data)
Compare TAG and tag
- If matches -> HitProvide Cache[INDEX](data) to CPU
- If not match -> Misscopy required data from main memory to cacheCPU <- Cache[INDEX](data)
8/6/2019 L18-CacheMem
http://slidepdf.com/reader/full/l18-cachemem 15/19
Direct mapping cache with block size of 8 words
8/6/2019 L18-CacheMem
http://slidepdf.com/reader/full/l18-cachemem 16/19
Two way set associative mapping
- Each memory block has a set of locations in the Cache to load
c)
8/6/2019 L18-CacheMem
http://slidepdf.com/reader/full/l18-cachemem 17/19
Oper ation- CPU gener ates a memory address(TAG; INDEX)- Access Cache with INDEX, (Cache word = (tag 0, data 0); (tag 1, data 1))- Compare TAG and tag 0 and then tag 1- If tag i = TAG -> Hit, CPU <- data i- If tag i { TAG -> Miss,Replace either (tag 0, data 0) or (tag 1, data 1),
Assume (tag 0, data 0) is selected for replacement,(Why (tag 0, data 0) instead of (tag 1, data 1) ?)M[tag 0, INDEX] <- Cache[INDEX](data 0)Cache[INDEX](tag 0, data 0) <- (TAG, M[TAG,INDEX]),CPU <- Cache[INDEX](data 0)
- SET ASSOCI ATIVE MAPPING -
8/6/2019 L18-CacheMem
http://slidepdf.com/reader/full/l18-cachemem 18/19
3. BLOCK REPLACEMENT POLICY
Many different block replacement policies are available
LRU(Least Recently Used) is most easy to implement
Cache word = (tag 0, data 0, U0 );(tag 1, data 1, U1), Ui = 0 or 1(binary)
Implementation of LRU in the Set Associative Mapping with set size = 2
Modif ications
Initially all U0 = U1 = 1When Hit to (tag 0, data 0, U0), U1 <- 1(least recently used)(When Hit to (tag 1, data 1, U1), U0 <- 1(least recently used))When Miss, f ind the least recently used one(Ui=1)
If U0 = 1, and U1 = 0, then replace (tag 0, data 0)
M[tag 0, INDEX] <- Cache[INDEX](data 0) Cache[INDEX](tag 0, data 0, U0) <- (TAG,M[TAG,INDEX], 0); U1 <- 1If U0 = 0, and U1 = 1, then replace (tag 1, data 1)
Similar to above; U0 <- 1If U0 = U1 = 0, this condition does not existIf U0 = U1 = 1, Both of them are candidates,
Take arbitr ary selection
8/6/2019 L18-CacheMem
http://slidepdf.com/reader/full/l18-cachemem 19/19
4. C ACHE WRITE
Wr ite Through
When wr iting into memory
If Hit, both Cache and memory is wr itten in par allelIf Miss, Memory is wr itten
For a read miss, missing block may beoverloaded onto a cache block
Memory is always updated-> Impor tant when CPU and DMA I/O are both executing
Slow, due to the memory access time
Wr ite-Back (Copy-Back)
When wr iting into memory
If Hit, only Cache is wr ittenIf Miss, missing block is brought to Cache and wr ite into Cache
For a read miss, candidate block must bewr itten back to the memory
Memory is not up-to-date, i.e., the same item inCache and memory may have different value