L18-CacheMem

19
Cache Smal l amount of fa st memor y Si ts b etween nor ma l mai n memor y and CPU Ma y be l ocat ed on CP U ch ip or mo dule It i s r elativel y e xp ensi ve

Transcript of L18-CacheMem

Page 1: L18-CacheMem

8/6/2019 L18-CacheMem

http://slidepdf.com/reader/full/l18-cachemem 1/19

Cache

� Small amount of fast memory

� Sits between normal main memory and CPU

� May be located on CPU chip or module

� It is relatively expensive

Page 2: L18-CacheMem

8/6/2019 L18-CacheMem

http://slidepdf.com/reader/full/l18-cachemem 2/19

Why Cache is needed?

� Cache is a special very high speed memory used to increase

the speed of processing by making current programs and data

available to CPU at a rapid rate.

� CPU logic is usually faster than main memory access time.� To compensate for the mismatch in operating speeds is to use,

extremely fast & small cache b/w CPU & main memory.

� Access time of cache is close to processor logic clock cycle

time.

� Cache is used to store segments of programs currently being

executed in CPU & temporary data frequently needed to do

calculations.

Page 3: L18-CacheMem

8/6/2019 L18-CacheMem

http://slidepdf.com/reader/full/l18-cachemem 3/19

Cache/Main Memory Structure

Page 4: L18-CacheMem

8/6/2019 L18-CacheMem

http://slidepdf.com/reader/full/l18-cachemem 4/19

� Main memory consists of up to 2n addressable words, each

word has a unique n-bit address.

� For mapping purpose, this memory is considered to consist of a number of fixed length blocks of K words each.

� Total no of blocks M = 2n / K 

� Cache is divided into C no of slots, each slot can contain K 

words.� No of slots in cache is considerably less than no of blocks in

main memory (C<<M).

� An individual slot cannnot be dedicated to a particular block ,

so each slot includes a tag that identifies which block iscurrently being stored. This is a portion of the main memory

address.

Page 5: L18-CacheMem

8/6/2019 L18-CacheMem

http://slidepdf.com/reader/full/l18-cachemem 5/19

Cache operation ± overview

� CPU requests contents of memory location� Check cache for this data

� If present, get from cache (fast)

� If not present, read required block from mainmemory to cache

� Then deliver from cache to CPU

� Cache includes tags to identify which block of main memory is in each cache slot

Page 6: L18-CacheMem

8/6/2019 L18-CacheMem

http://slidepdf.com/reader/full/l18-cachemem 6/19

Cache Read Operation - Flowchart

Page 7: L18-CacheMem

8/6/2019 L18-CacheMem

http://slidepdf.com/reader/full/l18-cachemem 7/19

Hit Ratio :

� The performance of cache memory is measured in terms of Hit ratio.

� Hit Ratio = no of hits/ total CPU references to memory

(hits + misses)

� When CPU refers to memory and finds the word in cache,it is said to produce a hit .

� If the word is not found in cache, it is in main memory, it

counts as a miss.

Page 8: L18-CacheMem

8/6/2019 L18-CacheMem

http://slidepdf.com/reader/full/l18-cachemem 8/19

Elements of Cache Design

1. Size2. Mapping Function

3. Replacement Algorithm

4. Write Policy5. Block Size

6. Number of Caches

Page 9: L18-CacheMem

8/6/2019 L18-CacheMem

http://slidepdf.com/reader/full/l18-cachemem 9/19

1. Size :

� Size of cache should be small enough so that overallaverage cost per bit is close to that of main memory alone

and large enough so that overall average access time is close

to that of the cache alone.

� Larger is cache, larger is the no of gates involved in

addressing the cache, as a result performance decreases i.e.

cache becomes slower.

� Cache size also depend on the available chip and board area.

Page 10: L18-CacheMem

8/6/2019 L18-CacheMem

http://slidepdf.com/reader/full/l18-cachemem 10/19

2. Mapping function :� Basic characteristics of cache is its fast access time, so time

should not be wasted while searching for words in cache.

� There are fewer cache lines than main memory blocks, analgorithm is needed for mapping main memory blocks intocache lines. i.e. a means is needed to determine which mainmemory blocks occupies a cache line.

� The transformation of data from main memory to cache isreferred to as a mapping process.

� Mapping function determines how the cache is organized.

� There are 3 types of mappings:

a) Associative mapping b) Direct mapping

c) Set-associative mapping

Page 11: L18-CacheMem

8/6/2019 L18-CacheMem

http://slidepdf.com/reader/full/l18-cachemem 11/19

Main memory32k X 12 Cache

memory 512 X12

CPU

Page 12: L18-CacheMem

8/6/2019 L18-CacheMem

http://slidepdf.com/reader/full/l18-cachemem 12/19

a) Associative Mapping :

- Any block location in Cache can store any block in memory-> Most flexible

- Mapping Table is implemented in an associative memory-> Fast, very Expensive

- Mapping TableStores both address and the content of the memory word

Page 13: L18-CacheMem

8/6/2019 L18-CacheMem

http://slidepdf.com/reader/full/l18-cachemem 13/19

b) Direct Mapping :

Addressing relationships b/w main and cache memories

- Each memory block has only one place to load in Cache- Mapping Table is made of RAM instead of CAM

- n-bit memory address consists of 2 par ts; k bits of Index f ield andn-k bits of Tag f ield

- n-bit addresses are used to access main memoryand k-bit Index is used to access the Cache

Page 14: L18-CacheMem

8/6/2019 L18-CacheMem

http://slidepdf.com/reader/full/l18-cachemem 14/19

Oper ation

- CPU gener ates a memory request with (TAG;INDEX)- Access Cache using INDEX ; (tag; data) 

Compare TAG and tag

- If matches -> HitProvide Cache[INDEX](data) to CPU

- If not match -> Misscopy required data from main memory to cacheCPU <- Cache[INDEX](data)

Page 15: L18-CacheMem

8/6/2019 L18-CacheMem

http://slidepdf.com/reader/full/l18-cachemem 15/19

Direct mapping cache with block size of 8 words

Page 16: L18-CacheMem

8/6/2019 L18-CacheMem

http://slidepdf.com/reader/full/l18-cachemem 16/19

Two way set associative mapping

- Each memory block has a set of locations in the Cache to load

c)

Page 17: L18-CacheMem

8/6/2019 L18-CacheMem

http://slidepdf.com/reader/full/l18-cachemem 17/19

Oper ation- CPU gener ates a memory address(TAG; INDEX)- Access Cache with INDEX, (Cache word = (tag 0, data 0); (tag 1, data 1))- Compare TAG and tag 0 and then tag 1- If tag i = TAG -> Hit, CPU <- data i- If tag i { TAG -> Miss,Replace either (tag 0, data 0) or (tag 1, data 1),

Assume (tag 0, data 0) is selected for replacement,(Why (tag 0, data 0) instead of (tag 1, data 1) ?)M[tag 0, INDEX] <- Cache[INDEX](data 0)Cache[INDEX](tag 0, data 0) <- (TAG, M[TAG,INDEX]),CPU <- Cache[INDEX](data 0)

- SET ASSOCI ATIVE MAPPING -

Page 18: L18-CacheMem

8/6/2019 L18-CacheMem

http://slidepdf.com/reader/full/l18-cachemem 18/19

3. BLOCK REPLACEMENT POLICY

Many different block replacement policies are available

LRU(Least Recently Used) is most easy to implement

Cache word = (tag 0, data 0, U0 );(tag 1, data 1, U1), Ui = 0 or 1(binary)

Implementation of LRU in the Set Associative Mapping with set size = 2

Modif ications

Initially all U0 = U1 = 1When Hit to (tag 0, data 0, U0), U1 <- 1(least recently used)(When Hit to (tag 1, data 1, U1), U0 <- 1(least recently used))When Miss, f ind the least recently used one(Ui=1)

If U0 = 1, and U1 = 0, then replace (tag 0, data 0)

M[tag 0, INDEX] <- Cache[INDEX](data 0) Cache[INDEX](tag 0, data 0, U0) <- (TAG,M[TAG,INDEX], 0); U1 <- 1If U0 = 0, and U1 = 1, then replace (tag 1, data 1)

Similar to above; U0 <- 1If U0 = U1 = 0, this condition does not existIf U0 = U1 = 1, Both of them are candidates, 

Take arbitr ary selection

Page 19: L18-CacheMem

8/6/2019 L18-CacheMem

http://slidepdf.com/reader/full/l18-cachemem 19/19

4. C ACHE WRITE

Wr ite Through

When wr iting into memory

If Hit, both Cache and memory is wr itten in par allelIf Miss, Memory is wr itten

For a read miss, missing block may beoverloaded onto a cache block

Memory is always updated-> Impor tant when CPU and DMA I/O are both executing

Slow, due to the memory access time

Wr ite-Back (Copy-Back)

When wr iting into memory

If Hit, only Cache is wr ittenIf Miss, missing block is brought to Cache and wr ite into Cache

For a read miss, candidate block must bewr itten back to the memory

Memory is not up-to-date, i.e., the same item inCache and memory may have different value