10/16: Lecture Topics

10/16: Lecture Topics

• Memory problem• Memory Solution: Caches• Locality• Types of caches

– Fully associative– Direct mapped– n-way associative

Memory Problem

• If all memory accesses (lw/sw) accessed main memory, programs would run 20 times slower

• And it’s getting worse– processors speed up by 50% annually– memory accesses speed up by 9%

annually– it’s becoming harder and harder to

keep these processors fed

Solution: Memory Hierarchy

• Keep all data in the big, slow, cheap storage

• Keep copies of the “important” data in the small, fast, expensive storage

fast, small, expensivestorage

slow, large, cheap storage

Memory Hierarchy

Memory Level

Fabrication Tech

Access Time (ns)

Size (bytes)

$/MB

Registers

Registers <0.5 256 1000

L1 Cache

SRAM 2 8K 100

L2 Cache

SRAM 10 1M 100

Memory DRAM 50 128M 0.75

Disk Magnetic Disk

10M 32G 0.0035

What is a Cache?

• A cache allows for fast accesses to a subset of a larger data store

• Your web browser’s cache gives you fast access to pages you visited recently– faster because it’s stored locally– subset because the web won’t fit on your disk

• The memory cache gives the processor fast access to memory that it used recently– faster because it’s located on the CPU

Locality

• Temporal locality: the principle that data being accessed now will probably be accessed again soon– Useful data tends to continue to be

useful

• Spatial locality: the principle that data near the data being accessed now will probably be needed soon– If data item n is useful now, then it’s

likely that data item n+1 will be useful soon

Memory Access Patterns

• Memory accesses don’t look like this– random accesses

• Memory accesses do look like this– hot variables– step through

arrays

Cache Terminology

• Hit—the data item is in the cache• Miss—the data item is not in the cache• Hit rate—the percentage of time the

data item is in the cache (hit ratio)• Miss rate—the percentage of time the

data item is not in the cache (hit ratio)• Hit time—the time required to access

data in the cache• Miss time—the time required to access

data not in the cache (miss penalty)• Access time—the time required to

access a level of the memory hierarchy

Effective Access Time

teffective = htcache + (1-h)tmemoryeffective

access time

memory access timecache

miss rate

cache access timecache

hit rate

aka, Average Memory Access Time (AMAT)

Access Time Example

• Suppose tmemory for main memory is 50ns

• Suppose tcache for the processor cache is 2ns

• We want an effective access time of 3ns

• What hit rate h do we need?

Cache Contents

• When do we put something in the cache?– when it is used for the first time

• When do we take something out of the cache?– when it hasn’t been used for a long

time– subset because all of memory won’t

fit on the CPU

Concepts in Caching

• Assume a two level hierarchy:• Level 1: a cache that can hold 8 words• Level 2: a memory that can hold 32 words

cache

memory

0000000

0000100

0001000

1110100

1111000

1111100

0001100

0010000

0010100

Fully Associative Cache

Address Valid Value

0010100 Y 0x00000001

0000100 N 0x09D91D11

0100100 Y 0x00000410

0101100 Y 0x00012D10

0001100 N 0x00000005

1101100 Y 0xDEADBEEF

0100000 Y 0x000123A8

1111100 N 0x00000200

• In a fully associative cache,– any memory word

can be placed in any cache line

– each cache line stores which address it contains

– accesses are slow (but not as slow as you would think)

Direct Mapped Caches

• Fully associative caches are too slow• With direct mapped caches the

address of the item determines where in the cache to store it– In this case, the lower five bits of the

address dictate the cache entry

Direct Mapped CacheAddress Valid Value

1100000 Y 0x00000001

1000100 N 0x09D91D11

0101000 Y 0x00000410

0001100 Y 0x00012D10

1010000 N 0x00000005

1110100 Y 0xDEADBEEF

0011000 Y 0x000123A8

1011100 N 0x00000200

• The lower 5 bits of the address determine where it is located in the table

Index

00000

00100

01000

01100

10000

10100

11000

11100

Address Tags

• A tag is a label for a cache entry indicating where it came from– The upper bits of the data item’s

address7 bit Address

1011101

Tag (2) Index (3) Byte Offset (2)

10 111 01

Cache with Address Tag

Tag Valid Value11 Y 0x00000001

10 N 0x09D91D11

01 Y 0x00000410

00 Y 0x00012D10

10 N 0x00000005

11 Y 0xDEADBEEF

00 Y 0x000123A8

10 N 0x00000200

Index

000

001

010

011

100

101

110

111

Reference Stream Example

11010, 10111, 00001, 11010, 11011, 11111, 01101, 11010

datatagvbindex

000

001

010

011

100

101110

111

Sample L1 Cache

• Suppose a L1 cache has the following characteristics– one word stored per entry– 1024 entries– direct mapped

• How many total bytes are stored in the cache?

• How many bits are used for the index and tag fields of the address?

• How many total bits are stored in the cache?

N-way Set Associative Caches

• Direct mapped caches cannot store two addresses with the same index

• If two addresses collide, then you have to kick one of them out

• 2-way associative caches can store two different addresses with the same index

• Reduces misses due to conflicts• Also, 3-way, 4-way and 8-way set

associative• Larger sets imply slower accesses

2-way Associative Cache


10 N 0x09D91D11

01 Y 0x00000410

00 Y 0x00012D10

10 N 0x00000005

11 Y 0xDEADBEEF

00 Y 0x000123A8

10 N 0x00000200

Index000

001

010

011

100

101

110

111


10 N 0x0000003B

11 Y 0x000000CF

10 N 0x000000A2

11 N 0x00000333

10 Y 0x00003333

01 Y 0x0000C002

10 N 0x00000005

Associativity Spectrum

Direct MappedFast to access

Conflict Misses

Fully AssociativeSlow to access

No Conflict Misses

N-way AssociativeSlower to access

Fewer Conflict Misses

10/16: Lecture Topics

Documents

Transcript of 10/16: Lecture Topics