Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf–...

131
Yonsei Yonsei University University Chapter 4 Internal Memory

Transcript of Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf–...

Page 1: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity

Chapter 4

Internal Memory

Page 2: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-2

ContentsContents

• Computer Memory System Overview• Semiconductor Main Memory• Cache Memory• Pentium and Power PC• Advanced DRAM Organization

Page 3: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-3

Key CharacteristicsKey Characteristics Computer memory Computer memory system overviewsystem overview

PerformanceAccess timeCycle timeTransfer rate

Physical TypeSemiconductorMagneticOpticalMagneto-Optical

Physical CharacteristicsVolatile/nonvolatileErasable/nonerasable

Organization

LocationProcessorInternal (main)External (secondary)

CapacityWord sizeNumber of words

Unit of TransferWordBlock

Access MethodSequentialDirectAssociative

Page 4: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-4

Characteristics of Memory SystemCharacteristics of Memory System• Location

– Processor– Internal(main)– External(secondary)

Computer memory Computer memory system overviewsystem overview

Page 5: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-5

CapacityCapacity• Internal memory capacity

– Expressed in terms of bytes or words

• External memory capacity– Expressed in terms of bytes

Computer memory Computer memory system overviewsystem overview

Page 6: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-6

Unit of TransferUnit of Transfer• Internal memory

– The unit of transfer is equal to number of data into and out of the memory module

– Often equal to the word length – Word

• Natural unit of organization of memory• The size of the word is equal to number of bits used

to represent a number and to instruction length– Addressable unit

• In many systems, the addressable unit is word• Some systems allow addressing at the byte level

– Unit of transfer• The number of bits read out of or written into

memory at a time

Computer memory Computer memory system overviewsystem overview

Page 7: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-7

Methods of AccessingMethods of Accessing• Sequential access

– Start at the beginning and read through in order– Access time depends on location of data and

a previous location– e.g. tape

• Direct access– Individual blocks have unique address– Access is by jumping to vicinity plus sequential

search– Access time depends on location and previous

location– e.g. disk

Computer memory Computer memory system overviewsystem overview

Page 8: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-8

Methods of AccessingMethods of Accessing

• Random access– Individual addresses identify locations exactly– Access time is independent of location or previous

access– e.g. RAM

• Associative access– Data is located by a comparison with contents of a

portion of the store– Access time is independent of location or previous

access– e.g. cache

Computer memory Computer memory system overviewsystem overview

Page 9: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-9

PerformancePerformance• Access time

– Time it takes to perform read or write operation (for random-access memory)

– Time it takes to position the read-write mechanism at desired location (for non-random-access memory)

• Memory cycle time – Applied to random-access memory– Consists of the access time plus any additional time

required before next access

Computer memory Computer memory system overviewsystem overview

Page 10: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-10

PerformancePerformance• Transfer rate

– Rate at which data can be transferred into or out of a memory unit (for random-access memory)

– For non-random-access memory

TN = Average time to read or write N bitsTA = Average access timeN = Number of bitsR = Transfer rate, in bits per second(bps)

RN

TT AN??

Computer memory Computer memory system overviewsystem overview

Page 11: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-11

Physical TypesPhysical Types

• Semiconductor– RAM

• Magnetic– Disk & Tape

• Optical– CD & DVD

• Others– Bubble– Hologram

Computer memory Computer memory system overviewsystem overview

Page 12: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-12

Physical CharacteristicsPhysical Characteristics

• Decay

• Volatility

• Erasable

• Power consumption

Computer memory Computer memory system overviewsystem overview

Page 13: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-13

OrganizationOrganization

• Physical arrangement of bits to form words

• Not always obvious

• e.g. interleaved

Computer memory Computer memory system overviewsystem overview

Page 14: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-14

The Bottom LineThe Bottom Line

• How much?– Capacity

• How fast?– Time is money

• How expensive?

Computer memory Computer memory system overviewsystem overview

Page 15: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-15

The The Memory HierarchyMemory Hierarchy• Relationships

– The faster access time, the greater cost per bit– The greater capacity, the smaller cost per bit– The greater capacity, the slower access time

• As one goes down the hierarchy, the following occur– Decreasing cost per bit– Increasing capacity– Increasing access time– Decreasing frequency of access of the memory by the

processor

Computer memory Computer memory system overviewsystem overview

Page 16: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-16

The Memory hierarchyThe Memory hierarchy Computer memory Computer memory system overviewsystem overview

Page 17: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-17

The The Memory HierarchyMemory Hierarchy

• Registers– In CPU

• Internal or Main memory– May include one or more levels of cache– RAM

• External memory– Backing store

Computer memory Computer memory system overviewsystem overview

Page 18: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-18

Performance of A TwoPerformance of A Two--Level MemoryLevel Memory Computer memory Computer memory system overviewsystem overview

Page 19: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-19

Hierarchy ListHierarchy List

• Registers• L1 Cache• L2 Cache• Main memory• Disk cache• Disk• Optical• Tape

Computer memory Computer memory system overviewsystem overview

Page 20: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-20

Memory HierarchyMemory Hierarchy• Locality of reference

– During the course of the execution of a program, memory references tend to cluster

Computer memory Computer memory system overviewsystem overview

Page 21: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-21

Memory HierarchyMemory Hierarchy

• Additional levels can be effectively added to the hierarchy in software

• Portion of main memory can be used as a buffer to hold data that is to be read out to disk

• Such a technique, sometimes referred to as a disk cache

Computer memory Computer memory system overviewsystem overview

Page 22: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-22

So You Want Fast?So You Want Fast?

• It is possible to build a computer which uses only static RAM (see later)

• This would be very fast

• This would need no cache– How can you cache cache?

• This would cost a very large amount

Computer memory Computer memory system overviewsystem overview

Page 23: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-23

Semiconductor Memory TypesSemiconductor Memory Types Semiconductor Semiconductor main memorymain memory

Electrically, byte level

Electrically Erasable

PROM(EEPROM)

Electrically, block level

Flash memory

UV light, chip level

Read-mostly memory

Erasable PROM(EPROM)

Electrically

Programmable ROM(PROM)

Nonvolatile

MaskNot possible Read-only memory

Read-only memory(ROM)

VolatileElectricallyElectrically, byte level

Read-write memoryRandom-access memory(RAM)

VolatilityWrite MechanismErasureCategoryMemory Type

Page 24: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-24

RAMRAM

• Misnamed as all semiconductor memory is random access

• Read/Write• Volatile• Temporary storage• Static or dynamic

– DRAM requires periodic charge refreshing to maintain data storage

Semiconductor Semiconductor main memorymain memory

Page 25: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-25

DRAMDRAM

• Bits stored as charge in capacitors• Charges leak• Need refreshing even when it is powered• Simpler construction• Smaller per bit• Less expensive• Need refresh circuits• Slower• Main memory

Semiconductor Semiconductor main memorymain memory

Page 26: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-26

SRAMSRAM

• Bits stored as on/off switches• No charges to leak• No refreshing needed when it is powered• More complex construction• Larger per bit• More expensive• Does not need refresh circuits• Faster• Cache

Semiconductor Semiconductor main memorymain memory

Page 27: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-27

Read Only Memory (ROMRead Only Memory (ROM))• Permanent storage• Applications

– Microprogramming – Library subroutines– Systems programs (BIOS)– Function tables

• Problems– The data insertion step includes a large fixed cost – No room for error

• Written during manufacture– Very expensive for small runs

Semiconductor Semiconductor main memorymain memory

Page 28: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-28

Programmable ROM (PROM)Programmable ROM (PROM)• Nonvolatile & Written only once

– Writing is performed electrically and may be performed by a supplier or customer at a time later than the original chip fabrication

– Needs special equipment to program

Semiconductor Semiconductor main memorymain memory

Page 29: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-29

Read Read ““MostlyMostly”” Memory (RMM)Memory (RMM)

• Read operations are far more frequent than write operations

• But for which nonvolatile storage is required

• EPROM• EEPROM• Flash

Semiconductor Semiconductor main memorymain memory

Page 30: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-30

EPROMEPROM

• Erasable Programmable • Storage cell must be erased by UV before

written operation• Read and written electrically • More expensive than PROM• Advantage of multiple update capability

Semiconductor Semiconductor main memorymain memory

Page 31: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-31

Electrically Erasable PROM(EEPROM)Electrically Erasable PROM(EEPROM)

• A read-mostly memory that can be written into at any time without erasing prior contents

• Takes much longer to write than read• Advantage

– Nonvolatility with the flexibility of being updatablein place using ordinary bus control, address, data lines

Semiconductor Semiconductor main memorymain memory

Page 32: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-32

Flash MemoryFlash Memory

• Erase whole memory electrically• Entire flash memory can be erased in one

or few seconds (faster than EPROM)• Possible to erase just blocks• Impossible to erase byte-level• Use only one transistor per bit • Achieves high density

Semiconductor Semiconductor main memorymain memory

Page 33: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-33

Memory Cell OperationMemory Cell Operation Semiconductor Semiconductor main memorymain memory

Page 34: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-34

Chip LogicChip Logic

• A 16Mbit chip can be organized as 1M of 16 bit words

• A bit per chip system has 16 lots of 1Mbit chip with bit 1 of each word in chip 1 and so on

• A 16Mbit chip can be organized as a 2048 x 2048 x 4bit array– Reduces the number of address pins

• Multiplex row address and column address• 11 pins to address (211=2048)• Adding one more pin doubles range of values • So the memory size grows by a factor of 4

Semiconductor Semiconductor main memorymain memory

Page 35: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-35

RefreshingRefreshing

• Refresh circuit included on chip• Disable chip• Count through rows• Read & Write back• Takes time• Slows down apparent performance

Semiconductor Semiconductor main memorymain memory

Page 36: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-36

Typical 16 Mb DRAM (4M x 4)Typical 16 Mb DRAM (4M x 4) Semiconductor Semiconductor main memorymain memory

Page 37: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-37

Chip PackagingChip Packaging

• Pins support following signal lines– Address of words being accessed

• For 1M words, a total of 20(220=1M) pins needed(A0-A19)

– Data to be read out, consisting of 8 lines(D0-D7) – Power supply to the chip(Vcc)– Ground pin(Vss)– Chip enable (CE)pin– Program voltage(Vpp) that is supplied during

programming(writing operation)

Semiconductor Semiconductor main memorymain memory

Page 38: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-38

Typical Memory PackageTypical Memory Package Semiconductor Semiconductor main memorymain memory

Page 39: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-39

Module OrganizationModule Organization• How a memory

module consisting of 256K 8-bit words could be organized– For 256K word,

18-bit address is needed and is supplied to the module from some external sources

Semiconductor Semiconductor main memorymain memory

Page 40: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-40

Module OrganizationModule Organization

• Possible organization of a memory consisting of 1M word by 8 bits per word– Need four columns of chips, each column

containing 256K words arranged

Semiconductor Semiconductor main memorymain memory

Page 41: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-41

11--MbyteMbyte Memory OrganisationMemory Organisation Semiconductor Semiconductor main memorymain memory

Page 42: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-42

Error CorrectionError Correction• Hard Failure

– Permanent physical defect– Memory cell or cells affected cannot reliably store data

and become stuck at 0 or 1or switch erratically between 0 and 1

– Caused by harsh environmental abuse, manufacturing defects and wear

• Soft Error– Random, non-destructive – No permanent damage to memory– Caused by power supply problems or alpha particles

• Detected using Hamming error correcting code

Semiconductor Semiconductor main memorymain memory

Page 43: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-43

Error Error -- Correcting Code Correcting Code Semiconductor Semiconductor main memorymain memory

• When data are to be read into memory, a function f is performed to produce a code

• Both the data and the code are stored• If M-bit data word and K-bit code, the stored word is

M+K bits

Page 44: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-44

Error CorrectionError Correction• If M-bit word of data is stored, and code is K

bit, then actual size of stored word is M+K bits• New set of K code is generated from M data

bits compared with fetched code bits• Comparison one of three results

– No errors detected • The fetched data bits sent out

– An error is detected and it is possible to correct error• The data bit plus error-correction bits fed into

corrector which produces a corrected set of M bits to be sent out

– An error is detected, but it is impossible to correct it• This condition reported

Semiconductor Semiconductor main memorymain memory

Page 45: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-45

Hamming ErrorHamming Error--Correcting CodeCorrecting Code Semiconductor Semiconductor main memorymain memory

Page 46: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-46

Hamming CodeHamming Code• Parity bits• By checking the parity bits, discrepancies found in

circle A and circle C, but not in circle B• Only one of the seven compartments is in A and C but

not B• The error can be corrected by changing that bit• Syndrome word : result of comparison of bit-by-bit• Each bit of syndrome is 0 or 1 according to if there is

or is not a match in position for two input• The syndrome word is K bits wide and has a range

between 0 and 2K-1

2K – 1 ≥M + k

Semiconductor Semiconductor main memorymain memory

Page 47: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-47

Increase in Word LengthIncrease in Word Length Semiconductor Semiconductor main memorymain memory

3.91103.5292567.0396.258128

12.5810.94864

21.875718.75632

37.5631.25516

62.555048

%IncreaseCheck Bits%IncreaseCheck BitsData Bits

Single-Error Correction/Double-

Error Detection

Single-Error-Correction

Page 48: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-48

Error CorrectionError Correction• To generate 4-bit syndrome

– If the syndrome contains all 0s, no error is detected– If the syndrome contains one and only one bit set to 1,

an error occurs in one of 4 check bits• No correction needed

– If the syndrome contains more than one bit set to 1, numerical value of syndrome indicates the position of data bit in error

• This data bit is inverted for correction

Semiconductor Semiconductor main memorymain memory

Page 49: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-49

Layout of Data Bits & Check BitsLayout of Data Bits & Check Bits Semiconductor Semiconductor main memorymain memory

Page 50: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-50

Layout of Data Bits & Check BitsLayout of Data Bits & Check Bits

• To achieve these characteristics, data and check bits are arranged into a 12-bit word as depicted

• Those bit positions whose position numbers are powers of 2 are designated as check bits

Semiconductor Semiconductor main memorymain memory

Page 51: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-51

Error CorrectionError Correction

• Each check bit operates on every data bit position whose position number contains 1 in corresponding column position

8765884324

764312754211

MMMMCMMMMC

MMMMMCMMMMMC

????????

??????????

Semiconductor Semiconductor main memorymain memory

Page 52: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-52

Check Bit DegenerationCheck Bit Degeneration

• Data and check bits are positioned properly in the 12-bit word.

• By laying out position number of each data bit in columns, the 1s in each row indicates data bits checked by the check bit for that row

Semiconductor Semiconductor main memorymain memory

Page 53: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-53

Check Bit DegenerationCheck Bit Degeneration Semiconductor Semiconductor main memorymain memory

Page 54: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-54

Error CorrectionError Correction

• Single-error-correcting(SEC)• Single-error-correcting, double-error

-detecting(SEC-DED)

Semiconductor Semiconductor main memorymain memory

Page 55: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-55

Hamming SECHamming SEC--DEC CodeDEC Code Semiconductor Semiconductor main memorymain memory

Page 56: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-56

Error CorrectionError Correction

• IBM 30xx implementations use 8-bit SEC-DED code for each 64 bits of data in main memory

• Size of main memory is 12% larger than apparent to the user

• VAX computer use 7-bit SEC-DED for each 32 bits of memory, for 22% overhead

• A number of contemporary DRAMs use 9 check bits for each 128 bits of data, for 7% overhead

Semiconductor Semiconductor main memorymain memory

Page 57: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-57

Cache & Main MemoryCache & Main Memory

• Small amount of fast memory• Sits between normal main memory and

CPU• May be located on CPU chip or module

Cache Cache memorymemory

Page 58: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-58

PrinciplePrinciple• CPU requests contents of memory location• Check cache for this data• If present, get the word from cache (fast)• If not present, main memory is read into

cache and the word is delivered to the processor

• The word is delivered from cache to CPU because of phenomenon of locality of reference

• Cache includes tags to identify which block of main memory is in each cache slot

Cache Cache memorymemory

Page 59: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-59

PrinciplePrinciple

• For mapping purposes, the memory is considered to consist of number of fixed-length block of K words each– Number of block : M = 2n/K

• Cache consists of C lines of K words each and the number of lines is considerably less than the number of main memory blocks

• Each line includes a tag that identifies which particular block is currently being stored

Cache Cache memorymemory

Page 60: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-60

Cache/MainCache/Main--Memory StructureMemory Structure Cache Cache memorymemory

Page 61: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-61

PrinciplePrinciple• Cache hit occurs

– Data and address buffers are disabled and the communication is only between the processor and cache, with no system bus traffic

• Cache miss occurs– Desired address is loaded onto the system bus and the

data are returned through data buffer to both cache and main memory

Cache Cache memorymemory

Page 62: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-62

Cache Read OperationCache Read Operation Cache Cache memorymemory

Page 63: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-63

Elements of Cache DesignElements of Cache Design

Write PolicyWrite throughWrite backWrite once

Line SizeNumber of Caches

Single or two level Unified or split

Cache sizeMapping Function

DirectAssociativeSet Associative

Replacement AlgorithmLeast recently used(LRU)First in first out(FIFO)Least frequently used(LFU)Random

Cache Cache memorymemory

Page 64: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-64

Cache SizeCache Size• Small enough that overall average cost per bit is

close to that of main memory• Large enough that overall average access time

is close to that cache alone• The larger cache, the larger number of gates

involved in addressing the cache• The larger cache tend to be slower than small

caches• Cache size is limited by available chip and board

area• Impossible to arrive “optimum” size because

cache’s performance is very sensitive to the nature of workload

Cache Cache memorymemory

Page 65: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-65

Factors For Cache SizeFactors For Cache Size

• Cost– More cache is expensive

• Speed– More cache is faster (up to a point)– Checking cache for data takes time

Cache Cache memorymemory

Page 66: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-66

Typical Cache OrganizationTypical Cache Organization Cache Cache memorymemory

Page 67: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-67

Mapping FunctionMapping Function• Needed for determining which main memory

block currently occupies a cache line• Direct mapping

– Cache of 64kByte– Cache block of 4 bytes

• i.e. cache is 16k (214) lines of 4 bytes– 16MBytes main memory– 24 bit address

• (224=16M)

Cache Cache memorymemory

Page 68: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-68

Direct MappingDirect Mapping• Maps each block of main memory into only

one possible cache line

i = j modulo m

wherei = cache line numberj = main memory block number

m= number of lines in the cache

Cache Cache memorymemory

Page 69: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-69

Direct Mapping Cache OrganizationDirect Mapping Cache Organization Cache Cache memorymemory

Page 70: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-70

Direct Mapping Cache Line TableDirect Mapping Cache Line Table

– Least significant w bits identify a unique word or byte– Most significant s bits specify one memory block– The MSBs are split into a cache line field r and a tag of

s-r (most significant)– This latter field identifies one of the m=2r lines of cache

m –1, 2m –1, 3m –1,… , 2s-1 m -1

.

.

.

.

.

.

1, m +1, 2m +1, … , 2s- m + 110, m, 2m, … , 2s- m0

Main memory block assignedCache line

Cache Cache memorymemory

Page 71: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-71

Direct Mapping ExampleDirect Mapping Example Cache Cache memorymemory

Page 72: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-72

Direct Mapping Address StructureDirect Mapping Address StructureTag s-r Line or Slot r Word w

8 14 2

• 24 bit address• 2 bit word identifier (4 byte block)• 22 bit block identifier

– 8 bit tag (=22-14)– 14 bit slot or line

• No two blocks in the same line have the same Tag field

• Check contents of cache by finding line and checking Tag

Cache Cache memorymemory

Page 73: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-73

Direct MappingDirect Mapping

• Advantage– Direct mapping technique is simple and inexpensive to

implement

• Disadvantage– A fixed cache location for any given block– The hit ratio will be low

Cache Cache memorymemory

Page 74: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-74

Associative MappingAssociative Mapping• A main memory block can load into any line of

cache• The tag field uniquely identifies a block of

main memory• To determine whether a block is in the cache,

the cache control logic must simultaneously examine every line’s tag for a match

• Cache searching gets expensive

Cache Cache memorymemory

Page 75: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-75

Fully Associative Cache OrganizationFully Associative Cache Organization Cache Cache memorymemory

Page 76: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-76

Associative Mapping ExampleAssociative Mapping Example Cache Cache memorymemory

Page 77: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-77

Tag 22 bit Word2 bit

Associative Mapping Address StructureAssociative Mapping Address Structure

• 22 bit tag stored with each 32 bit block of data

• Compare tag field with tag entry in cache to check for hit

• Least significant 2 bits of address identify which 16 bit word is required from 32 bit data block

• e.g.– Address Tag Data Cache line– FFFFFC FFFFFC 24682468 3FFF

Cache Cache memorymemory

Page 78: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-78

Associative MappingAssociative Mapping

• Replacement algorithms is designed to maximize the hit ratio

• Advantage– Flexible replacement of blocks when a new block is

read into the cache

• Disadvantage– Complex circuitry is required to examine the tags of all

cache lines in parallel

Cache Cache memorymemory

Page 79: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-79

Set Associative MappingSet Associative Mapping

• Compromise that reduces disadvantage of direct & associative approaches

m = v × ki = j modulo v

Wherei = cache set numberj = main memory block number

m= number of lines in the cachek-way set associative mapping

– V=m, k=1, the set associative technique reduces to direct mapping

– V=1, k=m, reduces to associative mapping

Cache Cache memorymemory

Page 80: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-80

Set Associative MappingSet Associative Mapping• Cache is divided into a number of sets• Each set contains a number of lines• A given block maps to any line in a given set

– e.g. Block B can be in any line of set i

• e.g. 2 lines per set– 2 way associative mapping– A given block can be in one of 2 lines in only one set

Cache Cache memorymemory

Page 81: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-81

KK-- Way Set Associative Cache OrganizationWay Set Associative Cache Organization Cache Cache memorymemory

Page 82: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-82

Set Associative Mapping Address StructureSet Associative Mapping Address Structure

• Use set field to determine cache set to look in• Compare the tag field to see if we have a hit• e.g

– Address Tag Data Set number– 1FF 7FFC 1FF 12345678 1FFF– 001 7FFC 001 11223344 1FFF

Tag 9 bit Set 13 bitWord2 bit

Cache Cache memorymemory

Page 83: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-83

Set Associative Mapping ExampleSet Associative Mapping Example

• 13 bit set number• Block number in main memory is modulo

213

• 000000, 008000,… , FF8000map to same set

Cache Cache memorymemory

Page 84: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-84

Two Way Set Associative Mapping ExampleTwo Way Set Associative Mapping Example Cache Cache memorymemory

Page 85: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-85

Replacement Algorithms Replacement Algorithms • A new block is brought into cache, one of the

existing blocks must be replaced• Direct mapping

– Because there is only one possible line for any particular block, the block is must be replaced

• Associative & set associative techniques– Need replacement algorithm

• Least recently used(LRU)– Most effective– Replace that block in the set that has been in the cache

longest with no reference – When a line is referenced, its USE bit is set to 1 and the

USE bit of the other line is set to 0

Cache Cache memorymemory

Page 86: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-86

Replacement AlgorithmsReplacement Algorithms

• First in first out (FIFO)– Replace block that has been in cache longest

• Least frequently used(LFU)– Replace block which has experienced the fewest

references

• Random– Not based on usage

Cache Cache memorymemory

Page 87: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-87

Write PolicyWrite Policy• Must not overwrite a cache block unless main

memory is up to date• If it has not updated, old block in the cache

may be overwritten• When multiple processors are attached to the

same bus and each processor has its own local cache– A word altered in one cache, invalidates a word in other

cache

• Multiple CPUs may have individual caches• I/O may address main memory directly

Cache Cache memorymemory

Page 88: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-88

Write ThroughWrite Through

• All writes go to main memory as well as cache

• Multiple CPUs can monitor main memory traffic to keep local (to CPU) cache up to date

• Disadvantage– Generates substantial memory traffic and may create

bottleneck

Cache Cache memorymemory

Page 89: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-89

Write BackWrite Back• Minimizes memory writes• Updates are made only in the cache • An UPDATE bit for cache slot is set when an

update occurs• If a block is to be replaced, it is written back

to main memory if and only if the UPDATE bit is set

• Other caches get out of sync• I/O must access main memory through cache

– Complex circuitry and a potential bottleneck• 15% of memory references are writes

Cache Cache memorymemory

Page 90: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-90

Cache CoherencyCache Coherency• If data in one cache is altered, this invalidates

the corresponding word in main memory and the same word in other caches

• Even if a write-through policy is used, the other caches may contain invalid data

• A system that prevent this problem said to maintain cache coherency– Bus watching with write through– Hardware transparency– Noncachable memory

Cache Cache memorymemory

Page 91: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-91

Bus Watching With Write ThroughBus Watching With Write Through

• Each cache controller monitors address lines to detect write operations to memory by other bus masters

• If another master writes to a location in shared memory that also resides in cache memory, cache controller invalidates that cache entry

• This strategy depends on the use of a write-through policy by all cache controller

Cache Cache memorymemory

Page 92: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-92

Hardware TransparencyHardware Transparency

• Additional hardware used to ensure that all updates to main memory via cache reflected in all cache

• If one processor modifies a word in its cache, this update is written to main memory

• Any matching words in other caches are similarly updated

Cache Cache memorymemory

Page 93: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-93

NoncachableNoncachable MemoryMemory• Only a portion of main memory is shared by

more than one processor, and this is designated as noncachable

• All accesses to shared memory are cache misses because the shared memory is never copied into the cache

• The noncachable memory can be identified using chip-select logic of high-address bits

Cache Cache memorymemory

Page 94: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-94

Line sizeLine size• When a block of data is retrieved and placed

in the cache, the desired word and some number of adjacent words are retrieved

• Two specific effects come into play– Larger blocks reduce the number of block that fit into a

cache. Because each block fetch overwrites older cache contents, a small number of blocks result in data being overwritten shortly after it is fetched

– As a block becomes larger, each additional word is farther from the requested word, and therefore less likely to be needed in near future

Cache Cache memorymemory

Page 95: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-95

Line sizeLine size

• The relationship between block size and hit ratio is complex, depending on the locality characteristics of a particular program

• No definitive optimum value has been found• A size of from 2 to 8 words seems reasonably

close to optimum

Cache Cache memorymemory

Page 96: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-96

Number of CachesNumber of Caches

• On chip cache– Possible to have cache on the same chip as the

processor– Reduces processor’s external bus activity and speed

up execution times and increases overall system performance

• Most contemporary designs include both on-chip and external caches

• Two-level cache– Internal cache designed level 1(L1) and external

cache designed level 2(L2)

Cache Cache memorymemory

Page 97: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-97

Number of CachesNumber of Caches• Reason for including L2 cache

– If there is no L2 cache and the processor makes access request for memory location not in L1 cache, the processor must access DRAM or ROM memory across the bus

• Poor performance because of slow bus speed and slow memory access time

– If L2 SRAM cache is used, missing information can be quickly retrieved

– Effect of using L2 depends on hit rates in both L1 and L2 caches

Cache Cache memorymemory

Page 98: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-98

Number of CachesNumber of Caches

• To split the cache into two: one dedicated to instruction and one dedicated to data

• Potential advantage of a unified cache– For a given cache size, a unified cache has higher hit

rate than split caches because it balances the load between instruction and data fetches automatically

– Only one cache needs to be designed and implemented

Cache Cache memorymemory

Page 99: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-99

Number of CachesNumber of Caches

• The trend toward split caches– Superscalar machines (Pentium II, PowerPC) which

emphasize parallel instruction execution and prefetching of predicted future instructions

– The key advantage of the split cache design is that it eliminates condition for cache between the instruction processor and the execution unit

– Important in any design that relies on the pipelining of instructions

Cache Cache memorymemory

Page 100: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-100

Pentium II Block DiagramPentium II Block Diagram Pentium Pentium II II and PowerPC and PowerPC

Page 101: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-101

Structure Of Pentium II Data Cache Structure Of Pentium II Data Cache • LRU replacement algorithm• Write-back policy

Pentium Pentium II II and PowerPC and PowerPC

Page 102: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-102

Data Cache ConsistencyData Cache Consistency

• To provide cache consistency, data cache supports a protocol MESI (modified/exclusive/shared/invalid)

• Data cache includes two status bit per tag, so that each line can be in one of four states– Modified– Exclusive– Shared– Invalid

Pentium Pentium II II and PowerPC and PowerPC

Page 103: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-103

MESI Cache Line StatesMESI Cache Line States

Goes directly to

use

Goes to bus and updates cache

Does not go to bus

Does not go to busA write to this line…

MaybeMaybeNoNoCopies exist in other caches?

-ValidValidOut of dateThe memory copy is…

NoYesYesYesThis cache line valid?

IInvalid

S Shared

EExclusive

MModified

Pentium Pentium II II and PowerPC and PowerPC

Page 104: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-104

Cache ControlCache Control

• Internal cache controlled by two bits– CD (cache disable) – NW(not write – through)

• Two Pentium II instruction that used to control cache– INVD invalidates(flushes) internal cache memory

and signals external cache to invalidate– WBINVD writes back and invalidates internal cache,

then writes back and invalidates external cache

Pentium Pentium II II and PowerPC and PowerPC

Page 105: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-105

Pentium II Cache Operation ModesPentium II Cache Operation Modes

DisabledEnabled

Enabled

Write Throughs

DisabledDisabled

Enabled

Cache Fills

Operating Mode

11

0

CD

Control Bits

Disabled1Enabled0

Enabled0

InvalidatesNW

Pentium Pentium II II and PowerPC and PowerPC

Page 106: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-106

PowerPC Internal CachePowerPC Internal Cache Pentium Pentium II II

and PowerPC and PowerPC

8-way set associative642 32-kbytePowePC 620

4-way set associative322 16-kbytePowePC 604

2-way set associative322 8-kbytePowePC 603

8-way set associative321 32-kbytePowePC 601

OrganizationBytes/LineSizeModel

Page 107: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-107

PowerPC G3 Block DiagramPowerPC G3 Block Diagram Pentium Pentium II II and PowerPC and PowerPC

Page 108: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-108

PowerPC Cache OrganizationPowerPC Cache Organization

• The L1 caches are eight-way set associative and use a version of the MESI cache coherency protocol

• The L2 cache is a two-way set associative cache with 256K, 512K, of 1 Mbyte of memory

Pentium Pentium II II and PowerPC and PowerPC

Page 109: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-109

Enhanced DRAM(EDRAM)Enhanced DRAM(EDRAM)• Integrates a small SRAM cache onto a generic

DRAM chip• Refresh operations can be conducted in

parallel with cache read operations, minimizing the time that the chip is unavailable due to refresh

• The read path from the row cache to the output port is independent of the write path from the I/O module to the sense amplifiers

• This enables a subsequent read access to the cache to be satisfied in parallel with the completion of the write operation

Advanced Advanced DRAM organizationDRAM organization

Page 110: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-110

EDRAMEDRAM Advanced Advanced DRAM organizationDRAM organization

Page 111: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-111

Cache DRAM(CDRAM)Cache DRAM(CDRAM)• Includes a larger SRAM cache than EDRAM• SRAM on the CDRAM can be used in two

ways– Used as a true cache, consisting of a number of

64-bit lines• The cache mode is effective for ordinary random

access to memory– Used as a buffer to support the serial access of a

block of data

Advanced Advanced DRAM organizationDRAM organization

Page 112: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-112

Synchronous DRAM(SDRAM)Synchronous DRAM(SDRAM)• Exchanges data with the processor

synchronized to an external clock signal and running at the full speed of the processor/memory bus without imposing wait states

• With synchronous access, the DRAM moves data in & out under control of the system clock

• Employs burst mode– To eliminate the address setup time and row and column

line precharge time after first access– In burst mode, a series of data bits can be clocked out

rapidly after the first bit has been accessed – Useful when all bits are to be accessed in sequence and

in the same row of the array as the initial access

Advanced Advanced DRAM organizationDRAM organization

Page 113: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-113

SDRAMSDRAM

• Dual-bank architecture that improves opportunities for on-chip parallelism

• The mode register and associated control logic is another key feature differentiating SDRAMs from conventional DRAMs

• It provides a mechanism to customize the SDRAM to suit specific system needs

• The mode register specifies the burst length• SDRAM performs best when it is transferring

large blocks of data serially

Advanced Advanced DRAM organizationDRAM organization

Page 114: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-114

SDRAMSDRAM Advanced Advanced DRAM organizationDRAM organization

Page 115: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-115

RambusRambus DRAM(RDRAM)DRAM(RDRAM)• A more revolutionary approach to the

memory-bandwidth problem• The chip exchanges data with processor over

28 wires no more than 12cm long• The bus can address up to 320 RDRAM chip

and is rated at 500 Mbps• The special RDRAM bus delivers address and

control information using an asynchronous block-oriented protocol

• An RDRAM gets a memory request over the high-speed bus

Advanced Advanced DRAM organizationDRAM organization

Page 116: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-116

RamLinkRamLink• A memory interface with point-to-point

connections arranged in a ring– Traffic on the rings is managed by a memory controller

that sends messages to the DRAM chips

• Data is exchanged in the from of packets• Request packets initiate memory transaction• Strengths

– Provides a scalable architecture that supports a small or large number of DRAMs

– Does not dictate internal DRAM structure

Advanced Advanced DRAM organizationDRAM organization

Page 117: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-117

RamLinkRamLink Advanced Advanced DRAM organizationDRAM organization

• RamLink Architecture

Page 118: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-118

RamLinkRamLink

• Packet format

Advanced Advanced DRAM organizationDRAM organization

Page 119: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-119

Characteristics of TwoCharacteristics of Two--Level MemoriesLevel Memories Appendix 4AAppendix 4A

64 to 4096 bytes64 to 4096 bytes 4 to 128 bytesTypical block size

Indirect accessIndirect accessDirect accessAccess of processor to second level

System softwareCombination of hardware and system software

Implemented by special hardware

Memory management system

1000/11000/15/1Typical access time ratios

Disk CacheVirtual Memory (Paging)

Main Memory Cache

Page 120: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-120

Relative Dynamic Frequency Relative Dynamic Frequency Appendix 4AAppendix 4A

6167-Other-3-92GoTo

3643291120IF

12121531Call

43534Loop4238456774Assign

SystemSystemSystemStudentScientificWorkload

SALCPascal FORTRANPascalLanguage

[TANE78][PATT82][KNUT71][HUCK83]Study

Page 121: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-121

LocalityLocality• Each call is represented by the line moving

down and to the right• Each return is represented by the line

moving up and to the right • A window with depth equal to 5 is defined

– Only a sequence of calls and returns with a net movement of 6 in either direction causes the window to move

Appendix 4AAppendix 4A

Page 122: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-122

The Call/Return Behavior of Programs The Call/Return Behavior of Programs Appendix 4AAppendix 4A

Page 123: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-123

LocalityLocality

• Spatial locality– Refers to the tendency of execution to involve a

number of memory locations that are clustered

• Temporal locality – Refers to the tendency for a processor to access

memory locations that have been used recently

Appendix 4AAppendix 4A

Page 124: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-124

Locality of Reference For Web PagesLocality of Reference For Web Pages Appendix 4AAppendix 4A

Page 125: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-125

Operation of TwoOperation of Two--Level MemoryLevel Memory

Ts = H ×T1 + (1 - H) × (T1 + T2 )= T1 + (1 – H) × T2

whereTs = average(system) access timeT1 = access time of M1 (e.g., cache, disk cache)T2 = access time of M2 (e.g., main memory, disk)H = hit ratio (fraction of time reference is found in M1)

Appendix 4AAppendix 4A

Page 126: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-126

PerformancePerformance

WhereCs = average cost per bit for the combined two-level

memoryC1 = average cost per bit of upper-level memory M1C2 = average cost per bit of lower-level memory M2S1 = size of M1S2 = size of M2We would like Cs ˜ C2

(Given C1 >> C2 , this requires S1 << S2 )

Appendix 4AAppendix 4A

SSSCSCC s

21

2211

?

??

Page 127: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-127

Memory cost Vs. Memory SizeMemory cost Vs. Memory Size Appendix 4AAppendix 4A

Page 128: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-128

PerformancePerformance• Consider the quantity T2 / T1 , which referred

to as the access efficiency

Appendix 4AAppendix 4A

TTT

THs

1

2

1

)1(1

1

???

Page 129: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-129

Access Efficiency Vs. Hit Ratio(TAccess Efficiency Vs. Hit Ratio(T22 / T/ T11 ) ) Appendix 4AAppendix 4A

Page 130: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-130

Hit Ratio Vs. Memory SizeHit Ratio Vs. Memory Size Appendix 4AAppendix 4A

Page 131: Chapter 4 Internal Memorysoc.yonsei.ac.kr/class/material/computersystems/2003/chapter4.pdf– Expressed in terms of bytes Computer memory system overview. 4-6 Yonsei University ...

YonseiYonsei UniversityUniversity4-131

PerformancePerformance• If there is strong locality, it is possible to

achieve high values of hit ratio even with relatively small upper-level memory size– Small cache sizes will yield a hit ratio above 0.75

regardless of the size of main memory– A cache in the range of 1K to 128K words is generally

adequate, whereas main memory is now typically in the multiple-megabyte range

• If we need only a relatively small upper-level memory to achieve good performance, the average cost per bit of the two levels of memory will approach that of the cheaper memory

Appendix 4AAppendix 4A