Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer...
-
Upload
scarlett-shaw -
Category
Documents
-
view
233 -
download
4
Transcript of Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer...
![Page 1: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/1.jpg)
Chapter 8CPU and Memory:Design, Implementation, and Enhancement
The Architecture of Computer Hardware and Systems Software: An Information Technology Approach
3rd Edition, Irv Englander
John Wiley and Sons 2003
![Page 2: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/2.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-2
CPU Architecture Overview
CISC – Complex Instruction Set Computer RISC – Reduced Instruction Set Computer CISC vs. RISC Comparisons VLIW – Very Long Instruction Word EPIC – Explicitly Parallel Instruction
Computer
![Page 3: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/3.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-3
CISC Architecture
Examples Intel x86, IBM Z-Series Mainframes, older
CPU architectures Characteristics
Few general purpose registers Many addressing modes Large number of specialized, complex
instructions Instructions are of varying sizes
![Page 4: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/4.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-4
Limitations of CISC Architecture
Complex instructions are infrequently used by programmers and compilers
Memory references, loads and stores, are slow and account for a significant fraction of all instructions
Procedure and function calls are a major bottleneck Passing arguments Storing and retrieving values in registers
![Page 5: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/5.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-5
RISC Features Examples
Power PC, Sun Sparc, Motorola 68000 Limited and simple instruction set Fixed length, fixed format instruction words
Enable pipelining, parallel fetches and executions Limited addressing modes
Reduce complicated hardware Register-oriented instruction set
Reduce memory accesses Large bank of registers
Reduce memory accesses Efficient procedure calls
![Page 6: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/6.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-6
CISC vs. RISC Processing
![Page 7: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/7.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-7
Circular Register Buffer
![Page 8: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/8.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-8
Circular Register Buffer- After Procedure Call
![Page 9: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/9.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-9
CISC vs. RISC Performance Comparison RISC Simpler instructions
more instructions
more memory accesses RISC more bus traffic and
increased cache memory misses More registers would improve CISC
performance but no space available for them Modern CISC and RISC architectures are
becoming similar
![Page 10: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/10.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-10
VLIW Architecture Transmeta Crusoe CPU 128-bit instruction bundle = molecule
4 32-bit atoms (atom = instruction) Parallel processing of 4 instructions
64 general purpose registers Code morphing layer
Translates instructions written for other CPUs into molecules
Instructions are not written directly for the Crusoe CPU
![Page 11: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/11.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-11
EPIC Architecture Intel Itanium CPU 128-bit instruction bundle
3 41-bit instructions 5 bits to identify type of instructions in bundle
128 64-bit general purpose registers 128 82-bit floating point registers Intel X86 instruction set included Programmers and compilers follow guidelines
to ensure parallel execution of instructions
![Page 12: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/12.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-12
Paging
Managed by the operating system Built into the hardware Independent of application
![Page 13: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/13.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-13
Logical vs. Physical Addresses
Logical addresses are relative locations of data, instructions and branch target and are separate from physical addresses
Logical addresses mapped to physical addresses
Physical addresses do not need to be consecutive
![Page 14: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/14.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-14
Logical vs. Physical Address
![Page 15: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/15.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-15
Page Address Layout
![Page 16: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/16.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-16
Page Translation Process
![Page 17: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/17.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-17
Memory Enhancements Memory is slow compared to CPU processing
speeds! 2Ghz CPU = 1 cycle in ½ of a billionth of a second 70ns DRAM = 1 access in 70 millionth of a second
Methods to improvement memory accesses Wide Path Memory Access
Retrieve multiple bytes instead of 1 byte at a time Memory Interleaving
Partition memory into subsections, each with its own address register and data register
Cache Memory
![Page 18: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/18.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-18
Memory Interleaving
![Page 19: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/19.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-19
Why Cache?
Even the fastest hard disk has an access time of about 10 milliseconds
2Ghz CPU waiting 10 millisecondswastes 20 million clock cycles!
![Page 20: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/20.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-20
Cache Memory Blocks: 8 or 16 bytes Tags: location in main memory Cache controller
hardware that checks tags Cache Line
Unit of transfer between storage and cache memory Hit Ratio: ratio of hits out of total requests Synchronizing cache and memory
Write through Write back
![Page 21: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/21.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-21
Step-by-Step Use of Cache
![Page 22: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/22.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-22
Step-by-Step Use of Cache
![Page 23: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/23.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-23
Performance Advantages
Hit ratios of 90% common 50%+ improved execution speed Locality of reference is why caching works
Most memory references confined to small region of memory at any given time
Well-written program in small loop, procedure or function
Data likely in array Variables stored together
![Page 24: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/24.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-24
Two-level Caches
Why do the sizes of the caches have to be different?
![Page 25: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/25.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-25
Cache vs. Virtual Memory
Cache speeds up memory access Virtual memory increases amount of
perceived storage independence from the configuration and
capacity of the memory system low cost per bit
![Page 26: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/26.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-26
Modern CPU Processing Methods
Timing Issues Separate Fetch/Execute Units Pipelining Scalar Processing Superscalar Processing
![Page 27: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/27.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-27
Timing Issues Computer clock used for timing purposes MHz – million steps per second GHz – billion steps per second Instructions can (and often) take more than one step Data word width can require multiple steps
![Page 28: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/28.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-28
Separate Fetch-Execute Units
Fetch Unit Instruction fetch unit Instruction decode unit
Determine opcode Identify type of instruction and operands
Several instructions are fetched in parallel and held in a buffer until decoded and executed
IP – Instruction Pointer register Execute Unit
Receives instructions from the decode unit Appropriate execution unit services the instruction
![Page 29: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/29.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-29
Alternative CPU Organization
![Page 30: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/30.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-30
Instruction Pipelining Assembly-line technique to allow overlapping between fetch-execute cycles of sequences of instructions Only one instruction is being executed to completion at a time Scalar processing
Average instruction execution is approximately equal to the clock speed of the CPU Problems from stalling
Instructions have different numbers of steps Problems from branching
![Page 31: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/31.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-31
Branch Problem Solutions
Separate pipelines for both possibilities Probabilistic approach Requiring the following instruction to not
be dependent on the branch Instruction Reordering (superscalar
processing)
![Page 32: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/32.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-32
Pipelining Example
![Page 33: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/33.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-33
Superscalar Processing Process more than one instruction per
clock cycle Separate fetch and execute cycles as
much as possible Buffers for fetch and decode phases Parallel execution units
![Page 34: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/34.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-34
Superscalar CPU Block Diagram
![Page 35: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/35.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-35
Scalar vs. Superscalar Processing
![Page 36: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/36.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-36
Superscalar Issues Out-of-order processing – dependencies
(hazards) Data dependencies Branch (flow) dependencies and speculative
execution Parallel speculative execution or branch
prediction Branch History Table Register access conflicts
Logical registers
![Page 37: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/37.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-37
Hardware Implementation
Hardware – operations are implemented by logic gates
Advantages Speed RISC designs are simple and typically
implemented in hardware
![Page 38: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.](https://reader035.fdocuments.us/reader035/viewer/2022062304/56649f0c5503460f94c1f8c8/html5/thumbnails/38.jpg)
Chapter 8: CPU and Memory:Design, Implementation, and Enhancement
8-38
Microprogrammed Implementation
Microcode are tiny programs stored in ROM that replace CPU instructions
Advantages More flexible Easier to implement complex instructions Can emulate other CPUs
Disadvantage Requires more clock cycles