EC 413 Computer Organization · 2019-12-03 · ADD PC 4 Write Data Read Addr 1 Read Addr 2 Write...
Transcript of EC 413 Computer Organization · 2019-12-03 · ADD PC 4 Write Data Read Addr 1 Read Addr 2 Write...
1
Department of Electrical & Computer Engineering
EC 413Computer Organization
Prof. Michel A. Kinsy
Summary
Department of Electrical & Computer Engineering
Computing Devices Then…
Department of Electrical & Computer Engineering
Computing Devices Now
2
Department of Electrical & Computer Engineering
The Von Neumann Architecture§ Stored Program Computer
Program = A sequence of instructions
Processor
2
114
17
100
0
1
2
3
Addresses Data
MemoryDatatransfer
inst<19:15>inst<24:20>
inst<11:7>
inst<14:12>
Instrcution
ALUControl
RegWriteclk
rd1
GPRs
rs1rs2
wswd rd2
we
ALU
rd ß(rs) func (rt)funct7 rs2 funct3rs1 rd opcode7 5 5 3 5 7
Real/Physical World
Output InputControl
Department of Electrical & Computer Engineering
The Von Neumann Architecture§ The modern computer system has three major functional hardware
units: CPU, Main Memory, and Input/Output devices
Processor Memory
Control Bus
211417
100
ReadAddress
Instruction[31-0]
ADD
PC
4
Write Data
Read Addr 1
Read Addr 2
Write AddrRegister File
Read Data 1
Read Data 2
ALU
Overflow
zero
RegWrite
Address
Write Data
Read Data
MemWrite
MemRead
SignExtend
16 32
MemtoReg
ALUSrc
Shiftleft 2
ADD
PCSrc
RegDst
ALUControl
1
1
1
0
00
0
1
ALUOp
Instr[5-0]
Instr[15-0]
Instr[25-21]
Instr[20-16]
Instr[15 -11]
ControlUnit
Instr[31-26]
Branch
Device#1
Device#n
I/O Devices
…
Address Bus
Data Bus
…
External World
Department of Electrical & Computer Engineering
The Von Neumann Architecture
§ At the most basic sense, a computer is a device consisting of three units performing three distinctive functions • A processor to interpret and execute programs• A memory to store both data and programs• A mechanism for transferring data to and from the
outside world
Processor
2
114
17
100
0
1
2
3
Addresses Data
MemoryDatatransfer
inst<19:15>inst<24:20>
inst<11:7>
inst<14:12>
Instrcution
ALUControl
RegWriteclk
rd1
GPRs
rs1rs2
wswd rd2
we
ALU
rd ß(rs) func (rt)funct7 rs2 funct3rs1 rd opcode7 5 5 3 5 7
Real/Physical World
Output InputControl
3
Department of Electrical & Computer Engineering
The Von Neumann Architecture
• Memory • Stores both program and data
• Control unit• Directs the operations of the other units in the processor by
providing timing and control signals• ALU / Execution Unit(s)
• Performs arithmetic and logical operations such as addition, subtraction, multiplication and division
Processor
2
114
17
100
0
1
2
3
Addresses Data
MemoryDatatransfer
inst<19:15>inst<24:20>
inst<11:7>
inst<14:12>
Instrcution
ALUControl
RegWriteclk
rd1
GPRs
rs1rs2
wswd rd2
we
ALU
rd ß(rs) func (rt)funct7 rs2 funct3rs1 rd opcode7 5 5 3 5 7
Real/Physical World
Output InputControl
Department of Electrical & Computer Engineering
The Von Neumann Architecture
• Input • An input device gets data from users• Examples are keyboards, mice, microphones, and secondary
storage devices
• Output• An output device sends data to users• Typical output devices are monitors, printers, and displays
Processor
2
114
17
100
0
1
2
3
Addresses Data
MemoryDatatransfer
inst<19:15>inst<24:20>
inst<11:7>
inst<14:12>
Instrcution
ALUControl
RegWriteclk
rd1
GPRs
rs1rs2
wswd rd2
we
ALU
rd ß(rs) func (rt)funct7 rs2 funct3rs1 rd opcode7 5 5 3 5 7
Real/Physical World
Output InputControl
Department of Electrical & Computer Engineering
Taxonomy of ProcessorsProcessor Organizations
Single instruction, single data stream
(SISD)
Uniprocessor
Single instruction multiple data stream (SIMD)
Multiple instruction, single data stream
(MISD)
Multiple instruction, multiple data stream
(MIMD)
Vector Processor Array Processor Shared Memory (Tightly Coupled)
Distributed Memory
(Loosely Coupled
ClusterSymmetricMultiprocessor
(SMP)
NonuniformedMemory Access
(NUMA)
4
Department of Electrical & Computer Engineering
Taxonomy of ProcessorsProcessor Organizations
Single instruction, single data stream
(SISD)
Uniprocessor
Single instruction multiple data stream (SIMD)
Multiple instruction, single data stream
(MISD)
Multiple instruction, multiple data stream
(MIMD)
Vector Processor Array Processor Shared Memory (Tightly Coupled)
Distributed Memory
(Loosely Coupled
ClusterSymmetricMultiprocessor
(SMP)
NonuniformedMemory Access
(NUMA)
Covered in detail in this class
Department of Electrical & Computer Engineering
Amdahl's Law Revisited § This law answers the critical question:
§ How much of a speedup one can get for a given parallelized task?
§ If s is the fraction of a calculation that is sequential, and (1-s) is the fraction that can be parallelized, then the maximum speed-up that can be achieved by using n processors is
§ Speed-up = 1
s+1− sn
Department of Electrical & Computer Engineering
Amdahl's Law Revisited§ If 80% of a calculation can be parallelized, i.e.
20% is sequential, then what is the maximum speed-up which can be achieved on 8 processors? § What if we double the number of processors (n =
16)?§ What if we double the number of processors again
(n = 32)?
§ What if the number of processors is 1000?
5
Department of Electrical & Computer Engineering
Amdahl's Law Revisited§ If 50% of a calculation can be parallelized, i.e.
50% is sequential, then what is the maximum speed-up which can be achieved on 8 processors? § What if we double the number of processors (n =
16)?§ What if we double the number of processors again
(n = 32)?
§ What if the number of processors is 1000?
Department of Electrical & Computer Engineering
Computing Layered View
Operating System
Programming Language
Applications & Algorithms
Firmware
Datapath & Control
Digital Design
Circuit Design
Layout
I/O systemProcessor Memory organizationISA
Compiler
Department of Electrical & Computer Engineering
Bridging/Compiling Process§ High-Level Language
C/C++/Java program
compiler
assembly code
assembler
object code library routines
executable
linker
loader
memory
Human Readable
Machine Code
6
Department of Electrical & Computer Engineering
The Big Picture
Cprogram
compiler
assemblycode
assembler
objectcode libraryrou4nes
executable
linker
loader
memory
Processor
2
114
17
100
0
1
2
3
Addresses Data
MemoryDatatransfer
inst<19:15>inst<24:20>
inst<11:7>
inst<14:12>
Instrcution
ALUControl
RegWriteclk
rd1
GPRs
rs1rs2
wswd rd2
we
ALU
rd ß(rs) func (rt)funct7 rs2 funct3rs1 rd opcode7 5 5 3 5 7
Department of Electrical & Computer Engineering
Program memory managementHigher
Addresses
Lower Addresses
Text Segment [Program code]
Fixed Size
Data Segment [Initialized global and static variables]
Fixed Size
BSS Segment [Initialized global and static variables]
Fixed Size
Heap Segment [Dynamic variables managed by
malloc(), free(), etc.]Variable Size
Stack Segment [Stack frames consisting of parameters,
return addresses and local variables]
Variable Size
Free spaceTop of the
stack
Bottom of the stack
Department of Electrical & Computer Engineering
Stack Structure
Other valueParameter p3Parameter p2Parameter p1Return address
Variable XVariable YVariable Z
Bottom of stackValueValue
…
Lower addresses
Higher addresses
Stack frame of the function
Associated C function code
int function (int p1, p2, p2){
int X, Y, Z; …
}
7
Department of Electrical & Computer Engineering
Stack Structure§ Procedure frame or activation record
Bottom of stack
…
Other value
sp
Bottom of stack
…
Other value
sp
Before call
sp
Bottom of stack
…
Other value
Local variablesSaved registers Arg. registers Return address
During call After call
Department of Electrical & Computer Engineering
Heap Structure§ The heap is allocated by demand or request
using C memory management functions such as malloc(), memset(), realloc() etc.
§ It allows data (especially arrays) to take on variable sizes
§ It allows locally created variables to live past end of routine
§ This is what permits many structures used in Data Structures and Algorithms
Department of Electrical & Computer Engineering
Application Compiling Process§ A compiler is a software program that translates
a human-oriented high-level programming language code into computer-oriented machine language
Compiler
Error messages
TargetProgram
(RISC-V, MIPS, x86,etc.)
Input
Output
SourceProgram
(C, C++, etc.)
8
Department of Electrical & Computer Engineering
Application Compiling Process§ Assembly language program (for RISC-V)
§ Machine (object, binary) code (for RISC-V)
assembler
one-to-one
swap:addi sp,sp,-48...mv a5,a1...ld s0,40(sp)addi sp,sp,48jr ra
111111010000 00010 000 00010 0010011000000110000 00010 000 01000 0010011...
Department of Electrical & Computer Engineering
Application Compiling Process§ Detailed compilation process
§ More on this later when you take a course on compilers
Scanner(lexicalanalysis)
Parser(syntaxanalysis)
CodeOptimizer
SemanticAnalysis
(IC generator)
CodeGenerator
Symbols&
Attributes Table
High-levellanguage
Targetlanguage
Language-focused transformations
Architecture-focused transformations
Department of Electrical & Computer Engineering
Instruction Set Architecture (ISA)§ Instructions are the language the computer
understand§ Instruction Set is the vocabulary of that language§ It serves as the hardware/software interface
§ Defines data types§ byte, int, float, double, string, vector…
§ Defines set of programmer visible state§ Known as the programmer’s model of the machine
§ Defines instruction semantics (operations, sequencing)§ operand location: register, immediate, indirect, . . . § add, sub, mul, move, compare, …
9
Department of Electrical & Computer Engineering
§ Many possible implementations of the same ISA§ 360 implementations: model 30 (c. 1964), z900 (c.
2001)§ x86 implementations: 8086 (c. 1978), 80186, 286,
386, 486, Pentium, Pentium Pro, Pentium-4, Core i7, AMD Athlon, AMD Opteron, Transmeta Crusoe, SoftPC
§ MIPS implementations: R2000, R4000, R10000, ...§ JVM: HotSpot, PicoJava, ARM Jazelle, ...§ RISC-V: RV32I, RV32E, RV64I, RV128I, …
§ Open-Source
Instruction Set Architecture (ISA)
Department of Electrical & Computer Engineering
Central Processing Unit (CPU)§ Central Processing Unit (CPU)
Organization§ CPU Execution Process
1. Fetch Instruction 2. Decode Instruction 3. Execute Operation 4. Memory Operation 5. Register Writeback Operation
Fetch Instruction
Decode Increment PCRead registers
ALU Operation Or
Branch Address
Data Memory Operation
Write Back
Department of Electrical & Computer Engineering
Single Cycle RISC-V CPU
ReadAddress Inst[31-0]
ADD
PC
4
Write Data
Read Addr 1
Read Addr 2
Write Addr
Register File
ReadData 1
ReadData 2
ALU
Overflow
zero
RegWrite
Address
Write Data
Read Data
MemWrite
MemRead
SignExtend
12 | 20 32
MemtoReg
ALUSrc
Shiftleft 1
ADD
PCSrc
ALUControl
1
1
00
0
1
ALUOp
Instr[30, 14-12]
Instr[19-15]
Instr[24-20]
Instr[11-7]
ControlUnit
Instr[31-21]
Branch
DataMemory
InstructionMemory
Jump
0
1
Instr[31-12]
PC[31-20]
10
Department of Electrical & Computer Engineering
Multi-Stage RISC-V CPU
Address
Inst[31-0]
PC
Write Data
Read Addr 1
Read Addr 2
Write AddrRegister File
ReadData 1
ReadData 2
ALU
Overflow
zero
Address
Write Data
Read Data
MemWrite
MemRead
SignExtend
12 | 20 32
MemtoReg
ALUSrc
Shiftleft 1
ADD
PCSrc
ALUControl
1
1
00
0
1
ALUOp
ControlUnit
Branch
Memory
RegWrite
RegWrite
Instr[30, 14-12]
Instr[19-15]
Instr[24-20]
Instr[11-7]
Instr[31-21]
ADD
4
0
Department of Electrical & Computer Engineering
Fully Bypassed Datapath
ASrcIRIR IR
PC A
B
Y
R
MD1 MD2
addrinst
InstMemory
0x4Add
IR ALU
ImmExt
rd1
GPRs
rs1rs2
wswdrd2
we
wdata
addr
wdata
rdataData Memory
we
31
nop
D
E M W
PC for JAL, ...
BSrc
Stall Condition
Department of Electrical & Computer Engineering
Performance Measurement § Processor performance:
§ Execution time § Area
§ Logic complexity
§ Power
§ In this class we will focus on Execution time
Time = Instructions Cycles TimeProgram Program * Instruction * Cycle
11
Department of Electrical & Computer Engineering
Amdahl's Law § By Gene Amdahl§ This law answers the critical question:
§ How much of a speedup one can get for a given architectural improvement/enhancement?§ The performance enhancement possible due to a given
design improvement is limited by the amount that the improved feature is used
§ Performance improvement or speedup due to enhancement E
Execution Time without E Performance with ESpeedup(E) = =
Execution Time with E Performance without E
Department of Electrical & Computer Engineering
Processor- Memory Gap§ Performance gap: CPU (55% each year) vs. DRAM (7%
each year)§ Processor operations take of the order of 1 ns§ Memory access requires 10s or even 100s of ns§ Each instruction executed involves at least one memory
access
1990 1980 2000 2010 1
10
10
Rel
ativ
e pe
rform
ance
Calendar year
Processor
Memory
3
6
Department of Electrical & Computer Engineering
Memory Technology§ Single-transistor DRAM cell is considerably simpler
than SRAM cell§ This leads to dense, high-capacity DRAM memory
chipsWord line
Capacitor
Bit line
Pass transistor
Word line
Bit line
Compl. bit line
Vcc
(a) DRAM cell (b) Typical SRAM cell DRAM Cell SRAM Cell
12
Department of Electrical & Computer Engineering
A Typical Memory Hierarchy
Register File
Instruction Cache Data Cache
L2 Cache
L3 Cache
Main Memory
Disk
Bypass Network
Capacity +Speed -
Speed +Capacity -
Inside the processor
Department of Electrical & Computer Engineering
Memory Organization§ A memory cannot be large and fast§ Increasing sizes of cache at each level
§ A hit at a level occurs if that level of the memory contains the data needed by the CPU
§ A miss occurs if the level does not contain the requested data
CPU L1 L2 DRAM
Department of Electrical & Computer Engineering
A Typical Memory Hierarchy
L1 Data Cache
L1 Instruction Cache Unified L2
Cache
RF Memory
Memory
Memory
Memory
Multi-ported register file (part of CPU)
Split instruction & data primary caches (on-chip SRAM)
Multiple interleaved memory banks(off-chip DRAM)
Large unified secondary cache (on-chip SRAM)
CPU
13
Department of Electrical & Computer Engineering
Multilevel Caches§ Cache is transparent to user (happens
automatically)
CPU CacheMemory
MainMemory
RegFile
WordLine
Data is in the cache fraction h
of the time Go to main 1 – h of the time
Department of Electrical & Computer Engineering
Caches§ Local miss rate = misses in cache / accesses to
cache§ Global miss rate = misses in cache / CPU memory
accesses§ Misses per instruction = misses in cache / number
of instructions
CPU L1 L2 DRAM
Department of Electrical & Computer Engineering
Address Bit-Field Partitioning§ The address (e.g., 32-bit) issued by the CPU is generally
divided into 3 fields § Tag
§ Serves as the unique identifier for a group of data§ Different regions of memory may be mapped to the same cache
location/block§ The tag is used to differentiate between them
§ Index § It is used to index into the cache structure
§ Block Offset§ The least significant bits are used to determine the exact data word§ If the block size is B then b = log2B bits will be needed in the address
to specify data word
BlockOffsetTag IndexAddress
t bits k bits b bits
14
Department of Electrical & Computer Engineering
Direct-Mapped Cache
Tag Data BlockV
=
BlockOffsetTag Index
tk b
t
HIT Data Word or Byte
2k
lines
Department of Electrical & Computer Engineering
Caching principles § Cache size (in bytes or words)
§ Total cache capacity § A larger cache can hold more of the program’s
useful data but is more costly and likely to be slower
§ Block or cache-line size § Unit of data transfer between cache and main§ With a larger cache line, more data is brought in
cache with each miss. This can improve the hit rate but also may bring low-utility data in cache
Department of Electrical & Computer Engineering
Caching principles § Placement policy
§ Determining where an incoming cache line is stored§ More flexible policies imply higher hardware cost and
may or may not have performance benefits (due to more complex data location)
§ Replacement policy § Determining which of several existing cache blocks
(into which a new cache line can be mapped) should be overwritten
§ Typical policies: choosing a random or the least recently used block
15
Department of Electrical & Computer Engineering
Caching Principles § Compulsory misses
§ With on-demand fetching, first access to any item is a miss
§ Capacity misses§ We have to evict some items to make room for others§ This leads to misses that are not incurred with an
infinitely large cache
§ Conflict misses§ The placement scheme may force us to displace useful
items to bring in other items§ This may lead to misses in future
Department of Electrical & Computer Engineering
Caching principles § Line width (2W)
§ Too small a value for W causes a lot of main memory accesses
§ Too large a value increases the miss penalty and may tie up cache space with low-utility items that are replaced before being used
§ Set size or associativity (2S)§ Direct mapping (S = 0) is simple and fast§ Greater associativity leads to more complexity, and
thus slower access, but tends to reduce conflict misses
Department of Electrical & Computer Engineering
Caching Principles § Cache contains copies of some of Main Memory
§ Those storage locations recently used§ When Main Memory address A is referenced in CPU§ Cache checked for a copy of contents of A
§ If found, cache hit§ Copy used§ No need to access Main Memory
§ If not found, cache miss§ Main Memory accessed to get contents of A§ Copy of contents also loaded into cache
16
Department of Electrical & Computer Engineering 46
Cache Performance Metrics§ Cache miss rate
§ Number of cache misses divided by number of accesses
§ Cache hit time§ Time between sending address and data returning
from cache§ Cache miss latency
§ Time between sending address and data returning from next-level cache/memory
§ Cache miss penalty§ Extra processor stall caused by next-level
cache/memory access
Department of Electrical & Computer Engineering
I/O Interface§ Basic I/O hardware
§ Ports, buses, devices and controllers § I/O Software
§ Interrupt Handlers§ Device Driver§ Device-Independent Software§ User-Space I/O Software
§ Three ways to perform I/O operations§ Polling§ Interrupt § Direct Memory Access (DMA)
Department of Electrical & Computer Engineering
I/O Services§ In a general-purpose computer, the CPU needs to
interact with I/O devices connected to the computer (e.g., keyboard, display, disk-drives, network, etc.)
§ I/O Devices are connected to the computer through controllers
Memory
System bus
CPU
Controller Controller Controller
17
Department of Electrical & Computer Engineering
Processor I/O§ Two approaches
§ Memory-mapped I/O § Devices mapped to
reserved memory locations - like RAM
§ Uses load/store instructions just like accesses to memory
§ Direct-mapped I/O § Special bus line§ Special instructions
Address
CPU
Memory I/O Device
Data
ReadWrite
CPU
MemoryI/O Device
Data
ReadWrite
Address
I/O Port
Memory I/O
Department of Electrical & Computer Engineering
Memory-Mapped I/O§ I/O devices are assigned memory locations§ Interactions with the I/O device is done through
memory load and store operations
Memory
System bus
CPU
Controller Controller Controller
Department of Electrical & Computer Engineering
Memory-Mapped I/O
PhysicalAddress
Space
VirtualAddress
Space
0xFFFF FFFF
0x00FF FFFF
0x0000 0000 0x0000 0000
Display
Disk
Keyboard
NetworkI/O
Controller
I/OController
I/OController
I/OController
18
Department of Electrical & Computer Engineering
I/O Memory Mappings
SeparateI/O and Memory
Space
0xFFFFFFFF
0
Memory
I/O ports
Memory-Mapped I/O
Hybrid: Memory-Mapped andSeparate I/O
Department of Electrical & Computer Engineering
Memory-Mapped I/O• Instead of having special methods to access
values to be read or written, just get them from memory or put them into memory to be access by device
• The device is connected directly to certain main memory locations
• Two types of information to/from the device§ Status§ Value read/write
Department of Electrical & Computer Engineering
I/O Services§ Each device controller is in charge of a particular
device type§ Each device controller has some local buffer§ CPU moves data from/to main memory to/from
device local buffers
Memory
System bus
CPU
Controller Controller Controller
19
Department of Electrical & Computer Engineering
I/O Services§ Device controller informs the CPU that it has an
operation to execute or it has finished its operation by causing an interrupt§ Interrupts provide a mechanism for devices to gain
the CPU's attention
Memory
System bus
CPU
Controller Controller Controller
Department of Electrical & Computer Engineering
Interrupts Execution Flow§ I/O devices have unique or shared Interrupt Request
Lines (IRQs)
Memory
System bus
CPU
Controller Controller Controller
Interrupt Request Lines (IRQs)
Department of Electrical & Computer Engineering
Interrupts Execution Flow§ I/O devices have unique or shared Interrupt Request
Lines (IRQs)§ Interrupt Request Lines are sent to Programmable
Interrupt Controller (PIC) hardware unit to generate the corresponding interrupt vectors
0
N
Interrupt Request
Lines (IRQs)
… Programmable Interrupt
Controller (PIC)
20
Department of Electrical & Computer Engineering
Interrupts Execution Flow§ I/O devices have unique or shared Interrupt Request
Lines (IRQs)§ Interrupt Request Lines are sent to Programmable
Interrupt Controller (PIC) hardware unit to generate the corresponding interrupt vectors
§ Interrupt vectors are sent to the Interrupt Descriptor Table (IDT) to locate the corresponding Interrupt Handler subroutine information
0
N
Interrupt Request Lines (IRQs)
… Programmable Interrupt
Controller (PIC)
0
255
Handler
Interrupt Descriptor Table (IDT)
Department of Electrical & Computer Engineering
Interrupts Execution Flow§ Interrupt Handler subroutine information consists of a
new stack pointer, program counter, and system state§ The new program counter is loaded to PC register for
execution
0
N
Interrupt Request
Lines (IRQs)
… Programmable Interrupt
Controller (PIC)
0
255
Handler
Interrupt Descriptor Table (IDT)
Interrupt Handler
PC
Department of Electrical & Computer Engineering
Interrupts Execution Flow§ I/O devices have unique or shared Interrupt Request
Lines (IRQs)§ Interrupt Request Lines are sent to Programmable
Interrupt Controller (PIC) hardware unit to generate the corresponding interrupt vectors
§ Interrupt vectors are sent to the Interrupt Descriptor Table (IDT) to locate the corresponding Interrupt Handler subroutine information
§ Interrupt Handler subroutine information consists of a new stack pointer, program counter, and system state
§ The new program counter is loaded to PC register for execution
21
Department of Electrical & Computer Engineering
Virtual Memory § Virtual memory
§ Technique that allows execution of a program that may not completely reside in memory (RAM)
§ Allows the computer to “fake” a program into believing that its memory space is larger than physical RAM
Department of Electrical & Computer Engineering
Virtual Memory § Two memory “spaces”
§ Virtual memory space what the program “sees”§ Physical memory space what the program runs in
(size of RAM)
§ On program startup § OS copies program into RAM § If there is not enough RAM, OS stops copying
program and starts it running with only a portion of the program loaded in RAM
Department of Electrical & Computer Engineering
Virtual Memory
bne 0x00
add r10,r1,r2
sub r3,r4,r1
sw r5,0x0c
0x00
0x04
0x08
0x0C
0x10
0x14
0x18 0x1C
0x00 0x04
0x08
0x0C
add r1,r2,r3
sub r2,r3,r4
lw r2, 0x04
mult r3,r4,r5
add r1,r2,r3
sub r2,r3,r4
lw r2, 0x04
mult r3,r4,r5
Virtual Memory Physical Memory
22
Department of Electrical & Computer Engineering
Virtual Memory
bne 0x00
add r10,r1,r2
sub r3,r4,r1
sw r5,0x0c
0x00
0x04
0x08
0x0C 0x10
0x14
0x18
0x1C
0x00
0x04
0x08
0x0C
add r1,r2,r3
sub r2,r3,r4
lw r2, 0x04
mult r3,r4,r5
add r1,r2,r3
sub r2,r3,r4
lw r2, 0x04
mult r3,r4,r5
0x00 0x000x04 0x040x08 0x080x0c 0x0c0x10 Disk0x14 Disk0x18 Disk0x1c Disk
Translation TableVirtual Memory Physical Memory
Department of Electrical & Computer Engineering
Page Fault
bne 0x00
add r10,r1,r2
sub r3,r4,r1
sw r5,0x0c
0x00
0x04
0x08
0x0C 0x10
0x14
0x18
0x1C
0x00
0x04
0x08
0x0C
add r1,r2,r3
sub r2,r3,r4
lw r2, 0x04
mult r3,r4,r5
bne 0x00
sub r2,r3,r4
lw r2, 0x04
mult r3,r4,r5
0x00 Disk0x04 0x040x08 0x080x0c 0x0c0x10 0x000x14 Disk0x18 Disk0x1c Disk
Translation TableVirtual Memory Physical Memory
Department of Electrical & Computer Engineering
Virtualaddress
Program View
0x00
0x04
0x08
0x0C
add r1,r2,r3
sub r2,r3,r4
lw r2, 0x04
mult r3,r4,r5
0x00 0x000x04 0x040x08 0x080x0c 0x0c0x10 Disk0x14 Disk0x18 Disk0x1c Disk
CPU
Instructions(or data)
Translation Table
Virtual Memory Physical Memory
23
Department of Electrical & Computer Engineering
The Von Neumann Architecture§ The modern computer system has three major functional hardware
units: CPU, Main Memory, and Input/Output devices
Processor Memory
Control Bus
211417
100
ReadAddress
Instruction[31-0]
ADD
PC
4
Write Data
Read Addr 1
Read Addr 2
Write AddrRegister File
Read Data 1
Read Data 2
ALU
Overflow
zero
RegWrite
Address
Write Data
Read Data
MemWrite
MemRead
SignExtend
16 32
MemtoReg
ALUSrc
Shiftleft 2
ADD
PCSrc
RegDst
ALUControl
1
1
1
0
00
0
1
ALUOp
Instr[5-0]
Instr[15-0]
Instr[25-21]
Instr[20-16]
Instr[15 -11]
ControlUnit
Instr[31-26]
Branch
Device#1
Device#n
I/O Devices
…
Address Bus
Data Bus
…
External World
Department of Electrical & Computer Engineering
BU EC 413 Fall of 2019