Pipelining III
description
Transcript of Pipelining III
![Page 1: Pipelining III](https://reader035.fdocuments.us/reader035/viewer/2022062521/568167cf550346895ddd2013/html5/thumbnails/1.jpg)
Pipelining III
Andreas KlappeneckerCPSC321 Computer
Architecture
![Page 2: Pipelining III](https://reader035.fdocuments.us/reader035/viewer/2022062521/568167cf550346895ddd2013/html5/thumbnails/2.jpg)
Administrative Issues Talk by Laszlo Kish
Quantum Computing Seminar, Thursday 10:00am-11:00am, HRBB 302
Projects: Get started!!!
![Page 3: Pipelining III](https://reader035.fdocuments.us/reader035/viewer/2022062521/568167cf550346895ddd2013/html5/thumbnails/3.jpg)
Pipelined Datapath
Pipeline separation registers, width varies
![Page 4: Pipelining III](https://reader035.fdocuments.us/reader035/viewer/2022062521/568167cf550346895ddd2013/html5/thumbnails/4.jpg)
Control Lines Instruction fetch:
control signal to read instruction memory and to write PC are always asserted - nothing special here
Instruction decode/register file read: same thing happens every clock cycle, so no
optional control lines to set Execution/address calculation
RegDst selects the result register, ALUOp selects the ALU operation ALUSrc selects Read data 2 or sign-extd.
immediate
![Page 5: Pipelining III](https://reader035.fdocuments.us/reader035/viewer/2022062521/568167cf550346895ddd2013/html5/thumbnails/5.jpg)
Pipelined Datapath w/ Controls Signals
![Page 6: Pipelining III](https://reader035.fdocuments.us/reader035/viewer/2022062521/568167cf550346895ddd2013/html5/thumbnails/6.jpg)
Control Lines Memory access
Branch set by branch equal MemRead set by load instructions MemWrite set by store instructions
Write back MemtoReg send ALU result or memory
value RegWrite selects register
![Page 7: Pipelining III](https://reader035.fdocuments.us/reader035/viewer/2022062521/568167cf550346895ddd2013/html5/thumbnails/7.jpg)
Pipelined Datapath w/ Controls Signals
![Page 8: Pipelining III](https://reader035.fdocuments.us/reader035/viewer/2022062521/568167cf550346895ddd2013/html5/thumbnails/8.jpg)
Pass control signals along just like the data
Pipeline Control
Execution/Address Calculation stage control lines
Memory access stage control lines
Write-back stage control
lines
InstructionReg Dst
ALU Op1
ALU Op0
ALU Src Branch
Mem Read
Mem Write
Reg write
Mem to Reg
R-format 1 1 0 0 0 0 0 1 0lw 0 0 0 1 0 1 0 1 1sw X 0 0 1 0 0 1 0 Xbeq X 0 1 0 1 0 0 0 X
Control
EX
M
WB
M
WB
WB
IF/ID ID/EX EX/MEM MEM/WB
Instruction
![Page 9: Pipelining III](https://reader035.fdocuments.us/reader035/viewer/2022062521/568167cf550346895ddd2013/html5/thumbnails/9.jpg)
Datapath with Control
PC
Instructionmemory
Inst
ruct
ion
Add
Instruction[20– 16]
Mem
toR
eg
ALUOp
Branch
RegDst
ALUSrc
4
16 32Instruction[15– 0]
0
0
Mux
0
1
Add Addresult
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Signextend
Mux
1
ALUresult
Zero
Writedata
Readdata
Mux
1
ALUcontrol
Shiftleft 2
Reg
Writ
e
MemRead
Control
ALU
Instruction[15– 11]
6
EX
M
WB
M
WB
WBIF/ID
PCSrc
ID/EX
EX/MEM
MEM/WB
Mux
0
1
Mem
Writ
e
AddressData
memory
Address
![Page 10: Pipelining III](https://reader035.fdocuments.us/reader035/viewer/2022062521/568167cf550346895ddd2013/html5/thumbnails/10.jpg)
Assume that the compiler has to guarantee that no hazards occur
Where do we insert the “nops” ?
sub $2, $1, $3and $12, $2, $5or $13, $6, $2add $14, $2, $2sw $15, 100($2)
Data Hazards
![Page 11: Pipelining III](https://reader035.fdocuments.us/reader035/viewer/2022062521/568167cf550346895ddd2013/html5/thumbnails/11.jpg)
Data hazard: a dependency that “goes backward in time”
Dependencies
IM Reg
IM Reg
CC 1 CC 2 CC 3 CC 4 CC 5 CC 6
Time (in clock cycles)
sub $2, $1, $3
Programexecutionorder(in instructions)
and $12, $2, $5
IM Reg DM Reg
IM DM Reg
IM DM Reg
CC 7 CC 8 CC 9
10 10 10 10 10/– 20 – 20 – 20 – 20 – 20
or $13, $6, $2
add $14, $2, $2
sw $15, 100($2)
Value of register $2:
DM Reg
Reg
Reg
Reg
DM
![Page 12: Pipelining III](https://reader035.fdocuments.us/reader035/viewer/2022062521/568167cf550346895ddd2013/html5/thumbnails/12.jpg)
Solutionsub $2, $1, $3nopnopand $12, $2, $5or $13, $6, $2add $14, $2, $2sw $15, 100($2)
Problem: this slows us down!
Resolution of Data Hazards
![Page 13: Pipelining III](https://reader035.fdocuments.us/reader035/viewer/2022062521/568167cf550346895ddd2013/html5/thumbnails/13.jpg)
ForwardingDo not wait until result have been written Use temporary results! Use register file forwarding to handle read/write to same register ALU forwarding
![Page 14: Pipelining III](https://reader035.fdocuments.us/reader035/viewer/2022062521/568167cf550346895ddd2013/html5/thumbnails/14.jpg)
Forwarding
IM Reg
IM Reg
CC 1 CC 2 CC 3 CC 4 CC 5 CC 6
Time (in clock cycles)
sub $2, $1, $3
Programexecution order(in instructions)
and $12, $2, $5
IM Reg DM Reg
IM DM Reg
IM DM Reg
CC 7 CC 8 CC 9
10 10 10 10 10/– 20 – 20 – 20 – 20 – 20
or $13, $6, $2
add $14, $2, $2
sw $15, 100($2)
Value of register $2 :
DM Reg
Reg
Reg
Reg
X X X – 20 X X X X XValue of EX/MEM :X X X X – 20 X X X XValue of MEM/WB :
DM
![Page 15: Pipelining III](https://reader035.fdocuments.us/reader035/viewer/2022062521/568167cf550346895ddd2013/html5/thumbnails/15.jpg)
Forwarding
PC Instructionmemory
Registers
Mux
Mux
Control
ALU
EX
M
WB
M
WB
WB
ID/EX
EX/MEM
MEM/WB
Datamemory
Mux
Forwardingunit
IF/ID
Inst
ruct
ion
Mux
RdEX/MEM.RegisterRd
MEM/WB.RegisterRd
Rt
Rt
Rs
IF/ID.RegisterRd
IF/ID.RegisterRt
IF/ID.RegisterRt
IF/ID.RegisterRs
![Page 16: Pipelining III](https://reader035.fdocuments.us/reader035/viewer/2022062521/568167cf550346895ddd2013/html5/thumbnails/16.jpg)
Load word can still cause a hazard: an instruction trying to read a register following a load instruction writing to the same register.
Need a hazard detection unit to “stall” pipeline
Obstructions to Forwarding
Reg
IM
Reg
Reg
IM
CC 1 CC 2 CC 3 CC 4 CC 5 CC 6
Time (in clock cycles)
lw $2, 20($1)
Programexecutionorder(in instructions)
and $4, $2, $5
IM Reg DM Reg
IM DM Reg
IM DM Reg
CC 7 CC 8 CC 9
or $8, $2, $6
add $9, $4, $2
slt $1, $6, $7
DM Reg
Reg
Reg
DM
![Page 17: Pipelining III](https://reader035.fdocuments.us/reader035/viewer/2022062521/568167cf550346895ddd2013/html5/thumbnails/17.jpg)
Stalling We can stall the pipeline by keeping
an instruction in the same stage
lw $2, 20($1)
Programexecutionorder(in instructions)
and $4, $2, $5
or $8, $2, $6
add $9, $4, $2
slt $1, $6, $7
Reg
IM
Reg
Reg
IM DM
CC 1 CC 2 CC 3 CC 4 CC 5 CC 6Time (in clock cycles)
IM Reg DM RegIM
IM DM Reg
IM DM Reg
CC 7 CC 8 CC 9 CC 10
DM Reg
RegReg
Reg
bubble
![Page 18: Pipelining III](https://reader035.fdocuments.us/reader035/viewer/2022062521/568167cf550346895ddd2013/html5/thumbnails/18.jpg)
Hazard Detection Unit Stall by letting an instruction that
won’t write anything go forward
PC Instructionmemory
Registers
Mux
Mux
Mux
Control
ALU
EX
M
WB
M
WB
WB
ID/EX
EX/MEM
MEM/WB
Datamemory
Mux
Hazarddetection
unit
Forwardingunit
0
Mux
IF/ID
Inst
ruct
ion
ID/EX.MemRead
IF/ID
Writ
e
PC
Writ
e
ID/EX.RegisterRt
IF/ID.RegisterRd
IF/ID.RegisterRtIF/ID.RegisterRtIF/ID.RegisterRs
RtRs
Rd
Rt EX/MEM.RegisterRd
MEM/WB.RegisterRd
![Page 19: Pipelining III](https://reader035.fdocuments.us/reader035/viewer/2022062521/568167cf550346895ddd2013/html5/thumbnails/19.jpg)
When we decide to branch, other instructions are in the pipeline!
We are predicting “branch not taken” need to add hardware for flushing instructions if we are wrong
Branch Hazards
Reg
Reg
CC 1
Time (in clock cycles)
40 beq $1, $3, 7
Programexecutionorder(in instructions)
IM Reg
IM DM
IM DM
IM DM
DM
DM Reg
Reg Reg
Reg
Reg
RegIM
44 and $12, $2, $5
48 or $13, $6, $2
52 add $14, $2, $2
72 lw $4, 50($7)
CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9
Reg
![Page 20: Pipelining III](https://reader035.fdocuments.us/reader035/viewer/2022062521/568167cf550346895ddd2013/html5/thumbnails/20.jpg)
Flushing Move branch decision from 4th
pipeline stage to the second only one instruction following the
branch will be in the pipeline IF.Flush turns fetched instruction
into a nop by zeroing the IF/ID pipeline register
![Page 21: Pipelining III](https://reader035.fdocuments.us/reader035/viewer/2022062521/568167cf550346895ddd2013/html5/thumbnails/21.jpg)
Flushing Instructions
PC Instructionmemory
4
Registers
Mux
Mux
Mux
ALU
EX
M
WB
M
WB
WB
ID/EX
0
EX/MEM
MEM/WB
Datamemory
Mux
Hazarddetection
unit
Forwardingunit
IF.Flush
IF/ID
Signextend
Control
Mux
=
Shiftleft 2
Mux
![Page 22: Pipelining III](https://reader035.fdocuments.us/reader035/viewer/2022062521/568167cf550346895ddd2013/html5/thumbnails/22.jpg)
Improving Performance Try and avoid stalls by reordering instructions Add a “branch delay slot”
the next instruction after a branch is always executed
rely on compiler to “fill” the slot with something useful
Superscalar: start more than one instruction in the same cycle
![Page 23: Pipelining III](https://reader035.fdocuments.us/reader035/viewer/2022062521/568167cf550346895ddd2013/html5/thumbnails/23.jpg)
Dynamic Scheduling The hardware performs the “scheduling”
hardware tries to find instructions to execute out of order execution is possible speculative execution and dynamic branch prediction
All modern processors are very complicated DEC Alpha 21264: 9 stage pipeline, 6 instruction issue PowerPC and Pentium: branch history table Compiler technology important
This class has given you the background you need to learn more - read Chapter 6!
More material will be posted!