Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of...
-
Upload
janice-scott -
Category
Documents
-
view
213 -
download
0
Transcript of Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of...
![Page 1: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/1.jpg)
Please see “portrait orientation” PowerPoint file for Chapter 8
Figure 8.1. Basic idea of instruction pipelining.
![Page 2: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/2.jpg)
Please see “portrait orientation” PowerPoint file for Chapter 8
Figure 8.2. A 4-stage pipeline.
![Page 3: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/3.jpg)
F1
F2
F3
I1
I2
I3
E1
E2
E3
D1
D2
D3
W1
W2
W3
Instruction
F4 D4I4
Clock cycle 1 2 3 4 5 6 7 8 9
Figure 8.3. Effect of an execution operation taking more than one clock cycle.
E4
F5I5 D5
Time
E5
W4
![Page 4: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/4.jpg)
Please see “portrait orientation” PowerPoint file for Chapter 8
Figure 8.4. Pipeline stall caused by a cache miss in F2.
![Page 5: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/5.jpg)
F1
F2
F3
I1
I2 (Load)
I3
E1
M2
D1
D2
D3
W1
W2
Instruction
F4I4
Clock cycle 1 2 3 4 5 6 7
Figure 8.5. Effect of a Load instruction on pipeline timing.
F5I5 D5
Time
E2
E3 W3
E4D4
![Page 6: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/6.jpg)
F1
F2
F3
I1 (Mul)
I2 (Add)
I3
D1
D3
E1
E3
E2
W3
Instruction
Figure 8.6. Pipeline stalled by data dependency between D2 and W1.
1 2 3 4 5 6 7 8 9Clock cycle
W1
D2A W2
F4 D4 E4 W4I4
D2
Time
Figure 8.6. Pipeline stalled by data dependency between D2 and W1.
![Page 7: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/7.jpg)
Please see “portrait orientation” PowerPoint file for Chapter 8
Figure 8.7. Operand forwarding in a pipelined processor.
![Page 8: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/8.jpg)
F2I2 (Branch)
I3
Ik
E2
F3
Fk Ek
Fk+1 Ek+1Ik+1
Instruction
Figure 8.8. An idle cycle caused by a branch instruction.
Execution unit idle
1 2 3 4 5Clock cycleTime
F1I1 E1
6
X
![Page 9: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/9.jpg)
Please see “portrait orientation” PowerPoint file for Chapter 8
Figure 8.9. Branch timing.
![Page 10: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/10.jpg)
F : Fetchinstruction
E : Executeinstruction
W : Writeresults
D : Dispatch/Decode
Instruction queue
Instruction fetch unit
Figure 8.10. Use of an instruction queue in the hardware organization of Figure 8.2b.
unit
![Page 11: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/11.jpg)
X
Figure 8.11. Branch timing in the presence of an instruction queue.Branch target address is computed in the D stage.
F1 D1 E1 E1 E1 W1
F4
W3E3
I5 (Branch)
I1
F2 D2
1 2 3 4 5 6 7 8 9Clock cycle
E2 W2
F3 D3
E4D4 W4
F5 D5
F6
Fk Dk Ek
Fk+1 Dk+1
I2
I3
I4
I6
Ik
Ik+1
Wk
Ek+1
10
1 1 1 1 2 3 2 1 1Queue length 1
Time
![Page 12: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/12.jpg)
Add
LOOP Shift_left R1DecrementBranch=0
R2LOOP
NEXT
(a) Original program loop
LOOP Decrement R2Branch=0
Shift_left
LOOP
R1NEXT
(b) Reordered instructions
Figure 8.12. Reordering of instructions for a delayed branch.
Add
R1,R3
R1,R3
![Page 13: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/13.jpg)
Please see “portrait orientation” PowerPoint file for Chapter 8
Figure 8.13. Execution timing showing the delay slot being filledduring the last two passes through the loop in Figure 8.12.
![Page 14: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/14.jpg)
Please see “portrait orientation” PowerPoint file for Chapter 8
Figure 8.14. Timing when a branch decision has been incorrectly predictedas not taken.
![Page 15: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/15.jpg)
Please see “portrait orientation” PowerPoint file for Chapter 8
Figure 8.15. State-machine representation of branch prediction algorithms.
![Page 16: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/16.jpg)
Please see “portrait orientation” PowerPoint file for Chapter 8
Figure 8.16. Figure 8.16. Equivalent operations using complex and simple addressing modes.
![Page 17: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/17.jpg)
AddCompareBranch=0
R1,R2R3,R4. . .
CompareAddBranch=0
R3,R4R1,R2. . .
(a) A program fragment
(b) Instructions reordered
Figure 8.17. Instruction reordering.
![Page 18: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/18.jpg)
Please see “portrait orientation” PowerPoint file for Chapter 8
Figure 8.18. Datapath modified for pipelined execution, withInterstage buffers at the input and output of the ALU.
![Page 19: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/19.jpg)
W : Writeresults
Dispatchunit
Instruction queue
Floating-pointunit
Integerunit
Figure 8.19. A processor with two execution units.
F : Instructionfetch unit
![Page 20: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/20.jpg)
I1 (Fadd) D1
D2
D3
D4
E1A E1B E1C
E2
E3 E3 E3
E4
W1
W2
W3
W4
I2 (Add)
I3 (Fsub)
I4 (Sub)
Figure 8.20. An example of instruction execution flow in the processor of Figure 8.19,assuming no hazards are encountered.
1 2 3 4 5 6Clock cycleTime
F1
F2
F3
F4
7
![Page 21: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/21.jpg)
Please see “portrait orientation” PowerPoint file for Chapter 8
Figure 8.21. Instruction completion in program order.
![Page 22: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/22.jpg)
LDX R3, 0, R6 Loadnumber ofitemsin thelist.OR R0, R0, R4 R4 to beusedasoffset in thelistOR R0, R0, R7 Clear R7 to be usedasaccumulator.
LOOPSTART LDX R3, R4, R5 Loadlist iteminto R5.ADD R5, R7, R7 Add number toaccumulator.ADD R4, 8, R4 Point to thenext entry.
SUBcc R6, 1, R6 Decrement R6 andsetconditionflags.BG xcc, LOOPSTART Loop if moreitems in the list.
NEXT ...
(a) Desired program loop
LDX R3, 0, R6OR R0, R0, R4OR R0, R0, R7
LOOPSTART LDX R3, R4, R5ADD R4, 8, R4
SUBcc R6, 1, R6BG,pt xcc, LOOPSTART Predictedtaken,Annul bit = 0ADD R5, R7, R7
NEXT ...
(b) Instructions reorganized to use the delay slot
Figure 8.22. An addition loop showing the use of the branch delay slotand branch prediction.
![Page 23: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/23.jpg)
Please see “portrait orientation” PowerPoint file for Chapter 8
Figure 8.23. Main building blocks of the UltraSPARC II processor.
![Page 24: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/24.jpg)
E C N1 N2 N3 W
F D G
Fetch Group
Decode
CheckDelay
Cache
Execute
Delay Write
E C N1 N2 N3 W
R X1 X2 X3 N3 W
R X1 X2 X3 N3 W
Two integerpipelines
Two floating-point
pipelines
Figure 8.24. Pipeline organization of the UltraSPARC II processor.
CheckExecuteExecute
RegisterExecute Write
Instruction
Buffer
![Page 25: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/25.jpg)
Please see “portrait orientation” PowerPoint file for Chapter 8
Figure 8.25. Example of instruction grouping.
![Page 26: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/26.jpg)
ADD R3, R5, R6 G E C N1 N2 N3 WLDSW R4, R7, R6 G E C N1 N2 N3 W
(a) Instructions with common destination
MOVRZ R1, R6, R7 G E C N1 N2 N3 WOR R7, R8, R9 G E C N1 N2 N3 W
(b) Delay caused by MOVR instruction
Figure 8.26 Dispatch delays due to hazards.
![Page 27: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/27.jpg)
Inte
ger
regi
ster
fil
e
Ann
exIEU0
IEU1
ALU
Interstage buffers
Figure 8.27. Integer execution unit.
![Page 28: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/28.jpg)
I1(Icc) G E CI2(BRcc) G E CI3 G E CI4 G E CI5 G EI6 G EI7 G EI8 G EI9 GI10 GI11 GI12 G
Abort
Figure 8.28. Worst-case timing for an incorrectly predicted branch.
![Page 29: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/29.jpg)
Integerregister file/
anne x
Figure 8.29. Load and store unit.
G E C N1
data
tags
dTLB
D-Cache
D-Cache
Compare
Load/store queue
Miss ToE-Cache
![Page 30: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/30.jpg)
Please see “portrait orientation” PowerPoint file for Chapter 8
Figure 8.30. Execution flow.
![Page 31: Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.](https://reader036.fdocuments.us/reader036/viewer/2022070407/56649e4d5503460f94b43753/html5/thumbnails/31.jpg)
Please see “portrait orientation” PowerPoint file for Chapter 8
Table 8.1. Examples of SPARC instructions.