Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly...

34
Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013

Transcript of Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly...

Page 1: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Midterm Review 2

Dr. Zhao ZhangIowa State University

CprE 381 Computer Organization and Assembly Level Programming, Fall 2013

Page 2: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Announcement No quiz today No homework this Friday Exam on Monday 9:00-9:50 HW9 deadline extended to next Friday HW8 solutions will be posted today

Chapter 1 — Computer Abstractions and Technology — 2

Page 3: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Exam 2 Coverage Coverage: Ch. 4, The Processor

Datapath and control Simple MIPS pipeline Data hazards and forwarding Load-use hazard and pipeline stall Control hazards

Arithmetic will NOT be covered Will be covered in the final exam Final exam is comprehensive

Chapter 1 — Computer Abstractions and Technology — 3

Page 4: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Question Styles and Coverage Short answer True/False or multi-choice Design and Analysis

Signal values in the datapath and control Identify critical path Support a new MIPS instruction

Performance analysis and optimization Identify pipeline bubbles in program execution Reorder instructions to improve performance

And others

Chapter 1 — Computer Abstractions and Technology — 4

Page 5: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Nine-Instruction MIPS They’re enough to illustrate the most aspects of

CPU design, particularly datapath and control design

Some questions will use it as the baseline design

Memory reference: LW and SW

Arithmetic/logic: ADD, SUB, AND, OR, SLT

Branch: BEQ, J

Chapter 1 — Computer Abstractions and Technology — 5

Page 6: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Chapter 4 — The Processor — 6

Datapath With Jumps Added

Page 7: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

The Control Control signals for the nine-instruction

implementation

Inst Reg-Dst

ALU-Src

Mem-toReg

Reg-Write

MemRead

MemWrite

Branch

ALUOp1

ALUOp0

Jump

R- 1 0 0 1 0 0 0 1 0 0

lw 0 1 1 1 1 0 0 0 0 0

sw X 1 X 0 0 1 0 0 0 0

beq X 0 X 0 0 0 1 0 1 0

j X X X 0 0 0 0 X X 1

Chapter 1 — Computer Abstractions and Technology — 7

Note: “R-” means R-format

Page 8: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Chapter 4 — The Processor — 8

ALU Control Truth table for ALU Control

Extend it as a secondary control unit in projects B & C, with more control signal output

opcode ALUOp Operation funct ALU function ALU control

lw 00 load word XXXXXX add 0010

sw 00 store word XXXXXX add 0010

beq 01 branch equal XXXXXX subtract 0110

R-type 10 add 100000 add 0010

subtract 100010 subtract 0110

AND 100100 AND 0000

OR 100101 OR 0001

set-on-less-than 101010 set-on-less-than 0111

Page 9: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Extend the Single-Cycle Processor

For each instruction, do we need1.Any new or revised datapath element(s)?2.Any new control signal(s)?

Then revise, if necessary, 1.Datapath: Add new elements or revise existing ones, add new connections2.Control Unit: Add/extend control signals, extend the truth table3.ALU Control: Extend the truth table

Chapter 1 — Computer Abstractions and Technology — 9

Page 10: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Support JAL

jal target

PC = JumpAddrR[31] = PC_plus_4

PC_plus_4 = PC+4

JumpAddr = PC_plus_4[31:28]

& Inst[25:0] & “00”

Chapter 1 — Computer Abstractions and Technology — 10

000011 address

31:26 25:0

Page 11: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Chapter 4 — The Processor — 11

Support JAL

Make what changes tothe datapath?

Page 12: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Support JAL Analyze the instruction execution

Writes register $ra ($31) Update PC with jump target

This part already done for supporting J Analyze datapath

Needs another input, fixed at 31, to “Write register” port of register file

Needs another input, PC+4, to “Write data” port of register file

Revise control Add a “link” signal The (main) control unit can tell it by reading the

opcode

Chapter 1 — Computer Abstractions and Technology — 12

Page 13: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Chapter 4 — The Processor — 13

SCPv1 + JAL

Revises the two muxes•Add another input•Extend the select signalsAlternatively, use extra mux

Page 14: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Control Signals Control signals for the nine-instruction

implementation

Inst Reg-Dst

ALU-Src

Mem-toReg

Reg-Write

MemRead

MemWrite

Branch

ALUOp1

ALUOp0

Jump Link

R- 1 0 0 1 0 0 0 1 0 0

lw 0 1 1 1 1 0 0 0 0 0

sw X 1 X 0 0 1 0 0 0 0

beq X 0 X 0 0 0 1 0 1 0

j X X X 0 0 0 0 X X 1

jal

Chapter 1 — Computer Abstractions and Technology — 14

• Add a new row for jal• Extend RegDst• Add a control line link

Page 15: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Control Signals Control signals for the nine-instruction

implementation

Inst Reg-Dst

ALU-Src

Mem-toReg

Reg-Write

MemRead

MemWrite

Branch

ALUOp1

ALUOp0

Jump Link

R- 1 0 0 1 0 0 0 1 0 0 0

lw 0 1 1 1 1 0 0 0 0 0 0

sw X 1 X 0 0 1 0 0 0 0 0

beq X 0 X 0 0 0 1 0 1 0 0

j X X X 0 0 0 0 X X 1 0

jal 0 X 0 1 0 0 X X X 1 1

Chapter 1 — Computer Abstractions and Technology — 15

• Extend control input to RegDst Mux: RegDst & Link• Extend control input to MemtoReg Mux: MemtoReg & Link

Page 16: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Chapter 4 — The Processor — 16

Simple Pipeline Add pipeline registers hold information

produced in each cycle

Page 17: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Chapter 4 — The Processor — 17

Pipelined Control

Page 18: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Chapter 4 — The Processor — 18

Hazards Situations that prevent starting the next

instruction safely in the next cycle The simple pipeline won’t work correctly

Structure hazards A required resource is busy

Data hazard Need to wait for previous instruction to

complete its data read/write Control hazard

Deciding on control action depends on previous instruction

Page 19: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Data Hazards

Program with data dependencesub $2, $1,$3and $12,$2,$5or $13,$6,$2add $14,$2,$2sw $15,100($2)

Program with control dependence beq $1, $3, +4 addi $2, $2, 1 addi $4, $4, 1

Chapter 1 — Computer Abstractions and Technology — 19

Page 20: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Data Forwarding

sub $2, $1,$3 # MEM=>EX forwardingand $12,$2,$5 # WB =>EX forwardingor $13,$6,$2add $14,$2,$2sw $15,100($2)

Chapter 1 — Computer Abstractions and Technology — 20

or and sub … …

or and sub …addAND gets forwarded new $2 value

or and subaddsw SUB gets forwardednew $2 value

IF ID EX MEM WB

Page 21: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Chapter 4 — The Processor — 21

Data Forwarding Paths

Page 22: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Chapter 4 — The Processor — 22

Detecting the Need to Forward

Input rs and rt from EX rd and RegWrite from MEM rd and RegWrite from WB

Output FwdA, FwdB

Caveats Check RegWrite Check if rd = 0 Forwarding from MEM wins over WB

Review slides and textbook for details

Page 23: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Chapter 4 — The Processor — 23

Load-Use Data Hazardlw $s0, 20($t1)sub $t2, $s0, $t3

Can’t always avoid stalls by forwardingMust stall pipeline by one cycle

Page 24: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Chapter 4 — The Processor — 24

Datapath with Hazard Detection

Page 25: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Hazard Detection Unit

Input rs and rt from ID rt and MemRead from EX

Output PCWrite, IF/IDWrite (0 for holding instructions) Select signal to a MUX to insert bubble in EX

Read slides/textbook for details

Chapter 4 — The Processor — 25

Page 26: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Chapter 4 — The Processor — 26

Pipeline Stall The nop has all control signals set to zero

It does nothing at EX, MEM and WB Prevent update of PC and IF/ID register

Using instruction is decoded again (OK) Following instruction is fetched again (OK) 1-cycle stall allows MEM to read data for lw

Can subsequently forward from WB to EX

Page 27: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Chapter 4 — The Processor — 27

Code Scheduling to Avoid Stalls

Reorder code to avoid use of load result in the next instruction

C code for A = B + E; C = B + F;

lw $t1, 0($t0)lw $t2, 4($t0)add $t3, $t1, $t2sw $t3, 12($t0)lw $t4, 8($t0)add $t5, $t1, $t4sw $t5, 16($t0)

stall

stall

lw $t1, 0($t0)lw $t2, 4($t0)lw $t4, 8($t0)add $t3, $t1, $t2sw $t3, 12($t0)add $t5, $t1, $t4sw $t5, 16($t0)

11 cycles13 cycles

Page 28: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Chapter 4 — The Processor — 28

Control Hazards Branch determines flow of control

Two branch outcomes: Taken or Not-Taken The CPU doesn’t recognize a branch until

it reaches the end of the ID stage Every cycle, the CPU has to fetch one

instruction

Page 29: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Chapter 4 — The Processor — 29

Control Hazards The MIPS pipeline in textbook always

predict “not-taken” Pipeline flush on every taken branch OK to flush because mis-fetched instructions

don’t write to register/memory But this incurs pipeline bubbles (performance

penalty) The revised MIPS pipeline move branch

comparison to the ID stage Doable for BEQ and BNE Reduce pipeline bubbles from 3 to 1 per taken

branch Complicate data forwarding and hazard detection

Page 30: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Chapter 4 — The Processor — 30

Revised MIPS Pipeline

Page 31: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Chapter 4 — The Processor — 31

Revised MIPS Pipeline

Note: Branch does nothing in EX, MEM and WB

Page 32: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Performance Penalty Any pipeline bubbles?

Chapter 1 — Computer Abstractions and Technology — 32

add $4, $5, $6

lw $1, addr

beq $1, $4, target

add $4, $5, $6

addi $1, $1, -1

beq $1, $zero, loop

loop:

Page 33: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Delayed BranchDelayed branch may remove the one-cycle stall

The instruction right after the beq is executed no matter the branch is taken or not (sub instruction in the example)

Alternatingly saying, the execution of beq is delayed by one cycle

sub $10, $4, $8 beq $1, $3, 7 beq $1, $3, 7 => sub $10, $4, $8 and $12, $2, $5 and $12, $2, $5 Must find an independent instruction, otherwise

May have to fill in a nop instruction, or Need two variants of beq, delayed and not delayed

Chapter 1 — Computer Abstractions and Technology — 33

Page 34: Midterm Review 2 Dr. Zhao Zhang Iowa State University CprE 381 Computer Organization and Assembly Level Programming, Fall 2013.

Other Topics Exception handling Multi-issue pipeline

Those topics will be covered in the final exam Exam 2 will NOT cover them

Chapter 1 — Computer Abstractions and Technology — 34