L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N =...

29
Forwarding Today: Forwarding Wednesday: Optional Exam 4 review session Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1)

Transcript of L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N =...

Page 1: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

ForwardingToday: Forwarding

Wednesday: Optional Exam 4 review sessionFriday: Optional Exam 5 review sessionCurrent Exam 5 average = 100%! (N = 1)

Page 2: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

233 in one slide! The class consists roughly of 4 quarters: (Bolded words are the big ideas of the course, pay attention when you hear these words)1. You will build a simple computer processor

Build and create statemachines with data, control, and indirection2. You will learn how high‐level language code executes on a processor

Time limitations create dependencies in the state of the processor3. You will learn why computers perform the way they do

Physical limitations require locality and indirection in how we access state4. You will learn about hardware mechanisms for parallelism

Locality, dependencies, and indirection on performance enhancing drugs

We will have a SPIMbot contest!

2

Page 3: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

Today’s LectureWhy dependencies and delayed feedback necessitate forwardingHow to implement forwarding

Page 4: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

What is the difference between a data dependency and a data hazard?A data dependency is when an instruction writes a value to a destination register that is used as a source register by later instruction(s). A data hazard is when the register is being read by the later incruction(s) before it it written to. This leads to an error in the program

Data hazard is when a data dependency will result in an error. Not all data dependencies are data hazards.

Page 5: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

What is the difference between a data dependency and a data hazard?A data dependency occurs when the value of a register needs to be computed before the instruction can operate. This becomes a data hazard when the result of the computation for the value of the needed register isn't actually computed by the time the next register reads it, creating an outdated data source that can propagate through the pipeline.

Page 6: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

COVID-19 preferencesFor the two weeks after spring break, which attendance policy would you prefer? (this is solely an advisory vote)A) Optional attendance, no extra credit for clickersB) Optional attendance, extra credit for clickersC) Required attendance, extra credit for clickers

Page 7: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

lw $8, 4($29)sub $2, $4, $5and $9, $10, $11or $16, $17, $18add $13, $14, $0

The instructions in this example are independent.  Each instruction reads and writes completely different registers.  Our datapath handles this sequence easily, as we saw last time.

Our examples were too simple because they lacked dependencies between instructions

Page 8: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

An example with dependenciessub $2, $1, $3

and $12, $2, $5

or $13, $6, $2

add $14, $2, $2

sw $15, 100($2)

Page 9: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

Clock cycle1 2 3 4 5 6 7 8 9

sub $2, $1, $3 IF ID EX MEM WB

and $12, $2, $5 IF ID EX MEM WB

or $13, $6, $2 IF ID EX MEM WB

add $14, $2, $2 IF ID EX MEM WB

sw $15, 100($2) IF ID EX MEM WB

Which instructions might not execute correctly on the pipelined processor?a) Just Ib) I and IIc) I, II, and IIId) I, II, III, and IV

I

II

III

IV

Page 10: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

Clock cycle1 2 3 4 5 6 7 8 9

sub $2, $1, $3 IF ID EX MEM WB

and$12, $2, $5 IF ID EX MEM WB

or $13, $6, $2 IF ID EX MEM WB

add$14, $2, $2 IF ID EX MEM WB

sw $15, 100($2) IF ID EX MEM WB

sub does not write to the reg file until cycle 5, this creates two data hazards

Read the old value of $2 from the reg file, not the result of sub

Page 11: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

sub finishes writing to the reg file after half a clock cycle, reg file reads take half a cycle

Clock cycle1 2 3 4 5 6 7 8 9

sub $2, $1, $3 IF ID EX MEM WB

and$12, $2, $5 IF ID EX MEM WB

or $13, $6, $2 IF ID EX MEM WB

add$14, $2, $2 IF ID EX MEM WB

sw $15, 100($2) IF ID EX MEM WB

Page 12: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

Clock cycle1 2 3 4 5 6 7 8 9

sub $2, $1, $3 IF ID EX MEM WB

and$12, $2, $5 IF ID EX MEM WB

or $13, $6, $2 IF ID EX MEM WB

add$14, $2, $2 IF ID EX MEM WB

sw $15, 100($2) IF ID EX MEM WB

Use arrows to show dependencies: Arrows that point backwards reveal data hazards

tail shows when register is written

head shows when register is read

Page 13: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

Use pipeline registers to access correct values before values are written back to the reg file“Forward” data from pipeline registers to later instructions

Instructionmemory

Datamemory

1

0

PC

ALURegisters

Rd

Rt0

1

IF/ID ID/EX EX/MEM MEM/WBALU outputavailable here

Page 14: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

In clock cycle 4, AND gets R[1]‐R[3] from EX/MEM In cycle 5, OR gets R[1]‐R[3] from MEM/WB

Forward values from pipeline registers so later instructions can use the correct value

sub $2, $1, $3

and $12, $2, $5

Or $13, $6, $2

Page 15: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

Add forwarding muxes in front of the ALU

ForwardAInstructionmemory

Datamemory

1

0

PC

ALURegisters

Rd

Rt0

1

IF/ID ID/EX EX/MEM MEM/WB

012

012

ForwardB

Page 16: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

ForwardAInstructionmemory

Datamemory

1

0

PC

ALURegisters

Rd

Rt0

1

IF/ID ID/EX EX/MEM MEM/WB

012

012

ForwardBForwardA, ForwardB

A) 0,0B) 0,1C) 0,2D) 2,0E) 2,1

or $10, $3, $6add $2, $10, $1

sub $1, $2, $3

Given the following instructions are in the following stages, what value should ForwardA and ForwardB have?

Page 17: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

ForwardAInstructionmemory

Datamemory

1

0

PC

ALURegisters

Rd

Rt0

1

IF/ID ID/EX EX/MEM MEM/WB

012

012

ForwardBForwardA, ForwardB

A) 0,0B) 0,1C) 0,2D) 2,0E) 2,1

sw $5, 4($6)add $2, $5, $4

xor $4, $3, $2

Given the following instructions are in the following stages, what value should ForwardA and ForwardB have?

Page 18: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

ForwardAInstructionmemory

Datamemory

1

0

PC

ALURegisters

Rd

Rt0

1

IF/ID ID/EX EX/MEM MEM/WB

012

012

ForwardBForwardA, ForwardB

A) 0,0B) 0,1C) 0,2D) 2,0E) 2,1

add $3, $3, $5add $3, $3, $6

add $3, $3, $4

Given the following instructions are in the following stages, what value should ForwardA and ForwardB have?

Page 19: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

ForwardAInstructionmemory

Datamemory

1

0

PC

ALURegisters

Rd

Rt0

1

IF/ID ID/EX EX/MEM MEM/WB

012

012

ForwardBForwardA, ForwardB

A) 0,0B) 0,1C) 0,2D) 2,0E) 2,1

and $5, $7, $1sw $5, 8($9)

add $8, $9, $4

Given the following instructions are in the following stages, what value should ForwardA and ForwardB have?

Page 20: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

ForwardAInstructionmemory

Datamemory

1

0

PC

ALURegisters

Rd

Rt0

1

IF/ID ID/EX EX/MEM MEM/WB

012

012

ForwardBForwardA, ForwardB

A) 0,0B) 0,1C) 0,2D) 2,0E) 2,1

and $5, $7, $1sw $9, 8($5)

add $8, $9, $4

Given the following instructions are in the following stages, what value should ForwardA and ForwardB have?

Page 21: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

or $10, $3, $6add $2, $10, $1 sub $1, $2, $3

Use “.” notation to indicate contents of pipeline registers

ID/EX.RegisterRs ID/EX.RegisterRt EX/MEM.RegisterRd

WB

EX/MEM.RegWrite

MEM/WB.RegisterRd

Page 22: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

Forwarding unit controls the multiplexers

Page 23: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

The forwarding unit several control signals as inputs.

ID/EX.RegisterRs EX/MEM.RegisterRd MEM/WB.RegisterRdID/EX.RegisterRt EX/MEM.RegWrite MEM/WB.RegWrite

Page 24: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

ALU source A comes from the pipeline register when necessary.

if (EX/MEM.RegWrite == 1 and EX/MEM.RegisterRd == ID/EX.RegisterRs)then ForwardA = 2

ALU source B is similar.

if (EX/MEM.RegWrite == 1 and EX/MEM.RegisterRd == ID/EX.RegisterRt)then ForwardB = 2 

Page 25: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

MEM/WB hazard equations Equation for MEM/WB hazards for ALU source A

if (MEM/WB.RegWrite == 1and MEM/WB.RegisterRd == ID/EX.RegisterRsand (EX/MEM.RegisterRd ≠ ID/EX.RegisterRs or EX/MEM.RegWrite = 0)

then ForwardA = 1

Equation for MEM/WB hazards for ALU source B

if (MEM/WB.RegWrite == 1and MEM/WB.RegisterRd == ID/EX.RegisterRtand (EX/MEM.RegisterRd ≠ ID/EX.RegisterRt or EX/MEM.RegWrite = 0)

then ForwardB = 1

Page 26: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

What about stores after a load?

In what cycle is: The load value available? The store value needed?

What do we have to add to the datapath?

DMReg RegIM

DMReg RegIM

lw $1, 0($2)

sw $1, 0($4)

1 2 3 4 5 6

a) b) c) d) e)

Page 27: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

Load/Store Bypassing: Extend the Datapath

0

1

Addr

Instructionmemory

Instr

Address

Writedata

Datamemory

Readdata 1

0

PC

Extend

ALUSrc Result

ZeroALU

Instr [15 - 0] RegDst

Readregister 1

Readregister 2

Writeregister

Writedata

Readdata 2

Readdata 1

Registers

Rd

Rt0

1

IF/ID ID/EX EX/MEM MEM/WB

Rs

012

012

ForwardingUnit

EX/MEM.RegisterRd

MEM/WB.RegisterRd

Sequence :lw $1, 0($2)sw $1, 0($4)

ForwardC

0

1

Page 28: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

Miscellaneous comments Each MIPS instruction writes to at most one register. This makes the forwarding hardware easier to design, since there is only one destination register that ever needs to be forwarded.

Forwarding is especially important with deep pipelines like the ones in all current PC processors. Section 6.4 of the textbook has some additional material not shown here. Their hazard detection equations also ensure that the source register is not $0, which can never be modified. There is a more complex example of forwarding, with several cases covered. Take a look at it!

Page 29: L21-Forwarding - AL2 · Friday: Optional Exam 5 review session Current Exam 5 average = 100%! (N = 1) 233 in one slide! The class consists roughly of 4 quarters: (Bolded words are

Summary In real code, most instructions are dependent upon other ones. This can lead to data hazards in our original pipelined datapath. Instructions can’t write back to the register file soon enough for the next two instructions to read.

Forwarding eliminates data hazards involving arithmetic instructions. The forwarding unit detects hazards by comparing the destination registers of previous instructions to the source registers of the current instruction. Hazards are avoided by grabbing results from the pipeline registers beforethey are written back to the register file.

Next time we’ll finish up pipelining. Forwarding can’t save us in some cases involving lw. We still haven’t talked about branches for the pipelined datapath.