Lecture 9. MIPS Processor Design – Pipelined Processor Design #2

58
Lecture 9. MIPS Processor Design – Pipelined Processor Design #2 Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System Education & Research

description

2010 R&E Computer System Education & Research. Lecture 9. MIPS Processor Design – Pipelined Processor Design #2. Prof. Taeweon Suh Computer Science Education Korea University. Pipelined Datapath. 0. M. u. x. 1. I. F. /. I. D. I. D. /. E. X. E. X. /. M. E. M. M. E. M. - PowerPoint PPT Presentation

Transcript of Lecture 9. MIPS Processor Design – Pipelined Processor Design #2

Page 1: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Lecture 9. MIPS Processor Design – Pipelined Processor Design #2

Prof. Taeweon SuhComputer Science Education

Korea University

2010 R&E Computer System Education & Research

Page 2: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Pipelined Datapath

2

Instructionmemory

Address

4

32

0

Add Addresult

Shiftleft 2

Inst

ruct

ion

IF/ID EX/MEM MEM/WB

Mux

0

1

Add

PC

0Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1

Readregister 2

16Sign

extend

Writeregister

Writedata

Readdata

1

ALUresult

Mux

ALUZero

ID/EX

Datamemory

Address

Page 3: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Example for lw instruction: Instruction Fetch (IF)

3

Instructionmemory

Address

4

32

0

Add Addresult

Shiftleft 2

Inst

ruct

ion

IF/ID EX/MEM MEM/WB

Mux

0

1

Add

PC

0Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1

Readregister 2

16Sign

extend

Writeregister

Writedata

Readdata

1

ALUresult

Mux

ALUZero

ID/EX

Datamemory

Address

Instruction fetch

Page 4: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Example for lw instruction: Instruction Decode (ID)

4

Instructionmemory

Address

4

32

0

Add Addresult

Shiftleft 2

Inst

ruct

ion

IF/ID EX/MEM MEM/WB

Mux

0

1

Add

PC

0Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1

Readregister 2

16Sign

extend

Writeregister

Writedata

Readdata

1

ALUresult

Mux

ALUZero

ID/EX

Datamemory

Address

Instruction decode

Page 5: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Example for lw instruction: Execution (EX)

5

Instructionmemory

Address

4

32

0

Add Addresult

Shiftleft 2

Inst

ruct

ion

IF/ID EX/MEM MEM/WB

Mux

0

1

Add

PC

0Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1

Readregister 2

16Sign

extend

Writeregister

Writedata

Readdata

1

ALUresult

Mux

ALUZero

ID/EX

Datamemory

Address

Execution

Page 6: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Example for lw instruction: Memory (MEM)

6

Instructionmemory

Address

4

32

0

Add Addresult

Shiftleft 2

Inst

ruct

ion

IF/ID EX/MEM MEM/WB

Mux

0

1

Add

PC

0Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1

Readregister 2

16Sign

extend

Writeregister

Writedata

Readdata

1

ALUresult

Mux

ALUZero

ID/EX

Datamemory

Address

Memory

Page 7: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Example for lw instruction: Writeback (WB)

7

Instructionmemory

Address

4

32

0

Add Addresult

Shiftleft 2

Inst

ruct

ion

IF/ID EX/MEM MEM/WB

Mux

0

1

Add

PC

0Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1

Readregister 2

16Sign

extend

Writeregister

Writedata

Readdata

1

ALUresult

Mux

ALUZero

ID/EX

Datamemory

Address

Writeback

Page 8: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Example for sw instruction: Memory (MEM)

8

Instructionmemory

Address

4

32

0

Add Addresult

Shiftleft 2

Inst

ruct

ion

IF/ID EX/MEM MEM/WB

Mux

0

1

Add

PC

0Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1

Readregister 2

16Sign

extend

Writeregister

Writedata

Readdata

1

ALUresult

Mux

ALUZero

ID/EX

Datamemory

Address

Memory

Page 9: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Example for sw instruction: Writeback (WB): do nothing

9

Instructionmemory

Address

4

32

0

Add Addresult

Shiftleft 2

Inst

ruct

ion

IF/ID EX/MEM MEM/WB

Mux

0

1

Add

PC

0Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1

Readregister 2

16Sign

extend

Writeregister

Writedata

Readdata

1

ALUresult

Mux

ALUZero

ID/EX

Datamemory

Address

Writeback

Page 10: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Corrected Datapath (for lw)

10

Instructionmemory

Address

4

32

0

Add Addresult

Shiftleft 2

Inst

ruct

ion

IF/ID EX/MEM MEM/WB

Mux

0

1

Add

PC

0

Address

Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1

Readregister 2

16Sign

extend

Writeregister

Writedata

Readdata

Datamemory

1

ALUresult

Mux

ALUZero

ID/EX

Page 11: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Pipelining Example

11

Instructionmemory

Address

4

32

0

Add Addresult

Shiftleft 2

Inst

r uct

ion

IF/ID EX/MEM MEM/WB

Mux

0

1

Add

PC

0Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1

Readregister 2

16Sign

extend

Writeregister

Writedata

Readdata

1

ALUresult

Mux

ALUZero

ID/EX

Datamemory

Address

add $14, $5, $6 lw $13, 24($1) add $12, $3, $4 sub $11, $2, $3 lw $10, 20($1)

Page 12: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Pipeline Control

12

PC

Instructionmemory

Address

Inst

ruct

ion

Instruction[20– 16]

MemtoReg

ALUOp

Branch

RegDst

ALUSrc

4

16 32Instruction[15– 0]

0

0Registers

Writeregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Signextend

Mux

1Write

data

Read

data Mux

1

ALUcontrol

RegWrite

MemRead

Instruction[15– 11]

6

IF/ID ID/EX EX/MEM MEM/WB

MemWrite

Address

Datamemory

PCSrc

Zero

AddAdd

result

Shiftleft 2

ALUresult

ALU

Zero

Add

0

1

Mux

0

1

Mux

Note that in this implementation, branch instruction decides whether to branch in the MEM stage

Page 13: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Pipeline Control

• We have 5 stages IF, ID, EX, MEM, WB

• What needs to be controlled in each stage? Instruction fetch and PC increment Instruction decode / operand fetch Execution stage

• RegDst• ALUop[1:0]• ALUSrc

Memory stage• Branch• MemRead• MemWrite

Writeback• MemtoReg• RegWrite (note that this signal is in ID stage)

13

Page 14: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Pipeline Control

• Extend pipeline registers to include control information (created in ID)

• Pass control signals along just like the data

14

Execution/Address Calculation stage control

linesMemory access stage

control lines

Write-back stage control

lines

InstructionReg Dst

ALU Op1

ALU Op0

ALU Src Branch

Mem Read

Mem Write

Reg write

Mem to Reg

R-format 1 1 0 0 0 0 0 1 0lw 0 0 0 1 0 1 0 1 1sw X 0 0 1 0 0 1 0 Xbeq X 0 1 0 1 0 0 0 X

Control

EX

M

WB

M

WB

WB

IF/ID ID/EX EX/MEM MEM/WB

Instruction

Page 15: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Datapath with Control

15

PC

Instructionmemory

Inst

ruct

ion

Add

Instruction[20– 16]

Mem

toR

eg

ALUOp

Branch

RegDst

ALUSrc

4

16 32Instruction[15– 0]

0

0

Mux

0

1

Add Addresult

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Signextend

Mux

1

ALUresult

Zero

Writedata

Readdata

Mux

1

ALUcontrol

Shiftleft 2

RegW

rite

MemRead

Control

ALU

Instruction[15– 11]

6

EX

M

WB

M

WB

WBIF/ID

PCSrc

ID/EX

EX/MEM

MEM/WB

Mux

0

1

Mem

Writ

e

AddressData

memory

Address

Page 16: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Datapath with Control

16

PC

Instructionmemory

Inst

ruct

ion

Add

Instruction[20– 16]

Mem

toR

eg

ALUOp

Branch

RegDst

ALUSrc

4

16 32Instruction[15– 0]

0

0

Mux

0

1

Add Addresult

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Signextend

Mux1

ALUresult

Zero

Writedata

Readdata

Mux

1

ALUcontrol

Shiftleft 2

Reg

Writ

e

MemRead

Control

ALU

Instruction[15– 11]

6

EX

M

WB

M

WB

WBIF/ID

PCSrc

ID/EX

EX/MEM

MEM/WB

Mux

0

1

Mem

Writ

e

AddressData

memory

Address

IF: lw $10, 9($1)IF: lw $10, 9($1)

Page 17: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Datapath with Control

17

PC

Instructionmemory

Inst

ruct

ion

Add

Instruction[20– 16]

Mem

toR

eg

ALUOp

Branch

RegDst

ALUSrc

4

16 32Instruction[15– 0]

0

0

Mux

0

1

Add Addresult

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Signextend

Mux1

ALUresult

Zero

Writedata

Readdata

Mux

1

ALUcontrol

Shiftleft 2

Reg

Writ

e

MemRead

Control

ALU

Instruction[15– 11]

6

X

M

WB

M

WB

WBIF/ID

PCSrc

ID/EX

EX/MEM

MEM/WB

Mux

0

1

Mem

Writ

e

AddressData

memory

Address

IF: sub $11, $2, $3IF: sub $11, $2, $3 ID: lw $10, 9($1)ID: lw $10, 9($1)

11

010

0001E

“lw”

Page 18: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Datapath with Control

18

PC

Instructionmemory

Inst

ruct

ion

Add

Instruction[20– 16]

Mem

toR

eg

ALUOp

Branch

RegDst

ALUSrc

4

16 32Instruction[15– 0]

0

0

Mux

0

1

Add Addresult

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Signextend

Mux1

ALUresult

Zero

Writedata

Readdata

Mux

1

ALUcontrol

Shiftleft 2

Reg

Writ

e

MemRead

Control

ALU

Instruction[15– 11]

6

X

M

WB

M

WB

WBIF/ID

PCSrc

ID/EX

EX/MEM

MEM/WB

Mux

0

1

Mem

Writ

e

AddressData

memory

Address

11

010

00E

ID: sub $11, $2, $3ID: sub $11, $2, $3 EX: lw $10, 9($1)EX: lw $10, 9($1)IF: and $12, $4, $5IF: and $12, $4, $5

1

0

10

000

1100

“sub”

Page 19: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Datapath with Control

19

PC

Instructionmemory

Inst

ruct

ion

Add

Instruction[20– 16]

Mem

toR

eg

ALUOp

Branch

RegDst

ALUSrc

4

16 32Instruction[15– 0]

0

0

Mux

0

1

Add Addresult

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Signextend

Mux1

ALUresult

Zero

Writedata

Readdata

Mux

1

ALUcontrol

Shiftleft 2

Reg

Writ

e

MemRead

Control

ALU

Instruction[15– 11]

6

X

M

WB

M

WB

WBIF/ID

PCSrc

ID/EX

EX/MEM

MEM/WB

Mux

0

1

Mem

Writ

e

AddressData

memory

Address

10

000

10E

EX: sub $11, $2, $3EX: sub $11, $2, $3 MEM: lw $10, 9($1)MEM: lw $10, 9($1)ID: and $12, $4, $5ID: and $12, $4, $5

0

1

10

000

1100

IF: or $13, $6, $7IF: or $13, $6, $7

110

10

“and”

Page 20: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Datapath with Control

20

PC

Instructionmemory

Inst

ruct

ion

Add

Instruction[20– 16]

Mem

toR

eg

ALUOp

Branch

RegDst

ALUSrc

4

16 32Instruction[15– 0]

0

0

Mux

0

1

Add Addresult

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Signextend

Mux1

ALUresult

Zero

Writedata

Readdata

Mux

1

ALUcontrol

Shiftleft 2

Reg

Writ

e

MemRead

Control

ALU

Instruction[15– 11]

6

X

M

WB

M

WB

WBIF/ID

PCSrc

ID/EX

EX/MEM

MEM/WB

Mux

0

1

Mem

Writ

e

AddressData

memory

Address

10

000

10E

MEM: sub $11, ..MEM: sub $11, .. WB: lw $10, WB: lw $10,

9($1)9($1)

EX: and $12, $4, $5EX: and $12, $4, $5

0

1

10

000

1100

ID: or $13, $6, $7ID: or $13, $6, $7

100

00

“or”

IF: add $14, $8, $9IF: add $14, $8, $9

1

1

Page 21: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Datapath with Control

21

PC

Instructionmemory

Inst

ruct

ion

Add

Instruction[20– 16]

Mem

toR

eg

ALUOp

Branch

RegDst

ALUSrc

4

16 32Instruction[15– 0]

0

0

Mux

0

1

Add Addresult

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Signextend

Mux1

ALUresult

Zero

Writedata

Readdata

Mux

1

ALUcontrol

Shiftleft 2

Reg

Writ

e

MemRead

Control

ALU

Instruction[15– 11]

6

X

M

WB

M

WB

WBIF/ID

PCSrc

ID/EX

EX/MEM

MEM/WB

Mux

0

1

Mem

Writ

e

AddressData

memory

Address

10

000

10E

WB: sub $11, ..WB: sub $11, ..MEM: and $12…MEM: and $12…

0

1

10

000

1100

EX: or $13, $6, $7EX: or $13, $6, $7

100

00

“add”

ID: add $14, $8, $9ID: add $14, $8, $9

1

0

IF: xxxxIF: xxxx

Page 22: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Datapath with Control

22

PC

Instructionmemory

Inst

ruct

ion

Add

Instruction[20– 16]

Mem

toR

eg

ALUOp

Branch

RegDst

ALUSrc

4

16 32Instruction[15– 0]

0

0

Mux

0

1

Add Addresult

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Signextend

Mux1

ALUresult

Zero

Writedata

Readdata

Mux

1

ALUcontrol

Shiftleft 2

Reg

Writ

e

MemRead

Control

ALU

Instruction[15– 11]

6

M

WB

WBIF/ID

PCSrc

EX/MEM

MEM/WB

Mux

0

1

Mem

Writ

e

AddressData

memory

Address

10

000

10

WB: and $12…WB: and $12…

0

1

MEM: or $13, ..MEM: or $13, ..

100

00

EX: add $14, $8, $9EX: add $14, $8, $9

1

0

IF: xxxxIF: xxxx ID: xxxxID: xxxx

X

M

WB

ID/EX

E

Page 23: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Datapath with Control

23

WB: or $13…WB: or $13…

PC

Instructionmemory

Inst

ruct

ion

Add

Instruction[20– 16]

Mem

toR

eg

ALUOp

Branch

RegDst

ALUSrc

4

16 32Instruction[15– 0]

0

0

Mux

0

1

Add Addresult

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Signextend

Mux1

ALUresult

Zero

Writedata

Readdata

Mux

1

ALUcontrol

Shiftleft 2

Reg

Writ

e

MemRead

Control

ALU

Instruction[15– 11]

6

M

WB

WBIF/ID

PCSrc

EX/MEM

MEM/WB

Mux

0

1

Mem

Writ

e

AddressData

memory

Address

MEM: add $14, ..MEM: add $14, ..

1000

0

EX: xxxxEX: xxxx

1

0

IF: xxxxIF: xxxx ID: xxxxID: xxxx

X

M

WB

ID/EX

E

Page 24: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Datapath with Control

24

PC

Instructionmemory

Inst

ruct

ion

Add

Instruction[20– 16]

Mem

toR

eg

ALUOp

Branch

RegDst

ALUSrc

4

16 32Instruction[15– 0]

0

0

Mux

0

1

Add Addresult

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Signextend

Mux1

ALUresult

Zero

Writedata

Readdata

Mux

1

ALUcontrol

Shiftleft 2

Reg

Writ

e

MemRead

Control

ALU

Instruction[15– 11]

6

M

WB

WBIF/ID

PCSrc

EX/MEM

MEM/WB

Mux

0

1

Mem

Writ

e

AddressData

memory

Address

WB: add $14..WB: add $14..MEM: xxxxMEM: xxxxEX: xxxxEX: xxxx

1

0

IF: xxxxIF: xxxx ID: xxxxID: xxxx

X

M

WB

ID/EX

E

Page 25: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Dependencies

• Dependencies Problem with starting (or executing) next instruction before first is

finished Dependencies incur data and control hazards

25

IM Reg

IM Reg

CC 1 CC 2 CC 3 CC 4 CC 5 CC 6

Time (in clock cycles)

sub $2, $1, $3

Programexecutionorder(in instructions)

and $12, $2, $5

IM Reg DM Reg

IM DM Reg

IM DM Reg

CC 7 CC 8 CC 9

10 10 10 10 10/– 20 – 20 – 20 – 20 – 20

or $13, $6, $2

add $14, $2, $2

sw $15, 100($2)

Value of register $2:

DM Reg

Reg

Reg

Reg

DM

Page 26: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Data Hazard - Software Solution

• Data hazards Dependencies that “go backward in time”

• Have compiler guarantee no hazards? Insert nop (no operation) instructions (“0x00000000” is nop in

MIPS) Code scheduling

• Where do we insert the “nops” ?

sub $2, $1, $3and $12, $2, $5or $13, $6, $2add $14, $2, $2sw $15, 100($2)

• Problem? This really slows us down!

26

Page 27: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Data Hazard - Pipeline Stalls?

27

IM Regsub $2, $1, $3

and $12, $2, $5

or $13, $6, $2

add $14, $2, $2

sw $15, 100($2)

DM Reg

IM Reg

IM Reg DM Reg

IM DM Reg

IM DM Reg

Reg

Reg

Reg

DM

stall

stall

stall IM

IM

IM

bubble

Page 28: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Data Hazard - Forwarding

• Use temporary results, don’t wait for them to be written Register file forwarding to handle read/write to same register ALU forwarding

28

IM Reg

IM Reg

CC 1 CC 2 CC 3 CC 4 CC 5 CC 6

Time (in clock cycles)

sub $2, $1, $3

Programexecution order(in instructions)

and $12, $2, $5

IM Reg DM Reg

IM DM Reg

IM DM Reg

CC 7 CC 8 CC 9

10 10 10 10 10/– 20 – 20 – 20 – 20 – 20

or $13, $6, $2

add $14, $2, $2

sw $15, 100($2)

Value of register $2 :

DM Reg

Reg

Reg

Reg

X X X – 20 X X X X XValue of EX/MEM :X X X X – 20 X X X XValue of MEM/WB :

DM

Ok.. Then, do we have to do this forwarding?

1. If you are asked to design CPU using only rising-edge of the clock, then?

• Let’s stick to this for our project

2. If the register file write occurs in the first half of the clock, and read occurs in the 2nd half of the clock, then?

• Our textbook follows this

Page 29: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Forwarding (simplified)

29

Data

Memory

Register

File

MU

X

ID/EX EX/MEM MEM/WB

ALU

Page 30: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Forwarding (from EX/MEM)

30

ALU

Data

Memory

Register

File

MU

X

ID/EX EX/MEM MEM/WB

MU

XM

UX

Page 31: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Forwarding (from MEM/WB)

31

ALU

Data

Memory

Register

File

MU

X

ID/EX EX/MEM MEM/WB

MU

XM

UX

Page 32: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Forwarding (operand selection)

32

ALU

Data

Memory

Register

File

MU

X

ID/EX EX/MEM MEM/WB

MU

XM

UX

Forwarding

Unit

Page 33: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Forwarding (operand propagation)

33

ALU

Data

Memory

Register

File

MU

X

ID/EX EX/MEM MEM/WB

MU

XM

UX

Forwarding

Unit

Rt

Rs

MU

XRd

Rt

EX/MEM Rd

MEM/WB Rd

Page 34: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Forwarding

34

PCInstruction

memory

Registers

Mux

Mux

Control

ALU

EX

M

WB

M

WB

WB

ID/EX

EX/MEM

MEM/WB

Datamemory

Mux

Forwardingunit

IF/ID

Inst

ruct

ion

Mux

RdEX/MEM.RegisterRd

MEM/WB.RegisterRd

Rt

Rt

Rs

IF/ID.RegisterRd

IF/ID.RegisterRt

IF/ID.RegisterRt

IF/ID.RegisterRs

Page 35: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Can't always forward

• lw (load word) can still cause a hazard An instruction tries to read a register following a load

instruction that writes to the same register

• Thus, we need a hazard detection unit to “stall” the pipeline after the load instruction

35

Reg

IM

Reg

Reg

IM

CC 1 CC 2 CC 3 CC 4 CC 5 CC 6

Time (in clock cycles)

lw $2, 20($1)

Programexecutionorder(in instructions)

and $4, $2, $5

IM Reg DM Reg

IM DM Reg

IM DM Reg

CC 7 CC 8 CC 9

or $8, $2, $6

add $9, $4, $2

slt $1, $6, $7

DM Reg

Reg

Reg

DM

Page 36: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Stalling

• We can stall the pipeline by keeping an instruction in the same stage

36

lw $2, 20($1)

Programexecutionorder(in instructions)

and $4, $2, $5

or $8, $2, $6

add $9, $4, $2

slt $1, $6, $7

Reg

IM

Reg

Reg

IM DM

CC 1 CC 2 CC 3 CC 4 CC 5 CC 6Time (in clock cycles)

IM Reg DM RegIM

IM DM Reg

IM DM Reg

CC 7 CC 8 CC 9 CC 10

DM Reg

RegReg

Reg

bubble

ID ID

IF IF

Page 37: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Hazard Detection Unit

• Stall by letting an instruction that won’t write anything go forward• Stall the pipeline if both ID/EX is a load and (rt=IF/ID.rs or

rt=IF/ID.rt)

37

PCInstruction

memory

Registers

Mux

Mux

Mux

Control

ALU

EX

M

WB

M

WB

WB

ID/EX

EX/MEM

MEM/WB

Datamemory

Mux

Hazarddetection

unit

Forwardingunit

0

Mux

IF/ID

Inst

ruct

ion

ID/EX.MemReadIF

/ID

Wri

te

PC

Wri

te

ID/EX.RegisterRt

IF/ID.RegisterRd

IF/ID.RegisterRt

IF/ID.RegisterRt

IF/ID.RegisterRs

RtRs

Rd

Rt EX/MEM.RegisterRd

MEM/WB.RegisterRd

Page 38: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Control Hazards - Branch

• When we decide to branch, other instructions are in the pipeline!• Assume: branch is not taken

When this assumption failed, flush 3 instructions

• We are predicting “branch not taken” need to add hardware for flushing instructions if we are wrong

38

Reg

Reg

CC 1

Time (in clock cycles)

40 beq $1, $3, 7

Programexecutionorder(in instructions)

IM Reg

IM DM

IM DM

IM DM

DM

DM Reg

Reg Reg

Reg

Reg

RegIM

44 and $12, $2, $5

48 or $13, $6, $2

52 add $14, $2, $2

72 lw $4, 50($7)

CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9

Reg

Page 39: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Alleviate Branch Hazards

• Move branch compare to ID stage of the pipeline• Add adder to calculate branch target in ID stage• Add IF.flush signal that zeros the instruction (or

squash) in IF/ID pipeline register • Reduce penalty to 1 cycle

39

IF ID MEM WBEXbeq $1,$2,L1

Taken target address is known here

IF ID MEM WBEX

Bubblee

add $1,$2,$3

L1: sub $1,$2, $3

IF ID MEM WBEX

Actual condition is generated here

Page 40: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Flushing Instructions

40

PCInstruction

memory

4

Registers

Mux

Mux

Mux

ALU

EX

M

WB

M

WB

WB

ID/EX

0

EX/MEM

MEM/WB

Datamemory

Mux

Hazarddetection

unit

Forwardingunit

IF.Flush

IF/ID

Signextend

Control

Mux

=

Shiftleft 2

Mux

Page 41: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Flushing Instructions (cycle N)

41

PCInstruction

memory

4

Registers

Mux

Mux

Mux

ALU

EX

M

WB

M

WB

WB

ID/EX

0

EX/MEM

MEM/WB

Datamemory

Mux

Hazarddetection

unit

Forwardingunit

IF.Flush

IF/ID

Signextend

Control

Mux

=

Shiftleft 2

Mux

and $12, $2, $5 beq $1, $3, L2beq $1, $3, L2and $12, $2, $5or $13, $12, $1…L2:lw $4, 40($7)

Page 42: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Flushing Instructions (cycle N)

42

PCInstruction

memory

4

Registers

Mux

Mux

Mux

ALU

EX

M

WB

M

WB

WB

ID/EX

0

EX/MEM

MEM/WB

Datamemory

Mux

Hazarddetection

unit

Forwardingunit

IF.Flush

IF/ID

Signextend

Control

Mux

=

Shiftleft 2

Mux

and $12, $2, $5 beq $1, $3, L2

L2L2

beq $1, $3, L2and $12, $2, $5or $13, $12, $1…L2:lw $4, 40($7)

Page 43: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Flushing Instructions (cycle N+1)

43

PCInstruction

memory

4

Registers

Mux

Mux

Mux

ALU

EX

M

WB

M

WB

WB

ID/EX

0

EX/MEM

MEM/WB

Datamemory

Mux

Hazarddetection

unit

Forwardingunit

IF.Flush

IF/ID

Signextend

Control

Mux

=

Shiftleft 2

Mux

nop beq $1, $3, L2lw $4, 40($7)beq $1, $3, L2and $12, $2, $5or $13, $12, $1…L2:lw $4, 40($7)

Page 44: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Improving Performance

• Try and avoid stalls! E.g., reorder these instructions:

lw $t0, 0($t1)lw $t2, 4($t1)sw $t2, 0($t1)sw $t0, 4($t1)

• Add a “branch delay slot” The next instruction after a branch is always executed Rely on compiler to “fill” the slot with something useful

• Superscalar Start more than one instruction in the same cycle Most all processors are now pipelined and Superscalar

44

Page 45: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Dynamic Scheduling

• The hardware performs the “scheduling” Hardware tries to find instructions to execute Out of order (OOO) execution is possible Speculative execution and dynamic branch prediction

• All modern processors are very complicated DEC Alpha 21264: 9 stage pipeline, 6 instruction

issue PowerPC and Pentium: branch history table Compiler technology is important

• This class has given you the background you need to learn more

45

Page 46: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Exceptions & Interrupts

• CPU has to prepare for all possible situations it could face “Unexpected” events require change in flow of control

• Exceptions arise within the CPU Undefined opcode Arithmetic overflow in MIPS

• Some other architectures (such as x86 and ARM) do not generate exception on arithmetic overflow. Instead, set bits of the flag register inside CPU

• Interrupts are from external I/O devices• Keyboard, Mouse, Network card etc

• Many architectures and authors do not distinguish between interrupts and exceptions Often use the term “interrupt” to refer to both types of

events

46

Page 47: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Pipelined Performance Example

• Ideally CPI = 1

• But, need to handle stalling (cause by loads and branches)

• SPECINT2000 benchmark: 25% loads 10% stores 11% branches 2% jumps 52% R-type

• Suppose 40% of loads are used by next instruction 25% of branches are mispredicted

• What is the average CPI?

47

Page 48: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Pipelined Performance Example

• SPECINT2000 benchmark: 25% loads 10% stores 11% branches 2% jumps 52% R-type

• If there is no stall in the pipelined MIPS, how would you calculate CPI? Average CPI = (0.25) (1 CPI) + (0.10) (1 CPI) + (0.11) (1 CPI) + (0.02) (1 CPI) + (0.52) (1

CPI) = 1

• Suppose 40% of loads are used by next instruction 25% of branches are mispredicted All jumps flush next instruction

• What is the average CPI? Load/Branch CPI = 1 when no stalling, 2 when stalling. Thus CPIlw = 1 (0.6) + 2 (0.4) = 1.4 CPIbeq = 1 (0.75) + 2 (0.25) = 1.25 CPIjump = 2 (1) = 2

• Average CPI = (0.25)(1.4) + (0.1)(1) + (0.11)(1.25) + (0.02)(2) + (0.52)(1) = 1.15

48

Page 49: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Pipelined Performance

• Critical path of the pipelined MIPS processor:

Tc = max { tpcq + tmem + tsetup

, // IF stage

2(tRFread + tmux + teq + tAND + tmux + tsetup ) , // ID stage

tpcq + tmux + tmux + tALU + tsetup , // EX stage

tpcq + tmemwrite + tsetup , // MEM stage

2(tpcq + tmux + tRFwrite) // WB stage

}

49

Where does this “2” come from?1. If you are asked to design CPU using

only rising-edge of the clock, then?• Let’s stick to this for our

project2. If the register file write occurs in the

first half of the clock, and read occurs in the 2nd half of the clock, then?

• Our textbook follows this

Page 50: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Pipelined Performance Example

Tc = 2(tRFread + tmux + teq + tAND + tmux + tsetup ) = 2[150 + 25 + 40 + 15 + 25 + 20] ps = 550 ps

50

Element Parameter

Delay (ps)

Register clock-to-Q tpcq_PC 30

Register setup tsetup 20

Multiplexer tmux 25

ALU tALU 200

Memory read tmem 250

Register file read tRFread 150

Register file setup tRFsetup 20

Equality comparator teq 40

AND gate tAND 15

Memory write Tmemwrite 220

Register file write tRFwrite 100 ps

Page 51: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Pipelined Performance Example

• For a program with 100 billion instructions executing on a pipelined MIPS processor, CPI = 1.15 Tc = 550 ps

Execution Time = (#instructions)(cycles/instruction)(seconds/cycle)

= (100 × 109)(1.15)(550× 10-12 s) = 63 seconds

51

ProcessorExecution

Time(seconds)

Speedup(single-cycle is

baseline)

Single-cycle

95 1

Multicycle 133 0.71

Pipelined 63 1.51

Page 52: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Backup Slides

52

Page 53: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Exception Handling in MIPS and Handler Actions

• Exception handling in MIPS Hardware (CPU) CPU saves PC of offending (or interrupted)

instruction to the “Exception Program Counter (EPC)” register

CPU saves indication of the problem to the “Cause” register

Jump to handler at 0x8000 00180

• Exception Handler in Software Read cause, and transfer to relevant handler If restartable,

• Take corrective action• Use EPC to return to program

Otherwise• Terminate program• Report error using EPC, cause, …

53

Page 54: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Exceptions in a Pipeline

• Another form of control hazard

• Consider overflow on add in EX stage

add $1, $2, $1

Prevent $1 from being clobbered Complete previous instructions Flush add and subsequent instructions Set Cause and EPC register values Transfer control to handler

• Similar to mispredicted branch Use much of the same hardware

54

Page 55: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Pipeline with Exceptions

55

Page 56: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Exception Example

• Exception on add in40 sub $11, $2, $444 and $12, $2, $548 or $13, $2, $64C add $1, $2, $150 slt $15, $6, $754 lw $16, 50($7)…

• Handler80000180 sw $25, 1000($0)80000184 sw $26, 1004($0)…

56

Page 57: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Exception Example

57

Page 58: Lecture 9. MIPS Processor Design –   Pipelined Processor Design #2

Korea Univ

Exception Example

58