DYNAMIC SCHEDULING - cs.utah.edubojnordi/classes/6810/s18/slides/11-ooo.pdf · WB DIV FP/integer...

12
DYNAMIC SCHEDULING CS/ECE 6810: Computer Architecture Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah

Transcript of DYNAMIC SCHEDULING - cs.utah.edubojnordi/classes/6810/s18/slides/11-ooo.pdf · WB DIV FP/integer...

Page 1: DYNAMIC SCHEDULING - cs.utah.edubojnordi/classes/6810/s18/slides/11-ooo.pdf · WB DIV FP/integer divider A1 A2 A3 A4 ... Big Picture ¨Goal:exploiting ... ¨Key idea:creating an instruction

DYNAMIC SCHEDULING

CS/ECE 6810: Computer Architecture

Mahdi Nazm Bojnordi

Assistant Professor

School of Computing

University of Utah

Page 2: DYNAMIC SCHEDULING - cs.utah.edubojnordi/classes/6810/s18/slides/11-ooo.pdf · WB DIV FP/integer divider A1 A2 A3 A4 ... Big Picture ¨Goal:exploiting ... ¨Key idea:creating an instruction

Overview

¨ Announcement¤ Homework 3 will be uploaded tonight

¨ This lecture¤ Dynamic scheduling

n Forming data flow graph on the fly

¤ Register renamingn Removing false data dependencen Architectural vs. physical registers

Page 3: DYNAMIC SCHEDULING - cs.utah.edubojnordi/classes/6810/s18/slides/11-ooo.pdf · WB DIV FP/integer divider A1 A2 A3 A4 ... Big Picture ¨Goal:exploiting ... ¨Key idea:creating an instruction

Big Picture

¨ Goal: exploiting more ILP by avoiding stall cycles¤ Branch prediction can avoid the stall cycles in the

frontend

WB

DIV

FP/integer divider

A1 A2 A3 A4

FP adder

M1 M2 M3 M4 M5 M6 M7

FP/integer multiply

MemEx

Integer unit

IF ID

Reorder Buffer (ROB)

Branchpred

Page 4: DYNAMIC SCHEDULING - cs.utah.edubojnordi/classes/6810/s18/slides/11-ooo.pdf · WB DIV FP/integer divider A1 A2 A3 A4 ... Big Picture ¨Goal:exploiting ... ¨Key idea:creating an instruction

Big Picture

¨ Goal: exploiting more ILP by avoiding stall cycles¤ Branch prediction can avoid the stall cycles in the

frontendn More instructions are sent to the pipeline

WB

DIV

FP/integer divider

A1 A2 A3 A4

FP adder

M1 M2 M3 M4 M5 M6 M7

FP/integer multiply

MemEx

Integer unit

IF ID

Reorder Buffer (ROB)

Queue

Branchpred

Page 5: DYNAMIC SCHEDULING - cs.utah.edubojnordi/classes/6810/s18/slides/11-ooo.pdf · WB DIV FP/integer divider A1 A2 A3 A4 ... Big Picture ¨Goal:exploiting ... ¨Key idea:creating an instruction

Big Picture

¨ Goal: exploiting more ILP by avoiding stall cycles¤ Branch prediction can avoid the stall cycles in the

frontendn More instructions are sent to the pipeline

¤ Instruction scheduling can remove unnecessary stall cycles in the execution/memory stagen Static scheduling

n Complex software (compiler)n Unable to resolve all data hazards (no access to runtime details)

n Dynamic schedulingn Completely done in hardware

Page 6: DYNAMIC SCHEDULING - cs.utah.edubojnordi/classes/6810/s18/slides/11-ooo.pdf · WB DIV FP/integer divider A1 A2 A3 A4 ... Big Picture ¨Goal:exploiting ... ¨Key idea:creating an instruction

Dynamic Scheduling

¨ Key idea: creating an instruction schedule based on runtime information¤ Hardware managed instruction reordering

WB

DIV

FP/integer divider

A1 A2 A3 A4

FP adder

M1 M2 M3 M4 M5 M6 M7

FP/integer multiply

MemEx

Integer unit

IF ID

Reorder Buffer (ROB)

Queue

DIV F1, F2, F3

ADD F4, F1, F5

SUB F6, F5, F7

Assembly code:Long latency operation

Dependent instruction

Independent instruction

Out-of-order execution?

Page 7: DYNAMIC SCHEDULING - cs.utah.edubojnordi/classes/6810/s18/slides/11-ooo.pdf · WB DIV FP/integer divider A1 A2 A3 A4 ... Big Picture ¨Goal:exploiting ... ¨Key idea:creating an instruction

Dynamic Scheduling

¨ Key idea: creating an instruction schedule based on runtime information¤ Hardware managed instruction reordering¤ Instructions are executed in data flow order

ADDI R1, R0, #1ADDI R2, R0, #4ADD R3, R3, R2ADD R2, R2, #-1BNEQ R2, R1, nextADD R4, R4, R3BNEQ R2, R0, loop

ADDI R1, R0, #1ADDI R2, R0, #4ADD R3, R3, R2ADD R2, R2, #-1BNEQ R2, R1, nextBNEQ R2, R0, loopADD R3, R3, R2ADD R2, R2, #-1BNEQ R2, R1, nextBNEQ R2, R0, loopADD R3, R3, R2ADD R2, R2, #-1BNEQ R2, R1, nextADD R4, R4, R3BNEQ R2, R0, loopADD R3, R3, R2ADD R2, R2, #-1BNEQ R2, R1, nextBNEQ R2, R0, loop

next:

loop:

ADDI R1, R0, #1ADDI R2, R0, #4ADD R3, R3, R2ADD R2, R2, #-1BNEQ R2, R1, nextADD R4, R4, R3BNEQ R2, R0, loop

Program code

How to form data flow graph on the fly?

ADDI R1, R0, #1 ADDI R2, R0, #4

ADD R3, R3, R2ADD R2, R2, #-1

ADD R3, R3, R2ADD R2, R2, #-1

ADD R3, R3, R2ADD R2, R2, #-1

ADD R4, R4, R3ADD R3, R3, R2ADD R2, R2, #-1

Data flow

Page 8: DYNAMIC SCHEDULING - cs.utah.edubojnordi/classes/6810/s18/slides/11-ooo.pdf · WB DIV FP/integer divider A1 A2 A3 A4 ... Big Picture ¨Goal:exploiting ... ¨Key idea:creating an instruction

Register Renaming

¨ Eliminating WAR and WAW hazards¤ Change the mapping between architectural registers

and physical storage locations

WB

DIV

FP/integer divider

A1 A2 A3 A4

FP adder

M1 M2 M3 M4 M5 M6 M7

FP/integer multiply

MemEx

Integer unit

IF ID

Reorder Buffer (ROB)

Queue

DIV F1, F2, F3

ADD F4, F1, F5

SUB F5, F6, F7

ADD F4, F5, F8

DIV F1, F2, F3

ADD F4, F1, F5

SUB Q1, F6, F7

ADD Q2, Q1, F8

WAR

WAW

RAW WAR and WAW hazards can be removed using more registers

Page 9: DYNAMIC SCHEDULING - cs.utah.edubojnordi/classes/6810/s18/slides/11-ooo.pdf · WB DIV FP/integer divider A1 A2 A3 A4 ... Big Picture ¨Goal:exploiting ... ¨Key idea:creating an instruction

Register Renaming

¨ Eliminating WAR and WAW hazardsn 1. allocate a free physical location for the new registern 2. find the most recently allocated location for the register

DIV F1, F2, F3ADD F4, F1, F5SUB F5, F6, F7ADD F4, F5, F8

Architectural Registers

F1F2F3F4F5

Physical Locations

P10P11P12P13P14

F6F7F8

P15P16P17

DIV P12, P11, P10

P18P19

Page 10: DYNAMIC SCHEDULING - cs.utah.edubojnordi/classes/6810/s18/slides/11-ooo.pdf · WB DIV FP/integer divider A1 A2 A3 A4 ... Big Picture ¨Goal:exploiting ... ¨Key idea:creating an instruction

Register Renaming

¨ Eliminating WAR and WAW hazardsn 1. allocate a free physical location for the new registern 2. find the most recently allocated location for the register

DIV F1, F2, F3ADD F4, F1, F5SUB F5, F6, F7ADD F4, F5, F8

Architectural Registers

F1F2F3F4F5

Physical Locations

P10P11P12P13P14

F6F7F8

P15P16P17

DIV P12, P11, P10 ADD P14, P12, P15

P18P19

Page 11: DYNAMIC SCHEDULING - cs.utah.edubojnordi/classes/6810/s18/slides/11-ooo.pdf · WB DIV FP/integer divider A1 A2 A3 A4 ... Big Picture ¨Goal:exploiting ... ¨Key idea:creating an instruction

Register Renaming

¨ Eliminating WAR and WAW hazardsn 1. allocate a free physical location for the new registern 2. find the most recently allocated location for the register

DIV F1, F2, F3ADD F4, F1, F5SUB F5, F6, F7ADD F4, F5, F8

Architectural Registers

F1F2F3F4F5

Physical Locations

P10P11P12P13P14

F6F7F8

P15P16P17

DIV P12, P11, P10 ADD P14, P12, P15SUB P19, P17, P13 P18

P19

Page 12: DYNAMIC SCHEDULING - cs.utah.edubojnordi/classes/6810/s18/slides/11-ooo.pdf · WB DIV FP/integer divider A1 A2 A3 A4 ... Big Picture ¨Goal:exploiting ... ¨Key idea:creating an instruction

Register Renaming

¨ Eliminating WAR and WAW hazardsn 1. allocate a free physical location for the new registern 2. find the most recently allocated location for the register

DIV F1, F2, F3ADD F4, F1, F5SUB F5, F6, F7ADD F4, F5, F8

Architectural Registers

F1F2F3F4F5

Physical Locations

P10P11P12P13P14

F6F7F8

P15P16P17

DIV P12, P11, P10 ADD P14, P12, P15SUB P19, P17, P13ADD P18, P19, P16

P18P19