Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another...

111
Pipelining

Transcript of Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another...

Page 1: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Pipelining

Page 2: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Performance Measurements

• Cycle Time: Time in between clock ticks

• Latency: Time to finish a complete job, start to finish

• Throughput: Average jobs completed per unit time

• CyclesPerJob: Number of cycles between finishing jobs.

Page 3: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Goals

• Faster clock rate• Use machine more efficiently• No longer execute only one instruction

at a time

Page 4: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Laundry

• Laundry-o-matic washes, dries & folds

• Wash: 30 min• Dry: 40 min• Fold: 20 min• It switches them internally with no

delay• How long to complete 1 load?

______

Page 5: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Laundry

• Laundry-o-matic washes, dries & folds

• Wash: 30 min• Dry: 40 min• Fold: 20 min• It switches them internally with no

delay• How long to complete 1 load? 90

min

Page 6: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Laundry-o-Matic -SingleCycle

Minutes

Load

1

2

3

0 30 60 90 120 150 180 210 240 270

Page 7: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Laundry-o-Matic -SingleCycle

Minutes

Load

1

2

3

0 30 60 90 120 150 180 210 240 270

W FD

Page 8: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Laundry-o-Matic -SingleCycle

Minutes

Load

1

2

3

0 30 60 90 120 150 180 210 240 270

W FW D F

D

Page 9: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Laundry-o-Matic -SingleCycle

Minutes

Load

1

2

3

0 30 60 90 120 150 180 210 240 270

W FW D F

W FD

D

Page 10: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Laundry-o-Matic

• Cycle Time: Clothing is switched every ____ minutes

• Latency: A single load takes a total of ______ minutes

• Throughput: A load completes each ______ minutes

• CyclesPerLoad: Every ____ cycles, a load completes

Page 11: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Laundry-o-Matic

• Cycle Time: Clothing is switched every 90 minutes

• Latency: A single load takes a total of ______ minutes

• Throughput: A load completes each ______ minutes

• CyclesPerLoad: Every ____ cycles, a load completes

Page 12: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Laundry-o-Matic

• Cycle Time: Clothing is switched every 90 minutes

• Latency: A single load takes a total of 90 minutes

• Throughput: A load completes each ______ minutes

• CyclesPerLoad: Every ____ cycles, a load completes

Page 13: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Laundry-o-Matic

• Cycle Time: Clothing is switched every 90 minutes

• Latency: A single load takes a total of 90 minutes

• Throughput: A load completes each 90 minutes

• CyclesPerLoad: Every ____ cycles, a load completes

Page 14: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Laundry-o-Matic

• Cycle Time: Clothing is switched every 90 minutes

• Latency: A single load takes a total of 90 minutes

• Throughput: A load completes each 90 minutes

• CyclesPerLoad: Every 1 cycles, a load completes

Page 15: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Pipelined Laundry

• Split the laundry-o-matic into a washer, dryer, and folder (what a concept)

• Moving the laundry from one unit to another takes 6 minutes

Page 16: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Pipelined Laundry

Minutes

Load

1

2

3

0 30 60 90 120 150 180 210 240 270

W FD

We have to include time to switch stages

Page 17: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Pipelined Laundry

Minutes

Load

1

2

3

0 30 60 90 120 150 180 210 240 270

W FDW FD

Page 18: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Pipelined Laundry

Minutes

Load

1

2

3

0 30 60 90 120 150 180 210 240 270

W FDW FD

Two loads can not be in Dryer at the same

time.

Page 19: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Pipelined Laundry

Minutes

Load

1

2

3

0 30 60 90 120 150 180 210 240 270

W FW

Switch all loads at the same time

DD F

Page 20: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Pipelined Laundry

Minutes

Load

1

2

3

0 30 60 90 120 150 180 210 240 270

W FWD

D FW D F

Page 21: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Pipelined Laundry

• Cycle Time: Clothing is switched every ____ minutes

• Latency: A single load takes a total of ______ minutes

• Throughput: A load completes each ______ minutes

• CyclesPerLoad: Every ____ cycles, a load completes

Page 22: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Pipelined Laundry

• Cycle Time: Clothing is switched every 46 minutes

• Latency: A single load takes a total of ______ minutes

• Throughput: A load completes each ______ minutes

• CyclesPerLoad: Every ____ cycles, a load completes

Page 23: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Pipelined Laundry

• Cycle Time: Clothing is switched every 46 minutes

• Latency: A single load takes a total of 138 minutes

• Throughput: A load completes each ______ minutes

• CyclesPerLoad: Every ____ cycles, a load completes

Page 24: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Pipelined Laundry

• Cycle Time: Clothing is switched every 46 minutes

• Latency: A single load takes a total of 138 minutes

• Throughput: A load completes each 46 minutes

• CyclesPerLoad: Every ____ cycles, a load completes

Page 25: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Pipelined Laundry

• Cycle Time: Clothing is switched every 46 minutes

• Latency: A single load takes a total of 138 minutes

• Throughput: A load completes each 46 minutes

• CyclesPerLoad: Every 1 cycles, a load completes

Page 26: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Single-Cycle vs Pipelined

• _________ has the higher cycle time• _________ has the higher clock rate• _________ has the higher single-load

latency• _________ has the higher throughput• _________ has the higher CPL (Cycles

per Load)• More stages makes a _______ clock

rate

Page 27: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Single-Cycle vs Pipelined

• Single has the higher cycle time• _________ has the higher clock rate• _________ has the higher single-load

latency• _________ has the higher throughput• _________ has the higher CPL (Cycles

per Load)• More stages makes a _______ clock

rate

Page 28: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Single-Cycle vs Pipelined

• Single has the higher cycle time• Pipelined has the higher clock rate• _________ has the higher single-load

latency• _________ has the higher throughput• _________ has the higher CPL (Cycles

per Load)• More stages makes a _______ clock

rate

Page 29: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Single-Cycle vs Pipelined

• Single has the higher cycle time• Pipelined has the higher clock rate• Pipelined has the higher single-load

latency• _________ has the higher throughput• _________ has the higher CPL (Cycles

per Load)• More stages makes a _______ clock

rate

Page 30: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Single-Cycle vs Pipelined

• Single has the higher cycle time• Pipelined has the higher clock rate• Pipelined has the higher single-load

latency• Pipelined has the higher throughput• _________ has the higher CPL (Cycles

per Load)• More stages makes a _______ clock

rate

Page 31: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Single-Cycle vs Pipelined

• Single has the higher cycle time• Pipelined has the higher clock rate• Pipelined has the higher single-load

latency• Pipelined has the higher throughput• Neither has the higher CPL (Cycles per

Load)• More stages makes a _______ clock

rate

Page 32: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Single-Cycle vs Pipelined

• Single has the higher cycle time• Pipelined has the higher clock rate• Pipelined has the higher single-load

latency• Pipelined has the higher throughput• Neither has the higher CPL (Cycles per

Load)• More stages makes a Higher clock rate

Page 33: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Obstacles to speedup in PipeliningW FD

• 1. • 2.

• Ideal cycle time w/out above limitations with n stage pipeline:

Page 34: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Obstacles to speedup in Pipelining

• 1. Uneven Stages• 2.

• Ideal cycle time w/out above limitations with n stage pipeline:

W FD

Page 35: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Obstacles to speedup in Pipelining

• 1. Uneven Stages• 2. Pipeline Register Delay

• Ideal cycle time w/out above limitations with n stage pipeline:

W FD

Page 36: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Obstacles to speedup in Pipelining

• 1. Uneven Stages• 2. Pipeline Register Delay

• Ideal cycle time w/out above limitations with n stage pipeline:w OldCycleTime / n

W FD

Page 37: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Example

• Washing = 45• Drying = 120• Folding = 15• Switching = 5

• What is the latency for one load of laundry?

• What is the latency for three loads?

Page 38: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Example

• Washing = 45• Drying = 120• Folding = 15• Switching = 5

• What is the latency for one load of laundry? 375

• What is the latency for three loads? 625

Page 39: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Creating Stages

• Fetch – get instruction• Decode – read registers• Execute – use ALU• Memory – access memory• WriteBack – write registers

Fetch Decode Execute Memory WriteBack

IF WBMEMID

Page 40: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Pipelined Machine

Read Addr Out Data

InstructionMemory

PC

4

src1 src1data

src2 src2dataRegister Filedestreg

destdata

op/funrsrtrdimm

Addr Out Data

Data Memory

In Data

32Sign Ext

16

<<2

<<2

Pipeline Register

Fetch

(Writeback)

ExecuteDecode Memory

Page 41: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

IF WBMEMID

Time->

IF

1 2 3 4 5 6 7 8

add $s0, $0, $0

lw $s1, 0($t0)

sw $s2, 0($t1)

or $s3, $s4, $t3

add $s0, $0, $0

Page 42: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

IF WBMEMID

Time->

IF ID

IF

1 2 3 4 5 6 7 8

add $s0, $0, $0

lw $s1, 0($t0)

sw $s2, 0($t1)

or $s3, $s4, $t3

add $s0, $0, $0lw $s1, 0($t0)

Page 43: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

IF WBMEMID

Time->

IF ID

IF ID

IF

1 2 3 4 5 6 7 8

add $s0, $0, $0

lw $s1, 0($t0)

sw $s2, 0($t1)

or $s3, $s4, $t3

add $s0, $0, $0lw $s1, 0($t0)sw $s2, 0($t1)

Page 44: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

IF WBMEMID

Time->

add $s0, $0, $0

lw $s1, 0($t0)

sw $s2, 0($t1)

or $s3, $s4, $t3

IF

add $s0, $0, $0

ID

IF

lw $s1, 0($t0)

ID

IF

sw $s2, 0($t1)

MEM

ID

IF

or $s3, $s4, $t3

1 2 3 4 5 6 7 8

Page 45: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

IF WBMEMID

add $s0, $0, $0lw $s1, 0($t0)sw $s2, 0($t1)or $s3, $s4, $t3

Time->

add $s0, $0, $0

lw $s1, 0($t0)

sw $s2, 0($t1)

or $s3, $s4, $t3

IF ID

IF ID

IF

MEM

ID

IF

1 2 3 4 5 6 7 8

MEM

ID

WB

Page 46: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

IF WBMEMID

lw $s1, 0($t0)sw $s2, 0($t1)or $s3, $s4, $t3

Time->

add $s0, $0, $0

lw $s1, 0($t0)

sw $s2, 0($t1)

or $s3, $s4, $t3

IF ID

IF ID

IF

MEM

ID

IF

1 2 3 4 5 6 7 8

MEM

ID

WBMEM

WB

Page 47: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

IF WBMEMID

sw $s2, 0($t1)or $s3, $s4, $t3

Time->

add $s0, $0, $0

lw $s1, 0($t0)

sw $s2, 0($t1)

or $s3, $s4, $t3

IF ID

IF ID

IF

MEM

ID

IF

1 2 3 4 5 6 7 8

MEMID

WB

MEM

WB

MEM

WB

Page 48: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

IF WBMEMID

or $s3, $s4, $t3

Time->

add $s0, $0, $0

lw $s1, 0($t0)

sw $s2, 0($t1)

or $s3, $s4, $t3

IF ID

IF ID

IF

MEM

ID

IF

1 2 3 4 5 6 7 8

ID WB

MEM

WB

MEM

WB

MEM

WB

Page 49: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

IF WBMEMID

Time->

add $s0, $0, $0

lw $s1, 0($t0)

sw $s2, 0($t1)

or $s3, $s4, $t3

IF ID

IF ID

IF

MEM

ID

IF

1 2 3 4 5 6 7 8

ID WB

MEM

WB

MEM

WB

MEM

WB

The machine in cycle 4

Page 50: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

IF WBMEMID

Time->

add $s0, $0, $0

lw $s1, 0($t0)

sw $s2, 0($t1)

or $s3, $s4, $t3

IF ID

IF ID

IF

MEM

ID

IF

1 2 3 4 5 6 7 8

ID WB

MEM

WB

MEM

WB

MEM

WB

The machine in cycle 5

Page 51: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

In what cycle was $s1 written?

In what cycle was $s4 read?

In what cycle was the Add executed?

Time->

add $s0, $0, $0

lw $s1, 0($t0)

sw $s2, 0($t1)

or $s3, $s4, $t3

IF ID

IF ID

IF

MEM

ID

IF

1 2 3 4 5 6 7 8

ID WB

MEM

WB

MEM

WB

MEM

WB

Page 52: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

In what cycle was $s1 written? 6

In what cycle was $s4 read?

In what cycle was the Add executed?

Time->

add $s0, $0, $0

lw $s1, 0($t0)

sw $s2, 0($t1)

or $s3, $s4, $t3

IF ID

IF ID

IF

MEM

ID

IF

1 2 3 4 5 6 7 8

ID WB

MEM

WB

MEM

WB

MEM

WB

Page 53: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

In what cycle was $s1 written? 6

In what cycle was $s4 read? 5

In what cycle was the Add executed?

Time->

add $s0, $0, $0

lw $s1, 0($t0)

sw $s2, 0($t1)

or $s3, $s4, $t3

IF ID

IF ID

IF

MEM

ID

IF

1 2 3 4 5 6 7 8

ID WB

MEM

WB

MEM

WB

MEM

WB

Page 54: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

In what cycle was $s1 written? 6

In what cycle was $s4 read? 5

In what cycle was the Add executed? 3

Time->

add $s0, $0, $0

lw $s1, 0($t0)

sw $s2, 0($t1)

or $s3, $s4, $t3

IF ID

IF ID

IF

MEM

ID

IF

1 2 3 4 5 6 7 8

ID WB

MEM

WB

MEM

WB

MEM

WB

Page 55: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Performance Analysis

• Measurements related to our machine• Job = single instruction• Latency: Time to finish a complete

_______________, start to finish. • Throughput: Average ______________

completed per unit time.

• Which is more important for reducing program execution time?

Page 56: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Performance Analysis

• Measurements related to our machine• Job = single instruction• Latency: Time to finish a complete

instruction start to finish. • Throughput: Average ______________

completed per unit time.

• Which is more important for reducing program execution time?

Page 57: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Performance Analysis

• Measurements related to our machine• Job = single instruction• Latency: Time to finish a complete

instruction start to finish. • Throughput: Average number of

instructions completed per unit time.

• Which is more important for reducing program execution time?

Page 58: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Pipelined Machine

Read Addr Out Data

InstructionMemory

PC

4

src1 src1data

src2 src2dataRegister Filedestreg

destdata

op/funrsrtrdimm

Addr Out Data

Data Memory

In Data

32Sign Ext

16

<<2

<<2

Pipeline Register

Fetch

(Writeback)

ExecuteDecode Memory

Page 59: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Pipeline Registers

w IF/ID§ 32b instruction§ 32b nPC

w ID/EX§ 32b register§ 32b register§ 32b immediate field§ 32b nPC

w EX/MEM§ Zero§ 32b ALU result§ 32b nPC§ 32b register value

w MEM/WB§ 32b ALU result§ 32b memory value

• Named for two stages they separate• Store all data corresponding to lines that go

through them

Page 60: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Register File

• Only takes half of a cycle to read or write to register file

• Convention:w Read 2nd half of cyclew Write 1st half of cycle

Page 61: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Machine ComparisonFetch Decode Execute Memory WriteBack2ns 1ns 2ns 2ns 1ns

0.1 ns pipeline register delaySingle-Cycle Implementation

Clock cycle time: _____ nsLatency of a single instruction: _____ nsThroughput for machine: _____ inst/ns

Pipelined ImplementationClock cycle time: _____ nsLatency of a single instruction: _____ nsThroughput for machine: _____ inst/ns

Page 62: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Machine ComparisonFetchDecode Execute Memory WriteBack2ns 1ns 2ns 2ns 1ns0.1 ns pipeline register delay

Single-Cycle ImplementationClock cycle time: 8 nsLatency of a single instruction: _____ nsThroughput for machine: _____ inst/ns

Pipelined ImplementationClock cycle time: _____ nsLatency of a single instruction: _____ nsThroughput for machine: _____ inst/ns

Page 63: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Machine ComparisonFetchDecode Execute Memory WriteBack2ns 1ns 2ns 2ns 1ns0.1 ns pipeline register delay

Single-Cycle ImplementationClock cycle time: 8 nsLatency of a single instruction: 8 nsThroughput for machine: _____ inst/ns

Pipelined ImplementationClock cycle time: _____ nsLatency of a single instruction: _____ nsThroughput for machine: _____ inst/ns

Page 64: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Machine ComparisonFetchDecode Execute Memory WriteBack2ns 1ns 2ns 2ns 1ns0.1 ns pipeline register delay

Single-Cycle ImplementationClock cycle time: 8 nsLatency of a single instruction: 8 nsThroughput for machine: 1/8 inst/ns

Pipelined ImplementationClock cycle time: _____ nsLatency of a single instruction: _____ nsThroughput for machine: _____ inst/ns

Page 65: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Machine ComparisonFetchDecode Execute Memory WriteBack2ns 1ns 2ns 2ns 1ns0.1 ns pipeline register delay

Single-Cycle ImplementationClock cycle time: 8 nsLatency of a single instruction: 8 nsThroughput for machine: 1/8 inst/ns

Pipelined ImplementationClock cycle time: 2.1 nsLatency of a single instruction: _____ nsThroughput for machine: _____ inst/ns

Page 66: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Machine ComparisonFetchDecode Execute Memory WriteBack2ns 1ns 2ns 2ns 1ns0.1 ns pipeline register delay

Single-Cycle ImplementationClock cycle time: 8 nsLatency of a single instruction: 8 nsThroughput for machine: 1/8 inst/ns

Pipelined ImplementationClock cycle time: 2.1 nsLatency of a single instruction: 2.1*5=10.5 nsThroughput for machine: _____ inst/ns

Page 67: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Machine ComparisonFetchDecode Execute Memory WriteBack2ns 1ns 2ns 2ns 1ns0.1 ns pipeline register delay

Single-Cycle ImplementationClock cycle time: 8 nsLatency of a single instruction: 8 nsThroughput for machine: 1/8 inst/ns

Pipelined ImplementationClock cycle time: 2.1 nsLatency of a single instruction: 2.1*5=10.5 nsThroughput for machine: 1 / 2.1 inst/ns

Page 68: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Example 2 – How do we speed up pipelined machine?

Fetch Decode Execute Memory Writeback6ns 4ns 8ns 10ns 4ns

0.1 ns pipelined register delay

Single cycle: 1 / nsPipelined: 1 / ns

Page 69: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Example 2 – How do we speed up pipelined machine?

Fetch Decode Execute Memory Writeback6ns 4ns 8ns 10ns 4ns

0.1 ns pipelined register delay

Single cycle: 1 / 32 inst / nsPipelined: 1 / 10.1 inst / ns

Page 70: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Example 2 – Split more stages

Fetch Decode Execute Memory Writeback6ns 4ns 8ns 10ns 4ns

0.1 ns pipelined register delay

Which stage(s) should we split?_________ and _________

Page 71: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Example 2 – Split more stages

Fetch Decode Execute Memory Writeback6ns 4ns 8ns 10ns 4ns

0.1 ns pipelined register delay

Which stage(s) should we split?Memory and _________

Page 72: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Example 2 – Split more stages

Fetch Decode Execute Memory Writeback6ns 4ns 8ns 10ns 4ns

0.1 ns pipelined register delay

Which stage(s) should we split?Memory and Execute

Page 73: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Example 2 – After Split

F D X1 X2 M1 M2 WB___ns ___ns ___ns ___ns ___ns ___ns ___ns0.1 ns pipelined register delay

Single cycle: 1 / nsPipelined: 1 / ns

Page 74: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Example 2 – After Split

F D X1 X2 M1 M2 WB6 ns 4 ns ___ns ___ns ___ns ___ns 4 ns0.1 ns pipelined register delay

Single cycle: 1 / nsPipelined: 1 / ns

Page 75: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Example 2 – After Split

F D X1 X2 M1 M2 WB6 ns 4 ns 4 ns 4 ns ___ns ___ns 4 ns0.1 ns pipelined register delay

Single cycle: 1 / nsPipelined: 1 / ns

Page 76: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Example 2 – After Split

F D X1 X2 M1 M2 WB6 ns 4 ns 4 ns 4 ns 5 ns 5 ns 4 ns0.1 ns pipelined register delay

Single cycle: 1 / nsPipelined: 1 / ns

Page 77: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Example 2 – After Split

F D X1 X2 M1 M2 WB6 ns 4 ns 4 ns 4 ns 5 ns 5 ns 4 ns0.1 ns pipelined register delay

Single cycle: 1 / 32 nsPipelined: 1 / ns

Page 78: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Example 2 – After Split

F D X1 X2 M1 M2 WB6 ns 4 ns 4 ns 4 ns 5 ns 5 ns 4 ns0.1 ns pipelined register delay

Single cycle: 1 / 32 nsPipelined: 1 / 6.1 ns

Page 79: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Easy Right? Not so fast.

In what cycle does the add write $s0?In what cycle does the or read $s0?

Time->

add $s0, $0, $0

or $s3, $s0, $t3

IF ID

IF ID

MEM

1 2 3 4 5 6 7 8

MEM

WB

WB

sw $s2, 0($t1)

and $s6, $s4, $t3

IF ID

IF ID WB

MEM

MEM

WB

Incorrect Execution

Page 80: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Time->

add $s0, $0, $0

or $s3, $s0, $t3

IF ID

IF ID

MEM

1 2 3 4 5 6 7 8

MEM

WB

WB

Easy Right? Not so fast.

In what cycle does the add write $s0? 1st half of cycle 5In what cycle does the or read $s0?

sw $s2, 0($t1)

and $s6, $s4, $t3

IF ID

IF ID WB

MEM

MEM

WB

Page 81: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Time->

add $s0, $0, $0

or $s3, $s0, $t3

IF ID

IF ID

MEM

1 2 3 4 5 6 7 8

MEM

WB

WB

Easy Right? Not so fast.

In what cycle does the add write $s0 1st half of cycle 5In what cycle does the or read $s0? 2nd half of cycle 3

sw $s2, 0($t1)

and $s6, $s4, $t3

IF ID

IF ID WB

MEM

MEM

WB

WB

Page 82: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Time->

add $s0, $0, $0

or $s3, $s0, $t3

IF ID

IF ID

MEM

1 2 3 4 5 6 7 8

MEM

WB

WB

Easy Right? Not so fast.

In what cycle does the add write $s0? 1st half of cycle 5In what cycle does the or read $s0? 2nd half of cycle 3

sw $s2, 0($t1)

and $s6, $s4, $t3

IF ID

IF ID WB

MEM

MEM

WB

Ahhhh! Values can not pass backwards

in time

WB

Page 83: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Time->

add $s0, $0, $0

or $s3, $s0, $t3

IF ID

IF ID

MEM

1 2 3 4 5 6 7 8

MEM

WB

WB

Easy Right? Not so fast.In what cycle does the add write $s0? 1st half of cycle 5In what cycle does the or read $s0? 2nd half of cycle 5

Stall - wasted cycles

sw $s2, 0($t1)

and $s6, $s4, $t3

IF ID

IF ID WB

MEM

MEM

WB

IF IF

Correct, Slow Execution

Page 84: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Time->

add $s0, $0, $0

or $s3, $s0, $t3

IF ID

IF ID

MEM

1 2 3 4 5 6 7 8

MEM

WB

WB

Easy Right? Not so fast.In what cycle does the add write $s0? 1st half of cycle 5In what cycle does the or read $s0? 2nd half of cycle 5

Stall - wasted cycles

sw $s2, 0($t1)

and $s6, $s4, $t3

IF ID

IF ID WB

MEM

MEM

WB

IF IF

Correct, Slow ExecutionOnly Register File rd/wr in half a cycle. All other stages take a full cycle – this is

because of shared hardware

Page 85: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Barriers to pipelined performance

• Uneven stages• Pipeline register delays

Page 86: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Barriers to pipelined performance

• Uneven stages• Pipeline register delays• Data Hazards

Page 87: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Barriers to pipeline performance

• Uneven stages• Pipeline register delays• Data Hazards

w An instruction depends on the result of a previous instruction still in the pipeline

Page 88: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Solutions?

• What can we try to reduce data hazards or their effect?

Page 89: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Time->

add $s0, $0, $0

or $s3, $s0, $t3

IF ID

IF ID

MEM

1 2 3 4 5 6 7 8

MEM

WB

WB

Easy Right? Not so fast.In what cycle does the add write $s0? 1st half of cycle 5In what cycle does the or read $s0? 2nd half of cycle 5

Stall - wasted cycles

sw $s2, 0($t1)

and $s6, $s4, $t3

IF ID

IF ID WB

MEM

MEM

WB

IF IF

Default (do nothing): Stall

Page 90: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Time->

or $s3, $s0, $t3

IF ID

IF

MEM

1 2 3 4 5 6 7 8

MEM WB

In what cycle is $s0 calculated in the machine? In what cycle is $s0 used in the machine?

sw $s2, 0($t1)

and $s6, $s4, $t3

IF ID

IF ID WB

MEM

MEM

WB

Solution 1: Data Forwarding

lw $s0, 0($t4) WB

ID

Page 91: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Time->

or $s3, $s0, $t3

IF ID

IF

MEM

1 2 3 4 5 6 7 8

MEM WB

In what cycle is $s0 calculated in the machine? End of cycle 4 In what cycle is $s0 used?

sw $s2, 0($t1)

and $s6, $s4, $t3

IF ID

IF ID WB

MEM

MEM

WB

Solution 1: Data Forwarding

lw $s0, 0($t4) WB

ID

Page 92: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Time->

or $s3, $s0, $t3

IF ID

IF

MEM

1 2 3 4 5 6 7 8

MEM WB

In what cycle is $s0 calculated in the machine? End of cycle 4In what cycle is $s0 used? beginning of cycle 4

sw $s2, 0($t1)

and $s6, $s4, $t3

IF ID

IF ID WB

MEM

MEM

WB

Solution 1: Data Forwarding

lw $s0, 0($t4) WB

ID

Page 93: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Time->

or $s3, $s0, $t3

IF ID

IF

MEM

1 2 3 4 5 6 7 8

MEM WB

In what cycle is $s0 calculated in the machine? end of cycle 4 In what cycle is $s0 used? beginning of cycle 5

sw $s2, 0($t1)

and $s6, $s4, $t3

IF ID

IF ID WB

MEM

MEM

WB

Solution 1: Data Forwarding

lw $s0, 0($t4)

ID

IF

WB

ID

Page 94: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Data-ForwardingWhere are those wires?

Read Addr Out Data

InstructionMemory

PC

4

src1 src1data

src2 src2dataRegister Filedestreg

destdata

op/funrsrtrdimm

Addr Out Data

Data Memory

In Data

32Sign Ext

16

<<2

<<2

Pipeline Register

Fetch

(Writeback)

ExecuteDecode Memory

Page 95: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Data-ForwardingWhere are those wires?

Read Addr Out Data

InstructionMemory

PC

4

src1 src1data

src2 src2dataRegister Filedestreg

destdata

op/funrsrtrdimm

Addr Out Data

Data Memory

In Data

32Sign Ext

16

<<2

<<2

Pipeline Register

Fetch

(Writeback)

ExecuteDecode Memory

Page 96: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Time->

add $s2, $s2, $t0

F D

F

M

1 2 3 4 5 6 7 8 9 10 11 12

Draw the timing diagram with data forwardingDraw arrows to indicate data passing through forwarding

F

lw $t0, 0($s0) W

D

Data ForwardingExample 2

sw $s2, 0($s0)

addi $t0, $t0, 1

Page 97: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Time->

or $s3, $s0, $t3

IF ID

IF

MEM

1 2 3 4 5 6 7 8

MEM WB

IF IF

Stall - wasted cycles

sw $s2, 0($t1)

and $s6, $s4, $t3

IF ID

IF ID WB

MEM

MEM

WB

Solution 2: Instruction Reordering(Before reordering)

lw $s0, 0($t4) WB

ID

Page 98: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Time->

or $s3, $s0, $t3

IF ID

IF ID

MEM

1 2 3 4 5 6 7 8

MEM

WB

WB

sw $s2, 0($t1)

and $s6, $s4, $t3 IF ID

IF ID WB

MEM

MEM

WB

Solution 2: Instruction Reordering(After Reordering)

lw $s0, 0($t4)

ID

WB

Page 99: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Who reorders instructions?

• Static schedulingw Compilerw Simpler, but does not know when caches

miss or loads/stores are to the same locations

• Dynamic schedulingw Hardwarew More complicated, but has all knowledge

Page 100: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Time->

or $s3, $s0, $t3

IF ID

IF ID

MEM

1 2 3 4 5 6 7 8

MEM

WB

WB

sw $s3, 0($t1)

and $s0, $s4, $t3

IF ID

IF ID WB

MEM

MEM

WB

Solution 2: Instruction Reordering

lw $s0, 0($t4)ID

WB

Page 101: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Time->

or $s3, $s0, $t3

IF ID

IF

MEM

1 2 3 4 5 6 7 8

MEM

WB

WBsw $s3, 0($t1)

and $s0, $s4, $t3 IF ID

IF ID WB

MEM

MEM

WB

Solution 2: Instruction Reordering

lw $s0, 0($t4)ID

WB

Is this the same execution?!?

Page 102: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Time->

or $s3, $s0, $t3

IF ID

IF

MEM

1 2 3 4 5 6 7 8

MEM

WB

WBsw $s3, 0($t1)

and $s0, $s4, $t3 IF ID

IF ID WB

MEM

MEM

WB

Solution 2: Instruction Reordering

lw $s0, 0($t4)ID

WB

Is this the same execution?!?

Page 103: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

How Are Data Hazards Detected

A.K.A. The Official Lawson Johnson Aside

Page 104: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

First: Pipelined Control

Page 105: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Recall: Pipeline Registers

w IF/ID§ 32b instruction§ 32b nPC

w ID/EX§ 32b register§ 32b register§ 32b immediate field§ 32b nPC

w EX/MEM§ Zero§ 32b ALU result§ 32b nPC§ 32b register value

w MEM/WB§ 32b ALU result§ 32b memory value

• Named for two stages they separate• Store all data corresponding to lines that go

through them

Page 106: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Pipelined Control

• PC is written on each clock cycle• No need for signals to pipeline

registers, since they are written on each clock cycle

• Observation: Each control line is associated with a component active in only one stage of the pipelinew So we can divide control lines into five

groups, according to pipeline stage

Page 107: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

Pipelined Control

Page 108: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

The Hazard Detection Pseudocode

• “Code” operates during ID stage• Looks something like this

Page 109: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

The Hazard Detection Pseudocode

• In English: if the instruction in the execute phase is a load (the only instruction that reads memory), and if the register being written by the load is either of the source registers for the following instruction, stall the pipeline

Page 110: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

How is the Pipeline Stalled?

• If ID stage is stalled, IF stage must also be stalledw Otherwise we lose the instruction that was

to be fetched nextw In concrete terms: PC and IF/ID pipeline

registers are prevented from changing§ (Basically, the same instruction is repeatedly

fetched and loaded into the IF/ID pipeline register.)

Page 111: Pipelining - University of Richmond · concept) •Moving the laundry from one unit to another takes 6 minutes. Pipelined Laundry Minutes Load 1 2 3 0 30 60 90 120 150 180 210 240

How is the Pipeline Stalled?

• And “back half” (EX/M/WB) phases must be doing “nothingw Like restarting the wash but letting dryer

continue to tumble emptyw Accomplished with NOP instruction(s)w Turns out: deasserting all control signals

into EX/M/WB states creates a NOP instruction in those stages

w Note these are percolated forward (doing nothing), which is what we want