Systems Software & Pipelining

34
Pipelining Jin-Soo Kim ([email protected]) Systems Software & Architecture Lab. Seoul National University Fall 2020 Chap. 4.5

Transcript of Systems Software & Pipelining

Page 1: Systems Software & Pipelining

Pipelining

Jin-Soo Kim([email protected])

Systems Software &Architecture Lab.

Seoul National University

Fall 2020

Chap. 4.5

Page 2: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 2

Page 3: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 3

Page 4: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 4

Page 5: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 5

▪ Sequential processing: Wash-Dry-Fold-Store

Page 6: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 6

▪ Overlapping execution

▪ Parallelism improves performance

▪ Four loads:• Speedup = 8 / 3.5 = 2.3

▪ Non-stop:• Speedup = 2n / (0.5n + 1.5) ≃ 4

= number of stages

Page 7: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 7

▪ Five stages, one step per stage

▪ IF: Instruction fetch from memory

▪ ID: Instruction decode & register read

▪ EX: Execute operation or calculate address

▪ MEM: Access memory operand

▪ WB: Write result back to register

Page 8: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 8

▪ Sequential execution

▪ Pipelined execution

IF ID EX MEM WB IF ID EX MEM WB IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

add x10, x11, x12

sub x13, x14, x15

and x5, x6, x7

Page 9: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 9

▪ Assume time for stages is

• 100ps for register read or write

• 200ps for other stages

▪ Compare piplined datapath with single-cycle datapath

Inst.Inst.fetch

Registerread

ALU op.Memory

accessRegister

writeTotaltime

ld 200ps 100ps 200ps 200ps 100ps 800ps

sd 200ps 100ps 200ps 200ps 700ps

R-type 200ps 100ps 200ps 100ps 600ps

beq 200ps 100ps 200ps 500ps

Page 10: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 10

Single-cycle(Tc = 800ps)

Pipelined(Tc = 200ps)

Page 11: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 11

▪ If all stages are balanced

• i.e., all take the same time

▪ If not balanced, speedup is less

▪ Speedup due to increased throughput

▪ Latency (time for each instruction) does not decrease

𝑻𝒊𝒎𝒆 𝒃𝒆𝒕𝒘𝒆𝒆𝒏 𝒊𝒏𝒔𝒕𝒓𝒖𝒄𝒕𝒊𝒐𝒏𝒔𝒑𝒊𝒑𝒆𝒍𝒊𝒏𝒆𝒅 =𝑻𝒊𝒎𝒆 𝒃𝒆𝒕𝒘𝒆𝒆𝒏 𝒊𝒏𝒔𝒕𝒓𝒖𝒄𝒕𝒊𝒐𝒏𝒔𝒏𝒐𝒏𝒑𝒊𝒑𝒆𝒍𝒊𝒏𝒆𝒅

𝑵𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒔𝒕𝒂𝒈𝒆𝒔

Page 12: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 12

▪ RISC-V ISA designed for pipelining

▪ All instructions are 32-bits

• Easier to fetch and decode in one cycle

• (cf.) x86: 1- to 17-byte instructions

▪ Few and regular instruction formats

• Source and destination register fields located in the same place

• Can decode and read registers in one step

▪ Load/store addressing

• Can calculate address in 3rd EX stage, access memory in 4th MEM stage

Page 13: Systems Software & Pipelining

Pipelined Datapath and Control

Chap. 4.6

Page 14: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 14

WB

MEM

Right-to-left flow leads to hazards

Page 15: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 15

▪ Need registers between stages

• To hold information produced in previous cycle

Page 16: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 16

▪ Cycle-by-cycle flow of instructions through the pipelined datapath

▪ “Single-clock-cycle” pipeline diagram

• Shows pipeline usage in a single cycle

• Highlight resource used

• (cf.) “multi-clock-cycle” diagram: graph of operation over time

▪ We’ll look at “single-clock-cycle” diagrams for load & store

Page 17: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 17

Page 18: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 18

Page 19: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 19

Page 20: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 20

Page 21: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 21

Wrongregisternumber

Page 22: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 22

Page 23: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 23

Page 24: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 24

Page 25: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 25

Page 26: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 26

Page 27: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 27

Page 28: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 28

▪ Form showing resource usage

Page 29: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 29

▪ Traditional form

Page 30: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 30

▪ State of pipeline in a given cycle

Page 31: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 31

Page 32: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 32

Page 33: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 33

▪ Control signals derived from instruction

• As in single-cycle implementation

Page 34: Systems Software & Pipelining

4190.308: Computer Architecture | Fall 2020 | Jin-Soo Kim ([email protected]) 34