The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on...

31
EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) The Processor (2) Jinkyu Jeong ([email protected]) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu

Transcript of The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on...

Page 1: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected])

The Processor (2)

Jinkyu Jeong ([email protected])Computer Systems Laboratory

Sungkyunkwan Universityhttp://csl.skku.edu

Page 2: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 2

Announcement• Midterm exam– Plan A• When: ?? (18:00~) on 4/24(Mon) or 4/25(Tue)• Where? #400102 (Semiconductor Bldg.)

– Plan B• When: 13:30 ~ 14:45 on 4/24 (Mon)• Where #26310

– Stay tuned on our class web page– http://csl.skku.edu/EEE3050S17/News

• No class on 4/19 (Wed)• Make-up class on 4/26 (Wed.)

Page 3: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 3

OutlineTextbook: P&H 4.5-4.6

• An Overview of Pipelining

• Pipelined Datapath and Control

3

Page 4: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 4

Pipelining Analogy• Pipelined laundry: overlapping execution– Parallelism improves performance

4

• Four loads:– Speedup

= 8/3.5 = 2.3

• Non-stop:– Speedup

= 2n/0.5n + 1.5 ≈ 4= number of stages

Page 5: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 5

MIPS Pipeline• Five stages, one step per stage

1. IF: Instruction fetch from memory2. ID: Instruction decode & register read3. EX: Execute operation or calculate address4. MEM: Access memory operand5. WB: Write result back to register

54 ICE3003: Computer Architecture | Spring 2012 | Jin-Soo Kim ([email protected])

MIPS Pipeline � Five stages, one step per stage

• IF: Instruction fetch from memory • ID: Instruction decode & register read • EX: Execute operation or calculate address • MEM: Access memory operand • WB: Write result back to register

IF ID EX MEM WB IF ID EX MEM WB

IF ID EX MEM WB ...

Page 6: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 6

Pipeline Performance• Assume time for stages is– 100ps for register read or write– 200ps for other stages

• Compare pipelined datapath with single-cycle datapath

6

Instr Instr fetch Register read

ALU op Memory access

Register write

Total time

lw 200ps 100 ps 200ps 200ps 100 ps 800ps

sw 200ps 100 ps 200ps 200ps 700ps

R-format 200ps 100 ps 200ps 100 ps 600ps

beq 200ps 100 ps 200ps 500ps

Page 7: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 7

Pipeline PerformanceSingle-cycle (Tc= 800ps)

Pipelined (Tc= 200ps)

Page 8: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 8

Pipeline Speedup• If all stages are balanced– i.e., all take the same time– Time between instructionspipelined

= Time between instructionsnonpipelined

Number of stages

• If not balanced, speedup is less

• Speedup due to increased throughput– Latency (time for each instruction) does not decrease

8

Page 9: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 9

Pipelining and ISA Design• MIPS ISA designed for pipelining– All instructions are 32-bits• Easier to fetch and decode in one cycle• c.f. x86: 1- to 17-byte instructions– Few and regular instruction formats• Can decode and read registers in one step– Load/store addressing• Can calculate address in 3rd stage, access memory in 4th stage– Alignment of memory operands• Memory access takes only one cycle

9

Page 10: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 10

Pipeline Summary• Pipelining improves performance by increasing

instruction throughput– Executes multiple instructions in parallel– Each instruction has the same latency

• Instruction set design affects complexity of pipeline implementation

• Subject to hazards– Structure, data, control

10

Page 11: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected])

PipelinedDatapath andControl

11

Page 12: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 12

MIPS Pipelined Datapath

WB

MEM

Right-to-left flow leads to hazards

Page 13: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 13

Pipeline registers• Need registers between stages– To hold information produced in previous cycle

13

Page 14: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 14

IF for Load

Page 15: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 15

ID for Load

Page 16: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 16

EX for Load

Page 17: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 17

MEM for Load

Page 18: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 18

WB for Load

Wrongregisternumber

Page 19: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 19

Corrected Datapath for Load

Page 20: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 20

IF for Store

Page 21: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 21

ID for Store

Page 22: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 22

EX for Store

Page 23: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 23

MEM for Store

Page 24: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 24

WB for Store

Page 25: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 25

Pipeline Operation• Cycle-by-cycle flow of instructions through the

pipelined datapath– “Single-clock-cycle” pipeline diagram• Shows pipeline usage in a single cycle• Highlight resources used

– c.f. “multi-clock-cycle” diagram• Graph of operation over time

• We will now look at multi-clock-cycle diagram– Also called “waterfall” diagram

25

Page 26: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 26

Multi-Cycle Pipeline Diagram• Form showing resource usage

26

Page 27: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 27

Multi-Cycle Pipeline Diagram• Traditional form

27

Page 28: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 28

Single-Cycle Pipeline Diagram• State of pipeline in a given cycle

28

Page 29: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 29

Pipelined Control (Simplified)

29

Page 30: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 30

Pipelined Control• Control signals derived from instruction– As in single-cycle implementation

30

Page 31: The Processor (2) - AndroBenchcsl.skku.edu/uploads/EEE3050S17/Lec07-proc2.pdf · EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong(jinkyu@skku.edu) 9 Pipelining

EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong ([email protected]) 31

Pipelined Control