Coa Lecture 13 Notes on Risc Pipe Lining

14
Lecture 13 Notes on RISC-Pipelining. Computer Organization and Architecture

Transcript of Coa Lecture 13 Notes on Risc Pipe Lining

Page 1: Coa Lecture 13 Notes on Risc Pipe Lining

Lecture 13Notes on RISC-Pipelining.

Computer Organization and Architecture

Page 2: Coa Lecture 13 Notes on Risc Pipe Lining

RISC(Recap)Reduced Instruction Set ComputerKey features

Large number of general purpose registersOr use of compiler technology to optimize

register useLimited and simple instruction setEmphasis on optimising the instruction

pipeline

2 Lecture-13-Notes-ON-RISC-PIPELINING

Page 3: Coa Lecture 13 Notes on Risc Pipe Lining

RISC Characteristics(Recap)One instruction per cycleRegister to register operationsFew, simple addressing modesFew, simple instruction formatsHardwired design (no microcode)Fixed instruction formatMore compile time/effort

3 Lecture-13-Notes-ON-RISC-PIPELINING

Page 4: Coa Lecture 13 Notes on Risc Pipe Lining

RISC CharacteristicsOne instruction per cycle

One machine instruction per machine cycleA machine cycle : the time it takes to fetch twooperands from registers, perform an ALUoperation and store the result in a register

Register to register operationsMost operations should be register-to-register with only simple

LOAD and STORE operationsThe design feature simplifies the instruction set and the

control unit

Lecture-13-Notes-ON-RISC-PIPELINING4

Page 5: Coa Lecture 13 Notes on Risc Pipe Lining

RISC CharacteristicsFew, simple addressing modes• Almost all instructions use register addressing• Several additional modes• Displacement• PC-relative

Few, simple instruction formatsOnly one or a few formats are usedInstruction length is fixed and aligned on word boundaries

Lecture-13-Notes-ON-RISC-PIPELINING5

Page 6: Coa Lecture 13 Notes on Risc Pipe Lining

RISC Pipelining(Recap)• Most instructions are register to register• Two phases of execution• I : Instruction fetch• E: Execute• ALU operation with register input and output

• For load and store• I : Instruction fetch• E: Execute• Calculate memory address

• D: Memory• Register to memory or memory to register operation

• If an instruction needs an operand that is altered by the preceding instruction, a delay is required• This delay can be accomplished by a NOOP

6 Lecture-13-Notes-ON-RISC-PIPELINING

Page 7: Coa Lecture 13 Notes on Risc Pipe Lining

Sequential Operation Vs Two Way Pipelines

Lecture-13-Notes-ON-RISC-PIPELINING7

• Sequential operation is obviously in-efficient.• Two-way pipelined

• I and E stages of two different instructions can be performed simultaneously• Yields up to twice the execution rate of sequential

• Problems• Causes wait state with accesses to memory• Branch disrupts flow

• (NOOP instruction can be inserted by assembler or compiler)

Page 8: Coa Lecture 13 Notes on Risc Pipe Lining

Three way Pipelined Vs Four Way Pipelined

Lecture-13-Notes-ON-RISC-PIPELINING8

• Permitting two memory accesses at one time allows for fully pipelined operation (dual-port RAM).• Since E is usually longer, break E into two parts

• E1 – Register file read• E2 – ALU operation and register write

•Because of RISC design, this is not as difficult to do.•Up to four instructions can be under way at one time

(potential speedup of 4)

Page 9: Coa Lecture 13 Notes on Risc Pipe Lining

Optimization of PipeliningData and branch dependencies reduce the overall

execution rateDelayed branch

Does not take effect until after execution of following instruction“This” following instruction is the delay slot

Lecture-13-Notes-ON-RISC-PIPELINING9

Page 10: Coa Lecture 13 Notes on Risc Pipe Lining

Delayed Branches? Traditional pipelining disposes of instruction loaded in pipe after branch. Delayed branching executes instruction loaded in pipe after branch. NOOP can be used if instruction cannot be found to execute after JUMP. This makes it so no special circuitry is needed to clear the pipe. It is left up to the compiler to rearrange instructions or add NOOPs

Lecture-13-Notes-ON-RISC-PIPELINING10

Page 11: Coa Lecture 13 Notes on Risc Pipe Lining

Delayed Branches? The interchange of instructions will work successfully for

unconditional branches calls and returns Cannot be blindly applied for conditional branches In the condition that is tested for, the branch can be altered by

the immediately preceding instruction, the compiler must refrain from doing the interchange and instead insert a NOOP.

Delayed load can be used on LOAD instructions On the LOAD instruction, the register that is to be the target of the

load is locked by the processorThe processor continues execution of the instruction stream

until it reaches an instruction requiring that register At that point, it idles until the load is complete.

The scheduling of instructions for the pipeline and the dynamic allocation of registers should be considered together to achieve the greatest efficiency

Lecture-13-Notes-ON-RISC-PIPELINING11

Page 12: Coa Lecture 13 Notes on Risc Pipe Lining

Delayed Branches?

Lecture-13-Notes-ON-RISC-PIPELINING12

Page 13: Coa Lecture 13 Notes on Risc Pipe Lining

Instruction PipelineTwo classes of processors have evolved to offer

execution of multiple instructions per clock cycleSuper Scalar architecture

Replicates each of the pipeline stages so that two or more instructions at the same stage of the pipeline can be processed simultaneously.

Super Pipelined architectureMakes use of more fine-grained, pipeline stagesWith more stages, more instructions can be in the pipeline

at the same time, increasing parallelism

Lecture-13-Notes-ON-RISC-PIPELINING13

Page 14: Coa Lecture 13 Notes on Risc Pipe Lining

Instruction PipelineBoth approaches have limitationsWith superscalar architecture

Dependencies between instructions in different pipelines can slow down the system.

Overhead logic is required to coordinate these dependencies

With super pipeliningOverhead associated with transferring

instructions from one stage to the next

Lecture-13-Notes-ON-RISC-PIPELINING14