Chapter One Introduction to Pipelined Processors

Post on 30-Dec-2015

20 views 1 download

Tags:

description

Chapter One Introduction to Pipelined Processors. Non-linear pipeline. In floating point adder , stage (2) and (4) needs a shift register. We can use the same shift register and then there will be only 3 stages. Then we should have a feedback from third stage to second stage . - PowerPoint PPT Presentation

Transcript of Chapter One Introduction to Pipelined Processors

Chapter One Introduction to Pipelined

Processors

Non-linear pipeline• In floating point adder, stage (2) and (4)

needs a shift register. • We can use the same shift register and then

there will be only 3 stages.• Then we should have a feedback from third

stage to second stage. • Further the same pipeline can be used to

perform fixed point addition.• A pipeline with feed-forward and/or

feedback connections is called non-linear

Example: 3-stage nonlinear pipeline

3 stage non-linear pipeline

• It has 3 stages Sa, Sb and Sc and latches.• Multiplexers(cross circles) can take more than

one input and pass one of the inputs to output

• Output of stages has been tapped and used for feedback and feed-forward.

SaSa SbSb ScScInput Output B

Output A

3 stage non-linear pipeline

• The above pipeline can perform a variety of functions.

• Each functional evaluation can be represented by a particular sequence of usage of stages.

• Some examples are:1. Sa, Sb, Sc2. Sa, Sb, Sc, Sb, Sc, Sa3. Sa, Sc, Sb, Sa, Sb, Sc

Reservation Table

• Each functional evaluation can be represented using a diagram called Reservation Table(RT).

• It is the space-time diagram of a pipeline corresponding to one functional evaluation.

• X axis – time units • Y axis – stages

Reservation Table

• For first sequence Sa, Sb, Sc, Sb, Sc, Sa called function A , we have

  0 1 2 3 4 5

Sa A A

Sb A A

Sc A A

Reservation Table

• For second sequence Sa, Sc, Sb, Sa, Sb, Sc called function B, we have

0 1 2 3 4 5

Sa B B

Sb B B

Sc B B

3 stage non-linear pipelineOutput A

Output BSaSa SbSb ScSc

Input

Reservation TableTime

Stage

  0 1 2 3 4 5

Sa

Sb

Sc

Function A

3 stage pipeline : Sa, Sb, Sc, Sb, Sc, Sa

SaSa SbSb ScScInput Output B

Output A

Reservation TableTime

Stage

0 1 2 3 4 5

Sa A

Sb

Sc

3 stage pipeline : Sa, Sb, Sc, Sb, Sc, Sa

SaSa SbSb ScScInput Output B

Output A

Reservation TableTime

Stage

0 1 2 3 4 5

Sa A

Sb A

Sc

3 stage pipeline : Sa, Sb, Sc, Sb, Sc, Sa

SaSa SbSb ScScInput Output B

Output A

Reservation TableTime

Stage

0 1 2 3 4 5

Sa A

Sb A

Sc A

3 stage pipeline : Sa, Sb, Sc, Sb, Sc, Sa

SaSa SbSb ScScInput Output B

Output A

Reservation TableTime

Stage

0 1 2 3 4 5

Sa A

Sb A A

Sc A

3 stage pipeline : Sa, Sb, Sc, Sb, Sc, Sa

SaSa SbSb ScScInput Output B

Output A

Reservation TableTime

Stage

0 1 2 3 4 5

Sa A

Sb A A

Sc A A

3 stage pipeline : Sa, Sb, Sc, Sb, Sc, Sa

SaSa SbSb ScScInput Output B

Output A

Reservation TableTime

Stage

0 1 2 3 4 5

Sa A A

Sb A A

Sc A A

Function B

3 stage pipeline: Sa, Sc, Sb, Sa, Sb, Sc

SaSa SbSb ScScInput Output B

Output A

Reservation TableTime

Stage

  0 1 2 3 4 5

Sa B

Sb

Sc

3 stage pipeline: Sa, Sc, Sb, Sa, Sb, Sc

SaSa SbSb ScScInput Output B

Output A

Reservation TableTime

Stage

0 1 2 3 4 5

Sa B

Sb

Sc B

3 stage pipeline: Sa, Sc, Sb, Sa, Sb, Sc

SaSa SbSb ScScInput Output B

Output A

Reservation TableTime

Stage

0 1 2 3 4 5

Sa B

Sb B

Sc B

3 stage pipeline: Sa, Sc, Sb, Sa, Sb, Sc

SaSa SbSb ScScInput Output B

Output A

Reservation TableTime

Stage

0 1 2 3 4 5

Sa B B

Sb B

Sc B

3 stage pipeline: Sa, Sc, Sb, Sa, Sb, Sc

SaSa SbSb ScScInput Output B

Output A

Reservation TableTime

Stage

0 1 2 3 4 5

Sa B B

Sb B B

Sc B

3 stage pipeline: Sa, Sc, Sb, Sa, Sb, Sc

SaSa SbSb ScScInput Output B

Output A

Reservation TableTime

Stage

0 1 2 3 4 5

Sa B B

Sb B B

Sc B B

Reservation Table• After starting a function, the stages need to be

reserved in corresponding time units.• Each function supported by multifunction

pipeline is represented by different RTs• Time taken for function evaluation in units of

clock period is compute time.(For A & B, it is 6)

Reservation Table• Marking in same row => usage of stage more

than once• Marking in same column => more than one

stage at a time

Multifunction pipelines• Hardware of multifunction pipeline should be

reconfigurable.• Multifunction pipeline can be static or

dynamic

Multifunction pipelines• Static: – Initially configured for one functional evaluation. – For another function, pipeline need to be drained

and reconfigured.– You cannot have two inputs of different function

at the same time

Multifunction pipelines

• Dynamic:– Can do different functional evaluation at a time.– It is difficult to control as we need to be sure that

there is no conflict in usage of stages.

Principle of Designing Pipeline Processors

(Design Problems of Pipeline Processors)

Instruction Prefetch and Branch Handling

• The instructions in computer programs can be classified into 4 types:– Arithmetic/Load Operations (60%) – Store Type Instructions (15%)– Branch Type Instructions (5%)– Conditional Branch Type (Yes – 12% and No – 8%)

Instruction Prefetch and Branch Handling

• Arithmetic/Load Operations (60%) : – These operations require one or two operand

fetches. – The execution of different operations requires a

different number of pipeline cycles

Instruction Prefetch and Branch Handling

• Store Type Instructions (15%) :– It requires a memory access to store the data.

• Branch Type Instructions (5%) :– It corresponds to an unconditional jump.

Instruction Prefetch and Branch Handling

• Conditional Branch Type (Yes – 12% and No – 8%) : – Yes path requires the calculation of the new

address – No path proceeds to next sequential instruction.

Instruction Prefetch and Branch Handling

• Arithmetic-load and store instructions do not alter the execution order of the program.

• Branch instructions and Interrupts cause some damaging effects on the performance of pipeline computers.

Handling Example – Interrupt System of Cray1

Cray-1 System• The interrupt system is built around an

exchange package. • When an interrupt occurs, the Cray-1 saves 8

scalar registers, 8 address registers, program counter and monitor flags.

• These are packed into 16 words and swapped with a block whose address is specified by a hardware exchange address register

Instruction Prefetch and Branch Handling

• In general, the higher the percentage of branch type instructions in a program, the slower a program will run on a pipeline processor.

Effect of Branching on Pipeline Performance

• Consider a linear pipeline of 5 stages

Fetch Instruction Decode Fetch

OperandsExecute

Store Results

Overlapped Execution of Instruction without branching

I1

I2I3

I4

I5I6

I7I8

I5 is a branch instruction

I1

I2

I3I4

I5

I6I7

I8

Estimation of the effect of branching on an n-segment instruction pipeline

Estimation of the effect of branching

• Consider an instruction cycle with n pipeline clock periods.

• Let – p – probability of conditional branch (20%)– q – probability that a branch is successful (60% of

20%) (12/20=0.6)

Estimation of the effect of branching

• Suppose there are m instructions • Then no. of instructions of successful branches

= mxpxq (mx0.2x0.6)• Delay of (n-1)/n is required for each successful

branch to flush pipeline.

Estimation of the effect of branching

• Thus, the total instruction cycle required for m instructions =

n

nmpqmn

n

)1(1

1

Estimation of the effect of branching

• As m becomes large , the average no. of instructions per instruction cycle is given as

= ?

nnmpq

nmn

mLtm )1(1

Estimation of the effect of branching

• As m becomes large , the average no. of instructions per instruction cycle is given as

nnmpq

nmn

mLtm )1(1

)1(1

npq

n

Estimation of the effect of branching

• When p =0, the above measure reduces to n, which is ideal.

• In reality, it is always less than n.

Solution = ?

Multiple Prefetch Buffers• Three types of buffers can be used to match

the instruction fetch rate to pipeline consumption rate

1.Sequential Buffers: for in-sequence pipelining2.Target Buffers: instructions from a branch

target (for out-of-sequence pipelining)

Multiple Prefetch Buffers• A conditional branch cause both sequential

and target to fill and based on condition one is selected and other is discarded

Multiple Prefetch Buffers

3. Loop Buffers– Holds sequential instructions within a loop