Pipelining combinational circuits Arvind Computer Science & Artificial Intelligence Lab.

28
Pipelining combinational circuits Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology February 20, 2013 http://csg.csail.mit.edu/6.375 L05-1

description

Pipelining combinational circuits Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology. Bfly4. in0. out0. Bfly4. Permute. Bfly4. out1. in1. Bfly4. x16. out2. in2. Bfly4. Bfly4. Bfly4. Permute. Permute. in3. out3. …. …. out4. in4. - PowerPoint PPT Presentation

Transcript of Pipelining combinational circuits Arvind Computer Science & Artificial Intelligence Lab.

Page 1: Pipelining combinational  circuits Arvind Computer Science & Artificial Intelligence Lab.

Pipelining combinational circuits

ArvindComputer Science & Artificial Intelligence Lab.Massachusetts Institute of Technology

February 20, 2013 http://csg.csail.mit.edu/6.375 L05-1

Page 2: Pipelining combinational  circuits Arvind Computer Science & Artificial Intelligence Lab.

Combinational IFFT

Lot of area and long combinational delayFolded or multi-cycle version can save area and reduce the combinational delay but throughput per clock cycle gets worsePipelining: a method to increase the circuit throughput by evaluating multiple IFFTs

in0

in1

in2

in63

in3in4

Bfly4

Bfly4

Bfly4

x16

Bfly4

Bfly4

Bfly4

Bfly4

Bfly4

Bfly4

out0

out1

out2

out63

out3out4

Permute

Permute

Permute

IFFTi IFFTi-1IFFTi+1

3 different datasets in the pipeline

February 20, 2013 http://csg.csail.mit.edu/6.375 L05-2

Page 3: Pipelining combinational  circuits Arvind Computer Science & Artificial Intelligence Lab.

Inelastic vs Elastic pipeline

xfifo1inQ

f1 f2 f3

fifo2 outQ

xsReg1inQ

f0 f1 f2

sReg2 outQInelastic: all pipeline stages move synchronously

Elastic: A pipeline stage can process data if its input FIFO is not empty and output FIFO is not Full

Most complex processor pipelines are a combination of the two stylesFebruary 20, 2013 http://csg.csail.mit.edu/6.375 L05-3

Page 4: Pipelining combinational  circuits Arvind Computer Science & Artificial Intelligence Lab.

Inelastic vs Elastic PipelinesInelastic pipeline: typically only one rule; the designer controls

precisely which activities go on in parallel downside: The rule can get too complicated

-- easy to make mistakes; difficult to make changes

Elastic pipeline: several smaller rules, each easy to write,

easier to make changes downside: sometimes rules do not fire

concurrently when they should

February 20, 2013 http://csg.csail.mit.edu/6.375 L05-4

Page 5: Pipelining combinational  circuits Arvind Computer Science & Artificial Intelligence Lab.

Inelastic pipeline

xsReg1inQ

f0 f1 f2

sReg2 outQ

rule sync-pipeline (True); inQ.deq(); sReg1 <= f0(inQ.first()); sReg2 <= f1(sReg1); outQ.enq(f2(sReg2));endrule

This is real IFFT code; just replace f0, f1 and f2 with stage_f code

This rule can fire only if

Atomicity: Either all or none of the state elements inQ, outQ, sReg1 and sReg2 will be updated

- inQ has an element- outQ has space

February 20, 2013 http://csg.csail.mit.edu/6.375 L05-5

Page 6: Pipelining combinational  circuits Arvind Computer Science & Artificial Intelligence Lab.

Inelastic pipelineMaking implicit guard conditions explicit

xsReg1inQ

f0 f1 f2

sReg2 outQ

rule sync-pipeline (!inQ.empty() && !outQ.full); inQ.deq(); sReg1 <= f0(inQ.first()); sReg2 <= f1(sReg1); outQ.enq(f2(sReg2));endrule

Suppose sReg1 and sReg2 have data, outQ is not full but inQ is empty. What behavior do you expect?Leave green and red data in the pipeline?

February 20, 2013 http://csg.csail.mit.edu/6.375 L05-6

Page 7: Pipelining combinational  circuits Arvind Computer Science & Artificial Intelligence Lab.

Pipeline bubbles

xsReg1inQ

f0 f1 f2

sReg2 outQ

rule sync-pipeline (True); inQ.deq(); sReg1 <= f0(inQ.first()); sReg2 <= f1(sReg1); outQ.enq(f2(sReg2));endrule

Red and Green tokens must move even if there is nothing in inQ!

Modify the rule to deal with these conditions

Also if there is no token in sReg2 then nothing should be enqueued in the outQ

Valid bits orthe Maybe type

February 20, 2013 http://csg.csail.mit.edu/6.375 L05-7

Page 8: Pipelining combinational  circuits Arvind Computer Science & Artificial Intelligence Lab.

Explicit encoding of Valid/Invalid data

rule sync-pipeline (True); if (inQ.notEmpty()) begin sReg1 <= f0(inQ.first()); inQ.deq(); sReg1f <= Valid end else sReg1f <= Invalid; sReg2 <= f1(sReg1); sReg2f <= sReg1f; if (sReg2f == Valid) outQ.enq(f2(sReg2));endrule

sReg1 sReg2x

inQ

f0 f1 f2

outQ

Valid/Invalid

February 20, 2013 http://csg.csail.mit.edu/6.375 L05-8

typedef union tagged {void Valid; void Invalid;} Validbit deriving (Eq, Bits);

Page 9: Pipelining combinational  circuits Arvind Computer Science & Artificial Intelligence Lab.

When is this rule enabled?

inQ sReg1f sReg2f outQNE V V NFNE V V FNE V I NFNE V I FNE I V NFNE I V FNE I I NFNE I I F

E V V NFE V V FE V I NFE V I FE I V NFE I V FE I I NFE I I F

yesNoYesYesYesNoYesyes

yesNoYesYesYesNoYes1yes

Yes1 = yes but no change

inQ sReg1f sReg2f outQ

rule sync-pipeline (True); if (inQ.notEmpty()) begin sReg1 <= f0(inQ.first()); inQ.deq(); sReg1f <= Valid end else sReg1f <= Invalid; sReg2 <= f1(sReg1); sReg2f <= sReg1f; if (sReg2f == Valid) outQ.enq(f2(sReg2));endrule

sReg1 sReg2inQ

f0 f1 f2

outQ

February 20, 2013 http://csg.csail.mit.edu/6.375 L05-9

Page 10: Pipelining combinational  circuits Arvind Computer Science & Artificial Intelligence Lab.

The Maybe typeA useful type to capture valid/invalid datatypedef union tagged {

void Invalid;data_T Valid;

} Maybe#(type data_T);

datavalid/invalid

Registers contain Maybe type values

Some useful functions on Maybe type: isValid(x) returns true if x is Valid fromMaybe(d,x) returns the data value in x if x is Valid the default value d if x is Invalid

February 20, 2013 http://csg.csail.mit.edu/6.375 L05-10

Page 11: Pipelining combinational  circuits Arvind Computer Science & Artificial Intelligence Lab.

Using the Maybe typetypedef union tagged {

void Invalid;data_T Valid;

} Maybe#(type data_T);

datavalid/invalid

Registers contain Maybe type values

rule sync-pipeline if (True); if (inQ.notEmpty()) begin sReg1 <= Valid f0(inQ.first()); inQ.deq(); end else sReg1 <= Invalid; sReg2 <= isValid(sReg1)? Valid f1(fromMaybe(d, sReg1)) : Invalid; if isValid(sReg2) outQ.enq(f2(fromMaybe(d, sReg2))); endrule

February 20, 2013 http://csg.csail.mit.edu/6.375 L05-11

Page 12: Pipelining combinational  circuits Arvind Computer Science & Artificial Intelligence Lab.

Pattern-matching: An alternative syntax to extract datastructure components

The &&& is a conjunction, and allows pattern-variables to come into scope from left to right

case (m) matches tagged Invalid : return 0; tagged Valid .x : return x;endcase

if (m matches (Valid .x) &&& (x > 10))

typedef union tagged { void Invalid; data_T Valid;} Maybe#(type data_T);

x will get bound to the appropriate part of m

February 20, 2013 http://csg.csail.mit.edu/6.375 L05-12

Page 13: Pipelining combinational  circuits Arvind Computer Science & Artificial Intelligence Lab.

The Maybe type data in the pipelinetypedef union tagged {

void Invalid;data_T Valid;

} Maybe#(type data_T);

datavalid/invalid

Registers contain Maybe type values

rule sync-pipeline if (True); if (inQ.notEmpty()) begin sReg1 <= Valid (f0(inQ.first())); inQ.deq(); end else sReg1 <= Invalid; case (sReg1) matches tagged Valid .sx1: sReg2 <= Valid f1(sx1); tagged Invalid: sReg2 <= Invalid; endcase case (sReg2) matches tagged Valid .sx2: outQ.enq(f2(sx2)); endcaseendrule

sx1 will get bound to the appropriate part of sReg1

February 20, 2013 http://csg.csail.mit.edu/6.375 L05-13

Page 14: Pipelining combinational  circuits Arvind Computer Science & Artificial Intelligence Lab.

Generalization: n-stage pipeline

rule sync-pipeline (True); if (inQ.notEmpty()) begin sReg[0]<= Valid f(1,inQ.first());inQ.deq();end else sReg[0]<= Invalid; for(Integer i = 1; i < n-1; i=i+1) begin case (sReg[i-1]) matches tagged Valid .sx: sReg[i] <= Valid f(i-1,sx); tagged Invalid: sReg[i] <= Invalid; endcase end case (sReg[n-2]) matches tagged Valid .sx: outQ.enq(f(n-1,sx)); endcaseendrule

sReg[0]inQ sReg[1] outQx

f(0) f(1) f(2) f(n-1)...sReg[n-2]

February 20, 2013 http://csg.csail.mit.edu/6.375 L05-14

Page 15: Pipelining combinational  circuits Arvind Computer Science & Artificial Intelligence Lab.

Elastic pipelineUse FIFOs instead of pipeline registers

xfifo1inQ

f1 f2 f3

fifo2 outQrule stage1 if (True); fifo1.enq(f1(inQ.first()); inQ.deq(); endrulerule stage2 if (True);

fifo2.enq(f2(fifo1.first()); fifo1.deq(); endrulerule stage3 if (True); outQ.enq(f3(fifo2.first());

fifo2.deq(); endrule

What is the firing condition for each rule?Can tokens be left inside the pipeline?No need for Maybe types

February 20, 2013 http://csg.csail.mit.edu/6.375 L05-15

Page 16: Pipelining combinational  circuits Arvind Computer Science & Artificial Intelligence Lab.

Firing conditions for reach rule

xfifo1inQ

f1 f2 f3

fifo2 outQinQ fifo1 fifo2 outQNE NE,NF NE,NF NFNE NE,NF NE,NF FNE NE,NF NE,F NFNE NE,NF NE,F F….

Yes Yes YesYes Yes NoYes No YesYes No No….

rule1 rule2 rule3

This is the first example we have seen where multiple rules may be ready to execute concurrentlyCan we execute multiple rules together?

February 20, 2013 http://csg.csail.mit.edu/6.375 L05-16

Page 17: Pipelining combinational  circuits Arvind Computer Science & Artificial Intelligence Lab.

Informal analysis

xfifo1inQ

f1 f2 f3

fifo2 outQ

inQ fifo1 fifo2 outQNE NE,NF NE,NF NFNE NE,NF NE,NF FNE NE,NF NE,F NFNE NE,NF NE,F F….

Yes Yes YesYes Yes NoYes No YesYes No No….

rule1 rule2 rule3

FIFOs must permit concurrent enq and deq for all three rules to fire concurrently

February 20, 2013 http://csg.csail.mit.edu/6.375 L05-17

Page 18: Pipelining combinational  circuits Arvind Computer Science & Artificial Intelligence Lab.

Concurrency when the FIFOs do not permit concurrent enq and deq

xfifo1inQ

f1 f2 f3

fifo2 outQnot

emptynot

empty&

not full

not empty

&not full

not full

At best alternate stages in the pipeline will be able to fire concurrently

February 20, 2013 http://csg.csail.mit.edu/6.375 L05-18

Page 19: Pipelining combinational  circuits Arvind Computer Science & Artificial Intelligence Lab.

Pipelined designs expressed using Multiple rules

If rules for different pipeline stages never fire in the same cycle then the design can hardly be called a pipelined design

If all the enabled rules fire in parallel every cycle then, in general, wrong results can be produced

We need a clean model for concurrent firing of rules

February 20, 2013 http://csg.csail.mit.edu/6.375 L05-19

Page 20: Pipelining combinational  circuits Arvind Computer Science & Artificial Intelligence Lab.

BSV Rule ExecutionA BSV program consists of state elements and rules, aka, Guarded Atomic Actions (GAA) that operate on the state elementsApplication of a rule modifies some state elements of the system in a deterministic manner

currentstate

nextstate valuesnext state

computation

fx

fxguard

reg en’s

nextState

AND

February 20, 2013 http://csg.csail.mit.edu/6.375 L05-20

Page 21: Pipelining combinational  circuits Arvind Computer Science & Artificial Intelligence Lab.

BSV Execution ModelRepeatedly:

Select a rule to execute Compute the state updates Make the state updates

Highly non-deterministicUser annotations can help in rule selection

February 20, 2013 http://csg.csail.mit.edu/6.375 L05-21

Page 22: Pipelining combinational  circuits Arvind Computer Science & Artificial Intelligence Lab.

One-rule-at-time-semanticsThe legal behavior of a BSV program can always be explained by observing the state updates obtained by applying only one rule at a time

Implementation concern: Schedule multiple rules concurrently without violating one-rule-at-a-time semantics

February 20, 2013 http://csg.csail.mit.edu/6.375 L05-22

Page 23: Pipelining combinational  circuits Arvind Computer Science & Artificial Intelligence Lab.

Guard liftingFor concurrent scheduling of rules, we only need to consider those rules that can be concurrently enabled, i.e., whose guards are trueIn order to understand when a rule can be enabled, we need to understand precisely how implicit guards are lifted precisely to form the rule guard

February 20, 2013 L05-23http://csg.csail.mit.edu/6.375

Page 24: Pipelining combinational  circuits Arvind Computer Science & Artificial Intelligence Lab.

Making guards explicitrule foo if (True); if (p) fifo.enq(8); r <= 7;endrule

rule foo if ((p && fifo.notFull) || !p); if (p) fifo.enq(8); r <= 7;endrule

Effectively, all implicit conditions (guards) are lifted and conjoined to the rule guard

February 20, 2013 http://csg.csail.mit.edu/6.375 L05-24

Page 25: Pipelining combinational  circuits Arvind Computer Science & Artificial Intelligence Lab.

Implicit guards (conditions)rule <name> if (<guard>); <action>; endrule

make implicit guards explicit m.gB(<exp>) when m.gG

<action> ::= r <= <exp> | if (<exp>) <action> | <action> ; <action> | m.g(<exp>) | t = <exp>

<action> ::= r <= <exp> | if (<exp>) <action> | <action> when (<exp>) | <action> ; <action> | m.gB(<exp>) | t = <exp>

February 20, 2013 http://csg.csail.mit.edu/6.375 L05-25

Page 26: Pipelining combinational  circuits Arvind Computer Science & Artificial Intelligence Lab.

Guards vs If’sA guard on one action of a parallel group of actions affects every action within the group(a1 when p1); a2

==> (a1; a2) when p1A condition of a Conditional action only affects the actions within the scope of the conditional action(if (p1) a1); a2

p1 has no effect on a2 ... Mixing ifs and whens(if (p) (a1 when q)) ; a2

((if (p) a1); a2) when ((p && q) | !p) ((if (p) a1); a2) when (q | !p)

February 20, 2013 http://csg.csail.mit.edu/6.375 L05-26

Page 27: Pipelining combinational  circuits Arvind Computer Science & Artificial Intelligence Lab.

Guard Lifting rulesAll the guards can be “lifted” to the top of a rule

(a1 when p) ; a2 a1 ; (a2 when p)

if (p when q) a if (p) (a when q)

(a when p1) when p2 x <= (e when p)

similarly for expressions ... Rule r (a when p) Can you prove that using these rules all the

guards will be lifted to the top of an action?

(a1 ; a2) when p(a1 ; a2) when p

(if (p) a) when q(if (p) a) when (q | !p)

a when (p1 & p2)(x <= e) when p

Rule r (if (p) a)

February 20, 2013 http://csg.csail.mit.edu/6.375 L05-27

Page 28: Pipelining combinational  circuits Arvind Computer Science & Artificial Intelligence Lab.

BSV provides a primitive (impCondOf) to make guards explicit and lift them to the top

From now on in concurrency discussions we will assume that all guards have been lifted to the top in every rule

February 20, 2013 http://csg.csail.mit.edu/6.375 L05-28