Dynamic Predication

22
10/25/2006 1 Dynamic Predication ACAL Group Seminar Alok Garg

description

Dynamic Predication. ACAL Group Seminar Alok Garg. What is Predicated Execution?. Conditional instruction Executed : if condition is true NOP: if condition is false Eliminate simple branches If(A==0) { S = T} Convert control dependencies into data dependencies. BNEZ R1, L - PowerPoint PPT Presentation

Transcript of Dynamic Predication

Page 1: Dynamic Predication

10/25/2006 1

Dynamic Predication

ACAL Group Seminar

Alok Garg

Page 2: Dynamic Predication

10/25/2006 2

What is Predicated Execution?

Conditional instruction Executed : if condition is true NOP: if condition is false

Eliminate simple branches If(A==0) { S = T}

Convert control dependencies into data dependencies

BNEZ R1, LADDU R2, R3, R0L:

CMOVZ R2, R3, R1

Page 3: Dynamic Predication

10/25/2006 3

Simple Example

A

B C

D

E

Normal Execution

A [B D E] C D E

Pipeline flushdue to misprediction

Predicted Execution

A [C[!p] B[p]] D E

Conditional instructions

T NT

Limitations of software predication:

1. If branch is NT 98% of time 2. Delayed execution of blocks B or C

Page 4: Dynamic Predication

10/25/2006 4

Limitations of Predication

ISA support Predicate registers Predicated instructions

Performance overhead Instruction fetch from both paths Can not execute predicated instructions until the predicate

value is resolved Ideal predication speedup - 16.4%

Only small subset of control-flow graph is covered Compiler cannot if-convert Complex control-flow Ideal predication for all conditional branches – 37.4%

Page 5: Dynamic Predication

10/25/2006 5

Motivation

Some branches are still very hard to predict with conventional branch predictors

Mispredictions lead to costly pipeline flushes Performance Energy

Predication is used to avoid pipeline flushes for those hard to predict branches

Page 6: Dynamic Predication

10/25/2006 6

Paper Covered

Dynamic Hammock Predication for Non-predicated Instruction Set Architecture. Artur Klauser, Todd Austin, Dirk Gruwald, and Brad Calder – Pact 1998

Wish Branches: Combining Conditional Branching with Predication for Adaptive Predicated Execution. Hyesoon Kim, Onur Mutlu, Jared Stark, and Yale N. Patt – MICRO 2005, IEEE MICRO TOP PICKS 2006

Diverge-Merge Processor (DMP): Dynamic Predicated Execution of Complex Control-Flow Graphs Based on Frequently Executed Paths. Hyesoon Kim, Jose A. Joao, Onur Mutlu, and Yale N. Patt – MICRO 2006

Page 7: Dynamic Predication

10/25/2006 7

Type of Control-flow graphs

A

B C

D

E

Simple hammock

F

A

B C

E

H

Nested hammock

I

D F G

A

B C

D

E

Frequently hammock

H

G F

Page 8: Dynamic Predication

10/25/2006 8

Type of Control-flow graphs

L

A

B

Loop

C

A

B C

E

Non-merging control flow

D F G

Page 9: Dynamic Predication

10/25/2006 9

Distribution of mispredicted branches

Simple + Nested : 16 % of all mispredictions All except non-merging: 66 % of all mispredictions

Page 10: Dynamic Predication

10/25/2006 10

Dynamic Hammock Predication

Target first limitation of software predication Get rid of ISA support required

Dynamic predication for simple hammock 11% of all mispredictions Compiler support to mark simple hammock boundaries

Predication decision Dynamic decision Static profile based

Page 11: Dynamic Predication

10/25/2006 11

Support for Dynamic Predication

a) R1 := …b) R2 := …c) R3 := …d) R4 := …e) B - cc (i)

Fork Context

f) R1 := R1 + R2g) R3 := R1 x 2h) BR (k)

i) R2 := R1 – R2j) R3 := R2 x 2

k) RA := R1l) RB := R2m) RC := R3n) RD := R4

Then Contextcc is false

Else Contextcc is true

Join Context

Page 12: Dynamic Predication

10/25/2006 12

Support for Dynamic Predication

a) R1.a := …b) R2.b := …c) R3.c := …d) R4.d := …e) PL.e

Fork Context

f) R1 := R1 + R2g) R3 := R1 x 2h) BR (k)

i) R2 := R1 – R2j) R3 := R2 x 2

k) RA := R1l) RB := R2m) RC := R3n) RD := R4

Then Contextcc is false

Else Contextcc is true

Join Context

Predicate Value = 0

R1R2R3R4

fork then elseabcd

Rename Tablef) R1.f := R1.a + R2.bg) R3.g := R1.f x 2h) Removed

f

g

Predicate Value = 1

i) R2.i := R1.a – R2.bj) R3.j := R2.i x 2

ij

k) R1.k := PL.e : R1.a : R1.fl) R2.l := PL.e : R2.i : R2.bm) R3.m:= PL.e : R3.j : R3.gn) RA.n := R1.ko) RB.o := R2.lp) RC.p := R3.mq) RD.q := R4.d

R1R2R3R4

fork then elseklmd

Rename Table

Page 13: Dynamic Predication

10/25/2006 13

Wish Branches

Target second and third limitation of software predication Dynamic decision based on confidence estimator Improved coverage by predicating loops

Uses compiler generated predicated blocks Add “wish” code for dynamic decision

Define how to include simple loops for predication

Page 14: Dynamic Predication

10/25/2006 14

Wish Jumps and Wish Joins

Code

Branch Code

Predicated Code

Wish jump/join code

Page 15: Dynamic Predication

10/25/2006 15

Wish Loops

Code

Normal CodeWish Loop Code

Branch Type Normal Exit

(X1 X2 X3 Y)

Early Exit

(X1 X2 Y)

Late Exit

(X1 X2 X3 X4 X5 Y)

No Exit

(X1 X2 X3 X4 … XN)

High Confidence No Overhead FLUSH (Y) Flush (X4 X5 Y) Flush (X4 X5 … XN)

Low Confidence Predication delay FLUSH (Y) Predicate (X4 X5) Flush (X4 X5 … XN)

Page 16: Dynamic Predication

10/25/2006 16

Dynamic Number of Wish Branches

Performance improvement: 10.7% over predicated code

Page 17: Dynamic Predication

10/25/2006 17

Dynamic Number of Wish Loops

Performance improvement: 13.3% over predicated code

Page 18: Dynamic Predication

10/25/2006 18

Diverge-Merge Processor (DMP)

Target all 3 limitations of software predication Dynamic Predication - Little compiler support Dynamic decision based on confidence estimation Only on frequently executed control-flow paths

Software support Compiler mark all diverge and merge points

Hardware support – similar to Dynamic Hammock predication Enters predication mode at diverge point Predicate only frequently executed paths

Page 19: Dynamic Predication

10/25/2006 19

1. Dynamically predicate: Blocks B C E Reduces predication overhead

2. Improve predication coverage by including complex control flow graphs

Frequently Executed Control-Flow Paths

Page 20: Dynamic Predication

10/25/2006 20

Comparison of Various Predication Schemes

Model Simple H Nested H Freq H Loop Non-Merging Coverage

DHP B,C,D,E,F - - - - 11%

SP B,C,D,E,F B,C,D,E,F,G,H,I

- - - 16%

WB B,C,D,E,F B,C,D,E,F,G,H,I

- A,A,B,C - 26%

DMP B,C,D,E,F B,C,D,G,H,I B,C,D,E,H A,A,B,C - 66%

Dual Path

Path1:B,D,E,F

Path2:C,D,E,F

Path1:B,D,H,I

Path2:C,G,H,I

Path1:B,D,E,H

Path2:C,E,H

Path1:A,A,B,C

Path2:B,C

Path1:B…

Path2:C…A

B C

D

E

Simple hammock

F

A

B C

E

H

Nested hammock

I

D F G

A

B C

D

E

Frequently hammock

H

G F

L

A

B

Loop

C

A

B C

E

Non-merging control flow

D F G

Page 21: Dynamic Predication

10/25/2006 21

Performance

19.3% average performance improvement 38% reduction in pipeline flushes Consumes 9% less energy

Page 22: Dynamic Predication

10/25/2006 22

Conclusion

Most of the hard to predict branches (66%) have convergence point

Dynamic predication is more effective than software predication in terms of: Number of miss-predicted branches covered Accuracy of coverage

Effectively reduce large number of pipeline flushes