Optimizing Stream Programs Using Linear State Space...
Transcript of Optimizing Stream Programs Using Linear State Space...
![Page 1: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/1.jpg)
1
Optimizing Stream Programs Using Linear State Space Analysis
Sitij Agrawal1,2, William Thies1, and Saman Amarasinghe1
1Massachusetts Institute of Technology2Sandbridge Technologies
CASES 2005
http://cag.lcs.mit.edu/streamit
![Page 2: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/2.jpg)
2Streaming Application Domain• Based on a stream of data
– Graphics, multimedia, software radio– Radar tracking, microphone arrays,
HDTV editing, cell phone base stations
• Properties of stream programs– Regular and repeating computation– Parallel, independent actors
with explicit communication– Data items have short lifetimes
AtoD
Decode
duplicate
LPF2LPF1 LPF3
HPF2HPF1 HPF3
Transmit
roundrobin
Encode
![Page 3: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/3.jpg)
3Conventional DSP Design Flow
DSP Optimizations
Rewrite the program
Design the Datapaths(no control flow)
Architecture-specificOptimizations(performance,
power, code size)
Spec. (data-flow diagram)
Coefficient Tables
C/Assembly Code
Signal Processing Expert in Matlab
Software Engineerin C and Assembly
![Page 4: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/4.jpg)
4Ideal DSP Design Flow
DSP Optimizations
High-Level Program(dataflow + control)
Architecture-SpecificOptimizations
C/Assembly Code
Application-Level Design
Compiler
Application Programmer
Challenge: maintaining performance
![Page 5: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/5.jpg)
5The StreamIt Language• Goals:
– Provide a high-level stream programming model
– Invent new compiler technology for streams
• Contributions:– Language design [CC ’02, PPoPP ’05]
– Compiling to tiled architectures [ASPLOS ’02, ISCA ’04, Graphics Hardware ’05]
– Cache-aware scheduling [LCTES ’03, LCTES ’05]
– Domain-specific optimizations [PLDI ’03, CASES ‘05]
![Page 6: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/6.jpg)
6
void->void pipeline FMRadio(int N, float lo, float hi) {add AtoD();
add FMDemod();add splitjoin {
split duplicate;for (int i=0; i<N; i++) {
add pipeline {add LowPassFilter(lo + i*(hi - lo)/N);
add HighPassFilter(lo + i*(hi - lo)/N);}
}join roundrobin();
}add Adder();
add Speaker();}
Adder
Speaker
AtoD
FMDemod
LPF1
Duplicate
RoundRobin
LPF2 LPF3
HPF1 HPF2 HPF3
Programming in StreamIt
![Page 7: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/7.jpg)
7Example StreamIt Filterfloat->float filter LowPassButterWorth (float sampleRate, float cutoff) {
float coeff;
float x;
init {coeff = calcCoeff(sampleRate, cutoff);
}
work peek 2 push 1 pop 1 {x = peek(0) + peek(1) + coeff * x;push(x);pop();
}}
filter
![Page 8: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/8.jpg)
8Focus: Linear State Space Filters• Properties:
1. Outputs are linear function of inputs and states 2. New states are linear function of inputs and states
• Most common target of DSP optimizations– FIR / IIR filters– Linear difference equations– Upsamplers / downsamplers– DCTs
![Page 9: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/9.jpg)
9Representing State Space Filters
u
• A state space filter is a tuple ⟨A, B, C, D⟩
x’ = Ax + Bu
y = Cx + Du
⟨A, B, C, D⟩
inputs
states
outputs
![Page 10: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/10.jpg)
10Representing State Space Filters
u
• A state space filter is a tuple ⟨A, B, C, D⟩
x’ = Ax + Bu
y = Cx + Du
⟨A, B, C, D⟩
float->float filter IIR {float x1, x2; work push 1 pop 1 { float u = pop();push(2*(x1+x2+u));x1 = 0.9*x1 + 0.3*u;x2 = 0.9*x2 + 0.2*u;
} }
inputs
states
outputs
![Page 11: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/11.jpg)
11Representing State Space Filters
0.9 00 0.9 B = A =
0.30.2
u
2 2C = 2
• A state space filter is a tuple ⟨A, B, C, D⟩
x’ = Ax + Bu
D =
y = Cx + Du
float->float filter IIR {float x1, x2; work push 1 pop 1 { float u = pop();push(2*(x1+x2+u));x1 = 0.9*x1 + 0.3*u;x2 = 0.9*x2 + 0.2*u;
} }
inputs
states
outputs
![Page 12: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/12.jpg)
12Representing State Space Filters• A state space filter is a tuple ⟨A, B, C, D⟩
0.9 00 0.9 B = A =
0.30.2
u
2 2C = 2
x’ = Ax + Bu
D =
y = Cx + Du
float->float filter IIR {float x1, x2; work push 1 pop 1 { float u = pop();push(2*(x1+x2+u));x1 = 0.9*x1 + 0.3*u;x2 = 0.9*x2 + 0.2*u;
} }
inputs
states
outputs
![Page 13: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/13.jpg)
13Representing State Space Filters• A state space filter is a tuple ⟨A, B, C, D⟩
0.9 00 0.9 B = A =
0.30.2
u
2 2C = 2
x’ = Ax + Bu
D =
y = Cx + Du
float->float filter IIR {float x1, x2; work push 1 pop 1 { float u = pop();push(2*(x1+x2+u));x1 = 0.9*x1 + 0.3*u;x2 = 0.9*x2 + 0.2*u;
} }
inputs
states
outputs
![Page 14: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/14.jpg)
14Representing State Space Filters• A state space filter is a tuple ⟨A, B, C, D⟩
0.9 00 0.9 B = A =
0.30.2
u
2 2C = 2
x’ = Ax + Bu
D =
y = Cx + Du
float->float filter IIR {float x1, x2; work push 1 pop 1 { float u = pop();push(2*(x1+x2+u));x1 = 0.9*x1 + 0.3*u;x2 = 0.9*x2 + 0.2*u;
} }
inputs
states
outputs
![Page 15: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/15.jpg)
15
float->float filter IIR {float x1, x2; work push 1 pop 1 { float u = pop();push(2*(x1+x2+u));x1 = 0.9*x1 + 0.3*u;x2 = 0.9*x2 + 0.2*u;
} }
Representing State Space Filters• A state space filter is a tuple ⟨A, B, C, D⟩
0.9 00 0.9 B = A =
0.30.2
u
2 2C = 2
x’ = Ax + Bu
D =
y = Cx + Du
inputs
states
outputs
![Page 16: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/16.jpg)
16Representing State Space Filters
float->float filter IIR {float x1, x2; work push 1 pop 1 { float u = pop();push(2*(x1+x2+u));x1 = 0.9*x1 + 0.3*u;x2 = 0.9*x2 + 0.2*u;
} }
• A state space filter is a tuple A, B, C, D⟩
0.9 00 0.9 B = A =
0.30.2
u
2 2C = 2
x’ = Ax + Bu
D =
y = Cx + Du
Linear dataflow analysis
inputs
states
outputs
![Page 17: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/17.jpg)
17State Space Optimizations1. State removal2. Reducing the number of parameters3. Combining adjacent filters
![Page 18: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/18.jpg)
18
x’ = Ax + Buy = Cx + Du
Change-of-Basis Transformation
![Page 19: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/19.jpg)
19
x’ = Ax + Buy = Cx + Du
Change-of-Basis Transformation
T = invertible matrix
Tx’ = TAx + TBuy = Cx + Du
![Page 20: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/20.jpg)
20
x’ = Ax + Buy = Cx + Du
Change-of-Basis Transformation
T = invertible matrix
Tx’ = TA(T-1T)x + TBuy = C(T-1T)x + Du
![Page 21: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/21.jpg)
21
x’ = Ax + Buy = Cx + Du
Change-of-Basis Transformation
T = invertible matrix
Tx’ = TAT-1(Tx) + TBuy = CT-1(Tx) + Du
![Page 22: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/22.jpg)
22
x’ = Ax + Buy = Cx + Du
Change-of-Basis Transformation
T = invertible matrix, z = Tx
Tx’ = TAT-1(Tx) + TBuy = CT-1(Tx) + Du
![Page 23: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/23.jpg)
23
x’ = Ax + Buy = Cx + Du
Change-of-Basis Transformation
T = invertible matrix, z = Tx
z’ = TAT-1z + TBuy = CT-1z + Du
![Page 24: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/24.jpg)
24
x’ = Ax + Buy = Cx + Du
Change-of-Basis Transformation
T = invertible matrix, z = Tx
z’ = A’z + B’uy = C’z + D’u
A’ = TAT-1 B’ =TBC’ = CT-1 D’ = D
![Page 25: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/25.jpg)
25
x’ = Ax + Buy = Cx + Du
Change-of-Basis Transformation
T = invertible matrix, z = Tx
z’ = A’z + B’uy = C’z + D’u
A’ = TAT-1 B’ =TBC’ = CT-1 D’ = D
Can map original states x to transformed states z = Tx without changing I/O behavior
![Page 26: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/26.jpg)
261) State Removal• Can remove states which are:
a. Unreachable – do not depend on inputb. Unobservable – do not affect output
• To expose unreachable states, reduce [A | B] to a kind of row-echelon form– For unobservable states, reduce [AT | CT]
• Automatically finds minimal number of states
![Page 27: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/27.jpg)
27State Removal Example0.9 00 0.9 x + x’ =
0.30.2
2 2y = x + 2u
float->float filter IIR {float x1, x2; work push 1 pop 1 { float u = pop();push(2*(x1+x2+u));x1 = 0.9*x1 + 0.3*u;x2 = 0.9*x2 + 0.2*u;
} }
u1 01 1T = 0.9 0
0 0.9 x + x’ =0.30.5
0 2y = x + 2u
u
![Page 28: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/28.jpg)
28State Removal Example0.9 00 0.9 x + x’ =
0.30.2
2 2y = x + 2u
float->float filter IIR {float x1, x2; work push 1 pop 1 { float u = pop();push(2*(x1+x2+u));x1 = 0.9*x1 + 0.3*u;x2 = 0.9*x2 + 0.2*u;
} }
u1 01 1T = 0.9 0
0 0.9 x + x’ =0.30.5
0 2y = x + 2u
u
x1 is unobservable
![Page 29: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/29.jpg)
29State Removal Example0.9 00 0.9 x + x’ =
0.30.2
2 2y = x + 2u
float->float filter IIR {float x1, x2; work push 1 pop 1 { float u = pop();push(2*(x1+x2+u));x1 = 0.9*x1 + 0.3*u;x2 = 0.9*x2 + 0.2*u;
} }
u1 01 1T =
float->float filter IIR {float x; work push 1 pop 1 { float u = pop();push(2*(x+u));x = 0.9*x + 0.5*u;
} }
x’ = 0.9x + 0.5u
y = 2x + 2u
![Page 30: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/30.jpg)
30State Removal Example
float->float filter IIR {float x1, x2; work push 1 pop 1 { float u = pop();push(2*(x1+x2+u));x1 = 0.9*x1 + 0.3*u;x2 = 0.9*x2 + 0.2*u;
} }
float->float filter IIR {float x; work push 1 pop 1 { float u = pop();push(2*(x+u));x = 0.9*x + 0.5*u;
} }
9 FLOPs12 load/store
5 FLOPs8 load/store
output output
![Page 31: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/31.jpg)
312) Parameter Reduction• Goal:
Convert matrix entries (parameters) to 0 or 1
• Allows static evaluation:1*x x Eliminate 1 multiply0*x + y y Eliminate 1 multiply, 1 add
• Algorithm (Ackerman & Bucy, 1971)– Also reduces matrices [A | B] and [AT | CT]– Attains a canonical form with few parameters
![Page 32: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/32.jpg)
32Parameter Reduction Example
x’ = 0.9x + 0.5uy = 2x + 2u
2T = x’ = 0.9x + 1uy = 1x + 2u
6 FLOPsoutput
4 FLOPsoutput
![Page 33: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/33.jpg)
33
Filter 1
3) Combining Adjacent Filters
Filter 2
y
u
z
y = D1u
z = D2yE
CombinedFilter
u
z
z = Euz = D2D1u
![Page 34: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/34.jpg)
34
Filter 1
3) Combining Adjacent Filters
Filter 2
y
u
z
CombinedFilter
u
zz = D2C1 C2 x + D2D1 u
A1 0B2C1 A2
B1B2D1
x’ = x + u
Also in paper:- combination of parallel streams- combination of feedback loops- expansion of mis-matching filters
![Page 35: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/35.jpg)
35
IIR Filter
Combination Example
x’ = 0.9x + uy = x + 2u
Decimator
y = [1 0] u1u2
IIR / Decimatorx’ = 0.81x + [0.9 1]
y = x + [2 0]
u1u2
8 FLOPsoutput
6 FLOPsoutput
u1u2
![Page 36: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/36.jpg)
36
IIR Filter
Combination Example
x’ = 0.9x + uy = x + 2u
Decimator
y = [1 0] u1u2
8 FLOPsoutput
6 FLOPsoutput
IIR / Decimatorx’ = 0.81x + [0.9 1]
y = x + [2 0]
u1u2
u1u2
As decimation factor goes to ∞,eliminate up to 75% of FLOPs.
![Page 37: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/37.jpg)
37Combination Hazards• Combination sometimes increases FLOPs
• Example: FFT– Combination results in DFT– Converts O(n log n) algorithm to O(n2)
• Solution: only apply where beneficial– Operations known at compile time– Using selection algorithm, FLOPs never increase
• See PLDI ’03 paper for details
![Page 38: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/38.jpg)
38Results• Subsumes combination of linear components
– Evaluated previously [PLDI ’03]• Applications: FIR, RateConvert, TargetDetect, Radar,
FMRadio, FilterBank, Vocoder, Oversampler, DtoA– Removed 44% of FLOPs– Speedup of 120% on Pentium 4
• Results using state space analysis
87%IIR + 1:16 Decimator49%IIR + 1:2 Decimator
Speedup(Pentium 3)
![Page 39: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/39.jpg)
39Ongoing Work• Experimental evaluation
– Evaluate real applications on embedded machines– In progress: MPEG2, JPEG, radar tracker
• Numerical precision constraints– Precision often influences choice of coefficients– Transformations should respect constraints
![Page 40: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/40.jpg)
40Related Work• Linear stream optimizations [Lamb et al. ’03]
– Deals with stateless filters
• Automatic optimization of linear libraries– SPIRAL, FFTW, ATLAS, Sparsity
• Stream languages– Lustre, Esterel, Signal, Lucid, Lucid Synchrone,
Brook, Spidle, Cg, Occam , Sisal, Parallel Haskell
• Common sub-expression elimination
![Page 41: Optimizing Stream Programs Using Linear State Space Analysisgroups.csail.mit.edu/commit/papers/05/streamit-cases05-slides.pdf · x’ = Ax + Bu D = y = Cx + Du Linear dataflow analysis](https://reader033.fdocuments.us/reader033/viewer/2022052000/6012241b25ca2943af6a8879/html5/thumbnails/41.jpg)
41Conclusions• Linear state space analysis:
An elegant compiler IR for DSP programs
• Optimizations using state space representation:1. State removal2. Parameter reduction3. Combining adjacent filters
• Step towards adding efficient abstraction layersthat remove the DSP expert from the design flow
http://cag.lcs.mit.edu/streamit