pipleing,parallel,retimimg
-
Upload
kumaranraj -
Category
Documents
-
view
6 -
download
0
description
Transcript of pipleing,parallel,retimimg
![Page 1: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/1.jpg)
4/23/20131
VLSI Programming: Lecture 2
Course 2IN35
Course: Kees van Berkel [email protected]
Rudolf Mak [email protected]
Lab: Kees van Berkel
Rudolf Mak
Alok Lele, Hrishikesh Salunkhe
www: http://www.win.tue.nl/~cberkel/2IN35/
Lecture 2 pipelining, retiming, J-slow, parallel
![Page 2: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/2.jpg)
4/23/20132
VLSI Programming: time table 2013
date in hour 5 hour 6 out hour 7 hour 8 out
April 23
introduction, DSP representations,
bounds
pipelining, retiming, transposition,
J-slow, unfolding T1 + T2
May 7 T1 + T2
unfolding (cntd), look-ahead,
strength reduction T3 + T4
(have FPGA tools installed)
FPGA + Verilog intros L1
May 14 T3 + T4 systolic computation T5
FPGA lab/L1: audio filter
simulation
May 21 T5 folding FPGA lab/L2: audio filter on XUP board L2
May 28 DSP processors FPGA lab/L3: sequential FIR, strength-reduced FIR L3
May 30 FPGA lab/L3: sequential FIR, strength-reduced FIR (cntd)
June 4 L3
FPGA lab/L4: audio sample rate convertor
deadline report L3 L4
June 6 FPGA lab/L4: audio sample rate convertor (cntd)
June 11 L4
FPGA lab/L5: audio sample rate convertor "1024x"
deadline report L4 L5
June 13 FPGA lab/L5: audio sample rate convertor "1024x" (cntd)
June 18 L5 deadline report L5
![Page 3: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/3.jpg)
4/23/20133
FPGA IC on a Xilinx XUP Board (Atlys)
XilinxSpartan 6
FPGA
![Page 4: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/4.jpg)
4/23/2013
4
Atlys board, based on Xilinx Spartan 6
XilinxSpartan 6
FPGA
![Page 5: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/5.jpg)
4/23/20135
Preparation for Lab work
• Prepare your notebook for lab work
• See preparation link on 2IN35 web-site
• Install the required tools and test them2 weeks from now (May 7): Hrishikesh and Alok will be around for Q&A
• First Lab exercises: Tue May 7
• Find a partner (team size is maxmaxmaxmax 2)
![Page 6: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/6.jpg)
4/23/20136
Note on course literature
Lectures VLSI programming are loosely based on:
• Keshab K. Parhi. VLSI Digital Signal Processing Systems, Design and Implementation. Wiley Inter-Science 1999.
• This book is recommended, but not mandatory
Accompanying slides can be found on:
• http://www.ece.umn.edu/users/parhi/slides.html
• http://www.win.tue.nl/~cberkel/2IN35/
Mandatory readingMandatory readingMandatory readingMandatory reading:
• Edward A. Lee and David G. Messerschmitt. Synchronous Data Flow. Proc. of the IEEE, Vol. 75, No. 9, Sept 1987, pp 1235-1245.
• Keshab K. Parhi. High-Level Algorithm and Architecture Transformations for DSP Synthesis. Journal of VLSI Signal Processing, 9, 121-143 (1995), Kluwer Academic Publishers.
![Page 7: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/7.jpg)
4/23/20137
Outline Lecture 2
Transformations of DFGs and SFGs:
• (commuting of an SFG) lecture 1
• pipelining of a DFG Parhi3.pdf
• transposition of an SFG Parhi3.pdf
• retiming of a DFG Parhi4.pdf
• K-slow transformation of a DFG Parhi4.pdf
• unfolding of a DFG Parhi3.pdf Parhi5.pdf
• assignments
![Page 8: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/8.jpg)
4/23/20138
![Page 9: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/9.jpg)
4/23/20139
• car assembly line; Henry Ford [1908]• 1914: Ts = 3min; latency = 93 min
![Page 10: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/10.jpg)
4/23/201310
![Page 11: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/11.jpg)
4/23/201311
![Page 12: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/12.jpg)
4/23/201312
![Page 13: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/13.jpg)
4/23/201313
![Page 14: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/14.jpg)
4/23/201314
every
by ≥ 0
![Page 15: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/15.jpg)
4/23/201315
![Page 16: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/16.jpg)
4/23/201316
![Page 17: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/17.jpg)
4/23/201317
![Page 18: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/18.jpg)
4/23/201318
![Page 19: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/19.jpg)
4/23/201319
![Page 20: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/20.jpg)
4/23/201320
![Page 21: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/21.jpg)
4/23/201321
Retiming and pipelining
• Review slides Parhi3.pdf
• Parhi follows a graph-theoretic approach to compute optimal pipelining/retiming
• For our purposes “moving delays around” is sufficient:
• Node retiming (Parhi4.pdf, slide 2)
• Introduction of a delay at all inputs (or all outputs)
![Page 22: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/22.jpg)
4/23/201322
Parhi ’95, Fig 3a
2
2
2
1
1
1 1Critical path is 10 time units long
(transposed version: 8 time units)
![Page 23: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/23.jpg)
4/23/201323
Parhi ’95, Fig 3a / retiming step 1
Critical path is 10 time units long
![Page 24: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/24.jpg)
4/23/201324
Parhi ’95, Fig 3a / retiming step 2
Critical path is 10 time units long
![Page 25: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/25.jpg)
4/23/201325
Parhi ’95, Fig 3a / retiming step 3
Critical path is 7 time units long
![Page 26: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/26.jpg)
4/23/201326
Parhi ’95, Fig 3a / retiming step 4
Critical path is 7 time units long
![Page 27: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/27.jpg)
4/23/201327
Parhi ’95, Fig 3a / retiming step 5
Critical path is 4 time units long
![Page 28: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/28.jpg)
4/23/201328
Parhi ’95, Fig 3a / retiming step 6
Critical path is 4 time units long
![Page 29: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/29.jpg)
4/23/201329
Parhi ’95, Fig 3a / retiming step 7
Critical path is 3 time units long
3 3
![Page 30: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/30.jpg)
4/23/201330
Parhi ’95, Fig 3a / retiming step 8
Critical path is 3 time units long
4 3
![Page 31: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/31.jpg)
4/23/201331
Parhi ’95, Fig 3a / retiming step 9
Critical path is 2 time units long
4 3
![Page 32: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/32.jpg)
4/23/201332
Parhi ’95, Fig 3a after retiming = Fig 3b
Critical path is 2 time units long
![Page 33: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/33.jpg)
4/23/201333
![Page 34: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/34.jpg)
4/23/201334
![Page 35: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/35.jpg)
4/23/201335
![Page 36: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/36.jpg)
4/23/201336
![Page 37: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/37.jpg)
4/23/201337
![Page 38: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/38.jpg)
4/23/201338
![Page 39: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/39.jpg)
4/23/201339
![Page 40: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/40.jpg)
4/23/201340
x(2(k-1))
x(10(k-1))
![Page 41: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/41.jpg)
4/23/201341
Unfolding, L=2
•Parhi’s paper, Fig 1/2, paper p123/124
•y(n) = ax(n) + bx(n-1) + cx(n-2)
•y(2k) = ax(2k) + bx(2k-1) + cx(2k-2)
•y(2k+1) = ax(2k+1) + bx(2k) + cx(2k-1)
•Rewrite all indices in equations to the form
•(L(k - i) + j), with 0 ≤ j < L
•y(2k) = ax(2k) + bx(2(k-1)+1) + cx(2(k-1))
•y(2k+1) = ax(2k+1) + bx(2k) + cx(2(k-1)+1) = Fig 2
![Page 42: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/42.jpg)
4/23/201342
Unfolding, L=3
•Same FIR
•y(3k) = ax(3k ) + bx(3k-1) + cx(3k-2)
•y(3k+1) = ax(3k+1) + bx(3k ) + cx(3k-1)
•y(3k+2) = ax(3k+2) + bx(3k+1) + cx(3k )
•Rewrite all indices in equations to the form
•(L(k - i) + j), with 0 ≤ j < L
•y(3k) = ax(3k ) + bx(3(k-1)+2)+ cx(3(k-1)+1)
•y(3k+1) = ax(3k+1) + bx(3k ) + cx(3(k-1)+2)
•y(3k+2) = ax(3k+2) + bx(3k+1) + cx(3k )
![Page 43: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/43.jpg)
4/23/201343
![Page 44: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/44.jpg)
4/23/201344
![Page 45: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/45.jpg)
4/23/201345
![Page 46: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/46.jpg)
4/23/201346
![Page 47: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/47.jpg)
4/23/201347
![Page 48: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/48.jpg)
4/23/201348
![Page 49: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/49.jpg)
4/23/201349
Parhi 5, slide 2
•Original program: y(n) = a x(n) + b y(n-2)
•2-unfolded version y(2k) = a x(2k) + b y(2k-2) y(2k+1) = a x(2k+1) + b y(2k-1)
••Rewrite all indices in equations to the form
•(L(k - i) + j), with 0 ≤ j < L
•2-unfolded version y(2k) = a x(2k) + b y(2(k-1)) y(2k+1) = a x(2k+1) + b y(2(k-1)+1)
![Page 50: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/50.jpg)
4/23/201350
Parhi 5, slide 3 (Fig 5.3, pp 123)
•Original program: v(n) = u(n-37)
•4-unfolded version v(4k) = u(4k-37)
• v(4k+1) = u(4k-36)
• v(4k+2) = u(4k-35)
• v(4k+3) = u(4k-34)
•4-unfolded, v(4k) = u(4(k-10) +3)
• v(4k+1) = u(4(k-9))
• v(4k+2) = u(4(k-9)+1)
• v(4k+3) = u(4(k-9)+2)
![Page 51: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/51.jpg)
4/23/201351
![Page 52: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/52.jpg)
4/23/201352
Parhi5, slide 4 (Fig 5.4, pp 123)
•v(n) = u(n-1) + t(n-6) + v(n-12)
•v(3k) = u(3k-1) + t(3k-6) + v(3k-12)
•v(3k+1) = u(3k) + t(3k-5) + v(3k-11)
•v(3k+2) = u(3k+1) + t(3k-4) + v(3k-10)
•v(3k) = u(3(k-1)+2) + t(3(k-2)) + v(3(k-4))
•v(3k+1) = u(3k) + t(3(k-2)+1) + v(3(k-4)+1)
•v(3k+2) = u(3k+1) + t(3(k-2)+2) + v(3(k-4)+2)
•= Fig 5.4b
![Page 53: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/53.jpg)
4/23/201353
![Page 54: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/54.jpg)
4/23/201354
Parhi5, slide 6 (Fig 5.6, pp 129)
•u(n) = p(n) + (s*u(n-3) + t*u(n-2))
•u(2k) = p(2k) + (s*u(2k-3) + t*u(2k-2))
•u(2k+1) = p(2k+1) + (s*u(2k-2) + t*u(2k-1))
•u(2k) = p(2k) + (s*u(2(k-2)+1) + t*u(2(k-1))
•u(2k+1) = p(2k+1) + (s*u(2(k-1)) + t*u(2(k-1)+1)
![Page 55: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/55.jpg)
4/23/201355
![Page 56: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/56.jpg)
4/23/201356
![Page 57: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/57.jpg)
4/23/201357
FIR assignment
• Consider FIR: y(n) = a*x(n) + b*x(n-1) + c*x(n-3)
• Assume add and multiply times: 2 and 5 nsec resp.
1. Draw DFG of FIR, calculate throughput.
2. Pipeline and retime FIR for maximal throughput.
3. Unfold FIR J=2; draw the unfolded DFG. Throughput?
4. pipeline and retime unfolded FIR; draw DFG. Throughput?
5. Same for J=3 (draw DFG), and J=16 (no need to draw DFGs). Throughput?
• Return deadline: Tuesday May 7, 13:45
![Page 58: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/58.jpg)
4/23/201358
IIR assignment
• Consider IIR: y(n) = x(n) + a*y(n-2)
• Assume add and multiply time: 2 and 5 nsec resp.
1. Draw DFG of IIR, calculate throughput.
2. Pipeline and retime IIR for maximal throughput.
3. Unfold IIR J=2; draw the unfolded DFG. Throughput?
4. pipeline and retime unfolded IIR; draw DFG. Throughput?
5. Same for J=3 (draw DFG), and J=16 (no need to draw DFGs). Throughput?
• Return deadline: Tuesday May 7, 13:45
![Page 59: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/59.jpg)
4/23/201359
VLSI Programming: Feb 28
• Parhi,
• More unfolding, parallelism
• Strength reduction
![Page 60: pipleing,parallel,retimimg](https://reader033.fdocuments.us/reader033/viewer/2022051401/55cf98c7550346d03399a0aa/html5/thumbnails/60.jpg)
THANK YOU