Asynchronous Circuits
description
Transcript of Asynchronous Circuits
![Page 1: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/1.jpg)
Asynchronous Circuits
Jordi CortadellaUniversitat Politècnica de Catalunya, Barcelona
Collège de FranceMay 14th, 2013
![Page 2: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/2.jpg)
Goals
• Convince ourselves that:– designing an asynchronous circuit is easy– synchronous and asynchronous circuits are similar– asynchronous circuits bring new advantages
• Not to discourage designers with exotic and sophisticated asynchronous schemes
Collège de France 2013 Asynchronous circuits 2
![Page 3: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/3.jpg)
Clocking
Collège de France 2013 Asynchronous circuits
Nvidia KeplerTM GK110
• How to distribute the clock?
• How to determine the clockfrequency?
• How to implement robustcommunications?
• How to reduce and manageenergy?
3
28nm, 7.1B transistors, 550mm2, 2688 CUDA cores,Base clock: 836MHz, Memory clock: 6GHz
![Page 4: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/4.jpg)
Collège de France 2013 Asynchronous circuits 4
![Page 5: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/5.jpg)
Synchronous circuits
![Page 6: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/6.jpg)
Synchronous circuit
Collège de France 2013 Asynchronous circuits
CombinationalLogic
Flip
Flo
ps
Flip
Flo
ps
PLL
6
![Page 7: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/7.jpg)
12112
Synchronous circuit
Collège de France 2013 Asynchronous circuits
CL
Two competing paths:• Launching path• Capturing path
Launching path < Capturing path + Period
CLKtree + CL < CLKtree + Period
CL < Period (no clock skew)
2PLL
7
![Page 8: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/8.jpg)
Source-synchronous
Collège de France 2013 Asynchronous circuits
CLKgen matched delay matched delay matched delay
• No global clock required
• More tolerance to PVT variations
• Period > longest combinational path
• Good for acyclic pipelines
Launching path
Capturing path
8
![Page 9: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/9.jpg)
CLKgen
?
Source-synchronous with forks and joins
Collège de France 2013 Asynchronous circuits
How to synchronize incoming events?
9
![Page 10: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/10.jpg)
C element (Muller 1959)
Collège de France 2013 Asynchronous circuits
CA
BC
A
B
C
A B C0 0 00 1 C1 0 C1 1 1
10
![Page 11: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/11.jpg)
C element (Muller 1959)
Collège de France 2013 Asynchronous circuits
A
B C
A
B
C
A B C0 0 00 1 C1 0 C1 1 1
MAJ
11
(many implementations exist)
![Page 12: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/12.jpg)
Multi-input C element
Collège de France 2013 Asynchronous circuits
C
C
C
C
C
C
a1
a2
a3
a4
a5
a6
a7
c
12
![Page 13: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/13.jpg)
Completion detection
![Page 14: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/14.jpg)
Completion detection
Collège de France 2013 Asynchronous circuits
CLKgen
fixed delay
The fixed delay must be longer than theworst-case logic delay (plus variability)
Q: could we detect when a computation has completed ASAP ?
14
![Page 15: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/15.jpg)
A 1 SP 0 SP 1 SP 1 SP
Delay-insensitive codes: Dual Rail• Dual rail: every bit encoded with two signals
Collège de France 2013 Asynchronous circuits
A.t A.f A0 0 Spacer0 1 01 0 11 1 Not used
A.t
A.f
15
![Page 16: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/16.jpg)
Dual-Rail AND gate
Collège de France 2013 Asynchronous circuits
A B C
SP SP SP
0 - 0
- 0 0
SP 1 SP
1 SP SP
1 1 1
A
BC
A.t
A.f
B.t
B.f
C.t
C.f
16
![Page 17: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/17.jpg)
Dual-Rail Inverter
Collège de France 2013 Asynchronous circuits
A Z
SP SP
0 1
1 0
A.t
A.f
Z.t
Z.f
17
![Page 18: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/18.jpg)
Dual-Rail AND/OR gate
Collège de France 2013 Asynchronous circuits
A
BC
A.t
A.f
B.t
B.f
C.t
C.f
A
BC
A.f
A.t
B.f
B.t
C.f
C.tA
BC
18
![Page 19: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/19.jpg)
Dual rail: completion detection
Dual-rail logic
•••
•••
Collège de France 2013 Asynchronous circuits 19
00
00
00
00
00
00
00
00
00
00
00
00
00
01
10
10
10
01
01
01
10
01
10
10
01
01
![Page 20: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/20.jpg)
Dual rail: completion detection
Dual-rail logic
•••
•••
C done
Completion detection tree
Collège de France 2013 Asynchronous circuits 20
![Page 21: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/21.jpg)
Dual rail: completion detection
Collège de France 2013 Asynchronous circuits
AND
OR
INV
AND
CLKgen
21
![Page 22: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/22.jpg)
Dual rail: completion detection
Collège de France 2013 Asynchronous circuits
AND
OR
INV
AND
C
22
C
![Page 23: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/23.jpg)
Single rail data vs. dual railSome back-of-the-envelope estimations:
Collège de France 2013 Asynchronous circuits
Single rail Dual RailArea 1 2Delay 1 << 1Static power 1 2Dynamic power < 0.2 2
Dual rail:• Good for speed• Large area• High power comsumption
23
![Page 24: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/24.jpg)
Handshaking
![Page 25: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/25.jpg)
Handshaking
Collège de France 2013 Asynchronous circuits
CLKgen unknown delay
Assume that the source module can provide data at any rate:
• When should the CLK generator send an event if the
internal delays of the circuit are unknown?
Solution: handshaking
25
![Page 26: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/26.jpg)
Handshaking
Collège de France 2013 Asynchronous circuits
I have data
I want data
Data
Request
Acknowledge
26
![Page 27: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/27.jpg)
Asynchronous elastic pipeline
C
ReqIn ReqOut
AckIn AckOut
C C C
• David Muller’s pipeline (late 50’s)• Sutherland’s Micropipelines (Turing award, 1989)
Collège de France 2013 Asynchronous circuits 27
![Page 28: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/28.jpg)
Multiple inputs and outputs
Collège de France 2013 Asynchronous circuits 28
![Page 29: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/29.jpg)
Multiple inputs and outputs
Collège de France 2013 Asynchronous circuits
delay
29
![Page 30: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/30.jpg)
Channel-based communication• A channel contains data and handshake wires
Collège de France 2013 Asynchronous circuits
DataReq
Ack
30
DataReq
Ack
![Page 31: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/31.jpg)
Two-phase protocol
• Every edge is active• It may require double-edge triggered flip-flops or
pulse generators
Collège de France 2013 Asynchronous circuits
Data 1 Data 2 Data 3
Req
Ack
Data
Data transfer Data transfer
31
![Page 32: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/32.jpg)
Four-phase protocol
• Valid data on the active edge of Req• Req/Ack must return to zero before the next transfer• Different variations of the 4-phase protocol exist
Collège de France 2013 Asynchronous circuits
Data 1 Data 2 Data 3
Req
Ack
Data
Data transfer Data transfer
32
![Page 33: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/33.jpg)
How to memorize?
Collège de France 2013 Asynchronous circuits
CombinationalLogic LL
delay
CC
? ?
2-phase or 4-phase ?
33
![Page 34: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/34.jpg)
How to memorize?
Collège de France 2013 Asynchronous circuits
CombinationalLogic LL
delay
CC
Pulsegenerator
2-phase
34
![Page 35: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/35.jpg)
How to memorize?
Collège de France 2013 Asynchronous circuits
CombinationalLogic LL
delay
CC 4-phase
35
![Page 36: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/36.jpg)
Performance analysis
![Page 37: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/37.jpg)
Ring oscillators
Collège de France 2013 Asynchronous circuits
CC
CC
C
• Every ring requires an odd number of inverters
• The cycle period is determined by the slowest ring
• The cycle period is adapted to the operating conditions(temperature, voltage)
37
1
2 3 4
5
6 7
![Page 38: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/38.jpg)
Why asynchronous?
![Page 39: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/39.jpg)
Modularity• Time-independent functional composability
– Performance may be affected (but not functionality)
Collège de France 2013 Asynchronous circuits 40
A BDataReq
AckB’
![Page 40: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/40.jpg)
Tracking variability
Collège de France 2013 Asynchronous circuits 41
matched delay
![Page 41: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/41.jpg)
Tracking variability
delay
best typ worst
multi-corner matched delay
critical paths
Good correlation for:
• Process variability (systematic)
• Global voltage fluctuations
• Temperature
• Aging (partially)Collège de France 2013 Asynchronous circuits 42
![Page 42: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/42.jpg)
Margins
Gate and wire delays (typ) P V T AgingPLLJitter
Skew
Rigid Clocks:
Cycle period
Gate and wire delays (typ) P V TA
gin
g
Elastic Clocks:
Skew
Cycle period
Margin reduction
Speed-up / Power savings
Collège de France 2013 Asynchronous circuits 43
![Page 43: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/43.jpg)
wasted timecomputation time
Rigid clock
computation time
Cycle period
Cycle period
Elastic clock
Clock elasticity
Collège de France 2013 Asynchronous circuits 44
![Page 44: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/44.jpg)
Voltage scaling and power savings
-24%-14%
3 ARM926 coreson the same die
Collège de France 2013 Asynchronous circuits 45
![Page 45: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/45.jpg)
Design Automation
![Page 46: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/46.jpg)
Design automation paradigms• Synthesis of asynchronous controllers
– Logic synthesis from Petri nets orasynchronous FSMs
• Syntax-directed translation– Correct-by-construction composition of handshake
components
• De-synchronization– Automatic transformation from synchronous to
asynchronousCollège de France 2013 Asynchronous circuits 47
![Page 47: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/47.jpg)
Synthesis of asynchronous controllers
Collège de France 2013 Asynchronous circuits 48
DSr
LDS
LDTACK
D
DTACK
LDS+ LDTACK+ D+ DTACK+ DSr- D-
DTACK-
LDS-LDTACK-
DSr+
![Page 48: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/48.jpg)
Synthesis of asynchronous controllers
Collège de France 2013 Asynchronous circuits 49
LDS+ LDTACK+ D+ DTACK+ DSr- D-
DTACK-
LDS-LDTACK-
DSr+
DTACKD
DSr
LDS
LDTACK
Example: Petrify
![Page 49: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/49.jpg)
Syntax-directed translation
Collège de France 2013 Asynchronous circuits 50
(A || B) ; C
P = (A || B) ; C
![Page 50: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/50.jpg)
Syntax-directed translation
Collège de France 2013 Asynchronous circuits 51
par
A B
C
A || B
seq
P = (A || B) ; C
![Page 51: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/51.jpg)
Syntax-directed translation
Collège de France 2013 Asynchronous circuits 52
seq
par
A B
C
P = (A || B) ; C
![Page 52: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/52.jpg)
Syntax-directed translation
Collège de France 2013 Asynchronous circuits 53
A B
P = (A ; B) seq
![Page 53: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/53.jpg)
Syntax-directed translation
Collège de France 2013 Asynchronous circuits 54
c := a + b +
c
a b
![Page 54: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/54.jpg)
Syntax-directed translation
Collège de France 2013 Asynchronous circuits
→
SEQ
xR
R
RWMUX
→
yR
R
RWMUX
*
DMX-
DMX-
DMX <>
DMX <
do
→→ @
áá ññ→
out
int = type [0..255]& gcd: main proc (in? chan <<int,int>> & out! chan int)begin x, y: var int| forever do in?<<x,y>>
; do x <> y then if x < y then y:=y-x else x:=x-y fi od
; out!x odend
Sources:
J. Kessels and A. Peeters.DESCALE: A Design Experiment for a SmartCard Application Consuming Low Energy,in Principles of Asynchronous Circuit Design, A Systems Perspective,Eds., J. Sparso and S. Furber, Kluwer Academic Publishers, 2001.
P.A.Beerel, R.O. Ozdag and M. Ferretti.A Designer’s Guide to Asynchronous VLSI,Cambridge University Press, 2010. 55
![Page 55: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/55.jpg)
De-synchronization• Strategy: substitute the clock tree
by local clocks and handshakes
• Combinational logic and latches are not modified
• More tolerance to variability– Similar area, less power and/or more speed
• Cortadella, Kondratyev, Lavagno and Sotiriou. Desynchronization: Synthesis of asynchronous circuits from synchronous specifications.IEEE TCAD, Oct 2006.
Collège de France 2013 Asynchronous circuits 56
![Page 56: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/56.jpg)
Synchronous operation
Collège de France 2013 Asynchronous circuits
CLKgen
Transforming a synchronous circuit into asynchronous (automatically)
57
![Page 57: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/57.jpg)
De-synchronization
Collège de France 2013 Asynchronous circuits
Transforming a synchronous circuit into asynchronous (automatically)
59
![Page 58: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/58.jpg)
Conclusions• Asynchrony offers flexibility in time
– Modularity– Dynamic adaptability– Tolerance to variability
• Better optimization of power/performance
• Why isn’t it an important trend in circuit design?– Lack of commercial EDA support (timing sign-off)– Designers do not feel comfortable with “unpredictable” timing– Other aspects: testing, verification, …
• De-synchronization might be a viable solutionCollège de France 2013 Asynchronous circuits 61
![Page 59: Asynchronous Circuits](https://reader038.fdocuments.us/reader038/viewer/2022103101/568143a2550346895db023ec/html5/thumbnails/59.jpg)
Collège de France 2013 Asynchronous circuits 62