Asynchronous Circuits Kent Orthner Wed. March 2nd, 2005 Presentation for: High speed and Low Power...

21
Asynchronous Circuits Kent Orthner Wed. March 2nd, 2005 Presentation for: High speed and Low Power VLSI, Dr. Maitham Shams

Transcript of Asynchronous Circuits Kent Orthner Wed. March 2nd, 2005 Presentation for: High speed and Low Power...

Page 1: Asynchronous Circuits Kent Orthner Wed. March 2nd, 2005 Presentation for: High speed and Low Power VLSI, Dr. Maitham Shams Kent Orthner Wed. March 2nd,

Asynchronous CircuitsAsynchronous Circuits

Kent OrthnerWed. March 2nd, 2005

Presentation for: High speed and Low Power VLSI, Dr. Maitham Shams

Kent OrthnerWed. March 2nd, 2005

Presentation for: High speed and Low Power VLSI, Dr. Maitham Shams

Page 2: Asynchronous Circuits Kent Orthner Wed. March 2nd, 2005 Presentation for: High speed and Low Power VLSI, Dr. Maitham Shams Kent Orthner Wed. March 2nd,

for High speed and Low Power VLSI, Carleton UniversityPage 2 Kent Orthner, March 2nd 2005

AgendaAgenda What are Asynchronous Circuits? Advantages & Disadvantages Example Asynchronous Circuit GasP FPGAs Design Project

Page 3: Asynchronous Circuits Kent Orthner Wed. March 2nd, 2005 Presentation for: High speed and Low Power VLSI, Dr. Maitham Shams Kent Orthner Wed. March 2nd,

for High speed and Low Power VLSI, Carleton UniversityPage 3 Kent Orthner, March 2nd 2005

What are Asynchronous Circuits?What are Asynchronous Circuits? Synchronous Circuits

Everything synchronized to a global clockClock edges determine the time instants where data is sampled Register inputs are sampled at the clock rising edgeData wires may glitch between clock edges

“Worst case” operation:The clock frequency is limited by the speed of the slowest stage.The clock frequency must be slow enough that the circuit will work with worst case PVT, and worst case data.

Clock

9 ns

10 ns 10 ns 10 ns

4 ns 6 ns

Page 4: Asynchronous Circuits Kent Orthner Wed. March 2nd, 2005 Presentation for: High speed and Low Power VLSI, Dr. Maitham Shams Kent Orthner Wed. March 2nd,

for High speed and Low Power VLSI, Carleton UniversityPage 4 Kent Orthner, March 2nd 2005

What are Asynchronous Circuits?What are Asynchronous Circuits? Asynchronous Circuits

Eliminate the global Clock signal

States defined in terms of input values and internal actions

Synchronize data transfer by other meansHandshaking, flow control

“Average-case” performance: each block goes as fast as it goes.Each block goes as fast as it goes.

9 ns

10 ns 5 ns 7 ns

4 ns 6 nsAck

Req

Page 5: Asynchronous Circuits Kent Orthner Wed. March 2nd, 2005 Presentation for: High speed and Low Power VLSI, Dr. Maitham Shams Kent Orthner Wed. March 2nd,

for High speed and Low Power VLSI, Carleton UniversityPage 5 Kent Orthner, March 2nd 2005

MicropipelinesMicropipelines Each data channel associated with two abstract control signals

Rdy – indicates when the upstream stage has data. Ack – indicates when the downstream stage is finished with the previous data.

Data moves through a stage when the upstream stage has data available, and the downstream stage is ready for new data.

If no logic processing is being performed, the circuit acts as an elastic FIFO.

C

C C

CRin

Ain R1

A1

R3

A3

A2

R2

Aout

Rout

Din Dout

Page 6: Asynchronous Circuits Kent Orthner Wed. March 2nd, 2005 Presentation for: High speed and Low Power VLSI, Dr. Maitham Shams Kent Orthner Wed. March 2nd,

for High speed and Low Power VLSI, Carleton UniversityPage 6 Kent Orthner, March 2nd 2005

AdvantagesAdvantages Performance

Average-case instead of worst case

Low Power Clock accounts for 30 – 50% of chip

dynamic power Automatic clock gating in

asynchronous

Escape from Metastability No concern about clock crossing:

circuits are metastable-safe by design

Easier Circuit Synthesis No clock distribution, no clock

skews, no clock buffering tree analysis

No timing-driven placement necessary

Technology Scaling Potential No circuit retiming/re-pipelining

necessary Technology-independent, in some

ways Automatic adaptation to physical

properties, PVT

Lower EMI Activity in synchronous circuits

produce predictable EMI patterns

Ease of composition Easier to interface heterogeneous

IP cores No timing assumptions necessary

Page 7: Asynchronous Circuits Kent Orthner Wed. March 2nd, 2005 Presentation for: High speed and Low Power VLSI, Dr. Maitham Shams Kent Orthner Wed. March 2nd,

for High speed and Low Power VLSI, Carleton UniversityPage 7 Kent Orthner, March 2nd 2005

DisadvantagesDisadvantages Vulnerable to circuit hazards & glitches Circuits are larger

more area for control & handshaking logic, encoding scheme, hazard avoidance

More difficult & less mature than synchronous designs Benefits not explored on large-scale VLSI Synchronous designs

are well understood : it’s easier to think sequentially than concurrently provide a simple way to deal with noise and hazards are tolerant to glitches

CAD Tools Synchronous tools are quite mature No such established asynchronous tools

Page 8: Asynchronous Circuits Kent Orthner Wed. March 2nd, 2005 Presentation for: High speed and Low Power VLSI, Dr. Maitham Shams Kent Orthner Wed. March 2nd,

for High speed and Low Power VLSI, Carleton UniversityPage 8 Kent Orthner, March 2nd 2005

Example Asynchronous CircuitExample Asynchronous Circuit TOKYO, Japan, February 9, 2005:Epson Develops the World's First Flexible 8-Bit Asynchronous Microprocessor

Seiko Epson Corp. ("Epson") has announced that it has developed the world's first*1 flexible 8-bit asynchronous microprocessor using low-temperature polysilicon thin-film transistors (LTPS-TFTs) on a plastic substrate

With energy consumption reduced by 70% compared to the synchronous microprocessors now in everyday use, Epson is now researching potential applications for its invention.

Using asynchronous circuit design technology, Epson has been able to:1. Make a stable 8-bit microprocessor

composed of 32,000 LTPS-TFTs,

2. Achieve energy consumption 70% lower than the synchronous design,

3. Reduce electromagnetic radiation by 20dB.

Page 9: Asynchronous Circuits Kent Orthner Wed. March 2nd, 2005 Presentation for: High speed and Low Power VLSI, Dr. Maitham Shams Kent Orthner Wed. March 2nd,

for High speed and Low Power VLSI, Carleton UniversityPage 9 Kent Orthner, March 2nd 2005

GasPGasP A family of asynchronous circuits that provide controls for:

simple pipelines branching and joining, Scatter & gather Join on demand with arbitration

Excess of 1.5 G data items / second in 0.35 um A single wire is used to carry both Ack & req messages, indicating

that each is empty or full. Rely on careful choice of transistor widths to equalize delay in logic

gates.

Page 10: Asynchronous Circuits Kent Orthner Wed. March 2nd, 2005 Presentation for: High speed and Low Power VLSI, Dr. Maitham Shams Kent Orthner Wed. March 2nd,

for High speed and Low Power VLSI, Carleton UniversityPage 10 Kent Orthner, March 2nd 2005

GasP CircuitGasP Circuit

1. If the upstream state conductor is full (low), and the downstream state conductor is empty (high), b and x both conduct, driving the voltage at (1) low.

2. This causes transistor p to turn on, making the data latch momentarily transparent.

Page 11: Asynchronous Circuits Kent Orthner Wed. March 2nd, 2005 Presentation for: High speed and Low Power VLSI, Dr. Maitham Shams Kent Orthner Wed. March 2nd,

for High speed and Low Power VLSI, Carleton UniversityPage 11 Kent Orthner, March 2nd 2005

GasP CircuitGasP Circuit

3. The low voltage at (2) causes transistor d to turn on, driving the downstream state conductor to low (full).

4. This also causes transistor y to turn on, driving the upstream state conductor to high (empty)

5. Transistor t turns on, resetting the top of the nand gate to a high value, causing pass transistor p to turn off.

Page 12: Asynchronous Circuits Kent Orthner Wed. March 2nd, 2005 Presentation for: High speed and Low Power VLSI, Dr. Maitham Shams Kent Orthner Wed. March 2nd,

for High speed and Low Power VLSI, Carleton UniversityPage 12 Kent Orthner, March 2nd 2005

GasP CircuitGasP Circuit

The propagation of data in the forward direction through the circuit is four gate delays per stage: a b c d The transistors for Logic functions must be sized such that the logic functions

take no more than four gate delays.

The propagation of holes in the reverse direction is two gate delays per stage: x y

Page 13: Asynchronous Circuits Kent Orthner Wed. March 2nd, 2005 Presentation for: High speed and Low Power VLSI, Dr. Maitham Shams Kent Orthner Wed. March 2nd,

for High speed and Low Power VLSI, Carleton UniversityPage 13 Kent Orthner, March 2nd 2005

FPGAsFPGAs Commonly built of 4-input look-up tables (LUTs)

Effectively a small RAM block with 1 data bit, and 16 memory locations. Any logic function with up to 4 inputs can be made from a 4 input LUT.

Combinations of LUTs are used to create larger logic functions.

RAM is programmed at configuration time, or during operation. A register for each logic element

Connected with a ‘sea of programmable interconnect’ SRAM used to configured at start-up time

Page 14: Asynchronous Circuits Kent Orthner Wed. March 2nd, 2005 Presentation for: High speed and Low Power VLSI, Dr. Maitham Shams Kent Orthner Wed. March 2nd,

for High speed and Low Power VLSI, Carleton UniversityPage 14 Kent Orthner, March 2nd 2005

FPGAsFPGAs Almost exclusively synchronous

Frequency is limited by the worst case path from a register, through one or more lookup tables, through the routing matrix, and into the next register.

The delay through a LUT is constant (and worst case!) A 2-input XOR function takes as much time as a complex 4-input function.

The path from a register to the next register is very granular If the logic function is 5 inputs, then then the propagation delay is almost doubled over

the 4-input case.

High power Clock distribution network goes everywhere. Power consumed to drive logic elements that aren’t used for a given design

Page 15: Asynchronous Circuits Kent Orthner Wed. March 2nd, 2005 Presentation for: High speed and Low Power VLSI, Dr. Maitham Shams Kent Orthner Wed. March 2nd,

for High speed and Low Power VLSI, Carleton UniversityPage 15 Kent Orthner, March 2nd 2005

Design ProjectDesign Project 16:1 pipeline multiplexer in four stages, using GasP pipeline.

Essentially a 4-input LUT Compare with equivalent synchronous design with the same gate sizes

Performance, Power & Energy per cycle, Circuit Size SPICE Simulations, with 0.13um technology

using TSMC models from MOSIS

Example: Out ABCD

0 In00 In10 In20 In30 In40 In50 In60 In70 In80 In90 In100 In110 In120 In130 In141 In15

Out

Sel [ABCD]

D-Sel0 C-Sel1 B-Sel2 A-Sel3

Delay Delay Delay

Page 16: Asynchronous Circuits Kent Orthner Wed. March 2nd, 2005 Presentation for: High speed and Low Power VLSI, Dr. Maitham Shams Kent Orthner Wed. March 2nd,

for High speed and Low Power VLSI, Carleton UniversityPage 16 Kent Orthner, March 2nd 2005

Design ProjectDesign Project Motivation

The pipeline is shortened when some inputs are not used, leading to reduced propagation delay.

If GasP latches are at each stage within the LUT, the flip-flop after each LUT is not required The effective operating frequency is not due to the propagation between GasP stages, not

LUTs. Performance can be further increased by incorporating GasP FIFO stages into the routing network.

Example: Z AB

0 In00 In10 In20 In30 In40 In50 In60 In70 In80 In90 In100 In111 In121 In131 In141 In15

Out

Delay Delay DelaySel [ABCD]

D-Sel0 C-Sel1 B-Sel2 A-Sel3

0

0

01

Page 17: Asynchronous Circuits Kent Orthner Wed. March 2nd, 2005 Presentation for: High speed and Low Power VLSI, Dr. Maitham Shams Kent Orthner Wed. March 2nd,

for High speed and Low Power VLSI, Carleton UniversityPage 17 Kent Orthner, March 2nd 2005

Tentative ScheduleTentative Schedule

Milestone Date

Background Research February

Design & Implementation of

GasP & Synchronous Circuits

Early / Mid March

Testing & Result Collection Late March

Class Presentation Early April

Prepare Report April

Page 18: Asynchronous Circuits Kent Orthner Wed. March 2nd, 2005 Presentation for: High speed and Low Power VLSI, Dr. Maitham Shams Kent Orthner Wed. March 2nd,

for High speed and Low Power VLSI, Carleton UniversityPage 18 Kent Orthner, March 2nd 2005

ReferencesReferences[1] Sutherland, Ivan, and Fairbanks, Scott, “GasP: A minimal FIFO Control”, Synchronous Circuits and Systems,

2001. ASYNC 2001. Seventh International Symposium on , 11-14 March 2001

[2] Shams, Maitham, Ebergen, Jo, and Elmasry, Mohammed I. “Asynchronous Circuits”, http://citeseer.ist.psu.edu/495643.html

[3] Ebergen, J, “Squaring the FIFO in GasP”, Asynchronous Circuits and Systems, 2001. ASYNC 2001. Seventh International Symposium on , 11-14 March 2001 [1] I. Sutherland, “Micropipelines”, Communications of the ACM, June 1989

[4] Girish Venkataramani, “Asynchronous Logic Design: What, Why and How?” National University of Singapore, Sept, 2004

[5] Myers, Chris J, “Asynchronous Circuit Design”, University of Utah lecture notes

[6] A. Davis, S. Nowick, “An Introduction to Asynchronous Circuit Design”, University of Utah, Columbia University.

[7] Asynchronous Logic Homepage http://www.cs.man.ac.uk/async/

[8] http://www.epson.co.jp/e/newsroom/2005/news_2005_02_09.htm

[9] S.Brown, J. Rose, “Architecture of FPGAs and CPLDs: A Tutorial”, Department of Electrical and Computer Engineering, University of Toronto, 1994

Page 19: Asynchronous Circuits Kent Orthner Wed. March 2nd, 2005 Presentation for: High speed and Low Power VLSI, Dr. Maitham Shams Kent Orthner Wed. March 2nd,

Asynchronous CircuitsAsynchronous Circuits

Kent OrthnerWed. March 2nd, 2005

Presentation for: High speed and Low Power VLSI, Dr. Maitham Shams

Kent OrthnerWed. March 2nd, 2005

Presentation for: High speed and Low Power VLSI, Dr. Maitham Shams

Page 20: Asynchronous Circuits Kent Orthner Wed. March 2nd, 2005 Presentation for: High speed and Low Power VLSI, Dr. Maitham Shams Kent Orthner Wed. March 2nd,

for High speed and Low Power VLSI, Carleton UniversityPage 20 Kent Orthner, March 2nd 2005

Classification: TimingClassification: Timing Delay-Insensitive (DI)

Designed to operate correctly regardless of the delays on gates & wires “Unbounded” gate & delay model assumed.

The class of simple DI operations built out of basic gates is almost empty Practical DI circuits can be build with complex compnents that use timing assumptions within

the component. Example: C-Element

Quasi-Delay Insensitive (QDI) Same as DI, but with Isochronic fork delay assumption

An isochronic fork is a forked wire where all branches have the same or a bounded delay

Weakest compromise to true DI circuits needed to build practival circuits. Speed-Independent (SI)

Unbounded delays for gates and “negligible” (optimistic) delays for wires. Self-timed

The circuit contains a number of elements, where each element may be SI internally. Communication between regions is assumed to be Delay Insensitive.

Page 21: Asynchronous Circuits Kent Orthner Wed. March 2nd, 2005 Presentation for: High speed and Low Power VLSI, Dr. Maitham Shams Kent Orthner Wed. March 2nd,

for High speed and Low Power VLSI, Carleton UniversityPage 21 Kent Orthner, March 2nd 2005

Classification: SignalingClassification: Signaling Control Signaling

Request/Acknowledge (Self-Timed) is popular Four phase / Return to Zero / Level signalling

Req / Ack / Req \ Ack \ : 1 cycle. Two phase / Non-RTZ / Transition Signalling

Req / Ack / : 1 Cycle. Req \ Ack \ : 1 cycle.

Data Signaling Bundled Data

Normal wires, one wire per bit. Use control signals to indicate when data is valid.

Dual-rail data 2 wires per bit, encoding implies data validity 00=no data, 01=0, 10=1, 11=invalid Simple acknowledge control wire