CMOS Design Methodologies The Design Problem Source: sematech97 A growing gap between design...

Post on 26-Dec-2015

216 views 0 download

Transcript of CMOS Design Methodologies The Design Problem Source: sematech97 A growing gap between design...

CMOS DesignMethodologies

The Design Problem

Source: sematech97

A growing gap between design complexity and design productivity

Design Methodology

• Design process traverses iteratively between three abstractions: behavior, structure, and geometry• More and more automation for each of these steps

Design Analysis and Verification

• Accounts for largest fraction of design time• More efficient when done at higher levels

of abstraction - selection of correct analysis level can save multiple orders of magnitude in verification time

• Two major approaches:– Simulation– Verification

Digital Data treated as Analog Signal

Vo

ut (

V)

5.0

3.0

1.0

–1.0

t (nsec)

21.510.50

Vin Vout

tpHL

Gn,p

In Out

VDD

Bp

Bn

Dn,p

Sn

Sp

Circuit Simulation

Both Time and Data treated as Analog QuantitiesAlso complicated by presence of non-linear elements(relaxed in timing simulation)

Circuit versus Switch-Level Simulation

0 5 10 15 20time (nsec)

–1.0

1.0

3.0

5.0

CIN

OUT[3]

OUT[2]

Circ

uit

Sw

itch

Design analysis and simulation

• Spice - exact but time consuming

• discrete time steps• circuit models• timing simulation with

partitioning and relaxation method

Gate level simulation

• faster than switch level

• functional simulation

• VHDL description used

Structural Description of Accumulator

entity accumulator isport ( -- definition of input and output terminals

DI: in bit_vector(15 downto 0) -- a vector of 16 bit wideDO: inout bit_vector(15 downto 0);CLK: in bit

);end accumulator;

architecture structure of accumulator iscomponent reg -- definition of register ports

port (DI : in bit_vector(15 downto 0);DO : out bit_vector(15 downto 0);CLK : in bit

);end component;component add -- definition of adder ports

port (IN0 : in bit_vector(15 downto 0);IN1 : in bit_vector(15 downto 0);OUT0 : out bit_vector(15 downto 0)

);end component;

-- definition of accumulator structuresignal X : bit_vector(15 downto 0);begin

add1 : addport map (DI, DO, X); -- defines port connectivity

reg1 : regport map (X, DO, CLK);

end structure;

Design defined as composition ofregister and full-adder cells (“netlist”)

Data represented as {0,1,Z}

Time discretized and progresses withunit steps

Description language: VHDLOther options: schematics, Verilog

Behavioral Description of Accumulator

entity accumulator isport (

DI : in integer;DO : inout integer := 0;CLK : in bit

);end accumulator;

architecture behavior of accumulator isbegin

process(CLK)variable X : integer := 0; -- intermediate variablebegin

if CLK = '1' thenX <= DO + D1;DO <= X;

end if;end process;

end behavior;

Design described as set of input-outputrelations, regardless of chosen implementation

Data described at higher abstractionlevel (“integer”)

Behavioral simulation of accumulator

Integer data

Discrete time

(Synopsys Waves display tool)

Design verification

• checking number of inversions between two C2MOS gates

• checking pull-up and pull down ratio in pseudo-NMOS gates

• checking minimum driver size to maintain rise and fall times

• checking charge sharing to satisfy noise-margins

Electrical verification

Design verification

• Spice too long simulation time

• RC delay estimated using Penfield-Rubinstein-Horowitz method

• identification of critical path (avoid false paths)

Timing verification

Timing Verification

(Synopsys-Epic Pathmill)

Critical path

Enumerates and rankorders critical timing paths

No simulation needed!

Design verification

• components described behaviorally

• circuit model obtained from component models

• resulting circuit behavior computed with design specifications

• no generally acceptable verifier exists

Formal verification

Implementation approaches

Custom circuit design

• labor intensive

• high time-to-market

• cost amortized over a large volume

• reuse as a library cell

• was popular in early designs

• layout editor, DRC, circuit extraction

Layout editor

• transistor symbols

• relative positioning

• compaction

• stick diagram description

• design rules automatically satisfied

• automatic pitch matching

1. Polygon based (Magic)2. Symbolic layout

Custom Design – Layout Editor

Magic Layout Editor(UC Berkeley)

Symbolic Layout

1

3

In O ut

VDD

GND

Stick diagram of inverter

• Dimensionless layout entities• Only topology is important• Final layout generated by “compaction” program

Design rule checking

• on-line DRC- rules checked and errors

flagged during layout

• batch DRC- post design verification

Circuit extraction

Circuit schematic derived from layout

transistors are build with proper geometry

parasitic capacitances and resistances evaluated

extraction of inductance requires 3D analysis

Cell-based design

• reduced cost

• reduced time

• reduced integration density

• reduced performance

Cell-based design

• standard cell

• compiled cells

• module generators

• macrocell place and route

Standard cell

• library contains basic logic cells - inverter, AND/NAND, OR/NOR,

XOR/NXOR, flip-flop - AOI, MUX, adder, compactor,

counter, decoder, encoder,• fan-in and fan-out specified• schematic uses cells from library• layout automatically generated

Standard cell• cells have equal heights

• cell rows separated by routing channels

Standard cell design

Standard cell layout and description

Standard cell

• large design cost amortized over a large number of designs

• large number of different cells with different fan-ins• large fan-out for cells to be used in different designs• synthesis tools made standard cell design popular• standard cell design outperform PLA in area and speed• standard cell benefit from multi level logic synthesis

Compiled cell• cell layout generated on the fly

• transistor or gate level netlist used with transistor size specified

• layout densities approach that of human designers

Circuit schematicswith

transistor sizing

Compiled cell

Generated layout

Automatic pitch matching

Module generators

• logic level cells not efficient for subcircuit design - shifters, adders, multipliers, data paths, PLAs, counters, memories

• Macrocell generators- use design parameters like number of bits

• data path compilers- use bit slice modules and repeat them N times

- generate interconnections between modules

Datapath compilers

Feedtroughs used to improve routing

Datapath compilers Datapath compiler

results

Macrocell place and route

• channel routing - metal 2 horizontal segments

- metal 1 vertical segments

• over the block routing (3-6 metal layers used)

Macrocell place and route

Array-based design implementation

• mask programmable arrays

• fuse based FPGAs

• nonvolatile FPGAs

• RAM based FPGAs

To avoid slow fabrication process which takes 3-4 weeks :

Mask programmable arrays

• gate-array - similar to standard cell

• sea-of-gate - routed over the cells (high density) - wires added to make logic gates

• challenge in design is to utilize the maximum cell capacity

• utilization < 75% for random logic design

Mask programmable arrays

Macrocell Design Methodology

Macrocell

Interconnect Bus

Routing Channel

Floorplan:Defines overalltopology of design,relative placement ofmodules, and global routes of busses,supplies, and clocks

Macrocell-Based DesignExample

Video-encoder chip[Brodersen92]

SRAM

SRAM

Rou

ting

Cha

nnel

Data paths

Standard cells

Gate Array — Sea-of-gates

rows of

cells

routing channel

uncommitted

VD D

GND

polysilicon

metal

possiblecontact

In1 In2 In3 In4

Out

UncommitedCell

CommittedCell(4-input NOR)

Sea-of-gate Primitive Cells

NMOS

PMOS

Oxide-isolation

PMOS

NMOS

NMOS

Using oxide-isolation Using gate-isolation

Sea-of-gates

Random Logic

MemorySubsystem

LSI Logic LEA300K(0.6 m CMOS)

Prewired Arrays

Categories of prewired arrays (or field-programmable devices):

• Fuse-based (program-once)• Non-volatile EPROM based• RAM based

Programmable Logic Devices

PLA PROM PAL

Fuse-based FPGA’s

Actel sea-of-gate and standard cell approach

Fuse-based FPGA’s

Example : XOR gate obtainedby setting :A=1, B=0, C=0, D=1,SA=SB=In1,S0=S1=In2

Fuse-based FPGA’sAnti-fuse provides short (low resistance) when blown out

Nonvolatile FPGA’s

• programming similar to PROM• erasable programmable logic devices - EPLD• electrically erasable - EEPLD• design partitioned into macrocells• flip-flops used to make sequential circuits• software used to program interconnections to

optimize use of hardware• input specified from schematics, truth tables, state

graphs, VHDL code

EPLD Block Diagram

Macrocell

Courtesy Altera Corp.

Primary inputs

RAM based (volatile) FPGA’s

• programming is fast and can be repeated many times

• no high voltage needed

• integration density is high

• information lost when the power goes off

XILINX FPGA’s

• configurable logic blocks CLBs used

• five input two output combinational blocks

• two D flip flops are edge or level triggered

• functionality and multiplexers controlled by RAM

• RAM can be used as look-up table or a register file

XILINX FPGA’s

XILINX FPGA’s

• each cell connected to 4 neighbors

• routing channels provide local or global connections

• switching matrices(RAM controlled) are used for switching between channels

XILINX FPGA’s

XILINX FPGA’s (XC4025)

• 32 × 32 CLBs

• 25000 gates

• 422 k bites of RAM

• operates at 250 MHz

• 32 kbit adder uses 62 CLBs

XILINX FPGA’s (XC4025)

Design synthesis

Circuit synthesis

• derivation of the transistors schematics from logic functions

- complementary CMOS- pass transistor

- dynamic - DCVSL

(differential cascode voltage switch logic)

• transistor sizing - performance modeling using RC

equivalent circuits - layout generation

• synthesis not popular due to designers reluctance

Logic synthesis

• state transition diagrams, FSM, schematics, Boolean equations, truth tables, and HDL used

• synthesis - combinational or sequential

- multi level, PLA, or FPGA• logic optimization for

- area, speed , power- technology mapping

Logic optimization

• Expresso - two level minimization tool (UCB)

• state minimization and state encoding

• MIS - multilevel logic synthesis (UCB)

Example : S = (AB) Ci

Co= AB + ACi + BCi

Logic optimization

Multilevel implementation of adder generated by MIS II cell library from University of Mississippi

Architecture synthesis

• behavioral or high level synthesis

• optimizing translation e.g. pipelining

• Cathedral and HYPER tools

• HYPER tutorial and synthesis example:

http://infopad.eecs.berkeley.edu/~hyper

Architecture synthesis example

Architecture synthesis

Vertical and Orthogonal CMOS COSMOS

– Stack two MOSFETs under a common gate

– Improve only hole mobility by using strained SiGe channel• pMOS transconductance equal to nMOS

– Reduce parasitics due to wiring and isolating the sub-nets

Conventional CMOS

COSMOS:

Complementary Orthogonal

Stacked MOS

Savas Kaya

Technology Base• Strained Si/SiGe layers

– Built-in strain traps more carriers and increases mobility• Equal+high electron and hole mobilities (Jung et al.,p.460,EDL’03)

• SOI (silicon-on-Insulator) substrates– active areas on buried oxide (BOX) layer

– Reduces unwanted DC leakage and AC parasitics

Mizuno et al., p.988, TED’03

Cheng et al., p.L48, SST’04

COSMOS Structure• Single common gate: mid-gap metal or poly-SiGe • Ultra-thin channels: 2-6nm to control threshold/leakage

– Strained Si1-xGex for holes (x0.3)

– Strained or relaxed Si for electrons

• Substrate: SOI

COSMOS Structure - 3D View I• Single gate stack: mid-gap metal or poly-SiGe

– Must be engineered for a symmetric threshold

In units of m

COSMOS Structure - 3D View II• Conventional self-aligned contacts

– Doped S/D contacts: p- (blue) or n- (red) type

• Inter-dependence between gate dimensions:

W

L

nMOS

L

W

pMOS

COSMOS Gate Control• A single gate to control both channels

– High-mobility strained Si1-xGex (x0.3) buried hole channel

• High Ge% eliminates parallel conduction and improves mobility

• Lowers the threshold voltage VT

– Electrons are in a surface channel

– Requires fine tuning for symmetric operation

0

0.5

1

1.5

2

2.5

0

0.5

1

1.5

2

2.5

3

-1.2 -0.8 -0.4 0 0.4 0.8 1.2

electronsholes

Vgate [V]

nVT

= 1011

[cm-2

]

3D Characteristics: 40nm Device • Symmetric operation

– No QM corrections• Lower VT

– Features in sub-threshold operation

• Related to p-i-n parasitic diode included in 3D

COSMOS Inverter

Top view Peel-off top views

• No additional processing– Just isolate COSMOS layers and establish proper contacts – Significantly shorter output metallization

3D TCAD Verification

• Inverter operation verified in 3D

40nm COSMOS NOT gate driving CL=1fF

load

Applications• Low power static CMOS:

– Should outperform conventional devices in terms of speed• Multiple input circuit example: NOR gate

• Area tight designs :– FPGA, Sensing/testing, power etc. ?