CMOS Design Methodologies The Design Problem Source: sematech97 A growing gap between design...
-
Upload
martina-mckenzie -
Category
Documents
-
view
216 -
download
0
Transcript of CMOS Design Methodologies The Design Problem Source: sematech97 A growing gap between design...
CMOS DesignMethodologies
The Design Problem
Source: sematech97
A growing gap between design complexity and design productivity
Design Methodology
• Design process traverses iteratively between three abstractions: behavior, structure, and geometry• More and more automation for each of these steps
Design Analysis and Verification
• Accounts for largest fraction of design time• More efficient when done at higher levels
of abstraction - selection of correct analysis level can save multiple orders of magnitude in verification time
• Two major approaches:– Simulation– Verification
Digital Data treated as Analog Signal
Vo
ut (
V)
5.0
3.0
1.0
–1.0
t (nsec)
21.510.50
Vin Vout
tpHL
Gn,p
In Out
VDD
Bp
Bn
Dn,p
Sn
Sp
Circuit Simulation
Both Time and Data treated as Analog QuantitiesAlso complicated by presence of non-linear elements(relaxed in timing simulation)
Circuit versus Switch-Level Simulation
0 5 10 15 20time (nsec)
–1.0
1.0
3.0
5.0
CIN
OUT[3]
OUT[2]
Circ
uit
Sw
itch
Design analysis and simulation
• Spice - exact but time consuming
• discrete time steps• circuit models• timing simulation with
partitioning and relaxation method
Gate level simulation
• faster than switch level
• functional simulation
• VHDL description used
Structural Description of Accumulator
entity accumulator isport ( -- definition of input and output terminals
DI: in bit_vector(15 downto 0) -- a vector of 16 bit wideDO: inout bit_vector(15 downto 0);CLK: in bit
);end accumulator;
architecture structure of accumulator iscomponent reg -- definition of register ports
port (DI : in bit_vector(15 downto 0);DO : out bit_vector(15 downto 0);CLK : in bit
);end component;component add -- definition of adder ports
port (IN0 : in bit_vector(15 downto 0);IN1 : in bit_vector(15 downto 0);OUT0 : out bit_vector(15 downto 0)
);end component;
-- definition of accumulator structuresignal X : bit_vector(15 downto 0);begin
add1 : addport map (DI, DO, X); -- defines port connectivity
reg1 : regport map (X, DO, CLK);
end structure;
Design defined as composition ofregister and full-adder cells (“netlist”)
Data represented as {0,1,Z}
Time discretized and progresses withunit steps
Description language: VHDLOther options: schematics, Verilog
Behavioral Description of Accumulator
entity accumulator isport (
DI : in integer;DO : inout integer := 0;CLK : in bit
);end accumulator;
architecture behavior of accumulator isbegin
process(CLK)variable X : integer := 0; -- intermediate variablebegin
if CLK = '1' thenX <= DO + D1;DO <= X;
end if;end process;
end behavior;
Design described as set of input-outputrelations, regardless of chosen implementation
Data described at higher abstractionlevel (“integer”)
Behavioral simulation of accumulator
Integer data
Discrete time
(Synopsys Waves display tool)
Design verification
• checking number of inversions between two C2MOS gates
• checking pull-up and pull down ratio in pseudo-NMOS gates
• checking minimum driver size to maintain rise and fall times
• checking charge sharing to satisfy noise-margins
Electrical verification
Design verification
• Spice too long simulation time
• RC delay estimated using Penfield-Rubinstein-Horowitz method
• identification of critical path (avoid false paths)
Timing verification
Timing Verification
(Synopsys-Epic Pathmill)
Critical path
Enumerates and rankorders critical timing paths
No simulation needed!
Design verification
• components described behaviorally
• circuit model obtained from component models
• resulting circuit behavior computed with design specifications
• no generally acceptable verifier exists
Formal verification
Implementation approaches
Custom circuit design
• labor intensive
• high time-to-market
• cost amortized over a large volume
• reuse as a library cell
• was popular in early designs
• layout editor, DRC, circuit extraction
Layout editor
• transistor symbols
• relative positioning
• compaction
• stick diagram description
• design rules automatically satisfied
• automatic pitch matching
1. Polygon based (Magic)2. Symbolic layout
Custom Design – Layout Editor
Magic Layout Editor(UC Berkeley)
Symbolic Layout
1
3
In O ut
VDD
GND
Stick diagram of inverter
• Dimensionless layout entities• Only topology is important• Final layout generated by “compaction” program
Design rule checking
• on-line DRC- rules checked and errors
flagged during layout
• batch DRC- post design verification
Circuit extraction
Circuit schematic derived from layout
transistors are build with proper geometry
parasitic capacitances and resistances evaluated
extraction of inductance requires 3D analysis
Cell-based design
• reduced cost
• reduced time
• reduced integration density
• reduced performance
Cell-based design
• standard cell
• compiled cells
• module generators
• macrocell place and route
Standard cell
• library contains basic logic cells - inverter, AND/NAND, OR/NOR,
XOR/NXOR, flip-flop - AOI, MUX, adder, compactor,
counter, decoder, encoder,• fan-in and fan-out specified• schematic uses cells from library• layout automatically generated
Standard cell• cells have equal heights
• cell rows separated by routing channels
Standard cell design
Standard cell layout and description
Standard cell
• large design cost amortized over a large number of designs
• large number of different cells with different fan-ins• large fan-out for cells to be used in different designs• synthesis tools made standard cell design popular• standard cell design outperform PLA in area and speed• standard cell benefit from multi level logic synthesis
Compiled cell• cell layout generated on the fly
• transistor or gate level netlist used with transistor size specified
• layout densities approach that of human designers
Circuit schematicswith
transistor sizing
Compiled cell
Generated layout
Automatic pitch matching
Module generators
• logic level cells not efficient for subcircuit design - shifters, adders, multipliers, data paths, PLAs, counters, memories
• Macrocell generators- use design parameters like number of bits
• data path compilers- use bit slice modules and repeat them N times
- generate interconnections between modules
Datapath compilers
Feedtroughs used to improve routing
Datapath compilers Datapath compiler
results
Macrocell place and route
• channel routing - metal 2 horizontal segments
- metal 1 vertical segments
• over the block routing (3-6 metal layers used)
Macrocell place and route
Array-based design implementation
• mask programmable arrays
• fuse based FPGAs
• nonvolatile FPGAs
• RAM based FPGAs
To avoid slow fabrication process which takes 3-4 weeks :
Mask programmable arrays
• gate-array - similar to standard cell
• sea-of-gate - routed over the cells (high density) - wires added to make logic gates
• challenge in design is to utilize the maximum cell capacity
• utilization < 75% for random logic design
Mask programmable arrays
Macrocell Design Methodology
Macrocell
Interconnect Bus
Routing Channel
Floorplan:Defines overalltopology of design,relative placement ofmodules, and global routes of busses,supplies, and clocks
Macrocell-Based DesignExample
Video-encoder chip[Brodersen92]
SRAM
SRAM
Rou
ting
Cha
nnel
Data paths
Standard cells
Gate Array — Sea-of-gates
rows of
cells
routing channel
uncommitted
VD D
GND
polysilicon
metal
possiblecontact
In1 In2 In3 In4
Out
UncommitedCell
CommittedCell(4-input NOR)
Sea-of-gate Primitive Cells
NMOS
PMOS
Oxide-isolation
PMOS
NMOS
NMOS
Using oxide-isolation Using gate-isolation
Sea-of-gates
Random Logic
MemorySubsystem
LSI Logic LEA300K(0.6 m CMOS)
Prewired Arrays
Categories of prewired arrays (or field-programmable devices):
• Fuse-based (program-once)• Non-volatile EPROM based• RAM based
Programmable Logic Devices
PLA PROM PAL
Fuse-based FPGA’s
Actel sea-of-gate and standard cell approach
Fuse-based FPGA’s
Example : XOR gate obtainedby setting :A=1, B=0, C=0, D=1,SA=SB=In1,S0=S1=In2
Fuse-based FPGA’sAnti-fuse provides short (low resistance) when blown out
Nonvolatile FPGA’s
• programming similar to PROM• erasable programmable logic devices - EPLD• electrically erasable - EEPLD• design partitioned into macrocells• flip-flops used to make sequential circuits• software used to program interconnections to
optimize use of hardware• input specified from schematics, truth tables, state
graphs, VHDL code
EPLD Block Diagram
Macrocell
Courtesy Altera Corp.
Primary inputs
RAM based (volatile) FPGA’s
• programming is fast and can be repeated many times
• no high voltage needed
• integration density is high
• information lost when the power goes off
XILINX FPGA’s
• configurable logic blocks CLBs used
• five input two output combinational blocks
• two D flip flops are edge or level triggered
• functionality and multiplexers controlled by RAM
• RAM can be used as look-up table or a register file
XILINX FPGA’s
XILINX FPGA’s
• each cell connected to 4 neighbors
• routing channels provide local or global connections
• switching matrices(RAM controlled) are used for switching between channels
XILINX FPGA’s
XILINX FPGA’s (XC4025)
• 32 × 32 CLBs
• 25000 gates
• 422 k bites of RAM
• operates at 250 MHz
• 32 kbit adder uses 62 CLBs
XILINX FPGA’s (XC4025)
Design synthesis
Circuit synthesis
• derivation of the transistors schematics from logic functions
- complementary CMOS- pass transistor
- dynamic - DCVSL
(differential cascode voltage switch logic)
• transistor sizing - performance modeling using RC
equivalent circuits - layout generation
• synthesis not popular due to designers reluctance
Logic synthesis
• state transition diagrams, FSM, schematics, Boolean equations, truth tables, and HDL used
• synthesis - combinational or sequential
- multi level, PLA, or FPGA• logic optimization for
- area, speed , power- technology mapping
Logic optimization
• Expresso - two level minimization tool (UCB)
• state minimization and state encoding
• MIS - multilevel logic synthesis (UCB)
Example : S = (AB) Ci
Co= AB + ACi + BCi
Logic optimization
Multilevel implementation of adder generated by MIS II cell library from University of Mississippi
Architecture synthesis
• behavioral or high level synthesis
• optimizing translation e.g. pipelining
• Cathedral and HYPER tools
• HYPER tutorial and synthesis example:
http://infopad.eecs.berkeley.edu/~hyper
Architecture synthesis example
Architecture synthesis
Vertical and Orthogonal CMOS COSMOS
– Stack two MOSFETs under a common gate
– Improve only hole mobility by using strained SiGe channel• pMOS transconductance equal to nMOS
– Reduce parasitics due to wiring and isolating the sub-nets
Conventional CMOS
COSMOS:
Complementary Orthogonal
Stacked MOS
Savas Kaya
Technology Base• Strained Si/SiGe layers
– Built-in strain traps more carriers and increases mobility• Equal+high electron and hole mobilities (Jung et al.,p.460,EDL’03)
• SOI (silicon-on-Insulator) substrates– active areas on buried oxide (BOX) layer
– Reduces unwanted DC leakage and AC parasitics
Mizuno et al., p.988, TED’03
Cheng et al., p.L48, SST’04
COSMOS Structure• Single common gate: mid-gap metal or poly-SiGe • Ultra-thin channels: 2-6nm to control threshold/leakage
– Strained Si1-xGex for holes (x0.3)
– Strained or relaxed Si for electrons
• Substrate: SOI
COSMOS Structure - 3D View I• Single gate stack: mid-gap metal or poly-SiGe
– Must be engineered for a symmetric threshold
In units of m
COSMOS Structure - 3D View II• Conventional self-aligned contacts
– Doped S/D contacts: p- (blue) or n- (red) type
• Inter-dependence between gate dimensions:
W
L
nMOS
L
W
pMOS
COSMOS Gate Control• A single gate to control both channels
– High-mobility strained Si1-xGex (x0.3) buried hole channel
• High Ge% eliminates parallel conduction and improves mobility
• Lowers the threshold voltage VT
– Electrons are in a surface channel
– Requires fine tuning for symmetric operation
0
0.5
1
1.5
2
2.5
0
0.5
1
1.5
2
2.5
3
-1.2 -0.8 -0.4 0 0.4 0.8 1.2
electronsholes
Vgate [V]
nVT
= 1011
[cm-2
]
3D Characteristics: 40nm Device • Symmetric operation
– No QM corrections• Lower VT
– Features in sub-threshold operation
• Related to p-i-n parasitic diode included in 3D
COSMOS Inverter
Top view Peel-off top views
• No additional processing– Just isolate COSMOS layers and establish proper contacts – Significantly shorter output metallization
3D TCAD Verification
• Inverter operation verified in 3D
40nm COSMOS NOT gate driving CL=1fF
load
Applications• Low power static CMOS:
– Should outperform conventional devices in terms of speed• Multiple input circuit example: NOR gate
• Area tight designs :– FPGA, Sensing/testing, power etc. ?