EE415 VLSI Design Introduction [ Adapted from presentation by Kia Bazagran, University of Minnesota...
-
Upload
benjamin-washington -
Category
Documents
-
view
232 -
download
2
Transcript of EE415 VLSI Design Introduction [ Adapted from presentation by Kia Bazagran, University of Minnesota...
EE415 VLSI Design
Introduction
[Adapted from presentation by Kia Bazagran, University of Minnesota]
VLSI Design II
EE415 VLSI Design
Section Outline
Administrative Issues
Semiconductor industry trends
Chip implementation methodologies
Design methodologies
EE415 VLSI Design
What is This Course All About?
Prerequisite» Basic CMOS design» Static/dynamic circuit design» CAD Computer Aided Design tools
What is different from “VLSI Design I”?» Higher-level of design (closer to architecture)» Emphasis on performance, processor cores, fault tolerance
What is covered?» Sequential logic» Arithmetic circuits and subsystem design» Parasitics, timing, synchronization, pipelining» Memories» Test and testability» New issues and design techniques
EE415 VLSI Design
Course Outline
Sequential Logic» Static sequential circuits» Dynamic sequential circuits» Non-bistable sequential circuits
CMOS Designs» Arithmetic & logic unit (ALU)
– Bitwise operations– Datapath layout
» Adders– Basic adders: carry propagation, Carry Look-ahead,
Manchester Carry Chain– More complex adders: Carry Save Adder, Brent-Kung– Fast adders: Carry-Select adder, Wallace tree
EE415 VLSI Design
Course Outline
CMOS Designs» Multipliers
– Shift/Add multiplication– Booth encoding– Multiplication by constants– Floating point multiplication
Interconnect and parasitics» Parasitic capacitances» Parasitic resistances» Parasitic inductances» Packaging
EE415 VLSI Design
Timing» Clock generation» Clock skew» Self-timed design» Synchronization» Pipelining» Asynchronous design
System Architecture and Power» Low Power Design in CMOS
Course Outline (cont)
EE415 VLSI Design
Course Outline (cont)
CMOS Designs (cont)» Shift/Rotate operations» Memories
– Memory cells: static and dynamic– Memory arrays: address decoders, sensors and amplifiers
Test and testability» Fault models» Design techniques: scan design, built-in self-test
New design techniques/platforms» CORDIC algorithms» Bit-serial computations» [Recent circuit examples]
EE415 VLSI Design
IC Products
Processors» CPU, DSP, Controllers
Memory chips» RAM, ROM, EEPROM
Analog» Mobile communication,
audio/video processing Programmable
» PLA, FPGA Embedded systems
» Used in cars, factories» Network cards
System-on-chip (SoC)Images: amazon.com
EE415 VLSI Design
Semiconductor Industry Growth Rates
Source: http://www.icinsight.com/ (McClean Report)
EE415 VLSI Design
More Demand for EDA
Source: http://www.edat.com/edac
CA
E =
Com
pute
r Aid
ed E
ngin
eerin
g
EE415 VLSI Design
Growth in System Size
Source: http://www.edat.com/edac
CA
GR
= C
om
pound A
nnual G
row
th R
ate
EE415 VLSI Design
Example: Intel Processor Sizes
Source: http://www.intel.com/
Intel386TM DXProcessor
Intel486TM DXProcessor
Pentium® Processor
Pentium® Pro &Pentium® II Processors
1.5 1.0 0.8 0.6 0.35 0.25Silicon ProcessTechnology
EE415 VLSI Design
Implementation Methodologies
[© Prentice Hall]
Digital Circuit Implementation Approaches
Custom Semi-custom
Cell-Based Array-Based
Standard Cells Macro Cells Pre-diffused Pre-wired(FPGA)Compiled Cells (Gate Arrays)
Semi customCustom
Digital Ckt Implementation Approaches
EE415 VLSI Design
Custom Design
Using custom design we can get exactly what we want.
However:» Complex to design» Takes weeks to
fabricate» High design costs» High overhead (non-
recurring – NRE) costs» How do we automate
the mapping?
[© Hauck]
EE415 VLSI Design
Standard Cells
Use regular layout Can automate the mapping process, but
» Takes weeks to fabricate» No economies of scale
PWR
GND
CELL1
CELL2
CELL3
CELL4
CELL5
CELL6
CELL8
CELL7
CELL10
CELL9
CELL16
CELL15
CELL14
CELL11
CELL12
CELL13
ROUTING
CellsROUTING
PWR
GND
PWR
GND
CellsROUTING
CellsROUTING
CellsROUTING
[© Hauck]
EE415 VLSI Design
Combined Standard Cell and Full Custom
Use full custom for regular structures & critical paths
Standard cells handle complex logic &non-critical logic
[© Hauck]
EE415 VLSI Design
Macrocell Design Methodology
Macrocell
Interconnect Bus
Routing Channel
Floorplan:Defines overalltopology of design,relative placement ofmodules, and global routes of busses,supplies, and clocks
EE415 VLSI Design
Macrocell-Based DesignExample
Video-encoder chip[Brodersen92]
SRAM
SRAM
Rou
ting
Cha
nnel
Data paths
Standard cells
EE415 VLSI Design
Mask-Programmable Gate Array (MPGA)
Prefabricate all but the metal layers
[© Hauck]
EE415 VLSI Design
Discrete Components
Prefabricate lots of small, simple parts. Wire them together.
D Q
D Q
D Q
DQ
DQ
DQ
[© Hauck]
EE415 VLSI Design
Gate Array — Sea-of-gates
rows of
cells
routing channel
uncommitted
VDD
GND
polysilicon
metal
possiblecontact
In1 In2 In3 In4
Out
UncommitedCell
CommittedCell(4-input NOR)
EE415 VLSI Design
Sea-of-gate Primitive Cells
NMOS
PMOS
Oxide-isolation
PMOS
NMOS
NMOS
Using oxide-isolation Using gate-isolation
Prefabricate all but the metal layers and the contacts
EE415 VLSI Design
Programmable Logic Devices
Categories of prewired arrays (or field-programmable devices):» Fuse-based (program-once)» Non-volatile EPROM based» RAM based
Recently:» VPGA (Via-Programmable Gate Array)» Structured ASIC
[© Prentice-Hall]
EE415 VLSI Design
Programming Technologies
Mask-programmed Antifuse
EPROM EEPROM
SRAM
n+ drainn+ source
P-Type Silicon
access gate floating gate
PolysiliconField Oxide
N+ diffusionONO
Dielectric
Q
~QWrite
[© Hauck]
EE415 VLSI Design
RAMs, ROMs
Given a RAM/ROM with 8k memory locations, in 1k*8bit organization » 10 address lines» Can implement 8 arbitrary 10-input functions (but
inefficiently)
ROM
000001010011100101110111
I1I2I3
A B C D E F G H
[© Hauck]
EE415 VLSI Design
Field Programmable Gate Arrays (FPGAs)
Logic cells embedded in a general routing structure
Logic cells usually contain:» 5-input function
calculator» Flip-flops
All features electronically (re)programmable
RAMRAMRAMRAMRAM
RAMRAMRAMRAMRAM
AMM
[© Hauck]
EE415 VLSI Design
Multi-Mode Reconfigurable Systems
Tektronix PhaserCard printer controllersDifferent configurations for different printers
Andromeda Systems disk controllerField upgrades performed by modem
Radius pivoting monitorDifferent configurations forlandscape & portrait
Honeywell tape driveDifferent configurations forread & write operations
FPGA
ROM
Config1Config2Config3Config4
[© Hauck]
EE415 VLSI Design
Microprocessors & Microcontrollers
Microcontrollers are simple 1-chip computers optimized for embedded control
Cheap, can handle complex control flow (relatively slowly)
CPU
RAM ROM
I/O Sensor
Actuator
[© Hauck]Microcontroller
EE415 VLSI Design
Digital Signal Processors (DSPs)
Fast multiply-accumulate for signal filtering, etc.
DATARAM
REGISTER
ALU
MUX
MULTIPLIER
ACCUMULATOR
SHIFTER
SHIFTER
REGISTERMUX
REGISTER
MUX
PCPROGRAM
CONTROLLER
I/OCONTROLLER
PROGRAMROM
Data Bus
ProgramBus
Address
Address
[© Hauck]
EE415 VLSI Design
Implementation Alternatives
o1
i6i5i4i3i2i1
Discrete Components
Programmable Logic Devices
Gate Arrays
Field-Programmable Gate Arrays (FPGAs)
Full Custom Standard CellsPWR
GND
CELL1
CELL2
CELL3
CELL4
CELL5
CELL6
CELL8
CELL7
CELL10
CELL9
PWR
GND
EE415 VLSI Design
Circuit synthesis
derivation of the transistors schematics from logic functions
- complementary CMOS- pass transistor
- dynamic - DCVSL
(differential cascode voltage switch logic)
transistor sizing - performance modeling using RC equivalent circuits
- layout generation synthesis not popular due to designers reluctance
EE415 VLSI Design
Logic synthesis
state transition diagrams, FSM, schematics, Boolean equations, truth tables, and HDL used
synthesis - combinational or sequential
- multi level, PLA, or FPGA logic optimization for
- area, speed , power- technology mapping
EE415 VLSI Design
Logic optimization
Expresso - two level minimization tool (UCB) state minimization and state encoding MIS - multilevel logic synthesis (UCB)
Example : S = (AB) Ci
Co= AB + ACi + BCi
EE415 VLSI Design
Architecture synthesis behavioral or high level synthesis optimizing translation e.g. pipelining Cathedral and HYPER tools HYPER tutorial and synthesis example: http://infopad.eecs.berkeley.edu/~hyper
EE415 VLSI Design
IC Design Steps (cont.)
SpecificationsSpecifications High-levelDescriptionHigh-level
DescriptionStructural
DescriptionStructural
Description
BehavioralVHDL, C
StructuralVHDL
Figs. [©Sherwani]
EE415 VLSI Design
Synthesis
IC Design Steps (cont.)
PackagingFabri-cation
PhysicalDesign
TechnologyMapping
SpecificationsSpecifications High-levelDescription
High-levelDescription
StructuralDescription
StructuralDescription
Placed& RoutedDesign
Placed& RoutedDesign
Gate-levelDesign
Gate-levelDesign
LogicDescription
LogicDescription
EE415 VLSI Design
IC Design Steps (cont.)
Synthesis
Packaging Fabri-cation
PhysicalDesign
TechnologyMapping
SpecificationsSpecifications High-levelDescription
High-levelDescription
StructuralDescription
StructuralDescription
Placed& RoutedDesign
Placed& RoutedDesign
X=(AB*CD)+ (A+D)+(A(B+C))Y = (A(B+C)+AC+ D+A(BC+D))
Figs. [©Sherwani]
Gate-levelDesign
Gate-levelDesign
LogicDescription
LogicDescription
EE415 VLSI Design
The Big Picture: IC Design Methods
Full Custom
ASIC – StandardCell Design
Standard CellLibrary Design
RTL-Level Design
Design MethodsCost /
DevelopmentTime
Quality % Companiesinvolved
EE415 VLSI Design
Optimization: Levels of Abstraction
Algorithmic» Encoding data, computation
scheduling, balancing delays of components, etc.
Gate-level» Reduce fan-out, capacitance» Gate duplication, buffer insertion
Layout» Move transistors driven by late
inputs closer to the output
Eff
ecti
ven
ess
Level of
deta
il
EE415 VLSI Design
Full Custom Design
Structural/RTL Description
Mem
Ctrl
Comp.Unit
RegFile
...
Layouts [© Prentice Hall]
Component Design
Floorplan [©Sherwani]
Place & Route
A/D
PLA
I/Ocomp
RAM
EE415 VLSI Design
ASIC Design
HDL Programming
P_Inp: process (Reset, Clock) begin if (Reset = '1') then sum <= ( others => '0' ); input_nums_read <= '0'; sum_ready <= '0';
P_Inp: process (Reset, Clock) begin if (Reset = '1') then sum <= ( others => '0' ); input_nums_read <= '0'; sum_ready <= '0';
add82 : kadd8 port map ( a => add_i1, b => add_i2, ci => carry, s => sum_o);Mult_i1 <= sum_o(7 downto 0);
add82 : kadd8 port map ( a => add_i1, b => add_i2, ci => carry, s => sum_o);Mult_i1 <= sum_o(7 downto 0);
Floorplan [©Sherwani]
Structural/RTL Description
Mem
Ctrl
Comp.Unit
RegFile
C DA B
Cell library
D C C B
A C C
D C D B
BCCC
EE415 VLSI Design
More Issues to Consider
Area/speed trade-off
N0 10 20
0
20
40
60
80
look-ahead
select
bypassmanchester
mirrorstatic
manchester
look-ahead
select
static
mirror
bypass
[© Prentice Hall]
t p(s
ec)
0 10 20N
0
0.2
0.4
Are
a (
mm
2)
EE415 VLSI Design
More Issues to Consider (cont.)
Aspect ratio, area budgets, datapath layout Power and clock grid
Well
Control wires (M1)
Well
Wires (M1)
GND VDD GND
GND
VDD
GND
Approach I —
Signal and power lines parallel
Approach II —
Signal and power lines perpendicular
Sign
al w
ires
(M
2)
Sign
al w
ires
(M
2)
Figures: [© Prentice Hall]
EE415 VLSI Design
Datapath Layout Example: Adder
[WE92] p.521
Standard cell layout Bit-slice cell layout
EE415 VLSI Design
Architecture of a CPU
Flags:overflow,zero, etc.
Read/writefunction
Mem
Control
Data pathRegister
File
EE415 VLSI Design
Arithmetic and Logic Unit (ALU)
Functions» Arithmetic (add, sub, inc, dec)» Logic (and, or, not, xor)» Comparison (<, >, <=, >=, !=)
Control signals» Function selection» Operation mode (signed, unsigned)
Output» Operation result (data)» Flags (overflow, zero, negative)
EE415 VLSI Design
Simple ALU Example
Tile identical processing elements [© Prentice Hall]
Bit 3
Bit 2
Bit 1
Bit 0
Reg
iste
r
Ad
der
Sh
ifte
r
Mu
ltip
lexer
Data
in
Data
Ou
t
Control
EE415 VLSI Design
FPGA Architecture - Layout
Island FPGAs» Array of functional units» Horizontal and vertical routing
channels connecting the functional units
» Versatile switch boxes» Example: Xilinx, Altera
Row-based FPGAs» Like standard cell design» Rows of logic blocks» Routing channels (fixed width)
between rows of logic
» Example: Actel FPGAs
EE415 VLSI Design
FPGA Architecture: Functional Units
Functional units» RAM blocks (Xilinx):
implement function truth table
» Multiplexers (Actel):build Boolean functions using muxes
» Logic gates, flip-flops:Such as carry chains. Used for high-performance computations
Addresslines(input)
output
EE415 VLSI Design
Programmable Switch Elements
Used in connecting:» The I/O of functional units
to the wires
» A horizontal wire to a vertical wire
» Two wire segments to form a longer wire segment
EE415 VLSI Design
SRAM connected to the gate of a transistor (Xilinx)
Fuse / Anti Fuse (Actel)
Programmable Switch Elements: Implementation
symbol implementation
symbol implementationNote: Switches degrade thesignals slow down
EE415 VLSI Design
Routing Channels
Note: fixed channel widths (tracks) Channel -> track -> segment
Segment length?» Long: carry the signal longer,
less “concatenation” switches, but might waste track» Short: local connections, slow for longer connections
channeltrack
segment
EE415 VLSI Design
Switch Boxes
Ideally, provide switches for all possible connections
Trade-off:» Too many switches:
– Large area– Complex to program
» Too few switches:– Cannot route signals
Xilinx 4000One possible
solution
EE415 VLSI Design
- Chain all config bits in a shift register or use pipelining
- Partition the elements into subsets, treat each as a memory block- Consider the problem when designing the FPGA architecture
- Carefully schedule the programming
- Yes! If two functional units drive same line
- Avoid at architectural design or when programming
Programming
How to access all programmable elements?» Pin limitation
» Feasibility of access (Actel example)
Are there “invalid” configurations?
EE415 VLSI Design
Programming (cont.)
Too much detail! (tens of bits for each cell/switch block)» Automated placement, routing and programming» Design a simple structure so that tools can handle
Partially reconfigurable?» Extra control circuitry, more flexibility » Runtime reconfigurable? (avoid conflicts with
running components)
EE415 VLSI Design
Pros and Cons
General architecture» Slower than ASIC» Less logic capacity
(solution: reuse silicon area through reconfiguration)» Flexible
Customization helps» Instantiate many small processing elements
parallel processing» Some operations faster
(e.g., constant multiplication, bit-wise operations)» More operations in parallel
reduce clock speed reduce power consumption