Closing Gap Between ASIC and Custom Chapter 12,13,14.

40
Closing Gap Between Closing Gap Between ASIC and Custom ASIC and Custom Chapter 12,13,14 Chapter 12,13,14

Transcript of Closing Gap Between ASIC and Custom Chapter 12,13,14.

Closing Gap Between Closing Gap Between ASIC and CustomASIC and CustomChapter 12,13,14Chapter 12,13,14

Chapter 12Chapter 12Semi-Custom Methods in a High-Semi-Custom Methods in a High-Performance Microprocessor designPerformance Microprocessor design

Custom Processor DesignCustom Processor Design

IBM eServer zSeries(S/390 mainframe)Physical design extensive use hierarchy

Each Functional unit is partitioned as a macro

Each macro unit is fully floorplanned Global wiring is done hierarchicallyMacros are characterized for timing, noise…Timing rules are generated using static

transistor level simulation

Circuit and Physical design start as soon as sufficient logic is defined

Custom Processor DesignCustom Processor DesignAs design matures emphasis shifts from

functional verification to logic modification and repartitioning for archive timing closureEfficiency, turn-around-time and flexibility are

as important as cycle-time.Three types of macros

ArraysSynthesized random logic macros(RLMs)Full custom dataflow

Done predominantly in static logic, with dynamic circuitry reserved for extremely critical functions

Custom Processor DesignCustom Processor Design

Custom design is very effective when elements are identical across the bit range of the data stack

Complex numerical functions usually are far less regular around the stackRequires more effort to produce full customOften Timing criticalCircuit architecture might evolve

Good candidate for Semi-Custom design

Semi-Custom designSemi-Custom design

Basic building block is a set of parameterized gates

Covers basis set capable covering most of the design space

No directly associated layout

Overall Design flowOverall Design flow

Circuit TuningCircuit Tuning Tools can be divided to dynamic and static

tuning Dynamic tuning involves simulation with explicit

waveforms and measures Static tuning formulates optimization through static

timing, optimizing slack in the presence of timing assertions

Large, non-bitslice circuit’s are impractical for dynamic tuning, but good for static

Tool used here is Einstuner Build on top of static transistor-level timing

tool(EinsTLT) Combines a fast event driven simulator(SPECS) with

timing tool(Einstimer)

Cell GenerationCell Generation Create layout for corresponding to the

parameterized gates Writers use their own tool, C-cell which is script

based system designed to produce optimal layout

Tool supports semi custom design Generate set of layouts from cell specs. Parse a schematic Converts between parameterized and standard (RLM

library) cells Has integrated floorplanning aid Layout post-processing (flattening, shape trimming)

Design exampleDesign example24-bit adder

Eight 24-bit adder modulesArchitecture

CLA Ling adder

Design exampleDesign example

ConclusionConclusion

Faster method than Full customFeasible performance compared to full

customSometimes better performance if the

architecture selection for full custom is done non optimally

Adapting easily to global timing convergence is advantage of semi-custom design

Chapter 13Chapter 13Controlling Uncertainty in High Controlling Uncertainty in High Frequency DesignsFrequency Designs

TerminologyTerminology Actual frequency Market frequency Design frequency Predicated frequency

Uncertainty DefinedUncertainty DefinedProcess uncertainty

for example in-die variationTool uncertainty

Inaccuracy in the simulation and extraction tools

For example: inductance is not extracted, the predicated frequency of design will be optimistic relative to actual frequency

Design uncertaintyUnpredictable variations in the design

process between design iterationsVariations of execution of design methods

across the chip

Uncertainty definedUncertainty defined

Uncertainty in the manufacturing tools and design processes cause a gap between the predicated and actual frequencies, thus reducing the cycle time available for logic functionality

Clock tree example of design uncertaintyClock tree example of design uncertainty

Uncertainty and FrequencyUncertainty and FrequencyWhen process contains uncertainty

Time and energy is wasted on non-critical parts of design

Reduced frequency or delay in time-to market

Example of tool uncertaintyExample of tool uncertainty

Example of tool uncertainlyExample of tool uncertainly

Focused methodology developementFocused methodology developement

Reducing the uncertainty will minimize the number of paths in WNS(worst negative slack) bucket and thus the effort required to address them

It has been shown that the closer the correlation between the predicated frequency and the actual frequency for the paths in the WNS bucket, the higher the actual frequency will be

Methods for removing paths from the Methods for removing paths from the uncertainty windowuncertainty window

Traditionally CAD algorithms is WNSResult a large number of paths in WNS

bucketBetter to use total negative slack (TNS)

algorithmTNS is defined as sum of negative slacksTries to improve all negative paths until it

reaches zero slack Goal is to remove all paths from the negative

region, design goal is remove all paths from the WNS bucket

Design frequency should be setted that all paths in WNS bucket are in the negative region.

Uncertainty LifetimeUncertainty Lifetime

ConclusionConclusion

Gap between actual and predicated frequency implies uncertainty in the manufacturing, tool, and design processesDesign teams will work on wrong paths

If the uncertainty in analysis of the design is minimized, resources can be managed better and the gain of costly local optimizations is higher.

ConclusionConclusion To control design and tool uncertainty take the

following steps List all of sources of uncertainty Uncertainty plan development Reduce guard band as much as possible Use TNS-based cost model Tune the Design frequency Toward end of design threat all paths within sigma of

the design equal, reduce uncertainty by reducing automation

Push CAD vendors into algorithm development Finally remember any gap between predicated and

actual frequency is lowering the actual frequency

Chapter 14Chapter 14Increasing Circuit Performance through Increasing Circuit Performance through Statistical Design TechniquesStatistical Design Techniques

Process variability Process variability As CMOS technology keeps scaling the

magnitude of variability of the process will increaseSystematic variable due the interaction

between manufacturing process and the properties of the design

Optical proximity cause polysilicon feature to vary depending on the local layout surrounding

Inter layer dielectric thickness varies due to the dependence to CMP on the local wire dencity

Ability to improve manufacturing tolerances is limited

Mask fabrication Overlay control

Process variabilityProcess variability

Intra-chip variation should be taken accountRecent study shows 0.13um CMOS 35% of

variation in MOS channel length is affected by intra-chip variation

For 0.07um CMOS the intra-chip effect would be 60%

Intra-chip variation is caused by emergence of a number of variation-generating mechanisms located on the interface between design and process

Process variabilityProcess variability

Process variabilityProcess variability

Identifying sources of variationIdentifying sources of variation

Needed to decide which of the multiple sources and patters of variation deserve the most attentionFor example impact on path delay The exact variability contribution of a process

parameter is defined Sensitivity of a circuit performance Magnitude of the variation

Identifying sources of variationIdentifying sources of variation

Increasing performance through Increasing performance through probabilistic timing modelingprobabilistic timing modeling

How does intra-chip variation differ from inter-chip variation?Usually in high performance chips, the delay

is optimized by moving delay off from critical path to paths with slack

Resulting chip with large number of paths near to the maximum delay

The inter-chip variation affects in each path similarly

The intra-chip variation effect is dependent on surrounding and die position

Increasing performance through Increasing performance through probabilistic timing modelingprobabilistic timing modeling

Increasing performance through Increasing performance through probabilistic timing modelingprobabilistic timing modeling

Conservatism of the traditional timing tools is more disadvantageous for ASICs

No testing for full speed as in custom circuits By implementing probabilistic timing analysis

methodology the conservatism built into standard ASIC design can be reduced

By lowering yield the performance could be improved Yield of 98% (instead of 99,99%) reduct conservatism

by 17%

It has been noted that an ASIC chip prodused in foundry ca run up to 40% faster than predicated by standard timing analysis

Vendors would trade yield to performance if the revenue from faster chips will justify the additional expense in lost yield and testing overhead

Increasing performance through design for Increasing performance through design for manufucturability techniquesmanufucturability techniques

As mentioned intra-chip variation is affected by layout.

Most techniques presented are already in use in full custom design

Optical proximity correction (OPC)Cover wide range of reticle enhancement

techniques Geometrical structures are added to mask

Critical dimension (L) and resolution variable Corner rounding and line pull-back

Phase shifting mask (PSM)

Increasing performance through design for Increasing performance through design for manufucturability techniquesmanufucturability techniques

Increasing performance through design for Increasing performance through design for manufucturability techniquesmanufucturability techniques

Currently(2002), a significant effort is under way to provide cell libraries which are OPC- and PSM-compliant

Allows ASIC designer benefit from those Inserting a dummy features with regions of

lesser density will increase uniformity. Improves process uniformity of CMP Downside is increased coupling capacitances and the

delay and signal integrity dangers Might be better to use better model than adding metal

fill or use both Systematic spatial correlated variation through

lens aberrations, would need a mask level spatial correlation algorithm performed in conjunction with OPC

Increasing performance through design for Increasing performance through design for manufucturability techniquesmanufucturability techniques

Is the parameter variation systematic or randomSystematic variation can be deterministic

modeledRandom (or too complex to model

deterministicly) variation is best be described by statistical means

ConclusionConclusion

Intra-chip variation of a process parameter are increasingMakes timing estimates provided by standard

design methodology overly conservative Downgrade the speed

New methods needed for timing analysisASICs suffers more of these effects

No full speed tests, or trading yield for speed