TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/...
-
Upload
chiportal -
Category
Technology
-
view
956 -
download
0
description
Transcript of TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/...
![Page 1: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever](https://reader036.fdocuments.us/reader036/viewer/2022081400/5491bc02ac79592f288b45ef/html5/thumbnails/1.jpg)
May 1, 2013 1
A breakthrough in logic design dras3cally improving performances from 65/55nm and below
Ilan Sever
Library group CTO And Israeli Subsidiary Manager
DOLPHIN INTEGRATION
![Page 2: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever](https://reader036.fdocuments.us/reader036/viewer/2022081400/5491bc02ac79592f288b45ef/html5/thumbnails/2.jpg)
May 1, 2013 2
• Incorporated as French SA in 1985
• on Alternext of NYSE in 2007
• as the Provider of Design Products for mixed signal SoCs
• now ac3ve from 180 nm to 28 nm – with 135 Design Engineers
– plus Field Applica3on Engineers and SoC Integra3on Engineers expert at Hardware Modeling to provide SoCs with the best subsystems:
• High Resolu3on Audio (Converters and Audio Signal Processing) • High Resolu3on Measurement (Converters for Power Metering, Mems, etc.) • Low-‐power Storage (Register Banks and Memories) • Low-‐power Microcontrol Logic (80x51 Legacy, eFlash Caches, Coprocessors...)
– and innova3ve libraries of Standard Cells and Memory Registers
– with Power Regula3on, Reference, Clock & Detector Networking
– where the major differen3ator is the Flexibility of IP configura3ons (FLIP)
Corporate ID
![Page 3: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever](https://reader036.fdocuments.us/reader036/viewer/2022081400/5491bc02ac79592f288b45ef/html5/thumbnails/3.jpg)
May 1, 2013 3
• Incorporated in October 2009 as Dolphin Integra3on Ltd
• With the charter to develop innova3ve small-‐capacity memory architectures
• 7 Employees, All engineers
• Developed three product families
• in technologies ranging from 0.13u down to 55nm :
Innova3ve high-‐density 1PRFile “AURA” up to 25% smaller than compe3tor’s solu3on with half the dynamic power – Licensed by TSMC, by Leading IDM’s and Fablesses
Innova3ve high-‐density DPRFILE “ERIS” up to 35% smaller than compe3tor’s solu3on while providing two full Read+Write ports (as opposed to 1R1W 2-‐Port registers)
Patent-‐Pending “CARME” mul3-‐port register allowing seamless replacement of Flip-‐Flop and extreme high-‐speed asynchronous access for accelera3on of digital blocks
Dolphin in Israel
![Page 4: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever](https://reader036.fdocuments.us/reader036/viewer/2022081400/5491bc02ac79592f288b45ef/html5/thumbnails/4.jpg)
May 1, 2013 4
Market trends : Boom of average SoCs clock speed
• Consumer electronics and mobile devices drive the need for higher SoC performances
– High performance required for embedded processor
– High density and low power required for rest of SoC
• Targeted applica3ons – Smartphone
– Mul3media
– Gaming
– Compu3ng
– …
Source: Kurzweil
![Page 5: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever](https://reader036.fdocuments.us/reader036/viewer/2022081400/5491bc02ac79592f288b45ef/html5/thumbnails/5.jpg)
May 1, 2013 5
Design techniques for improving performance
of cri3cal paths on logic blocks • Logic designers can leverage 4 solu3ons to improve performances of logic
blocks while maintaining the best density/power trade-‐off
5
Design techniques Drawbacks Impacts
Multi process support Use LP process for power critical circuits
Use G process for speed critical circuits
The high leakage of G process Leakage loss
Multi Vt support in standard cells Use LVt cells with improved performance in critical paths
An additional LVT layer is needed. The high leakage of LVT cells
Leakage loss
Multi tracks support in standard cells Use 7/8 Track libraries for density optimized blocks and 10/12/14 Track libraries for speed critical blocks
Most libraries are not path mixable, so optimization is limited to the whole logic block level: cells are
oversized for all non-critical paths of the block.
Area loss
CARME bit-‐cell based register packs
Use CARME for speed cri3cal registers - -
![Page 6: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever](https://reader036.fdocuments.us/reader036/viewer/2022081400/5491bc02ac79592f288b45ef/html5/thumbnails/6.jpg)
May 1, 2013 6
Barriers and challenges in op3mizing the register files within a logic design
• Flexibility of configura3ons : #words, #bits (unlike custom solu3ons)
• Reset opera3on (does not exist in SRAM-‐Based macros)
• Scan/DFT (SRAM-‐Based macros do not support scan and require BIST)
• Write & Read access protocol and speed • Mul3 ports
• Usage within a standard logic flow
• Automa3c P&R inside and area of standard logic rows
• Dynamic consump3on and IR-‐Drop during read/write ac3vity
• Support for power-‐down & reten3on modes
• Area – always a key factor
6
![Page 7: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever](https://reader036.fdocuments.us/reader036/viewer/2022081400/5491bc02ac79592f288b45ef/html5/thumbnails/7.jpg)
May 1, 2013 7 7
Property/Challenge
Synthesizable FF-Based register
Synthesizable Latch-Based register
SRAM-Based register CARME Register Pack
Reset Yes Yes No Yes
Scan / DFT Yes No (Need BIST) No (Need BIST) Yes
Write access Synchronous Synchronous Synchronous Synchronous Optional asynchronous
write-through
Read access Asynchronous Asynchronous Synchronous Asynchronous
Multi Port Yes Yes No Yes
Placement and Routability
Standard P&R Standard P&R Hard macro placement outside logic rows
Hard macro compatible with placement in logic
rows
Cell Compa3ble register packs CARME
key features
![Page 8: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever](https://reader036.fdocuments.us/reader036/viewer/2022081400/5491bc02ac79592f288b45ef/html5/thumbnails/8.jpg)
May 1, 2013 8
Cell Compa3ble register packs CARME
key features
• Brand new kind of bit-‐cell based generator which can be used as an alterna3ve to standard cell based implementa3on for storage elements such as registers – CARME instances are ac3ng exactly as synthesized registers thus ensuring a
seamless replacement
• CARME is the ideal solu3on for those who want to improve speed but also even further the logic density and dynamic power
• Tradi3onal registers once placed are unstructured and widespread lunless hierarchically Placed & Routed. Opposite to this approach, CARME registers are structured as “packs” to facilitate RTL engineering but s3ll enjoy the flexibility of a generator
8
![Page 9: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever](https://reader036.fdocuments.us/reader036/viewer/2022081400/5491bc02ac79592f288b45ef/html5/thumbnails/9.jpg)
May 1, 2013 9
Cell Compa3ble register packs CARME performances @ 65 nm LP
• Benchmark results aoer Synthesis (with scan inser3on) on Motu Uta V5
9
Process: TSMC 65 nm LP Standard cell library performances are for SVt PVT used for timings: SS; 1.08 V; 125°C Accuracy of results for CARME
Speed +/-10% Area +/-5%
18% gain in density
22% gain in speed
![Page 10: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever](https://reader036.fdocuments.us/reader036/viewer/2022081400/5491bc02ac79592f288b45ef/html5/thumbnails/10.jpg)
May 1, 2013 10
CARME Vs. Alterna3ves
• Benchmark: Implementa3on of a 16x16 Look-‐Up-‐Table (TSMC 65nm LP)
Property/Challenge
Synthesizable FF-Based
Synthesizable Latch-Based
SRAM-Based CARME Register Pack
Area (65LP) 3973 um² 1965 um² 1530 um² 2704 um²
Speed (access time, typical)
0.39 ns 0.6-0.8 ns 0.8-1.0 ns 0.22 ns
Power @1GHz 2.03 mW 0.80 mW 1.43 mW 0.89 mW
Reset Yes Yes No Yes
Write access Synchronous Synchronous Synchronous Synchronous
Read access Asynch. Synchronous Synchronous Asynch.
Multi Port Yes Yes No Yes
![Page 11: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever](https://reader036.fdocuments.us/reader036/viewer/2022081400/5491bc02ac79592f288b45ef/html5/thumbnails/11.jpg)
May 1, 2013 11
• READ is Asynchronous
• Can support up to 4 independent read ports.
read_addr
data_out
delay: read_addr=>data_out delay: read_addr=>data_out
Fast Asynchronous Read
![Page 12: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever](https://reader036.fdocuments.us/reader036/viewer/2022081400/5491bc02ac79592f288b45ef/html5/thumbnails/12.jpg)
May 1, 2013 12
CARME compiler highlights
• Architecture – Based on patentable bit-‐cell
– Op3mized for easy & risk-‐free integra3on within standard-‐cell rows
• Flexibility – 2 to 128 words
– 4 to 144 bit wide
– Up to 4 independent read-‐ports
• Features & Benefits – Very fast asynchronous read opera3on
– Synchronous write with op3onal fast write-‐through
– 1 write port, mul3ple read ports
– Reset func3on
– Reten3on Mode
– Byte/Bit-‐Write control
CARME register pack 16X16
TSMC 65LP Access 3me 220ps
![Page 13: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever](https://reader036.fdocuments.us/reader036/viewer/2022081400/5491bc02ac79592f288b45ef/html5/thumbnails/13.jpg)
May 1, 2013 13
CARME compiler highlights
• Proprietary Bitcell Features :
• Scannable
• Reserable • High-‐speed write
• Support for mul3ple high-‐speed read ports
• Area efficient – ½ of normal D-‐FF
• Low power – ½ of normal D-‐FF
• Reten3on-‐Ready -‐ Replace reten3on-‐FF • Non-‐Pushed-‐Rules : Easily retarget-‐able
![Page 14: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever](https://reader036.fdocuments.us/reader036/viewer/2022081400/5491bc02ac79592f288b45ef/html5/thumbnails/14.jpg)
May 1, 2013 14
All outputs are routed to Distribu3on Plane
Up to 16 Bitcells in a pack
Output Mux Address Bus
Data Bus
Basic Architecture
![Page 15: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever](https://reader036.fdocuments.us/reader036/viewer/2022081400/5491bc02ac79592f288b45ef/html5/thumbnails/15.jpg)
May 1, 2013 15
Add read ports in a modular way without complexity or performance degrada3on
OutMux Port A Addr Bus A
DataOut Port A
Mul3ple read ports
Addr Bus B OutMux Port B
DataOut Port B
![Page 16: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever](https://reader036.fdocuments.us/reader036/viewer/2022081400/5491bc02ac79592f288b45ef/html5/thumbnails/16.jpg)
May 1, 2013 16
ScanCK_N
ScanCK
ScanCK_N
ScanCK
• Scannable Latch Array
CARME compiler highlights
![Page 17: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever](https://reader036.fdocuments.us/reader036/viewer/2022081400/5491bc02ac79592f288b45ef/html5/thumbnails/17.jpg)
May 1, 2013 17
CARME compiler highlights
• Flexible number of read ports
1 Port 2 Ports 4 Ports
![Page 18: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever](https://reader036.fdocuments.us/reader036/viewer/2022081400/5491bc02ac79592f288b45ef/html5/thumbnails/18.jpg)
May 1, 2013 18
CARME compiler highlights
• Fits inside logic rows – zero overhead for spacers, power rings, wrappers
• Custom layout fits number of horizontal & ver3cal tracks
• IR-‐Drop-‐aware placement : Shared among rows
• Just like a big standard – cell !
![Page 19: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever](https://reader036.fdocuments.us/reader036/viewer/2022081400/5491bc02ac79592f288b45ef/html5/thumbnails/19.jpg)
May 1, 2013 19
CARME compiler highlights
• Rou3ng-‐Aware structure
• Feed-‐Through over the cell
![Page 20: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever](https://reader036.fdocuments.us/reader036/viewer/2022081400/5491bc02ac79592f288b45ef/html5/thumbnails/20.jpg)
May 1, 2013 20
CARME performances @ 65 nm LP
20
• Register performances
Block Name Register size ConfiguraBon Speed write
operaBon (ps)
Speed read operaBon
(ps)
Dynamic power (uA/
MHZ)
MCU OR1200 32x32 2R1W 497 611 1.3
ALU CHRONOS 16x32 3R1W 406 490 0.52
USB 32x8 1R1W 358 442 0.28
UART 16x8 1R1W 332 420 0.19
Spi 4x8 1R1W 315 240 0.15
PVT used for timings: SS; 1.08 V; 125°C PVT user for dynamic power consumption: TT; 1.2 V; 25°C
![Page 21: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever](https://reader036.fdocuments.us/reader036/viewer/2022081400/5491bc02ac79592f288b45ef/html5/thumbnails/21.jpg)
May 1, 2013 21
CARME performances @ 65 nm LP
21
Post P&R results on Motu-‐Uta using a High-‐Density 7-‐Track Spinner (Pulsed-‐latch) library :
• W/O CARME : 114000 um2 at 195 MHz
• Using CARME : 97600 um2 at 196 MHz (-‐15% area, same speed)
• Using CARME : 103000 um2 at 233 MHz (-‐10% area, +20% speed)
![Page 22: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever](https://reader036.fdocuments.us/reader036/viewer/2022081400/5491bc02ac79592f288b45ef/html5/thumbnails/22.jpg)
May 1, 2013 22
uHD-‐BTF Standard Cells
CARME Compiler
Cell Compa3ble register packs CARME integra3on flow
MHz
Patent pending Reduced cell stem library based on
pulsed latch for ultra high density
Patent pending bit-cell based generator
of register packs
LogiWare models
Library of verilog models of registers
Scripts
Automatic detection of registers in a RTL design and their swift replacement by a
model enabling both synthesis and instantiation of
registers
![Page 23: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever](https://reader036.fdocuments.us/reader036/viewer/2022081400/5491bc02ac79592f288b45ef/html5/thumbnails/23.jpg)
May 1, 2013 23
Cell Compa3ble register packs CARME integra3on flow
Memory compilers
Memory instances
list
LOGIWARE Library
Memory compilers
User’s original RTL
DETECTION script
Standard implementation flow
CARME implementation flow: automated steps
Dolphin’s silicon IPs offering
SELECTION script
Hard macros instan3ated
RTL
Netlist with hard macros Synthesis
Updated Memory instances
list
… 1
2 5
3 4
6 7 8
Selection script allows replacement per criteria defined by the user: Above certain # of bits (IE >500 bits) Above a defined speed/area/leakage gain Always replace inside a specified block Do not touch a specified block Etc..
![Page 24: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever](https://reader036.fdocuments.us/reader036/viewer/2022081400/5491bc02ac79592f288b45ef/html5/thumbnails/24.jpg)
May 1, 2013 24
Summary
• CARME is an innova3ve patent-‐pending breakthrough in logic design combining the flexibility and testability of synthesizable registers together with the high density of memory generators and the high speed and low power of custom data-‐paths.
• Dolphin integra3on is con3nuously challenging the tradi3onal library market with the introduc3on of patented ground-‐breaking innova3ons allowing SoC architects and backend-‐engineers to maximize their silicon performance/cost.