TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/...

25
May 1, 2013 1 A breakthrough in logic design dras3cally improving performances from 65/55nm and below Ilan Sever Library group CTO And Israeli Subsidiary Manager DOLPHIN INTEGRATION

description

 

Transcript of TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/...

Page 1: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever

May  1,  2013 1

A  breakthrough  in  logic  design  dras3cally  improving  performances  from  65/55nm  and  below  

Ilan  Sever  

Library  group  CTO  And  Israeli  Subsidiary  Manager  

DOLPHIN  INTEGRATION  

Page 2: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever

May  1,  2013 2

•  Incorporated  as  French  SA  in  1985  

•  on  Alternext  of  NYSE  in  2007  

•  as  the  Provider  of  Design  Products  for  mixed  signal  SoCs  

•  now  ac3ve  from  180  nm  to  28  nm    –  with  135  Design  Engineers    

–  plus  Field  Applica3on  Engineers  and  SoC  Integra3on  Engineers  expert  at  Hardware  Modeling  to  provide  SoCs  with  the  best  subsystems:  

•  High  Resolu3on  Audio  (Converters  and  Audio  Signal  Processing)  •  High  Resolu3on  Measurement  (Converters  for  Power  Metering,  Mems,  etc.)  •  Low-­‐power  Storage  (Register  Banks  and  Memories)  •  Low-­‐power  Microcontrol  Logic  (80x51  Legacy,  eFlash  Caches,  Coprocessors...)  

–  and  innova3ve  libraries  of  Standard  Cells  and  Memory  Registers    

–  with  Power  Regula3on,  Reference,  Clock  &  Detector  Networking  

–  where  the  major  differen3ator  is  the  Flexibility  of  IP  configura3ons  (FLIP)  

Corporate  ID  

Page 3: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever

May  1,  2013 3

•  Incorporated  in  October  2009  as  Dolphin  Integra3on  Ltd  

•  With  the  charter  to  develop  innova3ve  small-­‐capacity  memory  architectures  

•  7  Employees,    All  engineers    

•  Developed  three  product  families    

•  in  technologies  ranging  from  0.13u  down  to  55nm  :  

  Innova3ve  high-­‐density  1PRFile  “AURA”  up  to  25%  smaller  than  compe3tor’s  solu3on  with  half  the  dynamic  power  –  Licensed  by  TSMC,  by  Leading  IDM’s  and  Fablesses  

  Innova3ve  high-­‐density  DPRFILE  “ERIS”  up  to  35%  smaller  than  compe3tor’s  solu3on  while  providing  two  full  Read+Write  ports  (as  opposed  to  1R1W  2-­‐Port  registers)  

  Patent-­‐Pending  “CARME”  mul3-­‐port  register  allowing  seamless  replacement  of  Flip-­‐Flop  and  extreme  high-­‐speed  asynchronous  access  for  accelera3on  of  digital  blocks  

Dolphin  in  Israel  

Page 4: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever

May  1,  2013 4

Market  trends  :    Boom  of  average  SoCs  clock  speed  

•  Consumer  electronics  and  mobile  devices  drive  the  need  for  higher  SoC  performances  

–  High  performance  required  for  embedded  processor  

–  High  density  and  low  power  required  for  rest  of  SoC    

•  Targeted  applica3ons  –  Smartphone  

–  Mul3media  

–  Gaming  

–  Compu3ng  

–  …  

Source: Kurzweil

Page 5: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever

May  1,  2013 5

Design  techniques  for  improving  performance  

of  cri3cal  paths  on  logic  blocks  •  Logic  designers  can  leverage  4  solu3ons  to  improve  performances  of  logic  

blocks  while  maintaining  the  best  density/power  trade-­‐off  

5  

Design techniques Drawbacks Impacts

Multi process support  Use LP process for power critical circuits

 Use G process for speed critical circuits

The high leakage of G process Leakage loss

Multi Vt support in standard cells  Use LVt cells with improved performance in critical paths

An additional LVT layer is needed. The high leakage of LVT cells

Leakage loss

Multi tracks support in standard cells  Use 7/8 Track libraries for density optimized blocks and 10/12/14 Track libraries for speed critical blocks

Most libraries are not path mixable, so optimization is limited to the whole logic block level: cells are

oversized for all non-critical paths of the block.

Area loss

CARME  bit-­‐cell  based  register  packs  

 Use  CARME  for  speed  cri3cal  registers   - -

Page 6: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever

May  1,  2013 6

Barriers  and  challenges  in  op3mizing  the  register  files  within  a  logic  design  

•  Flexibility  of  configura3ons  :  #words,  #bits  (unlike  custom  solu3ons)  

•  Reset  opera3on  (does  not  exist  in  SRAM-­‐Based  macros)  

•  Scan/DFT  (SRAM-­‐Based  macros  do  not  support  scan  and  require  BIST)  

•  Write  &  Read  access  protocol  and  speed  •  Mul3  ports  

•  Usage  within  a  standard  logic  flow  

•  Automa3c  P&R  inside  and  area  of  standard  logic  rows  

•  Dynamic  consump3on  and  IR-­‐Drop  during  read/write  ac3vity  

•  Support  for  power-­‐down  &  reten3on  modes  

•  Area  –  always  a  key  factor  

6  

Page 7: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever

May  1,  2013 7 7  

Property/Challenge

Synthesizable FF-Based register

Synthesizable Latch-Based register

SRAM-Based register CARME Register Pack

Reset Yes Yes No Yes

Scan / DFT Yes No (Need BIST) No (Need BIST) Yes

Write access Synchronous Synchronous Synchronous Synchronous Optional asynchronous

write-through

Read access Asynchronous Asynchronous Synchronous Asynchronous

Multi Port Yes Yes No Yes

Placement and Routability

Standard P&R Standard P&R Hard macro placement outside logic rows

Hard macro compatible with placement in logic

rows

Cell  Compa3ble  register  packs  CARME    

key  features  

Page 8: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever

May  1,  2013 8

Cell  Compa3ble  register  packs  CARME    

key  features  

•  Brand  new  kind  of  bit-­‐cell  based  generator  which  can  be  used  as  an  alterna3ve  to  standard  cell  based  implementa3on  for  storage  elements  such  as  registers  –  CARME  instances  are  ac3ng  exactly  as  synthesized  registers  thus  ensuring  a  

seamless  replacement  

•  CARME  is  the  ideal  solu3on  for  those  who  want  to  improve  speed  but  also  even  further  the  logic  density  and  dynamic  power  

•  Tradi3onal  registers  once  placed  are  unstructured  and  widespread  lunless  hierarchically  Placed  &  Routed.  Opposite  to  this  approach,  CARME  registers  are  structured  as  “packs”  to  facilitate  RTL  engineering  but  s3ll  enjoy  the  flexibility  of  a  generator  

8  

Page 9: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever

May  1,  2013 9

Cell  Compa3ble  register  packs  CARME  performances  @  65  nm  LP  

•  Benchmark  results  aoer  Synthesis  (with  scan  inser3on)  on  Motu  Uta  V5  

9  

  Process: TSMC 65 nm LP   Standard cell library performances are for SVt   PVT used for timings: SS; 1.08 V; 125°C   Accuracy of results for CARME

  Speed +/-10%   Area +/-5%

18% gain in density

22% gain in speed

Page 10: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever

May  1,  2013 10

CARME  Vs.  Alterna3ves  

•  Benchmark:  Implementa3on  of  a  16x16  Look-­‐Up-­‐Table  (TSMC  65nm  LP)  

Property/Challenge

Synthesizable FF-Based

Synthesizable Latch-Based

SRAM-Based CARME Register Pack

Area (65LP) 3973 um² 1965 um² 1530 um² 2704 um²

Speed (access time, typical)

0.39 ns 0.6-0.8 ns 0.8-1.0 ns 0.22 ns

Power @1GHz 2.03 mW 0.80 mW 1.43 mW 0.89 mW

Reset Yes Yes No Yes

Write access Synchronous Synchronous Synchronous Synchronous

Read access Asynch. Synchronous Synchronous Asynch.

Multi Port Yes Yes No Yes

Page 11: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever

May  1,  2013 11

•  READ  is  Asynchronous  

•  Can  support  up  to  4  independent  read  ports.  

read_addr

data_out

         delay:  read_addr=>data_out  delay:  read_addr=>data_out

Fast  Asynchronous  Read  

Page 12: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever

May  1,  2013 12

CARME  compiler  highlights  

•  Architecture  –  Based  on  patentable  bit-­‐cell  

–  Op3mized  for  easy  &  risk-­‐free  integra3on  within  standard-­‐cell  rows  

•  Flexibility  –  2  to  128  words  

–  4  to  144  bit  wide  

–  Up  to  4  independent  read-­‐ports  

•  Features  &  Benefits  –  Very  fast  asynchronous  read  opera3on  

–  Synchronous  write  with  op3onal  fast  write-­‐through  

–  1  write  port,  mul3ple  read  ports  

–  Reset  func3on  

–  Reten3on  Mode  

–  Byte/Bit-­‐Write  control  

CARME  register  pack  16X16  

TSMC  65LP  Access  3me  220ps  

Page 13: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever

May  1,  2013 13

CARME  compiler  highlights  

•  Proprietary  Bitcell  Features  :  

•  Scannable  

•  Reserable  •  High-­‐speed  write  

•  Support  for  mul3ple  high-­‐speed  read  ports  

•  Area  efficient  –  ½  of  normal  D-­‐FF  

•  Low  power  –  ½  of  normal  D-­‐FF  

•  Reten3on-­‐Ready  -­‐  Replace  reten3on-­‐FF  •  Non-­‐Pushed-­‐Rules  :  Easily  retarget-­‐able  

Page 14: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever

May  1,  2013 14

All  outputs    are  routed  to  Distribu3on  Plane  

Up  to  16  Bitcells  in  a  pack    

Output  Mux  Address  Bus  

Data  Bus  

Basic  Architecture  

Page 15: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever

May  1,  2013 15

Add  read  ports  in  a  modular  way  without  complexity  or  performance  degrada3on  

OutMux                          Port  A  Addr  Bus  A  

DataOut              Port  A  

Mul3ple  read  ports  

Addr  Bus  B  OutMux                          Port  B  

DataOut              Port  B  

Page 16: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever

May  1,  2013 16

ScanCK_N  

ScanCK  

ScanCK_N  

ScanCK  

•  Scannable  Latch  Array  

CARME  compiler  highlights  

Page 17: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever

May  1,  2013 17

CARME  compiler  highlights  

•  Flexible  number  of  read  ports  

1  Port   2  Ports   4  Ports  

Page 18: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever

May  1,  2013 18

CARME  compiler  highlights  

•  Fits  inside  logic  rows  –  zero  overhead  for  spacers,  power  rings,  wrappers  

•  Custom  layout  fits  number  of  horizontal  &  ver3cal  tracks  

•  IR-­‐Drop-­‐aware  placement  :  Shared  among  rows  

•  Just  like  a  big  standard  –  cell  !  

Page 19: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever

May  1,  2013 19

CARME  compiler  highlights  

•  Rou3ng-­‐Aware  structure  

•  Feed-­‐Through  over  the  cell  

Page 20: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever

May  1,  2013 20

CARME  performances    @  65  nm  LP  

20  

•  Register  performances  

Block  Name   Register  size   ConfiguraBon   Speed    write  

operaBon  (ps)  

Speed    read  operaBon  

(ps)  

Dynamic  power  (uA/

MHZ)  

MCU  OR1200   32x32   2R1W   497   611   1.3  

ALU  CHRONOS   16x32   3R1W   406   490   0.52  

USB   32x8   1R1W   358   442   0.28  

UART   16x8   1R1W   332   420   0.19  

Spi   4x8   1R1W   315   240   0.15  

  PVT used for timings: SS; 1.08 V; 125°C   PVT user for dynamic power consumption: TT; 1.2 V; 25°C

Page 21: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever

May  1,  2013 21

CARME  performances    @  65  nm  LP  

21  

Post  P&R  results    on  Motu-­‐Uta    using  a  High-­‐Density  7-­‐Track  Spinner  (Pulsed-­‐latch)  library  :  

•  W/O    CARME  :  114000  um2  at  195  MHz  

•  Using  CARME  :      97600  um2  at  196  MHz  (-­‐15%  area,  same  speed)  

•  Using  CARME  :  103000  um2  at  233  MHz  (-­‐10%  area,  +20%  speed)  

Page 22: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever

May  1,  2013 22

uHD-­‐BTF  Standard  Cells  

CARME  Compiler  

Cell  Compa3ble  register  packs  CARME  integra3on  flow  

MHz

Patent pending Reduced cell stem library based on

pulsed latch for ultra high density

Patent pending bit-cell based generator

of register packs

LogiWare models

Library of verilog models of registers

Scripts

Automatic detection of registers in a RTL design and their swift replacement by a

model enabling both synthesis and instantiation of

registers

Page 23: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever

May  1,  2013 23

Cell  Compa3ble  register  packs  CARME  integra3on  flow  

Memory  compilers  

Memory  instances  

list  

LOGIWARE  Library  

Memory  compilers  

User’s  original  RTL  

DETECTION  script  

Standard implementation flow

CARME implementation flow: automated steps

Dolphin’s silicon IPs offering

SELECTION  script  

Hard  macros  instan3ated  

RTL  

Netlist  with  hard  macros  Synthesis  

Updated  Memory  instances  

list  

… 1  

2   5  

3   4  

6   7   8  

Selection script allows replacement per criteria defined by the user:   Above certain # of bits (IE >500 bits)   Above a defined speed/area/leakage gain   Always replace inside a specified block   Do not touch a specified block   Etc..

Page 24: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever

May  1,  2013 24

Summary  

•  CARME  is  an  innova3ve  patent-­‐pending  breakthrough  in  logic  design  combining  the  flexibility  and  testability  of  synthesizable  registers  together  with  the  high  density  of  memory  generators  and  the  high  speed  and  low  power  of  custom  data-­‐paths.  

•  Dolphin  integra3on  is  con3nuously  challenging  the  tradi3onal  library  market  with  the  introduc3on  of  patented  ground-­‐breaking  innova3ons  allowing  SoC  architects  and  backend-­‐engineers  to  maximize  their  silicon  performance/cost.  

Page 25: TRACK D: A breakthrough in logic design drastically improving performances from 65/55nm and below/ Ilan sever

May  1,  2013 25

THANK YOU ! Ilan  Sever        

 [email protected]  

Sales  :          [email protected]  

www.dolphin-­‐integra3on.com