of C Based VLSI Design Early time to market Code size reduction (7x-10x) Accelerated Simulation...
Transcript of of C Based VLSI Design Early time to market Code size reduction (7x-10x) Accelerated Simulation...
C-Based ASIC Design
S0
S1
S2
S0
S1
S2
A B
C
D
+ *
F
E
3 cycles
char A,B,C,D; char E,F; main(){ char X;
X = A + B; E = X * D; F = (B + C) * X; }
+
A B C D
X Y
F E
+
+ *
*
Behavior in C
+ : 1
* : 1
+ : 2
* : 2 1 cycle Delay:2T
Delay:1T
8 Lines
100 lines
RTL
Resource
constraints
1
2
VLSI Design Flow
RTL
Behavioral Model
Behavioral
Synthesis
Logic Synthesis
Gate Level Netlist
Physical Synthesis
GDSII
Software Algorithm
Designer
Hardware Designer
Manual
translation
Complicated algorithm which is tough
for RTL designer to map H/W.
C-synthesis can synthesize the whole
Program into ASIC/FPGA
(not only some parts: e.g. loop )
Page 4
C-based HLS has been used for various types of “ASICs”
Satellite
High-end
servers
Automobiles
Mobile phones
Digital Appliances
Cameras
Office
Automation
CyberWorkBench is used in REAL product development for many years!
Robots
C-based HLS CyberWorkBench
Customers* NEC 93~ (Transmission, Sattelit, Computer, Mobile, Encry, etc…)
1999 NIC for PC cluster(Xeon)
Renesas Electronics 00~ (MPEG, STB, configurable Processor,
etc..) Sold chips worth over $1B
Panasonic 01~ (Audio Video Processing)
Toyota 02~04 (Engine control research),
Dxxxx: automobile controller
JAXA 05~ (Satellite) Hayabusa, spacewire, ..
Toshiba 06~ (High Definition Video, Audio etc..)
Hitachi 07~ (HDTV algorithm development on FPGA)
Advantest 07~ (LSI Tester, 200M Gates ASIC only with in C, No RTL
JVC ~08 (Video Camera, controller)
Fujitsu 09~ (Transmission, Network)
Caxxx: Digital Camera, MFP,
Fuxxxxxx: Digital Camera , using DRP(Dynamic Reconfigurable Chip)
Suxxxx: Audio Digital
Huxxxx: Router, L2 Cache
…..
Advantages of C Based VLSI Design
Early time to market
Code size reduction (7x-10x)
Accelerated Simulation speed Generate equivalent circuit sizes compared to hand
coded RTL
RTL design
SW design
Layout Production
HW eval.
Soft.Eval.
Sim.
RTL (200KG)
C-based (600KG)
SW design
Syn
Veri.
14M
100MM
8.5M
40MM
3 6 9 12
Layout
SW eval.
HW eval
Production
Spec
Spec&Sim
Shorter Time-to-Market
smaller NRE cost
MAHALO MARY
RTL-based Design
C-based Design
other bug (IP etc.) sync. bug basic bug system bug
Less Bugs
H/W
MPU
Viterbi
Decoder
H/W S-DSP
SRAM
2KW
X N pairs
Viterbi
Decoder
NEC
DSP MPU
Area/Power Reduction
Saving Several $100K per SoC
(CWB costs much less )
Case study W-CDMA ASSP
W-CDMA chip design example
SoC Design with CWB
3G Mobile Phone Application Chip
(MP211: Multicore-ISSCC05)
Data-
Dominant
Control-
Dominant
All-in-C
1. Synthesize ANY
application (data or
control dominant,
controllers)
2. Design AND verify
at the C-level
Full Chip simulation
and synthesis
Import legacy RTL
blocks
Behavioral
Synthesizer
SystemC/SpecC ANSI-C Verilog / VHDL
Verilog/VHDL
Behavioral description
C-RTL Equivalence
Prover
Property
Checker
Formal Verifier Sim. Model Generator
Bit-accurate
Behavioral Simulator
Cycle-accurate HW/SW
Co-simulator
RT
FloorPlanner
RT Power
Estimator
GUI For
QoR Analysis
Synthesis Control
C
SystemC
CPU Bus I/F generator Behavioral
IP library
Library
Characterizer
Software
Legacy RTL / IP
ASIC STP FPGA Logic synthesis
& Back-end implementation
FPGA fast prototype
Testbench
Generator
CyberWorkBench Overview
Low Power
Synthesis
Design Space Explorer
C Source code debugger
CWB Integrated Design Environment
Integrated GUI Synthesis Controllability
Area
Cycle
Useful feedback to the designers
Rich synthesis function
Parallelization
Pipelining
Behavioral IP reuse
Controller synthesis
Script generation
Various circuit support
Synchronous/asynchronous/
pipelined memory
Synchronous/asynchronous
reset
Multiple clock, gated clock
RTL style selection
QoR report
Automatic exploration
C source editor
RTL viewer
Datapath
schematic viewer
Resource info.
Cross probing
Area, delay,
routability,
false path
Dataflow
diagram
State
transition
Tradeoff
chart
To get better quality of circuit
ANSI-C extensions (timing)
Pragmas
Synthesis options
Synthesize ANY Application with CWB
Traditionally NOT fit
for Hardware
Sequencers
-USB I/F, ATA, UART,...
-PCI bus I/F, AMBA bus.
-DMA, TIMER,…
-SDRAM I/F, NAND flash,...
Arithmetic operation in
Complex control
- Video Voice recognition
- Data compression
- Complex CODEC
- DRM
- Turbo ECC
- Public Key Encryption
Arithmetic operations
Simple algorithm
-FIR, FFT, ...
-secret key Encryption
-simple ECC, EDC
-graphic decoding
Data Intensive Control Intensive Controller
Traditionally NOT fit
for Hardware
2 synthesis ENGINES
Different scheduling mode (manual, automatic, pipelined, mixed mode)
Multiple Synthesis directives (local and global)
C syntax extension (implementation description)
Traditionally Fit
For Hardware
C Source Code Debugging Software-like debugging environment (look at C-code while debugging the
RTL) Allows break points, step and highlights lines being executed in parallel at the untimed C-code
Faster simulation than synthesizable RTL (~10x)
Cycle accurate
Verilog model
parse
synthesis
Verilog
generation
Verilog
(simulator)
C/SystemC
RTL (synthesizable)
Source
code C
Testbench
VCD
Page 13 © NEC Corporation 2009 Page 13 13
Memory Controller
DMA Engine
FIFO
FIFO
CPU
External Memory
STP Engine
HW Config Memory
1. Maximum 64 (pseudo) hardware configurations are stored inside STP engine. Among those, zero-cycle hardware reconfiguration is possible • Configurations are pre-loaded at system boot-up time
• ⇒ Reconfiguration time: Less than 1ns
2. Externally stored configurations may also be loaded during runtime, totally flashing in/out hardware. • Virtualizing hardware, in some sense
• ⇒ Reconfiguration time: Orders of 100us
Config #1 Config #M
Config #N Config #5
Dynamic Reconfigurable Chip and CWB
Run-time task switching by Dynamic
Loading
14
copy
External memory
print scan
copy print scan
Config size: up to 300KB
0.1-0.5ms for task switching scene resolution
Ex. Multi Functional Printer
Why CyberWorkBench
Most mature C-Based synthesis tool
Ability to synthesis ANY Digital Application
Complete C-Based Verification
SoC-level Design Capabilities
Best in class High Level Synthesis Design Environment
1
2
3
4
Proposal
Web Presentation and Demo
Free Evaluation License
On-site Training (Synthesis, Verification)
one or two days.