of C Based VLSI Design Early time to market Code size reduction (7x-10x) Accelerated Simulation...

17
www.cyberworkbench.com

Transcript of of C Based VLSI Design Early time to market Code size reduction (7x-10x) Accelerated Simulation...

www.cyberworkbench.com

C-Based ASIC Design

S0

S1

S2

S0

S1

S2

A B

C

D

+ *

F

E

3 cycles

char A,B,C,D; char E,F; main(){ char X;

X = A + B; E = X * D; F = (B + C) * X; }

A B C D

X Y

F E

+ *

Behavior in C

+ : 1

* : 1

+ : 2

* : 2 1 cycle Delay:2T

Delay:1T

8 Lines

100 lines

RTL

Resource

constraints

1

2

VLSI Design Flow

RTL

Behavioral Model

Behavioral

Synthesis

Logic Synthesis

Gate Level Netlist

Physical Synthesis

GDSII

Software Algorithm

Designer

Hardware Designer

Manual

translation

Complicated algorithm which is tough

for RTL designer to map H/W.

C-synthesis can synthesize the whole

Program into ASIC/FPGA

(not only some parts: e.g. loop )

Page 4

C-based HLS has been used for various types of “ASICs”

Satellite

High-end

servers

Automobiles

Mobile phones

Digital Appliances

Cameras

Office

Automation

CyberWorkBench is used in REAL product development for many years!

Robots

C-based HLS CyberWorkBench

Customers* NEC 93~ (Transmission, Sattelit, Computer, Mobile, Encry, etc…)

1999 NIC for PC cluster(Xeon)

Renesas Electronics 00~ (MPEG, STB, configurable Processor,

etc..) Sold chips worth over $1B

Panasonic 01~ (Audio Video Processing)

Toyota 02~04 (Engine control research),

Dxxxx: automobile controller

JAXA 05~ (Satellite) Hayabusa, spacewire, ..

Toshiba 06~ (High Definition Video, Audio etc..)

Hitachi 07~ (HDTV algorithm development on FPGA)

Advantest 07~ (LSI Tester, 200M Gates ASIC only with in C, No RTL

JVC ~08 (Video Camera, controller)

Fujitsu 09~ (Transmission, Network)

Caxxx: Digital Camera, MFP,

Fuxxxxxx: Digital Camera , using DRP(Dynamic Reconfigurable Chip)

Suxxxx: Audio Digital

Huxxxx: Router, L2 Cache

…..

Advantages of C Based VLSI Design

Early time to market

Code size reduction (7x-10x)

Accelerated Simulation speed Generate equivalent circuit sizes compared to hand

coded RTL

RTL design

SW design

Layout Production

HW eval.

Soft.Eval.

Sim.

RTL (200KG)

C-based (600KG)

SW design

Syn

Veri.

14M

100MM

8.5M

40MM

3 6 9 12

Layout

SW eval.

HW eval

Production

Spec

Spec&Sim

Shorter Time-to-Market

smaller NRE cost

MAHALO MARY

RTL-based Design

C-based Design

other bug (IP etc.) sync. bug basic bug system bug

Less Bugs

H/W

MPU

Viterbi

Decoder

H/W S-DSP

SRAM

2KW

X N pairs

Viterbi

Decoder

NEC

DSP MPU

Area/Power Reduction

Saving Several $100K per SoC

(CWB costs much less )

Case study W-CDMA ASSP

W-CDMA chip design example

SoC Design with CWB

3G Mobile Phone Application Chip

(MP211: Multicore-ISSCC05)

Data-

Dominant

Control-

Dominant

All-in-C

1. Synthesize ANY

application (data or

control dominant,

controllers)

2. Design AND verify

at the C-level

Full Chip simulation

and synthesis

Import legacy RTL

blocks

Behavioral

Synthesizer

SystemC/SpecC ANSI-C Verilog / VHDL

Verilog/VHDL

Behavioral description

C-RTL Equivalence

Prover

Property

Checker

Formal Verifier Sim. Model Generator

Bit-accurate

Behavioral Simulator

Cycle-accurate HW/SW

Co-simulator

RT

FloorPlanner

RT Power

Estimator

GUI For

QoR Analysis

Synthesis Control

C

SystemC

CPU Bus I/F generator Behavioral

IP library

Library

Characterizer

Software

Legacy RTL / IP

ASIC STP FPGA Logic synthesis

& Back-end implementation

FPGA fast prototype

Testbench

Generator

CyberWorkBench Overview

Low Power

Synthesis

Design Space Explorer

C Source code debugger

CWB Integrated Design Environment

Integrated GUI Synthesis Controllability

Area

Cycle

Useful feedback to the designers

Rich synthesis function

Parallelization

Pipelining

Behavioral IP reuse

Controller synthesis

Script generation

Various circuit support

Synchronous/asynchronous/

pipelined memory

Synchronous/asynchronous

reset

Multiple clock, gated clock

RTL style selection

QoR report

Automatic exploration

C source editor

RTL viewer

Datapath

schematic viewer

Resource info.

Cross probing

Area, delay,

routability,

false path

Dataflow

diagram

State

transition

Tradeoff

chart

To get better quality of circuit

ANSI-C extensions (timing)

Pragmas

Synthesis options

Synthesize ANY Application with CWB

Traditionally NOT fit

for Hardware

Sequencers

-USB I/F, ATA, UART,...

-PCI bus I/F, AMBA bus.

-DMA, TIMER,…

-SDRAM I/F, NAND flash,...

Arithmetic operation in

Complex control

- Video Voice recognition

- Data compression

- Complex CODEC

- DRM

- Turbo ECC

- Public Key Encryption

Arithmetic operations

Simple algorithm

-FIR, FFT, ...

-secret key Encryption

-simple ECC, EDC

-graphic decoding

Data Intensive Control Intensive Controller

Traditionally NOT fit

for Hardware

2 synthesis ENGINES

Different scheduling mode (manual, automatic, pipelined, mixed mode)

Multiple Synthesis directives (local and global)

C syntax extension (implementation description)

Traditionally Fit

For Hardware

C Source Code Debugging Software-like debugging environment (look at C-code while debugging the

RTL) Allows break points, step and highlights lines being executed in parallel at the untimed C-code

Faster simulation than synthesizable RTL (~10x)

Cycle accurate

Verilog model

parse

synthesis

Verilog

generation

Verilog

(simulator)

C/SystemC

RTL (synthesizable)

Source

code C

Testbench

VCD

Page 13 © NEC Corporation 2009 Page 13 13

Memory Controller

DMA Engine

FIFO

FIFO

CPU

External Memory

STP Engine

HW Config Memory

1. Maximum 64 (pseudo) hardware configurations are stored inside STP engine. Among those, zero-cycle hardware reconfiguration is possible • Configurations are pre-loaded at system boot-up time

• ⇒ Reconfiguration time: Less than 1ns

2. Externally stored configurations may also be loaded during runtime, totally flashing in/out hardware. • Virtualizing hardware, in some sense

• ⇒ Reconfiguration time: Orders of 100us

Config #1 Config #M

Config #N Config #5

Dynamic Reconfigurable Chip and CWB

Run-time task switching by Dynamic

Loading

14

copy

External memory

print scan

copy print scan

Config size: up to 300KB

0.1-0.5ms for task switching scene resolution

Ex. Multi Functional Printer

Why CyberWorkBench

Most mature C-Based synthesis tool

Ability to synthesis ANY Digital Application

Complete C-Based Verification

SoC-level Design Capabilities

Best in class High Level Synthesis Design Environment

1

2

3

4

Proposal

Web Presentation and Demo

Free Evaluation License

On-site Training (Synthesis, Verification)

one or two days.

[email protected]

www.cyberworkbench.com