Overview and Commercial Examples

57
Reconfigurable Architectures Overview and Commercial Examples Doug Densmore [email protected] EE249 10/16/03

Transcript of Overview and Commercial Examples

Page 1: Overview and Commercial Examples

Reconfigurable ArchitecturesOverview and Commercial Examples

Doug [email protected]

EE249

10/16/03

Page 2: Overview and Commercial Examples

�Outline

� Introduction to Reconfigurable Architectures

� Motivation for Reconfigurable Architectures

� Reconfigurable Architecture Classifications

� Reconfigurable Architecture Challenges

� Commercial Examples

� Cypress Semiconductors’ PSoC

� Xilinx’s Virtex II Pro

� Conclusion

Page 3: Overview and Commercial Examples

Introduction

� What is a reconfigurable architecture?

� Depends on who you ask and how you would like to classify your architecture (to come later).

� Static vs. Configurable vs. Reconfigurable?

� What is an architecture?

� Depends again on who you talk to!!

� Modeling perspective vs. programming perspective vs. design perspective?

Page 4: Overview and Commercial Examples

�Introduction

� For our purposes I will propose the most generic:

� Reconfigurable Architecture – device which provides processing and communication services which can redefine their relationship via user input at some point in either the design or execution aspects of its usage.

� Still questions: System vs. Architecture?

Page 5: Overview and Commercial Examples

Motivation

� Why develop a reconfigurable architecture?

� IC trends: Migration from ASIC to platforms to programmable platforms

� Who uses a reconfigurable architectures?

� This determines what features are relevant

� ISA, Control, Computation, Abstraction, Programming interface

� This determines at what point the device should be configurable

� At plant

� During application

� Middle Ground?

Page 6: Overview and Commercial Examples

Motivation - System Design in 200x

� Less like synthesis of an integrated circuit from a high-level description

� More like programming of a complex application-specific processor

RTLSynthesis

HDL

netlist

logicoptimization

netlist

Library

physicaldesign

layout

IMPACTFront-End

Simulator / Visualization

ELCORBack-End

MESCALELCORMDES

MESCALMDES

C

Courtesy K.Keutzer

Page 7: Overview and Commercial Examples

Motivation -Evolution of the EDA Industry

Effort

(EDA tools effort)

Results

(Design Productivity)

a

b

s

q0

1

d

clk

1978197819781978

1985198519851985

1992199219921992

1999199919991999

Transistor entry - Calma, Computervision

Schematic Entry - Daisy, Mentor, Valid

Synthesis - Cadence, Synopsys

ASIP/Prog Platform

McKinsey S-Curve

Courtesy K.Keutzer

Page 8: Overview and Commercial Examples

�Motivation - Why

� Development of reconfigurable architectures has tremendous potential:

Strengths:

� Rapid time-to market

� Versatility, flexibility – increases product lifetime

� In-field upgradability

� Performance: 2-100X faster than general-purpose microprocessors

� Development of reconfigurable architectures has potential downsides:

Weaknesses:

� Performance: 2-6X slower relative to ``hardwired ASIC’’

� Power: 13X greater power dissipation relative to ``hardwired ASIC’’

Page 9: Overview and Commercial Examples

�Motivation - Who

� Reconfigurable Architectures naturally appeal to different groups of people:

� Academic

� Industrial

� Military

� Looking to take advantage of the particular strengths for their application space

Page 10: Overview and Commercial Examples

� �Getting Started!

� Each reconfigurable piece of hardware has its own:

� Strengths

� Weakness

� Tool Flow – programming part, programming application

� Important to understand how a particular piece of hardware fits into the global picture of reconfigurable devices in order to get some insight into these areas.

� How can this be done?

Page 11: Overview and Commercial Examples

� �Characteristics of Reconfigurable Architectures

� There is no “one” reconfigurable architecture or “one” reconfiguration characteristic.

Reconfiguration manifests itself particular areas reflecting possible applications.

The required resources for computation are distributed throughout the device.

Distributed Resources

Units process data based on local control.Distributed Control

Functionality and the interconnection network of the computational units is flexible.

Configurable Datapath

Data processed by spatially distributing the computations.

Spatial Computation

DescriptionCharacteristic

Bondalapati and Prasanna - USC

Page 12: Overview and Commercial Examples

��

Characteristics of Reconfigurable Architectures

ReconfigurableReconfigurableLogicLogic

ReconfigurableReconfigurableDatapathsDatapaths

adder

buffer

reg0

reg1

muxCLB CLB

CLBCLB

DataMemory

InstructionDecoder

&Controller

DataMemory

ProgramMemory

Datapath

MAC

In

AddrGen

Memory

AddrGen

Memory

ReconfigurableReconfigurableArithmeticArithmetic

ReconfigurableReconfigurableControlControl

Bit-Level Operationse.g. encoding

Dedicated data pathse.g. Filters, AGU

Arithmetic kernelse.g. Convolution

RTOSProcess management

Courtesy K.Keutzer

Page 13: Overview and Commercial Examples

� �Classification of Reconfigurable Architectures

� Technology

A coarse classification can be made based upon the technology used to make the device.

This provides some insight into

� Programming, Organization

Mix of discreet and continuous type components

Hybrid Architectures

Contains both static and reconfigurable components.

System on Chip (SOC)

Contaminates uncommitted configurable analog blocks (CABs)

Field Programmable Analog Array (FPAA)

Contains uncommitted configurable logic blocks (CLBs)

Field Programmable Gate Array (FPGA)

PROMs, PLAsProgrammable Logic Device

DescriptionDevice

Page 14: Overview and Commercial Examples

��

Classification of Reconfigurable Architectures

� Properties

� Technology does not really address the “programming model” for the device.

� What is available to the designer?

� Four properties introduced by Bondalapati and Prasanna at USC

� Granularity

� Host Coupling

� Reconfiguration Methodology

� Memory Organization

Page 15: Overview and Commercial Examples

��

Classification of Reconfigurable Architectures

How computations access memory.

Example: Large blocks, distributed

Memory Organization

How the device is programmed.

Examples: bitstream (serial, parallel), dynamic, partial

Reconfiguration Methodology

Type of coupling to host processor

Loose System Level/Loose Chip Level/Tight Chip Level.

Examples: Through IO (SPLASH), Direct Communication (PRISM), same chip (GARP, Chameleon)

Host Coupling

Size of the smallest reconfigurable functional unit addressed by mapping tools. Tradeoff between flexibility and performance overhead.

Examples: CLB, ADC, ISA

Granularity

DescriptionClassification

Page 16: Overview and Commercial Examples

� �

Classification of Reconfigurable Architectures

� An alternate approach by P.Schaumont et al. is based on three orthogonal axes.

� Vertical

� Level of abstraction

� Horizontal

� Reconfigurable feature density

� Time

� Timing relationship of configuration processing

Page 17: Overview and Commercial Examples

� �

Classification of Reconfigurable Architectures - Vertical Axis

� This represents the level of abstraction.

� Four basic descriptions

� Implementation (I)– indicates that the physical implementation can change. Example: power vs. performance.

� Microarchitecture (M) – Function unit organization can change.

� ISA – programmer’s view change from an instruction set standpoint.

� Process/Systems Architecture (P) – Buffer sizes, task organization

Page 18: Overview and Commercial Examples

� �

Classification of Reconfigurable Architectures - Vertical Axis

www.acca.beISAC-RISP, KULeuven

www.cs.ucla.edu/elib/reconfigurable

I, MSPS, UCLA

www.eng.uci.edu/morphosysI, ISA, PMorphoSys, UCI

www.ece.cum.edu/research/piperench

MPipeRench, CMU

Brass.cs.berkeley.eduI, ISA, PGARP, UCB

Academic

www.cypressmicro.comI, MPSoC, Cypress

www.chameleonsystems.comM, PCS2112, Chameleon Sys.

www.atmel.comI, PFPSLIC, ATMEL

www.trisend.comISA, PE7/A5, Trisend

www.morphics.comPMorphics

www.pmc-sierra.comPMECA41, PMC-Sierra

www.altera.comI,MExcalibur, Altera

Commercial

ReferenceVertical AxisPlatform

Page 19: Overview and Commercial Examples

� �

Classification of Reconfigurable Architectures - Horizontal Axes

� This represents feature diversity

� Typically features are in communication, storage, and processing.

� Interaction across horizontal and vertical axes.

Number/Type Tasks

Buffer SizeIntercon. NetworkProcess Architecture

Custom Instr.Reg. SetAddress SizeISA

Execution Unit Type

Reg file size, Cache

Crossbar/BusMicorarchitecture

CLB/IP BlockRAM orgSwitches,.MuxesImplementation

ProcessingStorageCommunication

(Horizontal Axis)

Design Elements

Design Levels

Page 20: Overview and Commercial Examples

��

Classification of Reconfigurable Architectures - Time Axis

� Timing relationship of configuration processing

� Based on binding time

� When the configuration data is sent to the part

� Implementation vs. design time binding

� Implementation – postponed until actual execution of the part.

� Design Time - when the part is conceived.

� Typically the lower level features are bound at design time while others are at implementation time. In between there is the binding time continuum.

Page 21: Overview and Commercial Examples

��

Classification of Reconfigurable Architectures

FPGA Processor

SpecializedMicro-Architectures

SpecializedInstruction-SetArchitectures

Domain-Specialization

ChameleonSystems

Morphics

Frontier Design

TensilicaARC

Improv Systems

PMC Sierra

Xilinx Altera AtmelTriscend

ActelAdaptive Silicon

Proceler

Network Processors

Courtesy K.Keutzer

Page 22: Overview and Commercial Examples

� �Classification of Reconfigurable Architectures - Microcode

� Taking a combination of the vertical ISA and Microarchitecture classification, is a microcode classification by M.Sima et al.

� Two views of how a microinstruction controls resources:

� Vertical – a microinstruction which controls a single resource.

� Horizontal – a microinstruction which controls multiple resources in one cycle. In extreme case all resources are controlled.

Page 23: Overview and Commercial Examples

� �

Classification of Reconfigurable Architectures - Microcode

Page 24: Overview and Commercial Examples

� �Classification of Reconfigurable Architectures - SET instruction

� In addition to the microcode distinction, there is the notion of a SET instruction.

� This instruction initiates the reconfiguration of raw hardware.

� Can be used in conjunction which the microcode classification.

� This is the extremes of the time axis mentioned in the previous classification method.

Page 25: Overview and Commercial Examples

��

Classification of Reconfigurable Architectures – SET and µcode

Xputer/rALU

CCSimP

Gilson’s CCM

Nano-Processor

URISC

rDPAChimaeraOneChip-98

ColtMultiple-RISAGARP

RaPiDDISCMIPS + REMARC

VEGAOneChip-98’RISA’’

RISA’’ConCISe 7RISA

Alippi’s VLIWOneChipPRISMII/RASC

PipeRenchCoMPAREPRISCPRISM

w/o SETExplicit SETw/o SETExplicit SET

Horizontal

�codeVertical �code

Page 26: Overview and Commercial Examples

��

Classification of Reconfigurable Architectures - Runtime vs. Compile Time

� Related to the Time Axis as well as the SET instruction.

Time Axis allows for a less coarse continuum.

SET is the opposite extreme.

� Often referred to as dynamic vs. static reconfigurability

� Compile Time – predetermined configuration which remains until the completion of a particular task.

� Runtime – can repeated program a device with many smaller functions to complete a particular application.

Overhead associated with this reconfiguration

Key performance issues: configuration time reduction and retention of intermediate values.

Page 27: Overview and Commercial Examples

��

Classification of Reconfigurable Architectures - Runtime vs. Compile Time

Page 28: Overview and Commercial Examples

��Reconfigurable Challenges

� Notice the similarities as well as differences in the previously mentioned methods of classification.

Similarities point out some fundamental issues with reconfigurable devices

� Abstraction Levels, Binding Times

Differences point out features which may be more a function of the device then the architectures in general.

� Key Challenge is how to cope with

Static vs. Dynamic Reconfiguration

Design Methodologies

Multi-dimensional Optimization

Design Tools

Page 29: Overview and Commercial Examples

��

Reconfigurable Challenges – Static vs. Dynamic

� This requires that scheduling configurations and constraints are accounted for so that applications can take advantage of a hardware which can adapt continuously.

� “Design Methodologies for Partially Reconfigured Systems” – Hadley and Hutchings – Brigham Young

� Looks at how to optimally reconfigure only aspects of the device which require a change thus saving configuration time.

Page 30: Overview and Commercial Examples

��

Reconfigurable Challenges –Design Methodologies

� Platform Based Design

Constraints, Applications, Platforms, Estimation (CAPE) –Densmore, ASV - UCB

� Boolean Constraint Based with PBD

� Hybrid System Architecture Model (HySAM) –Bondalapati - USC

Von-Neuman style processor and configurable logic unit.

� Finds “optimal” partitions of the capabilities of the hardware from the implementations.

� SCORE – Wawrzynek, et al – UCB

Virtualizes computing resources by dividing computation into fixed size “pages” and time multiplexing the pages on available physical hardware.

Page 31: Overview and Commercial Examples

��

Reconfigurable Challenges –Multidimensional Optimization

� Design space exploration process in which multiple metrics are examined.

� Three Axes

� Application Constraints

� Architecture Constraints

� Adaptation Constraints

� For example: Configuration overhead vs. performance (adaptation vs. architecture with a requirement to meet application needs)

Page 32: Overview and Commercial Examples

��

Reconfigurable Challenges –Design Tools

� Architecture Based

Propose ways of organizing and interfacing configurable logic.

� Theoretical Modeling

Reconfigurable Mesh analysis, Virtual Hardware Operating Systems

� Algorithmic Synthesis

Techniques to schedule computations on dynamically reconfigurable machines.

� Software Tools

Mapping Techniques, run-time reconfiguration, compilation from high level languages, simulation, operating systems, etc

Page 33: Overview and Commercial Examples

� �

Reconfigurable Challenges –Design ToolsTools that help build thecomplex programmablechips

ProgramROM

A/DD/A

P=>SS=>P

CoreµP

ASICCircuitry

DMA

Tools that help program them

On-chipprogram

RAM

FPGA

Off-chipRAM

signal integrity

3D-extraction

SW estimators

performancevisualization

runtimescheduling

debugger

gridless router

RTLmodel

RTLfloorplanner

logic synthesis

compiler

Courtesy K.Keutzer

Page 34: Overview and Commercial Examples

��Cypress Semiconductors’ PSoC

� Developed by Cypress Microsystems, a subsidiary of Cypress Semiconductor. Acquired March 6th, 2000.

� PSoC Released November 13, 2000

“As general purpose solutions, PSoC devices are targeted for implementation in embedded applications, including audio, wireless, handheld, data communications, Internet control, industrial, and consumer systems. “

� Named Innovation of the Year 2001 by EDN Magazine.

� Berkeley provided with a PSoC development kit as member of GSRC.

http://www.cypressmicro.com

Page 35: Overview and Commercial Examples

��

Cypress Semiconductors’ PSoC-Hardware Overview

� Harvard Architecture Processor

M8C; Up to 24MHz; Flexible Addressing modes

Separate MAC; 8x8 multiply, 32 bit accumulate

� On Chip Memory

Flash 4k to 16k - SONOS™-based (Silicon Oxide Nitride Oxide Silicon)

256 Bytes SRAM

EEPROM Emulation in Flash

� Programmable System on a Chip Blocks

12 Analog Blocks

8 Digital Blocks

Page 36: Overview and Commercial Examples

��

Cypress Semiconductors’ PSoC-Application Overview

� Company Line*

“PSoC™ Devices Integrate Programmable Analog and Digital Functions To Simplify Design Of Wireless, Handheld, Data Communications, and Industrial Systems”

� Sample Application Notes

Range Finder

1-GHz Vectorial Network Analyzer

Remote Human Health Monitoring System

� Dynamic reconfiguration is a key application point.

Page 37: Overview and Commercial Examples

��

Cypress PSoCSystem Overview

�Keys to note:

� Programmable interconnect

� Digital PSoC Blocks

� Analog PSoC Blocks

� Separate MAC

� Static Peripherals

� LVD, Decimator, etc

�Exposed to Programmer through “Module Placement view”

�Exposed to Programmer through “Application View”

http://www.cypressmicro.com

Page 38: Overview and Commercial Examples

��PSoC - M8C

� 8-bit, Harvard Architecture Microprocessor

� Five Hardware Registers

� Flags (F) – 3 Status Bits, Global Interrupt Bit, XIO (regbank switch)

� Program Counter (PC)– 16 bit; Full addressing of the 16K FLASH

� Accumulator (A)

� Stack Pointer (SP)

� Index (X) – Used in addressing Modes; Often used by peripherals

CPU ProgramMem

DataMem

Page 39: Overview and Commercial Examples

��PSoC – M8C Address Space

Page 40: Overview and Commercial Examples

��

Cypress Semiconductors’ PSoC-Digital Blocks

� Total of 8, 8-bit digital blocks

Four Digital Basic Type A (DBA) and four Digital Communications Type A (DCA)

Each can be configured independently or in combination

Each have a unique Interrupt Vector and Interrupt Enable bit

� Three Configuration Registers to program

Function Register – function and mode

� Timer, Counter, CRC/PRS, Deadband (for PWM), UART, Serial Peripheral Interface (SPI)

Input Register – data input and clock selection

Output Register – select and enable outputs

Page 41: Overview and Commercial Examples

��

Cypress Semiconductors’ PSoC-Digital Blocks

� Three Data Registers

� Data0, Data1, Data2 – function dependent

� One Control Register� Sample Register

� Exposed in the “Module Placement View “

Page 42: Overview and Commercial Examples

��

Cypress Semiconductors’ PSoC-Digital Blocks

Page 43: Overview and Commercial Examples

� �Cypress Semiconductors’ PSoC-Analog Blocks

� 12 analog blocks

� 4 Continuous Time Blocks, 4 Type A Switched Capacitor, and 4 Type B Switched Capacitor

� Three Distinct outputs from each analog block

� The analog output bus (ABUS) shared by all blocks in analog column.

� The comparator bus (CBUS) which is a digital resource shared by all blocks in a column.

� The output bus (Out) which is shared by all blocks in the column and can be reconfigured to send a signal externally.

Page 44: Overview and Commercial Examples

� �Cypress Semiconductors’ PSoC-Analog Blocks

� Analog Block Registers

� Analog Column Clock Select Register

� Analog Reference Control Register

� Analog Clock Select Register

� Control0, Control1, Control2 Registers (Control3 for SwCap Blks)

� Exposed in the “Module Placement View “

Page 45: Overview and Commercial Examples

��

Cypress Semiconductors’ PSoC-Analog Blocks

Page 46: Overview and Commercial Examples

��

Cypress Semiconductors’ PSoC-User Modules

� User modules are what the programmer really sees when configuring the device.

� Could be considered a primitive component along with the M8C and static peripherals.

� Current User Modules (sample in table).

� New modules in software updates.

115 Flash2D16-bit PWM

29 Flash2A SwCpTwo Pole Band Pass Filter

56 Flash2D16-bit CRC

47 Flash1A SwCp6-Bit DAC

66 Flash1D8-bit Counter

32 Flash1A CT Programmable Gain Amp

184 Flash6 SRAM

2D, 1A12-Bit ADC

Memory (Bytes)

PSoCBlocks

Page 47: Overview and Commercial Examples

��

Cypress Semiconductors’ PSoC-Programming Environment

� Windows based graphical programming environment both for the configuration of the reconfigurable blocks and interconnect, as well as the development of the software.

� Multiple Editors (“Views”)

� Device Editor

� Application Editor

� Debugger

Page 48: Overview and Commercial Examples

��

Cypress Semiconductors’ PSoC-Dynamic Reconfiguration

� In the Module Selection view, you can import (or export) configurations.

� Configurations consist of user modules, their interconnections, and their parameters.

� Then at runtime you can swap to another configuration via

call UnloadConfig_newled_proj

call LoadConfig_dynamic_improved

� This amounts to swapping out and reloading of the PSoCblock registers mentioned earlier.

Stores the configurations in FLASH

100+ cycles (best guess)

Page 49: Overview and Commercial Examples

��Xilinx Virtex II Pro

� High Performance FPGA

� Up to 24 RocketIO embedded multi-gigabit transceivers.

� Up to 4 IBM PowerPC RISC processor blocks

� Based on Virtex II Platform FPGA Technology

� CLB resources and logic cells (4 input LUT, FF + Carry Logic)

� SRAM Based in-system configuration

� Active Interconnect Technology

� Dedicated 18bit * 18bit Multiplier blocks

Xilinx Advance Product Specification

Page 50: Overview and Commercial Examples

� �

Virtex II Pro Generic Architecture Overview

� Embedded RocketIOMulti-Gigabit Transceiver (MGT)

� Processor block containing embedded IBM PowerPC

� FPGA Fabric

Page 51: Overview and Commercial Examples

� �Xilinx Virtex II Power PC Core

Page 52: Overview and Commercial Examples

� �Virtex II Tool Flow

� Main Package is Xilinx ISE tools

HDL Based Designs

Schematic Based Designs

Behavioral Simulation

� Modelsim Based

Design Implementation

Timing Simulation

� Synthesis

Xilinx Synthesis Technology (XST)

� Works for both HDL and Schematic Designs

� Part of ISE

Synplify/Synplify Pro

� Schematic based; Not part of ISE

LeonardoSpectrum

� Works for both HDL and Schematic Designs; Not part of ISE

Page 53: Overview and Commercial Examples

� �Virtex II IP Blocks

� Key Tool is Xilinx’s Core Generator

� The Xilinx CORE Generator System generates and delivers parameterizable cores optimized for Xilinx FPGAs.

� Both Xilinx and 3rd party cores

� Communication/Network

� Math

� DSP

� Memories/Storage

� Microprocessors/Controllers

� Video/Audio Processing

Page 54: Overview and Commercial Examples

� �

Virtex II Applications

� Networking - network switch fabrics� Wireless base-stations� Mass storage� Video servers - video-on-demand servers� Software-defined radio (SDR) with

Mercury Computer Systems

Page 55: Overview and Commercial Examples

� �

Conclusions

� Reconfigurable Architectures have many different definitions arising from a diverse system of classification!

Technology

Properties – Granularity, Host Coupling, Methodology, Memory Organization

Abstraction vs. Feature Density vs. Time

Microcode organization

Runtime vs. Compile Time

� Did not really even touch on how SoCs and Hybrid architectures can fit into this scheme.

� Keep in mind the high level characteristics mentioned initially as a common ground.

Spatial Computation

Configurable Datapath

Distributed Control

Distributed Resources

Page 56: Overview and Commercial Examples

� �Conclusions

� The right choice of a reconfigurable device can greatly HELP or HURT your application.

� Because of the relative strengths and weaknesses of the various devices you should examine how your application will run on each device.

� Reconfigurable devices fit very nicely into many tool chains which seek to examine various architecture instances.

� Platform Based Design – Many different architecture instances.

Page 57: Overview and Commercial Examples

��Conclusions

� Reconfigurable architectures are here to stay!

� Deal with increased time to market pressures

� Need to keep costs of products low (reuse, IP blocks, etc)

� One supplier can be vender of choice.

� Many great research problems can be investigated will relatively simple devices.

� Scheduling, mapping, hardware/software co-design, testing, etc