Algorithm Analysis and Mapping Environment for...

8
Eric Pauer, Cory Myers, Ken Smith, and Paul Fiore {pauer,cory,jmsmith,pfiore}@sanders.com Sanders, a Lockheed Martin Company Nashua, NH 03061 Algorithm Analysis and Mapping Environment for Adaptive Computing Systems Adaptive Computing System Design and Implementation using the ACS Domain Third Bi-Annual Ptolemy Miniconference - 1999 PUBS-98-M21_001-P 2/12/99 2 Reconfigurable computing technology offers leap ahead performance, e.g. 10X ops per watt and/or ops per cubic inch, over general purpose programmable solutions without the need to develop custom hardware. However, today generation of a working implementation requires hardware design expertise and generation of a good implementation requires many slow iterations between an algorithm designer and a hardware developer. Statement of the Problem

Transcript of Algorithm Analysis and Mapping Environment for...

Page 1: Algorithm Analysis and Mapping Environment for …ptolemy.eecs.berkeley.edu/conferences/99/pauer.pdfPUBS-98-M21_001-P 2/12/99 2 Reconfigurable computing technology offers leap ahead

Eric Pauer, Cory Myers, Ken Smith, and Paul Fiore{pauer,cory,jmsmith,pfiore}@sanders.com

Sanders, a Lockheed Martin CompanyNashua, NH 03061

Algorithm Analysis and MappingEnvironment for Adaptive Computing

Systems

Adaptive Computing System Design and Implementation using the ACS Domain

Third Bi-Annual Ptolemy Miniconference - 1999

PUBS-98-M21_001-P2/12/99 2

Reconfigurable computing technology offers leap aheadperformance, e.g. 10X ops per watt and/or ops per cubic inch,

over general purpose programmable solutions without theneed to develop custom hardware. However, today

generation of a working implementation requires hardwaredesign expertise and generation of a good implementation

requires many slow iterations between an algorithm designerand a hardware developer.

Statement of the Problem

Page 2: Algorithm Analysis and Mapping Environment for …ptolemy.eecs.berkeley.edu/conferences/99/pauer.pdfPUBS-98-M21_001-P 2/12/99 2 Reconfigurable computing technology offers leap ahead

Adaptive Computing System Design and Implementation using the ACS Domain

Third Bi-Annual Ptolemy Miniconference - 1999

PUBS-98-M21_001-P2/12/99 3

EXCEED(1)1330

ThreshCalc

FilterBank

MeanCalc

ThreshDelayRam

ExceedCalcFilter

SelectY Calc Sigma

Calc

FBINDelayRAM

MNINDelayRam

THRES(19)1304

FBOUT(20)1328

MEAN(19)1323

FSDLY(2)1323

1296

FLSEL(2)

1298

THDL(19)1329

L = 8s, 6fC = 355

L = 545fb, 30fsC = 3880

L = 1037C = 993

L = 25C = 0

L = 7m, 7fs, 2fo, 1tC = 288MNINO(12)

286

L = 286C = 0

L = 1037C = 1002

L = 783C = 0

SIGMA(20)

FBINO(12)783

IN(12)

L = 259C = 287

Y(13)259

Greater than 10X performanceDesign time measured in weeks

Analysis:Bit WidthsLatency (L)Cells Used (C)

Adaptive Computing Performance GainCHAMP TMS 320C80

Image Size 256 x 256 256 x 256Implementation Time 44 Days 28 DaysFrame Rate 305 frames/sec 12 frames/secLatency 68 µsec 82,000 µsecProcessing Load 4.7 Bops 0.2 BopsUtilization 73% UnknownGates 510k N/A

(Operation count increase by 70% if memory loads and stores are counted)

Reconfigurable Architectures

CHAMP

CHAMP 2 MCMRCP

Adaptive Computing System Design and Implementation using the ACS Domain

Third Bi-Annual Ptolemy Miniconference - 1999

PUBS-98-M21_001-P2/12/99 4

Floating PointSimulation

Fixed PointSimulation

Bit Width AnalysisNoise Distribution Analysis

AlgorithmRearrangement

Dataflow Graph

Algorithm Analysis

PerformanceModeling

AutomaticScheduling

Algorithm Mapping

Smart Generators

Partitioning andMapping

AdaptiveComputingResource

Precision Analysis Alternative Implementations

Dataflow Graph

Performance Metrics

Device Programming

Allocated Functions

•SNR analysis•Alternative

implementations•Functional

approximations

•Timing and sizingestimation

•Scheduling – FSM andcontexts

•Partitioning within aresource node

•Device program•Interface program

CommonDatabase

inPtolemy

GeneratorSelection

VHDL Interface Libraries

Analysis and Mapping in ACS Environment

Page 3: Algorithm Analysis and Mapping Environment for …ptolemy.eecs.berkeley.edu/conferences/99/pauer.pdfPUBS-98-M21_001-P 2/12/99 2 Reconfigurable computing technology offers leap ahead

Adaptive Computing System Design and Implementation using the ACS Domain

Third Bi-Annual Ptolemy Miniconference - 1999

PUBS-98-M21_001-P2/12/99 5

OptimizeWordlengths

Floating PointSimulation

Fixed PointRealization

Bit Widths for each flowCost estimates (area/complexity)Quantization Noise (SNR)

Yields

a

b1 b3

b6b

b2 b4

x

x2

x2b5

y

F1 F3 F4F2

F6 F7F5

F’1 F’3 F’4F’2

F’6 F’7F’5

Automated Float to Fixed Point Translation

SNRCost

Maximize SNR(b),subject to Cost(b) " C0 and b " bmin " 0

orMinimize Cost(b),

subject to SNR(b) " SNR0 and b " bmin " 0

Adaptive Computing System Design and Implementation using the ACS Domain

Third Bi-Annual Ptolemy Miniconference - 1999

PUBS-98-M21_001-P2/12/99 6

•Ptolemy - simulation/design environment from the University ofCalifornia, Berkeley (http://ptolemy.eecs.berkeley.edu)

•New ACS domain developed to facilitate movement amongsimulation and code/design generation (released in 0.7.1, 6/98)

•ACS Stars (basic building block) are composed of a Corona(interface) and multiple cores (implementations)

•Core (implementation) selection is via targeting mechanism

Fixed_PointSimulation

Core

C CodeGeneration

Core

FPGA DesignGeneration

Core

FPGA JavaGeneration

Core

Floating_PointSimulation

Core

Corona

Retargetable Implementations

Common Interface

UCB BRASS Project

Ptolemy and the ACS Domain

Available cores/targets

Page 4: Algorithm Analysis and Mapping Environment for …ptolemy.eecs.berkeley.edu/conferences/99/pauer.pdfPUBS-98-M21_001-P 2/12/99 2 Reconfigurable computing technology offers leap ahead

Adaptive Computing System Design and Implementation using the ACS Domain

Third Bi-Annual Ptolemy Miniconference - 1999

PUBS-98-M21_001-P2/12/99 7

Top Level Example• FIR filter to be implemented in both floating point and fixed point simulation

Adaptive Computing System Design and Implementation using the ACS Domain

Third Bi-Annual Ptolemy Miniconference - 1999

PUBS-98-M21_001-P2/12/99 8

• Alternative implementations are represented as “targets”• Targets can have parameters• Floating point simulation, fixed point simulation, and C code

generation are integrated today. FPGA generation almost ready.

Selecting Among Alternative Implementations

ACS-CGFPGAsoon!

Page 5: Algorithm Analysis and Mapping Environment for …ptolemy.eecs.berkeley.edu/conferences/99/pauer.pdfPUBS-98-M21_001-P 2/12/99 2 Reconfigurable computing technology offers leap ahead

Adaptive Computing System Design and Implementation using the ACS Domain

Third Bi-Annual Ptolemy Miniconference - 1999

PUBS-98-M21_001-P2/12/99 9

•Comparison of floating point and fixed point implementations

Comparing Implementations

Adaptive Computing System Design and Implementation using the ACS Domain

Third Bi-Annual Ptolemy Miniconference - 1999

PUBS-98-M21_001-P2/12/99 10

Winograd-based FSK Receiver

Page 6: Algorithm Analysis and Mapping Environment for …ptolemy.eecs.berkeley.edu/conferences/99/pauer.pdfPUBS-98-M21_001-P 2/12/99 2 Reconfigurable computing technology offers leap ahead

Adaptive Computing System Design and Implementation using the ACS Domain

Third Bi-Annual Ptolemy Miniconference - 1999

PUBS-98-M21_001-P2/12/99 11

ACS Domain - CGFPGA TargetWinograd dataflow (ACS domain)

Dataflow/Hardware schedule

VHDL design (generated)

The results are sent to synthesisand place/route, yielding

complete FPGA implementation!

CGFPGAtarget yields:VHDL designand schedule

Adaptive Computing System Design and Implementation using the ACS Domain

Third Bi-Annual Ptolemy Miniconference - 1999

PUBS-98-M21_001-P2/12/99 12

Winograd schedule

Page 7: Algorithm Analysis and Mapping Environment for …ptolemy.eecs.berkeley.edu/conferences/99/pauer.pdfPUBS-98-M21_001-P 2/12/99 2 Reconfigurable computing technology offers leap ahead

Adaptive Computing System Design and Implementation using the ACS Domain

Third Bi-Annual Ptolemy Miniconference - 1999

PUBS-98-M21_001-P2/12/99 13

FPGA-based Winograd DFT

Address generatorWord counter

Data multiplexerSequencer/State Machine

Winograd DFTcomputations

Xilinx 4062XL

Adaptive Computing System Design and Implementation using the ACS Domain

Third Bi-Annual Ptolemy Miniconference - 1999

PUBS-98-M21_001-P2/12/99 14

Hardware-in-the-loop

SDF Wildforce star executes complete FPGA designin hardware on Annapolis Wildforce FPGA board

SDF Galaxy

Page 8: Algorithm Analysis and Mapping Environment for …ptolemy.eecs.berkeley.edu/conferences/99/pauer.pdfPUBS-98-M21_001-P 2/12/99 2 Reconfigurable computing technology offers leap ahead

Adaptive Computing System Design and Implementation using the ACS Domain

Third Bi-Annual Ptolemy Miniconference - 1999

PUBS-98-M21_001-P2/12/99 15

Sim vs. Hardware

Adaptive Computing System Design and Implementation using the ACS Domain

Third Bi-Annual Ptolemy Miniconference - 1999

PUBS-98-M21_001-P2/12/99 16

Context SwitchingReconfigurable

Computing

Adaptive ComputingSmart Modules

Algorithm Analysis andMapping Environment

for AdaptiveComputing Systems

ReconfigurableAlgorithms for Adaptive

Computing

Efficient MathematicalAlgorithms for Image

Processing Applications

Advanced SensorsFederated Laboratory

TechnologyApplications

JSF

ACP

HHSAR

MAV

Reconfigurable andAdaptive Computing

IRADs

ASFL

OSA

REE

ISAC

AdaptiveComputing

SystemInsertions

Related ACS Work at Sanders