An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob...

33
An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University of California, Berkeley AADEBUG 2005

Transcript of An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob...

Page 1: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

An IntegratedDebugging Environment for

Reprogrammable Hardware Systems

Kevin CameraHayden So

Bob Brodersen

Berkeley Wireless Research CenterUniversity of California, Berkeley

AADEBUG 2005

Page 2: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

2

Outline

Motivation Existing platform Existing design/verification flow Proposed solution Environment features Walkthrough Implementation strategy

Page 3: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

3

Application Domain

Direct-mapped, reprogrammable hardware systems

FPGA-based signalprocessing andsupercomputingarrays

Page 4: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

4

FPGA Computing Benefits

Superior power, computation, and cost efficiency than any processor-based solution, due to direct mapping of algorithms

XC2VP70-7 C6415T-1G

Computation Rate(Gop/s) 72 4

Power Efficiency(Gop/s/W) 2.72 1.84

Price/Performance(Mop/s/$) 31.0 14.81

Chang, Wawrzynek, Brodersen; ISCA ‘05

Page 5: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

5

BEE2: 2nd Berkeley Emulation Engine

(5) Xilinx V2P100 per board ~100K logic cells 2 PowerPC405 cores 444 dedicated multipliers 1MB on-chip SRAM 3.125Gb/s duplex links

(4) DDR2 banks per FPGA 72 bits per bank with ECC Up to 12.8 (DDR400) or 17

(533DDR) GB/s bandwidth Up to 4GB capacity

Page 6: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

6

BEE Design Flow

Design entry is in the Matlab/Simulink environment Graphical, library based; also allows custom HDL

Typical FPGA path to physical implementation HDL synthesis and place and route Hierarchy is flattened in each pass (non-modular flow)

Netlist

Design

Place andRoute

VerifyHardware

Page 7: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

7

Complexity,Accuracy

Design Verification Methods

High-level functional simulation

HDL/RTL simulation

Native FPGA execution

Page 8: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

8

High-level Functional Simulation

Design executionin Matlab/Simulink

Intended to becorrect byconstruction

Fastest software-based simulation

Powerful and convenient algorithm exploration

Page 9: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

9

Drawbacks of High-level Simulation

Even with high level of abstraction, vastly slower than hardware Trend is worsening with

increased FPGA capacity Doesn’t cover any side-

effects or requirements of the backend tool chain

2E-06

38.8

0

5

10

15

20

25

30

35

40

45

50

sec

HW SW

Page 10: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

10

HDL/RTL Simulation

Varying levelsof accuracy

Access toarbitraryinternal signals

But, simulation speed is even slower Parameterization/Iteration is much harder

Page 11: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

11

Native FPGA Execution

Runs at full speed of hardware

Three tools for on-FPGA testing:

Xilinx ChipScope Pro

System Generator HW-in-the-loop

Good old-fashioned signal probing

Page 12: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

12

Xilinx ChipScope Pro

Inserts BRAM cores into design and binds to JTAG

Captures selected signals and provides trigger conditions

Signals of interest must be chosen in advance

Captured state is limited by available BRAM

Any changes require tool flow re-iteration

Page 13: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

13

System Generator HW-in-the-loop

Allows hardware itself to accept and process data from Simulink via JTAG

Arbitrary number of data elements can be accessed as “ports”

Very powerful tool, but features limited process control

Page 14: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

14

Hands-on Hardware Debugging

Most accurate method for finding timing-related bugs in a “production” system

Tradeoffs are all too well-known:

Complex equipment Limited probing pins A priori signal output Limited input options

Page 15: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

15

Drawback of On-FPGA Execution

Place and route time is a major bottleneck

Complete run is needed for every design change

Increasingly problematic due to larger FPGA capacity

0

10

20

30

40

50

60

70

80

90

100

min

Synthesis 0.28333 1.26667 3.4

Place and Route 3.85 35.4 90.46667

PFB (3805)PFB x4 (15,301)

PFB x8 (30,601)

Page 16: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

16

Proposed Solution

Enable extensive debugging and design exploration functionality directly on the hardware platform Vastly superior execution time for today’s

large-scale computing challenges Exploit the spatial resources of the hardware

to assist in debugging Essentially a -g switch to the hardware

design flow Minimize or eliminate iterations through

implementation flow

Page 17: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

17

Caveats

Final timing of design will not be preserved Critical path will definitely be increased,

but 106 is a lot of headroom Timing-driven implementation still needed once

verification is complete Significantly more FPGA capacity and memory

will be needed Acceptable for scalable BEE-like platforms and

for modular, tiled algorithms

Page 18: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

18

Essential Features of Environment

1. Robustly parameterized library components with soft configuration

Design exploration without tool iterations

2. Readily accessible variable contents Reading and writing of any values by user

3. Complete user-driven control over process execution

Single-step, bursts, breakpoints, assertions

Page 19: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

19

1: Parameterized Library

Library components provide configuration parameters as inputs, which can be set by variables Allows runtime modification of function properties,

including precision, range, and latency Enables design-space exploration at hardware speed,

plus correction of configuration errors without re-implementation

• Number of bits

• Saturate / Wrap

• Binary point position

• Microarchitecture

Page 20: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

20

2: Data Management

Ability to dynamically observe any variable’s value at the user’s request

Ability to overwrite a variable’s value at runtime and continue operation

Ability to rewind system state within the bounds of buffer capacity

Page 21: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

21

2: Data Management Requirements

Too expensive to re-implement the hardware to expose new data All variables are streamed into local and off-

chip storage, such as DRAM and disks

Unlike software, hardware is highly parallel, and often deeply pipelined Memory requirements could be extreme Can be offset by hierarchical memory

architecture and/or periodic sampling

Page 22: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

22

3: Process Control

Inherit the most useful features of software debuggers like GDB Cycle-by-cycle (single-step) execution Breakpoints (either state dependent, or fixed

cycle count)

Implemented using multiple clock domains and clock buffer control Already available for use on BEE2

Page 23: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

23

Walkthrough: Design

Use specialized libraries to provide soft configuration

Integrates directly into the existing BEE2 tool flow

Page 24: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

24

Walkthrough: Tagging

User tags signals of interest with debugging testpoints Defines a

variable name Defines other

parameters of interest for data observation

Also includes breakpoints and assertions

Page 25: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

25

Walkthrough: Stitching

“Stitcher” updates the design before entering back-end tool flow Inserts logic as

needed for debug functions

Instantiates PowerPC core and master controller

Adds underlying connections to route data

Page 26: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

26

Walkthrough: Runtime

User can monitor variables and control process execution from remote client Embedded

PowerPC software provides a thin service layer

Client is fully integrated with Matlab and Simulink input description

Page 27: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

27

Control FPGA

User FPGA

PPC

UserDesign

Inserted Logic

ClockBufferLogic

Network 100MHz

User Defined(~1-10MHz)

ControlBreakpointinterrupt

Single-step

Clockdomains

DRAM

Control Architecture on BEE2

Page 28: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

28

Stitching

Stitcher traverses the design hierarchy and: Replaces debugging component placeholders

with necessary logic Creates a simple route from all variables to

off-chip storage devices During execution, the stitcher records:

A mapping between variable names and their physical variable unit in hardware

The latency within the variable routing network

Page 29: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

29

Variable Control Unit (VCU)

Inserted in place of each variable block in design

Automatically implied for every state variable in a state machine

Combination of local buffers and off-chip DRAM Exact memory

allocation is subject to experimentation

Page 30: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

30

Debug Controller (DC)

Interface between all variable and assertion instances, the runtime user shell, and process control “services”

Regulates the system clock both for exceptions and to prevent variable storage overflows

Page 31: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

31

Runtime Shell Examples

load Initialize or reset a design

halt Stop the design as soon as possible

runfor Run the design for a number of cycles

cont Run the design until the next exception

break View, enable, or change a breakpoint

view View a variable’s value or history

set Override a variable’s value or source

rewind Rewind the system state by n cycles

Page 32: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

32

Future Work

Complete infrastructure for BEE2

Extensive experiments with variable memory Efficient methods for variable routing Storage requirements and hierarchy Time/Space tradeoffs for periodic sampling

Generalize framework to define concepts such as variable priorities, multiple debug levels, and extensions to text-based languages

Page 33: An Integrated Debugging Environment for Reprogrammable Hardware Systems Kevin Camera Hayden So Bob Brodersen Berkeley Wireless Research Center University.

Questions?