RAMP Stats and Monitoring Derek Chiou, Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer...

24
RAMP Stats and Monitoring Derek Chiou, Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer 1/15/2009 1
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    218
  • download

    1

Transcript of RAMP Stats and Monitoring Derek Chiou, Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer...

Page 1: RAMP Stats and Monitoring Derek Chiou, Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer 1/15/20091.

1

RAMP Stats and Monitoring

Derek Chiou, Bill Reinhart, Nikhil Patilwith Krste Asanovic and Joel Emer

1/15/2009

Page 2: RAMP Stats and Monitoring Derek Chiou, Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer 1/15/20091.

2

Goals/Requirements

• Provide functionality equivalent to software-based simulators at RAMP speeds– Full observability– Monitoring for events

• Triggers for breakpoints, dumping state, etc.

– Trace (lossy and lossless)– Aggregate Statistics

• Baseline functionality automatically included• Resource efficient• Flexible• Dynamic and static configurablility• Integrated with other infrastructure (component interfaces)1/15/2009

Page 3: RAMP Stats and Monitoring Derek Chiou, Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer 1/15/20091.

3

At Least Three Levels of Debug/Monitoring/Stats

• Platform/Unmodel level– Bringing up BEE3/ACP system independent of RAMP code– May be strange bugs that get exercised with RAMP usage model

• Simulator (Model) level– Simulator may model target incorrectly– Monitor simulator bandwidth requirements

• Could be very different than target machine (e.g., cache of target cache)

• Target level– The target machine may have been implemented correctly, but that is

incorrect– Stats/tracing of working target

• We focus on simulator (model)/target level, but hopefully some will be useful for platform level as well

1/15/2009

Page 4: RAMP Stats and Monitoring Derek Chiou, Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer 1/15/20091.

4

Statistics/Monitoring Philosophy• Instrument simulator communication (eg, RAMP channels)

– Communication mechanisms are logically connected to command network

– Can export/examine/change anything being communicated• No need to add additional code if that is sufficient

– Turn off to save resources when possible

• Introduce additional communication to export where communication does not already exist– Use standard simulator communication (channel) interfaces

• Automatically provides target timing information• Connected to null end-point that logically dumps

– Pipe to /dev/null

• Potentially have non-timed interface, but need time reference point

Bill Reinhart, Nikhil A Patil

1/15/2009

Page 5: RAMP Stats and Monitoring Derek Chiou, Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer 1/15/20091.

5

Simple Example

F D E M W

compressorcompressor

State

1/15/2009

Page 6: RAMP Stats and Monitoring Derek Chiou, Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer 1/15/20091.

6

Required Support

• Endpoint support• Channel support• Transport (network)• Naming

1/15/2009

Page 7: RAMP Stats and Monitoring Derek Chiou, Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer 1/15/20091.

7

User vs Simulator Initiated• Precise User-Initiated

– function call to read/write value at specific target time– Can be implemented through timed channels

• Commands live in target time

– Can be handled logically as a compressor • discard data unless there is a command

– How far ahead in target time should pull command be issued?• Too close impact performance but enables precise control• Too far makes reacting to event difficult

• Imprecise User-Initiated– Issue a read of state, perform whenever, report back target time

• Simulator-initited– dump everything, filter later– can be slow if there is limited bandwidth, storage, filtering

1/15/2009

Page 8: RAMP Stats and Monitoring Derek Chiou, Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer 1/15/20091.

8

Required Support: Endpoint

– Provide state connected to command network• Same interface as a register, drop in replacement• Stats counters, monitor points, control points, etc.

– Provide default compressors/filters• Output every n cycles• Output on rollover• Output toggled on signal• Etc.

1/15/2009

Page 9: RAMP Stats and Monitoring Derek Chiou, Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer 1/15/20091.

9

Required Support: Channel

• Optional connection to control network• Use internal buffering to look back in time– Channels implements as circular buffer in BRAM• Far more storage than needed (in general)

– Can look back in time– Can save bandwidth by only exporting when

needed headtail

1/15/2009

Page 10: RAMP Stats and Monitoring Derek Chiou, Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer 1/15/20091.

10

Required Support: Transport• Transport

– To units: commands, configuration, state changes, etc.– From units: Extract target/host state, statistics, etc.

• Could be virtual channel(s) on common physical network

• Lossy Network?– Lossless for now, support lossy at endpoint

• QoS?• A ring or a ring of rings for simplicity• Ordered network simpler

– helps reconstruction of data outside– But, could result in less efficiency

1/15/2009

Page 11: RAMP Stats and Monitoring Derek Chiou, Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer 1/15/20091.

11

Required Support: Naming/Tagging

• Naming of source of data– Command

• read P1.iCache.num_hits stats register translated to actual register

– Returned data/Trace entry• Needs to be tagged to indicate data

• Each stats entry also includes at least– Target time– Potentially platform/host time for platform/simulator-

level debugging1/15/2009

Page 12: RAMP Stats and Monitoring Derek Chiou, Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer 1/15/20091.

12

FPGA Debug

Hari Angepat, Chris Craik and Derek ChiouElectrical and Computer Engineering

University of Texas at Austin

1/15/2009

Page 13: RAMP Stats and Monitoring Derek Chiou, Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer 1/15/20091.

13

Introduction

• FPGA Simulators offer magnitude speedup– However, can suffer from traditional hardware

issues of limited visibility and debugging challenges

• RAMP Simulators face additional complexity to due scalability requirements that may prevent instrumenting every signal in the simulator

FPGADBG11/15/2009

Page 14: RAMP Stats and Monitoring Derek Chiou, Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer 1/15/20091.

14

Challenge

• How to bring software level debugging visibility to RAMP simulators without dramatically increasing resources or affecting timing closure

1/15/2009

Page 15: RAMP Stats and Monitoring Derek Chiou, Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer 1/15/20091.

15

Challenge

• How to bring software level debugging visibility to RAMP simulators without dramatically increasing resources or affecting timing closure

• Revisit idea of FPGA state readback in combination with gdb style debug interfaces

1/15/2009

Page 16: RAMP Stats and Monitoring Derek Chiou, Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer 1/15/20091.

16

Our Technique

• 1) Leverage FPGA readback mechanism to exploit as much free visibility as possible– FPGA frame readback exists in V2Pro, V4, V5– Can sample flip-flop state dynamically– Can sample BRAM/LUT (notes on this later..)– Can use JTAG hardware for latency-tolerant low-

resource physical link

FPGADBG11/15/2009

Page 17: RAMP Stats and Monitoring Derek Chiou, Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer 1/15/20091.

17

Our Technique

• 2) Provide a GDB interface that can debug both a software process, as well as a FPGA fabric simultaneously.– Can display FPGA netlist symbols alongside

software symbols– Can allow for hybrid CPU/FPGA platform

debugging (ie. X86-FSB-FPGA)

FPGADBG11/15/2009

Page 18: RAMP Stats and Monitoring Derek Chiou, Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer 1/15/20091.

18

FPGADBG Toolflow

FPGADBG1

Dummy!

Compiler

Software Sources(C/C++/…)

Synthesis

FPGA Implementation

Hardware Sources(Verilog/VHDL/…)

Hierarchy Name PreservationConstraints

Debug Flags(-g -Ox)

Logic Allocation

Map

PAR Netlist

FPGA Bitstream

Symbol Table

ASCII Disassembly

BinaryExecutable

FPGADBG – Interactive extension that enables non-intrusive debugging of software running on FPGA (GDB-Py)

Software Debugger (GDB)

1/15/2009

Page 19: RAMP Stats and Monitoring Derek Chiou, Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer 1/15/20091.

19

Architecture

• Designed as set of C/Python libraries– GDB Interface (plugin)– Netlist Frontend (parsing, mapping)– FPGA Backend (board comm, readback)– Hardware library (step control, ICAP readback)

• GDB frontend allows connecting to software-based portions of a simulator

• Assumes design-level support for step– Allows design to ensure consistent state before

sampling

FPGADBG11/15/2009

Page 20: RAMP Stats and Monitoring Derek Chiou, Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer 1/15/20091.

20

Architecture

FPGADBG1

User Logic

Domain Step Control

Readback Engine (ICAP)

IO Logic (Transport Layer)

FPGA Fabric

Target Application

Target OS

Target Virtual Machine

GDB

GDB Plugin Bindings (Python)

FPGADBG Core (Python)

FPGA Chip Comm

(C)

FPGA Readback

(C)

Netlist Parser

(Python)

HW/SW Simulation Platform1/15/2009

Page 21: RAMP Stats and Monitoring Derek Chiou, Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer 1/15/20091.

21FPGADBG1

Bit  6597758 0x005e0200   5758 Block=SLICE_X88Y18 Latch=XQ Net=dout(3)Bit  6597838 0x005e0200   5838 Block=SLICE_X88Y16 Latch=XQ Net=dout(1)Bit  6604350 0x005e0400   5758 Block=SLICE_X88Y18 Latch=YQ Net=dout(2)Bit  6604430 0x005e0400   5838 Block=SLICE_X88Y16 Latch=YQ Net=dout(0)

inst "regOut(1)" "SLICE",placed R72C45 SLICE_X88Y16  , cfg " BXINV::BX BXOUTUSED::#OFF BYINV::BY BYINVOUTUSED::#OFF BYOUTUSED::#OFF ... DXMUX::0 DYMUX::0 F::#OFF F5USED::#OFF FFX:myREG/dout_1:#FF FFX_INIT_ATTR::INIT0 FFX_SR_ATTR::SRLOW FFY:myREG/dout_0:#FF FFY_INIT_ATTR::INIT0 FFY_SR_ATTR::SRLOW    ... ";inst "regOut(3)" "SLICE",placed R71C45 SLICE_X88Y18  , cfg " BXINV::BX BXOUTUSED::#OFF BYINV::BY BYINVOUTUSED::#OFF BYOUTUSED::#OFF ... DXMUX::0 DYMUX::0 F::#OFF F5USED::#OFF FFX:myREG/dout_3:#FF FFX_INIT_ATTR::INIT0 FFX_SR_ATTR::SRLOW FFY:myREG/dout_2:#FF FFY_INIT_ATTR::INIT0 FFY_SR_ATTR::SRLOW ...“;

TopmyREG

doutregOut

Netlist Parsing

1/15/2009

Page 22: RAMP Stats and Monitoring Derek Chiou, Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer 1/15/20091.

22

Netlist Parsing

Alias Detection

Physical Netlist

Vector Merger HierarchyConstruction

Frame AddressMapping

FPGA Cmd Generator

Symbolic Netlist

Readback Cmd Parser Bitstream Reorder

Readback Bitstream

FPGA Board Communication

• FPGA toolflow introduces optimizations and naming issues

1/15/2009

Page 23: RAMP Stats and Monitoring Derek Chiou, Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer 1/15/20091.

23

Limitations

• Hardware readback has limitations:– RAMs require offline readback due to resource

contention issues– FPGA frame span large vertical stripes potentially

restricting visibility if some logic cannot be disabled during sampling

– Hierarchy must be preserved during synthesis to ensure understandable netnames

– Step control requires design-level support

FPGADBG11/15/2009

Page 24: RAMP Stats and Monitoring Derek Chiou, Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer 1/15/20091.

24

Status & Future Work

• Current prototype implements board communication with the XUP Virtex2Pro30 with JTAG-based frame readback

• Frontend netlist parser support hierachical node generation, bit vector merging and some support for aliased signals.

• Full GDB shell expected to be released in Q1-2009 with support for Virtex5{110/330}

FPGADBG11/15/2009