Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary...

42
1 Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

Transcript of Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary...

Page 1: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

1

Abstracting the Hardware /Software Boundary through a

Standard System SupportLayer and Architecture

Erik Anderson4 May 2007

Page 2: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

2

Agenda

• Publications and proposal review.• Problem background.• Abstracting HW/SW boundary.• Analysis, comparing HW and SW.• Results• Conclusions

Page 3: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

3

Special Thanks to

• David Andrews• Perry Alexander• Douglass Niehaus• Ron Sass• Yang Zhang

• Jason Agron• Fabrice Baijot• Ed Komp• Andy Schmidt• Jim Stevens• Wesley Peck• Seth Warn

Page 4: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

4

Academic Background

• Bachelor of Science in ComputerScience, University of Kentucky, 1993 -1997, Magna Cum Laude

• Doctoral Candidate in ElectricalEngineering, Kansas University, 2003 -2007

Page 5: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

5

Publications• “Enabling a Uniform Programming Model Across the

Software/Hardware Boundary,” FCCM 2006• “Supporting High Level Language Semantics within Hardware Resident

Threads,” submitted to FPL 2007• “Memory Hierarchy for MCSoPC Multithreaded Systems,” ERSA 2007• “Achieving Programming Model Abstractions for Reconfigurable

Computing,” Transactions on VLSI, to appear in 2007• “Hthreads: A Computational Model for Reconfigurable Devices,” FPL

2006• “The Case for High Level Programming Models for Reconfigurable

Computing,” ERSA 2006• “Run-time Services for Hybrid CPU/FPGA Systems on Chip,” RTSS

2006

Page 6: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

6

Proposal Review• Augment the HWTI

• Extend support for keysubset of Hthread API.

• Semantic andimplementation differences.

• Hthread test suite.

• Application suite.

• Augmented HWTI with:– User interface and protocol.– Globally distributed local

memory.– Function call stack.

• Extended support for keysubset of Hthread API.– Remote procedural calls.

• Chapter 5 in dissertation– Context similarities.

• Hthread test suite.– Abstractions held.

• Application suite.– Framework for HLL to HDL.

Proposed Completed

Page 7: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

7

History of ReconfigurableComputing

• 1959: Gerald Estrin’sFixed plus VariableArchitecture.

• 1984: Xilinx is founded.• 1993: Athana proposed

PRISM-I• 2006: First 65nm FPGA

released.

IEEE Annals of History of Computing, Oct -Dec 2002, page 522FPGA 2006

30FPL 2006

25FCCM 2006

PapersConference

Page 8: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

8

Reconfigurable ComputingTechnology

• Post-fabricationcircuit design.

• Embedded cores,memory, andmultipliers.

From xilinx.com

Page 9: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

9

“Blessing and a Curse”

• FPGA’s can take onany computationalmodel post-fabrication.

• But which one touse?

MISDSISD

MIMDSIMD

Instruction StreamD

ata

Stre

am

Flynn’s Taxonomy

Page 10: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

10

Hardware Acceleration Model• SISD or SIMD.• Advantages:

– Can be successful.– C to HDL tools.

• Disadvantages:– Custom interfaces between

HW and SW.• Write once, run once.

– Costly design-spaceexploration.

– Does not use today’s MIMDprogramming models.

Page 11: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

11

Abstract Interfaces• Parallel programming

models to abstract CPU/ FPGA interface.

• CPU and FPGA bothtarget an equivalentabstract interface.

• OS / Middleware layerprovidescommunication andsynchronizationmechanism.

Page 12: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

12

Thesis Statement

Programming model and high level languageconstructs can be used to abstract the

existing hardware/software boundary thatcurrently exists between CPU and FPGA

components.

Page 13: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

13

Extending the Shared Memory Multi-Threaded Model to Hardware

Conceptual Reality

•Pthreads programming

Page 14: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

14

Extending the Shared Memory Multi-Threaded Model to Hardware

• Key Challenges– HW access to API

library.– HW access to

application data.– Eliminate custom

interface to HW.

Page 15: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

15

Extending the Shared Memory Multi-Threaded Model to Hardware

• Hthread’s Solutions– Access to the same

communicationmedium.

– Equal or equivalentsynchronizationservices migrated toHW.

– Standard systemsupport layer. Hybridthreads System

Page 16: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

16

Hardware Thread Interface

• HWTI provides a standard register set forcommunication and synchronization services.

Page 17: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

17

Creating a MeaningfulAbstraction

• Communication and synchronizationare solved.

• Problems persist:– “scratchpad” memory:

• How to instantiate?• How to maintain the shared memory model?

– System versus user function calls?– Creating threads from hardware?

Page 18: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

18

Globally Distributed LocalMemory

• Dual ported BRAM.• “Globally Distributed” = All threads have access.• “Local” = User logic access is through LOAD and

STORE protocols.

Page 19: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

19

Function Call Stack

• Abstract access to localmemory.

• Consistent function callmodel.– Recursion.

• Works analogously tosoftware based stack.– Only difference, user

logic pushes “returnstate” value instead of“return instruction.”

7RETURN

3CALL

1ADDRESSOF

1WRITE

3READ

1DECLARE

5POP

1PUSH

Clock CyclesOperation

Page 20: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

20

Remote Procedural Calls

• Some functions tooexpensive toimplement in HWTI.

• Utilize existingsynchronizationprimitives to calloutto a special softwaresystem thread toperform function.

Page 21: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

21

Hardware / Software Duality

Page 22: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

22

Hthread System CallImplementation Differences

• HW has dedicated resources allocatedat synthesis time.– HW explicitly blocks.

• SW has shared resources allocated atruntime.– SW context switches.

Page 23: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

23

Hthread Size andPerformance Comparison

• Size– Definition– hthread_create /

hthread_join– hthread_yield

• Performance– Definition– HW outperforms SW

• Create/join notableexception

– Hardware’s bustransactions

Page 24: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

24

Demonstrating an AbstractInterface

• POSIX Test-suite adapted for Hthreads.• Conformance tests

– Version for SW, HW, and mixed.• Stress tests

– Version for SW, HW, and mixed.• Abstractions held across SW/HW.

Page 25: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

25

Demonstrating HLLConstructs

Sharing data between HW and SWHuffman

Task level parallelism, localvariables.

IDEA

Local array access, access toshared memory.

Haar DWT

Recursion, local variables.Factorial

Recursion, local variables, accessto shared memory.

QuicksortDemonstratesAlgorithm

Page 26: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

26

Function Call Stacks andRecursion

• Quicksort– Recursive– O(nlogn)

performance

Page 27: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

27

Memory Latency

• IDEA encryption– Key and data

locationcomparison.

Page 28: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

28

Task Level Parallelism

• Haar DWT– Software’s pseudo-

concurrency.– Hardware’s true

concurrency.• Performance

– 2 SW = 31.1ms– 1HW/SW = 16.5ms– 2 HW = 16.6ms

Page 29: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

29

Future Work

• Memory latency for hardware threads.• Leveraging reconfigurable computing.• High level language to hardware

descriptive language translation.

Page 30: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

30

Conclusions

• Parallel programming models may beused to abstract CPU/FPGA boundary.– Threads communicate and synchronize

with other threads without regard tolocation.

• Abstract virtual machine can beimplemented in either HW or SW.– Created a framework for HLL to HDL.

Page 31: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

31

Questions?

Page 32: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

32

Supplemental Material

Page 33: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

33

Function Call Stack Example

Page 34: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

34

Globally Distributed LocalMemory

Page 35: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

35

Dynamic Memory Allocation

• Pre-allocated Heap.• Light version of

malloc, calloc, andfree.

Page 36: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

36

Demonstration HLLConstructs: Quicksort

• HWTI maintainsO(nlogn) behavior.

• Cache-likeperformance.

Page 37: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

37

Demonstration HLLConstructs: IDEA

• Benefits of task level parallelism.• Comparison with Vuletic’s hardware threads.

Page 38: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

38

Demonstration HLLApplicability: Huffman

• Abstract datapassing betweenSW and HWthreads.

• Data cache on CPU.

Page 39: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

39

Demonstration HLLApplicability: Haar DWT

• Abstract interface vsmeaningful abstractinterface.

• Performance.• Complexity.

Page 40: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

40

Globally Distributed LocalMemory

• “Cache like” performance.• Maintains shared memory

model.• User access without bus

transactions.HWTILocalGlobalOperation

19128Store

19351Load

Page 41: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

41

Join Danger

Page 42: Abstracting the Hardware / Software Boundary … Abstracting the Hardware / Software Boundary through a Standard System Support Layer and Architecture Erik Anderson 4 May 2007

42

Remote Procedural Calls

• Advantages:– HW Access to

shared libraries.– Complete support for

hthread APIs.• Disadvantages:

– Interrupts the CPU.– Comparatively slow. 114µsstrcmp (string.h)

450µscos (math.h)1.66msprintf (stdio.h)120µsfree (stdlib.h)122µsmalloc (stdlib.h)130µshthread_join (hthread.h)160µshthread_create (hthread.h)

ExecutionLibrary Call