3rd 3DDRESD: OSyRIS

51
POLITECNICO DI MILANO Operating System Support for Operating System Support for Core Management in a Dynamic Core Management in a Dynamic Reconfigurable Environment Reconfigurable Environment D D ynamic ynamic R R econfigurability econfigurability in in E E mbedded mbedded S S ystems ystems D D esign esign Ivan Beretta: [email protected]

Transcript of 3rd 3DDRESD: OSyRIS

Page 1: 3rd 3DDRESD: OSyRIS

POLITECNICO DI MILANO

Operating System Support Operating System Support for Core Management in a for Core Management in a

DynamicDynamicReconfigurable EnvironmentReconfigurable Environment

DDynamic ynamic RReconfigurability econfigurability inin EEmbeddedmbedded SSystemsystems DDesignesign

Ivan Beretta: [email protected]

Page 2: 3rd 3DDRESD: OSyRIS

2

OutlineOutline

Context definition

Aims

Related works

Proposed Methodology

Validation results

Concluding remarks

Page 3: 3rd 3DDRESD: OSyRIS

3

OutlineOutline

Context definition

Aims

Related works

Proposed Methodology

Validation results

Concluding remarks

Page 4: 3rd 3DDRESD: OSyRIS

Reconfigurable devicesReconfigurable devices

Reconfigurable devices can be re-programmed multiple times during their lifetime

High reusabilityDynamic reconfiguration

Field Programmable Gate Arrays (FPGAs)Large amount of computational resourcesSystem-on-ChipSoftware multitasking

4

Page 5: 3rd 3DDRESD: OSyRIS

5

Some DefinitionsSome Definitions

Configuration Code: the executable active physical (either hardware or software) implementation of a given functionality

E.g. Bitstream and partial bitstreams for hardware implementations on FPGAs

Relocation: the ability of moving a configuration code from a location to a new one

Reconfiguration Controller: the element that is responsible for the physical implementation of the reconfiguration process

E.g. The Internal Configuration Access Port (ICAP) on Xilinx FPGAs

Core: a representation of a functionalityE.g. It is possible to have a core described in VHDL, in C or in an intermediate representation (e.g. a DFG)

IP-Core: a core described using a HDL combined with its communication infrastructure (i.e. the bus interface)

It can be configured on the FPGA as a part of the whole system

Page 6: 3rd 3DDRESD: OSyRIS

Classification of reconfigurable architectures Classification of reconfigurable architectures (1 of 2)(1 of 2)

Static vs DynamicCan an FPGA be reconfigured during its execution?

Internal vs ExternalIs a reconfiguration controller accessible within the FPGA?

Partial vs TotalIs the whole FPGA area being reconfigured?

Module based vs Difference basedDoes the reconfiguration process involve a small number of logic gates or an entire IP-Core?

6

Page 7: 3rd 3DDRESD: OSyRIS

Classification of reconfigurable architectures Classification of reconfigurable architectures (2 of 2)(2 of 2)

Mono-dimensional vs Bi-dimensional

Is there any limitation on the IP-Core shape?

Reference architecture

Dynamic, internal, partial, module based and mono-dimensional

7

FPGAReconfigured

Area

Configured IP-Core

ReconfiguredArea

Configured IP-Core

FPGA

Configured IP-Core

Configured IP-Core

IP-Core

Reconfigurable Area Static Area

IP-Core

GPP

Me

mo

ryE

the

rne

t

UART

ImageFilter

CryptoCore

Page 8: 3rd 3DDRESD: OSyRIS

8

Reconfiguration challenges Reconfiguration challenges

Reconfiguration times heavily impact on the final solution’s latency

Possible solution:Maximize the reuse of configured modulesReconfiguration hidingAlternative implementation (Software execution)

Software has a huge role in the reconfiguration process

User applications request functionalities and IP-CoresA software code manages requests and optimizes the overall latency

8

Page 9: 3rd 3DDRESD: OSyRIS

Why do applications ask for IP-Cores?Why do applications ask for IP-Cores?

Software applications are executed on a General Purpose Processor (GPP) on the FPGA

They may require hardware acceleration

90-10 rule90% of the execution is spent in 10% of the code

Inner loops in algorithmsComputational intense code

10% of the execution is spent in 90% of the codeExceptionsUser interaction

The 10% computational intense code has to be executed as hardware on reconfigurable devicesThe 90% exception code is run as executable files on GPPs

9

Page 10: 3rd 3DDRESD: OSyRIS

Software applications and reconfiguration Software applications and reconfiguration managementmanagement

Who manages the reconfiguration process?Standalone vs Operating System solutions

Standalone solutions directly interact with the reconfigurable device

Deal with the specific hardware and the specific taskLack of portability and reusability

Operating systems take care of the reconfiguration issues

Increased portability of user applicationsInherited multitasking capabilities

10

Page 11: 3rd 3DDRESD: OSyRIS

11

OutlineOutline

Context definition

Aims

Related works

Proposed Methodology

Validation results

Concluding remarks

Page 12: 3rd 3DDRESD: OSyRIS

12

AimsAims

Problem statementOperating system definition aimed at an efficient exploitation of the potentiality provided by dynamic reconfigurable architectures.

Aims[A1] Definition of a portable solution for the partial reconfiguration management, based on a widely used kernel such as Linux

[A2] Implementation of a hardware-independent interface for software applications

[A3] Efficient management of FPGA resources within the operating system

Page 13: 3rd 3DDRESD: OSyRIS

13

OutlineOutline

Context definition

Aims

Related works

Proposed Methodology

Validation results

Concluding remarks

Page 14: 3rd 3DDRESD: OSyRIS

Related worksRelated works

Two main classes of OSs for FPGAsSpecific operating systems to handle dynamic reconfigurationExtensions of GNU/Linux kernels

Implementations for multiple architectures and applications

Single and multi FPGA systemsReal-time systems

Area management and hardware preemption explored outside the scope of an OS

14

Page 15: 3rd 3DDRESD: OSyRIS

Specific OSs for dynamic reconfigurationSpecific OSs for dynamic reconfiguration

Tightly coupled with the specific hardware [1][3]

Advanced runtime functionalitiesLack of portability on different architecturesLack of a user interface

ReConfigME [2] explicitly deals with user interfaces

Still a custom interfaceNot suitable for software development

15

Free Area Detection Blocked State

Partitioning

Placement

Routing Successfully allocated IP-Core

Passed

Passed

Passed

Passed

Fail

Fail

Removedtask

Fail

Fail

New hardware IP-Core

[1] Wigley and Kearney: The management of applications for reconfigurable computing using an operating system. In Seventh Asia-Pacific Computer Systems Architectures Conference (ACSAC2002), eds. F. Lai and J. Morris, Melbourne, Australia, 2002. ACS.

[2] Wigley et al.: ReConfigME: a detailed implementation of an operating system for reconfigurable computing. Parallel and Distributed Processing Symposium, 2006. IPDPS 2006. 20th International, April 2006.

[3] Steiger et al.: Operating systems for reconfigurable embedded platforms: online scheduling of real-time tasks. Transactions on Computers, 53(11):1393–1407, November 2004.

Page 16: 3rd 3DDRESD: OSyRIS

Operating systems based on Operating systems based on LinuxLinux

Dynamic reconfiguration support within the Linux kernel

Standard and well-known programming interface

Access to the reconfiguration controller [4][5][7]

Software application can configure their own bitstreamsInteraction with configured IP-CoresOnly a “bridge” between user applications and the FPGA

Transparency between hardware and software implementations in BORPH [6]

No efficient FPGA exploitation

16

[4] Donato et al.: Operating system support for dynamically reconfigurable SoC architectures. SOC Conference, 2005. Proceedings. IEEE International, pages 233–238, September 2005.

[5] Williams and Bergmann: Embedded linux as a platform for dynamically self-reconfiguring systems-on-chip. In Proceedings of the International Conference on Engineering of Reconfigurable Systems and Algorithms, ed. T. P. Plaks, pages 163–169. CSREA Press, June 2004.

[6] So and Brodersen: Improving usability of FPGA-based reconfigurable computers through operating system support. Field Programmable Logic and Applications, 2006. FPL ’06. International Conference on, pages 1–6, 2006.

[7] Donato, A., Ferrandi, F., Redaelli, M., Santambrogio, M. D., and Sciuto, D.: Exploiting partial dynamic reconfiguration for SOC design of complex application on FPGA platform. Ricardo Augusto da Luz Reis, Adam Osseiran, Hans-Jorg Pfleiderer (Eds.): VLSI-SoC: From System To Sylicon, Springer 2007, pp.:87-109

Page 17: 3rd 3DDRESD: OSyRIS

17

OutlineOutline

Context definition

Aims

Related works

Proposed MethodologyThe Caronte flowReconfiguration supportLinux Reconfiguration ManagerImplementation of the LRM

Validation results

Concluding remarks

Page 18: 3rd 3DDRESD: OSyRIS

The Caronte flowThe Caronte flow

18

HW: HardwareRHW: Reconfigurable HWSW: Software

Page 19: 3rd 3DDRESD: OSyRIS

The software side of the Caronte flowThe software side of the Caronte flow

19

elf

SW-Side Design

GCC Frontend

Configuration Set Identification

GCC Backend

Software Integration

.c

Application

elf

Reconfiguration Libraries

Linux OS

SW

Standalone solutionSpecific reconfiguration librariesIntegration in the final bitstreamNo multitasking

Page 20: 3rd 3DDRESD: OSyRIS

The software side of the Caronte flowThe software side of the Caronte flow

Standalone solutionSpecific reconfiguration librariesIntegration in the final bitstreamNo multitasking

Linux supportHigh-level reconfiguration librariesSupport for multitasking and other OS servicesOnly a bootloader is integrated in the bitstreamDevice driver development for specific hardware

20

elf

SW-Side Design

GCC Frontend

Configuration Set Identification

GCC Backend

Software Integration

.c

Application

elf

Reconfiguration Libraries

Linux OS

SW

Page 21: 3rd 3DDRESD: OSyRIS

21

OutlineOutline

Context definition

Aims

Related works

Proposed MethodologyThe Caronte flowReconfiguration supportLinux Reconfiguration ManagerImplementation of the LRM

Validation results

Concluding remarks

Page 22: 3rd 3DDRESD: OSyRIS

Reconfiguration SupportReconfiguration Support

Linux kernels do not natively support dynamic reconfiguration

No interaction with the reconfiguration controllerDynamic addition of hardware modules is supported…… But area management or module caching are not!

The kernel must be extended in order to perform a set of basic operations

Module configuration upon requestModule release/removal

General assumption: the partial bitstream is provided as part of the module request

22

Page 23: 3rd 3DDRESD: OSyRIS

IP-Core devices accessIP-Core devices access

Interaction with configured IP-Cores implemented by means of the standard Linux device access

Open, Close, Read, write, ioctl operations

23

/dev/device_1

/dev/device_2a

/dev/device_2b

SoftwareApplication 1

SoftwareApplication 2

device_1.o

device_2.o

/dev

DeviceDrivers

Userspace

Devices

IP-Core_1

IP-Core_2a

IP-Core_2b

FPGA

Multiple instances of the same

hardware module

Page 24: 3rd 3DDRESD: OSyRIS

Implementation of reconfiguration supportImplementation of reconfiguration support

Implementation by means of two kernel modules and a high-level library

24

Kernel

KernelModules

ReconfigurationController Driver

MACOther

kernel modules

Library Reconfiguration Library

UserspaceSoftware

ApplicationSoftware

Application

Pros: the OS can handle the entire reconfiguration processCons: inefficient exploitation of FPGA resources

Each application has a limited knowledge of the FPGA statusUser applications are aware of the underlying reconfiguration

Page 25: 3rd 3DDRESD: OSyRIS

25

OutlineOutline

Context definition

Aims

Related works

Proposed MethodologyThe Caronte flowReconfiguration supportLinux Reconfiguration ManagerImplementation of the LRM

Validation results

Concluding remarks

Page 26: 3rd 3DDRESD: OSyRIS

26

Reconfigurable ProcessReconfigurable Process

Reconfigurable process: a configuration code in executionEach reconfigurable process is represented in the system by a Reconfigurable Process Control Block (RPCB).A RPCB contains all the information associated with a specific reconfigurable process

State: the state in which the reconfigurable process control is at the current timeConfiguration Code Accounting Information:

Configuration Code pointerConfiguration PriorityResourcesPosition

Configured

Addressing Space Assignment

Ready

Positioning

Configuring

Configuring

Cached

ComputingRemoved

Waiting

Executing

Preempted

Page 27: 3rd 3DDRESD: OSyRIS

The Centralized ManagerThe Centralized Manager

Userspace applications are not allowed to explicitly request a bitstream

They request high-level functionalities

Userspace requests are collected and served by a centralized manager (Linux Reconfiguration Manager)

The OS chooses the configuration codeA new reconfigurable process is created

Only the LRM can ask for a bitstream to be configured on the FPGA

Centralized knowledge of the device statusArea management and module caching

27

Page 28: 3rd 3DDRESD: OSyRIS

A further level of abstractionA further level of abstraction

28

Kernel

KernelModules

ReconfigurationController Driver

MAC

Linux Reconfiguration Manager (LRM)

Otherkernel modules

Library Reconfiguration Library

CentralizedManager

SoftwareApplication

SoftwareApplicationUserspace

Kernel

KernelModules

ReconfigurationController Driver

MACOther

kernel modules

Library Reconfiguration Library

UserspaceSoftware

ApplicationSoftware

Application

Page 29: 3rd 3DDRESD: OSyRIS

29

OutlineOutline

Context definition

Aims

Related works

Proposed MethodologyThe Caronte flowReconfiguration supportLinux Reconfiguration ManagerImplementation of the LRM

Validation results

Concluding remarks

Page 30: 3rd 3DDRESD: OSyRIS

Implementation of the LRMImplementation of the LRM

30

Page 31: 3rd 3DDRESD: OSyRIS

Selection of hardware/software Selection of hardware/software implementationsimplementations

Multiple implementations of the same functionality

Hardware implementations with different timing performances or area constraintsSoftware implementations

Selection among the implementationsExpected number of iterationsReconfiguration overheadProcessor and FPGA usage

The Reconfigurable process abstraction does not allow user applications to be aware of the chosen implementation

31

Page 32: 3rd 3DDRESD: OSyRIS

Module cachingModule caching

What happens when a hardware module is released?

Hard removalSoft removalCaching

A cached module can be reused without the reconfiguration overhead

Modules are cached according to the Reconfigurable Process Caching Value (RPC)

Stored in the RPCBUpdated when a module is reused or removed

32

Page 33: 3rd 3DDRESD: OSyRIS

Preemptive module allocationPreemptive module allocation

Allocation policy decides where a module needs to be placed on the FPGA

Reconfigurable area may be fragmented

An IP-Core cannot be placed even if enough area is availableA module cannot be split into different slots

Preemption of active IP-Cores and defragmentation

33

Sta

tic Part

IP-C

ore

1

IP-C

ore

2

IP-C

ore

3

IP-C

ore

4

Sta

tic Part

IP-C

ore

2IP

-Core

3

IP-C

ore

1

Sta

tic Part

IP-C

ore

2IP

-Core

3

IP-C

ore

1

IP-C

ore

4

Page 34: 3rd 3DDRESD: OSyRIS

Preemptive allocation rulesPreemptive allocation rules

Fragmentation can be prevented by keeping the free area clustered

If more than one location is available, we need policies to choose among them (e.g. the smallest one is chosen)

If the area is fragmented, defragmentation is performed

E.g. in a mono-dimensional environment, all the IP-Cores are pushed on the left side

If a module has to be relocated, it is preempted

The context must be read and restored

34

Sta

tic Part

IP-C

ore

1

IP-C

ore

2

IP-C

ore

3

IP-C

ore

4

Sta

tic Part

IP-C

ore

1

IP-C

ore

2

IP-C

ore

3

Page 35: 3rd 3DDRESD: OSyRIS

35

OutlineOutline

Context definition

Aims

Related works

Proposed Methodology

Validation results

Concluding remarks

Page 36: 3rd 3DDRESD: OSyRIS

Validation ResultsValidation Results

Experimental results for the proposed OSTesting of the reconfiguration supportTwo case studies to validate the centralized approach

Simulation results for the preemptive allocation policy

Testing frameworkLinux kernel: μCLinux

ELDK and PetaLinux distributions

Two FPGAs: Xilinx Virtex-II Pro xc2vp7 and xc2vp30Internal Configuration Access Port (ICAP)

Two processors: PowerPC and MicroBlazeDifferent approaches to memory management

36

Page 37: 3rd 3DDRESD: OSyRIS

Dynamic reconfiguration support Dynamic reconfiguration support (1 of 2)(1 of 2)

PowerPC architecture on Virtex-II Pro vp7

Introduction of a device driver for the ICAP deviceDynamic reconfiguration accessed from the shell

# cat bistream_name.bit > /dev/icap

# ioctl /dev/icap c 2

37

...

PowerPC 405Processor Local Bus(PLB)

SDRAMMemory Controller

...

ICAPController

I/O P

ins

PLB-to-OPBBridge

On-ChipPeripheralBus (OPB)

EthernetController

UARTController

UART Signals

FiniteState

MachineSDRAM Signals

Ethernet Signals

I/O and control signals

ReconfigurableArea

StaticArea

Bus Macro

Page 38: 3rd 3DDRESD: OSyRIS

Dynamic reconfiguration support Dynamic reconfiguration support (2 of 2)(2 of 2)

38

Throughput enhancement of ~ 2x compared to [7][7] Donato, A., Ferrandi, F., Redaelli, M., Santambrogio, M. D., and Sciuto, D.: Exploiting partial dynamic reconfiguration for SOC design of complex application on FPGA platform. Ricardo Augusto da Luz Reis, Adam Osseiran, Hans-Jorg Pfleiderer (Eds.): VLSI-SoC: From System To Sylicon, Springer 2007, pp.:87-109

Page 39: 3rd 3DDRESD: OSyRIS

First case study: simple logic application First case study: simple logic application (1 (1 of 2)of 2)

Test of the Linux Reconfiguration ManagerSelection between multiple implementations

PowerPC-based hardware architecture

Inversion of 1 to 8 bits in a 8-bit registerSoftware version writes 1 bit at a timeHardware version reconfigures the entire register

Communication between clients and LRM based on UNIX sockets

Clients connect to the LRM and request the functionalityLRM generates a new process to serve the requestThe new process computes and returns the result

39

Page 40: 3rd 3DDRESD: OSyRIS

First case study: simple logic application First case study: simple logic application (2 of (2 of 2)2)

40

Page 41: 3rd 3DDRESD: OSyRIS

Second case study: cryptography application Second case study: cryptography application (1 of 2)(1 of 2)

Two cryptographic algorithms available on the LRM

Data Encryption Standard (DES)Advanced Encryption Standard (AES)

MicroBlaze architecture on Virtex-II Pro vp30

41

...

Microblaze

Processor Local Bus (PLB)

SDRAMMemory Controller

...ICAPController

I/O P

ins

OPB-to-PLBBridge

On-Chip Peripheral Bus (OPB)

EthernetController

UARTController

UART Signals

SDRAM Signals

Ethernet Signals Wishbone Bus

ReconfigurableArea

StaticArea

OPB-to-WishboneBridge

OPB-to-WishboneBridge

Bus Macros

AES/DESCipher

Wishbone Bus

AES/DESCipher

Bus Macros

First Reconfigurable Slot

Second Reconfigurable Slot

Page 42: 3rd 3DDRESD: OSyRIS

Second case study: cryptography application Second case study: cryptography application (2 (2 of 2)of 2)

Area occupation

Performances

42

AES IP-Core DES IP-Core

Static part

Throughput w/ caching = 246 kB/s Throughput w/ caching = 436 kB/s

Page 43: 3rd 3DDRESD: OSyRIS

Evaluation of IP-Core preemption Evaluation of IP-Core preemption (1 of 2)(1 of 2)

Simulation resultsLack of a runtime relocation technique

When an IP-Core is preempted:Its context must be storedIt must wait until the new location is availableIt must be reconfigured on the device

Total delay:

IP-Cores are associated a timing constraintIf they cannot meet the constraint, they are discardedOtherwise, they are accepted

43

CoreIPbackread AT

CoreIPreconf AT

CoreIPreconfbackreadstall ATTT 2

Page 44: 3rd 3DDRESD: OSyRIS

Evaluation of IP-Core preemption Evaluation of IP-Core preemption (2 of 2)(2 of 2)

44 [3] Steiger et al.: Operating systems for reconfigurable embedded platforms: online scheduling of real-time tasks. Transactions on Computers, 53(11):1393–1407, November 2004.

Page 45: 3rd 3DDRESD: OSyRIS

45

OutlineOutline

Context definition

Aims

Related works

Proposed Methodology

Validation results

Concluding remarks

Page 46: 3rd 3DDRESD: OSyRIS

46

Concluding Remarks Concluding Remarks (1 of 2)(1 of 2)

[A1] Definition of a portable solution for the partial reconfiguration management, based on a widely used kernel such as Linux

Portable on different processorsPortable on different distributions

[A2] Implementation of a hardware-independent interface for software applications

Hardware-specific issues handled by the LRM Hardware/software transparency

Page 47: 3rd 3DDRESD: OSyRIS

47

Concluding Remarks Concluding Remarks (2 of 2)(2 of 2)

[A3] Efficient management of FPGA resources within the operating system

Centralized control of device resourcesModule cachingArea management and IP-Core preemption

Collaborations with National Chung Cheng University (Taiwan) and Heinz Nixdorf Institut (Germany)

Page 48: 3rd 3DDRESD: OSyRIS

48

Future worksFuture works

Extension to bi-dimensional environmentsMore complex allocation policiesMore complex defragmentation techniques

Integration with a runtime relocation toolBitstreams can be relocated anywhere on the FPGA

Implementation of task migrationFrom hardware to software executions and viceversaImplementation-independent context saving

Page 49: 3rd 3DDRESD: OSyRIS

49

ReferencesReferences

[1] Wigley, G. and Kearney, D.: The management of applications for reconfigurable computing using an operating system. In Seventh Asia-Pacific Computer Systems Architectures Conference (ACSAC2002), eds. F. Lai and J. Morris, Melbourne, Australia, 2002. ACS.[2] Wigley, G., Kearney, D., and Jasiunas, M.: ReConfigME: a detailed implementation of an operating system for reconfigurable computing. Parallel and Distributed Processing Symposium, 2006. IPDPS 2006. 20th International, pages 8 pp.–, April 2006.[3] Steiger, C., Walder, H., and Platzner, M.: Operating systems for reconfigurable embedded platforms: online scheduling of real-time tasks. Transactions on Computers, 53(11):1393–1407, November 2004.[4] Donato, A., Ferrandi, F., Redaelli, M., Santambrogio, M. D., and Sciuto, D.: Exploiting partial dynamic reconfiguration for SOC design of complex application on FPGA platform. Ricardo Augusto da Luz Reis, Adam Osseiran, Hans-Jorg Pfleiderer (Eds.): VLSI-SoC: From System To Sylicon, Springer 2007, pp.:87-109[5] Williams, J. and Bergmann, N.: Embedded linux as a platform for dynamically self-reconfiguring systems-on-chip. In Proceedings of the International Conference on Engineering of Reconfigurable Systems and Algorithms, ed. T. P. Plaks, pages 163–169. CSREA Press, June 2004.[6] So, H. K.-H. and Brodersen, R. W.: Improving usability of FPGA-based reconfigurable computers through operating system support. Field Programmable Logic and Applications, 2006. FPL ’06. International Conference on, pages 1–6, August 2006.

Page 50: 3rd 3DDRESD: OSyRIS

50

General InformationGeneral Information

Webpagewww.dresd.org/osyris

Mailing [email protected]

ContactTo have more information regarding the OSyRiS project:

[email protected]

For a complete list of information on how to contact us:

www.dresd.org/contact_osyris

Page 51: 3rd 3DDRESD: OSyRIS

51

¿ Questions ?¿ Questions ?

Thanks for your attention