4-LOOP: 4-core Leon 3 with linux Operating system, OpenMP library and hardware Profiling system

1
LINUX + OPENMP L E O N 3 L E O N 3 L E O N 3 L E O N 3 S 1 S 2 S 3 AHB Controller LEON3 LEON3 JTAG Dbg Link Ethernet MAC LEON3 LEON3 AHB/APB Bridge Memory Controller JTAG PHY AMBA AHB AMBA APB UART UART - USB SRAM S1 S2 S3 Non-parallel region: Master thread only Parallel region starts: #pragma omp parallel ID:0 fork ID:0 ID:1 ID:2 ID:3 Parallel region: Several thread execute simultaneously join Parallel region ends: program waits for all threads to terminate ID:0 Program reverts to single threaded execution APB Interface Decode Section Event Monitor Time Monitor Counter AHB - Adapter APB Bus AHB Bus SNIFFER BLOCK DIAGRAM LEON3 7 - Stage Integer Pipeline 3-Port Register File IEEE-754 FPU Co-Processor HW MUL/DIV Trace Buffer Debug port Interrupt port I-Cache D-Cache SRMMU AHB I/F Local IRAM ITLB Local DRAM DTLB AMBA AHB Master (32-bit) SYSTEM BEHAVIOUR Perfomance evaluation of the platform by means of Pi calculation algorithm, proposed in four different versions: serial computation, single process multiple data (SPMD) technique with false sharing, SPMD technique without false sharing and OMP reduction function. Proposed profiling technique, used to monitor computational behaviour of the 4-LOOP platform, follows the approach of runtime bus sampling. LEGEND: 1 Thread 2 Threads 3 Threads 4 Threads Event monitor: strobe generation (ld_ac_event) during access on specified address range (delimited by sig_out_inf and sig_out_sup). Time monitor: counter activated by read operation (during_read) and stopped by write operation (during_write), both on specified address (0x808). LEON3 HW PROFILING SYSTEM SYSTEM DESCRIPTION The LEON3 processor is designed for Embedded applications, combining high performance with low complexity and low power consumption. The LEON3 processor is highly configurable. A distributed hardware profiling system has been developed for runtime analysis. It is composed of distributed AHB bus monitoring elements (sniffers) that moni- tor AHB bus, initialized by means of APB bus. A global monitor unit, represented by one LEON3 processor, provides sniffers initialization and collects results. A Linux distribution, customized to work with multicore platform in SMP mode, has been developed using Buildroot tool, starting from LEON LINUX kernel (provided by Gaisler research). Libraries required to implement parallel applications using shared memory, developed with OpenMP C/C++, have been cross-compiled and added to the adopted Linux distribution. LINUX OPENMP HARDWARE ARCHITECTURE THE PLATFORM OVERVIEW The opportunity to build multi-processor systems exploiting soft-cores is increasing the range of applications that can be implemented on FPGAs. In order to maximize performance a parallel programming model should be used: OpenMP API is a specification for a set of compiler directives, library routines, and environment variables that can be used to specify high-level parallelism. Runtime analysis on SoC is useful to optimize reconfigurable systems. However, software profiling systems impose software overhead to application execution. 1) 2) 3) Proposed platform is composed of a working symmetric multi-processor systems (SMP) based on four LEON3 cores, enhanced by adding a custom hardware profiling system with no software overhead introduction. A SMP LINUX kernel targeting the proposed system and including the device drivers needed to collect data from the custom hardware profilers has been also built. The system has been further customized to support the execution of OpenMP-based applications. PROPOSED PLATFORM MOTIVATIONS 4-LOOP IS A PLATFORM DEVELOPED TO OFFER ADVANTAGE OF PARALLEL EXECUTION, WHILE MONITORING RUNTIME SYSTEM BEHAVIOUR WITHOUT SOFTWARE OVERHEAD 4 - LOOP LEON3 4-CORE LEON3 WITH LINUX OPERATING SYSTEM, OPENMP LIBRARY AND HARDWARE PROFILING SYSTEM G. Valente, V. Muttillo, L. Pomante, M. Faccio, F. Federici, A. Moro Main Contacts: [email protected], [email protected], [email protected], [email protected] UNIVERSITA’ degli S TUDI dell’ AQUILA - C ENTER of E XCELLENCE D EWS ( I TALY) http://dews.univaq.it Graphic Designed By: Tania Valentina Ferro

Transcript of 4-LOOP: 4-core Leon 3 with linux Operating system, OpenMP library and hardware Profiling system

LINUX+

OPENMP

LEON3

LEON3

LEON3LEON3

S 1S 2

S 3

AHBController

LEON3 LEON3 JTAGDbg Link

EthernetMAC

LEON3 LEON3 AHB/APBBridge

MemoryController

JTAG PHY

AMBA AHB

AMBA APB

UARTUART - USB SRAM

S1 S2 S3

Non-parallel region:Master thread only

Parallel region starts:#pragma omp parallel

ID:0fork

ID:0 ID:1 ID:2 ID:3

Parallel region:Several thread executesimultaneously

joinParallel region ends:program waits for all threads to terminate

ID:0

Program reverts to single threaded execution

APBInterface

DecodeSection

EventMonitor

TimeMonitor

Counter

AHB - Adapter

APBBus

AHB BusSNIFFER BLOCK DIAGRAM

LEON37 - Stage

Integer Pipeline

3-Port Register FileIEEE-754 FPUCo-ProcessorHW MUL/DIV

Trace BufferDebug port

Interrupt port

I-Cache D-CacheSRMMUAHB I/F

Local IRAMITLB

Local DRAMDTLB

AMBA AHB Master (32-bit)

SYSTEM BEHAVIOURPerfomance evaluation of the platform by means of Pi calculation algorithm, proposed in four different versions: serial computation, single process multiple data (SPMD) technique with false sharing, SPMD technique without false sharing and OMP reduction function.

Proposed profiling technique, used to monitor computational behaviour of the 4-LOOP platform, follows the approach of runtime bus sampling.

LEGEND:1 Thread2 Threads3 Threads4 Threads

Event monitor: strobe generation (ld_ac_event) during access on specified address range (delimited by sig_out_inf and sig_out_sup).

Time monitor: counter activated by read operation (during_read) and stopped by write operation (during_write), both on specified address (0x808).

LEON3 HW PROFILING SYSTEM

SYSTEM DESCRIPTION

The LEON3 processor is designed for Embedded applications, combining high performance with low complexity and low power consumption. The LEON3 processor is highly configurable.

A distributed hardware profiling system has been developed for runtime analysis. It is composed of distributed AHB bus monitoring elements (sniffers) that moni-tor AHB bus, initialized by means of APB bus. A global monitor unit, represented by one LEON3 processor, provides sniffers initialization and collects results.

A Linux distribution, customized to work with multicore platform in SMP mode, has been developed using Buildroot tool, starting from LEON LINUX kernel (provided by Gaisler research).

Libraries required to implement parallel applications using shared memory, developed with OpenMP C/C++, have been cross-compiled and added to the adopted Linux distribution.

LINUX OPENMP

HARDWARE ARCHITECTURE

THE PLATFORM

OVERVIEW

The opportunity to build multi-processor systems exploiting soft-cores is increasing the

range of applications that can be implemented on FPGAs.

In order to maximize performance a parallel programming model should be used: OpenMP API

is a specification for a set of compiler directives, library routines, and environment variables

that can be used to specify high-level parallelism.

Runtime analysis on SoC is useful to optimize reconfigurable systems. However, software

profiling systems impose software overhead to application execution.

1)

2)

3)

Proposed platform is composed of a working symmetric multi-processor

systems (SMP) based on four LEON3 cores, enhanced by adding a custom hardware

profiling system with no software overhead introduction. A SMP LINUX kernel

targeting the proposed system and including the device drivers needed to collect data

from the custom hardware profilers has been also built. The system has

been further customized to support the execution of OpenMP-based applications.

PROPOSED PLATFORMMOTIVATIONS

4-LOOP IS A PLATFORM DEVELOPED TO OFFER ADVANTAGE OF PARALLEL EXECUTION, WHILE MONITORING RUNTIME SYSTEM BEHAVIOUR WITHOUT SOFTWARE OVERHEAD

4 - LOOP LEON34-CORE LEON3 WITH LINUX OPERATING SYSTEM, OPENMP LIBRARY AND HARDWARE PROFILING SYSTEM

G. Valente, V. Muttillo, L. Pomante, M. Faccio, F. Federici, A. Moro

Main Contacts: [email protected], [email protected], [email protected], [email protected]

UNIVERSITA’ degli STUDI dell’AQUILA - CENTER of EXCELLENCE DEWS (ITALY)http://dews.univaq.it

Graphic Designed By: Tania Valentina Ferro