FP7-612069-HARPA Project
HARPA
Management of mixed criticality and reliability at run-time:
the HARPA approach
Thematic Session on Challenges in Mixed Criticality and Real-time and Reliability in
Networked Complex Embedded Systems
Barcelona, May 15, 2014
HiPEAC CSW
Prof. William Fornaciari
home.deib.polimi.it/fornacia
FP7-612069-HARPA Project
The project in a nutshell Objectives
Outputs
Exploitation
Organization of the activities and WP mapping
Mixed criticality support and run-time management
Some experimental results
How to get information, contact us
2
Outline
FP7-612069-HARPA Project
The Challenge: dependable performance Critical for embedded applications timing correctness Paramount for HPC load balancing and fast execution To be considered with other figures of merit (mixed
criticality)
The Vision: a synergic approach Exploit synergies in the ES or the HPC domains Merging concepts, assessing key applications
The Goal: HARnessing Perfomance vAriability Cost-effective variations confront in next ES/HPC Dependable performance, slack identification, timing
3
Introduction
FP7-612069-HARPA Project 4
Introduction
CONSORTIUM OVERVIEW
Participant Business activity / Expertise
Main role in project No. Name Country
1 POLIMI IT University Coordinator of the project. Development of the HARPA OS engine. WP1 and WP7 leader.
2 IMEC BE Research and Technology
Providing link to advanced process technology reliability modeling and WP 4 leader
3 ICCS GR University Development HARPA run-time engine and WP2 leader. Dissemination activities (WP6) leader
4 UCY CY University R&D on run-time Monitors, Knobs and Network-on-Chip. WP3 leader.
5 IT4I CZ Research and Technology
Application of HARPA environment for HPC simulations. WP6 leader.
6 THALES FR Industrial Providing an industrial high-end embedded application which will serve as a use case for the HARPA runtime evaluation. WP5 leader
7 HEN IT SME Providing an industrial application for low-end embedded systems which will serve as a use case for the HARPA runtime evaluation.
FP7-612069-HARPA Project
SO1 - Shaving margins Adopt Razor-like concepts into different aspects of a system that are
typically over-provisioned for the worst case
Worst-Case Execution Time (WCET) for time predictability is an example of such over-provisioning in the embedded systems domain.
Over-provisioning also characterizes current design practices in the on-chip interconnect of HPC-oriented multi-core CPUs
SO2 - A more predictable system with real-time guarantees The different monitors, knobs, and the HARPA engine will allow to
study the correlation between the different elements of the system
SO3 - Implementation of effective platform monitors/knobs The implemented monitors and knobs should be lightweight and
should have no or negligible impact on the chip.
Cross-layer approach, whereby monitors and knobs throughout the system stack facilitate a comprehensive control strategy
5
Introduction
FP7-612069-HARPA Project
O1 - Performance-dependable multi-core architectures for ES and HPC Augment existing multi-core designs to guarantee
performance dependability
Proactive and reactive techniques derived from ES and HPC
O2 - Monitors/knobs in hardware designs Monitors will allow the identification of the main sources of
performance unpredictability
Knobs will allow the control of applications execution, providing dependable performance
O3 - Monitors/knobs in software designs Track the resources that lead to at least 90% of the
unpredictability
6
Introduction
FP7-612069-HARPA Project
O4 - System sw designs that support high performance dependability Provide high commitment in the SLAs in conjunction with the
run-time systems
O5 - Run-time designs that support high performance dependability Develop run-time engine designs to provide high
performance dependability guarantees
O6 - Methodologies for conflicting metrics Develop optimization methodologies at hardware level
exploiting models to maintain HARPA-OS architecture independent as much as possible.
These methodologies follow high level directives provided by the HARPA-OS level to tradeoff different metrics
7
Introduction
FP7-612069-HARPA Project
O7 - Develop sw/hw interfaces to provide fluent communication flow Develop interfaces between the different computing stack
layers that allow each layer to obtain information in a reduced timeframe
O8 - New application guidelines to improve performance dependability Develop guidelines that will help improve the performance
dependability guarantees (target 25% improvement)
O9 - Validate the results with industrial case studies Evaluation of the techniques proposed in the project will be
performed on industrial applications provided by the partners THALES, Henesis, and IT4I
8
Introduction
FP7-612069-HARPA Project
ITO1 - System Architectural Design Principles
Define a set of hardware and software design
guidelines allowing heterogeneous multicore systems
to provide dependable performance guarantees
Performance guarantees facilitated by the HARPA
engine through the use of monitors and knobs
orchestrated by appropriate control policies
The monitors and knobs operate on pertinent non-
functional objectives, such as power, energy, timing,
wear-out, etc. The proposed solution should be low-cost
and should be applicable to both embedded systems
and high-performance general-purpose environments
9
Introduction
FP7-612069-HARPA Project
ITO2 - Dependable Performance Guarantees
Provide the implementation of the HARPA engine. The
HARPA engine is the main outcome of this project
Develop sufficiently generic software that can easily
adapt to different types of hardware depending on the
available monitors and knobs in the system
At the end of this project, this outcome will be directly
exploitable, once appropriately adapted to the existing
hardware
10
Introduction
FP7-612069-HARPA Project
ITO3 - Demonstrators Develop case studies with applications representing
different scenarios from both the embedded systems world and the HPC world
These applications will validate the efficacy and efficiency of the various techniques and mechanisms derived from (and cross-fertilized with) both computing paradigms
The HARPA project will test the HARPA engine on platforms, representative of embedded systems, and a full-system evaluation environment simulating typical HPC setups
The idea is to explore the capabilities of the HARPA engine with the monitors and knobs available in existing and future heterogeneous multi-core architectures
11
Introduction
FP7-612069-HARPA Project 12
FLOREON+ (IT4I)
Spectrum Sensing (THALES)
Beesper (HENESIS)
Concept Vehicles
HPC: Floreon+ Environmental risk modelling and simulation – risk
management
High-end ES: Spectrum sensing Explore the frequency spectrum to perform radio
freq. allocation
Low-end ES: Beesper Monitoring landslide based on WSN and cameras
Cross-domain video processing (POLIMI) • Example: people identification/searching
HPC: massively process multiple cameras/images Embedded: power constrained processing
FP7-612069-HARPA Project
Technology scaling: Challenges
> 20 nm: as transistors aged…
13
< 20 nm: as transistors will age…
Time-dependent phenomena become prevalent -> dynamics of applications matters
Not to mention variability related to mixed workloads and data dependency on top of that: …but it is not a fault, it is a feature!
Acc
ele
rate
d t
est
(fe
w y
ear
s)
FP7-612069-HARPA Project 14
Filling the gap
Domain State-of-the-art Novelties to be introduced through HARPA
PV
Mo
de
llin
g Averaging out models with a single signal probability value for the entire system lifetime
Use of atomistic models accounting for time- and workload- dependent variability [Grasser12b]
Highly accurate but CPU-intensive TCAD reliability models [Rodopoulos11]
Reconciliation of CPU-intensity with accuracy in the context of reliability simulations
Kn
ob
s &
M
on
ito
rs Specific metric targeted in isolation
Holistic approach that combines and integrates multiple non-functional requirements
Non-functional metrics not exposed to end user
User and system establish performance dependability agreement, which is upheld throughout system lifetime
PV
Mit
igat
ion
Built-in self-test and job scheduling to enhance MTTF [Feng08]
Detailed knob and monitor placement to enable runtime performance dependability
Time-zero variability compensation at the testing phase [Pineda12a, Pineda12b]
Time- and workload-dependent variability mitigation at runtime throughout system lifetime
FP7-612069-HARPA Project
User Requirements Quality & Quality Cost
HARPA Operating System ~1s responsiveness
HARPA Run Time Engine ~1ms responsiveness
Monitors and Knobs Cross-Layer Placement
Hardware System EC or HPC Platform
15
In a nutshell….
Examples of Monitors/Knobs
Timing violation / DVFS
Bit-flip / ECC
Power Consumption / DVFS
Performance / Scheduling
QoS / Resource allocation
FP7-612069-HARPA Project 16
WP-level organization
Proactively and reactively control Guarantee performance of applications on
heterogeneous architecture Establish Service-Level Agreements
(SLA) with the system Periodically monitor the system state Select and steer the appropriate knobs to
provide the performance guarantees, against time-dependent variations
Notion of SLA different for ES and HPC ES: SLA primarily focuses on satisfying
constraints HPC: minimization of deviations from
nominal specifications
HARPA engine: split between The Operating System (HARPA-OS) The Run-Time system (HARPA-RT)
FP7-612069-HARPA Project
Impact
HARPA OS Already available as open source: BBQ open source project
(http://bosp.dei.polimi.it, see demo videos) Implementation already running on x86 and ARM platforms Run-time management of GPGPU in progress Customizations for possible customers (it was ready for inclusion in
the STHORM SDK platform)
HARPA RT Demonstrator of control loop implemented in firmware exploiting
low level modeling information. 1 ms reaction time Possible porting on commercial platforms
Monitors and Knobs New set of Hw and Sw knobs/monitors Support for Hw and Sw adaptation
Modeling and PV mitigation Filing of a patent before the end of the project Possibility to deal with 7nm time-dependent variability Methodology to validate platform reliability models
17
FP7-612069-HARPA Project
Provide a methodology and framework to guarantee the QoS of application execution Run-time system resources allocation
• to different applications running concurrently
Meet QoS/SLA requirements while optimizing a mix of figures of merit
• including reliability and power budgeting
Interface the low level run-time management monitoring/tuning PEs and other system
resources
Achieve a wide applicability of the methodology Across a number of possible architectures
ranging from HPC to embedded many-cores
18
WP1 Objectives
WP5
Applications
WP2
HARPA RTE
WP3
Monitors and
Knobs
FP7-612069-HARPA Project
Run-Time Resources Management (RTRM) is about finding the optimal trade-off between
QoS requirements and resources availability
Target scenario HW standpoint: Shared resources
• targeting many-core devices, both multi-cores and GPGPUs ─ considering process variations and run-time issues
SW standpoint: Mixed workloads • subject to resources sharing and competition
─ considering relative criticality and time-varying requirements
Simple solutions are required support for frequently changing use-cases
suitable for both critical and best-effort applications
19
What is RTRM?
FP7-612069-HARPA Project
Multiple devices, subsystems Heterogeneous -> Homogeneous (Many-Cores,
GPGPUs) Scalability and Retargetability
Shared resources among different applications Computation, memory, energy, bandwidth…
System-wide resources management
Multiple applications and usage scenarios Run-time changing requirements
Time adaptability
20
Main Goals of RTRM
FP7-612069-HARPA Project
Methodology to support system-wide run-time resource management exploiting design-time information
hierarchical and distributed control
BarbequeRTRM Framework multi-objective optimization strategy
easily portable and modular design
run-time tunable and scalable policies
open source project
21
The starting point
http://bosp.dei.polimi.it
www.harpa-project.eu/
FP7-612069-HARPA Project
Introduction of a new modular policy (YaMS) partition available resources (R) on applications (A)
• considering A priorities and R “residual” availabilities
multi-objective optimization • support a set of tunable goals
─ DONE: performances, overheads,
congestion, fairness ─ WIP: stability, robustness,
thermal and power
increase overall system value • considering discrete and tunable
improvements
LP theory, MMKP heuristic promote scheduling of some AWMs
• which improve optimization goals
demote scheduling of others AWMs • which degrades solution metrics
22
BarbequeRTRM Development
FP7-612069-HARPA Project
23
BarbequeRTRM Concepts
Track run-time variability application requirements resources availabilities
Overheads contingency design-time profiling run-time optimization
Support different granularities system-wide optimization application-specific tuning
Integrated work-flow single framework to support
both design-time and run-time
FP7-612069-HARPA Project
24
BarbequeRTRM Concepts
System-Wide RTRM Coarse grained control on
platform available resources:
- resource accounting
- partitioning and abstraction
- manage applications priorities
- power/thermal “coarse tuning”
Applications
HARPA OS Engine
HARPA
RT Engine
Access
Control
Guide
Assistanc
e
Business
Intelligenc
e
Monitorin
g and
Security Requirements
Notify
Constraints
Configure
Optimization
Policy
Application-Specific RTM Fine grained control on
application specific parameters:
- task ordering
- application parameters monitoring
Platform-Specific
RTRM Fine grained control on
platform available resources:
- resource monitoring
- allocation enforcing
- low-level HW events handling
e.g., critical conditions, faults...
- power/thermal “fine tuning”
WP2
WP5
WP
1
WP3
CPU Count,
Bandwidth, ... Goal Gap, ...
PVT constraints
on Clocks,...
Upper bound
on F and P,...
Embedded
HPC
FP7-612069-HARPA Project
BarbequeRTRM Development -
[1] S.Libutti et. al., “Exploiting Performance Counters for Energy Efficient Co-Scheduling of Mixed Workloads on Multi-Core Platforms”. HiPEAC – PARMA-DITAM. 01/2014.
Future developments
Extension to embedded
multi/many-core architectures
GPGPUs
bandwidth allocation
IPC: 1.070 => 1.325
Power [W]
WP5
WP[2,3]
Design of a run-time CGroup tuning support Improved code execution efficiency
more than x1.3 execution time speedup • increased IPC
reduced context switches, reduced OS overhead
26
FP7-612069-HARPA Project
27
BarbequeRTRM Development
Application
explicitly select
the computing device
for executing kernels
What if more
applications compete
for resources?
System-wide
Run-time Resource Manager require
FP7-612069-HARPA Project 28
BarbequeRTRM Development
GPU 0 load GPU 1
load
Exec time
-50%
Exec time
GPU 1 temperature GPU 0 temperature Temp. delta Thermal
unbalancing
from 12-13°C
to 3-8 °C
Load [%
]
Load [%
]
Tem
pera
ture
[°C
]
Tem
pera
ture
[°C
]
∆ [°C
] T
ime [s]
[1] G.Massari et. al., “Extending a Run-time Resource Management framework to support OpenCL and Heterogeneous Systems”. HiPEAC – PARMA-DITAM. 01/2014.
GPUs load and temperature balancing AMD Nbody sample (32768 particles input), from 1 to 4 running instances
2 GPUs (AMD Radeon 7750 HD, fGPU=400MHz, fMEM=300MHz)
FP7-612069-HARPA Project
Impact
IT4I
Incremental adoption: Installation of HARPA OS (BBQ) already started, Experiments also on GPGPU management
QoS guaranteed of critical tasks, better power management
HARPA environment ensure load-balancing and error resilience based on criticality of the situation
HENESIS
Use of HARPA OS and HARPA technology for new generation of products before end of the project
Improved reliability and extended lifetime
Deployment of one or more pilot installations to test the device in real-world scenario
THALES
Experiments to see how to achieve 20 years of duration of products
• With power budget reduced of one order of magnitude
• Exploiting multi-many core and run-time management
• Analyses impact of intensive DVFS decisions on SoC reliability over the time
POLIMI
Exercise the entire HARPA flow
Vehicle for public dissemination
Reference design for training
29
Hig
hly
Re
usa
ble
fo
r o
the
r ap
plic
atio
ns
FP7-612069-HARPA Project
HARPA project website http://www.harpa-project.eu
HARPA OS – BOSP bosp.dei.polimi.it
In future use of openAIRE
Meet us during workshops we organize HiPEAC, DATE, … Disseminatione Manager:
prof. Dimitrios Soudris, ICCS [email protected]
Contact (project coordinator)
Prof. William Fornaciari
Politecnico di Milano - DEIB
home.deib.polimi.it/fornacia
30
Thanks for your attention
Top Related