QoS-driven Lifecycle Management of Service-oriented Distributed Real-time & Embedded Systems...

QoS-driven Lifecycle Management QoS-driven Lifecycle Management of Service-oriented Distributed Real-of Service-oriented Distributed Real-

time & Embedded Systemstime & Embedded SystemsAniruddha Gokhale

[email protected]/~gokhale

Assistant ProfessorISIS, Dept. of EECS

Vanderbilt University Nashville, Tennessee

February 16th, 2006

www.dre.vanderbilt.eduwww.dre.vanderbilt.edu

mailto:[email protected]

http://www.dre.vanderbilt.edu/~gokhale

2

Service-oriented Style of Distributed Real-time &

Embedded Systems

– Regulating & adapting to (dis)continuous changes in runtime environments

• e.g., online prognostics, dependable upgrades

– Satisfying tradeoffs between multiple (often conflicting) QoS demands

• e.g., secure, real-time, reliable, etc.

– Satisfying QoS demands in face of fluctuating and/or insufficient resources

• e.g., mobile ad hoc networks (MANETs)

3

• Manifestation of Service-Oriented Architectures (SOA) in the distributed real-time & embedded (DRE) systems space– Applications composed of a one or more “operational string” of services– A service is a component or an assembly of components– Dynamic (re)deployment of services into operational strings is necessary– New class of QoS (performance + survivability) requirements

• Realized using enabling component middleware technologies e.g., CCM, .NET and J2EE

Characteristics of SOA-style DRE Systems

4

QoS Issues for SOA-style DRE Systems

C1 C2 C3 C4 C5

• Per-component concern – choice of implementation– Depends of resources, compatibility with other components in assembly

• Communication concern – choice of communication mechanism used• Assembly concerns – what components to assemble dynamically?

What order? What configurations end-to-end are valid?• Failure recovery concern – what is the unit of failover?• Sharing concern – shared components will need proactive survivability

since it affects several services simultaneously• Availability concern – what is the degree of redundancy? What

replication styles to use? Does it apply to whole assembly?• Deployment concern – how to select resources? Risk alleviation?

Failover Unit

5

Tangled Concerns in SOA-style DRE Systems

Separation of Concerns &

Managing Variability is the Key

• Demonstrates numerous tangled para-functional concerns

• Significant sources of variability that affect end-to-end QoS (performance + survivability)

Design-time Deployment-time Run-time

(1) Design-time Variability Management in SOA-style DRE

Systems• Focus on Separation of Concerns• “What if” Analysis

• Analytical methods• Simulation methods• Model-driven generative programming for “what if”

• Understanding the impact of individual concerns

• Students involved:• Krishnakumar Balasubramanian, Jaiganesh Balasubramanian, Gan

Deng, Amogh Kavimandan, James Hill, Sumant Tambe, Arundhati Kogekar, Dimple Kaul

Work partly supported by DARPA PCES program (PI), DARPA ARMS Program, PI on subcontracts from Lockheed Martin ATL, & NSF CSR-SMA Program, PI

7

Separation of Concerns using CoSMIC

Component

ResourceRequirements

Impl

Impl

Impl

Properties

Component Assembler

Component Assembly

Component Component

Component Component

Component Package

Component Assembly

Component Component

Component Component

Component Assembly

Component Component

Component Component

(1) d

evel

ops

(2) assembles

(3) packages

(4) c

onfig

ures

(6) deployment

Assembly

DeploymentApplication

Assembly

Assembly

CoSMIC

(8) reconfiguration &

replanning

Analysis & Benchmarking

packaging

asse

mbl

y

specification

configurationpl

anni

ng

feedback

(7) analysis & benchmarking

(IDM

L &

PIC

ML)

(PICML)

(PIC

ML)

(OC

ML,

QoS

ML)

(Cadena & BGML)

DAnCE Framework

(5) planning

Component Developer

RACE Framework

),...,( 21 nxxxfy

Deployment Planner

Component Packager

Component Configurator

Systemanalyzer

ComponentDeployer

(9) design

feedback

• CoSMIC tools e.g., PICML used for separation of concerns in operational strings• Captures the data model of the OMG D&C specification• Synthesis of static deployment plans for DRE components• New capabilities being added for static deployment planning

Work supported by DARPA PCES Program, PI

• Project Lead and PI DARPA PCES program

• CoSMIC project focuses on separation of deployment and configuration concerns

• Model-driven generative programming framework

• Complementary technology to CIAO and DAnCE middleware

• www.dre.vanderbilt.edu/cosmic

8

Case Study for “What if” Analysis: Virtual Router

Provider Edge (PE)

Provider Edge (PE)

Provider Edge (PE)VR

VR

VR

VR

CE

CE

CE

VR

VR

VR

VR

CE

CE

CE

VR

VR

VR

VR

CE

CE

CE

CE

CE


VR

VR

VR

Level 2 Service Provider

Backbone 1

Provider Edge (PE) VR

VR

VR

VR

Level 1 Service Providers

Provider Edge (PE) VR

VRVR

Backbone 2

VRVR

VR

CE

CE

CE

CE

CE

CE

CE

VP

N1

VP

N2

VP

N3

VP

N1

VP

N2

VP

N3

Virtual Router

FirewallMultiple tunnels to customer edge or virtual routers

Multiple tunnels to backbone or virtual routers

Level 1 Service Providers

• .e.g., VPN Service provided by a virtual router

• Provides differentiated services to customers, e.g., prioritized service

• VPN setup messages must be efficiently (de) multiplexed, serviced and forwarded

• Implemented using middleware

• Need to estimate capacity of the system at design-time

• Network services need support for efficient (de)-multiplexing, dispatching and routing/forwarding

Problem boils down to capacity planning and estimating performance of configured middleware

9

Performance Analysis of Reactor Pattern in VR

The Reactor architectural pattern allows event-driven applications to demultiplex & dispatch service requests that are delivered to an application from one or more clients.

• Customers send VPN setup messages to router

• VPN setup messages manifest as events at the VR

• VR must service these events (e.g., resource allocation) and honor the prioritized service, if any

• Accepted messages are forwarded

• Events could be dropped in overload conditions

•Reactor pattern decouples the detection, demultiplexing, & dispatching of events from the handling of events

•Participants include the Reactor, Event handle, Event demultiplexer, abstract and concrete event handlers


VR

VR

VR

CE

CE

CE

VP

N1

10

Modeling VR Capabilities in a Reactor

network

Single Threaded Reactor

Event Handler with

exponential service time m1

select-based event demultiplexer

Event Handler with

exponential service time m2

l2 Poisson arrival rate

l1 Poisson arrival rate

N1

N2

incoming events

• Consider VPN service for two customer classes Reactor accepts and handles two types

of input events

• Differentiated services for two classes Events are handled in prioritized order

• Each event type has a separate queue to hold the incoming events. Buffer capacity for events of type one is 1 and of type two is 2.

• Event arrivals are Poisson for type one and type two events with rates l1and l2resp.

• Event service time is exponential for type one and type two events with rates m1and m2, resp.

Model of a single-threaded, select-based reactor implementation

11

Performance Metrics of Interest for Reactor

•Throughput: -Number of events that can be processed -Applications such as telecommunications call processing.

•Queue length: -Queuing for the event handler queues. -Appropriate scheduling policies for applications with real-time requirements.

•Total number of events: -Total number of events in the system. -Scheduling decisions. -Resource provisioning required to sustain system demands.

•Probability of event loss: -Events discarded due to lack of buffer space. -Safety-critical systems. -Levels of resource provisioning.

•Response time: -Time taken to service the incoming event. -Bounded response time for real-time systems.

12

Performance Analysis using Stochastic Reward Nets

• Stochastic Reward Nets (SRNs) are an extension to Generalized Stochastic Petri Nets (GSPNs) which are an extension to Petri Nets.

• Extend the modeling power of GSPNs by allowing: Guard functions Marking-dependent arc multiplicities General transition probabilities Reward rates at the net level• Allow model specification at a level closer to intuition.• Solved using tools such as SPNP (Stochastic Petri Net Package).

N1 N2A1 A2

B1 B2

Sn1 Sn2

S2S1

Sr1 Sr2

StSnpSht

SnpShtInProg

T_SrvSnpSht T_EndSnpSht

(a) (b)

Transition

Place

Immediate transition

Inhibitor arc

Token

13

Modeling the Reactor using SRN (1/2)

• Models arrivals, queuing, and prioritized service of events. • Transitions A1 and A2: Event arrivals.• Places B1 and B2: Buffer/queues.• Places S1 and S2: Service of the events.• Transitions Sr1 and Sr2: Service completions.• Inhibitor arcs: Place B1and transition A1 with multiplicity N1 (B2, A2, N2) - Prevents firing of transition A1 when there are N1 tokens in place B1. • Inhibitor arc from place S1 to transition Sr2: - Offers prioritized service to an event of type one over event of type two. - Prevents firing of transition Sr2 when there is a token in place S1.

N1 N2A1 A2

B1 B2

Sn1 Sn2

S2S1

Sr1 Sr2

StSnpSht

SnpShtInProg


(a) (b)

Event arr.

Service queue

Servicing the event

Drop events on overflow

Prioritized service

Service completion

14

Modeling the Reactor using SRN (2/2)

N1 N2A1 A2

B1 B2

Sn1 Sn2

S2S1

Sr1 Sr2

StSnpSht

SnpShtInProg


(a) (b)

• Process of taking successive snapshots• Reactor waits for new events when currently enabled events are

handled• Sn1 enabled: Token in StSnpSht & Tokens in B1 & No Token in S1.• Sn2 enabled: Token in StSnpSht & Tokens in B2 & No Token in S2.• T_SrvSnpSht enabled: Token in S1 and/or S2.• T_EndSnpSht enabled: No token in S1 and S2.• Sn1 and Sn2 have same priority• T_SrvSnpSht lower priority than Sn1 and Sn2

15

VR SRN: Performance Estimates

• SRN model solved using Stochastic Petri Net Package (SPNP) to obtain estimates of performance metrics.

• Parameter values:l1secl2/sec, m12secm22/sec.

• Two cases: N1 = N2 = 1, and N1 = N2 = 5.

Observations:• Probability of event loss is higher when the buffer space is 1• Total number of events of type two is higher than type one. • Events of type two stay in the system longer than events of type one.• May degrade the response time of event requests for class 2 customers

compared to requests from class 1 customers

N1 = N2 = 1 N1 = N2 = 5Perf. metric

#1 #2 #1 #2

Throughput 0.37/s 0.37/s 0.40/s 0.40/s

Queue length 0.065 0.065 0.12 0.12

Total events 0.25 0.27 0.32 0.35

Loss probab. 0.065 0.065 .00026 .00026

16

VR SRN: Sensitivity Analysis

• Analyze the sensitivity of performance metrics to variations in input parameter values.

• Vary l1from 0.5/sec to 2.0/sec. • Values of other parameters:l2/sec, m12secm22/sec, N1 =

N2 = 5.• Compute performance measures for each one of the input values.

Observations:• Throughput of event requests from customer class #1 increases, but rate

of increase declines.• Throughput of event requests from customer class #2 remains

unchanged.

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

0.4 0.44 0.5 0.57 0.66 0.8 1 1.33 2

Lambda1

Th

rou

gh

pu

t

17

Middleware Pattern Simulations in OMNeT++

.ned files

Mod Submod1 Submod2

Mod_n.h/.cpp

Submod1.h/.cpp

Submod2.h/.cpp

Simulation kernel

UI Library

OMNeT++ Initialization File

OMNeT++ Message File

Output Vector File

Output Scalar File

Statistics

Visualization and Animation

• OMNeT++ is a discrete event simulator for networked systems

• Developers write C++ code for simulation

• www.omnetpp.org

18

The Simulation Model for Reactor

Event Handlers with queues

Synchronous Event

Demultiplexer

Reactor

Statistics Collector

Event Generator

19

Addressing Middleware Variability Challenges

•Per-Block Configuration Variability• Incurred due to variations in implementations &

configurations for a patterns-based building block

• E.g., single threaded versus thread-pool based reactor implementation dimension that crosscuts the event demultiplexing strategy (e.g., select, poll, WaitForMultipleObjects

Although middleware provides reusable building blocks that capture commonalities, these blocks and their compositions incur variabilities that impact performance in significant ways.

•Compositional Variability• Incurred due to variations in the

compositions of these building blocks• Need to address compatibility in the

compositions and individual configurations

• Dictated by needs of the domain• E.g., Leader-Follower makes no sense

in a single threaded Reactor

Reactor

event demultiplexing strategy

event handling strategy

single threaded

thread pool

select poll WaitForMultipleObjects

Qt Tk

20

Composed System

Automation Goals for “What if” Analysis

• Build and validate performance models for invariant parts of middleware building blocks

• Weaving of variability concerns manifested in a building block into the performance models

• Compose and validate performance models of building blocks mirroring the anticipated software design of DRE systems

• Estimate end-to-end performance of composed system

• Iterate until design meets performance requirements

Applying design-time performance analysis techniques to estimate the impact of variability in middleware-based DRE systems

Invariant model of a

pattern

Refined model of a

patternvariability variabilityweave weave

Refined model of a

pattern

Refined model of a

pattern

Refined model of a

pattern

Refined model of a

pattern

Refined model of a

pattern

Refined model of a

patternworkload

0

50

100

150

200

workload 0

50

100

150

200

system

21

Automating & Scaling the “What if” Process

• Model-driven Generative technologies• Developed the SRN Modeling Language (SRNML) in GME• Applied C-SAW framework (from Univ of Alabama, Birmingham) for

model scalability

R&D supported by NSF CSR-SMA Program in collaboration with Dr. Jeff Gray (UAB) and Dr. Swapna Gokhale (UConn)

22

Analyzing Impact of Individual Concerns

• Borrow concepts from physical systems to analyze the impact of individual concerns on end-to-end system

• Method of joints, method of sections, free body diagrams, equilibrium conditions

Engineering Mechanics – Statics & Dynamics – for analyzing impact of concerns?

23

Engineering Mechanics for DRE Systems

A concern is viewed as a “force”

Challenges• Directionality – are concerns vectors?• Rigidity – are assemblies rigid or deformable?• Force distribution – does a concern have components along Cartesian axes• Well-defined structures – do software components have properties like trusses• Second order effects – transient effects showing up elsewhere• Notion of friction – these are probably the capacities of resources

C1 C2 C3 C4 C5

Failover Unit

(2) Deployment-time Intelligence

• Near optimal deployment planning decisions• Specialized middleware stacks

• Students involved:• Arvind Krishna (graduated), Jaiganesh Balasubramanian, Gan

Deng, Dimple Kaul, Arundhati Kogekar, Amogh Kavimandan

Work partly supported by DARPA ARMS Program, PI on subcontracts from Lockheed Martin ATL

25

Deployment Challenges

• Service workloads and resource capacity issues – service placement depends on workloads and available resources

• Component accessibility patterns -- component survivability depends on its sharing degree• Differentiated levels of service –affects resource provisioning and survivability strategies• Service failover – different failover possibilities e.g., as a whole or part assembly or one

component at a time• Resource sharing – increases the risk of component(s) requiring proactive survivability

strategy• No one-size-fits-all dependability strategy – cannot dictate one FT strategy on all services

26

Service Placement Problem

Cc

cTAt

cPI

tPI

CTAPU

)(

)(1

)( )(

• A resource configuration is a tuple RC = (C, D, HC, EC) where:• C: is a set of computation nodes each attributed by:

• PI(c): processing index (capacity)

• MI(c): memory index

• RI(c): reliability index

• D: is a set of Data access units of types in {Ai,Sj}

• HC: C (D): is a map associating each c in C with a set of data access units

• EC C C : is a set of comm. links each attributed by:

• BI(e): bandwidth index

• RI(e): reliability index

C1

C2

C3

C4

A3

A2 S3

A1

S4

S1

• System performance can be measured in a variety of ways. Considering a task assignment TA: T C:

• Resource utilization: for processing it is defined as the average of all task processing utilization, given as

• Memory utilization MU(TA) and link utilization LU(TA) can defined similarly

• System utilization factor: The weighted sum percentage of utilizing the system resources

)()()()( 321 TALUTAMUTAPUTASU

• Reliability is more tricky to measure. In general, the reliability of a given computation string is the multiplication of the reliability indices of the underlying nodes and communication edges.

• The reliability factor RF(TA) for a given task assignment, TA, depends on:• The reliability of all its computation strings.• The group reliability the underlying nodes (taking into account their relative distances).• The resource utilization of the systems. The more the system hardware are utilized the less reliable it is.

27

Specializations via Generative Programming

• GME-based POSAML language for POSA2 pattern language

• Generative programming to synthesize FOCUS and AspectC++ rules

• Synthesize specialized middleware stacks for distributed deployment of operational strings.

CONTAINER

demuxing & dispatching

marshaling

protocol adapter

Crosscutting,Configurable, QoS Property

Manager Component(1) concurrency(2) security(2) persistence(3) instrumentation(4) others

ComponentLifecycleManager

Specialized Middleware Stack

Run-time QoS-aware Mechanisms

• Focus on Autonomic Mechanisms• Survivability & Fault tolerance

• Students involved:• Jaiganesh Balasubramanian, Sumant Tambe, Jules

White, Nishanth Shankaran

Work supported by DARPA ARMS Program, PI on subcontracts from Lockheed Martin ATL, BBN Technologies, & Telcordia

29

Distributed Virtual Container Approach

• Virtual Container Concept for Component M/W• Based on a virtualization idea• Spans boundaries across all the replicas, which could

be placed on different physical nodes• Provides a single point for resource provisioning &

component programming• Seamless environment for configuring FT, LB, online

swapping• Handles fine-grained checkpointing across all the

replicas in virtual container• Reliable multicast & state synchronization confined to

a virtual container• Maintains information about how the replicas are

connected to the external component assemblies

• Salient features• Provides an operating context for the

components/assemblies requiring QoS• Relieves programmer from having to configure the

middleware for QoS support• Clients are oblivious to replication

• Normal container programming model• Middleware hides the virtualization details

Vir

tual

Con

tain

er…

…

…

…

primary

secondary

31

Run-time QoS & Survivability Mechanisms

• A configurable approach to survivability including micro- (infrastructure) & macro- (assembly & operational string) level strategies

• Micro-level strategies monitor infrastructure state to make proactive decisions at

• Component level (swapping & migration)

• Middleware level (configurations)

• Component Server Level (process resource allocations)

• Node level (multiple components)

• Macro-level strategies monitor assembly health to make failover decisions• Failover based on type of failover unit

• Affects service placement decisions

• May involve load balancing

• State synchronization issues

• Replication styles (hidden by FT strategies)

• Initial prototype developed using Component-Integrated ACE ORB (CIAO) & Deployment & Configuration Engine (DAnCE) (www.dre.vanderbilt.edu)

http://www.dre.vanderbilt.edu/

33

Research Summary

R&D in new, holistic approaches to end-to-end QoS management in services-enabled distributed real-time & embedded systems

Hardware

Middleware

OS & Protocols

Applications

Research Challenge Research Approach Benefits

• Managing problem space variability

• Model-driven generative approach to separation of concerns

• Enhance the state-of-art in MDD and AOSD technologies

• Design-time “What-if” analysis using generative prog

• Variety of analysis techniques including non traditional mechanisms

• Generative technologies for automated analysis

• Application of Engineering Mechanics

• Deployment-time intelligent decisions

• New applications of constraints optimization theory

• Middleware specializations

• Near optimal deployment• Specialized middleware stacks

• Run-time Mechanisms

• Multilevel, proactive QoS mgmt schemes

• Virtualization ideas

• Largely autonomic• Survivable systems

QoS-driven Lifecycle Management of Service-oriented Distributed Real-time & Embedded Systems...

Documents

Transcript of QoS-driven Lifecycle Management of Service-oriented Distributed Real-time & Embedded Systems...