QoS-driven Lifecycle Management of Service-oriented Distributed Real-time & Embedded Systems...
-
date post
15-Jan-2016 -
Category
Documents
-
view
220 -
download
0
Transcript of QoS-driven Lifecycle Management of Service-oriented Distributed Real-time & Embedded Systems...
QoS-driven Lifecycle Management QoS-driven Lifecycle Management of Service-oriented Distributed Real-of Service-oriented Distributed Real-
time & Embedded Systemstime & Embedded SystemsAniruddha Gokhale
[email protected]/~gokhale
Assistant ProfessorISIS, Dept. of EECS
Vanderbilt University Nashville, Tennessee
February 16th, 2006
www.dre.vanderbilt.eduwww.dre.vanderbilt.edu
2
Service-oriented Style of Distributed Real-time &
Embedded Systems
– Regulating & adapting to (dis)continuous changes in runtime environments
• e.g., online prognostics, dependable upgrades
– Satisfying tradeoffs between multiple (often conflicting) QoS demands
• e.g., secure, real-time, reliable, etc.
– Satisfying QoS demands in face of fluctuating and/or insufficient resources
• e.g., mobile ad hoc networks (MANETs)
3
• Manifestation of Service-Oriented Architectures (SOA) in the distributed real-time & embedded (DRE) systems space– Applications composed of a one or more “operational string” of services– A service is a component or an assembly of components– Dynamic (re)deployment of services into operational strings is necessary– New class of QoS (performance + survivability) requirements
• Realized using enabling component middleware technologies e.g., CCM, .NET and J2EE
Characteristics of SOA-style DRE Systems
4
QoS Issues for SOA-style DRE Systems
C1 C2 C3 C4 C5
• Per-component concern – choice of implementation– Depends of resources, compatibility with other components in assembly
• Communication concern – choice of communication mechanism used• Assembly concerns – what components to assemble dynamically?
What order? What configurations end-to-end are valid?• Failure recovery concern – what is the unit of failover?• Sharing concern – shared components will need proactive survivability
since it affects several services simultaneously• Availability concern – what is the degree of redundancy? What
replication styles to use? Does it apply to whole assembly?• Deployment concern – how to select resources? Risk alleviation?
Failover Unit
5
Tangled Concerns in SOA-style DRE Systems
Separation of Concerns &
Managing Variability is the Key
• Demonstrates numerous tangled para-functional concerns
• Significant sources of variability that affect end-to-end QoS (performance + survivability)
Design-time Deployment-time Run-time
(1) Design-time Variability Management in SOA-style DRE
Systems• Focus on Separation of Concerns• “What if” Analysis
• Analytical methods• Simulation methods• Model-driven generative programming for “what if”
• Understanding the impact of individual concerns
• Students involved:• Krishnakumar Balasubramanian, Jaiganesh Balasubramanian, Gan
Deng, Amogh Kavimandan, James Hill, Sumant Tambe, Arundhati Kogekar, Dimple Kaul
Work partly supported by DARPA PCES program (PI), DARPA ARMS Program, PI on subcontracts from Lockheed Martin ATL, & NSF CSR-SMA Program, PI
7
Separation of Concerns using CoSMIC
Component
ResourceRequirements
Impl
Impl
Impl
Properties
Component Assembler
Component Assembly
Component Component
Component Component
Component Package
Component Assembly
Component Component
Component Component
Component Assembly
Component Component
Component Component
(1) d
evel
ops
(2) assembles
(3) packages
(4) c
onfig
ures
(6) deployment
Assembly
DeploymentApplication
Assembly
Assembly
CoSMIC
(8) reconfiguration &
replanning
Analysis & Benchmarking
packaging
asse
mbl
y
specification
configurationpl
anni
ng
feedback
(7) analysis & benchmarking
(IDM
L &
PIC
ML)
(PICML)
(PIC
ML)
(OC
ML,
QoS
ML)
(Cadena & BGML)
DAnCE Framework
(5) planning
Component Developer
RACE Framework
),...,( 21 nxxxfy
Deployment Planner
Component Packager
Component Configurator
Systemanalyzer
ComponentDeployer
(9) design
feedback
• CoSMIC tools e.g., PICML used for separation of concerns in operational strings• Captures the data model of the OMG D&C specification• Synthesis of static deployment plans for DRE components• New capabilities being added for static deployment planning
Work supported by DARPA PCES Program, PI
• Project Lead and PI DARPA PCES program
• CoSMIC project focuses on separation of deployment and configuration concerns
• Model-driven generative programming framework
• Complementary technology to CIAO and DAnCE middleware
• www.dre.vanderbilt.edu/cosmic
8
Case Study for “What if” Analysis: Virtual Router
Provider Edge (PE)
Provider Edge (PE)
Provider Edge (PE)VR
VR
VR
VR
CE
CE
CE
VR
VR
VR
VR
CE
CE
CE
VR
VR
VR
VR
CE
CE
CE
CE
CE
Provider Edge (PE)VR
VR
VR
VR
Level 2 Service Provider
Backbone 1
Provider Edge (PE) VR
VR
VR
VR
Level 1 Service Providers
Provider Edge (PE) VR
VRVR
Backbone 2
VRVR
VR
CE
CE
CE
CE
CE
CE
CE
VP
N1
VP
N2
VP
N3
VP
N1
VP
N2
VP
N3
Virtual Router
FirewallMultiple tunnels to customer edge or virtual routers
Multiple tunnels to backbone or virtual routers
Level 1 Service Providers
• .e.g., VPN Service provided by a virtual router
• Provides differentiated services to customers, e.g., prioritized service
• VPN setup messages must be efficiently (de) multiplexed, serviced and forwarded
• Implemented using middleware
• Need to estimate capacity of the system at design-time
• Network services need support for efficient (de)-multiplexing, dispatching and routing/forwarding
Problem boils down to capacity planning and estimating performance of configured middleware
9
Performance Analysis of Reactor Pattern in VR
The Reactor architectural pattern allows event-driven applications to demultiplex & dispatch service requests that are delivered to an application from one or more clients.
• Customers send VPN setup messages to router
• VPN setup messages manifest as events at the VR
• VR must service these events (e.g., resource allocation) and honor the prioritized service, if any
• Accepted messages are forwarded
• Events could be dropped in overload conditions
•Reactor pattern decouples the detection, demultiplexing, & dispatching of events from the handling of events
•Participants include the Reactor, Event handle, Event demultiplexer, abstract and concrete event handlers
Provider Edge (PE)VR
VR
VR
VR
CE
CE
CE
VP
N1
10
Modeling VR Capabilities in a Reactor
network
Single Threaded Reactor
Event Handler with
exponential service time m1
select-based event demultiplexer
Event Handler with
exponential service time m2
l2 Poisson arrival rate
l1 Poisson arrival rate
N1
N2
incoming events
• Consider VPN service for two customer classes Reactor accepts and handles two types
of input events
• Differentiated services for two classes Events are handled in prioritized order
• Each event type has a separate queue to hold the incoming events. Buffer capacity for events of type one is 1 and of type two is 2.
• Event arrivals are Poisson for type one and type two events with rates l1and l2resp.
• Event service time is exponential for type one and type two events with rates m1and m2, resp.
Model of a single-threaded, select-based reactor implementation
11
Performance Metrics of Interest for Reactor
•Throughput: -Number of events that can be processed -Applications such as telecommunications call processing.
•Queue length: -Queuing for the event handler queues. -Appropriate scheduling policies for applications with real-time requirements.
•Total number of events: -Total number of events in the system. -Scheduling decisions. -Resource provisioning required to sustain system demands.
•Probability of event loss: -Events discarded due to lack of buffer space. -Safety-critical systems. -Levels of resource provisioning.
•Response time: -Time taken to service the incoming event. -Bounded response time for real-time systems.
12
Performance Analysis using Stochastic Reward Nets
• Stochastic Reward Nets (SRNs) are an extension to Generalized Stochastic Petri Nets (GSPNs) which are an extension to Petri Nets.
• Extend the modeling power of GSPNs by allowing: Guard functions Marking-dependent arc multiplicities General transition probabilities Reward rates at the net level• Allow model specification at a level closer to intuition.• Solved using tools such as SPNP (Stochastic Petri Net Package).
N1 N2A1 A2
B1 B2
Sn1 Sn2
S2S1
Sr1 Sr2
StSnpSht
SnpShtInProg
T_SrvSnpSht T_EndSnpSht
(a) (b)
Transition
Place
Immediate transition
Inhibitor arc
Token
13
Modeling the Reactor using SRN (1/2)
• Models arrivals, queuing, and prioritized service of events. • Transitions A1 and A2: Event arrivals.• Places B1 and B2: Buffer/queues.• Places S1 and S2: Service of the events.• Transitions Sr1 and Sr2: Service completions.• Inhibitor arcs: Place B1and transition A1 with multiplicity N1 (B2, A2, N2) - Prevents firing of transition A1 when there are N1 tokens in place B1. • Inhibitor arc from place S1 to transition Sr2: - Offers prioritized service to an event of type one over event of type two. - Prevents firing of transition Sr2 when there is a token in place S1.
N1 N2A1 A2
B1 B2
Sn1 Sn2
S2S1
Sr1 Sr2
StSnpSht
SnpShtInProg
T_SrvSnpSht T_EndSnpSht
(a) (b)
Event arr.
Service queue
Servicing the event
Drop events on overflow
Prioritized service
Service completion
14
Modeling the Reactor using SRN (2/2)
N1 N2A1 A2
B1 B2
Sn1 Sn2
S2S1
Sr1 Sr2
StSnpSht
SnpShtInProg
T_SrvSnpSht T_EndSnpSht
(a) (b)
• Process of taking successive snapshots• Reactor waits for new events when currently enabled events are
handled• Sn1 enabled: Token in StSnpSht & Tokens in B1 & No Token in S1.• Sn2 enabled: Token in StSnpSht & Tokens in B2 & No Token in S2.• T_SrvSnpSht enabled: Token in S1 and/or S2.• T_EndSnpSht enabled: No token in S1 and S2.• Sn1 and Sn2 have same priority• T_SrvSnpSht lower priority than Sn1 and Sn2
15
VR SRN: Performance Estimates
• SRN model solved using Stochastic Petri Net Package (SPNP) to obtain estimates of performance metrics.
• Parameter values:l1secl2/sec, m12secm22/sec.
• Two cases: N1 = N2 = 1, and N1 = N2 = 5.
Observations:• Probability of event loss is higher when the buffer space is 1• Total number of events of type two is higher than type one. • Events of type two stay in the system longer than events of type one.• May degrade the response time of event requests for class 2 customers
compared to requests from class 1 customers
N1 = N2 = 1 N1 = N2 = 5Perf. metric
#1 #2 #1 #2
Throughput 0.37/s 0.37/s 0.40/s 0.40/s
Queue length 0.065 0.065 0.12 0.12
Total events 0.25 0.27 0.32 0.35
Loss probab. 0.065 0.065 .00026 .00026
16
VR SRN: Sensitivity Analysis
• Analyze the sensitivity of performance metrics to variations in input parameter values.
• Vary l1from 0.5/sec to 2.0/sec. • Values of other parameters:l2/sec, m12secm22/sec, N1 =
N2 = 5.• Compute performance measures for each one of the input values.
Observations:• Throughput of event requests from customer class #1 increases, but rate
of increase declines.• Throughput of event requests from customer class #2 remains
unchanged.
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
0.4 0.44 0.5 0.57 0.66 0.8 1 1.33 2
Lambda1
Th
rou
gh
pu
t
17
Middleware Pattern Simulations in OMNeT++
.ned files
Mod Submod1 Submod2
Mod_n.h/.cpp
Submod1.h/.cpp
Submod2.h/.cpp
Simulation kernel
UI Library
OMNeT++ Initialization File
OMNeT++ Message File
Output Vector File
Output Scalar File
Statistics
Visualization and Animation
• OMNeT++ is a discrete event simulator for networked systems
• Developers write C++ code for simulation
• www.omnetpp.org
18
The Simulation Model for Reactor
Event Handlers with queues
Synchronous Event
Demultiplexer
Reactor
Statistics Collector
Event Generator
19
Addressing Middleware Variability Challenges
•Per-Block Configuration Variability• Incurred due to variations in implementations &
configurations for a patterns-based building block
• E.g., single threaded versus thread-pool based reactor implementation dimension that crosscuts the event demultiplexing strategy (e.g., select, poll, WaitForMultipleObjects
Although middleware provides reusable building blocks that capture commonalities, these blocks and their compositions incur variabilities that impact performance in significant ways.
•Compositional Variability• Incurred due to variations in the
compositions of these building blocks• Need to address compatibility in the
compositions and individual configurations
• Dictated by needs of the domain• E.g., Leader-Follower makes no sense
in a single threaded Reactor
Reactor
event demultiplexing strategy
event handling strategy
single threaded
thread pool
select poll WaitForMultipleObjects
Qt Tk
20
Composed System
Automation Goals for “What if” Analysis
• Build and validate performance models for invariant parts of middleware building blocks
• Weaving of variability concerns manifested in a building block into the performance models
• Compose and validate performance models of building blocks mirroring the anticipated software design of DRE systems
• Estimate end-to-end performance of composed system
• Iterate until design meets performance requirements
Applying design-time performance analysis techniques to estimate the impact of variability in middleware-based DRE systems
Invariant model of a
pattern
Refined model of a
patternvariability variabilityweave weave
Refined model of a
pattern
Refined model of a
pattern
Refined model of a
pattern
Refined model of a
pattern
Refined model of a
pattern
Refined model of a
patternworkload
0
50
100
150
200
workload 0
50
100
150
200
system
21
Automating & Scaling the “What if” Process
• Model-driven Generative technologies• Developed the SRN Modeling Language (SRNML) in GME• Applied C-SAW framework (from Univ of Alabama, Birmingham) for
model scalability
R&D supported by NSF CSR-SMA Program in collaboration with Dr. Jeff Gray (UAB) and Dr. Swapna Gokhale (UConn)
22
Analyzing Impact of Individual Concerns
• Borrow concepts from physical systems to analyze the impact of individual concerns on end-to-end system
• Method of joints, method of sections, free body diagrams, equilibrium conditions
Engineering Mechanics – Statics & Dynamics – for analyzing impact of concerns?
23
Engineering Mechanics for DRE Systems
A concern is viewed as a “force”
Challenges• Directionality – are concerns vectors?• Rigidity – are assemblies rigid or deformable?• Force distribution – does a concern have components along Cartesian axes• Well-defined structures – do software components have properties like trusses• Second order effects – transient effects showing up elsewhere• Notion of friction – these are probably the capacities of resources
C1 C2 C3 C4 C5
Failover Unit
(2) Deployment-time Intelligence
• Near optimal deployment planning decisions• Specialized middleware stacks
• Students involved:• Arvind Krishna (graduated), Jaiganesh Balasubramanian, Gan
Deng, Dimple Kaul, Arundhati Kogekar, Amogh Kavimandan
Work partly supported by DARPA ARMS Program, PI on subcontracts from Lockheed Martin ATL
25
Deployment Challenges
• Service workloads and resource capacity issues – service placement depends on workloads and available resources
• Component accessibility patterns -- component survivability depends on its sharing degree• Differentiated levels of service –affects resource provisioning and survivability strategies• Service failover – different failover possibilities e.g., as a whole or part assembly or one
component at a time• Resource sharing – increases the risk of component(s) requiring proactive survivability
strategy• No one-size-fits-all dependability strategy – cannot dictate one FT strategy on all services
26
Service Placement Problem
Cc
cTAt
cPI
tPI
CTAPU
)(
)(1
)( )(
• A resource configuration is a tuple RC = (C, D, HC, EC) where:• C: is a set of computation nodes each attributed by:
• PI(c): processing index (capacity)
• MI(c): memory index
• RI(c): reliability index
• D: is a set of Data access units of types in {Ai,Sj}
• HC: C (D): is a map associating each c in C with a set of data access units
• EC C C : is a set of comm. links each attributed by:
• BI(e): bandwidth index
• RI(e): reliability index
C1
C2
C3
C4
A3
A2 S3
A1
S4
S1
• System performance can be measured in a variety of ways. Considering a task assignment TA: T C:
• Resource utilization: for processing it is defined as the average of all task processing utilization, given as
• Memory utilization MU(TA) and link utilization LU(TA) can defined similarly
• System utilization factor: The weighted sum percentage of utilizing the system resources
)()()()( 321 TALUTAMUTAPUTASU
• Reliability is more tricky to measure. In general, the reliability of a given computation string is the multiplication of the reliability indices of the underlying nodes and communication edges.
• The reliability factor RF(TA) for a given task assignment, TA, depends on:• The reliability of all its computation strings.• The group reliability the underlying nodes (taking into account their relative distances).• The resource utilization of the systems. The more the system hardware are utilized the less reliable it is.
27
Specializations via Generative Programming
• GME-based POSAML language for POSA2 pattern language
• Generative programming to synthesize FOCUS and AspectC++ rules
• Synthesize specialized middleware stacks for distributed deployment of operational strings.
CONTAINER
demuxing & dispatching
marshaling
protocol adapter
Crosscutting,Configurable, QoS Property
Manager Component(1) concurrency(2) security(2) persistence(3) instrumentation(4) others
ComponentLifecycleManager
Specialized Middleware Stack
Run-time QoS-aware Mechanisms
• Focus on Autonomic Mechanisms• Survivability & Fault tolerance
• Students involved:• Jaiganesh Balasubramanian, Sumant Tambe, Jules
White, Nishanth Shankaran
Work supported by DARPA ARMS Program, PI on subcontracts from Lockheed Martin ATL, BBN Technologies, & Telcordia
29
Distributed Virtual Container Approach
• Virtual Container Concept for Component M/W• Based on a virtualization idea• Spans boundaries across all the replicas, which could
be placed on different physical nodes• Provides a single point for resource provisioning &
component programming• Seamless environment for configuring FT, LB, online
swapping• Handles fine-grained checkpointing across all the
replicas in virtual container• Reliable multicast & state synchronization confined to
a virtual container• Maintains information about how the replicas are
connected to the external component assemblies
• Salient features• Provides an operating context for the
components/assemblies requiring QoS• Relieves programmer from having to configure the
middleware for QoS support• Clients are oblivious to replication
• Normal container programming model• Middleware hides the virtualization details
Vir
tual
Con
tain
er…
…
…
…
primary
secondary
31
Run-time QoS & Survivability Mechanisms
• A configurable approach to survivability including micro- (infrastructure) & macro- (assembly & operational string) level strategies
• Micro-level strategies monitor infrastructure state to make proactive decisions at
• Component level (swapping & migration)
• Middleware level (configurations)
• Component Server Level (process resource allocations)
• Node level (multiple components)
• Macro-level strategies monitor assembly health to make failover decisions• Failover based on type of failover unit
• Affects service placement decisions
• May involve load balancing
• State synchronization issues
• Replication styles (hidden by FT strategies)
• Initial prototype developed using Component-Integrated ACE ORB (CIAO) & Deployment & Configuration Engine (DAnCE) (www.dre.vanderbilt.edu)
33
Research Summary
R&D in new, holistic approaches to end-to-end QoS management in services-enabled distributed real-time & embedded systems
Hardware
Middleware
OS & Protocols
Applications
Research Challenge Research Approach Benefits
• Managing problem space variability
• Model-driven generative approach to separation of concerns
• Enhance the state-of-art in MDD and AOSD technologies
• Design-time “What-if” analysis using generative prog
• Variety of analysis techniques including non traditional mechanisms
• Generative technologies for automated analysis
• Application of Engineering Mechanics
• Deployment-time intelligent decisions
• New applications of constraints optimization theory
• Middleware specializations
• Near optimal deployment• Specialized middleware stacks
• Run-time Mechanisms
• Multilevel, proactive QoS mgmt schemes
• Virtualization ideas
• Largely autonomic• Survivable systems