Françoise André IRISA – Prof. University of Rennes 1 Jérémy Buisson IRISA – INSA of Rennes
-
Upload
constance-murray -
Category
Documents
-
view
32 -
download
4
description
Transcript of Françoise André IRISA – Prof. University of Rennes 1 Jérémy Buisson IRISA – INSA of Rennes
Dynamic adaptabilityPhenix workshop on self-healing and fault tolerant systems
December 7-8, 2006 – IRISA, Rennes
Françoise AndréIRISA – Prof. University of Rennes 1
Jérémy BuissonIRISA – INSA of Rennes
Dynamic adaptability 2
Outline
Dynamic adaptabilityDynaco: generic framework for
adaptabilityAfpac: tool for the adaptation of SPMD
codesEvaluationsConclusion and future works
Dynamic adaptability 3
Outline
Dynamic adaptabilityDynaco: generic framework for
adaptabilityAfpac: tool for the adaptation of SPMD
codesEvaluationsConclusion and future works
Dynamic adaptability 4
Adaptability
A functionality of applicationsAbility to modify itself (reconfigure) at runtime (dynamically)
according to its execution environment
Some synonyms for “adaptability”Autonomous computing, autonomic computing
• More or less adaptability• Sometimes structured as provided functionalities, such as self-
healing, self-optimization, …Adaptivity, autonomicity
Other similar areaApplication steering
• More or less adaptability triggered by users
Dynamic adaptability 5
Need for adaptability
When resources vary in the execution environmentSome resources may appearSome resources may disappearPossible causes
• Faults; administrative tasks; resource sharing among users When an application have several configurations that use
resources differentlyDifferent possible algorithmsSome parameters that can be tuned
Adaptability ensures that the application continuously executes the “best” configurationAccording to the actual execution environment
Dynamic adaptability 6
Overall goal
Benefit from appearing resourcesTerminate sooner
Support disappearing resourcesAvoid expected
crashes
Dynamic adaptability 7
Adaptability in the PARIS team
Studied for some yearsInitially
• Mobile computing• Distributed computing
Last works• Parallel computing
Framework approachAd-hoc implementations should be avoidedThe structure should highlight reusable tools
Current prototypesDynaco: generic framework for adaptabilityAfpac: tool for adapting SPMD codes
Dynamic adaptability 8
Other works on adaptability
Many ad-hoc implementationsSpecific to one kind of adaptation
• E.g. adapting the number of processes to the number of processors/machines, redistributing tasks [Paul et al., 1998]
Specific to one application• E.g. video streaming [Plasma]
Some (more or less generic) frameworks[EPSN]
Some compiler approaches[ASSIST]
Some semantic models[Zhang et Cheng, 2005]
Dynamic adaptability 9
Outline
Dynamic adaptabilityDynaco: generic framework for
adaptabilityAfpac: tool for the adaptation of SPMD
codesEvaluationsConclusion and future works
Dynamic adaptability 10
Dynaco: a generic adaptability framework
Decomposition of adaptability in 4 stepsObserve the execution environment as it evolvesDecide that the component should adaptPlan how to achieve the adaptationSchedule and execute planned actions
Dynamic adaptability 11
Adaptability step 1: observe
Collect information about the execution environmentConnect to the monitoring infrastructure of
the environmentDetect relevant changes
Trigger adaptability when the adaptable component may not be well adapted anymore
Dynamic adaptability 12
Adaptability step 2: decide
Find the best strategyWith regard to a developer- or user-provided criterion
• E.g. performance model
Depending on information collected at the observe phase
Possible implementationsAny optimization algorithm
• Depending on the properties of the criterion that should be optimized
Expert systems and decision diagrams
Dynamic adaptability 13
Adaptability step 3: planning
Find how the decided strategy can be achievedStarting from the currently executing configurationAssembling predefined actions with some control flow
Possible algorithmPlanning algorithms
• May be costly if too much expressivity is required
Collection of predefined plans• Difficult to construct a sufficient collection
Dynamic adaptability 14
Adaptability step 4: execution
Execute generated plansSchedule accordingly to dependencies highlighted in
plansSynchronize with the applicative execution flows
Possible implementationsHooks in the applicative code
• Called “adaptation points”• Rendezvous at the next hook in applicative code• Rollback to the previous hook in applicative code
Applicative code suspension
Dynamic adaptability 15
Dynaco: a generic adaptability framework
In order to instantiate the frameworkChoose implementations for the generic enginesImplement policy, guide and actions
Dynamic adaptability 16
Dynaco: a generic adaptability framework
Integrate the framework instance within the adaptable componentBind “actions” and “execute” to the content of the
componentBind the framework to the monitoring infrastructure
Dynamic adaptability 18
Outline
Dynamic adaptabilityDynaco: generic framework for
adaptabilityAfpac: tool for the adaptation of SPMD
codesEvaluationsConclusion and future works
Dynamic adaptability 19
Adaptation for parallel components
Parallel componentsComponents that encapsulate parallel codes
Case of parallel componentsIn the execute phase
• Synchronize adaptation actions with the execution of the applicative code
• Hook the applicative execution threads
Adaptation points are global states
Dynamic adaptability 20
Afpac: adaptation for SPMD components
Rendezvous at the upcoming global state hook
Locally to each process, adaptation points are indicated by developersCall to an Afpac function
Globally, adaptation points are built as the identity relation over local adaptation pointsSPMD code assumption
Dynamic adaptability 21
Afpac
Distributed algorithm to find the upcoming adaptation pointIterative
• Each process locally predicts upcoming local adaptation points
– If prediction is impossible, wait for the applicative execution thread to progress
» E.g. in case of conditional instructions• Each process gathers other processes’ predictions• As long as at least one process does not agree, rerun the
algorithm– Each process computes a least upper bound according to other
processes’ predictionsConcurrent to the applicative execution thread
Dynamic adaptability 22
Afpac
Requirements for the applicative codeTracking the progress of the execution in
each process• Upon local adaptation points• Upon control structures containing adaptation
points
Predicting upcoming adaptation points• Control flow model of the applicative code
– With the same granularity as above
Dynamic adaptability 23
Taco: AOP tool easing the use of Afpac
Specific aspect weaverHandling of control structures
• Source code transformation for inserting calls upon control structures
Extraction of the control flow model
Task still belonging to developersIndicating local adaptation points
Dynamic adaptability 24
Outline
Dynamic adaptabilityDynaco: generic framework for
adaptabilityAfpac: tool for the adaptation of SPMD
codesEvaluationsConclusion and future works
Dynamic adaptability 25
Examples of using Dynaco
FT (from the NAS Parallel Benchmark suite): numerical kernelAdapting the number of processes to the number of
available processors• i.e. implementing malleability
Gadget 2: N body simulatorAdapting the data distribution to load unbalance
• i.e. revisiting load balancingDad: home-made genetic algorithm
Adapting the implementation to the underlying architecture
• Including to communication facilities
Dynamic adaptability 28
Outline
Dynamic adaptabilityDynaco: generic framework for
adaptabilityAfpac: tool for the adaptation of SPMD
codesEvaluationsConclusion and future works
Dynamic adaptability 29
Summary
Dynaco: a generic framework for adaptabilityIndependent of the application
• E.g.: numerical algorithms, transactional systemsIndependent of formalisms and technologies
• E.g.: 3 interchangeable formalisms for the policy in the current implementation
– Objective function, optimized by a genetic algorithm– Collection of condition-action rules, interpreted by the
Jess expert system– Plain Java code, executed by a JVM
Dynamic adaptability 30
Ongoing and future works
Trying to reduce applicative code suspension while selecting a global adaptation pointDesigning speculative algorithmGuessing what other processes will do, rather than waiting for
those processes to do it• Compensating small desynchronizations
Using rollback in case of wrong prediction Designing a dialogue between grid resource managers and
adaptable applications Investigate how resource managers and adaptable applications
can mutually benefit from each other• Better resource management• Avoid considering rescheduling as faults
– Avoid using checkpoint/restart
Dynamic adaptability 31
Long term goals
Connections with fault toleranceMaking Dynaco resilient
• Dynaco is centralized– Even if able to command the adaptation of parallel applications
• The process executing the Dynaco framework must never fail
– Furthermore, it should be able to execute actionsImplementing fault tolerance with Dynaco
• One adaptation action may be “restart from checkpoint”• Adaptability would allow to restart with a different
behavior/implementationUsing fault tolerance features for adaptability
• Several adaptability implementations use checkpoint/restart• It can be useful to implement speculative adaptation point
selection