A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support Manuel Caeiro Zsolt...
-
Upload
colin-gregory -
Category
Documents
-
view
217 -
download
0
Transcript of A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support Manuel Caeiro Zsolt...
A Chemical Workflow Engine for Scientific Workflows with Dynamicity
Support
Manuel Caeiro Zsolt Nemeth Thierry Priol
CoreGRID Post DocIRISA, Rennes, France
MTA SZTAKI, Budapest, Hungary
Associated TeacherUniversity of Vigo, Spain
[email protected] MTA SZTAKIBudapest, Hungary
IRISARennes, France
Manuel Caeiro / A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support 2
Outline of the Presentation
1. Introduction• Scientific Workflows• The Chemical Computation Model
2. Proposal• The Scientific Workflow Language• The Chemical Workflow Engine• Dynamicity Support
3. Validation
4. Conclusions and Future Works
3
1. Introduction
This work has been performed in the context of the CoreGRID Excellence Network• IRISA (Rennes): December 2007 – March 2008• SZTAKI (Budapest): April 2008 – August 2008
VIGO
RENNESBUDAPEST
Manuel Caeiro / A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support
Manuel Caeiro / A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support 4
1. Introduction: Scientific Workflows
Scientific applications and experiments involve:• Large number of operations• Large data sets• Complex algorithms
Earth Sciences
Biology
Medical Image Analysis
Astronomy
Wheather Prediction
Sub-atomic Physics
Manuel Caeiro / A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support 5
1. Introduction: Scientific Workflows
Dynamicity is intrinsic to Scientific Workflows
• Scientists usually introduce modifications and variations in their experiments
• Scientific workflows are not always completely specified• Data is known dynamically during execution• Data is distributed and mobile• The resources are not fixed, but they change during workflow
execution
Manuel Caeiro / A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support 6
1. Introduction: Scientific Workflows
Dynamicity Requirements (1/2)– Monitoring
• To observe the progress of the workflow• To obtain the partial and final results
– Automatic Control• To support the detection of errors, problems• To support the control of data values and events
– Reproducibility• To enable the reproduction of the execution• It is important to validate the results
– Smart “re-runs”• To be able to re-start at an already performed stage
– Version Management• To support and distinguish different “attempts”
Manuel Caeiro / A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support 7
1. Introduction: Scientific Workflows
Dynamicity Requirements (2/2)– User steering
• VCR-like: pause, play, roll-back, etc.• Checkpoints
– User Manipulation• To be able to change the abstract workflow descriptions• To be able to change the data and the parameters
– Adaptation in the Workflow Language• Controlled change of workflows• Parametric studies
– Adaptation in the Workflow Management System• Support execution with different resources• Support changes in task assignment to resources and
services’ instances
Use
r D
rive
nA
uton
omou
s
Manuel Caeiro / A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support 8
1. Introduction: The Chemical Computation Model
Main Idea: Computation as chemical reactions
Programs are conceived as chemical solutionsinvolving a set of molecules of different types
that react among them in accordance with specific reaction conditions and actions
Manuel Caeiro / A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support 9
1. Introduction: The Chemical Computation Model
Molecule types:– Variables (data)– Reaction conditions and Actions (instructions)– Molecule Aggregations (pairs) – Solutions
• A solution is a container of molecules where chemical computations can be produced
Computation:– A molecule with a reaction condition “matches” another
molecule (or set of molecules) that satisfies its condition– The molecules react and the actions are performed
– The matched molecules are consumed – New molecules are created
– Return to step 1 until the solution is inert
Manuel Caeiro / A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support 10
1. Introduction: The Chemical Computation Model
An example: Compute the maximum value of a set of numbers– Chemical solution:
• Numbers: 1, 2, 7, 8, 9• Reaction condition and action:
Match x, y; if x>y then replace x, y by x
1Passive MoleculeNumbers
Chemical Solution
Active MoleculeReaction condition and action
2
8
97
Manuel Caeiro / A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support 11
1. Introduction: The Chemical Computation Model
Main properties of the chemical computation model:
• Inherently concurrent
• Natural parallelism. No serialization is imposed
• Non determinism
Manuel Caeiro / A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support 12
2. Proposal
Goal: • To develop a workflow engine for scientific applications based on the
chemical computation model and supporting dynamicity
Steps:• The Scientific Workflow Language• The Chemical Workflow Engine• The Support of Dynamicity
Manuel Caeiro / A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support 13
2. Proposal: The Scientific Workflow Language
No General Accepted Scientific Workflow Language: • There exists several languages• Two main approaches: control-flow and data-flow • Specific data operators:
o SCUFL: one-to-one, all-to-allo ASKALON: large data set loops
• Solution Adopted:• To propose a new workflow language involving the more common
constructs
Manuel Caeiro / A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support 14
2. Proposal: The Scientific Workflow Language
Main Features: • It is an extension to Event-driven Process Chains (EPCs)• Events represent the state• Data Elements are related to Events (Inputs and outputs of Functions)• Resources are used to process Functions• Connector Types: AND/OR/XOR-split/Join, Sub-process, Loops, Data-
Loops, O2O, A2A
Function Connector Event Data Element Resource
Manuel Caeiro / A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support 15
2. Proposal: The Scientific Workflow LanguageLAPW0
Data-LOOP-split
Init
R1
Event1
LAPW1-K1
Event21
Event31
LAPW1-K2
Event22
Event32
LAPW1-Kn
Event2n
Event3n
Data-LOOP-join
R2
Data1
Data21
Data31
An Example: The VIEM workflow from
ASKALOM
2. Proposal: The Chemical Workflow EngineTwo main kinds of molecules:
Manuel Caeiro / A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support 16
Function Connector Event Data Element Resource
Active Molecules Passive Molecules
Connector + Event(s) + Data Element(s) Event(s) + Data Element(s)
Manuel Caeiro / A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support 17
2. Proposal: The Chemical Workflow Engine
Functions evolve through 4 states:• Disabled: a function not activated, not matched the input Event• Enabled: not matched the input Data Elements• Ready: not assigned to appropriate Resources• Initiated: the function that is being performed
Each state is represented by a different molecule
Manuel Caeiro / A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support 18
2. Proposal: The Chemical Workflow Engine
Disabled FunctionsDisabled Connectors
Events Data Elements
Enabled Function
Ready Function
Resources
Initiated Function
Event
Data Element
Resource
Chemical Solution
Disabled Enabled Ready Initiated
Manuel Caeiro / A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support 19
2. Proposal: The Chemical Workflow Engine
Connectors evolve through 2 states:• Disabled: a connector not activated, not matched the input Event(s)• Enabled: not matched the input Data Elements
Each state is represented by a different molecule
Manuel Caeiro / A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support 20
3. An HOCL Workflow Engine
Disabled FunctionsDisabled Connectors
Events Data Elements
+ 1 Connector
Resources
F.A
Ev.1 D.A.1..n
Resource
Chemical Solution
Data One-to-One Connector
F.A
+
F.B
Data A.1,2, …, N
Data B.1,2, …, N
Data C.1,2, …, N
Ev.1 Ev.2
Ev.3.1… 3.N
F.B
Ev.2 D.B.1..n
Resource
+ Connector+ 2 Connector
+ N Connector
Data A.1
Data B.1
Data C.1
Ev.3.1
F.C
Manuel Caeiro / A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support 21
2. Proposal: The Chemical Workflow Engine
Structure of the Chemical Workflow Engine:• Separated in 4 sub-solutions: one for each state• Transfer of molecules among sub-solutions
Operations in the Workflow Engine:• Compilation: the molecules representing the Disabled Functions and
Connectors corresponding to the process definition are introduced• Data Population: the molecules representing the Input Data Elements related
with a case are introduced• Resource Population: the molecules representing the available Resources are
introduced• Instance Creation: the molecules representing the initial Events are introduced
Manuel Caeiro / A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support 22
2. Proposal: The Chemical Workflow Engine
InputData
Compilation Data Population
Instance Creation
Resource Population
Manuel Caeiro / A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support 23
2. Proposal: The Chemical Workflow Engine
Identifiers:• Element Identifier: distinguishes among the several elements included
in a process specification.• Process Schema Identifier: distinguishes among process
specifications. • It has two parts: a process number and a version number.• Included in Functions, Connectors and Events.
• Instance Identifier: distinguishes among the several instances.• It includes a thread identifier (numbered Data Elements).• Included in Events and Data Elements and also in Functions and
Connectors in states Enabled, Ready and Initiated.
Manuel Caeiro / A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support 24
2. Proposal: Dynamicity Support
Dynamicity is supported in several ways:
• A workflow specification can be modified by changing the Functions and Connectors contained in the disabled sub-solution.
• The distinction between Event and Data Element molecules enables to separate the workflow specification from the data to be processed.
• Several workflow instances can be initiated and executed in parallel. Disabled molecules are not eliminated.
• The availability of Event molecules enables to develop a steering facility.
• Data Element molecules are not eliminated. This enables the development of monitoring, “smart re-runs” and provenance solutions.
Manuel Caeiro / A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support 25
2. Proposal: Dynamicity Support
Addendums to the Identifiers:• Addendum to the Process Schema Identifier
• Enables to use modifying versions of an existing process specification just by including the new molecules.
• Addendum to the Instance Identifier• Enables to use the data of another instance execution.
We support the 13 change patterns proposed in [18]:
Manuel Caeiro / A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support 26
3. Validation
Developed in CLIPS:• CLIPS provides an environment for the construction of rule-based
expert systems• CLIPS programming is performed by assertions and rules
• Assertions are used to are used to maintain information• Rules specify a certain action to be performed when a
conditions is satisfied• To validate the CWE we used two kinds of assertions and specific
rules:• Active molecule assertions of two types (Function and
Connector) and four possible states (Disabled, Enabled, Ready, Initiated)
• Passive molecule assertions of three types (Event, Data Element and Resource)
Manuel Caeiro / A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support 27
4. Conclusions
Summary:• Scientific workflows are gaining a great momentum• Dynamicity is an intrinsic need in scientific workflows
• A workflow engine based on the Chemical Computation Model has been conceived supporting dynamicity needs
Scientific Workflow Chemical Workflow Engine CLIPS
Future Work:• To provide an actual validation
Manuel Caeiro / A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support 28
4. Conclusions
Opportunities from the Chemical Computation Model:
• It is parallel in nature: it facilitates the distribution of computations parallelization is obtained in a transparent way• Workflows can be specified in the same way• Execution of workflows is automatically parallelized
• Change of the role of resources:– Central “chemical solution” vs. central Workflow engine– Pull-oriented vs. Push-oriented
Manuel Caeiro / A Chemical Workflow Engine for Scientific Workflows with Dynamicity Support 29
Questions and Comments are welcome!!!