Workflow Topics for the Next-Workflow Topics for the Next-
Generation SDM-CenterGeneration SDM-Center
Ilkay [email protected]
Bertram Ludä[email protected]
San Diego Supercomputer Center
UC DAVISDepartment ofComputer ScienceSciDAC SDM AHM
Oct 5-6, 2005, NCSU Raleigh, NC
SciDAC SDM AHMOct 5-6, 2005, NCSU
Raleigh, NC
Sir Walter Raleigh
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
Overview
• Kepler/SPA:– What we have (The GOOD)– What we don’t (yet) have (The BAD)– What we really need?? (The UGLY)
Things we might do; prioritization
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
Macro Definitions …
• #define KEPLER KEPLER/SPA
• #define KEPLER KEPLER*SPA
• By the end:
• #define SPA KEPLERHPC
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
What we have – The GOOD
• Big Heritage from Ptolemy II– Vergil GUI for design and (some) execution monitoring– Actor-Oriented Modeling & Design
• Director / Actor Separation• Models of Computation: PN, SDF, DE, .. • Nested Workflows & Hierarchical Modeling• Research Results on Modeling Complex Systems
– modal models, mobile models, reconfig’able models, model lifecycle management, higher-order actors, …
head-start for CCA Extensions, e.g. • SciRUN-2 Extensions (Steve P. et al.) • Self-Managing, Dynamically-Adaptive, Autonomous
Components (Manish et al.)
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
What we have – The GOOD
• Kepler Extensions (to Ptolemy II)– Mostly: loosely coupled, e.g. WS (web service) workflows– Many generic actors
• ssh, scp, cmd-line,SRB, Globus, …• new R expression actor
– Many custom actors• e.g. in PIW, TSI-1, TSI-2, GEON, SEEK, Resurgence, …
– Several ad-hoc extensions & (initial) research, e.g.• External job scheduling (e.g. NIMROD, …)• Director extensions (fault tolerance via WS “retry”)• WF-Templates (structured combination of dataflow & control-flow:
fault-tolerance, reusability)• Higher-order functions (map/3, iterate-over-array, … : simpler
control-flow, optimization potential, …)
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
Some KEPLER Actors (out of 160+ … and counting…)
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
What we have – The GOOD
• Kepler Extensions (Cont’d)– Some generic extensions
• Metadata-based (EML/ADN) Dataset Search • Concept-based Actor Search (OWL)• Documentation Framework• Authentication & Authorization Framework (GAMA from GEON)• Improved component/WF archival & plug-in (KAR,…)• Provenance Recorder (“Listener”)
PS … a growing open-source developers community …
… and some scientific users … (TSI-1/2, PIW, GEON, SEEK, … )
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
Concept-based Actor Search– Implemented as proof-of-
concept
• Additional operations slated for next Kepler Release (data search, port-based actor search, etc.)
Biggest Challenges– Building/searching a
repository …
– Making changes to MoML (see KAR)
– GUI changes
– Ontology management
Concept-based Actor Search
WorkflowComponents(MoML/KAR)
Ontologies(OWL)
Default + Other
SemanticAnnotations
urn idsinstanceexpressions
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
The GOOD: Kepler Archives
• Purpose: Encapsulate WF data and actors in an archive file– … inlined or by reference– … version control
More robust workflow exchange
Easy management of semantic annotations
Plug-in architecture (Drop in and use)
Easy documentation updates
• A jar-like archive file (.kar) including a manifest• All entities have unique ids (LSID)• Custom object manager and class loader• UI and API to create, define, search and load .kar files
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
KAR File Example <entity name="Multiply or Divide" class="ptolemy.kernel.ComponentEntity"><property name="entityId" value="urn:lsid:localhost:actor:80:1"
class="org.kepler.moml.NamedObjId"/><property name="documentation"
class="org.kepler.moml.DocumentationAttribute"></property><property name="class" value="ptolemy.actor.lib.MultiplyDivide"
class="ptolemy.kernel.util.StringAttribute"> <property name="id" value="urn:lsid:localhost:class:955:1"
class="ptolemy.kernel.util.StringAttribute"/></property><property name="multiply" class="org.kepler.moml.PortAttribute"> <property name="direction" value="input" class="ptolemy.kernel.util.StringAttribute"/> <property name="dataType" value="unknown" class="ptolemy.kernel.util.StringAttribute"/> <property name="isMultiport" value="true"
class="ptolemy.kernel.util.StringAttribute"/></property><property name="divide" class="org.kepler.moml.PortAttribute"> <property name="direction" value="input" class="ptolemy.kernel.util.StringAttribute"/> <property name="dataType" value="unknown" class="ptolemy.kernel.util.StringAttribute"/> <property name="isMultiport" value="true" class="ptolemy.kernel.util.StringAttribute"/></property><property name="output" class="org.kepler.moml.PortAttribute"> <property name="direction" value="output" class="ptolemy.kernel.util.StringAttribute"/> <property name="dataType" value="unknown" class="ptolemy.kernel.util.StringAttribute"/> <property name="isMultiport" value="false"
class="ptolemy.kernel.util.StringAttribute"/></property><property name="semanticType00"
value="http://seek.ecoinformatics.org/ontology#ArithmeticMathOperationActor" class="org.kepler.sms.SemanticType"/>
</entity>
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
Kepler Object Manager
• Designed to access local and distributed objects• Objects: data, metadata, annotations, actor classes,
supporting libraries, native libraries, etc. archived in kar files
• Advantages:– Reduce the size of Kepler distribution
• Only ship the core set of generic actors and domains– Easy exchange of full or partial workflows for collaborations– Publish full workflows with their bound data
• Becomes a provenance system for derived data objects
=> Separate SPA workflow repository and distribution
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
Provenance Framework
• Provenance – Track origin and derivation information about scientific workflows, their runs
and derived information (datasets, metadata…)• Need for Provenance
– Association of process and results– reproduce results– “explain & debug” results (via lineage tracing, parameter settings, …)– optimize: “Smart Re-Runs”
• Types of Provenance Information:– Data provenance
• Intermediate and end results including files and db references– Process (=workflow instance) provenance
• Keep the wf definition with data and parameters used in the run– Error and execution logs
– Workflow design provenance (quite different)• WF design is a (little supported) process (art, magic, …)• for free via cvs: edit history• need more “structure” (e.g. templates) for individual & collaborative
workflow design
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
Kepler Provenance Recording Utility
• Parametric and customizable – Different report formats– Variable levels of detail
• Verbose-all, verbose-some, medium, on error– Multiple cache destinations
• Saves information on– User name, Date, Run, etc…
Joint work with Oscar Barney
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
Provenance: Next Steps
• .kar file generation, registration and search for provenance information
• Possible data/metadata formats• Automatic report generation from accumulated data• A relational schema for the provenance info in
addition to the existing XML• Smart re-runs
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
The Future
• From GOOD via BAD to UGLY
• The good news (about ‘bad’ and ‘ugly’)– Lots of interesting challenges!– … so ‘ugly’ is actually good!
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
What we don’t (yet) have … THE BAD
• Much is still to do (or still ongoing)– Detached execution
• many options; depend on requirements – Kepler WF repository w/ dynamic actor plug-in– Smart Reruns
• avoid doing (old) work twice– Smarter Reruns (too smart?)
• reuse previous results for speed-up of (new) work– NIMROD Director, CONDOR Director … – Task manager / monitor– Support for WF design & reuse
• Semantic extensions • “Design Patterns”, Templates
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
What we don’t have … THE BAD cont’d
• Vertical SDM Integration– Workflow layer could be used to embed other SDM
components and glue them together– Scope & Architecture unclear
• Data Mining tools new WF actors• Parallel-R new WF actors !? • SEA, Bitmap tools new !?• MPI-IO alternative to current Kepler data access!?• …
– Not only a technical problem• e.g. need for driving use-cases that require
combination of several SDM layers together
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
Challenges
• Easier said …– “We’re not going to reinvent the wheel …”– “We just use XYZ …”
• XYZ in {CCA, HDF5, PnetCDF, Ccaffeine, Condor, MPI-IO, parallel-R, …}
• … than done …– Incompatible, isolated solutions and frameworks– Can’t use workflow/actor/director A with B
• Coming up with a coherent, overall architecture is hard!
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
HTC Example (using: NIMROD)
• need to make Kepler NIMROD/Condor/… “aware”• similar need for HPC support
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
Another Distribution Approach
Client
Servers
Computer Network
Service Locator(Peer Discovery)
Simulation is orchestrated in a
centralized manner
Source: Daniel Lázaro Cuadrado, Aalborg University
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
What we don’t have … THE UGLY
• Workflow Design & (Re-)Usability– Difficult Marriage of Dataflow and Control-flow
• e.g. PIW, TSI-1/2, GEON-A-type-WF, …– WF development, deployment, maintenance, use
• from (Mess…) to Art to Commodity ( next presentation)
– support for WF whole life-cycle• Fault Tolerance
– current embedding of control-flow into dataflow yields to non-maintainable workflows!
• Close Coupling of Components for HPC – CCA-style– MPI-style– Memory-to-Memory (on single nodes)– large, efficient data transfer– …
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
WF-Design: Adapters for Semantic & Structural Incompatibility
Adapters may:
– be abstract (no impl.)
– be concrete
– bridge a semantic gap
– fix a structural mismatch
– be generated automatically (e.g., Taverna’s “list mismatch”)
– be reused components(based on signatures)
C1 C1 D1C1
C2
C D C C D D
D DC2 C2 D2
f2f1[S] S T [S][S]
f1[T]f2
map
f2f1[[S]] S T [[S]][[S]]
f1[[T]]f2
map
map
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
Additional Design Primitives for Semantic Types
Extended Transformations Starting Workflow Resulting Workflow
t9: Actor Semantic Type Refinement (T T)
T
t12: I/O ConstraintStrengthening ( )
t10: Port Semantic TypeRefinement(C C, D D)
C
t14: Adapter Insertion
T
t11: AnnotationConstraint Refinement( ) s
C1
t15: Actor Replacement f f
t16: Workflow Combination(Map)
t13: Data Connection Refinement
…f1
f2
f1…f2
Resulting Workflow
D C D C D
t
D2 1
t
D 2
s
C 1
t
D2
s
C
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
Workflow Design Primitives
End-to-End Workflow Design and Implementation
– Viewed as a series of primitive “transformations”– Each takes a WF and produces a new WF– Can be combined to form design “strategies”
W0 tW1
W2
Wm
Wn
…
t
t
WorkflowDesign
WorkflowImplementation
Top-Down
Bottom-Up
Input Driven
Output Driven
Structure Driven
Semantic Driven
Task Driven
Data Driven
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
Fault Tolerance & Maintenance Challenges
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
Workflow Templates and Patterns
New Ingredients Proposed Layered Architecture
work w/ Anne Ngu, Shawn Bowers, Terence Critchlow
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
Use Ideas from Fault Tolerant Shell
Source: Douglas Thain, Miron Livny The Ethernet Approach to Grid Computing
Good ideas in ftsh; some might be
(semi-)low hanging fruits for Kepler …
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
Kepler Coupling Components & Codes
• Types of Coupling …– Loosely coupled (“1st Phase”)
• Web Services (SPA, GEON, SEEK, …), • ssh actors, ..
+ reusability (behavorial polymorphism)
+ scalability (# components)
– efficiency– Tight(er) coupling (“2nd Phase”)
• Via CCA (SciRUN-2, Ccaffeine, …) (Cipres uses CORBA) • HPC needs: code-coupling as efficient & flexible as possible
(e.g. Scott’s challenges…) – memory-to-memory (single node or shared memory), – MPI (multiple-nodes)– optimizations for transfer of data & control (streaming, socket-based
connections)
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
Accord-CCA: Ccaffeine w/ Self-Managed Behavior
Source: Hua Liu and Manish Parashar
cf. w/ mobile models, reconfiguration in Ptolemy II
… begging for a Kepler design and
implementation …
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
Different “Directors” for Different Concerns
• Example: – Ptolemy Directors – “factoring out” the concern of
workflow “orchestration” (MoC)– common aspects of overall execution not left to the
actors• Similarly:
– “Black Box” (“flight recorder”) • a kind of “recording central” to avoid wiring 100’s of
components to recording-actor(s) – “Red Box” (error handling, fault tolerance)
• use ftsh ideas; tempaltes – “Yellow Box” (type checking)
• for workflow design– “Blue Box” (shipping-and-handling)
• central handling of data transport (by value, by reference, by scp, SRB, GridFTP, …)
– “CCA++ Boxes” • Change behavior (e.g. algorithm) of a component
• Change behavior (i.e., wiring) of a workflow in-flight
SDF/PN/DE/…
Provenance Recorder
SHA @
Static Analysis
On Error
Component Mgr
Composition Mgr
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
Summary
• The GOOD: – lots to build upon
• The BAD: – no common / integrated architecture
use Kepler/SPA as a glue this might be harder than it sounds needs a mix of end-to-end application-drive and
serious design effort for the integration architecture
• The UGLY: – HPC challenges: close coupling, fault tolerance, …– The good news: there’s work to be done!
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
Use of Semantics in SWF…
“Smart” Search– Concept-based, e.g., “find all datasets containing biomass
measurements”
Improved Linking, Merging, Integration– Establishing links between data through semantic annotations &
ontologies– Combining heterogeneous sources based on annotations– Concatenate, Union (merge), Join, etc.
Transforming– Construct mappings from schema S1 to S2 based on annotations
Semantic Propagation– “Pushing” semantic annotations through transformations/queries
SDM-AHM-10-05 NCSU Next SDM-C: Workflows
Helping with “shims” / adapters
• Services can be semantically compatible, but structurally incompatible
SourceActor
SourceActor
TargetActor
TargetActor
Ps Pt
SemanticType Ps
SemanticType Ps
SemanticType Pt
SemanticType Pt
StructuralType Pt
StructuralType Pt
StructuralType Ps
StructuralType Ps
Desired Connection
Incompatible
Compatible
(⋠)
(⊑)
(Ps)(Ps) (≺)
Ontologies (OWL)Ontologies (OWL)
Source: [Bowers-Ludaescher, DILS’04]Source: [Bowers-Ludaescher, DILS’04]
Top Related