Karma Provenance Framework v2 Provenance Challenge Workshop/GGF18 Yogesh L. Simmhan Beth Plale,...
-
date post
19-Dec-2015 -
Category
Documents
-
view
219 -
download
2
Transcript of Karma Provenance Framework v2 Provenance Challenge Workshop/GGF18 Yogesh L. Simmhan Beth Plale,...
Karma Provenance Framework v2
Provenance Challenge Workshop/GGF18
Yogesh L. SimmhanBeth Plale, Dennis Gannon, Srinath Perera
Indiana University
[2/25]2006-09-13
Outline
Architecture of Karma
Workflow Setup & Collecting Provenance
Provenance Traces
“canonical” Challenge Queries
Suggested Variations
[3/25]2006-09-13
Provenance Collection: Challenges & Uses Linked Environments for Atmospheric Discovery
(LEAD) project Weather & Severe Storm Prediction Applications
Provenance on workflow (process) & data products at fine granularity
Dynamic, Long running workflows Helps scientists to search for workflows & data
products, Track workflow execution, Analyze & mine data products from runs
[4/25]2006-09-13
Karma Provenance Framework Lightweight – do not duplicate existing
metadata cataloging effort myLEAD personal metadata catalog ResCat service & data registry
Glue to integrate metadata on data & services with runtime workflow information
Scalability1 – 500 users, 100’s of workflows, 10,000’s of data products
[1] [1] Performance Evaluation of the Karma Provenance Framework, Simmhan, Y., et al.; IPAW, 2006
[5/25]2006-09-13
Karma Provenance ServiceKarma Provenance Service
ProvenanceListener
ProvenanceListener
ActivityDB
ActivityDB
Karma Architecture2
Workflow Instance10 Data Products Consumed & Produced by each Service
Workflow Instance10 Data Products Consumed & Produced by each Service
Service2
Service2 ……Service
1Service
1Service
10Service
10Service
9Service
910P/10C
10C
10P 10C 10P/10C
10P
Workflow Engine
Workflow Engine
Message Bus WS-Eventing Service API Message Bus WS-Eventing Service API WS-Messenger
Notification BrokerWS-Messenger
Notification Broker
Publish Provenance Activities as Notifications
Application–Started & –Finished, Data–Produced & –ConsumedActivities
Workflow–Started & –Finished Activities
ProvenanceQuery API
ProvenanceQuery API
Provenance Browser ClientProvenance
Browser Client
Query for Workflow, Process,& Data Provenance
Subscribe & Listen toActivity Notifications
[2] A Framework for Collecting Provenance in Data-Centric Scientific Workflows, Simmhan, Y., et al., Submitted to ICWS Conference, 2006
[6/25]2006-09-13
Provenance Challenge Workflow Applications modeled as web-services
GFac toolkit creates service for command-line applications
Service invokes a shell-script wrapper of the application, passing command-line arguments
Created services automatically instrumented to generate provenance using Karma client library
Workflow composed as GPEL* script XBaya Workflow composer GUI Central GPEL workflow engine orchestrates
execution
*Grid Process Execution Language, an extension of the Business Process Execution Language (BPEL)
[8/25]2006-09-13
Provenance Traces Data Provenance: get[Recursive]DataProvenance
What (ID), where (URL), when (Timestamp) How (Process, inputs)
[9/25]2006-09-13
Provenance Traces Process Provenance: getProcessProvenance
What (ID), when (Timestamp), who (Invoker) State (execution/completion status) Input & Output data products
[10/25]2006-09-13
Provenance Traces Workflow Trace: getWorkflowTrace
What (ID), when (Timestamp), who (Invoker) State (execution/completion status) Process provenance of workflow steps
[12/25]2006-09-13
Provenance Challenge Queries ! Answered by Karma Service API Directly Answered by Karma Service API,
with post-processing by client ~ Answered by access to backend DB (SQL) Not answered
Query 1 2 3 4 5 6 7 8 9
Result ! ! ~ ~ ~ ~
[13/25]2006-09-13
Provenance Challenge Queries: Q1 Find everything that caused Atlas X Graphic to be
as it is ! Answered by Karma Service API Directly This is the recursive data provenance of the Atlas
X Graphic file A call to
getRecursiveDataProvenance(
‘lead:uuid:1157946992-atlas-x.gif’)
returns this [www]
[14/25]2006-09-13
Provenance Challenge Queries: Q2 Find the process that led to Atlas X Graphic,
excluding all prior to softmean Answered by Karma Service API, with post-
processing by client1. First call getDataProvenance2. Then recursively get data provenance till
‘SoftmeanService’ is seenReturns this [www]
1. let $dataList := ['lead:uuid:1157946992-atlas-x.gif']2. while ($dataList != empty) do // get data provenance for this level a. $dataProvenance = karma.getDataProvenance($dataList[0]) // print process information & remove data from list b. Print $dataProvenance; $dataList.delete(0) c. if ($dataProvenance.getProducedBy() == 'SoftmeanService') break; // found
Softmean. Stop. // get input data used by this data & recurse up the tree d. foreach ($inputData in $dataProvenance.getUsingData()) do i. $dataList.add($inputData) 3. End
[15/25]2006-09-13
Provenance Challenge: Q4 Find all invocations of align_warp ( with parameter "-m
12") that ran on a Monday ~ Answered by access to backend DB (SQL)1. Use SQL query to get matching invocations
2. Call getProcessProvenance to get description of align_warpReturns this [www]
SELECT invokee.workflow_id, invokee.service_id, invokee.workflow_node_id, invokee.workflow_timestep, invoker.workflow_id, invoker.service_id, invoker.workflow_node_id, invoker.workflow_timestep
FROM invocation_state_table invocation, entity_table invokee, entity_table invoker, notification_table notifications
WHERE invokee.entity_id = invocation.invokee_id AND invoker.entity_id = invocation.invoker_id AND notifications.source_id = invocation.invokee_id AND notifications.notification_type = 'ServiceInvoked' AND invokee.service_id =
'urn:qname:http://www.extreme.indiana.edu/karma/challenge06:AlignWarpService' AND notifications.notification_xml LIKE'%<ModelMenuNumber>12</ModelMenuNumber>%‘AND DayOfWeek(invocation.request_receive_time) = 2; // 1=Sunday, 2=Monday, ...
[16/25]2006-09-13
Provenance Challenge: Q9 Find all the graphical atlas sets that have
metadata annotation studyModality with values speech, visual or audio, and return all other annotations to these files.
Not answered We do not expect to answer such queries through
the provenance system We push the provenance information to external
metadata management systems such as MyLEAD, which can answer such “join” queries on data product metadata and provenance
[17/25]2006-09-13
Variations of Workflow Workflows with loops Workflows whose structure changes
dynamically or, as a simpler case, workflows with
conditional branches Hierarchical composition of workflows
workflows invoking other workflows
[18/25]2006-09-13
Variations of Queries Find all [workflows | processes] with a
particular execution status [completed | failed | waiting for input]
Show the client view and service view of the provenance and check for differences
AcknowledgementsAlek Slominski (GPEL Engine)
Satoshi Shirasuna (XBaya Composer)
LEAD Members
NSF
Questionswww.extreme.indiana.edu/
karma