2013-05-29 Taverna Provenance (pptx source)

8

Click here to load reader

description

Slide deck presenting the Provenance support of Taverna workflow system, detailing architecture, ontologies and how results are exported as Research Object bundles, including the PROV-O provenance of the workflow run. This is the original PPTX version (PowerPoint 2013), for PDF version see http://www.slideshare.net/soilandreyes/20130529-taverna-provenance

Transcript of 2013-05-29 Taverna Provenance (pptx source)

Page 2: 2013-05-29 Taverna Provenance (pptx source)

ARCHITECTURE

Provenance

Workflow

Workflow run

Process run (iteration)

Parameter bindings

Data

Lists

Values

References

Errors

Process1

portA B C

D E

Process2

portA B C

D E

Invoke

Retry

Failover

Loop

Error bounce

Provenance

Parallelise

Processor

dispatch stack

layer injected by plugin

P Missier, S Soiland-Reyes, S Owen, W Tan, A Nenadic, I Dunlop, C

Goble: (2010, January). Taverna, reloaded. In Scientific and

Statistical Database Management (pp. 471-481). Springer Berlin

Heidelberg. DOI 10.1007/978-3-642-13818-8_33

captures provenance trace

Workflow execution

Page 3: 2013-05-29 Taverna Provenance (pptx source)

ONTOLOGY STACK

tavernaprov

• Lists, errors, byte content, checksums

wfprov + wfdesc

• Workflow execution, parameters, processes

PROV-O

• Activity start/stop, generation of values

http://purl.org/wf4ever/wfprov#

http://www.w3.org/ns/prov-o#

http://ns.taverna.org.uk/2012/tavernaprov/

Page 4: 2013-05-29 Taverna Provenance (pptx source)

INTERMEDIATE RESULTS

• Within the Taverna Workbench, the provenance database is

used for showing intermediate results and previous runs

Clicking a processor

Inputs and outputs of individual invocations

Page 5: 2013-05-29 Taverna Provenance (pptx source)

WORKFLOW RESULTS (FOLDER)

workflowrun.prov.ttl

(RDF)

outputA.txt

outputC.jpg

outputB/

Folder structure

intermediates/

1.txt2.txt

3.txt

de/def2e58b-50e2-4949-9980-fd310166621a.txt

Workflow outputs, one file per value

Provenance trace

Values from intermediate steps in workflow

Page 6: 2013-05-29 Taverna Provenance (pptx source)

WORKFLOW RESULTS (BUNDLE)

workflowrun.prov.ttl

(RDF)

outputA.txt

outputC.jpg

outputB/

https://w3id.org/bundle

intermediates/

1.txt2.txt

3.txt

de/def2e58b-50e2-4949-9980-fd310166621a.txt .ro/manifest.json

inputA.txtworkflow

URI

references

attribution

execution

environment

Aggregating in Research Object

ZIP folder structure (RO Bundle)

mimetype

application/vnd.wf4ever.robundle+zip

Page 7: 2013-05-29 Taverna Provenance (pptx source)

ACKNOWLEDGEMENTS

• Paolo Missier – initial provenance engine for Taverna 2

• Ian Dunlop – provenance capture execution layer

• Khalid Belhajjame – ontologies

• Alexandra Nenadic – intermediates, folder structure

• W3C Provenance working group – PROV-O

• Funded by European Commission’s 7th FWP FP7-ICT-2007-6

270192 and ESPRC platform grant EP/G026238/1