10 Best Practices for Workflow Design

Post on 13-Nov-2014

24.814 views 5 download

Tags:

description

Presented at the 2nd BioVeL Workshop on taxonomic and phylogenetic workflows (http://www.biovel.eu/index.php?option=com_content&view=article&id=43:ms6-workshop&catid=22:biovel-meetings&Itemid=122)

Transcript of 10 Best Practices for Workflow Design

The 10 Best Practices for Workflow Design

BioVeL M6 Workshop

Göteborg, May 10-11, 2012

Kristina Hettne, Marco Roos (LUMC), Katy Wolstencroft , Carole Goble (myGrid)Thanks: BioSemantics Group (LUMC), myGrid team (UoM), Yassene Mohamed, Harish Dharuri (LUMC)

2

http://biosemantics.org

Our specialty: Knowledge Discovery

Substrates for Knowledge Discovery

Disambiguation

*Text Mining

Applications•Predict protein-protein, protein-disease associations, gene prioritization•Genotype-phenotype studies, e.g. Huntington’s Disease, Metabolic Syndrome•Yours?

Applications•Predict protein-protein, protein-disease associations, gene prioritization•Genotype-phenotype studies, e.g. Huntington’s Disease, Metabolic Syndrome•Yours?

* Global disambiguation initiative: http://snipurl.com/conceptweballiance

Methods for Knowledge Discovery

3

Why build good workflows?Introduction

Good workflow design = good science!

4

Best Practices for workflow design=

Best Practices experimental science+

Best Practices software engineering

Introduction

Best practices for workflow design

5

1Make a sketch workflow

6

Powerpoint courtersy of Eleni Mina

Sketch an Abstract Workflow

Best practice 1

7

2Use modules

8

http://www.myexperiment.org/workflows/74.html

9

3Think about the output

(and the data in your workflow in general)

10

http://...

Think about the output

Best practice 3

?

11

4Provide example inputs and

outputs

12

Taverna 2.3 RecipeSelect input/outputSelect tab ‘Details’Click ‘Annotation’

Add Example

Taverna 2.3 RecipeSelect input/outputSelect tab ‘Details’Click ‘Annotation’

Add Example

Taverna 2.4Right-click

input/outputSelect ‘Annotation’

Add Example

Taverna 2.4Right-click

input/outputSelect ‘Annotation’

Add Example

13

5Annotate

14

Annotate

Best practice 5

Each component in Taverna can be

annotated

Each component in Taverna can be

annotated

15

Annotate and help your users

Best practice 5

16

6Make workflow executable from outside the local environment

17

Make workflow executable by others

Best practice 6

» Try it!› Ask a colleague› Use an external t2web runner

» Tips› Use Web Services› If you use local command line tools

• Install tools on a publicly accessible server (e.g. applies to Rserve)• Use system that your users can set up (e.g. BioLinux)

How to check that others can execute your workflow?

Proof of executability

18

7Choose services carefully

19

Choose services carefully

Best practice 7

20

Choose services carefully

Best practice 7

21

8Reuse existing workflows

22

Invent a new wheel

Search the internet

The reuse workflow

Best practice 8

Check workflows on myExperiment

Contact authors

Retry

Contact authors

Retry

Use scripts from

colleagues

Not a best practice, but a tip: know-how is important for reuse

Not a best practice, but a tip: know-how is important for reuse

Neg.

Neg.

Neg.

Reuse, AttributeRespect licences

Check services on

BioCatalogue

Pos.

Pos.

23

9Advertise

24

Advertise

Unique reference for in your papers and for others to cite

Unique reference for in your papers and for others to cite

25

10Maintain

26

Maintain

Best Practice 10

» Regularly check your workflow› Ask colleagues

» Enable support for maintenance› Register your workflow on myExperiment› Register Web Services on

» Enable peers to repair: annotate!

» Note about versioning› No need to register all edits on myExperiment: use subversion› Register important updates on myExperiment

Best practices to support maintenance

27

Bonus tipUse common sense as scientist

2828

Workflow 74“Protein Discovery”2005

Workflow 2876“Match gene listsby literature” 2012

Preservation of good workflows for future applications

Workflow Forever

Workflow 2805“Get Pathway genes” 2012

29

myExperiment 2.0

BioCatalogue

Taverna

Research Objects

Linked Data

Methods

Protocols for Preservation and Conservation

Wf4Ever

Outcomes for BioVeL

30

1. Make a sketch workflow

2. Use modules

3. Think about the output

4. Provide example inputs and outputs

5. Annotate

6. Make it executable from outside the local environment

7. Choose services carefully

8. Reuse existing workflows

9. Advertise

10. Maintain

Thank youThe 10 Best Practices of Workflow Design

Thank you for your attention

More information:

http://snipurl.com/workflowbestpractices