Merging and sharing Metabolomics analysis tools with Galaxy: transparent, reproducible, open 'omics...

19
Merging and sharing Metabolomics analysis tools with Galaxy: transparent, reproducible, open 'omics Robert L Davidson PhD. @bobbledavidson #MMW2014 Merlion Metabolomics Workshop 2014 http://dx.doi.org/10.6084/m9.figshare.1243500 http://bit.ly/1EPTbme

Transcript of Merging and sharing Metabolomics analysis tools with Galaxy: transparent, reproducible, open 'omics...

Page 1: Merging and sharing Metabolomics analysis tools with Galaxy: transparent, reproducible, open 'omics Robert L Davidson PhD. @bobbledavidson #MMW2014 Merlion.

Merging and sharingMetabolomics analysis tools with Galaxy:

transparent, reproducible, open 'omics

Robert L Davidson PhD. @bobbledavidson

#MMW2014

Merlion Metabolomics Workshop 2014

http://dx.doi.org/10.6084/m9.figshare.1243500http://bit.ly/1EPTbme

Page 2: Merging and sharing Metabolomics analysis tools with Galaxy: transparent, reproducible, open 'omics Robert L Davidson PhD. @bobbledavidson #MMW2014 Merlion.

Researcher bias Positive result bias

20 teams do studies, 1 publishes p<0.05 Poorly explained analyses

DOI: 10.1371/journal.pmed.0020124

Page 3: Merging and sharing Metabolomics analysis tools with Galaxy: transparent, reproducible, open 'omics Robert L Davidson PhD. @bobbledavidson #MMW2014 Merlion.

85% of research resources are wasted! We must ... favor ... unbiased, transparent, collaborative

research with greater standardization Share data, protocols, materials, software, other tools

DOI: 10.1371/journal.pmed.1001747

Page 4: Merging and sharing Metabolomics analysis tools with Galaxy: transparent, reproducible, open 'omics Robert L Davidson PhD. @bobbledavidson #MMW2014 Merlion.

Data sharing

Supported by gov policy: e.g. UK and NIH MetaboLights repository

www.ebi.ac.uk/metabolights/ NIH Metabolomics Data Repository

www.metabolomicsworkbench.org/data/index.php ISA-Tab for metadata

http://www.isa-tools.org/format.html

Page 5: Merging and sharing Metabolomics analysis tools with Galaxy: transparent, reproducible, open 'omics Robert L Davidson PhD. @bobbledavidson #MMW2014 Merlion.

What about methods?

http://reproducibility.cs.arizona.edu/

“The good news is that I was able to find some code. I am just hoping that it isa stable working version of the code... I have lost some data... The bad news is that the code is not commented and/or clean. So, I cannot really guarantee that you will enjoy playing with it.”

613 paperstested

123 successfulreproductions

Page 6: Merging and sharing Metabolomics analysis tools with Galaxy: transparent, reproducible, open 'omics Robert L Davidson PhD. @bobbledavidson #MMW2014 Merlion.

Problem

There is a reproducibility crisis Published results are untrustworthy Research is a waste of government money (85%)

What's the solution? Share data AND methods

Page 7: Merging and sharing Metabolomics analysis tools with Galaxy: transparent, reproducible, open 'omics Robert L Davidson PhD. @bobbledavidson #MMW2014 Merlion.

Galaxy

Over 36,000 main Galaxy server users

Over 1,000 papersciting Galaxy use

Over 55 Galaxyservers deployed

Open source

http://galaxyproject.org

Page 8: Merging and sharing Metabolomics analysis tools with Galaxy: transparent, reproducible, open 'omics Robert L Davidson PhD. @bobbledavidson #MMW2014 Merlion.

Galaxy – Toolshed

https://toolshed.g2.bx.psu.edu/

Many 'omics, stats,

visualisations

Metabolomics can pluginto this

2700+ tools!

Download;Run instantly

Page 9: Merging and sharing Metabolomics analysis tools with Galaxy: transparent, reproducible, open 'omics Robert L Davidson PhD. @bobbledavidson #MMW2014 Merlion.

Any tool in Galaxy<tool name=”myfunction”>

<command>

python myfunction input1

</command>

<inputs>

<param format=”txt” name=”input1”>

</inputs>

<outputs>

<data format=”csv” name=”output1”>

</outputs>

</tool>

Basic xml 'wrapper'Describe inputs and outputs

Calls command

Monitors for output

Logs/returns to 'history'

Page 10: Merging and sharing Metabolomics analysis tools with Galaxy: transparent, reproducible, open 'omics Robert L Davidson PhD. @bobbledavidson #MMW2014 Merlion.

Galaxy

Tool List Tool Parameters History/results

Page 11: Merging and sharing Metabolomics analysis tools with Galaxy: transparent, reproducible, open 'omics Robert L Davidson PhD. @bobbledavidson #MMW2014 Merlion.

Birmingham metabolomics workflow

SIM-Stitch DOI:10.1016/j.jasms.2009.02.001

XCMS DOI:10.1021/ac051437y

MI-Pack DOI:10.1016/j.chemolab.2010.04.010

KNN Impute DOI:10.1007/s11306-011-0366-4

PQ-Normalisation DOI:10.1021/ac051632c

G-Log transform DOI:10.1186/1471-2105-8-234

PCA (with statistical test of scores)

Page 12: Merging and sharing Metabolomics analysis tools with Galaxy: transparent, reproducible, open 'omics Robert L Davidson PhD. @bobbledavidson #MMW2014 Merlion.

Birmingham metabolomics workflow

Many tools

Many languages

Complex to learn

Many parameters

Complex to report

Page 13: Merging and sharing Metabolomics analysis tools with Galaxy: transparent, reproducible, open 'omics Robert L Davidson PhD. @bobbledavidson #MMW2014 Merlion.

Metabolomics workflow in Galaxy

User sees website (intuitive)

Centrally stored (secure)

Workflow is recorded

Methods shareable

Page 14: Merging and sharing Metabolomics analysis tools with Galaxy: transparent, reproducible, open 'omics Robert L Davidson PhD. @bobbledavidson #MMW2014 Merlion.

View, share, edit, rerun workflow

Page 15: Merging and sharing Metabolomics analysis tools with Galaxy: transparent, reproducible, open 'omics Robert L Davidson PhD. @bobbledavidson #MMW2014 Merlion.

Citable workflowAdd as supplemental files or publish with distinct DOI via GigaDB or FigShare

Page 16: Merging and sharing Metabolomics analysis tools with Galaxy: transparent, reproducible, open 'omics Robert L Davidson PhD. @bobbledavidson #MMW2014 Merlion.

Where to get our workflow

Coming soon! Galaxy Toolshed Github Submitted to GigaScience (gigasciencejournal.com)

VM/Code/TestData to be available on GigaDB.org Test server to be available at GigaGalaxy

http://galaxy.cbiit.cuhk.edu.hk/

Page 17: Merging and sharing Metabolomics analysis tools with Galaxy: transparent, reproducible, open 'omics Robert L Davidson PhD. @bobbledavidson #MMW2014 Merlion.

Summary

Share your data Share your software Share your workflow – in full Galaxy is not a new 'software', it's a flexible

sharing platform Add your tools to ours, in Galaxy Toolshed

Help make metabolomics: Trustworthy, meaningful, reproducible

Page 18: Merging and sharing Metabolomics analysis tools with Galaxy: transparent, reproducible, open 'omics Robert L Davidson PhD. @bobbledavidson #MMW2014 Merlion.

Acknowledgements

University of Birmingham Ralf Weber Mark Viant

GigaScience Pete Li

Funding NERC NE/K011294/1

Page 19: Merging and sharing Metabolomics analysis tools with Galaxy: transparent, reproducible, open 'omics Robert L Davidson PhD. @bobbledavidson #MMW2014 Merlion.

http://orcid.org/0000-0002-0311-9774

Me: Rob L. Davidson This presentation:

http://bit.ly/1EPTbmehttp://dx.doi.org/10.6084/m9.figshare.1243500