Merging and sharing Metabolomics analysis tools with Galaxy: transparent, reproducible, open 'omics...

Post on 19-Jan-2016

218 views 0 download

Tags:

Transcript of Merging and sharing Metabolomics analysis tools with Galaxy: transparent, reproducible, open 'omics...

Merging and sharingMetabolomics analysis tools with Galaxy:

transparent, reproducible, open 'omics

Robert L Davidson PhD. @bobbledavidson

#MMW2014

Merlion Metabolomics Workshop 2014

http://dx.doi.org/10.6084/m9.figshare.1243500http://bit.ly/1EPTbme

Researcher bias Positive result bias

20 teams do studies, 1 publishes p<0.05 Poorly explained analyses

DOI: 10.1371/journal.pmed.0020124

85% of research resources are wasted! We must ... favor ... unbiased, transparent, collaborative

research with greater standardization Share data, protocols, materials, software, other tools

DOI: 10.1371/journal.pmed.1001747

Data sharing

Supported by gov policy: e.g. UK and NIH MetaboLights repository

www.ebi.ac.uk/metabolights/ NIH Metabolomics Data Repository

www.metabolomicsworkbench.org/data/index.php ISA-Tab for metadata

http://www.isa-tools.org/format.html

What about methods?

http://reproducibility.cs.arizona.edu/

“The good news is that I was able to find some code. I am just hoping that it isa stable working version of the code... I have lost some data... The bad news is that the code is not commented and/or clean. So, I cannot really guarantee that you will enjoy playing with it.”

613 paperstested

123 successfulreproductions

Problem

There is a reproducibility crisis Published results are untrustworthy Research is a waste of government money (85%)

What's the solution? Share data AND methods

Galaxy

Over 36,000 main Galaxy server users

Over 1,000 papersciting Galaxy use

Over 55 Galaxyservers deployed

Open source

http://galaxyproject.org

Galaxy – Toolshed

https://toolshed.g2.bx.psu.edu/

Many 'omics, stats,

visualisations

Metabolomics can pluginto this

2700+ tools!

Download;Run instantly

Any tool in Galaxy<tool name=”myfunction”>

<command>

python myfunction input1

</command>

<inputs>

<param format=”txt” name=”input1”>

</inputs>

<outputs>

<data format=”csv” name=”output1”>

</outputs>

</tool>

Basic xml 'wrapper'Describe inputs and outputs

Calls command

Monitors for output

Logs/returns to 'history'

Galaxy

Tool List Tool Parameters History/results

Birmingham metabolomics workflow

SIM-Stitch DOI:10.1016/j.jasms.2009.02.001

XCMS DOI:10.1021/ac051437y

MI-Pack DOI:10.1016/j.chemolab.2010.04.010

KNN Impute DOI:10.1007/s11306-011-0366-4

PQ-Normalisation DOI:10.1021/ac051632c

G-Log transform DOI:10.1186/1471-2105-8-234

PCA (with statistical test of scores)

Birmingham metabolomics workflow

Many tools

Many languages

Complex to learn

Many parameters

Complex to report

Metabolomics workflow in Galaxy

User sees website (intuitive)

Centrally stored (secure)

Workflow is recorded

Methods shareable

View, share, edit, rerun workflow

Citable workflowAdd as supplemental files or publish with distinct DOI via GigaDB or FigShare

Where to get our workflow

Coming soon! Galaxy Toolshed Github Submitted to GigaScience (gigasciencejournal.com)

VM/Code/TestData to be available on GigaDB.org Test server to be available at GigaGalaxy

http://galaxy.cbiit.cuhk.edu.hk/

Summary

Share your data Share your software Share your workflow – in full Galaxy is not a new 'software', it's a flexible

sharing platform Add your tools to ours, in Galaxy Toolshed

Help make metabolomics: Trustworthy, meaningful, reproducible

Acknowledgements

University of Birmingham Ralf Weber Mark Viant

GigaScience Pete Li

Funding NERC NE/K011294/1

http://orcid.org/0000-0002-0311-9774

Me: Rob L. Davidson This presentation:

http://bit.ly/1EPTbmehttp://dx.doi.org/10.6084/m9.figshare.1243500