Merging and sharing Metabolomics analysis tools with Galaxy: transparent, reproducible, open 'omics...
-
Upload
oswin-watson -
Category
Documents
-
view
218 -
download
0
Transcript of Merging and sharing Metabolomics analysis tools with Galaxy: transparent, reproducible, open 'omics...
Merging and sharingMetabolomics analysis tools with Galaxy:
transparent, reproducible, open 'omics
Robert L Davidson PhD. @bobbledavidson
#MMW2014
Merlion Metabolomics Workshop 2014
http://dx.doi.org/10.6084/m9.figshare.1243500http://bit.ly/1EPTbme
Researcher bias Positive result bias
20 teams do studies, 1 publishes p<0.05 Poorly explained analyses
DOI: 10.1371/journal.pmed.0020124
85% of research resources are wasted! We must ... favor ... unbiased, transparent, collaborative
research with greater standardization Share data, protocols, materials, software, other tools
DOI: 10.1371/journal.pmed.1001747
Data sharing
Supported by gov policy: e.g. UK and NIH MetaboLights repository
www.ebi.ac.uk/metabolights/ NIH Metabolomics Data Repository
www.metabolomicsworkbench.org/data/index.php ISA-Tab for metadata
http://www.isa-tools.org/format.html
What about methods?
http://reproducibility.cs.arizona.edu/
“The good news is that I was able to find some code. I am just hoping that it isa stable working version of the code... I have lost some data... The bad news is that the code is not commented and/or clean. So, I cannot really guarantee that you will enjoy playing with it.”
613 paperstested
123 successfulreproductions
Problem
There is a reproducibility crisis Published results are untrustworthy Research is a waste of government money (85%)
What's the solution? Share data AND methods
Galaxy
Over 36,000 main Galaxy server users
Over 1,000 papersciting Galaxy use
Over 55 Galaxyservers deployed
Open source
http://galaxyproject.org
Galaxy – Toolshed
https://toolshed.g2.bx.psu.edu/
Many 'omics, stats,
visualisations
Metabolomics can pluginto this
2700+ tools!
Download;Run instantly
Any tool in Galaxy<tool name=”myfunction”>
<command>
python myfunction input1
</command>
<inputs>
<param format=”txt” name=”input1”>
</inputs>
<outputs>
<data format=”csv” name=”output1”>
</outputs>
</tool>
Basic xml 'wrapper'Describe inputs and outputs
Calls command
Monitors for output
Logs/returns to 'history'
Galaxy
Tool List Tool Parameters History/results
Birmingham metabolomics workflow
SIM-Stitch DOI:10.1016/j.jasms.2009.02.001
XCMS DOI:10.1021/ac051437y
MI-Pack DOI:10.1016/j.chemolab.2010.04.010
KNN Impute DOI:10.1007/s11306-011-0366-4
PQ-Normalisation DOI:10.1021/ac051632c
G-Log transform DOI:10.1186/1471-2105-8-234
PCA (with statistical test of scores)
Birmingham metabolomics workflow
Many tools
Many languages
Complex to learn
Many parameters
Complex to report
Metabolomics workflow in Galaxy
User sees website (intuitive)
Centrally stored (secure)
Workflow is recorded
Methods shareable
View, share, edit, rerun workflow
Citable workflowAdd as supplemental files or publish with distinct DOI via GigaDB or FigShare
Where to get our workflow
Coming soon! Galaxy Toolshed Github Submitted to GigaScience (gigasciencejournal.com)
VM/Code/TestData to be available on GigaDB.org Test server to be available at GigaGalaxy
http://galaxy.cbiit.cuhk.edu.hk/
Summary
Share your data Share your software Share your workflow – in full Galaxy is not a new 'software', it's a flexible
sharing platform Add your tools to ours, in Galaxy Toolshed
Help make metabolomics: Trustworthy, meaningful, reproducible
Acknowledgements
University of Birmingham Ralf Weber Mark Viant
GigaScience Pete Li
Funding NERC NE/K011294/1
http://orcid.org/0000-0002-0311-9774
Me: Rob L. Davidson This presentation:
http://bit.ly/1EPTbmehttp://dx.doi.org/10.6084/m9.figshare.1243500