Alessandro Baricco: A Modern Homer / Alessandro Baricco: Omero Modern
Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome...
Transcript of Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome...
![Page 1: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/1.jpg)
Tools for reproducible and accessible science
KnitR, VMs and OMERORob Davidson
Cardiac Physiome WorkshopAuckland, April 8th 2015
DOI for this talk: 10.6084/m9.figshare.1368774
DOI: 10.6084/m9.figshare.1368774
![Page 2: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/2.jpg)
Today’s message
• Tools that fit with GigaDB– General purpose Research Object store
• Enhancing– Accessibility– Reproducibility
• Of some of your research objects– Software– images
DOI: 10.6084/m9.figshare.1368774
![Page 3: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/3.jpg)
Problems with scientific software - reproducibility
DOI: 10.6084/m9.figshare.1368774
![Page 4: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/4.jpg)
Measuring software reproducibility
• Systematic study:• 515 papers (429 conference, 86 journal)• <30% reproducible
http://reproducibility.cs.arizona.edu DOI: 10.6084/m9.figshare.1368774
![Page 5: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/5.jpg)
Measuring software reproducibilityhttp://reproducibility.cs.arizona.edu DOI: 10.6084/m9.figshare.1368774
![Page 6: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/6.jpg)
Reasons for failure
“The good news is that I was able to find some code. I am just hoping that it is a stable working version of the code... I have lost some data... The bad news is that the code is not commented and/or clean. So, I cannot really guarantee that you will enjoy playing with it.”
http://reproducibility.cs.arizona.edu DOI: 10.6084/m9.figshare.1368774
![Page 7: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/7.jpg)
Cost of failure
• Waste time• Waste money
– Ioannidis 2014 – 85% resources wasted
• Frustrating• Distrust
DOI: 10.1371/journal.pmed.1001747 DOI: 10.6084/m9.figshare.1368774
![Page 8: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/8.jpg)
Literate programming - KnitR
DOI: 10.6084/m9.figshare.1368774
![Page 9: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/9.jpg)
Literate programming
• Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to humans what we want the computer to do.– Donald E. Knuth, Literate Programming, 1984
DOI: 10.6084/m9.figshare.1368774
![Page 10: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/10.jpg)
Literate programming options
• See listing: http://www.gigasciencejournal.com/content/3/1/19– R: KnitR, Sweave, R-Markdown– Javascript: Tangle, Active Markdown (CoffeeScript)– Python: Ipython Notebooks – iReport links this functionality for Galaxy
DOI: 10.6084/m9.figshare.1368774
![Page 11: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/11.jpg)
KnitR is versatile
R
Python
Ruby
HaskellPerl
SAS
Coffeescript
.txt
LaTeX
HTML
D3.js
R Markdown
HTML5 slides
Command line Any text?
WordPress
DOI: 10.6084/m9.figshare.1368774
![Page 12: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/12.jpg)
KnitR – how does it work?
• Code chunks– Basic text (or latex or markdown), interrupted by
‘chunks’ of code• For latex, similar to Sweave
…some text \Sexpr{rfunc(var)} more text……some text <<language, chunk_name, chunk_options>>=Some code@
• Process this combined text/code with knit() in R
DOI: 10.6084/m9.figshare.1368774
![Page 13: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/13.jpg)
KnitR uses: easy to explainhttp://reproducibility.cs.arizona.edu DOI: 10.6084/m9.figshare.1368774
![Page 14: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/14.jpg)
KnitR uses: reproducible analysis
• Can string different tools/languages together • Stores parameters• Just like a pipeline/workflow system
– E.g. galaxy, taverna, Knime
• But also: codifies your figures…
DOI: 10.6084/m9.figshare.1368774
![Page 15: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/15.jpg)
KnitR uses – codified figures
• Classic problems:• No description of error
bars• No description of
distributions
• Admittedly this could be fixed by ‘proper’ peer review
Source code: http://bit.ly/1NQZlHh DOI: 10.6084/m9.figshare.1368774
![Page 16: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/16.jpg)
KnitR uses: codified figures
• Code can be found quickly• Using text as markers
• Plot can be altered – 1 line of code
• New visualisation produced instantaneously
• Better evaluation of results
Source code: http://bit.ly/1NQZlHh DOI: 10.6084/m9.figshare.1368774
![Page 17: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/17.jpg)
GigaScience KnitR example• “This article is an example of a literate programming document. It has
been created in R using the knitr package. Figures and tables in this paper are generated dynamically as the document is compiled. Several R packages are required to run the analysis. Materials are archived in the Gigascience database”
DOI:10.1186/2047-217X-3-3 DOI: 10.6084/m9.figshare.1368774
![Page 18: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/18.jpg)
Environment wrappers - VMs
DOI: 10.6084/m9.figshare.1368774
![Page 19: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/19.jpg)
Measuring software reproducibilityhttp://reproducibility.cs.arizona.edu DOI: 10.6084/m9.figshare.1368774
![Page 20: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/20.jpg)
Your environment
• How hard would it be to start from scratch?• What if you move from Ubuntu to Centos? Or
just upgrade?
• Dependencies / Versions• System settings• Hard for you, horrendous for others!
DOI: 10.6084/m9.figshare.1368774
![Page 21: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/21.jpg)
Share your environment• Virtual machine
– Copy your exact environment– If it works for you, it works for anyone– Reproducibility, frozen in time
DOI:10.1186/2047-217X-3-23 DOI: 10.6084/m9.figshare.1368774
![Page 22: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/22.jpg)
Share your environment
• Docker– ‘light’ vm – Discrete unit of code+environment– Can be called from command line– Can be linked together
• New possibilities e.g. nucleotid.es – Benchmarking -> “data-driven peer-review”?
http://nucleotid.es/ DOI: 10.6084/m9.figshare.1368774
![Page 23: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/23.jpg)
Share your environment
• Some concerns:– http://ivory.idyll.org/blog/vms-considered-harmfu
l.html– VM = black box?– Docker == black box!
Solution-> codify the environment
DOI: 10.6084/m9.figshare.1368774
![Page 24: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/24.jpg)
Codify your environment
• Provisioning scripts are ‘research objects’• Improves adaptability (easier to recode for
alternative OS etc)• Builds in extra documentation• Easier to share – although GigaDB still wants a
compiled snapshot (i.e. full machine)
DOI: 10.6084/m9.figshare.1368774
![Page 25: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/25.jpg)
Short list of provisioning systems
• Vagrant• Chef• Salt• Puppet• Ansible
• Many more – see link for info
Source: http://bit.ly/1wrYiuI DOI: 10.6084/m9.figshare.1368774
![Page 26: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/26.jpg)
Images: release ALL the images with OMERO
“And now for something completely different”
DOI: 10.6084/m9.figshare.1368774
![Page 27: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/27.jpg)
NO
Phenotyping with microCTdoi:10.1186/2047-217X-2-14 DOI: 10.6084/m9.figshare.1368774
![Page 28: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/28.jpg)
NO
Phenotyping with microCTdoi:10.1186/2047-217X-3-6 DOI: 10.6084/m9.figshare.1368774
![Page 29: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/29.jpg)
Hosting Images• Image LIMS
• MetaData!!!• Can handle most
formats• Web embedding
• View online, no need for software
• Open Source
www.openmicroscopy.org/site/products/omero DOI: 10.6084/m9.figshare.1368774
![Page 30: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/30.jpg)
www.openmicroscopy.org/site/products/omero DOI: 10.6084/m9.figshare.1368774
![Page 31: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/31.jpg)
OMERO: providing access to imaging data
View, filter, measure raw images with direct links from journal article.
See all image data, not just cherry picked examples.
Download and reprocess.
DOI: 10.6084/m9.figshare.1368774
![Page 32: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/32.jpg)
OMERO: Adding value http://jcb-dataviewer.rupress.org/ DOI: 10.6084/m9.figshare.1368774
![Page 33: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/33.jpg)
The alternative...
...look but don't touch
DOI: 10.6084/m9.figshare.1368774
![Page 34: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/34.jpg)
Thanks for listening!
Acknowledgements• GigaTeam
– Scott Edmunds– Peter Li– Chris Hunter– Jesse Xiao– Nicole Edmunds– Laurie Goodman
Where to get these slides• FigShare DOI:
– 10.6084/m9.figshare.1368774
• http://bit.ly/1JmnRiU
DOI: 10.6084/m9.figshare.1368774