E- Biogenouest : a regional Life Sciences initiative for data integration
description
Transcript of E- Biogenouest : a regional Life Sciences initiative for data integration
![Page 1: E- Biogenouest : a regional Life Sciences initiative for data integration](https://reader036.fdocuments.us/reader036/viewer/2022081603/56813ba9550346895da4daf8/html5/thumbnails/1.jpg)
E-BIOGENOUEST: A REGIONAL LIFE SCIENCES INITIATIVE FOR DATA INTEGRATION
Datacite Annual Conference 2014 - Nancy
Olivier Collin – IRISA/INRIA
http://www.genouest.org
![Page 2: E- Biogenouest : a regional Life Sciences initiative for data integration](https://reader036.fdocuments.us/reader036/viewer/2022081603/56813ba9550346895da4daf8/html5/thumbnails/2.jpg)
Agenda• Context
• Biogenouest• Biology
• The e-biogenouest project• “Bridging data, metadata and computation”
• A system of systems : collaborative portal, metadata management environment, data analysis portal
![Page 3: E- Biogenouest : a regional Life Sciences initiative for data integration](https://reader036.fdocuments.us/reader036/viewer/2022081603/56813ba9550346895da4daf8/html5/thumbnails/3.jpg)
Biogenouest
Biogenouest is a network bringing together technological core facilities dedicated to Life and Environmental Sciences in the West of France
![Page 4: E- Biogenouest : a regional Life Sciences initiative for data integration](https://reader036.fdocuments.us/reader036/viewer/2022081603/56813ba9550346895da4daf8/html5/thumbnails/4.jpg)
Biogenouest
Created in 2002, Biogenouest coordinates 31 technological core facilities based in the regions of Brittany and Pays de la Loire, with the aim to organize and pool interregional resources.
Biogenouest also federates 70 research units involved in thematic research covering 4 areas of activity : Marine resources, Agri-food, Health and Bioinformatics.
![Page 5: E- Biogenouest : a regional Life Sciences initiative for data integration](https://reader036.fdocuments.us/reader036/viewer/2022081603/56813ba9550346895da4daf8/html5/thumbnails/5.jpg)
GenOuest : Bioinformatics core facility
• Member of the Biogenouest network• Member of the IFB : French Bioinformatics Institute• National recognition : IBiSA platform• Regional strategic facility for INRA (National Institute of
Agronomical Research)• ISO9001:2008 certified
• Established since 2002• 10 to 12 people• Computing infrastructure, storage, software development,
expertise, R&D projects
![Page 6: E- Biogenouest : a regional Life Sciences initiative for data integration](https://reader036.fdocuments.us/reader036/viewer/2022081603/56813ba9550346895da4daf8/html5/thumbnails/6.jpg)
Computation
DataWorkflows
Portals
Collaboration
Grid Cloud Cluster
BioMAJ
SeqCrawler
MetaData
EMME
HubZero
Galaxy
Mobyle
Ontologies
BiosciencesMobyle2
R&D projects
![Page 7: E- Biogenouest : a regional Life Sciences initiative for data integration](https://reader036.fdocuments.us/reader036/viewer/2022081603/56813ba9550346895da4daf8/html5/thumbnails/7.jpg)
Computation
DataWorkflows
Portals
Collaboration
Grid Cloud Cluster
BioMAJ
SeqCrawler
MetaData
EMME
HubZero
Galaxy
Mobyle
Ontologies
BiosciencesMobyle2
R&D projects
E-Biogenouest
![Page 8: E- Biogenouest : a regional Life Sciences initiative for data integration](https://reader036.fdocuments.us/reader036/viewer/2022081603/56813ba9550346895da4daf8/html5/thumbnails/8.jpg)
![Page 9: E- Biogenouest : a regional Life Sciences initiative for data integration](https://reader036.fdocuments.us/reader036/viewer/2022081603/56813ba9550346895da4daf8/html5/thumbnails/9.jpg)
Context
Kahn. On the future of genomic data. Science (2011) vol. 331 (6018) pp. 728-9
Now : Genomics : Next Generation Sequencing
Next : Proteomics
Next : Bio-imaging
Digital data Huge amount Heterogenous
Critical situation for some laboratories
![Page 10: E- Biogenouest : a regional Life Sciences initiative for data integration](https://reader036.fdocuments.us/reader036/viewer/2022081603/56813ba9550346895da4daf8/html5/thumbnails/10.jpg)
E-BIOGENOUEST
![Page 11: E- Biogenouest : a regional Life Sciences initiative for data integration](https://reader036.fdocuments.us/reader036/viewer/2022081603/56813ba9550346895da4daf8/html5/thumbnails/11.jpg)
E-Biogenouest• Started in May 2012 for 3 years• Funded by Brittany and Pays de la Loire • E-science initiative for the Biogenouest network
• Community building• Training/workshops• Roadmap preparation• Experimentation/Pilot project : Virtual Research
Environment (VRE)
![Page 12: E- Biogenouest : a regional Life Sciences initiative for data integration](https://reader036.fdocuments.us/reader036/viewer/2022081603/56813ba9550346895da4daf8/html5/thumbnails/12.jpg)
A system of systems
• Combination of various tools• A data analysis portal : Galaxy• A metadata management tool : ISAtools suite• A collaborative portal : HubZero• Additional utilities :
• Pydio : file transfer
• Some software glue to make it work…• BioBlend : Galaxy API• In-house developments
![Page 13: E- Biogenouest : a regional Life Sciences initiative for data integration](https://reader036.fdocuments.us/reader036/viewer/2022081603/56813ba9550346895da4daf8/html5/thumbnails/13.jpg)
Galaxy portal• Galaxy : a web based portal for biomedical data analysis
• Intuitive interface• Workflows
• Galaxy@Genouest• 800 tools (transcriptomics, population genetics, quantitative
genetics, metagenomics, proteomics, etc.)
• http://galaxyproject.org/Giardine B, Riemer C, Hardison RC, Burhans R, Elnitski L, Shah P, Zhang Y, Blankenberg D, Albert I, Taylor J, Miller W, Kent WJ, Nekrutenko A. "Galaxy: a platform for interactive large-scale genome analysis." Genome Research. 2005 Oct; 15(10):1451-5.
![Page 14: E- Biogenouest : a regional Life Sciences initiative for data integration](https://reader036.fdocuments.us/reader036/viewer/2022081603/56813ba9550346895da4daf8/html5/thumbnails/14.jpg)
ISAtools Suite • Open Source tools for experimental metadata
management• Enforces the description of experiments with standards or
ontologies• Creates local repository• Allows publication to public repositories
• ISA@GenOuest = EMME• Additional developements and auxiliary tools.
• http://www.isa-tools.org/• Rocca-Serra, P. et al. ISA software suite: supporting standards-
compliant experimental annotation and enabling curation at the community level. Bioinformatics 26, 2354–6 (2010).
![Page 15: E- Biogenouest : a regional Life Sciences initiative for data integration](https://reader036.fdocuments.us/reader036/viewer/2022081603/56813ba9550346895da4daf8/html5/thumbnails/15.jpg)
EMME
Wet Lab Experiment
Data MetaData
IsaTools
ISAtab files
ISAarchive
Link to raw data
![Page 16: E- Biogenouest : a regional Life Sciences initiative for data integration](https://reader036.fdocuments.us/reader036/viewer/2022081603/56813ba9550346895da4daf8/html5/thumbnails/16.jpg)
EMME
Wet Lab Experiment
Data MetaData
ISAarchive
Galaxy
ImportDecompress
Import
Data Analysis
![Page 17: E- Biogenouest : a regional Life Sciences initiative for data integration](https://reader036.fdocuments.us/reader036/viewer/2022081603/56813ba9550346895da4daf8/html5/thumbnails/17.jpg)
HubZero • Scientific web portal
• Collaboration: wiki, blog, etc.• Resources : results, articles, presentations, etc. • Lightweight project management
• https://hubzero.org/M. McLennan, R. Kennell, "HUBzero: A Platform for Dissemination and Collaboration in Computational Science and Engineering," Computing in Science and Engineering, 12(2), pp. 48-52, March/April, 2010
![Page 18: E- Biogenouest : a regional Life Sciences initiative for data integration](https://reader036.fdocuments.us/reader036/viewer/2022081603/56813ba9550346895da4daf8/html5/thumbnails/18.jpg)
Continuum
• Continuum for the management and analysis of biological data
• Collaborative environment
HubZero
Galaxy EMME
![Page 19: E- Biogenouest : a regional Life Sciences initiative for data integration](https://reader036.fdocuments.us/reader036/viewer/2022081603/56813ba9550346895da4daf8/html5/thumbnails/19.jpg)
19
VRE : Virtual Research Environment
Data
Versioning
ProvenanceSecurity
Sharing
Workflows
Versioning
ProvenanceSecurity
Sharing
Web portal
Project management
Collaboration
Dissemination
Data infrastructure
Computing infrastructure
![Page 20: E- Biogenouest : a regional Life Sciences initiative for data integration](https://reader036.fdocuments.us/reader036/viewer/2022081603/56813ba9550346895da4daf8/html5/thumbnails/20.jpg)
A paradigm shift
Data
IT Environment
Data
IT Environment
From… To…
![Page 21: E- Biogenouest : a regional Life Sciences initiative for data integration](https://reader036.fdocuments.us/reader036/viewer/2022081603/56813ba9550346895da4daf8/html5/thumbnails/21.jpg)
Next steps• What we learned :
Acceptance / adoption issues are key issues
• What we will do : • Switch to a production environment• Identity federation• ISA-Dataflow : metadata for bioinformatics workflows
What we need to do :• To connect to other initiatives • To define the perimeter :
• Big changes for bioinformatics facilities
![Page 22: E- Biogenouest : a regional Life Sciences initiative for data integration](https://reader036.fdocuments.us/reader036/viewer/2022081603/56813ba9550346895da4daf8/html5/thumbnails/22.jpg)
Conclusion• Biology becomes a digital science• New technologies with lower costs create a dangerous
situation• A system of systems :
« metadata + collaborative tool + analysis portal »
• Continuum : data centered philosophy« Bring back Biology to the biologist »