Approach and outcome of the Biodiversity Virtual e-Laboratory (BioVeL) project
-
Upload
alex-hardisty -
Category
Education
-
view
107 -
download
0
Transcript of Approach and outcome of the Biodiversity Virtual e-Laboratory (BioVeL) project
Funded by the
BioVeL - Approach and outcome of the Biodiversity Virtual e-Laboratory project
Alex Hardisty, CoordinatorCardiff University, United Kingdom
13th November 2014Paris, France
Final eventBioVeL In Practice and In Future
Funded by the
Funded by the
Overview1. What we did: Background, objectives and approach2. What was achieved: Infrastructure and important
outcomes3. What did we learn: Lessons and future development
13th November 2014Paris, France
Final eventBioVeL In Practice and In Future
Funded by the
Funded by the
Background to the work2001 GRAB demonstrator links climate, species and
geographic data in an “e‐Science environment” with a simple static workflow
2003‐2006 Biodiversity World prototype applied workflow techniques to model climate preferences for the LeguminosaeEuropean Networks of Excellence make case for LifeWatch in ESFRI 2006 roadmap
2008‐2011 ESFRI LifeWatch research infrastructure: “Preparatory Phase” project adopts Service Network (“as a Service” model) and workflow paradigms as basis of architecture
2011‐2014 BioVeL project explores the practicalities and offers a pilot service for scientists 3
Parallel developments in the USA, of course
Funded by the
Important contributionto infrastructure
Where we fit in
Data curationBiodiversity monitoring and research networks • LTER / NEON, Genomic Observatories, EMBRC,
Natural History Museums, GEO BON / EU BON, EMBOS, BioSOS, citizen observatories
Data acquisition
Biodiversity information systems • ViBRANT Scratchpads, CoL i4Life, PESI,
WORMS, OBIS, GBIF, BExIS, BOLD, AquaMaps, agINFRA, pro-iBioSphere, OpenUp!, BioFresh, Dryad, Pangaea, GFBio, ALA, SiBBr/SpeciesLink, GBoWS/CAS, SANBI, etc.
Biodiversity e-science infrastructures • LifeWatch, BioVeL, iMarine, EUBrazilOpenBio
• DataONE
Data processingand analysis
Data access
Synthesis Centres
Funded by the
Objectives of the project
• Provide (web)services for the interdisciplinary analysis of biodiversity
• Provide analytical pipelines (workflows) based on these services• Desired functionalities
– Access data from cross‐disciplinary resources (data mining)– Access analytical methods from a range of disciplines (interoperability)– Digest large data (scalability)– Repeat complex analytical processes (reproducibility)– Access to virtual communities (sociability)
Overall: Build an infrastructure to facilitate cross-disciplinary and holistic analytical approaches in biodiversity and ecosystem research
Funded by the
6
Users’ workflows and applications
Sustained Service and Data ProvidersGBIF, CoL, OBIS, WoRMS,EMBL‐EBI, BGBM, CRIA, EoL,BHL, ALA, LTER, etc. & more.www.biodiversitycatalogue.org
Recognised and stable Infrastructure ProvidersNational, EGI.eu, PRACE, commercial, EUDAT, etc.
Building a heterogeneous Service NetworkTechnical objective: An informatics infrastructure for the next decade
Funded by the
e.g., Study ecological niche of south east Asian horseshoe crab • Import south east Asian data from external library• Apply succession of “services” = workflow • Result: ecological niche map
Study 1: create a workflow
7
Creating powerful data virtual laboratoriesTechnical objective: Flexible, re-usable, adjustable workflows
e.g., Study niche of American horseshoe crab• Import American data• Re-use south east Asian crab study workflow
Study 3: modify a workflowe.g., substitute a different model validation methodor produce the output in a different format
Study 2: re-use a workflow
Service
Z
Funded by the
Develop a portfolio of data access and processing services, composed into ‘workflows’
• Toolbox of many different Web services. • Connect in sequence to perform required analysis task. • Workflows can be shared and re‐used. • ‘Pre‐cooked’ workflows for users that don’t want to
create their own.
Foster cooperation in the community by• Discussing scientific use cases• Identifying important Web Services• Offering workflows• Training scientists
Development guided by use casesScience objective: Achieve new research publications and impact
TaxonomyTaxonomy
Ecosystem modellingEcosystem modelling
Population modellingPopulation modelling
Ecological niche modellingEcological niche modelling
GenomicsGenomics
PhylogeneticsPhylogenetics
……
Carbon sequestration
Carbon sequestration
Ecosystem function
Ecosystem function
Invasivespecies
InvasivespeciesMethods
Research
Funded by the
DisciplineDiscipline
Scientists
Scientific PAL
Technical PAL
Scientific and Technical Service Providers
ScientificRequirements
Translation
TechnicalRequirements
TechnicalCapabilities
ScientificCapabilities
ApplicationServices Team
Prioritisation
Support Centre
Training &Issue Resolution
Service LevelRequirements
Sustainability
Community
Community
Source: J Giddy
Connecting two communitiesSocial objective: Building an international social network connecting biodiversity scientists and computing technologists
Funded by the
Overview1. What we did: Background, objectives and approach2. What was achieved: Infrastructure and important
outcomes3. What did we learn: Lessons and future development
13th November 2014Paris, France
Final eventBioVeL In Practice and In Future
Funded by the
1. 50 Services and a community catalogue for discovery www.biodiversitycatalogue.org– includes 3rd party services, best practice guidance
2. Several families of workflows, shared via www.myexperiment.org
3. Public virtual laboratory (portal.biovel.eu) as operational service– users can execute workflows with their own data, – incl. data/parameter sweeps and keeping details of
their experiments, and sharing with colleagues,– helpdesk and associated training
4. Taverna Player plug‐in for website integration– e.g., for Scratchpads, National LifeWatch, Fisheries
and Oceans Canada
5. VRE/VL image: Research groups can take our stuff and create their own virtual labs, under own control– still using our workflows and services if they like
As an international network cooperating togetherday‐to‐day, we deliver:
Funded by theBiodiversity Cataloguewww.biodiversitycatalogue.org
Web service provider community
• How can I advertise my web services?• What information do people need about
them?
Web service provider community
• How can I advertise my web services?• What information do people need about
them?
Scientific user community
• How can I find the right web service?• What can this web service do?• How do I use it?• How do I know this service is working?
Scientific user community
• How can I find the right web service?• What can this web service do?• How do I use it?• How do I know this service is working?
Relevant analytical and processing code
Web Service wrapper
Multiple and systematic execution of the service in scientific workflows and other applications
Discoverable, scalable, and robust service
STAN
DAR
DS
Funded by the
Biodiversity CatalogueCuration: Annotation
• Scientific annotations– Description– Links to publications
• About the service• About the algorithms
– How to cite
• Technical annotations– How to use the service– Endpoints– Data formats– Sample data
Funded by the
BioVeL Workflow Repository
http://biovel.myexperiment.org
• Hosted on the myExperiment public site in a branded space
• Includes scientific and technical specifications
• Internal group:– Develop– 44 members, 148 workflows
• Public group: curated content– Publish– 39 workflows
• Established workflow approval process
Funded by theSign up on the portal https://portal.biovel.eu/
Funded by the
16
Linking services and workflows at the user documentation sitehttps://wiki.biovel.eu
BioVeL portal myExperiment
User documentation BiodiversityCatalogue
services
DeveloperScientist
Scientists and stakeholdersScientists and stakeholders
Workflow buildersWorkflow builders
Service providersService providers
Service Centre
User documentation
BioVeL portal
BiodiversityCatalogue
Taverna workbench
myExperiment
BioVeLinfrastructure
Helpdesk: triageadvisetroubleshootescalateresolve
Solving more than 70% of problems in less than 5 days
Training
Funded by the
Achievement in numbers
• International network cooperating together– >50 ICT and ecology experts, 18 ‘friends’, 20+ EC FP7 projects and
national LifeWatch initiatives alongside; Wider biodiversity informatics community of 80 persons + many others
• 135 products (assets) arising from the project– 36 web services deployed in use; 24 R libraries
• BiodiversityCatalogue.org – 58 services registered (21 BioVeL representing 36 deployed). 37 external to BioVeL. 160,000 discovery queries
– 45 workflows, in several families• Niche modelling (5), Population modelling (25), Phylogenetics (7), Metagenomics (3), Ecosystem functionality & CO2 sequestration (5)
• ~30 regular users. Steady stream of new sign‐ups (>105)• 12 training workshops. 15 papers published
19
Funded by the
Overview1. What we did: Background, objectives and approach2. What was achieved: Infrastructure and important
outcomes3. What did we learn: Lessons and future development
13th November 2014Paris, France
Final eventBioVeL In Practice and In Future
Funded by the
Funded by the
What have we learnt? (doing well)
• Our approach works– “It’s promising for the future” is the view of world‐class experts
• It impresses– The ideas it demonstrates are widely supported– We have won many friends
• Positive multiplier effects– From embedding workflows into other applications and websites
• It can deliver new science– More quickly, more cheaply, more effectively
• It makes the LifeWatch vision more tangible– Laying the basis for the decadal objectives to be achieved i.e., “as a
Service” model, calculating EBVs, towards predicting the biosphere21
Funded by the
What have we learnt? (to help and guide us)
• We have to focus more on the key / most valuable assets• We need to better promote what is really cool and
unique about BioVeL• We need to refine target audience segmentation into
different kinds of users and developers, and to address each with more appropriate services and capabilities (e.g., better support for R users)
• Data management capabilities needs to be more obvious• Need to scale up the science to show something that
cannot be done on the desktop or with R alone22
Funded by the
What have we learnt? (operational issues)
• Delivering “professional quality” operational service is hard • Technical challenges
– Delegated authentication and authorization of service use– Long‐running asynchronous jobs have to be handled– Difficult to maintain large number of services/workflows robust & operational
• Sociological challenges– Pals (buddy) approach works well but is expensive– Professionalization of service delivery (long road to ITSM certification)– Achieving sustainability is still difficult
• Scalability challenges– Multiple issues: e.g. files in BioSTIF, large data retrievals in DRW– Peaks of multiple simultaneous usage; presently managed “by hand”
Funded by the
Future activity focus
• Promoting biodiversity workflows and services through Friends, National LifeWatch initiatives, LifeWatch ERIC, Horizon 2020 opportunities
• Coordinate, sustain and integrate existing workflows and service initiatives
• Ramp‐up the service to gain a broad user base
Funded by the
A consortium of 15 partners from 9 countries1. Cardiff University, UK – Coordinator 2. Centro de Referência em Informação Ambiental, Brazil3. Foundation for Research on Biodiversity, France4. Fraunhofer‐Gesellschaft, Institute IAIS, Germany5. Free University of Berlin – Botanical Gardens and Botanical Museum, Germany6. Hungarian Academy of Sciences Institute of Ecology and Botany, Hungary7. Max Planck Society, MPI for Marine Microbiology, Germany8. National Institute of Nuclear Physics, Italy9. CNR: Inst. for Biomedical Technologies / Inst. of Biomembrane and Bioenergetics, Italy10. Netherlands Centre for Biodiversity (NCB Naturalis), The Netherlands11. Stichting European Grid Initiative, The Netherlands12. University of Amsterdam, Institute of Biodiversity and Ecosystem Dynamics, NL13. University of Eastern Finland, Finland14. University of Gothenburg, Sweden15. University of Manchester, UK
Thank you for your attention