THOR Workshop - Services EBI

18
http://project-thor.eu 1 THOR Partner Services Year 1 Workshop, Amsterdam

Transcript of THOR Workshop - Services EBI

THOR year 1 review

THOR Partner ServicesYear 1 Workshop, Amsterdam

http://project-thor.eu nr.ScheduleEBI - Florian GraefPANGAEA Markus StockerCERN Robin DaslerDataCite - Laura Rueda

http://project-thor.eu nr.MissionEnable seamless navigation across research space by:Embedding PID services in specific research contexts to:Make them easily discoverable and usable by researchersenablingClear credit and provenance for research objects

http://project-thor.eu nr.Where WP3 fits in THOR

http://project-thor.eu nr.Participants

http://project-thor.eu nr.Task timelinesM1M12M30T3.1M4T3.2M19M25M10T3.3M29M13T3.4Task 3.1 Integration of ORCIDs into data submission servicesTask 3.2. Dataset claiming services and workflowsTask 3.3 Cross-linking of identifier systemsTask 3.4 Data citation services

http://project-thor.eu nr.Research (with WP2)Services are not born in a vacuum!Establishing users/stakeholder requirements Using PIDs in workflows to address these

Interactions with WP2 are increasing as the tasks focus on interoperability, cross-linking and workflows

http://project-thor.eu nr.The EBI-ORCID Hub

15/09/15

http://project-thor.eu nr.

Genes, genomes & variationRNA CentralArrayExpressExpression AtlasMetabolightsPRIDEChEMBLChEBIMolecular structuresProtein Data Bank in EuropeElectron Microscopy Data BankEuropean Nucleotide ArchiveEuropean Variation ArchiveEuropean Genome-Phenome ArchiveGene, protein & metabolite expressionProtein sequences, familiesChemical biologyReactions, interactions & pathwaysIntActReactomeMetaboLightsSystemsBioModelsEnzyme PortalBioSamplesEnsembl Ensembl GenomesGWAS CatalogMetagenomics portalEurope PMCGene OntologyExperimental Factor OntologyLiterature & ontologies

InterProPfamUniProt

http://project-thor.eu nr.The slide shows the core resources at the EBI to show the range of data you can access through the EBI.

9

Institution-Specific RequirementsEMBL-EBIMany submission databasesKnowledge/Added Value databasesInternational delivery and governance structuresEcosystem of persistent identifiers

Central ORCID services with an API No duplication of effortORCID integrations with consistent UIs across resourcesEasy maintenance of a single service

http://project-thor.eu nr.

10

ApproachEMBL-EBIDeveloped Middleware layer with an integration API library

Middleware mediates communication between applications

Enables EBI databases to incorporate ORCID iDs in Web Forms

ORCID-EBI Hub

http://project-thor.eu nr.

11

EBI-ORCID Hub

http://project-thor.eu nr.A service with an API that any number of databases at EBI can use to interface with the ORCID registry to verify user identity and populate existing data submission forms with data from their ORCID profile. Special emphasis was put on the simplicity of integrating existing web forms with the API and two databases, Metabolights and EMPIAR have already integrated this into their workflows. EBI are actively engaging with other life science databases (BioStudies, PDBe, ENA, IntAct, PRIDE and ArrayExpress) and expect these to begin using the API over the next months and years, depending on governance structures and production plans.

12

API Integration Demonstration

Available on GitHub: https://github.com/thor-project/ebi

http://project-thor.eu nr.Early Adoption: Metabolights

Authenitcated ORCIDs added to submission form and underlying data record

http://project-thor.eu nr.

14

Early Adoption: Metabolights

http://project-thor.eu nr.

15

Early Adoption: Metabolights

http://project-thor.eu nr.

16

Early Adoption: EMPIAR

http://project-thor.eu nr.

17

Early Adoption: EMPIAR

http://project-thor.eu nr.

18

Outreach (with WP4)Many meetings with key data resources, incl workshop EMBL-EBI, life science data resourcesECCB presentation (Sept 2016)Coordination with Force11 data citation activitiesLiterature-data publication workflows workshop (Nov 2016)CERN?Pangaea?

http://project-thor.eu nr.Challenges and Lessons LearnedOpen development and code sharingGoal: to deliver real services in production environmentsPragmatic decisions on technology, expertise, and timing requiredCommon code base for interacting with the ORCIDproved complexdifferent contexts and infrastructures at the three participating institutionsPANGAEA receives access to the ORCID API partially through DataCite and has a single major database serviceEMBL-EBI has direct access to the APIs, but needed to integrate ORCID iDs into many EBI-hosted databases.

http://project-thor.eu nr.

20

ChallengesComplex ecosystem, many moving parts, many different priorities, approaches Different disciplines different state of preparedness, different legacy situations

http://project-thor.eu nr.Next StepsClosing the loop:Pushing metadata to ORCID recordClaiming systems for previously published data Extending the loopIntegration of data submission with article submission systems Sharing ORCID information (e.g. based on claims, back to data resources)Alignment of identifiers (life sciences data identifiers.org; ORCID and ISNI)Enabling data citation

http://project-thor.eu nr.ImpactLife Sciences: only the beginning. Social lag, not only from scientists but database providersKey to integrate with other workflows in particular to boost impact (forthcoming with future tasks ). Expected:Datasets claimed to ORCIDsORCIDs used more widely in data spaceImproved crosslinking between literature and data

http://project-thor.eu nr.