EMIF project
-
Upload
data-science-institute-imperial-college-london -
Category
Science
-
view
21 -
download
0
Transcript of EMIF project
EU Project Informatics Alignment Workshop
Imperial College London
April 24th-25th
Rudi Verbeeck (Janssen)EMIF
What is EMIF
• European Medical Information Framework• Project objectives: 3 projects in 1
• EMIF-Platform: Develop a framework for evaluating, enhancing and providing access to human health data across Europe
• EMIF-Metabolic: Identify predictors of metabolic complications in obesity• EMIF-AD: Identify predictors of Alzheimer’s disease in the pre-clinical and prodromal
phase
• Project Data:• Large variety of data types: Primary care data sets, Hospital data, Administrative, Regional
record-linkage systems, Registries and cohorts (broad and disease specific), Biobanks…• Combined more than 52M subjects from 7 EU countries, including 25K subjects in AD
cohorts and more than 94K subjects in Metabolic cohorts• Population vs Cohort
• Funded by: IMI • Time frames: 5 years (Jan 2013 --- Dec 2017)• Consortium of 57 partners:
10 Pharma partners, 37 Academic groups, 1 Patient organization, 9 SMEs
Data access types in EMIF
Subject-level fine grained data• Data sharing agreements• Ethical approvals• Disclosure and usage policies • Held by TTP• Protected by PET• Fully audited• Limited availability• Deep harmonization
Granular queries
Bespoke
Pre-fetched repository of aggregated query results for commonly needed data items
Cohort selection (flexible reports, aggregated data)
Aggregated queries
Members
Level 1: Browsing metadata
Level 2: Access to profiles e.g. age breakdown
Level 3: Research Request Form to select data sources
Fingerprinting
Free
No patient data Aggregated patient data Patient level data
Specific needs for EMIF• Support verticals in their research
• Stable server environment, end user support, training, account request process• Flexible data upload procedures
• EMIF Platform requirements• Population data: patient level data does not leave the source
→ Federated architecture, distributed statistics
• Data custodians keep control over their data→ Fine grained security model, server environment at source / TTP
• Analysts want to define their own analysis algorithms→ Flexible plugin mechanism for R et al, controlled data export, longitudinal data, flexible cohort pooling
• Reproducible research→ Data versioning / archiving, workflow support
• High variety of data types, standards & languages→ Clinical data, lab results, imaging, gene expression, GWAS, proteomics, metabolomics…
• Semantics requirements• Separate technical from semantic harmonization• Separate responsibilities
• SME’s define standards – within a research field, re-use what is available• Data custodians define local concepts, vocabulary mappings, mappings to global- and FP concepts, and access
control• Platform makes mappings executable and builds supporting technologies: collaboration, approval flows,
metadata management