EMIF project

EU Project Informatics Alignment Workshop

Imperial College London

April 24th-25th

Rudi Verbeeck (Janssen)EMIF

What is EMIF

• European Medical Information Framework• Project objectives: 3 projects in 1

• EMIF-Platform: Develop a framework for evaluating, enhancing and providing access to human health data across Europe

• EMIF-Metabolic: Identify predictors of metabolic complications in obesity• EMIF-AD: Identify predictors of Alzheimer’s disease in the pre-clinical and prodromal

phase

• Project Data:• Large variety of data types: Primary care data sets, Hospital data, Administrative, Regional

record-linkage systems, Registries and cohorts (broad and disease specific), Biobanks…• Combined more than 52M subjects from 7 EU countries, including 25K subjects in AD

cohorts and more than 94K subjects in Metabolic cohorts• Population vs Cohort

• Funded by: IMI • Time frames: 5 years (Jan 2013 --- Dec 2017)• Consortium of 57 partners:

10 Pharma partners, 37 Academic groups, 1 Patient organization, 9 SMEs

Data access types in EMIF

Subject-level fine grained data• Data sharing agreements• Ethical approvals• Disclosure and usage policies • Held by TTP• Protected by PET• Fully audited• Limited availability• Deep harmonization

Granular queries

Bespoke

Pre-fetched repository of aggregated query results for commonly needed data items

Cohort selection (flexible reports, aggregated data)

Aggregated queries

Members

Level 1: Browsing metadata

Level 2: Access to profiles e.g. age breakdown

Level 3: Research Request Form to select data sources

Fingerprinting

Free

No patient data Aggregated patient data Patient level data

Specific needs for EMIF• Support verticals in their research

• Stable server environment, end user support, training, account request process• Flexible data upload procedures

• EMIF Platform requirements• Population data: patient level data does not leave the source

→ Federated architecture, distributed statistics

• Data custodians keep control over their data→ Fine grained security model, server environment at source / TTP

• Analysts want to define their own analysis algorithms→ Flexible plugin mechanism for R et al, controlled data export, longitudinal data, flexible cohort pooling

• Reproducible research→ Data versioning / archiving, workflow support

• High variety of data types, standards & languages→ Clinical data, lab results, imaging, gene expression, GWAS, proteomics, metabolomics…

• Semantics requirements• Separate technical from semantic harmonization• Separate responsibilities

• SME’s define standards – within a research field, re-use what is available• Data custodians define local concepts, vocabulary mappings, mappings to global- and FP concepts, and access

control• Platform makes mappings executable and builds supporting technologies: collaboration, approval flows,

metadata management

EMIF project

Science

Transcript of EMIF project