Dutch medical research and clinical data infrastructure ...service, process, apps, data, technology....

22
Dutch medical research and clinical data infrastructure coordinated by NFU Morris Swertz, Richard Sinke, Marc Rietveld and many others.

Transcript of Dutch medical research and clinical data infrastructure ...service, process, apps, data, technology....

Page 1: Dutch medical research and clinical data infrastructure ...service, process, apps, data, technology. NGS is on data (std. of meta data), apps (pipelines, auth) and technology. We can

Dutch medical research and clinical data infrastructure coordinated by NFU Morris Swertz, Richard Sinke, Marc Rietveld and

many others.

Page 2: Dutch medical research and clinical data infrastructure ...service, process, apps, data, technology. NGS is on data (std. of meta data), apps (pipelines, auth) and technology. We can

Outline

• NFU – Towards program research data infrastructure

• VKGL – Towards data sharing for diagnostics

• UMCG – Some hooks for future collaboration

Page 3: Dutch medical research and clinical data infrastructure ...service, process, apps, data, technology. NGS is on data (std. of meta data), apps (pipelines, auth) and technology. We can

NFU, DTL, and beyond

Page 4: Dutch medical research and clinical data infrastructure ...service, process, apps, data, technology. NGS is on data (std. of meta data), apps (pipelines, auth) and technology. We can

NFU - Netherlands Federation of UMCs

The Netherlands Federation of University Medical Centres (Nederlandse Federatie van Universitair Medische Centra) (NFU) represents the eight cooperating UMCs in the Netherlands, as an advocate for and employer of 65,000 people. The NFU was founded in 2004 as a spin-off from the University Hospitals Association (Vereniging Academische Ziekenhuizen) (VAZ), which was established in 1989. The objective has remained the same: to ensure that agencies that decide healthcare issues in the Netherlands take into account the special role of the academic hospitals (in the past) and the UMCs (presently).

Page 5: Dutch medical research and clinical data infrastructure ...service, process, apps, data, technology. NGS is on data (std. of meta data), apps (pipelines, auth) and technology. We can

Desire to coordinate research data infra

Challenges: • Increasing demands from government and society • Innefficient use of available data • Barriers in access to data and facilities • Hidden costs because of inadequate infrastructure • Difficulties to integrate health into research • Fragmentation / duplication of the work NFU Needs • Coordination of the existing infrastructure programs • In particular because these go beyond one UMCs walls • Better alignment with other NFU strategic activities (registratie aan de

bron; kwaliteitsborging mensgebonden onderzoek).

• Improve position in EU

Page 6: Dutch medical research and clinical data infrastructure ...service, process, apps, data, technology. NGS is on data (std. of meta data), apps (pipelines, auth) and technology. We can

6 sept: project initiated to define the program

Towards one integrated research infrastructure for UMCs in 10y Fase 1: 2014 – 2018: harmonization / standardization Fase 2: 2018 – 2023: integration += DTL

health Jeroen Belien,

David v Enckevort, Freek de Bruijn, et

al

Page 7: Dutch medical research and clinical data infrastructure ...service, process, apps, data, technology. NGS is on data (std. of meta data), apps (pipelines, auth) and technology. We can

Themes (embracing existing WG) • Data stewardship guidelines

• Standards for process and architecture • Coordination in EU calls

• Collaboration in ‘hard’ IT infrastructure • Big data / HPC Clusters

• Medical intelligence • Use EHR for research

• TTPs / pseudonimisation and security policy • Standards data models / interfaces

• Integration of registries • Findability / Catalogues

• Data access • Sharing of expertise (“loket”)

Page 8: Dutch medical research and clinical data infrastructure ...service, process, apps, data, technology. NGS is on data (std. of meta data), apps (pipelines, auth) and technology. We can

Relevant themes for NGS

Big data infrastructure [definition unclear] Large network, storage and compute needs are increasing. We need coordination for base capacity (in each house?) and peak capacity (shared?). Ability to scale-out is key. We can coordinate via BBMRI-IT, TraIT, SURF, EYR.

Standards process & architecture Reference architecture incl. business, service, process, apps, data, technology. NGS is on data (std. of meta data), apps (pipelines, auth) and technology. We can coordinate via CTMM/TraIT, ACZIE/TACZIE and SURF Medical intelligence

Patients are increasingly classified based on many phenotypic/imaging profiles. DNA/RNA is becoming dominant in these approaches. Sharing is needed for suitably large populations and efficient IT development. Unclear yet how to coordinate as still fragmented.

Data expertise loket Each researcher should have access to local expertise center. Emerging centers should work together, using each others specialties, and collaborate on SOPs etc. For NGS we can coordinate this via DTL theme meetings.

Page 9: Dutch medical research and clinical data infrastructure ...service, process, apps, data, technology. NGS is on data (std. of meta data), apps (pipelines, auth) and technology. We can

VKGL data sharing

Rien Blok, Marielle van Gijn, Ronald Lekanne, Pieter Neerincx, Rolph Pfundt, Claudia

Ruivenkamp, Jasper Saris, Rolf Sijmons, Morris Swertz (secretary), Richard Sinke (chair), Peter

Taschner, Maartje Vogel, Joeri van der Velde, Terry Vrijenhoek

Page 10: Dutch medical research and clinical data infrastructure ...service, process, apps, data, technology. NGS is on data (std. of meta data), apps (pipelines, auth) and technology. We can

What is VKGL

• Dutch Society of Clinical Genetic Diagnostic Laboratories • Aims to promote clinical genetic diagnostics specifically and

clinical genetics in general, via • Education and registration of its members specialist • Quality guidelines and certification • Spokesperson in government policy making • Coordination of care and diagnostics together with sister VKGN

(Society clinical genetics NL) • Coordination of research activity with NVHG (Dutch society for

human genetics).

Page 11: Dutch medical research and clinical data infrastructure ...service, process, apps, data, technology. NGS is on data (std. of meta data), apps (pipelines, auth) and technology. We can

Motivation

• From DNA diagnostics there is a high need to gain insight in the observations of other labs, E.g.

• (how often) have variants been seen before? • (how often) have variants been seen in a patient? • Are we trying to solve the same families?

• There are many national and international initiatives • For known variants there is a range of services: HGMD, 1KG, GoNL,

div LOVDs, DMuDB, EBI, NIH, etc. • For data sharing there are many models; centralized, de-centralized;

federated; closed; (partially) open; etc.

• What is needed to start sharing? • Remove technical barriers from hospital system to sharing • Remove organization barriers as sharing still labour intensive • Agree on content, purpose and conditions of data sharing

Page 12: Dutch medical research and clinical data infrastructure ...service, process, apps, data, technology. NGS is on data (std. of meta data), apps (pipelines, auth) and technology. We can

Work plan Step 1: share example data as representative test set

• four pilots: ‘legacy’ brca1; cardio (CGD); NGS panel; fenotype • Collect a list of all gene panels used

Step 2: gap analysis / standardization on format/content • Evaluate to what extend notations/nomenclature diverge • Expect to use HGVS nomenclature, references used (LRG) • Incl. classifications, quality, coverage, etc

Step 3: demontrator implementations • Evaluation of various software and architectures used • User interfaces that can answer the desired questions • Evaluation how to integrate with existing infra (e.g. Cartagenia)

Step 4: evaluate.

Page 13: Dutch medical research and clinical data infrastructure ...service, process, apps, data, technology. NGS is on data (std. of meta data), apps (pipelines, auth) and technology. We can

Other BBMRI/BioMedBridges/BioSHaRE/UMCG actions

NGS research, diagnostics, patient registries ... Can we share notes?

Page 14: Dutch medical research and clinical data infrastructure ...service, process, apps, data, technology. NGS is on data (std. of meta data), apps (pipelines, auth) and technology. We can

Mission

GWAS Explore summary level GWAS data

Compute Run analysis workflows on big data compute infra- structure

Catalogue Find data item and sample collections

Protocol CRFs, Questionnaires, Lab protocols, and assays

xQTL Multi-omics association & visualization tools

Share Friends, Groups and Permission management

NGS Next-Generation Sequencing

File File storage and drivers for images and data

Mutation Explore genetic mutations and patho-genicity effects

XGAP Multi-omics genotypes and phenotypes

Data Filter individual data sets and download to Excel & SPSS

Organization Institutes, Departments, People, Locations & Containers

Download as open source at http://github.com/molgenis/molgenis

48h diagnostics Gene panels Lung cancer PM Leukemia PM CVD PM LifeLines deep / RP3 (RNA) +5 RD patient registries

Page 15: Dutch medical research and clinical data infrastructure ...service, process, apps, data, technology. NGS is on data (std. of meta data), apps (pipelines, auth) and technology. We can

Data management SOPs

Page 16: Dutch medical research and clinical data infrastructure ...service, process, apps, data, technology. NGS is on data (std. of meta data), apps (pipelines, auth) and technology. We can

‘light-weight’ solution for bioinformaticians

sample glucose

disease

patient1 5.6 diabetes

patient2 7.8 diabetes

patient3 12.3 diabetes

step script parameterMapping

step1 assessRisk.sh sample=user_sample; glucose=user_glucose

step2 report.sh dis=user_disease;risk=step1_risk

#input sample #input glucose #output risk if ((10 < $glucose)); then risk+=("yes"); else risk+=("no"); fi

assessRisk.sh

#list risk #string dis nRisk=0 for r in $risk; do if [[ "yes"="$r" ]]; then ((nRisk++)); fi; done echo "Fraction of samples with $dis risk:" echo "scale=2;$nRisk/${#risk[*]}" | bc

report.sh

#input glucose #output risk if (( 10<$glucose )); then class=“yes”; else class=“no”; fi

assessRisk_0.sh #input glucose #output risk if (( 10<$glucose )); then class=“yes”; else class=“no”; fi

assessRisk_1.sh #input glucose #output risk if (( 10<$glucose )); then class=“yes”; else class=“no”; fi asssesRisk_2.sh

#input exp #input list risk nRisk=0 for r in $risk; do if (( “yes”=“$r” )); then nRisk++; fi; echo “Risk in experiment $exp:” echo “scale=2;$nRisk/${#risk[*]}” | bc

report_0.sh

workflow.csv 1. Design

parameters.csv

2. Parameters 3. Run + logs

Page 17: Dutch medical research and clinical data infrastructure ...service, process, apps, data, technology. NGS is on data (std. of meta data), apps (pipelines, auth) and technology. We can

Data API – to deal with all data modalities

• F: Java, REST or R build on Observ-OM format and model • B: Excel, csv, database, index and custom (VCF) formats

JPA repo

Mongo repo

Indexing Service

Specific repo

(VCF,plink)

Spreadsheet repo (Excel,csv)

Various Repositories

molgenis-data

Generic/ Specific

http://github.com/molgenis/molgenis

Page 18: Dutch medical research and clinical data infrastructure ...service, process, apps, data, technology. NGS is on data (std. of meta data), apps (pipelines, auth) and technology. We can

VCF / PED backend

https://github.com/molgenis/systemsgenetics

Page 19: Dutch medical research and clinical data infrastructure ...service, process, apps, data, technology. NGS is on data (std. of meta data), apps (pipelines, auth) and technology. We can

Self-describing file format (anything you want)

Patient Submit1

Patient Height Weight BMI .. LL_123041 176 68 25 .. LL_123042 163 62 23 .. LL_123043 188 75 25 .. LL_123044 180 60 23 .. LL_123045 165 106 32 .. .. .. .. .. ..

Name Sex … LL_123041 M … LL_123042 F … LL_123043 M … LL_123044 F … LL_123045 F … ..

Feature

name description unit_name dataType Patient Patient observed ref Patient Height Height standing up by nurse cm decimal Weight Weight on digital floor scale by nurse kg decimal BMI Body Mass Index kg/cm^2 decimal .. .. .. ..

(d) (c)

(b) Protocol

name features .. general • Height

• Weight • BMRI • …

(a)

Page 20: Dutch medical research and clinical data infrastructure ...service, process, apps, data, technology. NGS is on data (std. of meta data), apps (pipelines, auth) and technology. We can

eDAS + Genome Browser (collab. with U Leic)

http://github.com/molgenis/molgenis

Any data having ‘positions’ will have genome browser

Page 21: Dutch medical research and clinical data infrastructure ...service, process, apps, data, technology. NGS is on data (std. of meta data), apps (pipelines, auth) and technology. We can

Annotation/integration wizards (NGS/PM)

http://github.com/molgenis/molgenis

Extensible (ws, cmd, script)

Page 22: Dutch medical research and clinical data infrastructure ...service, process, apps, data, technology. NGS is on data (std. of meta data), apps (pipelines, auth) and technology. We can

Thanks! • NFU – Towards program research data infrastructure

• VKGL – Towards data sharing for diagnostics

• UMCG – Some hooks for future collaboration