Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences...

Post on 27-Mar-2018

216 views 2 download

Transcript of Observa(onal Health Data Sciences and Informa(cs · PDF fileObserva(onal Health Data Sciences...

Observa(onalHealthDataSciencesandInforma(cs

(OHDSI)GeorgeHripcsak,MD,MS

ColumbiaUniversityMedicalCenter

NewYork-PresbyterianHospital

Sea>leSymposiumonHealthCareDataAnalyBcs

Observa(onalHealthDataSciencesandInforma(cs(OHDSI,as“Odyssey”)

AmulB-stakeholder,interdisciplinary,

internaBonalcollaboraBvewithacoordinaBng

centeratColumbiaUniversity

Mission:Toimprovehealth,byempoweringa

communitytocollaboraBvelygeneratethe

evidencethatpromotesbe>erhealthdecisions

andbe>ercare

Aimingfor1,000,000,000paBentdatanetwork

h>p://ohdsi.org

OHDSI’sglobalresearchcommunity

•  >140collaboratorsfrom20differentcountries

•  ExpertsininformaBcs,staBsBcs,epidemiology,clinicalsciences

•  AcBveparBcipaBonfromacademia,government,industry,providers

•  Currently600millionpaBentrecordsin52databases

h>p://ohdsi.org/who-we-are/collaborators/

Whylarge-scaleanalysisisneededin

healthcare

Alldrugs

Allhealthoutcomesofinterest

PaBent-levelpredicBonsforpersonalizedevidencerequires

bigdata

2millionpaBentsseemexcessiveorunnecessary?

•  ImagineaproviderwantstocompareherpaBentwithotherpaBentswiththe

samegender(50%),inthesame10-yearagegroup(10%),andwiththesame

comorbidityofType2diabetes(5%)

•  ImaginethepaBentisconcernedabouttheriskofketoacidosis(0.5%)

associatedwithtwoalternaBvetreatmentstheyareconsidering

•  With2millionpaBents,you’donlyexpecttoobserve25similarpaBentswith

theevent,andwouldonlybepoweredtoobservearelaBverisk>2.0

Aggregateddataacrossahealthsystemof1,000providersmaycontain2,000,000paBents

EvidenceOHDSIseekstogeneratefrom

observaBonaldata•  Clinicalcharacteriza(on

–  Naturalhistory:Whohasdiabetes,andwhotakesme`ormin?

–  Qualityimprovement:WhatproporBonofpaBentswithdiabetesexperiencecomplicaBons?

•  Popula(on-leveles(ma(on–  Safetysurveillance:Doesme`ormincauselacBcacidosis?

–  ComparaBveeffecBveness:Doesme`ormincauselacBcacidosismorethanglyburide?

•  Pa(ent-levelpredic(on–  Precisionmedicine:Giveneverythingyouknowaboutme,ifItakeme`ormin,whatisthechanceIwillgetlacBcacidosis?

–  DiseaseintercepBon:Giveneverythingyouknowaboutme,whatisthechanceIwilldevelopdiabetes?

OHDSI’sapproachtoopenscience

Open

source

socware

Open

science

Enableusers

todo

something

Generate

evidence

•  OpenscienceisaboutsharingthejourneytoevidencegeneraBon

•  Open-sourcesocwarecanbepartofthejourney,butit’snotafinaldesBnaBon

•  Openprocessescanenhancethejourneythroughimprovedreproducibilityof

researchandexpandedadopBonofscienBficbestpracBces

Data+AnalyBcs+DomainexperBse

Standardizingworkflowstoenable

transparent,reproducibleresearch

Open

science

Generate

evidence

Databasesummary

Cohortdefini(on

Cohortsummary

Comparecohorts

Exposure-outcomesummary

Effectes(ma(on

&calibra(on

Comparedatabases

Definedinputs:•  Targetexposure

•  Comparatorgroup

•  Outcome

•  Time-at-risk

•  ModelspecificaBon

PopulaBon-levelesBmaBonforcomparaBve

effecBvenessresearch:

Is<intervenBonX>be>erthan<intervenBonY>

inreducingtheriskof<condiBonZ>?

Consistentoutputs:•  analysisspecificaBonsfortransparencyand

reproducibility(protocol+sourcecode)

•  onlyaggregatesummarystaBsBcs

(nopaBent-leveldata)

•  modeldiagnosBcstoevaluateaccuracy

•  resultsasevidencetobedisseminated

•  staBcforreporBng(e.g.viapublicaBon)

•  interacBveforexploraBon(e.g.viaapp)

OHDSIDisBnguishingFeatures

•  InternaBonaleffort(size&coverage)– 43sourcesterminologiesfromaroundtheworld

•  Openscience(depth)–  Infrastructureservesthescience– Stack:Terminology,CDM,ETL,QA,VisualizaBon,

NovelanalyBcmethods,Clinicalresearch

•  FullinformaBonmodel

HowOHDSIWorks

Sourcedata

warehouse,with

idenBfiable

paBent-leveldata

Standardized,de-

idenBfiedpaBent-

leveldatabase

(OMOPCDMv5)

ETL

Summary

staBsBcsresults

repository

OHDSI.org

Consistency

Temporality

Strength Plausibility

Experiment

Coherence

Biologicalgradient Specificity

Analogy

Compara(veeffec(veness

Predic(vemodeling

OHDSIDataPartners

OHDSICoordinaBngCenter

Standardized

large-scale

analyBcs

Analysis

results

AnalyBcs

development

andtesBng

Researchand

educaBon

Data

network

support

DeepinformaBonmodelOMOPCDMv5.0.1

Concept

Concept_relaBonship

Concept_ancestor

Vocabulary

Source_to_concept_map

RelaBonship

Concept_synonym

Drug_strength

Cohort_definiBon

Standardizedvocabularies

A>ribute_definiBon

Domain

Concept_class

Cohort

Dose_era

CondiBon_era

Drug_era

Cohort_a>ribute

Standardizedderivedelem

ents

Stan

dardized

clin

icaldata

Drug_exposure

CondiBon_occurrence

Procedure_occurrence

Visit_occurrence

Measurement

ObservaBon_period

Payer_plan_period

Provider

Care_siteLocaBon

Death

Cost

Device_exposure

ObservaBon

Note

Standardizedhealthsystemdata

Fact_relaBonship

SpecimenCDM_source

Standardizedmeta-data

Standardizedhealtheconom

ics

Person

Extensivevocabularies

Preparingyourdataforanalysis

PaBent-level

datainsource

system/schema

PaBent-level

datain

OMOPCDM

ETL

design

ETL

implementETLtest

WhiteRabbit:profileyour

sourcedata

RabbitInAHat:mapyoursource

structureto

CDMtablesand

fields

ATHENA:standardized

vocabularies

forallCDM

domains

ACHILLES:profileyour

CDMdata;

reviewdata

quality

assessment;

explore

populaBon-

levelsummaries

OHDSItoolsbuilttohelp

CDM:

DDL,index,

constraintsfor

Oracle,SQL

Server,

PostgresQL;

Vocabularytables

withloading

scripts

h>p://github.com/OHDSI

OHDSIForums:PublicdiscussionsforOMOPCDMImplementers/developers

Usagi:mapyour

sourcecodes

toCDM

vocabulary

ACHILLESHeelDataValidaBon

ATLAStobuild,visualize,andanalyze

cohorts

Characterizethecohortsofinterest

LAERTES:Knowledgebaseofwhatweknow:

literature,labeling,spontaneousreporBng

OHDSIinAcBon

•  Generateevidence– Randomizedtrialisthegoldstandard

– ObservaBonalresearchissupporBng•  Canitbecomeapartnership?

CharacterizaBon

•  TodaywecarryoutRCTswithoutclearknowledgeofactualpracBce

•  TherewillbenoRCTswithoutanobservaBonalprecursor

–  ItwillberequiredtocharacterizeapopulaBonusinglarge-scaleobservaBonaldatabeforedesigninganRCT

–  Diseaseburden–  ActualtreatmentpracBce

–  Timeontherapy

–  CourseandcomplicaBonrate

–  Donenowsomewhatthroughliteratureandpilotstudies

TreatmentPathways

Public

Industry

Regulator

AcademicsRCT,Obs

Literature

Laypress

Socialmedia

Guidelines

Formulary

Labels

AdverBsing Clinician

PaBent

Family

Consultant

IndicaBon

Feasibility

Cost

Preference

Localstakeholders

Globalstakeholders Conduits

Inputs

Evidence

Networkprocess

1.  JointhecollaboraBve2.  ProposeastudytotheopencollaboraBve3.  Writeprotocol

–  h>p://www.ohdsi.org/web/wiki/doku.php?id=research:studies

4.  Codeit,runitlocally,debugit(minimizeothers’work)

5.  Publishit:h>ps://github.com/ohdsi

6.  EachnodevoluntarilyexecutesontheirCDM

7.  Centrallyshareresults8.  CollaboraBvelyexploreresultsandjointlypublish

findings

OHDSIinacBon:

Chronicdiseasetreatmentpathways

•  ConceivedatAMIA

•  Protocolwri>en,codewri>enandtestedat2

sites

•  Analysissubmi>edto

OHDSInetwork

•  Resultssubmi>edfor7

databases

15Nov2014

30Nov2014

2Dec2014

5Dec2014

OHDSIparBcipaBngdatapartnersAbbre-via(on

Name Descrip(on Popula(on,millions

AUSOM AjouUniversitySchoolofMedicine SouthKorea;inpaBenthospital

EHR2

CCAE MarketScanCommercialClaimsand

EncountersUSprivate-payerclaims 119

CPRD UKClinicalPracBceResearchDatalink UK;EHRfromgeneralpracBce 11CUMC ColumbiaUniversityMedicalCenter US;inpaBentEHR 4GE GECentricity US;outpaBentEHR 33INPC RegenstriefInsBtute,IndianaNetworkfor

PaBentCareUS;integratedhealthexchange15

JMDC JapanMedicalDataCenter Japan;private-payerclaims 3MDCD MarketScanMedicaidMulB-State US;public-payerclaims 17MDCR MarketScanMedicareSupplementaland

CoordinaBonofBenefitsUS;privateandpublic-payer

claims9

OPTUM OptumClinFormaBcs US;private-payerclaims 40STRIDE StanfordTranslaBonalResearchIntegrated

DatabaseEnvironmentUS;inpaBentEHR 2

HKU HongKongUniversity HongKong;EHR 1

Treatmentpathwayeventflow

ProceedingsoftheNaBonalAcademyofSciences,2016

T2DM:Alldatabases

Treatmentpathwaysfordiabetes

Firstdrug

Seconddrug

Onlydrug

Type2DiabetesMellitus Hypertension Depression

OPTUM

GE

MDCDCUMC

INPC

MDCR

CPRD

JMDC

CCAE

PopulaBon-levelheterogeneityacrosssystems,

andpaBent-levelheterogeneitywithinsystems

HTN:Alldatabases

PaBent-levelheterogeneity

25%ofHTNpaBents(10%ofothers)have

auniquepathdespite250Mpop

Monotherapy–diabetes

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1989 1994 1999 2004 2009

AUSOM(SKorea*) CCAE(US#) CPRD(UK*) CUMC(US*)

GE(US*) INPC(US*#) JMDC(Japan#) MDCD(US#)

MDCR(US#) OPTUM(US#) STRIDE(US*)

General

upwardtrend

in

monotherapy

Monotherapy–HTN

AUSOM(SKorea*) CCAE(US#) CPRD(UK*) CUMC(US*)

GE(US*) INPC(US*#) JMDC(Japan#) MDCD(US#)

MDCR(US#) OPTUM(US#) STRIDE(US*)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1989 1994 1999 2004 2009

Academic

medical

centers

differfrom

general

pracBces

Monotherapy–diabetes

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1989 1994 1999 2004 2009

AUSOM(SKorea*) CCAE(US#) CPRD(UK*) CUMC(US*)

GE(US*) INPC(US*#) JMDC(Japan#) MDCD(US#)

MDCR(US#) OPTUM(US#) STRIDE(US*)

General

pracBces,

whether

EHRor

claims,have

similar

profiles

Conclusions:Networkresearch

•  ItisfeasibletoencodetheworldpopulaBoninasingledatamodel

– Over600,000,000recordsbyvoluntaryeffort(682,000,000)

•  GeneraBngevidenceisfeasible•  Stakeholderswillingtoshareresults•  Abletoaccommodatevastdifferencesin

privacyandresearchregulaBon