Data of the Dead An HMORN wide Comparison of Death Data EASTMAN

Post on 15-Jun-2015

231 views 2 download

Tags:

description

Virtual Data Warehouse

Transcript of Data of the Dead An HMORN wide Comparison of Death Data EASTMAN

Data of the DeadAn HMORN-Wide Comparison of Death Data Sources

David Eastman, KP CHR Southeast, Atlanta, GADon Bachman, MS, KP CHR, Portland, OR

Daniel Ng, BSE, MBA, KP DOR, Oakland, CAWei Tao, MS, KP DOR, Oakland, CA

TopicsBackgroundSurvey of death data sources

The SOURCE variableMethods of weaving death data together

The CONFIDENCE variableInter-source agreement analysis at KPGAVDW death data QA program preliminary

findingsSprinkled throughout

BackgroundVDW death files contain:

Dates of deathQualifiers - data source, confidence,

date imputation flagCauses of death

Typically VDW sites have access to multiple sources of death data

How data are woven together varies considerably

Death Data SourcesHMO Membership

Clarity Patient tableCommon membership

Hospital DischargesState Death CertificatesSocial Security AdministrationNational Death IndexTumor DataClarity “Death Notes”

HMO Death Data: Pros and ConsPros

No probabilistic matching; unlikely to be the wrong person

Gold standard at some HMOsCons

No cause of death informationNo inactive (prior) member deaths; death after

disenrollment will probably be missedAt some HMOs, family/employer must notify HMO;

less rigorously reported & dates may be inaccurate.At other sites, hospital, home health and hospice

care are well integrated in the EMR and provide very reliable death dates.

At some sites, this method is more prone to false negatives than Gov’t data

Gov’t Death Data: Pros and ConsPros

HMO enrollment status at time of death is irrelevant; death after disenrollment more likely to be captured if it is part of the matching algorithm

Some gov’t sources contain cause of death informationCons

Probabalistic matching on names/dates/SSN/etc.; wrong person may get matched. Some sites cannot match on SSN which makes the method less reliable.

Some sites do the matching themselves, some only get matches from the gov’t

May be more far reaching than HMO data, but may not include deaths outside of HMO’s state(s)

At some sites, this method is more prone to false positives than HMO data

The SOURCE VariableSpec definition: Source of death data?Spec values:

S = State Death filesN = National Death IndexT = Tumor dataOthers are locally defined

Based on preliminary QA results from 7 sites:5 sites use the State Death files (S)1 site uses National Death Index data (N)2 sites use the Tumor data (T)7 sites include “other” local codes

Methods of Weaving Death Data TogetherDescriptions of methods used at:

KPGAKPNCKPNW

KPGA Method - Step 1Merge all possible death data into a

research data warehouse table

KPGA Method – Step 2Select the “best quality” data to populate the VDW

HMO sources favored (vs. Gov’t sources)

Confidence variable: source agreement & postmortem activity

KPNC Method1. Input Pre-Processing

Combine member records containing demographic variables, contact dates, and membership dates

2. QualityStage matchingProbabilistic matching of KPNC members to CA state and SSA

death records3. Initial Filtering

Filter large number of match output records down to manageable size

Resulting files (KPNC-CA and KPNC-SSA matches) have multiple matches per MRN

4. Ranking & SelectionSelect the single, best match per MRN based on weighted

comparison of match linkweights, demographic vars, and contact and membership dates

5. Assign Final VariablesSelect best Death dateAssign scores for overall confidence and confidence of CA and

SSA matches

KPNW Method - Part 1Internal KP data: only use reliable sources1. Patient table from Clarity. Most reliable & best

source of death dates based on internal validation and subsequent CESR QA.

2. Common Membership including a specific death table (older sources don’t include death dates, but do correctly identify dead patients)

3. KPNW tumor registry4. Probabilistic match of KP members to OR and WA

state data by CHR Staff (unlike other many other sites).

OR & WA state don’t do the matching and won’t share SSNs.

CHR staff match members from the past 2 years to the state data. Only current source of cause of death. 18-36 month lag.

KPNW Method - Part 2 Been creating death files for several years Death files only include those who we believe

have truly died Death dates from KP internal data appear very

reliable based on CESR QA Death dates from the Tumor Registry and state

data are also excellent but not as good as internal KP data

Death more than 2 years after disenrollment will probably be missed with current system

Would benefit from switching to a common HMORN confidence variable algorithm

The CONFIDENCE VariableSpec definition: “How you rate the accuracy of the

observation based on source, match, # of reporting sources, discrepancies, etc.”

Spec values: E=Excellent, F=Fair, P=PoorBased on preliminary QA results from 7 sites,

by site:% E ranges from 20% to 100%% F ranges from 0% to 55%% P ranges from 0% to 50%% E + %F ranges from 50% to 100%

The CONFIDENCE variable is inconsistently implemented!

The CONFIDENCE VariableWhat does the confidence variable

measure? Likelihood of death? Accuracy of the death date? Likelihood that the cause of death

information is linked to the correct person?

Inter-source Agreement Analysis at KPGAWhere do data come from?Corroborated deathsInter-source death date agreementPostmortem activityConfidence distribution

Where Do Data Come From? (KPGA)

Corroborated Deaths (KPGA)

Inter-source Death Date Agreement (KPGA)

Postmortem Activity (KPGA)

Confidence Distribution (KPGA)

RecommendationsCreate new confidence variables

Confidence that the patient is really deadConfidence in the death dateConfidence in the linkage to external source

dataKPNC has implemented these as local variables

Develop a common algorithm to determine the values of these confidence variables to give them a common meaning.

Any Questions?