Post on 15-Jun-2015
description
Data of the DeadAn HMORN-Wide Comparison of Death Data Sources
David Eastman, KP CHR Southeast, Atlanta, GADon Bachman, MS, KP CHR, Portland, OR
Daniel Ng, BSE, MBA, KP DOR, Oakland, CAWei Tao, MS, KP DOR, Oakland, CA
TopicsBackgroundSurvey of death data sources
The SOURCE variableMethods of weaving death data together
The CONFIDENCE variableInter-source agreement analysis at KPGAVDW death data QA program preliminary
findingsSprinkled throughout
BackgroundVDW death files contain:
Dates of deathQualifiers - data source, confidence,
date imputation flagCauses of death
Typically VDW sites have access to multiple sources of death data
How data are woven together varies considerably
Death Data SourcesHMO Membership
Clarity Patient tableCommon membership
Hospital DischargesState Death CertificatesSocial Security AdministrationNational Death IndexTumor DataClarity “Death Notes”
HMO Death Data: Pros and ConsPros
No probabilistic matching; unlikely to be the wrong person
Gold standard at some HMOsCons
No cause of death informationNo inactive (prior) member deaths; death after
disenrollment will probably be missedAt some HMOs, family/employer must notify HMO;
less rigorously reported & dates may be inaccurate.At other sites, hospital, home health and hospice
care are well integrated in the EMR and provide very reliable death dates.
At some sites, this method is more prone to false negatives than Gov’t data
Gov’t Death Data: Pros and ConsPros
HMO enrollment status at time of death is irrelevant; death after disenrollment more likely to be captured if it is part of the matching algorithm
Some gov’t sources contain cause of death informationCons
Probabalistic matching on names/dates/SSN/etc.; wrong person may get matched. Some sites cannot match on SSN which makes the method less reliable.
Some sites do the matching themselves, some only get matches from the gov’t
May be more far reaching than HMO data, but may not include deaths outside of HMO’s state(s)
At some sites, this method is more prone to false positives than HMO data
The SOURCE VariableSpec definition: Source of death data?Spec values:
S = State Death filesN = National Death IndexT = Tumor dataOthers are locally defined
Based on preliminary QA results from 7 sites:5 sites use the State Death files (S)1 site uses National Death Index data (N)2 sites use the Tumor data (T)7 sites include “other” local codes
Methods of Weaving Death Data TogetherDescriptions of methods used at:
KPGAKPNCKPNW
KPGA Method - Step 1Merge all possible death data into a
research data warehouse table
KPGA Method – Step 2Select the “best quality” data to populate the VDW
HMO sources favored (vs. Gov’t sources)
Confidence variable: source agreement & postmortem activity
KPNC Method1. Input Pre-Processing
Combine member records containing demographic variables, contact dates, and membership dates
2. QualityStage matchingProbabilistic matching of KPNC members to CA state and SSA
death records3. Initial Filtering
Filter large number of match output records down to manageable size
Resulting files (KPNC-CA and KPNC-SSA matches) have multiple matches per MRN
4. Ranking & SelectionSelect the single, best match per MRN based on weighted
comparison of match linkweights, demographic vars, and contact and membership dates
5. Assign Final VariablesSelect best Death dateAssign scores for overall confidence and confidence of CA and
SSA matches
KPNW Method - Part 1Internal KP data: only use reliable sources1. Patient table from Clarity. Most reliable & best
source of death dates based on internal validation and subsequent CESR QA.
2. Common Membership including a specific death table (older sources don’t include death dates, but do correctly identify dead patients)
3. KPNW tumor registry4. Probabilistic match of KP members to OR and WA
state data by CHR Staff (unlike other many other sites).
OR & WA state don’t do the matching and won’t share SSNs.
CHR staff match members from the past 2 years to the state data. Only current source of cause of death. 18-36 month lag.
KPNW Method - Part 2 Been creating death files for several years Death files only include those who we believe
have truly died Death dates from KP internal data appear very
reliable based on CESR QA Death dates from the Tumor Registry and state
data are also excellent but not as good as internal KP data
Death more than 2 years after disenrollment will probably be missed with current system
Would benefit from switching to a common HMORN confidence variable algorithm
The CONFIDENCE VariableSpec definition: “How you rate the accuracy of the
observation based on source, match, # of reporting sources, discrepancies, etc.”
Spec values: E=Excellent, F=Fair, P=PoorBased on preliminary QA results from 7 sites,
by site:% E ranges from 20% to 100%% F ranges from 0% to 55%% P ranges from 0% to 50%% E + %F ranges from 50% to 100%
The CONFIDENCE variable is inconsistently implemented!
The CONFIDENCE VariableWhat does the confidence variable
measure? Likelihood of death? Accuracy of the death date? Likelihood that the cause of death
information is linked to the correct person?
Inter-source Agreement Analysis at KPGAWhere do data come from?Corroborated deathsInter-source death date agreementPostmortem activityConfidence distribution
Where Do Data Come From? (KPGA)
Corroborated Deaths (KPGA)
Inter-source Death Date Agreement (KPGA)
Postmortem Activity (KPGA)
Confidence Distribution (KPGA)
RecommendationsCreate new confidence variables
Confidence that the patient is really deadConfidence in the death dateConfidence in the linkage to external source
dataKPNC has implemented these as local variables
Develop a common algorithm to determine the values of these confidence variables to give them a common meaning.
Any Questions?