Post on 16-Apr-2017
Publicly Available Secondary Data Sources: An Overview and an
Example from the Medical Expenditure Panel Survey (MEPS)
Marion R. Sills, MD, MPH Department of Pediatrics
University of Colorado Health Sciences Center
Goals
• How do I find secondary data sets?• Once I find one, how do I know it’s right for
me and my research question?• Example of a MEPS analysis
Goals
• How do I find secondary data sets?• Once I find one, how do I know it’s right for
me and my research question?• Example of a MEPS analysis
Goals
• How do I find secondary data sets?• Once I find one, how do I know it’s right for
me and my research question?– What types of questions was it designed to
answer?– What data elements are available?– How can I figure out if those data elements
are useful to me?• Example of a HCUP and a MEPS analysis
Health Data Online
• Agency for Healthcare Research and Quality (AHRQ)
• CDC WONDER• HRSA• National Center for Health Statistics (NCH
S)• Partners in Information Access for the Pub
lic Health Workforce
Two Examples
• HCUP (KID) used for background statement in a manuscript
• MEPS used for a full analysis for a manuscript
HCUP--KID
• The only all-payer inpatient care database for children in the United States
• Contains data from 2-3 million pediatric hospital discharges
• Online data available via HCUPnet
HCUP--KID
• Question: What is the utilization of inpatient resources for asthma among children?
• Use: A background statement for a grant, demonstrating why the proposed study is important
HCUP
MEPS Analysis Example
• Question: What is the association between parental mental health (MH) status and • pediatric healthcare utilization patterns • access to care measures
• Use: A manuscript describing this association
Background: MEPS• Conducted by
• Agency for Healthcare Research and Quality (AHRQ)
• National Center for Health Statistics (NCHS)• MEPS sample drawn from NCHS’s National Health
Interview Study (NHIS)
• Started data collection in 1996
Background: MEPS• MEPS
• gives info about US health care use and costs • improves accuracy of economic projections
• Who has used MEPS data:• policymakers• health care administrators• businesses• researchers
Background: MEPS• Questions it was designed to address
Growth of managed
care
Changes in
private health
insurance
Changes in the healthcare delivery system
Kinds, amount, and cost of health
care
Background: MEPS• Questions it was designed to address
Growth of managed
care
Changes in
private health
insurance
Changes in the healthcare delivery system
Kinds, amount, and cost of health
care
Who benefits, who bears the costs
Background: MEPS• MEPS collects data on
• the specific health services US residents use• how frequently they use them• the cost of these services• how they are paid for• the cost, scope, and breadth of health
insurance held by US population
Background: MEPS• MEPS unique for
• the degree of detail in its data• its ability to link data:
health services spending and health insurance
demographic, employment, economic, health status, and other characteristics
Questions MEPS Can/Cannot Answer
• CAN• How do health care
use, insurance, and spending vary for different groups?
• How do access to care and satisfaction with care vary for different groups?
• CANNOT• What are estimates of
disease, prevalence of health conditions, or mortality/morbidity?
• What is the frequency of treatments or costs associated with specific treatments?
Structure of MEPS: HC• From nationally representative sample of
households• Unit of analysis can be:
• Family/household• Individual• Healthcare encounter
Structure of MEPS: HC
Structure of MEPS: HC
• Household level– includes respondents
whether or not they seek health care
– respondent report of health related experiences
Structure of MEPS HC: N
Year MEPS HC Population Size1996 21,571 1997 32,636 1998 22,9531999 22,3652000 22,8392001 33,5562002 39,1652003 34,2152004 34,4032005 33,961
Weighting
• Sample based on complex, stratified, multi-stage, probability design
• Estimates need to be weighted to reflect sample design and survey non-response– If unweighted, results are biased
• Use appropriate methods to calculate standard error to allow for complex design– If not, standard error is underestimated
Weighting
1997 1998 1999 2000 2001
Average 8,312 11,917 11,730 11,679 8,849
Minimum 299 321 307 454 336
Maximum 68,518 84,587 80,062 78,157 67,537
Variable Name WTDPER97 WTDPER98 PERWT99F PERWT00F PERWT01F
Weighting• Basic software procedures assume simple random
sampling (SRS)– MEPS not SRS– Point estimates correct (if weighted)– Standard errors usually too small
• Software to account for complex design – SUDAAN (stand-alone or callable within SAS)– STATA (svy commands)– SAS (8.2 or later) (survey procedures)– SPSS (complex survey features in 13.0 or later)
Example: Using the 2002 MEPS full year consolidated file (PUF HC-070) as the analytic file, the following statements will produce accurate estimates of the average total expenditures in 2002 for children younger than 18 years of age ($1,085.82) and the corresponding standard error ($70.28).
SAS
proc surveymeans;stratum varstr;cluster varpsu;weight perwt02f;var totexp02;domain agegroup;
Note: The domain statement in this example will generate estimates for all categories of the variable agegroup (a hypothetical constructed analytic variable where the youngest group is children under 18). There is no option within the surveymeans procedure to select only a specific population subgroup (e.g., agegroup=1).
SUDAAN
proc descript filetype=sas design=wr;nest varstr varpsu;weight perwt02f;var totexp02;subpopn agegroup=1;
Note: The subpopn statement in this example generates estimates for children under 18 (where agegroup is a constructed analytic variable that is equal to 1 for children under 18).
Stata (syntax below applies to releases 8.0 and higher)
svyset [pweight=perwt02f], strata(varstr) psu(varpsu)svymean totexp02, subpop(children)
Note: The subpop statement in this example generates estimates for children under 18 only (where children is a constructed variable set equal to 1 for persons under 18 and set equal to 0 for all other persons).
SPSS
csplan analysis/plan file=’filename’/planvars analysisweight=perwt02f/design strata=varstr cluster=varpsu/estimator type=wr.csdescriptives/plan file=’filename’/summary variables=totexp02/mean/statistics se/subpop table=children.
Note: The subpop statement in this example will generate estimates for all categories of the variable children (a hypothetical constructed dichotomous analytic variable where 1=children under 18 and 0=adults 18 and over). There is no option within the csdescriptives procedure to select only a specific population subgroup (e.g., children=1).
MEPSnet/HC
MEPS Analysis Example
• Parental Mental Health and Child Healthcare Utilization
• Objective: to show the association between parental mental health (MH) status and • pediatric healthcare utilization patterns • access to care measures
Methods• Data source: MEPS HC, 1996-99• Inclusion criteria
• 0-18 years old • <1 parent in MEPS
Methods: Conceptual Model
# of parents with MH dx
healthcare-utilization variables
access-to-care variables
child’s demographics
child’s chronic illness
year parent’s education
Methods: Conceptual Model
# of parents with MH dx
healthcare-utilization variables
access-to-care variables
child’s demographics
child’s chronic illness
year parent’s education
Parent’s Full Year File
Parent’s Medical Conditions File
Child’s Full Year File
Child’s Visit FilesMEPS File Source:
Methods• Outcome measures:
• Utilization variables
• Access-to-care variables•changed providers •any difficulty
obtaining usual care
•ED/Inpatient visits•WCC visits
• Total healthcare expenditures
Methods
• Primary independent variable: number of parents at home with a MH diagnosis (ICD-9 code 291-314)
Methods
• Other independent variables:• Child’s
• Year • Parent’s education
•age •urbanicity •income•gender •census region •insurance•race/ethnicity •family size •chronic illness
Methods: Analysis
• Bivariate analyses• Logistic regression: to determine
associations between primary independent variable and • healthcare-utilization variables• access-to-care variables
Results
• 31,062 children in 1996-99 weighted estimate of 76 million children/year• 18% (13 million) with 1 parent with a MH
diagnosis• 89% (12 million) with 1 parent with MH diagnosis• 11% (1.5 million) with 2 parents with MH
diagnosis
Results: BivariateSignificant Association Between Parents’
MH and Both ER Visits and Hospitalizations
11.7
2.8
40.1
14.6
3.4
40.2
15.0
5.0
38.6
05
10
15202530
354045
ER Visit Hosp WCC
% W
ith V
isit
0 Parents with MH Diagnosis
1 Parent with MH Diagnosis
2 Parents with MH Diagnosis
Results: BivariateAssociation Between Child’s Mean Total
Expenditures and Parent’s MH
$744$935
$1,817
$0
$400
$800
$1,200
$1,600
$2,000
0 Parents withMH Diagnosis
1 Parent withMH Diagnosis
2 Parents withMH Diagnosis
Tota
l Chi
ld H
ealth
care
Exp
endi
ture
s
Regression ResultsIncreased Acute Care Visits and Expenditures
# Parents with MH Diagnosis (referent = 0)
1 Parent 2 Parents
Had WCC visit 1.06 (0.95, 1.19) 0.99 (0.77, 1.27)
Had ER/Hosp visit 1.22 (1.08, 1.36) 1.32 (1.05, 1.67)
Had health expenditures 1.34 (1.17, 1.54) 1.67 (1.13, 2.45)
Conclusions
• Parent’s MH diagnoses associated with child’s• costlier patterns of health care utilization• higher overall healthcare costs