Phone 919-541-6086
description
Transcript of Phone 919-541-6086
13040 Cornwallis Road ■ P.O. Box 12194 ■ Research Triangle Park, North Carolina, USA 27709 Phone 919-541-6086 e-mail [email protected] 919-541-6086
Sampling for a Highly Skewed Population: Sample Design for the National Survey of
Residential Care Facilities
Margie Byron, Joshua Wiener, John Loft, Vince Iannacchione and Angela GreeneRTI International
International Conference on Establishment Surveys IIIJune 21, 2007 Montreal, Que.
RTI International is a trade name of Research Triangle Institute
2
Introduction
National Survey of Residential Care Facilities (NSRCF) to be conducted in early 2009
Joint initiative of the Office of the Assistant Secretary for Planning and Evaluation (ASPE), the National Center for Health Statistics (NCHS), and the Agency for Healthcare Research and Quality (AHRQ)
Very little nationally representative data available on residential care facilities (RCFs)
3
What are Residential Care Facilities?
There is no commonly used definition
The terms used for these types of residences vary across states in the U.S.
Residential care facilities
Assisted living facilities
Homes for the aged
Board and care homes
Congregate care facilities
4
Goals of NSRCF
General purpose survey of residential care facilities (RCFs)
How many RCFs are in the U.S. and what are their characteristics?
How many people reside in RCFs and what are their characteristics?
Want sufficient sample sizes and power to perform comparative analyses at the facility and resident levels
5
RCF Eligibility Criteria
Provide care to predominantly older population (age 65+ years old)
State licensed or regulated
Licensed to contain 4+ beds
Provide room and board and 2+ meals/day
Provide 24 hour/7 day on-site supervision
Provide assistance with personal care and/or health related services
Nursing homes and retirement communities are not eligible.
6
Sample Design Challenges
Want sufficient sample sizes and power for both facility and resident level comparative analyses
Want to conduct in-person interviews with facility staff about the facility and its residents
Keep estimated data collection costs within specified budget amount
Higher costs to add an additional facility to the study compared with adding an additional resident within a facility
7
Estimated Distributions
Facilities Beds
Bed Size Number Percent Number Percent
4 – 10 20,327 62.1 116,536 15.1
11 – 20 4,274 13.1 64,717 8.4
21 – 50 3,529 10.8 122,542 15.8
51 – 75 2,008 6.1 123,300 16.0
76 – 100 1,040 3.2 91,837 11.9
101 - 900 1,547 4.7 253,557 32.8
Total 32,725 100.0 772,489 100.0Note: Table total excludes 21,583 facilities in the SSS data file where bed size is missing (36.4% of 59,304 facilities on the file).Data Source: Social and Statistical Systems, Inc. sampling frame data file; compiled 2003
8
Sample Design Options
Stratified random sample by bed size
Probability proportional to size (PPS) random sample with bed size used to calculate size measure
Stratified PPS using bed size for stratification and to calculate size measures
9
Sample Size and Power Simulations
Selected 10 samples of RCFs under each sample design option and various sample sizes to estimate design effects
Determined number of RCFs needed to achieve desired precision requirements
Used equal and unequal subgroup comparison tests
H0: p1=p2
Prevalence rate of 0.50 for subgroup 1
10
Sample Design Option
Number ofFacilities Needed
Design Effect
EffectiveSample
SizeSubgroup 1
EffectiveSample
SizeSubgroup 2
Difference ofPrevalenceEstimates
Stratified 1,840 1.15 800 800 0.07
Stratified 1,900 1.00 1,330 570 0.07
Stratified PPS 2,500 1.56 801 801 0.07
Stratified PPS 2,550 1.34 1,332 571 0.07
PPS 3,710 2.32 800 800 0.07
PPS 4,430 2.32 1,331 570 0.07
Design Effect Simulation Results
Note: Assumptions: alpha=0.05, power=80%, prevalence of characteristic in subgroup1= 0.50. Design effects estimates based on sample selection simulations conducted on the SSS sampling frame data.Source: RTI analysis of SSS data.
11
Optimal Stratification Cutoffs
Facility Stratum
3-StrataOptions
Upper Cutoff Values
4-Strata Options
Upper Cutoff Values
Small 4 to SM bedsSM = 8, 10, 12 or 15
4 to SM bedsSM = 8, 10, 12 or 15
Medium(SM+1) to ML beds
ML = 25, 30, 40, 50, 75 or 100
(SM+1) to ML beds
ML = 25, 30, 40 or 50
Large(ML+1) or more beds
(ML+1) to LX beds
LX = 75, 100, 125 or 150
Extra-Large
(LX+1) or more beds
Number of Options Evaluated
24 64
12
Optimal Stratification Results
Facility Stratum
3-Strata Option # Facilities
4-Strata Option # Facilities
Small 4-8 beds 500 4-8 beds 500
Medium 9-30 beds 1,000 9-25 beds 750
Large 31+ beds 1,000 26-100 beds 750
Extra-Large 101+ beds 500
Total Sample 2,500 2,500
Design Effects 1.97 1.99
13
Resident Sample Selection Simulations
Facility Stratum Option 1 Option 2 Option 3 Option 4 Option 5
Small 4 2 2 2 2
Medium 4 4 3 3 2
Large 4 5 6 3 4
Extra Large 5 5
Facility Sample Size 2,000 2,000 2,000 2,500 2,500
Total Resident Sample Size
8,000 8,000 8,000 8,000 8,000
14
Optimal Resident Sample Sizes
Facility Stratum
3-Strata Option
# Facilities
# Residents/
Facility4-Strata Option
# Facilities
# Residents/
Facility
Small 4-8 beds 500 2 4-8 beds 500 2
Medium 9-30 beds 1,000 2 9-25 beds 750 2
Large 31+ beds 1,000 526-100
beds750 4
Extra-Large 101+ beds 500 5
Total Sample 2,500 8,000 2,500 8,000
Design Effects
1.97 1.58 1.99 1.23
15
Power for Resident Group Comparisons
Number ofResidents
Number of
Facilities
PercentInterviews
in SubGroup1Design Effect
EffectiveSample SizeSubGroup1
EffectiveSample SizeSubGroup2
Difference of
PrevalenceEstimates
8,100 1,620 50% 1.47 2,762 2,762 0.04
8,100 1,620 60% 1.47 3,314 2,210 0.04
8,100 1,620 80% 1.47 4,419 1,105 0.05
Note: Assumptions: alpha=0.05, power=80%, prevalence of characteristic in subgroup1= 0.50. Design effects estimates based on sample selection simulations conducted on the SSS sampling frame data. Design effect calculations include an intracluster correlation of 0.01.Source: RTI analysis of SSS data.
16
Optimal Sample Design Option for NSRCF
Facility Stratum
Number of Facility
Interviews
Number of Residents /
Facility
Number of Resident
Interviews
Small (4-10 beds) 600 3 1,800
Medium (11-25 beds) 650 3 1,950
Large (26-100 beds) 650 5 3,250
Extra Large (101+ beds) 350 9 3,150
Total 2,250 10,150
Design Effect 2.12 1.28
17
Conclusions
Access to preliminary sampling frame data containing characteristics of the target population could be very useful in determining optimal sample design, even if the frame does not provide complete data for the whole target population.
The higher costs associated with adding one more facility to the sample, compared to adding one more resident to the sample, along with power requirements, caused us to focus more on finding an optimal design for facility level analysis that would not sacrifice power of the resident level analysis.
18
Conclusions
It was a complicated, iterative process to balance sample size and power criteria with data collection costs.
The population of RCFs is very dynamic. The analysis will be repeated once the final sampling frame for the NSRCF is constructed to see if any changes should be made to the optimal bed size stratification cutoffs for the sample design.