SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of...
-
Upload
edith-howard -
Category
Documents
-
view
218 -
download
3
Transcript of SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of...
SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE?Lu Ann Aday, Ph.D.The University of Texas School of Public Health
SAMPLE DESIGN: Key Components
Target Population or Universe: group about which information is desired
Sampling frame: operational definition of the target population which directly matches the target population, e.g., existing or constructed list of individuals from which the sample would actually be drawn • Sample elements: types of individuals or units
that will be drawn, i.e., ultimate sampling unit refers to final sampling unit that is usually the focus of the analysis, e.g., individuals
SAMPLE DESIGN: Types of Designs
Probability Sample: Relies on laws of chance to pick the sample, where probability of selection is known, i.e., based on sampling fraction: n/NNonprobability Sample: Relies on human judgment to pick the sample
SAMPLE DESIGN: Types of Nonprobability Designs
Purposive: Pick people for certain purpose, e.g., focus groupsQuota: Pick target number of people in certain categories, e.g., women 18-35Chunk: Pick convenient “chunk” of people, e.g., church attendeesVolunteer: Ask for volunteers, e.g., healthy male medical studentsSnowball: Identify small number of individuals representative of the population of interest, who then identify others that meet the same inclusion criteria, e.g., drug users
SAMPLE DESIGN: Types of Probability Designs
Simple random sampleSystematic random sampleStratified sampleCluster sample
SAMPLE DESIGN:Simple Random Sample
Definition: Every unit in the population has a known, nonzero, and equal chance of being selected through a lottery-type procedure
SAMPLE DESIGN:Simple Random Sample
ProceduresDraw sample randomly from numbers assigned to sampling elements placed in a sampling “urn” ORUse a random numbers table to identify sampling elements to be included ORUse computer software to randomly select sample from computerized sampling frame
RANDOM NUMBERS TABLE: Example: 1-Select random starting point “X”; 2-Look at 1st two digits of random numbers; 3-Proceed from left to right through table to identify elements from sampling frame (numbered 1-50) until the target sample size (n) , e.g., 10, has been reached.
91567 42595 X
27958 30134 04024
17955 56349 90999 49127 20044
46503 18584 18845 49618 02304
92157 89634 94824 78171 84610
14577 62765 35065 81263 39667
SAMPLE DESIGN:Systematic Random Sample
Definition: Variation of simple random sample selected through randomly selecting a starting point and then taking every n’th unit thereafter, based on the sampling fraction
SAMPLE DESIGN:Systematic Random Sample
Procedures1-Determine the sampling interval required to sample the required number of cases, based on the sampling fraction: n/N, e.g, 10/50 = 1/52-Select a random starting point “X” within the first sampling interval, e.g., elements 1-53-Starting at “X”, sample every n/Nth case from the sampling frame until the target sample size (n) , e.g., 10, has been reached
SYSTEMATIC RANDOM SAMPLE: Example, e.g., n/N=10/50 = 1/5 (20%)
1 11 21 31 41
2 12 22 32 42
3 X 13 X 23 X 33 X 43 X
4 14 24 34 44
5 15 25 35 45
6 16 26 36 46
7 17 27 37 47
8 X 18 X 28 X 38 X 48 X
9 19 29 39 49
10 20 30 40 50
SAMPLE DESIGN:Stratified Sample
Definition: Sample based on dividing the population into homogeneous strata and drawing random-type sample separately from all the strata
Proportionate: Use same sampling fraction in each stratumDisproportionate: Use different sampling fraction in each (or selected) stratum
SAMPLE DESIGN:Stratified Sample
Procedures1-Order or group the sampling frame by relevant strata2-Determine the sampling interval required to sample the required number of cases, based on the sampling fraction3-Select a random starting point “X” within the first sampling interval4-Starting at “X”, sample every n/Nth case from the sampling frame until the target sample size (n) has been reached
STRATIFIED SAMPLE: Example-Proportionate, e.g., n/N=1/20 (5%) in all strata
STRATA N (%) n/N n (%)
A 500 (5%) 1/20 25 (5%)
B 3000 (30%) 1/20 150 (30%)
C 2000 (20%) 1/20 100 (20%)
D 500 (5%) 1/20 25 (5%)
E 700 (7%) 1/20 35 (7%)
F 1600 (16%) 1/20 80 (16%)
G 700 (7%) 1/20 35 (7%)
H 1000 (10%) 1/20 50 (10%)
10000 500
STRATIFIED SAMPLE: Example-Disproportionate, e.g., n/N=1/20 (5%) in strata B,C,F,H & 1/10 (10%) in strata A,D,E,G
STRATA N (%) n/N n (%)
A 500 (5%) 1/10 50 (8.1%)
B 3000 (30%) 1/20 150 (24.2%)
C 2000 (20%) 1/20 100 (16.1%)
D 500 (5%) 1/10 50 (8.1%)
E 700 (7%) 1/10 70 (11.3%)
F 1600 (16%) 1/20 80 (12.9%)
G 700 (7%) 1/10 70 (11.3%)
H 1000 (10%) 1/20 50 (8.1%)
10000 620
SAMPLE DESIGN:Cluster Sample
Definition: Sample based on dividing the population into heterogeneous clusters and drawing random-type sample separately from sample of clusters
CLUSTER SAMPLE: Example—Probability Proportionate to Size (PPS) (Aday & Cornelius, 2006, Table 6.2) (continued in next lecture)
Block A: 100 HUs*
Block F: 250 HUs*
Block K: 200 HUs*
Block B: 50 HUs Block G: 125 HUs*
Block L: 300 HUs*
Block C: 75 HUs Block H: 50 HUs Block M: 125 HUs
Block D: 150 HUs*
Block I: 100 HUs*
Block N: 150 HUs*
Block E: 200 HUs*
Block J: 50 HUs Block O: 275 HUs*
CRITERIA FOR EVALUATING SAMPLE DESIGNS
Precision—how close the estimates derived from the sample are to the true population value as a function of variable sampling errorAccuracy—how close the estimates derived from the sample are to the true population value as a function of systematic sampling error (bias)
CRITERIA FOR EVALUATING SAMPLE DESIGNS (cont.)
Complexity—number of stages and steps required to implement the sample designEfficiency—obtaining the most accurate and precise estimates at the lowest possible costs
ADVANTAGES & DISADVANTAGES: Simple Random SampleADVANTAGES
Requires little knowledge of population in advance
DISADVANTAGES
May not capture certain groups of interestMay not be very efficient
ADVANTAGES & DISADVANTAGES: Systematic Random SampleADVANTAGES
Easy to analyze and compute sampling (standard) errorsHigh precision
DISADVANTAGESPeriodic ordering of elements in sample frame may create biases in the dataMay not capture certain groups of interestMay not be very efficient
ADVANTAGES & DISADVANTAGES: Stratified SampleADVANTAGES
Enables certain groups of interest to be capturedEnables disproportionate sampling within strataHighest precision
DISADVANTAGESRequires knowledge of population in advanceMay introduce more complexity in analyzing data and computing sampling (standard) errors
ADVANTAGES & DISADVANTAGES: Cluster SampleADVANTAGES
Lowers field costsEnables sampling of groups of individuals for which detail on individuals themselves may not be available
DISADVANTAGESIntroduces more complexity in analyzing data and computing sampling (standard) errorsLowest precision