SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of...

23
SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of Public Health

Transcript of SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of...

Page 1: SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of Public Health.

SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE?Lu Ann Aday, Ph.D.The University of Texas School of Public Health

Page 2: SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of Public Health.

SAMPLE DESIGN: Key Components

Target Population or Universe: group about which information is desired

Sampling frame: operational definition of the target population which directly matches the target population, e.g., existing or constructed list of individuals from which the sample would actually be drawn • Sample elements: types of individuals or units

that will be drawn, i.e., ultimate sampling unit refers to final sampling unit that is usually the focus of the analysis, e.g., individuals

Page 3: SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of Public Health.

SAMPLE DESIGN: Types of Designs

Probability Sample: Relies on laws of chance to pick the sample, where probability of selection is known, i.e., based on sampling fraction: n/NNonprobability Sample: Relies on human judgment to pick the sample

Page 4: SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of Public Health.

SAMPLE DESIGN: Types of Nonprobability Designs

Purposive: Pick people for certain purpose, e.g., focus groupsQuota: Pick target number of people in certain categories, e.g., women 18-35Chunk: Pick convenient “chunk” of people, e.g., church attendeesVolunteer: Ask for volunteers, e.g., healthy male medical studentsSnowball: Identify small number of individuals representative of the population of interest, who then identify others that meet the same inclusion criteria, e.g., drug users

Page 5: SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of Public Health.

SAMPLE DESIGN: Types of Probability Designs

Simple random sampleSystematic random sampleStratified sampleCluster sample

Page 6: SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of Public Health.

SAMPLE DESIGN:Simple Random Sample

Definition: Every unit in the population has a known, nonzero, and equal chance of being selected through a lottery-type procedure

Page 7: SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of Public Health.

SAMPLE DESIGN:Simple Random Sample

ProceduresDraw sample randomly from numbers assigned to sampling elements placed in a sampling “urn” ORUse a random numbers table to identify sampling elements to be included ORUse computer software to randomly select sample from computerized sampling frame

Page 8: SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of Public Health.

RANDOM NUMBERS TABLE: Example: 1-Select random starting point “X”; 2-Look at 1st two digits of random numbers; 3-Proceed from left to right through table to identify elements from sampling frame (numbered 1-50) until the target sample size (n) , e.g., 10, has been reached.

91567 42595 X

27958 30134 04024

17955 56349 90999 49127 20044

46503 18584 18845 49618 02304

92157 89634 94824 78171 84610

14577 62765 35065 81263 39667

Page 9: SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of Public Health.

SAMPLE DESIGN:Systematic Random Sample

Definition: Variation of simple random sample selected through randomly selecting a starting point and then taking every n’th unit thereafter, based on the sampling fraction

Page 10: SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of Public Health.

SAMPLE DESIGN:Systematic Random Sample

Procedures1-Determine the sampling interval required to sample the required number of cases, based on the sampling fraction: n/N, e.g, 10/50 = 1/52-Select a random starting point “X” within the first sampling interval, e.g., elements 1-53-Starting at “X”, sample every n/Nth case from the sampling frame until the target sample size (n) , e.g., 10, has been reached

Page 11: SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of Public Health.

SYSTEMATIC RANDOM SAMPLE: Example, e.g., n/N=10/50 = 1/5 (20%)

1 11 21 31 41

2 12 22 32 42

3 X 13 X 23 X 33 X 43 X

4 14 24 34 44

5 15 25 35 45

6 16 26 36 46

7 17 27 37 47

8 X 18 X 28 X 38 X 48 X

9 19 29 39 49

10 20 30 40 50

Page 12: SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of Public Health.

SAMPLE DESIGN:Stratified Sample

Definition: Sample based on dividing the population into homogeneous strata and drawing random-type sample separately from all the strata

Proportionate: Use same sampling fraction in each stratumDisproportionate: Use different sampling fraction in each (or selected) stratum

Page 13: SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of Public Health.

SAMPLE DESIGN:Stratified Sample

Procedures1-Order or group the sampling frame by relevant strata2-Determine the sampling interval required to sample the required number of cases, based on the sampling fraction3-Select a random starting point “X” within the first sampling interval4-Starting at “X”, sample every n/Nth case from the sampling frame until the target sample size (n) has been reached

Page 14: SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of Public Health.

STRATIFIED SAMPLE: Example-Proportionate, e.g., n/N=1/20 (5%) in all strata

STRATA N (%) n/N n (%)

A 500 (5%) 1/20 25 (5%)

B 3000 (30%) 1/20 150 (30%)

C 2000 (20%) 1/20 100 (20%)

D 500 (5%) 1/20 25 (5%)

E 700 (7%) 1/20 35 (7%)

F 1600 (16%) 1/20 80 (16%)

G 700 (7%) 1/20 35 (7%)

H 1000 (10%) 1/20 50 (10%)

10000 500

Page 15: SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of Public Health.

STRATIFIED SAMPLE: Example-Disproportionate, e.g., n/N=1/20 (5%) in strata B,C,F,H & 1/10 (10%) in strata A,D,E,G

STRATA N (%) n/N n (%)

A 500 (5%) 1/10 50 (8.1%)

B 3000 (30%) 1/20 150 (24.2%)

C 2000 (20%) 1/20 100 (16.1%)

D 500 (5%) 1/10 50 (8.1%)

E 700 (7%) 1/10 70 (11.3%)

F 1600 (16%) 1/20 80 (12.9%)

G 700 (7%) 1/10 70 (11.3%)

H 1000 (10%) 1/20 50 (8.1%)

10000 620

Page 16: SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of Public Health.

SAMPLE DESIGN:Cluster Sample

Definition: Sample based on dividing the population into heterogeneous clusters and drawing random-type sample separately from sample of clusters

Page 17: SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of Public Health.

CLUSTER SAMPLE: Example—Probability Proportionate to Size (PPS) (Aday & Cornelius, 2006, Table 6.2) (continued in next lecture)

Block A: 100 HUs*

Block F: 250 HUs*

Block K: 200 HUs*

Block B: 50 HUs Block G: 125 HUs*

Block L: 300 HUs*

Block C: 75 HUs Block H: 50 HUs Block M: 125 HUs

Block D: 150 HUs*

Block I: 100 HUs*

Block N: 150 HUs*

Block E: 200 HUs*

Block J: 50 HUs Block O: 275 HUs*

Page 18: SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of Public Health.

CRITERIA FOR EVALUATING SAMPLE DESIGNS

Precision—how close the estimates derived from the sample are to the true population value as a function of variable sampling errorAccuracy—how close the estimates derived from the sample are to the true population value as a function of systematic sampling error (bias)

Page 19: SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of Public Health.

CRITERIA FOR EVALUATING SAMPLE DESIGNS (cont.)

Complexity—number of stages and steps required to implement the sample designEfficiency—obtaining the most accurate and precise estimates at the lowest possible costs

Page 20: SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of Public Health.

ADVANTAGES & DISADVANTAGES: Simple Random SampleADVANTAGES

Requires little knowledge of population in advance

DISADVANTAGES

May not capture certain groups of interestMay not be very efficient

Page 21: SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of Public Health.

ADVANTAGES & DISADVANTAGES: Systematic Random SampleADVANTAGES

Easy to analyze and compute sampling (standard) errorsHigh precision

DISADVANTAGESPeriodic ordering of elements in sample frame may create biases in the dataMay not capture certain groups of interestMay not be very efficient

Page 22: SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of Public Health.

ADVANTAGES & DISADVANTAGES: Stratified SampleADVANTAGES

Enables certain groups of interest to be capturedEnables disproportionate sampling within strataHighest precision

DISADVANTAGESRequires knowledge of population in advanceMay introduce more complexity in analyzing data and computing sampling (standard) errors

Page 23: SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of Public Health.

ADVANTAGES & DISADVANTAGES: Cluster SampleADVANTAGES

Lowers field costsEnables sampling of groups of individuals for which detail on individuals themselves may not be available

DISADVANTAGESIntroduces more complexity in analyzing data and computing sampling (standard) errorsLowest precision