Survey Methodology Sampling error and sample size EPID 626 Lecture 4.

20
Survey Methodology Sampling error and sample size EPID 626 Lecture 4

Transcript of Survey Methodology Sampling error and sample size EPID 626 Lecture 4.

Page 1: Survey Methodology Sampling error and sample size EPID 626 Lecture 4.

Survey MethodologySampling error and sample size

EPID 626

Lecture 4

Page 2: Survey Methodology Sampling error and sample size EPID 626 Lecture 4.

Lecture overview

• Finish discussion of nonprobability sampling

• Discuss sampling error

• Discuss sample sizes

• Some practical exercises

Page 3: Survey Methodology Sampling error and sample size EPID 626 Lecture 4.

Nonprobability sampling designs

• Focus groups: small group chosen because the investigator believes they accurately reflect the population of interest- usually used for planning or piloting

Page 4: Survey Methodology Sampling error and sample size EPID 626 Lecture 4.

Advantages and Disadvantages

• Advantages

– Cost

– Efficiency

– Effort

– May be the only option

• Disadvantages– Bias– Bias– Bias– Not a valid

representation of the population

Page 5: Survey Methodology Sampling error and sample size EPID 626 Lecture 4.

When would nonprobability sampling be appropriate?

• Surveys of hard-to-identify groups (ex. A survey of goals and aspirations among gang members)

• Surveys of specific groups (ex. A survey about pain among hospice patients)

• Surveys in pilot situations (used for planning purposes, not for research per se)(Fink, 1995)

Page 6: Survey Methodology Sampling error and sample size EPID 626 Lecture 4.

Sampling error

• The central limit theorem: In a sequence of samples of a population, for a particular estimate (say a mean), there will be a normal distribution around the true population value.

Page 7: Survey Methodology Sampling error and sample size EPID 626 Lecture 4.

• This variation around the true value is the sampling error-it stems from the fact that, by chance, samples may differ from the population as a whole.

Page 8: Survey Methodology Sampling error and sample size EPID 626 Lecture 4.

• The larger the sample size and the less variance of what is being measured, the more tightly the sample estimates will “bunch” around the true population value, and the more accurate the sample-based estimate will be.

Page 9: Survey Methodology Sampling error and sample size EPID 626 Lecture 4.

Standard error of a mean

n

VarSE

Page 10: Survey Methodology Sampling error and sample size EPID 626 Lecture 4.

Standard error of a mean

• The standard deviation of the distribution of sample estimates of the mean that would be formed if an infinite number of samples of a given size were drawn.

Page 11: Survey Methodology Sampling error and sample size EPID 626 Lecture 4.

• ~67% of the means of samples of a given size and design will fall within the range of 1 SE of the true population mean

• ~95% will fall within 2 SE -this is where we derive a 95% confidence interval

Page 12: Survey Methodology Sampling error and sample size EPID 626 Lecture 4.

Proportions

• Mean of a two-value (binomial) distribution

• Var of a proportion = p(1-p)

• So the

n

ppSE

)1(

Page 13: Survey Methodology Sampling error and sample size EPID 626 Lecture 4.

Table 2.1Confidence Ranges for Variability

Attributable to Sampling

• Trends

• If sample size=75 and p=20,

)29.0,11.0(%95

9092.02*)046188.0(

046188.075

16.0

75

)80.0)(20.0(

CI

SE

Page 14: Survey Methodology Sampling error and sample size EPID 626 Lecture 4.

Confidence intervals

• In a survey of 100 respondents, 20% say yes. What is the confidence interval for a 95% confidence level?

• In a survey 250 respondents, 10% say yes. What is the confidence interval for a 95% confidence level? What if 50% said yes?

Page 15: Survey Methodology Sampling error and sample size EPID 626 Lecture 4.

• In a survey of 100 respondents, 20% say yes. What is the confidence interval for a 95% confidence level?

• Interval is 8.

• 95% CI=(12%, 28%)

Page 16: Survey Methodology Sampling error and sample size EPID 626 Lecture 4.

• In a survey 250 respondents, 10% say yes. What is the confidence interval for a 95% confidence level? What if 50% said yes?

• Interval is about 3.8.• 95% CI is about (6.2%, 13.8%)• If 50% said yes, CI is about

(3.7%, 16.3%)

Page 17: Survey Methodology Sampling error and sample size EPID 626 Lecture 4.

Sampling error and sampling strategy

• SRS is approximated by the standard error• Systematic sampling

– If not stratified, sampling error is the same as in SRS.

– If stratified, errors are lower than those associated with SRS for the same size for variables that differ (on average) by stratum, if rates of selection are constant across strata.

Page 18: Survey Methodology Sampling error and sample size EPID 626 Lecture 4.

Sampling error and sampling strategy (2)

• Unequal rates of selection decrease sampling error for oversampled groups.

• It will generally produce sampling errors for the whole sample that are higher than those associated with SRS of the same size for variables that differ by stratum.

Page 19: Survey Methodology Sampling error and sample size EPID 626 Lecture 4.

Sampling error and sampling strategy (3)

• But when oversampling occurs in strata that have higher than average variances, overall sampling errors will be lower than for a sample of the same size with equal probabilities of selection. (see Fowler text p.32 for more extensive discussion)

Page 20: Survey Methodology Sampling error and sample size EPID 626 Lecture 4.

Sampling error and sampling strategy (4)

• Clusters will produce sampling errors that are higher than SRS for the same size for variables that are more homogenous within clusters than in the population as a whole.