EMR 6500: Survey Research Dr. Chris L. S. Coryn Kristin A. Hobson Spring 2013.
-
Upload
angelo-smock -
Category
Documents
-
view
215 -
download
0
Transcript of EMR 6500: Survey Research Dr. Chris L. S. Coryn Kristin A. Hobson Spring 2013.
EMR 6500:Survey Research
Dr. Chris L. S. CorynKristin A. Hobson
Spring 2013
Agenda
• Systematic sampling• Cluster sampling for means and
totals
Systematic Sampling
Systematic Sampling
• Systematic sampling simplifies the sample selection process compared to both simple random sampling and stratified random sampling
• In systematic sampling an interval (k) is used to select sample elements
• The starting point is (should be) selected randomly
Systematic Sampling
• Systematic sampling is a useful alternative to simple random sampling because:1. It is easier to perform in the field and
less subject to selection errors, especially if a good frame is not available
2. It can provide greater information per unit cost than simple random samples for populations with certain patterns in the arrangement of elements
1-in-k Systematic Sampling
• Divide the population size N by the desired sample size n
• Let k = N/n• k must be equal to or less than N/n
(i.e., k ≤ N/n)– If N = 15,000 and n = 100, then k ≤ 150
1-in-k Systematic Sampling
• If N were 1,000 and n were 100• k would equal 1,000/100 = 10• If k = 10, the start value would range
between 1 to 10 and all selections thereafter would be every 10th entry on the sampling frame– If the start value was 8, then the next
selection would be 18, followed by 28, and so forth
Random Population Elements
Ordered Population Elements
Periodic Population Elements
Estimation of a Population Mean and Total
Estimation of a Population Mean
*Note: This formula assumes a randomly ordered population
Estimation of a Population Total
*Note: This formula assumes a randomly ordered population
Estimation of a Population Proportion
Estimation of a Population Proportion
*Note: This formula assumes a randomly ordered population
Selecting the Sample Size
Sample Size for Estimating a Population Mean
Sample Size for Estimating a Population Proportion
Variance Estimation for Ordered and Periodic Distributions
Variance Estimates
• Repeated systematic sampling– Divides a systematic sample into smaller
systematic samples to approximate a random population
– Multiple 1-in-k systematic samples
• Successive difference method– A samples of size n yields n-1 successive
differences that are used to estimate variance
– Best choice when population elements are not randomly ordered
Cluster Sampling
Cluster Sampling
• Cluster sampling is a probability sampling method in which each sampling unit is a collection, or cluster, of elements
• Clusters can consist of almost any imaginable natural (and artificial) grouping of elements
Cluster Sampling
• Cluster sampling is an effective sampling design if:1. A good sampling frame listing
population elements is not available or is very costly to obtain, but a frame listing clusters is easily obtained
2. The cost of obtaining observations increases as the distance separating elements increases
Cluster Sampling
• Unlike stratified random sampling, in which strata are ideally similar within stratum and where stratum should differ from one another, clusters should be different within clusters and be similar between clusters
Take a simple random sample from every stratum Take a simple random sample of clusters; observe all elements within clusters in the sample
Each element of the population is in exactly one stratum
Each element of the population is in exactly one cluster
Variance of the estimate depends on the variability within strata
Variance of the estimate depends primarily on the variability between clusters
For greatest precision, individual elements within each stratum should have similar values, but stratum means should differ from each other as much as possible
For greatest precision, individual elements within each cluster should be heterogeneous, and cluster means should be similar to one another
Cluster Sampling Notation
Estimation of a Population Mean and Total
Estimation of a Population Mean*Note: takes the form of a ratio estimator, with taking the place of
*Note: can be estimated by if M is unknown
Example for a Population MeanCluster
Number of residents, mi
Total income per cluster, yi
ClusterNumber of residents,
Total income per cluster
1 8 $96,000 14 10 $49,000
2 12 $121,000 15 9 $53,000
3 4 $42,000 16 3 $50,000
4 5 $65,000 17 6 $32,000
5 6 $52,000 18 5 $22,000
6 6 $40,000 19 5 $45,000
7 7 $75,000 20 4 $37,000
8 5 $65,000 21 6 $51,000
9 8 $45,000 22 8 $30,000
10 3 $50,000 23 7 $39,000
11 2 $85,000 24 3 $47,000
12 6 $43,000 25 8 $41,000
13 5 $54,000
Example for a Population Mean
n M Med SD
Resident ( ) 25 6.040 6.000 2.371
Income ( ) 25 $51,360 $49,000 $21,784
25 0 993 25,189
Example for a Population Mean
*Note: Because M is not known, is estimated by
Example for a Population Mean
Estimation of a Population Total
Estimation of a Population Total
Estimation of a Population Total that Does not Depend on M
Example of Estimation of a Population Total that Does not Depend on M
Example of Estimation of a Population Total that Does not Depend on M
Equal Cluster Sizes
Equal Cluster Sizes for Estimating a Population Mean
• All mi values are equal to a common, or constant, value m
• In this case, M = Nm, and the total sample size is nm elements (n clusters of m elements each)
• When cluster sizes are equal m1 = m2 = mN
• Variance components analysis simplifies estimating the variance using ANOVA methods
Equal Cluster Sizes for Estimating a Population Mean
ANOVA Method
Cluster Number of Newspapers Total
1 1 2 1 3 3 2 1 4 1 1 19
2 1 3 2 2 3 1 4 1 1 2 20
3 2 1 1 1 1 3 2 1 3 1 16
4 1 1 3 2 1 5 1 2 3 1 20
• There are 4,000 households (elements)• There are 400 geographical regions
(clusters)• There are 10 households in each region
ANOVA Method
ANOVA Method
Source df SS MS
Factor 3 1.07 0.36
Error 36 43.30 1.20
Total 39 44.38
*Note: ‘Factor’ denotes between-cluster variation and ‘Error’ denotes within cluster variation
ANOVA Method
Selecting the Sample Size for Estimating Population Means and Totals
Sample Size for Estimating Population Means
Where is estimated by
Example of Sample Size for Estimating Population Means
• How large a sample should be taken to estimate the average per-capita income with a bound on the error of estimation of B = $500?
Example of Sample Size for Estimating Population Means
*Note: Because M is not known, is estimated by
Example of Sample Size for Estimating Population Means
Sample Size for Estimating Population Totals When M is Known
Where is estimated by
Example of Sample Size for Estimating Population Totals When M is Known• How large a sample should be taken to
estimate the total income of all residents with a bound on the error of estimation of B = $1,000,000? (M = 2,500)
Sample Size for Estimating Population Totals When M is Known
Sample Size for Estimating Population Totals When M is Unknown
Where is estimated by
Example of Sample Size for Estimating Population Totals When M is Unknown• How large a sample should be taken to
estimate the total income of all residents with a bound on the error of estimation of B = $1,000,000? (M = 2,500)
Sample Size for Estimating Population Totals When M is Unknown