Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys...
-
Upload
connor-ashby -
Category
Documents
-
view
219 -
download
1
Transcript of Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys...
Variance Estimation in Complex Surveys
Third International Conference on Establishment Surveys
Montreal, Quebec
June 18-21, 2007
Presented by:
Kirk Wolter, NORC and the University of Chicago
2
Outline of Lecture –
Introduction (Chapter 1) Textbook Methods (Chapter 1) Replication-Based Methods
Random Group (Chapter 2) Balanced Half-Samples (Chapter 3) Jackknife (Chapter 4) Bootstrap (Chapter 5)
Taylor Series (Chapter 6) Generalized Variance Functions
(Chapter 7)
3
Chapter 1: Introduction
Notation and Basic Definitions
1. Finite population, - Residents of Canada- Restaurants in Montreal- Farms in Quebec- Schools in Ottawa
2. Sample, - Simple random sampling, without replacement- Systematic sampling- Stratification- Clustering- Double sampling
NU ,,1
s
4
Chapter 1: Introduction
5. Probability sampling design,
-
-
8. Characteristic of interest,
-
-
0)( sP
s
sP 1
farmth of in tons yield iYi
sP
iY
employednot if ,0
employed isresident th - if ,1
iYi
5
Chapter 1: Introduction
12. Parameter, - Proportion of residents who are
employed- Total production of farms- Trend in price index for
restaurants- Regression of sales on area for pharmacies
13. Estimator,
-
Ys,ˆ
6
Chapter 1: Introduction
14. Expectation and variance
-
-
16. Estimator of variance
-
-
-
v
YssPs
,ˆE
s
EYssP
E2
2
ˆ,
ˆˆEˆVar
ˆVarˆE v
01ˆ
ˆ
Var
vP
7
Textbook Methods
1. Design: srs wor of size
Estimator:
Variance Estimator:
Nnf
yfYn
ii
/
ˆ1
1
n
ii
n
ii
nyy
nyys
nsfNYv
1
1
22
22
/
1/
/1ˆ
n
8
Textbook Methods
2. Design: srs wor at both the first and second stages of sampling
Estimator:
Variance Estimator:
iii
n
i
m
jiji
Mmf
Nnf
yffYi
/
/
ˆ
2
1
1 1
12
11
i
i
m
jiiji
m
jiiiji
ii
n
iii
n
iii
myy
myys
msfMnNnNYyMnfNYv
1.
1
2.
2
2
12
2
1
2
.12
/
1/
/1/1//ˆ/11ˆ
9
Replication-Based Methods
2
1
ˆˆˆ
k
Cv
10
Chapter 2: The Method of Random Groups
Interpenetrating samples Replicated samples Ultimate cluster Resampling Random groups
11
Chapter 2: The Method of Random Groups
The Case of Independent Random Groups
(i) Draw a sample, No restrictions on the sampling methodology
(ii) Replace the first sampleDraw second sample, Use same sampling methodology
(iii) Repeat until samples are obtained,
2s
1s
2k
ksss ,,, 21
12
Chapter 2: The Method of Random Groups
Common estimation procedure:
Editing procedures Adjustments for nonresponse Outlier procedures Estimator of parameter
13
Chapter 2: The Method of Random Groups
Common measurement process:
Field work Callbacks Clerical screening and coding Conversion to machine-readable form
14
Chapter 2: The Method of Random Groups
Estimators of the Parameter of Interest,
Random group estimators
Overall estimators
k ˆ,,ˆ,ˆ21
k
k 1
ˆ1ˆ
:
15
Chapter 2: The Method of Random Groups
Two Examples:
Population total
Ratio
k
isi
i
isi
i
isi
i
N
ii
YWk
YYW
YYW
YY
1
1
1ˆ
ˆˆ
ˆˆ
k
X
Y
k
X
Y
X
Y
X
Y
1ˆ
ˆ1ˆ
ˆ
ˆˆ
ˆ
ˆˆ
16
Chapter 2: The Method of Random Groups
Estimators of
k
k
kkv
vv
kkv
1
2
2
1
1
2
1/ˆˆˆ
)ˆ(ˆ
1/)ˆˆ()ˆ(
:ˆVaror ˆVar
17
Chapter 2: The Method of Random Groups
Properties:
ˆˆ
1/3ˆ
ˆVar
ˆVarˆCV
ˆVarˆE
21
21
14
21
vv
k
kk
vv
v
18
Chapter 2: The Method of Random Groups
Confidence Intervals:
2/,12/ or
)ˆ(ˆ,)ˆ(ˆ
ktzc
vcvc
19
Chapter 3: Variance Estimation Based on Balanced Half-Samples
Description of Basic Techniques
L strata
Nh units per stratum
N size of entire population
nh = 2 units selected per stratum
srs wr
Example: restaurants in Montreal
20
Chapter 3: Variance Estimation Based on Balanced Half-Samples
average number of customers served by Montreal restaurants on a Monday night
Y
2/
/
21
1
hhh
hh
L
hhhst
yyy
NNW
yWy
21
Chapter 3: Variance Estimation Based on Balanced Half-Samples
Textbook Estimator of Variance
21
2
1
22
1
22
1
22
12/
4/
2/
hhh
ihhih
L
hhh
L
hhhst
yyd
yys
dW
sWyv
22
Chapter 3: Variance Estimation Based on Balanced Half-Samples
Random Group Estimator of Variance
k = 2 independent random groups are available
L
hhhst
stst
stststRG
L
L
yWy
yy
yyyv
yyy
yyy
1,
22,1,
2
1
2,
22212
12111
4/
122
1
,,,
,,,
23
Chapter 3: Variance Estimation Based on Balanced Half-Samples
Half-Sample Methodology
L
hhhhhhst
L
h
yyWy
h
12211,
1
samples-half possible 2
otherwise , 0
sample halfth - the
for selected is )1,(unit if , 1
24
Chapter 3: Variance Estimation Based on Balanced Half-Samples
Choosing a Manageable Number, k, of Half-Samples
balanced
random
/1
2,
k
k
kyyyvk
stststk
25
Chapter 3: Variance Estimation Based on Balanced Half-Samples
Table 3.2.1. Definition of Balanced Half-Sample Replicates for 5, 6, 7, or 8 Strata
Stratum (h)
Replicate 1 2 3 4 5 6 7 8
1h +1 -1 -1 +1 -1 +1 +1 -1
2h +1 +1 -1 -1 +1 -1 +1 -1
3h +1 +1 +1 -1 -1 +1 -1 -1
4h -1 +1 +1 +1 -1 -1 +1 -1
5h +1 -1 +1 +1 +1 -1 -1 -1
6h -1 +1 -1 +1 +1 +1 -1 -1
7h -1 -1 +1 -1 +1 +1 +1 -1
8h -1 -1 -1 -1 -1 -1 -1 -1
26
Chapter 3: Variance Estimation Based on Balanced Half-Samples
Properties of the Balanced Half-Sample Methods
Lkyyk
yvyv
k
stst
ststk
provided, 1
1,
27
Chapter 3: Variance Estimation Based on Balanced Half-Samples
Usage with Multistage Designs
L
hhhhh
h
h
L
hhhhh
pYpYYv
h
Y
p
pYpYY
Y
1
2
2211
1
1
12211
4//ˆ/ˆˆ
PSUth -1,in persons employed
ofnumber totalof estimator ˆ
units housing
2/ˆ2/ˆˆ
Canadain
persons employed ofnumber total
28
Chapter 3: Variance Estimation Based on Balanced Half-Samples
Balanced Half-Sample Methodology
k
k
L
hhhhhhh
kYYYv
pYpYY
1
2
1222111
/ˆˆˆ
/ˆ/ˆˆ
29
Chapter 3: Variance Estimation Based on Balanced Half-Samples
Alternative Half-Sample Estimators of Variance
equaly necessarilnot are Estimators
ˆˆ1ˆ
ˆˆ4
1ˆ
ˆ1ˆˆ
ˆˆ1ˆ
/ˆˆˆ
1
2
2
1
2†
1
2
1
2
k
kc
ckk
kcc
k
k
kv
kv
vvv
kv
kv
k
k
k
k
30
Chapter 4: The Jackknife Method
Quenouille (1949) – bias reduction
Tukey (1958) – variance estimationtestinginterval estimation
Resampling
31
Chapter 4: The Jackknife Method
Basic Methodology
Random sample
Random groups
Parameter
Estimator
nyyy ,,, 21
kmn
Quebec)
in farms of acreper yield :(example
32
Chapter 4: The Jackknife Method
Drop out m
Pseudovalue
Quenouille’s estimator
Variance estimator
Special case
ˆ1ˆˆ kk
k
k 1
ˆ1ˆ
2
12
2
11
ˆˆ1
1)ˆ(
)ˆˆ(1
1)ˆ(
k
k
kkv
kkv
1, mnk
k,,1ˆ
33
Chapter 4: The Jackknife Method
Full-sample estimator
Variance estimator
2
1
ˆˆ1
1ˆ
k
ikkv
34
Chapter 4: The Jackknife Method
Example: ratio
xykxyk
xy
xy
XY
/1/ˆ
/ˆ
/ˆ
/
35
Chapter 4: The Jackknife Method
Usage in Stratified Sampling
Drop out observation(s) from individual strata
hn
ihi
L
h h
h
hi
n
nv
1
2
11
ˆˆ1ˆ
ˆ
ˆ
36
Chapter 4: The Jackknife Method
Application to Cluster Sampling
Example
Drop out ultimate clusters
persons employed total
ijijji
km
iii
ijijji
n
iii
YWpYkm
YWpYn
)(
1
1
1
/ˆ1
1ˆ
/ˆ1ˆ
out dropped is PSU if ,0
out droppednot is PSU if ,)1()(
ijij Wkm
mkW
37
Chapter 5: The Bootstrap Method
Works with replicates of potentially any size, *n
Original Application –
nYY ,,1 are iid random variables (scalar or vector)
from a distribution function F is to be estimated
38
Chapter 5: The Bootstrap Method
A bootstrap sample (or bootstrap replicate) is a
simple random sample with replacement (srs wr) of
size *n selected from the original sample.
**1 *,,
nYY
* denotes the estimator of the same functional form as
39
Chapter 5: The Bootstrap Method
Ideal Bootstrap Estimator of Var
**1
ˆˆ Varv ,
where *Var signifies the conditional variance, given the original sample
Monte Carlo Bootstrap Estimator of Var
i. Draw a large number, A , of independent bootstrap replicates from
the main sample and label the corresponding observations as
**1 *,,
nYY
, for A,,1 ;
ii. For each bootstrap replicate, compute the corresponding estimator
* of the parameter of interest; and
iii. Calculate the variance between the * values
2**
12
ˆˆ1
1ˆ
A
Av ,
*
1
* ˆ1ˆ
A
A.
40
Chapter 5: The Bootstrap Method
Application to the Finite Population –
Simple Random Sampling with Replacement (srs wr) Data
nyy ,,1
Parameter of Interest
Y Standard Estimator
iyny /1
41
Chapter 5: The Bootstrap Method
Bootstrap Sample
**
*1 ,, nyy
Estimator
*** /1 iyny
Bootstrap Moments
n
iiy
nyE
1*1*
22*1*
11s
n
nyy
nyVar
n
ii
Ideal Bootstrap Estimator of Variance
*
2
*
*1**
*1
1
}{
n
s
n
n
n
yVaryVaryv
Unbiased Choice
1* nn
42
Chapter 5: The Bootstrap Method
Multistage Sampling with pps wr Sampling at the First Stage Observed Data
ijy , where i indexes the selected PSU and j indexes the completed
interview within the PSU Parameter of Interest
Y Estimator
iii
n
ii
n
i i
i
jijij
n
i
pYz
znp
Y
nywY
/ˆ
1ˆ1ˆ1
43
Chapter 5: The Bootstrap Method
Bootstrap Sample
**2
*1 *,...,,
nzzz
Bootstrap Moments
n
ii
n
ii
Yzn
z
Yzn
z
2*1*
*1*
ˆ1Var
ˆ1E
Ideal Bootstrap Estimator of Variance
.ˆ11VarˆVarˆ 2
**
*1**
*1 hn
ii Yz
nnn
zYYv
Unbiased Choice
1* nn
44
Chapter 6: Taylor Series Methods
Assume a complex survey design
),...,( 1 pYYY vector of population totals
)ˆ,...,ˆ(ˆ1 pYYY
)(Yg parameter of interest, such as
the ratio 2
1
Y
Y
)ˆ(ˆ Yg
45
Chapter 6: Taylor Series Methods
First-order Taylor series approximation
MSE
RYYy
gjj
j
p
j
)ˆ()(ˆ
1
Y
ddY
)}ˆ()(
{Var}ˆ{MSE1
jjj
p
j
YYy
g
jj y
gd
)(Y
})ˆ)(ˆ{(E YYYY
46
Chapter 6: Taylor Series Methods
dd ˆˆˆ)ˆ(v
jj y
gd
)ˆ(ˆ Y
by textbook or replication-based method applied to the y-data
Alternative algorithm
si
jiij YWY
iisi
UWU
ˆ
jij
p
ji Y
y
gU
)(
1
Y
}ˆ{Var}ˆ{MSE U
47
Chapter 7: Generalized Variance Functions
1. Population total,
2. Estimator of the total,
3. Relative variance,
4.
2
2ˆVar
X
XV
XV /2
X
X