MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data...
Transcript of MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data...
![Page 1: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/1.jpg)
MEASUREMENT ERROR IN HEALTH STUDIES
• Lecture 1 — Introduction, Examples, Effects of Measurement Error in Linear
Models
• Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact
Predictors, Berkson/Classical
• Lecture 3 — Distinguishing Berkson from Classical, Structural/Functional, Re-
gression Calibration, Non-Additive Errors
• Lecture 4 — SIMEX and Instrumental Variables
• Lecture 5 — Bayesian Methods
• Lecture 6 — Nonparametric Regression with Measurement Error
![Page 2: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/2.jpg)
Lecture 1 1
LECTURE 1: INTRODUCTION, EXAMPLES AND LINEAR
MEASUREMENT ERROR MODELS
OUTLINE
• Why We Need Special Methods For Measurement Errors
• Measurement Error Examples
• Structure of a Measurement Error Problem
• Classical Error Model in Linear Regression
![Page 3: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/3.jpg)
Lecture 1 2
WHY WE NEED SPECIAL METHODSTO HANDLE MEASUREMENT
ERRORS
• These lectures are about strategies for regression when some predictors are mea-
sured with error.
• Remember your introductory regression text. . .
∗ Snedecor and Cochran (1967),“Thus far we have assumed thatX-variable
in regression is measured without error. Since no measuring instrument is
perfect this assumption is often unrealistic.”
∗ Steele and Torrie (1980),“ . . . if the X ’s are also measured with error,. . . an
alternative computing procedure should be used. . .”
∗ Neter and Wasserman (1974),“Unfortunately, a different situation holds if
the independent variableX is known only with measurement error.”
![Page 4: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/4.jpg)
Lecture 1 3
WHY SPECIAL METHODSARE NEEDED — CONT.
• Measurement error in outcome: If
Y = β0 + β1X + ε
and we observeY ∗ = Y + U , then
Y ∗ = β0 + β1X + (ε + U)
∗ All that has happened is that the error variance is bigger
∗ Standard regression applies
![Page 5: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/5.jpg)
Lecture 1 4
WHY SPECIAL METHODSARE NEEDED — CONT.
• Consider now:
Y = β0 + β1X + ε
W = X + U
and
only W andY are observed.
Then
Y = β0 + β1W + (ε− β1U)
∗ Thisseems like a standard regression problemwith additional error.
I Can we analyze it that way?
![Page 6: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/6.jpg)
Lecture 1 5
WHY SPECIAL METHODSARE NEEDED — CONT.
• From previous page:X = W − U implies
Y = β0 + β1W + (ε− β1U)
• Problem: W is correlated with the “error”ε− β1U
• Better to replaceW by a function ofW , namelyE(X|W ), since
X = E(X|W ) + V, whereE(X|W ) andV are uncorrelated
and then we have the model
Y = β0 + β1E(X|W ) + (ε + β1V )
∗ special case of“regression calibration”
∗ leads to unbiased estimates
![Page 7: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/7.jpg)
Lecture 1 6
EXAMPLES OF MEASUREMENTERRORPROBLEMS
• Different measures of nutrient intake, e.g., by a food frequency questionnaire
(FFQ)
• Systolic Blood Pressure
• Radiation Dosimetry
• Exposure to arsenic in drinking water, dust in the workplace, radon gas in the
home, and other environmental hazards
![Page 8: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/8.jpg)
Lecture 1 7
MEASURESOF NUTRIENT INTAKE
• Y = average daily % calories from fat by a FFQ.
• X = true long-term average daily percentage of calories from fat
• AssumeY = β0 + βxX + ε
∗ we donotassume that the FFQ is unbiased
• X is never observable. It is measuredwith error:
∗ Along with the FFQ, on 6 days over the course of a year women are interviewedby phone and asked to recall their food intake over the past year (24–hourrecalls). Their average is recorded and denoted byW .
I W is an“alloyed gold standard”
∗ The analysis of 24–hour recall introduces some error=⇒ analysis error
∗ Measurement error = sampling error + analysis error
∗ Classical measurement error model:
Wi = Xi + Ui, Ui are measurement errors
![Page 9: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/9.jpg)
Lecture 1 8
HEART DISEASEVS SYSTOLIC BLOOD PRESSURE
• Y = indicator of Coronary Heart Disease (CHD)
• X = true long-term average systolic blood pressure (SBP) (maybe transformed)
• AssumeP (Y = 1) = H (β0 + βxX), H is the logistic or probit function
• Data are CHD indicators and determinations of systolic blood pressure forn =
1600 in Framingham Heart Study
![Page 10: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/10.jpg)
Lecture 1 9
HEART DISEASEVS SYSTOLIC BLOOD PRESSURE— CONT.
• X measuredwith error:
∗ SBP measured at two exams (and averaged)=⇒ sampling error
∗ The determination of SBP is subject to machine and reader variability=⇒analysis error
∗ Measurement error = sampling error + analysis error
∗ Measurement error model
Wi = Xi + Ui, Ui are measurement errors
![Page 11: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/11.jpg)
Lecture 1 10
GENERAL STRUCTUREOF A M EASUREMENTERRORPROBLEM
• Y = response,Z = error-free predictor,X = error-prone predictor
• E(Y |Z, X) = f (Z,X, β) (outcome model)
• Observed data:(Yi, Zi, Wi), i = 1, . . . , n
∗ E(Y |Z,W ) 6= f (Z,W, β) (source of our worries)
• Error model relatingWi andXi (measurement model)
∗ Wi = Xi + Ui (classical error model)
∗ Wi = γ0,em + γtx,emXi + γt
z,emZi + Ui (error calibration model)
![Page 12: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/12.jpg)
Lecture 1 11
A CLASSICAL ERRORMODEL
• Wi = Xi + Ui (additive)
• Ui are:
∗ independent of allYi, Zi andXi
∗ IID (0, σ2u)
• In addition, wemayhave a model for the distribution of(X, Z) (exposure model)
∗ called astructural model
![Page 13: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/13.jpg)
Lecture 1 12
SIMPLE L INEAR REGRESSIONWITH A CLASSICAL ERROR
MODEL
• Y = response,X = error-prone predictor
• Y = β0 + βxX + ε
• Observed data:(Yi,Wi), i = 1, . . . , n
• Wi = Xi + Ui (additive)
• Ui are:
∗ independent of allYi, Zi andXi
∗ IID (0, σ2u)
What are the effects of measurement error on the usual analysis?
![Page 14: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/14.jpg)
Lecture 1 13
SIMULATION STUDY
• GenerateX1, . . . , X50, IID N (0, 1)
• GenerateYi = β0 + βxXi + εi
∗ εi IID N (0, 1/9)
∗ β0 = 0
∗ βx = 1
• GenerateU1, . . . , U50, IID N(0, 1)
• SetWi = Xi + Ui
• RegressY onX andY onW and compare
![Page 15: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/15.jpg)
Lecture 1 14
-6 -2 0 2 4 6
-5
0
5
10
Effects of Measurement Error
Reliable Data
True Data Without Measurement Error
![Page 16: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/16.jpg)
Lecture 1 15
-6 -2 0 2 4 6
-5
0
5
10
Effects of Measurement Error
Reliable Data
Error--prone Data
Observed Data With Measurement Error
![Page 17: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/17.jpg)
Lecture 1 16
THEORY BEHIND THE PICTURES: THE NaiveANALYSIS
• Least Squares Estimate of Slope:
βx =Sy,w
S2w
=n−1
∑(Y − Y )(W −W )
n−1∑
(W −W )2
where
Sy,w −→ Cov(Y, W ) = Cov(Y, X + U) = Cov(Y, X) = σy,x
S2w −→ Var(W ) = Var(X + U) = σ2
x + σ2u
So
βx −→ σy,x
σ2x + σ2
u
=
(σ2
x
σ2x + σ2
u
)σy,x
σ2x
= λβx
where
λ =σ2
x
σ2x + σ2
u
=σ2
x
σ2w
= attenuation factor= reliability ratio
∗ It is therelativesize of the error that matters
![Page 18: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/18.jpg)
Lecture 1 17
THEORY BEHIND THE PICTURES: THE NaiveANALYSIS
• Least Squares Estimate of Intercept:
β0 = Y − βxW
−→ µy − λβxµx
= β0 + (1− λ)βxµx
• Estimate of Residual Variance:
MSE−→ σ2ε + (1− λ)β2
xσ2x
![Page 19: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/19.jpg)
Lecture 1 18
MORE THEORY: JOINT NORMALITY
• Y, X, W jointly normal=⇒
∗ Y | W ∼ Normal
∗ E(Y | W ) = β0 + (1− λ)βxµx + λβxW
∗ Var(Y | W ) = σ2ε + (1− λ)β2
xσ2x
• Intercept is shifted by(1− λ)βxµx
• Slope is attenuated by the factorλ
• Residual variance is inflated by(1− λ)β2xσ
2x
![Page 20: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/20.jpg)
Lecture 1 19
MORE THEORY: IMPLICATIONS FOR TESTING HYPOTHESES
• Because
βx = 0 iff λβx = 0
it follows that
[H0 : βx = 0] ≡ [H0 : λβx = 0]
so the naive test ofβx = 0 is valid (correct Type I error rate).
• The naive test ofH0 : βx = 0 is asymptotically efficient whenE(X | W ) is linear
in W .
• The discussion of naive tests when there are multiple predictors measured with
error, or error-free predictors, is more complicated
![Page 21: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/21.jpg)
Lecture 1 20
Measurement Error Variance
Sam
ple
Siz
e
0.0 0.2 0.4 0.6 0.8 1.0
0
5
10
15
20
25
30
Sample Size for 80% Power. True slopeβx = 0.75. Variancesσ2x = σ2
ε = 1.
![Page 22: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/22.jpg)
Lecture 1 21
MULTIPLE L INEAR REGRESSIONWITH ERROR
• Model
Y = β0 + βtzZ + βt
xX + ε
W = X + U
• RegressingY onZ andW estimates(
βz∗βx∗
)= Λ
(βz
βx
) [6=
(βz
βx
)]
• Λ is theattenuation matrix or reliability matrix
Λ =
(σzz σzx
σxz σxx + σuu
)−1 (σzz σzx
σxz σxx
)
– Biases in components ofβx∗ andβz∗ can be multiplicative or additive=⇒∗ Naive test ofH0 : βx = 0, βz = 0 is valid
![Page 23: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/23.jpg)
Lecture 1 22
From previous page:
• (βz∗
βx∗
)= Λ
(βz
βx
) [6=
(βz
βx
)]
•Λ =
(σzz σzx
σxz σxx + σuu
)−1 (σzz σzx
σxz σxx
)
∗ Naive test ofH0 : βx = 0 is valid
∗ Naive test ofH0 : βx,1 = 0 is typically not valid (βx,1 denotes a subvector of
βx)
∗ Naive test ofH0 : βz = 0 is typically not valid (same is true for subvectors)
![Page 24: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/24.jpg)
Lecture 1 23
MULTIPLE L INEAR REGRESSIONWITH ERROR
• ForX scalar, attenuation factor inβx∗ is
λ1 =σ2
x|zσ2
x|z + σ2u
∗ σ2x|z = residual variance in regression ofX onZ
∗ σ2x|z ≤ σ2
x =⇒
λ1 =σ2
x|zσ2
x|z + σ2u
≤ σ2x
σ2x + σ2
u
= λ
∗ =⇒ Collinearity accentuates attenuation
• Biased estimates ofβz:
βz∗ = βz + (1− λ1)βxΓz,
∗ Γz is from E(X | Z) = Γ1 + ΓtzZ
![Page 25: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/25.jpg)
Lecture 1 24
ANALYSIS OF COVARIANCE
• These results have implications for the two group ANCOVA.
∗ X = true covariate
∗ Z = dummy indicator of group 1, say
• We are interested in estimatingβz, the group effect. Biased estimates ofβz:
βz∗ = βz + (1− λ1)βxΓz,
∗ Γz is from E(X | Z) = Γ1 + ΓtzZ
∗ Γz is the difference in the mean ofX among the two groups.
∗ Thus, biased unlessX andZ are unrelated.
I Use a randomized Study!!!
![Page 26: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/26.jpg)
Lecture 1 25
Predictors
Res
pons
e
-4 -2 0 2 4
-0.5
0.0
0.5
1.0
1.5
Unbalanced ANCOVA. Red = True Data,Blue = Observed.Solid = First Group,
Open = Second Group. No Difference In Groups.
![Page 27: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/27.jpg)
Lecture 1 26
WHAT IF WE WILL PREDICT FUTURE OUTCOMESBASED ON
THE OBSERVEDCOVARIATE?
• If we wish to predictY based onW , then the regression ofY onW is the correct
model to use
∗ Though this assumes that the distribution ofX and the measurement error dis-
tribution will be the same in the future as in the study
I sinceλ = σ2x/(σ2
x + σ2u)
• In public health, interventions will change the trueX, notW , so we are interested
in the regression ofY onX
![Page 28: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/28.jpg)
Lecture 1 27
SUMMARY OF THE EFFECTSOF MEASUREMENTERROR IN
SIMPLE L INEAR REGRESSION
• Regression ModelY = β0 + βxX + ε
W = X + U
• Attenuation Factor (Reliability Ratio)
λ =σ2
x
σ2x + σ2
u
∗ 0 < λ ≤ 1
∗ λ = 1 ⇐⇒ σ2u = 0
Regression Intercept Slope Residual VarianceY onX β0 βx σ2
ε
Y onW β0 + (1− λ)βxµx λβx σ2ε + (1− λ)β2
xσ2x
![Page 29: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/29.jpg)
Lecture 1 28
END OF LECTURE1
![Page 30: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/30.jpg)
Lecture 2 1
LECTURE 2: DATA TYPES, NONDIFFERENTIAL ERROR,
ESTIMATING ATTENTUATION, EXACT PREDICTORS,
BERKSON MODEL
OUTLINE
• Nondifferential measurement error
• Estimating the attenuation
• Replication and validation data
• Internal and external subsets
• Transportability across data sets
• Is there an “exact” predictor?
• Berkson and classical measurement error
![Page 31: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/31.jpg)
Lecture 2 2
THE BASIC DATA
• A responseY
• PredictorsX measured with error (unobserved).
• PredictorsZ measured without error.
• A major proxyW for X.
• Sometime, a second proxyT for X.
![Page 32: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/32.jpg)
Lecture 2 3
NONDIFFERENTIAL ERROR
• Roughly speaking, the error is said to benondifferentialif W andT would not be
measured if one could have measuredX.
• More formally,(W,T ) areconditionally independentof Y given(X,Z).
⇒ (W,T ) would provide no additional information aboutY if X were ob-
served
• This often makes sense, but it may be fairly subtle in each application.
![Page 33: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/33.jpg)
Lecture 2 4
NONDIFFERENTIAL ERROR
• Many crucial theoretical calculations revolve around nondifferential error.
• Consider simple linear regression:Y = β0 + βxX + ε, ε independent ofX.
E(Y |W ) =
(1)︷ ︸︸ ︷E [{E(Y |X,W )} |W ] = E [{E(Y |X)} |W ] = β0 + βxE(X|W ).
∗ This reduces the problem to estimatingE(X|W ). For example,
E(X|W ) = λW + (1− λ)µx (under joint normality and classical error)
whereλ = σ2x/σ
2w.
• If the error isdifferential, then (1) fails, and no simplification is possible.
![Page 34: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/34.jpg)
Lecture 2 5
HEART DISEASEVS SYSTOLIC BLOOD PRESSURE
• Y = indicator of Coronary heart Disease (CHD)
• X = true long-term average systolic blood pressure (SBP) (maybe transformed)
• AssumeP (Y = 1) = H (β0 + βxX)
• Data are CHD indicators and determinations of systolic blood pressure forn =
1600 in Framingham Heart Study
• X measuredwith error:
∗ SBP measured at two exams (and averaged)=⇒ sampling error
∗ The determination of SBP is subject to machine and reader variability
• It is hard to believe that the short term average of two days carries any additional
information about the subject’s chance of CHD over and above true SBP.
• Hence,nondifferential
![Page 35: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/35.jpg)
Lecture 2 6
IS THIS NONDIFFERENTIAL?
• From Tosteson et al. (1989).
• Y = I{wheeze}.
• X is personal exposure to NO2.
• W = (NO2 in kitchen, NO2 in bedroom) is observed in the primary study.
![Page 36: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/36.jpg)
Lecture 2 7
IS THIS NONDIFFERENTIAL?
• From Kuchenhoff & Carroll
• Y = I{lung irritation}.
• X is actual personal long–term dust exposure
• W = is dust exposure as measured by occupational epidemiology techniques.
∗ The sampled the plant for dust.
∗ Then they tried to match the person to where he/she worked.
![Page 37: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/37.jpg)
Lecture 2 8
WHAT IS NECESSARYTO DO AN ANALYSIS?
• In linear regression with classical additive errorW = X + U , one needs:
∗ Nondifferential error
∗ An estimate of the error variance var(U) — sinceλ = σ2w−σ2
uσ2
w.
• How do we get the latter information?
• The best way is to get a subsample of the study in whichX is observed. This is
calledvalidation.
∗ In many applications not possible.
• Another method is to do replications of the process, often calledcalibration.
• A third way is to get the value from another similar study.
![Page 38: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/38.jpg)
Lecture 2 9
REPLICATION
• In a replication study, for some of the study participants you measure more than
oneW .
• The standard model is
Wij = Xi + Uij, j = 1, ..., mi.
• This is a one-factor ANOVA with mean squared errorσ2u estimated by
σ2u =
∑ni=1
∑mij=1(Wij −W i•)2∑ni=1(mi − 1)
.
• As the proxy forXi one would useW i• where
W i• = Xi + U i•
var(U i•) = σ2u/mi.
![Page 39: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/39.jpg)
Lecture 2 10
REPLICATION
• Replication allows you to test whether your model is additive with constant error
variance.
• If Wij = Xi + Uij with Uij symmetrically distributed about zero and independent
of Xi, we have a major fact:
∗ The sample mean and sample standard deviation are uncorrelated.
I Eckert, Carroll, and Wang (1997) use this fact to find a transformation to
additivity
• Also, if Uij are normally distributed, then so too are differencesWi1 − Wi2 =
Ui1 − Ui2.
• Graphical diagnostics can be implemented easily in any package.
![Page 40: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/40.jpg)
Lecture 2 11
REPLICATION: WISH
• The WISH study measured caloric intake using a 24–hour recall.
∗ There were 6 replicates per woman in the study.
• A plot of the caloric intake data showed thatW was nowhere close to being nor-mally distributed in the population.
∗ If additive, then eitherX or U is not normal.
• When plotting standard deviation versus the mean, one often uses the rule that themethod “passes” the test if the max-to-min is less than 2.0.
∗ A little bit of non–constant variance does not seem to hurt. See Carroll &
Ruppert (1988)
∗ Caveat:Carroll & Ruppert studied effects of heteroscedasticity on efficiency,not on bias due to measurement error.
![Page 41: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/41.jpg)
Lecture 2 12
-3 -2 -1 0 1 2 3
1000
2000
3000
4000
5000
6000
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
••
•
••
••
••
•
•
•
•
•
•
•
•
•
•
•
••
•
••
•
•
•
••
•
•
•
••
••
•
••
•
••
••
•
•
••
•
••
••
•
•
•
•
••
•
••
••
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
••
•
••
•
•
•
••
•
•
••
•
•
•
••
•
•
•
•
•
•
••
••
•
••
•
•
•
•
•
•••
•
•• •
••
•
•
•
•
•
•
•
•
•
••
•
•
••
• •
•
•
•
•
•
•
•
•
• •
••
•
•
•
•
•
•
••
•
•
••
•
••
•
•
•
•
••
••
•
•
•
•
•
•
•
••••
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•• •
•
•
•
•
•
•
•
• •
•
••
•
•
•••
••
•
•
•
•
•
•
•
WISH Calories, FFQNormal QQ--Plot
WISH, Caloric Intake, Q–Q plot of W .
![Page 42: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/42.jpg)
Lecture 2 13
-3 -2 -1 0 1 2 3
-3000
-2000
-1000
0
1000
2000
3000
•
•
•
•
•
•
•
•
•
••
•
•
•
•
••
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
••
•
•
••
•
•
•
••
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•••
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
••
•
•
•
•
••
•
•
••
•
•
•
••
•
•
•
••
•
•
•
•
••
•
••
••
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
••
••
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
••
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
••
••
••
•
•
•
•
•
•
•
•
•
••
•
•
•
••
•
•
•
•
•
•
••
•
•
•
•
••
•
•
••
•
•
•
•
•
•
•
•
•
•
•
••
•
•
••
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
••
•
•
••
•
•
•
•
•
•
•
•
•
•
••
•
••
•
•
•
•
•
•
•
•
••
•
••
•
••
••
•
•
••
•
•
••
•
•
•
•
••
•
•
•
•
•
••
•
•
•
•
••
•
•
•
•
••
•
•
•
••
••
•
•
•
•
• •
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
••
•••
•
•
•
• •
•
••
•
••
•
••
•
•
•
•
•
•
•
•
•
•••
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
••••
•
•
•
•
•
••
•
•
•
•
•
•
••
•
•
•
••
•
•••
•
•
••
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
••
•
•
••
•
•
•
•
•
•
••
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•••
•
••
•••
•
••
•
•
•
•
•
•
•
•
•
••
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
• •
•
•
•
••
•
•
•
•
••
•
••
•
•
•
••
•
•
•
•
••
••
•
•
•
•
•
•
•
•
••
•
•
•
• • ••
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
••
••
•
•••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
••
•
••
•
•
•
•
•
•
••
••
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
••
••
•
•
•
•
•
•
•
••
•
•
••
•
•
•
•
•
••
••
•
•
•••
•
•
••
•
•
•
•
WISH Calories, 24--hour recallsNormal QQ--Plot of Pairwise Differences
WISH, Caloric Intake, Q–Q plot of Differenced Ws.
![Page 43: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/43.jpg)
Lecture 2 14
Mean Calories
s.d.
1000 1500 2000 2500 3000
200
400
600
800
1000
1200
•
•
•
•
•
•
•
•
•
••
•
•
•
•
••
•
•
••
•
•
•
•
•
•
•
•
••
•
•
•
•
•
••
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
• •
••
•
•
••
•
••
•
• •
•
•
•
••
•
•
•
••
•
•
•
•
••
•
••
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
• •
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
••
•
•
••
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
• •
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
WISH Calories, 24--hour recallss.d. versus mean
WISH, Caloric Intake, plot for additivity, loess and OLS.
![Page 44: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/44.jpg)
Lecture 2 15
REPLICATION: WISH
• Taking logarithm ofW improves all the plots.
![Page 45: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/45.jpg)
Lecture 2 16
-3 -2 -1 0 1 2 3
6.0
6.5
7.0
7.5
8.0
8.5
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
••
•
••
••
••
•
••
•
•
•
•
•
•
•
•
••
•
••
•
•
•
•
•
•
•
•
••
••
•
••
•
••
•
••
•
••
•
•
••
•
•
•
•
•
••
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
••
•
••
•
•
•
•
•
•••
•
••
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
• •
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
••
••
•
•
•
•
•
•
•
•
•••
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
••
•
••
•
•
••
•
•
•
•
•
•
•
•
•
•
WISH Log-Calories, FFQNormal QQ--Plot
WISH, Log Caloric Intake, Q–Q Plot of ObservedW .
![Page 46: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/46.jpg)
Lecture 2 17
-3 -2 -1 0 1 2 3
-2*10^0
-1.000020*10^0
-4.053116*10^-5
9.999392*10^-1
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
• ••
•
•
•
•
••
•
•
•
•
•
•
•
•
•
••
•
••
•
•
•
•
•
•
•
•
•
•
•••
•
•
•
••
••
•
••
•
•
••
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•••
•
•
•••
•
•
•
•
••
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
••
•
•
••
•
•
•
••
•
•
•
••
•
•
•
•
•
•
•
•
••
•
•
•
•
••
•
•
•
•
••
••
•
•
• •
••
•
•
•
•
•
••
••
••
•
•
•
•
•
•
•
•••
••
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
••
•
•
•
•••
•
••
•
•• •
•
•
•
•
•
•
•
•
•
•
•
•
••••
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
••
•
•
••
•
•
•
•
•
•
•
••
••
•
•
•
••
•
•
•
•
•
•
••
•
•
• •
••
••
••
•
••
•
•
•
•
••
•
•
••
•
•
••
•
•
•
•
••
•
•
•
•
•
•
•
•
•
••
••
•
•
•
•
••
•
•
•
•
•
••
•
••
•
••
•
•
•
•
•
•
•
•
•
•
•
••
•
••
•
•
•
•
•
••
•
••
•
•
•
•
••
•
•
•
••
••
•
••
•
•
•
•
•
•
•
• •
•
•
•
•
•
••
•
•
•
•
•••
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
••
•
••
••
••
• •
•
•
•
•
•
•
•
•
•
••••
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
••
••••
•
•
•
•
•
••
•
•
•
•
•
•
••
•
•
••
•
•
•
••
•
•
••
•
•
•
•
•••
•
•
•
•
•
•
•
•
•
••
•
•••
•
•
•
•
•
•
••
•
•
•
•
••
•
•
••
•
•
• •
•
•••
•
••
•
••
•
• •
•
•
•
•
•
•
•
•
•
••
•
•
•
•
••
•
••
•
•
••
•
•
•
•
•
•
•
•
••
•
•
••
•
•
•
•••
•
•
•
•••
••
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
••••
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•••
•
•
•••
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
••
•
• •
•
•
•
•
••••
•
•
•
•
•
•
••
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
••
•
•
•
•
•
•
•
•
•
•
••
•
•
• •
•
•
••
•
••
••
•
•
•••
•
•
••
•
•
•
•
WISH Log-Calories, 24--hour recallsNormal QQ--Plot of Pairwise Differences
WISH, Log Caloric Intake, Q–Q Plot of Differenced Ws.
![Page 47: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/47.jpg)
Lecture 2 18
Mean Log--Calories
s.d.
6.5 7.0 7.5 8.0
0.2
0.4
0.6
0.8
•
•
•
•
• •
•
•
•
•
•
•
•
••
•
•
••
•
•
•
•
•
•
•
•
•
•
•
••
•
••
•
••
••
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
••
•
•
•
•
••
•
•
•
•
•
•
•
•••
•
•
•
•
•
•••
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•••
•
•
•
••
••
•
••
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•••
•
•
•
•
•
•
•
•
•••
•
•• ••
•
WISH Log--Calories, 24--hour recallss.d. versus mean
WISH, Log Caloric Intake, Plot For Additivity, Loess and OLS.
• Perhaps a bit over-transformed — square-root might be better.
![Page 48: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/48.jpg)
Lecture 2 19
TRANSPORTABILITY OF ESIMATES
• In linear regression, we only require need the measurement error variance (after
checking for semi-constant variance, additivity, normality).
• In general though, more is needed. Let’s remember that if we observeW instead
of X, then the observed data have a regression ofY onW that effectively acts as
if
E(Y |W ) = β0 + βxE(X|W )
≈ β0 + βx{λW + (1− λ)µx}.
• As we will see later, in general problems we can do a likelihood analysis if we
know the distribution ofX givenW .
![Page 49: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/49.jpg)
Lecture 2 20
TRANSPORTABILITY OF ESIMATES
• It is tempting to try to use outside data and transport an estimate ofλ to yourproblem.
∗ Bad idea!!!
λ =σ2
x
σ2x + σ2
u
∗ Note how this depends on the distribution ofX.
∗ It is rarely the case that two populations have the sameX distribution, evenwhen the same instrument is used.
I the instrument effects onlyσ2u, notσ2
x
• Maybe one should transport the estimate ofσ2u and use
λ =σ2
w − σ2u,transported
σ2w
![Page 50: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/50.jpg)
Lecture 2 21
EXTERNAL DATA AND TRANSPORTABILITY
• We say that a model istransportableacross studies if the model holdswith the
same parametersin the two studies.
∗ Internal data is ideal since there is no question about transportability.
• With external data, transportability back to the primary study cannot be
taken for granted.
∗ Sometimes transportability clearly will not hold. Then the value of the external
data is, at best, questionable.
∗ Even is transportability seems to be a reasonable assumption, it is still just that,
an assumption.
![Page 51: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/51.jpg)
Lecture 2 22
EXTERNAL DATA AND TRANSPORTABILITY
• As an illustration, consider two nutrition data sets which use exactly the same FFQ
• Nurses Health Study
∗ Nurses in the Boston Area
• American Cancer Society
∗ National sample
• Since the same instrument is used, error properties should be about the same.
∗ But maybe note the entire distribution!!
∗ var(differences, NHS = 47)
∗ var(differences, ACS = 45)
∗ var(sum, NHS = 152)
∗ var(sum, ACS = 296)
![Page 52: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/52.jpg)
Lecture 2 23
10 20 30 40 50
010
2030
4050
pccal.nhs
NHS
10 20 30 40 50
010
2030
40
pccal.acs
ACS
FFQ Histograms in NHS and ACS
![Page 53: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/53.jpg)
Lecture 2 24
IS THERE AN “EXACT” PREDICTOR?
• One can distinguish betweentwo concepts ofX:
∗ theactual exact predictor, which may never be observable under any circum-
stances.
∗ the operationally-defined exact predictorwhich could be observed, albeit at
great cost and effort.
• Example:X is long-term caloric intake.
∗ actualX is long-term intake over lifetime.
∗ operationally-definedX is long-term intake since inception of the study, as
measured by the best available instrument.
![Page 54: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/54.jpg)
Lecture 2 25
IS THERE AN “EXACT” PREDICTOR? —CONTINUED
• One could probably construct hypothetical examples where the effects of actual
X and of operationally-definedX are quite different, even opposite in sign.
• Obviously, a policy to changeX, say to reduce intake of saturated fat or of calo-
ries, affects the actualX, not the operationally definedX.
• Nonetheless, the operationally-definedX is all we can work with.
• “Gold standard” generally means the operationally-definedX.
∗ We will take “exact predictor” and “gold standard” to both refer to operationally-
definedX.
∗ However, for some authors, “gold standard” mean the actual exactX even if
not operationally-defined.
![Page 55: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/55.jpg)
Lecture 2 26
THE BERKSONMODEL
• Theclassical Berkson modelsays thatTrue Exposure = Observed Exposure + Mean Zero Error
X = W + Ub (or X = W × Ub),
• It is assumed thatW andUb are indep. andE(U) = 0 (additive error) orE(U) = 1
(multiplicative error) so that
E(X|W ) = W
• Compare with classical measurement error model where
W = X + U
andE(X|W ) = λW + (1− λ)µx.
![Page 56: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/56.jpg)
Lecture 2 27
THE BERKSONMODEL — CONT.
• From previous pageE(X|W ) = W
• In the linear regression model,
∗ Ignoring error still leads to unbiased intercept and slope estimates,∗ but the error about the line is increased.
I To see this, note that
Y = β0 + β1(W + Ub) + ε = β0 + β1W + (β1Ub + ε)
• In the logistic model with normally distributed measurement error, ignoring error
∗ Leads to a bias in slope (and intercept):
Est. Slope≈ True Slope√1 + β2
1σ2u,b/2.9
.
∗ For many problems, this attenuation is only minor.
![Page 57: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/57.jpg)
Lecture 2 28
END OF LECTURE2
![Page 58: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/58.jpg)
Lecture 3 1
LECTURE 3: BERKSON/CLASSICAL,
STRUCTURAL/FUNCTIONAL,
REGRESSION CALIBRATION
OUTLINE
• What is Berkson? What is classical?
• Functional versus structural modeling
∗ Classical and flexible structural modeling
• Regression calibration
• Multiplicative error
![Page 59: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/59.jpg)
Lecture 3 2
WHAT ’S BERKSON? WHAT ’S CLASSICAL?
• Berkson: X = W + Ub andW andUb are independent
• Classical: W = X + Uc andX andUc are independent
• In practice, it may be hard to distinguish between the classical and the Berksonerror models.
∗ In some instances, neither holds exactly.
∗ In some complex situations, errors may have both Berkson and classical com-ponents, e.g., when the observed predictor is a combination of 2 or more error–prone predictors.
• Berkson model: a nominal value is assigned.
∗ Direct measures cannot be taken, nor can replicates.
• Classical error structure: direct individual measurements are taken, and can bereplicated but with variability.
![Page 60: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/60.jpg)
Lecture 3 3
WHAT ’S BERKSON? WHAT ’S CLASSICAL?
• The reason why people have trouble distinguishing the two is that in real life, it
takes hard thought to do so.
![Page 61: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/61.jpg)
Lecture 3 4
LET’S PLAY STUMP THE EXPERTS!
• Framingham Heart Study
∗ Predictor is systolic blood pressure
• All workers with the same job classification and age are assigned the same expo-
sure based on job exposure studies.
• Using a phantom, all persons of a given height and weight with a given recorded
dose are assigned the same radiation exposure.
• The radon gas concentration in houses that have been torn down is estimated by
the average concentration in a sample of nearby houses
![Page 62: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/62.jpg)
Lecture 3 5
BERKSONERROR IN NONLINEAR MODELS
• Suppose thatY = f (X, β) + ε
• Then
Y = f (W,β) +1
2f ′′(W,β)σ2
b +
{f ′(W,β)Ub +
1
2f ′′(W,β)(U 2
b − σ2b ) + ε
}
∗ Here
f ′(W,β) =∂
∂Wf (W,β)
and so forth
∗ The nature of the bias(in red)depends on the sign off ′′(W,β)
∗ In the linear casef ′′(W,β) = 0 and we get the previous result: no bias
![Page 63: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/63.jpg)
Lecture 3 6
M IXTURES OF BERKSONAND CLASSICAL ERROR
• Let L be long-term average radon concentration for all houses in a neighborhood
• X is long-term average radon concentration in a specific house in this neighbor-hood
• W is average measured radon concentration for a sample of houses in this neigh-borhood
• Model of Tosteson and Tsiatis (1988)
X = L + Ub
W = L + Uc
L, Ub, Uc are independent
∗ Used by Reeves, Cox, Darby, and Whitley (1998) and Mallick, Hoffman, andCarroll (2002)
![Page 64: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/64.jpg)
Lecture 3 7
M IXTURES OF BERKSONAND CLASSICAL ERROR
• From previous page: Model of Tosteson and Tsiatis (1988)
X = L + Ub
W = L + Uc
• Uc = 0 ⇒ X = W + Ub ⇒ Berkson
• Ub = 0⇒ W = X + Uc ⇒ Classical
• More generally,
W = X + Uc − Ub
andUc is independent ofX while Ub is independent ofW
![Page 65: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/65.jpg)
Lecture 3 8
FUNCTIONAL AND STRUCTURAL MODELING CLASSICAL ERROR
MODELS
• The common linear regression texts make distinction:
∗ Functional: X ’s arefixed constants
∗ Structural: X ’s arerandom variables
• If you pretend that theX ’s are fixed constants, it seems plausible to try to estimatethem as well as all the other model parameters.
• This is the functional maximum likelihood estimator.
∗ Every textbook has the linear regression functional maximum likelihood esti-mator.
• Unfortunately, the functional MLE in nonlinear problems has two defects.
∗ It’s really nasty to compute.
∗ It’s a lousy estimator (badly inconsistent).
• Structural⇒ empirical Bayes type analysis
![Page 66: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/66.jpg)
Lecture 3 9
FUNCTIONAL AND STRUCTURAL MODELING CLASSICAL ERROR
MODELS
• More useful distinction:
∗ Functional modeling: No assumptions made about theX ’s (could be random
or fixed)
∗ Classical structural modeling: Strong parametric assumptions made about
the distribution ofX. Generally normal, lognormal or gamma.
∗ Flexible structural modeling: Structural, but flexible parametric family. Tries
to get the best of both worlds.
• Flexible is similar in spirit toQuasi-structural modeling (Pierce et al., 1992)
∗ Assume the that distribution ofX is the (unknown) empirical distribution of
theX values in the sample
![Page 67: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/67.jpg)
Lecture 3 10
CHOOSINGBETWEEN FUNCTIONAL OR STRUCTURAL MODELS
• Key questions:
∗ How sensitive are inferences to an assumedSTRUCTURAL model?
∗ How much does it “cost” to be functional?
• FUNCTIONAL : No need to perform extensive sensitivity analyses.
• Many functional methods are simple to implement (and some are computed using
little more than standard software).
• Functionality focuses emphasis on the error model.
• Because of “latent-model” robustness, a functional analysis serves as a useful
check on a parametric structural model.
![Page 68: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/68.jpg)
Lecture 3 11
CHOOSINGBETWEEN FUNCTIONAL OR STRUCTURAL MODELS
• Structural models can be viewed as anempirical Bayesmethod of dealing with
a large number of nuisance parameters (the true covariate values) (Whittemore,
1989)
• Best knownfunctional methods can beinefficient (missing data, thresholds, non-
parametric regression).
![Page 69: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/69.jpg)
Lecture 3 12
CHOOSINGBETWEEN FUNCTIONAL OR STRUCTURAL MODELS
• One should consider the distribution of the trueX ’s
∗ Example: A-bomb survivor study (Pierce et al., 1992):
I All subjects survived the acute effects
I This provides some information
I A high radiation measurement is likely to be due to a high measurement
error
∗ This information can, perhaps, be captured by during structural modeling
I The distribution of true doses is different among survivors than among all
exposed to the attacks
![Page 70: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/70.jpg)
Lecture 3 13
CHOOSINGBETWEEN FUNCTIONAL OR STRUCTURAL MODELS
• My opinion is that one should usestructural models
∗ But care is needed when modeling the distribution ofX
![Page 71: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/71.jpg)
Lecture 3 14
SOME FUNCTIONAL METHODS
• Regression Calibration/Substitution
∗ Replaces true exposureX by an estimate of itbased only on covariatesbut not
on the response.
∗ In linear model with additive errors, this is the classicalcorrection for attenua-
tion.
∗ In Berkson model, one simply ignores the measurement error.
• SIMEX is a fairly general functional method.
∗ It assumes only that you have an error model and that you can “add on” mea-
surement error to make the problem worse.
![Page 72: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/72.jpg)
Lecture 3 15
REGRESSIONCALIBRATION —BASIC IDEAS
• Key idea: replace the unknownX by E(X|Z, W ) which depends only on the
known(Z, W ).
∗ This provides an approximate model forY in terms of(Z, W ).
∗ Called the “conditional expectation approach” by Lyles and Kupper (1997)
• Generally applicable.
∗ Depends on the measurement error being “not too large” in order for the ap-
proximation to be suffciently accurate.
∗ For some models, if var(X|Z,W ) is constant then the only bias due to the
regression calibration approximation is in the intercept parameter.
![Page 73: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/73.jpg)
Lecture 3 16
REGRESSIONCALIBRATION —BASIC IDEAS
Why does regression calibration work?
• X = E(X|Z,W ) + U whereU is uncorrelated with any function of(Z, W )
• If
Y = XTβx + ZTβz
then
Y = E(X|Z, W )Tβx + ZTβz + (UTβx + ε)
where(UTβx + ε) is uncorrelated with the regressorsZ andE(X|Z, W )
• Therefore, regression ofY onE(X|Z,W ) andZ gives unbiased estimates
– E(X|Z,W ) could be replaced by the linear regression (best linear predictor)
of X using(Z,W )
![Page 74: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/74.jpg)
Lecture 3 17
THE REGRESSIONCALIBRATION ALGORITHM
• The general algorithm is:
∗ Using replication, validation, or instrumental data, develop a modelE(X|Z, W ) =
m(Z,W, γcm) and estimateγcm.
∗ ReplaceX by m(Z, W, γcm) and run your favorite analysis.
∗ Obtain standard errors by the bootstrap or the “sandwich method.”
I Can automatically correct for uncertainty aboutγcm
• In linear regression, regression calibration is equivalent to the “correction for at-
tenuation.”
![Page 75: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/75.jpg)
Lecture 3 18
THE REGRESSIONCALIBRATION ALGORITHM
• Easily adjusts for different amount of replication
• If W i is the average ofmi replicates then
λi =σ2
x
σ2x + σ2
u/mi
E(Xi|W i) = λiW i + (1− λi)µx
![Page 76: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/76.jpg)
Lecture 3 19
LOGISTIC REGRESSION, NORMAL X
• Consider the logistic regression model
Pr(Y = 1|X) = {1 + exp(−β0 − βxX)}−1 = H(β0 + βxX).
• Suppose that
∗ X andU are normally distributed.
∗ ThenX givenW is normal with meanE(X|W ) = µx(1−λ)+λW and varianceλσ2
u.
• If the event is “rare”, thenH(β0 + βxX) ≈ exp(β0 + βxX).
• Then, by using moment generating functions, the observed data follow
Pr(Y = 1|W ) = E [{Pr(Y = 1|X, W )} |W ] =why?
E [{Pr(Y = 1|X)} |W ]
≈ E {exp(β0 + βxX)|W} = exp{β0 + (1/2)β2xλσ2
u + βxE(X|W )}≈ H{β0 + (1/2)β2
xλσ2u + βxE(X|W )}.
• Typically, the regression calibration approximation works fine in this case.
![Page 77: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/77.jpg)
Lecture 3 20
LOGISTIC REGRESSION, NORMAL X
• A different approximation uses the fact thatH(x) ≈ Φ(kx) with k = 0.588 ≈1/1.70
−10 −5 0 5 100
0.5
1normallogistic
−10 −5 0 5 10−0.01
0
0.01
error
−3 −2 −1 0 1 2 3−0.2
0
0.2
relative error
error = {Φ(kx)−H(x)} and relative error = Error /H(x).
![Page 78: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/78.jpg)
Lecture 3 21
LOGISTIC REGRESSION, NORMAL X
Pr(Y = 1|X) = Φ{k(β0 + βxX)} = Pr{U ≤ k(β0 + βxX)}
Therefore
Pr(Y = 1|W ) = E[Pr{U ≤ k(β0 + βxX)}|W ]
= Pr[U − βxkUb ≤ k{β0 + βxE(X|W )}]
= Φ
[k{β0 + βxE(X|W )}(1 + β2
xk2σ2
X|W )1/2
]≈ H
{β0 + βxE(X|W )
(1 + β2xk
2σ2X|W )1/2
}
Here
Ub ∼ N(0, σ2X|W ) = N(0, λσ2
u)
![Page 79: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/79.jpg)
Lecture 3 22
LOGISTIC REGRESSION, NORMAL X
A third approximation is
Pr(Y = 1|W ) = H [β0 + βx{E(X|W ) + Ub}|W ]
≈ H{β0 + βxE(X|W )} +1
2β2
x H ′′{β0 + βxE(X|W )}σ2X|W
Note that
H ′′(x) > 0 if x < 0
H ′′(x) < 0 if x > 0
• This approximation is applicable to most models, not just logistic regression
![Page 80: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/80.jpg)
Lecture 3 23
ESTIMATING THE CALIBRATION FUNCTION
• Need to estimateE(X|Z,W ).
∗ How this is done depends, of course, on the type of auxiliary data available.
• Easy case: validation data
∗ suppose one has internal, validation data.
∗ then one can simply regressX on (Z, W ) and transports the model to the non-
validation data.
∗ of course, for the validation data one regressesY on (Z, X), and this estimatemust be combined with the one from the non-validation data.
• Same approach can be used for external validation data, but with the usual concernfor non-transportability.
![Page 81: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/81.jpg)
Lecture 3 24
ESTIMATING THE CALIBRATION FUNCTION: INSTRUMENTAL
DATA : ROSNER’ S METHOD
• Internal unbiased instrumental data:
∗ supposeE(T |Z,X) = E(T |Z,X,W ) = X so thatT is an unbiased instru-ment.
∗ If T is expensive to measure, thenT might be available for only a subset of thestudy.W will generally be available for all subjects.
∗ thenE(T |Z,W ) = E{E(T |Z, X,W )|Z, W} = E(X|Z, W ).
• Thus,T regressed on(Z, W ) follows the same model asX regressed on(Z, W ),although with greater variance.
∗ So one regressesT on (Z, W ) to estimate the parameters in the regression ofX on (Z, W ).
![Page 82: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/82.jpg)
Lecture 3 25
ROSNER’ S METHOD, CONT.
• More formally, suppose we have the modelWi = γ0,em + γtx,emXi + γt
z,emZi + Ui
(error calibration model)
• Under joint normality, this implies a calibration model
Xi = γ0,cm + γtw,cmWi + γt
z,cmZi + Ucm,i
∗ Sometimes one can use validation data to regressX onZ andW
∗ Suppose we only have an “alloyed gold standard”Ti = Xi + UT,i on a “valida-
tion sample”
I the regressT onZ andW
I This works! — think ofTi = Xi + Ui as outcome measurement error
![Page 83: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/83.jpg)
Lecture 3 26
ESTIMATING THE CALIBRATION FUNCTION: REPLICATION DATA
• Suppose that one has unbiased internal replicate data:
∗ Wij = Xi + Uij, i = 1, . . . , n andj = 1, . . . , ki, whereE(Uij|Zi, Xi) = 0.
∗ W i· := 1ki
∑j Wij.
∗ Notation:µz is E(Z), Σxz is the covariance (matrix) betweenX andZ, etc.
• Use standard least squares theory to get the best linear unbiased predictor ofX
from (W,Z):
E(X|Z, W ) ≈ µx + (Σxx Σtxz)
{Σxx + Σuu/k Σxz
Σtxz Σzz
}−1 (W − µw
Z − µz
)
(best linear approximation= exact conditional expectation under joint normality).
• Need to estimate the unknownµ’s andΣ’s.
• Essentially a flexible structure method since existence ofµx andΣxx is assumed.
![Page 84: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/84.jpg)
Lecture 3 27
ESTIMATING THE CALIBRATION FUNCTION: REPLICATION DATA ,
CONTINUED
• µz andΣzz are the “usual” estimates since theZ ’s are observed.
• µx = µw =∑n
i=1 kiW i/∑n
i=1 ki.
• Σxz =∑n
i=1 ki(W i· − µw)(Zi − µz)t/ν
whereν =∑
ki −∑
k2i /
∑ki.
•Σuu =
∑ni=1
∑kij=1(Wij −W i·)(Wij −W i·)
t
∑ni=1(ki − 1)
.
•Σxx =
[{n∑
i=1
ki(W i· − µw)(W i· − µw)t
}− (n− 1)Σuu
]/ν.
![Page 85: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/85.jpg)
Lecture 3 28
REGRESSIONCALIBRATION VERSUSCORRECTINGTHE NAIVE
ESTIMATOR FOR ATTENUATION
• If
g{E(Y |X)} = β0 + β1X (GLM)
and
E(X|W ) = α0 + λW
then these two slope estimators are equivalent:
∗ RegressY onE(X|W ) (e.g., Rosner, Spiegelman, Willett, 1989)
∗ RegressY on W and divide the slope byλ (e.g., Carroll, Ruppert, Stefanski,
1995)
• Equivalence does not hold in general, e.g., for multiplicative error, whenE(X|W )
is nonlinear.
![Page 86: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/86.jpg)
Lecture 3 29
MULTIPLICATIVE ERROR
• The multiplicative lognormal error model is
W = X U, log(U) ∼ N(µu, σ2u) (1)
• If log(X) is normal then we can convert (1) to a model forX givenW
log(X) = α + λ log(W ) + Ub
whereUb is N(0, σ2b ) and independent ofW
• If Y were linear inlog(X) then we would work withlog(X) and have an additive
error model
∗ But typically it is assumed thatY is linear inX, not log(X)
![Page 87: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/87.jpg)
Lecture 3 30
MULTIPLICATIVE ERROR
• Themultiplicative errormodelwith outcome linear in exposureis well-supported
by empirical work:
∗ Lyles and Kupper (1997,Biometrics):
I “there is much evidence for this model”
I “the more biologically relevant predictor is true mean exposure on theorig-
inal scale”
![Page 88: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/88.jpg)
Lecture 3 31
MULTIPLICATIVE ERROR
∗ Pierce, Stram, Vaeth, Schafer (1992, JASA)
I data from Radiation Effects Research Foundation (RERF) in Hiroshima
I “On both radiobiological and empirical grounds, focus is on models where
the expected response is linear or quadratic in dose rather than linear on
logistic and logarithmic scales.”
I “It is accepted that radiation dose-estimation errors are more homogeneous
on a multiplicative than on an additive scale”
I “The distribution of true doses is extremely nonnormal”
I However, it seemed unlikely that a lognormal model for the true doses would
fit as well as the Weibull model that they used
![Page 89: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/89.jpg)
Lecture 3 32
MULTIPLICATIVE ERROR
∗ Seychelles study (preliminary analysis of Thurston)
I validation data (brain versus maternal hair MeHg): error appears to be mul-
tiplicative and lognormal
I researchers use MeHg concentration, not log concentration, as the exposure
![Page 90: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/90.jpg)
Lecture 3 33
MULTIPLICATIVE ERROR
• from previous page:
log(X) = α + λ log(W ) + Ub (2)
• From (2)
X = W λ exp(Ub)
and
E(X|W ) = W λ exp(α + σ2
b/2)
Not linear in W
![Page 91: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/91.jpg)
Lecture 3 34
MULTIPLICATIVE ERROR
• Therefore ifE(Y |X) = β0 + β1X then
E(Y |W ) = β0 +{β1 exp
(α + σ2
B/2)}
W λ
• Therefore, regressY onW λ and divide the slope estimate byexp(α + σ2
B/2)
∗ This is regression calibration again
• Note that the regression ofY on W is not linear even though the regression ofY
onX is linear
∗ This is because the regression ofX onW is not linear
![Page 92: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/92.jpg)
Lecture 3 35
MULTIPLICATIVE ERROR
0 1 2 3 4 5 6−0.5
0
0.5
1
1.5
2
predictor
outc
ome
n=50
true xsurrogate w
Simulation: W = XU , log(X) = N(0, 1/4), log(U) = N(−1/8, 1/4)
Y = X/2 + N(0, 0.04)
![Page 93: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/93.jpg)
Lecture 3 36
MULTIPLICATIVE ERROR
0 5 10 15 20−1
0
1
2
3
4
5
w
outc
ome
n=10000
w−yfit to x−yfit to w−y
W = XU , log(X) = N(0, 1/4), log(U) = N(−1/8, 1/4)
Y = X/2 + N(0, 0.04)
![Page 94: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/94.jpg)
Lecture 3 37
MULTIPLICATIVE ERROR
0 1 2 3 4 5−1
0
1
2
3
4
5
w1/2
outc
ome
n=10000 w1/2−yfit
W = XU , log(X) = N(0, 1/4), log(U) = N(−1/8, 1/4)
Y = X/2 + N(0, 0.04) Note: λ = 1/2
![Page 95: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/95.jpg)
Lecture 3 38
MULTIPLICATIVE ERROR
Other work
• Hwang (1986) regressesY onW (notW λ) and then corrects
∗ This is the “correction method” that is equivalent to regression calibrationonly
whenE(X|W ) is linear inW – which is not the case here
∗ Consistent but badly biased in simulations of Lyle and Kupper
∗ Bias is still noticeable forn = 10, 000
∗ Previous plots suggest why
∗ These results are consistent with those of Iturria, Carroll, and Firth (1999)
![Page 96: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/96.jpg)
Lecture 3 39
MULTIPLICATIVE ERROR
Other work
• Lyles and Kupper (1999) propose a method based on quasi-likelihood
∗ somewhat superior to regression calibration in a simulation study
I especially when measurement error is large relative to equation error (plus
measurement error inY )
∗ it is a weighted version of regression calibration
I more efficient than regression calibration because it weights according to
inverse conditional variances
∗ a special case of expanded regression calibration (Carroll, Ruppert, Stefanski,
1995)
![Page 97: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/97.jpg)
Lecture 3 40
MULTIPLICATIVE ERROR
Other work
• Iturria, Carroll, and Firth (1999) study polynomial regression with multiplicative
error
∗ One of their general methods is a special case of Hwang’s
∗ Their “partial regression” estimator assumes lognormality and generalizes the
estimator discussed earlier
∗ In a simulation with lognormalX andU , the partial regression estimator is
superior to the ones that make less assumptions
I this is similar to the results of Lyles and Kupper
![Page 98: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/98.jpg)
Lecture 3 41
TRANSFORMATIONSTO ADDITIVE ERROR
• Model ish(W ) = h(X) + U
∗ Includes additive and multiplicative error as a special case
• Eckert, Carroll, and Wang (1997) consider parametric (power) and nonparametric(monotonic cubic spline) models forh
∗ can transform to a specific error distribution
∗ could, instead, transform to constant error variance
∗ if model holds, then both methods have the same target transformation
∗ it could be more efficient to use both sources of information (Ruppert andAldershof, 1989)
• Nusser et al. (1996) use transformations to estimate the distribution ofX whereX is the daily intake of a dietary component
![Page 99: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/99.jpg)
Lecture 3 42
END OF LECTURE3
![Page 100: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/100.jpg)
Lecture 4 1
LECTURE 4: SIMEX AND INSTRUMENTAL VARIABLES
OUTLINE
• The Key Idea Behind SIMEX (Simulation/Extrapolation)
• An Empirical Version of SIMEX
• Details of Simulation Extrapolation Algorithm
• Example: Measurement Error in SBP in Framingham Study
• IV (Instrumental Variables): Rationale
• IV via prediction and the IV algorithm
• Example: Calibrating a FFQ
• Example: CHD and Cholesterol
![Page 101: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/101.jpg)
Lecture 4 2
ABOUT SIMULATION EXTRAPOLATION
• A functional method
∗ no assumptions about the trueX values
• For bias reduction and variance estimation
∗ like bootstrap and jackknife
• Not model dependent
∗ like bootstrap and jackknife
• Handles complicated problems
• Computer intensive
• Approximate, less efficient for certain problems
![Page 102: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/102.jpg)
Lecture 4 3
THE KEY IDEA
• Theeffects of measurement error on an estimator can be studied with a sim-
ulation experiment in which additional measurement error is addedto the
measured data and the estimator recalculated.
• “Response variable” is the estimator under study
• “Independent factor” is the measurement error variance
∗ “Factor levels” are the variances of the added measurement errors
• Objective is to study how the estimator depends on the variance of the measure-
ment error
![Page 103: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/103.jpg)
Lecture 4 4
OUTLINE OF THE ALGORITHM
• Add measurement error to variable measured with error
∗ Λ controls amount of added measurement error
∗ σ2u increased to(1 + Λ)σ2
u
∗ Average over many simulations to remove Monte Carlo variation
• Recalculate estimates — called pseudo-estimates
• Plot Monte Carlo average pseudo-estimates versusΛ
• Extrapolate toΛ = −1
∗ Λ = −1 corresponds to case of no measurement error
![Page 104: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/104.jpg)
Lecture 4 5
Measurement Error Variance
Coe
ffici
ent
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.2
0.4
0.6
0.8
1.0
1.2
Naive Estimate
Illustration of SIMEX
Your estimate when you ignore measurement error.
![Page 105: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/105.jpg)
Lecture 4 6
Measurement Error Variance
Coe
ffici
ent
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.2
0.4
0.6
0.8
1.0
1.2
Naive Estimate
Illustration of SIMEX
What happens to your estimate when you have more error, which
you add on by simulation, but you still ignore the error.
![Page 106: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/106.jpg)
Lecture 4 7
Measurement Error Variance
Coe
ffici
ent
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.2
0.4
0.6
0.8
1.0
1.2
Naive Estimate
Illustration of SIMEX
What statistician can resist fitting a curve?.
![Page 107: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/107.jpg)
Lecture 4 8
Measurement Error Variance
Coe
ffici
ent
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.2
0.4
0.6
0.8
1.0
1.2
Naive Estimate
SIMEX Estimate
Illustration of SIMEX
Now extrapolate to the case of no measurement error.
![Page 108: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/108.jpg)
Lecture 4 9
AN EMPIRICAL VERSIONOF SIMEX: FRAMINGHAM DATA
EXAMPLE
• Data
∗ Y = indicator of CHD
∗ Wk = SBP at Examk, k = 1, 2
∗ X = “true” SBP
∗ Data:
(Yj, W1,j, W2,j), j = 1, . . . , 1660
• Model Assumptions
∗ W1, W2 | X iid N(X, σ2u)
∗ Pr(Y = 1 | X) = H(α + βX), H logistic
![Page 109: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/109.jpg)
Lecture 4 10
FRAMINGHAM DATA EXAMPLE : THREE NAIVE ANALYSES:
• RegressY onW • 7−→ βA
• RegressY onW1 7−→ β1
• RegressY onW2 7−→ β2
Measurement Error Slope
Λ Variance= (1 + Λ)σ2u/2 Estimate
−1 0 ?
0 σ2u/2 βA
1 σ2u β1, β2
![Page 110: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/110.jpg)
Lecture 4 11
Logistic regression fits in Framingham using
first replicate, second replicate and average of both
![Page 111: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/111.jpg)
Lecture 4 12
A SIMEX–type plot for the Framingham data,
where the errors are not computer–generated.
![Page 112: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/112.jpg)
Lecture 4 13
A SIMEX–type extrapolation for the Framingham data,
where the errors are not computer–generated.
![Page 113: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/113.jpg)
Lecture 4 14
SIMULATION AND EXTRAPOLATION STEPS: ADDING
MEASUREMENTERROR
• Framingham Example:
∗ W • 7−→ W1 = W • + (W1 −W2)/2
∗ W • 7−→ W2 = W • + (W2 −W1)/2
• In General:
∗ Best Estimator(data
) 7−→ Best Estimator(data +
√Λ {Independent N
(0, σ2
u
)
Error } )
∗ Λ controls amount (variance) of added measurement error
![Page 114: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/114.jpg)
Lecture 4 15
SIMULATION AND EXTRAPOLATION ALGORITHM: ADDING
MEASUREMENTERROR (DETAILS)
• ForΛ ∈ {Λ1, . . . , ΛM}
• For b = 1 (1) B, compute:
∗ bth pseudo data set
Wb,i(Λ) = Wi +√
Λ Normal(0, σ2
u
)b,i
∗ bth pseudo estimate
θb(Λ) = θ({Yi,Wb,i(Λ)}n
1
)
∗ the average of the pseudo estimates
θ(Λ) = B−1B∑
b=1
θb(Λ) ≈ E(θb(Λ) | {Yj, Xj}n
1
)
![Page 115: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/115.jpg)
Lecture 4 16
SIMULATION AND EXTRAPOLATION STEPS: EXTRAPOLATION
• Framingham Example: (two pointsΛ = 0, 1)
∗ Linear Extrapolation —a + bΛ
• In General: (multipleΛ points)
∗ Linear —a + bΛ
∗ Quadratic —a + bΛ + cΛ2
∗ Rational Linear —(a + bΛ)/(c + Λ) (exact for linear regression)
![Page 116: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/116.jpg)
Lecture 4 17
SIMULATION AND EXTRAPOLATION ALGORITHM:
EXTRAPOLATION (DETAILS)
• Plot θ(Λ) vsΛ (Λ > 0)
• Extrapolate toΛ = −1 to getθ(−1) = θSIMEX
• For certain models and assuming the extrapolant function is chosen correctly
• E{
θ(−1) | True Data}
= θTRUE
![Page 117: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/117.jpg)
Lecture 4 18
EXAMPLE : MEASUREMENTERROR IN SYSTOLIC BLOOD
PRESSURE
• Framingham Data:(Yj, Agej, Smokej, Cholj WA,j
), j = 1, . . . , 1615
∗ Y = indicator of CHD
∗ Age (at Exam 2)
∗ Smoking Status (at Exam 1)
∗ Serum Cholesterol (at Exam 3)
∗ Transformed SBP
WA = (W1 + W2) /2,
Wk = ln (SBP− 50) at Examk
![Page 118: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/118.jpg)
Lecture 4 19
EXAMPLE : PARAMETER ESTIMATION
• Consider logistic regression ofY on Age, Smoke, Chol and SBP with transformed
SBP measured with error
∗ The plots on the following page illustrate the simulation extrapolation method
for estimating the parameters in the logistic regression model
![Page 119: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/119.jpg)
Lecture 4 20
• • • • • • • •
Age
LambdaC
oeffi
cien
t (x
1e+
2)-1.0 0.0 1.0 2.0
4.36
4.95
5.54
6.13
6.72
0.05360.0554
• • • • • • • •
Smoking
Lambda
Coe
ffici
ent (
x 1
e+1)
-1.0 0.0 1.0 2.0
3.43
4.68
5.93
7.18
8.43
0.601 0.593
• • • • • • • •
Cholesterol
Lambda
Coe
ffici
ent (
x 1
e+3)
-1.0 0.0 1.0 2.0
5.76
6.82
7.87
8.93
9.98
0.00782
0.00787•
••
••
••
•
Log(SBP-50)
Lambda
Coe
ffici
ent (
x 1
e+0)
-1.0 0.0 1.0 2.0
1.29
1.50
1.71
1.92
2.12
1.93
1.71
![Page 120: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/120.jpg)
Lecture 4 21
INSTRUMENTAL VARIABLES: RATIONALE
• Remember,W = X + U , U ∼ Normal(0, σ2u).
• The most direct and efficient way to get information aboutσ2u is to observeX on
a subset of the data.
• The next best way is via replication, namely to take≥ 2 independent replicates
∗ W1 = X + U1 and W2 = X + U2.
∗ If these are indeed replicates, then we can estimateσ2u via a components of
variance analysis.
• The third method is to useInstrumental Variables.
∗ SometimesX cannot be observed and replicates cannot be taken.
∗ Then IV’s can help.
![Page 121: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/121.jpg)
Lecture 4 22
WHAT IS AN INSTRUMENTAL VARIABLE ?
Y = β0 + βxX + ε;
W = X + U ;
U ∼ Normal(0, σ2u).
• In linear regression, an instrumental variableT is a random variable which hasthree properties:
∗ T is independent ofε
∗ T is independent ofU
∗ T is related toX.
∗ You only measureT to get information about measurement error: it is not part
of the model.
∗ In our parlance,T is a surrogate forX!
• WhetherT qualifies as an instrumental variable can be a difficult question.
![Page 122: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/122.jpg)
Lecture 4 23
AN EXAMPLE : CALIBRATING A QUESTIONNAIRE
X = usual (long–term) average intake of Fat (log scale);
Y = Fat as measured by a questionnaire;
W = Fat as measured by 6 days of 24–hour recalls
T = Fat as measured by a diary record
• In this example, the time ordering was:
∗ Questionnaire
∗ Then one year later, the recalls were done fairly close together in time.
∗ Then 6 months later, the diaries were measured.
• One could think of the recalls as replicates, but some researchers have worriedthat substantial correlations exist, i.e., they are notindependentreplicates.
• The 6–month gap with the recalls and the 18–month gap with the questionnairemakes the diary records a good candidate for an instrument.
![Page 123: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/123.jpg)
Lecture 4 24
USING INSTRUMENTAL VARIABLES: MOTIVATION
• In what follows, we will use underscores to denote what coefficients go where.
• For example,βY |1X is the coefficient forX in the regression ofY onX.
• Let’s do a little algebra:
Y = βY |1X + βY |1XX + ε;
W = X + U ;
(ε, U) = independent ofT.
• This means
E(Y | T ) = βY |1T + βY |1TT = βY |1X + βY |1XE(X | T )
= βY |1X + βY |1XE(W | T ) = βY |1T + βY |1XβW |1TT.
![Page 124: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/124.jpg)
Lecture 4 25
MOTIVATION — CONT.
• From previous page
E(Y | T ) = βY |1T + βY |1X βW |1TT.
• We want to estimateβY |1X
∗ From above
βY |1T = βY |1X βW |1T
∗ Equivalently,
βY |1X =βY |1TβW |1T
.
∗ RegressY onT and divide its slope by the slope of the regression ofW onT !
![Page 125: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/125.jpg)
Lecture 4 26
THE DANGERSOF A WEAK INSTRUMENT
• Remember that we get the IV estimate using the relationship
βY |1X =βY |1TβW |1T
.
• The division causes increased variability.
∗ If the instrument is very weak, the slopeβW |1T will be near zero.
∗ This will make the IV estimate very unstable.
![Page 126: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/126.jpg)
Lecture 4 27
IV V IA PREDICTION
• Here’s another way to do the same thing and get the IV estimator without doing
division explicitly:
∗ RegressW onT .
∗ Form predicted values:βW |1T + βW |1TT .
∗ RegressY on these predicted values
∗ The slope is the instrumental variables estimate.
![Page 127: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/127.jpg)
Lecture 4 28
LOGISTIC REGRESSIONEXAMPLE
• These data come from a paper by Satten and Kupper.
Y = Evidence of Coronary Heart Disease (binary)W = Cholesterol LevelT = LDLZ = Age and Smoking Status
• It’s not particularly clear thatT is really an instrument. While it may well beuncorrelated with the error inW and the random variation inY , it might also beincluded as part of the model itself!
∗ There are no replicates here to compare with.
• Note here that we have added variables,Z, which are measured without error.
• The algorithm does not change: all regressions are done withZ as an extra covari-ate.
![Page 128: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/128.jpg)
Lecture 4 29
IV A LGORITHM: NON-GAUSSIAN OUTCOME
• The IV algorithm in linear regression is:
STEP 1: RegressW onT andZ (may be a multivariate regression)
STEP 2: Form the predicted values of this regression
STEP 3: RegressY on the predicted values andZ.
STEP 4: The regression coefficients are the IV estimates.
• Only Step 3 changes if you do not have linear regression but instead have logisticregression or a GLM.
∗ Then the “regression” is logistic or GLM.
∗ Very simple to compute.
∗ Easily bootstrapped.
• This method is “valid” in GLM’s to the extent that regression calibration is valid.
![Page 129: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/129.jpg)
Lecture 4 30
LOGISTIC REGRESSIONEXAMPLE : RESULTS
• We used LDL/100, Cholesterol/100, Age/100
• The naive logistic regression leads to the following analysis:
∗ Slope and bootstrap s.e. for Cholestrol: 0.65 and 0.29
∗ Slope and bootstrap s.e. for Smoking: 0.065 and 0.260
∗ Slope and bootstrap s.e. for Age: 7.82 and 4.26
• The IV logistic regression leads to the following analysis:
∗ Slope and bootstrap s.e. for Cholestrol: 0.91 and 0.33
∗ Slope and bootstrap s.e. for Smoking: 0.056 and 0.259
∗ Slope and bootstrap s.e. for Age: 7.79 and 4.31
• Note here how only the coefficient for cholesterol is affected by the measurementerror.
![Page 130: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/130.jpg)
Lecture 4 31
END OF LECTURE4
![Page 131: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/131.jpg)
Lecture 5 1
LECTURE 5: BAYESIAN METHODS
OUTLINE
• Likelihood: Outcome, measurement, and exposure models
• Priors and posteriors
• MCMC
• Example: linear regression
• Example: Mixtures of Berkson and classical error and application to thyroid can-
cer and exposure to fallout
![Page 132: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/132.jpg)
Lecture 5 2
BAYESIAN METHODS
Outcome model:(true disease model)
f (Y |X,Z,θO)
Could be a GLM or a “nonparametric” spline model
Measurement model:(measurement error model)
f (W |Y,X,Z,θM) = f (W |X,Z,θM) if nondifferential (which is assumed here)
Could be a classical Gaussian measurement error model
Exposure model:(predictor distribution model)
f (X|Z, θX)f (Z|θZ) (usuallyf (Z|θZ) is ignored)
Could be a low-dimensional parametric model or a more flexible model (e.g., normalmixture model)
![Page 133: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/133.jpg)
Lecture 5 3
PRIOR
π(θO), π(θM), π(θX), π(θZ) (usuallyπ(θZ) is ignored)
![Page 134: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/134.jpg)
Lecture 5 4
L IKELIHOOD
f (Y, W |θO, θM , θX, Z) =
∫f (Y |X,Z, θO) f (W |X,Z,θM) f (X|Z, θX) dX
Major problem: computing the integral
We will go in another direction — Bayesian analysis by MCMC.
![Page 135: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/135.jpg)
Lecture 5 5
POSTERIOR ANDMCMC
f (θO, θM , θX, X|Y, W,Z) ∝ f (Y, W,Z,X,θO, θMθX|Z)
= f (Y |X,Z,θO) f (W |X,Z,θM) f (X|Z, θX)π(θO) π(θM) π(θX)
MCMC — For example we might sample successively from:
1. [θO|others]
2. [θM |others]
3. [θX|others]
4. [X|others]
![Page 136: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/136.jpg)
Lecture 5 6
MCMC: L INEAR REGRESSIONWITH REPLICATE
MEASUREMENTS
[Yi|Xi, Z i, β, σε] ∼ N(ZTi βz + Xiβx, σ
2ε ) (outcome model)
[Wi,j|X ] ∼ IN(Xi, σ2u), j = 1, . . . , Ji (measurement model)
[Xi|Zi] = [Xi] ∼ N(µx, σ2x) (exposure model)
[βx] = N(0, σ2β), [βz] = N(0, σ2
βI), [µx] = N(0, σ2µ)
[σ2ε ] = IG(δ, δ), [σ2
u] = IG(δ, δ), [σ2x] = IG(δ, δ),
σβ andσµ are “large” andδ is “small”
unknowns: (βx, βz, σε), (σu), (X1, . . . , XN), (µx, σx)
![Page 137: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/137.jpg)
Lecture 5 7
MCMC: L INEAR REGRESSION— L IKELIHOOD
Let
C i =
(Z i
Xi
), Y =
Y1
...
YN
, and β =
(βz
βx
)
Likelihood:
[Yi,Wi,1, . . . , Wi,Ji|others] ∝ 1
σεσJiu
exp
{−1
2
((Yi −CT
i β)2
σ2ε
+
∑Jij=1(Wi,j −Xi)
2
σ2u
)}
![Page 138: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/138.jpg)
Lecture 5 8
MCMC: L INEAR REGRESSION— POSTERIORFOR β
[β|others] ∝ exp
{− 1
2σ2ε
N∑i=1
(Yi −CTi β)2 − 1
2σ2β
‖β‖2
}
Let C haveith rowCTi and letλ = σ2
ε/σ2β. Then
[β|others] = N({CTC + λI
}−1 CTY , σ2ε
(CTC + λI)−1
)
• Exactly what we get for linear regression without measurement error.
∗ Except nowC will vary on each iteration of MCMC, since it contains imputed
Xs.
![Page 139: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/139.jpg)
Lecture 5 9
MCMC: L INEAR REGRESSION— POSTERIORFOR µx
[µx|others] ∝ exp
{−
∑Ni=1(Xi − µx)
2
2σ2x
− µ2x
2σ2µ
}
Let σ2X
= σ2x/N . Then
[µx|others] ∼ N
(1
σ2X
+1
σ2µ
)−1 (X
σ2X
+0
σ2µ
),
(1
σ2X
+1
σ2µ
)−1 ,
≈ N(X, σ2
X
)if σµ is large
![Page 140: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/140.jpg)
Lecture 5 10
MCMC: L INEAR REGRESSION— POSTERIORFOR Xi
Define
σ2w,i =
σ2u
Ji
Then
[Xi|others] ∝ exp
[−1
2
{(Yi −Xiβx −ZT
i βz)2
σ2ε
+(Xi − µx)
2
σ2x
+(W i −Xi)
2
σ2w,i
}]
After some algebra,[Xi|others] is normalwith mean
{(Yi −ZTi βz)/βx}(β2
x/σ2ε ) + (µx)(1/σ
2x) + (W i)(1/σw,i)
(β2x/σ
2ε ) + (1/σ2
x) + (1/σ2w,i)
and variance {(β2
x/σ2ε ) + (1/σ2
x) + (1/σ2w,i)
}−1
![Page 141: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/141.jpg)
Lecture 5 11
MCMC: L INEAR REGRESSION— POSTERIORFOR σ2ε
Prior:
[σ2ε ] = IG(δ, δ)
IG(α, β):
mean =α/β and variance =α/β2 and density∝ x−(α+1) exp(−β
x
)
Posterior:
[σ2ε |others] ∝ 1
(σ2ε )
δ+N/2+1exp
{δ + 1
2
∑Ni=1(Yi −Xiβx −ZT
i βz)2
σ2ε
}
⇒ [σ2ε |others] = IG
{δ +
N
2, δ +
1
2
N∑i=1
(Yi −Xiβx −ZTi βz)
2
}
![Page 142: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/142.jpg)
Lecture 5 12
MCMC: L INEAR REGRESSIONPOSTERIORFOR σ2x
[σ2x |others] ∝ 1
(σ2x)
δ+N/2+1exp
{δ + 1
2
∑Ni=1(Xi − µx)
2
σ2ε
}
⇒ [σ2x |others] = IG
{δ +
N
2, δ +
1
2
N∑i=1
(Xi − µx)2
}
![Page 143: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/143.jpg)
Lecture 5 13
MCMC: L INEAR REGRESSION— POSTERIORFOR σ2u
[σ2u |others] ∝ 1
(σ2u)
δ+∑N
i=1 Ji/2+1exp
{δ + 1
2
∑Ni=1
∑Jij=1(Wi,j −Xi)
2
σ2u
}
⇒ [σ2u |others] = IG
δ +
∑Ni=1 Ji
2, δ +
1
2
N∑i=1
Ji∑j=1
(Wi,j −Xi)2
![Page 144: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/144.jpg)
Lecture 5 14
M IXTURES OF BERKSONAND CLASSICAL MEASUREMENT
ERROR
Mallick, Hoffman, and Carroll (2002) study models where both Berkson and classical
measurement errors are present.
• W = observed dose of radiation from fallout due to nuclear testing in Nevada
• W = C ×DCF × I × TD × FP
∗ C = time-integrated radioiodine concentration of milk
I specific to producer, not individual(Berkson)
I but one component is deposition rate ofI131 across regions of study(Clas-
sical)
![Page 145: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/145.jpg)
Lecture 5 15
M IXTURES OF BERKSONAND CLASSICAL MEASUREMENT
ERROR
∗ DCF = ingestion dose conversion factor
I specific to age and isotope(Berkson) (Classical)
∗ I = individual milk intake rate in liters per day, measured by a food frequency
questionnaire (FFQ)(Berkson)
∗ TD = time delay (depends on source of milk — from FFQ)(Berkson)
∗ FP= frequency of purchase correction factor (from FFQ)(Berkson)
![Page 146: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/146.jpg)
Lecture 5 16
LATENT VARIABLE MODEL
• W = measured dose
• X = true dose
• L = latent intermediate variable
• Berkson and classical errors are combined in:
log(X) = log(L) + Ub
log(W ) = log(L) + Uc
∗ Ub = 0 ⇒ log(X) = log(L) ⇒ log(W ) = log(X) + Uc (Classical)
∗ Uc = 0⇒ log(W ) = log(L) ⇒ log(X) = log(W ) + Ub (Berkson)
![Page 147: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/147.jpg)
Lecture 5 17
LATENT VARIABLE MODEL
• σ2b + σ2
c is known
• p = σ2b/(σ2
b + σ2c ) is unknown
∗ informative priors:
I p = 1 (Berkson)
I p = 0 (Classical)
I p is uniform on [0.2, 0.8](Mixture)
I compare results from these three priors (sensitivity analysis)
• Z = age at exposure, sex (known exactly)
• state = state of residence (Utah, Nevada, Arizona) (known exactly)
![Page 148: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/148.jpg)
Lecture 5 18
EXPOSUREAND OUTCOME MODELS
• Exposure model:
[ log(L)|Z, state = s ] ∼ N( αs0 + ZTαs1, σ2s )
• Outcome model:
∗ Y = indicator of thyroid cancer
∗ logit{pr(Y = 1|Z, X)} = β0 + ZTβ1 + log(1 + θX)
![Page 149: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/149.jpg)
Lecture 5 19
APPROXIMATE MODEL FOR [Y |W,Z]
• Outcome model (from previous page):
∗ Y = indicator of thyroid cancer
∗ logit {pr(Y = 1|Z, X)} = β0 + ZTβ1 + log(1 + θX)
• Outcome model based onW :
∗ logit{pr(Y = 1|Z, W )} ≈ β0 + ZTβ1 + log(1 + θ γ W λx|w,t
)
where
γ = exp{(
1− λx|w,t
) (µL|Z + σ2
L|Z/2)
+ σ2b/2
}
I µL|Z andσ2L|Z are the mean and variance oflog(L) givenZ
and
λx|w,t =σ2
L|Zσ2
L|Z + σ2c
![Page 150: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/150.jpg)
Lecture 5 20
BERKSONCASE
• λx|w,t = 1
• γ = exp(σ2b/2)
and model on previous page simplifies to
logit{pr(Y = 1|Z, W )} ≈ β0 + ZTβ1 + log (1 + θ γ W )
∗ γ > 1 so effect of Berkson error is for the naive analysis tooverestimatethe
effect of dose
∗ In the regression, replaceW by γW to correct for bias (regression calibration)
![Page 151: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/151.jpg)
Lecture 5 21
BAYESIAN ANALYSIS
• Several parametric and semiparametric models were used
• nonparametric model for dose is constrained to be monotonically increasing
• nonparametric model for the distribution oflog(L) uses a Polya-tree prior (Lavine,
1992)
![Page 152: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/152.jpg)
Lecture 5 22
BAYESIAN ANALYSIS: SOME RESULTSFOR PARAMETRIC
MODELS
Error model Post. mean ofθ RR at 1 Gy 95% credible interval for RR
No error 38.9 9.4 (4.5, 13.8)
Classical 74.1 17.1 (8.5, 24.6)
Berkson 31.9 7.9 (3.8, 11.4)
Mixture 56.1 13.2 (5.0, 23.1)
• Effects on a naive analysis:
∗ Berkson error⇒ overestimate effect
∗ Classical error⇒ underestimate effect
∗ Mixture of errors⇒ underestimate effect (but less than if only classical error)
![Page 153: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/153.jpg)
Lecture 5 23
BAYESIAN ANALYSIS: SEMIPARAMETRIC MODELS
• the semiparametric models have better fit by DIC
• the semiparametric model forlog(L):
∗ leads to slightly smaller estimated posterior means ofθ and RR
∗ credible intervals also widen
• the semiparametric outcome model:
∗ leads to slightly higher estimated posterior means ofθ and RR
∗ credible intervals also widen
• Overall effect of both semiparametric models on mixture of errors model:
∗ raise RR from 13.2 to 14.2
∗ change CI from (5.0, 23.1) to (1.7, 33.6)
![Page 154: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/154.jpg)
Lecture 5 24
BAYESIAN ANALYSIS: ASSUMPTIONOF INDEPENDENTERROR
• almost certainly false
• C (radioiodine concentration) includes
∗ deposition ofI131 by region
∗ mass interception on vegetation
∗ consumption of vegetation by cows
∗ effective half-life ofI131 in vegetation
∗ milk transfer coefficient (MTC)
• Utah study generated a distribution of log-normally distributed MTC’s with esti-mated mean and variance
∗ if parameters known, then error is Berkson
∗ parameters estimated from historical data and literature review
⇒ shared classical errors
![Page 155: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/155.jpg)
Lecture 5 25
SENSITIVITY ANALYSIS
• For the six gender/state combinations, Berkson errors for individuals in a group
had a common correlation ofρ
ρ 0.0 0.2 0.4 0.6
E(θ|data) 56.1 65.9 84.1 95.1
CI (18.6, 102) (21.4, 120) (30.4, 143) (38.2, 152)
![Page 156: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/156.jpg)
Lecture 5 26
END OF LECTURE5
![Page 157: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/157.jpg)
Lecture 6 1
LECTURE 6: NONPARAMETRIC REGRESSION WITH
MEASUREMENT ERROR
OUTLINE
• Earlier approaches: deconvolution kernels, SIMEX, regression calibration
• New Bayesian spline approach (Berry, Carroll, and Ruppert, 2002, JASA)
• Simulation results
• Example: A clinical trial
• Example: TSP and mortality, a sensitivity study
![Page 158: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/158.jpg)
Lecture 6 2
THE PROBLEM OF MEASUREMENTERROR— ILLUSTRATION
−4 −2 0 2 4−1
−0.5
0
0.5
1
1.5
x
y
Gold Standard
datatrue curveSpline
−4 −2 0 2 4−1
−0.5
0
0.5
1
1.5
w
y
Data with measurement error
![Page 159: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/159.jpg)
Lecture 6 3
THE PROBLEM OF MEASUREMENTERROR— ILLUSTRATION
−4 −2 0 2 4−1.5
−1
−0.5
0
0.5
1
1.5
x
y
Gold Standard
data true curveSpline
−4 −2 0 2 4−1.5
−1
−0.5
0
0.5
1
1.5
w
y
Data with measurement error
![Page 160: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/160.jpg)
Lecture 6 4
THE PROBLEM OF MEASUREMENTERROR
• The regression model is
Y = m(X) + ε
wherem is only known to be smooth
• Observe
Y and W = X + U, where
– E(U |X) = 0 and var(U |X) = σ2u
– U |X normally distributed
• Measurement error varianceσ2u is estimated from internal replicate data. (Observe
Wij, j = 1, . . . , Ji.)
![Page 161: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/161.jpg)
Lecture 6 5
AVAILABLE METHODS
• Deconvolution kernels
• SIMEX
• Structural spline
• Bayesian spline
![Page 162: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/162.jpg)
Lecture 6 6
DECONVOLUTION KERNELS
• Globally consistent nonparametric regression by deconvolution kernels (Fan and
Truong, 1993,Annals)
– does not work so well
∗ Fan & Truong show very poor asymptotic rates of convergence
∗ simulations show poor finite-sample behavior
– no methodology for bandwidth selection or inference
![Page 163: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/163.jpg)
Lecture 6 7
SIMEX
• The SIMEX method is due to Cook & Stefanski (1995,JASA).
• SIMEX has been previously applied to parametric problems.
• Makes no assumptions about the trueX ’s. (Functional)
• Results in estimators which areapproximatelyconsistent, i.e., consistent at least
to orderO(σ6u).
![Page 164: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/164.jpg)
Lecture 6 8
SIMEX
• Carroll, Maca, Ruppert (2001,Biometrika) (CMR ) applied the SIMEX to non-parametric regression.
• CMR have asymptotic theory in the local polynomial regression (LPR) context.
∗ The estimators have the usual rates of convergence.
∗ They are approximately consistent, to orderO(σ6u).
• An asymptotic theory with rates seems very difficult for splines
∗ but, simulations inCMR indicate that SIMEX/splines works a littlebetterthan
SIMEX/kernel
∗ problem seems due to undersmoothing
• Staudenmayer and Ruppert (2004,JRSS-B) look at bandwidth selection for SIMEX/LPR.
∗ With better bandwidth selection, SIMEX/LPR is competitive with other meth-ods.
![Page 165: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/165.jpg)
Lecture 6 9
STRUCTURAL MODELING
• The regression ofY on the observedW is
E(Y |W ) = E {m(X)|W} =
∫m(x)f (x|W )dx.
• Suppose that we had:
∗ convenient flexible form form(x; β)
∗ convenient flexible distribution forf (x|W ).
• Then we could estimatem(X ; β) by minimizing over the datan∑
i=1
{Yi −
∫m(x; β)f (x|Wi) dx
}2
.
![Page 166: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/166.jpg)
Lecture 6 10
REGRESSIONSPLINES — “PLUS FUNCTION” B ASIS
• Model
E(Y |X) = m(X ; β) :=
J∑j=0
βjXj +
K∑
k=1
βk+J(X − ξk)J+
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 11
1.1
1.2
1.3
1.4
1.5
0 0.2 0.4 0.6 0.8 10.5
1
1.5
2
2.5
0 0.2 0.4 0.6 0.8 11
1.2
1.4
1.6
1.8
![Page 167: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/167.jpg)
Lecture 6 11
10-KNOT QUADRATIC SPLINE APPROXIMATIONS
0 0.2 0.4 0.6 0.8 1−1.5
−1
−0.5
0
0.5
1
1.5sin(15x)
0 0.2 0.4 0.6 0.8 1−1
−0.5
0
0.5
1
1.5
sin{5(1+x+x2)}
0 0.2 0.4 0.6 0.8 1−0.5
0
0.5
1
1.5
Φ{18(x−.05)}
0 0.2 0.4 0.6 0.8 10
0.5
1
1.50.8x+exp{−35(x−0.5)}
“blue” = function ,“red” = spline approximation
![Page 168: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/168.jpg)
Lecture 6 12
REGRESSIONSPLINES
• Model
E(Y |X) = m(X ; β) :=
J∑j=0
βjXj +
K∑
k=1
βk+J(X − ξk)J+
• Xj is replaced byE(Xj|W )
• (X − ξk)J+ is replaced byE{(X − ξk)
J+|W}
• The key remaining issue: the joint distribution ofX andU .
∗ CMR used a mixtures of normals for[X ] and Gibbs sampling to estimate the
parameters.(Flexible structural)
I This is an extension to measurement error of an idea of Roeder & Wasser-
man (JASA, 1997).
![Page 169: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/169.jpg)
Lecture 6 13
FULLY BAYESIAN MODEL
What’s New?Answer: Fully Bayesian MCMC method in Berry, Carroll, and Ruppert (2002,JASA)(BCR)
• Uses splines
∗ smoothing or penalized
∗ P-splines in this lecture
• Structural
∗ Xi are iid normal
∗ but seems robust to violations of normality
![Page 170: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/170.jpg)
Lecture 6 14
FULLY BAYESIAN MODEL
• Smoothing parameter is automatic
• Inference adjusts for the data-based smoothing parameter and for measurement
error
• Allow global confidence bands
![Page 171: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/171.jpg)
Lecture 6 15
FULLY BAYESIAN MODEL — PARAMETERS
• Regression Model
Yi = m(xi; β) + εi
– m(xi; β) is a P-spline
– εi iid N(0, σ2ε )
• Measurement Error Model
Wij = Xi + Uij whereUij iid N(0, σ2u)
• Structural Model
Xi iid N(µx, σ2x)
• Parameters:β, σ2e , σ
2u, µx, σ
2x
![Page 172: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/172.jpg)
Lecture 6 16
FULLY BAYESIAN MODEL — PARAMETERS
• Priors
– β is N(0, (γK)−1) whereK is known. [α := γσ2e is the smoothing parameter.]
– γ is Gamma(Aγ, Bγ)
– σ2e is Inv-Gamma(Ae, Be)
– σ2u is Inv-Gamma(Au, Bu)
– µx is N(dx, t2x)
– σ2x is Inv-Gamma(Ax, Bx)
• Hyperparameters: Ae, Be, Au, Bu, Ax, Bx, dx, t2x, Aγ, Bγ
– all fixed at values making the priors noninformative
∗ E.g.,t2x = 106.
![Page 173: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/173.jpg)
Lecture 6 17
FULLY BAYESIAN MODEL — SMOOTHING
• Model
E(Y |X) = m(X ; β) :=
J∑j=0
βjXj +
K∑
k=1
βk+J(X − ξk)J+
• β is N(0, (γK)−1) whereK is known. [α := γσ2e is the smoothing parameter.]
• Let
K = diag(∆, . . . , ∆, 1, . . . , 1)
∗ ∆ is “small” — essentially 0
∗ J + 1 ∆’s followed byK ones
∗ The prior on the jumps at the knots is that they are iid.
∗ The prior on the polynomial coefficients is “diffuse,” i.e., “non-informative”.
![Page 174: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/174.jpg)
Lecture 6 18
GIBBS SAMPLING
• Iterate throughβ, σ2e , σ
2u, σ
2x, µx, γ,X1, . . . , Xn.
• All steps except one are easy, either gamma, inverse-gamma, or normal
– E.g.,
[β|other parameters, Y , W ] ∼ Normal
Mean = (XTX + γK)−1XTY
Cov = σ2e(X
TX + γK)−1.
∗ HereX is one of the “other parameters”
∗ Essentially we’re fitting a spline to the imputedX ’s and the observedY ’s
![Page 175: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/175.jpg)
Lecture 6 19
GIBBS SAMPLING
∗ Estimate ofβ, call it β, is
(XTX + γK)−1XTY
averaged overγ andX.
![Page 176: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/176.jpg)
Lecture 6 20
GIBBS SAMPLING
• The exception to the sampling being quick and easy is that a Metropolis-Hastings
step is needed forX1, . . . , Xn.
[Xi|µx, σ2x, β, σ2
e , σ2u, Y ,W]
∝ exp
− 1
2σ2u
mi∑j=1
(Wij −Xi)2 − 1
2σ2ε
{Yi −m(Xi; β)}2 − 1
2σ2x
(Xi − µx)2
.
![Page 177: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/177.jpg)
Lecture 6 21
BAYESIAN INFERENCE
• Let X be the spline basis function evaluated on a fine grid over some interval,
[a, b].
• Xβ is the curve on[a, b].
• Xβ is the estimated curve.
• Let Kα be the(1− α) MCMC sample quantile of
maxgrid
{X(β − β)
SD(Xβ)
}.
• Then,
Xβ ±Kα SD(Xβ)
is a 100(1− α)% simultaneous confidence band for the curve on[a, b].
![Page 178: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/178.jpg)
Lecture 6 22
BAYESIAN INFERENCEFOR DERIVATIVES
• Let X′
be derivatives of the spline basis function evaluated on a fine grid over
[a, b].
• X′β is the curve’s derivative on[a, b].
• X′β is the estimated derivative.
• Let K ′α be the(1− α) MCMC sample quantile of
maxgrid
{X
′(β − β)
SD(X′β)
}.
• Then,
X′β ±K ′
.95 SD(X′β)
is a 100(1− α)% simultaneous confidence band for the derivative on[a, b].
![Page 179: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/179.jpg)
Lecture 6 23
SIMULATIONS
The six cases were considered.ni ≡ 2 in each case.
![Page 180: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/180.jpg)
Lecture 6 24
SIMULATIONS
• [Case 1]
The regression function is
m(x) =sin (πx/2)
1 + 2x2{sign(x) + 1}.
with n = 100, σ2ε = 0.32, σ2
u = 0.82, µx = 0 andσ2x = 1.
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
![Page 181: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/181.jpg)
Lecture 6 25
SIMULATIONS
• [Case 2] Same as Case 1 exceptn = 200.
• [Case 3] A modification of Case 1 above except thatn = 500.
• [Case 4]
Case 1 ofCMR so that
m(x) = 1000x3+(1− x)3+,
x+ = xI(x > 0), with n = 200, σ2ε = 0.00152, σ2
u = (3/7)σ2x, µx = 0.5 and
σ2x = 0.252.
−0.5 0 0.5 1 1.50
0.002
0.004
0.006
0.008
0.01
0.012
0.014
0.016
![Page 182: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/182.jpg)
Lecture 6 26
SIMULATIONS
• [Case 5]
A modification of Case 4 ofCMR so that
m(x) = 10 sin(4πx),
with n = 500, σ2ε = 0.052, σ2
u = 0.1412, µx = 0.5 andσ2x = 0.252.
−0.5 0 0.5 1 1.5−10
−8
−6
−4
−2
0
2
4
6
8
10
• [Case 6]
The same as Case 1 above except thatX is a normalized chi-square(4) random
variable. (Tests robustness against violation of the structural assumptions.)
![Page 183: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/183.jpg)
Lecture 6 27
SIMULATIONS
Mean Squared Bias×102
Method Case 1 Case 2 Case 3 Case 4 Case 5 Case 6Naive 5.59 4.92 5.21 1,108 3,733 4.83Bayes 0.78 0.38 1.04 17.4 468 1.74Flex. Structural, 5 knots 1.38 0.62 0.46 3.7 838 1.47Flex. Structural, 15 knots 1.44 0.60 0.66 3.3 226 1.75
Mean Squared Error ×102
Method Case 1 Case 2 Case 3 Case 4 Case 5 Case 6Naive 6.91 5.57 5.38 1,155 3,793 5.77Bayes 2.84 1.56 1.47 195 1,031 2.69Flex. Structural, 5 knots 8.17 3.82 1.73 217 2,032 7.27Flex. Structural, 15 knots 9.90 5.40 1.85 237 799 6.94
Results based on 200 Monte Carlo simulations for each case. SIMEX was notincluded in the table — it was not among the best estimators.
![Page 184: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/184.jpg)
Lecture 6 28
EXAMPLE — SIMULATED
• Y = sin(2X) + ε
• X is N(1, 1)
• σu = 1
• σe = 0.15
• n = 201
• ni = 2 for all i
• 15 knot quadratic P-splines
• 2,000 iterations of Gibbs. First 667 deleted as burn-in.
![Page 185: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/185.jpg)
Lecture 6 29
−1 0 1 2 3−1.5
−1
−0.5
0
0.5
1
1.5
x
y
Gold Standard
−4 −2 0 2 4 6−1.5
−1
−0.5
0
0.5
1
1.5
wbar
y
Data with measurement error
−2 0 2 4−4
−2
0
2
4
6
x
wba
r
−1 0 1 2 3−1.5
−1
−0.5
0
0.5
1
1.5
2
mha
t
x
Fitted curveTrue curveGold Std.
![Page 186: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/186.jpg)
Lecture 6 30
0 1000 2000−5
0
5
10beta(1)
0 1000 2000−1
−0.5
0
0.5yhat(1)
0 1000 20000
0.5
1
1.5yhat(51)
0 1000 20000
0.5
1
1.5yhat(101)
0 1000 2000−2
−1
0
1x(1)
0 1000 20000.8
1
1.2
1.4
1.6
sigmau
0 1000 2000−4
−2
0
2log(gamma)
0 1000 2000−2.5
−2
−1.5
−1
−0.5
log(σe)
0 1000 20000.8
1
1.2
1.4
1.6
mux
Results of Gibbs Sampling. Every twentieth iteration.
Note: X(1) = −1.45 and W (1) = −0.8. Also, log(σe) = −1.9.
![Page 187: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/187.jpg)
Lecture 6 31
EXAMPLE — SIMULATED
What does the Bayes approach work so well? Here’s my explanation:
Bayes uses all possible information to estimateX and, especially,m(X).
• ‖m(X)− E{m(X)|W, Y , other param.}‖≈ ‖m(X)− ave{m(X)}‖ = 2.47
• ‖m(X)−m(E{X|W, Y , other param.})‖≈ ‖m(X)−m(ave{X})‖ = 4.67
• ‖m(X)−m(E(X|W ))‖ = 10.25
• ‖m(X)−m(W )‖ = 12.36
![Page 188: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/188.jpg)
Lecture 6 32
EXAMPLE — CLINICAL TRIAL
• Study of a psychiatric medication.
• Treatment and control group.
• Evaluation at baseline (W ) and at end of study (Y ).
– smaller values→ more severe disease
– scale is a combination of self-report and clinical interview so there is consider-
able measurement error
– it is believe thatσ2u ≈ 0.35.
• We are interested in∆(X) := m(X)−X = E(Y −W |X).
• Preliminary Wilcoxon test found a highly significant treatment effect.
• Question: How does the treatment effect depend upon the baseline value?
![Page 189: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/189.jpg)
Lecture 6 33
True Baseline Score
Cha
nge
-1 0 1 2
0.0
0.5
1.0
1.5
2.0
2.5
3.0
TreatmentControl
![Page 190: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/190.jpg)
Lecture 6 34
True Baseline Score
Tre
atm
ent E
ffect
-1 0 1 2
-3
-2
-1
0
1
2
3
![Page 191: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/191.jpg)
Lecture 6 35
MAXIMUM L IKELIHOOD V IA MCEM
Ganguli, Staudenmayer, and Wand (2005, to appear):
• use the same model as Berry, Carroll, and Ruppert (2002)
• compute MLEs using van Dyk’s (2002) nested Monte Carlo EM algorithm
• their estimator works well
∗ this suggests that the success of the Berry, Carroll, and Ruppert method is due
to using the likelihood rather than from the information in the prior
• Ganguli, Staudenmayer, and Wand extend their method to additive models
![Page 192: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/192.jpg)
Lecture 6 36
EXAMPLE — A SENSITIVITY ANALYSIS
Original study from Zanobetti, Wand, Schwartz, and Ryan (2000).
This analysis from Ganguli, Staudenmayer, and Wand (2005, to appear) — also see
Ruppert, Wand, Carroll (2003,Semiparametric Regression)
• Outcome = log(natural mortality)
• Exposure
∗ TSP = total suspended particles
• Confounders
∗ DAY = days since beginning of study
∗ Mean daily temperature
∗ Relative humidity
![Page 193: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/193.jpg)
Lecture 6 37
EXAMPLE — A SENSITIVITY ANALYSIS
reliability ratio =var{log(TSPi)}
var{log(TSPi)} + σ2u
.
Problem: No replicates so the measurement error varianceσ2u cannot be estimated.
![Page 194: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/194.jpg)
Lecture 6 38
EXAMPLE — A SENSITIVITY ANALYSIS
0 2 4 6−0.4
−0.3
−0.2
−0.1
0
0.1
0.2
Log(TSP)
Add
itive
Mod
el T
erm
for
log(
TS
P)
No Measurement Errorrel. ratio = 0.9rel. ratio = 0.8rel. ratio = 0.7
0 1000 2000 3000 4000−0.4
−0.2
0
0.2
0.4
0.6
Day
Add
itive
Mod
el T
erm
for
Day
−10 0 10 20 30 40−0.2
−0.1
0
0.1
0.2
0.3
0.4
0.5
Mean Daily Temperature
Add
itive
Mod
el T
erm
for
Tem
pera
ture
0 20 40 60 80 100−0.1
−0.05
0
0.05
0.1
Relative Humidity
Add
itive
Mod
el T
erm
for
Hum
idity
![Page 195: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/195.jpg)
Lecture 6 39
DISCUSSION
• With the work ofCMR andBCR we now have reasonably efficient estimators for
nonparametric regression with measurement error.
– SIMEX (LPR and splines) — inCMR
– (Flexible) Structural splines — inCMR
– Fully Bayesian (hardcore structural) — inBCR
![Page 196: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/196.jpg)
Lecture 6 40
DISCUSSION
• With BCR we have a methodology that
– automatically selects the amount of smoothing
– estimates the unknownX ’s
– allows inference that takes account of the effects of smoothing parameter se-
lection and measurement error
• Most efficient methods appear to be structural, though SIMEX may be competitive
– hardcore structural methods seem reasonably robust
![Page 197: MEASUREMENT ERROR IN HEALTH STUDIEScarroll/talks/Carroll_Short_Course.pdf · † Lecture 2 — Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors, Berkson/Classical](https://reader034.fdocuments.us/reader034/viewer/2022050117/5f4e38c22ed32b67dd5b7ab3/html5/thumbnails/197.jpg)
Lecture 6 41
END OF LECTURE6
THANK YOU FOR YOUR ATTENTION!!!