Raymond J. Carroll Texas A&M University stat.tamu/~carroll [email protected]
description
Transcript of Raymond J. Carroll Texas A&M University stat.tamu/~carroll [email protected]
![Page 1: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/1.jpg)
Raymond J. CarrollTexas A&M University
http://stat.tamu.edu/[email protected]
Postdoctoral Training Program:http://stat.tamu.edu/B3NC
Non/Semiparametric Regression and Clustered/Longitudinal Data
![Page 2: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/2.jpg)
College Station, home of Texas A&M University
I-35I-45
Big Bend National Park
Wichita Falls, my hometown
West Texas
Palo DuroCanyon, the Grand Canyon of Texas
Guadalupe Mountains National Park
East Texas
Midland
![Page 3: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/3.jpg)
Raymond Carroll Alan Welsh
Naisyin Wang Enno Mammen Xihong Lin
Oliver Linton
Acknowledgments
Series of papers are on my web site
Lin, Wang and Welsh: Longitudinal data (Mammen & Linton for pseudo-observation methods)
Linton and Mammen: time series data
![Page 4: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/4.jpg)
Outline
• Longitudinal models: • panel data
• Background: • splines = kernels for independent data
• Correlated data: • do splines = kernels?
![Page 5: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/5.jpg)
Panel Data (for simplicity)
• i = 1,…,n clusters/individuals• j = 1,…,m observations per cluster
Subject Wave 1 Wave 2 … Wave m
1 X X X
2 X X X
… X
n X X X
![Page 6: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/6.jpg)
Panel Data (for simplicity)
• i = 1,…,n clusters/individuals• j = 1,…,m observations per cluster• Important points:
• The cluster size m is meant to be fixed• This is not a multiple time series
problem where the cluster size increases to infinity
• Some comments on the single time series problem are given near the end of the talk
![Page 7: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/7.jpg)
The Marginal Nonparametric Model
• Y = Response• X = time-varying covariate
• Question: can we improve efficiency by accounting for correlation?
ij ij ij
ij
Y =Θ(X )+εΘ • unknownfunctioncov(ε )=Σ
![Page 8: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/8.jpg)
Nonstandard Example: Colon carcinogenesis experiments.
Cell-Based Measures in single rats of DNA damage, differentiation, proliferation, apoptosis, P27, etc.
![Page 9: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/9.jpg)
The Marginal Nonparametric Model
• Important assumption• Covariates at other waves are not
conditionally predictive, i.e., they are surrogates
• This assumption is required for any GLS fit, including parametric GLS
ij ij ik ijE(Y| X ,X for k j)=Θ(X )
![Page 10: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/10.jpg)
Independent Data
• Splines (smoothing, P-splines, etc.) with penalty parameter =
• Ridge regression fit• Some bias, smaller variance• is over-parameterized least squares• is a polynomial regression
n T 2
i i i ii 1
Y (X ) Y (mini X )m tize ( ) dt
0
![Page 11: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/11.jpg)
Independent Data
• Kernels (local averages, local linear, etc.), with kernel density function K and bandwidth h
• As the bandwidth h 0, only observations with X near t get any weight in the fit
n1 i
ii 1
n1 i
i 1
t
t
Xn Y K h(t) Xn K h
![Page 12: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/12.jpg)
Independent Data
• Major methods• Splines • Kernels
• Smoothing parameters required for both
• Fits: similar in many (most?) datasets• Expectation: some combination of bandwidths
and kernel functions look like splines
12
![Page 13: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/13.jpg)
Independent Data
• Splines and kernels are linear in the responses
• Silverman showed that there is a kernel function and a bandwidth so that the weight functions
are asymptotically equivalent • In this sense, splines = kernels
n
-1n i i
i=1Θ( ) = n G ( ,t Xt )Y
nG (t,x)
![Page 14: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/14.jpg)
The weight functions Gn(t=.25,x) in a specific case for independent data
Kernel Smoothing Spline
Note the similarity of shape and the locality: only X’s near t=0.25 get any weight
![Page 15: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/15.jpg)
Working Independence
• Working independence: Ignore all correlations
• Fix up standard errors at the end• Advantage: the assumption
is not required• Disadvantage: possible severe loss of
efficiency if carried too far
ij ij ik ijE(Y| X ,X for k j)=Θ(X )
![Page 16: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/16.jpg)
Working Independence
• Working independence: • Ignore all correlations• Should posit some reasonable marginal variances• Weighting important for efficiency
• Weighted versions: Splines and kernels have obvious analogues
• Standard method: Zeger & Diggle, Hoover, Rice, Wu & Yang, Lin & Ying, etc.
![Page 17: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/17.jpg)
Working Independence
• Working independence: • Weighted splines and weighted kernels are
linear in the responses• The Silverman result still holds• In this sense, splines = kernels
![Page 18: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/18.jpg)
Accounting for Correlation• Splines have an obvious analogue for non-
independent data• Let be a working covariance matrix
• Penalized Generalized least squares (GLS)
• GLS ridge regression• Because splines are based on likelihood ideas,
they generalize quickly to new problems
n T 2
i i i ii 1
-1wY (X ) Y (X ) (t) dtΣ
wΣ
![Page 19: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/19.jpg)
Accounting for Correlation
• Splines have an obvious analogue for non-independent data
• Kernels are not so obvious• One can do theory with kernels
• Local likelihood kernel ideas are standard in independent data problems
• Most attempts at kernels for correlated data have tried to use local likelihood kernel methods
![Page 20: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/20.jpg)
Kernels and Correlation
• Problem: how to define locality for kernels?• Goal: estimate the function at t• Let be a diagonal matrix of standard
kernel weights • Standard Kernel method: GLS pretending
inverse covariance matrix is
• The estimate is inherently local
iK(t,X )
-1/ 2 1/w
2i
1iK (t,X ) K (t,XΣ )
![Page 21: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/21.jpg)
Kernels and CorrelationSpecific case: m=3, n=35
Exchangeable correlation structure
Red: = 0.0
Green: = 0.4
Blue: = 0.8
Note the locality of the kernel method
The weight functions Gn(t=.25,x) in a specific case
18
![Page 22: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/22.jpg)
Splines and CorrelationSpecific case: m=3, n=35
Exchangeable correlation structure
Red: = 0.0
Green: = 0.4
Blue: = 0.8
Note the lack of locality of the spline method
The weight functions Gn(t=.25,x) in a specific case
![Page 23: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/23.jpg)
Splines and CorrelationSpecific case: m=3, n=35
Complex correlation structure
Red: Nearly singular
Green: = 0.0
Blue: = AR(0.8)
Note the lack of locality of the spline method
The weight functions Gn(t=.25,x) in a specific case
![Page 24: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/24.jpg)
Splines and Standard Kernels
• Accounting for correlation:• Standard kernels remain local• Splines are not local
• Numerical results can be confirmed theoretically
![Page 25: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/25.jpg)
Results on Kernels and Correlation
• GLS with weights
• Optimal working covariance matrix is working independence!
• Using the correct covariance matrix • Increases variance• Increases MSE• Splines Kernels (or at least these
kernels)
-1/ 2 1/w
2i
1iK (t,X ) K (t,XΣ )
24
![Page 26: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/26.jpg)
Pseudo-Observation Kernel Methods
• Better kernel methods are possible• Pseudo-observation: original method• Construction: specific linear
transformation of Y• Mean = (X)• Covariance = diagonal matrix
• This adjusts the original responses without affecting the mean
* 1i i i i
-1/ 2w= diag( )Σ
Y Y ( ) Y (X )
![Page 27: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/27.jpg)
Pseudo-Observation Kernel Methods
• Construction: specific linear transformation of Y• Mean = (X)• Covariance = diagonal
• Iterative:• Efficiency: More efficient than working
independence• Proof of Principle: kernel methods can be
constructed to take advantage of correlation
* 1i i i iY Y ( ) Y (X )
![Page 28: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/28.jpg)
0
0.5
1
1.5
2
2.5
3
3.5
Excnhg AR Near Sing
SplineP-kernel
Efficiency of Splines and Pseudo-Observation Kernels
Exchng: Exchangeable with correlation 0.6
AR: autoregressive with correlation 0.6
Near Sing: A nearly singular matrix
![Page 29: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/29.jpg)
What Do GLS Splines Do?
• GLS Splines are really working independence splines using pseudo-observations
• Let
• GLS Splines are working independence splines
-1 jkjk
Σ = σ
jjweight =σs
jk
*ij ij ik ikjjk j
σpseudo Y =Y + Y (- X )σobs
![Page 30: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/30.jpg)
GLS Splines and SUR Kernels
• GLS Splines are working independence splines
• Algorithm: iterate until convergence• Idea: for kernels, do same thing• This is Naisyin Wang’s SUR method
-1 jkjk
Σ = σ jjweight =σs
jk
*ij ij ik ikjjk j
σpseudo Y =Y + Y (- X )σobs
![Page 31: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/31.jpg)
Better Kernel Methods: SUR• Another view • Consider current state in iteration• For every j,
• assume function is fixed and known for• Use the seemingly unrelated regression (SUR)
idea • For j, form estimating equation for local
averages/linear for jth component only using GLS with weights
• Sum the estimating equations together, and solveij
-1wK(t,X )Σ
j k
![Page 32: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/32.jpg)
SUR Kernel Methods
• It is well known that the GLS spline has an exact, analytic expression
• We have shown that the SUR kernel method has an exact, analytic expression
• Both methods are linear in the responses• Relatively nontrivial calculations show that
Silverman’s result still holds• Splines = SUR Kernels
![Page 33: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/33.jpg)
Nonlocality
• The lack of locality of GLS splines and SUR kernels is surprising
• Suppose we want to estimate the function at t
• Result: All observations in a cluster contribute to the fit, not just those with covariates near t
• Locality: Defined at the cluster level
![Page 34: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/34.jpg)
Nonlocality
• Wang’s SUR kernels = pseudo kernels with a clever linear transformation. Let
• SUR kernels are working independence kernels
1 jkjk
jjweight =σs
jk
*ij ij ik ikjjk j
σpseudo Y =Y + Y (- X )σobs
![Page 35: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/35.jpg)
Locality of Splines
• Splines = SUR kernels (Silverman-type result)
• GLS spline: • Iterative• standard independent spline smoothing• SUR pseudo-observations at each iteration
• GLS splines are not local• GLS splines are local in (the same!) pseudo-
observations
![Page 36: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/36.jpg)
Time Series Problems
• Time series problems: many of the same issues arise
• Original pseudo-observation method• Two stages• Linear transformation• Mean (X)• Independent errors• Single standard kernel applied
• Potential for great gains in efficiency (even infinite for AR problems with large correlation)
![Page 37: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/37.jpg)
Time Series: AR(1) Illustration, First Pseudo Observation Method
• AR(1), correlation :
• Regress Yt0 on Xt
0t t t-1 t-1Y =Y - Y -Θ(Xρ )
t t-1 tε - ε =u (whitenρ oise)
![Page 38: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/38.jpg)
Time Series Problems
• More efficient methods can be constructed• Series of regression problems: for all j,
• Pseudo observations • Mean• White noise errors• Regress for each j: fits are asymptotically
independent• Then weighted average
• Time series version of SUR-kernels for longitudinal data?
jiY
t j(X )
![Page 39: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/39.jpg)
Time Series: AR(1) Illustration, New Pseudo Observation Method
• AR(1), correlation :
• Regress Yt0 on Xt and Yt
1 on Xt-1
• Weights: 1 and 2
0t t t-1 t-1Y =Y - Y -Θ(Xρ )
t t-1 tρε - ε =u
1 -1t t-1 t tY =Y - Y -Θ(Xρ )
![Page 40: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/40.jpg)
Time Series Problems
• AR(1) errors with correlation • Efficiency of original pseudo-observation
method to working independence:
• Efficiency of new (SUR?) pseudo-observation method to original method:
21 : as 11
21 : 2 as 1
36
![Page 41: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/41.jpg)
The Semiparametric Model
• Y = Response• X,Z = time-varying covariates
• Question: can we improve efficiency for by accounting for correlation?
ij ij ij ij
ij
Y =Z +Θ(X )+εcov(ε
β)=Σ
![Page 42: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/42.jpg)
Profile Methods
• Given , solve for say• Basic idea: Regress
• Working independence• Standard kernels• Pseudo –observations kernels• SUR kernels
*ij ij ij ijY =Y -Z on Xβ
ijΘ(X ,β)
ij ij ij ij
ij
Y =Z +Θ(X )+εcov(ε
β)=Σ
![Page 43: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/43.jpg)
Profile Methods
• Given , solve for say• Then fit GLS or W.I. to the model with mean
• Question: does it matter what kernel method is used?
• Question: How bad is using W.I. everywhere?• Question: are there efficient choices?
ij ijZ + (XΘβ ,β)
ijΘ(X ,β)
ij ij ij ij
ij
Y =Z +Θ(X )+εcov(ε
β)=Σ
![Page 44: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/44.jpg)
The Semiparametric Model: Special Case
• If X does not vary with time, simple semiparametric efficient method available
• The basic point is that has common mean and covariance matrix
• If were a polynomial, GLS likelihood methods would be natural
ij ij i ijY =Z +Θ(X )+εβ
ij ijY -Z βiΘ(X ) Σ
Θ( )
![Page 45: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/45.jpg)
The Semiparametric Model: Special Case
• Method: Replace polynomial GLS likelihood with GLS local likelihood with weights
• Then do GLS on the derived variable
• Semiparametric efficient
ij ij i ijY =Z +Θ(X )+εβ
iXK ht
ij ijZ + (XΘβ ,β)
![Page 46: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/46.jpg)
Profile Method: General Case• Given , solve for say• Then fit GLS or W.I. to the model with mean
• In this general case, how you estimate matters• Working independence• Standard kernel• Pseudo-observation kernel• SUR kernel
ij ijZ + (XΘβ ,β)
ijΘ(X ,β)
![Page 47: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/47.jpg)
Profile Methods• In this general case, how you estimate
matters• Working independence• Standard kernel• Pseudo-observation kernel• SUR kernel
• The SUR method leads to the semiparametric efficient estimate
• So too does the GLS spline
![Page 48: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/48.jpg)
Age .014 .035 .010 .033 .008 .032
# of Smokes .984 .192 .549 .144 .579 .139
Drug Use? 1.05 .53 .58 .33 .58 .33
# of Partners
-.054 .059 .080 .038 .078 .039
Depression? -.033 .021 -.045 .013 -.046 .014
Longitudinal CD4 Count Data (Zeger and Diggle)
Working IndependenceEst. s.e.Est. s.e. Est. s.e.
SemiparametricGLS Z-D
SemiparametricGLS refit
![Page 49: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/49.jpg)
Conclusions (1/3): Nonparametric Regression
• In nonparametric regression• Kernels = splines for working
independence (W.I.)• Working independence is inefficient• Standard kernels splines for
correlated data
![Page 50: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/50.jpg)
Conclusions (2/3): Nonparametric Regression
• In nonparametric regression• Pseudo-observation methods improve
upon working independence• SUR kernels = splines for correlated
data• Splines and SUR kernels are not local• Splines and SUR kernels are local in
pseudo-observations
![Page 51: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/51.jpg)
Conclusions (3/3): Semiparametric Regression
• In semiparametric regression• Profile methods are a general class• Fully efficient parameter estimates are
easily constructed if X is not time-varying• When X is time-varying, method of
estimating affects properties of parameter estimates•Using SUR kernels or GLS splines as the
nonparametric method leads to efficient results• Conclusions can change between working
independence and semiparametric GLS
![Page 52: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/52.jpg)
Conclusions: Splines versus Kernels
• One has to be struck by the fact that all the grief in this problem has come from trying to define kernel methods
• At the end of the day, they are no more efficient than splines, and harder and more subtle to define
• Showing equivalence as we have done suggests the good properties of splines
![Page 53: Raymond J. Carroll Texas A&M University stat.tamu/~carroll carroll@stat.tamu](https://reader035.fdocuments.us/reader035/viewer/2022062315/56815c02550346895dc9ea79/html5/thumbnails/53.jpg)
The decrease in s.e.’s is in accordance with our theory. The other phenomena are more difficult to explain. Nonetheless, they are not unique to semiparametric GEE methods.
Similar discrepant outcomes occurred in parametric GEE estimation in which (t) was replaced by a cubic regression function in time. Furthermore, we simulated data using the observed covariates but having responses generated from the multivariate normal with mean equal to the fitted mean in the parametric correlated GEE estimation, and with correlation given by Zeger and Diggle.
The level of divergence between two sets of results in the simulated data was fairly consistent with what appeared in the Table. For example, among the first 25 generated data sets, 3 had different signs in sex partners and 7 had the scale of drug use coefficient obtained by WI 1.8 times or larger than what was obtained by the proposed method.
The Numbers in the Table