SUG London 2007 Least Angle Regression Translating the S-Plus/R Least Angle Regression package to...
-
Upload
jaden-sallis -
Category
Documents
-
view
224 -
download
4
Transcript of SUG London 2007 Least Angle Regression Translating the S-Plus/R Least Angle Regression package to...
SUG London 2007 Least Angle Regression
Translating the S-Plus/R Least Angle
Regression package to Mata
Adrian Mander
MRC-Human Nutrition Research Unit, Cambridge
SUG London 2007 Least Angle Regression
Outline
LARS package
Lasso (the constrained OLS)
Forward Stagewise regression
Least Angle Regression
Translating Hastie & Efron’s code from R to Mata
The lars Stata command
SUG London 2007 Least Angle Regression
Lasso
tTm
jj
1
ˆ)ˆ(
m
jjjx
1
ˆˆ
n
iiiyS
1
2ˆ)ˆ(
Let y be the dependent variable and xj be the m covariates
The usual linear predictor
Want to minimise the squared differences
N.B. Ridge regression does constraint on L2 norm
Subject to this constraint, large t gives OLS solution
SUG London 2007 Least Angle Regression
b1
b2
Lasso graphically
The constraints can be seen below.
One property of this constraint is that there will be coefficients =0 for a subset of variables
SUG London 2007 Least Angle Regression
Ridge Regression
The constraints can be seen below.
b1
b2
The coefficients are shrunk but does not have the property of parsimony
SUG London 2007 Least Angle Regression
Forward Stagewise
Using constraints
The function of current correlations is
Move the mean in the direction of the greatest
correlation for some small ε
FORWARD STEPWISE is greedy and selects
SUG London 2007 Least Angle Regression
Least Angle Regression
The LARS (S suggesting LaSso and Stagewise)
Starts like classic Forward Selection
Find predictor xj1 most correlated with the current residual
Make a step (epsilon) large enough until another predictor
xj2 has as much correlation with the current residual
LARS – now step in the direction equiangular between two
predictors until xj3 earns its way into the “correlated set”
SUG London 2007 Least Angle Regression
Least Angle Regression Geometrically
Two covariates x1 and x2 and the space L(x1 ,x2) that is
spanned by them
μ0 μ1 x1
x2 x2
y1
y2
y2 is the
projection of y
onto L(x1 ,x2)
Start at μ0 =0
SUG London 2007 Least Angle Regression
Continued…
The current correlations only depend on the projection of y on L(x1 ,x2) I.e. y2
SUG London 2007 Least Angle Regression
Programming similarities
The code comparing Splus to Mata looks incredibly similar
SUG London 2007 Least Angle Regression
Programming similarities
There are some differences thoughArray of arrays… beta[[k]] = array
Indexing on the left hand side…beta[positive] = beta0
Being able to “join” null matrices.
Row and column vectors are not very strict in Splus.
Being able to use the minus sign in indexing
beta[-positive]
“Local”-ness of mata functions within mata functions? Local is from the
first call of Mata
Not the easiest language to debug when you don’t know what you are
doing (thanks to statalist/Kit to push start me).
SUG London 2007 Least Angle Regression
Stata command
LARS is very simple to uselars y <varlist>, a(lar)
lars y <varlist>, a(lasso)
lars y <varlist>, a(stagewise)
Not everything in the Splus package is implemented
because I didn’t have all the data required to test all the
code
SUG London 2007 Least Angle Regression
Stata command
SUG London 2007 Least Angle Regression
Graph output
-1000
-500
0
500
1000B
eta
0 .2 .4 .6 .8 1Sum mod(beta)/ Max sum mod (beta)
age sex bmi bp s1s2 s3 s4 s5 s6
SUG London 2007 Least Angle Regression
Conclusions
Mata could be a little easier to use
Translating Splus code is pretty simple
Least Angle Regression/Lasso/Forward Stagewise are
all very attractive algorithms and certainly an
improvement over Stepwise.