Step Wise

8/3/2019 Step Wise

1/4

Stepwise Multiple Regression

Your introductory lesson for multiple regression with SAS involved developing amodel for predicting graduate students Grade Point Average. We had data from 30graduate students on the following variables: GPA (graduate grade point average),GREQ (score on the quantitative section of the Graduate Record Exam, a commonlyused entrance exam for graduate programs), GREV (score on the verbal section of theGRE), MAT (score on the Miller Analogies Test, another graduate entrance exam), andAR, the Average Rating that the student received from 3 professors who interviewedem prior to making admission decisions. GPA can exceed 4.0, since this universityattaches pluses and minuses to letter grades. We used a simultaneous multipleregression, entering all of the predictors at once. Now we shall learn how to conductstepwise regressions, where variables are entered and/or deleted according tostatistical criteria. Please run the program STEPWISE.SAS from my SAS Programspage.

Forward Selection

In a forward selection analysis we start out with no predictors in the model. Eachof the available predictors is evaluated with respect to how much R2would be increasedby adding it to the model. The one which will most increase R2will be added if it meetsthe statistical criterion for entry. With SAS the statistical criterion is the significancelevel for the increase in the R2produced by addition of the predictor. If no predictormeets that criterion, the analysis stops. If a predictor is added, then the second stepinvolves re-evaluating all of the available predictors which have not yet been enteredinto the model. If any satisfy the criterion for entry, the one which most increases R2 isadded. This procedure is repeated until there remain no more predictors that are

eligible for entry.

Look at the program. The first model (A:) asks for a forward selection analysis.The SLENTRY= value specifies the significance level for entry into the model. Thedefaults are 0.50 for forward selection and 0.15 for fully stepwise selection. I set theentry level at .05 -- I think that is unreasonably low for a forward selection analysis, but Iwanted to show you a possible consequence of sticking with the .05 criterion.

Look at the output. The Statistics for Entry on page 1 show that all fourpredictors meet the criterion for entry. The one which most increases R2 is the AverageRating, so that variable is entered. Now look at the Step 2 Statistics for Entry. The Fvalues there test the null hypotheses that entering a particular predictor will not change

the R

2

at all. Notice that all of these Fvalues are less than they were at Step 1,because each of the predictors is somewhat redundant with the AR variable which isnow in the model. Now look at the Step 3 Statistics for Entry. The Fvalues there aredown again, reflecting additional redundancy with the now entered GRE_Verbalpredictor. Neither predictor available for entry meets the criterion for entry, so theprocedure stops. We are left with a two predictor model, AR and GRE_V, whichaccounts for 54% of the variance in grades.

Copyright 2006, Karl L. Wuensch, All Rights Reserved

Stepwise.doc
http://core.ecu.edu/psyc/wuenschk/SAS/SAS-Programs.htmhttp://core.ecu.edu/psyc/wuenschk/SAS/SAS-Programs.htm

8/3/2019 Step Wise

2/4

Backwards Elimination

In a backwards elimination analysis we start out with all of the predictors in themodel. At each step we evaluate the predictors which are in the model and eliminateany that meet the criterion for removal.

Look at the program. Model B asks for a backwards elimination model. TheSLSTAY= value specifies the significance level for staying in the model. The defaultsare 0.10 for BACKWARD and 0.15 for STEPWISE. I set it at .05.

Look at the output for Step 1. Of the variables eligible for removal (those withp> .05), removing AR would least reduce the R2, so AR is removed. Recall that AR wasthe first variable to be entered with our forwards selection analysis. AR is the bestsingle predictor of grades, but in the context of the other three predictors it has thesmallest unique contribution towards predicting grades. The Step 2 statistics show thatonly GRE_V is eligible for removal, so it is removed. We are left with a two predictormodel containing GRE_Q and MAT and accounting for 58% of the variance in grades.

Does it make you a little distrustful of stepwise procedures to see that the onesuch procedure produces a two variable model that contains only predictors A and B,while another such procedure produces a two variable model containing only predictorsC and D? It should make you distrustful!

Fully Stepwise Selection

With fully stepwise selection we start out just as in forwards selection, but ateach step variables that are already in the model are first evaluated for removal, and ifany are eligible for removal, the one whose removal would least lowerR2 is removed.You might wonder why a variable would enter at one point and leave later -- well, avariable might enter early, being well correlated with the criterion variable, but laterbecome redundant with predictors that follow it into the model.

Look at the program. For Model C I asked for fully stepwise analysis and setboth SLSTAY and SLENTRY at .08 (just because I wanted to show you both entry anddeletion).

Look at the output. AR entered first, and GRE_V second, just as in the forwardselection analysis. At this point, Step 3, both GRE_Q and MAT are eligible for entry,given my .08 criterion for entry. MAT has ap a tiny bit smaller than GRE_Q, so it isselected for entry. Look at the Ffor entry of GRE_Q on Step 4 -- it is larger than it wason Step 3, reflecting a suppressor relationship between GRE_Q and MAT. GRE_Q

enters. We now have all four predictors in the model, but notice that GRE_V and ARno longer have significant partial effects, and thus become eligible for removal. AR isremoved first, then GRE_V is removed.

It appears that the combination of GRE_Q and MAT is better than thecombination of GRE_V and AR, due to GRE_Q and MAT having a suppressorrelationship. That suppressor relationship is not accounted for, however, until one orthe other of GRE_Q and MAT are entered into the model, and with the forwardselection analysis neither get the chance to enter.

2

8/3/2019 Step Wise

3/4

R2 Selection

SELECTION = RSQUARE finds the best n (BEST = n) combinations of

predictors among all possible 1 predictor models, then among 2 predictor models, then3, etc., etc., where best means highest R2. You may force it to INCLUDE=ithe first ipredictors, START=n with n-predictor models, and STOP=n with n-predictor models. Ispecified none of these options, so I got every possible 1 predictor model, everypossible 2 predictor model, etc.

I did request Mallows Cp statistic and MSE. One may define the best model

as that which has a small value ofCp which is also close top (the number of

parameters in the model, including the intercept). The small Cp indicates precision,

small variance in estimating the population regression coefficients. With Cp small and

approximately equal top, the model should fit the data well, and adding additionalpredictors should not improve precision much. Models with Cp >>p do not fit the data

well.

The output shows that the best one-predictor model is AR, as we already know.The best two-predictor model is GRE_Q and MAT, which should not surprise us,given our evidence of a suppressor relationship between those two predictors.Adding GRE_V to that model gives us the best three predictor model -- and evenif the R2doesnt go up much, we might as well add GRE_V, because the GRE_Vscores come along with the GRE_Q scores at no additional cost. For economicreasons, we might well decide to drop the AR predictor, since it does not

contribute much beyond what the other three predictors provide, and it is likelymuch more expensive to gather scores for that predictor than for the other three.

My Opinion of Stepwise Multiple Regression

I think it is fun, but dangerous. For the person who understands multipleregression well, a stepwise analysis can help reveal interesting relationships such asthe suppressor effects we noted here. My experience has been that the typical user ofa stepwise multiple regression has little understanding of multiple regression, andabsolutely no appreciation of how a predictors unique contribution is affected by thecontext within which it is evaluated (the other predictors in the model). Too many

psychologists think that stepwise regression somehow selects out the predictors thatare really associated with the criterion and leaves out those which have only spuriousor unimportant relationships with the criterion. Stepwise analysis does no such thing.Furthermore, statistical programs such as SPSS for Windows make it all too easy forsuch psychologists to conduct analyses, such as stepwise multiple regression analysis,which they cannot understand and whose results they are almost certain tomisinterpret.

3

8/3/2019 Step Wise

4/4

Voodoo Regression

Copyright 2006, Karl L. Wuensch, All Rights Reserved

4
http://core.ecu.edu/psyc/wuenschk/StatHelp/Stepwise-Voodoo.htmhttp://core.ecu.edu/psyc/wuenschk/StatHelp/Stepwise-Voodoo.htm

Step Wise

Documents

Transcript of Step Wise