Quantitative Methods Model Selection II: datasets with several explanatory variables.

31
Quantitative Methods Model Selection II: datasets with several explanatory variables

Transcript of Quantitative Methods Model Selection II: datasets with several explanatory variables.

Page 1: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Quantitative Methods

Model Selection II:datasets with several explanatory variables

Page 2: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Model Selection II: several explanatory variables

The problem of model choice

Page 3: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Model Selection II: several explanatory variables

The problem of model choice

Page 4: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Model Selection II: several explanatory variables

The problem of model choice

With 5 x-variables, there are 25=32 possible models, not including interactions.

If we include two-way interactions without squared terms, there are1x1 + 5x1 + 10x2 + 10x8 + 5x64 + 1x1024 = 1450 models

If we do allow squared terms, there are1x1 + 5x2 + 10x8 + 10x64 + 5x1024 + 1x32768 = 38619 models.

With multiple models, there are many p-values and possible “right-leg/left-leg” and “poets’ dates” effects.

Page 5: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Model Selection II: several explanatory variables

The problem of model choice

• Economy of variables• Multiplicity of p-values• Marginality

Page 6: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Model Selection II: several explanatory variables

The problem of model choice

Page 7: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Model Selection II: several explanatory variables

Economy of variables

Page 8: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Model Selection II: several explanatory variables

Economy of variables

Page 9: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Model Selection II: several explanatory variables

Economy of variables

all variables increase R2

F<1 - adding the variable decreased R2 adjF>1 - adding the variable increased R2 adj

Page 10: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Model Selection II: several explanatory variables

Economy of variables

continuous

Page 11: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Model Selection II: several explanatory variables

Economy of variables

Page 12: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Model Selection II: several explanatory variables

Economy of variables

(Predictions for datapoint 39)

Page 13: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Model Selection II: several explanatory variables

Multiplicity of p-values

Page 14: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Model Selection II: several explanatory variables

Multiplicity of p-values

Multiple bites at the cherry

0

0.1

0.2

0.3

0.4

0.5

0.6

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Number of tests

Page 15: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Model Selection II: several explanatory variables

Multiplicity of p-values

Focus, don’t fish

- reduce number of X-variables - use outside information to decide on inclusion - use outside information to decide on exclusion

Stringency

- reduce nominal p-value

Combine model terms

- for once, reverse the usual splitting

Page 16: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Model Selection II: several explanatory variables

Multiplicity of p-values

Page 17: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Model Selection II: several explanatory variables

Multiplicity of p-values

DF SeqSS1 366.91 42.71 14.7

3 424.3

MS=424.3/3=141.4

F = 141.4/108.9 = 1.30on 3 and 30 DF

Single p-value from Minitabusing CDF: p=0.293CDF 1.30 K1;

F 3 30.LET K2=1-K1

Page 18: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Model Selection II: several explanatory variables

Stepwise regression

Page 19: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Model Selection II: several explanatory variables

Stepwise regression

Page 20: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Model Selection II: several explanatory variables

Stepwise regression

General Linear Model: LRGWHAL versus

Source DF Seq SS Adj SS Adj MS F PVIS 1 61.166 61.166 61.166 193.35 0.000Error 230 72.759 72.759 0.316Total 231 133.925

Term Coef SE Coef T PConstant -4.52464 0.06116 -73.98 0.000VIS 0.125222 0.009005 13.91 0.000

Page 21: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Model Selection II: several explanatory variables

Stepwise regression

General Linear Model: LRGWHAL versus

Source DF Seq SS Adj SS Adj MS F PVIS 1 61.166 61.166 61.166 193.35 0.000Error 230 72.759 72.759 0.316Total 231 133.925

Term Coef SE Coef T PConstant -4.52464 0.06116 -73.98 0.000VIS 0.125222 0.009005 13.91 0.000

Page 22: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Model Selection II: several explanatory variables

Stepwise regression

General Linear Model: LRGWHAL versus

Source DF Seq SS Adj SS Adj MS F PVIS 1 61.166 61.166 61.166 193.35 0.000Error 230 72.759 72.759 0.316Total 231 133.925

Term Coef SE Coef T PConstant -4.52464 0.06116 -73.98 0.000VIS 0.125222 0.009005 13.91 0.000

Page 23: Quantitative Methods Model Selection II: datasets with several explanatory variables.

General Linear Model: LRGWHAL versus

Source DF Seq SS Adj SS Adj MS F PVIS 1 61.166 61.166 61.166 193.35 0.000Error 230 72.759 72.759 0.316Total 231 133.925

Term Coef SE Coef T PConstant -4.52464 0.06116 -73.98 0.000VIS 0.125222 0.009005 13.91 0.000

Model Selection II: several explanatory variables

Stepwise regression

Page 24: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Model Selection II: several explanatory variables

Stepwise regression

Page 25: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Model Selection II: several explanatory variables

Stepwise regression

Forward = Backward

Forward ≠ Backward

Page 26: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Model Selection II: several explanatory variables

Stepwise regression

Page 27: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Model Selection II: several explanatory variables

Stepwise regression

Page 28: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Model Selection II: several explanatory variables

Stepwise regression

Page 29: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Model Selection II: several explanatory variables

Stepwise regression

Page 30: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Model Selection II: several explanatory variables

Stepwise regression

Page 31: Quantitative Methods Model Selection II: datasets with several explanatory variables.

Last words…

• Economy of variables: prediction, adjusted R2

• Multiplicity: outside information, focussing, stringency, combining model terms

• Stepwise regressions not usually suitable -- but are for initial sifting of a large number of potential predictors in a preliminary study

Random Effects

Read Chapter 12

Model Selection II: several explanatory variables