Lecture #18 - 4/7/2005 Slide 1 of 29
Curvilinear Regression Analysis
Lecture 18
April 7, 2005Applied Regression Analysis
Overview Today’s Lecture
ANOVA Example
CurvilinearRegression
Wrapping Up
Lecture #18 - 4/7/2005 Slide 2 of 29
Today’s Lecture
ANOVA with a continuous independent variable.
Curvilinear regression analysis.
Interactions with continuous variables.
Overview
ANOVA Example Example ANOVA Regression SS Differences ANOVA Table
CurvilinearRegression
Wrapping Up
Lecture #18 - 4/7/2005 Slide 3 of 29
An Example
From Pedhazur, p. 513-514:
“Assume that in an experiment on the learning of pairedassociates, the independent variable is the number ofexposures to a list. Specifically, 15 subjects arerandomly assigned, in equal numbers, to five levels ofexposure to a list, so that one group is given oneexposure, a second group is given two exposures, andso on to five exposures for the fifth group. Thedependent variable measure is the number of correctresponses on a subsequent test.”
Overview
ANOVA Example Example ANOVA Regression SS Differences ANOVA Table
CurvilinearRegression
Wrapping Up
Lecture #18 - 4/7/2005 Slide 4 of 29
The Analysis
Running an ANOVA (from Analyze...General LinearModel...Univariate in SPSS) produces these results:
Overview
ANOVA Example Example ANOVA Regression SS Differences ANOVA Table
CurvilinearRegression
Wrapping Up
Lecture #18 - 4/7/2005 Slide 5 of 29
The Interpretation
From the example, we could test the hypothesis:
H0 : µ1 = µ2 = µ3 = µ4 = µ5
Here, F4,10 = 2.10, which gives a p-value of 0.156.
Using any reasonable Type-I error rate (like 0.05), we wouldfail to reject the null hypothesis.
We would then conclude that there is no effect of number ofexposures on learning (as measured by test score).
Note that for this analysis there were five coded vectorsproduced (four degrees of freedom for the numerator).
Overview
ANOVA Example Example ANOVA Regression SS Differences ANOVA Table
CurvilinearRegression
Wrapping Up
Lecture #18 - 4/7/2005 Slide 6 of 29
A New Analysis
Instead of running an ANOVA to test for differences betweenthe means of the test scores at each level of X , couldn’t werun an linear regression?
In the words of Marv Albert: YES!
For the linear regression to be valid, the means of the levelsof X must fall on the linear regression line.
The key point is that the means must follow a linear trend.
Using the difference between the ANOVA and theRegression, I will show you how you can test for a lineartrend in the analysis.
Overview
ANOVA Example Example ANOVA Regression SS Differences ANOVA Table
CurvilinearRegression
Wrapping Up
Lecture #18 - 4/7/2005 Slide 7 of 29
Multiple Regression Results
Running an regression (from Analyze...Regression...Linearin SPSS) produces these results:
Overview
ANOVA Example Example ANOVA Regression SS Differences ANOVA Table
CurvilinearRegression
Wrapping Up
Lecture #18 - 4/7/2005 Slide 8 of 29
Multiple Regression Results
From the example, we could test the hypothesis:
H0 : b1 = 0
Here, F1,13 = 8.95, which gives a p-value of 0.010.
Using any reasonable Type-I error rate (like 0.05), we wouldreject the null hypothesis.
We would then conclude that there is a significantrelationship between number of exposures and learning (asmeasured by test score).
This conclusion is different than the conclusion we drewbefore.
What is different about our analysis?
Overview
ANOVA Example Example ANOVA Regression SS Differences ANOVA Table
CurvilinearRegression
Wrapping Up
Lecture #18 - 4/7/2005 Slide 9 of 29
SS Differences
Notice from the ANOVA analysis, the SStreatment = 8.40.
From the regression analysis, the SSregression = 7.50.
Note the difference between the two.
The SStreatment is larger.
SSdeviation = SStreatment − SSregression = 0.90.
The difference between SStreatment and SSregression istermed SSdeviation.
Take a look at how that difference comes about.
Overview
ANOVA Example Example ANOVA Regression SS Differences ANOVA Table
CurvilinearRegression
Wrapping Up
Lecture #18 - 4/7/2005 Slide 10 of 29
SS Differences
The estimated regression line is:
Y ′ = 2.7 + 0.5X
X NX X Y ′ X − Y ′ (X − Y ′)2 NX(X − Y ′)2
1 3 3.0 3.2 -0.2 0.04 0.12
2 3 4.0 3.7 0.3 0.09 0.27
3 3 4.0 4.2 -0.2 0.04 0.12
4 3 5.0 4.7 0.3 0.09 0.27
5 3 5.0 5.2 -0.2 0.04 0.12∑
NX(X − Y ′)2 0.90
Overview
ANOVA Example Example ANOVA Regression SS Differences ANOVA Table
CurvilinearRegression
Wrapping Up
Lecture #18 - 4/7/2005 Slide 11 of 29
Data Scatterplot
1.00 2.00 3.00 4.00 5.00
number of exposures
2.00
3.00
4.00
5.00
6.00
nu
mb
er c
orr
ect
Ω
Ω
Ω
Ω
Ω
Ω
Ω
Ω
Ω
Ω
Ω
Ω
Ω
Ω
Ω
M
M M
M M
Overview
ANOVA Example Example ANOVA Regression SS Differences ANOVA Table
CurvilinearRegression
Wrapping Up
Lecture #18 - 4/7/2005 Slide 12 of 29
SS Differences
The value obtained in the previous slide, 0.90, was equal tothe SSdeviation.
The SSdeviation is literally the calculation of a statistic thatmeasures a variable’s deviation from linearity.
This value serves as a basis for the question of:
“What is the difference between restricting the data toconfirm to a linear trend and placing no suchrestriction?” (Pedhazur, p. 517)
Overview
ANOVA Example Example ANOVA Regression SS Differences ANOVA Table
CurvilinearRegression
Wrapping Up
Lecture #18 - 4/7/2005 Slide 13 of 29
SS Differences
When the SStreatment is calculated, there is no restriction onthe means of the treatment groups.
If the means fall onto a (straight) line, there will be nodifference between SStreatment and SSregression,SSdeviation = 0.
With departures from linearity, the SStreatment will be muchlarger than the SSregression.
Do you feel a statistical hypothesis test coming on?
Overview
ANOVA Example Example ANOVA Regression SS Differences ANOVA Table
CurvilinearRegression
Wrapping Up
Lecture #18 - 4/7/2005 Slide 14 of 29
Hypothesis Test
The SSTreatments can be partitioned into two components:SSRegression (also called the SS due to linearity), and theremainder, the SS due to deviation from linearity.
Source df SS MS F
Between Treatments 4 8.60Linearity 1 7.50 7.50 7.50
Deviation From Linearity 3 0.90 0.30 0.30Within Treatments 10 10.00 1.00
Total 14 18.40
If the SS due to linearity leads to a significant F value, thenone can conclude a linear trend exists, and that linearregression is appropriate.
Overview
ANOVA Example
CurvilinearRegression The Polynomial
Model New Example Estimation In
SPSS Parameter
Interpretation Variable
Centering Multiple
CurvilinearRegression
Wrapping Up
Lecture #18 - 4/7/2005 Slide 15 of 29
Curvilinear Regression
The preceding example demonstrated how a linear trendcould be detected using a statistical hypothesis test.
A linear trend is something we are very familiar with, havingencountered linear regression for most of this course.
Curvilinear regression analysis can be used to determine ifnot-so-linear trends exist between Y and X .
Pedhazur distinguishes between two types of trendspossible:
Intrinsically linear.
Intrinsically nonlinear.
Overview
ANOVA Example
CurvilinearRegression The Polynomial
Model New Example Estimation In
SPSS Parameter
Interpretation Variable
Centering Multiple
CurvilinearRegression
Wrapping Up
Lecture #18 - 4/7/2005 Slide 16 of 29
Curvilinear Regression
An intrinsically linear model is one that is linear in itsparameters but not linear in the variables.
By transformation such a model may be reduced to alinear model.
Such models are the focus of this remainder of thislecture.
An intrinsically nonlinear model is one that may not becoerced into linearity by transformation.
Such models often require more complicated estimationalgorithms than what is provided by least squares and theGLM.
Overview
ANOVA Example
CurvilinearRegression The Polynomial
Model New Example Estimation In
SPSS Parameter
Interpretation Variable
Centering Multiple
CurvilinearRegression
Wrapping Up
Lecture #18 - 4/7/2005 Slide 17 of 29
The Polynomial Model
A simple regression model extension for curved relations isthe polynomial model, such as the following second-degreepolynomial:
Y ′ = a + b1X1 + b2X2
1
One could also estimate a third-degree polynomial:
Y ′ = a + b1X1 + b2X2
1+ b3X
3
1
Or a fourth-degree polynomial:
Y ′ = a + b1X1 + b2X2
1+ b3X
3
1+ b4X
4
1
And so on...
Overview
ANOVA Example
CurvilinearRegression The Polynomial
Model New Example Estimation In
SPSS Parameter
Interpretation Variable
Centering Multiple
CurvilinearRegression
Wrapping Up
Lecture #18 - 4/7/2005 Slide 18 of 29
The Polynomial Model: Estimation
The way of determining the extent to which a given model isapplicable is similar to determining if added variablessignificantly improve the predictive ability of a regressionmodel.
Beginning with a linear model (a first-degree polynomial),estimate the model, denoted as R2
y.x.
The tests of incremental variance accounted for are done foreach level of the polynomial:
Linear: R2
y.x
Quadratic: R2
y.x,x2 − R2
y.x
Cubic: R2
y.x,x2,x3 − R2
y.x,x2
Quartic: R2
y.x,x2,x3,x4 − R2
y.x,x2,x3
Overview
ANOVA Example
CurvilinearRegression The Polynomial
Model New Example Estimation In
SPSS Parameter
Interpretation Variable
Centering Multiple
CurvilinearRegression
Wrapping Up
Lecture #18 - 4/7/2005 Slide 19 of 29
A New Example
From Pedhazur, p. 522:
“Suppose that we are interested in the effect of timespent in practice on the performance of a visualdiscrimination task. Subjects are randomly assigned todifferent levels of practice, following which a test ofvisual discrimination is administered, and the number ofcorrect responses is recorded for each subject. Asthere are six levels the highest-degree polynomialpossible for these data is the fifth. Our aim, however, isto determine the lowest degree-polynomial that best fitsthe data.”
Overview
ANOVA Example
CurvilinearRegression The Polynomial
Model New Example Estimation In
SPSS Parameter
Interpretation Variable
Centering Multiple
CurvilinearRegression
Wrapping Up
Lecture #18 - 4/7/2005 Slide 20 of 29
Data Scatterplot
2.50 5.00 7.50 10.00
Practice Time
5.00
10.00
15.00
20.00
Tas
k S
core
Ω
Ω
Ω
Ω
ΩΩ
Ω
Ω
Ω
Ω
Ω
Ω
Ω
Ω
Ω
Ω
Ω
Ω
Overview
ANOVA Example
CurvilinearRegression The Polynomial
Model New Example Estimation In
SPSS Parameter
Interpretation Variable
Centering Multiple
CurvilinearRegression
Wrapping Up
Lecture #18 - 4/7/2005 Slide 21 of 29
Estimation In SPSS
To estimate the degrees of a polynomial, first one mustcreate new variables in SPSS, each representing X raised toa given power.
Then successive regression analyses must be run, eachadding a level to the equation:
Model R2 Increase Over Previous F
X 0.883 0.940 121.029 *X , X2 0.943 0.060 15.604 *
X , X2, X3 0.946 0.003 0.911
Because adding X3 did not significantly increase R2, westop with the quadratic model.
Overview
ANOVA Example
CurvilinearRegression The Polynomial
Model New Example Estimation In
SPSS Parameter
Interpretation Variable
Centering Multiple
CurvilinearRegression
Wrapping Up
Lecture #18 - 4/7/2005 Slide 22 of 29
Estimation In SPSS
Of course, there is an easier way...
In SPSS go to Analyze...Regression...Curve Estimation
Overview
ANOVA Example
CurvilinearRegression The Polynomial
Model New Example Estimation In
SPSS Parameter
Interpretation Variable
Centering Multiple
CurvilinearRegression
Wrapping Up
Lecture #18 - 4/7/2005 Slide 23 of 29
Estimation In SPSS
MODEL: MOD_2.
Independent: x
Dependent Mth Rsq d.f. F Sigf b0 b1 b2 b3
y LIN .883 16 121.03 .000 3.2667 1.5571
y QUA .943 15 123.55 .000 -1.9000 3.4946 -.1384
y CUB .946 14 82.18 .000 .6667 1.8803 .1290 -.0127
Overview
ANOVA Example
CurvilinearRegression The Polynomial
Model New Example Estimation In
SPSS Parameter
Interpretation Variable
Centering Multiple
CurvilinearRegression
Wrapping Up
Lecture #18 - 4/7/2005 Slide 24 of 29
Data Scatterplot
Overview
ANOVA Example
CurvilinearRegression The Polynomial
Model New Example Estimation In
SPSS Parameter
Interpretation Variable
Centering Multiple
CurvilinearRegression
Wrapping Up
Lecture #18 - 4/7/2005 Slide 25 of 29
Parameter Interpretation
The b parameters in a polynomial regression are nearlyimpossible to interpret.
An independent variable is represented by more than asingle vector - what’s held constant?
The relative magnitude of the b parameters for differentdegrees cannot be compared because the SD of the higherdegree polynomials explodes.
X → s2
x
X2→ (s2
x)2
X3→ (s2
x)3
. . .
Overview
ANOVA Example
CurvilinearRegression The Polynomial
Model New Example Estimation In
SPSS Parameter
Interpretation Variable
Centering Multiple
CurvilinearRegression
Wrapping Up
Lecture #18 - 4/7/2005 Slide 26 of 29
Variable Centering
Centering variables in a polynomial equation can avoidcollinearity problems.
Centering does not change the R2 of a model, only theregression parameters.
Overview
ANOVA Example
CurvilinearRegression The Polynomial
Model New Example Estimation In
SPSS Parameter
Interpretation Variable
Centering Multiple
CurvilinearRegression
Wrapping Up
Lecture #18 - 4/7/2005 Slide 27 of 29
Multiple Curvilinear Regression
Running multiple curvilinear regression models are straightforward extensions from what was shown today:
Y ′ = a + b1X + b2Z + b3XZ + b4X2 + b5Z
2
Note the cross-product XZ.
This cross product term is tested above and beyond X and Z
individually.
Overview
ANOVA Example
CurvilinearRegression
Wrapping Up Final Thought Next Class
Lecture #18 - 4/7/2005 Slide 28 of 29
Final Thought
Curvilinear regression canbe accomplished usingtechniques we are familiarwith.
Interpretation can betricky...
We are all lucky to be students during this season...
Overview
ANOVA Example
CurvilinearRegression
Wrapping Up Final Thought Next Class
Lecture #18 - 4/7/2005 Slide 29 of 29
Next Time
No class next week (I’m in Montreal...if you are there, sayhello).
Chapter 14: Continuous and categorical independentvariables.
Comedy provided by this guy:
Top Related