SADC Course in Statistics Comparing Regressions (Session 14)
-
date post
21-Dec-2015 -
Category
Documents
-
view
224 -
download
0
Transcript of SADC Course in Statistics Comparing Regressions (Session 14)
2To put your footer here go to View > Header and Footer
Learning Objectives
At the end of this session, you will be able to
• understand and interpret the components of a linear model with one quantitative variable and one categorical factor
• interpret output from such models
• write regressions equations for each level of the categorical variable using the model estimates
3To put your footer here go to View > Header and Footer
Return to the Paddy example
In the paddy example, consider the possible effects of fertiliser and variety together.
Objective is to explore whether fertiliser or variety of both affect paddy yields.
Note that the two explanatory variables (we will call them factors) being considered here are of different types, one is a quantitative variable, the other is a categorical variable.
4To put your footer here go to View > Header and Footer
Models with each factor in turnPreviously we have fitted each variable one at a time.Thus the model with fertiliser alone is:
yi = 0 + 1 (fert)i + i
while the model with variety alone is:
yij = ’0 + vi + ij
In models above, 0 , ’0 represent constants, 1
is the slope of the line in first model and vi (i=1,2,3)
represent the variety effect in 2nd model.
5To put your footer here go to View > Header and Footer
One model with both factors
We can put the two factors together into a single model as:
yij = 0 + 1 (fert)ij + vi + ij
This model fits a regression lines with common slope for each variety, i.e. it represents three parallel lines.
The intercepts of the lines are:
(0 + v1), (0 + v2) and (0 + v3).
6To put your footer here go to View > Header and Footer
Anova results (sequential)Source d.f. S.S. M.S. F Prob.
Fertiliser 1 29.94 29.94 130.8 0.000
Variety 2 12.29 6.14 26.9 0.000
Residual 32 7.32 0.2288
Total 35 49.55
The Residual M.S. (s2) = 0.2288. It describes the variation not explained by fertiliser and variety.
How may the above results be interpreted?
7To put your footer here go to View > Header and Footer
Anova results (adjusted)Source d.f. Adj.SS. Adj.MS. F Prob.
Fertiliser 1 6.95 6.95 30.4 0.000
Variety 2 12.29 6.14 26.9 0.000
Residual 32 7.32 0.2288
Total 35 49.55
In anova above, each term has been adjusted for the other. So S.S. for fertiliser, variety and residual do not add to the total S.S.
What conclusions may be drawn from above?
8To put your footer here go to View > Header and Footer
Model estimates
Parameter Coeff. Std.error t t prob
0 : constant 4.776 0.322 14.9 0.000
1 : fertiliser 0.526 0.096 5.51 0.000
g1 (new) 0 - - -
g2 (old) -1.207 0.269 -4.49 0.000
g3 (trad) -2.179 0.304 -7.16 0.000
What do these results tell us?
9To put your footer here go to View > Header and Footer
Comparing variety means
Thus: Old - New = -1.207 = Estimate of g2
Trad - New = -2.179 = Estimate of g3
In addition, because the results need to be adjusted for the effect of fertiliser, results again need to be reported in terms of adjusted means!
These are usually calculated at the overall mean of the fertiliser variable = 1.444
As before, comparisons with the base level can be made using the model estimates.
10To put your footer here go to View > Header and Footer
Raw means and adjusted means
Sample Raw Std.error
Variety Size(n) Means (s.d./n)
New improved 4 5.96 0.128
Old improved 17 4.54 0.173
Traditional 15 3.00 0.168
Variety Adjusted means Std.error
New improved 5.54 0.251
Old improved 4.33 0.122
Traditional 3.36 0.139
Variety means adjusted for fertiliser effect:
11To put your footer here go to View > Header and Footer
Parallel lines for each variety
Equations describing the regression of yield on fertiliser for each variety are:
y = 0 + 1 (fert) + vi
y = (0 + vi) + 1 (fert)
Thus for the new improved variety, y = (4.776 + 0) + 0.526 (fert) y = 4.776 + 0.526 (fert)
Similarly, equations can be found for the remaining two varieties.
12To put your footer here go to View > Header and Footer
Model with different slopes
We can put the two factors together into a single model as:
yij = 0 + 1(fert)ij + vi + i(fert)ij + ij
This model fits regression lines with different
intercepts (0 + vi), and diff. slopes (1 + i).
The separate slopes are:
(1 + 1), (1 + 3) and (1 + 3).
13To put your footer here go to View > Header and Footer
Anova with different slopes
Source d.f. Adj.SS. Adj.MS. F Prob.
Fertiliser 1 0.391 0.391 1.6 0.211
Variety 2 1.610 0.805 3.4 0.048
Fert*Var 2 0.143 0.071 0.3 0.745
Residual 30 7.180 0.239
Total 35 49.55
Fitting separate lines involves fitting an interaction term (see below)
What are your conclusions?
14To put your footer here go to View > Header and Footer
Final model….
Clear from above that the added term in the model to allow for different slopes is non-significant.
Hence return to the parallel lines model, i.e.y = 4.776 + 0.526(fert), for new varietyy = 3.569 + 0.526(fert), for old varietyy = 2.597 + 0.526(fert), for traditional