Post on 26-Dec-2015
Multiple Linear Regression - Matrix Formulation
Let x = (x1, x2, … , xn)′ be a n 1 column
vector and let g(x) be a scalar function of x. Then, by definition,
xgx
xgx
xgx
xgx
n
2
1
For example, let 2
1
n
ii
g x x x x
Let a = (a1, a2, … , a n)′ be a n 1 column vector
of constants. It is easy to verify that
x a ax
and that, for symmetrical A (n n)
2x A x A xx
Theory of Multiple Regression
Suppose we have response variables Yi ,
i = 1, 2, … , n and k explanatory variables/predictors X1, X2, … , Xk .
0 1 1 2 2 ...i i i k ki iY b b x b x b x
i = 1,2, … , nThere are k+2 parameters b0 , b1 , b2 , …, bk and σ2
nY
Y
Y
1
11 21 1
12 22 2
1 2
1
1
1
k
k
n n kn
x x x
x x x
X
x x x
X is called the design matrix
0
k
b
b
b
n
1
:Model Y Xb
OLS (ordinary least-squares) estimation
S Y Xb Y Xb
Y b X Y Xb
2Y Y b X Y b X Xb
2 2 0S
X Y X Xbb
1
1
b X X X Xb
b A A X X X
where
ˆ ˆE b b AE b b so is unbiased
ˆX Xb X Y
Fitted values are given by
1ˆY X b X X X X Y HY
1H X X X X
H is called the “hat matrix” (… it puts the hats on the Y’s)
The error sum of squares, SSRES , is
ˆ ˆ ˆ2S Y Y b X Y b X Xb Min
1ˆ ˆ2Y Y b X Y b X X X X X Y
ˆY Y b X Y
The estimate of 2 is based on this.
Example: Find a model of the form
y x1 x2
3.5 3.1 30
3.2 3.4 25
3.0 3.0 20
2.9 3.2 30
4.0 3.9 40
2.5 2.8 25
2.3 2.2 30
0 1 1 2 2 ...i i i k ki iY b b x b x b x for the data below.
Y
35
32
30
29
4 0
25
23
.
.
.
.
.
.
.
X
1 31 30
1 34 25
1 30 20
1 32 30
1 39 40
1 28 25
1 2 2 30
.
.
.
.
.
.
.
X is called the design matrix
Y Xb The model in matrix form is given by:
1
ˆ
ˆ ( )
X Xb X Y
b X X X Y
We have already seen that
Now calculate this for our example
X X
7 0 216 200 0
216 683 626 0
200 0 626 0 5950 0
. . .
. . .
. . .
R can be used to calculate X’X and the answer is:
To input the matrix in R use
X=matrix(c(1,1,1,1,1,1,1,3.1,3.4,3.0,3.4,3.9,2.8,2.2,30,25,20,30,40,25,30),7,3)
Number of rows
Number of columns
Notice command for matrix multiplication
The inverse of X’X can also be obtained by using R
We also need to calculate X’Y
1ˆ ( )b X X X Y Now
Notice that this is the same result as obtained previously using the lm result on R
So y = -0.2138 + 0.8984x1 + 0.01745x2 + e
1H X X X X
The “hat matrix” is given by
Y HY
The fitted Y values are obtained by
Recall once more we are looking at the model
Compare with
Error Terms and Inference
2 1 ˆˆ1Y Y b X Y
n k
A useful result is :
n : number of points
k: number of explanatory variables
1
ˆˆ ˆ~ . .
ˆ. .i i
n k i ii
i
b bt s e b c
s e b
where
In addition we can show that:
1.X X
And c(i+1)(i+1) is the (i+1)th diagonal element of
where s.e.(bi)=c(i+1)(i+1)
For our example:
ˆ67.44 67.1031Y Y b X Y
. . . 2 1
467 44 671031 0 08422
ˆ 0.2902
1.X X
was calculated as:
This means that
c11= 6.683, c22=0.7600,c33=0.0053
Note that c11 is associated with b0, c22 with b1 and c33 with b2
We will calculate the standard error for b1
This is 0.7600 x 0.2902 = 0.2530
The value of b1 is 0.8984
Now carry out a hypothesis test.
H0: b1 = 0
H1: b1 ≠ 0
The standard error of b1 is 0.2530
^
The test statistic is
This calculates as (0.8984 – 0)/0.2530 = 3.55
1 1ˆ
. .
b bt
S E
Ds…..
……….
t tables using 4 degrees of freedom give cut of point of 2.776 for 2.5%.
………………................
We therefore accept H1. There is no evidence at the 5% level that b1 is zero.
The process can be repeated for the other b values and confidence intervals calculated in the usual way.
CI for 2 - based on the 42 distribution of
4 2 2 / ((4 0.08422)/11.14 , (4 0.08422)/0.4844)
i.e. (0.030 , 0.695)
ˆ ˆˆ ˆRESSS Y Xb Y Xb
The sum of squares of the residuals can also be calculated.