Multiple Regression Analysis: Part 1

1

Multiple Regression Analysis: Part 1

Correlation, Simple Regression, Introduction to Multiple Regression and Matrix Algebra

2

Background: 3 Aims of Research

1.

2.

3.

Regression Defined:

3

Numerical Example

25 CDs X = Marketing $’s Y = Sales Index Question: Can we predict

sales by knowing marketing expenditures?

CD Marketing (x $1000) SalesIndx1 87 33.72 69 35.13 70 36.44 73 37.85 129 39.16 189 40.57 88 41.88 93 43.29 111 44.6

10 123 45.911 255 47.312 113 48.613 201 50.014 189 51.415 99 52.716 125 54.117 222 55.418 198 56.819 236 58.220 172 59.521 144 60.922 139 62.223 92 63.624 189 64.925 200 66.3

4

Correlation

The relationship between x and y… ( )

1x y

xy

z zr

N

2 2 2 2

( )

[ ( ) ][ ( ) ]xy

N XY X Yr

N X X N Y Y

2 2

25(187,253.80) (3606)(1250) 173,845.515

1,899,664 59996.75[25(596,116) (3606) ][25(64,899.87) (1250) ]xyr

Or,

5

Or visually… 2 2.515 .265r CD Sales & Marketing

R2 = 0.2652

20.000

25.000

30.000

35.000

40.000

45.000

50.000

55.000

60.000

65.000

70.000

75.000

80.000

25 50 75 100 125 150 175 200 225 250 275 300

Marketing Costs

CD

Sale

s In

dex

6

Given the relationship, we can predict y by developing the simple regression equation

'y a bx Predicted Score

Actual Score

y a bx e

y’ =

a =

b =

x =

e =

7

Calculating parameter estimates

y

x

sb r

s

If you have the correlation and standard deviations…

If you do not…

2 2

( ) ( )( )

( ) ( )

N XY X Yb

N X X

Once you have b, a is easy…

a Y bX

10.515 .092

56.27b

2

25(187,253.8) (3606)(1250).092

25(596116) (3606)b

50 .092(144.24) 36.73a

8

Numerical Example with more stuffCD x y x*y y' y - y' (y - y')2 y' - M(y) (y' - M(y))2

1 87 33.7 2931.5 44.8 -11.1 122.5 -5.24 27.42 69 35.1 2418.8 43.1 -8.1 65.0 -6.89 47.43 70 36.4 2548.9 43.2 -6.8 46.1 -6.79 46.24 73 37.8 2757.3 43.5 -5.7 32.6 -6.52 42.55 129 39.1 5047.8 48.6 -9.5 89.8 -1.39 1.96 189 40.5 7652.4 54.1 -13.6 185.2 4.10 16.87 88 41.8 3682.6 44.9 -3.0 9.0 -5.15 26.58 93 43.2 4018.2 45.3 -2.1 4.4 -4.69 22.09 111 44.6 4946.7 47.0 -2.4 5.7 -3.04 9.3

10 123 45.9 5648.6 48.1 -2.1 4.5 -1.94 3.811 255 47.3 12057.1 60.1 -12.9 165.2 10.14 102.712 113 48.6 5496.5 47.1 1.5 2.3 -2.86 8.213 201 50.0 10050.0 55.2 -5.2 27.0 5.19 27.014 189 51.4 9706.8 54.1 -2.7 7.5 4.10 16.815 99 52.7 5219.0 45.9 6.9 47.0 -4.14 17.116 125 54.1 6759.5 48.2 5.8 34.1 -1.76 3.117 222 55.4 12306.5 57.1 -1.7 2.8 7.12 50.618 198 56.8 11245.1 54.9 1.9 3.5 4.92 24.219 236 58.2 13723.9 58.4 -0.2 0.1 8.40 70.520 172 59.5 10235.9 52.5 7.0 48.6 2.54 6.521 144 60.9 8765.2 50.0 10.9 118.6 -0.02 0.022 139 62.2 8649.7 49.5 12.7 161.5 -0.48 0.223 92 63.6 5850.0 45.2 18.4 337.4 -4.78 22.924 189 64.9 12274.7 54.1 10.8 117.7 4.10 16.825 200 66.3 13260.9 55.1 11.2 125.5 5.10 26.0

Sum 3606 1250.0 187253.8 1250.0 0.0 1763.5 0.0 636.4

9

Partitioning Variance – What else? Total Variation = SSy or SSTOT

What we cannot account for… Actual y-scores minus predicted y-scores

y – y’ Can square and sum to get SSRES

What we can account for SSTOT – SSRES (a.k.a. SSREG) Or…

Predicted y-scores minus mean of y (squared & summed) Why?

10

Calculating F, because we can

636.4636.4

1REG

REGREG

SSMS

df

1763.576.67

23RES

RESREG

SSMS

df

636.48.301

76.67REG

RES

MSF

MS

11

Effect Size / Fit…

2 1 ,RES REG

TOT TOT

SS SSr or

SS SS 2 1763.5

1 0.2652399.9

r

We can evaluate it at k, N – k – 1.

The null hypothesis of this test is ___________________________________.

Take our previously calculated F, 8.301

12

Multiple Regression

Multiple Independent (predictor) variables

One Dependent (criterion) variable

Predicted Score

y’ = a + b1x1 + b2x2 + … + bkxk

Actual Score

yi = a + b1x1 + b2x2 + … + bkxk + ei

13

Numerical Example

N = 25 Participants (CDs) X1: Marketing Expenditures

X2: Airplay/Day Y: Sales Index Question: Can the two pieces of

information, Marketing Expenditures and Airplay be used in combination to predict CD Sales?

CD Marketing (x $1000) Airplay/day SalesIndx1 87 12.49 33.6962 69 8.65 35.0543 70 14.41 36.4134 73 13.73 37.7725 129 19.73 39.1306 189 21.65 40.4897 88 16.63 41.8488 93 17.9 43.2079 111 15.95 44.565

10 123 18.76 45.92411 255 28.74 47.28312 113 18.62 48.64113 201 26.49 50.00014 189 21.37 51.35915 99 16.78 52.71716 125 19.23 54.07617 222 24.76 55.43518 198 25.83 56.79319 236 23.73 58.15220 172 21.99 59.51121 144 21.61 60.87022 139 25.45 62.22823 92 15.05 63.58724 189 28.98 64.94625 200 25.15 66.304

14

Selected SPSS Output (1)

ANOVAb

1049.020 2 524.510 8.542 .002a

1350.844 22 61.402

2399.864 24

Regression

Residual

Total

Model1

Sum ofSquares df Mean Square F Sig.

Predictors: (Constant), Number of plays per day, Marketing in thousands $'sa.

Dependent Variable: Sales Indexb.

Model Summaryb

.661a .437 .386 7.83594 1.010Model1

R R SquareAdjustedR Square

Std. Error ofthe Estimate

Durbin-Watson

Predictors: (Constant), Number of plays per day, Marketing in thousands$'s

a.

Dependent Variable: Sales Indexb.

15

Selected SPSS Output (2)

Coefficientsa

22.883 6.934 3.300 .003 8.502 37.265

-.048 .061 -.273 -.794 .436 -.175 .078

1.693 .653 .890 2.592 .017 .339 3.047

(Constant)

Marketing in thousands$'s

Number of plays per day

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig. Lower Bound Upper Bound

95% Confidence Interval for B

Dependent Variable: Sales Indexa.

Notice the change in b for Marketing!

16

The equations introduced previously can be extended to the two IV case Involves finding six SS terms

SSX1, SSX2, SSX1&SSX2, SSY, SSX1&Y, SSX2&Y

Must also calculate Two b-weights Two beta weights Correlation between X1 and X2

Then SS for Regression, Residual and Total Then significance tests for each b-weight

In general, it is a pain in the backside.

17

For Instance, to obtain b1 & b2…

2 22 1

1 1

( ) 41.9135.15 135.15 109.73 25.42

16

XSSX X

N

2 22 2

2 2

( ) 1852217,576 217,576 214,369 3207

16

XSSX X

N

2 22 ( ) 1347

115,149 115,149 113,400.56 1748.4416

YSSY Y

N

11 1

( )( ) (41.9)(1347)3704.5 3704.5 3527.46 177.04

16

X YSSX Y X Y

N

22 2

( )( ) (1852)(1347)158,003 158,003 155,915.25 2087.75

16

X YSSX Y X Y

N

1 21 2 1 2

( )( ) (41.9)(1852)5101 5101 4849.93 251.07

16

X XSSX X X X

N

2 1 1 2 21 2 2

1 2 1 2

( )( ) ( )( ) (3207)(177.04) (251.07)(2087.75) 567767.28 524171.39 43595.892.358

( )( ) ( ) (25.42)(3207) 251.07 81521.94 63036.14 18485.8

SSX SSX Y SSX X SSX Yb

SSX SSX SSX X

1 2 1 2 12 2 2

1 2 1 2

( )( ) ( )( ) (25.42)(2087.75) (251.07)(177.04) 53070.61 44449.43 8621.180.466

( )( ) ( ) (25.42)(3207) 251.07 81521.94 63036.14 18485.8

SSX SSX Y SSX X SSX Yb

SSX SSX SSX X

Note: this is from a different example…, mileage may vary for the current example.

18

Which is why matrix algebra is our friend

There’s only one equation to get the Standardized Regression

Weights

Bi = Rij-1Riy

Then another one to get R2

R2 = RyiBi

And so on.

So, let’s take a joyride through the wonderful world of Matrix Algebra

19

First, some definitions

For us, matrix algebra is a set of operations that can be carried out on a group of numbers (a matrix) as a whole.

A Matrix is denoted by a bold capital letter Has R rows and C columns (thus has dimension of RxC) R and/or C can be 1. When R=1, the matrix is a row vector. When C=1 it is a column vector When both R and C are 1, it is a scalar (usually denoted by a small case

bold letter). Xij – X is a matrix and i represents the row and j the column. Thus, x31

refers to the element in the third row and first column.

20

Example

The order of X is 5x2 X31 = 3

5 5

4 6

3 2

4 4

4 3

X

21

Some other important concepts

A is a diagonal matrix I is an Identity Matrix

2.40 0.00 0.000.00 1.76 0.000.00 0.00 3.94

A

1.00 0.00 0.000.00 1.00 0.000.00 0.00 1.00

I

22

Matrix Transpose

X is our 5x2 matrix previously introduced.

X’ is the transpose of X. 5 54 63 24 44 3

X

5 4 3 4 45 6 2 4 3

X’

23

Matrix Addition

5 5 7 74 6 6 6

X = 3 2 and, Y = 5 24 4 4 44 3 4 7

Given two matrices, X and Y

5 5 7 7 12 124 6 6 6 10 12

T = 3 2 + 5 2 = 8 44 4 4 4 8 84 3 4 7 8 10

Then we can add the individual elements of X and Y to get T

24

Similarly, Matrix Subtraction…

5 5 7 74 6 6 6

X = 3 2 and, Y = 5 24 4 4 44 3 4 7

Given the same two matrices, X and Y

Then we can subtract the individual elements of X and Y to get D

5 5 7 7 -2 -24 6 6 6 -6 0

D = 3 2 -- 5 2 = -2 04 4 4 4 0 04 3 4 7 0 -4

25

We can also use scalars w/matrices

12 12 2.8 2.8

10 12 0.8 2.8

C =T -9.2 8 4 = -1.2 -5.2

8 8 -1.2 -1.2

8 10 -1.2 0.8

Here, I’ve subtracted a scalar, 9.2, from T. I could have also multiplied T by 0.5 to get a matrix of means. The value 9.2 happens to be the mean for each column, meaning we have centered the data within each column.

26

Matrix Multiplication:As seen on T.V.!

Matrices must be conformable for multiplication First matrix must have the same number of columns as the

second matrix has rows. The resulting matrix will be of order R1 x C2

We then multiply away… We multiply each element from the first row of the first matrix

by the corresponding element of the first column of the second matrix.

Then we multiply each element from the first row of the first matrix by the corresponding element of the second column of the second matrix.

We continue until we run out of columns in the second matrix, and do it over again for the second row of the first matrix.

27

Example

If we take the transpose of C (C’) and post-multiply it by C, we could get a new matrix called SSCP. It would go like this.

SSCP11 = (2.8 * 2.8)+(0.8*0.8)+(-1.2*-1.2)+(-1.2*-1.2)+(-1.2*-1.2) = 12.8

SSCP12 = (2.8 * 2.8)+(0.8*2.8)+(-1.2*-5.2)+(-1.2*-1.2)+(-1.2*0.8) = 16.8

SSCP21 = (2.8 * 2.8)+(2.8*0.8)+(-5.2*-1.2)+(-1.2*-1.2)+(0.8*-1.2) = 16.8

SSCP22 = (2.8*2.8)+(2.8*2.8)+(-5.2*-5.2)+(-1.2*-1.2)+(0.8*0.8) = 44.8

2.8 0.8 -1.2 -1.2 -1.2 2.8 2.8

2.8 2.8 -5.2 -1.2 0.8 0.8 2.8

X -1.2 -5.2

-1.2 -1.2

-1.2 0.8

C’ =

28

SSCP, V-C & R

12.8 16.816.8 44.8SSCP =

Rearranging the elements into a matrix:

Multiplying by a scalar,

1/(n-1):

3.2 4.24.2 11.2V-C =

The above matrix is closely related to the familiar R

1.000 0.7020.702 1.000R=

29

Matrix Division:It just keeps getting better!

Matrix Division is even stranger than matrix multiplication.

You know most of what you need to know though, since it is accomplished through multiplying by an inverted matrix.

Finding the inverse is the tricky part. We will do a very simple example.

30

Inverses

Not all matrices have an inverse. A matrix inverse is defined such that

XX-1=I We need two things in order to find the inverse

1. the determinant of the matrix we wish to take the inverse of, V-C in this case, which is written as |V-C|

2. The adjoint of the same matrix, i.e. V-C, written adj(V-C)

31

Determinant and Adjoint

|V-C| = 18.2

V22 -V12

-V21 V11

For a 2x2 matrix, V, the determinant is V11*V22 – V12*V21

The adjoint is formed in the following way.

11.2 -4.2-4.2 3.2Adj(V-C) =

32

Almost there…

We then divide each element of the adjoint matrix by the determinant

11.2 / 18.2 -4.2 / 18.2 -4.2 / 18.2 3.2/ 18.2V-C-1 =

Or,

0.615 -0.231-0.231 0.176V-1 =

33

Checking our work…

V-C*V-C-1 = I

3.2 4.24.2 11.2V-C =

0.615 -0.231-0.231 0.176V-1 =X

V-C*V-C-111 = 3.2*0.615+4.2*-0.231 = 1.968-0.972 ≈ 1.0

V-C*V-C-112 = 3.2*-0.231+4.2*0.176 = -.7392+.7392 = 0

V-C*V-C-121 = 4.2*0.615+11.2*-0.231 = 2.583-2.5872 ≈ 1.0

V-C*V-C-112 = 4.2*-0.231+11.2*0.176 = -0.9702+1.9712 ≈ 0

1.0 0.00.0 1.0V-CxV-C-1 =

34

Why we leave matrix operations to computers

a b cd e fg h i

Finding the determinant of a 3 x 3 matrix:

D = a(ei – fh) + b(fg – di) + c(dh – eg)

ei - fh ch - bi bf - ce1/D x fg - di ai - cg cd - af

dh - eg bg - ah ae - bd

Inverting the 3 x 3 matrix after solving for the determinant:

35

So, why did I drag you through this?

Multiple Regression Analysis: Part 1

Documents

Transcript of Multiple Regression Analysis: Part 1