Linear Regression Primer

download Linear Regression Primer

of 53

Transcript of Linear Regression Primer

  • 7/27/2019 Linear Regression Primer

    1/53

    DERIVING LINEAR REGRESSION COEFFICIENTS

    This sequence shows how the regression coefficients for a simple regression model arederived, using the least squares criterion (OLS, for ordinary least squares)

    1

    0

    1

    2

    3

    4

    5

    6

    0 1 2 3

    Y

    X

    3Y

    2Y

    1Y

    u X Y 21

    True model

  • 7/27/2019 Linear Regression Primer

    2/53

    0

    1

    2

    3

    4

    5

    6

    0 1 2 3

    DERIVING LINEAR REGRESSION COEFFICIENTS

    We will start with a numerical example with just three observations: (1,3), (2,5), and (3,6).

    X

    Y

    3Y

    2Y

    1Y

    2

    u X Y 21

    True model

  • 7/27/2019 Linear Regression Primer

    3/53

    0

    1

    2

    3

    4

    5

    6

    0 1 2 3

    2Y

    3Y

    211

    b b Y

    212 2

    b b Y

    213 3

    b b Y Y

    b 2 b 1

    X

    Writing the fitted regression as Y = b 1 + b 2X , we will determine the values of b 1 and b 2 thatminimize RSS , the sum of the squares of the residuals.

    3

    ^

    DERIVING LINEAR REGRESSION COEFFICIENTS

    1Y

    u X Y 21

    True model

    X b b Y 21

    Fitted model

  • 7/27/2019 Linear Regression Primer

    4/53

    0

    1

    2

    3

    4

    5

    6

    0 1 2 3

    2Y

    3Y

    211

    b b Y

    212 2

    b b Y

    213 3

    b b Y Y

    b 2 b 1

    X

    4

    DERIVING LINEAR REGRESSION COEFFICIENTS

    1Y

    u X Y 21

    True model

    X b b Y 21

    Fitted model

    Given our choice of b 1 and b 2, the residuals are as shown.

    21333

    21222

    21111

    36

    25

    3

    b b Y Y e

    b b Y Y e

    b b Y Y e

  • 7/27/2019 Linear Regression Primer

    5/53

    212122

    21

    212122

    21

    212122

    21

    212122

    21

    221

    221

    221

    23

    22

    21

    12622814370

    63612936

    42010425

    2669

    )36()25()3(

    b b b b b b

    b b b b b b

    b b b b b b

    b b b b b b

    b b b b b b e e e RSS

    The sum of the squares of the residuals is thus as shown above.

    5

    DERIVING LINEAR REGRESSION COEFFICIENTS

    21333

    21222

    21111

    36

    25

    3

    b b Y Y e

    b b Y Y e

    b b Y Y e

  • 7/27/2019 Linear Regression Primer

    6/53

    212122

    21

    212122

    21

    212122

    21

    212122

    21

    221

    221

    221

    23

    22

    21

    12622814370

    63612936

    42010425

    2669

    )36()25()3(

    b b b b b b

    b b b b b b

    b b b b b b

    b b b b b b

    b b b b b b e e e RSS

    The quadratics have been expanded.

    6

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    7/53

    Like terms have been added together.

    7

    DERIVING LINEAR REGRESSION COEFFICIENTS

    212122

    21

    212122

    21

    212122

    21

    212122

    21

    221

    221

    221

    23

    22

    21

    12622814370

    63612936

    42010425

    2669

    )36()25()3(

    b b b b b b

    b b b b b b

    b b b b b b

    b b b b b b

    b b b b b b e e e RSS

  • 7/27/2019 Linear Regression Primer

    8/53

    For a minimum, the partial derivatives of RSS with respect to b 1 and b 2 should be zero. (Weshould also check a second-order condition.)

    8

    DERIVING LINEAR REGRESSION COEFFICIENTS

    0281260 211

    b b b

    RSS

    06228120 212

    b b

    b

    RSS

    212122

    21

    212122

    21

    212122

    21

    212122

    21

    221

    221

    221

    23

    22

    21

    12622814370

    63612936

    42010425

    2669

    )36()25()3(

    b b b b b b

    b b b b b b

    b b b b b b

    b b b b b b

    b b b b b b e e e RSS

  • 7/27/2019 Linear Regression Primer

    9/53

    The first-order conditions give us two equations in two unknowns.

    9

    DERIVING LINEAR REGRESSION COEFFICIENTS

    0281260 211

    b b b

    RSS

    06228120 212

    b b

    b

    RSS

    212122

    21

    212122

    21

    212122

    21

    212122

    21

    221

    221

    221

    23

    22

    21

    12622814370

    63612936

    42010425

    2669

    )36()25()3(

    b b b b b b

    b b b b b b

    b b b b b b

    b b b b b b

    b b b b b b e e e RSS

  • 7/27/2019 Linear Regression Primer

    10/53

    0281260 211

    b b b

    RSS

    06228120 212

    b b

    b

    RSS

    50.1,67.1 21 b b

    Solving them, we find that RSS is minimized when b 1 and b 2 are equal to 1.67 and 1.50,respectively.

    10

    212122

    21

    212122

    21

    212122

    21

    212122

    21

    221

    221

    221

    23

    22

    21

    12622814370

    63612936

    42010425

    2669

    )36()25()3(

    b b b b b b

    b b b b b b

    b b b b b b

    b b b b b b

    b b b b b b e e e RSS

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    11/53

    Here is the scatter diagram again.

    11

    DERIVING LINEAR REGRESSION COEFFICIENTS

    0

    1

    2

    3

    4

    5

    6

    0 1 2 3

    2Y

    3Y

    211

    b b Y

    212 2

    b b Y

    213 3

    b b Y Y

    b 2 b 1

    X

    1Y

    u X Y 21

    True model

    X b b Y 21

    Fitted model

  • 7/27/2019 Linear Regression Primer

    12/5312

    DERIVING LINEAR REGRESSION COEFFICIENTS

    0

    1

    2

    3

    4

    5

    6

    0 1 2 3

    2Y

    3Y

    17.3

    1Y

    67.4

    2Y

    17.6

    3Y

    Y

    b 2 b 1

    X

    1Y

    u X Y 21

    True model

    Fitted model X Y 50.167.1

    The fitted line and the fitted values of Y are as shown.

  • 7/27/2019 Linear Regression Primer

    13/5313

    DERIVING LINEAR REGRESSION COEFFICIENTS

    Before we move on to the general case, it is as well to make a small but importantmathematical point.

    0281260 211

    b b b

    RSS

    06228120 212

    b b

    b

    RSS

    50.1,67.1 21 b b

    212122

    21

    212122

    21

    212122

    21

    212122

    21

    221

    221

    221

    23

    22

    21

    12622814370

    63612936

    42010425

    2669

    )36()25()3(

    b b b b b b

    b b b b b b

    b b b b b b

    b b b b b b

    b b b b b b e e e RSS

  • 7/27/2019 Linear Regression Primer

    14/53

    14

    DERIVING LINEAR REGRESSION COEFFICIENTS

    When we establish the expression for RSS , we do so as a function of b 1 and b 2. At thisstage, b 1 and b 2 are not specific values. Our task is to determine the particular values thatminimize RSS .

    0281260 211

    b b b

    RSS

    06228120 212

    b b

    b

    RSS

    50.1,67.1 21 b b

    212122

    21

    212122

    21

    212122

    21

    212122

    21

    221

    221

    221

    23

    22

    21

    12622814370

    63612936

    42010425

    2669

    )36()25()3(

    b b b b b b

    b b b b b b

    b b b b b b

    b b b b b b

    b b b b b b e e e RSS

  • 7/27/2019 Linear Regression Primer

    15/53

    15

    DERIVING LINEAR REGRESSION COEFFICIENTS

    We should give these values special names, to differentiate them from the rest.

    0281260 211

    b b b

    RSS

    06228120 212

    b b

    b

    RSS

    50.1,67.1 21 b b

    212122

    21

    212122

    21

    212122

    21

    212122

    21

    221

    221

    221

    23

    22

    21

    12622814370

    63612936

    42010425

    2669

    )36()25()3(

    b b b b b b

    b b b b b b

    b b b b b b

    b b b b b b

    b b b b b b e e e RSS

  • 7/27/2019 Linear Regression Primer

    16/53

    16

    DERIVING LINEAR REGRESSION COEFFICIENTS

    Obvious names would be b 1OLS and b 2OLS , OLS standing for Ordinary Least Squares andmeaning that these are the values that minimize RSS . We have re-written the first-order

    conditions and their solution accordingly.

    212122

    21

    212122

    21

    212122

    21

    212122

    21

    221

    221

    221

    23

    22

    21

    12622814370

    63612936

    42010425

    2669

    )36()25()3(

    b b b b b b

    b b b b b b

    b b b b b b

    b b b b b b

    b b b b b b e e e RSS

    0281260 OLS2OLS1

    1

    b b b

    RSS

    06228120 OLS2OLS1

    2

    b b

    b

    RSS

    50.1,67.1 OLS2OLS1

    b b

  • 7/27/2019 Linear Regression Primer

    17/53

    17

    DERIVING LINEAR REGRESSION COEFFICIENTS

    Now we will proceed to the general case with n observations.

    X X n X 1

    Y

    n Y

    1Y

    u X Y 21

    True model

  • 7/27/2019 Linear Regression Primer

    18/53

    X X n X 1

    Y

    1211

    X b b Y

    1Y

    18

    DERIVING LINEAR REGRESSION COEFFICIENTS

    u X Y 21

    True model

    X b b Y 21

    Fitted model

    b 2 b 1

    Given our choice of b 1 and b 2, we will obtain a fitted line as shown.

    n Y

    n n X b b Y

    21

  • 7/27/2019 Linear Regression Primer

    19/53

    X X n X 1

    Y

    1211

    X b b Y

    1Y

    n Y

    19

    DERIVING LINEAR REGRESSION COEFFICIENTS

    b 2 b 1

    The residual for the first observation is defined.

    1e

    n n n n n X b b Y Y Y e

    X b b Y Y Y e

    21

    1211111

    .....

    u X Y 21

    True model

    X b b Y 21

    Fitted model

    n n X b b Y

    21

  • 7/27/2019 Linear Regression Primer

    20/53

    Similarly we define the residuals for the remaining observations. That for the last one ismarked.

    X X n X 1

    Y

    1211

    X b b Y

    1Y

    n Y

    1e

    n e

    20

    DERIVING LINEAR REGRESSION COEFFICIENTS

    b 2 b 1 n n n n n X b b Y Y Y e

    X b b Y Y Y e

    21

    1211111

    .....

    u X Y 21

    True model

    X b b Y 21

    Fitted model

    n n X b b Y

    21

  • 7/27/2019 Linear Regression Primer

    21/53

    i i i i i i

    n n n n n n

    n n n

    X b b Y X b Y b X b nb Y X b b Y X b Y b X b b Y

    X b b Y X b Y b X b b Y

    X b b Y X b b Y e e RSS

    212122

    221

    22121

    22

    2

    2

    1

    2

    1211121121

    22

    21

    21

    221

    21211

    221

    222222

    ...

    222

    )(...)(...

    21

    DERIVING LINEAR REGRESSION COEFFICIENTS

    RSS , the sum of the squares of the residuals, is defined for the general case. The data for the numerical example are shown for comparison..

    212122

    21

    212122

    21

    212122

    21

    212122

    21

    221

    221

    221

    23

    22

    21

    12622814370

    63612936

    42010425

    2669

    )36()25()3(

    b b b b b b

    b b b b b b

    b b b b b b

    b b b b b b

    b b b b b b e e e RSS

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    22/53

    22

    DERIVING LINEAR REGRESSION COEFFICIENTS

    The quadratics are expanded.

    212122

    21

    212122

    21

    212122

    21

    212122

    21

    221

    221

    221

    23

    22

    21

    12622814370

    63612936

    42010425

    2669

    )36()25()3(

    b b b b b b

    b b b b b b

    b b b b b b

    b b b b b b

    b b b b b b e e e RSS

    i i i i i i

    n n n n n n

    n n n

    X b b Y X b Y b X b nb Y

    X b b Y X b Y b X b b Y

    X b b Y X b Y b X b b Y

    X b b Y X b b Y e e RSS

    212122

    221

    22121

    22

    2

    2

    1

    2

    1211121121

    22

    21

    21

    221

    21211

    221

    222222

    ...

    222

    )(...)(...

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    23/53

    i i i i i i

    n n n n n n

    n n n

    X b b Y X b Y b X b nb Y

    X b b Y X b Y b X b b Y

    X b b Y X b Y b X b b Y

    X b b Y X b b Y e e RSS

    212122

    221

    22121

    22

    2

    2

    1

    2

    1211121121

    22

    21

    21

    221

    21211

    221

    222222

    ...

    222

    )(...)(...

    Like terms are added together.

    23

    DERIVING LINEAR REGRESSION COEFFICIENTS

    212122

    21

    212122

    21

    212122

    21

    212122

    21

    221

    221

    221

    23

    22

    21

    12622814370

    63612936

    42010425

    2669

    )36()25()3(

    b b b b b b

    b b b b b b

    b b b b b b

    b b b b b b

    b b b b b b e e e RSS

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    24/53

    24

    DERIVING LINEAR REGRESSION COEFFICIENTS

    Note that in this equation the observations on X and Y are just data that determine thecoefficients in the expression for RSS .

    212122

    21 12622814370 b b b b b b RSS

    0281260 211

    b b

    b

    RSS

    06228120 212

    b b b

    RSS 50.1,67.1 21 b b }

    i i i i i i X b b Y X b Y b X b nb Y RSS 212122

    221

    2

    222

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    25/53

    25

    DERIVING LINEAR REGRESSION COEFFICIENTS

    The choice variables in the expression are b 1 and b 2. This may seem a bit strange becausein elementary calculus courses b 1 and b 2 are usually constants and X and Y are variables.

    212122

    21 12622814370 b b b b b b RSS

    0281260 211

    b b

    b

    RSS

    06228120 212

    b b b

    RSS 50.1,67.1 21 b b }

    i i i i i i X b b Y X b Y b X b nb Y RSS 212122

    221

    2

    222

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    26/53

    26

    DERIVING LINEAR REGRESSION COEFFICIENTS

    However, if you have any doubts, compare what we are doing in the general case with whatwe did in the numerical example.

    212122

    21 12622814370 b b b b b b RSS

    0281260 211

    b b

    b

    RSS

    06228120 212

    b b b

    RSS 50.1,67.1 21 b b }

    i i i i i i X b b Y X b Y b X b nb Y RSS 212122

    2

    2

    1

    2

    222

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    27/53

    27

    DERIVING LINEAR REGRESSION COEFFICIENTS

    The first derivative with respect to b 1.

    212122

    21 12622814370 b b b b b b RSS

    0281260 211

    b b

    b

    RSS

    06228120 212

    b b b

    RSS 50.1,67.1 21 b b }

    i i i i i i X b b Y X b Y b X b nb Y RSS 212122

    2

    2

    1

    2

    222

    02220 211

    i i X b Y nb

    b

    RSS

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    28/53

    28

    DERIVING LINEAR REGRESSION COEFFICIENTS

    With some simple manipulation we obtain a tidy expression for b 1 .

    212122

    21 12622814370 b b b b b b RSS

    0281260 211

    b b

    b

    RSS

    06228120 212

    b b b

    RSS 50.1,67.1 21 b b }

    i i i i i i X b b Y X b Y b X b nb Y RSS 212122

    2

    2

    1

    2

    222

    02220 211

    i i X b Y nb

    b

    RSS

    i i X b Y nb

    21X b Y b

    21

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    29/53

    The first derivative with respect to b 2.

    29

    i i i i i i X b b Y X b Y b X b nb Y RSS

    2121

    22

    2

    2

    1

    2

    222

    212122

    21 12622814370 b b b b b b RSS

    0281260 211

    b b

    b

    RSS

    06228120 212

    b b b

    RSS 50.1,67.1 21 b b

    02220 211

    i i X b Y nb

    b

    RSS

    i i X b Y nb

    21X b Y b

    21

    02220 12

    22

    i i i i X b Y X X b

    b

    RSS

    DERIVING LINEAR REGRESSION COEFFICIENTS

    }

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    30/53

    Divide through by 2.

    30

    DERIVING LINEAR REGRESSION COEFFICIENTS

    i i i i i i X b b Y X b Y b X b nb Y RSS

    2121

    22

    2

    2

    1

    2

    222

    02220 211

    i i X b Y nb

    b

    RSS

    i i X b Y nb

    21X b Y b

    21

    02220 12

    22

    i i i i X b Y X X b

    b

    RSS

    02220 12

    22

    i i i i X b Y X X b

    b

    RSS

    012

    2 i i i i X b Y X X b

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    31/53

    We now substitute for b 1 using the expression obtained for it and we thus obtain anequation that contains b 2 only.

    31

    DERIVING LINEAR REGRESSION COEFFICIENTS

    i i i i i i X b b Y X b Y b X b nb Y RSS

    2121

    22

    2

    2

    1

    2

    222

    02220 211

    i i X b Y nb

    b

    RSS

    i i X b Y nb

    21X b Y b

    21

    02220 12

    22

    i i i i X b Y X X b

    b

    RSS

    012

    2 i i i i X b Y X X b

    0)( 22

    2 i i i i X X b Y Y X X b

    02220 12

    22

    i i i i X b Y X X b

    b

    RSS

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    32/53

    32

    DERIVING LINEAR REGRESSION COEFFICIENTS

    The definition of the sample mean has been used.

    012

    2 i i i i X b Y X X b

    0)( 22

    2 i i i i X X b Y Y X X b

    02220 12

    22

    i i i i X b Y X X b

    b

    RSS

    0)( 22

    2X n X b Y Y X X b

    i i i

    n

    X X

    i

    X n X i

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    33/53

    33

    DERIVING LINEAR REGRESSION COEFFICIENTS

    The last two terms have been disentangled.

    012

    2 i i i i X b Y X X b

    0)( 22

    2 i i i i X X b Y Y X X b

    02220 12

    22

    i i i i X b Y X X b

    b

    RSS

    0)( 22

    2X n X b Y Y X X b

    i i i

    0222

    2X nb Y X n Y X X b

    i i i

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    34/53

    012

    2 i i i i X b Y X X b

    0)( 22

    2 i i i i X X b Y Y X X b

    34

    02220 12

    22

    i i i i X b Y X X b

    b

    RSS

    DERIVING LINEAR REGRESSION COEFFICIENTS

    0)( 22

    2X n X b Y Y X X b

    i i i

    0222

    2X nb Y X n Y X X b

    i i i

    Terms not involving b 2 have been transferred to the right side.

    Y X n Y X X n X b i i i

    222

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    35/53

    To create space, the equation is shifted to the top of the slide.

    35

    DERIVING LINEAR REGRESSION COEFFICIENTS

    Y X n Y X X n X b i i i

    222

    Y X n Y X X n X b i i i

    222

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    36/53

    Hence we obtain an expression for b 2.

    36

    DERIVING LINEAR REGRESSION COEFFICIENTS

    Y X n Y X X n X b i i i

    222

    222 X n X

    Y X n Y X

    b i

    i i

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    37/53

    In practice, we shall use an alternative expression. We will demonstrate that it is equivalent.

    37

    DERIVING LINEAR REGRESSION COEFFICIENTS

    Y X n Y X X n X b i i i

    222

    22 X X

    Y Y X X b

    i

    i i

    222 X n X

    Y X n Y X

    b i

    i i

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    38/53

    Y X n Y X

    Y X n Y n X X n Y Y X

    Y X n Y X X Y Y X

    Y X Y X Y X Y X Y Y X X

    i i

    i i

    i i i i

    i i i i i i

    Expanding the numerator, we obtain the terms shown.

    38

    DERIVING LINEAR REGRESSION COEFFICIENTS

    Y X n Y X X n X b i i i

    222

    22 X X

    Y Y X X b

    i

    i i

    222 X n X

    Y X n Y X

    b i

    i i

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    39/53

    Y X n Y X

    Y X n Y n X X n Y Y X

    Y X n Y X X Y Y X

    Y X Y X Y X Y X Y Y X X

    i i

    i i

    i i i i

    i i i i i i

    In the second term the mean value of Y is a common factor. In the third, the mean value of X is a common factor. The last term is the same for all i .

    39

    Y X n Y X X n X b i i i

    222

    22 X X

    Y Y X X b

    i

    i i

    222 X n X

    Y X n Y X

    b i

    i i

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    40/53

    Y X n Y X X n X b i i i

    222

    22 X X

    Y Y X X b

    i

    i i

    222 X n X

    Y X n Y X

    b i

    i i

    We use the definitions of the sample means to simplify the expression.

    40

    Y X n Y X

    Y X n Y n X X n Y Y X

    Y X n Y X X Y Y X

    Y X Y X Y X Y X Y Y X X

    i i

    i i

    i i i i

    i i i i i i

    n

    X X

    i

    X n X i

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    41/53

    Hence we have shown that the numerators of the two expressions are the same.

    41

    Y X n Y X X n X b i i i

    222

    22 X X

    Y Y X X b

    i

    i i

    Y X n Y X

    Y X n Y n X X n Y Y X

    Y X n Y X X Y Y X

    Y X Y X Y X Y X Y Y X X

    i i

    i i

    i i i i

    i i i i i i

    222 X n X

    Y X n Y X

    b i

    i i

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    42/53

    The denominator is mathematically a special case of the numerator, replacing Y by X .Hence the expressions are quivalent.

    42

    Y X n Y X Y Y X X i i i i

    222 X n X X X i i

    Y X n Y X X n X b i i i

    222

    222 X n X

    Y X n Y X

    b i

    i i

    22 X X

    Y Y X X b

    i

    i i

  • 7/27/2019 Linear Regression Primer

    43/53

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    44/53

    44

    X X n X 1

    Y

    1211

    X b b Y

    1Y

    n Y

    n n X b b Y

    21

    u X Y 21

    True model

    X b b Y 21

    Fitted model

    b 2 b 1

    We chose the parameters of the fitted line so as to minimize the sum of the squares of theresiduals. As a result, we derived the expressions for b 1 and b 2.

    X b Y b 21

    22 X X

    Y Y X X b

    i

    i i

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    45/53

    45

    X X n X 1

    Y

    1211

    X b b Y

    1Y

    n Y

    b 2 b 1

    Again, we should make the mathematical point discussed in the context of the numericalexample. These are the particular values of b 1 and b 2 that minimize RSS , and we should

    differentiate them from the rest by giving them special names, for example b 1OLS

    and b 2OLS

    .

    X b Y b OLS2

    OLS1

    2OLS2

    X X

    Y Y X X b

    i

    i i

    n n X b b Y

    21

    u X Y 21

    True model

    X b b Y 21

    Fitted model

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    46/53

    46

    X X n X 1

    Y

    1211

    X b b Y

    1Y

    n Y

    b 2 b 1

    However, for the next few chapters, we shall mostly be concerned with the OLS estimators,and so the superscript 'OLS' is not really necessary. It will be dropped, to simplify the

    notation.

    n n X b b Y

    21

    X b Y b OLS2

    OLS1

    2OLS2

    X X

    Y Y X X b

    i

    i i

    u X Y 21

    True model

    X b b Y 21

    Fitted model

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    47/53

    47

    Typically, an intercept should be included in the regression specification. Occasionally,however, one may have reason to fit the regression without an intercept. In the case of a

    simple regression model, the true and fitted models become as shown.

    u X Y 2

    X b Y 2

    True model Fitted model

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    48/53

    48

    We will derive the expression for b 2 from first principles using the least squares criterion.The residual in observation i is e i = Y i b 2X i .

    i i i i i X b Y Y Y e

    2

    u X Y 2

    X b Y 2

    True model Fitted model

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    49/53

    49

    With this, we obtain the expression for the sum of the squares of the residuals.

    i i i i i X b Y Y Y e 2

    2222

    222 2 i i i i i i X b Y X b Y X b Y RSS

    u X Y 2

    X b Y 2

    True model Fitted model

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    50/53

    We differentiate with respect to b 2. The OLS estimator is the value that makes this slopeequal to zero (the first-order condition for a minimum). Note that we have differentiated

    properly between the general b 2 and the specific b 2OLS

    . 50

    i i i i i X b Y Y Y e 2

    2222

    222 2 i i i i i i X b Y X b Y X b Y RSS

    i i i Y X X b b

    RSS 22

    dd 22

    2

    u X Y 2

    X b Y 2

    True model Fitted model

    022 2OLS2 i i i Y X X b

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    51/53

    51

    Hence, we obtain the OLS estimator of b 2 for this model.

    i i i i i X b Y Y Y e 2

    2222

    222 2 i i i i i i X b Y X b Y X b Y RSS

    i i i Y X X b b

    RSS 22

    dd 22

    2

    2OLS2

    i

    i i

    X

    Y X b

    u X Y 2

    X b Y 2

    True model Fitted model

    022 2OLS2 i i i Y X X b

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/27/2019 Linear Regression Primer

    52/53

    52

    i i i i i X b Y Y Y e 2

    2222

    222 2 i i i i i i X b Y X b Y X b Y RSS

    i i i Y X X b b

    RSS 22

    dd 22

    2

    2OLS2

    i

    i i

    X

    Y X b

    02d

    d 222

    2

    i X

    b

    RSS

    The second derivative is positive, confirming that we have found a minimum.

    u X Y 2

    X b Y 2

    True model Fitted model

    022 2OLS2 i i i Y X X b

  • 7/27/2019 Linear Regression Primer

    53/53

    Copyright Christopher Dougherty 2012.

    These slideshows may be downloaded by anyone, anywhere for personal use.Subject to respect for copyright and, where appropriate, attribution, they may beused as a resource for teaching an econometrics course. There is no need torefer to the author.

    The content of this slideshow comes from Section 1.3 of C. Dougherty,I n tr o d u c t i o n t o E c o n o m e t r i c s , fourth edition 2011, Oxford University Press.Additional (free) resources for both students and instructors may be

    downloaded from the OUP Online Resource Centrehttp://www.oup.com/uk/orc/bin/9780199567089/ .

    Individuals studying econometrics on their own who feel that they might benefitfrom participation in a formal course should consider the London School of Economics summer school courseEC212 Introduction to Econometrics

    http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx or the University of London International Programmes distance learning courseEC2020 Elements of Econometricswww.londoninternational.ac.uk/lse .

    http://www.oup.com/uk/orc/bin/9780199567089/http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspxhttp://c/Documents%20and%20Settings/vacharop/Local%20Settings/Temporary%20Internet%20Files/www.londoninternational.ac.uk/lsehttp://c/Documents%20and%20Settings/vacharop/Local%20Settings/Temporary%20Internet%20Files/www.londoninternational.ac.uk/lsehttp://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspxhttp://www.oup.com/uk/orc/bin/9780199567089/