Download - Lecture 4

Lecture 4

Econ 488

Ordinary Least Squares (OLS)

Objective of OLS Minimize the sum of squared residuals:

where

Remember that OLS is not the only possible estimator of the βs.

But OLS is the best estimator under certain assumptions…

n

iie

1

2

ˆmin

iKiKiii XXXY ...22110

iii YYe ˆ

Classical Assumptions1. Regression is linear in parameters2. Error term has zero population mean3. Error term is not correlated with X’s4. No serial correlation5. No heteroskedasticity6. No perfect multicollinearity and we usually add:7. Error term is normally distributed

Assumption 1: Linearity

The regression model: A) is linear

It can be written as

This doesn’t mean that the theory must be linear For example… suppose we believe that CEO salary is

related to the firm’s sales and CEO’s tenure. We might believe the model is:

iKiKiii XXXY ...22110

iiiii tenuretenuresalessalary 23210 )log()log(


The regression model: B) is correctly specified

The model must have the right variables No omitted variables The model must have the correct functional form This is all untestable We need to rely on economic

theory.


The regression model: C) must have an additive error term

The model must have + εi

Assumption 2: E(εi)=0Error term has a zero population meanE(εi)=0

Each observation has a random error with a mean of zero

What if E(εi)≠0?

This is actually fixed by adding a constant (AKA intercept) term

Assumption 2: E(εi)=0Example: Suppose instead the mean of εi

was -4.Then we know E(εi+4)=0

We can add 4 to the error term and subtract 4 from the constant term:

Yi =β0+ β1Xi+εi

Yi =(β0-4)+ β1Xi+(εi+4)

Assumption 2: E(εi)=0Yi =β0+ β1Xi+εi

Yi =(β0-4)+ β1Xi+(εi+4)

We can rewrite:Yi =β0*+ β1Xi+εi*

Where β0*= β0-4 and εi*=εi+4

Now E(εi*)=0, so we are OK.

Assumption 3: ExogeneityImportant!!All explanatory variables are uncorrelated

with the error termE(εi|X1i,X2i,…, XKi,)=0

Explanatory variables are determined outside of the model (They are exogenous)

Assumption 3: ExogeneityWhat happens if assumption 3 is violated?Suppose we have the model,Yi =β0+ β1Xi+εi

Suppose Xi and εi are positively correlated

When Xi is large, εi tends to be large as well.

Assumption 3: Exogeneity

“True” Line

-40

-20

0

20

40

60

80

100

120

0 5 10 15 20 25

“True Line”


“True” Line

“True Line”

Data

-40

-20

0

20

40

60

80

100

120

0 5 10 15 20 25

“True Line”

Data


-40

-20

0

20

40

60

80

100

120

0 5 10 15 20 25

“True Line”

Data

Estimated Line


Why would x and ε be correlated?Suppose you are trying to study the

relationship between the price of a hamburger and the quantity sold across a wide variety of Ventura County restaurants.


We estimate the relationship using the following model:

salesi= β0+β1pricei+εi

What’s the problem?


What’s the problem? What else determines sales of hamburgers? How would you decide between buying a burger

at McDonald’s ($0.89) or a burger at TGI Fridays ($9.99)?

Quality differs salesi= β0+β1pricei+εi quality isn’t an X

variable even though it should be. It becomes part of εi


What’s the problem? But price and quality are highly positively

correlated Therefore x and ε are also positively correlated. This means that the estimate of β1will be too

high This is called “Omitted Variables Bias” (More in

Chapter 6)

Assumption 4: No Serial CorrelationSerial Correlation: The error terms across

observations are correlated with each other

i.e. ε1 is correlated with ε2, etc.

This is most important in time seriesIf errors are serially correlated, an

increase in the error term in one time period affects the error term in the next.

Assumption 4: No Serial Correlation The assumption that there is no serial

correlation can be unrealistic in time seriesThink of data from a stock market…

Assumption 4: No Serial Correlation

-500

0

500

1000

1500

2000

1870 1920 1970 2020

Year

Rea

l S&

P 5

00 S

tock

Pri

ce Ind

ex

Price

Stock data is serially correlated!

Assumption 5: Homoskedasticity

Homoskedasticity: The error has a constant variance

This is what we want…as opposed toHeteroskedasticity: The variance of the

error depends on the values of Xs.


Homoskedasticity: The error has constant variance


Heteroskedasticity: Spread of error depends on X.


Another form of Heteroskedasticity

Assumption 6: No Perfect MulticollinearityTwo variables are perfectly collinear if one

can be determined perfectly from the other (i.e. if you know the value of x, you can always find the value of z).

Example: If we regress income on age, and include both age in months and age in years. But age in years = age in months/12 e.g. if we know someone is 246 months old, we

also know that they are 20.5 years old.

Assumption 6: No Perfect MulticollinearityWhat’s wrong with this?incomei= β0 + β1agemonthsi +

β2ageyearsi + εi

What is β1?It is the change in income associated with

a one unit increase in “age in months,” holding age in years constant. But if you hold age in years constant, age in

months doesn’t change!

Assumption 6: No Perfect Multicollinearity

β1 = Δincome/Δagemonths

Holding Δageyears = 0If Δageyears = 0; then Δagemonths = 0So β1 = Δincome/0

It is undefined!

Assumption 6: No Perfect MulticollinearityWhen more than one independent variable

is a perfect linear combination of the other independent variables, it is called Perfect MultiCollinearity

Example: Total Cholesterol, HDL and LDLTotal Cholesterol = LDL + HDLCan’t include all three as independent

variables in a regression.Solution: Drop one of the variables.

Assumption 7: Normally Distributed Error

Assumption 7: Normally Distributed Error

This is required not required for OLS, but it is important for hypothesis testing

More on this assumption next time.

Putting it all together

Last class, we talked about how to compare estimators. We want:

1. is unbiased. on average, the estimator is equal to the population

value

2. is efficient The variance of the estimator is as small as possible

)ˆ(E

Putting it all togehter

Gauss-Markov Theorem

Given OLS assumptions 1 through 6, the OLS estimator of βk is the minimum variance estimator from the set of all linear unbiased estimators of βk for k=0,1,2,…,K

OLS is BLUEThe Best, Linear, Unbiased Estimator


What happens if we add assumption 7?Given assumptions 1 through 7, OLS is

the best unbiased estimatorEven out of the non-linear estimatorsOLS is BUE?


With Assumptions 1-7 OLS is: 1. Unbiased: 2. Minimum Variance – the sampling distribution

is as small as possible 3. Consistent – as n∞, the estimators

converge to the true parameters As n increases, variance gets smaller, so each estimate

approaches the true value of β. 4. Normally Distributed. You can apply

statistical tests to them.

)ˆ(E