9420 (1)

8/11/2019 9420 (1)

1/57

8/11/2019 9420 (1)

2/57

2

Solutions for 0and 1 OLS Chooses values of 0and 1that

minimizes the unexplained sum of squares.

To find minimum take partial derivatives withrespect to

0

and 1

n

i

uSSR1

2

2( )i iSSR y y

2

0 1( )i iSSR y x

8/11/2019 9420 (1)

3/57

3

Solutions for 0and 1

Derivatives were transformed into the

normal equations

Solving the normal equations for 0

and 1gives us our OLS estimators

0 1

2

0 1

( ) ( )

( ) ( )

i i

i i i i

y n x

x y x x

8/11/2019 9420 (1)

4/57

4

Solutions for 0and 1

Our estimate of the slope of the line is:

And our estimate of the intercept is:

1 2

( )( )

( )i i

i

x x y y

x x

0 1 y x

8/11/2019 9420 (1)

5/57

8/11/2019 9420 (1)

6/57

8/11/2019 9420 (1)

7/57

7

Gauss-Markov Theorem:

Under the 5 Gauss-Markov assumptions,

the OLS estimator is the best, linear,

unbiased estimator of the true parameters(s) conditional on the sample values of

the explanatory variables. In other words,

the OLS estimators is BLUEKarl Freidrich

Gauss Andrey Markov

8/11/2019 9420 (1)

8/57

8

5 Gauss-Markov Assumptions for

Simple Linear Model (Wooldridge, p.65)

1. Linear in Parameters

2. Random Sampling of nobservations

3. Sample variation inexplanatory variables. xisare not all the same value

4. Zero conditional mean.The

error u has an expectedvalue of 0, given any valuesof the explanatory variable

5. Homoskedasticity. The errorhas the same variance givenany value of the explanatoryvariable.

,( ) : 1, 2,...i ix y i n

0 1 1y x u

1 2 3( ... )nx x x x x

( ) 0E u x

2( )Var u x

8/11/2019 9420 (1)

9/57

9

The Linearity Assumption

Key to understanding

OLS models.

The restriction is that our

model of the population

must be linear in the

parameters.

A model cannot be non-

linear in the parameters

Non-linear in the

variables (xs), however,

is fine and quite useful

0 1 1 2 2 ... k ky x x x u

2

0 1 1

0 1 1

[ ]

[ ln( ) ]

OLS y x

OLS y x

2

0 1 1 1

0 1 1

[ ]

[ ln( )]

OLS y x x

OLS y x

8/11/2019 9420 (1)

10/57

y=ln(x)

10

y

x

0

1

8/11/2019 9420 (1)

11/57

Interpretation of Logs in

Regression Analysis

11

Model Dependent

Variable

Independent

Variable

Interpretation of 1

Level - Level y x y=1x

Level - Log y ln(x) y=(1/100)%x

Log - Level ln(y) x %y=(100*1)x

Log - Log ln(y) ln(x) %y=1%x

8/11/2019 9420 (1)

12/57

12

y

x

0

Quadratic Function (y=6+8x-2x2)

6

14

2

8/11/2019 9420 (1)

13/57

13

F(y/x)

y

x

x1

Demonstration of the Homskedasticity Assumption

Predicted Line Drawn Under Homoskedasticity

x2

x4x3

xy 10

Variance across values

of x is constant

8/11/2019 9420 (1)

14/57

14

F(y/x)

y

x

x1

Demonstration of the Homskedasticity Assumption

Predicted Line Drawn Under Heteroskedasticity

x2

x4x3

xy 10

Variance differs across

values of x

8/11/2019 9420 (1)

15/57

8/11/2019 9420 (1)

16/57

16

Second Best Properties of

EstimatorsAsymptotic (or large sample) Properties

True in hypothetical instance of infinite data

In practice applicable if N>50 or so

Asymptotically unbiased

Consistency

Asymptotic efficiency

8/11/2019 9420 (1)

17/57

17

Bias

A parameter is unbiased if

In other words, the average value of the estimatorin repeated sampling equals the true parameter.

Note that whether an estimator is biased or notimplies nothing about its dispersion.

( ) , 0,1,....,j jE j k

8/11/2019 9420 (1)

18/57

18

Efficiency

An estimator is efficient if its variance is lessthan any other estimator of the parameter.

This criterion only useful in combination withothers. (e.g. =2 is low variance, but biased)

is thebest Unbiased estimator if

,where is any other unbiased estimatorof

j

j

( ) ( )j j

Var Var

j

We m ight want to

choose a biased

estimator, if it has a

smaller variance.

8/11/2019 9420 (1)

19/57

19True

F(x)

+ bias

Biased estimator

of

Unbiased and

efficient estimatorof

High Sampling

Variance means

inefficient

estimator of

0

8/11/2019 9420 (1)

20/57

20

BLUE

(Best Linear Unbiased Estimate) An Estimator is BLUE

if:

is a linear function

is unbiased:

is the most efficient:

j

j

j

j

( ) , 0,1,....,j jE j k

( ) ( )j jVar Var

8/11/2019 9420 (1)

21/57

21

Large Sample Properties

Asymptotically Unbiased

As n becomes larger E( ) trends toward j

Consistency If the bias and variance both decrease as n

gets larger, the estimator is consistent.

Asymptotic Efficiency

asymptotic distribution with finite mean andvariance

is consistent

no estimator has smaller asymptotic variance

j

8/11/2019 9420 (1)

22/57

8/11/2019 9420 (1)

23/57

23

Lets Show that OLS is

Unbiased Begin with our equation: yi = 0+1xi +u

u ~ N(0,2) and yi ~ N(0+1xi, 2)

A Linear function of a normal randomvariable is also a normal randomvariable

Thus 0and 1are normal randomvariables

8/11/2019 9420 (1)

24/57

24

The Robust Assumption of

Normality Even if we do not know the distribution of y, 0

and 1will behave like normal random variables

Central Limit Theorem says estimates of themean of any random variable will approachnormal as n increasesAssumes cases are independent (errors not

correlated) and identically distributed (i.i.d).

This is critical for hypothesis testing

s are normal regardless of y

8/11/2019 9420 (1)

25/57

25

Showing 1hat is Unbiased

Recall the formula for

From rules of summation properties, this

reduces to:

1 2

( )( )

( )

i i

i

x x y y

x x

j

1 2

( )

( )

i i

i

x x y

x x

8/11/2019 9420 (1)

26/57

26

Showing 1hat is Unbiased

Now we substitute for yito yield:

This expands to:

0 1 1

1 2

( )( )

( )

i i

i

x x x u

x x

0 1 1

1 2

( ) ( ) ( )

( )

i i i i

i

x x x x x x x u

x x

8/11/2019 9420 (1)

27/57

27

Showing 1hatis Unbiased

Now, we can separate terms to yield:

Now, we need to rely on two more rules of

summation:

0 1 1

1 2 2 2

( ) ( ) ( )

( ) ( ) ( )

i i i i

i i i

x x x x x x x u

x x x x x x

1

2 2 2

1 1 1

6. ( ) 0

7. ( ) ( ) ( )

n

i

i

n n n

i i i i

i i i

x x

x x x n x x x x

8/11/2019 9420 (1)

28/57

28


By the first summation rule, the first term = 0

By the second summation rule, the second term

= 1

This leaves:

1 1 2

( )

( )i i

i

x x u

x x

8/11/2019 9420 (1)

29/57

29


Expanding the summations yields:

To show that 1hatis unbiased, we must show

that the expectation of 1hat

= 1

1 1 2 2

1 1 2 2 2

( )( ) ( )

...( ) ( ) ( )

n n

i i i

x x ux x u x x u

x x x x x x

1 1 2 21 1 2 2 2

( )( ) ( )( ) ...

( ) ( ) ( )

n n

i i i

x x ux x u x x uE E E E

x x x x x x

8/11/2019 9420 (1)

30/57

30


Now we need Gauss-Markov assumption4that expected value of

the error term = 0

Then all terms after 1are equal to 0

This reduces to:

( ) 0E u x

1 1( ) 0 0... 0E

1 1( )E

Two Assum pt ions needed

to get this resul t :

1. xs are fixed

(measu red w ithou t

error)

2. Expected Value of th e

error is zero

8/11/2019 9420 (1)

31/57

31

Showing 0hatIs Unbiased

Begin with equation for 0hat

Since :

Substitute for mean of y (ybar)

0 1 y x

0 1

0 1

... ,...

i i iy x u

Then

y x u

0 0 1 1 x u x

8/11/2019 9420 (1)

32/57

8/11/2019 9420 (1)

33/57

33

Notice Assumptions

Two key assumptions to show 0hatand

1hatare unbiased

x is fixed (meaning, it is measured withouterror)

E(u)=0

Unbiasedness tells us that OLS will give

us a best guess at the slope and interceptthat is correct on average.

8/11/2019 9420 (1)

34/57

34

OK, but is it BLUE?

Now we have an estimator (1hat)

We know that 1hatis unbiased

We can calculate the variance of 1hat

across samples.

But is 1hatthe Best Linear Unbiased

Estimator????

8/11/2019 9420 (1)

35/57

35

The variance of the

estimator and

hypothesis testing

8/11/2019 9420 (1)

36/57

36

The variance of the estimator

and hypothesis testing We have derived an estimator for the

slope a line through data: 1hat

We have shown that 1hatis an unbiasedestimator of the true relationship 1

Must assume x is measured without error

Must assume the expected value of the errorterm is zero

8/11/2019 9420 (1)

37/57

37

Variance of 0hatand 1

hat

Even if 0hatand 1

hatare right on average we

still want to know how far off they might be in a

given sample Hypotheses are actually about 1, not 1

hat

Thus we need to know the variance of 0hatand

1hat

Use probability theory to draw conclusions about

1, given our estimate of 1hat

8/11/2019 9420 (1)

38/57

38

Variances of 0hatand 1

hat

Conceptually, thevariances of 0

hatand1

hatare the expected

distance from theirindividual values to theirmean values.

We can solve thesebased on our proof of

unbiasedness Recall from above:

2

0 0 0

2

1 1 1

Var( ) = E[( -E( )) ]

Var( ) = E[( -E( )) ]

1 1 2 21 1 2 2 2

( )( ) ( ) ...

( ) ( ) ( )n n

i i i

x x ux x u x x u

x x x x x x

8/11/2019 9420 (1)

39/57

39

The Variance of 1hat

If a random variable (1hat) is the linear

combination of other independently distributed

random variables (u) Then the variance of 1

hatis the sum of the

variances of the us

Note the assumption of independent observations

Applying this principle to the previous equationyields:

8/11/2019 9420 (1)

40/57

40

The Variance of 1hat

Now we need another Gauss-MarkovAssumption 5:

That is, we must assume that the variance ofthe errors is constant. This yields:

2 2 2 2 2 21 1 2 2

1 2 2 2 2 2 2

( ) ( ) ( )( ) ...

[ ( ) ] [ ( ) ] [ ( ) ]

u u n un

i i i

x x x x x xVar

x x x x x x

22

2

2

1

2 ... unuuu

2( )Var u x

22 2

1 2 21

( )( )

[ ( ) ]

iu

i

x xVar

x x

8/11/2019 9420 (1)

41/57

41

The Variance of 1hat!

OR:

That is, the variance of 1hatis a function ofthe variance of the errors (u

2), and the

variation of x

Butwhat is the truevariance of the errors?

22

1 21

( )( )

u

i

Varx x

8/11/2019 9420 (1)

42/57

42

The Estimated Variance of 1hat

We do not observe the u2- because we

dont observe 0and 1

0hatand 1hatare unbiased, so we use thevariance of the observed residuals as an

estimator of the variance of the true errors

We lose 2 degrees of freedom bysubstituting in estimators 0

hatand 1hat

8/11/2019 9420 (1)

43/57

43


Thus:

This is an unbiased estimator of

u

2

Thus the final equation for the estimated varianceof 1hatis:

New assumptions: independent observations andconstant error variance

2

2

2

n

uiu

2

2

1 21

( )

( )

u

i

Var

x x

8/11/2019 9420 (1)

44/57

44


has nice intuitive qualities

As the size of the errors decreases,

decreases

The line fits tightly through the data. Few other lines

could fit as well As the variation in x increases, decreases

Few lines will fit without large errors for extreme

values of x

2

1

2

1

2

1

22

1 21

( )( )

u

i

Varx x

8/11/2019 9420 (1)

45/57

45


Because the variance of the estimated errors hasn in the denominator, as n increases, thevariance of 1

hatdecreases

The more data points we must fit to the line,

the smaller the number of lines that fit with fewerrors

We have more information about where theline must go

2

2

2

n

uiu

22

1 21

( )( )

u

i

Varx x

8/11/2019 9420 (1)

46/57

46

Variance of 1hatis Important for

Hypothesis Testing Ftesthypothesis that Null Model does

better

Log-likelihood Testjoint significance ofvariables in an MLE model

T-testtests that individual coefficientsare not zero.

This is the central task for testing most policytheories

8/11/2019 9420 (1)

47/57

47

T-Tests

In general, our theories give ushypotheses that 0>0 or 1

8/11/2019 9420 (1)

48/57

48

ZScores & Hypothesis Tests

We know that 1hat~ N(1, )

Subtracting 1from both sides, we can

see that (1hat- 1) ~ N( 0 , )

Then, if we divide by the standard

deviation we can see that:

(1hat- 1) / 1

hat~ N( 0 , 1 )

To test the Null Hypothesis that 1=0, we

can see that: 1

hat/

~ N( 0 , 1 )

8/11/2019 9420 (1)

49/57

49

Z-Scores & Hypothesis Tests

This variable is a z-score based on the

standard normal distribution.

95% of cases are within 1.96 standarddeviations of the mean.

If 1hat/ > 1.96 then in a series of

random draws there is a 95% chance that1>0

The Problem is that we dont know

8/11/2019 9420 (1)

50/57

8/11/2019 9420 (1)

51/57

51

The t-statistic

The statistic is called Students t, and the t-

distribution looks similar to a normal distribution

Thus 1hat/ ~ t(n-2)for bivariate regression.

More generally 1

hat/ ~ t(n-k)

where k is the # of parameters estimated

1

1

8/11/2019 9420 (1)

52/57

52

The t-statistic

Note the addition of a degrees of

freedom constraint

Thus the more data points we haverelative to the number of parameters we

are trying to estimate, the more the t

distribution looks like the z distribution. When n>100 the difference is negligible

8/11/2019 9420 (1)

53/57

53

Limited Information in Statistical

Significance Tests Results often illustrative rather than

precise

Only tests not zero hypothesis doesnot measure the importance of thevariable (look at confidence interval)

Generally reflects confidence that resultsare robust across multiple samples

8/11/2019 9420 (1)

54/57

54

For ExamplePresidential

Approval and the CPI reg approval cpi

Source | SS df MS Number of obs = 148

---------+------------------------------ F( 1, 146) = 9.76

Model | 1719.69082 1 1719.69082 Prob > F = 0.0022

Residual | 25731.4061 146 176.242507 R-squared = 0.0626

---------+------------------------------ Adj R-squared = 0.0562

Total | 27451.0969 147 186.742156 Root MSE = 13.276

------------------------------------------------------------------------------

approval | Coef. Std. Err. t P>|t| [95% Conf. Interval]

---------+--------------------------------------------------------------------

cpi | -.1348399 .0431667 -3.124 0.002 -.2201522 -.0495277

_cons | 60.95396 2.283144 26.697 0.000 56.44168 65.46624

------------------------------------------------------------------------------

. sum cpi

Variable | Obs Mean Std. Dev. Min Max

---------+-----------------------------------------------------

cpi | 148 46.45878 25.36577 23.5 109

8/11/2019 9420 (1)

55/57

55

So the distribution of 1hatis:

Fraction

Simd cpi parameter-.3 -.2 -.135 -.1 -.05 0 .1

0

.3

8/11/2019 9420 (1)

56/57

56

Now Lets Look at Approval and

the Unemployment Rate . reg approval unemrate

Source | SS df MS Number of obs = 148

---------+------------------------------ F( 1, 146) = 0.85

Model | 159.716707 1 159.716707 Prob > F = 0.3568

Residual | 27291.3802 146 186.927262 R-squared = 0.0058

---------+------------------------------ Adj R-squared = -0.0010

Total | 27451.0969 147 186.742156 Root MSE = 13.672

------------------------------------------------------------------------------

approval | Coef. Std. Err. t P>|t| [95% Conf. Interval]

---------+--------------------------------------------------------------------

unemrate | -.5973806 .6462674 -0.924 0.357 -1.874628 .6798672

_cons | 58.05901 3.814606 15.220 0.000 50.52003 65.59799

------------------------------------------------------------------------------

. sum unemrate

Variable | Obs Mean Std. Dev. Min Max

---------+-----------------------------------------------------

unemrate | 148 5.640541 1.744879 2.6 10.7

8/11/2019 9420 (1)

57/57

Now the Distribution of 1hatis:

Fraction

Simd unemrate parameter-3 -2 -1 -.597 0 .67 1 2 3

0

.25

9420 (1)

Documents

Transcript of 9420 (1)