1. Sampling Distribution of the OLS...

ECON 482 / WH Hong Multiple Regression Analysis: Inference

1

1. Sampling Distribution of the OLS Estimators

Statistical inference in the regression model

Hypothesis tests about population parameters

Construction of confidence intervals

Sampling distribution of the OLS estimators

The OLS estimators are random variables

We already know their expected values and their variances

However, for hypothesis tests we need to know their distribution.

In order to derive their distribution we need additional assumptions

Assumption about distribution of errors: Normal distribution => Assumption MRL. 6

ECON 482 / WH Hong Multiple Regression Analysis: Inference Additional Assumption

MLR. 6 Normality of error terms

The population error i is independent of the explanatory variables iku 1 2, ,...,i ix x x and is

normally distributed with zero mean and variance 2σ . That is,

( )2~ 0,iu N σ independently of ik1 2, ,...,i ix x x

Terminology

MLR.1 - MLR. 5: Gauss-Markov assumptions

MLR.1 - MLR. 6: Gauss-Markov assumptions + Normality

= Classical Linear Model (CLM) assumptions

Under CLM assumption, we can summarize the population assumptions as:

( )20 1 1~ ... ,k ky Normal x xβ β β σ+ + +x

2

ECON 482 / WH Hong Multiple Regression Analysis: Inference Graphically,

Note that Assumption MRL. 6 is much stronger than any of our previous assumptions. In fact, if

we make Assumption MRL. 6, we are necessarily assuming MLR. 4 (Zero conditional mean) and

MLR. 5 (Homoskedasticity).

3

ECON 482 / WH Hong Multiple Regression Analysis: Inference Discussion of the normality assumption

• The error term is the sum of "many" different unobserved factors

• Sums of independent factors are normally distributed by the Central Limit Theorem (CLT).

• Problem

CLT only works when all unobserved factors affect y in a separate additive fashion.

But nothing guarantees that this is so. => may be a complicated functional form

Possibly very heterogeneous distributions of individual factors

How independent are the different factors?

• The normality of the error term is an empirical question. In many cases, normality is

questionable or impossible by definition

• Examples where normality cannot hold:

Wage (nonnegative); Unemployment (indicator variable, takes on only 1 or 0), etc

• In some cases, normality can be achieved through transformations of the dependent variable

(e.g. use log(wage) instead of wage)

• Under normality, OLS is BLUE.

• Important: For the purpose of statistical inference, the assumption of normality can be

replaced by a large sample size.

4

ECON 482 / WH Hong Multiple Regression Analysis: Inference Normal sampling distributions

• Normality of the error term translates into normal sampling distributions of the OLS

estimators.

Theorem 4.1 Normal sampling distribution

Under the CLM assumption from MLR. 1 through MLR. 6,

( )( )ˆ ˆ~ ,varj jNormal jβ β β

where ( ) ( )2

2ˆvar

1jj jSST Rσβ =−

, ( )2

1

n

j j j and i

SST x x=

= −∑

2jR is the R-squared from the regression of jx on all other independent variables

Therefore, the standardized estimators follow:

( )

( ) ( )ˆ

~ 0, 1ˆ. .

j j

j

Normals d

β β

β

−

5

ECON 482 / WH Hong Multiple Regression Analysis: Inference Sketch of the proof.

• The proof uses a property of normality:

Any linear combination of normally-distributed random variables is also normally

distributed.

• Note that the OLS estimators are a linear combination of errors:

1

ˆn

j j w uβ β=

= +∑ ij ii

where Rˆ /ij ij jw r SS= , is the i-th residual from the regression of

the

ijr

jx on all other independent variables, and 2

1

ˆn

j iji

SSR r=

= ∑

• Therefore, ˆjβ is also normally-distributed.

Note that any linear combination of 1ˆ,..., kβ β is also normally distributed since each jβ is

normally distributed.

6


2. Testing Hypotheses about a Single Population Parameter: The t Test (1) Test statistics

Theorem 4.2 t-distribution for the standardized estimator

Under the CLM assumptions MLR. 1 through MLR. 6,

( ) 1

ˆ~

ˆ. .j j

n k df

j

t ts d

β β

β− −

−=

where is the number of unknown parameters, and 1k + 1n k− − is the degrees of freedom (df).

Note: The t-distribution is close to the standard normal distribution if 1n k− − is large

7

ECON 482 / WH Hong Multiple Regression Analysis: Inference Constructing the test statistics and hypothesis testing

0 : 0jH β =• Null hypothesis:

The population parameter is equal to zero, i.e., after controlling for the other independent

variables, jx have no effect on the expected value of . y

• Under this null hypothesis, we can construct the t-statistics (or t-ratio) as:

( )ˆ

ˆ

ˆ. .j

j

j

ts dβ

β

β=

The farther the estimated coefficient is away from zero, the less likely it is that the null

hypothesis hold true.

• Distribution of the t-statistics if the null hypothesis is true

( ) ( )ˆ 1

ˆ ˆ~

ˆ ˆ. . . .j

j j jn k

j j

t ts d s dβ

β β β

β β− −

−= =

• Goal: Define a rejection rule so that, if it is true, 0H is rejected only with a small

probability (= significance level, e.g. 5%)

8

ECON 482 / WH Hong Multiple Regression Analysis: Inference (2) Testing against one-sided alternatives

• Hypothesis (greater than zero)

0 : 0jH β = 1 against : 0jH β >

• Construct the t-statistics under the null hypothesis

• Decide a rejection rule, i.e., define the significance level (10%, 5%, or 1%). And construct

the critical value, . c

• The null is rejected if ; e.g. ˆ 0.05 1,0.05j

n kt c tβ − −> = 0.05 28,0.05 1.701c t= =

• Graphically,

9

ECON 482 / WH Hong Multiple Regression Analysis: Inference (Example) Wage equation

• Estimated equation

( )log 0.284 0.092 0.0041 0.022(0.104) (0.007) (0.0017) (0.003)

wage educ exper tenure= + + +

526n = 2 0.316, R =

• Test whether, after controlling for education and tenure, higher work experience leads to

higher hourly wages

, and standard errors are in the parentheses

• Hypothesis: 00 : experH β = vs. 1 : 0experH β >

• t-statistics: ˆ0.0041 2.410.0017exper

tβ

= ≈

• Critical values:

1 526 3 1 522df n k= − − = − − =

At the 5 % significance level, 0.05 522,0.05 1.645c t= = ;

At the 1 % significance level, 0.01 522,0.01 2.326c t= =

• Since ˆ 0.05exper

t cβ

> , the effect of experience on hourly wage is statistically greater than zero at

the 5 % (even at the 1%) significance level.

10

ECON 482 / WH Hong Multiple Regression Analysis: Inference Similarly, we can test the following hypothesis (less than zero)

0 : 0jH β = against 1 : 0jH β <




• The null is rejected if ˆ 0.05 1,0.05j

n kt c tβ − −< = ; e.g. 0.05 18,0.05 1.734c t= = −

• Graphically,

11

ECON 482 / WH Hong Multiple Regression Analysis: Inference (3) Testing against two-sided alternatives

• Hypothesis

0 : 0jH β = 1 against : 0jH β ≠




• The null is rejected if ˆ 0.05 1,0.025j

n kt c tβ − −> = ; e.g. 0.05 25,0.025 2.06c t= =

• Graphically,

12

ECON 482 / WH Hong Multiple Regression Analysis: Inference (Example) Determinants of college GPA


1.39 0.412 0.015 0.083(0.33) (0.094) (0.011) (0.026)

collGPA hsGPS ACT skipped= + + −

skipped : lectures missed per week

141n = , 2 0.234R =

• 0.014.38 2.58hsGPAt c= > =

0.101.36 1.645ACTt c= < =

0.013.19 2.58skippedt c= − > =

• The effects of hsGPA and skipped are significantly different from zero at the 1 %

significance level. The effect of ACT is not significantly different from zero, not even at the

10 % significance level.

13

ECON 482 / WH Hong Multiple Regression Analysis: Inference "Statistically significant" variables in a regression

If a regression coefficient is different from zero in a two-sided test, the corresponding

variable is said to be "statistically significant"

If the number of degrees of freedom is large enough so that the normal approximation

applies, the following rules of thumb apply:

1.645t ratio− > => "statistically significant at the 10 % level"



14

ECON 482 / WH Hong Multiple Regression Analysis: Inference (4) Testing other hypothesis: general case

• Hypothesis

0 : j jH β α= against 1 : j jH β α≠

where jα is a hypothesized value of the coefficient

• t-statistics

( ) ( )( )

ˆ

ˆ. .

j j

j

estimate hypothesized valuet

standard error s d

β α

β

−−= =

• critical value: 0.05 1,0.025n kc t − −=

• Reject the null if 0.05t c>

• The test works exactly as before, except that the hypothesized value is subtracted from the

estimate when forming the statistic

15

ECON 482 / WH Hong Multiple Regression Analysis: Inference (Example) Campus crime and enrollment

• An interesting hypothesis is whether crime increases by one percent if enrollment is

increased by one percent


( ) ( )log 6.63 1.27 log(1.03) (0.11)

crime enroll= − +

97n = 2, 0.585R =

• Hypothesis: ( )0 log: 1enrollH β = vs. ( )1 log: 1enrollH β ≠

• ( )0.05

1.27 12.45 1.96

0.11t c

−= ≈ > =

16

ECON 482 / WH Hong Multiple Regression Analysis: Inference (5) Computing p-values for t-test

• If the significance level is made smaller and smaller, there will be a point where the null

hypothesis cannot be rejected anymore

• The smallest significance level at which the null hypothesis is rejected, is called the p-value

of the hypothesis test

• p-value= ( )P T t ratio> − where is a t distributed random variable. T

• A small p-value is evidence against the null hypothesis.

• For example, when the t-statistic is 1.85 with df = 40, we can calculate the p-value as:

( ) ( )1.85 2 0.359 0.0718P T > = = (need to use a computer program.)

17

ECON 482 / WH Hong Multiple Regression Analysis: Inference (6) Confidence intervals

• Simple manipulation of the result in Theorem 4.2 implies that

( ) ( )( )0.05 0.05ˆ ˆ ˆ ˆ. . . . 0.95j j j j jP c s d c s dβ β β β β− ⋅ ≤ ≤ + ⋅ =

where 0.05 is the critical value of the sided test c

• Interpretation of the confidence interval

The bound of the interval are random

In repeated samples, the interval that is constructed in the above way will cover the

population regression coefficient in 95% of the cases

• Confidence intervals for typical confidence levels




As a rule of thumb, we can use 0.01 2.576c = , and 0.05 1.96c = 0.10 1.645c =

18

ECON 482 / WH Hong Multiple Regression Analysis: Inference (Example) Model of firms' R&D expenditure


( ) ( )log 4.38 1.084 log 0.0218(0.47) (0.060) (0.0217)

rd sales profmarg= − + +

32n = 2 0., 918R = , : Profits as percentage of sales profmarg

• Critical value: 0.05 29,0.025 2.045c t= =

( )log sales• C.I. of : ( ) [ ]1.084 2.045 0.060 0.961, 1.21± =

The effect of sales on R&D is relatively precisely estimated as the interval is narrow.

Moreover, the effect is significantly different from zero because zero is outside the

interval

19

• C.I. of : profmarg ( ) [ ]0.0217 2.045 0.0218 0.0045, 0.0479± = −

The effect is imprecisely estimated as the interval is very wide

It is not even statistically significant because zero lies in the interval


3. Testing Hypotheses about a Single Linear Combination of the Parameters

(Example) Return to education at 2 year vs. at 4 year colleges

• ( ) 0 1 2 3log wage jc univ exper uβ β β β= + + + +

jc : years of education at 2 year colleges

univ : years of education at 4 year colleges

exper

• Test hypothesis:

: work experience

0 1 2 1 2: 0H orβ β β β= − = vs. 1 1 2: 0H β β− <

(i) A possible test statistic

• t-statistic: ( )1 2

1 2

ˆ ˆˆ ˆ. .

ts d

β ββ β−

=−

, Hence, reject the null if 0.05

• However, it is impossible with standard regression output because

1,0.05n kt c t − −< =

( ) ( ) ( ) ( ) ( )1 2 1 2 1 2 1 2ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ. . var var var 2cov ,s d β β β β β β β β and − = − = + −

( )1 2ˆ ˆcov ,β β is unknown.

20

ECON 482 / WH Hong Multiple Regression Analysis: Inference (ii) Alternative method

• Define 21 1θ β= − β and test 00 1:H θ = against 1 1: 0H θ <

( ) ( )( )

0 1 2 2 3

0 1 2 3

log wage jc univ exper u

jc jc univ exper u

β θ β β β

β θ β β

= + + + + +

= + + + + +


( )log 1.472 0.0102 0.0769 0.0049(0.021) (0.0069) (0.0023) (0.0002)

wage jc totcoll exper= − + +

6,763n = , 2 0.222R = , totcoll jc univ= +

• 0.0102 1.480.0069

t −= = −

( )1.48 0.70p value P T− = < − =

C.I. ( ) [ ]0.0102 1.96 0.0069 0.0237, 0.0003− ± = −

=> The null hypothesis is rejected at the 10% level but not at the 5% level

• This method works always for single linear hypothesis

21


4. Testing Multiple Linear Restrictions: The F test (1) Testing exclusion restriction

(Example) MLB players' salaries

• ( ) 0 1 2 3 4 5log salary years gamesyr bavg hrunsyr rbisyr uβ β β β β β= + + + + + +

years : years in the league

gamesyr : average number of games per year

bavg : batting average

hrunsyr : home runs per year

rbisyr : runs batted in per year

• Hypothesis: 00 3 4 5: 0, 0,H β β β= = 1 0:= vs. H H is not true

test whether performance measures have no effect on salary / can be excluded from

regression.

The alternative hypothesis can be read as "at least one performance measure have an

effect on salary."

22

ECON 482 / WH Hong Multiple Regression Analysis: Inference (Example) MLB players' salaries (Cont'd)

• Estimation of the unrestricted model

( )log 11.19 0.0689 0.0126(0.29) (0.0121) (0.0026)0.00098 0.0144 0.0108

(0.00110) (0.0161) (0.0072)

salary years gamesyr

bavg hrunsyr rbisyr

= + +

+ + +

353n = 183.186urSSR = 2 0.6278urR =

None of the performance measures is statistically significant when tested individually

using t-statistic.

, ,

• Estimation of the restricted model ( 3 4 50, 0, 0β β β= = = )

( )log 11.22 0.0713 0.0202(0.11) (0.0125) (0.0013)

salary years gamesyr= + +

353n = 198.311rSSR = 2 0.5971rR =

The sum of squared residuals necessarily increases because possible relevant variables

are dropped in the restricted regression. => Is the increase statistically significant?

, ,

23


• Test statistic

( )( ) , 1

/~

/ 1r ur

q n kur

SSR SSR qF F

SSR n k − −

−=

− −

# rq numerator df of restictions df df= = = ur− and 1 urn k denominator df df− − = =

No derivation, but the definition of an F distributed random variable is 2

21

q

n k

Fχ

χ − −

= .

• Rejection rule: Reject the null if 0.05 , 1,0.05q n kF c F − −> =

A F-distributed variable only takes on positive values.

24


• Test decision

( )( )

198.311 183.186 / 39.55

183.186 / 353 5 1F

−= ≈

− − and 0.05 3,347,0.01 3.78c F= =

( )9.55 0.000p value P F− = > =

Thus, the null hypothesis is overwhelmingly rejected (even very small significance level).

• Discussion

They were not significant when tested individually

But, the three variables are "jointly significant"

The likely reason is multicollinearity between them

25

ECON 482 / WH Hong Multiple Regression Analysis: Inference • The R-squared form of the F statistic

( )( )

( )( ) ( )

2 2

2

/// 1 1 / 1

ur rr ur

ur ur

R R qSSR SSR qF

SSR n k R n k

−−= =

− − − − −

The proof can be easily done by using ) and ( 21r rSSR SST R= − ( )21ur urSSR SST R= −

Note that the R-squared form of the test is only valid for exclusion restriction.

Since , 2 0.5971rR = R2 0.6278ur =

( )( ) ( )

( )( )

2 2

2

/ 0.6278 0.5971 / 39.54

1 0.6278 / 3471 / 1ur r

ur

R R qF

R n k

− −= = ≈

−− − −

- The difference in the last decimal digit is due to rounding error

26

ECON 482 / WH Hong Multiple Regression Analysis: Inference (2) Test of overall significance of a regression

• Unrestricted model

0 1 1 2 2 ...i i i k iy x x x uikβ β β β= + + + + +

• Hypothesis: 00 1: ... kH β β= = = 1 1: vs. H H is not true

The null hypothesis states that the explanatory variables are not useful at all in explaining

the dependent variable

• Restricted model (regression on constant)

0i iy uβ= +

• Test statistic

( )( ) ( ) ( )

2

, 12

/ / ~/ 1 1 / 1

r ur urk n k

ur ur

SSR SSR q R qF FSSR n k R n k − −

−= =

− − − − −

Note that . 2 0rR =

27

ECON 482 / WH Hong Multiple Regression Analysis: Inference (3) Testing general linear restrictions with the F-test

(Example) Test whether house price assessments are rational

• Estimation model:

( ) ( ) ( ) ( )0 1 2 3 4log log log logprice assess lotsize sqrft bdrms uβ β β β β= + + + + +

price : actual house price : the assessed housing value (before sold) assess

lotsize : size of lots (in feet) sqrft : square feet

bdrms : number of bedrooms

28

• Hypothesis: 00 1 2 3 4: 1, 0, 0,H β β β β= = = 1 0:= vs. H H is not tru

If house price assessments are rational, a 1% change in the assessment should be

associated with a 1% change in price.

e

In addition, other known factors should not influence the price once the assessed value

has been controlled for.


29

• Note that the R-squared form of the test statistic CANNOT be applied here.

(Example) Test whether house price assessments are rational (cont'd)

• Test statistics

( )( )

( )( )

/ 1.880 1.822 / 40.661

/ 1 1.822 / 88 4 1r ur

ur

SSR SSR qF

SSR n k− −

= = ≈− − − −

• Unrestricted regression

0 1 1 2 2 3 3 4 4y x x x x uβ β β β β= + + + + + , 1.822urSSR =

• Restricted regression

0 1y x uβ= + + => [ ]1 0y x uβ− = +

Regression of

, 1.880rSSR =

4,83,0.05 0.05~ 2.50

0H cannot be rejected since 0.050.661 2.50F c= < =

[ ]1y x− on a constant

F F c= =

1. Sampling Distribution of the OLS...

Documents

Transcript of 1. Sampling Distribution of the OLS...