1. Sampling Distribution of the OLS...
Transcript of 1. Sampling Distribution of the OLS...
ECON 482 / WH Hong Multiple Regression Analysis: Inference
1
1. Sampling Distribution of the OLS Estimators
Statistical inference in the regression model
Hypothesis tests about population parameters
Construction of confidence intervals
Sampling distribution of the OLS estimators
The OLS estimators are random variables
We already know their expected values and their variances
However, for hypothesis tests we need to know their distribution.
In order to derive their distribution we need additional assumptions
Assumption about distribution of errors: Normal distribution => Assumption MRL. 6
ECON 482 / WH Hong Multiple Regression Analysis: Inference Additional Assumption
MLR. 6 Normality of error terms
The population error i is independent of the explanatory variables iku 1 2, ,...,i ix x x and is
normally distributed with zero mean and variance 2σ . That is,
( )2~ 0,iu N σ independently of ik1 2, ,...,i ix x x
Terminology
MLR.1 - MLR. 5: Gauss-Markov assumptions
MLR.1 - MLR. 6: Gauss-Markov assumptions + Normality
= Classical Linear Model (CLM) assumptions
Under CLM assumption, we can summarize the population assumptions as:
( )20 1 1~ ... ,k ky Normal x xβ β β σ+ + +x
2
ECON 482 / WH Hong Multiple Regression Analysis: Inference Graphically,
Note that Assumption MRL. 6 is much stronger than any of our previous assumptions. In fact, if
we make Assumption MRL. 6, we are necessarily assuming MLR. 4 (Zero conditional mean) and
MLR. 5 (Homoskedasticity).
3
ECON 482 / WH Hong Multiple Regression Analysis: Inference Discussion of the normality assumption
• The error term is the sum of "many" different unobserved factors
• Sums of independent factors are normally distributed by the Central Limit Theorem (CLT).
• Problem
CLT only works when all unobserved factors affect y in a separate additive fashion.
But nothing guarantees that this is so. => may be a complicated functional form
Possibly very heterogeneous distributions of individual factors
How independent are the different factors?
• The normality of the error term is an empirical question. In many cases, normality is
questionable or impossible by definition
• Examples where normality cannot hold:
Wage (nonnegative); Unemployment (indicator variable, takes on only 1 or 0), etc
• In some cases, normality can be achieved through transformations of the dependent variable
(e.g. use log(wage) instead of wage)
• Under normality, OLS is BLUE.
• Important: For the purpose of statistical inference, the assumption of normality can be
replaced by a large sample size.
4
ECON 482 / WH Hong Multiple Regression Analysis: Inference Normal sampling distributions
• Normality of the error term translates into normal sampling distributions of the OLS
estimators.
Theorem 4.1 Normal sampling distribution
Under the CLM assumption from MLR. 1 through MLR. 6,
( )( )ˆ ˆ~ ,varj jNormal jβ β β
where ( ) ( )2
2ˆvar
1jj jSST Rσβ =−
, ( )2
1
n
j j j and i
SST x x=
= −∑
2jR is the R-squared from the regression of jx on all other independent variables
Therefore, the standardized estimators follow:
( )
( ) ( )ˆ
~ 0, 1ˆ. .
j j
j
Normals d
β β
β
−
5
ECON 482 / WH Hong Multiple Regression Analysis: Inference Sketch of the proof.
• The proof uses a property of normality:
Any linear combination of normally-distributed random variables is also normally
distributed.
• Note that the OLS estimators are a linear combination of errors:
1
ˆn
j j w uβ β=
= +∑ ij ii
where Rˆ /ij ij jw r SS= , is the i-th residual from the regression of
the
ijr
jx on all other independent variables, and 2
1
ˆn
j iji
SSR r=
= ∑
• Therefore, ˆjβ is also normally-distributed.
Note that any linear combination of 1ˆ,..., kβ β is also normally distributed since each jβ is
normally distributed.
6
ECON 482 / WH Hong Multiple Regression Analysis: Inference
2. Testing Hypotheses about a Single Population Parameter: The t Test (1) Test statistics
Theorem 4.2 t-distribution for the standardized estimator
Under the CLM assumptions MLR. 1 through MLR. 6,
( ) 1
ˆ~
ˆ. .j j
n k df
j
t ts d
β β
β− −
−=
where is the number of unknown parameters, and 1k + 1n k− − is the degrees of freedom (df).
Note: The t-distribution is close to the standard normal distribution if 1n k− − is large
7
ECON 482 / WH Hong Multiple Regression Analysis: Inference Constructing the test statistics and hypothesis testing
0 : 0jH β =• Null hypothesis:
The population parameter is equal to zero, i.e., after controlling for the other independent
variables, jx have no effect on the expected value of . y
• Under this null hypothesis, we can construct the t-statistics (or t-ratio) as:
( )ˆ
ˆ
ˆ. .j
j
j
ts dβ
β
β=
The farther the estimated coefficient is away from zero, the less likely it is that the null
hypothesis hold true.
• Distribution of the t-statistics if the null hypothesis is true
( ) ( )ˆ 1
ˆ ˆ~
ˆ ˆ. . . .j
j j jn k
j j
t ts d s dβ
β β β
β β− −
−= =
• Goal: Define a rejection rule so that, if it is true, 0H is rejected only with a small
probability (= significance level, e.g. 5%)
8
ECON 482 / WH Hong Multiple Regression Analysis: Inference (2) Testing against one-sided alternatives
• Hypothesis (greater than zero)
0 : 0jH β = 1 against : 0jH β >
• Construct the t-statistics under the null hypothesis
• Decide a rejection rule, i.e., define the significance level (10%, 5%, or 1%). And construct
the critical value, . c
• The null is rejected if ; e.g. ˆ 0.05 1,0.05j
n kt c tβ − −> = 0.05 28,0.05 1.701c t= =
• Graphically,
9
ECON 482 / WH Hong Multiple Regression Analysis: Inference (Example) Wage equation
• Estimated equation
( )log 0.284 0.092 0.0041 0.022(0.104) (0.007) (0.0017) (0.003)
wage educ exper tenure= + + +
526n = 2 0.316, R =
• Test whether, after controlling for education and tenure, higher work experience leads to
higher hourly wages
, and standard errors are in the parentheses
• Hypothesis: 00 : experH β = vs. 1 : 0experH β >
• t-statistics: ˆ0.0041 2.410.0017exper
tβ
= ≈
• Critical values:
1 526 3 1 522df n k= − − = − − =
At the 5 % significance level, 0.05 522,0.05 1.645c t= = ;
At the 1 % significance level, 0.01 522,0.01 2.326c t= =
• Since ˆ 0.05exper
t cβ
> , the effect of experience on hourly wage is statistically greater than zero at
the 5 % (even at the 1%) significance level.
10
ECON 482 / WH Hong Multiple Regression Analysis: Inference Similarly, we can test the following hypothesis (less than zero)
0 : 0jH β = against 1 : 0jH β <
• Construct the t-statistics under the null hypothesis
• Decide a rejection rule, i.e., define the significance level (10%, 5%, or 1%). And construct
the critical value, . c
• The null is rejected if ˆ 0.05 1,0.05j
n kt c tβ − −< = ; e.g. 0.05 18,0.05 1.734c t= = −
• Graphically,
11
ECON 482 / WH Hong Multiple Regression Analysis: Inference (3) Testing against two-sided alternatives
• Hypothesis
0 : 0jH β = 1 against : 0jH β ≠
• Construct the t-statistics under the null hypothesis
• Decide a rejection rule, i.e., define the significance level (10%, 5%, or 1%). And construct
the critical value, . c
• The null is rejected if ˆ 0.05 1,0.025j
n kt c tβ − −> = ; e.g. 0.05 25,0.025 2.06c t= =
• Graphically,
12
ECON 482 / WH Hong Multiple Regression Analysis: Inference (Example) Determinants of college GPA
• Estimated equation
1.39 0.412 0.015 0.083(0.33) (0.094) (0.011) (0.026)
collGPA hsGPS ACT skipped= + + −
skipped : lectures missed per week
141n = , 2 0.234R =
• 0.014.38 2.58hsGPAt c= > =
0.101.36 1.645ACTt c= < =
0.013.19 2.58skippedt c= − > =
• The effects of hsGPA and skipped are significantly different from zero at the 1 %
significance level. The effect of ACT is not significantly different from zero, not even at the
10 % significance level.
13
ECON 482 / WH Hong Multiple Regression Analysis: Inference "Statistically significant" variables in a regression
If a regression coefficient is different from zero in a two-sided test, the corresponding
variable is said to be "statistically significant"
If the number of degrees of freedom is large enough so that the normal approximation
applies, the following rules of thumb apply:
1.645t ratio− > => "statistically significant at the 10 % level"
1.96t ratio− > => "statistically significant at the 5 % level"
2.576t ratio− > => "statistically significant at the 1 % level"
14
ECON 482 / WH Hong Multiple Regression Analysis: Inference (4) Testing other hypothesis: general case
• Hypothesis
0 : j jH β α= against 1 : j jH β α≠
where jα is a hypothesized value of the coefficient
• t-statistics
( ) ( )( )
ˆ
ˆ. .
j j
j
estimate hypothesized valuet
standard error s d
β α
β
−−= =
• critical value: 0.05 1,0.025n kc t − −=
• Reject the null if 0.05t c>
• The test works exactly as before, except that the hypothesized value is subtracted from the
estimate when forming the statistic
15
ECON 482 / WH Hong Multiple Regression Analysis: Inference (Example) Campus crime and enrollment
• An interesting hypothesis is whether crime increases by one percent if enrollment is
increased by one percent
• Estimated equation
( ) ( )log 6.63 1.27 log(1.03) (0.11)
crime enroll= − +
97n = 2, 0.585R =
• Hypothesis: ( )0 log: 1enrollH β = vs. ( )1 log: 1enrollH β ≠
• ( )0.05
1.27 12.45 1.96
0.11t c
−= ≈ > =
16
ECON 482 / WH Hong Multiple Regression Analysis: Inference (5) Computing p-values for t-test
• If the significance level is made smaller and smaller, there will be a point where the null
hypothesis cannot be rejected anymore
• The smallest significance level at which the null hypothesis is rejected, is called the p-value
of the hypothesis test
• p-value= ( )P T t ratio> − where is a t distributed random variable. T
• A small p-value is evidence against the null hypothesis.
• For example, when the t-statistic is 1.85 with df = 40, we can calculate the p-value as:
( ) ( )1.85 2 0.359 0.0718P T > = = (need to use a computer program.)
17
ECON 482 / WH Hong Multiple Regression Analysis: Inference (6) Confidence intervals
• Simple manipulation of the result in Theorem 4.2 implies that
( ) ( )( )0.05 0.05ˆ ˆ ˆ ˆ. . . . 0.95j j j j jP c s d c s dβ β β β β− ⋅ ≤ ≤ + ⋅ =
where 0.05 is the critical value of the sided test c
• Interpretation of the confidence interval
The bound of the interval are random
In repeated samples, the interval that is constructed in the above way will cover the
population regression coefficient in 95% of the cases
• Confidence intervals for typical confidence levels
( ) ( )( )0.01 0.01ˆ ˆ ˆ ˆ. . . . 0.99j j j j jP c s d c s dβ β β β β− ⋅ ≤ ≤ + ⋅ =
( ) ( )( )0.05 0.05ˆ ˆ ˆ ˆ. . . . 0.95j j j j jP c s d c s dβ β β β β− ⋅ ≤ ≤ + ⋅ =
( ) ( )( )0.10 0.10ˆ ˆ ˆ ˆ. . . . 0.90j j j j jP c s d c s dβ β β β β− ⋅ ≤ ≤ + ⋅ =
As a rule of thumb, we can use 0.01 2.576c = , and 0.05 1.96c = 0.10 1.645c =
18
ECON 482 / WH Hong Multiple Regression Analysis: Inference (Example) Model of firms' R&D expenditure
• Estimated equation
( ) ( )log 4.38 1.084 log 0.0218(0.47) (0.060) (0.0217)
rd sales profmarg= − + +
32n = 2 0., 918R = , : Profits as percentage of sales profmarg
• Critical value: 0.05 29,0.025 2.045c t= =
( )log sales• C.I. of : ( ) [ ]1.084 2.045 0.060 0.961, 1.21± =
The effect of sales on R&D is relatively precisely estimated as the interval is narrow.
Moreover, the effect is significantly different from zero because zero is outside the
interval
19
• C.I. of : profmarg ( ) [ ]0.0217 2.045 0.0218 0.0045, 0.0479± = −
The effect is imprecisely estimated as the interval is very wide
It is not even statistically significant because zero lies in the interval
ECON 482 / WH Hong Multiple Regression Analysis: Inference
3. Testing Hypotheses about a Single Linear Combination of the Parameters
(Example) Return to education at 2 year vs. at 4 year colleges
• ( ) 0 1 2 3log wage jc univ exper uβ β β β= + + + +
jc : years of education at 2 year colleges
univ : years of education at 4 year colleges
exper
• Test hypothesis:
: work experience
0 1 2 1 2: 0H orβ β β β= − = vs. 1 1 2: 0H β β− <
(i) A possible test statistic
• t-statistic: ( )1 2
1 2
ˆ ˆˆ ˆ. .
ts d
β ββ β−
=−
, Hence, reject the null if 0.05
• However, it is impossible with standard regression output because
1,0.05n kt c t − −< =
( ) ( ) ( ) ( ) ( )1 2 1 2 1 2 1 2ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ. . var var var 2cov ,s d β β β β β β β β and − = − = + −
( )1 2ˆ ˆcov ,β β is unknown.
20
ECON 482 / WH Hong Multiple Regression Analysis: Inference (ii) Alternative method
• Define 21 1θ β= − β and test 00 1:H θ = against 1 1: 0H θ <
( ) ( )( )
0 1 2 2 3
0 1 2 3
log wage jc univ exper u
jc jc univ exper u
β θ β β β
β θ β β
= + + + + +
= + + + + +
• Estimated equation
( )log 1.472 0.0102 0.0769 0.0049(0.021) (0.0069) (0.0023) (0.0002)
wage jc totcoll exper= − + +
6,763n = , 2 0.222R = , totcoll jc univ= +
• 0.0102 1.480.0069
t −= = −
( )1.48 0.70p value P T− = < − =
C.I. ( ) [ ]0.0102 1.96 0.0069 0.0237, 0.0003− ± = −
=> The null hypothesis is rejected at the 10% level but not at the 5% level
• This method works always for single linear hypothesis
21
ECON 482 / WH Hong Multiple Regression Analysis: Inference
4. Testing Multiple Linear Restrictions: The F test (1) Testing exclusion restriction
(Example) MLB players' salaries
• ( ) 0 1 2 3 4 5log salary years gamesyr bavg hrunsyr rbisyr uβ β β β β β= + + + + + +
years : years in the league
gamesyr : average number of games per year
bavg : batting average
hrunsyr : home runs per year
rbisyr : runs batted in per year
• Hypothesis: 00 3 4 5: 0, 0,H β β β= = 1 0:= vs. H H is not true
test whether performance measures have no effect on salary / can be excluded from
regression.
The alternative hypothesis can be read as "at least one performance measure have an
effect on salary."
22
ECON 482 / WH Hong Multiple Regression Analysis: Inference (Example) MLB players' salaries (Cont'd)
• Estimation of the unrestricted model
( )log 11.19 0.0689 0.0126(0.29) (0.0121) (0.0026)0.00098 0.0144 0.0108
(0.00110) (0.0161) (0.0072)
salary years gamesyr
bavg hrunsyr rbisyr
= + +
+ + +
353n = 183.186urSSR = 2 0.6278urR =
None of the performance measures is statistically significant when tested individually
using t-statistic.
, ,
• Estimation of the restricted model ( 3 4 50, 0, 0β β β= = = )
( )log 11.22 0.0713 0.0202(0.11) (0.0125) (0.0013)
salary years gamesyr= + +
353n = 198.311rSSR = 2 0.5971rR =
The sum of squared residuals necessarily increases because possible relevant variables
are dropped in the restricted regression. => Is the increase statistically significant?
, ,
23
ECON 482 / WH Hong Multiple Regression Analysis: Inference (Example) MLB players' salaries (Cont'd)
• Test statistic
( )( ) , 1
/~
/ 1r ur
q n kur
SSR SSR qF F
SSR n k − −
−=
− −
# rq numerator df of restictions df df= = = ur− and 1 urn k denominator df df− − = =
No derivation, but the definition of an F distributed random variable is 2
21
q
n k
Fχ
χ − −
= .
• Rejection rule: Reject the null if 0.05 , 1,0.05q n kF c F − −> =
A F-distributed variable only takes on positive values.
24
ECON 482 / WH Hong Multiple Regression Analysis: Inference (Example) MLB players' salaries (Cont'd)
• Test decision
( )( )
198.311 183.186 / 39.55
183.186 / 353 5 1F
−= ≈
− − and 0.05 3,347,0.01 3.78c F= =
( )9.55 0.000p value P F− = > =
Thus, the null hypothesis is overwhelmingly rejected (even very small significance level).
• Discussion
They were not significant when tested individually
But, the three variables are "jointly significant"
The likely reason is multicollinearity between them
25
ECON 482 / WH Hong Multiple Regression Analysis: Inference • The R-squared form of the F statistic
( )( )
( )( ) ( )
2 2
2
/// 1 1 / 1
ur rr ur
ur ur
R R qSSR SSR qF
SSR n k R n k
−−= =
− − − − −
The proof can be easily done by using ) and ( 21r rSSR SST R= − ( )21ur urSSR SST R= −
Note that the R-squared form of the test is only valid for exclusion restriction.
Since , 2 0.5971rR = R2 0.6278ur =
( )( ) ( )
( )( )
2 2
2
/ 0.6278 0.5971 / 39.54
1 0.6278 / 3471 / 1ur r
ur
R R qF
R n k
− −= = ≈
−− − −
- The difference in the last decimal digit is due to rounding error
26
ECON 482 / WH Hong Multiple Regression Analysis: Inference (2) Test of overall significance of a regression
• Unrestricted model
0 1 1 2 2 ...i i i k iy x x x uikβ β β β= + + + + +
• Hypothesis: 00 1: ... kH β β= = = 1 1: vs. H H is not true
The null hypothesis states that the explanatory variables are not useful at all in explaining
the dependent variable
• Restricted model (regression on constant)
0i iy uβ= +
• Test statistic
( )( ) ( ) ( )
2
, 12
/ / ~/ 1 1 / 1
r ur urk n k
ur ur
SSR SSR q R qF FSSR n k R n k − −
−= =
− − − − −
Note that . 2 0rR =
27
ECON 482 / WH Hong Multiple Regression Analysis: Inference (3) Testing general linear restrictions with the F-test
(Example) Test whether house price assessments are rational
• Estimation model:
( ) ( ) ( ) ( )0 1 2 3 4log log log logprice assess lotsize sqrft bdrms uβ β β β β= + + + + +
price : actual house price : the assessed housing value (before sold) assess
lotsize : size of lots (in feet) sqrft : square feet
bdrms : number of bedrooms
28
• Hypothesis: 00 1 2 3 4: 1, 0, 0,H β β β β= = = 1 0:= vs. H H is not tru
If house price assessments are rational, a 1% change in the assessment should be
associated with a 1% change in price.
e
In addition, other known factors should not influence the price once the assessed value
has been controlled for.
ECON 482 / WH Hong Multiple Regression Analysis: Inference
29
• Note that the R-squared form of the test statistic CANNOT be applied here.
(Example) Test whether house price assessments are rational (cont'd)
• Test statistics
( )( )
( )( )
/ 1.880 1.822 / 40.661
/ 1 1.822 / 88 4 1r ur
ur
SSR SSR qF
SSR n k− −
= = ≈− − − −
• Unrestricted regression
0 1 1 2 2 3 3 4 4y x x x x uβ β β β β= + + + + + , 1.822urSSR =
• Restricted regression
0 1y x uβ= + + => [ ]1 0y x uβ− = +
Regression of
, 1.880rSSR =
4,83,0.05 0.05~ 2.50
0H cannot be rejected since 0.050.661 2.50F c= < =
[ ]1y x− on a constant
F F c= =