Chapter 3. Two-Variable Regression Model: The Problem of...

Chapter 3.Two-Variable Regression Model:The Problem of Estimation

Ordinary Least Squares Method (OLS)

Recall that, PRF: Yi = β1 + β2 Xi + ui

Thus, since PRF is not directly observable, it is estimated by SRF; that is,

iii uXY ˆˆˆ21 ++= ββ

iii uYY ˆˆ +=

On Error Term More

iii YYu ˆˆ −=

iii uYY ˆˆ +=If

And,iii XYu 21

ˆˆˆ ββ −−=

On error term moreWe need to choose SRF in such a way that, error terms should be as

small as possible,

That is,

The sum of residuals which is represented by

( )∑ ∑ −= iii YYu ˆˆ

Should be as SMALL as possible

On Error Terms moreTherefore, the essential solution is to find a criterion in

order to minimize error disturbances in SRF.

All of the errors are to be as closer as possible to the

central line of SRF

Then, Least Squares Criterion Comes as a Solution

Least Squares Criterion is based on:

( )∑ ∑ −=22 ˆˆ iii YYu

( )∑ −−=2

21ˆˆ

ii XY ββ

( )∑ = 212 ˆ,ˆˆ ββfui

Example to Least Squares Criterion

The first Model is Better?Why?

Sum of squares of Error disturbances of the second model is lower

Regression Equation

iii uXY ˆˆˆ21 ++= ββ

( )( )( )

∑∑∑

∑∑ ∑

∑ ∑ ∑

−−=

YXYXnβ

∑ ∑∑ ∑ ∑ ∑

Sample mean of YSample mean of X

The Classical Linear Regression Model (CLRM): The Assumptions Underlying The Method of Least Squares

The inferences about the true β1 and β2 are important because the estimated values of them are needed to be closer and closer to population values.

Therefore CLRM, which is the cornerstone of most econometric theory, makes 10 assumptions.

Assumptions of CLRM:Assumption 1. Linear Regression Model

The regression model is linear in the parameters, that is:

Yi = β1 + β2 Xi + ui

Assumption 2. X values are fixed in repeated sampling.More technically, X is assumed to be non-stochastic

X: 80$ income level → Y: 60$ weekly consumption of a familyX: 80$ income level → Y: 75$ weekly consumption of another family

Assumption 2 is known as: Conditional Regression Analysis, that is, conditional on the given values of the regressor(s) X.

Assumption 3. Zero Mean value of disturbance ui

( ) 0/ =ii XuE

Assumption 4. Homoscedasticity or Equal Variance of ui

( ) ( )[ ]( )

cefor varian stands var

3 Assumption of because /

XuEuEXu

Homoscedasticity vs Heteroscedasticity

( ) 2/var σ=ii Xu

Assumption 5. No Autocorrelation between the disturbances

Autocorrelation

PRF: Yt = β1 + β2Xt + ut

And if ut and ut-1 are correlated, then Yt depends not only Xt, but also on ut-1.

Autocorrelation in Graphs

Assumption 6. Zero Covariance between ui and Xi.

Assumption 7.

Assumption 8.

Assumption 9.

Assumption 10. There is No Perfect Multicollinearity

That is, there is no perfect linear relationship among the explanatory variables.

tnnt uXXXY ++++= ββββ .....22110

High correlation among independent variables causes multicollinearity which also causes standard errors to be high, hypotheses to be inefficient (low t values), etc...

Properties of the Least-Squares Estimators: The Gauss-Markov Theorem

Gauss-Markov Theorem is the least squares approach of Gauss (1821) with the minimum variance approach of Markov (1900).

Standard error of estimate is simply the standard deviation of the Y values about the estimated regression line and is often used as a summary measure of the “goodness of fit” of the estimated regression line.

BLUE (Best Linear Unbiased Estimator)

1. An estimator is linear, that is, a linear function of a random variable, such as the dependent variable Y in the regression model.

2. An estimator is unbiased, that is, its average or expected value, E(β2), is equal to the true value, β2.

3. An estimator has minimum variance in the class of all such linear unbiased estimators; an unbiased estimator with the least variance is known as an efficient estimator.

Therefore, in the regression context it can be proved that the OLS estimators are BLUE which also sets the base of Gauss-Markov Theorem.

The Coefficient of Determination, r2: A Measure of “Goodness of Fit”

The coefficient of determination, r2 (two-variable case) or R2 (multiple regression) is a summary measure that tells how well the sample regression line fits the data.

The Ballentine View of R2

See Peter Kennedy, “Ballentine: A Graphical Aid for Econometrics”, Australian Economics Papers, Vol 20, 1981, 414-416. The name Ballentine is derived from the emblem of the well-known Ballantine beer with its circles.

Coefficient of Determination, r2

TSS = ESS + RSS

where;TSS = total sum of squaresESS = explained sum of squaresRSS = residual sum of squares

( )( ) ( )∑

∑∑∑

2i ˆY

TSSRSS

TSSESS

If TSS = ESS + RSS, then:

On r2 more:

R2 indicates the explained part of the regression model, therefore,

TSSESSr =2

( )( ) TSS

−=∑∑

Alternatively,

TSSRSSr

−−=∑∑

Coefficient of Determination

Coefficient Of Determination

HW # 1:

Problem 3.20 (Chapter 3)

Consumer Prices and Money Supply in Japan

1982 to 2001

Chapter 3. Two-Variable Regression Model: The Problem of...

Documents

Transcript of Chapter 3. Two-Variable Regression Model: The Problem of...

1 Linear Regression With One Variable

Model-Free Variable Screening, Sparse Regression Analysis ...

Variable Importance Assessment Regression

LINEAR REGRESSION WITH ONE INDEPENDENT VARIABLE …balkire/CE5403/LinearRegression.pdf · 4.3 LINEAR REGRESSION WITH ONE INDEPENDENT VARIABLE 4.3.1 . Basic Regression Model First,

Stabilizing Variable Selection and Regression

Dummy Variable Regression Analysis

Linear Regression - Wharton Finance - Finance …finance.wharton.upenn.edu/.../Linear-Regression-Slides.pdf · 2009-10-07 · regression model, 2-variable linear regression model)

Bayesian variable selection regression for genome-wide ... · tion Regression (BVSR) to GWAS (or other similar large-scale problems). Variable selection regression provides a very

Chapter 1 : Linear Regression With One Predictor Variable · · 2017-06-11... Linear Regression With One Predictor Variable Lecture 13 October 24, 2006 ... We use regression analysis

S7 Logistic Regression - NKI · In logistic regression, a variable is analyzed as a continuous variable or a categorical variable. Continuous variable: for each increase of 1 unit,

Model-Based Variable Decorrelation in Linear Regression

Semiparametric Bayesian latent variable regression for skewed …dbandyop/SkewPaperCombined.pdf · 2018-10-15 · Biometrics , 1{20 Semiparametric Bayesian latent variable regression

Online Instrumental Variable Regression with Applications ..../arunvenk/papers/2015/OnlineIVR_Venkatraman.pdfOnline Instrumental Variable Regression with Applications to Online Linear

Simple Linear Regression (single variable)mpetrik/teaching/intro_ml_17/intro_ml_17_files/cla… · Simple Linear Regression (single variable) Introduction to Machine Learning Marek

Dummy-Variable Regression · PDF file7 Dummy-Variable Regression O ne of the serious limitations of multiple-regression analysis, as presented in Chapters 5 ... education constant,

Online Instrumental Variable Regression with …arunvenk/papers/2015/OnlineIVR_Venkat...Online Instrumental Variable Regression with Applications to Online Linear System Identiﬁcation

Regression Analysis 13 Regression Analysis 255 Formula 13.9 Equation for Multiple Regression With Categorical Gender Variable and Dummy Coded Region Variable 273 Formula 13.10 Regression

Simple Linear Regression. Types of Regression Model Regression Models Simple (1 variable) LinearNon-Linear Multiple (2

Econometrics: Two Variable Regression

6- Single Variable Regression (Part II)