Lecture+1 +Introduction

ECONOMETRICS-I Eco-537

Dr. Adnan Haider Assistant Professor

Department of Economics and Finance

IBA, Karachi.

E-Mail: [email protected]

Week 1: Introduction

Aims and Objectives

To explain some of the basic methods in econometrics (the application of statistical methods to economic problems)

To provide an overview of how to carry out and interpret empirical research

useful for your dissertations & Research Projects

Econometrics is too mathematical; its the reason my best friend isnt majoring in economics!!!

The Students Perspective

The Lecturers Perspective

Econometrics allows the measurement and analysis of economic phenomena and the

prediction of future economic trends

What is Econometrics?

Why Study Econometrics?

Economic Theory deals with the question why?, Econometrics answers how much?

Econometrics: The practice of combining economic theory with data to make

statistical inferences and predictions

How do you decide how much to charge for a good or your professional services?

How does the Monetary Policy Committee decide how to set interest rates (policy discount rate)?

Will theory alone provide an answer?

Its clear that data issues are very important in Econometrics

You will, sooner or later, be asked to use data in your jobs, in order to support a decision

What is Econometrics?

Historical Context

Econometricians wear many different hats

Are often criticised for ..using sledgehammers to crack open peanuts while turning a blind eye to data deficiencies and the many

questionable assumptions required for the successful application of

these techniques (Kennedy,1998: p.2)

Econometrics as Art or Science?

economic theory

economic data } economic decisions

To use information effectively:

*Econometrics* helps us combine

economic theory and economic data .

Economic Decisions

Economic Data

Economic data are incomplete and far from perfect Uncertainty!

Economic variables should, in general, be treated as random variables, since we have imperfect knowledge of the actual data generating mechanism

On the other hand, we usually work with samples, not the entire population we infer population features by analysing samples: statistical inference as a tool for drawing conclusions from limited sets of information, i.e., quantifying uncertainty

Different samples will lead to different results we need to account for sampling variability

Cross-sectional Panel/longitudinal Time series

Each type is suited for a different purpose and will have associated

problems

Cross-sectional data Consists of a sample of individuals, households, firms, regions,

countries or other units taken at a given point in time Often the information comes from government surveys e.g. LFS or

PSLM in the Paksitan These data have to be obtained by random sampling from the

underlying population (apart from Census) Random sampling often suffers from the problem of non-response =>

can create biases

Types of Economic Data

Problems could also occur if units are sampled from units that are large relative to the population e.g. geographical areas

Survey data is also widely analysed in other social sciences e.g. sociology and geography

Used to test micro-economic hypotheses and evaluating economic policies

Mainly used in the fields of financial economics, labour economics, public economics, industrial economics and health economics

Time series data

Consists of observations on a variable or series of variables over a period of time

Chronological ordering of observations conveys potentially important information

More difficult to analyse than cross-sectional data because economic observations are rarely independent over time

Most time series are related to their recent histories

Modifications to standard econometric techniques have been developed to account for and exploit the dependent nature of economic time series and address other issues, e.g. variables have trends

The frequency of the data is also important

most common frequencies are daily, weekly, monthly, quarterly and annually

Many weekly, monthly and quarterly time series display a strong seasonal pattern

Mainly used to analyse macroeconomic issues and test macroeconomic theories

Panel or Longitudinal Data

Consists of a time series for each cross-sectional unit in the data set

Quite difficult to collect this type of data as need information on the same units over time

Can be obtained either from surveys or collected over time for regions, industries etc

But biases if the household/individual drops out of the sample => attrition

The main advantage is the ability to control for unobserved characteristics of individuals/firms

Also contains the best features of cross section and time series

Disaggregated data e.g. individuals/firms

Allows for the inclusion of dynamics

But some of the econometric techniques needed to analyse panel data are quite complex => covered in Econometrics

Pooled cross sections are similar but do not have repeated observations on the same units

The Modelling Process

(1) Statement of theory/hypothesis

(2) Specification of mathematical model

(3) Specification of the econometric model

(4) Obtaining the data / conduct preliminary data analysis

(5) Estimation of the econometric model and interpretation of regression results

(6) Diagnostic Analysis

(7) Hypothesis testing

(8) Prediction/forecasting

8 STAGE PROCESS

EXAMPLE: 1

Degree Performance in Economics

Number of factors determine performance:

Ability Family background Effort

Let us look at this from the perspective of ability

only and analyse this using a simple bivariate

(2 variable) model

Student Performance

STAGE 1- Statement of Theory /Hypothesis

Student Performance Function:

Student degree performance is determined by

ability

STAGE 2 - Mathematical Model

Performance, P, is some function of ability, A :

P = f(A) (1)

In linear form:

Y = 1 + 2X (2)

where Y = performance and X = ability

Student Performance

STAGE 3 - Econometric Model

Y = 1 + 2X + U (3)

Y = Performance - the dependent variable

X = Ability - explanatory variable

U = Disturbance (random error) term

Student Performance

For this particular example we will collect data on

year 2 average and final year average

STAGE 4a - Obtaining the data

Observed values of Y (yr 3 average) and X (yr

2 average)

Student Performance

STAGE 4b Preliminary Data Analysis

Descriptive Statistics, graphical charts

(initial identification of possible errors: outliers,

influential observations and lurking variables)

Year 3 Average Against Year 2 Average

0

10

20

30

40

50

60

70

80

10 20 30 40 50 60 70 80 90

Year 2 Average

Ye

ar 3

Av

era

ge

What can we say about the relationship between year 3

average and year 2 average? Subjective judgement

Student Performance

Student Performance

STAGE 5a - Estimation of the Parameters

Y and X are the variables - known 1, 2 are the parameters and U both:

unknown Estimators versus Estimates

The least squares (OLS) regression line is the line

that minimises the sum of square deviations of the

data points

Year 3 Average Against Year 2 Average

0

10

20

30

40

50

60

70

80

10 20 30 40 50 60 70 80 90

Year 2 Average

Ye

ar 3

Av

era

ge

Student Performance

STAGE 5b Interpreting the regression results

Check the coefficient sign against

expectations (hypothesis)

(e.g. do we expect year 2 average to be positively

related to year 3 average?)

Check coefficient magnitude against expectations

(if any)

(e.g. are there any prior expectations on the

possible magnitude of the effect of year 2

average?)

Student Performance

STAGE 6 - Diagnostic Analysis

Is the model correctly specified?

- Correct functional form - Omitted variables (or unnecessary ones)

- Is the regression spurious?

Has the model got good diagnostic properties? (validity of the probability distribution of the disturbance

term)

- Is the disturbance term uncorrelated with the

regressors?

- Are the values of the disturbance term

independently and normally distributed with mean

zero and variance 2

Student Performance

STAGE 7 - Hypothesis Testing

Are the estimates statistically significant? Do they conform with economic theory?

STAGE 8 - Forecasting/Prediction

For example, predicting the level of

performance for a particular ability level.

Possible policy implications?

Student Performance

Excel Regression Output: Bivariate Case

Correlation coefficient

Coefficient of determination

36% of the variation in year 3 is

explained by variation in year 2

1 2 t statistic

SUMMARY OUTPUT

Regression Statistics

Multiple R 0.599443

R Square 0.359332

Adjusted R Square 0.356338

Standard Error 4.617821

Observations 216

ANOVA

df SS MS F Significance F

Regression 1 2559.477 2559.477 120.0265 1.84E-22

Residual 214 4563.393 21.32427

Total 215 7122.87

CoefficientsStandard Error t Stat P-value Lower 95%Upper 95%Lower 95.0%Upper 95.0%

Intercept 29.4029 2.799513 10.50286 4.45E-21 23.88475 34.92106 23.88475 34.92106

Y2AV 0.538088 0.049115 10.95566 1.84E-22 0.441277 0.6349 0.441277 0.6349

P-value

Student Performance

MicroFit Regression Output: Bivariate Case Ordinary Least Squares Estimation

******************************************************************************

Dependent variable is Y3AV

216 observations used for estimation from 1 to 216

******************************************************************************

Regressor Coefficient Standard Error T-Ratio[Prob]

CONSTANT 29.4029 2.7995 10.5029[.000]

Y2AV .53809 .049115 10.9557[.000]

******************************************************************************

R-Squared .35933 R-Bar-Squared .35634

S.E. of Regression 4.6178 F-stat. F( 1, 214) 120.0265[.000]

Mean of Dependent Variable 59.8796 S.D. of Dependent Variable 5.7558

Residual Sum of Squares 4563.4 Equation Log-likelihood -635.9494

Akaike Info. Criterion -637.9494 Schwarz Bayesian Criterion -641.3247

DW-statistic 1.6939

******************************************************************************

Diagnostic Tests

******************************************************************************

* Test Statistics * LM Version * F Version

******************************************************************************

* * *

* A:Serial Correlation*CHSQ( 1)= 4.8159[.028]*F( 1, 213)= 4.8573[.029]

* * *

* B:Functional Form *CHSQ( 1)= .0080270[.929]*F( 1, 213)= .0079158[.929]

* * *

* C:Normality *CHSQ( 2)= 1394.7[.000]* Not applicable

* * *

* D:Heteroscedasticity*CHSQ( 1)= 3.1753[.075]*F( 1, 214)= 3.1928[.075]

******************************************************************************

A:Lagrange multiplier test of residual serial correlation

B:Ramsey's RESET test using the square of the fitted values

C:Based on a test of skewness and kurtosis of residuals

D:Based on the regression of squared residuals on squared fitted values

Student Performance

Regression

diagnostics

Basic Regression

(similar to results

reported in Excel

Example: 2 Keynesian theory of consumption

Statement of economic theory or hypothesis e.g. Keynesian consumption function

C = f (Y), Marginal propensity to consume (MPC 0

Econometric Methodology: Keynesian Consumption Theory

Y

X

1

=MPC


Specification of econometric model

C = a + bY + u

u is a disturbance or error term

it is a random variable with well-defined properties

this model is probabilistic or stochastic

Obtaining data

involves making further assumptions e.g. which measures of C and Y to use and whether the variables should be in real or nominal terms

a plot of the data shows that usually no exact relationship holds between the variables


Data on C (personal consumption expenditure) and Y (Gross Domestic

Product), 1980-1991 in 1987 Billions of $US

Year C Y

1980 2447.1 3776.3

1981 2476.9 3843.1

1982 2503.7 3760.3

1983 2619.4 3906.6

1984 2746.1 4148.5

1985 2865.8 4279.8

1986 2969.1 4404.5

1987 3052.2 4539.9

1988 3162.4 4718.6

1989 3223.3 4838.0

1990 3260.4 4877.5

1991 3240.8 4821.0

Source: Gujarati, p 6. Reproduced from Economic Report of the President, 1993, Table B-2,

p. 350.

Estimation of parameters of the model

Basically, we will study how to draw a line through a set of points But our set of points is just a sample, and we want to know a and b in

the population usually these estimates are denoted using hats i.e. and Quantifying uncertainty: Use statistical theory to assign a std. error to

the parameter estimates and interpret those estimates as random variables in a probability distribution. Then we can draw confidence intervals for a and b

Hypothesis testing Having estimated a and b we can test statistical hypotheses about

them, e.g. is the MPC different from unity?

a b

Econometric Methodology:

Keynesian Consumption Theory

Forecasting and Prediction

Estimation of a and b also allows us to use the model for prediction or the forecasting of C for a given value of Y given the equation:

Using the model for policy analysis

The model can be used to policy questions e.g. what is the effect on C of cutting taxes (and thereby raising disposable income) by 5%?

Ybac

Econometric Methodology:

Keynesian Consumption Theory


Estimation of the model

regression analysis

OLS estimates (details next lecture)

On average, a US$1 increase in real income led to an increase of about US72c in consumption expenditure

Hypothesis testing

theory: 0


Can also work out the Income Multiplier (M)

M=1/(1-0.72)=3.57

Using the model for control or policy purposes Govt believe expenditure of US$4000 will lead to unchanged unemployment

Using the model for control or policy purposes Govt believe expenditure of US$4000 will lead to unchanged unemployment

What level of income leads to the target consumption expenditure?

Control variable X; target variable Y

MPCM

1

1

5882

7194.08.2314000

X

X

35

The Practice of Econometrics

Economic theory

Econometric model

Data

Estimation

Specification testing and diagnostic testing

Is the model adequate? No Yes

Hypothesis testing

Policy: prediction and forecasting

Econometric Analysis

Theory Facts

Model Data

Econometric

Model

Refined

Data

Econometric

Techniques

Statistical

Theory

Estimation of Econometric Model with the

Refined Data Using Econometric Techniques

Structural

Analysis

Forecasting Policy

Evaluation

Summary Three stages of research Specification of model

relevant variables, mathematical form, signs and magnitudes of parameters, error terms

Estimation Data requirements (time series, cross section, panel),

level of aggregation (households, regional, national), estimation techniques (OLS, etc)

Model evaluation a priori beliefs (signs and magnitudes etc), significance of

coefficients, degree of fit within sample, forecasting ability beyond sample, nature of residuals

Lecture+1 +Introduction

Documents

Transcript of Lecture+1 +Introduction