Lecture+1 +Introduction

38
ECONOMETRICS-I Eco-537 Dr. Adnan Haider Assistant Professor Department of Economics and Finance IBA, Karachi. E-Mail: [email protected] Week 1: Introduction

description

intro to econometrics lecture.

Transcript of Lecture+1 +Introduction

  • ECONOMETRICS-I Eco-537

    Dr. Adnan Haider Assistant Professor

    Department of Economics and Finance

    IBA, Karachi.

    E-Mail: [email protected]

    Week 1: Introduction

  • Aims and Objectives

    To explain some of the basic methods in econometrics (the application of statistical methods to economic problems)

    To provide an overview of how to carry out and interpret empirical research

    useful for your dissertations & Research Projects

  • Econometrics is too mathematical; its the reason my best friend isnt majoring in economics!!!

    The Students Perspective

    The Lecturers Perspective

    Econometrics allows the measurement and analysis of economic phenomena and the

    prediction of future economic trends

    What is Econometrics?

  • Why Study Econometrics?

    Economic Theory deals with the question why?, Econometrics answers how much?

    Econometrics: The practice of combining economic theory with data to make

    statistical inferences and predictions

    How do you decide how much to charge for a good or your professional services?

    How does the Monetary Policy Committee decide how to set interest rates (policy discount rate)?

    Will theory alone provide an answer?

    Its clear that data issues are very important in Econometrics

    You will, sooner or later, be asked to use data in your jobs, in order to support a decision

  • What is Econometrics?

    Historical Context

    Econometricians wear many different hats

    Are often criticised for ..using sledgehammers to crack open peanuts while turning a blind eye to data deficiencies and the many

    questionable assumptions required for the successful application of

    these techniques (Kennedy,1998: p.2)

    Econometrics as Art or Science?

  • economic theory

    economic data } economic decisions

    To use information effectively:

    *Econometrics* helps us combine

    economic theory and economic data .

    Economic Decisions

  • Economic Data

    Economic data are incomplete and far from perfect Uncertainty!

    Economic variables should, in general, be treated as random variables, since we have imperfect knowledge of the actual data generating mechanism

    On the other hand, we usually work with samples, not the entire population we infer population features by analysing samples: statistical inference as a tool for drawing conclusions from limited sets of information, i.e., quantifying uncertainty

    Different samples will lead to different results we need to account for sampling variability

  • Cross-sectional Panel/longitudinal Time series

    Each type is suited for a different purpose and will have associated

    problems

    Cross-sectional data Consists of a sample of individuals, households, firms, regions,

    countries or other units taken at a given point in time Often the information comes from government surveys e.g. LFS or

    PSLM in the Paksitan These data have to be obtained by random sampling from the

    underlying population (apart from Census) Random sampling often suffers from the problem of non-response =>

    can create biases

    Types of Economic Data

  • Problems could also occur if units are sampled from units that are large relative to the population e.g. geographical areas

    Survey data is also widely analysed in other social sciences e.g. sociology and geography

    Used to test micro-economic hypotheses and evaluating economic policies

    Mainly used in the fields of financial economics, labour economics, public economics, industrial economics and health economics

  • Time series data

    Consists of observations on a variable or series of variables over a period of time

    Chronological ordering of observations conveys potentially important information

    More difficult to analyse than cross-sectional data because economic observations are rarely independent over time

    Most time series are related to their recent histories

  • Modifications to standard econometric techniques have been developed to account for and exploit the dependent nature of economic time series and address other issues, e.g. variables have trends

    The frequency of the data is also important

    most common frequencies are daily, weekly, monthly, quarterly and annually

    Many weekly, monthly and quarterly time series display a strong seasonal pattern

    Mainly used to analyse macroeconomic issues and test macroeconomic theories

  • Panel or Longitudinal Data

    Consists of a time series for each cross-sectional unit in the data set

    Quite difficult to collect this type of data as need information on the same units over time

    Can be obtained either from surveys or collected over time for regions, industries etc

    But biases if the household/individual drops out of the sample => attrition

  • The main advantage is the ability to control for unobserved characteristics of individuals/firms

    Also contains the best features of cross section and time series

    Disaggregated data e.g. individuals/firms

    Allows for the inclusion of dynamics

    But some of the econometric techniques needed to analyse panel data are quite complex => covered in Econometrics

    Pooled cross sections are similar but do not have repeated observations on the same units

  • The Modelling Process

    (1) Statement of theory/hypothesis

    (2) Specification of mathematical model

    (3) Specification of the econometric model

    (4) Obtaining the data / conduct preliminary data analysis

    (5) Estimation of the econometric model and interpretation of regression results

    (6) Diagnostic Analysis

    (7) Hypothesis testing

    (8) Prediction/forecasting

    8 STAGE PROCESS

  • EXAMPLE: 1

    Degree Performance in Economics

    Number of factors determine performance:

    Ability Family background Effort

    Let us look at this from the perspective of ability

    only and analyse this using a simple bivariate

    (2 variable) model

  • Student Performance

    STAGE 1- Statement of Theory /Hypothesis

    Student Performance Function:

    Student degree performance is determined by

    ability

  • STAGE 2 - Mathematical Model

    Performance, P, is some function of ability, A :

    P = f(A) (1)

    In linear form:

    Y = 1 + 2X (2)

    where Y = performance and X = ability

    Student Performance

  • STAGE 3 - Econometric Model

    Y = 1 + 2X + U (3)

    Y = Performance - the dependent variable

    X = Ability - explanatory variable

    U = Disturbance (random error) term

    Student Performance

    For this particular example we will collect data on

    year 2 average and final year average

  • STAGE 4a - Obtaining the data

    Observed values of Y (yr 3 average) and X (yr

    2 average)

    Student Performance

    STAGE 4b Preliminary Data Analysis

    Descriptive Statistics, graphical charts

    (initial identification of possible errors: outliers,

    influential observations and lurking variables)

  • Year 3 Average Against Year 2 Average

    0

    10

    20

    30

    40

    50

    60

    70

    80

    10 20 30 40 50 60 70 80 90

    Year 2 Average

    Ye

    ar 3

    Av

    era

    ge

    What can we say about the relationship between year 3

    average and year 2 average? Subjective judgement

    Student Performance

  • Student Performance

    STAGE 5a - Estimation of the Parameters

    Y and X are the variables - known 1, 2 are the parameters and U both:

    unknown Estimators versus Estimates

    The least squares (OLS) regression line is the line

    that minimises the sum of square deviations of the

    data points

  • Year 3 Average Against Year 2 Average

    0

    10

    20

    30

    40

    50

    60

    70

    80

    10 20 30 40 50 60 70 80 90

    Year 2 Average

    Ye

    ar 3

    Av

    era

    ge

    Student Performance

  • STAGE 5b Interpreting the regression results

    Check the coefficient sign against

    expectations (hypothesis)

    (e.g. do we expect year 2 average to be positively

    related to year 3 average?)

    Check coefficient magnitude against expectations

    (if any)

    (e.g. are there any prior expectations on the

    possible magnitude of the effect of year 2

    average?)

    Student Performance

  • STAGE 6 - Diagnostic Analysis

    Is the model correctly specified?

    - Correct functional form - Omitted variables (or unnecessary ones)

    - Is the regression spurious?

    Has the model got good diagnostic properties? (validity of the probability distribution of the disturbance

    term)

    - Is the disturbance term uncorrelated with the

    regressors?

    - Are the values of the disturbance term

    independently and normally distributed with mean

    zero and variance 2

    Student Performance

  • STAGE 7 - Hypothesis Testing

    Are the estimates statistically significant? Do they conform with economic theory?

    STAGE 8 - Forecasting/Prediction

    For example, predicting the level of

    performance for a particular ability level.

    Possible policy implications?

    Student Performance

  • Excel Regression Output: Bivariate Case

    Correlation coefficient

    Coefficient of determination

    36% of the variation in year 3 is

    explained by variation in year 2

    1 2 t statistic

    SUMMARY OUTPUT

    Regression Statistics

    Multiple R 0.599443

    R Square 0.359332

    Adjusted R Square 0.356338

    Standard Error 4.617821

    Observations 216

    ANOVA

    df SS MS F Significance F

    Regression 1 2559.477 2559.477 120.0265 1.84E-22

    Residual 214 4563.393 21.32427

    Total 215 7122.87

    CoefficientsStandard Error t Stat P-value Lower 95%Upper 95%Lower 95.0%Upper 95.0%

    Intercept 29.4029 2.799513 10.50286 4.45E-21 23.88475 34.92106 23.88475 34.92106

    Y2AV 0.538088 0.049115 10.95566 1.84E-22 0.441277 0.6349 0.441277 0.6349

    P-value

    Student Performance

  • MicroFit Regression Output: Bivariate Case Ordinary Least Squares Estimation

    ******************************************************************************

    Dependent variable is Y3AV

    216 observations used for estimation from 1 to 216

    ******************************************************************************

    Regressor Coefficient Standard Error T-Ratio[Prob]

    CONSTANT 29.4029 2.7995 10.5029[.000]

    Y2AV .53809 .049115 10.9557[.000]

    ******************************************************************************

    R-Squared .35933 R-Bar-Squared .35634

    S.E. of Regression 4.6178 F-stat. F( 1, 214) 120.0265[.000]

    Mean of Dependent Variable 59.8796 S.D. of Dependent Variable 5.7558

    Residual Sum of Squares 4563.4 Equation Log-likelihood -635.9494

    Akaike Info. Criterion -637.9494 Schwarz Bayesian Criterion -641.3247

    DW-statistic 1.6939

    ******************************************************************************

    Diagnostic Tests

    ******************************************************************************

    * Test Statistics * LM Version * F Version

    ******************************************************************************

    * * *

    * A:Serial Correlation*CHSQ( 1)= 4.8159[.028]*F( 1, 213)= 4.8573[.029]

    * * *

    * B:Functional Form *CHSQ( 1)= .0080270[.929]*F( 1, 213)= .0079158[.929]

    * * *

    * C:Normality *CHSQ( 2)= 1394.7[.000]* Not applicable

    * * *

    * D:Heteroscedasticity*CHSQ( 1)= 3.1753[.075]*F( 1, 214)= 3.1928[.075]

    ******************************************************************************

    A:Lagrange multiplier test of residual serial correlation

    B:Ramsey's RESET test using the square of the fitted values

    C:Based on a test of skewness and kurtosis of residuals

    D:Based on the regression of squared residuals on squared fitted values

    Student Performance

    Regression

    diagnostics

    Basic Regression

    (similar to results

    reported in Excel

  • Example: 2 Keynesian theory of consumption

    Statement of economic theory or hypothesis e.g. Keynesian consumption function

    C = f (Y), Marginal propensity to consume (MPC 0

  • Econometric Methodology: Keynesian Consumption Theory

    Y

    X

    1

    =MPC

  • Econometric Methodology: Keynesian Consumption Theory

    Specification of econometric model

    C = a + bY + u

    u is a disturbance or error term

    it is a random variable with well-defined properties

    this model is probabilistic or stochastic

    Obtaining data

    involves making further assumptions e.g. which measures of C and Y to use and whether the variables should be in real or nominal terms

    a plot of the data shows that usually no exact relationship holds between the variables

  • Econometric Methodology: Keynesian Consumption Theory

    Data on C (personal consumption expenditure) and Y (Gross Domestic

    Product), 1980-1991 in 1987 Billions of $US

    Year C Y

    1980 2447.1 3776.3

    1981 2476.9 3843.1

    1982 2503.7 3760.3

    1983 2619.4 3906.6

    1984 2746.1 4148.5

    1985 2865.8 4279.8

    1986 2969.1 4404.5

    1987 3052.2 4539.9

    1988 3162.4 4718.6

    1989 3223.3 4838.0

    1990 3260.4 4877.5

    1991 3240.8 4821.0

    Source: Gujarati, p 6. Reproduced from Economic Report of the President, 1993, Table B-2,

    p. 350.

  • Estimation of parameters of the model

    Basically, we will study how to draw a line through a set of points But our set of points is just a sample, and we want to know a and b in

    the population usually these estimates are denoted using hats i.e. and Quantifying uncertainty: Use statistical theory to assign a std. error to

    the parameter estimates and interpret those estimates as random variables in a probability distribution. Then we can draw confidence intervals for a and b

    Hypothesis testing Having estimated a and b we can test statistical hypotheses about

    them, e.g. is the MPC different from unity?

    a b

    Econometric Methodology:

    Keynesian Consumption Theory

  • Forecasting and Prediction

    Estimation of a and b also allows us to use the model for prediction or the forecasting of C for a given value of Y given the equation:

    Using the model for policy analysis

    The model can be used to policy questions e.g. what is the effect on C of cutting taxes (and thereby raising disposable income) by 5%?

    Ybac

    Econometric Methodology:

    Keynesian Consumption Theory

  • Econometric Methodology: Keynesian Consumption Theory

    Estimation of the model

    regression analysis

    OLS estimates (details next lecture)

    On average, a US$1 increase in real income led to an increase of about US72c in consumption expenditure

    Hypothesis testing

    theory: 0

  • Econometric Methodology: Keynesian Consumption Theory

    Can also work out the Income Multiplier (M)

    M=1/(1-0.72)=3.57

    Using the model for control or policy purposes Govt believe expenditure of US$4000 will lead to unchanged unemployment

    Using the model for control or policy purposes Govt believe expenditure of US$4000 will lead to unchanged unemployment

    What level of income leads to the target consumption expenditure?

    Control variable X; target variable Y

    MPCM

    1

    1

    5882

    7194.08.2314000

    X

    X

    35

  • The Practice of Econometrics

    Economic theory

    Econometric model

    Data

    Estimation

    Specification testing and diagnostic testing

    Is the model adequate? No Yes

    Hypothesis testing

    Policy: prediction and forecasting

  • Econometric Analysis

    Theory Facts

    Model Data

    Econometric

    Model

    Refined

    Data

    Econometric

    Techniques

    Statistical

    Theory

    Estimation of Econometric Model with the

    Refined Data Using Econometric Techniques

    Structural

    Analysis

    Forecasting Policy

    Evaluation

  • Summary Three stages of research Specification of model

    relevant variables, mathematical form, signs and magnitudes of parameters, error terms

    Estimation Data requirements (time series, cross section, panel),

    level of aggregation (households, regional, national), estimation techniques (OLS, etc)

    Model evaluation a priori beliefs (signs and magnitudes etc), significance of

    coefficients, degree of fit within sample, forecasting ability beyond sample, nature of residuals