Data organization. Regression Models Time series Cross-sectional Panel Multi-dimensional panel.

Post on 31-Mar-2015

224 views 3 download

Transcript of Data organization. Regression Models Time series Cross-sectional Panel Multi-dimensional panel.

Data organization

Year Sales

2005 $10,2002006 $10,9002007 $11,0002008 $8,5002009 $10,400

Time Series

Location Sales

Virginia $10,400Florida $10,300

Colorado $8,300Maine $10,200

Cross-Sectional

Year Location Sales

2005 Virginia $9,0002005 Florida $9,5002005 Colorado $9,2002005 Maine $8,8002006 Virginia $9,2002006 Florida $10,5002006 Colorado $10,7002006 Maine $9,3002007 Virginia $8,7002007 Florida $8,9002007 Colorado $11,0002007 Maine $9,7002008 Virginia $8,0002008 Florida $8,4002008 Colorado $9,3002008 Maine $9,0002009 Virginia $8,0002009 Florida $9,7002009 Colorado $8,5002009 Maine $9,100

Panel

Year Location Holiday Sales

2005 Virginia Christmas $9,2002005 Virginia July 4 $8,4002005 Virginia Labor Day $8,9002005 Florida Christmas $9,1002005 Florida July 4 $8,4002005 Florida Labor Day $10,5002005 Colorado Christmas $10,3002005 Colorado July 4 $9,4002005 Colorado Labor Day $10,9002005 Maine Christmas $8,9002005 Maine July 4 $9,1002005 Maine Labor Day $8,7002006 Virginia Christmas $8,2002006 Virginia July 4 $8,9002006 Virginia Labor Day $8,9002006 Florida Christmas $10,3002006 Florida July 4 $11,0002006 Florida Labor Day $8,5002006 Colorado Christmas $8,1002006 Colorado July 4 $9,2002006 Colorado Labor Day $10,2002006 Maine Christmas $10,2002006 Maine July 4 $8,1002006 Maine Labor Day $8,6002007 Virginia Christmas $9,6002007 Virginia July 4 $10,4002007 Virginia Labor Day $10,8002007 Florida Christmas $10,3002007 Florida July 4 $9,1002007 Florida Labor Day $10,9002007 Colorado Christmas $10,8002007 Colorado July 4 $9,6002007 Colorado Labor Day $10,2002007 Maine Christmas $10,4002007 Maine July 4 $9,6002007 Maine Labor Day $11,0002008 Virginia Christmas $8,2002008 Virginia July 4 $9,8002008 Virginia Labor Day $8,9002008 Florida Christmas $9,2002008 Florida July 4 $10,4002008 Florida Labor Day $9,0002008 Colorado Christmas $10,7002008 Colorado July 4 $9,6002008 Colorado Labor Day $8,6002008 Maine Christmas $8,1002008 Maine July 4 $8,6002008 Maine Labor Day $8,0002009 Virginia Christmas $9,8002009 Virginia July 4 $8,8002009 Virginia Labor Day $10,4002009 Florida Christmas $10,7002009 Florida July 4 $8,3002009 Florida Labor Day $9,6002009 Colorado Christmas $9,1002009 Colorado July 4 $8,3002009 Colorado Labor Day $9,6002009 Maine Christmas $10,2002009 Maine July 4 $9,6002009 Maine Labor Day $8,200

Multi-Dimensional Panel

Regression Models

• Time series

• Cross-sectional

• Panel

• Multi-dimensional panel

t t ty x u

i i iy x u

, , ,i t i t i ty x u

, , , , , ,i s t i s t i s ty x u

Errors in Uni-dimensional Data

In standard time series or cross-sectional data sets, we must adjust for non-independent errors.

Serial correlationErrors correlated across time

Spatial correlationErrors correlated across cross-sections

HeteroskedasticityError variance changes over time or cross-sections

Errors in Panel Data

Heterogeneous serial correlationErrors correlated across time and differently for different cross-sections.

Heterogeneous spatial correlationErrors correlated across cross-sections but differently for different time periods.

Heterogeneous heteroskedasticityError variance changes over time, but does so differently for different cross-sections.

Serial-spatial correlationPast errors from one cross-section are correlated with future errors from a different cross-section.

Generalized Least Squares

1 1 2 1 3

1 2 2 2 3

1 3 2 3 3

var cov , cov ,

cov , var cov ,

cov , cov , var

t t t t t

t t t t t

t t t t t

u u u u u

u u u u u

u u u u u

The error covariance matrix shows the covariances of error terms across different observations.

11 1

For the regression model

ˆ ' '

t t ty x u

X X X Y

cov ,

0 t s

u t su u

t s

Ordinary Least Squares Assumptions

0 0

0 0

0 0

u

u

u

11 1

For the regression model

ˆ ' '

t t ty x u

X X X Y

Ordinary Least Squares (Heteroskedasticity)

cov ,

0 t

t s

u t su u

t s

1

2

3

0 0

0 0

0 0

t

t

t

u

u

u

11 1

For the regression model

ˆ ' '

t t ty x u

X X X Y

Ordinary Least Squares (Serial Correlation)

| |cov , t st su u u

2

2

u u u

u u u

u u u

11 1

For the regression model

ˆ ' '

t t ty x u

X X X Y

Two-Dimensional Panel Data: OLS Assumptions

, , ,

11 1

For the regression model

ˆ ' '

i t i t i t i ty x v u

X X X Y

cov ,

0 otherwisei j

v i jv v

cov ,

0 otherwiset s

t s

, ,

and cov ,

0 otherwisei t j s

u i j t su u

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

Two-Dimensional Panel Data: OLS Assumptions

, , ,

, , i t i t i t i t

i t i t

y x v u

x

2

2

2

2

2

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0

2

Two-Dimensional Panel Data: OLS (homogeneous serial correlation)

, , ,

, , i t i t i t i t

i t i t

y x v u

x

21 1 1

1 1 12

1 1 1

22 2 2

2 2 22

2 2 2

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0

23 3 3

3 3 32

3 3 3

Two-Dimensional Panel Data: OLS (heterogeneous serial correlation)

, , ,

, , i t i t i t i t

i t i t

y x v u

x

2 2 21 1 1 1,2 1,2 1,2 1,3 1,3 1,3

1 1 1 1,2 1,2 1,2 1,3 1,3 1,32 2 2

1 1 1 1,2 1,2 1,2 1,3 1,3 1,3

21,2 1,2 1,2

1,2 1

2 22 2 2 2,3 2,3 2,3

,2 1,2 2 2 2 2,3 2,3 2,32 2 2

1,2 1,2 1,2 2 2 2 2,3 2,3 2,3

21,3 1,3 1,3

1,3 1,3 1,32

1,3 1,3 1,3

2 22,3 2,3 2,3 3 3 3

2,3 2,3 2,3 3 3 32 2

2,3 2,3 2,3 3 3 3

Two-Dimensional Panel Data: OLS (serial-spatial correlation)

, , ,

, , i t i t i t i t

i t i t

y x v u

x

OLS vs. Panel Estimation

2Estimation Procedure Estimate Standard Error Regression R

OLS 0.482 0.017 0.37

Cross-Sectional Effects 0.499 0.014 0.46

Time Effects 0.486 0.013 0.48

Both Effects 0.505 0.009 0.67

, , ,

2 2 2 2, ,~ 0, , ~ 0, , ~ 0, , ~ 0,

35, 40

0.5

i t i t i t i t

i v t i t i u i t t u

y x v u

v IIN IIN u IIN u IIN

N T

Fixed versus Random Effects

Under the random effects assumption, and are treated as stochastic.

Under the fixed effects assumption, they are treated as fixed in repeated samples.

iv t

, , ,i t i t i t i ty x v u

Random vs. Fixed Effects

Random Effects Assumption

Pro: Estimators are more efficient

Con:Estimators are inconsistent if any of the three errors are not IIN(0,σ2) across all dimensions.

Fixed Effects Assumption

Pro: Estimators are consistent regardless of and .

Con:Estimators are less efficient.

iv t

, , ,i t i t i t i ty x v u

See Hausman test for endogeneity.

Random vs. Fixed Cross-Sectional Effects

2Estimation Procedure Estimate Standard Error Regression R

OLS 0.595 0.004 0.63

Random Effects 0.588 0.004 0.59

Fixed Effects 0.518 0.009 0.65

, , ,

2 2 2, ,~ 0, , ~ 0, , ~ 0,

35, 40

0.5

i t i t i t i t

t i t i u i t t u

y x v u

IIN u IIN u IIN

N T

Test statistic = 22

Alternatives to Panel Techniques

1, 1 1 1, 1,

2, 2 2 2, 2,

For cross-section 1

For cross-section 2

etc.

t t t

t t t

y x u

y x u

Separate Regressions

Drawbacks

Less efficient estimators due to lost information about cross-sectional error covariance.

Remove the ability to restrict parameter values across cross-sections.

Alternatives to Panel Techniques

, , ,

Run standard OLS on

i t i t i ty x u

Pooled Regression

Drawbacks

Less efficient estimators due to lost information about cross-sectional error covariance.

Restricts parameter values to be equal across cross-sections.

Alternatives to Panel Techniques

, , ,

Run standard OLS on

i t i i t i ty x u

Pooled Regression with Cross-Sectional Dummies

Drawbacks

This is the fixed effects panel technique.

If the cross-sectional dummies are IIN, then parameter estimates are less efficient than under the random effects panel technique.

Procedures to use with panel data

Generalized least squares (GLS)Generalized method of moments (GMM)

OLS with “automated” corrections for serial correlation, etc. is GLS.

Extra stuff

Panel data reveals information that is unattainable with non-panel data.

Three-Dimensional Structure of the ASA-NBER Data Set

Shock Occurrence vs. Shock Impact

These shocks all occur in quarter 6 but impact inflation in different quarters.

These shocks all impact inflation in quarter 9 but occur in different quarters.

Shock Occurrence vs. Shock Impact

, 1ˆ ˆˆth th t hu

1, 1ˆ ˆ ˆth th t hv u u

, , 11

1ˆN

th ith i t hi

F FN

Cumulative shocks

Cross-sectional shocks

Discrete shocks

Shock Occurrence vs. Shock Impact

Shock Measure Shocks Occur From Shocks Impact Inflation From

Cumulative shocks

th

Beginning of quarter t – h to the end of quarter t.

Beginning of quarter t – h to the end of quarter t.

Cross-sectional shocks

uth

Beginning of quarter t – h to the end of quarter t – h.

Beginning of quarter t – h to the end of quarter t.

Discrete shocks

vth

Beginning of quarter t – h to the end of quarter t – h.

Beginning of quarter t to the end of quarter t.