Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

26
Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration

Transcript of Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

Page 1: Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

Chapter 13: Limited Dependent Vars.

Zongyi ZHANGCollege of Economics and Business

Administration

Page 2: Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

1. Linear Probability Model

Page 3: Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

Introduction Sometimes we have a situation

where the dependent variable is qualitative in nature It takes on two (or more) mutually

exclusive values Examples:

Whether or not a person is in the labor force

Union membership

Page 4: Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

Linear Probability Model

Examine choice of whether an individual owns a house. Yi = b1 + b2Xi + ui

where Yi = 1 if family owns a house Yi = 0 if family does not own a house Xi = family income

Page 5: Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

Linear Probability Model

We can estimate such a model by OLS. However, we don't get good results.

This is called a linear probability model because E(Yi| Xi) is the conditional probability that the event (buying a house) will occur given Xi (family income).

Page 6: Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

Derivation Expected value of above:

E(Yi|Xi) = b1 + b2Xi since E(ui) = 0. Let

Pi = probability that Yi=1 (the event occurs)

Then 1-Pi is the probability Yi=0 Then by definition of a mathematical

expectation: E(Yi|Xi)= 0(1-Pi) + 1(Pi) = Pi

Page 7: Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

Derivation

So E(Yi|Xi)= b1 + b2Xi = Pi

So the conditional expectation is like a conditional probability.

Page 8: Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

Problems with LPM Error term is not normally

distributed but follows a binomial probability distribution For OLS we do not require that the

error term is distributed normally. But we do assume this for the

purposes of hypothesis testing.

Page 9: Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

Problems with LPM However we can’t assume normality for

the error term here Ui takes on only two values:

When Yi = 1 then ui = 1 - b1 - b2Xi Yi = 0 then ui = - b1 - b2Xi

So ui is not normally distributed, but follows a binomial distribution.

Note that the OLS point estimates still remain unbiased.

As n rises the estimators will tend to be ~ N

Page 10: Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

Problems with LPM Error term is heteroskedastic

Though the E(ui) = 0, the errors are not homoscedastic.

var(ui) = E(Yi|Xi)[1-E(Yi|Xi)] var (ui)= Pi(1- Pi)

This is heteroskedastic because the conditional expectation of Y, depends on the value taken by X.

Page 11: Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

Problems with LPM What does this imply?

With heteroskedasticity, OLS estimators are unbiased but not efficient

They do not have minimum variance. We correct the heteroskedasticity -

Transform data with weight = Pi(1- Pi) This eliminates the heteroskedasticity

Page 12: Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

Problems with LPM In practice we don't know the true

probability - so estimate it: a. Run OLS on original model. b. Get predicted Yi and construct wi =

predictedYi*(1-predictedYi) c. Do OLS regression on transformed data

Page 13: Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

Problems with LPM Probabilities falling outside 0 and 1

is main problem with LPM. Although in theory P(Yi| Xi) would fall

between 0 and 1, there is no guarantee that predicted probabilities in the linear model will

We can estimate by OLS and see if estimated probabilities lie outside these bounds, then assume them to be at 0 or 1.

Page 14: Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

Problems with LPM Or use probit or logit model that

guarantees that the estimated probabilities will fall between these limits.

Graph

Page 15: Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

Problems with LPM LPM assumes that probabilities

increase linearly with the explanatory variables Each unit increase in an X has the same

effect on the probability of Y occurring regardless of the level of the X.

More realistic to assume a smaller effect at high probability levels.

Probit and Logit make this assumption

Page 16: Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

2. CDF

Page 17: Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

Introduction Probit and Logit have a S shaped

probability function As X increases, probability of Y

increases, but never steps outside the 0-1 interval

The relationship between the probability of Y and X is nonlinear

It approaches zero at slower and slower rates as X gets small

Page 18: Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

Introduction It approaches one at slower and slower

rates as X gets large.

The S-shaped curve can be modeled by a cumulative distribution function (CDF). The CDF of a random variable X: F(X) = P(X x)

CDF measures the probability that X takes a value of less than or equal to a given x

Page 19: Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

Introduction Graph of F(X) vs X

The CDF's most commonly chosen are : The logistic function - logit; The cumulative normal - probit

Logit and probit quite different models, different interpretation. Logit distribution has flatter tails

Approaches the axes more slowly

Page 20: Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

3. Probit

Page 21: Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

Introduction Suppose the decision to join union

depends on some unobserved index Zi "the propensity to join" for each individual. Don't observe the "propensity to join"

Just observe union or not. So we only observe dummy variable

D,

Page 22: Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

Introduction Defined as:

D = 0 if a worker is nonunion. D = 1 if a worker is union member

Behind this "observed" dummy variable is the "unobserved" index

Assume Z depends on explanatory variables such as wage. So Zi = b1 + b2Xi

where Xi is the wage of the i'th individual

Page 23: Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

Introduction

Each individual's Z index can be expressed a function of some intercept term and wage with attached coefficient

Reality: many X's, not just wage

Suppose there's a critical level or threshold level of the Z, -- Zi*, If Zi>Zi* an individual will join,

otherwise will not.

Page 24: Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

Introduction Assume Zi* is distributed normally

with the same mean and variance as Zi.

What's the probability that Zi>Zi* In other words, what's the probability that

this individual will join?.

iZ sii dseZFDP 2/2

2

1)()1Pr(

Page 25: Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

Introduction Pi, the probability of joining, is

measured by the area under the standard normal curve from - to Zi. Individuals are at different points along

this function Have different critical values pushing them

into joining, depending on characteristics.

Page 26: Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

Introduction

How do we estimate Zi? Use the inverse of the cumulative

normal function, Zi =F-1 (Pi) = b1 +b2Xi