Limited Dependent VariablesVariables

7/27/2019 Limited Dependent VariablesVariables

1/23

Limited Dependent Variables

Often there are occasions where we are

interested in explaining a dependentvariable that has only limited

measurement

Frequently it is even dichotomous.


2/23

Examples

War(1) vs. no War(0)

Vote vs. no vote Regime change vs. no change


3/23

These are often Probability Models

E.g.

Power disparity leads to war:

Where Yt is the occurrence (or not) of war, and Xtis a measure of power disparity

We call this a Linear Probability Model

ttt eXBaY 1


4/23

Problems with LPM Regression

OLS in this case is called the Linear

Probability Model Running regression produces some problems

Errors are not distributed normally

Errors are heteroskedastic Predicted Ys can be outside the 0.0-1. bounds

required for probability


5/23

Logistic Model

We need a model that produces true probabilities

The Logit, or cumulative logistic distribution offers one

approach.

This produces a sigmoid curve.

Look at equation under 2 conditions: Xi = +

Xi

= -

)( 211

1iXBBi e

Y


6/23

Sigmoid curve
http://en.wikipedia.org/wiki/Logistic_function


7/23

Probability Ratio

Note that

Where

Z

Z

ZXBBie

eee

Pii

11

11

1)_( 21

ii XBBZ 21


8/23

Log Odds Ratio

The logit is the log of the odds ratio, and is givenby:

This model gives us a coefficient that may beinterpreted as a change in the weighted odds ofthe dependent variable

ii

i

ii XBBZ

PPL 21

1ln


9/23

Estimation of Model

We estimate this with maximum likelihood

The significance tests are z statistics

We can generate a Pseudo R2 which is an attempt tomeasure the percent of variation of the underlyinglogit function explained by the independentvariables

We test the full model with the Likelihood Ratiotest (LR), which has a 2 distribution with k degreesof freedom


10/23

Neural Networks

The alternate formulation is representative of asingle-layer perceptron in an artificial neural

network.


11/23

Probit

If we can assume that the dependent variable is

actually the result of an underlying (and

immeasurable) propensity or utility, we can use the

cumulative normal probability function to estimate

a Probit model

Also, more appropriate if the categories (or their

propensities) are likely to be normally distributed

It looks just like a logit model in practice


12/23

The Cumulative Normal Density

Function

The normal distribution is given by:

The Cumulative Normal Density Function is:

2

2

2

)(

22

1)(

X

eXf

0 2

2

2

)(

22

1)(

XX

eXF


13/23

The Standard Normal CDF

We assume that there is an underlying threshold

value (Ii) that if the case exceeds will be a 1, and 0

otherwise.

We can standardize and estimate this as

iXBB

zi dzeIF21 2

2/

2

1)(


14/23

Probit estimates

Again, maximum likelihood estimation

Again, a Pseudo R2

Again, a LR ratio with k degrees of freedom


15/23

Assumptions of Models

All Ys are in {0,1} set

They are statistically independent

No multicollinearity

The P(Yi=1) is normal density for probit, and

logistic function for logit


16/23

Ordered Probit

If the dependent variable can take on ordinal

levels, we can extend the dichotomous Probit

model to an n-chotomous, or ordered, Probit

model

It simply has several threshold values

estimated

Ordered logit works much the same way


17/23

Multinomial Logit

If our dependent variable takes on different

values, but they are nominal, this is a

multinomial logit model


18/23

Some additional info

The Modal category is good benchmark

Present % correctly predicted This can be calculated and presented.

This, when compared to the modal category,

gives us a good indication of fit.


19/23

Stata

Use Leadership Change data

(1992 cross section) 1992-Stata
http://localhost/var/www/apps/conversion/tmp/Data/logit_data_st9_1992.xlshttp://localhost/var/www/apps/conversion/tmp/Data/Logit_Data_ST9_1992.dtahttp://localhost/var/www/apps/conversion/tmp/Data/Logit_Data_ST9_1992.dtahttp://localhost/var/www/apps/conversion/tmp/Data/Logit_Data_ST9_1992.dtahttp://localhost/var/www/apps/conversion/tmp/Data/Logit_Data_ST9_1992.dtahttp://localhost/var/www/apps/conversion/tmp/Data/logit_data_st9_1992.xls


20/23

Test different models

Dependent variable Leadership change

Examine distributiontables ledchan1

Independent variables

Try different

Try corrand then (pwcorr)


21/23

Try the following

regress ledchan1 grwthgdp hlthexp i l l i t_f pol i ty2

logit ledchan1 grwthgdp hl thexp il l i t_f pol ity2

logistic ledchan1 grwthgdp hl thexp il l i t_f poli ty2

probit ledchan1 grwthgdp hl thexp i l l i t_f poli ty2

ologit ledchan1 grwthgdp hl thexp il l i t_f pol ity2

oprobit ledchan1 grwthgdp hl thexp il l i t_f poli ty2

mlogit ledchan1 grwthgdp hl thexp i l l i t_f poli ty2

tobit ledchan1 grwthgdp hl thexp il l i t_f poli ty2, ul l l


22/23

Tobit

Assumes a 0 value, and then a scale

E.g., the decision to incarcerate 0 or 1

(Imprison or not)

If Imprison, than for how many years?


23/23

Other models

This leads to many other models

Count models & Poisson regression Duration/Survival/hazard models

Censoring and truncation models

Selection bias models

Limited Dependent VariablesVariables

Documents

Transcript of Limited Dependent VariablesVariables