Graduate Labor Economicsfaculty.arts.ubc.ca/fhoffmann/econ560/labor_demand_handout.pdf · 2 Static...

Graduate Labor EconomicsSection 3: The Neoclassical Model of Labor Demand and

ApplicationsCopyright: Florian HoffmannPlease do not Circulate

Florian HoffmannDepartment of Economics

University of British Columbia

August 2018

0

1 Introduction

It is one of the surprising facts of modern labor economics research that there is not a lot of research

in the area of labor demand. This is particularly true for dynamic versions. A possible reason is that

the intertemporal model of labor demand collapses to a sequence of static labor demand models when

maintaining the common assumptions of neoclassical models. Another reason might be that to study firm

behavior one ideally wants to have a theory of the firm. This quickly leads one to study collective wage

setting, internal career dynamics, or dynamics of firm size. Each of these topics turn out to be diffi cult.

Firm size dynamics and the decision of opening a business is a topic of IO, and we will not cover it here.

One should mention however that a considerably fraction of flows in and out of the labor market is driven

by firm openings and closures, or substantial downsizing. We will therefore briefly cover a model of labor

demand with dynamics.

However, the majority of this section is concerned with the static model, but keep in mind its limitations

relative to a dynamic model:

• The assumption of the neo-classical model that firms can freely hire and fire workers to meet the

number of employees that maximizes future expected discounted profits is arguably unrealistic. Ad-

justment in firm size or opening a business may be costly procedures that introduces a dynamic

dimension into the decisions.

• Firm-specific and aggregate labor demand vary considerably over time. In particular, business cycle

fluctuations correlate strongly with worker flows into and out of employment and with the number

of new firms entering the market. As we will see, a simple neo-classical static model without any

adjustment costs or other frictions cannot explain such patterns.

• Comparable to the model of intertemporal labor supply, a truly dynamic model of labor demand

produces econometric panel data models rather than cross-sectional models. Hence, one can use

the panel structure to identify many structural parameters that are not identified when relying on

cross-sectional data.

• A dynamic model of labor supply can be applied to various dimensions of policy analysis. Focussing

on static models may mask important dynamics in firm behavior and thus lead to wrong policy

conclusions.

1

2 Static Labor Demand and an Empirical Application

2.1 The Neoclassical Model of Labor Demand

In the static model of labor demand firms maximize profits Y −w ∗L− r ∗K, where Y is output, w is the

wage rate, r is the interest rate, L is labor input, and K is capital input. Output is produced according to

a technology F (L,K) that satisfies Inada conditions. Formally the maximization problem is described by

maxL,K

F (L,K)− w ∗ L− r ∗K (1)

with first-order conditions

FL = w (2)

FK = r.

The same first-order conditions can be derived when solving the cost-minimization problem

minL,K

w ∗ L+ r ∗K (3)

s.t. F (L,K) = Y

instead, where Y is some exogenous output target. The first-order conditions imply that

FLFK

=w

r. (4)

Hence, at the optimum the marginal rate of technological substitution (MRTS) FLFK

is equal to the relative

price of the labor input wr . Graphically this means that the slope of the iso-cost line is tangential to the

isoquant:

ENTER FIGURE HERE.

A common comparative-statics exercise is to quantify the change in the input mix if relative factor

prices change. The relevant measure for the strength of such an adjustment is the elasticity of substitution

[d(KL )KL

][dMRTSMRTS

] =

[MRTS(

KL

) ]−1∗[dMRTS

d(KL

) ] (5)

which measures the adjustment in relative factor inputs as we move along the isoquant. If we assume that

the production function is of the CES-form, we have

2

F (L,K) = [µ ∗Kρ + (1− µ) ∗ Lρ]1ρ (6)

⇒ FL =

(1

ρ

)∗ (1− µ) ∗ ρ ∗ Lρ−1 ∗ [µ ∗Kρ + (1− µ) ∗ Lρ]

1ρ−1

FK =

(1

ρ

)∗ µ ∗ ρ ∗Kρ−1 ∗ [µ ∗Kρ + (1− µ) ∗ Lρ]

1ρ−1

⇒ MRTS =

(1− µµ

)∗(K

L

)1−ρ⇒ MRTS

KL

=

(1− µµ

)∗(K

L

)−ρdMRTS

d(KL

) = (1− ρ) ∗(

1− µµ

)∗(K

L

)−ρ⇒ ES =

1

(1− ρ).

Thus, as the name of the production function implies, the elasticity of substitution is a constant.

The CES-production function is a popular specification in empirical applications. The tangency con-

dition above together with the CES-production function yields(1− µµ

)∗(K

L

)1−ρ=

w

r

⇒(L

K

)1−ρ=

(1− µµ

)∗ rw

⇒ log

(L

K

)=

(1

1− ρ

)∗ log

(1− µµ

)+

(1

1− ρ

)(log r − logw) (7)

This is a linear structural labor demand equation, with labor demand expressed relative to capital inputs.

It predicts that the labor input share decreases if log-wages increase. Suppose we have cross-sectional

data on firms’input choices and factor prices, and index firms by i. Then we can run the following linear

regression:

log

(L

K

)i

= β0 + β1 ∗ log ri + β2 ∗ logwi + εi. (8)

This equation has two interesting features: First, one can identify all the structural parameters - the ρ

and µ -, and second, we have an overidentifying restriction β1 = −β2. This is a testable implication.

The principal problem with this equation is that log-wages are an equilibrium outcome. Hence, we do

not know if variation in log-wages is driven by shifts in the labor supply or the labor demand equation.

To identify β2 we need an instrument that shifts the labor supply curve, holding the labor demand curve

stable. Also note that the level of aggregation of this equation is higher than the one in the labor supply

model since a firm usually consists of more than just one worker. Most studies in the literature look at a

higher level of aggregation, e.g. local labor markets defined by cities or states.

We may want to isolate labor demand on the LHS of the equation, thus bringing capital inputs to the

RHS. What is the problem with this approach? Capital is endogenous, and one thus needs to have another3

instrument that shifts capital demand while not affecting labor demand directly. It is hard to think of

such an instrument.

2.2 Monopsony

The model above assumes that firms are price takers. From their point of view, the wage wi is a parameter.

In reality, there may be situations in which firms have some market power on the demand side for labor.

One example is a large firm that operates in a geographically isolated place and that mainly hires low-skill

labor. We call this type of market power monopsony power. Firms do not take wages as given anymore

but have a direct influence on the wage rate paid in the local labor market. In the profit maximization

problem we thus need to substitute the parameter w by the inverse labor supply function w(L).

Let’s abstract from capital and assume that the labor supply equation is upward sloping (i.e. wL (L) >

0). Then the firm maximizes F (L)− w (L) ∗ L and the FOC is given by

FL(L) = w (L) + wL (L) ∗ L (9)

But since w (L) +wL (L) ∗L > w (L) the marginal cost that is relevant to the firm is above the wage rate

for a given labor supply. Hence, Lmonop < Lcomp and we have indeed a situation in which both, wages

and employment, are below the competitive level.

The effect of introducing a minimum wage that lies between wmonopand wcomp is quite surprising. At

the margin, the cost of labor is now wmin, and the monopsonist takes this parametrically. As a consequence,

it will hire up to where FL(L) = wmin so that given our assumption wmin ∈ (wmonop, wcomp) we have that

labor demand increases relative to the monopsony level without minimum wages. Of course, as soon as

wmin > wcomp labor demand will decrease. The intuition for this result is as follows: The firm takes wmin

parametrically so that the relevent marginal cost of hiring an additional worker in comparison to the pure

monopsony is indeed given by the minimum wage. In contrast, in a monopsony the marginal cost does not

only include the wage rate w, but also the marginal increase in wage payments to all the (inframarginal)

workers as given by wL (L)∗L. To fill the additional jobs the firm needs to pay higher wages to all workers.

This additional marginal cost depresses labor demand. But as soon as there is a minimum wage, we have

that wL (L) ∗ L = 0 in the region between wmonop and wmin and the monopsonist increases labor demand

until FL(L) = wmin.

2.3 Empirical Application 1: The Evolution of the Wage Structure

Above we showed that a CES-production function generates labor demand equations as described by

log

(LdiKdi

)= β0 + β1 ∗ log ri + β2 ∗ logwi + εi. (10)

4

As we will see below, intertemporal labor demand decisions in the intertemporal model without adjustment

costs collapse to a sequence of static labor demand decisions. Hence this equation remains valid in dynamic

neoclassical models of labor demand. We can thus add the index t and pretend that we are actually studying

a dynamic economy. But how to solve for equilbrium? After all, labor supply is endogenous and we cannot

simply use the equation above as an equilbrium relationship. Suppose that we are studying aggregate labor

demand rather than firm-level labor demand, meaning that we drop the index i from the equation (while

adding the index t). In a constant-returns-to-scale economy, this only means that we replace firm-level

labor demand by aggregate labor demand in the equation above since an economy with many small firms

can be replicated by an economy with one big firm (the representative firm) that takes wages as given.

Bringing wages to the LHS and aggregate labor shares to the RHS, we can rewrite the aggregate labor

demand equation as

logwt =

(1

β2

)[log

(LdtKdt

)− β0 − β1 ∗ log rt − εt

]. (11)

Still, we need to address the problem of endogeneity. Since we have aggregated up the economy, labor

supply Lst may be thought of as population size in period t. A common assumption is that population size

is an exogenous variable, and we write it parametrically as Lt. This is valid if labor force participation

rates and unemployment rates are constant over time and if fertility rates do not depend on aggregate

wages. Labor market equilibrium requires that Lst = Lt = Ldt . We can thus rewrite the labor demand

equation as

logwt =

(1

β2

)[log

(LtKdt

)− β0 − β1 ∗ log rt − εt

](12)

clarifying that Lt enters parametrically. The remaining problem is that rt and Kdt are endogenous. We

can get around this problem if we assume that rt is taken exogenously by a country. Given the exogenous

variables (rt, Lt), the first-order condition for capital demand, FK,t(Kt, Lt) = (rt + δ) can be inverted to

get Kdt . With a CRS production function and a time constant interest rate, capital inputs adjust in such

a way that the(KtLt

)is a constant. With time-varying interest rates it is true that once we condition on

the interest rate, we can treat the labor input share as an exogenous variable. We can thus interpret the

equation above as an equilibrium relationship. This is how we should interpret the approach taken in the

paper we discuss next.

2.3.1 Card and Lemieux (2001)

The model presented above is the starting point for a vast empirical literature on the evolution of the wage

structure and of the skill-premium, i.e. the average wage differential between highly and lowly educated

labor, over time. There have been major changes in the wage structure over the last few decades. Most

importantly, wage inequality has risen, and some of it can be explained by a rise in the skill premium. Card

5

and Lemieux find that the education premium has increased only for the young, while it has remained

remarkably constant for the older workers. This is shown in figure 1 in their paper. At the same time

relative supplies of college-educated labor has remained remarkably constant or has even decreased over

the last few decades while it has increased for the older workers. See figure 3 in the paper. So one

might conjecture that there is a relationship between the two stylized facts. To test this, we need to have a

theory of the age-specific college-premium. Card and Lemieux consider a theoretical model that essentially

generates age-education specific equations of the form given in (12). Let Hjt and Cjt be the supplies of

high-school and college labor in age group j and year t respectively, assumed to be exogenous, and assume

that the aggregate supply of both types of labor are given by

Ht =[∑

jαjH

ηjt

] 1η

(13)

Ct =[∑

jβjC

ηjt

] 1η

. (14)

Also assume that aggregate output is produced according to the producation function

yt = [θhtHρt + θctC

ρt ]

1ρ . (15)

Hence, there are different labor inputs here that are not perfectly substitutable. First, firms use both,

high-school and college-labor, and they are not perfectly substitutable as long as ρ = 1− 1σeduc

6= 1. Second,

within each education group, firms can hire from different age groups, but the age-groups are not perfectly

substitutable as long as η = 1− 1σage

6= 1. Hence, we cannot simply add up individual-level labor supplies

to get the relevant aggregate labor inputs. This is why there will be relations of the form given in (12) for

each education-age group, and this is what will give us expressions for age-specific education premia which

allows us to study the patterns documented in figures 1 and 3.

Using the FOC for each input (i.e. each Hjt and Cjt) it is straightforward to derive

log

(wcjtwhjt

)= log

(θctθht

)+ log

(βjαj

)−(

1

σeduc

)∗ log

(CtHt

)−(

1

σage

)∗[log

(CjtHjt

)− log

(CtHt

)]+ εjt (16)

where(wcjtwhjt

)is the college wage-premium, and εjt is added to account for sampling error. This is a

theory of the evolution of the age-specific college premium over time. Age-specific relative labor supply

matters only if η 6= 1 (i.e. σage → ∞). In that case, there will be a relationship between the evolution

of age-specific college-shares in labor supply over time. Otherwise, there might be an age-premium given

by(βjαj

), but it will be constant over time which contradicts the stylized facts discussed above. Also note

that the college premium irrespective of age will change over time if θctθht changes over time. All in all, the

relatively basic model of labor demand generates an empirical model that imposes a lot of interesting and

testable restrictions on the data.6

In principle, the parameters of this regression equation can be estimated by NLS using data on age-

education specific wages and number of workers. For the US, such data can be constructed from the CPS

and the decennial Census. The challenge for estimation is the fact that both, Cjt and Hjt, enter the

equation non-linearly through the term log(CtHt

). Instead of estimating the parameters simultaneously,

the authors apply a 2-stage procedure. The 1st stage recognizes that the regression equation can be

represented in reduced form as

log

(wcjtwhjt

)= dt + bj −

(1

σage

)∗ log

(CjtHjt

)+ εjt (17)

where dt = log(θctθht

)+[(

1σage

)−(

1σeduc

)]∗ log

(CtHt

)and bj = log

(βjαj

)are time-and age fixed effects.

This yields an estimate for σage. It is straightforward to show that the FOC produce equations of the form

log(wcjt)

+

(1

σage

)∗ Cjt = dct + log(αj) (18)

log(whjt)

+

(1

σage

)∗Hjt = dht + log(βj) (19)

which can be used to estimate the education-specific age fixed effects αj and βj , since one can construct

the LHS using the 1st-stage estimate of σage and data on age-education specific wages and employment.

In the 2nd stage one can then construct the CES-aggregates of Ct and Ht and run the regression in (16)

to get an estimate of σeduc. This also provides us with a second estimate of σage which can be used to

test equality with its 1st-stage estimate. However, one should note that the constructed variable(CtHt

)varies by time only, and hence log

(θctθht

)cannot be identified when not imposing additional restrictions on

it. The authors assume that log(θctθht

)evolves linearly over time. It seems to me that this identification

problem could be avoided by estimating the structural form in one step since(CtHt

)is a non-linear function

of the Cjt and Hjt which are not collinear with log(θctθht

).

The results are interesting. The first-stage reduced-form fits the data almost perfectly, and σage is

highly significant. The fit of the structural second-stage is much worse, but the estimate for σage is almost

identical to the one from the first-stage. Why does the fit decrease when moving to the structural form?

Tthe reduced form time dummy from the 1st-stage is composed of a relative productivity effect log(θctθht

)and a relative supply effect

[(1

σage

)−(

1σeduc

)]∗ log

(CtHt

). In contrast to the 2nd-stage one can leave the

process for log(θctθht

)unrestricted. In the data, returns to college stagnated in the mid of the 70s before

taking off in the 80s. This can be explained by the increase of the relative supply of college workers in

the 70s before stagnating afterward, together with a non-linear process for log(θctθht

). However, the linear

process assumed in the 2nd-stage will need to fit the stark increases in the college wage premium after

1980 and thus will have a significant and positive slope. Because of the linearity of the process, the model

forces log(θctθht

)to increase during the 70s as well, generating too large of a growth in the structural form

of log(θctθht

)+[(

1σage

)−(

1σeduc

)]∗ log

(CtHt

).

7

The elasticity of substitution between age groups is estimated to be quite high (in the range of 4 to

6) while the elasticity of substitution between education groups is estimated at a much lower value (in

the range of 2 to 2.5). These results are relatively robust across several alternative specifications and

countries. The authors conclude that "the increase in the college-high school wage gap over the past

two decades is attributable to steadily rising relative demand for college-educated labor, coupled with a

dramatic slowdown in the rate of growth of the relative supply of college-educated workers". This of course

raises the question of why the relative supply of the highly educated has not picked up during that time

period.

2.4 Empirical Application 2: The Effect of Immigration on Local Labor Mar-kets

At a deep economic level, Card and Lemieux estimate labor demand equations and how they shift over time.

One challenge is that in the data we may only observe equilibrium outcomes for wages and employment.

This is a problem if labor supply is endogenous and not, as in Card and Lemieux, given exogenously.

In particular, estimating the labor demand function boils down to identifying the parameters of one of

two simultaneous equations. If labor supply is endogenous, then estimates of the labor demand equation

will suffer from a simultaneity bias. As you should know from your econometrics classes, this threat to

identification can be overcome if we find an exogenous shifter of the labor supply curve. One context that

certainly shifts the labor supply curve is immigration. Consider the labor demand equation derived from

a simple CES function and isolate the wage rate:

logwi =

(1

β2

)∗[β0 + β1 ∗ log ri − log

(L

K

)i

+ εi

]. (20)

A standard approach in the immigration literature is to index the variables in this equation by city (”c”)

and time (”c”) instead of the firm (”i”). Is the assumption that Lct is shifted by immigration for exogenous

reasons credible? Probably not, for at least two reasons: First, immigration changes local labor supply,

but it is likely that immigrants go to cities with relatively favorable demand factors, not all of which are

observable. Hence, cov(Lct, εct) 6= 0 and the OLS-estimates will be biased. Second, inflows of immigrants

might cause an outflow of former residents since they want to avoid the increase in competition in local

labor markets. As a consequence, changes in Lct due to immigration, holding everything else constant, and

changes in Lct observed in the data might be different. With appropriate data, the second problem can be

tested since one observes both, inflows into and outflows from a city. Authors writing studies in this area

spend considerable time to address the second issue and usually find that outflows are not systematically

related to inflows of immigrants. It thus remains to address the first problem which is diffi cult since local

demand shocks are usually unobservable or only partially observable. Let’s go over some of the most

influential papers in the literature and how they address this issue.8

One important note is in order, however. The three papers we cover below operate under the assumption

that immigration has a partial equilibrium effect on labor markets only. This does not seem to be a good

assumption, at least for cities that experience large-scale inflows of immigrants. After all, immigrants also

demand goods, and they may thus shift the labor demand curve through an aggregate effect on output.

So there is good reason to believe that viewing labor markets and goods markets as isolated entities is

problematic.

2.4.1 Card (1990)

This paper is an early study in labor economics that uses a natural experiment to identify some parameter

of interest. As a consequence of a decision by Fidel Castro in 1980 to allow anybody who wants to

emmigrate to the US to do so, approx. 125,000 Cuban immigrants arrived in Miami between May and

September 1980 (this event has been depicted in the movie "Scarface"). The result was a 7% increase in

the labor force of Miami and a 20% increase in the number of Cuban workers in Miami. This event is

commonly referred to as the "Mariel Boatlift". The source of the experimental variation in the data - an

exogenous shift in labor supply that disproportionally affects Miami relative to other US-cities in the US -

lends itself to a DiD-research design. Let ygct be some labor market outcome for group g in city c in year

t. Index Miami by c = 1 and index some other cities that serve as a control group by c = 2. To estimate

the effect of the Mariel Boatlift on labor market outcomes of group g we define a dummy variable Dgct = 1

if c = 1 and t > 1980, and Dgct = 0 otherwise. We are then interested in the parameter β in

ygct = β ∗Dgct + µc + φt + εgct (21)

which measures the causal impact of the Mariel Boatlift on the labor market in Miami if cov(Dgct, εgct) = 0.

This assumption is satisfied if there are no labor-market trends that are specific to Miami. This requires

a valid choice of the control group. Card chooses Atlanta, LA, Houston and Tampa-St.Petersburg as

control cities since they display similar labor market trends prior to the Mariel Boatlift as Miami while

being unaffected by the Marial Boatlift. Next, we need to choose the native group g for which we want

to study systematic changes in average labor market outcomes due to the large labor supply shock. Card

focusses on low-skill individuals in one part, and Cubans and African-Americans in another part of the

paper since these are the groups that are most likely to compete with the immigrants. Hence, we define

the group g either by some low-skill measures or by a group of races/ethnicities that are most likely to

compete with the "Mariels". Generally, his results suggest that the large immigration wave had negligible

effects on labor market outcomes of locals. This is somewhat surprising at first sight since unemployment

rates significantly increased and wage rates significantly decreased for the native groups g between 1980

and 1982. But comparison with the control groups suggest that this was a nation-wide business cycle

9

effect which was independent of the Boatlift - it was just bad timing. The DiD-estimator controls for such

nationwide trends through the fixed effect φt.

There are the usual concerns with the DiD-strategy: Are the control groups valid? Have we really

identified the group g that is most likely to be affected by the large shock to local labor supply? What

about sample sizes? The concern that natives might have moved out of Miami as a response to the shock

cannot be ruled out, either. The results are still quite surprising since the shock was so large. Also note

that if we take the results at face value, then they seemingly contradict the findings from Card&Krueger

(1994) regarding the effects of an increase in the minimum wage on employment: The latter essentially

implies that the labor demand curve is inelastic, while the immigration-paper suggests that the labor demand

curve is perfectly elastic. Beaudry, Green and Sand (2015) show how one can resolve this "puzzle" when

using an equilibrium search-matching model of the labor market.

2.4.2 Card (2001)

The Mariel-Boatlift paper is essentially a case study and thus might not be externally valid. Given the

controversial nature of the topic, evidence that addresses external validity is needed. Card (2001) uses

data for a broad set of cities in the US together with a clever strategy for constructing an Instrumental

Variable. The IV-strategy used in the paper goes back to Bartik (1990). It is thus often referred to as the

Bartik-Instrument. It is highly instructive to read the paper to see how much work is invested to get

empirical results that are convincing.

Card’s starting point is the observation that skills of immigrants are highly heterogeneous, so that "the

overall fraction of immigrants in a city is too crude an index of immigrant competition for any particular

subgroup of natives". He instead assumes that local labor markets are stratified along occupation lines.

Let Yc be the output in city c, sold at price qc, and assume that it is produced according to some production

function Y (Kc, Lc). To capture stratification among occupation lines, Card assumes that labor inputs are

not perfectly substitutable across occupations indexed by j = 1, ..., J and uses the CES-aggregator

Lc =[∑

j(ejcNjc)

σ−1σ

] σσ−1

, (22)

where σ is the elasticity of substitution between the different labor inputs. The FOC for Njc yield

logNjc = θc + (σ − 1) log ejc − σ logwjc (23)

where wjc are average wages in the cell defined by occupation j and city c, and θc = σ∗log[qc ∗ FL (Kc, Lc) ∗ L

1σc

].

We need to account for the endogeneity of wjc. Card assumes that the labor supply function for the pop-

ulation of individuals in occupation j in city c, denoted by Pjc is given by

log

(NjcPjc

)= ε ∗ logwjc. (24)

10

This is a relationship between the cell-specific wage rate and the cell-specific aggregate employment rate.

Imposing an equilibrium condition, solving for the equilibrium wages and employment rates, and assuming

that

log ejc = ej + ec + ejc (25)

(where identification of the fixed effects ej and ec requires that ejc is some random productivity term)

yields the following reduced-form equations:

logwjc = uj + uc + d1 ∗ log fjc + µjc (26)

log

(NjcPjc

)= vj + vc + d2 ∗ log fjc + ηjc, (27)

where fjc =Pjc∑jPjc

≡ PjcPc. These are cross-sectional fixed effects regression equations for equilibrium

labor market outcomes that condition on occupation-specific effects that are common across cities (the

uj and vj), essentially controlling for the possibility that there are systematic differences in labor market

outcomes across occupation groups, and on city-specific effects that are common across occupations (the

uc and vc), essentially controlling for the possibility that there are systematic differences in labor market

outcomes across cities. The coeffi cients of interest are d1 and d2, which are the effects of an increase in

the relative occupation-city specific labor supply on equilibrium labor market outcomes. These effects are

predicted to be negative. In fact, it can be shown that the relationships between the reduced form labor

supply effects and the structural parameters are given by d1 = − 1(ε+σ) , d2 = − ε

(ε+σ) .

Card is interested in estimating the effect of an increase in fjc on labor market equilibrium outcomes

due to immigrant inflows. To understand the empirical content of the model it is instructive to note that

one can use the reduced-form wage equations to derive within-city relative wages across occupations:

log(wjcwj′c

) = uj − uj′ + d1 logPjcPj′c

+ µjc − µj′c. (28)

This highlights that everything here is about relative supplies. Given imperfect substitution of labor across

occupations, an increase of immigrant inflows affects over-proportionally the skill-group among natives that

is competing with the immigrants, i.e. that skill group j in city c for which Pjc increases. Thus, the skill

composition of immigrant inflows matters for the wage structure within city. If one does not take into

account the importance of this skill composition, the effect of immigrant inflows might be attenuated

towards zero because natives that belong to other skills groups are not in "perfect competition" with the

immigrants.

To estimate the effect of an increase in fjc due to immigration, one still needs to address the two issues

mentioned in the introduction of this subsection: Natives might move away from a city in response to

an inflow of immigrants into their skill groups, and immigrant inflows are potentially endogenous to the

extent that they are a reaction to local demand shocks. The latter would imply that log fjc, µjc and ηjc11

are correlated, thus violating the assumption necessary for consistency of the regression estimates. Card

spends a lot of time presenting evidence that immigrant inflows have not been offset by an outflow of

natives. In other words, fjc increases significantly due to an inflow of immigrants.

Addressing endogeneity is more diffi cult. Card constructs an Instrumental Variable for fjc as follows.

Let Mg be the number of immigrants from source country g entering the US sometime between 1985

and 1990, where 1990 is the sample year (the CPS and Census include a variable that asks whether the

respondent entered the US sometime during the last 5 years, so Mg can be computed from the data). Also

let τgj be the nationwide fraction of recent immigrants (those arriving between 1985 and 1990) working in

occupation j, and let λgc be the fraction of immigrants from an earlier cohort of immigrants from country

g who are observed in city c prior to 1985. Then, based on historical settlement patterns, it is predicted

that λgc ∗ τgj ∗Mg immigrants from country g move to city c and work in occupation j. The "supply-push

component" of recent immigrants into city c and occupation j is given by

SPjc =∑

gλgc ∗ τgj ∗Mg. (29)

Card uses this variable as an instrument for fjc. Remember that for an IV to be valid, the following two

assumptions need to be satisfied: (i) cov(fjc, SPjc) 6= 0; (ii) cov(µjc, SPjc) = cov(ηjc, SPjc) = 0. The

first assumption requires the IV to have predictive power on the endogenous variable, and the second

assumption is that the IV is uncorrelated with the unobservables. The first assumption is testable using a

test for significance of the instrumental variable in the first-stage regression of the 2-SLS procedure

log fjc = δ ∗ SPjc + ψj + ψc + ψjc. (30)

Note that we need to include all regressors (other than the endogenous variable itself) on the right hand

side of the equation since otherwise the 2SLS-estimator will be inconsistent. Assumption (i) is satisfied

if δ̂ is significantly different from zero. Card shows this to be the case. He also shows that there is no

significant effect of SPjc on native outflows. The second assumption is not testable, since we do not

observe µjc and ηjc. The supply-push component of immigration into cell (j, c) is a valid instrument if

total inflows of immigrants into the US (the M ′gs) and the nationwide skill compositions of immigrant

groups (the τ ′gjs) are not related to local demand shocks in one particular city, and if the same is true for

historical immigration patterns. The latter part is somewhat problematic since local demand shocks might

be serially correlated: A city that faces mean-reverting good demand shocks might have experienced the

shock some time ago, which in turn might have influenced migration patterns in the not-so-distant past.

Card draws the following conclusions from estimating d1 and d2: First, immigration significantly re-

duces employment rates of natives conditional on city and occupation fixed effects. The IV-estimates

are significantly larger, consistent with demand shocks driving at least some of the immigration patterns.

Second, the upper bound on the effect on relative wages is a reduction of 3 percent for the occupation-city

12

cell that has experienced the largest immigrant inflows. He argues that although effects are significant,

they are quantitatively quite small.

Some Comments on IV-estimation Before the early 1990 when the IV-estimator became widely used

in applied econometrics, the instrumental variable estimator of e.g. d1 would have been interpreted as the

effect of an increase in labor supply on equilibrium wages. As proven by Imbens and Angrist (1994) this

interpretation as an Average Treatment Effect (ATE) for the whole population is wrong. Under a

monotonicity assumption, i.e. δ > 0, dIV1 needs to be interpreted as the Local Average Treatment

Effect (LATE), i.e. the average effect of "treatment" on those who change state in response to a change

in the instrument. Here it simply means that dIV1 measures the effect of an increase of occupation-specific

labor supply due to an exogenous increase in immigration into (j, c). Note that the "L" in "LATE" comes

in here because of the second part of the sentence, i.e. the one starting with "due to ...". Because it

has become so important in empirical work, let me write out the interpretation of ATE and LATE in the

context of this paper:

• ATE: labor supply effect on equilibrium labor market outcomes.

• LATE: labor supply effect on equilibrium labor market outcomes due to immigration.

Looking at equations (26) and (27) there is in fact no mention of immigration at all. Immigration

comes in through the instrumental variable, with the LATE-interpretation, the dIV1 is the labor market

effect of immigration rather than the labor market effect of an increase in labor supply - exactly what Card

is interested in. So here we have a case where we explicitely rely on the LATE-interpretation to answer the

main empirical question of the paper - the fact that dIV1 is an estimate of the LATE rather than the ATE

is exactly what we want here! This is not always the case when relying on IV. For example, the literature

that estimates returns to schooling often uses variation in compulsory schooling laws as instruments. In

this literature we want to know the ATE, the average effect of schooling on earnings, but what we get is

the LATE to be interpreted as the returns to schooling for those who are induced to take one more year of

school due to compulsory schooling laws. Thus, the LATE is specific for a small subset of the population,

and it is unclear if this is what we are interested in.

The second note about IV-estimation is about assumption (i) for the validity of an instrument. What

happens if we have a weak instrument, i.e. and instrument that is only weakly correlated with the

endogenous variable? Consider the following bivariate regression model of yi on xi:

yi = β0 + β1 ∗ xi + εi. (31)

The OLS-estimator of β1 is given by

β̂OLS

1 = β1 +cov(xi, εi)

var(xi)(32)

13

and we have an endogeneity problem if cov(xi, εi) 6= 0. If we have an instrumental variable zi we get the

IV-estimator

β̂IV

1 =cov(yi, zi)

cov(xi, zi)= β1 +

cov(zi, εi)

cov(xi, zi). (33)

In large samples and with assumption (ii), β̂IV

1 is consistent while β̂OLS

1 is not. However, if there is any

small correlation between zi and εi left, the inconsistency of the IV-estimator(β1 − β̂

IV

1

)can actually be

larger than the inconsistency in the OLS-estimator. This is likely if cov(xi, zi) is very small. There is a

highly technical literature in econometrics that addresses this problem. Mostly, this is a problem in small

samples where the sample covariance cov(zi, εi) might be positive. Indeed, the IV-estimator is consistent

(a large sample property), but it is NOT unbiased (a property independent of sample size), and it has been

shown that the bias can be substantial if the instrument is weak. Hence, it has now become absolutely

crucial in empirical work to convince the reader that the instrument is strong. The "magical value" for

judging whether the instrument is strong is a t-statistic of at least 12 in the first-stage. This value goes

back to work by Stock and Yogo.

�

2.4.3 Borjas (2003)

Borjas argues that the spatial approach adopted by Card (2001) cannot rule out that small effects of

immigration on local wages is driven by factor-price equalization across cities, even in light of the evidence

that immigrant inflows into cities are not perfectly offset by native outflows. Firms might quickly adjust

their wage payments to natives to avoid major outflows. Borjas uses the CES-approach from Card and

Lemieux (2001) instead. The idea is that on a national level, immigration does shift the aggregate labor

supply curve because outflows of natives from one city are inflows of natives in another (unless natives

emmigrate from the US altogether in response to immigration, which is unlikely). However, aggregating up

the CES-production function with only one type of labor means that the only variation in wages and labor

supply takes place across time. This is problematic for identification. Borjas follows Card and Lemieux

(2001) and assumes that skill groups defined by education and experience are not perfect substitutes

in production. Let t index time, e index education, and j index experience. Also let Qt,Kt and Lt be

aggregate output, capital and labor supply. Assume the following structure:

Qt = [λKt ∗Kυt + λLt ∗ Lυt ]

1υ (34)

Lt =[∑

eθet ∗ Lρet

] 1ρ

(35)

Let =[∑

jαej ∗ Lηejt

] 1η

. (36)

14

Capital and labor are imperfect substitutes, and so are employees with different education. Within edu-

cation groups, experience groups are not perfectly subsitutable. This generates log-wage equations

logwejt = log λLt + (1− υ) logQt + (υ − ρ) logLt + log θet

+(ρ− η) logLet + logαej + (η − 1) logLejt, (37)

generating a reduced-form equation

logwejt = δt + δet + δej −(

1

σX

)logLejt (38)

where σX = 11−η . Note that we have imposed that within an education group, the productivities of the

different experience groups αej are constant over time. This is an assumption required for identification,

since the education-specific productivities θet are allowed to vary across time. There is still an identification

problem because δt and δet are not separately identified. Noting that δet = log θet+ (ρ−η) logLet clarifies

that we can achieve identification by assuming that log θet follows a linear trend.1

Once we have estimated the parameters of this equation, we have estimates of logαej = δej and

η = 1− 1σE, which enables us to run the regression

logwet = δt + log θet −(

1

σE

)logLet (39)

where we still assume that log θet follows a linear trend, and σE = 11−ρ .

Lejt is potentially endogenous. Borjas instruments it by immigration flows into cells (i, j, t). This gives

us the LATE-estimator of(

1σX

), the effect of an increase of cell-specific labor supply due to immigration

on log-wages. The IV-estimator might still be biased if the skill-composition of immigration depends on

unobservable demand shocks. This is not unlikely. Borjas argues however that the bias is unambigously

towards zero, and he thus interprets his IV-estimates as a lower bound on the effect of immigration.

Given his estimates, he can compute several types of factor-price elasticities to calculate the effect of

immigration on wages. He finds that the large immigrant influx observed between 1980 and 2000 has

significantly decreased wages of natives. The effects differ across education groups: 8.9 percent for high-

school dropouts, 4.9 percent for college graduates, 2.6 for high-school graduates and no effect for those

with a post-secondary degree. The effect is much smaller when using the spatial approach instead.

1A different approach is to drop δt from the equation and let time effects to vary freely across education groups. Theidentification assumption adopted here is needed for estimation of the next equation - the wage equations that are aggregatedto the (e, t)−cell.

15

3 Intertemporal Labor Demand

3.1 The Basic Neo-Classical Framework

We now discuss the intertemporal version of the simple labor demand model. Given that labor demand

models are commonly based on a production function F (L,K) that uses both, labor and capital, as inputs,

and since in the absence of complete depreciation a decision maker can use capital to shift resources across

periods, one needs to be specific about the ownership and market structure of the economy, and in particular

about who owns the capital and assets. Following the standard approach we assume:

• Households own all factors of production and collect profits.

• Households make decisions about capital accumulation through savings.

• Firms hire capital at rental prices rt and labor at wage rate wt and sell their output at a price of 1.

• Firms are risk-neutral maximizers of expected future discounted profits.

In contrast to the static model, in which time does not have meaning, capital depreciates in an in-

tertemporal setting. It does so at rate δ. In period t, the firm hires labor and capital, pays wage rate wt

and interest rate rt to the household and needs to cover the depreciation cost δ ∗ Kt at the end of the

period. The cost of capital is thus (rt + δ), and the firm’s maximization problem becomes

maxL,K

E

{∑T

t=0βt ∗ [F (Lt,Kt)− wt ∗ Lt − (rt + δ) ∗Kt]

}. (40)

Since no lags or leads of Lt andKt enter period−t profits, there is no intertemporal link in this optimization

problem. Hence, it collapses to a sequence of (T + 1) static maximization problems where each static

maximization problem coincides with the problem discussed in the previous section. In particular, we

still can use the first-order conditions FL,t = wt and FK,t = (rt + δ) and thus the tangency conditionFL,tFK,t

= wt(rt+δ)

. As a theory of intertemporal labor demand this is not very interesting since it misses

many of the stylized facts about labor flows into and out of firms and about adjustments of average hours

within firm. However, as we will see in the section on labor market equilibrium, it is a very convenient

theory to study trends and dynamics of the wage structure at a fairly aggregate level. For now let’s turn

to a decision-theoretic model of labor demand that introduces an intertemporal dimension into the firms’

problem.

3.2 Models with Adjustment Costs

The basic neo-classical model of labor demand treats labor markets as "spot markets" where, at the market

wage, any quantity of labor can be hired and fired at any point in time. This is arguably an unrealistic16

assumption: Hiring and firing workers is likely to be a costly and time consuming process. In fact, there

is considerable evidence from firm-level panel data against the spot market model of labor markets. In

particular, there are interesting and rich dynamics in observed firm behavior with respect to the number

of workers hired and fired and the average hours worked which are diffi cult to reconcile with a simple

frictionless labor demand model. For example, periods of little recruitment activity by the firm are often

followed by periods in which a firm becomes very active in hiring or firing workers. Often, firms adjust

their labor demand by changing their number of employees rather than changing the number of hours

worked per employee. However, as only the total number of labor demanded matters for a firm in the

neo-classical model, the difference between hours worked per employee and the number of employees does

not have a meaning in this model.

We now discuss a simple extension of a neo-classical labor demand model that is consistent with many

of the empirical facts just described. In this extension, hiring and firing is a costly process.2 The firm

maximizes the expected discounted sum of future profits, and period-t profits are given by Πt = Yt−Wt−Ct,

where Yt is output, Wt is the total wage bill, and Ct are the costs associated with adjusting the number of

employees in a firm, denoted et. With ht denoting the numbers of hours worked per employee, the firm’s

problem is given by

maxet,ht

E

{∑T

t=0βt ∗Πt

}(41)

s.t. Πt = Yt −Wt − Ct

Yt = F (At, et, ht)

Wt = W (At, et, ht)

Ct = C (et, et−1) ; e0 given.

We shall note that the total number of hours worked in the firm is Lt = et ∗ ht, but the total wage bill Wt

depends directly on et and ht rather than the composite Lt. This specification allows for non-linearities in

wage schedules, such as overtime wages. Furthermore, adjustment costs Ct are a function of et and et−1,

meaning that the firm needs to incur some costs when changing its number of employees from one period

to the next by hiring (et > et−1) or firing (et < et−1) workers.

In period t, the et−1 affects current profits, but cannot be influenced anymore, and thus acts as a

state-variable in a dynamic decision problem. Likewise, the choice of et will affect future profits of the

firm, thus introducing an intertemporal link in labor demand. As a consequence, the firm needs to solve

a Dynamic Programming Problem. Assuming that T → ∞ and that At follows some stationary Markov

process with transition density Γ (At) the Bellman Equation is given by

2This section closely follows the presentation of labor demand in the book by Adda and Cooper.

17

V (A, e−1) = maxe,h{F (A, e, h)−W (A, e, h)− C (e, e−1) + β ∗ E [V (A+1, e) |A]} . (42)

To gain some intuition for the properties of the model we assume that adjustment costs follow the

quadratic function C (e, e−1) = η2 [e− (1− q)e−1]2 where η is a scale parameter and q is the fraction of

employees that leave the firm from one period to the next for exogenous reasons. The policy functions

for the endogenous variables e, h are functions of the state variables A, e−1 and satisfy the two first-order

conditions

Fh (A, e, h)−Wh (A, e, h) = 0

Fe (A, e, h)−We (A, e, h)− η [e− (1− q)e−1] + β ∗ E [Ve (A+1, e) |A] = 0.

By the Envelope Theorem, we have

Ve−1 (A, e−1) = −Ce−1 = η (1− q) [e− (1− q)e−1]

⇒ Ve (A+1, e) = η (1− q) [e+1 − (1− q)e]

Combined with the FOC for e we get

Fe (A, e, h)−We (A, e, h)− η [e− (1− q)e−1] + β ∗ E [η (1− q) [e+1 − (1− q)e] |A] = 0. (43)

Note that the ”e” are policy functions, i..e decision rules that map the current state into an optimal

decision. To make this explicit, we need to write e(A, e−1) and thus the expectation in this equation is

over A′ that enters the policy function. Also note that this equation is nothing else than an Euler-equation.

Now lets consider two special cases of the model with quadratic adjustment costs. In the first, we let

η = 0,W (A, e, h) = e ∗ w̃(h) and F (A, e, h) = F̃ (A, e ∗ h) = A ∗ F̃ (L). The two FOC’s reduce to

e ∗A ∗ F̃ ′ (L) = e ∗ w̃′(h)

h ∗A ∗ F̃ ′ (L) = w̃(h)

and hence 1 = h ∗ w̃′(h)w̃(h) . This equation does not depend on A and, with some further assumptions on the

function w̃(h), is satisfied by some constant h. In other words, the firm does not adjust hours worked per

employee. Rather, to accomodate to changing business cycle conditions, the firm only varies its number

of employees, e. The only difference to the standard neo-classical model is the non-linearity of w̃(h).

In the second case, we assume η > 0, q = 0,W (A, e, h) = e ∗ h and F (A, e, h) = A ∗ F̃ (L). Now the

FOC become

18

e ∗A ∗ F̃ ′ (L) = e

h ∗A ∗ F̃ ′ (L) = h+ η [e− e−1]− β ∗ E [η [e+1 − e] |A] .

The first equation implies that A ∗ F̃ ′ (L) = 1, so that the second equation becomes η [e− e−1] − β ∗

E [η [e+1 − e] |A] = 0. The latter is satisfied at e− e−1 = 0 (implying e+1 − e = 0). In this case e is kept

constant while all the adjustment to fluctuations in total factor productivity takes place at the intensive

margin, i.e. with respect to hours worked per employee h.

While the model with quadratic fixed costs is a useful starting point and presents the basic logic of

models with adjustment costs, it still cannot produce the discontinuous labor demand behavior observed

in the data. The standard in the literature has become to assume that hiring and firing is associated

with fixed costs rather than a continuous adjustment cost function. Notice how the assumptions on fixed-

costs for adjusting the number of workers and the non-linearity of the wage schedule is reminiscent of the

assumptions in Erosa, Fuster and Kambourov (2016) on intertemporal labor supply. Just as in the labor

supply case, to explain movements at the intensive and extensive margins of the endogenous decisions one

needs these model ingredients.

To say anything more about the model without making strong parametric restrictions on the production

function, the adjustment cost function and the total wage bill, one needs to use numerical methods. There

has been some work on estimation of models with adjustment costs, for example by Cooper and Haltiwanger

or Nicolas Roys. I won’t go over these studies here. Instead, I will briefly cover the empirical evidence

on government mandated work-sharing programs, which can be interpreted in the context of the labor

demand model above.

3.2.1 Application: Government Mandated Hours Reduction Programs

Which policies may be effective in reducing high unemployment rates? Periodically, unions and social

democrats have suggested the implementation of so called "work sharing programs", that is government

mandated hours of work restrictions, in periods of high unemployment. Such policies consist of an upper

limit on the regular hours of work per week. Workers exceeding this limit must be paid an overtime wage

rate, which is usually also set by the government. Oftentimes, firms are forced to increase wages such that

monthly earnings remain unchanged even though hours of work have decreased. The simultaneous change

of the limits on hours of work and the wage rate makes policy evaluation of such programs diffi cult. This

applies to both empirical studies briefly discussed below.

The idea of work sharing programs is that if one forces firms to decrease the hours of work per week it

demands from its workers, then the firm will fill the "missing hours" by hiring additional workers. This is

19

a very static and mechanical view of an economy according to which the government just needs to turn a

few institutional levers and everything falls nicely into place. Yet, if the wage rate is held constant (which

it was not in the programs evaluated in the papers below), it is consistent with the simplest version of the

static neoclassical labor demand model as discussed at the beginning of this section. Indeed, with L = e∗h

the production function in (1) does not depend on (e, h) conditional on L. Hence, for an optimal level of

labor inputs L∗ the firm is indifferent between any combinations (e, h) such that L∗ = e ∗ h.

This doesn’t seem to be a very realistic view of the labor market. Can it be that easy to reduce the

unemployment rate? No, if we deviate slightly from this simple model. As discussed above, there are at

least three factors that generate a meaningful distinction between the number of workers and the hours

per worker. First, (e, h) may need to be treated as two distinct inputs. Second, the total wage bill W (e, h)

may not factor into w∗e∗h for a fixed hourly wage w. And third, there may be adjustment costs for hiring

and firing workers, C (et, et−1). As a consequence, the employment effects of a work sharing program

are unclear and depend on the shapes of these functions. Furthermore, there will be transition dynamics

because of the hiring and firing costs. It may take time until the firm will have adjusted to the new steady

state level of employment.

A few points can be made, however. Assume that W (e, h) remains stable in response to the imple-

mentation of the policy, ruling out general equilibrium effects or bargaining over the wage schedule. Also

assume that initially the firm made an unconstrained optimal choice, say (e∗, h∗), and abstract from capital

inputs. To continue producing the initial output the total wage bill per worker increases unambiguously

holding constant e because the firm needs to pay more workers the overtime wage premium. Alternatively,

the firm needs to pay additional hiring costs because it needs a higher e holding constant h. Hence, the

costs of producing the same level of output increase. There will thus be a scale effect, tending to reduce

both e and h. The marginal costs of hours and workers change as well, so there will be substitution effects.

These will depend on the shape of the technologies and of the wage schedule. If the output technology

favors longer hours per worker, for example because of human capital effects or because many workers

with few hours generate communication overload, and if overtime premia are not too high, then the firm

will tend to substitute towards more hours per work. Furthermore, in a dynamic model a firm may be

reluctant to hire additional workers because hiring and firing - and thus any dynamic adjustments to

exogenous shocks in the future - may be costly. On the other hand, if the number of workers and hours

per work are closely substitutable, and if overtime premia are large relative to adjustment costs, then the

firm will substitute towards more employees. Whether the scale effect is strong, and how the substitution

effect plays out is an empirical question. So let’s see what the evidence says.

Hunt (1999, QJE) and Crepon and Kramarz (2002) study work sharing programs in Germany and

France, respectively, both of which were legislated in the 1980’s. The two papers use DiD-type research

designs. Hunt (1999) exploits the fact that work sharing programs were not introduced at the same time in

20

different industries, so that the two levels of differencing are the industry and the time period. Crepon and

Kramarz (2002) uses groups that tend to work relatively few hours absent any hours of work restrictions as

"control", so that differencing is across groups and time. Both papers find that hours worked per worker

and employment decreased in response to the policy. At the same time, monthly earnings did not decrease.

Hence, those who have jobs did better after the policy, but there are fewer people with jobs. The policy

thus did exactly the opposite of what it was intended to do.

4 Labor Market Flows and Labor Demand

4.1 Introduction

The theoretical and empirical studies of labor supply and labor demand we have covered so far have one

major thing in common: Quantities are measured in terms of stocks rather than flows. When we discussed

dynamics we meant the evolution of stocks over time or the life-cycle, but not the underlying flows. This

is an inherent feature of neoclassical models, where flows are undefined. Indeed, a change in, say, the

quantity of labor demanded can be rationalized by infinitely many combinations of inflows and outflows

from the labor force. Neoclassical models have nothing to say about the magnitudes of these flows. We

will therefore need a different formal framework to study flows, such as search models. In this section we

will briefly talk about the measurement of labor market flows and an application to the empirical study

of labor market effects of immigration.

4.2 Estimating Flow Rates in Discrete Time Non-Parametrically

Let’s go over some of the basic concepts for measurement. The first issue to notice is that studying

flows inherently requires panel data. To see this, let E and U denote employment and unemployment

respectively. lndex calendar time by t and abstract for simplicity from non-employment. Hence, there are

only two employment states, E and U , which we will index by e. From a worker’s perspective, the flow

rate in discrete time is the probability that she will change the state in two consecutive time periods. For

example, the rate of going from U to E is:

λUEt ≡ P (et = E|et−1 = U)

=P (et = E, et−1 = U)

P (et−1 = U). (44)

By definition,

P (et−1 = U) = E[1et−1=U

]P (et = E, et−1 = U) = E

[1et=E,et−1=U

](45)

21

where the notation 1j denotes an indicator variable equal to one if event j happens in the population. By

the Law of Large Number, with a random sample we can consistently estimate means by sample averages.

The sample analogue of 1j is a dummy variable equal to one if event j happens to a sampling unit and

zero otherwise. Furthermore, the sample average of a dummy is the sample share of those for whom the

event j is recorded. Combining these arguments we can estimate P (et−1 = U) by the expression NU,t−1Nt−1

,

where N is the total sample size and NU is the number of unemployed individuals, both measured at

time (t− 1). Likewise, P (et = E, et−1 = U) can be estimated consistently using NUE,t−1Nt−1

. With NUE,t−1

denoting the number of individuals changing from unemployment to employment between (t− 1) and t it

is now clear that we need panel data for estimation. In particular, we need to know the employment state

of an individual in two consecutive months. Our consistent estimator of λUEt is then given by

λ̂UE

t =

NUE,t−1Nt−1NU,t−1Nt−1

=NUE,t−1NU,t−1

(46)

which is simply the share of those unemployed in the previous period who are employed in the current

period.

This is a non-parametric estimator of λUEt because it does not rely on any distributional assumptions.

It does however rely on an assumption of timing, namely that transitions take place in discrete time, and

that we are measuring the relevant time periods correctly in the data. This is, of course, wrong. Panel data

are usually recorded on the annual, quarterly, monthly, or weekly level, but not on a finer level. However,

transitions do happen within years or even weeks, and our estimator will miss them. We therefore need to

assume either that these flows are quantitatively unimportant, or we need to impose additional assumptions

on the DGP (data generating process). Shimer (2012, Review of Economic Dynamics) estimates λ̂UE

t and

λ̂EU

t parametrically by assuming that the DGP is in continuous time and that the number of events taking

place within an interval is exponentially distributed. Furthermore, within a sampling period the rates

are assumed to be constant. In this case, the transition rates are Poisson-arrival rates, and they can be

estimated from observations on labor market flows, such as NUE,t−1 or NEU,t−1. In the context of labor

demand, Davis, Faberman and Haltiwanger (2013, QJE) assume that the DGP is on the discrete daily level

and that flow rates are constant within a month. Again, a flow equation can be used to derive estimation

equations for the rates.

4.3 The Relationship between Stocks and Flows

So far we have talked about measuring flow rates from the data in a model-free way. Next, let’s come

back to the relationship between stocks and flows. Again abstracting from non-employment we have the

22

following law of motions for the employment and unemployment rates:

REt =(

1− λEUt)∗REt−1 + λUEt ∗RUt−1

RUt = λEUt ∗REt−1 +(

1− λUEt)∗RUt−1 (47)

where Ret is the population share of individuals in state e in period t. These equations hold for the

population of workers, not for the population of firms. However, the job finding rate λUEt will be positively

related to the job- or vacancy creation rate, and the worker separation rate λEUt will be positively related

to the job destruction rate. If for example the matching process is desribed by a matching technology

M (V,NU ) that maps the number of vacancies V and the number of unemployed workers NU into the

number of newly created worker-firm matches, then

λUEt =M (Vt−1, NU,t−1)

NU,t−1. (48)

Furthermore, if M (V,NU ) is increasing in both its arguments, then the rate at which jobs are created

are positively related to the number of vacancies, holding constant the number of unemployed individuals.

Hence, vacancy creation is a fundamental economic force determining the rate at which unemployed workers

find jobs. Likewise, the rate at which a firm shuts down jobs will directly affect the rate at which workers

flow from employment to unemployment. This is the lens through which we look at the labor demand side

in the rest of this section.

Consider an economy which is in a stationary state. The flow equations above solve for the stationary

state unemployment rate

RU =λEU

λEU + λUE. (49)

If we think of λEU as an exogenous job destruction rate and λUE as determined by the matching function

M (V,U), then (49) is the Beveridge curve, i.e. the relationship between vacancy rates and unemploy-

ment rates. The effect of a minimum wage on the unemployment rate will then depend on how it affects

the EU− and UE- rates, which in turn depends on the vacancy rate. Furthermore, a minimum wage hike

will start a transition process to a new stationary state, and the speed of adjustment will depend on the

rates.

4.4 Empirical Evidence and some Applications

The empirical literature on measuring labor flows is large. Davis, Haltiwanger and Schuh (1998) and

Shimer (2010) have written entire books about it, the former about the firm side and the latter about the

worker side. On the worker side, interest is in the job finding and the job separating rate. They can be

measured from panel data on workers, preferably on the monthly (or even weekly) frequency. The monthly

23

CPS is the longest running such data set for the US, but has major issues with measurement errors. The

SIPP has higher data quality, but only starting in the 90s. Shimer (2012) studies flows in and out of

unemployment in the monthly CPS and finds that fluctuations in unemployment rates are mainly driven

by fluctuations in job finding rather than job separation rates. That is, unemployment increases during

recessions mostly because firms stop hiring, not because they increase the firing rate.

Davis and Haltiwanger have innovated the literature on measuring flows from the firm side, starting

their main work in the early 1990s. Data quality is an issue since one needs to have panel data on the

hiring and firing behavior of firms. Furthermore, one would like to have a measure of vacancies in addition

to actual hires. That is, hires and job filling rates measure the actual rather than the desired amount of

hiring. Hence, the vacancy rate is a better measure of recruitment intensity that is relevant for a worker’s

job finding rate. Unfortunately, data on vacancy creation is not all that great. A large part of Davis

and Haltiwanger’s work relies on the US Census of Manufacturing, but with the limitation that vacancy

creation cannot be measured accurately. Their paper with Faberman in the 2013 QJE uses the JOLTS

instead, which is a data set with the purpose of tracking vacancy creation and job destruction on the

firm level. It is not very long running, however. The most interesting result coming out of the paper is

that firm-level employment dynamics are mostly driven by variation in the vacancy creation and vacancy

filling rate. The job destruction rate is much less important. This is fully consistent with Shimer’s findings

from the worker side. There is now growing international evidence that fluctuations in the unemployment

rate are indeed mostly driven by the recruitment activities of firms, not their job destruction behavior.

It seems that workers with some seniority are particularly well shielded from business cycle movements,

with relatively stable jobs. In contrast, young workers and workers close to retirement are those who carry

the largest burden. For example, the job finding rates of young workers fluctuates dramatically over the

business cycle. This has long lasting effects on their career, as shown for example by Oreopoulos, von

Wachter and Heisz (2010). The conclusion is that if labor market policy wants to target unemployment

rates it needs to incentivize hiring of young workers rather than making firing more diffi cult.

4.4.1 Immigration

As a last empirical application of the job-flow approach we revisit the question of how immigration affects

labor market outcomes of natives. The paper we cover here is by Dustmann, Schoenberg and Stuhler

and is forthcoming in the QJE. This is a really nice paper that addresses several of the issues we have

talked about in this section. It uses a similar experiment like the Mariel Boatlift, but with a much

stronger research design. At the beginning of the 1990’s, in the context of the breakdown of the Soviet

Union, Germany made a deal with the Czech republic that allowed Czech workers to cross the German

border for working, but not for living. That is, Czech workers were allowed to accept jobs in Germany,

but they were required to reside in the Czech republic.24

For a research design this has two attractive features. First, it implies that only German municipalities

close to the Czech border are affected since Czech workers need to commute daily. This makes geographic

distance to the Czech border a valid instrument. Second, there are none of the General Equilibrium effects

we talked about in the earlier section on immigration. Since Czech workers are not allowed to reside were

they work - if their job is in Germany - their additional money earned is, to a large extent, spent in the

Czech republic, not in Germany. Hence, the goods demand curve in the affected German municipalities is

not likely to shift substantially. Third, the design has external validity because it involved many munici-

palities along the German-Czech border, not only one city.

The authors use two different research designs and combinations of the two. The first is a DiD that uses

municipalities close to the border as treatment group and municipalities with similar pre-trends that are far

away from the border as a control group. It is interesting to see how they choose the municipalities in the

control group. They use an algorithm from the machine learning literature that searches for municipalities

that are similar in observable dimensions, where "similar" is defined by some distance criteriom. The

second research design is an IV-regression, with the distance to the border used as IV. This design also

identified the "treatment intensity". The data are administrative and come from the Social Security

Records. Have a very good look at the data description. These are also the data I am using in my own

research on life-cycle labor market dynamics.

Results from the two research designs are robust and very similar. The commuting policy had the effect

of slightly reducing native wages and strongly reducing native employment. The reduction in employment

rates came by way of a dramatic decline in job hiring rates. In contrast, job separation rates remained

fairly stable.

At this point you should pause and connect all the empirical evidence. No matter if we look at the

aggregate economy, industries, or specific events, there is remarkably consistent evidence that increases in

unemployment rates and decreases in employment rates come by way of lower flow rates into jobs rather

than higher flow rates out of jobs. This even holds across countries that have very different institutional

settings (like the US and Germany). It is the rate at which workers find jobs rather than the rate at which

they separate into unemployment that explains variation in unemployment rates.

25

Graduate Labor Economicsfaculty.arts.ubc.ca/fhoffmann/econ560/labor_demand_handout.pdf · 2 Static...

Documents

Transcript of Graduate Labor Economicsfaculty.arts.ubc.ca/fhoffmann/econ560/labor_demand_handout.pdf · 2 Static...