TOUGH LOVE FOR LAZY KIDS - Society for Economic Dynamics · 2009. 2. 15. · TOUGH LOVE FOR LAZY...

TOUGH LOVE FOR LAZY KIDS

Ctirad Slavık Kevin Wiseman

University of Minnesota and University of Minnesota and

Federal Reserve Bank of Minneapolis Federal Reserve Bank of Minneapolis

[email protected] [email protected]

Abstract. Simple theories about why parents give money to their children fail to explain a

central puzzle in inter-generational transfers: While parents are alive, they give more money

to their poorer children. Bequests, by contrast, are typically divided evenly between children.

We construct a model in which altruistic parents behave this way when facing a dynamic

insurance problem. Parents concentrate incentives later in life, so that poorer children are

partially insured against income shocks early in life, while insurance and incentive motives

offset each other in determining bequests. We show that equal division of bequests can arise

in the presence of small costs of unequal division.

1. Introduction

Why do parents give money to their kids? Inter-generational transfers are a major feature

of the modern economy, possibly accounting for 50% or more of the US capital stock (Gale

and Scholz, 1994). Gifts from parents to kids play a vital role in explaining wealth distribu-

tion, consumption insurance, and other major features of economic life. Yet the empirical

evidence about the way parents distribute money has defied explanation.

The empirical literature on inter-generational transfers followed in the wake of Becker’s

‘rotten kid’ theorem and Barro’s dynastic Ricardian equivalence result (Becker, 1976; Barro,

Date: October 14, 2008.We would like to thank to Larry Jones, Tim Kehoe, Narayana Kocherlakota, and Fabrizio Perri for theirhelp and encouragement and to Ioanna Grypari for useful comments on an earlier version of the paper. Allremaining errors are ours.

1

2 TOUGH LOVE FOR LAZY KIDS

1976). These models imply that poorer children will receive much more than their richer

siblings, in fact parents will completely offset differences in child income. However, empirical

studies have not found big differences in bequests. Wilhelm (1996) finds in a sample of estate

tax records that 76.6% of bequests are divided almost evenly (within 2% of the mean) across

children. McGarry (1999) finds that 83% of respondent’s wills that include children are

reported to treat all children about equally in the Asset and Health Dynamics Study survey

(AHEAD). Light and McGarry (2004) find that 92.1% of mothers who have a will say that

their estate will be divided equally among their children in the 1999 National Longitudinal

Surveys (NLS) of Mature Women and Young Women.

Recent empirical work documents a very different pattern of transfers while parents are

alive. Altonji et al. (1997) study the ‘transfer derivative,’ the extra amount of money that a

parent gives a child when the parent is one dollar richer and the child is one dollar poorer. A

pure altruism model implies that the parent would transfer an extra dollar to the child, but

they find that the value is $0.13 in data from the Panel Study of Income Dynamics (PSID).

This result rejects a pure altruism model, but also rejects non-altruistic preferences which

predict a zero or even a negative value for the derivative. McGarry and Schoeni find that

lower income increases both the probability and size of an inter-vivos transfer (henceforth: a

gift) from parents in Health and Retirement Survey (1994) and AHEAD data (1995). Other

empirical work has documented the same pattern for both bequests in France (Arrondel and

Masson, 2002) and for gifts in Sweden (Hochguertel and Ohlsson, 2000).

To explain this puzzle we consider a two-period, private-information environment where

parents do not know their children’s productivity levels and children face uncertain produc-

tivity in the second period. A parent cannot provide perfect insurance because she does

not know the skill levels of her children. She must provide incentives for her children to

keep them from pretending to be less productive than they are. This property implies that

TOUGH LOVE FOR LAZY KIDS 3

overall, parents will give more money to their poorer children, but not as much as is implied

by altruism with perfect information about productivity types.

In our model, the incentives are concentrated later in life in the distribution of bequests,

a common property in dynamic insurance models. Parents are looking for the cheapest way

to keep their productive children from pretending to be less productive. Productive children

are relatively more concerned about the future, when they will possibly be unproductive.

Less productive children are more concerned about the present, since they will likely be

more productive in the future. Parents can exploit this difference by offering a high bequest

relative to today’s gift to productive kids and a high gift relative to tomorrow’s bequest to

the less productive kid. This opportunity implies that bequests will be less progressive than

gifts, possibly even regressive. In section 2 we show that for a wide range of reasonable

parameters bequests are nearly equal across children. In section 3 we consider whether these

bequests are close enough that equal division can be justified as a response to small costs of

unequal division, or as a rule-of-thumb behavior.

Although the facts we address in this paper have been described as a puzzle as recently

as 2007 (Ohlsson, 2007), we know of two other models which attempt to explain the evi-

dence from both gifts and bequests. McGarry (1999) studies a one-parent, one-child model

with income uncertainty and perfect-information. She shows that gifts are progressive, but

bequests may be equal or even regressive depending on parameters. In her model, children

who are rich at the time bequests are determined are likely to have been rich when gifts were

handed out. Their parents did not give as much money in the first period and have more

to offer now. Bequests are regressive if this wealth effect dominates the progressive effect of

altruism. While this model may explain some cross-family correlations, the equal division

puzzle is emphatically a within-family result. If we add a second child to McGarry’s model,

the wealth effect will be the same for both children and the within-family distribution of


bequests will be the strongly progressive one implied by pure altruism. Our basic result is

true in a simple model with no family-level wealth effects, although we extend these results

to one with wealth effects for the sake of realism.

Bernheim and Severinov (2001) address the equal-division puzzle with a more significant

departure from current models. In their model parents love their children with different

intensities and children have strong preferences over how much their parents love them.

Parents divide bequests equally under some parameterizations in order to avoid signalling

their preferences. Although it is not formally treated, they suggest that the same model

could also explain progressive distribution of gifts by exploiting the fact that bequests are

publicly known to all children while gifts may be private. Such an extension would again

imply a perfectly compensatory distribution of gifts unless an incentive component is also

introduced. Our model achieves a similar result with just this incentive problem.

Section 2 introduces a simple version of the model to derive some analytical results and

highlight the forces at work in the model, and considers the model’s robustness to a number

of extensions. Section 3 quantitatively compares a richer version of the model to data.

Section 4 concludes.

2. Simple Model

A parent wants to give more money to her less-productive children, but must be careful to

structure the payments so that her more-productive children do not want to act unproductive.

That is, she picks levels of gifts, output, and bequests for each productivity type to maximize

the well-being of her children subject to a budget constraint and incentive compatibility

constraints. In this section we consider a stylized model of the parent’s choice to highlight

the basic forces at work in the parent’s decision and derive analytical results cleanly. Much

of these results hold in the richer model which we compare with data in section 4.


2.1. Children. Parents have a unit mass of children so that no child’s individual choice

of output affects the total family resources available to the parent. Each child draws a

productivity type zi ∈ {zL, zH} with probability πi, and can produce output from labor

effort linearly, y = z`. Probabilities are i.i.d. and we assume that the law of large numbers

holds so that πi is also the fraction of children of type i.

A child’s utility is additively separable in first period consumption, first period labor, and

expected utility from bequests: u(c)−v(y/z)+W (b, z), where b is bequest and z first period’s

productivity. We assume u′ > 0, u′′ < 0, u′(∞) = 0, u′(0) = ∞, v′ > 0, v′′ > 0, v′(0) = 0,

v′(∞) = ∞.

The child survives into the second period, realizes his second period productivity z′, and

solves an autarky problem:

(2.1) W(b, z′) = maxc,y

u(c)− v(y/z′) s.t. c ≤ y + b

The probability of being the same productivity type in both periods is π for both types, and

the probability of switching types is 1 − π. So W (b, z) is the expected value of a bequest

conditional on first period productivity.

(2.2) W (b, zi) = πW(b, zi) + (1− π)W(b, zj)

We show in the appendix, section A.1 that W (b, z) is strictly increasing and strictly concave

in b and Wb(b, z) is strictly decreasing in z for π > 12.

2.2. Parent’s Problem. At the beginning of the first period the parent who has A to

distribute among his kids designs a schedule of output, gifts, and bequests. For the sake of

presentation we have the parent picking consumption for the child, c = y + g, output, and

bequests. The parent maximizes a sum of the utilities of her children of each type, weighted


by their prevalence in the population.

(2.3) maxc,y,b

∑i=H,L

πi(u(ci)− v(yi/zi) + W (bi, zi))

subject to a family-wide budget constraint

(2.4)∑

i=H,L

πi(ci + bi) ≤∑

i=H,L

πiyi + A

and incentive compatibility constraints.

(2.5) ∀i, j u(ci)− v(yi/zi) + W (bi, zi) ≥ u(cj)− v(yj/zi) + W (bj, zi)

2.3. Characterization. We are looking for a model which generates a distribution of gifts

which is progressive but not as progressive as that under pure altruism, while bequests are

nearly equal.

Without the incentive constraints the solution to the parent’s problem is stark. The par-

ent’s first order conditions yield u′(cH) = λ = u′(cL), hence cH = cL, where λ is the Lagrange

multiplier on the budget constraint. This is the sense in which altruistic preference imply

that income differences will be completely offset. This argument becomes more complicated

when preferences are not additively separable due to the cross partial derivative between

consumption and leisure. Reasonable people disagree about the sign of this derivative, and

assuming it is small or zero means consumptions will still be roughly or exactly equal.

We show in the appendix, section A.4, that for the cases of interest only the constraint

preventing the high type from pretending to be the low type binds at the solution to

the parent’s problem. Solving the problem with just this constraint yields the condition

u′(cH) = λ1+µ

< λ1−µ

= u′(cL), hence cH > cL, where µ is the Lagrange multiplier on the high


type’s incentive compatibility constraint. µ represents the intensity with which the incen-

tive constraint binds, a measure of the need to provide incentives. This imperative prevents

perfect compensation with gifts in our model.

How far does this imperative go? We show in the appendix, section A.2 that overall

transfers, g + b, are progressive. The need to provide incentives constrains but does not

override the parent’s wish to compensate poorer children. To establish that gifts alone are

progressive, however, we need to characterize bequests. In the remainder of this section we

state and provide intuition for a sequence of lemmas in service of the following proposition.

Proposition 2.1. There exists a level of persistence π∗ ∈ (.5, 1) under which gL > gH and

bL = bH .

The proof of this proposition highlights the importance of uncertainty of the child’s future

income in our model. When there is no income persistence, π = .5, each type has the same

chance of being a high or low type in the following period, both types feel the same way

about bequests for tomorrow. We saw earlier that when both types get the same utility

from consumption, consumption is unevenly distributed. Here both children get the same

expected utility from bequests, so the parents use bequests to help provide incentives, giving

more to the richer child but asking him to work more in the first period.

When persistence is perfect, π = 1, both types know that they will be the same type next

period as they are today. In this case low productivity types care as much about tomorrow

as today, incentives are not concentrated in either period so both gifts and bequests are

progressive. For intermediate levels of persistence both effects are at work and bequests are

about equal, and for some level they are exactly equal. Below is a typical graph of each

type’s bequests as a function of π.


Persistence and the Distribution of Bequests

0.5 0.6 0.7 0.8 0.9 1.00.4

0.6

0.8

1.0

b_L

b_H

Persistence - π

Beq

uest

s

We prove this result for u with non-increasing absolute risk aversion (NIARA), v CARA

or CRRA, and zL in a neighborhood around zero. Numerically, it appears to be true for a

much broader class of utility functions and any zL < zH .

Lemma 2.2. Total transfers are progressive: gL + bL > gH + bH .

Proof: See appendix section A.2.

This result is a reflection of the the parent’s insurance motive. Incentive problems constrain

but do not reverse this motive.

Lemma 2.3. Under no persistence, π = .5, bequests are regressive, bL < bH .

Proof: No persistence implies that the expected value of bequests is independent of today’s

type, W (b, zL) = W (b, zH) = W (b). Thus we can use the same arguments as we did for

consumption. The first order conditions yield Wb(bH) = λ1+µ

< λ1−µ

= Wb(bL), which implies

bH > bL.


Lemma 2.4. Policy functions are continuous in π and zL if u has NIARA, and v has CARA

or CRRA.

Proof: See appendix section A.3. The maximum theorem guarantees upper hemi-continuity.

We need assumptions on the utility functions to show that policies are single valued every-

where. We can relax our assumptions on v, and simply assume that it is NDRRA if we also

assume that W is NIARA. Our stronger assumptions on v are sufficient to show that W is

NIARA.

Lemma 2.5. Under perfect persistence, π = 1, and for zL in a neighborhood around zero,

bequests are progressive, bL > bH .

Proof: Suppose the low type is disabled, i.e. zL = 0. Then he consumes only g so

that c1L = gL, c2

L = bL. Since π = 1, W (b, zi) = W(b, zi) and Wb(bL, zH) < Wb(bL, zL)

as established in the appendix section A.1 we have the result where superscripts refer to

periods:

u′(c1L) =

λ

1− µ>

λ

1− µWb(bL,zH)Wb(bL,zL)

= Wb(bL, zL) = u′(c2L)

So c1L < c2

L hence gL < bL. The high type’s first order conditions are undistorted, he smooths

consumption and output, c1H = c2

H , y1H = y2

H , so gH = bH . From overall progressiveness

(lemma 2.2) we have gH + bH = 2bH < gL + bL < 2bL which proves our theorem. Continuity

of policies in zL (lemma 2.4) guarantees that this will be true in a neighborhood around

zL = 0. Numerically we always find that this neighborhood is [0, zH). �

We have that bH − bL > 0 for π = .5 and bH − bL < 0 for π = 1 under the assumptions

of the lemmas above. Since policy functions are continuous in π, there exists a π∗ such that

bH = bL = 0 by the intermediate value theorem. By lemma 2.2 this implies that gifts are

progressive. �

A technical appendix with a fuller characterization of the model is available by request.


2.4. Extensions and Limitations of the Model. This simple 2 period, 2 state, many

kids model highlights the basic forces driving apart gifts and bequests with a minimum of

distractions. In this section we consider several extensions to this model which add realism

or are prominent features of related models in the literature.

First, we modify the model so that the parent has two children. We will index them by

1 and 2. The state of the family (ij) defines productivity levels for the first kid - zi and

the second kid - zj. The parent also cares about his own utility from consumption and

effort. Parent’s productivity level is zP , his allocations are indexed by P and he discounts

the welfare of children by η. Both children are weighted equally. They discount the future

by β and the interest rate is R. We assume that a child’s productivity is not known by the

parent nor the other child, hence the incentive compatibility constraints are expectations

over the other’s child’s type. Because this is the model we will consider in section 3, it is

worth writing out here and considering some of the changes. The full parent’s problem is :

max Ei,j

{u(cP (ij))− v(

yP (ij)

zP

) +

η

[u(c1(ij))− v(

y1(ij)

zi

) + βW (b1(ij), zi) + u(c2(ij))− v(y2(ij)

zj

) + βW (b2(ij), zj)

]}s.t.

∀i, j : c1(ij) + c2(ij) + cP (ij) +b1(ij)

R+

b2(ij)

R≤ y1(ij) + y2(ij) + yP (ij) + A

∀i, k : Ej

{u(c1(ij))− v(

y1(ij)

zi

) + βW (b1(ij), zi)}≥

Ej

{u(c1(kj))− v(

y1(kj)

zi

) + βW (b1(kj), zi)}


∀j, k : Ei

{u(c2(ij))− v(

y2(ij)

zj

) + βW (b2(ij), zj)}≥

Ei

{u(c2(ik))− v(

y2(ik)

zj

) + βW (b2(ik), zj)}

2.4.1. Family-Level Wealth Effects. In the simple model we abstracted from the parent’s

allocations. The solution to the simple model discussed in sections 2.1 - 2.3 is also the

solution to a related problem with parent’s allocations in which parent’s allocations satisfy

u′(cp) = v′(yp

cp) = λ and total family wealth is A = A+ yp− cp, so that A is the fixed amount

the parent will spend on her children. With two children the amount the parent spends on

her kids is uncertain. She will spend more if her children are both unproductive and less if

they are both highly productive, so we cannot reduce the problem and hold A fixed across

states.

We have argued above that the distinction between within-family and across-family com-

parisons can be critical. We can show, however that an analog to proposition 2.1 can be

proved in a 2-state version of the 2-child problem above. In addition to the assumptions

above, we need to assume that the perfect-information allocation is not incentive compat-

ible. To see that this assumption is necessary, consider a high-type child with a disabled

parent (zp = 0) and where zL = 0. If he pretends to be disabled, and his brother happens

to be disabled, no one in the family will produce output. If utility from consumption is un-

bounded below, he will receive −∞ utility in this case (loosely speaking), which dominates

his expectation on the left hand side of his IC constraint, thus the IC will not bind. The

proof of the analog of proposition 2.1 is in the appendix, section A.5.

2.4.2. More Than Two Productivity States. Extending the model to more than two produc-

tivity states poses two problems. First, we increase the number of incentive compatibility


constraints to (n− 1) · n. Numerically, we find that only local downward constraints bind.1

It can be proven that this is the pattern of binding constraints in simpler insurance models

(Thomas and Worrall, 1990), but we have not proven it in this environment. Without this

result we cannot establish those that follow in the simple case.

Second, the proof in the 2-state case used the intermediate value theorem to show that

the two bequest policies cross at some level of persistence. With more than two states, there

is no reason to suspect that all policies will cross at the same level of persistence, even if we

can show monotonicity at extreme levels of persistence. In fact, numerically it appears that

they generically cross at different points.

The graph below illustrates this point. As with the two-state graph, bequests for each of

10 types is plotted as a function of persistence 2. Some bequests are equal to others under

a variety of π’s, but there is no one π where all bequests are equal.

Persistence and Bequests - Many States

0.5 0.6 0.7 0.8 0.9 1.0

0.8

0.9

1.0

1.1

1.2

0.0

0.2

0.4

0.6

0.8

1.0

b_L -Low-typebequest

b_H -High-typebequest

Cost of dividingbequests equally

Persistence - π

Beq

uest

sC

ost of Equal Bequests

1By local downward we mean that for each type but the lowest, the constraint which prevents him frompretending to be the next productivity type below him is the one that binds.2In this numerical experiment the probability of remaining the same state tomorrow is π and the probabilityof transiting to any other state is 1−π

n−1 where n is the number of states.


To model real-world income processes we will want to use as many states as possible, and

in consequence we cannot hope to achieve perfectly equal bequests. We argue, however, that

dividing bequests equally can be explained as a rule-of-thumb behavior in response to costs,

mental or financial, of dividing bequests unequally. To motivate this line of argument we

plot the costs of unequal division that would be necessary to rationalize equal division at

different levels of persistence, relative to those costs for π = .5. These costs are quite small

for π ∈ (.8, .9), but we discuss our attempts to calculate real costs for a more realistic income

process in section 3.

2.4.3. Child Savings. There is no explicit saving on the part of the children in this model.

Extending the model to allow for savings may help us compare the model to real world

data. But savings also suggests a serious critique of our model. Typically, access to an

unobserved savings technology dramatically changes the allocations in an insurance problem

(e.g. Cole and Kocherlakota, 2001). If we believe that children in the real world have access

to unobserved savings, extending our model to reflect this fact may alter our conclusions.

A simple way of extending the model to include publicly observed savings is to give parents

another choice variable, s. If savings earns the same return as bequests, they are perfect

substitutes in the eyes of the parents, and we will have that savings and bequests, s + b, in

this model are equivalent to b in the model without savings. The magnitude of savings is not

pinned down, and we may choose to set it to match the level of saving in the data. We do

just this in the quantitative section of our paper. From the perspective of theory, however,

this is a cosmetic change which offers a new interpretation of the variables in the model but

does not tell us much about why savings occurs in the real world.

There are a few ways to extend the model to include hidden savings. In our model parents

observe the savings and consumption of their children, thus any kid who tried to sneak off


and save would be caught by a parent who calculates y-c. For hidden savings to get off the

ground at least one of these variables must not be observed.

If parents cannot observe consumption and local downward constraints bind then the

child’s Euler equation is u′(ct) ≥ Wb(b, z) = βRE{u′(ct+1)}. Given the allocation solving

the parent’s problem in our model the child would like to borrow in the first period. Thus

the allocation of our model is also constrained efficient in a model with hidden consumption

and borrowing constraints. Such a model would explicitly have child’s savings equal to zero,

in contradiction to the data.

Unobserved income has just the opposite effect on savings. The inter-temporal first order

condition in terms of labor is v′(yt/zt) ≤ Wb(b, z) = βRE{v′(yt+1/zt+1)}. At the solution to

our model the child would prefer to work more in the first period and save. In the presence

of hidden saving, our model is not robust to making income unobserved.

2.4.4. Commitment. A lack of commitment on the part of the child is a popular way to

explain the timing of bequests in the literature (Altonji et. al., 1997; Nishiyama, 2000).

Commitment is also a common concern in interpreting dynamic insurance models. Can the

commitment to the contract our parent writes be justified? As the model stands there is no

commitment problem by virtue of the timing in the model. The parent distributes gifts and

sets aside bequests simultaneously. She demonstrates no more commitment than a static

Mirrlees mechanism designer. Yet in the real world a parent can change her will on her

deathbed.

In the data, bequests are overwhelmingly equally divided, suggesting that parents on

their deathbed do not change their will to compensate their poorer children. One plausible

explanation is that parents simply have a behavioral propensity to commit. An alternative

option, which avoids ad hoc deviations from rationality, is to implement the commitment as

the outcome of a repeated game. The parent may be induced to commit because she is in an


equilibrium in which future generations will not commit if she deviates from her contract.

We are working on a dynastic version of the model in which we might state and prove a

more precise version of this claim.

3. Quantitative Analysis

Our analytical results predict equal bequests only for a particular level of persistence,

although we find quantitatively that bequests are similar for low and high types for a large

range of persistence levels around π∗. In this section we quantify the sense in which bequests

may be ‘close’ by taking a richer version of our model to income and transfers data. We

calculate the solution to the multiple-state, 2-child version of our model, and pick skill process

parameters to match income process statistics from the PSID. We also present a number of

extensions to the model which further explain the data or offer robustness checks to our

basic model.

Relative to the simple model presented at the beginning of section 2, we enrich the model

along several dimensions for the sake of realism. First, the parent now has two children with

uncertain incomes, so that the level of family-wide resources is uncertain, particularly to

children who are deciding whether to lie about their type. Second, we increase the number

of productivity levels. The more productivity levels we add, the less insurance the parent

can provide. Third, we will include parents’ allocations explicitly. We discuss these changes

in detail in section 2.4.

The forms of the utility function are u(c) = c1−σ

1−σand v(`) = φ `1+γ

1+γ. Both functions are

CRRA and satisfy the assumptions of the proofs in the appendix. Of the parameters on

the utility function, σ and γ are perhaps the most critical. These parameters control the

elasticity of consumption and labor, and influence the child’s incentives as well as parent’s

insurance motives. The parent discounts her children’s utility by η, and we assume she loves

them both equally. We also assume that children’s skill draws are i.i.d. conditional on zP .


Hence a parent of type zP with assets A will solve the following maximization problem:

max∑ij

π(zi|zP )π(zj|zP ){cP (ij)1−σ

1− σ− φ

(yP (ij)zP

)1+γ

1 + γ+

η

[c1(ij)

1−σ

1− σ−

(y1(ij)zi

)1+γ

1 + γ+ βW (b1(ij), zi) +

c2(ij)1−σ

1− σ− φ

(y2(ij)zj

)1+γ

1 + γ+ βW (b2(ij), zj)

]}s.t.

∀i, j : c1(ij) + c2(ij) + cP (ij) +b1(ij)

R+

b2(ij)

R≤ y1(ij) + y2(ij) + yP (ij) + A

∀i, k :∑

j

π(zj|zP )

[c1(ij)

1−σ

1− σ− φ

(y1(ij)zi

)1+γ

1 + γ+ βW (b1(ij), zi)

]≥

∑j

π(zj|zP )

[c1(kj)1−σ

1− σ− φ

(y1(kj)zi

)1+γ

1 + γ+ βW (b1(kj), zi)

]

∀j, k :∑

i

π(zi|zP )

[c2(ij)

1−σ

1− σ− φ

(y2(ij)zj

)1+γ

1 + γ+ βW (b2(ij), zj)

]≥

∑i

π(zi|zP )

[c2(ik)1−σ

1− σ− φ

(y2(ik)zj

)1+γ

1 + γ+ βW (b2(ik), zj)

]

In the calculations we present below we solve this problem for parents with a variety of

skill and asset levels, weighting to reflect the US distribution of assets and income.

We begin our analysis by calibrating the model to income statistics for a range of σ’s, γ’s,

and η’s. We then examine the model allocations for a fixed set of utility parameters, and

compare the results with transfers data and the model with a public information version of

the model. Next, we consider changes which improve the fit of the model or address forces

in transfer decisions which are absent in our model.

3.1. Calibrating the Model to a Variety of Parameters. The nature and persistence

of productivity shocks are central to our analytical results. Bequests will be progressive


or regressive if the lifetime productivity process is very persistent or very impersistent. To

assess the implications of our model for the income process we see in the data, we pick skill

process parameters in our model so that our model allocations match income statistics.

We examine a finite-state approximation to an AR1 log skill process with normal innova-

tions over time and across generations, so that:

log z1 = µz1 + ρzG (log zp − µzp) + εzG εzG ∼ N (0, σ2zG)

log z2 = µz2 + ρzL (log z1 − µz1) + εzL εzL ∼ N (0, σ2zL)

For each vector of utility parameters we pick the 6 productivity process parameters, (µzp , µz1 ,

ρzG, ρzL

, σ2zG, σ2

zL) to match the analogous income statistics in the PSID, (µyp , µy1 , ρyG

, ρyL,

σ2y1

, σ2y2

). We also pick the correlation of assets and parent’s skills to match the asset-income

correlation in the data. A fuller description of all calibration procedures is in Appendix B.

In this paper we focus on the within-family distribution of gifts and bequests. In the

following tables we compare the progressiveness of gifts and of bequests for the model cal-

ibrated to a range of parameters. To measure ‘progressiveness’ we regress gifts received as

a fraction of average gift received by all children in the family on income as a fraction of

average child income in the family:

g1i

(|g1i|+ |g2i|)/2= β0 + βgy

y1i

(y1i + y2i)/2+ εi


This regression is a close cousin of a family fixed-effects model in logs, except that this

formulation allows us to account for negative values. We find a value of βgy = −0.5459 in

the 1994 HRS, a highly significant negative relationship. We also regress model bequests as

a fraction of average bequests on model income as a fraction of average child income. There

is no corresponding data on bequests, but given the fraction of equally divided bequests the

value should be close to zero, βby ≈ 0.

Table: βgy for η = 1 and various σ, γ Table: βby for η = 1 and various σ, γ

σ \ γ 0.5 1 2 4

0.5 0.0222 -0.8132 -1.6547 -1.9402

1 -1.3054 -1.6582 -1.8366 -1.9678

2 -1.7568 -2.0063 -2.0105 -2.0340

4 -2.0483 -2.0149 -2.0694

σ \ γ 0.5 1 2 4

0.5 -0.7775 -0.7342 -1.0937 -1.6997

1 0.7029 -1.3979 -1.3940 -1.7554

2 -1.7202 -1.5576 -1.6370 -1.4926

4 -1.4206 -1.1278 -1.1245

In general, lower elasticities reduce the degree of redistribution. This is primarily due to

the direct effect on the parent’s motives - less risk-averse children require less insurance,

while incentive constraints are not significantly relaxed. We also find that σ > γ generates

results which are closer to the transfers data. These results are presented for η = 1, perhaps

the upper bound for reasonable values of this parameter. We performed the same exercise

for η = .5, with somewhat less progressive results, see Appendix C.

Overall, these tables confirm that the basic forces in our model are active in the richer

model. Bequests are distributed less progressively than gifts. Contrary to transfers evidence,

however, we find that both measures are more progressive than in the data. This suggests

that for our specification of the skill process we are in the ‘too persistent’ range in the graph

which motivated our discussion in section 2.


Which parameter values get us closest to the transfers data? In a related exercise, we

picked utility parameters as well as skill process parameters to match income statistics as

well as the regression statistics on bequests and gifts. We find that we can match the transfer

statistics quite closely with σ = .86, γ = .61, φ =???, and η = 1.2. We are not able to match

all income statistics perfectly, however, in spite of the fact that we are picking 11 parameters

to match 9 statistics.

A more direct measure of the ‘closeness’ of bequests is the percentage of households in

which the children’s bequests are within 25% of each other. Wilhelm finds this number to

be 88% in his sample. We report these numbers for our model in the table below.

Table: Bequests ±12.5% for η = 1

σ \ γ 0.5 1 2 4

0.5 58.2% 31.0% 21.4% 20%

1 68.1% 38.1% 20% 20%

2 22.0% 20% 27.9% 20%

4 28.6% 24.1% 33.7%

These numbers include families with ‘identical children.’ We approximate the skill distri-

bution with 5 equally likely states, so that 20% of families have children of the same type.

We find that increasing the number of states to 10, so that only 10% of families have children

of the same type, does not significantly reduce the numbers presented in this table. More

detail is available in appendix C.

While the table above suggests that our model helps explain the equal division of bequests,

we do not find the majority of families have children receiving within 2% of mean bequests.

In the next section we interpret equal division as a ‘rule of thumb’ behavior and examine the

performance of our model against a public information model in justifying this explanation.


3.2. The Cost of Unequal Bequests. The bequest literature frequently considers the

costs, legal or psychological, of dividing bequests unequally. In France, for example, some

parts of bequests must be divided equally by law (Arrondel and Masson, 2002). In the

US, unequally divided bequests may be more subject to legal challenge. Others suggest

that unequal division is painful due either to squabbling kids in the style of Bernheim and

Severinov (2001) or simply an easy rule of thumb. In this section we compare our model

to an identical one with public information. We measure the costs, κ, that a parent would

be willing to pay for the ability to divide her bequest unequally and we compare this value

across the two models. Private information dramatically reduces the κ required to rationalize

equal division of bequests.

First, we fix utility parameters to σ = γ = φ = 1 and η = .75 which are relatively favorable

to our story, but not extremes of the range we consider above. We calibrate our model to

the income data as above. We also calibrate a public information version of our model, one

in which the parent solves the same problem without incentive constraints. The allocations

of both models are summarized below.

Table: Model statistics for σ = γ = φ = 1 and η = .75

data private info public infoE(g)E(y) 0.0184 0.2266 0.1448

βgy -0.5459 -1.5438 -1.6757E(b)E(y) 0.0157 -0.3368 -0.3411

βby 0 -0.9417 -1.4966

%∣∣∣ b1(b1+b2)/2 − 1

∣∣∣ < .02 77.6% 20% 20%

%∣∣∣ b1(b1+b2)/2 − 1

∣∣∣ < .125 88% 28.7% 20%

%∣∣∣ b1(b1+b2)/2 − 1

∣∣∣ < .25 49.9% 20%


The public information model is more progressive in both measures and only families with

identical children have both kids within even 25% of mean family bequests.

Our model does not predict exactly equal bequests. To measure the improvement of our

model over the pure altruism model we compare them in terms of the cost to the parent

that would be required to rationalize equal division. With the same utility parameters, we

calibrate a model in which bequests are forced to be equal across children. Keeping utility

and skill process parameters the same we relax this requirement and calculate the optimal

allocation. We perform the same exercise for the public information model.

To calculate κ, we calculate the difference in the utilities arising from the problems with

and without an equal-division requirement. We then divide by the marginal utility of the

parent in the forced-equality model to convert this value to dollars. This value can be

interpreted as the smallest cost of unequal division which could rationalize the equal division

behavior in the data.

In the case of public information, a parent would be willing to pay an average of 60% of

the present value of 20 years discounted income in order to divide bequests unequally. In

our model, this number is 5%. While κ is not negligible, our model does a much better

job supporting equal division as a ‘rule of thumb’ behavior or as a response to legal or

psychological costs.

3.3. Extensions. Our model is constructed to explain a puzzle, that gifts are progressively

distributed while bequests are equally distributed. In this section we consider an extension

of the model which improves its performance in the progressiveness of bequests. We also

consider some extensions which would add realism to the model and possibly improve its

performance in other aspects of transfer data.

Bequest behavior is particularly sensitive to σ2zG

, the variance of the error term in inter-

generational skill persistence. The lower is this variance, the less progressive are bequests. To


examine this relationship we drop σ2y1

as a target of calibration, and we match βby instead.

We are able to match all seven targets with our 7 parameters, including non-progressive

bequests.

Such a calibration can be motivated by a modification of the benchmark model in which

parents also receive a signal θ about their children’s skills. After seeing the signal, the true

variance of children’s income is σ2zG, but ‘measured’ skill variance in the model would be

σ2θ +σ2

zG. This modification speaks to many parent’s impression that they know their child’s

productivity better than the child does. In practice, we need the child’s income variance

conditional on θ to be extremely small. Results of this exercise are in Appendix C.

Our model fails decisively to explain the magnitude of bequests, a statistic it was not

designed for. Parents care more about themselves than their kids, and their kids expect to

be more than 40% richer in the future. For a range of parameter values parents take from

their children to feed themselves in the model, bequeathing debt to their children, which is

not legally available to parents in the US.

We solve a version of the benchmark model with an added non-negativity constraint on

bequests. We find that this change helps reduce the average size of gifts while increasing

the average size of bequests. Further, this addition significantly increases βby, now restricted

to families leaving positive bequests, while leaving βgy, unchanged. The addition of non-

negativity constraints does not significantly alter the conclusions of our model while adding

some realism and improving some other transfer statistics. A table of our results is included

in Appendix C.

4. Conclusion

Standard models of intergenerational transfers which are in common use in macroeco-

nomics are not supported by studies of actual transfers. No consensus model of inter-

generational giving has emerged because none can account for transfer behavior in both


gifts and bequests. In this paper we propose that such a difference can arise as a result of

a dynamic insurance problem. We show in a stylized model that there is a level of income

persistence which implies equal bequests and progressive gifts, the pattern we observe in the

data. We then take a richer version of the model to the data to see if these results hold up to

a more realistic skill process which is pinned down by income data. We find that the model

generates bequests which are much closer than those in a public information model, and can

be used to motivate exactly equal division as a rule-of-thumb behavior or response to small

costs of unequal bequest division.


5. References

ALTONJI, J. G., HAYASHI, F. and KOTLIKOFF, L. J. (1997), “Parental Altruism and

Inter Vivos Transfers: Theory and Evidence”, Journal of Political Economy, 105, 1121 -

1166.

ARRONDEL, L. and MASON, A. (2002), “Altruism, Exchange or Indirect Reciprocity :

What do the Data on Family Transfers Show?”, (Working Paper 2002-18, Department et

laboratoire d’economie theorique at apliquee).

BARRO, R. J. (1974) “Are Government Bonds Net Wealth?”, The Journal of Political

Economy, 82, 1095-1117.

BECKER, G. S., “A Theory of Social Interactions”, The Journal of Political Economy,

82, 1063-1093.

BERNHEIM, B. D. and SEVERINOV, S. (2003), “Bequests as Signals: An Explanation

for the Equal Division Puzzle”, Journal of Political Economy, 111, 733 - 764.

COLE, H. L. and KOCHERLAKOTA, N. R. (2001), “Efficient Allocations with Hidden

Income and Hidden Storage”, Review of Economic Studies, 68, 523 - 542.

GALE, W. and SCHOLZ, J. K. (1994), “Intergenerational Transfers and the Accumulation

of Wealth”, Journal of Economic Perspectives, 8, 145-160.

HEATHCOTE, J., STORESLETTEN, K. and VIOLANTE, G. L. (2008), “The Macroeco-

nomic Implications of Rising Wage Inequality in the United States”, NBER Working Paper

14052.

HOCHGUERTEL, S. and OHLSSON, H. (2000),“Compensatory inter vivos gifts”, unpub-

lished manuscript.

LIGHT, A. and MCGARRY, K. (2004), “Why Parents Play Favorites: Explanations for

Unequal Bequests”, The American Economic Review, 94, 1669 - 1681.


LEE, C. and SOLON, G. (2006) “Trends in Intergenerational Income Mobility”, NBER

Working Paper 12007.

MCGARRY, K. and SCHOENI, R. F. (1994), “Transfer Behavior: Measurement and the

Redistribution of resources within the Family”, NBER Working Paper 4607.

MCGARRY, K. and SCHOENI, R. F. (1995), “Transfer Behavior within the Family:

Results from the Asset and Health Dynamics Survey”, NBER Working Paper 5099.

MCGARRY, K. (1999), “Inter vivos transfers and intended bequests”, Journal of Public

Economics, 73, 321 - 351.

NISHIYAMA, S. (2000), “Measuring Time Preference and Parental Altruism”, CBO Tech-

nical Paper 2000-7.

OHLSSON, H. (2007), “The equal division puzzle - empirical evidence on intergenerational

transfers in Sweden”, Uppsala Universitaet Working Paper 2007:10.

SOLON, G. (1999), “Intergenerational Mobility in the Labor Market”, in O. Ashenfelter

and D. Card (eds.) Handbook of Labor Economics, Volume 3, Elsevier Science B.V.

THOMAS, J. and WORRALL, T. (1990), “Income Fluctuation and Asymmetric Informa-

tion: An Example of a Repeated Principal-Agent Problem.” Journal of Economic Theory,

51, 367-390

WILHELM, M. O. (1996), “Bequest Behavior and the Effect of Heirs’ Earnings: Testing

the Altruistic Model of Bequests”, The American Economic Review, 86, 874-892.


Appendix A

A.1. Some results about W and W.

Lemma A.1. Wb > 0 & Wbb < 0 & Wb is decreasing in z.

Proof:

(1) Children solve the autarky problem in the second period:

W(b; z) = maxc,`

u(c)− v(c− b

z)

The envelope condition is:

Wb =1

zv′(

c− b

z) = u′(c) > 0

Clearly W(b, z) is increasing in b.

(2) Differentiation with respect to b yields:

∂

∂bWb =

1

z2v′′(

c− b

z)(

∂c

∂b− 1)

Then it is enough that ∂c∂b

< 1, which must be true since more money must reduce

output somewhat. We can formally establish that by noting that:

∂

∂bWb =

∂u′(c)

∂b= u′′(c)

∂c

∂b

Setting the two expressions equal, we get:

∂c

∂b=

v′′( c−bz

)

v′′( c−bz

)− z2 · u′′(c)> 0 ⇒ ∂

∂bWb < 0


(3) Differentiation with respect to z yields:

∂

∂zWb = u′′(c)

∂c

∂z

We expect ∂c∂z

> 0, which we can verify by differentiating the first order condition

zu′(c)− v′( c−bz

) = 0 to get:

∂c

∂z=

c−bz

v′′( c−bz

) + z · u′(c)v′′( c−b

z)− z2u′′(c)

> 0 ⇒ ∂

∂zWb < 0 �

Lemma A.2. Wb > 0 & Wbb < 0 & Wb(b, zL) > Wb(b, zH). The last result holds

for π ≥ 12.

Proof: The first 2 are straightforward from the previous lemma and the definition:

Wb(b, zi) := πWb(b, zi) + (1− π)Wb(b, zj)

Since we have Wb(b, zL) > Wb(b, zH) and π ≥ 12

we have:

Wb(b, zL) = πWb(b, zL) + (1− π)Wb(b, zH) > (1− π)Wb(b, zL) + πWb(b, zH) = Wb(b, zH) �

A.2. Proof of Lemma 2.2. At the solution to the relaxed problem: gL + bL > gH + bH ,

i.e. “overall redistribution” is progressive.

Proof: Consider the problem of a high agent who is given a transfer xH and we let him

decide how much he wants to work, consume and save for next period. His problem is:

maxc,y,b

u(c)− v(y

zH

) + W (b, zH) s.t. c + b ≤ y + xH


Notice that the first order conditions for this problem are the same as those for the high

type in the parents’ problem. Thus by setting xH = cH + bH − yH , we guarantee that the

unique3 solution to the maximization problem is (c∗, y∗, b∗) = (cH , yH , bH), the high type’s

allocations in the parent’s problem.

Since the ICH is binding (lemma A.4) cL, yL, bL give the high type the same utility as

cH , yH , bH . Thus cL, yL, bL can’t be in the constraint set of the problem above (otherwise

a convex combination of cL, yL, bL and cH , yH , bH would give a strictly higher value of the

objective function while still being in the constraint set). Thus:

cL + bL > yL + xH

cL + bL > yL + yL + A− cL − bL

(cL − yL) + bL = gL + bL >A

2=⇒

(cH − yH) + bH = gH + bH <A

2

The second and last last lines come from substituting xH into the parent’s problem feasi-

bility so that xH = yL + A− cL − bL. The claim follows. �

A.3. Proof of Lemma 2.4. Policy functions are continuous in π and zL if u has non-

increasing absolute risk aversion, and v CARA or CRRA.

Proof: The proof has 2 steps. In the first we show that under our assumptions the policy

correspondence is single valued, in the second that it is u.h.c. in π and z. Combined these

imply that the policy correspondence is in fact a continuous function. WLOG we assume

that the probability of being of a high and low type is the same.

3The objective function is strictly concave, the constraint set is convex.


Step 1. The policy is a function, i.e. a single valued correspondence. We can convexify

the constraint set by having the parent choose utility values instead of real values. In this

setup the parent chooses utility from consumption, output, and welfare for the high type

(uH , vH , wH), and for the high type pretending to be the low type (uL, vL, wL). Here we use

capital letters to denote the functions U , V , and W . The parent solves:

maxu,v,w

uH − vH + wH + uL − V (zH

zL

V −1(vL)) + WL(W−1H (wL)) s.t.

U−1(uH) + U−1(uL) + W−1H (wH) + W−1

H (wL) ≤ zH · V −1(vH) + zH · V −1(vL) + A

uH − vH + wH ≥ uL − vL + wL

The incentive compatibility constraint is linear. Since U(·) and Wi(·) are strictly increasing

and strictly concave their inverses are strictly convex. Similarly the inverse of V (·) is strictly

concave. The convex functions are on the lesser side of the inequality and the concave ones

are on the greater side, so this constraint is convex as well.

To show uniqueness it remains to show that the objective function is weakly concave. The

first four terms are linear. We will show that our assumptions guarantee that V and W

are weakly concave. Both of the functions are of the form h(x) = f(g−1(x)). Recall that

∂∂x

g−1(x) = 1g′(g−1(x))

. Then:

h′(x) =f ′(g−1(x))

g′(g−1(x))

h′′(x) =

[g′(g−1(x))

f ′′(g−1(x))

g′(g−1(x))− f ′(g−1(x))

g′′(g−1(x))

g′(g−1(x))

]/(g′(g−1(x)))2

The sign of the second derivative is the sign of the numerator, which can be written as:(− g′′(g−1(x))

g′(g−1(x))

)−

(− f ′′(g−1(x))

f ′(g−1(x))

)


We want to show that the second derivative of WL as a function of wL is negative and

that it is positive for V as a function of vL, since this function is subtracted, so we need it

convex. Let RRV (l) denote the relative risk aversion of V and similarly the absolute risk

aversion ARW (b; z) where the derivatives are taken w.r.t b. Plugging into the formulas above

one gets:

∂2V ( zH

zLV −1(vL))

∂v2L

≥ 0 ⇐⇒ RRV (y

zH

) ≥ RRV (y

zL

)

∂2WL(W−1H (wL))

∂w2L

≤ 0 ⇐⇒ ARW (b; zH) ≤ ARW (b; zL)

Thus we require v to be NIRRA, which covers CARA as well. For the last term in the

objective function, we will prove the following lemma:

Lemma A.3. Assume u is NIARA (covers CRRA) and v is CARA or CRRA. Then −Wbb

Wb

is decreasing in z and Wbb is increasing in z.

Once we prove this claim, it is easy to show that −Wbb(b,zL)Wb(b,zL)

≥ −Wbb(b,zH)Wb(b,zH)

and Wbb(b, zL) ≤

Wbb(b, zH) using π ≥ 12

in a way similar to lemma A.1.

Proof: from above we have:

Wbb

Wb

=u′′(c)

u′(c)

∂c

∂b=

1zv′( c(z)−b

z)

1z2 v′′(

c(z)−bz

)+

u′(c(z))

−u′′(c(z))=

11zv′( c(z)−b

z)

1z2 v′′( c(z)−b

z)+ u′(c(z))

−u′′(c(z))

So now Wbb

Wbincreasing in z ⇐⇒

1zv′( c(z)−b

z)

1z2 v′′( c(z)−b

z)+ u′(c(z))

−u′′(c(z))increasing in z. The second term is

increasing in z e.g. for u NIARA since c is increasing in z. We claim that the first term is

increasing in z for v CRRA. To see that note that the first term can be rewritten as:

y(z) · v′(y(z)z

)y(z)

z· v′′(y(z)

z)

=y(z)

−RRv(y(z))


Remember that RRv(y(z)) is negative so that −RRv(y(z)) is positive. Assuming CRRA

−RRv(y(z)) is constant, we get that ∂∂z

y(z)−RRv(y(z))

=∂y∂z

−RRv(y(z))> 0 since y is increasing in

z (by c is increasing in z which was established above). Note that it is not straightforward

to find a sufficient condition in terms of DRRA or IRRA, because we don’t know the sign

of ∂`(z)∂z

. The argument is similar for CARA (again we don’t need to worry which way `(z)

goes). Finally, another sufficient condition would be `(z) constant in z. The functions that

we are using do not have this property, so we don’t include this sufficient condition in the

statement of the lemma. Finally the fact that Wbb is (strictly) increasing in z follows from

the fact that Wb is (strictly) decreasing in z.

Step 2. The policy correspondences are upper hemi-continuous in π and z.

Proof: for u.h.c. in π define

f(π; cH , cL, yH , yL, bH , bL) := u(cH)− v(yH

zH

) + πW(bH , zH) + (1− π)W(bH , zL) +

u(cL)− v(yL

zL

) + πW(bL, zL) + (1− π)W(bL, zH)

Γ(π) :={(cH , cL, yH , yL, bH , bL) ∈ <4

+ ×<2 : cH + cL + bH + bL ≤ yH + yL + A,

u(cH)− v(yH

zH

) + πW(bH , zH) + (1− π)W(bH , zL) ≥

u(cL)− v(yL

zH

) + πW(bL, zH) + (1− π)W(bL, zL)}

h(π) :={(c∗H , c∗L, y∗H , y∗L, b∗H , b∗L) ∈ <4

+ ×<2 : f(π; c∗H , c∗L, y∗H , y∗L, b∗H , b∗L) =

max(cH ,cL,yH ,yL,bH ,bL)∈Γ(π)

f(π; cH , cL, yH , yL, bH , bL)}

Below we will prove that W(b, z) is continuous in b. This means that f is a continu-

ous mapping [0, 1] × <4+ × <2 −→ <. Clearly Γ : [0, 1] −→ <6

+ is a non-empty valued


correspondence. Γ is also a continuous correspondence, since all the functions are contin-

uous and π enters linearly. One can show that ∃L large enough and ∃B small enough s.t.

∀π : lH , lL < L, bH , bH > B at a solution to the relaxed problem. Then WLOG Γ can be

made compact valued since these bounds imply an upper bound on cH , cL, yL, yH as well.

Thus h(π) is a non-empty, compact, upper hemi-continuous correspondence. To finish the

proof, we need to prove the following lemma.

Lemma A.4. W(b, z) is continuous in b.

Proof: recall that

W(b; z) = maxc

u(c)− v(c− b

z) s.t. c ≥ b

Standard arguments (strict concavity of the objective function, convexity of the constraint

set and the Inada conditions) imply that this problem has a unique interior solution ∀b.

Consumption solves

zu′(c)− v′(c− b

z) = 0.

The LHS is a C1 function of c and b and the derivative with respect to c is strictly negative:

zu′′(c) − v′′( c−bz

)/z, hence invertible for all c, b. Thus c(b) is a continuous function by the

implicit function theorem, and hence W(b, z) = u(c(b)) − v( c(b)−bz

) is a continuous function

of b. �

The proof of upper hemi-continuity of the policy correspondences in z is similar.


A.4. Relaxed problem valid.

Lemma A.5. The constraint preventing the high type from reporting the low skill level,

ICH , is sufficient, i.e. at the solution to the problem where ICL is not included ICL will be

satisfied for π ∈ {12, π∗, 1}. For π = 1 we have the result as long as bL > bH which has been

established for zL small enough and v is CRRA or CARA.

Proof: clearly at the solution to the relaxed problem the ICH binds. To prove the lemma,

we need to characterize the solutions in more detail. We will proceed case by case.4

(1) π = 12. First note that yH > yL. This is because as we established above cH >

cL, bH > bL and the ICH binds. Moreover W (bi, z) = W (bi). Denote W (bi) as wi.

Then WTS:

u(cH)− v(yH

zH

) + wH = u(cL)− v(yL

zH

) + wL ⇒ u(cL)− v(yL

zL

) + wL ≥ u(cH)− v(yH

zL

) + wH

This is equivalent to:

u(cH)− u(cL) + wH − wL = v(yH

zH

)− v(yL

zH

) ⇒ v(yH

zL

)− v(yL

zL

) ≥ u(cH)− u(cL) + wH − wL

Thus it is enough to show:

v(yH

zL

)− v(yL

zL

) ≥ v(yH

zH

)− v(yL

zH

)

This is equivalent to:∫ yH

yL

v′( yzL

)

zL

dy ≥∫ yH

yL

v′( yzH

)

zH

dy

This is true by convexity of v, yH > yL and zH > zL.

4Note that for zL = 0, the validity of the relaxed problem is clear, because the low type cannot pretend tobe the high type.


(2) π = 1. Showing that yH > yL is more complicated here. It deserves a separate

lemma.

Lemma A.6. Suppose v is NIRRA. Then y1L < y1

H .

Proof: first we will show y1L < y2

L. We use superscripts for periods. We have π = 1

so the types are constant over time. Define the output that will be chosen in the

second period by a high guy who misreports in the first period as a function of bL:

y := u′(y(bL) + bL) = v′(y(bL)

zH

) · 1

zH

The properties of u and v imply that y2 is strictly increasing in z and therefore

∀b : y2L(b) < y(b). Now output in the first period is determined by:

v′(y1

L

zL

)1

zL

− µv′(y1

L

zH

)1

zH

= W1(bL, zL)− µW1(bL, zH)

Using W1(bL, zH) = v′( y(bL)zH

) · 1zH

and W1(bL, zL) = v′(y2

L(bL)

zL) 1

zLwe can rewrite this:

v′(y1

L

zL

)1

zL

− µv′(y1

L

zH

)1

zH

= v′(y2

L(bL)

zL

)1

zL

− µv′(y(bL)

zH

) · 1

zH

Since y2L(b) < y(b) the above implies (we drop the arguments for simplicity):

v′(y1

L

zL

)1

zL

− µv′(y1

L

zH

)1

zH

< v′(y2

L

zL

)1

zL

− µv′(y2

L

zH

) · 1

zH

Now to prove the claim we will show that f(y) := v′( yzL

) 1zL−µv′( y

zH) 1

zHis increasing

in y which implies that y1L < y2

L. Taking the derivative yields:

f ′(y) = v′′(y

zL

)1

z2L

− µv′′(y

zH

)1

z2H


By NIRRA we have:

−v′′( y

zL)

v′( yzL

)

y

zL

≤ −v′′( y

zH)

v′( yzH

)

y

zH

=⇒

v′′(y

zL

)1

zL

> v′′(y

zH

)1

zH

=⇒

v′′(y

zL

)1

z2L

> v′′(y

zH

)1

z2H

Here the reasoning comes from v′, v′′ > 0 and zL < zH . Combine the last expression

with µ ∈ (0, 1) to get f ′(y) > 0. Thus y1L < y2

L. To prove the lemma suppose by

way of contradiction that y1H ≤ y1

L. Then we can have either bH ≥ bL or bH < bL.

The first one is inconsistent with the ICH binding since we also have cH > cL. For

the second one, we would have y2L < y2

H and hence y1L < y2

L < y2H = y1

H , which is a

contradiction. Thus y1L < y1

H . �

To show that the relaxed problem is valid WTS:

u(cH)− v(yH

zH

) + W (bH , zH) = u(cL)− v(yL

zH

) + W (bL, zH)

⇒ u(cL)− v(yL

zL

) + W (bL, zL) ≥ u(cH)− v(yH

zL

) + W (bH , zL)

Thus it is enough to show:

v(yH

zL

)− v(yL

zL

) + W (bL, zL)−W (bH , zL) ≥ v(yH

zH

)− v(yL

zH

) + W (bL, zH)−W (bH , zH)

Since we have bL > bH , yL < yH we can rewrite the inequality as:∫ yH

yL

v′( yzL

)

zL

dy +

∫ bL

bH

Wb(b, zL)db ≥∫ yH

yL

v′( yzH

)

zH

dy +

∫ bL

bH

Wb(b, zH)db

Which is true by the convexity of v, zH > zL, and Wb(b, zL) ≥ Wb(b, zH). �


(3) π = π∗. The W ’s in the IC’s cancel out by bL = bH . Moreover the ICH binding

implies yH > yL. Thus this reduces to showing the following which has been proved

before:

v(yH

zL

)− v(yL

zL

) ≥ v(yH

zH

)− v(yL

zH

)

�

A.5. Family level uncertainty with 2 types. The general problem of a parent with

assets A and skill level zP is defined in section 2.4. We make several assumptions here.

First, there are only 2 states: H and L so that (ij) ∈ S := {HH,HL, LH, LL}. For

each kid Pr(H) = Pr(L) and the draws are independent across kids. Since the solution is

unique, it will feature symmetric treatment, so we will use xs for allocations of the first kid

∀s ∈ S. Parent’s allocations are defined as xP (s). By symmetry xP (LH) = xP (HL) and

also λLH = λHL. To simplify notation we divide the objective function by 1η. We keep β and

R in the model, but we assume that β · R = 1. As before, we consider a relaxed problem

and assume that the ICH is binding. The goal is to prove the following proposition:

Proposition A.7. Suppose zL = 0 and the ICH binds. Then ∃π∗ ∈ (12, 1) s.t. bHL = bLH .

The crucial and somewhat tedious part is to prove the following lemma.

Lemma A.8. At the solution to the relaxed problem: gLH + bLH

R> gHL + bHL

R, i.e. “overall

redistribution” in state (LH)is progressive.

Proof: for clarity of exposition we will divide the proof into several steps.

Step 1: consider an artificial problem of a high kid who is given a transfer (the variables

on the RHS are the optimal allocations from the relaxed problem) xH = cH + bH

R− yH and

xHL = cHL + bHL

R−yHL with equal probability and we let this kid decide how much he wants


to work, consume and save for next period conditional on the transfer he got. His problem

is:

maxc1,y1,b1,c2,y2,b2

u(c1)− v(y1

zH

) + βW (b1, zH) + u(c2)− v(y2

zH

) + βW (b2, zH) s.t.

c1 + b1 ≤ y1 + xH

c2 + b2 ≤ y2 + xHL

It is clear that (no distortion on top principle) the unique solution is:

(c∗1, y∗1, b

∗1, c

∗2, y

∗2, b

∗2) = (cH , yH , bH , cHL, yHL, bHL)

Since the ICH is assumed binding (cL, yL, bL, cLH , yLH , bLH) give the same utility as (cH , yH ,

bH , cHL, yHL, bHL). Thus (cL, yL, bL, cLH , yLH , bLH) can’t be in the constraint set. Thus one

of the 2 constraints must be violated and we get that:

cL − yL +bL

R> cH +

bH

R− yH ∨ cLH − yLH +

bLH

R> cHL − yHL +

bHL

R

⇐⇒

gL +bL

R> gH +

bH

R∨ gLH +

bLH

R> gHL +

bHL

R

Step 2: Now we will prove gLH + bLH

R> gHL + bHL

R=⇒ gL + bL

R> gH + bH

R. Combined

with the result above this will in fact imply gL + bL

R> gH + bH

R. Under our supposition we

will prove the following two claims:5

(1) λLL > λLH . Suppose not, i.e. λLL ≤ λLH . The FOC will imply: cL ≥ cLH , yL ≤

yLH , bL ≥ bLH and cP (LL)− yP (LL) ≥ cP (LH)− yP (LH). The first is obvious. For

the second we use that v′( yzL

) 1zL−µv′( y

zH) 1

zHis increasing in y which was established

5We will use the fact that since there is no distortion for the parent there is an equivalence between therelative size of the λ’s and how much the parent will give to the kids in each state.


in lemma A.6. The next one follows from W1(b, zL)− µW1(b, zH) decreasing in b for

which we use Wbb negative and increasing in z as shown in lemma A.1 and lemma

A.3 and µ ∈ (0, 1) and the last one is obvious from the FOC. We add all these up

and use feasibility and the overall progressiveness assumption to get a contradiction:

A = cP (LL)− yP (LL) + 2[gL +bL

R] ≥ cP (LH) + yP (LH) + 2[gLH +

bLH

R] >

> cP (LH) + yP (LH) + gLH +bLH

R+ gHL +

bHL

R= A

(2) λLH > λHH . Suppose not, i.e. λLH ≤ λHH . The same reasoning implies cH ≤

cHL, yH ≥ yHL, bH ≤ bHL, cP (HH) − yP (HH) ≤ cP (LH) − yP (LH) and we get a

contradiction as before.

Thus we have λLL > λLH > λHH under our assumption. Now λLL > λHH =⇒ cP (LL) −

yP (LL) < cP (LH)− yP (LH). The claim easily follows.

Note that in fact gL + bL

R> gH + bH

R⇐⇒ λLL > λHH , so we have established λLL > λHH

independent of the supposition as well.

Step 3: now we will show λLL > λLH > λHH (without the assumption gLH + bLH

R>

gHL + bHL

R).

We will proceed by contradiction. Since we have shown that λLL > λHH we need to rule

out 2 cases: λLH ≥ λLL > λHH and λLL > λHH ≥ λLH .

(1) λLH ≥ λLL > λHH . Note that we can compare the allocations for the high agent in

different states because of no distortions and for the low agent across states because

of the properties of v and W just like above. We get the following inequalities.


λLH ≥ λLL λLH > λHH =⇒

cP (LL) ≥ cP (LH) cP (HH) > cP (LH)

yP (LL) ≤ yP (LH) yP (LL) < yP (LH)

cL ≥ cLH cH > cHL

yL ≤ yLH yH < yHL

bL ≥ bLH bH > bHL

When we add the first 2 lines and use the BC’s we get gL + bL

R+ gH + bH

R<

gLH + bLH

R+ gHL + bHL

R. When we add the last 3 lines, we get: gL + bL

R+ gH + bH

R>

gLH + bLH

R+ gHL + bHL

R. This is a contradiction.

(2) λLL > λHH ≥ λLH . A contradiction can be reached in a very similar way, so it won’t

be repeated here.

Step 4: λLL > λLH > λHH =⇒ gLH + bLH

R> gHL + bHL

R.

This goes in a very similar way as before:

λLL > λLH ⇒ gLH +bLH

R> gL +

bL

R

λLH > λHH ⇒ gH +bH

R> gHL +

bHL

R

But we have already established that: gL + bL

R> gH + bH

R. So:

gLH +bLH

R> gL +

bL

R> gH +

bH

R> gHL +

bHL

R

�

The FOC also imply that cLH < cHL. We can use this along with the above results to

characterize the solution as follows:

cL < cLH < cHL < cH , yL > yLH , yHL > yH , bL < bLH , bHL < bH


One would guess that yLH < yHL. Recall that even in the simpler model without family

level uncertainty, this was established case by case. We’ll follow that strategy here as well.

bLH <> bHL will change with π as before.

Lemma A.9. Suppose π = 12. Then bLH < bHL, yL < yH , yLH < yHL and the ICL is

satisfied.

Proof: the first one is obvious from the FOC. For the second note that if π = 12

the FOC

imply that bL < bH . The claim then follows from cL + bL < cH + bH and cL − yL + bL >

cH − yH + bH . Using the characterization above and the first result, we get from the ICH

being satisfied with equality that:

2 · v(yHL

zH

) > v(yH

zH

) + v(yHL

zH

) > v(yLH

zH

) + v(yL

zH

) > 2 · v(yLH

zH

)

=⇒ yHL > yLH

To show that the ICL is satisfied it is enough to show:

v(yH

zL

) + v(yHL

zL

)− v(yLH

zL

)− v(yL

zL

) > v(yH

zH

) + v(yHL

zH

)− v(yLH

zH

)− v(yL

zH

)∫ yH

yL

v′( yzL

)

zL

dy +

∫ yHL

yLH

v′( yzL

)

zL

dy >

∫ yH

yL

v′( yzH

)

zH

dy +

∫ yHL

yLH

v′( yzH

)

zH

dy

This is true by v convex, zL < zH , yL < yH and yLH < yHL, which has just been shown. �

Lemma A.10. Assume that π = 1, zL = 0. Then bLH > bHL.

Proof: this is the same as before. The superscript denotes the second period. We get:

Wb(bLH , zL) < u′(cLH) ⇒ u′(c2LH) < u′(cLH) ⇒ cLH < c2

LH ⇒ gLH < bLH ⇒ bHL < bLH

The last result follows from the overall progressivity and the fact that bHL = gHL which

is still true even if R 6= 1 as long as βR = 1. Under our assumption that zL = 0 the ICL is


satisfied as well6, which is actually true ∀π ∈ [12, 1]. Hence it is also true for π∗. We can use

continuity in z (this can be established as before) to extend this argument to a neighborhood

of zL = 0.

6In general, we can’t show that the ICL is satisfied even if we could show that yLH < yHL.


Appendix B

Here we explain our calibration strategy and the details of your numerical implementation.

B.1. Income Process Calibration. We assume the following about the distribution of

skills and assets.

(log zP , log A) ∼ N2

µzp

µA

,

σ2zp corr(A, zp) · σzpσA

corr(A, zp) · σzpσA σ2A

log z1 = µz1 + ρzG (log zp − µzp) + εzG εzG ∼ N (0, σ2

zG)

log z2 = µz2 + ρzL (log z1 − µz1) + εzL εzL ∼ N (0, σ2zL)

We choose not to directly compare the outcomes in the second period to the data. Instead

we simply assume that: µz2 = µzp + 20 log(1.02) and σ2z2 = σ2

zp. Then we get:

log z1 ∼ N(

µz1,ρ2

zGσ2zL + σ2

zG

1− ρ2zGρ2

zL

)log z2 ∼ N

(µzp + 20 log(1.02),

ρ2zLσ2

zG + σ2zL

1− ρ2zGρ2

zL

)σ2

zp = σ2z2 =

ρ2zLσ2

zG + σ2zL

1− ρ2zGρ2

zL

Hence to completely define the skill process we need to specify following 9 parameters:

µA, σ2A, µzp, σ2

zL, corr(A, zp), µz1, σ2zG, ρzL, ρzG.

We pick µA and σ2A directly to match the data. For parent’s assets, we run the following

regression for agents aged 46 - 65 in the 2005 PSID:

log Ai = β0 + β1agei + β2age2i + β3age3

i + β4age4i + εi


Then we compute the wealth as if everybody was 46 years old. We get the following

numbers: µA = 11.1069, σ2A = 3.5792.

The rest are parameters in our calibration procedure. The table below summarizes what

parameters we are guessing and what we want to match. The explanation follows.

Parameters Targets

µzp µyp = 13.3803

σ2zL σ2

yp = 0.8332

corr(A, zp) corr(A, yp) = 0.5132

µz1 µy1 = 13.2512

σ2zG σ2

y1 = 0.6813

ρzG ρyG = 0.6379

ρzL ρyL = 0.7054

• µyp, σ2yp, corr(A, yp), µy1, σ

2y1 are all measures of logged income. We approximate the

age earnings profile of an individual i by estimating the following regression equation

for the 2005 PSID data:

log yi = β0 + β1agei + β2age2i + β3age3

i + β4age4i + εi

We estimate these regressions separately for the two age groups. Then we use

the computed coefficients and residuals to simulate the earnings profile over the 26

- 45 and 46 - 65 spans. Then we simply add the earnings over the 20 years with

discounting.

• ρyL was calculated with the estimated logged yearly income process from Heathcote,

Storesletten and Violante (2008). We simulate the process for 40 years and then

take the average income in the first and second 20 year periods. Then we regress the


second period average on the first period average to get our estimate for ρyL in the

data. As for the data generated by the model we simply regress kids’ second period’s

income on first period’s income with the number of observations corresponding to

probabilities.

• ρyG - the empirical literature on inter-generation income elasticity measures parent’s

logged earnings while young with children’s logged earnings while young. However

the outcome of our model is a regression coefficient from log y1 = β0 + ρyG log yG + ε.

Thus we measure the child’s first period income as a function of the parent’s second

period income. The parent’s first period income enters only indirectly as (assuming

the process doesn’t change over time) log yp2 = β0 + ρyL · yp1 + ε. Thus the empirical

literature is measuring ρyL · ρyG. These studies find a range of values between 0.15

and 0.55 (for a survey, see Solon, 1999). For the benchmark simulations we use 0.457

and then ρyG = 0.45ρyL

.

B.2. Numerical Implementation.

• First, we will describe how we deal with children’s savings. We estimate the following

regression equation for agents aged 26 - 45:

log Ai = β0 + β1agei + β2age2i + β3age3

i + β4age4i + β5 log yi + εi

For assets we use the 2005 PSID data and for income we use the simulated income

as described above. Then we define the savings function as if the agents were 45

years old. Thus we compute α = β0 + β1 · 45 + β2 · 452 + β3 · 453 + β4 · 454 and

70.448 is a simple average of what Lee, Solon (2006) report for sons and daughters of age 30-34 and 35-39and years 1986-2006 (table 3) based on the PSID data. The independent variable in their regression islogged parents’ income when the kids were 15-17 years old, so the dependent variable measures income ofthe offsprings roughly 20 years later which is our model period length. If we use just the data for 35 - 39years, we will get 0.475.


use the following deterministic savings function: log Ai = α + β log yi. We get:

log Ai = −4.6971 + 1.1489 log yi.

• In the benchmark simulations, we use 5 states for assets and 5 states for parent’s skill

zP , thus having a grid of 25 points. The distance on either axis is the same measured

in probability of the unconditional distributions. For each zP we have a grid of 5

values of z1 and 5 values of z2 with equal distance in conditional probabilities z1|zP

and z2|zP . Thus we effectively have 25 grid points for both z1 and z2.


Appendix C. Additional Quantitative Results

C.1. Results for η = 0.5. Here, we report the results obtained for η = 0.5 and various

values of σ = γ8.

Table: Results for η = 0.5

σ = γ βgy βby %∣∣∣ b1(b1+b2)/2 − 1

∣∣∣ < .125

0.5 1.2214 -0.6090 43.83%

1 -1.0250 -0.7274 38.28%%

2 -2.0512 -1.4280 22.67%

4 -2.0836 -1.4872 20.03%

C.2. 10 states. In the table we compare the results obtained using a matrix of 5x5 states

for zP and assets with those obtained for 10 states.

Table: Gifts and bequests statistics for φ = σ = γ = 1, η = 0.75.

5 states 10 states 5 states 10 states

data private info private info public info public infoE(g)E(y) 0.0184 0.2266 0.2297 0.1448 0.1494

βgy -0.5459 -1.5438 -1.3202 -1.6757 -1.6288E(b)E(y) 0.0157 -0.3368 -0.3095 -0.3411 -0.3254

βby 0 -0.9417 -1.0133 -1.4966 -1.4992

%∣∣∣ b1(b1+b2)/2 − 1

∣∣∣ < .02 77.6% 20% 10% 20% 10%

%∣∣∣ b1(b1+b2)/2 − 1

∣∣∣ < .125 88% 28.66% 26.72% 20% 10%

%∣∣∣ b1(b1+b2)/2 − 1

∣∣∣ < .25 49.9% 48.87% 20% 17.96%

8Cases where σ = γ are much easier to solve numerically, because the solution to the autarky problembecomes a quadratic equation, eliminating costly approximations of the analytical solution.


The message of the table is that the results are robust to the number of states we use.

To that end we want to emphasize two points valid for both private and public information

economy:

(1) The magnitudes of the regression coefficients and mean to mean ratios are not chang-

ing very much as we increase the number of states from 5 to 10.

(2) The percentages of children who get bequests within a given distance of the mean is

not changing very much as we increase the number of states from 5 to 10. Note that

with 5 states, there are 10% and with 10 states 20% of families whose children are of

the same type, so they get the same bequest trivially. The last 2 lines for the private

information economy and the last line for the public information economy indicate

that these children will actually end up getting bequests within the given bounds as

we increase the number of states and they cease to be of exactly the same type (their

types will remain quite close nevertheless). This justifies why we included these 20%

of children, respectively, in these statistics in the main text.


C.3. Results for the Alternative Calibration Procedure. Below, we report results

from a calibration procedure when we didn’t target the variance in children’s income in the

first period σy1, but instead we targeted the regression coefficient βby.

Table: Targeting βby instead of σy1

data original model this modelE(g)E(y) 0.0184 0.2266 0.2984

βgy -0.5459 -1.5438 -0.2217E(b)E(y) 0.0157 -0.3368 -0.3320

βby 0 -0.9417 0

%∣∣∣ b1(b1+b2)/2 − 1

∣∣∣ < .02 77.6% 20% 98.73%

%∣∣∣ b1(b1+b2)/2 − 1

∣∣∣ < .125 88% 28.66% 100%

%∣∣∣ b1(b1+b2)/2 − 1

∣∣∣ < .25 49.9% 100%

σ2y1 0.6813 0.6813 0.3433

σ2z1 0.2241 0.0038


C.4. Non-negativity Constraint on Bequests. In this section, we report the results from

a comparative statics exercise: for a given set of utility parameters we keep the skill process

parameters from the calibration procedure and impose a set of constraints: b1(ij), b2(ij) ≥ 0.

We chose this set of utility parameters, because here more that 50% of children actually get

positive bequests in both scenarios.

Table: Results for σ = γ = 2 and η = φ = 1.

data without b ≥ 0 with b ≥ 0E(g)E(y) 0.0184 0.1189 0.0932

βgy -0.5459 -2.0105 -2.0137E(b)E(y) 0.0157 0.0172 0.0512

βby 0 -1.6370 -0.3626

% with b1 or b2 > 0 0.6165 0.5990

%∣∣∣ b1(b1+b2)/2 − 1

∣∣∣ < .02 77.6% 20% 12.94%

%∣∣∣ b1(b1+b2)/2 − 1

∣∣∣ < .125 88% 20% 12.94%

%∣∣∣ b1(b1+b2)/2 − 1

∣∣∣ < .25 27.89% 17.81%

We want to make 2 points:

(1) The numbers are not changing very much for gifts.

(2) The regression coefficient on bequests in the constrained problem was computed using

only children with b > 0. We see quite a dramatic decrease in this coefficient, bequests

become less progressive. It seems that the parent can’t extract the resources from

the rich kid and transfer them to the poor kid. However, we see a decrease in the

percentage with similar bequests.

TOUGH LOVE FOR LAZY KIDS - Society for Economic Dynamics · 2009. 2. 15. · TOUGH LOVE FOR LAZY...

Documents

Transcript of TOUGH LOVE FOR LAZY KIDS - Society for Economic Dynamics · 2009. 2. 15. · TOUGH LOVE FOR LAZY...