Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding...

80
Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford GSB Takuo Sugaya Stanford GSB October 3, 2017 Abstract Contracts that compensate workers if they choose to leave an organization or Pay to Quit programs are becoming increasingly prevalent in both established and new companies. In this paper, we address the puzzle of why such employment contracts are o/ered. We propose a model to study the optimal employment contract when rms face both an adverse selection problem they need to nd a good t for the project and a moral hazard problem they need to incentize employee e/ort. Moreover, hiring the bad t is costly for the rm because taking on a worker requires the rm to relinquish an outside option, coming for instance form being able to search for a new candidate. We fully characterize the optimal employment contract in this environment and derive the conditions under which o/ering a payment for quitting is optimal. Foarta: Stanford Graduate School of Business, 655 Knight Way, Stanford, CA, 94305, tel: (650) 723-4058, email: [email protected]. Sugaya: Stanford Graduate School of Business, 655 Knight Way, Stanford, CA, 94305, tel: (650) 724-3739, email: [email protected]. 1

Transcript of Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding...

Page 1: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

Avoiding the Cost of a Bad Hire: On the Optimality ofPay to Quit Programs∗

Dana FoartaStanford GSB

Takuo SugayaStanford GSB

October 3, 2017

Abstract

Contracts that compensate workers if they choose to leave an organization —or Payto Quit programs —are becoming increasingly prevalent in both established and newcompanies. In this paper, we address the puzzle of why such employment contracts areoffered. We propose a model to study the optimal employment contract when firmsface both an adverse selection problem —they need to find a good fit for the project— and a moral hazard problem — they need to incentize employee effort. Moreover,hiring the bad fit is costly for the firm because taking on a worker requires the firm torelinquish an outside option, coming for instance form being able to search for a newcandidate. We fully characterize the optimal employment contract in this environmentand derive the conditions under which offering a payment for quitting is optimal.

∗Foarta: Stanford Graduate School of Business, 655 Knight Way, Stanford, CA, 94305, tel: (650) 723-4058,email: [email protected]. Sugaya: Stanford Graduate School of Business, 655 Knight Way, Stanford,CA, 94305, tel: (650) 724-3739, email: [email protected].

1

Page 2: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

1 Introduction

An increasingly common practice among both small and large companies has been to in-

centivize the early removal of potentially ‘bad fits’by offering contracts that compensate

employees who decide to quit —called Pay to Quit programs. The online retailer Zappos

famously introduced such contracts in the 2000s.1 As described by the Harvard Business

Review, “Zappos wants to learn if there’s a bad fit between what makes the organization tick

and what makes individual employees tick —and it’s willing to pay to learn sooner rather than

later.”2 Similar Pay to Quit programs were subsequently adopted by Amazon in 2014 and

by some Silicon Valley start-ups afterwards.3 Relatedly, in both the public sector and the

private sector, it is becoming increasingly common to slot employees early on into different

career tracks, with a career advancing track —for example, a track towards becoming a part-

ner —existing alongside a ‘lifestyle’track —the type of job from which career advancement

is limited, pay is lower, but the risk of termination due to low performance is minimal.4

A motivation often offered for this phenomenon is that organizations are structured in

ways that are increasingly reliant on employee cultural fit. A misfit employee does not only

cost his organization in terms of productivity — the usual concern of supplying too little

effort —, but also in terms of the effectiveness of the organizational structure —a problem of

selecting the right type of person for the team. Moreover, in many cases, organizations cannot

immediately compensate for a bad hire by expanding the team and hiring other employees

who are good fits. Organizational charts and team sizes cannot be easily expanded, so an

employee must leave for a new one to be hired. Moreover, even when teams can be expanded,

a bad hire’s negative effect on morale and culture may not be easily neutralized. The process

1The offer was $2000 and the take-up rate about 3% (https://www.cbsnews.com/news/amazon-pays-employees-5000-to-quit/)

2https://hbr.org/2008/05/why-zappos-pays-new-employees (accessed on August 8, 2017).3The monetary compensation was up to $5000 in the case of Amazon and 10% of salary in the case of

startup Mavens. Seehttps://www.cbsnews.com/news/amazon-pays-employees-5000-to-quit/ https://www.cnbc.com/2016/09/30/this-

company-will-pay-you-a-bonus-to-quit-your-job-commentary.html4For example, an increasing trend among law firms is to have a “permanent associates”track along with

the partner track: http://www.nytimes.com/2011/05/24/business/24lawyers.html

2

Page 3: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

of removing a misfit hire is in itself diffi cult and costly,5 even unfeasible in some cases.6

Given the high costs of bad hires, a key concern for both firms and public institutions

is how to optimally structure contracts that incentivize employee effort —the moral hazard

problem —and at the same time incentivize low ability employees to select out in order to free

up the spot for another candidate —the adverse selection problem. Devising such contracts

is not an easy task, as a candidate’s ability is usually better known to the candidate than to

the organization, especially when this refers to "what makes the employee tick". Moreover,

the incentives offered for the low ability candidates to select out must not be so tempting

that high ability candidates also choose to take them.

In this paper, we study this fundamental problem. We derive the optimal contracts that

firms should offer workers in a setting in which (i) the workers’ability is unobservable to the

firm (there is adverse selection), (ii) the worker’s input (effort) is not observable, but it is

correlated with the project’s success (there is moral hazard), and (iii) if and only if a worker

selects out of working on the project, the firm can exercise an outside option —for example,

it can offer the contract to a new candidate.7

Our model considers a firm that has one project and exactly one worker can work on that

project. The pool of available workers consists of both low ability and high ability types. The

high ability types have a lower cost of supplying effort. This matters for the firm because the

worker’s effort increases the payoff from the project, and yet, the amount of effort supplied

cannot be observed by the firm. The firm is matched with exactly one candidate from the

pool of available workers. Based on the contract he is offered, the candidate either works on

the project or selects out. If the candidate stays to work on the project, the outcome of the

project is revealed at the end of the period. If the candidate selects out, the firm exercises

an outside option. The firm can offer contracts that specify a monetary reward based on

the observable project outcome, a probability with which the worker stays to work on the

5In one survey of 6000 hiring professionals worldwide, more than half mentioned that they have beenaffected by the costs of a bad hire (“More Than Half of Companies in the Top Ten World Economies HaveBeen Affected By a Bad Hire,”CareerBuilder, 2013)

6For example, union-protected jobs or public servants.7This outside option can be seen as the baseline productivity achieved without hiring a new employee.

3

Page 4: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

project, and any fixed payment for the worker who does not stay on the project.

We characterize the optimal contract for the firm in this environment. In designing this

contract, the firm faces three challenges. First, in order to respond the moral hazard problem,

the firm would like to offer high powered incentives, that induce the most effort from the

worker. This implies offering higher monetary rewards for better outcomes. Second, in order

to respond to adverse selection, the firm would like for the low ability type to select out of

the project. This would require offering a contract that retains the low ability worker less

often. The relative value to the firm of solving the moral hazard problem —incentivizing

effort —and solving the adverse selection problem —retaining the high ability type —dictates

the balance between monetary rewards and retention probabilities in the optimal contract.

There is, however, a third challenge for the firm, namely that it must offer contracts that lead

each type of worker to select the contract designed for his type —the incentive compatibility

problem. The interaction of this third challenge with the previous two determines whether

payments to select out are optimal.

The main result of the paper identifies the two conditions which, if simultaneously

satisfied, make it optimal for the employment contract to offer the worker a payment when

selecting out of the project. The first condition is that there must be misalignment between

the relative valuation by workers of monetary rewards versus retention and the relative

valuation by the firm of these two instruments. Such misalignment emerges if the high

ability worker values a higher reward more than a higher probability of retention while the

firm values relatively more offering the higher probability of retention to the high ability

type, or vice-versa, if the high ability type values the higher retention probability more when

the firm values more offering the higher reward to the high ability type. The second condition

for the main result to emerge is that the high ability worker is suffi ciently reactive to the

high-powered incentives, so the principal would like to offer him a suffi ciently high reward

for effort.

The result emerges from the interplay of these two conditions. Under the second con-

dition, the principal would like to provide a suffi ciently high monetary reward to the high

4

Page 5: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

ability worker, in order to respond to the moral hazard issue, and she would like to also

provide this worker with a higher probability of retention, in order to respond to the adverse

selection issue. Yet, such a contract would not be incentive compatible for the low ability

worker. To address this issue, the principal could try to modify the contract offered to the

high ability worker by decreasing either the reward or the probability of retention. The

misalignment between the relative values placed on these two instruments by the workers

versus the firm prevents this adjusted contract from being both incentive compatible for

the workers and profitable to the firm. The only way to obtain incentive compatibility and

profitability is by offering the low ability type a fixed payment in case of no retention. Since

the low ability worker has a higher marginal cost of effort than the high ability worker, a

fixed payment is relatively more valuable for the low ability type than for the high ability

type.

Theoretically, our contribution to analyzing this multidimensional mechanism design

problem is to derive the relevant condition for the misalignment between firms and work-

ers, and by doing so, to provide a proper analogue of the single-crossing condition in a

single-dimensional problem. We show that the misalignment condition corresponds to the

log-submodularity/ supermodularity of the net total gain from retaining the worker —the net

gain from retention for the firm and the worker together —and the log-submodularity/ super-

modularity of the worker’s value from the contract. We show that comparing the benefit to

the principal from a change in the reward versus a change in the probability of retention boils

down to determining whether the net total gain from retaining the worker is log-submodular/

supermodular. A similar comparison links the relative benefit to the worker from a change

in these two instruments to whether his value is log-submodular/supermodular When the

two values are both log-submodular or both log-supermodular, there is alignment between

the relative benefit to the firm and to the high ability worker. Conversely, misalignment

emerges when one of the values is log-supermodular while the other is log-submodular.

Both moral hazard and adverse selection are essential to obtaining this key result. With

only moral hazard, a fixed payment would never be used, since it does not incentivize effort.

5

Page 6: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

With only adverse selection, no production is the least informative outcome, hence making

the use of a fixed payment ineffi cient. Therefore, studying an environment with both moral

hazard and adverse selection is essential for understanding the optimality of paying workers

to quit.

We provide a full characterization of when each contract form emerges also when either

one of the two conditions for the main result are not satisfied. We determine when the same

contract is offered to both types, when only the high ability type is retained, and when

differentiated contracts are offered that give both workers positive probabilities of retention

and no fixed payment.

The results of the model have wide policy implications for contracts both in the private

sector and in public organizations. The results imply that contracts with payments to quit

emerge in organizations for which organization-specific fit is of key importance, and in which

there is a misalignment between the value placed by the organization on this fit and the

value placed by employees on fit relative to compensation. This misalignment is suggested

in the motivating examples of Amazon and Zappos. An equivalent representation of the

payment to quit programs are early tracking programs, which are more prevalent in fields

like law firms or public bureaucracies. We discuss the public policy implication of this

result for contracts in both the private and the public sector. If misalignment is present, in

that retention of the "right fits" is valued differently by the organization and the employee,

early tracking programs may improve outcomes, by selecting out low ability types without

being too tempting for the high ability types. We highlight, however, the importance of

differentiating between organization specific fit and genuine productivity by providing an

extension to the model that adds type-dependent outside values for the workers, to reflect

genuine productivity differences. In such cases, offering a payment to select out would, in

some circumstances, backfire and result in the high ability types selecting out.

6

Page 7: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

Related Literature Agency theory has extensively explored the optimal design of con-

tracts with either moral hazard or adverse selection;8 however, the case in which both moral

hazard and adverse selection are present is less well understood. Chade and Swinkels (2016)

analyze the general principal-agent problem with both moral hazard and adverse selection,

and provide a set of suffi cient conditions under which we can decouple the problem into

adverse selection and moral hazard. There are two main differences between their paper

and ours. First, our main focus is on the case when the principal has an outside option.

Second, our principal targets both effort level —as in their model —as well as the retention

probability for each type, which adds another dimension to the mechanism design problem.

Gottlieb and Moreira (2014) also analyze a principal-agent problem in which two types of

agents who have private information about their type and about the distribution of outcomes

conditional on whether effort was exerted. In this general framework, Gottlieb and Moreira

(2014) show that with risk neutral types and limited liability for the agents, all agents are

offered a single contract. We consider this same setting, but add another component to the

principal’s problem: the existence of an outside option if the current worker is not retained.

With this outside option, the optimal contract can in fact be differentiated for each type and

feature a payment when the agent is not retained. This happens because the firm’s ability

to remove a bad hire from the project in order to free up its outside option may be suffi cient

to compensate for any losses from not retaining the current candidate.

The focus of our paper is on optimal contracts in environments in which it is valuable to

select out low ability types rather than employ them and learn about their type —the firm

can be matched with a different candidate if the current one is not retained. This relates

our work to the literature on screening in the presence of both moral hazard and adverse

selection (Chiu and Karni, 1998; De Meza and Webb, 2001; Jullien et al., 2007). While this

literature considers a similar framework, we add to it the outside option for the firm if it

does not retain the current candidate.

There is a rich and growing literature on dynamic contracts with both moral hazard

8Starting with the seminal papers of Mirrlees (1975), Hölmstrom (1979), Myerson (1981), Grossman andHart (1983), and Maskin and Riley (1984).

7

Page 8: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

and adverse selection (Strulovici, 2011; Garrett and Pavan, 2012; Halac et al., 2016); while

the focus in this literature is on the evolution of the relationship between the firm and the

employee once hiring happens, our paper focuses instead on the problem of selecting out

a bad type employee ex ante, when the cost of a ‘bad hire’ is high. To focus on ex ante

selection before production, we analyze the environment with a one-period production.

Finally, the type of applied problem analyzed in this paper is closely related to issues in

regulation and procurement. The seminal work of Laffont and Tirole (1986, 1993) examines

the problem of simultaneous adverse selection and moral hazard in the context of regulation.

Their model considers a setting in which adverse selection and moral hazard can be easily

decoupled, leading to the optimality of linear contracts. In our setting, the relationship

between types, effort and output is not deterministic, leading to a problem in which the

moral hazard component cannot be easily decoupled from adverse selection.

The rest of the paper is organized as follows. Section 2 describes the environment and

the structure of the game. Section 3 establishes the benchmark contract chosen by a so-

cial planner. Section 4 describes a fundamental property used in the analysis, namely log-

submodularity/supermodularity. Section 5 studies the optimal contract for a firm with an

exogenous outside option, relating the results to those of Section 3. Section B presents the

results in a dynamic model, endogenizing the firm’s outside option. Section 6 presents several

extensions that show the robustness of the optimal contract to introducing outside options

for agents, multiple agent types, and multiple project outcomes. Section 8 concludes, and

the Appendix contains the proofs and further extensions of the model.

2 The Model

We consider an environment with two players: a principal and an agent. The agent has two

possible types, θ ∈ {H,L}, with µθ representing the prior that the agent is type θ. The

agent is hired by the principal to work on a project, the outcome of which depends on the

8

Page 9: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

agent’s private effort. The type θ agent has a cost of effort c (θ, e) for e ∈ R+, which satisfies

ce, cee, ceee ≥ 0, cθ ≥ 0, (1)

where the subscripts denote derivatives with respect to the stated parameter. Moreover, the

H type has lower marginal cost of effort than the L type:

ceθ ≤ 0 for each e ∈ R+.

The set of possible outcomes is denoted by Y = {0, 1}, where y = 1 denotes a project

success and y = 0 denotes a failure. The probability of a success is given by q(eθ), where

q(eθ) = q0 + q1eθ, with q0, q1 ∈ R.9

By the revelation principle, we can focus on the contract in which, upon arrival, the

agent truthfully declares his type. Depending on the declared type θ, the principal offers the

contract Cθ. In particular, Cθ specifies a probability of retention pθ, a wage wθ (y) contingent

of the realized output y ∈ Y , if the agent is retained on the project, and a payment wθ (∅) in

case of no retention. If the agent is not retained, he is removed right away and paid wθ (∅),

so he does not exert any effort on the project. For notational convenience, we also define

wθ ≡ wθ(y = 1)− wθ(y = 0),

the relative reward received by a retained agent θ in case of a success rather than a failure,

and

vθ ≡ pθ · wθ(y = 0) + (1− pθ) · wθ (∅) ,

the expected base payment if nothing is produced– either because the project fails or because

the agent is not retained.

9Note that, as long as q is concave, we can always re-measure the effort such that the linearity is obtainedfor convex cost c.

9

Page 10: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

We assume that the agent’s type θ is not observable (adverse selection), and effort eθ

is not also observable (moral hazard). We also assume limited liability: wθ (y) ≥ 0 and

wθ (∅) ≥ 0 ∀θ, y.10

Timing of Actions. The game can be summarized in the following timing of actions:

1. An agent arrives and declares his type θ. On the equilibrium path, he reports his true

type: θ = θ.

2a. With probability pθ, the principal retains the agent for the project. After retention,

the agent chooses private effort eθ and is paid wθ (y) given the observable outcome y.

This is the end of the game.

2b. With probability 1 − pθ, the principal does not retain the agent. She pays wθ (∅) to

the agent. The principal obtains an exogenous outside option with value W .

Payoffs. Given the risk neutrality, the total expected payoff for agent θ from contract

Cθ is given by

pθ · V (θ, wθ) + vθ,

where

V (θ, wθ) ≡ q(e (θ, wθ)) · wθ − c (θ, e (θ, wθ))

denotes the payoff given the optimal effort with reward wθ. The equilibrium effort e (θ, wθ)

for type θ is determined by

e (θ, wθ) = arg maxeq(e) · wθ − c (θ, e) . (2)

We assume that the H type brings strictly higher profit to the principal than the L type

given the same wage:

10Otherwise, “selling a project”with a price accepted only by the high type will be optimal.

10

Page 11: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

Assumption 1 We assume the following condition on c (θ, e) which implies e (H,w) >

e (L,w): lime→0 ce (L, e) = 0, lime→ 1−q0

q1

ce (H, e) ≥ q1,11 c (θ, e) is continuously differentiable

with e ∈[0, 1−q0

q1

], and ce (H, e)− ce (L, e) < 0 for each e > 0.

Upon retaining an agent of type θ, the principal obtains expected payoff

π (θ, wθ) ≡ q (e (θ, wθ)) (1− wθ) .

Denoting by J the equilibrium value of the principal, her expected payoff is given by

J =∑θ

µθ [pθ · π (θ, wθ)− vθ + (1− pθ) ·W ] . (3)

We present the problem and results by assuming for now that W is exogenous, W ∈(0,WMAX

), where WMAX = maxwH π(H,wH).12 In Section B, we present results endo-

genizing W .

3 Welfare Benchmark

To analyze this problem, it proves useful to begin by presenting the type of contracts that

would maximize social welfare. The social welfare upon an agent’s retention is given by:

S (θ, wθ) = q (e (θ, wθ))− c (θ, e (θ, wθ)) . (4)

Consider the problem for a social planner who offers employment contracts (pθ, wθ, vθ) in

order to maximize

maxEθ [pθ · S (θ, wθ) + (1− pθ) ·W ] ,

11Since it is suboptimal for the principal to offer wθ ≥ 1, this condition implies that e (H,wH) < 1.12If W ≥WMAX , not hiring any candidate is trivially optimal.

11

Page 12: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

subject to (2) and the incentive compatibility constraints for the type H agent and the type

L agent, respectively:

pHV (H,wH) + vH ≥ pLV (H,wL) + vL; (5)

pLV (L,wL) + vL ≥ pHV (L,wH) + vH . (6)

If the social planner retains the agent, she obtains S (θ, wθ) , while if she does not retain the

agent, she obtains the exogenous continuation value W.

Proposition 1 The social planner’s solution can take one of two forms:

1. (Retention of type H only) If W > S (L, 1), then wH = 1, pH = 1, vH = 0, and pL = 0,

wL = 0, vL = V (L, 1).

2. (Retention of both agents) If W ≤ S (L, 1), then wθ = 1, pθ = 1, vθ = 0 for θ = H,L

Proof. In Appendix A.2.

If the social planner retains an agent, we obtain the well-known results that she “sells

the project to the worker” in order to generate the maximum effort. Since the exogenous

continuation payoff satisfies W < S (H, 1), the social planner always benefits from keeping

the type H agent; however, keeping the type L agent is optimal only if the continuation

value W is smaller than the social value of retaining the type L agent, S (L, 1).

4 A Useful Condition

To analyze this problem, we make use of the following property.13

Definition 1 A function F (θ, w) is log-supermodular (log-submodular) if

F (θ, w)F (θ′, w′)− F (θ′, w)F (θ, w′) ≥ (≤)0

if and only if (θ − θ′) (w − w′) ≥ (≤)0. (7)

13See Athey (2002) for other economic applications of log-supermodularity/log-submodularity.

12

Page 13: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

Log-supermodularity implies that an increase in w is relatively more valuable (in per-

centage terms, (F (θ, w + ∆)− F (θ, w)) /F (θ, w)) when θ is higher. Applied to V (θ, w),

log-supermodularity means that, all else constant, an increase in the reward w leads to a

relatively higher increase in the payoff of the high ability worker. If S (θ, w) is also log-

supermodular, this implies that an increase in w increases the social welfare upon retention

relatively more when the agent is of type H.

ln order to streamline the analysis and highlight the intuition behind our main result,

we present the results for the case in which V (θ, w) is log-submodular. In Appendix F, we

provide the complementary analysis for the case in which V (θ, w) is log-supermodular.

Remark 1 There exist cost functions c (θ, e) such that V (θ, w) is log-submodular.

An example of a cost function that leads to V (θ, w) log-submodular is presented in

Section 5.3 below.

Assumption 2 The cost function c (θ, e) has a form that leads to V (θ, w) log-submodular.

Under this assumption, we proceed to analyze the optimal employment contract.

5 The Optimal Contract

We now analyze the optimal contract offered by the principal in order to maximize her

expected profit. The principal offers these contracts under the constraint that the ex ante

value for each type θ from contract Cθ is no less than Cθ for each θ 6= θ. This problem is

therefore characterized by the following program:

J (W ) = max{pH ,pL,wH ,wL,vH ,vL}

µH [pH (S (H,wH)− V (H,wH)) + (1− pH)W − vH ]

+µL [pL (S (L,wL)− V (L,wL)) + (1− pL)W − vL] , (8)

13

Page 14: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

subject to14

pHV (H,wH) + vH ≥ pLV (H,wL) + vL; (ICH)

pLV (L,wL) + vL ≥ pHV (L,wH) + vH . (ICL)

The principal maximizes her payoff J (W ) subject to the two incentive compatibility con-

straints. Constraint (ICH) ensures that the type H agent prefers the contract (pH , wH , vH)

to the contract designed for the type L agent. Similarly, constraint (ICL) ensures that the

type L agent prefers the contract (pL, wL, vL) to the contract designed for the type H agent,

(pH , wH , vH) .

5.1 Preliminaries

We begin by deriving several properties of objective (8), that allow us to characterize the

optimal contract. We analyze the non-trivial case in which the principal hires at least one

of the agents with positive probability —this is ensured by assuming that W < S(H, 1).15

We start by showing that the type H agent is offered no fixed base payment, and at least

one type of agent is always retained for the project.

Lemma 1 The optimal contract has the property that vH = 0, and either pL = 1 or pH = 1.

Proof. In Appendix A.3.

The principal only rewards the H type in case of the success of the project. Intuitively,

increasing the reward increases the effort provided by the agent. Since the agent is risk-

neutral, the principal can incentivize the highest effort from the type H agent by loading

the entire reward on the positive outcome. Moreover, this change makes type L less willing

to pretend to be of type H since he has to pay a higher effort cost for a good outcome. Also,

at least one of the agents is always retained. Otherwise, increasing pθ and vθ proportionally

would proportionally increase the principal’s payoff relative to the outside option W .

14The definition of V (θ, w) and S(θ, w) implies (2).15The trivial case is pL = pH = vL = vH = 0.

14

Page 15: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

Next, we establish that the incentive compatibility constraint for the type H agent must

hold with equality whenever vL > 0.

Lemma 2 In the optimal contract, if vL > 0, then constraint ICH binds.

Proof. In Appendix A.4.

The fixed payment vL > 0 amounts to a cost for the principal without any benefit in

terms of output. The principal can then improve her expected payoff by slightly reducing

vL and slightly increasing wL, keeping pL (L,wL) + vL fixed. She can implement this change

as long as the type H still prefers the contract designed for the type H, i.e., up until ICH

binds.

To analyze the optimal contract, we must also establish when constraint ICL binds.

When the incentive compatibility constraint for type L binds, the contract offered to type

H is adjusted by the principal to make it just weakly less desirable to type L than the

contract designed specifically for his type. Conversely, when constraint ICL does not bind,

the contract offered to type H must not be adjusted in order to make it less desirable to the

type L agent.

In the case in which the incentive compatibility constrain does not bind for the low type,

we can first establish the following result.

Lemma 3 Constraint (ICL) does not hold with equality only if wPBH < wPBL , where

wPBθ = arg maxwθ≥0

π (θ, wθ) . (9)

Proof. In Appendix A.10.

If wPBH ≥ wPBL , then the principal would like to both offer a higher wage to type H to

address the moral hazard problem —so wH > wL—, and to also offer a higher probability

of retention of this type, in order to address the adverse selection problem —so pH > pL.

In this case, a contract (pH , wH , vH = 0) would be preferable to both types to a contract

(pL, wL, vL = 0). Therefore, in order to satisfy constraint ICL, the principal must offer a

positive fixed payment vL up to the point at which constraint ICL holds with equality.

15

Page 16: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

Figure 1: Algorithm for solving for the optimal contract

The above preliminary results set up a road map for how to derive the optimal contract

in this environment. We illustrate this road map in Figure 1.

The problem can be divided into three different steps, each capturing a key economic

force. The first issue is making the contract incentive compatible for type L. Lemma 3

implies that problem can be divided into two main cases: (1) the case in which constraint

ICL holds with equality, so that there is a threat of type L imitating type H, and (2) the case

in which constraint ICL does not bind, in which case the principal’s available tools for solving

the moral hazard and adverse selection problem do not raise incentive compatibility concerns

for type L. The second issue is the severity of adverse selection: if type L cannot deliver

an expected benefit higher than the outside option, then type L should never be retained.

Finally, the third issue is the relative value to the principal of higher effort – solving

the moral hazard issue – relative to retention of type H – solving the adverse selection

16

Page 17: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

issue. The principal’s relative valuation of these two instruments, and the relationship to

the agent’s relative valuation, determines the optimality of a fixed payment.

The road map described above allows an interested reader to follow the stepwise program

to solving for the optimal contract. Since the goal of the paper is to focus on the interesting

case in which a fixed payment may be used in the optimal contract, we start by examining

the case relevant to this question. We begin by studying the problem when constraint ICL

binds (Section 5.2), present the main result of the paper, and then also analyze the other

case, in which constraint ICL does not bind (5.4).

5.2 With Threat of Type L Imitating Type H

We being by considering the case in which constraint ICL holds with equality. The condition

under which this occurs will be given explicitly in Proposition 6 in Section 5.4 below. In this

case, the principal faces the threat of type L imitating type H, and she must therefore adjust

the menu of contracts to dissuade type L from choosing the contract designed for type H.

First, if the low type is very unproductive, then the principal does not retain the low

type at all:

Proposition 2 (Retention of type H only) Suppose constraint ICL holds with equality. If

W ≥ S (L, 1), then only the type H agent is retained in the optimal contract: vL > 0, pL = 0,

pH = 1. Conversely, if W < S (L, 1), then we have pL > 0.

Proof. In Appendix A.5.

If W ≥ S (L, 1), then, as in the social planner’s problem, the type L agent is never

retained. He is instead paid a fixed base payment that makes his contract as appealing as

taking the contract designed for typeH. By not retaining the type L agent, the principal loses

at most S (L, 1) and she gains W from the outside option. Conversely, if W < S (L, 1), then

suppose we start from pL = 0. Consider offering the L type a small probability of retention

pL = ε > 0 and a reward wL = 1, while at the same time reducing vL by εV (L, 1) = εS(L, 1),

so as to keep the type L indifferent. When retained, the type L agent creates the social

17

Page 18: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

surplus S (L, 1), and the principal captures ε (S(L, 1)−W ). Since S (L, 1) > W , this is a

gainful deviation for the principal.

The significant departure from the social planner’s solution comes when the condition

from the above proposition is not satisfied. Suppose the principal would like to implement

wH > wL to solve the moral hazard problem since the high type is more reactive to the

high-powered incentive (that is, wPBH > wPBL ). She also would like to implement pH > pL to

solve the adverse selection problem.16 The following lemma shows that, even with the fixed

payment, she cannot achieve these two goals simultaneously:

Lemma 4 It is the case that “wH ≥ wL and pH ≤ pL”or “wH ≤ wL and pH ≥ pL.”

Proof. In Appendix A.6.

The next proposition characterizes when the principal puts more weight on solving the

moral hazard (wH ≥ wL) and when she puts more weight on solving the adverse selection

(pH ≥ pL). It also shows when the fixed payment is needed, which stands as the paper’s

main result:

Proposition 3 When W < S (L, 1) and constraint ICL binds, the optimal contract has the

following properties:

1. (Principal-Agent Alignment) If S(θ, w)−W is log-submodular, then vL = 0, wL ≥

wH , and 1 = pH ≥ pL > 0.

2. (Principal-Agent Misalignment) If S(θ, w)−W is log-supermodular, then vL ≥ 0,

wH ≥ wL, and 1 = pL ≥ pH > 0. Moreover, vL > 0 if and only if wH 6= wL.

Proof. In Appendix A.7.

The results presented in Proposition 3 emerge from two forces, log-supermodularity (or

log-submodularity) of S(θ, w) − W and the log-submodularity of V (θ, w). First, whether

16If she would like to implement wH < wL to solve the moral hazard problem, then the principal’s actionsto address the moral hazard problem and her actions to address adverse selection both go towards wH < wLand pH > pL, which is covered in Proposition 3.

18

Page 19: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

S (θ, w) − W is log-submodular (or log-supermodular) determines whether wL ≥ wH (or

wH ≥ wL). The principal has two possible incentive compatible deviations that she could

undertake: offer all agents the contract designed for the type H, or offer all agents the

contract designed for the type L. If the principal offers all agents the contract designed for

the type H agent, then the principal gains over the original set of contracts if under the new

contract she obtains a higher payoff from the type L agent. Given (8), such a deviation is

unprofitable if

pL (S (L,wL)− V (L,wL)) + (1− pL)W − vL

≥ pH (S (L,wH)− V (L,wH)) + (1− pH)W.

Using (ICL), this condition implies

pL (S (L,wL)−W ) ≥ pH (S (L,wH)−W ) . (10)

Intuitively, the principal’s payoff is the social welfare minus the rent given to the agent. The

agent’s incentive compatibility implies that the principal pays more rent to the agent in the

contract designed for his own type than for the other type. Hence, if it is not profitable for

the principal to offer the uniform contract, the former contract has to generate higher social

welfare than the latter contract.

Similarly, if the principal offers both types the contract designed for the type L agent,

then this deviation is not gainful only if

pL (S (H,wL)−W ) ≤ pH (S (H,wH)−W ) . (11)

Combining (10) and (11) delivers the result that the log-submodularity of S (θ, w)−W

is equivalent to wL ≥ wH , and the log-supermodularity of S (θ, w) − W is equivalent to

wH ≥ wL. Intuitively, log-submodularity of S (θ, w) − W means that an increase in w

granted to the agent of type L is relatively more valuable for the excess social welfare than

19

Page 20: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

an increase in w granted to the agent of type H since we can increase pH . That is, the log-

submodularity implies that the principal puts more weight on solving the adverse selection

problem by offering wH ≤ wL and pH ≥ pL. Symmetrically, the log-supermodularity implies

that the principal puts more weight on solving the moral hazard problem by offering wH ≥ wL

and pH ≤ pL.

While the log-submodularity (or log-supermodularity) of S (θ, w)−W determines wL ≥

wH (or wH ≥ wL), the log-submodularity (or log-supermodularity) of V (θ, w) determines

whether a fixed base payment will be used in the optimal contract. Consider the case in

which S (θ, w)−W is log-submodular, so that wL ≥ wH . Acting towards solving the adverse

selection problem would lead the principal to choose pH ≥ pL. This implies that an agent of

type H can be compensated for the reduced reward wH by being offered a higher probability

of retention pH . If V (θ, w) is log-submodular, then the relative value of an increase in the

probability of retention p (to compensate for a decrease in the reward w) is relatively more

valuable to the agent of type H. Thus, in this case, the principal’s actions towards solving

adverse selection also act towards solving the incentive compatibility issues. Therefore, no

fixed payment vL ≥ 0 is necessary. In the case in which S (θ, w)−W is log-supermodular, we

have wH ≥ wL. Now, the principal puts more weight on the moral hazard problem, and her

actions towards solving it lead to the type H agent receiving more compensation. This would

violate incentive compatibility without fixed payment: since V (θ, w) is log-submodular, the

decrease in pH (and associated increase in wH) would be relatively more beneficial for the

type L agent than to the type H agent, which means that the type L would like to mimic

the type H without a fixed payment. Hence, the fixed payment vL ≥ 0 is necessary in order

to ensure incentive compatibility.

To summarize, if both S (θ, w)−W and V (θ, w) are log-submodular, then the incentives

of the principal and those of the agent’s are aligned; the principal values solving the adverse

selection problem by increasing pH relatively more than solving the moral hazard problem,

and the agent of type H values retention– the action that solves adverse selection– over

reward– the action that the solves moral hazard problem– relatively more compared to

20

Page 21: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

the valuation of the agent of type L over these two components. When S (θ, w) − W is

log-supermodular, the principal values solving the moral hazard problem by increasing wH

(assuming wPBH ≥ wPBL ) relatively more than solving the adverse selection problem. With

V (θ, w) log-submodular, this means that the principal’s incentives are misaligned with the

agent’s, requiring a fixed base payment vL ≥ 0 in order to “correct” for this misalignment

and ensure incentive compatibility of the optimal contract.

Characterization of the Optimal Contract in the Alignment Case

We now provide a more detailed characterization of the optimal contract. First, consider

the case in which both V (θ, w) and S(θ, w)−W are log-submodular. Given Lemma 1 and

Proposition 3, it follows that pH = 1 and vH = vL = 0. Substituting constraint (ICL) into

(8) we obtain

maxwH ,wL,wL≥wH≥0

µH (π (H,wH)−W ) + µLV (L,wH)

V (L,wL)(π (L,wL)−W ) . (12)

Proposition 4 (Alignment case) If both V (θ, w) and S(θ, w)−W are log-submodular, then

the optimal contract solves (12).

In Section 5.3, we provide an example in which the optimal contract features wH < wL

and pH > pL. This result is markedly different from the results obtained by Gottlieb and

Moreira (2014), who study a similar problem, but without an outside option for the principal.

In that environment, all agents are retained and offered the same contract. This highlights

the difference brought about by the existence of the outside option.17 The fact that the

principal has an outside option creates a non-trivial cost of adverse selection– retaining the

agent implies giving up the outside optionW . Hence the principal can benefit from lowering

pL.

17Although Gottlieb and Moreira (2014) only consider the deterministic contract, we can show that al-lowing stochastic replacement without outside options does not change the result that all agents are alwaysretained and offered the same contract. Details are available upon request.

21

Page 22: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

Characterization of the Optimal Contract in the Misalignment Case

Next, consider the case in which S(θ, w)−W is log-supermodular. Since the preferences of the

principal and agent are misaligned, both (ICH) and (ICL) hold with equality. Substituting

constraints (ICH) and (ICL) into (8) we obtain

maxwH ,wL,wH≥wL≥0

µHV (H,wL)− V (L,wL)

V (H,wH)− V (L,wH)(π (H,wH)−W )

+ µL (π (L,wL)−W − V (L,wH) + V (L,wL)) . (13)

Proposition 5 (Misalignment case) If V (θ, w) is log-submodular and S(θ, w) −W is log-

supermodular, then the optimal contract solves (13). Moreover, we have vL > 0 if and only

if wH > wL.

See Section 5.3 for the example with wH > wL and vL > 0. Again, the comparison with

Gottlieb and Moreira (2014) highlights the difference brought about by the existence of the

outside option. Lowering pH is not so costly if the outside option is suffi ciently attractive,

since the principal can at least obtain the outside option. Without outside option, lowering

pH would be too costly, and the optimal contract would be the same for both types.

5.3 Numerical Illustration

In what follows, we provide numerical examples for both the alignment and the misalignment

cases. We assume that the good outcome takes value y = 10. This re-scaling allows us to

minimize computational errors without qualitatively changing the results.18

18With y = 1, everything is the same by also rescaling c(θ, e)/10.

22

Page 23: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

Alignment Case

Consider the case in which µH = .7 and the functions q (e) and c (θ, e) take the following

functional forms:

q (e) = q1e,

c (H, e) =

(1

1− e − e− 1

)/80,

c (L, e) =

(1

1− e − 1

)/40.

Under these specifications, S (θ, w)−W is globally log-submodular for eachW with S (θ, w) ≥

W and V (θ, w) is log-submodular . Hence Propositions 6, 2, and 4 characterize the optimal

contract.

Figure 2 shows the form of the optimal contract when q1 = .1 andW varies between 10−2

and 4. As q1 andW are very low, both types are ineffi cient, and the outside option is too low

to be worth eliminating any of them. Since both types have similarly low productivity, the

principal does not find it optimal to offer separate contracts. As W increases, the adverse

selection problem increases, and offering different contracts becomes optimal. If W = 0.75,

for q1 = .1, the value of q1 is very low, and so the L type is very ineffi cient. Hence, the

principal would like to set pH > pL, which can be achieved in an incentive compatible

contract if wH < wL. ForW suffi ciently large, not hiring any agent and receiving the outside

option is optimal.

We can also examine the effect of increasing q1. For q1 ≥ .2, the principal offers the same

contract to both types, since now the low type is suffi ciently productive. Finally, we can

provide an example where the optimal contract retains only the type H agent and provides a

payment for the type L agent to select out. Setting µH = .99, q1 = .8, andW = S(L, 1)+0.01

delivers this case. The intuition behind the result is that the probability of encountering a

type L is small and W is greater than S (L, 1), so never retaining this type is almost costless

for the principal.

23

Page 24: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

Figure 2: Optimal Contrant Form as a Function of the Outside Option

Misalignment Case

For the misalignment case, consider the case in which µH = .5, W = 3.41 and the functions

q (e) and c (θ, e) take the following functional forms:19

q (e) = .18 + .3e,

c (H, e) = .5e+ max {e− .5, 0}+ .4 max {e− .8, 0} ,

c (L, e) = e+ 2.99 max {e− .5, 0} .

The cost functions are piecewise linear. It immediately follows that the optimal wages

are in the set{

0, 53, 10

3, 19

3, 29.9

3

}, corresponding to the marginal cost divided by q1 = .3. If we

restrict our attention to those wages, we can show that S (θ, w)−W is log-supermodular and

V (θ, w) is log-submodular. Moreover, π (θ, w) is concave in w.20 In addition, wPBH > wPBL ,

and S (L, 1) > W = 3.41. Hence, Proposition 4 characterizes the optimal contract.

We can show that, if the principal is forced to offer the same wage, then she offers

19One may notice that the example below does not satisfy Condition 1. We can make the cost functioncontinuously differentiable by modifying it slightly around the kink to satisfy Condition 1. We keep linearityto simplify the algebra.20We use ceee ≥ 0 only to show the concavity of π (θ, w) in w; hence, piecewise linearity does not change

the statement of the lemmas or propositions.

24

Page 25: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

wH = wL = 103and the expected profit is 6.333. When the principal offers differentiated

contracts by setting wH = 193and wL = 10

3to induce more effort by type H (solving the

moral hazard problem), we can show that the principal’s payoff is increased. By Proposition

4, we have vL > 0 at the optimal contract. Note that W is close to the profit coming from

the same wage contract. Hence, the principal does not bare a high cost of reducing pH in

order to solve the moral hazard problem.

5.4 With No Threat of Imitation from Type L

If constraint ICL does not bind, then type L strictly prefers the contract designed for him to

the contract designed for typeH. Since the contract for typeH is unattractive for type L, the

principal optimally retains type H at all times, pH = 1, and sets the fixed payment vL to 0.

Since there is no threat of type L pretending to be type H, the only binding constraint is the

one for type H. This constrains how high pL can be set. Hence, pL = V (H,wH) /V (H,wL),

and the optimal rewards (wH , wL) are then determined by

(wH , wL) = arg maxwH≥0,wL≥0

µH · π (H,wH) + µL ·V (H,wH)

V (H,wL)(π (L,wL)−W ) . (14)

Proposition 6 When wPBH < wPBL , constraint ICL holds with strict inequality if and only

if wH < wL. In this case, the optimal contract satisfies vL = 0 and 1 = pH > pL =

V (H, wH) /V (H, wL).

Proof. In Appendix A.11.

Intuitively, when wPBH < wPBL and constraint ICL is not binding., the principal’s actions

to address the moral hazard problem– offering a lower wage to type H than to type L– and

her actions to address adverse selection– offering a higher probability of retention to type

H– go towards wH < wL and pH > pL.

The mechanics showing that constraint ICL is not binding can be described through the

graphic illustration depicted in Figure 3 below. Since vL = vH = 0, we can map each contract

onto the two dimensional space of (p, w). For any arbitrary contract CH , consider the set of

25

Page 26: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

Figure 3: Indifference Curves with Log-Submodular V

contracts C that would make type θ indifferent between CH and C. On the indifference curve

obtained in this way, the marginal rate of substitution between p and w satisfies

V (θ, w) dp+ pVw (θ, w) dw = 0⇔ dp

dw= p

Vw (θ, w)

V (θ, w).

The property of log-submodularity of V implies that dp/dw is decreasing in θ: type H values

higher retention probabilities relative to higher rewards compared to type L. Hence, when

constraint ICH is binding and wH < wL, constraint ICL is also satisfied.

In other words, the log-submodularity of V (θ, w) implies that the higher type values the

higher retention probability more than the higher reward. Hence, the principal’s action of

reducing wH and increasing pH is aligned with this agent’s preference. Therefore, in the

optimal contract, there is no need for a fixed base payment.

6 Robustness and Extensions

6.1 Outside Option for Agents

In the following extension, we consider the case with exogenous outside options for both the

principal and the agents. As before, the firm has outside option W . The agent of type θ

26

Page 27: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

has an outside option V Oθ . We assume that V

OH ≥ V O

L , that is, type H has a higher outside

option. In particular, V OH > V O

L implies that the more productive type in the current match

is genuinely more productive and so obtains higher value in the outside option. On the other

hand, V OH = V O

L implies that the type of the current match is just the match-specific fit.

The principal’s problem is

maxpH ,wH ,pL,wL,vL

µHpH (π (H,wH)−W ) + µL [pL (π (L,wL)−W )− vL]

subject to constraints

pθ[V (θ, wθ)− V O

θ

]+ vθ ≥ max

{pθ[V (θ, wθ)− V

]+ vθ, 0

}. (15)

where constraint (15) denote the incentive compatibility and participation constraint for

type θ ∈ {H,L}.

We make the following assumption:

Assumption 3 The following conditions hold:

π(H,wPBH

)> W > π

(L,wPBL

), (16)

with wPBθ defined as in Lemma 3.

Assumption 3 means that , if the principal knew the agent’s type, then she would hire

type H only, even if type H is more expensive because of the higher outside option V OH .

The results of Lemma 2 are readily generalized for the non-trivial contract:

Lemma 5 The optimal contract vH = 0, and pL > 0 together with vL > 0 imply (ICH) is

binding.

Proof. Analogous to the proof of Lemma 2.

Assumption 3 also allows us to focus on the case that is relevant for the purposes of

this paper, the case in which there is a threat of imitation from type L, i.e. constraint ICL

27

Page 28: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

binds.21 This then implies that, for type L, constraint (15) is equivalent to

pL[V (L,wL)− V O

L

]+ vL = pH

[V (L,wH)− V O

L

]≥ 0. (17)

The analysis follows the same steps as the analysis presented in the main text. We present

these steps in the Appendix (sections C.2 and C.3). We obtain the following main result:

Proposition 7 If V OH = V O

L , then not retaining the type L agent (pL = 0) is optimal if and

only if S (L, 1)− V OL −W ≤ 0.

If V OH > V O

L , then not retaining the type L agent (pL = 0) is optimal only if S (L, 1)−

V OL −W ≤ 0.

Proof. In Appendix C.1.

When the agents have the same outside options, Proposition 7 shows that not retaining

type L is equivalent to the net social social gain from keeping this type being negative. The

intuition is that, if pL were to ever be positive, then the principal could increase her objective

by decreasing pL and increasing vL so as to keep type L indifferent. Moreover, the change

would be incentive compatible for type H, since this type obtains a higher value from the

retention than type L. Similarly, if S (L, 1)− V OL −W > 0 but pL = 0, then since (ICH) is

redundant for suffi ciently small pL, the principal could obtain a higher payoff by increasing

pL with wL = 1 and reducing vL, keeping the rent to type L constant.

When the outside options of the two agents differ, so that V OH > V O

L , the problem for the

principal changes if S (L, 1)−V OL −W ≤ 0. In this case, the principal cannot reduce pL if the

incentive compatibility constraint for type H is binding. Intuitively, since the H type has a

higher outside option, if the L type contract allows the agent to capture the outside option

with a higher probability —due to the lower probability of retention —, then such a contract

21Since the problem satisfies the second order condition, as in Proposition 6, (ICL) being slack is equivalentto not having (ICL) at all. Without (ICL), given (16), the optimal contract would satisfy (pH , wH , vH) =(1, wPBH , 0

)and (pL, wL, vL) = (0, 0, 0), which violates (ICL). This is a contradiction.

In other words, the condition in Proposition 6 holds– (ICL) does not bind– only if π(L,wPBL

)≥ W ,

where wPBL is defined without the agent’s outside option in the context of Proposition 6.

28

Page 29: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

may be more attractive to the H type. On the other hand, if S (L, 1)− V OL −W > 0, then

increasing pL and reducing vL is still a profitable deviation from pL = 0. Therefore, pL = 0

is optimal only if S (L, 1)− V OL −W ≤ 0, but not always. This means that offering a fixed

payment vL > 0 in order to completely remove the type L agent becomes harder.

Proposition 7 highlights the key importance of the nature of the agents’types, specifically

whether the agent type is related to the match-specific fit or to genuine productivity. In our

motivating example of the Zappos policy, it was emphasized that the company wants to

encourage “bad fits” to voluntarily leave. If the types that they face are match-specific

fits and V OH = V O

L , then offering vL > 0 to completely replace the agent is easier. On the

other hand, if the types also reflect differences in outside options, then it is more diffi cult

to implement a contract that pays the type L agents to select out. A famous example is

provided by Apple. When the company introduced a generous retirement plan to incentivize

voluntary early retirement, only those of type H, with a higher outside option, left the

company and moved to other companies.22

6.2 Model with More than Two Outcomes

We also consider the extension to the case in which more than two outcomes can be observed:

|Y | ≥ 3. Let Y ={y1, ..., y|Y |

}3 y be the set of possible outcomes with y1 ≤ y2 ≤ · · · ≤ y|Y |.

Now the rewards after outcomes is expressed by a vector wθ = (wθ (y))y. Similarly, the social

welfare S (θ,wθ) and the agent’s value V (θ,wθ) have wθ as an argument. As before, let vθ

be the payment when the agent is not retained. The complete analysis of this problem is

provided in Appendix E.

In order to provide an analysis comparable to the one for the case with two outcomes, we

derive a condition analogous to the log-submodularity/supermodularity of S (θ, w)−W and

V (θ, w) . This condition is based on the log-submodularity/supermodularity of S|Y | (θ, w)−

W and V|Y | (θ, w):

Assumption 4 The following two conditions are satisfied:22Discussion in Kreps, “Microeconomic Theory”(2003).

29

Page 30: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

1. (Monotone Hazard Rate Condition) The hazard rate qe(yn|e)q(yn|e) is increasing in n.

2. (Monotone Likelihood Ratio Condition) For each e and e′ with e > e′, q(yn|e)q(yn|e′) is

increasing in n.

The first part of the condition corresponds to the monotone hazard rate condition. With-

out adverse selection, this condition alone would be suffi cient to show that, in the optimal

contract, the agent is rewarded only after the highest outcome y|Y |. The second part of

the above condition implies that rewarding the agent after outcome y|Y | decreases type L’s

incentive to mimic type H. This implies that the H type is compensated only after the

highest outcome: wH = (wH , 0, . . . , 0).

Nevertheless, it is not always the case that the L type is compensated only after the

highest outcome. To see why, the monotone hazard rate condition implies that compensat-

ing the L type only after the highest outcome incentivizes more effort– it helps solve the

moral hazard problem; however, by the symmetric argument, the monotone likelihood ratio

condition implies that compensating the L type only after the highest outcome increases the

H type’s incentive to mimic the L type. In other words, the compensation after intermediate

outcomes helps solve the adverse selection problem.

Note that the argument that the H type is compensated only after the highest outcome

is analogous to Lemma 1 —the H type is compensated only after y = 1; and the argument

that the L type may be compensated after the intermediate outcomes is analogous to vL > 0

in the binary case —vL > 0 is costly for the principal, but reducing vL and increasing wL

may violate type H’s incentive compatibility.

In order to derive more specific conditions for when vL > 0, another step is required in

the analysis. Given wL, let w (L,wL) be the reward after y|Y | such that type L is indifferent

between w (L,w) = (w (L,w) , 0, 0, ..., 0) and wL. As mentioned, this increases the effort of

the L type. Let ∆S (L,w) be the increase of the social welfare when the agent is of type L:

∆S (L,w) = S (L,w (L,w))− S (L,w) ≥ 0.

30

Page 31: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

At the same time, the H type’s value from CL also increases:

∆V (H,w) = V (H,w (L,w))− V (H,w) ≥ 0.

If the benefit of increasing S (L,w) dominates the cost of increasing ∆V (H,w), such that

∆S (L,w) ≥ ∆V (H,w) for each w, (18)

then we can show that vL ≥ 0 implies the log-supermodularity of S (θ,w).

Proposition 8 Suppose Assumptions 4 and (18) are satisfied. In addition, suppose V|Y | (θ, w)

is log-submodular. Then vL ≥ 0 only if S|Y | (θ, w)−W is log-supermodular.

Proof. In Section D.3.2.

6.3 Optimal Contract with Endogenous Outside Option and Mul-

tiple Agent Types

In Appendix B, we presented the characterization of the optimal employment contract in

which we endogenize W by assuming that, if the current agent is not retained, the principal

then pays a small cost and draws another agent.23 This game repeats itself until an agent

is retained. We show that the main results of the model extend to the case with endogenous

W.

We also extend the model to consider the case in which the agent can have more than

two types. To avoid a complication arising from non-binding incentive compatibility of

lower types, we assume endogenous outside options with a small cost of finding a new agent,

as in the extension presented in Appendix B. Specifically, we assume that the agent can

have types θ1, ..., θn with θ1 < θ2 < · · · < θn−1 < θn and V (θ, w) is well defined for

23We formulate the problem such that a new agent arrives right away, and the principal incurs a directcost κ > 0. We can also formalize κ by discounting and search friction. Details are available upon request.

31

Page 32: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

θ ∈ Θ =[θ, θ]⊃ {θ1, ..., θn}. Contract Ci is designed for type θi. Let ICi,j be the constraint

that type i does not want to pretend to be type j:

piV (θi, wi) + vi ≥ pjV (θi, wj) + vj.

In Appendix E, we provide the complete analysis of this case. We start by deriving a set

of general results analogous to those of Section 5.1. They allow to show that if both V (θ, w)

and S (θ, w)−W are log-submodular, then the main result immediately extends to the case

with multiple agents.

When S (θ, w) − W are log-supermodular, while V (θ, w) is log-submodular, the same

logic as in the case with only two agent types leads to the conclusion that some fixed base

payments will be useful in the optimal contract. The diffi culty, however, is in determin-

ing whether these base payments, the rewards, and retention probabilities are following a

monotone pattern. The complication is driven by the lack of a single crossing condition for

this three dimensional problem (pθ, vθ, wθ), which creates a non-trivial preference ordering

over contracts by different types. In order to make further progress in the analysis, we need to

assume a single-peakedness condition. Specifically, for any pair of contracts C = (p, w, v) and

C ′ = (p′, w′, v′) with w > w′ and p < p′, we assume that the function [pV (θ, w)− pV (θ, w′)]

is single-peaked. Under this condition, we can show that for suffi ciently small κ > 0, if

S (θ, w) − W is log-supermodular, there exists θ ∈ {θ1, ..., θn−1} such that the principal

never retains the types for θi < θ for sure, and offers all types θi ≥ θ the same contract:

0 = vn = vn−1 = · · · = vi, pn = pn−1 = · · · = pi, and wn = wn−1 = · · · = wi.

The above result shows that, when the model is extended to multiple agent types, the

optimal contract no longer offers a positive fixed payment to lower agent types, unless that

type is never retained. As shown in the analysis with only two types, when S (θ, w)−W is

log-supermodular, the excess social value created by the contract increases relatively more

when the higher agent types are offered higher rewards. Solving the moral hazard problem

through higher rewards is more beneficial to the principal in this case. Yet, this would

require offering lower retention probabilities to higher types, in order to maintain incentive

32

Page 33: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

compatibility. Under Assumption 7, however, the higher agent type is highly effi cient, which

implies that committing not retaining this type is highly costly for the principal. We can

show that this cost is suffi ciently high to deter the principal for attempting to separate the

types in the optimal contract.

7 Empirical and Policy Implications

The main results of model shed light upon the conditions necessary for Pay to Quit programs

to be desirable in an organization. It identifies the dimension along which the incentives of

firms and workers must be misaligned for Pay to Quit programs to emerge. First, if the

firm-specific fit between employer and employee is of crucial importance to the firm, so much

so that hiring a wrong fit offers a lower social welfare from the relationship than the firm’s

outside option, then paying the low type to completely select them out is optimal. We show

that the optimal contract involves retaining only the high ability worker– “the right fit”. To

achieve this outcome, the employer must offer the low ability worker the option of a payment

to quit, so that he low type prefers selecting out rather than remaining with the company.

This case is illustrated eloquently in the description of the policy implemented by Amazon,

presented in the introduction.

Second, we have the case when the firm values retaining the “right fits” suffi ciently

highly — relatively more than offering high power incentives. If the firm and workers are

aligned on these values, then high types of workers also value retention in the job where

they are "right fits" suffi ciently highly — relatively more than compensation — compared

to the low types. This corresponds to industries in which the fit of the employee in the

organization is relatively more important for the final product than the effort generated by

the compensation scheme. This is more likely to be the case when team work is essential for

the project success —so fit matters —, and the employees are more intrinsically motivated —

so that monetary compensation matters less. For instance, such a description can apply to

emergency intervention teams, fire fighters or other special units. It might also apply in the

33

Page 34: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

case of teachers, a category for which both compensation and firing have been matters of

intense public policy debates. In these cases, Proposition 3 shows that offering a payment

for the low ability type to select out would not be optimal. Differentiated contracts can be

offered without such a payment being made.

Finally, we have the case when the firm values offering high power incentives relatively

more than retaining the “right fit”workers; yet, the workers are misaligned with the firm.

The high ability workers still value retention in the job they are the right fit for relatively

more than compensation, compared to the low ability types. This corresponds to industries

in which generating high effort through monetary compensation is important for the organi-

zations, but retention is relatively more valuable for the employee. Such a description could

apply, for example, to law firms or financial management organizations, where incentivizing

high effort is highly important for the firm, but being retained at a prestigious firm may

be relatively more important for the employee. In these cases, in which there is misalign-

ment between the preferences of the employer and those of the employee, a fixed payment

for the low ability type to select out, or creating a early tracking system, is necessary for

any contract that differentiates between the worker types. This may explain, for example,

the introduction in recent years of early tracking at prestigious corporate law firms, with a

partner track as well as a "permanent associate" track, that offers a higher probability of

retention, but lower compensation.24

As shown in the Appendix F, the alignment and misalignment cases emerge also when

high ability workers value compensation relatively more than retention, and the fixed pay-

ment is optimal when there is misalignment between the preferences of the firm and those

of the workers.

The above discussion also suggests several public policy implications of the model. The

conditions for when Pay to Quit programs are optimal are also informative for when it might

be desirable for such programs to be supported through fiscal policy. Moreover, the model

suggests that some public employment sectors might benefit from Pay to Quit programs or

24The New York Times, “At Well-Paying Law Firms, a Low-Paid Corner,” May 23, 2011(http://www.nytimes.com/2011/05/24/business/24lawyers.html, accessed August 15, 2017)

34

Page 35: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

early tracking —for example, the bureaucracy might fit the misalignment case —, while in

others —for example, public teachers —such programs would not be desirable.

8 Conclusion and Empirical Implications

We presented a model to study a fundamental problem that arises in employment contracts.

We modeled an environment in which firms offer contracts to employees when the employees

have unobservable ability (there is adverse selection), put in unobservable effort (there is

moral hazard), and the firm has only one slot for the job, so that the firm can exercise an

outside option only if the current candidate is not retained. We characterized the optimal

employment contract in this environment and showed that it can take the form of differ-

entiated contracts for different candidate abilities. Moreover, we showed that the optimal

contract features a fixed base payment, independent of retention or project outcome, when-

ever the firm’s and the agents’incentives are misaligned along a fundamental dimension– the

relative benefit from the project reward, used to solve the moral hazard problem, versus the

retention probability, used to solve the adverse problem.

Our model sheds light on the prevalence of contracts that offer payments for workers to

select out, or allow for early tracking of employees within organizations, and it provides a

framework for understanding which conditions are necessary for such contracts to be optimal.

For applied studies, it helps inform what features of organizations and what worker charac-

teristics should be examined in order to account for the value and desirability of contracts

that offer monetary incentives for selecting out.

35

Page 36: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

References

Athey, S., 2002. Monotone comparative statics under uncertainty. The Quarterly Journal of

Economics 117 (1), 187—223.

Chade, H., Swinkels, J., 2016. The no-upward-crossing condition and the moral hazard

problem. Tech. rep., Arizona State University Working Paper.

Chiu, W. H., Karni, E., 1998. Endogenous adverse selection and unemployment insurance.

Journal of Political Economy 106 (4), 806—827.

De Meza, D., Webb, D. C., 2001. Advantageous selection in insurance markets. RAND

Journal of Economics, 249—262.

Garrett, D. F., Pavan, A., 2012. Managerial turnover in a changing world. Journal of Political

Economy 120 (5), 879—925.

Gottlieb, D., Moreira, H., 2014. Simultaneous adverse selection and moral hazard. mimeo.

Grossman, S. J., Hart, O. D., 1983. An analysis of the principal-agent problem. Economet-

rica: Journal of the Econometric Society, 7—45.

Halac, M., Kartik, N., Liu, Q., 2016. Optimal contracts for experimentation. The Review of

Economic Studies 83 (3), 1040—1091.

Hölmstrom, B., 1979. Moral hazard and observability. The Bell journal of economics, 74—91.

Jullien, B., Salanie, B., Salanie, F., 2007. Screening risk-averse agents under moral hazard:

single-crossing and the cara case. Economic Theory 30 (1), 151—169.

Laffont, J.-J., Tirole, J., 1986. Using cost observation to regulate firms. Journal of political

Economy 94 (3, Part 1), 614—641.

Laffont, J.-J., Tirole, J., 1993. A theory of incentives in procurement and regulation. MIT

press.

36

Page 37: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

Maskin, E., Riley, J., 1984. Monopoly with incomplete information. The RAND Journal of

Economics 15 (2), 171—196.

Mirrlees, J., 1975. On moral hazard and the theory of unobservable behavior. Nuffi eld Col-

lege.

Myerson, R. B., 1981. Optimal auction design. Mathematics of operations research 6 (1),

58—73.

Strulovici, B., 2011. Renegotiation-proof contracts with moral hazard and persistent private

information. SSRN Working Paper No. 1755081.

37

Page 38: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

A Proofs of Results from the Main Text

A.1 Concavity/Convexity

In the proofs, the following concavity/convexity results will be useful:

eθw (θ, w) = −q1ceeθ (θ, e (θ, w))

[cee (θ, e (θ, w))]2+ceee (θ, e (θ, w)) cθe (θ, e (θ, w))

[cee (θ, e (θ, w))]3≥ 0 by (1);

eww (θ, w) =−q1e (θ, w) ceee (θ, e (θ, w))

[cee (θ, e (θ, w))]2≤ 0 by (1);

Sw (θ, w) = q1 (1− w) ew (θ, w) ≥ 0;

Swθ (θ, w) = q1 (1− w) eθw (θ, w) ≥ 0;

Sww (θ, w) = −q1 (e (θ, w)) ew (θ, w) + q1e (θ, w) (1− w) eww (θ, w) ≤ 0;

Vwθ (θ, w) = q1eθ (θ, w) = −cθe (θ, e (θ, w))

cee (θ, e (θ, w))≥ 0;

Vww (θ, w) = q1ew (θ, w) ≥ 0;

πww (θ, w) = Sww (θ, w)− Vww (θ, w) ≤ 0. (19)

A.2 Proof of Proposition 1

We consider the following relaxed problem:

maxwL,wH

Eθ [pθ · S (θ, wθ) + (1− pθ) ·W ] . (20)

For each θ, wθ = 1 (“selling the project”) maximizes the social welfare:

S(θ, 1) = maxe

(q (e)− c (θ, e))

≥ q (e (θ, w))− c (θ, e (θ, w)) for each w

= S(θ, w).

Hence wθ = 1 is optimal for each θ ∈ {H,L}.

Given wθ = 1, (20) is maximized by taking pθ = 1 if S (θ, 1) ≥ W and pθ = 0 otherwise.

38

Page 39: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

Since

W < WMAX = maxwH

π(H,wH) ≤ maxwH

S(H,wH),

we have pH = 1, and pL = 1 if S (L, 1) > W and only if S (θ, 1) ≥ W .25

We are left to verify that there exists vθ such that (wθ, pθ) is incentive compatible (note

that vθ does not affect the incentive). By setting vL = (pH − pL)V (L, 1), it is straightforward

to verify that both (ICH) and (ICL) are satisfied.

A.3 Proof of Lemma 1

A.3.1 Proof of vH = 0

Assume vH > 0. Note that (ICH) is binding since otherwise the principal would reduce vH :

V (H,wH) + vH = V (H,wL) + vL. (21)

From (8), the principal’s payoff from the type H is

pH (S (H,wH)− V (H,wH)) + (1− pH)W − vH

= pH (S (H,wH)−W ) +W − pLV (H,wL)− vL.

Therefore, offering vH − ∆v and wH + ∆w to keep (21) improves the principal’s payoff by

(19). Moreover, (ICL) is satisfied:

V (L,wH + ∆w) + vH −∆v

= V (L,wH) + vH + V (L,wH + ∆w)− V (L,wH)−∆v

≤ V (L,wL) + vL + V (L,wH + ∆w)− V (L,wH)−∆v

by (ICL) for the original contract

≤ V (L,wL) + vL + V (H,wH + ∆w)− V (H,wH)−∆v.

25Since S (θ, 1) =W implies any pL ∈ [0, 1] is optimal, we will simply say pL = 1 if and only if S (θ, 1) ≥W .

39

Page 40: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

The last line follows since Vθw (θ, w) ≥ 0 by (19). Since we keep (21), this inequality implies

V (L,wH + ∆w) + vH −∆v ≤ V (L,wL) + vL.

Therefore, we have vH = 0.

A.3.2 Proof of either pL = 1 or pH = 1 unless pH = pL = vH = vL = 0

Assume next pH < 1 and pL < 1. From (8), the principal’s objective is

µH [pHπ (H,wH) + (1− pH)W − vH ] + µL [pLπ (L,wL) + (1− pL)W − vL]

= µH [pH [π (H,wH)−W ]− vH ] + µL [pL [π (L,wL)−W ]− vL] +W.

We have

µH [pH [π (H,wH)−W ]− vH ] + µL [pL [π (L,wL)−W ]− vL] > 0

since otherwise pH = pL = vH = vL = 0 would be optimal.

Offering (kpH , wH , kvH) and (kpL, wL, kvL) with k > 1 increases the principal’s profit:

µH [kpHπ (H,wH) + (1− kpH)W − kvH ] + µL [kpLπ (L,wL) + (1− kpL)W − vL]

= k {µH [pH [π (H,wH)−W ]− vH ] + µL [pL [π (L,wL)−W ]− vL]}+W.

Since (ICH) and (ICL) are satisfied, this is a profitable deviation, which is a contradiction.

A.4 Proof of Lemma 2

Suppose vL > 0 and (ICH) is not binding. Then, reducing vL > 0 and increasing wL to keep

pLV (L,wL) + vL improves the principal’s payoff and satisfies the incentive compatibility

constraint, by the same proof as Lemma 1.

40

Page 41: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

A.5 Proof of Proposition 2

Given Lemma 1, it suffi ces to prove that pL = 0 is optimal if and only if

S (L, 1)−W ≤ 0. (22)

Suppose pL = 0 is optimal. Then the payoff of type θ from the contract for the low type

is equal to vL for each θ. Since the H type incurs the lower cost of effort, the H type’s

incentive compatibility constraint is slack. Hence we have vL = pHV (L,wH)− pLV (L,wL)

and this is necessary and suffi cient for the incentive compatibility. Moreover, pL ≤ 1 and

vL ≥ 0 is not binding.

Substituting vL into the principal’s problem, we have

maxpH ,pL,wH ,wL

µHpH [q (e (H,wH)) (1− wH)−W ]

+µL [pLq (e (L,wL)) (1− wL)− pLW − vL]

maxpH ,wH

µHpH [q (e (H,wH)) (1− wH)−W ]− pHµLV (L,wH)

+µL maxpL

pL maxwL

[q (e (L,wL))− c (L, e (L,wL))−W ] .

If and only if maxwL [q (e (L,wL))− c (L, e (L,wL))−W ] ≤ 0, the optimal pL without

the constraint pL ≤ 1 is equal to 0. Moreover, q (e (L,wL)) − c (L, e (L,wL)) is the social

welfare from the low type, we have

maxwL

[q (e (L,wL))− c (L, e (L,wL))−W ] = S (L, 1)−W,

as desired.

41

Page 42: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

A.6 Useful Lemma

The following lemma pins down the shape of the optimal contract given log-submodularity

(or log-supermodularity) of V :

Lemma 6 If (ICL) holds with equality, then,

1. If V is log-submodular, then we have wH > wL and pH < pL if vL > 0; and wH ≤ wL

and pH ≥ pL if vL = 0.

2. If V is log-supermodular, then we have wH < wL and pH > pL if vL > 0; and wH ≥ wL

and pH ≤ pL if vL = 0.

In particular, together with Proposition 6, we have Lemma 4: If (ICL) is slack, then

Proposition 6 implies wL ≥ wH and pL ≤ pH . If (ICL) binds, then Lemma 6 implies Lemma

4.

We now prove Lemma 6, focusing on log-submodular V , since the argument for log-

supermodular V is symmetric.

A.6.1 Proof: If vL > 0, then we have wH > wL and pH < pL

Since Lemma 2 implies both (ICL) and (ICH) are binding, we have

pHV (H,wH) = pLV (H,wL) + vL;

pHV (L,wH) = pLV (L,wL) + vL,

and so

pH = pLV (H,wL)− V (L,wL)

V (H,wH)− V (L,wH);

pHV (L,wH) = pLV (L,wL) + vL.

(23)

Since vL > 0, we have pHV (L,wH) > pLV (L,wL), and so

V (H,wL)− V (L,wL)

V (H,wH)− V (L,wH)V (L,wH) > V (L,wL)

42

Page 43: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

V (H,wL)V (L,wH) > V (H,wH)V (L,wL) .

Hence we have wH > wL if V is log-submodular. (23) implies pH < pL.

A.6.2 Proof: If vL = 0, then we have wH ≤ wL and pL ≥ pH

Since (ICL) is satisfied with equality, we have pHV (L,wH) = pLV (L,wL). If V is log-

submodular and wH > wL, then we have V (H,wH) /V (L,wH) < V (H,wL) /V (L,wL),

and so pHV (H,wH) < pLV (H,wL). This contradicts to (ICH). Hence wH ≤ wL, and so

pH/pL = V (L,wL) /V (L,wH) ≥ 1.

A.7 Proof of Proposition 3

If she offers CH to type L, the principal’s profit changes by

0 ≥ [pHπ (L,wH) + (1− pH)W ]− [pLπ (L,wL)− vL + (1− pL)W ]

≥ pH [S (L,wH)−W ]− pL [S (L,wL)−W ] by (ICL). (24)

On the other hand, if she offers CL to type H, her profit changes by

0 ≥ [pLπ (H,wL)− vL + (1− pL)W ]− [pHπ (H,wH) + (1− pH)W ]

≥ pL [S (H,wL)−W ]− pH [S (H,wH)−W ] by (ICH) .

Hence we have

pH [S (L,wH)−W ]− pL [S (L,wL)−W ] ≤ pH [S (H,wH)−W ]− pL [S (H,wL)−W ]

pH [S (H,wH)− S (L,wH)] ≥ pL [S (H,wL)− S (L,wL)] .

43

Page 44: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

Since (24) implies

pH ≤ pLS (L,wL)−WS (L,wH)−W ,

we have

S (L,wL)−WS (L,wH)−W [(S (H,wH)−W )− (S (L,wH)−W )]

≥ [(S (H,wL)−W )− (S (L,wL)−W )] .

(S (L,wL)−W ) (S (H,wH)−W ) ≥ (S (L,wH)−W ) (S (H,wL)−W ) .

Hence if the principal’s payoff is log-supermodular (or log-submodular), then we have wH ≥

wL (or wH ≤ wL).

Given Lemma 6, since V (θ, w) is log-submodular, we have vL > 0 if and only if wH > wL.

A.8 Proof of Proposition 4

By Proposition 3, we have vL = 0 and wH ≤ wL. Lemmas 1 and 6 imply that pH = 1.

Since (ICL) is satisfied with equality, we have V (L,wH) = pLV (L,wL). Since V is log-

submodular and wH ≤ wL, we have V (H,wH) /V (L,wH) ≥ V (H,wL) /V (L,wL), and so

V (H,wH) ≥ pLV (H,wL). Hence (ICH) is satisfied whenever wH ≤ wL.

Hence the principal’s problem becomes

maxwH ,wL,wL≥wH≥0

µH (π (H,wH)−W ) + µLpL (π (L,wL)−W )

subject to V (L,wH) = pLV (L,wL). Substituting the constraint yields (12).

A.9 Proof of Proposition 5

By Proposition 3, we have wH ≥ wL. Lemmas 1 and 6 imply that pL = 1.

Suppose wH 6= wL but (ICH) holds with strict inequality (and so vL = 0). Since (ICL)

44

Page 45: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

is satisfied with equality, we have pHV (L,wH) = V (L,wL). Since V is log-submodular and

wH > wL, we have V (H,wH) /V (L,wH) < V (H,wL) /V (L,wL), and so pHV (H,wH) <

pLV (H,wL). Hence (ICH) is not satisfied.

Hence wH 6= wL implies (ICH) holds with equality. Since wH = wL implies pH = pL and

vL = 0 from Lemma 6, regardless of (wH , wL), (ICH) holds with equality.

pHV (H,wH) = V (H,wL) + vL;

pHV (L,wH) = V (L,wL) + vL.

Substituting them into (8) yields (13).

Moreover, since we have

pH =V (H,wL)− V (L,wL)

V (H,wH)− V (L,wH);

pHV (L,wH) = V (L,wL) + vL,

the fixed payment is equal to

vL =V (H,wL)− V (L,wL)

V (H,wH)− V (L,wH)V (L,wH)− V (L,wL)

=V (H,wL)V (L,wH)− V (L,wL)V (H,wH)

V (H,wH)− V (L,wH),

which is positive if and only if wH > wL given V being log-submodular.

A.10 Proof of Lemma 3

Suppose (ICL) holds with strict inequality. We assume wPBH ≥ wPBL and will derive a

contradiction.

We have vL = 0 (otherwise the principal would reduce vL) and π (L,wL) ≥ W (otherwise

she would reduce pL to obtain the outside option but pL = vL would violate (ICL)). Hence

45

Page 46: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

the principal’s payoff is equal to

µH [pHπ (H,wH) + (1− pH)W − vH ] + µL [pLπ (L,wL) + (1− pL)W − vL]

≤ µH [pHπ (H,wH) + (1− pH)W − vH ] + µLπ (L,wL)

since π (L,wL) is no less than W

≤ µHπ (H,wL) + µLπ (L,wL) .

The last inequality follows since offering (pL, wL, vL) = (1, wL, 0) to both types is incentive

compatible. Hence we have π (H,wH) ≥ W .

Given π (H,wH) ≥ W , without loss, we have pH = 1 (since increasing pH does not violate

(ICL) if (ICL) holds with strict inequality). In addition, πw (H,wH) ≤ 0 (since increasing wH

does not violate (ICL) if (ICL) holds with strict inequality). This implies ddwπ (L,wH) ≤ 0

since we assume wPBH ≥ wPBL and π is concave by (19).

Suppose pL < 1. Then, we have to have wL > wH since otherwise (ICL) would be

violated (recall vH = vL = 0 and pH = 1). However, πw (L,wL) < πw (L,wH) ≤ 0 implies

that reducing wL improves π (L,wL) (note that (ICL) still holds if it originally holds with

strict inequality). Hence we have pL = 1.

Finally, given pH = pL = 1, Gottlieb and Moreira (2014) implies wH = wL and vH =

vL = 0. Hence (ICL) holds with equality.

A.11 Proof of Proposition 6

A.11.1 Proof of the “If”Part

It suffi ces to show that (14) is a relaxed problem of the original problem. We start with

a relaxed problem in which we ignore (ICL), and we show that this relaxed problem is

equivalent to (14).

46

Page 47: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

Suppose we ignore (ICL) and so

J (W ) = max{pH ,pL,wH ,wL,vL}

µH [pH (S (H,wH)− V (H,wH)) + (1− pH)W ]

+µL [pL (S (L,wL)− V (L,wL)) + (1− pL)W − vL]

subject to pHV (H,wH) ≥ pLV (H,wL) + vL. Here, by Lemma 1, we assume vH = 0. Since

reducing vL improves the objective and relaxes the constraint, we have vL = 0.

Next, we will show that pH = 1. Given vH = vL = 0, the problem now becomes

J (W ) = max{pH ,pL,wH ,wL}

µH [pH (S (H,wH)− V (H,wH)) + (1− pH)W ]

+µL [pL (S (L,wL)− V (L,wL)) + (1− pL)W ]

subject to

pHV (H,wH) ≥ pLV (H,wL) .

Since the problem is linear in (pH , pL), we have pH = 1 or pL = 1. Suppose pL = 1 is

optimal. Then we have S (L,wL)−V (L,wL)−W > 0 since otherwise reducing pL improves

the objective and relaxes the constraint. Hence we have S (H,wL) − V (H,wL) −W > 0.

If S (H,wH) − V (H,wH) − W ≤ 0, then offering pH = 1 and wH = wL would improve

the objective (type H brings a higher profit to the principal than type L) and satisfy the

constraint. Therefore, S (H,wH) − V (H,wH) −W > 0 and so increasing pH improves the

objective and relaxes the constraint. Hence, without loss, we can assume pH = 1. Hence the

problem becomes

maxpL,wH ,wL

µH (S (H,wH)− V (H,wH)−W ) + µLpL (S (L,wL)− V (L,wL)−W )

subject to V (H,wH) ≥ pLV (H,wL).

Suppose V (H,wH) ≥ pLV (H,wL) is not binding. Then we have wH = wPBH , wL = wPBL ,

and pL = 1 (recall that we have S (L,wL) − V (L,wL) −W > 0). Since wPBH < wPBL , this

47

Page 48: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

violates the constraint. Hence the constraint is binding. Therefore, we have

maxpL,wH ,wL

µH (S (H,wH)− V (H,wH)−W )

+µLpL (S (L,wL)− V (L,wL)−W )

subject to V (H,wH) = pLV (H,wL). Substituting the constraint yields (14).

Finally, if wH ≤ wL in the solution for (14), then V (H, wH) = pLV (H, wL) implies

(ICL):

pLV (L, wL)− V (L, wH) =V (H, wH)

V (H, wL)V (L, wL)− V (L, wH)

=V (H, wH)V (L, wL)− V (H, wL)V (L, wH)

V (H, wL)≤ 0

by log-submodularity of V . Hence the ignored constraint (ICL) is satisfied.

A.11.2 Proof of the “Only if”Part

We show that, if (ICL) is not binding, then the principal is solving (14). By the same proof

as the “if”part, if (ICL) is not binding, then we have (ICH) binding, vL = 0, and pL < 1.

Hence the principal’s problem becomes

maxpL,wH ,wL

µH (S (H,wH)− V (H,wH)−W )

+µLpL (S (L,wL)− V (L,wL)−W ) (25)

subject to V (H,wH) = pLV (H,wL) and V (L,wH) ≤ pLV (L,wL). Substituting the first

constraint yields the following problem:

maxwH ,wL

µH · π (H,wH) + µL ·V (H,wH)

V (H,wL)(π (L,wL)−W )

48

Page 49: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

subject to

V (L,wH) ≤ V (H,wH)

V (H,wL)V (L,wL)⇔ V (H,wL)V (L,wH) ≤ V (H,wH)V (L,wL) .

By log-submodularity of V , this implies wH ≤ wL.

Hence we are left to show that we can ignore the non-binding constraint (ICL) in this

problem. Since

S (θ, w)− V (θ, w)−W = π (θ, w)−W

is concave and V (θ, w) is convex by (19), the second order condition is satisfied for wL

regardless of wH . Hence, we can ignore the non-binding constraint.

B Optimal Contract with Endogenous Outside Option

In the previous section, we presented the characterization of the optimal employment contract

when the principal had the exogenous outside option W . In what follows, we endogenize W

by assuming that, if the current agent is not retained, the principal then pays a small cost

κ > 0 and draws another agent.26 The game then repeats itself.

With the endogenous outside option W , the planning problem for the principal becomes

J (W ) = max{pH ,pL,wH ,wL,vH ,vL}

µH [pH (S (H,wH)− V (H,wH)) + (1− pH) (W − κ)]

+µL [pL (S (L,wL)− V (L,wL)) + (1− pL) (W − κ)] , (26)

subject to

pHV (H,wH) + vH ≥ pLV (H,wL) + vL; (ICEH)

pLV (L,wL) + vL ≥ pHV (L,wH) + vH . (ICEL )

26We formulate the problem such that a new agent arrives right away, and the principal incurs the directcost κ. We can also formalize κ by discounting and search friction. Details are available upon request.

49

Page 50: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

In equilibrium, J (W ) = W by recursion. Let J be the value of J (W ) such that J (W ) =

W . Analyzing this problem, we obtain equivalents of the results of Section 5. Lemmas 1 and

2 continue to hold as stated, since they do not rely on the exogeneity of W . The analysis

changes when we consider the incentive compatibility constraint for the type L agent.

Lemma 7 If π (L,wL) < W −κ, then constraint (ICL) binds. In addition, for a suffi ciently

small κ, constraint (ICL) binds at the fixed point W = J (W ).

Proof. In Section B.1.

(ICL) binds if π (L,wL) < W − κ– the profit the principal is obtaining from hiring the

L type is lower than the value of getting a new worker after paying the waiting cost. This

happens in equilibrium when the waiting cost κ is suffi ciently small, since the H type is more

effi cient than the L type, as long as cθe < 0. Then, the principal would decrease to pL if

(ICL) were not binding. The next proposition focuses on the case in which κ is small and

(ICL) binds.27

Proposition 9 For suffi ciently small κ > 0, the optimal contract can take one of the fol-

lowing forms:

1. The L type is never retained: pL = 0, pH = 1, vL > 0.

2. Both types are hired with positive probability:

(a) (Principal-Agent alignment) If S(θ, w) −W is log-submodular, then vL = 0,

1 = pH ≥ pL and wL ≥ wH .

(b) (Principal-Agent alignment) If S(θ, w)−W is log-supermodular, then vL ≥ 0,

and 1 = pL ≥ pH and wH ≥ wL.

Proof. In Section B.2.

The intuition for the results is the same as described in Section 5. The details on deriving

the conditions that characterize each of the cases described in the above Proposition are given

in Appendix B.2.27The analysis in the case with (ICL) not binding is analogous to Proposition 6.

50

Page 51: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

B.1 Proof of Lemma 7

Suppose (ICL) is not binding. Let C∗θ = (p∗θ, w∗θ , v∗θ) be the optimal contract to type θ; and

let J∗θ be the principal’s profit from type θ:

J∗θ ≡ p∗θq (e (θ, w∗θ)) (1− w∗θ) + (1− p∗θ)(J − κ

)− v∗θ .

Note that J∗H ≥ J∗L since offering CH = C∗L is incentive compatible.

We haved

dwLq (e (L,wL)) (1− wL)

∣∣∣∣wL=w∗L

≥ 0

since otherwise the principal would decrease w∗L without violating (ICH). Hence w∗L ≤

wPBL ≡ arg maxwL≥0 q (e (L,wL)) (1− wL), and so

q (e (L,w∗L)) (1− w∗L) ≤ πPBL ≡ maxwL≥0

q (e (L,wL)) (1− wL) .

Since offering p = 1, w = wPBL , v = 0 to both agent is possible, we have

J > µHq(e(H,wPBL

)) (1− wPBL

)+ µLπ

PBL .

The strict inequality follows from Assumption 1. Hence there exists ε > 0 such that

J ≥ q (e (L,w∗L)) (1− w∗L) + ε. (27)

Hence, for κ ≤ ε, the principal would decrease pL if (ICL) is not binding.

51

Page 52: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

B.2 Proof of Proposition 9

B.2.1 Retention of Type H Only

Proposition 10 With the contract in which the L type is never retained, pL = 0, pH = 1,

vL > 0, the optimal wH maximizes

J := maxwH

q (e (H,wH)) (1− wH)− µLµH

(V (L,wH) + κ) . (28)

This contract is optimal if and only if S (L, 1)− (J − κ) ≤ 0.

Proof. By the same proof as Proposition 2, pL = 0 is optional if and only if S (L, 1) −

(W − κ) ≤ 0.

Let us guess pL = 0 is optimal, that is, S (L, 1)− (W − κ) ≤ 0. Then, at the fixed point,

the value is

J = maxwH

µHq (e (H,wH)) (1− wH) + µL[(J − κ

)− V (wH , L)

]⇔

J = J := maxwH

q (e (H,wH)) (1− wH)− µLµH

(V (L,wH) + κ) .

This guess is verified if and only if S (L, 1)− (J − κ) ≤ 0.

B.2.2 Retention of Both Types: S (θ, w)−W is Log-Submodular

Proposition 11 If S(θ, w)−W is log-submodular and the condition of Proposition 10 does

not hold, then vL = 0, 1 = pH ≥ pL and wL ≥ wH . In particular, wH and wL solve

maxwH ,wL,wL≥wH≥0

µHπ (H,wH) + µL

(V (L,wH)V (L,wL)

π (L,wL)− V (L,wL)−V (L,wH)V (L,wL)

κ)

µH + µLV (L,wH)V (L,wL)

and pL = V (L,wH)V (L,wL)

.

52

Page 53: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

Proof. Suppose S (θ, w) −W is log-submodular for each W . Then, by the same proof as

Proposition 3, we have wH ≤ wL, pH ≥ pL, and vL = 0. Moreover, (ICH) is not binding.

Hence the problem becomes

J (W ) = maxwH ,wL,wL≥wH≥0

µHπ (H,wH)+µL

(V (L,wH)

V (L,wL)π (L,wL) +

V (L,wL)− V (L,wH)

V (L,wL)(W − κ)

).

At the fixed point, we have

J =µHπ (H,wH) + µL

(V (L,wH)V (L,wL)

π (L,wL)− V (L,wL)−V (L,wH)V (L,wL)

κ)

µH + µLV (L,wH)V (L,wL)

.

By dynamic programming, we have

J = maxwH ,wL,wL≥wH≥0

µHπ (H,wH) + µL

(V (L,wH)V (L,wL)

π (L,wL)− V (L,wL)−V (L,wH)V (L,wL)

κ)

µH + µLV (L,wH)V (L,wL)

.

B.2.3 Retention of Both Types: S (θ, w)−W is Log-Supermodular

Proposition 12 If S(θ, w) − W is log-supermodular and the condition of Proposition 10

does not hold, then vL ≥ 0, pH ≤ pL = 1 and wL ≤ wH . In particular, wH and wL solve

J (W ) = maxwH ,wL,wL≥wH≥0

µH

V (H,wL)−V (L,wL)V (H,wH)−V (L,wH)

π (H,wH)−(

1− V (H,wL)−V (L,wL)V (H,wH)−V (L,wH)

µHV (H,wL)−V (L,wL)V (H,wH)−V (L,wH)

+ µL

+ µLπ (L,wL)− V (L,wH) + V (L,wL)

µHV (H,wL)−V (L,wL)V (H,wH)−V (L,wH)

+ µL.

and pH = V (H,wL)−V (L,wL)V (H,wH)−V (L,wH)

.

Proof. Suppose S (θ, w)−W is log-supermodular for each W . Then, by the same proof as

Proposition 3, we have wH ≥ wL, pH ≤ pL, and vL ≥ 0. Moreover, both (ICH) and (ICL)

53

Page 54: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

are binding. Substituting them into the objective, the problem becomes

J (W ) = maxwH ,wL,wH≥wL≥0

µHV (H,wL)− V (L,wL)

V (H,wH)− V (L,wH)π (H,wH)

+ µH

(1− V (H,wL)− V (L,wL)

V (H,wH)− V (L,wH)

)(W − κ)

+ µL (π (L,wL)− V (L,wH) + V (L,wL)) .

At the fixed point, we have

J = µH

V (H,wL)−V (L,wL)V (H,wH)−V (L,wH)

π (H,wH)−(

1− V (H,wL)−V (L,wL)V (H,wH)−V (L,wH)

µHV (H,wL)−V (L,wL)V (H,wH)−V (L,wH)

+ µL

+µLπ (L,wL)− V (L,wH) + V (L,wL)

µHV (H,wL)−V (L,wL)V (H,wH)−V (L,wH)

+ µL.

By dynamic programming, we have

J (W ) = maxwH ,wL,wL≥wH≥0

µH

V (H,wL)−V (L,wL)V (H,wH)−V (L,wH)

π (H,wH)−(

1− V (H,wL)−V (L,wL)V (H,wH)−V (L,wH)

µHV (H,wL)−V (L,wL)V (H,wH)−V (L,wH)

+ µL

+ µLπ (L,wL)− V (L,wH) + V (L,wL)

µHV (H,wL)−V (L,wL)V (H,wH)−V (L,wH)

+ µL,

as desired.

C Outside Options for Agents

C.1 Proof of Proposition 7

Since (ICL) binds by (16), the principal’s problem is

maxpH ,wH ,pL,wL,vL

µHpH (π (H,wH)−W ) + µL [pL (π (L,wL)−W )− vL] (29)

54

Page 55: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

subject to

pH[V (H,wH)− V O

H

]≥ max

{pL[V (H,wL)− V O

H

]+ vL, 0

}, (ICH)

pL[V (L,wL)− V O

L

]+ vL = pH

[V (L,wH)− V O

L

]. (ICL)

Substituting (ICL) yields

maxpH ,wH ,pL,wL

µHpH (π (H,wH)−W ) + µL{pL(S (L,wL)−W − V O

L

)− pH

[V (L,wH)− V O

L

]}subject to

pH[V (H,wH)− V O

H

]≥ max

pL{[V (H,wL)− V O

H

]−[V (L,wL)− V O

L

]}+pH

[V (L,wH)− V O

L

], 0

,

pL[V (L,wL)− V O

L

]≤ pH

[V (L,wH)− V O

L

].

Note that the second constraint is equivalent to vL ≥ 0.

C.1.1 Case 1: V OH = V O

L

Suppose V OH = V O

L . If S (L, 1)−W −V OL ≤ 0, then reducing pL increases the objective. Note

that, with V OH = V O

L , the constraint becomes

pH[V (H,wH)− V O

H

]≥ max

{pL [V (H,wL)− V (L,wL)] + pH

[V (L,wH)− V O

H

], 0},

pL[V (L,wL)− V O

L

]≤ pH

[V (L,wH)− V O

L

].

Hence reducing pL always relaxes the constraint. Hence pL = 0 is optimal if S (L, 1)−W −

V OL ≤ 0.

On the other hand, if S (L, 1)−W − V OL > 0, then increasing pL from pL = 0 improves

the objective since, for suffi ciently small pL, the constraints are redundant. Hence pL = 0 is

optimal only if S (L, 1)−W − V OL ≤ 0.

55

Page 56: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

C.1.2 Case 2: V OH > V O

L

If V OH > V O

L , then even if S (L, 1)−W − V OL ≤ 0, the principal cannot reduce pL to pL = 0

if the constraint

pH[V (H,wH)− V O

H

]≥ max

{pL{[V (H,wL)− V O

H

]−[V (L,wL)− V O

L

]}+ pH

[V (L,wH)− V O

L

], 0}

is binding. Note that the term multiplied to pL,[V (H,wL)− V O

H

]−[V (L,wL)− V O

L

], can

be negative since V OH > V O

L .

On the other hand, suppose S (L, 1)−W − V OL > 0 and pL = 0 is incentive compatible:

pH[V (H,wH)− V O

H

]≥ max

{pH[V (L,wH)− V O

L

], 0}.

Consider the following two cases:

1. V (L,wH)− V OL < 0. In this case,

max{pL{[V (H,wL)− V O

H

]−[V (L,wL)− V O

L

]}+ pH

[V (L,wH)− V O

L

], 0}

= 0

for suffi ciently small pL. Hence, for suffi ciently small pL, constraint (ICH) is redundant.

Hence the same proof as Proposition 2 implies pL > 0 at optimal.

2. V (L,wH)− V OL ≥ 0. In this case, (ICH) at pL = 0 is equivalent to

V (H,wH)− V (L,wH) ≥ V OH − V O

L ≥ 0.

Suppose V (H,wH)−V (L,wH) > V OH −V O

L . Then, for suffi ciently small pL, constraint

(ICH) is redundant. Hence the same proof as Proposition 2 implies pL > 0 at optimal.

On the other hand, suppose V (H,wH)− V (L,wH) = V OH − V O

L . Since the first order

condition for wH is satisfied for the principal’s problem, the first order effect of increas-

56

Page 57: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

ing wH on the principal’s objective is 0. On the other hand, Assumption 1 implies that

the first order effect on V (H,wH) − V (L,wH) is positive; so, increasing wH slightly

makes (ICH) slack. With this slack, the constraint (ICH) is redundant for suffi ciently

small pL. Hence the same proof as Proposition 2 implies that there is a first-order gain

of increasing pL for the principal.

C.2 Retention of Both Types: V OH = V O

L

If V OH = V O

L , then we can show that, if V (θ, w) is log-submodular, then V (θ, w)−V Oθ is also

log-submodular:

Lemma 8 Suppose V OH = V O

L = V O. If V (θ, w) is log-submodular, then V (θ, w) − V Oθ is

also log-submodular.

Recall that the log-submodularity of V (θ, w) implies that the higher type values retention

probabilities more than monetary rewards. Since both types have the same outside option,

the outside option is relatively less attractive to the high type. Hence the higher type values

retention probabilities even more compared to the low type, with the same outside options

between the two types.

The similar relationship holds for S (θ, w)− V Oθ −W :

Lemma 9 Suppose V OH = V O

L = V O. If S (θ, w) − W is log-submodular, then S (θ, w) −

V Oθ −W is also log-submodular. On the other hand, even if S (θ, w)−W is log-supermodular,

S (θ, w)− V Oθ −W may not be log-supermodular.

These lemmas follow from the following lemma, noting that Vwθ (θ, w) ≥ 0 and Swθ (θ, w) ≥

0 from (19):

Lemma 10 Take any F (θ, w) with Fwθ (θ, w) ≥ 0 andW ≥ 0. If F (θ, w) is log-submodular,

then F (θ, w) − W is also log-submodular. On the other hand, even if F (θ, w) is log-

supermodular, F (θ, w)−W may not be log-supermodular.

57

Page 58: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

Proof. For simplicity, we assume logF (θ, w) is twice-differentiable. Otherwise, we can take

derivative of

(F (θ, w)−W ) (F (θ′, w′)−W )− (F (θ, w′)−W ) (F (θ′, w)−W )

with respect to W to derive the same result.

We have

d2

dθdwlog (F (θ, w)−W ) =

Fwθ (θ, w) [F (θ, w)−W ]− Fw (θ, w)Fθ (θ, w)

[F (θ, w)−W ]2.

By the standard argument, F (θ, w) −W is log-supermodular (or submodular) if this cross

derivative is non-negative (or non-positive). Hence we are left to show that ddW

d2

dθdwlog (F (θ, w)−W ) ≤

0 if d2

dθdwlog (F (θ, w)−W ) ≤ 0.

We have

d

dW

d2

dθdwlog (F (θ, w)−W )

=−Fwθ (θ, w) [F (θ, w)−W ]2 + 2 {Fwθ (θ, w) [F (θ, w)−W ]− Fw (θ, w)Fθ (θ, w)} [F (θ, w)−W ]

[F (θ, w)−W ]4.

Given Fθw (θ, w) ≥ 0, we have

d

dW

Fwθ (θ, w) [F (θ, w)−W ]− Fw (θ, w)Fθ (θ, w)

[F (θ, w)−W ]2≤ −Fwθ (θ, w) [F (θ, w)−W ]2

[F (θ, w)−W ]4≤ 0,

as desired.

By Lemma 8, we have V (θ, w) − Vθ is log-submodular. As in Proposition 3, we can

characterize the optimal contract as follows:

Proposition 13 Suppose V OH = V O

L = V O. Consider the case in which the condition of

Proposition 7 is not satisfied. The optimal contract has the following properties:

58

Page 59: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

1. (Principal-Agent Alignment) If S(θ, w)− V O −W is also log-submodular, then

vL = 0, wL ≥ wH , 1 = pH ≥ pL > 0.

2. (Principal-Agent Misalignment) If S(θ, w) − V O −W is log-supermodular, then

(ICH) holds with equality and

vL ≥ 0, wH ≥ wL, 1 = pL ≥ pH > 0.

In addition, vL > 0 if and only if wH 6= wL.

Proof. The same as Proposition 3.

Moreover, as Propositions 4 and 5, the optimal contract in each case has the following

characterization (the proofs are omitted):

Proposition 14 (Alignment case) Suppose V OH = V O

L = V O. If both V (θ, w) − V O and

S(θ, w)− V O −W are log-submodular, then the optimal contract solves

maxwH ,wL,wL≥wH≥0

µH (π (H,wH)−W ) + µLV (L,wH)− V O

V (L,wL)− V O(π (L,wL)−W ) .

Proposition 15 (Misalignment case) Suppose V OH = V O

L = V O. If V (θ, w) − V O is log-

submodular and S(θ, w)− V O −W is log-supermodular, then the optimal contract solves

maxwH ,wL,wH≥wL≥0

µHV (H,wL)− V (L,wL)

V (H,wH)− V (L,wH)(π (H,wH)−W )

+ µL (π (L,wL)−W − V (L,wH) + V (L,wL)) .

Moreover, we have vL > 0 if and only if wH > wL.

59

Page 60: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

C.3 Retention of Both Types: V OH > V O

L

If V OH > V O

L , then it is possible that type H prefers the outside option V OH to the type L’s

contract. The following proposition characterizes when type H prefers the outside option

V OH to type L’s contract:

Proposition 16 Suppose V OH > V O

L . Consider the case in which the solution of (29) has

pL > 0. Define wOH and wOL such that

wOH = arg maxwH :V (H,wH)≥V OH

S (H,wH)− V (H,wH)−WV (L,wH)− V O

L

;

and

wOL = arg maxwL

V (L,wL)− V OL

V (L,wOH)− V OL

(S(H,wOH

)− V

(H,wOH

)−W

)+S (L,wL)−V (L,wL)−W.

The optimal contract satisfies wH = wOH , wL = wOL , pH =V (L,wOL )−V OLV (L,wOH)−V OL

, pL = 1, and vL = 0

if and only if we have wOH ≥ wOL , and

0 ≥ V(H,wOL

)− V O

H ,

0 ≤V(L,wOL

)− V O

L

V (L,wOH)− V OL

(S(H,wOH

)− V

(H,wOH

)−W

)+ S

(L,wOL

)− V

(L,wOL

)−W .

Moreover, in this case, type H prefers the outside option V OH to the type L’s contract.

Proof. We guess that type H prefers the outside option:

pL[V (H,wL)− V O

H

]+ vL ≤ 0. (30)

In such a case, since CL does not affect (ICH), we have vL = 0. Since (ICL) is binding, the

principal’s problem is

maxpH ,wH ,pL,wL

pH (S (H,wH)− V (H,wH)−W ) + pL (S (L,wL)− V (L,wL)−W )

60

Page 61: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

subject to

V (H,wH)− V OH ≥ 0. (PCH)

pL[V (L,wL)− V O

L

]= pH

[V (L,wH)− V O

L

]. (ICL)

Moreover, if (30) is true, then we have 0 ≥ V (H,wL)− V OH . Hence, at optimal, we have

to have wH ≥ wL. This implies that pH ≤ pL = 1.

Substituting (ICL) and pL = 1, we have

maxwH ,wL

V (L,wL)− V OL

V (L,wH)− V OL

(S (H,wH)− V (H,wH)−W ) + S (L,wL)− V (L,wL)−W

subject to V (H,wH) ≥ V OH . We ignore the constraint pH ≤ 1 since we know that, if the type

H prefers the outside option V OH to type L’s contract– if the conjecture is correct– , then

we have wH ≥ wL given V (H,wH) ≥ V OH (the H type prefers CH to the outside option).

Hence we have

wOH = arg maxwH

S (H,wH)− V (H,wH)−WV (L,wH)− V O

L

subject to V (H,wH) ≥ V OH . Given the solution w

OH ,

wOL = arg maxwL

V (L,wL)− V OL

V (L,wOH)− V OL

(S(H,wOH

)− V

(H,wOH

)−W

)+S (L,wL)−V (L,wL)−W.

The guess is verified if and only if we have V OH is better than CL and pL = 1 is optimal,

that is,

0 ≥ V(H,wOL

)− V O

H ,

0 ≤V(L,wOL

)− V O

L

V (L,wOH)− V OL

(S(H,wOH

)− V

(H,wOH

)−W

)+ S

(L,wOL

)− V

(L,wOL

)−W .

If the condition in the above proposition is not satisfied, then, at the optimal contract,

the type H agent prefers CL to the outside option. Then, the analysis of this case is the

61

Page 62: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

same as above, replacing V (θ, w) with V (θ, w)− Vθ. Since it is possible that V (θ, w)− Vθis log-supermodular even if V (θ, w) is log-submodular,28 the full characterization takes the

following form:

Proposition 17 Suppose V OH > V O

L . Consider the case in which the solution of (29) has

pL > 0 and the condition of Proposition 16 is not satisfied. The optimal contract has the

following properties:

1. (Principal-Agent Alignment: log-submodular V (θ, w)−Vθ) If both V (θ, w)−Vθand S(θ, w)− V O −W are log-submodular, then

vL = 0, wL ≥ wH , 1 = pH ≥ pL > 0.

2. (Principal-Agent Alignment: log-supermodular V (θ, w)−Vθ) If both V (θ, w)−

Vθ and S(θ, w)− V O −W are log-supermodular, then

vL = 0, wH ≥ wL, 1 = pL ≥ pH > 0.

3. (Principal-Agent Misalignment: log-submodular V (θ, w)− Vθ) If V (θ, w)− Vθis log-submodular and S(θ, w) − V O −W is log-supermodular, then (ICH) holds with

equality and

vL ≥ 0, wH ≥ wL, 1 = pL ≥ pH > 0.

In addition, vL > 0 if and only if wH 6= wL.

4. (Principal-Agent Misalignment: log-supermodular V (θ, w)−Vθ) If V (θ, w)−Vθis log-supermodular and S(θ, w)− V O −W is log-submodular, then

vL ≥ 0, wL ≥ wH , 1 = pH ≥ pL > 0.

28On the contrary to Lemma 10, even if the higher type values retention probabilities compared to thelow type without outside options, the H type may value monetary rewards, compared to the L type, if theH-type outside option is very high and he would like to obtain the outside option.

62

Page 63: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

In addition, vL > 0 if and only if wH 6= wL.

Proof. The same as Proposition 3.

Moreover, we can obtain a similar characterization as Propositions 4 and 5 for each of

the four cases.29

D Model with More than Two Outcomes

We also consider the extension to the case in which more than two outcomes can be observed:

|Y | ≥ 3. Let Y ={y1, ..., y|Y |

}3 y be the set of possible outcomes with y1 ≤ y2 ≤ · · · ≤ y|Y |.

Now the rewards after outcomes is expressed by a vector wθ = (wθ (y))y. Similarly, the social

welfare S (θ,wθ) and the agent’s value V (θ,wθ) have wθ as an argument. As before, let vθ

be the payment when the agent is not retained.

D.1 A General Condition

When we prove Proposition 3, we consider the principal’s deviation to offer the same contract

wθ to both types. The non-profitability of this deviation amounts to

pH [S (H,wH)−W ] ≥ pL [S (H,wL)−W ] ;

pH [S (L,wH)−W ] ≤ pL [S (L,wL)−W ] .

That is,

(S (H,wH)−W ) (S (L,wL)−W )− (S (H,wL)−W ) (S (L,wH)−W ) ≥ 0. (31)

29We omit the details since it follows from the straightforward substitution of incentive constraints.

63

Page 64: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

On the other hand, by the same proof as Lemma 2, we can show that vL > 0 implies both

(ICH) and (ICL) are binding. Manipulating them yields

vL = pLV (H,wL)V (L,wH)− V (L,wL)V (H,wH)

V (H,wH)− V (L,wH).

Hence

V (H,wL)V (L,wH)− V (L,wL)V (H,wH) > 0 if and only if vL > 0. (32)

If |Y | = 2, then the log-submodularity/supermodularity of S (θ, w) − W and V (θ, w)

with w = w (1)− w (0) ensures that (31) and (32) imply that vL > 0 only if S (θ, w)−W is

log-supermodular given log submodular V (θ, w). To see why, given log submodular V (θ, w),

we have

[(S (H,wH)−W ) (S (L,wL)−W )− (S (H,wL)−W ) (S (L,wH)−W )]

× [V (H,wH)V (L,wL)− V (H,wL)V (L,wH)]

≥ 0 for each wL, wH

if and only if S (θ, w) −W is log-submodular. The analogous condition with multiple out-

comes is that

[(S (H,wH)−W ) (S (L,wL)−W )− (S (H,wL)−W ) (S (L,wH)−W )]

× [V (H,wH)V (L,wL)− V (H,wL)V (L,wH)]

≥ 0 for each wL,wH . (33)

We obtain a result analogous to the aligned case of Proposition 3:

Proposition 18 If (33) holds, then vL = 0.

Proof. The same as Proposition 3.

64

Page 65: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

D.2 A Specific Condition

One may wonder if there is a condition more closely related to the log-submodularity/supermodularity

of S (θ, w) − W and V (θ, w). To this end, let S|Y | (θ, w) − W and V|Y | (θ, w) be the so-

cial welfare and the agent’s value, when w = (w, 0, . . . , 0) (the agent is compensated

only after the highest outcome y = y|Y |). We derive the condition based on the log-

submodularity/supermodularity of S|Y | (θ, w)−W and V|Y | (θ, w).

Lemma 11 Given Assumption 4, the H type is compensated only after the highest outcome:

wH = (wH , 0, . . . , 0).

Proof. In Section D.3.1.

Intuitively, we consider the following deviation of the principal when the H type is com-

pensated after yn < y|Y |: decrease wH (yn) and increase wH(y|Y |), so as to keep constant type

H’s value from the contract. The monotone hazard rate condition implies that this change

incentivizes more effort. The monotone likelihood ratio condition implies that, with lower

effort, it is harder to achieve y|Y | than yn. Since the L type exerts lower effort compared

to the H type, this reduces the L type’s incentive to mimic the H type. Hence, this is a

profitable and incentive compatible deviation for the principal.

Nevertheless, it is not always the case that the L type is compensated only after the

highest outcome. To see why, the monotone hazard rate condition implies that compensat-

ing the L type only after the highest outcome incentivizes more effort– it helps solve the

moral hazard problem; however, by the symmetric argument, the monotone likelihood ratio

condition implies that compensating the L type only after the highest outcome increases the

H type’s incentive to mimic the L type. In other words, the compensation after intermediate

outcomes helps solve the adverse selection problem.

Note that the argument that the H type is compensated only after the highest outcome

is analogous to Lemma 1– the H type is compensated only after y = 1; and the argument

that the L type may be compensated after the intermediate outcomes is analogous to vL > 0

in the binary case– vL > 0 is costly for the principal, but reducing vL and increasing wL

65

Page 66: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

may violate the H type’s incentive compatibility.

In order to derive more specific conditions for when vL > 0, consider the following:

Given wL, let w (L,wL) be the reward after y|Y | such that type L is indifferent between

w (L,w) = (w (L,w) , 0, 0, ..., 0) and wL. As mentioned, this increases the effort of the L

type. Let ∆S (L,w) be the increase of the social welfare when the agent is of type L:

∆S (L,w) = S (L,w (L,w))− S (L,w) ≥ 0.

At the same time, the H type’s value from CL also increases:

∆V (H,w) = V (H,w (L,w))− V (H,w) ≥ 0.

If the benefit of increasing S (L,w) dominates the cost of increasing ∆V (H,w), such that

∆S (L,w) ≥ ∆V (H,w) for each w, (34)

then we can show that vL ≥ 0 implies the log-supermodularity of S (θ,w).

Proposition 19 Suppose Assumptions 4 and (34) are satisfied. In addition, suppose V|Y | (θ, w)

is log-submodular. Then vL ≥ 0 only if S|Y | (θ, w)−W is log-supermodular.

Proof. In Section D.3.2.

D.3 Proofs

D.3.1 Proof of Lemma ??

Let wnθ be the wage after outcome yn. If there exist n and n′ such that n > n′ and wnθ , w

n′θ > 0,

suppose the principal increases wnH by one unit, which increases the type H’s payoff by

q (yn|eH). To keep the H type indifferent, the principal reduces wn′H by q (yn|eH) /q (yn′ |eH).

Keeping eH constant, the principal would be indifferent. Yet, the monotone hazard rate con-

dition implies that the H type increases his effort, and so the principal’s payoff is increasing.

66

Page 67: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

For the L type, by the envelope theorem, the gain from imitating the H type is changed

by

q (yn|eLH)− q (yn|eH)

q (yn′|eH)q (yn′|eLH) ,

where eLH is the optimal effort of the L type when he takes contract CH . Since eH > eLH ,

the monotone likelihood ratio condition implies that the above gain is negative. Hence (ICL)

gets relaxed. This implies that we have wnH > 0 only for n = |Y |.

D.3.2 Proof of Proposition 8

We prove that vL > 0 only if S|Y | (θ, w) is log-supermodular. To see why, suppose vL > 0.

Then, since (ICH) is binding, we have

pHV (H,wH) = pLV (H,wL) + vL;

pHV (L,wH) = pLV (L,wL) + vL.

HencevLpL

=V (H,wL)V (L,wH)− V (L,wL)V (H,wH)

V (H,wH)− V (L,wH)> 0,

which implies

V (H,wL)V (L,wH)− V (L,wL)V (H,wH) > 0.

Since ∆V (H,w) ≥ 0 and V (L,w (L,wL)) = V (L,wL), this implies

V (H,w (L,wL))V (L,wH)− V (L,w (L,wL))V (H,wH) > 0,

or

V|Y | (H,w (L,wL))V|Y | (L,wH)− V|Y | (L,w (L,wL))V|Y | (H,wH) > 0.

Hence w (L,wL) < wH by log-submodularity of V|Y | (θ, w).

Since the principal’s deviation of offering w (L,w) to both types is not profitable, we

67

Page 68: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

have

pL(S|Y | (H,w (L,wL))−W −∆V (H,wL)

)≤ pH

(S|Y | (H,wH)−W

).

Since w (L,wL) and wH have positive rewards only after y|Y |, we have S|Y | (H,w (L,wL)) =

S (H,w (L,wL)) and S|Y | (H,wH) = S (H,wH). The term ∆V (H,wL) represents the fact

that the rent that the principal needs to pay to the high type in CL increases by ∆V (H,wL).

Hence the lower bound of the change of the principal’s profit is given by

pL(S|Y | (H,w (L,wL))−W −∆V (H,wL)

)− pH

(S|Y | (H,wH)−W

).

Similarly, since the principal’s deviation of offering wH to both types is not profitable,

we have

pH (S (L,wH)−W ) ≤ pL (S (L,wL)−W ) .

Again, since S (L,wH) = S|Y | (L,wH) and S (L,wL) = S (L,w (L,wL)) − ∆S (L,wL) =

S|Y | (L,w (L,wL))−∆S (L,wL), we have

pH(S|Y | (L,wH)−W

)≤ pL

(S|Y | (L,w (L,wL))−∆S (L,wL)−W

).

Note that

pL(S|Y | (H,w (L,wL))−W −∆V (H,wL)

)≤ pH

(S|Y | (H,wH)−W

)and

pH(S|Y | (L,wH)−W

)≤ pL

(S|Y | (L,w (L,wL))−∆S (L,wL)−W

)imply

(S|Y | (L,wH)−W

) (S|Y | (H,w (L,wL))−W −∆V (H,wL)

)≤

(S|Y | (H,wH)−W

) (S|Y | (L,w (L,wL))−∆S (L,wL)−W

).

68

Page 69: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

Since we have S|Y | (L,wH)−W ≤ S|Y | (H,wH)−W , together with (34), we have

(S|Y | (H,wH)−W

)∆S (L,w) ≥

(S|Y | (L,wH)−W

)∆V (H,w) .

Hence we have

(S|Y | (L,wH)−W

) (S|Y | (H,w (L,wL))−W

)≤

(S|Y | (H,wH)−W

) (S|Y | (L,w (L,wL))−W

).

Since w (L,wL) < wH , this implies that S|Y | (θ, w)−W is log-supermodular.

E Optimal Contract with Multiple Agent Types

We extend the model to consider the case in which the agent can have more than two types.

To avoid a complication arising from non-binding incentive compatibility of lower types, we

assume endogenous outside options with small κ > 0.

Specifically, we assume that the agent can have types θ1, ..., θn with θ1 < θ2 < · · · <

θn−1 < θn. Contract Ci is designed for type θi. Let ICi,j be the constraint that type i does

not want to pretend to be type j:

piV (θi, wi) + vi ≥ pjV (θi, wj) + vj.

Assumption 5 We assume that V (θ, w) is well defined for θ ∈ Θ =[θ, θ]⊃ {θ1, ..., θn}.

First, we can derive a set of general properties of the optimal contract with multiple

types of agents.

Lemma 12 With multiple agent types, the optimal contract has the following properties:

1. vi > 0 implies ICi,j is binding for some j.

2. For the highest type, vn = 0.

69

Page 70: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

3. If pi > 0 and vi > 0, then there exists j ≥ i+ 1 such that ICj,i is binding.

4. With κ > 0 suffi ciently small, for each i, there exist i′ ≤ i and j ≥ i + 1 such that

ICi′,j is binding.

5. If pi = 0, then pj = 0 for each j ≤ i.

Proof. In Section E.1.1.

The first property stems from the fact that offering a fixed base payment vi > 0 is

a direct cost to the principal. The only benefit from such a payment comes if it helps

establish incentive compatibility. Therefore, it has to be the case that one of the incentive

compatibility constraints must be binding. For the highest type, the intuition for vn = 0 is

the same as in the two type case. Property 3 is derived from noticing that the principal sets

vi > 0 only if she cannot offer incentives to the agent of type i through a higher reward wi,

because this would make the contract more attractive to an agent of a higher type, so with

lower marginal cost of effort. Property 4 follows because otherwise the principal would stand

to gain from retaining lower types less often in order to have the chance to perform a new

search. Finally, if the agent of type i finds it optimal to accept a contract without retention,

then it must be that all the agents of lower types find it optimal to accept this contract

since they have even higher marginal cost of effort. Hence, we impose the constraint that, if

pi = 0, then pj = 0 for each j ≤ i, and we ignore the incentive compatibility constraint for

j.

First, if both V (θ, w) and S (θ, w)−W are log-submodular, then it is relatively straight-

forward to extend the main result:

Proposition 20 For suffi ciently small κ > 0, if V (θ, w) is log-submodular and S(θ, w)−W

is also log-submodular, then there exists θ ∈ {θ1, ..., θn−1} such that the principal does not

retain the types for θi < θ for sure, and for the retained types, we have 0 = vn = · · · = vi,

pn ≥ pn−1 ≥ · · · ≥ pi, and wn ≤ wn−1 ≤ · · · ≤ wi.

Proof. In Section E.1.2.

70

Page 71: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

The result extends the intuition derived in the case with only two agent types. If S(θ, w)−

W is log-submodular then the excess social value from the contract increases relatively

more when the principal’s offers higher retention to a higher agent type. The principal can

compensate an agent for lower retention by increasing the reward offered to this agent. With

V (θ, w) also log-submodular, the relative value placed by the agent on the reward as opposed

to retention is higher for lower agent types. Therefore, the actions of the principal towards

solving the moral hazard and adverse selection problems align with the incentives of agents.

The principal can offer an incentive compatible contract that gives higher retention to higher

types and higher rewards to lower, and she does not need to offer a fixed base payment.

When S (θ, w) − W are log-supermodular, while V (θ, w) is log-submodular, the same

logic as in the case with only two agent types leads to the conclusion that some fixed base

payments will be useful in the optimal contract. The diffi culty, however, is in determin-

ing whether these base payments, the rewards, and retention probabilities are following a

monotone pattern. The complication is driven by the lack of a single crossing condition for

this three dimensional problem (pθ, vθ, wθ), which creates a non-trivial preference ordering

over contracts by different types. In order to make further progress in the analysis, we need

to assume a single-peakedness condition.

Consider any pair of contracts C = (p, w, v) and C ′ = (p′, w′, v′). Define

f (θ; C, C ′) ≡ pV (θ, w)− pV (θ, w′) ,

the marginal gain of obtaining the contract C than C ′, except for the fixed payment v and

v′.

Assumption 6 For w > w′ and p < p′, we assume that f (θ; C, C ′) is single-peaked.

This assumption is necessary in order to ensure an ordering of agents’preferences over

contracts. To see why this make sense, suppose w > w′, p < p′, and v = v′ = 0. We have

to have p′w′ > pw. Otherwise, C would pay more in terms of expected reward, and it would

allow the agent not to pay the cost of effort more often; and so each type would prefer C.

71

Page 72: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

Then, if θ is very low and that type exerts almost zero effort (and so his payoffs are q0pw

and q0p′w′), or if θ is very high and that type does not incur the cost of effort (and so his

payoffs are pw and p′w′ from C and C ′, respectively), then both types prefer C ′ to C, and so

f (θ; C, C ′) becomes low for extreme types.

Even with the single-peakedness condition, controlling the global incentive compatibility

constraint is diffi cult. In particular, vi may not be monotone in types.30 To overcome this

additional complication, we make an additional assumption about the most effi cient agent.

Assumption 7 The most effi cient agent type, θn, exerts the highest effort possible, e = 1,

without any cost.31

Given this assumption, we can show the following:

Proposition 21 Under the above assumptions, for suffi ciently small κ > 0, if S (θ, w)−W

is log-supermodular, then there exists θ ∈ {θ1, ..., θn−1} such that the principal never retains

the types for θi < θ for sure, and offers all types θi ≥ θ the same contract: 0 = vn = vn−1 =

· · · = vi, pn = pn−1 = · · · = pi, and wn = wn−1 = · · · = wi.

Proof. In Section E.1.3.

The above result shows that, when the model is extended to multiple agent types, the

optimal contract no longer offers a positive fixed payment to lower agent types, unless that

type is never retained. As shown in the analysis with only two types, when S (θ, w)−W is

log-supermodular, the excess social value created by the contract increases relatively more

when the higher agent types are offered higher rewards. Solving the moral hazard problem

through higher rewards is more beneficial to the principal in this case. Yet, this would

require offering lower retention probabilities to higher types, in order to maintain incentive

compatibility. Under Assumption 7, however, the higher agent type is highly effi cient, which

implies that committing not retaining this type is highly costly for the principal. We can

show that this cost is suffi ciently high to deter the principal for attempting to separate the

types in the optimal contract.30More details are available upon request.31This is consistent with our assumption that V is log-submodular.

72

Page 73: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

E.1 Proofs

E.1.1 Proof of Lemma 12

We prove each property in the sequel:

1. vi > 0 implies ICi,j is binding for some j since otherwise the principal would reduce

vi.

2. We have vn = 0 by the same proof as Lemma 1.

3. If pi > 0 and vi > 0, then there exists j ≥ i+ 1 such that ICj,i is binding by the same

proof as Lemma 2.

4. Suppose that, for some i, there do not exist i′ ≤ i and j ≥ i + 1 such that ICi′,j is

binding. Then, reducing (pk, vk) to ((1− η) pk, (1− η) vk) for each k ≤ i keeps incentive

compatibility for all the types. Since this new contract allows the principal to replace

lower types with higher probabilities, for suffi ciently small κ > 0, the principal’s profit

increases by the same proof as Lemma 7.

5. Note that the value from a contract with p > 0 is strictly increasing in θ while the

value from a contract with p = 0 is the same for all the types. Hence whenever type i

prefers the contract with p = 0, so does type j ≤ i. Hence, if pi = 0, then pj = 0 for

each j ≤ i.

E.1.2 Proof of Proposition 20

We ignore incentive compatibility conditions except for types next to each other: For each

i, we consider ICi,i+1 and ICi+1,i only. Then ICi,i+1 is always binding if pi > 0:

Lemma 13 For suffi ciently small κ > 0, for each i with pi > 0, ICi,i+1 is binding in the

relaxed problem.

Proof. The same as Property 4 of lemma 12.

The next lemma shows that vi+1 = 0 implies vi = 0.

73

Page 74: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

Lemma 14 For each i with pi > 0, if vi+1 = 0, then we have wi+1 ≤ wi, pi+1 ≥ pi, and

vi = 0 in the relaxed problem.

Proof. Suppose otherwise: vi > 0. Then ICi+1,i must be binding: If ICi+1,i is not bind-

ing, then reducing vi and increasing wi to keep type i’s payoff constant is a profitable and

incentive-compatible deviation for the principal, as in Lemma 2.

However, if both ICi,i+1 and ICi+1,i are binding, then offering Ci to θi+1 and Ci+1 to θi are

both incentive compatible deviations of the principal. Since S (θ, w)−W is log-submodular,

the same proof as Proposition 3 ensures that wi+1 ≤ wi and pi+1 ≥ pi. Together with

log-submodularity of V and vi+1 = 0, we have vi = 0 by the same proof as Lemma 6.

We have vn = 0 by the same proof as Property 2 of Lemma 12. Hence, together with the

previous two lemmas, we have vi = 0 for each i with pi > 0. Hence log-submodularity of V

and “ICi,i+1 and ICi+1,i for each i”imply that 0 = vn = · · · = vi, pn ≥ pn−1 ≥ · · · ≥ pi and

wn ≤ wn−1 ≤ · · · ≤ wi.

Finally, log-submodularity of V implies all the other IC’s are satisfied. Hence the solution

for the relaxed problem is the true solution.

Lemma 15 Suppose V is log-submodular. Take any (Ci)ni=1 such that there exists ı ∈

{1, ..., n} where (i) pi > 0 if and only if i ≥ ı, (ii) vı−1 = · · · = v1 > 0 if ı ≥ 2, (iii)

0 = vn = · · · = vı, pn ≥ pn−1 ≥ · · · ≥ pı and wn ≤ wn−1 ≤ · · · ≤ wı, and (iv) ICi,i+1 binds

for i ≥ ı. If ICi,i+1 and ICi+1,i are satisfied for each i ∈ {1, ..., n}, then ICi,j is satisfied for

each i, j ∈ {1, ..., n}.

Proof. The constraint ICı,ı−1 implies type j prefers Ci to Cı−1 for each j ≥ ı (higher types

obtain higher value from contracts with positive retention probabilities) and type j prefers

Cı−1 to Cı for each j ≤ ı − 1 (lower types obtain lower value from contracts with positive

retention probabilities). Moreover, contracts Cı−1, ..., C1 are the same. Hence we focus on

proving ICj,i with pi, pj > 0.

74

Page 75: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

First, we prove that binding ICi,i+1 implies that type j ≥ i (weakly) prefers Ci+1 to Ci.

Suppose otherwise:

piV (θj, wi) < pi+1V (θj, wi+1) for some j > i.

Then, together with binding ICi,i+1, we have

V (θj, wi)

V (θj, wi+1)>pi+1

pi=

V (θi, wi)

V (θi, wi+1),

or

V (θj, wi)V (θi, wi+1) > V (θi, wi)V (θj, wi+1) .

Since V is log-submodular and wi ≥ wi+1, this is a contradiction.

Hence ICi+2,i+1 implies ICi+2,i. Similarly, ICi+3,i+2 implies ICi+3,i+1, which in turn

implies ICi+3,i. Recursively we have ICj,i for each j ≥ i+ 1.

The symmetric argument establishes that binding ICi,i+1 implies that type j ≤ i (weakly)

prefers Ci to Ci+1. Using this, we prove that ICi−2,i−1 implies ICi−2,i; ICi−3,i−2 implies

ICi−3,i−1, which in turn implies ICi−3,i; and so on.

E.1.3 Proof of Proposition 21

We relax the problem by ignoring ICi,j for each j > i + 1 (that is, the upward incentive

with more than one steps away) and ignoring ICi,j with i < n and j < i − 1 (that is, the

downward incentive with more than one steps away, except for the highest type θn). It is

straightforward to show that Lemma 12 holds with this relaxed problem. We can also show

that ICi,i+1 binds, as in Lemma 13:

Lemma 16 For suffi ciently small κ > 0, for each i, ICi,i+1 is binding.

Proof. Since we ignore ICj,i+1 for j < i, if ICi,i+1 is not binding, then reducing (pk, vk) to

((1− η) pk, (1− η) vk) for each k ≤ i keeps all the incentive compatibility constraints and

increases the principal’s profit.

75

Page 76: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

Using this lemma, we prove that contracts Ci and Ci−1 are the same, as long as pi, pi−1 > 0

recursively:

Contracts Cn and Cn−1 We know vn = 0. For pn−1, pn, wn−1, wn, vn−1, there are following

three cases:

Case 1. Replacement of n− 1: pn−1 = 0 In this case, Ci with i ≤ n− 1 satisfies pi = 0

by Property 5 of Lemma 12.

Case 2. No Fixed Payment for n− 1: vn−1 = 0 In this case, as in Lemma 6, ICn−1,n

and ICn,n−1 together with log-submodularity of V implies that wn ≤ wn−1 and pn ≥ pn−1.

We now prove that actually we have wn = wn−1 and pn = pn−1. Suppose otherwise.

By the same proof as Proposition 3, log-supermodularity of S (θ, w)−W implies that we

have either

pn [S (θn−1, wn)−W ]− pn−1 [S (θn−1, wn−1)−W ] > 0 (35)

or

pn−1 [S (θn, wn−1)−W ]− pn [S (θn, wn)−W ] > 0. (36)

Since Lemma 16 implies ICn−1,n is binding, it is possible to offer Cn to θn−1; and so

0 ≥ [pnπ (θn−1, wn)− vn + (1− pn)W ]− [pn−1π (θn−1, wn−1)− vn−1 + (1− pn−1)W ]

≥ pn [S (θn−1, wn)−W ]− pn−1 [S (θn−1, wn−1)−W ] by (ICn−1,n).

Hence we cannot have (35) (otherwise the principal would offer Cn to θn−1). Hence, we have

to have (36).

However, since the highest type works with e = 1 without incurring any cost, we have

S (θn, wn−1)−W = S (θn, wn)−W = q (1)−W > 0.

Now pn > pn−1 contradicts to (36).

76

Page 77: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

Case 3. Fixed Payment for n − 1: vn−1 > 0 In this case, we will show that, with

log-submodular V , we have wn ≥ wn−1 and pn ≤ pn−1. Moreover, we have ICn,n−1 binding.

Lemma 17 Suppose pn−1, vn−1 > 0. If V (θ, w) is log-submodular, then ICn,n−1 is binding

and we have wn ≥ wn−1 and pn ≤ pn−1.

Proof. If vn−1 > 0, then Property 3 of Lemma 12 implies ICn,n−1 is binding:

pn−1V (θn, wn−1) + vn−1 = pnV (θn, wn) . (37)

On the other hand, Lemma 16 implies

pn−1V (θn−1, wn−1) + vn−1 = pnV (θn−1, wn) . (38)

From (37) and (38), we have

pn−1 [V (θn, wn−1)− V (θn−1, wn−1)] = pn [V (θn, wn)− V (θn−1, wn)] ,

and so

vn−1 = pnV (θn, wn)− pn−1V (θn, wn−1)

= pn−1

(V (θn, wn−1)− V (θn−1, wn−1)

V (θn, wn)− V (θn−1, wn)V (θn, wn)− V (θn, wn−1)

)= pn−1

(V (θn−1, wn)V (θn, wn−1)− V (θn−1, wn−1)V (θn, wn)

V (θn, wn)− V (θn−1, wn)

).

Since vn−1 ≥ 0, we have that, if V (θ, w) is log-submodular, then we have wn ≥ wn−1 and

pn ≤ pn−1.

77

Page 78: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

Since ICn,n−1 is binding, it is possible to offer Cn−1 to θn. Hence such a deviation should

not be profitable:

0 ≥ [pn−1π (θn, wn−1)− vn−1 + (1− pn−1)W ]− [pnπ (θn−1, wn) + (1− pn)W ]

≥ pn−1 [S (θn, wn−1)−W ]− pn [S (θn, wn)−W ] by (ICn,n−1)

= pn−1 [q (1)−W ]− pn [q (1)−W ] .

Hence we have pn = pn−1, and so vn−1 = 0 and wn = wn−1.

In total, Cn and Cn−1 are the same, as long as pn−1 > 0, as desired.

Contracts Cn−1 and Cn−2 Consider the following two cases:

Case 1. Replacement of n− 2: pn−2 = 0 In this case, Ci with i ≤ n− 2 satisfies pi = 0.

Case 2. Positive Retention of n−2: pn−2 > 0 and vn−2 ≥ 0 Since Cn = Cn−1, ICn−2,n−1

(this constraint is not ignored) is equivalent to ICn−2,n (this constraint is ignored). In

addition, Lemma 16 implies ICn−2,n−1 is binding, and so ICn−2,n is binding given Cn = Cn−1.

Suppose vn−2 = 0. ICn−2,n and ICn,n−2 together with log-submodularity of V implies

that wn ≤ wn−2 and pn ≥ pn−2. Hence, by the same proof as in Case 2 of contracts Cn and

Cn−1, we have wn = wn−2 and pn = pn−2.

Suppose vn−2 > 0. Since we ignore ICn−1,n−2, this implies that ICn,n−2 is binding.

Since both ICn−2,n and ICn,n−2 are binding, we have wn ≥ wn−2 and pn ≤ pn−2 from log-

submodularity of V , by the same proof as Lemma 17. Moreover, since ICn,n−2 is binding,

offering Cn−2 to θn is possible. By the same proof as in Case 3 of contracts Cn and Cn−1, we

have wn = wn−2 and pn = pn−2.

In total, Cn−1 and Cn−2 are the same, as long as pn−2 > 0. Recursively, we can prove that

all the types with positive pi > 0 are offered the same contract. Hence all the constraints of

the original problem is satisfied.

78

Page 79: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

F The Model with Log-Supermodular V

We now provide the characterization with log-supermodular V . We focus on the base model:

W is exogenous and the agent does not have outside options, since the extension from the

base model is analogous to the case with log-submodular V .

Note that Lemmas 1 and 2 hold with log-submodular V since their proofs do not use the

log-submodularity of V .

First, we show that, with log-supermodular V , constraint (ICL) always binds:

Proposition 22 If V is log-supermodular, then (ICL) holds with equality.

Proof. Suppose (ICL) is not binding. Then, the principal would like to increase pH as

much as possible: pH = 1; and no fixed payment is offered to type L: vL = 0. In

addition, since one of the constraints has to bind, constraint (ICH) holds with equality:

pL = V (H,wH) /V (H,wL). This implies wH ≤ wL. However, the log-supermodularity of V ,

pL = V (H,wH) /V (H,wL), and wH ≤ wL implies pL ≤ V (L,wH) /V (L,wL). Hence (ICL)

has to bind.

Hence we focus on (ICL) binding. The case in which type L is not retained is the same

as the case with log-submodular V :

Proposition 23 (Retention of type H only) If W ≥ S (L, 1), then only the type H agent is

retained in the optimal contract: vL > 0, pL = 0, pH = 1. Conversely, if W < S (L, 1), then

we have pL > 0.

Proof. The same as Proposition 2.

Finally, the case in which both types are retained is characterized as follows:

Proposition 24 Consider the case in which the condition of Proposition 23 is not satisfied.

The optimal contract has the following properties:

1. (Principal-Agent Alignment) If S(θ, w)−W is also log-supermodular, then

vL = 0, wL ≤ wH , 1 = pL ≥ pH > 0.

79

Page 80: Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit … · 2017-10-04 · Avoiding the Cost of a Bad Hire: On the Optimality of Pay to Quit Programs Dana Foarta Stanford

2. (Principal-Agent Misalignment) If S(θ, w) − W is log-submodular, then (ICH)

holds with equality and

vL ≥ 0, wL ≥ wH , 1 = pH ≥ pL > 0.

Proof. Symmetric to Proposition 3.

We can derive the full characterization analogous to Propositions 4 and 5. The details

are omitted since it is completely symmetric to the case with log-submodular V .

80