CENSORSHIP AS OPTIMAL PERSUASION

32
CENSORSHIP AS OPTIMAL PERSUASION Anton Kolotilin, Tymofiy Mylovanov, Andriy Zapechelnyuk Abstract. A sender designs a signal about the state of the world to persuade a receiver. Under standard assumptions, an optimal signal censors states on one side of a cutoand reveals all other states. This result holds in continuous and discrete environments with general and monotone partitional signals. The sender optimally censors more information if she is more biased, if she is more certain about the receiver’s preferences, and if the receiver is easier to persuade. We apply our results to the problem of media censorship by a government. JEL Classification: D82, D83, L82 Keywords: Bayesian persuasion, information design, censorship, media Date: December 11, 2019. Kolotilin : School of Economics, UNSW Australia, Sydney, NSW 2052, Australia. E-mail: [email protected]. Mylovanov : Cabinet of Ministers of Ukraine, Government of Ukraine; University of Pittsburgh, De- partment of Economics, 4714 Posvar Hall, 230 South Bouquet Street, Pittsburgh, PA 15260, USA. E-mail: [email protected]. Zapechelnyuk : School of Economics and Finance, University of St Andrews, Castleclie, the Scores, St Andrews KY16 9AR, UK. E-mail: [email protected]. We are grateful for discussions with Ming Li with whom we worked on the companion paper Kolotilin et al. (2017). An early version of the results in this paper and the results in the companion pa- per were presented in our joint working paper Kolotilin et al. (2015). We thank Ricardo Alonso, Dirk Bergemann, Patrick Bolton, Alessandro Bonatti, Steven Callander, Odilon Cˆ amara, Rahul Deb, P´ eter Es¨ o, Johannes H¨ orner, Florian Homan, Roman Inderst, Emir Kamenica, Navin Kar- tik, Daniel Kr¨ aehmer, Hongyi Li, Marco Ottaviani, Mallesh Pai, Andrea Prat, Ilya Segal, Larry Samuelson, Joel Sobel, Konstantin Sonin, as well as many conference and seminar participants, for helpful comments. Kolotilin gratefully acknowledges financial support from the Australian Research Council Discovery Early Career Research Award DE160100964 and from MIT Sloan’s Program on Innovation in Markets and Organizations. Zapechelnyuk gratefully acknowledges financial support from the Economic and Social Research Council Grant ES/N01829X/1. The views expressed in this paper are solely the responsibility of the authors and should not be interpreted as reflecting the views of the Government of Ukraine. 1

Transcript of CENSORSHIP AS OPTIMAL PERSUASION

CENSORSHIP AS OPTIMAL PERSUASION

Anton Kolotilin Tymofiy Mylovanov Andriy Zapechelnyuk

Abstract A sender designs a signal about the state of the world to persuade areceiver Under standard assumptions an optimal signal censors states on one sideof a cutoff and reveals all other states This result holds in continuous and discreteenvironments with general and monotone partitional signals The sender optimallycensors more information if she is more biased if she is more certain about thereceiverrsquos preferences and if the receiver is easier to persuade We apply our resultsto the problem of media censorship by a government

JEL Classification D82 D83 L82

Keywords Bayesian persuasion information design censorship media

Date December 11 2019

Kolotilin School of Economics UNSW Australia Sydney NSW 2052 Australia E-mailakolotilingmailcomMylovanov Cabinet of Ministers of Ukraine Government of Ukraine University of Pittsburgh De-partment of Economics 4714 Posvar Hall 230 South Bouquet Street Pittsburgh PA 15260 USAE-mail mylovanovgmailcomZapechelnyuk School of Economics and Finance University of St Andrews Castlecliffe the ScoresSt Andrews KY16 9AR UK E-mail az48st-andrewsacuk

We are grateful for discussions with Ming Li with whom we worked on the companion paper Kolotilinet al (2017) An early version of the results in this paper and the results in the companion pa-per were presented in our joint working paper Kolotilin et al (2015) We thank Ricardo AlonsoDirk Bergemann Patrick Bolton Alessandro Bonatti Steven Callander Odilon Camara RahulDeb Peter Eso Johannes Horner Florian Hoffman Roman Inderst Emir Kamenica Navin Kar-tik Daniel Kraehmer Hongyi Li Marco Ottaviani Mallesh Pai Andrea Prat Ilya Segal LarrySamuelson Joel Sobel Konstantin Sonin as well as many conference and seminar participants forhelpful comments Kolotilin gratefully acknowledges financial support from the Australian ResearchCouncil Discovery Early Career Research Award DE160100964 and from MIT Sloanrsquos Program onInnovation in Markets and Organizations Zapechelnyuk gratefully acknowledges financial supportfrom the Economic and Social Research Council Grant ESN01829X1

The views expressed in this paper are solely the responsibility of the authors and should not beinterpreted as reflecting the views of the Government of Ukraine

1

2 KOLOTILIN MYLOVANOV ZAPECHELNYUK

1 Introduction

There has been a rapid growth of the literature on Bayesian persuasion and infor-mation design over the past decade The range of applications in this literature isimpressive and includes clinical trials bank stress tests school grading policies qual-ity certification advertising strategies transparency in organizations persuasion ofvoters and media control In many of these applications it is optimal to disclose in-formation about an uncertain state of the world via a censorship signal that censorsstates on one side of a cutoff and reveals states on the other side of the cutoff Fullyinformative and completely uninformative signals are special cases of censorship

We characterize necessary and sufficient conditions under which a censorship signalis optimal in the linear persuasion problem where the sender and receiverrsquos utilitiesdepend on the beliefs about the state only through the expected state We thusunify and generalize sufficient conditions scattered across the literature In additionwe establish new results for environments where the state is discrete and a signal isconstrained to be monotone partitional These environments are relevant in applica-tions and take into account constraints faced by information designers in practice1

Finally we provide monotone comparative statics results on the informativeness ofthe optimal signal

The linear persuasion problem is a tractable workhorse model in the persuasion lit-erature This problem can be described by the prior distribution of the state and thesenderrsquos indirect utility function of the expected state We consider a slightly differentbut equivalent formulation of the problem which is easier to interpret

In our model the receiver chooses one of two actions to accept or reject a proposalIf the receiver rejects the proposal the sender and receiverrsquos utilities are normalizedto zero If the receiver accepts the proposal the sender and receiverrsquos utilities dependon the unknown state of the world and the receiverrsquos private type The state and typeare independent random variables that represent respectively the receiverrsquos benefitand cost from acceptance We assume that the type is a continuous random variablebut we allow the state to be either a continuous or discrete random variable2

The senderrsquos utility is linear in the state and is not perfectly aligned with the receiverrsquosutility An important canonical case of the model is where the senderrsquos utility is aweighted sum of the receiverrsquos utility and action

To influence the receiverrsquos action the sender designs a signal that reveals informationabout the state After observing the signal realization the receiver updates his beliefsabout the state and chooses an action that maximizes his expected utility

1For example a credit rating of financial institutions is typically a monotone partition Neithernonmonotone partitions that pool high- and low-performing institutions nor stochastic disclosurerules that map an institutionrsquos financial performance to a random rating appear to be feasible

2Extending the analysis to the case where the state and type are general random variables presentstechnical difficulties but does not yield new economic insights

CENSORSHIP AS OPTIMAL PERSUASION 3

The main result shows that if the senderrsquos indirect utility is S-shaped in the expectedstate (that is convex below some threshold and concave above the threshold) thenand only then an upper-censorship signal that pools states above a cutoff and revealsstates below the cutoff is optimal for all prior distributions of the state3 In thecanonical case the main result shows that if the probability density of receiver types islog-concave then and only then upper censorship is optimal for all prior distributionsof the state and for all weights that the sender puts on the receiverrsquos utility and action4

The main result is obtained under two scenarios In the first scenario the signal isa random variable with arbitrary correlation with the state In the second scenariothe signal is constrained to be a monotone partition of the state space so that eachstate is either revealed or pooled with some adjacent states

To compare the main result under these two scenarios consider the cutoff of an opti-mal upper-censorship signal Notice that upper censorship does not specify whetherthe cutoff state is pooled or revealed If the state is a continuous random variablethen it does not matter whether this cutoff state is pooled or revealed Thus an op-timal upper-censorship signal is in fact a monotone partition However if the stateis a discrete random variable then the optimal signal is stochastic upper censorshipso that the cutoff state is generally pooled and revealed with interior probabilitiesIn contrast the optimal monotone partition is deterministic upper censorship so thatthe cutoff state is either completely pooled or fully revealed

In many applications of interest the state is discrete and thus optimal signals andoptimal monotone partitions generally differ An optimal upper-censorship signalprovides an upper bound on the value of persuasion To achieve this bound thesender should be able to commit to randomize over two signal realizations conditionalon the cutoff state which may be hard to enforce in practice In contrast monotonepartitions are relatively simple signals that can be enforced and verified ex post

The problem of finding an optimal monotone partition is a discrete optimizationproblem which cannot be solved using existing tools from the persuasion literatureNevertheless we show that if the senderrsquos indirect utility is S-shaped (or in thecanonical case the density of receiver types is log-concave) then upper censorship isan optimal monotone partition To prove this result for any monotone partition dif-ferent from upper censorship we explicitly construct a deterministic upper-censorshipsignal that is preferred by the sender

When upper censorship is optimal it is possible to perform the comparative staticsanalysis on the informativeness of the optimal signal We show that in the canonicalcase the sender optimally censors more information (i) if the sender is more biasedin that the sender puts a smaller weight on the receiverrsquos utility (ii) if the receiver is

3Analogously a lower-censorship signal that pools states below a cutoff is optimal for all priordistributions if and only if the senderrsquos indirect utility has an inverted S-shape

4Most commonly used probability densities are log-concave Log-concave densities exhibit niceproperties such as single-peakedness and the monotonicity of the hazard rate

4 KOLOTILIN MYLOVANOV ZAPECHELNYUK

easier to persuade in that the receiverrsquos cost of accepting the proposal is smaller and(iii) if the sender is more certain about the receiverrsquos preferences in that the densityof receiver types is more concentrated around its mode

We apply our results to the problem of media censorship by the government Weconsider a stylized setting with a finite number of media outlets and a continuumof heterogeneous citizens (receivers) Each media outlet is identified by its editorialpolicy that specifies an interval of endorsement states We permit the aggregate ac-tion of the citizens to affect their utility but not their optimal actions For examplean election outcome impacts all citizens but does not change their preferences overcandidates The government wishes to influence citizensrsquo actions by deciding whichmedia outlets to censor In the canonical case if the probability density of citizenstypes is log-concave then the optimal media censorship policy is to permit all suf-ficiently loyal media outlets and to censor the remaining outlets Our comparativestatics results suggest that the government should optimally increase censorship (i)if influencing society decisions becomes relatively more important than maximizingindividual welfare (ii) if the society experiences an ideology shock in favor of thegovernment and (iii) if the society becomes less diverse in ideology and taste

Related Literature The literature on Bayesian persuasion was set in motion bythe seminal papers of Rayo and Segal (2010) and Kamenica and Gentzkow (2011)The linear case which is the subject of this paper is prevalent in the literature5

Kolotilin (2018) provides necessary and sufficient prior dependent conditions for theoptimality of censorship in the case of a continuous state6 Alonso and Camara(2016b) provide sufficient conditions in the case of a discrete state In contrastour paper provides necessary and sufficient conditions that are prior independent (aswell as bias independent in the canonical case) and apply in both continuous anddiscrete cases More importantly we show that these conditions are also necessaryand sufficient for deterministic censorship to be optimal in the class of monotonepartitional signals (simple signals) There are no counterparts to this result in theliterature

Similar to us Alonso and Camara (2016b) show that the optimal signal is less infor-mative if the receiver is easier to persuade Their proof is simpler but it is valid onlywhen the senderrsquos utility is state-independent But our comparative statics resultswith respect to the senderrsquos bias and uncertainty are novel

Censorship policies commonly emerge as optimal signals in linear persuasion models ina variety of contexts starting from the prosecutor-judge example as well as lobbyingand product advertising examples in Kamenica and Gentzkow (2011) Other contextswhere censorship is optimal include grading policies (Ostrovsky and Schwarz 2010)

5Notable exceptions include Rayo and Segal (2010) Goldstein and Leitner (2018) Guo andShmaya (2019) and Kolotilin and Wolitzky (2019)

6See also Gentzkow and Kamenica (2016) Kolotilin Mylovanov Zapechelnyuk and Li (2017)and Dworczak and Martini (2019) for sufficient conditions

CENSORSHIP AS OPTIMAL PERSUASION 5

media control (Gehlbach and Sonin 2014 Ginzburg 2019) clinical trials (Kolotilin2015) voter persuasion (Alonso and Camara 2016ab) transparency benchmarks(Duffie Dworczak and Zhu 2017) stress tests (Goldstein and Leitner 2018 OrlovZryumov and Skrzypach 2018) trading mechanisms (Romanyuk and Smolin 2019Dworczak 2017 Yamashita 2018) quality certification (Zapechelnyuk 2019) andrelational communication (Kolotilin and Li 2019)

Our application to media censorship fits into the literature on media capture thataddresses the problem of media control by governments political parties or lobbyinggroups Besley and Prat (2006) pioneered this literature by studying how a gov-ernmentrsquos incentives to censor free media depend on plurality (the number of mediaoutlets) and on transaction costs of bribing the media In their model media outletsare identical and all voters have common interests Thus the governmentrsquos optimaldecision reduces to either capturing all media outlets or none Relatedly Gehlbachand Sonin (2014) consider a setting with a government-influenced monopoly mediaoutlet that exploits the trade-off between the governmentrsquos objective to ldquomobilizerdquo thepopulation for some collective goal and to collect the revenue from subscribers whodemand informative news Gehlbach and Sonin (2014) show that the presence of theldquomobilizationrdquo objective increases the news bias whereas the subscription revenuesreduce the bias but can cause the government to nationalize the media outlet7

The models of Besley and Prat (2006) and Gehlbach and Sonin (2014) have two statesof the world and either a single media outlet or a few identical media outlets Hencethe government uses the same censorship policy for each outlet We are the first toconsider a richer model of media censorship with a continuum of states and multipleheterogeneous media outlets As a result the government optimally discriminatesmedia outlets by permitting sufficiently loyal ones and banning the remaining ones

2 Model

21 Setup There are two players a sender (she) and a receiver (he) The receiverchooses whether to accept a proposal (a = 1) or reject it (a = 0) The proposalhas an uncertain value ω isin [0 1] By accepting the proposal the receiver forgoesan outside option worth r isin [0 1] so the receiverrsquos utility is a(ω minus r) The senderrsquosutility is av(ω r) where v(ω r) is linear in ω and continuously differentiable in rWe will refer to ω as state and to r as type and denote their distributions by F andG respectively Throughout the paper we assume that distribution G of r has acontinuously differentiable and strictly positive density g8

As in many applications the state is either a continuous or discrete random variablewe will separately analyze these two cases We say that the state ω and distribution F

7In different contexts media control by a government has also been studied by Egorov Gurievand Sonin (2009) Edmond (2013) and Lorentzen (2014) See also the overview of the literature onmedia capture slant and transparency in Prat and Stromberg (2013)

8This assumption is made for clarity of exposition The results can be extended to general G

6 KOLOTILIN MYLOVANOV ZAPECHELNYUK

are continuous if F has a density f on [0 1] We say that the state ω and distributionF are discrete if the support of F is a finite set

A particularly tractable and intuitive special case is where the senderrsquos utility is aweighted sum of the receiverrsquos utility and action We call this the canonical caseFormally in the canonical case the senderrsquos utility when the proposal is accepted is

v (ω r) = 1 + ρ(ω minus r) ρ isin R (A1)

That is the sender is biased towards a = 1 but also puts a weight ρ on the receiverrsquosutility In particular if the weight ρ is large then the sender and receiverrsquos inter-ests are aligned whereas if the weight is zero then the sender cares only about thereceiverrsquos action

The receiver privately knows his type but he does not observe the state The sendercan influence the action taken by the receiver a = 1 or a = 0 by releasing a signalthat reveals information about the state A signal is a random variable s isin [0 1]that is independent of r but is possibly correlated with ω For example s is fullyinformative if it is perfectly correlated with ω and s is completely uninformative ifit is independent of ω

The timing is as follows First the sender publicly chooses a signal s Then realiza-tions of ω r and s are drawn Finally the receiver observes the realizations of histype r and the signal s and then chooses between a = 0 and a = 1

We consider two scenarios

A The sender can choose any signal

B The sender can choose any monotone partitional signal

Formally a signal s is a monotone partitional signal if there exists a nondecreasingfunction ξ(ω) such that s = ξ(ω) for each ω Every such signal induces a partitionof the state space [0 1] into intervals and singletons and the receiver observes thepartition element that contains the state

Under Scenario A we are interested in an optimal signal that maximizes the senderrsquosexpected utility among all signals This is a standard persuasion problem

Under Scenario B we are interested in an optimal monotone partition that maximizesthe senderrsquos expected utility among all monotone partitional signals This scenarioincorporates constraints that information designers often face in practice For ex-ample a non-monotone grading policy that gives better grades to worse performingstudents will be perceived as unfair and will be manipulated by strategic students

22 Upper Censorship A subset of signals called upper censorship will play aspecial role in this paper

CENSORSHIP AS OPTIMAL PERSUASION 7

An upper-censorship signal reveals states below a specified cutoff and pools statesabove this cutoff When the state is equal to the cutoff it can be revealed or pooledIf the state is continuous it does not matter whether the cutoff state is revealed orpooled because this is a zero probability event However if the state is discretewe will distinguish between deterministic and stochastic upper-censorship signalsdepending on what happens at the cutoff state

A lower-censorship signal is defined symmetrically it pools states below a specifiedcutoff and reveals states above this cutoff Because the case of lower censorshipis symmetric (see Remark 1 below) for clarity of exposition we focus on uppercensorship throughout the paper

A signal s is stochastic upper censorship if there exists a cutoff pair (ωlowast qlowast) consistingof a cutoff state ωlowast isin [0 1] and probability qlowast isin [0 1] such that states below ωlowast arerevealed states above ωlowast are pooled and the state ω = ωlowast is revealed with probabilityqlowast and pooled with probability 1minus qlowast For example s can be expressed as

s =

983099983105983103

983105983101

ω with probability one if ω lt ωlowast

ωlowast and mlowast with probabilities qlowast and 1minus qlowast if ω = ωlowast

mlowast with probability one if ω gt ωlowast

where

mlowast =

983125(ωlowast1]

ωdF (ω) + ωlowast(1minus qlowast) Pr[ω = ωlowast]983125(ωlowast1]

dF (ω) + (1minus qlowast) Pr[ω = ωlowast](1)

is the posterior expected state induced by the pooling signal Notice that mlowast is afunction of ωlowast and qlowast

A stochastic upper-censorship signal s with a cutoff pair (ωlowast qlowast) is deterministic uppercensorship if and only if qlowast isin 0 1 in which case s can be expressed as a monotonepartition of the state space [0 1] For example if qlowast = 0 then the signal s fullyreveals states ω lt ωlowast and pools states ω ge ωlowast so s can be expressed as

s =

983083ω if ω lt ωlowast

mlowast if ω ge ωlowast

wheremlowast = E[ω|ω ge ωlowast]

is the expected state conditional on being at least ωlowast Note that the fully informativesignal and the completely uninformative signal are deterministic upper-censorshipsignals with (ωlowast qlowast) = (1 1) and (ωlowast qlowast) = (0 0) respectively

23 Benchmark To illustrate the difference between the cases of a continuous anddiscrete state as well as between Scenarios A and B we solve a benchmark examplewhere there is no uncertainty about the receiverrsquos type

Let the state ω have the expected value E[ω] = 12 and let the receiverrsquos type r be

known to the sender and satisfy r isin98304312 1983044 In addition suppose that the sender wishes

8 KOLOTILIN MYLOVANOV ZAPECHELNYUK

to maximize the probability that the receiver accepts the proposal so v(ω r) = 1 forall ω and all r

Observe that if the sender reveals no information about ω the receiver will evaluate ωby its expected value of 1

2 thus having utility 1

2minus r lt 0 from accepting the proposal

So revealing no information makes the receiver reject the proposal with certaintyThe sender can do better by fully revealing ω in which case the proposal is acceptedwith probability Pr[ω ge r]

However the sender can do even better by pooling states above r with states below rwhile keeping the expected state induced by the pooling signal at least r Thus thereceiver who is unable to distinguish between the states in the pool will accept theproposal at all of these states

To illustrate the case of the continuous state suppose that ω is uniformly distributedon [0 1] The largest pool that maintains the posterior expectation at least r is theinterval [2r minus 1 1] Indeed once the receiver learns that ω isin [2r minus 1 1] given theuniform prior of ω the posterior expected state is r So the proposal is accepted whenω ge 2rminus 1 and rejected when ω lt 2rminus 19 Thus the deterministic upper-censorshipsignal with cutoff ωlowast = 2r minus 1 is optimal The senderrsquos expected utility is equal to

Pr[proposal is accepted] = Pr[ω ge 2r minus 1] = 2(1minus r)

which is twice as high as that from the fully informative signal

Pr[proposal is accepted] = Pr[ω ge r] = 1minus r

When ω is a discrete random variable an optimal way to reveal information aboutω takes the form of stochastic upper censorship For illustration suppose that ω canonly be 0 or 1 equally likely Now simply pooling states is not helpful for the senderPooling 0 and 1 yields the posterior expected state of 1

2 which is smaller than the

type r resulting in a rejection of the proposal Including any other states in thepool makes no difference as these states never occur Thus an optimal monotonepartition is the fully informative signal and the probability that the receiver acceptsthe proposal is simply equal to the probability that ω = 1

Pr[proposal is accepted] = Pr[ω ge r] = Pr[ω = 1] =1

2

The sender can do strictly better by partial (stochastic) pooling If ω = 1 let thesignal realization be ldquoacceptrdquo with certainty If ω = 0 let the signal realization beldquorejectrdquo with some probability q and ldquoacceptrdquo with probability 1 minus q So when thereceiver observes ldquoacceptrdquo the posterior expected state is

E[ω|ldquoacceptrdquo] = Pr[ω = 1]

Pr[ω = 1] + Pr[ω = 0] middot (1minus q)=

12

12 + 12 middot (1minus q)=

1

2minus q

9Whether states strictly below ωlowast are revealed or pooled among themselves does not matterbecause they are all strictly below r and thus induce the receiver to take action a = 0

CENSORSHIP AS OPTIMAL PERSUASION 9

The sender now wishes to find the lowest q subject to 1(2 minus q) ge r which yieldsqlowast = 2 minus 1

r Thus stochastic upper censorship with ωlowast = 0 and qlowast = 2 minus 1

ris

optimal The probability that the receiver accepts the proposal is exactly equal tothe probability that the signal realization is ldquoacceptrdquo

Pr[proposal is accepted] = Pr[ω = 1] + Pr[ω = 0] middot (1minus qlowast) =1

2+

1

2middot 1minus r

r=

1

2r

We now summarize the insights obtained in this example If the state is continuousthen an optimal signal is deterministic upper censorship If the state is discrete thenan optimal monotone partition is deterministic upper censorship and an optimalsignal is stochastic upper censorship10 In the remainder of the paper we will showthat these insights extend to the case where the receiverrsquos type is uncertain

3 Continuous State

In this section we provide necessary and sufficient conditions for the optimality ofupper censorship when the state is continuous

Let m be the expected state conditional on observing a realization of a signal s Sincethe sender and receiverrsquos utilities are linear in ω they depend on the information aboutω revealed by signal s only through the expected state m In particular the receiverchooses a = 1 if and only if r le m

Let V (m) denote the indirect utility of the sender conditional on m

V (m) =

983133

rlem

E [v(ω r)|m] g(r)dr =

983133 m

0

v(m r)g(r)dr m isin [0 1] (2)

where E [v(ω r)|m] = v(m r) by the linearity of v in ω

A function V is said to be S-shaped if it is convex below some threshold and concaveabove that threshold or equivalently if V primeprime is single-crossing from above

there exists τ such that V primeprime(m) ge (le) 0 for all m lt (gt) τ

We now provide the criterion for the optimality of upper censorship Recall thatmlowast = E[ω|ω ge ωlowast]

Theorem 1 Let V be S-shaped Then and only then for all continuous F anoptimal signal is deterministic upper censorship whose cutoff state ωlowast isin [0 τ ] satisfies

V (mlowast) + V prime(mlowast)(mminusmlowast) ge (le)V (m) for all m ge (lt)ωlowast (3)

The criterion for the optimality of lower censorship is analogous

10See for example Kolotilin (2015) for a detailed treatment of the case where there is no uncer-tainty about the receiverrsquos type

10 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Remark 1 Let minusV be S-shaped Then and only then for all continuous F anoptimal signal is deterministic lower censorship whose cutoff state ωlowast isin [τ 1] satisfies

V (mlowast) + V prime(mlowast)(mminusmlowast) ge (le)V (m) for all m le (gt)ωlowast

Theorem 1 states that optimal persuasion takes a simple form of upper censorshipwhen V is S-shaped11 Moreover the theorem establishes that this result is tight inthe sense that if V is not S-shaped then there exists a distribution F of the statesuch that no upper-censorship signal is optimal

Note that every upper-censorship signal is a monotone partition Therefore Theorem1 applies in both Scenario A where the sender is free to choose any signal andScenario B where the sender is constrained to choose a monotone partition

The intuition behind Theorem 1 is as follows Observe that when no informationabout the state is revealed the receiverrsquos best-response action does not change withthe state The more information is revealed the more variable the receiverrsquos behaviorwill be in response to this information Consider an interval where V is concave Thesender would prefer not to reveal any information when the state is in this intervalsince a certain outcome is preferred to any lottery with the same expected stateConversely consider an interval where V is convex The sender would prefer to fullyreveal the state in this interval since now lotteries are preferred If V is S-shapedthat is it is convex below some threshold and concave above that threshold theinduced optimal persuasion takes the form of upper censorship

When V is S-shaped the senderrsquos optimization problem is reduced to finding anoptimal censorship cutoff ωlowast If the realized state ω is below the cutoff then it isrevealed to the receiver so the expected utility of the sender is V (ω) If the realizedstate ω is above the cutoff then the posterior expected state is mlowast = E[ω|ω ge ωlowast] sothe expected utility of the sender conditional on ω ge ωlowast is V (mlowast) The sender thusneeds to solve the problem

maxωlowastisin[01]

983133 ωlowast

0

V (ω)f(ω)dω +

983133 1

ωlowastV (mlowast)f(ω)dω (4)

The expression in (3) represents the first-order condition to this problem and capturesthree possible cases the boundary solutions ωlowast = 0 and ωlowast = 1 if the expression in(3) has the same sign for all ω and an interior solution ωlowast such that

V (mlowast) + V prime(mlowast)(ωlowast minusmlowast) = V (ωlowast) (5)

This first-order condition (5) is illustrated by Figure 1 The solid line is V (m) andthe dashed line is V (mlowast) + V prime(mlowast)(mminusmlowast) which is tangent to V at mlowast

11For a fixed distribution F Proposition 3 in Kolotilin (2018) implies that upper censorship withcutoff ωlowast is optimal if and only if (3) holds and V (m) is convex on [0ωlowast] This result can be used toshow that upper censorship is optimal when V is S-shaped For completeness we include a simpleself-contained proof inspired by the proof of Theorem 1 in Dworczak and Martini (2019)

CENSORSHIP AS OPTIMAL PERSUASION 11

1ωlowast

V

revealing pooling

mlowast0

Figure 1 Optimal upper censorship with cutoff ωlowast

The solution of the senderrsquos problem (4) is particularly simple if V is either globallyconvex or globally concave In this case the first-order condition in (3) has a constantsign so either ωlowast = 0 or ωlowast = 1 must be optimal If V is convex then ωlowast = 1 isoptimal which corresponds to the fully informative signal Similarly if V is concavethen ωlowast = 0 is optimal which corresponds to the completely uninformative signalThis is summarized in the following corollary

Corollary 1 An optimal signal is

(i) fully informative for all F if and only if V is convex

(ii) completely uninformative for all F if and only if V is concave

31 Canonical Case We now consider the canonical case where the senderrsquos utilitysatisfies assumption (A1)

The density g of receiver types r is said to be log-concave if ln g(r) is concave in r12

Note that ln g(r) is well defined in the canonical case

In this case the shape of the senderrsquos indirect utility V defined by (9) is connectedto the shape of the density g of receiver types as follows

Lemma 1 In the canonical case V is S-shaped for all ρ if and only if g is log-concave

12Table 1 in Bagnoli and Bergstrom (2005) reports distributions with log-concave densities

12 KOLOTILIN MYLOVANOV ZAPECHELNYUK

That is if the density of receiver types g is log-concave then the senderrsquos indirectutility V is S-shaped Moreover this result is tight in the sense that if g is notlog-concave then there exists ρ such that V is not S-shaped

By Theorem 1 and Lemma 1 we obtain a criterion for the optimality of upper cen-sorship with the condition on the primitive of the model the density g

Theorem 2 Consider the canonical case and let the density g be log-concave Thenand only then an optimal signal is deterministic upper censorship for all F and all ρ

The symmetric statement is also true an optimal signal is lower censorship for all Fand all ρ if and only if minusg is log-concave Consequently if both g and minusg are log-concave (that is g is exponential) then there exists an optimal signal which is bothupper censorship and lower censorship There are only two signals with this propertyfully informative and completely uninformative This allows us to obtain conditionson the distribution of receiver types under which the optimal signal is polarizedbetween fully informative and completely uninformative signals as for example inLewis and Sappington (1994) and Johnson and Myatt (2006)13

Corollary 2 In the canonical case an optimal signal is either fully informative orcompletely uninformative for all F and all ρ if and only if there exist λ isin R and c gt 0such that g(r) = ceminusλr for r isin [0 1]

If g(r) = ceminusλr then the fully informative signal is optimal whenever ρ ge λ and thecompletely uninformative signal is optimal whenever ρ le λ (and any signal is optimalwhen ρ = λ) In particular if ρ = 0 the optimal signal is fully determined by thesign of λ which is in turn determined by whether the mean of r is smaller or greaterthan 1

2

32 Comparative Statics Theorem 2 allows for a sharp comparative statics anal-ysis on the informativeness of the optimal signal

We compare signals by their Blackwell informativeness (Blackwell 1953) To compareupper-censorship signals s1 and s2 we only need to compare their cutoffs ωlowast

1 and ωlowast2

Signal s1 is more informative than signal s2 if ωlowast1 ge ωlowast

2 Indeed state ω isin [0ωlowast2) is

fully revealed by both s1 and s2 and state ω isin [ωlowast2 1] is partially revealed by s1 and

not revealed at all by s2 so s1 is more informative than s2

13In the case of log-concave density g the sets of upper- and lower-censorship signals are totallyordered by the rotation order of Johnson and Myatt (2006) If we restrict attention to the set oflower-censorship signals (which are generally suboptimal under log-concave g) then the rotationpoint is decreasing with respect to the rotation order By Lemma 1 of Johnson and Myatt (2006)the senderrsquos expected utility is quasiconvex and one of the extreme lower-censorship signals (thefully informative or completely uninformative signal) is optimal for the sender But if we restrictattention to the set of upper-censorship signals (which are optimal under log-concave g) then therotation point is not decreasing with respect to the rotation order Therefore Lemma 1 of Johnsonand Myatt (2006) does not apply and an interior upper-censorship signal is generally optimal forthe sender

CENSORSHIP AS OPTIMAL PERSUASION 13

For the purpose of comparison we extend the definition of distribution function Gto the real line and assume that its density g is log-concave Consider a family ofdistributions Gtσ of receiver types

Gtσ (r) = G

983061τ minus t+

r minus τ

σ

983062

where τ is the maximum point of g and t isin R and σ gt 0 are parameters Let gtσ (r)denote the corresponding density Note that gtσ (r) is log-concave for all t isin R andall σ gt 0

Because gtσ is log-concave on [0 1] deterministic upper censorship is optimal byTheorem 2 Let ωlowast (ρ t σ) isin [0 1] be the optimal censorship cutoff as given by (3)

The shift parameter t shifts the distribution along the horizontal axis and the stretchparameter σ stretches the distribution horizontally and symmetrically around themode τ so the limit of σ rarr 0 leads to the unit mass on τ We now show that thesender optimally discloses more information when

(i) the senderrsquos preferences are better aligned with the receiverrsquos preferences (thealignment parameter ρ is greater)

(ii) the receiver is more reluctant to accept the proposal (the shift parameter t isgreater)

(iii) the sender is more uncertain about the receiverrsquos type (the stretch parameter σis greater) provided ρ ge 014

Theorem 3 In the canonical case a cutoff state of an optimal signal satisfies

(i) ωlowast (ρ t σ) is increasing in ρ

(ii) ωlowast (ρ t σ) is increasing in t

(iii) ωlowast (ρ t σ) is increasing in σ if ρ ge 0

The intuition for part (i) is that for a higher ρ the sender puts more weight onthe receiverrsquos utility so she optimally endows the receiver with a higher utility byproviding more information

The intuition for part (ii) is that for a higher t each type of the receiver has a greatercost of accepting the proposal so to persuade the same type of the receiver the senderneeds to increase E[ω|ω ge ωlowast] by expanding the full disclosure interval [0ωlowast]

The intuition for part (iii) is that for a higher σ receiver types are more spread out soto persuade the same mass of types the sender again needs to increase E[ω|ω ge ωlowast]

14The case ρ ge 0 is the practically relevant case where the senderrsquos utility is a weighted averageof the receiverrsquos action and utility 1

1+ρv(ω r) =1

1+ρ + ρ1+ρ (ω minus r)

14 KOLOTILIN MYLOVANOV ZAPECHELNYUK

4 Discrete State

In this section we assume that the state is a discrete random variable

Theorem 1prime Let V be S-shaped Then and only then for all discrete F

(A) an optimal signal is stochastic upper censorship whose cutoff pair (ωlowast qlowast) satisfiescondition (3)

(B) an optimal monotone partition is deterministic upper censorship whose cutoff pair(ωlowast

d qlowastd) satisfies ωlowast

d = ωlowast and qlowastd isin 0 1

In the canonical case we obtain the result analogous to Theorem 2

Theorem 2prime Consider the canonical case and let the density g be log-concave Thenand only then for all discrete F and all ρ

(A) an optimal signal is stochastic upper censorship

(B) an optimal monotone partition is deterministic upper censorship

Let us discuss how the comparative statics result (Theorem 3) changes In Section 32we argue that upper-censorship signals can be ordered by their censorship cutoffs Butwhen the state is discrete a censorship cutoff is described by a pair (ωlowast qlowast) where ωlowast

is a cutoff state and qlowast is the probability of revealing this cutoff state when it realizes

Consider two stochastic upper-censorship signals s1 and s2 with cutoff pairs (ωlowast1 q

lowast1)

and (ωlowast2 q

lowast2) Denote (ωlowast

1 qlowast1) ≽ (ωlowast

2 qlowast2) if ω

lowast1 gt ωlowast

2 or ωlowast1 = ωlowast

2 and qlowast1 ge qlowast2 Observethat s1 is more (Blackwell) informative than s2 if and only if (ωlowast

1 qlowast1) ≽ (ωlowast

2 qlowast2) This

comparison also applies to deterministic upper-censorship signals with the constraintthat qlowast1 and qlowast2 are in 0 1

With the above order on cutoff pairs Theorem 3 extend as follows

Theorem 3prime Consider the canonical case

(A) A cutoff pair of an optimal signal satisfies

(i) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in ρ

(ii) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in t

(iii) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in σ if ρ ge 0

(B) A cutoff pair of an optimal monotone partition satisfies15

(i) (ωlowastd (ρ t σ) q

lowastd (ρ t σ)) is increasing in ρ

(ii) (ωlowastd (ρ t σ) q

lowastd (ρ t σ)) is increasing in t

15Part (iii) also holds for an optimal monotone partition when for example the discrete supportof the state ω is a regular grid with sufficiently many points However it is possible to construct acounterexample where part (iii) fails for an optimal monotone partition

CENSORSHIP AS OPTIMAL PERSUASION 15

That is the sender optimally discloses more information when she is less biasedrelative to the receiver (the alignment parameter ρ is greater) when the receiver ismore reluctant to accept the proposal (the shift parameter t is greater) and underScenario A when the sender is more uncertain about the receiverrsquos type (the stretchparameter σ is greater)

Remark 2 Theorem 1prime and Theorem 2prime show that upper censorship emerges as anoptimal persuasion strategy under the same conditions (V is S-shaped in Theorem1prime and g is log-concave in Theorem 2prime) regardless of whether the sender can choosean arbitrary signal or is constrained to choose a monotone partition It may betempting to conjecture that if the sender was allowed to choose an arbitrary partition(not necessarily monotone) the same result would hold This however is not true

For illustration suppose that the sender wishes to maximize the probability of a = 1and the receiver has type r = 1

2with certainty16 Let the state ω take values in

0 14 1 with probabilities (1

6 23 16) so the prior mean is 1

3

An optimal signal is stochastic upper censorship with cutoff pair (ωlowast qlowast) = (14 12)

which sends the pooling message with certainty at ω = 1 and with probability 1minusqlowast =12at ω = 1

4 and reveals the state otherwise The pooling message induces the posterior

mean m = 12 leading to a = 1 all other messages induce posterior means below 1

2

leading to a = 0 Under the optimal signal action a = 1 is chosen with probability23middot 12+ 1

6= 1

2

An optimal monotone partition is deterministic upper censorship that fully revealsthe state where only ω = 1 leads to a = 1 Under this partition action a = 1 ischosen with probability 1

6

An optimal partition pools the extreme states ω = 0 and ω = 1 and reveals theintermediate state ω = 1

4 The pooling message induces the posterior mean m = 1

2

leading to a = 1 This optimal partition is nonmonotone Under this partitionaction a = 1 is chosen with probability 1

6+ 1

6= 1

3 which is strictly better than the

probability of 16obtained under the optimal monotone partition

5 Application to Media Censorship

In this section we apply our results to the problem of media censorship by thegovernment In the modern world people obtain information about the governmentrsquosstate through various media sources such as television newspapers and internetblogs Without the media most people would not know what policies and reformsthe government pursues and how effective they are Media outlets have differentpositions on the political spectrum and differ substantially in how they select andpresent facts to cover the same news People choose their sources of information based

16In this case V (m) = 1 if m ge 12 and V (m) = 0 if m lt 1

2 Observe that the graph of V is

S-shaped and it can be approximated by a continuously differentiable S-shaped function

16 KOLOTILIN MYLOVANOV ZAPECHELNYUK

on their political ideology and socioeconomic status This information is valuable forsignificant individual decisions on migration investment and voting to name a fewIndividuals may not fully internalize externalities that their decisions impose on thesociety Likewise the government may not have the societyrsquos best interest at heartTo further its goals the government then wishes to influence individual decisions bymanipulating the information through media In autocracies and countries with weakchecks and balances the government has power to censor the media content

The governmentrsquos problem of media censorship can be represented as the persua-sion problem in Section 2 We apply our results to provide conditions for the op-timality of upper-censorship policies that censor all media outlets except the mostpro-government ones Furthermore we interpret our comparative statics results

51 Setup There is a continuum of heterogeneous citizens indexed by r isin [0 1]distributed with G Each citizen chooses between a = 0 and a = 1 The utility of acitizen of type r is given by

u(θ r ar a) = (θ minus r)ar + κ(θ r a)

where ar isin 0 1 denotes the citizenrsquos own action a =983125ardG(r) denotes the aggre-

gate action in the society θ isin [0 1] captures an unobserved benefit from action 1 ascompared to action 0 and κ captures the impact of the aggregate action a on thecitizenrsquos utility The term (θ minus r)ar is a private surplus of a citizen of type r Theterm κ(θ r a) is an externality because for a citizen of type r it is optimal to ignorethis term and choose ar = 1 if and only if θ ge r

There is a government which is concerned with a weighted average of the social utilityand the governmentrsquos intrinsic benefit from the aggregate action For a given θ thegovernmentrsquos utility is given by

983133 1

0

ν(θ r ar a)dG(r) + δγ(θ a)

The term ν captures a citizenrsquos utility from the governmentrsquos perspective We allow νto be different from u to reflect paternalistic or other concerns The term γ capturesthe governmentrsquos intrinsic benefit from the aggregate action The parameter δ ge 0captures the weight of the aggregate action in the governmentrsquos utility

Let T be a distribution of the random variable θ We assume that distributions G andT are independent and admit continuously differentiable and strictly positive densi-ties We also assume that κ ν and γ are linear in θ and continuously differentiablein r and a Furthermore to simplify interpretations we assume that ν and γ arenon-decreasing in θ and a so that for the government a high θ is a good news anda higher aggregate action is preferable

Citizens obtain information about the unobservable benefit θ through media outletsEach media outlet is identified by its editorial policy c isin [0 1] and it endorses action

CENSORSHIP AS OPTIMAL PERSUASION 17

a = 1 if θ ge c and criticizes it if θ lt c17 A set C of media outlets is a finite subsetof [0 1]18

The governmentrsquos censorship policy is a set of the media outlets X sub C that arepermitted to broadcast so the rest of the media outlets are censored

The timing is as follows First the government chooses a set X sub C of permittedmedia outlets Second state θ is realized and each permitted media outlet endorsesor criticizes action a = 1 according to its editorial policy Finally each citizen observesmessages from all permitted media outlets updates his beliefs about θ and choosesan action

52 Discussion We now discuss interpretations of the key components of the mediacensorship application As in Gehlbach and Sonin (2014) there can be various inter-pretations of the citizenrsquos action a = 1 such as voting for the government supportinga governmentrsquos policy or taking an individual decision that benefits the governmentA citizenrsquos type r can be interpreted as his ideological position or preference param-eter A citizen who is more supportive of the government has a smaller r

A media outlet with a higher editorial policy c can be interpreted as being less loyalto the government because it criticizes the government on a larger set of states Aneditorial policy c isin C can therefore represent a slant or political bias of the outletagainst the government and can be empirically measured as the frequency with whichthe outlet uses anti-government language Gentzkow and Shapiro (2010) constructsuch a slant index for US newspapers Empirical findings of their paper suggestthat the editorial policies of media outlets are driven by reader preferences justifyingour assumption of the existence of a large variety of editorial policies19 As in Suen(2004) Chan and Suen (2008) and Chiang and Knight (2011) the assumption ofthe binary media reports that communicate only whether the state θ is above somestandard c can be justified by a cursory readerrsquos preference for simple messages suchas positive or negative opinions and yes or no recommendations

The governmentrsquos censorship of media outlets can take various forms For examplethe government can ban access to internet sites withdraw licenses disrupt financingconfiscate print materials and equipment discredit media providers and even arresteditors and journalists using broadly formulated legislation on combating extremismIn some countries the government can exercise direct control over media editorialpolicies either through state ownership or administrative pressure

17The tie-breaking in the event of θ = c is unimportant as θ is a continuous random variable18We discuss the case of a continuum of media outlets in Section 5519Theoretical literature has explored the determinants of media slant of an outlet driven by its

citizens (Mullainathan and Shleifer 2005 Gentzkow and Shapiro 2006 and Chan and Suen 2008)and its owners (Baron 2006 and Besley and Prat 2006)

18 KOLOTILIN MYLOVANOV ZAPECHELNYUK

53 Formulating Media Censorship as Persuasion We now show that the me-dia censorship problem can be formulated as a linear persuasion problem in whichthe government is a sender and a representative citizen is a receiver

In our model the citizensrsquo and the governmentrsquos utilities are linear in θ There-fore given any information from the media outlets the utilities depend only on theposterior mean of θ

Consider an arbitrary posterior mean of θ denoted by m isin [0 1] Each citizen of typer chooses ar = 1 if and only if r le m The externality term κ plays no role in thisdecision Therefore the aggregate action a is simply the mass of all citizens whosetypes do not exceed m so a = G(m)

Next using the citizensrsquo optimal behavior we derive the governmentrsquos expected utilityconditional on the posterior mean m

V (m) = E983063983133 1

0

ν(θ r ar G(m))dG(r) + δγ(θ G(m))

983055983055983055983055m983064

=

983133 1

0

ν(m r1rlem G(m))dG(r) + δγ(mG(m)) (6)

This is the governmentrsquos indirect utility As in Section 2 the government now choosesa signal which is informative about the state to maximize its expected utility How-ever in contrast to Section 2 here the government is restricted to signals that areimplementable by a subset of a given set of media outlets

Next we show that with an appropriate definition of the state the restriction tosignals implementable by a subset of a given set of media outlets can be formulatedas a restriction to monotone partitional signals as in Scenario B in Section 2 Thuswe will be able to use our results in Section 4 to find the optimal censorship policies

Recall that the information about θ is only available through the media outlets Let ωdenote a random variable equal to the conditional expectation of θ given the messagesof all media outlets in C Note that as C is finite ω is a discrete random variablewith values in [0 1] Let F denote the distribution of ω From now on we treat therandom variable ω as the state and denote by s a signal as defined in Section 2

Let HM denote the set of all distributions of the posterior mean state induced byall monotone partitional signals and let HC denote the set of all distributions ofthe posterior mean state induced by all finite subsets of the set C of media outletsObserve that any set of media outlets induces a monotone partitional signal thatis HC sub HM We now show that every outcome implementable by an arbitrarymonotone partition in HM is also implementable by a subset of media outlets

Lemma 2 HC = HM

For illustration suppose that there is only one media outlet with the editorial policyc isin (0 1) There are two observable events θ ge c and θ lt c In each realized event

CENSORSHIP AS OPTIMAL PERSUASION 19

the citizens compute the posterior expectation of θ which we will call ω So in thisexample the distribution F of ω has two mass points one for each of the two events

Given these two mass points every monotone partition of [0 1] induces one of twopossible distributions of the posterior mean state (i) the mass points revealed and (ii)the mass points pooled Observe however that these distributions are implementableby permitting and censoring the media outlet respectively

A censorship policy is upper censorship if it censors all sufficiently disloyal mediaoutlets Specifically there exists a cutoff clowast isin C and an indicator qlowast isin 0 1 suchthat all media outlets whose editorial policies are below clowast are permitted all mediaoutlets whose editorial policies are above clowast are censored and clowast is permitted if andonly if qlowast = 1 That is X = c isin C c le clowast if qlowast = 1 and X = c isin C c lt clowastif qlowast = 0 Note that the full censorship policy clowast = 0 and qlowast = 0 (where all mediaoutlets are censored) and the free media policy clowast = 1 and qlowast = 1 (where all mediaoutlets are permitted) are the two extreme upper-censorship policies

As shown above the media censorship problem is equivalent to the linear persuasionproblem in which the sender is restricted to monotone partitional signals and thesenderrsquos indirect utility V is given by (6) We thus apply Theorem 1prime(B) to obtainthe following result

Theorem 1primeprime If V is S-shaped then an optimal censorship policy is upper censorship

To illustrate Theorem 1primeprime suppose the government is interested only in the aggregateaction so that ν = 0 and γ depends only on the aggregate action a so V (m) =γ(G(m)) Thus an upper-censorship policy is optimal if the composition functionγ(G(middot)) is S-shaped For example this condition holds if the government is interestedin reaching a certain approval threshold (eg a simple majority) so that γ is a stepfunction This condition also holds if γ is S-shaped and G is uniform or S-shapedwith the same inflection point as γ20

54 Canonical Case We now impose more structure on the utilities to obtain asharper result for the optimality of upper censorship and to perform a comparativestatics analysis Analogously to assumption (A1) we assume that the governmentrsquosutility is a weighted average of the citizensrsquo utility and their aggregate action

983133 1

0

u(θ r ar a)dG(r) + δa

where u(θ r ar a) = (θ minus r)ar + ζ(r)a

(A2)

for some continuously differentiable function ζ Define

β =

983133 1

0

ζ(r)dG(r) + δ

20This condition holds in many special cases where γ and G are S-shaped even when γ and Ghave different inflection points

20 KOLOTILIN MYLOVANOV ZAPECHELNYUK

The term983125 1

0ζ(r)dG(r) is the governmentrsquos expected bias towards the citizensrsquo action

a = 1 due to the citizensrsquo externality and the term δ is the governmentrsquos intrinsicbias towards a greater aggregate action a Thus we interpret β as the governmentrsquosaggregate bias This decomposition of the bias allows for different interpretations whythe government is biased So β can be high because the government is too self-serving(high δ) or because the government is benevolent (low δ) but wishes to internalizestrong positive externalities of the citizens (high ζ(r)) To ease interpretations weassume that β gt 021

Under Assumption (A2) the governmentrsquos indirect utility V given by (6) becomes

V (m) =

983133 m

0

(mminus r)dG(r) + βa =

983133 m

0

v(m r)dG(r)

where

v(m r) = β

9830611 +

1

β(mminus r)

983062

This is the same as v given by Assumption (A1) with ρ = βminus1 up to rescaling by apositive constant Consequently we can apply Theorem 2prime(B) to obtain the followingresult

Theorem 2primeprime Let (A2) hold If the density g of citizens types is log-concave then anoptimal censorship policy is upper censorship

We now apply the comparative statics analysis presented in Section 32 Note thatthe upper-censorship policies are ordered according to the amount of informationtransmitted to the citizens in the sense of Blackwell (1953) A greater censorshipthreshold clowast means that more media outlets are permitted With this order in mindwe apply Theorem 3prime(B) to make a comparative statics analysis on the amount ofinformation that is optimally disclosed by the government

First the censorship cutoff is increasing in the alignment parameter ρ Recall thatthis is a reciprocal of the governmentrsquos bias ρ = βminus1 This means that the governmentoptimally discloses more information (the censorship cutoff is greater) when it is lessbiased (β is smaller) Intuitively as β decreases the government puts more weighton the citizensrsquo utility so it optimally endows the population with a higher utility bycensoring fewer media outlets and disclosing more information

Second the censorship cutoff is increasing in the magnitude t of the horizontal shiftof the density g A greater t corresponds to a greater opportunity cost of actiona = 1 for each citizen This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more difficultto persuade to take action a = 1 (parameter t is greater) Informally speaking topersuade the same type of the citizen the government needs to increase the posterior

21If β lt 0 then swapping the roles of a = 0 and a = 1 reverses the sign of the bias If β = 0then the governmentrsquos utility and the citizensrsquo private interests coincide so it is trivially optimal topermit all media outlets

CENSORSHIP AS OPTIMAL PERSUASION 21

mean But the expected posterior mean must be equal to the prior mean so it is notpossible to increase all posterior means Due to the log-concave shape of the densityof the citizensrsquo types this tradeoff is resolved by increasing the posterior mean of thepooling interval E[ω|ω ge clowast] This is done by shrinking the pooling interval [clowast 1]that is increasing the censorship cutoff clowast

Third when the number of media outlets is sufficiently large the censorship cutoffis increasing in the stretch factor σ of the density g around its peak τ (see Footnote15) A greater σ corresponds to a more dispersed distribution of the opportunitycost in the population This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more diverse inhow difficult they are to persuade Intuitively when the types are more spread outpersuading the same mass of types requires a greater posterior mean which leads toa higher censorship cutoff similarly to the comparative statics in t discussed above

55 Extensions and Open Questions Let us now consider a few extensions ofour model of media censorship

In our model the set of media outlets is exogenous and the governmentrsquos only in-strument is censorship We now consider three alternative ways of expanding thegovernmentrsquos instruments of influence

First suppose that the government is able not only to censor existing media outletsbut also to introduce new media outlets with chosen editorial policies This is equiv-alent to our censorship model where all media outlets in [0 1] are initially availableand the government can censor any subset of them When all media outlets in [0 1]are permitted the revealed information about θ (which we call ω) is θ itself Thusω = θ is a continuous random variable with distribution F = T So we can nowapply our results for the continuous state from Section 3 instead of those for the dis-crete state in Section 4 reaching the same conclusion about the optimality of uppercensorship

Second suppose that the government can garble information available from mediaoutlets That is the government is not restricted to monotone partitions it cancreate arbitrary signals about state ω for the citizens to observe This becomes ageneral persuasion problem and our Theorems 1prime(A) 2prime(A) and 3prime(A) apply

Third suppose that the government is able to restrict not only which media outletsare permitted but also how many media outlets each citizen can choose to observeIn this extension the citizens are not allowed to communicate with one another (oth-erwise they could share the information thus observing all permitted media outletsindirectly) This extension does not affect our results as long as each citizen is al-lowed to access at least one media outlet of his choice as in Chan and Suen (2008)Intuitively this is because each citizen categorizes the information from the mediaoutlets into ldquogood newsrdquo where a = 1 is optimal and ldquobad newsrdquo where a = 0 is

22 KOLOTILIN MYLOVANOV ZAPECHELNYUK

optimal Because the information from the media outlets induces a monotone parti-tion it means that ldquogood newsrdquo is separated from ldquobad newsrdquo by a threshold mediaoutlet that depends on the citizenrsquos type That is it is sufficient to observe a singlethreshold media outlet to distinguish ldquogood newsrdquo from ldquobad newsrdquo

In our model each citizenrsquos utility depends on the aggregate action a through theexternality term κ(θ r a) which does not affect the chosen action Let us relax thisassumption so that a citizenrsquos optimal choice can depend on a We can still writethe senderrsquos indirect utility V as a function of the posterior mean state and applyour results However V is no longer uniquely determined by the primitives of themodel It is now endogenous and depends on an equilibrium the citizens play aseach citizenrsquos optimal action now depends on what all citizens do in equilibrium Forexample given the same information about the state a citizen could prefer to choosea = 1 if and only if sufficiently many citizens choose the same action so a is largeenough This creates the problem of multiplicity of equilibria and as a consequencethe dependence of optimal censorship on equilibrium selection

As a side our media censorship problem can also be applied to spatial voting mod-els as in Chiang and Knight (2011) Consider a government party (p = G) and anopposition party (p = O) competing in an election If party p wins a voter withan ideological position r gets utility wp minus (r minus rp)

2 where wp is the quality or va-lence of party p and rp is the ideology or policy platform of party p Voters knowthe partiesrsquo ideologies and obtain information about the partiesrsquo qualities from allavailable media outlets Each voter supports the party that maximizes his expectedutility Within this context our analysis still applies because the voterrsquos utility dif-ference between the government and opposition parties is proportional to θminusr whereθ = (wG minus wO + r2O minus r2G)2(rO minus rG)

There are a few more extensions that can be relevant in applications First insteadof complete censorship of a media outlet there can be a cost of accessing it Forexample an international news channel can be censored by a local government butcitizens may still access it through VPN at some cost Second it can be costly forthe government to censor media outlets So another important question to answeris how a government may prioritize censoring Finally citizens can incur some costof following each media outlet While we have already mentioned that citizens gainno benefit from following more than one outlet it is entirely possible for them tostop watching news altogether if it is sufficiently uninformative These extensions arenontrivial and left for future research

Appendix

Proof of Theorem 1 Suppose that V is S-shaped For mωlowast isin [0 1] and mlowast(ωlowast) =E[ω|ω ge ωlowast] define

V (mωlowast) = V (mlowast(ωlowast)) + V prime(mlowast(ωlowast))(mminusmlowast(ωlowast)) (7)

CENSORSHIP AS OPTIMAL PERSUASION 23

Since V (m) is convex on [0 τ ] and concave on [τ 1] there exists ωlowast isin [0 τ ] thatsatisfies condition (3) that is

V (mωlowast) ge (le)V (m) for all m ge (lt)ωlowast (8)

This is true because V (ωlowastωlowast) minus V (ωlowast) is single-crossing from below To see thisconsider Figure 1 Observe that (3) can hold only if ωlowast is on the convex part of V andmlowast is on the concave part of V As the censorship cutoff ωlowast increases the posteriormean state of the pooling message mlowast(ωlowast) also increases But because V is concaveat mlowast(ωlowast) the dashed tangent line V (mωlowast) becomes flatter in m so it crosses thesolid line V (m) at a smaller m

Consider an arbitrary signal s Let H be the distribution of m = E[ω|s] inducedby signal s Since each signal is a garbling of the fully informative signal F is amean-preserving spread of H Thus the senderrsquos expected utility is smaller undersignal s than under the upper-censorship signal with cutoff ωlowast as follows from

983133 1

0

V (m)dH(m) le983133 1

0

maxV (m) V (mωlowast)dH(m)

le983133 1

0

maxV (m) V (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mlowast(ωlowast))dF (m)

where the second line holds because maxV (m) V (mωlowast) is convex in m and F isa mean-preserving spread of H the third line holds by (8) and the last line holds by

(7) and by983125 1

ωlowast(mminusmlowast(ωlowast))dF (m) = 0

Conversely suppose that V is not S-shaped Then there exist 0 le m1 lt m2 lem3 lt m4 le 1 such that V primeprime(m) lt 0 for m isin (m1m2) V

primeprime(m) = 0 for m isin [m2m3]and V primeprime(m) gt 0 for m isin (m3m4) because V primeprime is continuous by assumption It isstraightforward to show that there exists 983141mlowast isin (m1m2) (sufficiently close to m2) and983141ωlowast isin (m3m4) such that

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) lt V (m) for all m isin (983141ωlowastm4]

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) gt V (m) for all m isin [m1 983141ωlowast] 983141mlowast

Moreover there exists a continuous distribution F such that the support of F is asubset of [m1m4] and 983141mlowast = E[ω|ω le 983141ωlowast] Using a chain of inequalities similar to theone above it is easy to show that lower-censorship signal 983141slowast with cutoff 983141ωlowast is optimalFurthermore it is uniquely optimal as both inequalities hold with equalities only forsignal 983141slowast This in turn implies that any upper-censorship signal is suboptimal

24 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Proof of Lemma 1 Notice that under assumption (A1) we have

V primeprime(m) = gprime(m) + ρg(m) for m isin [0 1]

Recall that V is S-shaped if and only if V primeprime(m) = gprime (m) + ρg (m) is single-crossingfrom above Using Proposition 1 in Quah and Strulovici (2012) it is easy to showthat V primeprime(m) is single-crossing from above for all ρ isin R if and only if gprime (m)g (m) isnonincreasing in m (that is ln g(m) is concave)

Proof of Theorem 2 Immediate by Theorem 1 and Lemma 1

Proof of Theorem 3 Part (i) We fix t and σ and vary ρ Without loss ofgenerality let t = 0 and σ = 1 (so that Gtσ = G) Consider a signal that induces thedistribution H of posterior mean m This signal implements the receiverrsquos interimutility U given by

U(r) =

983133 1

r

(mminus r)dH(m) for r isin [0 1]

which holds because the receiver acts if and only if m ge r Since types r lt 0 alwayschoose a = 1 and types r gt 1 always choose a = 0 we can exclude them fromconsideration and assume that supp(G) = [0 1] The senderrsquos expected utility is then

983133 1

0

V (m)dH(m) =

983133 1

0

983133 m

0

(1 + ρ(mminus r))dG(r)dH(m)

=

983133 1

0

983133 1

r

(1 + ρ(mminus r))dH(m)dG(r)

=

983133 1

0

(1minusH(r) + ρU(r))dG(r)

Consider ρ2 gt ρ1 Suppose by contradiction that the corresponding optimal upper-censorship signals s2 and s1 are such that s1 is strictly more informative than s2 LetU2 and U1 be the receiverrsquos interim utilities implemented by s2 and s1 Since thesender prefers signal s2 under ρ2 and signal s1 under ρ1 we have

983133 1

0

(1minusH2 (r) + ρ2U2 (r)) dG (r) ge983133 1

0

(1minusH1 (r) + ρ2U1 (r)) dG (r)

983133 1

0

(1minusH1 (r) + ρ1U1 (r)) dG (r) ge983133 1

0

(1minusH2 (r) + ρ1U2 (r)) dG (r)

Summing up these inequalities gives

(ρ2 minus ρ1)

983133 1

0

(U2 (r)minus U1 (r)) dG (r) ge 0

leading to a contradiction because U1(r) ge U2(r) for all r with strict inequality forsome r given that s1 is strictly more informative than s2

CENSORSHIP AS OPTIMAL PERSUASION 25

Part (ii) We fix ρ isin R and σ = 1 and vary t The indirect utility of the senderconditional on m isin [0 1] is given by

Vt(m) =

983133

rlem

(1+ρ(mminusr))gt1(r)dr =

983133

rlemminust

(1+ρ(mminustminusr))g(r)dr = V (mminust) (9)

By Theorem 1 when t0 = 0 an optimal cutoff ωlowast0 satisfies

V (m) le V (mlowast0) + V prime(mlowast

0)(mminusmlowast0) for m isin [ωlowast

0 1]

When V is S-shaped and t1 lt 0 it is easy to see that

V (mminus t1) le V (mlowast0 minus t1) + V prime(mlowast

0 minus t1)(mminusmlowast0) for m isin [ωlowast

0 1]

which implies that when t = t1 the above inequality changes its sign at some ωlowast1

that satisfies ωlowast1 le ωlowast

0 It follows from (9) that ωlowast1 is an optimal cutoff for t = t1 and

as we established above ωlowast1 le ωlowast

0

Part (iii) We fix ρ ge 0 and t = 0 and vary σ The indirect utility of the senderconditional on m isin [0 1] is given by

V ρσσ(m) =

983133

rlem

9830591 +

ρ

σ(mminus r)

983060dG0σ(r)

=

983133

rlem

9830591 +

ρ

σ(mminus r)

983060g

983061r minus (1minus σ)τ

σ

9830621

σdr

=

983133

rlemminus(1minusσ)τσ

9830611 + ρ

983061mminus (1minus σ)τ

σminus r

983062983062g(r)dr

= Vρ1

983061mminus (1minus σ)τ

σ

983062 (10)

Let σ0 = 1 σ1 lt 1 and ρ1 = ρσ1 To compare the optimal cutoffs under parameters(ρ σ0) and (ρ σ1) we consider two steps first comparing the cutoffs between (ρ σ0)and (ρ1 σ1) and then comparing the cutoffs between (ρ1 σ1) and (ρ σ1)

First we compare an optimal cutoff ωlowast0 under (ρ σ0) = (ρ 1) and an optimal cutoff

ωlowast1 under (ρ1 σ1) By Theorem 1 ωlowast

0 satisfies

Vρ1(m) le Vρ1(mlowast0) + V prime

ρ1(mlowast0)(mminusmlowast

0) for m isin [ωlowast0 1]

When V is increasing (implied by ρ ge 0) and S-shaped and σ1 lt 1 it is easy to seethat for all m isin [ωlowast

0 1]

Vρ1

983061mminus (1minus σ1)τ

σ1

983062le Vρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062+ V prime

ρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062(mminusmlowast

0)

which implies that when σ = σ1 lt 1 the above inequality changes its sign at someωlowast1 that satisfies ωlowast

1 le ωlowast0 It follows from (10) that ωlowast

1 is an optimal cutoff under(ρσ σ1) = (ρ1 σ1) and as we established above ωlowast

1 le ωlowast0

26 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Second we compare the optimal cutoff ωlowast1 under (ρ1 σ1) and the optimal cutoff ωlowast

2

under (ρ σ1) Because ρ lt ρ1 = ρσ1 by part (i) we have ωlowast2 le ωlowast

1

Combining ωlowast1 le ωlowast

0 and ωlowast2 le ωlowast

1 we obtain ωlowast2 le ωlowast

0

Proof of Theorem 1prime Part (A) The proof is analogous to that of Theorem 1 andis thus omitted

Part (B) The only if statement follows from that in part (A) (see the proof of Theorem1) and the fact that we can always ensure that an optimal stochastic censorship(ωlowast qlowast) is deterministic (so qlowast isin 0 1) by the choice of the support and distributionof ω Thus we only need to prove the if statement

Let F be a distribution of ω with support on n values 0 lt ω0 lt lt ωnminus1 lt 1 Thisis without loss of generality because if 0 andor 1 were in the support of F we couldredefine the state as ω = (ω + ε)(1 + 2ε) and the type as r = (r + ε)(1 + 2ε)

Let X sub ω1 ωnminus1 identify a set of cutoffs defining a monotone partition for eachωi isin X the receiver is informed whether ω lt ωi or ω ge ωi Note that we exclude ω0

from the set of cutoffs because ω ge ω0 always holds

Note that X is deterministic upper censorship if and only if there exists a censorshipthreshold ωlowast isin [0 1] such that for each i = 1 nminus 1 ωi isin X if and only if ωi lt ωlowastthat is we can assume that qlowast = 0 without loss of generality

Let VX be the senderrsquos expected utility when the monotone partition is given by X

If n = 2 then there are only two partitions X = ω1 (full disclosure) and X = empty(no disclosure) Both are upper censorships so part (B) holds trivially

If n = 3 then there are four monotone partitions X = ω1ω2 (full disclosure)X = ω2 X = ω1 and X = empty (no disclosure) Only X = ω2 is not uppercensorship The next lemma proves that X = ω2 cannot be uniquely optimal whenV is S-shaped thus proving the optimality of upper censorship for n = 3

Denote

mjk = E983045ω|ω isin [ωjωk]

983046

For convenience of notation we write mj = ωj

Lemma 3 Let V be S-shaped with an inflexion point τ Let ω be a discrete randomvariable with support on m0m1m2 and probabilities p0 p1 p2 where 0 lt m0 ltm1 lt m2 lt 1 Then

(i) if τ le m0 then Vm2 le Vempty

(ii) if τ ge m1 then Vm2 le Vm1m2

(iii) if m0 lt τ lt m1 then Vm2 le maxVempty Vm1m2

CENSORSHIP AS OPTIMAL PERSUASION 27

Proof Define p01 = p0 + p1 and p02 = p01 + p2 = 1 Observe that

m01 =p0m0 + p1m1

p0 + p1and m02 =

p0m0 + p1m1 + p2m2

p0 + p1 + p2

We thus have

Vempty = p02V (m02) = (p01 + p2)V

983061p01m01 + p2m2

p01 + p2

983062

Vm2 = p01V (m01) + p2V (m2) = (p0 + p1)V

983061p0m0 + p1m1

p0 + p1

983062+ p2V (m2)

Vm1m2 = p0V (m0) + p1V (m1) + p2V (m2)

Clearly

m0 lt m01 lt m1 and m0 lt m02 lt m2

Therefore

Vm2 le Vempty lArrrArrp01

p01 + p2V (m01) +

p2p01 + p2

V (m2) le V

983061p01

p01 + p2m01 +

p2p01 + p2

m2

983062= V (m02)

(11)

and

Vm2 le Vm1m2 lArrrArr

V

983061p0

p0 + p1m0 +

p1p0 + p1

m1

983062le p0

p0 + p1V (m0) +

p1p0 + p1

V (m1)(12)

Suppose that τ le m01 so V is concave on [m01m2] Then (11) holds by Jensenrsquosinequality In particular it holds for τ le m0 lt m01 thus proving part (i)

Suppose that τ ge m1 so V is convex on [m0m1] Then (12) holds by Jensenrsquosinequality thus proving part (ii)

Finally suppose that m01 lt τ lt m1 Let (τ V (τ)) lie below (or on) the straight lineA connecting (m0 V (m0)) and (m2 V (m2)) as illustrated on Figure 2 By convexityof V (m) for m lt τ it must be the case that V (m01) lies below (or on) the dashedline B By concavity of V (m) for m gt τ it must be the case that V (m1) lies above(or on) the dashed line C Consequently (12) is satisfied

Alternatively let (τ V (τ)) lie above the straight line A connecting (m0 V (m0)) and(m2 V (m2)) as illustrated on Figure 3 If Vm2 le Vm1m2 then the proof of part(iii) is complete So suppose that Vm2 gt Vm1m2 so (12) is violated

By convexity of V (m) for m lt τ it must be the case that V (m01) lies below (or on)the line B Since (12) is violated V (m01) must lie between the lines A and B Recallthat m01 lt m02 lt m2 Observe that V (m02) must be in the shaded area on Figure 3Indeed if m02 lt τ then by convexity of V (m) for m lt τ it must be the case thatV (m02) lies between (or on) the lines B and D If m02 ge τ then by concavity of

28 KOLOTILIN MYLOVANOV ZAPECHELNYUK

τ m2m0

A C

B

m01 m1

V (m0) V (τ)

V (m2)

Figure 2 The case where (τ V (τ)) lies below (or on) the straight lineconnecting (m0 V (m0)) and (m2 V (m2))

τ m2m0

A

C

B

m01

V (m0)

V (τ)

V (m2)

D

V (m01)

Figure 3 The case where (τ V (τ)) lies above the straight line con-necting (m0 V (m0)) and (m2 V (m2))

V (m) for m gt τ it must be the case that V (m02) lies above (or on) line C It followsthat (11) is satisfied

We now adapt Lemma 3 to prove the optimality of upper censorship for n ge 4

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

2 KOLOTILIN MYLOVANOV ZAPECHELNYUK

1 Introduction

There has been a rapid growth of the literature on Bayesian persuasion and infor-mation design over the past decade The range of applications in this literature isimpressive and includes clinical trials bank stress tests school grading policies qual-ity certification advertising strategies transparency in organizations persuasion ofvoters and media control In many of these applications it is optimal to disclose in-formation about an uncertain state of the world via a censorship signal that censorsstates on one side of a cutoff and reveals states on the other side of the cutoff Fullyinformative and completely uninformative signals are special cases of censorship

We characterize necessary and sufficient conditions under which a censorship signalis optimal in the linear persuasion problem where the sender and receiverrsquos utilitiesdepend on the beliefs about the state only through the expected state We thusunify and generalize sufficient conditions scattered across the literature In additionwe establish new results for environments where the state is discrete and a signal isconstrained to be monotone partitional These environments are relevant in applica-tions and take into account constraints faced by information designers in practice1

Finally we provide monotone comparative statics results on the informativeness ofthe optimal signal

The linear persuasion problem is a tractable workhorse model in the persuasion lit-erature This problem can be described by the prior distribution of the state and thesenderrsquos indirect utility function of the expected state We consider a slightly differentbut equivalent formulation of the problem which is easier to interpret

In our model the receiver chooses one of two actions to accept or reject a proposalIf the receiver rejects the proposal the sender and receiverrsquos utilities are normalizedto zero If the receiver accepts the proposal the sender and receiverrsquos utilities dependon the unknown state of the world and the receiverrsquos private type The state and typeare independent random variables that represent respectively the receiverrsquos benefitand cost from acceptance We assume that the type is a continuous random variablebut we allow the state to be either a continuous or discrete random variable2

The senderrsquos utility is linear in the state and is not perfectly aligned with the receiverrsquosutility An important canonical case of the model is where the senderrsquos utility is aweighted sum of the receiverrsquos utility and action

To influence the receiverrsquos action the sender designs a signal that reveals informationabout the state After observing the signal realization the receiver updates his beliefsabout the state and chooses an action that maximizes his expected utility

1For example a credit rating of financial institutions is typically a monotone partition Neithernonmonotone partitions that pool high- and low-performing institutions nor stochastic disclosurerules that map an institutionrsquos financial performance to a random rating appear to be feasible

2Extending the analysis to the case where the state and type are general random variables presentstechnical difficulties but does not yield new economic insights

CENSORSHIP AS OPTIMAL PERSUASION 3

The main result shows that if the senderrsquos indirect utility is S-shaped in the expectedstate (that is convex below some threshold and concave above the threshold) thenand only then an upper-censorship signal that pools states above a cutoff and revealsstates below the cutoff is optimal for all prior distributions of the state3 In thecanonical case the main result shows that if the probability density of receiver types islog-concave then and only then upper censorship is optimal for all prior distributionsof the state and for all weights that the sender puts on the receiverrsquos utility and action4

The main result is obtained under two scenarios In the first scenario the signal isa random variable with arbitrary correlation with the state In the second scenariothe signal is constrained to be a monotone partition of the state space so that eachstate is either revealed or pooled with some adjacent states

To compare the main result under these two scenarios consider the cutoff of an opti-mal upper-censorship signal Notice that upper censorship does not specify whetherthe cutoff state is pooled or revealed If the state is a continuous random variablethen it does not matter whether this cutoff state is pooled or revealed Thus an op-timal upper-censorship signal is in fact a monotone partition However if the stateis a discrete random variable then the optimal signal is stochastic upper censorshipso that the cutoff state is generally pooled and revealed with interior probabilitiesIn contrast the optimal monotone partition is deterministic upper censorship so thatthe cutoff state is either completely pooled or fully revealed

In many applications of interest the state is discrete and thus optimal signals andoptimal monotone partitions generally differ An optimal upper-censorship signalprovides an upper bound on the value of persuasion To achieve this bound thesender should be able to commit to randomize over two signal realizations conditionalon the cutoff state which may be hard to enforce in practice In contrast monotonepartitions are relatively simple signals that can be enforced and verified ex post

The problem of finding an optimal monotone partition is a discrete optimizationproblem which cannot be solved using existing tools from the persuasion literatureNevertheless we show that if the senderrsquos indirect utility is S-shaped (or in thecanonical case the density of receiver types is log-concave) then upper censorship isan optimal monotone partition To prove this result for any monotone partition dif-ferent from upper censorship we explicitly construct a deterministic upper-censorshipsignal that is preferred by the sender

When upper censorship is optimal it is possible to perform the comparative staticsanalysis on the informativeness of the optimal signal We show that in the canonicalcase the sender optimally censors more information (i) if the sender is more biasedin that the sender puts a smaller weight on the receiverrsquos utility (ii) if the receiver is

3Analogously a lower-censorship signal that pools states below a cutoff is optimal for all priordistributions if and only if the senderrsquos indirect utility has an inverted S-shape

4Most commonly used probability densities are log-concave Log-concave densities exhibit niceproperties such as single-peakedness and the monotonicity of the hazard rate

4 KOLOTILIN MYLOVANOV ZAPECHELNYUK

easier to persuade in that the receiverrsquos cost of accepting the proposal is smaller and(iii) if the sender is more certain about the receiverrsquos preferences in that the densityof receiver types is more concentrated around its mode

We apply our results to the problem of media censorship by the government Weconsider a stylized setting with a finite number of media outlets and a continuumof heterogeneous citizens (receivers) Each media outlet is identified by its editorialpolicy that specifies an interval of endorsement states We permit the aggregate ac-tion of the citizens to affect their utility but not their optimal actions For examplean election outcome impacts all citizens but does not change their preferences overcandidates The government wishes to influence citizensrsquo actions by deciding whichmedia outlets to censor In the canonical case if the probability density of citizenstypes is log-concave then the optimal media censorship policy is to permit all suf-ficiently loyal media outlets and to censor the remaining outlets Our comparativestatics results suggest that the government should optimally increase censorship (i)if influencing society decisions becomes relatively more important than maximizingindividual welfare (ii) if the society experiences an ideology shock in favor of thegovernment and (iii) if the society becomes less diverse in ideology and taste

Related Literature The literature on Bayesian persuasion was set in motion bythe seminal papers of Rayo and Segal (2010) and Kamenica and Gentzkow (2011)The linear case which is the subject of this paper is prevalent in the literature5

Kolotilin (2018) provides necessary and sufficient prior dependent conditions for theoptimality of censorship in the case of a continuous state6 Alonso and Camara(2016b) provide sufficient conditions in the case of a discrete state In contrastour paper provides necessary and sufficient conditions that are prior independent (aswell as bias independent in the canonical case) and apply in both continuous anddiscrete cases More importantly we show that these conditions are also necessaryand sufficient for deterministic censorship to be optimal in the class of monotonepartitional signals (simple signals) There are no counterparts to this result in theliterature

Similar to us Alonso and Camara (2016b) show that the optimal signal is less infor-mative if the receiver is easier to persuade Their proof is simpler but it is valid onlywhen the senderrsquos utility is state-independent But our comparative statics resultswith respect to the senderrsquos bias and uncertainty are novel

Censorship policies commonly emerge as optimal signals in linear persuasion models ina variety of contexts starting from the prosecutor-judge example as well as lobbyingand product advertising examples in Kamenica and Gentzkow (2011) Other contextswhere censorship is optimal include grading policies (Ostrovsky and Schwarz 2010)

5Notable exceptions include Rayo and Segal (2010) Goldstein and Leitner (2018) Guo andShmaya (2019) and Kolotilin and Wolitzky (2019)

6See also Gentzkow and Kamenica (2016) Kolotilin Mylovanov Zapechelnyuk and Li (2017)and Dworczak and Martini (2019) for sufficient conditions

CENSORSHIP AS OPTIMAL PERSUASION 5

media control (Gehlbach and Sonin 2014 Ginzburg 2019) clinical trials (Kolotilin2015) voter persuasion (Alonso and Camara 2016ab) transparency benchmarks(Duffie Dworczak and Zhu 2017) stress tests (Goldstein and Leitner 2018 OrlovZryumov and Skrzypach 2018) trading mechanisms (Romanyuk and Smolin 2019Dworczak 2017 Yamashita 2018) quality certification (Zapechelnyuk 2019) andrelational communication (Kolotilin and Li 2019)

Our application to media censorship fits into the literature on media capture thataddresses the problem of media control by governments political parties or lobbyinggroups Besley and Prat (2006) pioneered this literature by studying how a gov-ernmentrsquos incentives to censor free media depend on plurality (the number of mediaoutlets) and on transaction costs of bribing the media In their model media outletsare identical and all voters have common interests Thus the governmentrsquos optimaldecision reduces to either capturing all media outlets or none Relatedly Gehlbachand Sonin (2014) consider a setting with a government-influenced monopoly mediaoutlet that exploits the trade-off between the governmentrsquos objective to ldquomobilizerdquo thepopulation for some collective goal and to collect the revenue from subscribers whodemand informative news Gehlbach and Sonin (2014) show that the presence of theldquomobilizationrdquo objective increases the news bias whereas the subscription revenuesreduce the bias but can cause the government to nationalize the media outlet7

The models of Besley and Prat (2006) and Gehlbach and Sonin (2014) have two statesof the world and either a single media outlet or a few identical media outlets Hencethe government uses the same censorship policy for each outlet We are the first toconsider a richer model of media censorship with a continuum of states and multipleheterogeneous media outlets As a result the government optimally discriminatesmedia outlets by permitting sufficiently loyal ones and banning the remaining ones

2 Model

21 Setup There are two players a sender (she) and a receiver (he) The receiverchooses whether to accept a proposal (a = 1) or reject it (a = 0) The proposalhas an uncertain value ω isin [0 1] By accepting the proposal the receiver forgoesan outside option worth r isin [0 1] so the receiverrsquos utility is a(ω minus r) The senderrsquosutility is av(ω r) where v(ω r) is linear in ω and continuously differentiable in rWe will refer to ω as state and to r as type and denote their distributions by F andG respectively Throughout the paper we assume that distribution G of r has acontinuously differentiable and strictly positive density g8

As in many applications the state is either a continuous or discrete random variablewe will separately analyze these two cases We say that the state ω and distribution F

7In different contexts media control by a government has also been studied by Egorov Gurievand Sonin (2009) Edmond (2013) and Lorentzen (2014) See also the overview of the literature onmedia capture slant and transparency in Prat and Stromberg (2013)

8This assumption is made for clarity of exposition The results can be extended to general G

6 KOLOTILIN MYLOVANOV ZAPECHELNYUK

are continuous if F has a density f on [0 1] We say that the state ω and distributionF are discrete if the support of F is a finite set

A particularly tractable and intuitive special case is where the senderrsquos utility is aweighted sum of the receiverrsquos utility and action We call this the canonical caseFormally in the canonical case the senderrsquos utility when the proposal is accepted is

v (ω r) = 1 + ρ(ω minus r) ρ isin R (A1)

That is the sender is biased towards a = 1 but also puts a weight ρ on the receiverrsquosutility In particular if the weight ρ is large then the sender and receiverrsquos inter-ests are aligned whereas if the weight is zero then the sender cares only about thereceiverrsquos action

The receiver privately knows his type but he does not observe the state The sendercan influence the action taken by the receiver a = 1 or a = 0 by releasing a signalthat reveals information about the state A signal is a random variable s isin [0 1]that is independent of r but is possibly correlated with ω For example s is fullyinformative if it is perfectly correlated with ω and s is completely uninformative ifit is independent of ω

The timing is as follows First the sender publicly chooses a signal s Then realiza-tions of ω r and s are drawn Finally the receiver observes the realizations of histype r and the signal s and then chooses between a = 0 and a = 1

We consider two scenarios

A The sender can choose any signal

B The sender can choose any monotone partitional signal

Formally a signal s is a monotone partitional signal if there exists a nondecreasingfunction ξ(ω) such that s = ξ(ω) for each ω Every such signal induces a partitionof the state space [0 1] into intervals and singletons and the receiver observes thepartition element that contains the state

Under Scenario A we are interested in an optimal signal that maximizes the senderrsquosexpected utility among all signals This is a standard persuasion problem

Under Scenario B we are interested in an optimal monotone partition that maximizesthe senderrsquos expected utility among all monotone partitional signals This scenarioincorporates constraints that information designers often face in practice For ex-ample a non-monotone grading policy that gives better grades to worse performingstudents will be perceived as unfair and will be manipulated by strategic students

22 Upper Censorship A subset of signals called upper censorship will play aspecial role in this paper

CENSORSHIP AS OPTIMAL PERSUASION 7

An upper-censorship signal reveals states below a specified cutoff and pools statesabove this cutoff When the state is equal to the cutoff it can be revealed or pooledIf the state is continuous it does not matter whether the cutoff state is revealed orpooled because this is a zero probability event However if the state is discretewe will distinguish between deterministic and stochastic upper-censorship signalsdepending on what happens at the cutoff state

A lower-censorship signal is defined symmetrically it pools states below a specifiedcutoff and reveals states above this cutoff Because the case of lower censorshipis symmetric (see Remark 1 below) for clarity of exposition we focus on uppercensorship throughout the paper

A signal s is stochastic upper censorship if there exists a cutoff pair (ωlowast qlowast) consistingof a cutoff state ωlowast isin [0 1] and probability qlowast isin [0 1] such that states below ωlowast arerevealed states above ωlowast are pooled and the state ω = ωlowast is revealed with probabilityqlowast and pooled with probability 1minus qlowast For example s can be expressed as

s =

983099983105983103

983105983101

ω with probability one if ω lt ωlowast

ωlowast and mlowast with probabilities qlowast and 1minus qlowast if ω = ωlowast

mlowast with probability one if ω gt ωlowast

where

mlowast =

983125(ωlowast1]

ωdF (ω) + ωlowast(1minus qlowast) Pr[ω = ωlowast]983125(ωlowast1]

dF (ω) + (1minus qlowast) Pr[ω = ωlowast](1)

is the posterior expected state induced by the pooling signal Notice that mlowast is afunction of ωlowast and qlowast

A stochastic upper-censorship signal s with a cutoff pair (ωlowast qlowast) is deterministic uppercensorship if and only if qlowast isin 0 1 in which case s can be expressed as a monotonepartition of the state space [0 1] For example if qlowast = 0 then the signal s fullyreveals states ω lt ωlowast and pools states ω ge ωlowast so s can be expressed as

s =

983083ω if ω lt ωlowast

mlowast if ω ge ωlowast

wheremlowast = E[ω|ω ge ωlowast]

is the expected state conditional on being at least ωlowast Note that the fully informativesignal and the completely uninformative signal are deterministic upper-censorshipsignals with (ωlowast qlowast) = (1 1) and (ωlowast qlowast) = (0 0) respectively

23 Benchmark To illustrate the difference between the cases of a continuous anddiscrete state as well as between Scenarios A and B we solve a benchmark examplewhere there is no uncertainty about the receiverrsquos type

Let the state ω have the expected value E[ω] = 12 and let the receiverrsquos type r be

known to the sender and satisfy r isin98304312 1983044 In addition suppose that the sender wishes

8 KOLOTILIN MYLOVANOV ZAPECHELNYUK

to maximize the probability that the receiver accepts the proposal so v(ω r) = 1 forall ω and all r

Observe that if the sender reveals no information about ω the receiver will evaluate ωby its expected value of 1

2 thus having utility 1

2minus r lt 0 from accepting the proposal

So revealing no information makes the receiver reject the proposal with certaintyThe sender can do better by fully revealing ω in which case the proposal is acceptedwith probability Pr[ω ge r]

However the sender can do even better by pooling states above r with states below rwhile keeping the expected state induced by the pooling signal at least r Thus thereceiver who is unable to distinguish between the states in the pool will accept theproposal at all of these states

To illustrate the case of the continuous state suppose that ω is uniformly distributedon [0 1] The largest pool that maintains the posterior expectation at least r is theinterval [2r minus 1 1] Indeed once the receiver learns that ω isin [2r minus 1 1] given theuniform prior of ω the posterior expected state is r So the proposal is accepted whenω ge 2rminus 1 and rejected when ω lt 2rminus 19 Thus the deterministic upper-censorshipsignal with cutoff ωlowast = 2r minus 1 is optimal The senderrsquos expected utility is equal to

Pr[proposal is accepted] = Pr[ω ge 2r minus 1] = 2(1minus r)

which is twice as high as that from the fully informative signal

Pr[proposal is accepted] = Pr[ω ge r] = 1minus r

When ω is a discrete random variable an optimal way to reveal information aboutω takes the form of stochastic upper censorship For illustration suppose that ω canonly be 0 or 1 equally likely Now simply pooling states is not helpful for the senderPooling 0 and 1 yields the posterior expected state of 1

2 which is smaller than the

type r resulting in a rejection of the proposal Including any other states in thepool makes no difference as these states never occur Thus an optimal monotonepartition is the fully informative signal and the probability that the receiver acceptsthe proposal is simply equal to the probability that ω = 1

Pr[proposal is accepted] = Pr[ω ge r] = Pr[ω = 1] =1

2

The sender can do strictly better by partial (stochastic) pooling If ω = 1 let thesignal realization be ldquoacceptrdquo with certainty If ω = 0 let the signal realization beldquorejectrdquo with some probability q and ldquoacceptrdquo with probability 1 minus q So when thereceiver observes ldquoacceptrdquo the posterior expected state is

E[ω|ldquoacceptrdquo] = Pr[ω = 1]

Pr[ω = 1] + Pr[ω = 0] middot (1minus q)=

12

12 + 12 middot (1minus q)=

1

2minus q

9Whether states strictly below ωlowast are revealed or pooled among themselves does not matterbecause they are all strictly below r and thus induce the receiver to take action a = 0

CENSORSHIP AS OPTIMAL PERSUASION 9

The sender now wishes to find the lowest q subject to 1(2 minus q) ge r which yieldsqlowast = 2 minus 1

r Thus stochastic upper censorship with ωlowast = 0 and qlowast = 2 minus 1

ris

optimal The probability that the receiver accepts the proposal is exactly equal tothe probability that the signal realization is ldquoacceptrdquo

Pr[proposal is accepted] = Pr[ω = 1] + Pr[ω = 0] middot (1minus qlowast) =1

2+

1

2middot 1minus r

r=

1

2r

We now summarize the insights obtained in this example If the state is continuousthen an optimal signal is deterministic upper censorship If the state is discrete thenan optimal monotone partition is deterministic upper censorship and an optimalsignal is stochastic upper censorship10 In the remainder of the paper we will showthat these insights extend to the case where the receiverrsquos type is uncertain

3 Continuous State

In this section we provide necessary and sufficient conditions for the optimality ofupper censorship when the state is continuous

Let m be the expected state conditional on observing a realization of a signal s Sincethe sender and receiverrsquos utilities are linear in ω they depend on the information aboutω revealed by signal s only through the expected state m In particular the receiverchooses a = 1 if and only if r le m

Let V (m) denote the indirect utility of the sender conditional on m

V (m) =

983133

rlem

E [v(ω r)|m] g(r)dr =

983133 m

0

v(m r)g(r)dr m isin [0 1] (2)

where E [v(ω r)|m] = v(m r) by the linearity of v in ω

A function V is said to be S-shaped if it is convex below some threshold and concaveabove that threshold or equivalently if V primeprime is single-crossing from above

there exists τ such that V primeprime(m) ge (le) 0 for all m lt (gt) τ

We now provide the criterion for the optimality of upper censorship Recall thatmlowast = E[ω|ω ge ωlowast]

Theorem 1 Let V be S-shaped Then and only then for all continuous F anoptimal signal is deterministic upper censorship whose cutoff state ωlowast isin [0 τ ] satisfies

V (mlowast) + V prime(mlowast)(mminusmlowast) ge (le)V (m) for all m ge (lt)ωlowast (3)

The criterion for the optimality of lower censorship is analogous

10See for example Kolotilin (2015) for a detailed treatment of the case where there is no uncer-tainty about the receiverrsquos type

10 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Remark 1 Let minusV be S-shaped Then and only then for all continuous F anoptimal signal is deterministic lower censorship whose cutoff state ωlowast isin [τ 1] satisfies

V (mlowast) + V prime(mlowast)(mminusmlowast) ge (le)V (m) for all m le (gt)ωlowast

Theorem 1 states that optimal persuasion takes a simple form of upper censorshipwhen V is S-shaped11 Moreover the theorem establishes that this result is tight inthe sense that if V is not S-shaped then there exists a distribution F of the statesuch that no upper-censorship signal is optimal

Note that every upper-censorship signal is a monotone partition Therefore Theorem1 applies in both Scenario A where the sender is free to choose any signal andScenario B where the sender is constrained to choose a monotone partition

The intuition behind Theorem 1 is as follows Observe that when no informationabout the state is revealed the receiverrsquos best-response action does not change withthe state The more information is revealed the more variable the receiverrsquos behaviorwill be in response to this information Consider an interval where V is concave Thesender would prefer not to reveal any information when the state is in this intervalsince a certain outcome is preferred to any lottery with the same expected stateConversely consider an interval where V is convex The sender would prefer to fullyreveal the state in this interval since now lotteries are preferred If V is S-shapedthat is it is convex below some threshold and concave above that threshold theinduced optimal persuasion takes the form of upper censorship

When V is S-shaped the senderrsquos optimization problem is reduced to finding anoptimal censorship cutoff ωlowast If the realized state ω is below the cutoff then it isrevealed to the receiver so the expected utility of the sender is V (ω) If the realizedstate ω is above the cutoff then the posterior expected state is mlowast = E[ω|ω ge ωlowast] sothe expected utility of the sender conditional on ω ge ωlowast is V (mlowast) The sender thusneeds to solve the problem

maxωlowastisin[01]

983133 ωlowast

0

V (ω)f(ω)dω +

983133 1

ωlowastV (mlowast)f(ω)dω (4)

The expression in (3) represents the first-order condition to this problem and capturesthree possible cases the boundary solutions ωlowast = 0 and ωlowast = 1 if the expression in(3) has the same sign for all ω and an interior solution ωlowast such that

V (mlowast) + V prime(mlowast)(ωlowast minusmlowast) = V (ωlowast) (5)

This first-order condition (5) is illustrated by Figure 1 The solid line is V (m) andthe dashed line is V (mlowast) + V prime(mlowast)(mminusmlowast) which is tangent to V at mlowast

11For a fixed distribution F Proposition 3 in Kolotilin (2018) implies that upper censorship withcutoff ωlowast is optimal if and only if (3) holds and V (m) is convex on [0ωlowast] This result can be used toshow that upper censorship is optimal when V is S-shaped For completeness we include a simpleself-contained proof inspired by the proof of Theorem 1 in Dworczak and Martini (2019)

CENSORSHIP AS OPTIMAL PERSUASION 11

1ωlowast

V

revealing pooling

mlowast0

Figure 1 Optimal upper censorship with cutoff ωlowast

The solution of the senderrsquos problem (4) is particularly simple if V is either globallyconvex or globally concave In this case the first-order condition in (3) has a constantsign so either ωlowast = 0 or ωlowast = 1 must be optimal If V is convex then ωlowast = 1 isoptimal which corresponds to the fully informative signal Similarly if V is concavethen ωlowast = 0 is optimal which corresponds to the completely uninformative signalThis is summarized in the following corollary

Corollary 1 An optimal signal is

(i) fully informative for all F if and only if V is convex

(ii) completely uninformative for all F if and only if V is concave

31 Canonical Case We now consider the canonical case where the senderrsquos utilitysatisfies assumption (A1)

The density g of receiver types r is said to be log-concave if ln g(r) is concave in r12

Note that ln g(r) is well defined in the canonical case

In this case the shape of the senderrsquos indirect utility V defined by (9) is connectedto the shape of the density g of receiver types as follows

Lemma 1 In the canonical case V is S-shaped for all ρ if and only if g is log-concave

12Table 1 in Bagnoli and Bergstrom (2005) reports distributions with log-concave densities

12 KOLOTILIN MYLOVANOV ZAPECHELNYUK

That is if the density of receiver types g is log-concave then the senderrsquos indirectutility V is S-shaped Moreover this result is tight in the sense that if g is notlog-concave then there exists ρ such that V is not S-shaped

By Theorem 1 and Lemma 1 we obtain a criterion for the optimality of upper cen-sorship with the condition on the primitive of the model the density g

Theorem 2 Consider the canonical case and let the density g be log-concave Thenand only then an optimal signal is deterministic upper censorship for all F and all ρ

The symmetric statement is also true an optimal signal is lower censorship for all Fand all ρ if and only if minusg is log-concave Consequently if both g and minusg are log-concave (that is g is exponential) then there exists an optimal signal which is bothupper censorship and lower censorship There are only two signals with this propertyfully informative and completely uninformative This allows us to obtain conditionson the distribution of receiver types under which the optimal signal is polarizedbetween fully informative and completely uninformative signals as for example inLewis and Sappington (1994) and Johnson and Myatt (2006)13

Corollary 2 In the canonical case an optimal signal is either fully informative orcompletely uninformative for all F and all ρ if and only if there exist λ isin R and c gt 0such that g(r) = ceminusλr for r isin [0 1]

If g(r) = ceminusλr then the fully informative signal is optimal whenever ρ ge λ and thecompletely uninformative signal is optimal whenever ρ le λ (and any signal is optimalwhen ρ = λ) In particular if ρ = 0 the optimal signal is fully determined by thesign of λ which is in turn determined by whether the mean of r is smaller or greaterthan 1

2

32 Comparative Statics Theorem 2 allows for a sharp comparative statics anal-ysis on the informativeness of the optimal signal

We compare signals by their Blackwell informativeness (Blackwell 1953) To compareupper-censorship signals s1 and s2 we only need to compare their cutoffs ωlowast

1 and ωlowast2

Signal s1 is more informative than signal s2 if ωlowast1 ge ωlowast

2 Indeed state ω isin [0ωlowast2) is

fully revealed by both s1 and s2 and state ω isin [ωlowast2 1] is partially revealed by s1 and

not revealed at all by s2 so s1 is more informative than s2

13In the case of log-concave density g the sets of upper- and lower-censorship signals are totallyordered by the rotation order of Johnson and Myatt (2006) If we restrict attention to the set oflower-censorship signals (which are generally suboptimal under log-concave g) then the rotationpoint is decreasing with respect to the rotation order By Lemma 1 of Johnson and Myatt (2006)the senderrsquos expected utility is quasiconvex and one of the extreme lower-censorship signals (thefully informative or completely uninformative signal) is optimal for the sender But if we restrictattention to the set of upper-censorship signals (which are optimal under log-concave g) then therotation point is not decreasing with respect to the rotation order Therefore Lemma 1 of Johnsonand Myatt (2006) does not apply and an interior upper-censorship signal is generally optimal forthe sender

CENSORSHIP AS OPTIMAL PERSUASION 13

For the purpose of comparison we extend the definition of distribution function Gto the real line and assume that its density g is log-concave Consider a family ofdistributions Gtσ of receiver types

Gtσ (r) = G

983061τ minus t+

r minus τ

σ

983062

where τ is the maximum point of g and t isin R and σ gt 0 are parameters Let gtσ (r)denote the corresponding density Note that gtσ (r) is log-concave for all t isin R andall σ gt 0

Because gtσ is log-concave on [0 1] deterministic upper censorship is optimal byTheorem 2 Let ωlowast (ρ t σ) isin [0 1] be the optimal censorship cutoff as given by (3)

The shift parameter t shifts the distribution along the horizontal axis and the stretchparameter σ stretches the distribution horizontally and symmetrically around themode τ so the limit of σ rarr 0 leads to the unit mass on τ We now show that thesender optimally discloses more information when

(i) the senderrsquos preferences are better aligned with the receiverrsquos preferences (thealignment parameter ρ is greater)

(ii) the receiver is more reluctant to accept the proposal (the shift parameter t isgreater)

(iii) the sender is more uncertain about the receiverrsquos type (the stretch parameter σis greater) provided ρ ge 014

Theorem 3 In the canonical case a cutoff state of an optimal signal satisfies

(i) ωlowast (ρ t σ) is increasing in ρ

(ii) ωlowast (ρ t σ) is increasing in t

(iii) ωlowast (ρ t σ) is increasing in σ if ρ ge 0

The intuition for part (i) is that for a higher ρ the sender puts more weight onthe receiverrsquos utility so she optimally endows the receiver with a higher utility byproviding more information

The intuition for part (ii) is that for a higher t each type of the receiver has a greatercost of accepting the proposal so to persuade the same type of the receiver the senderneeds to increase E[ω|ω ge ωlowast] by expanding the full disclosure interval [0ωlowast]

The intuition for part (iii) is that for a higher σ receiver types are more spread out soto persuade the same mass of types the sender again needs to increase E[ω|ω ge ωlowast]

14The case ρ ge 0 is the practically relevant case where the senderrsquos utility is a weighted averageof the receiverrsquos action and utility 1

1+ρv(ω r) =1

1+ρ + ρ1+ρ (ω minus r)

14 KOLOTILIN MYLOVANOV ZAPECHELNYUK

4 Discrete State

In this section we assume that the state is a discrete random variable

Theorem 1prime Let V be S-shaped Then and only then for all discrete F

(A) an optimal signal is stochastic upper censorship whose cutoff pair (ωlowast qlowast) satisfiescondition (3)

(B) an optimal monotone partition is deterministic upper censorship whose cutoff pair(ωlowast

d qlowastd) satisfies ωlowast

d = ωlowast and qlowastd isin 0 1

In the canonical case we obtain the result analogous to Theorem 2

Theorem 2prime Consider the canonical case and let the density g be log-concave Thenand only then for all discrete F and all ρ

(A) an optimal signal is stochastic upper censorship

(B) an optimal monotone partition is deterministic upper censorship

Let us discuss how the comparative statics result (Theorem 3) changes In Section 32we argue that upper-censorship signals can be ordered by their censorship cutoffs Butwhen the state is discrete a censorship cutoff is described by a pair (ωlowast qlowast) where ωlowast

is a cutoff state and qlowast is the probability of revealing this cutoff state when it realizes

Consider two stochastic upper-censorship signals s1 and s2 with cutoff pairs (ωlowast1 q

lowast1)

and (ωlowast2 q

lowast2) Denote (ωlowast

1 qlowast1) ≽ (ωlowast

2 qlowast2) if ω

lowast1 gt ωlowast

2 or ωlowast1 = ωlowast

2 and qlowast1 ge qlowast2 Observethat s1 is more (Blackwell) informative than s2 if and only if (ωlowast

1 qlowast1) ≽ (ωlowast

2 qlowast2) This

comparison also applies to deterministic upper-censorship signals with the constraintthat qlowast1 and qlowast2 are in 0 1

With the above order on cutoff pairs Theorem 3 extend as follows

Theorem 3prime Consider the canonical case

(A) A cutoff pair of an optimal signal satisfies

(i) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in ρ

(ii) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in t

(iii) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in σ if ρ ge 0

(B) A cutoff pair of an optimal monotone partition satisfies15

(i) (ωlowastd (ρ t σ) q

lowastd (ρ t σ)) is increasing in ρ

(ii) (ωlowastd (ρ t σ) q

lowastd (ρ t σ)) is increasing in t

15Part (iii) also holds for an optimal monotone partition when for example the discrete supportof the state ω is a regular grid with sufficiently many points However it is possible to construct acounterexample where part (iii) fails for an optimal monotone partition

CENSORSHIP AS OPTIMAL PERSUASION 15

That is the sender optimally discloses more information when she is less biasedrelative to the receiver (the alignment parameter ρ is greater) when the receiver ismore reluctant to accept the proposal (the shift parameter t is greater) and underScenario A when the sender is more uncertain about the receiverrsquos type (the stretchparameter σ is greater)

Remark 2 Theorem 1prime and Theorem 2prime show that upper censorship emerges as anoptimal persuasion strategy under the same conditions (V is S-shaped in Theorem1prime and g is log-concave in Theorem 2prime) regardless of whether the sender can choosean arbitrary signal or is constrained to choose a monotone partition It may betempting to conjecture that if the sender was allowed to choose an arbitrary partition(not necessarily monotone) the same result would hold This however is not true

For illustration suppose that the sender wishes to maximize the probability of a = 1and the receiver has type r = 1

2with certainty16 Let the state ω take values in

0 14 1 with probabilities (1

6 23 16) so the prior mean is 1

3

An optimal signal is stochastic upper censorship with cutoff pair (ωlowast qlowast) = (14 12)

which sends the pooling message with certainty at ω = 1 and with probability 1minusqlowast =12at ω = 1

4 and reveals the state otherwise The pooling message induces the posterior

mean m = 12 leading to a = 1 all other messages induce posterior means below 1

2

leading to a = 0 Under the optimal signal action a = 1 is chosen with probability23middot 12+ 1

6= 1

2

An optimal monotone partition is deterministic upper censorship that fully revealsthe state where only ω = 1 leads to a = 1 Under this partition action a = 1 ischosen with probability 1

6

An optimal partition pools the extreme states ω = 0 and ω = 1 and reveals theintermediate state ω = 1

4 The pooling message induces the posterior mean m = 1

2

leading to a = 1 This optimal partition is nonmonotone Under this partitionaction a = 1 is chosen with probability 1

6+ 1

6= 1

3 which is strictly better than the

probability of 16obtained under the optimal monotone partition

5 Application to Media Censorship

In this section we apply our results to the problem of media censorship by thegovernment In the modern world people obtain information about the governmentrsquosstate through various media sources such as television newspapers and internetblogs Without the media most people would not know what policies and reformsthe government pursues and how effective they are Media outlets have differentpositions on the political spectrum and differ substantially in how they select andpresent facts to cover the same news People choose their sources of information based

16In this case V (m) = 1 if m ge 12 and V (m) = 0 if m lt 1

2 Observe that the graph of V is

S-shaped and it can be approximated by a continuously differentiable S-shaped function

16 KOLOTILIN MYLOVANOV ZAPECHELNYUK

on their political ideology and socioeconomic status This information is valuable forsignificant individual decisions on migration investment and voting to name a fewIndividuals may not fully internalize externalities that their decisions impose on thesociety Likewise the government may not have the societyrsquos best interest at heartTo further its goals the government then wishes to influence individual decisions bymanipulating the information through media In autocracies and countries with weakchecks and balances the government has power to censor the media content

The governmentrsquos problem of media censorship can be represented as the persua-sion problem in Section 2 We apply our results to provide conditions for the op-timality of upper-censorship policies that censor all media outlets except the mostpro-government ones Furthermore we interpret our comparative statics results

51 Setup There is a continuum of heterogeneous citizens indexed by r isin [0 1]distributed with G Each citizen chooses between a = 0 and a = 1 The utility of acitizen of type r is given by

u(θ r ar a) = (θ minus r)ar + κ(θ r a)

where ar isin 0 1 denotes the citizenrsquos own action a =983125ardG(r) denotes the aggre-

gate action in the society θ isin [0 1] captures an unobserved benefit from action 1 ascompared to action 0 and κ captures the impact of the aggregate action a on thecitizenrsquos utility The term (θ minus r)ar is a private surplus of a citizen of type r Theterm κ(θ r a) is an externality because for a citizen of type r it is optimal to ignorethis term and choose ar = 1 if and only if θ ge r

There is a government which is concerned with a weighted average of the social utilityand the governmentrsquos intrinsic benefit from the aggregate action For a given θ thegovernmentrsquos utility is given by

983133 1

0

ν(θ r ar a)dG(r) + δγ(θ a)

The term ν captures a citizenrsquos utility from the governmentrsquos perspective We allow νto be different from u to reflect paternalistic or other concerns The term γ capturesthe governmentrsquos intrinsic benefit from the aggregate action The parameter δ ge 0captures the weight of the aggregate action in the governmentrsquos utility

Let T be a distribution of the random variable θ We assume that distributions G andT are independent and admit continuously differentiable and strictly positive densi-ties We also assume that κ ν and γ are linear in θ and continuously differentiablein r and a Furthermore to simplify interpretations we assume that ν and γ arenon-decreasing in θ and a so that for the government a high θ is a good news anda higher aggregate action is preferable

Citizens obtain information about the unobservable benefit θ through media outletsEach media outlet is identified by its editorial policy c isin [0 1] and it endorses action

CENSORSHIP AS OPTIMAL PERSUASION 17

a = 1 if θ ge c and criticizes it if θ lt c17 A set C of media outlets is a finite subsetof [0 1]18

The governmentrsquos censorship policy is a set of the media outlets X sub C that arepermitted to broadcast so the rest of the media outlets are censored

The timing is as follows First the government chooses a set X sub C of permittedmedia outlets Second state θ is realized and each permitted media outlet endorsesor criticizes action a = 1 according to its editorial policy Finally each citizen observesmessages from all permitted media outlets updates his beliefs about θ and choosesan action

52 Discussion We now discuss interpretations of the key components of the mediacensorship application As in Gehlbach and Sonin (2014) there can be various inter-pretations of the citizenrsquos action a = 1 such as voting for the government supportinga governmentrsquos policy or taking an individual decision that benefits the governmentA citizenrsquos type r can be interpreted as his ideological position or preference param-eter A citizen who is more supportive of the government has a smaller r

A media outlet with a higher editorial policy c can be interpreted as being less loyalto the government because it criticizes the government on a larger set of states Aneditorial policy c isin C can therefore represent a slant or political bias of the outletagainst the government and can be empirically measured as the frequency with whichthe outlet uses anti-government language Gentzkow and Shapiro (2010) constructsuch a slant index for US newspapers Empirical findings of their paper suggestthat the editorial policies of media outlets are driven by reader preferences justifyingour assumption of the existence of a large variety of editorial policies19 As in Suen(2004) Chan and Suen (2008) and Chiang and Knight (2011) the assumption ofthe binary media reports that communicate only whether the state θ is above somestandard c can be justified by a cursory readerrsquos preference for simple messages suchas positive or negative opinions and yes or no recommendations

The governmentrsquos censorship of media outlets can take various forms For examplethe government can ban access to internet sites withdraw licenses disrupt financingconfiscate print materials and equipment discredit media providers and even arresteditors and journalists using broadly formulated legislation on combating extremismIn some countries the government can exercise direct control over media editorialpolicies either through state ownership or administrative pressure

17The tie-breaking in the event of θ = c is unimportant as θ is a continuous random variable18We discuss the case of a continuum of media outlets in Section 5519Theoretical literature has explored the determinants of media slant of an outlet driven by its

citizens (Mullainathan and Shleifer 2005 Gentzkow and Shapiro 2006 and Chan and Suen 2008)and its owners (Baron 2006 and Besley and Prat 2006)

18 KOLOTILIN MYLOVANOV ZAPECHELNYUK

53 Formulating Media Censorship as Persuasion We now show that the me-dia censorship problem can be formulated as a linear persuasion problem in whichthe government is a sender and a representative citizen is a receiver

In our model the citizensrsquo and the governmentrsquos utilities are linear in θ There-fore given any information from the media outlets the utilities depend only on theposterior mean of θ

Consider an arbitrary posterior mean of θ denoted by m isin [0 1] Each citizen of typer chooses ar = 1 if and only if r le m The externality term κ plays no role in thisdecision Therefore the aggregate action a is simply the mass of all citizens whosetypes do not exceed m so a = G(m)

Next using the citizensrsquo optimal behavior we derive the governmentrsquos expected utilityconditional on the posterior mean m

V (m) = E983063983133 1

0

ν(θ r ar G(m))dG(r) + δγ(θ G(m))

983055983055983055983055m983064

=

983133 1

0

ν(m r1rlem G(m))dG(r) + δγ(mG(m)) (6)

This is the governmentrsquos indirect utility As in Section 2 the government now choosesa signal which is informative about the state to maximize its expected utility How-ever in contrast to Section 2 here the government is restricted to signals that areimplementable by a subset of a given set of media outlets

Next we show that with an appropriate definition of the state the restriction tosignals implementable by a subset of a given set of media outlets can be formulatedas a restriction to monotone partitional signals as in Scenario B in Section 2 Thuswe will be able to use our results in Section 4 to find the optimal censorship policies

Recall that the information about θ is only available through the media outlets Let ωdenote a random variable equal to the conditional expectation of θ given the messagesof all media outlets in C Note that as C is finite ω is a discrete random variablewith values in [0 1] Let F denote the distribution of ω From now on we treat therandom variable ω as the state and denote by s a signal as defined in Section 2

Let HM denote the set of all distributions of the posterior mean state induced byall monotone partitional signals and let HC denote the set of all distributions ofthe posterior mean state induced by all finite subsets of the set C of media outletsObserve that any set of media outlets induces a monotone partitional signal thatis HC sub HM We now show that every outcome implementable by an arbitrarymonotone partition in HM is also implementable by a subset of media outlets

Lemma 2 HC = HM

For illustration suppose that there is only one media outlet with the editorial policyc isin (0 1) There are two observable events θ ge c and θ lt c In each realized event

CENSORSHIP AS OPTIMAL PERSUASION 19

the citizens compute the posterior expectation of θ which we will call ω So in thisexample the distribution F of ω has two mass points one for each of the two events

Given these two mass points every monotone partition of [0 1] induces one of twopossible distributions of the posterior mean state (i) the mass points revealed and (ii)the mass points pooled Observe however that these distributions are implementableby permitting and censoring the media outlet respectively

A censorship policy is upper censorship if it censors all sufficiently disloyal mediaoutlets Specifically there exists a cutoff clowast isin C and an indicator qlowast isin 0 1 suchthat all media outlets whose editorial policies are below clowast are permitted all mediaoutlets whose editorial policies are above clowast are censored and clowast is permitted if andonly if qlowast = 1 That is X = c isin C c le clowast if qlowast = 1 and X = c isin C c lt clowastif qlowast = 0 Note that the full censorship policy clowast = 0 and qlowast = 0 (where all mediaoutlets are censored) and the free media policy clowast = 1 and qlowast = 1 (where all mediaoutlets are permitted) are the two extreme upper-censorship policies

As shown above the media censorship problem is equivalent to the linear persuasionproblem in which the sender is restricted to monotone partitional signals and thesenderrsquos indirect utility V is given by (6) We thus apply Theorem 1prime(B) to obtainthe following result

Theorem 1primeprime If V is S-shaped then an optimal censorship policy is upper censorship

To illustrate Theorem 1primeprime suppose the government is interested only in the aggregateaction so that ν = 0 and γ depends only on the aggregate action a so V (m) =γ(G(m)) Thus an upper-censorship policy is optimal if the composition functionγ(G(middot)) is S-shaped For example this condition holds if the government is interestedin reaching a certain approval threshold (eg a simple majority) so that γ is a stepfunction This condition also holds if γ is S-shaped and G is uniform or S-shapedwith the same inflection point as γ20

54 Canonical Case We now impose more structure on the utilities to obtain asharper result for the optimality of upper censorship and to perform a comparativestatics analysis Analogously to assumption (A1) we assume that the governmentrsquosutility is a weighted average of the citizensrsquo utility and their aggregate action

983133 1

0

u(θ r ar a)dG(r) + δa

where u(θ r ar a) = (θ minus r)ar + ζ(r)a

(A2)

for some continuously differentiable function ζ Define

β =

983133 1

0

ζ(r)dG(r) + δ

20This condition holds in many special cases where γ and G are S-shaped even when γ and Ghave different inflection points

20 KOLOTILIN MYLOVANOV ZAPECHELNYUK

The term983125 1

0ζ(r)dG(r) is the governmentrsquos expected bias towards the citizensrsquo action

a = 1 due to the citizensrsquo externality and the term δ is the governmentrsquos intrinsicbias towards a greater aggregate action a Thus we interpret β as the governmentrsquosaggregate bias This decomposition of the bias allows for different interpretations whythe government is biased So β can be high because the government is too self-serving(high δ) or because the government is benevolent (low δ) but wishes to internalizestrong positive externalities of the citizens (high ζ(r)) To ease interpretations weassume that β gt 021

Under Assumption (A2) the governmentrsquos indirect utility V given by (6) becomes

V (m) =

983133 m

0

(mminus r)dG(r) + βa =

983133 m

0

v(m r)dG(r)

where

v(m r) = β

9830611 +

1

β(mminus r)

983062

This is the same as v given by Assumption (A1) with ρ = βminus1 up to rescaling by apositive constant Consequently we can apply Theorem 2prime(B) to obtain the followingresult

Theorem 2primeprime Let (A2) hold If the density g of citizens types is log-concave then anoptimal censorship policy is upper censorship

We now apply the comparative statics analysis presented in Section 32 Note thatthe upper-censorship policies are ordered according to the amount of informationtransmitted to the citizens in the sense of Blackwell (1953) A greater censorshipthreshold clowast means that more media outlets are permitted With this order in mindwe apply Theorem 3prime(B) to make a comparative statics analysis on the amount ofinformation that is optimally disclosed by the government

First the censorship cutoff is increasing in the alignment parameter ρ Recall thatthis is a reciprocal of the governmentrsquos bias ρ = βminus1 This means that the governmentoptimally discloses more information (the censorship cutoff is greater) when it is lessbiased (β is smaller) Intuitively as β decreases the government puts more weighton the citizensrsquo utility so it optimally endows the population with a higher utility bycensoring fewer media outlets and disclosing more information

Second the censorship cutoff is increasing in the magnitude t of the horizontal shiftof the density g A greater t corresponds to a greater opportunity cost of actiona = 1 for each citizen This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more difficultto persuade to take action a = 1 (parameter t is greater) Informally speaking topersuade the same type of the citizen the government needs to increase the posterior

21If β lt 0 then swapping the roles of a = 0 and a = 1 reverses the sign of the bias If β = 0then the governmentrsquos utility and the citizensrsquo private interests coincide so it is trivially optimal topermit all media outlets

CENSORSHIP AS OPTIMAL PERSUASION 21

mean But the expected posterior mean must be equal to the prior mean so it is notpossible to increase all posterior means Due to the log-concave shape of the densityof the citizensrsquo types this tradeoff is resolved by increasing the posterior mean of thepooling interval E[ω|ω ge clowast] This is done by shrinking the pooling interval [clowast 1]that is increasing the censorship cutoff clowast

Third when the number of media outlets is sufficiently large the censorship cutoffis increasing in the stretch factor σ of the density g around its peak τ (see Footnote15) A greater σ corresponds to a more dispersed distribution of the opportunitycost in the population This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more diverse inhow difficult they are to persuade Intuitively when the types are more spread outpersuading the same mass of types requires a greater posterior mean which leads toa higher censorship cutoff similarly to the comparative statics in t discussed above

55 Extensions and Open Questions Let us now consider a few extensions ofour model of media censorship

In our model the set of media outlets is exogenous and the governmentrsquos only in-strument is censorship We now consider three alternative ways of expanding thegovernmentrsquos instruments of influence

First suppose that the government is able not only to censor existing media outletsbut also to introduce new media outlets with chosen editorial policies This is equiv-alent to our censorship model where all media outlets in [0 1] are initially availableand the government can censor any subset of them When all media outlets in [0 1]are permitted the revealed information about θ (which we call ω) is θ itself Thusω = θ is a continuous random variable with distribution F = T So we can nowapply our results for the continuous state from Section 3 instead of those for the dis-crete state in Section 4 reaching the same conclusion about the optimality of uppercensorship

Second suppose that the government can garble information available from mediaoutlets That is the government is not restricted to monotone partitions it cancreate arbitrary signals about state ω for the citizens to observe This becomes ageneral persuasion problem and our Theorems 1prime(A) 2prime(A) and 3prime(A) apply

Third suppose that the government is able to restrict not only which media outletsare permitted but also how many media outlets each citizen can choose to observeIn this extension the citizens are not allowed to communicate with one another (oth-erwise they could share the information thus observing all permitted media outletsindirectly) This extension does not affect our results as long as each citizen is al-lowed to access at least one media outlet of his choice as in Chan and Suen (2008)Intuitively this is because each citizen categorizes the information from the mediaoutlets into ldquogood newsrdquo where a = 1 is optimal and ldquobad newsrdquo where a = 0 is

22 KOLOTILIN MYLOVANOV ZAPECHELNYUK

optimal Because the information from the media outlets induces a monotone parti-tion it means that ldquogood newsrdquo is separated from ldquobad newsrdquo by a threshold mediaoutlet that depends on the citizenrsquos type That is it is sufficient to observe a singlethreshold media outlet to distinguish ldquogood newsrdquo from ldquobad newsrdquo

In our model each citizenrsquos utility depends on the aggregate action a through theexternality term κ(θ r a) which does not affect the chosen action Let us relax thisassumption so that a citizenrsquos optimal choice can depend on a We can still writethe senderrsquos indirect utility V as a function of the posterior mean state and applyour results However V is no longer uniquely determined by the primitives of themodel It is now endogenous and depends on an equilibrium the citizens play aseach citizenrsquos optimal action now depends on what all citizens do in equilibrium Forexample given the same information about the state a citizen could prefer to choosea = 1 if and only if sufficiently many citizens choose the same action so a is largeenough This creates the problem of multiplicity of equilibria and as a consequencethe dependence of optimal censorship on equilibrium selection

As a side our media censorship problem can also be applied to spatial voting mod-els as in Chiang and Knight (2011) Consider a government party (p = G) and anopposition party (p = O) competing in an election If party p wins a voter withan ideological position r gets utility wp minus (r minus rp)

2 where wp is the quality or va-lence of party p and rp is the ideology or policy platform of party p Voters knowthe partiesrsquo ideologies and obtain information about the partiesrsquo qualities from allavailable media outlets Each voter supports the party that maximizes his expectedutility Within this context our analysis still applies because the voterrsquos utility dif-ference between the government and opposition parties is proportional to θminusr whereθ = (wG minus wO + r2O minus r2G)2(rO minus rG)

There are a few more extensions that can be relevant in applications First insteadof complete censorship of a media outlet there can be a cost of accessing it Forexample an international news channel can be censored by a local government butcitizens may still access it through VPN at some cost Second it can be costly forthe government to censor media outlets So another important question to answeris how a government may prioritize censoring Finally citizens can incur some costof following each media outlet While we have already mentioned that citizens gainno benefit from following more than one outlet it is entirely possible for them tostop watching news altogether if it is sufficiently uninformative These extensions arenontrivial and left for future research

Appendix

Proof of Theorem 1 Suppose that V is S-shaped For mωlowast isin [0 1] and mlowast(ωlowast) =E[ω|ω ge ωlowast] define

V (mωlowast) = V (mlowast(ωlowast)) + V prime(mlowast(ωlowast))(mminusmlowast(ωlowast)) (7)

CENSORSHIP AS OPTIMAL PERSUASION 23

Since V (m) is convex on [0 τ ] and concave on [τ 1] there exists ωlowast isin [0 τ ] thatsatisfies condition (3) that is

V (mωlowast) ge (le)V (m) for all m ge (lt)ωlowast (8)

This is true because V (ωlowastωlowast) minus V (ωlowast) is single-crossing from below To see thisconsider Figure 1 Observe that (3) can hold only if ωlowast is on the convex part of V andmlowast is on the concave part of V As the censorship cutoff ωlowast increases the posteriormean state of the pooling message mlowast(ωlowast) also increases But because V is concaveat mlowast(ωlowast) the dashed tangent line V (mωlowast) becomes flatter in m so it crosses thesolid line V (m) at a smaller m

Consider an arbitrary signal s Let H be the distribution of m = E[ω|s] inducedby signal s Since each signal is a garbling of the fully informative signal F is amean-preserving spread of H Thus the senderrsquos expected utility is smaller undersignal s than under the upper-censorship signal with cutoff ωlowast as follows from

983133 1

0

V (m)dH(m) le983133 1

0

maxV (m) V (mωlowast)dH(m)

le983133 1

0

maxV (m) V (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mlowast(ωlowast))dF (m)

where the second line holds because maxV (m) V (mωlowast) is convex in m and F isa mean-preserving spread of H the third line holds by (8) and the last line holds by

(7) and by983125 1

ωlowast(mminusmlowast(ωlowast))dF (m) = 0

Conversely suppose that V is not S-shaped Then there exist 0 le m1 lt m2 lem3 lt m4 le 1 such that V primeprime(m) lt 0 for m isin (m1m2) V

primeprime(m) = 0 for m isin [m2m3]and V primeprime(m) gt 0 for m isin (m3m4) because V primeprime is continuous by assumption It isstraightforward to show that there exists 983141mlowast isin (m1m2) (sufficiently close to m2) and983141ωlowast isin (m3m4) such that

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) lt V (m) for all m isin (983141ωlowastm4]

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) gt V (m) for all m isin [m1 983141ωlowast] 983141mlowast

Moreover there exists a continuous distribution F such that the support of F is asubset of [m1m4] and 983141mlowast = E[ω|ω le 983141ωlowast] Using a chain of inequalities similar to theone above it is easy to show that lower-censorship signal 983141slowast with cutoff 983141ωlowast is optimalFurthermore it is uniquely optimal as both inequalities hold with equalities only forsignal 983141slowast This in turn implies that any upper-censorship signal is suboptimal

24 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Proof of Lemma 1 Notice that under assumption (A1) we have

V primeprime(m) = gprime(m) + ρg(m) for m isin [0 1]

Recall that V is S-shaped if and only if V primeprime(m) = gprime (m) + ρg (m) is single-crossingfrom above Using Proposition 1 in Quah and Strulovici (2012) it is easy to showthat V primeprime(m) is single-crossing from above for all ρ isin R if and only if gprime (m)g (m) isnonincreasing in m (that is ln g(m) is concave)

Proof of Theorem 2 Immediate by Theorem 1 and Lemma 1

Proof of Theorem 3 Part (i) We fix t and σ and vary ρ Without loss ofgenerality let t = 0 and σ = 1 (so that Gtσ = G) Consider a signal that induces thedistribution H of posterior mean m This signal implements the receiverrsquos interimutility U given by

U(r) =

983133 1

r

(mminus r)dH(m) for r isin [0 1]

which holds because the receiver acts if and only if m ge r Since types r lt 0 alwayschoose a = 1 and types r gt 1 always choose a = 0 we can exclude them fromconsideration and assume that supp(G) = [0 1] The senderrsquos expected utility is then

983133 1

0

V (m)dH(m) =

983133 1

0

983133 m

0

(1 + ρ(mminus r))dG(r)dH(m)

=

983133 1

0

983133 1

r

(1 + ρ(mminus r))dH(m)dG(r)

=

983133 1

0

(1minusH(r) + ρU(r))dG(r)

Consider ρ2 gt ρ1 Suppose by contradiction that the corresponding optimal upper-censorship signals s2 and s1 are such that s1 is strictly more informative than s2 LetU2 and U1 be the receiverrsquos interim utilities implemented by s2 and s1 Since thesender prefers signal s2 under ρ2 and signal s1 under ρ1 we have

983133 1

0

(1minusH2 (r) + ρ2U2 (r)) dG (r) ge983133 1

0

(1minusH1 (r) + ρ2U1 (r)) dG (r)

983133 1

0

(1minusH1 (r) + ρ1U1 (r)) dG (r) ge983133 1

0

(1minusH2 (r) + ρ1U2 (r)) dG (r)

Summing up these inequalities gives

(ρ2 minus ρ1)

983133 1

0

(U2 (r)minus U1 (r)) dG (r) ge 0

leading to a contradiction because U1(r) ge U2(r) for all r with strict inequality forsome r given that s1 is strictly more informative than s2

CENSORSHIP AS OPTIMAL PERSUASION 25

Part (ii) We fix ρ isin R and σ = 1 and vary t The indirect utility of the senderconditional on m isin [0 1] is given by

Vt(m) =

983133

rlem

(1+ρ(mminusr))gt1(r)dr =

983133

rlemminust

(1+ρ(mminustminusr))g(r)dr = V (mminust) (9)

By Theorem 1 when t0 = 0 an optimal cutoff ωlowast0 satisfies

V (m) le V (mlowast0) + V prime(mlowast

0)(mminusmlowast0) for m isin [ωlowast

0 1]

When V is S-shaped and t1 lt 0 it is easy to see that

V (mminus t1) le V (mlowast0 minus t1) + V prime(mlowast

0 minus t1)(mminusmlowast0) for m isin [ωlowast

0 1]

which implies that when t = t1 the above inequality changes its sign at some ωlowast1

that satisfies ωlowast1 le ωlowast

0 It follows from (9) that ωlowast1 is an optimal cutoff for t = t1 and

as we established above ωlowast1 le ωlowast

0

Part (iii) We fix ρ ge 0 and t = 0 and vary σ The indirect utility of the senderconditional on m isin [0 1] is given by

V ρσσ(m) =

983133

rlem

9830591 +

ρ

σ(mminus r)

983060dG0σ(r)

=

983133

rlem

9830591 +

ρ

σ(mminus r)

983060g

983061r minus (1minus σ)τ

σ

9830621

σdr

=

983133

rlemminus(1minusσ)τσ

9830611 + ρ

983061mminus (1minus σ)τ

σminus r

983062983062g(r)dr

= Vρ1

983061mminus (1minus σ)τ

σ

983062 (10)

Let σ0 = 1 σ1 lt 1 and ρ1 = ρσ1 To compare the optimal cutoffs under parameters(ρ σ0) and (ρ σ1) we consider two steps first comparing the cutoffs between (ρ σ0)and (ρ1 σ1) and then comparing the cutoffs between (ρ1 σ1) and (ρ σ1)

First we compare an optimal cutoff ωlowast0 under (ρ σ0) = (ρ 1) and an optimal cutoff

ωlowast1 under (ρ1 σ1) By Theorem 1 ωlowast

0 satisfies

Vρ1(m) le Vρ1(mlowast0) + V prime

ρ1(mlowast0)(mminusmlowast

0) for m isin [ωlowast0 1]

When V is increasing (implied by ρ ge 0) and S-shaped and σ1 lt 1 it is easy to seethat for all m isin [ωlowast

0 1]

Vρ1

983061mminus (1minus σ1)τ

σ1

983062le Vρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062+ V prime

ρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062(mminusmlowast

0)

which implies that when σ = σ1 lt 1 the above inequality changes its sign at someωlowast1 that satisfies ωlowast

1 le ωlowast0 It follows from (10) that ωlowast

1 is an optimal cutoff under(ρσ σ1) = (ρ1 σ1) and as we established above ωlowast

1 le ωlowast0

26 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Second we compare the optimal cutoff ωlowast1 under (ρ1 σ1) and the optimal cutoff ωlowast

2

under (ρ σ1) Because ρ lt ρ1 = ρσ1 by part (i) we have ωlowast2 le ωlowast

1

Combining ωlowast1 le ωlowast

0 and ωlowast2 le ωlowast

1 we obtain ωlowast2 le ωlowast

0

Proof of Theorem 1prime Part (A) The proof is analogous to that of Theorem 1 andis thus omitted

Part (B) The only if statement follows from that in part (A) (see the proof of Theorem1) and the fact that we can always ensure that an optimal stochastic censorship(ωlowast qlowast) is deterministic (so qlowast isin 0 1) by the choice of the support and distributionof ω Thus we only need to prove the if statement

Let F be a distribution of ω with support on n values 0 lt ω0 lt lt ωnminus1 lt 1 Thisis without loss of generality because if 0 andor 1 were in the support of F we couldredefine the state as ω = (ω + ε)(1 + 2ε) and the type as r = (r + ε)(1 + 2ε)

Let X sub ω1 ωnminus1 identify a set of cutoffs defining a monotone partition for eachωi isin X the receiver is informed whether ω lt ωi or ω ge ωi Note that we exclude ω0

from the set of cutoffs because ω ge ω0 always holds

Note that X is deterministic upper censorship if and only if there exists a censorshipthreshold ωlowast isin [0 1] such that for each i = 1 nminus 1 ωi isin X if and only if ωi lt ωlowastthat is we can assume that qlowast = 0 without loss of generality

Let VX be the senderrsquos expected utility when the monotone partition is given by X

If n = 2 then there are only two partitions X = ω1 (full disclosure) and X = empty(no disclosure) Both are upper censorships so part (B) holds trivially

If n = 3 then there are four monotone partitions X = ω1ω2 (full disclosure)X = ω2 X = ω1 and X = empty (no disclosure) Only X = ω2 is not uppercensorship The next lemma proves that X = ω2 cannot be uniquely optimal whenV is S-shaped thus proving the optimality of upper censorship for n = 3

Denote

mjk = E983045ω|ω isin [ωjωk]

983046

For convenience of notation we write mj = ωj

Lemma 3 Let V be S-shaped with an inflexion point τ Let ω be a discrete randomvariable with support on m0m1m2 and probabilities p0 p1 p2 where 0 lt m0 ltm1 lt m2 lt 1 Then

(i) if τ le m0 then Vm2 le Vempty

(ii) if τ ge m1 then Vm2 le Vm1m2

(iii) if m0 lt τ lt m1 then Vm2 le maxVempty Vm1m2

CENSORSHIP AS OPTIMAL PERSUASION 27

Proof Define p01 = p0 + p1 and p02 = p01 + p2 = 1 Observe that

m01 =p0m0 + p1m1

p0 + p1and m02 =

p0m0 + p1m1 + p2m2

p0 + p1 + p2

We thus have

Vempty = p02V (m02) = (p01 + p2)V

983061p01m01 + p2m2

p01 + p2

983062

Vm2 = p01V (m01) + p2V (m2) = (p0 + p1)V

983061p0m0 + p1m1

p0 + p1

983062+ p2V (m2)

Vm1m2 = p0V (m0) + p1V (m1) + p2V (m2)

Clearly

m0 lt m01 lt m1 and m0 lt m02 lt m2

Therefore

Vm2 le Vempty lArrrArrp01

p01 + p2V (m01) +

p2p01 + p2

V (m2) le V

983061p01

p01 + p2m01 +

p2p01 + p2

m2

983062= V (m02)

(11)

and

Vm2 le Vm1m2 lArrrArr

V

983061p0

p0 + p1m0 +

p1p0 + p1

m1

983062le p0

p0 + p1V (m0) +

p1p0 + p1

V (m1)(12)

Suppose that τ le m01 so V is concave on [m01m2] Then (11) holds by Jensenrsquosinequality In particular it holds for τ le m0 lt m01 thus proving part (i)

Suppose that τ ge m1 so V is convex on [m0m1] Then (12) holds by Jensenrsquosinequality thus proving part (ii)

Finally suppose that m01 lt τ lt m1 Let (τ V (τ)) lie below (or on) the straight lineA connecting (m0 V (m0)) and (m2 V (m2)) as illustrated on Figure 2 By convexityof V (m) for m lt τ it must be the case that V (m01) lies below (or on) the dashedline B By concavity of V (m) for m gt τ it must be the case that V (m1) lies above(or on) the dashed line C Consequently (12) is satisfied

Alternatively let (τ V (τ)) lie above the straight line A connecting (m0 V (m0)) and(m2 V (m2)) as illustrated on Figure 3 If Vm2 le Vm1m2 then the proof of part(iii) is complete So suppose that Vm2 gt Vm1m2 so (12) is violated

By convexity of V (m) for m lt τ it must be the case that V (m01) lies below (or on)the line B Since (12) is violated V (m01) must lie between the lines A and B Recallthat m01 lt m02 lt m2 Observe that V (m02) must be in the shaded area on Figure 3Indeed if m02 lt τ then by convexity of V (m) for m lt τ it must be the case thatV (m02) lies between (or on) the lines B and D If m02 ge τ then by concavity of

28 KOLOTILIN MYLOVANOV ZAPECHELNYUK

τ m2m0

A C

B

m01 m1

V (m0) V (τ)

V (m2)

Figure 2 The case where (τ V (τ)) lies below (or on) the straight lineconnecting (m0 V (m0)) and (m2 V (m2))

τ m2m0

A

C

B

m01

V (m0)

V (τ)

V (m2)

D

V (m01)

Figure 3 The case where (τ V (τ)) lies above the straight line con-necting (m0 V (m0)) and (m2 V (m2))

V (m) for m gt τ it must be the case that V (m02) lies above (or on) line C It followsthat (11) is satisfied

We now adapt Lemma 3 to prove the optimality of upper censorship for n ge 4

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

CENSORSHIP AS OPTIMAL PERSUASION 3

The main result shows that if the senderrsquos indirect utility is S-shaped in the expectedstate (that is convex below some threshold and concave above the threshold) thenand only then an upper-censorship signal that pools states above a cutoff and revealsstates below the cutoff is optimal for all prior distributions of the state3 In thecanonical case the main result shows that if the probability density of receiver types islog-concave then and only then upper censorship is optimal for all prior distributionsof the state and for all weights that the sender puts on the receiverrsquos utility and action4

The main result is obtained under two scenarios In the first scenario the signal isa random variable with arbitrary correlation with the state In the second scenariothe signal is constrained to be a monotone partition of the state space so that eachstate is either revealed or pooled with some adjacent states

To compare the main result under these two scenarios consider the cutoff of an opti-mal upper-censorship signal Notice that upper censorship does not specify whetherthe cutoff state is pooled or revealed If the state is a continuous random variablethen it does not matter whether this cutoff state is pooled or revealed Thus an op-timal upper-censorship signal is in fact a monotone partition However if the stateis a discrete random variable then the optimal signal is stochastic upper censorshipso that the cutoff state is generally pooled and revealed with interior probabilitiesIn contrast the optimal monotone partition is deterministic upper censorship so thatthe cutoff state is either completely pooled or fully revealed

In many applications of interest the state is discrete and thus optimal signals andoptimal monotone partitions generally differ An optimal upper-censorship signalprovides an upper bound on the value of persuasion To achieve this bound thesender should be able to commit to randomize over two signal realizations conditionalon the cutoff state which may be hard to enforce in practice In contrast monotonepartitions are relatively simple signals that can be enforced and verified ex post

The problem of finding an optimal monotone partition is a discrete optimizationproblem which cannot be solved using existing tools from the persuasion literatureNevertheless we show that if the senderrsquos indirect utility is S-shaped (or in thecanonical case the density of receiver types is log-concave) then upper censorship isan optimal monotone partition To prove this result for any monotone partition dif-ferent from upper censorship we explicitly construct a deterministic upper-censorshipsignal that is preferred by the sender

When upper censorship is optimal it is possible to perform the comparative staticsanalysis on the informativeness of the optimal signal We show that in the canonicalcase the sender optimally censors more information (i) if the sender is more biasedin that the sender puts a smaller weight on the receiverrsquos utility (ii) if the receiver is

3Analogously a lower-censorship signal that pools states below a cutoff is optimal for all priordistributions if and only if the senderrsquos indirect utility has an inverted S-shape

4Most commonly used probability densities are log-concave Log-concave densities exhibit niceproperties such as single-peakedness and the monotonicity of the hazard rate

4 KOLOTILIN MYLOVANOV ZAPECHELNYUK

easier to persuade in that the receiverrsquos cost of accepting the proposal is smaller and(iii) if the sender is more certain about the receiverrsquos preferences in that the densityof receiver types is more concentrated around its mode

We apply our results to the problem of media censorship by the government Weconsider a stylized setting with a finite number of media outlets and a continuumof heterogeneous citizens (receivers) Each media outlet is identified by its editorialpolicy that specifies an interval of endorsement states We permit the aggregate ac-tion of the citizens to affect their utility but not their optimal actions For examplean election outcome impacts all citizens but does not change their preferences overcandidates The government wishes to influence citizensrsquo actions by deciding whichmedia outlets to censor In the canonical case if the probability density of citizenstypes is log-concave then the optimal media censorship policy is to permit all suf-ficiently loyal media outlets and to censor the remaining outlets Our comparativestatics results suggest that the government should optimally increase censorship (i)if influencing society decisions becomes relatively more important than maximizingindividual welfare (ii) if the society experiences an ideology shock in favor of thegovernment and (iii) if the society becomes less diverse in ideology and taste

Related Literature The literature on Bayesian persuasion was set in motion bythe seminal papers of Rayo and Segal (2010) and Kamenica and Gentzkow (2011)The linear case which is the subject of this paper is prevalent in the literature5

Kolotilin (2018) provides necessary and sufficient prior dependent conditions for theoptimality of censorship in the case of a continuous state6 Alonso and Camara(2016b) provide sufficient conditions in the case of a discrete state In contrastour paper provides necessary and sufficient conditions that are prior independent (aswell as bias independent in the canonical case) and apply in both continuous anddiscrete cases More importantly we show that these conditions are also necessaryand sufficient for deterministic censorship to be optimal in the class of monotonepartitional signals (simple signals) There are no counterparts to this result in theliterature

Similar to us Alonso and Camara (2016b) show that the optimal signal is less infor-mative if the receiver is easier to persuade Their proof is simpler but it is valid onlywhen the senderrsquos utility is state-independent But our comparative statics resultswith respect to the senderrsquos bias and uncertainty are novel

Censorship policies commonly emerge as optimal signals in linear persuasion models ina variety of contexts starting from the prosecutor-judge example as well as lobbyingand product advertising examples in Kamenica and Gentzkow (2011) Other contextswhere censorship is optimal include grading policies (Ostrovsky and Schwarz 2010)

5Notable exceptions include Rayo and Segal (2010) Goldstein and Leitner (2018) Guo andShmaya (2019) and Kolotilin and Wolitzky (2019)

6See also Gentzkow and Kamenica (2016) Kolotilin Mylovanov Zapechelnyuk and Li (2017)and Dworczak and Martini (2019) for sufficient conditions

CENSORSHIP AS OPTIMAL PERSUASION 5

media control (Gehlbach and Sonin 2014 Ginzburg 2019) clinical trials (Kolotilin2015) voter persuasion (Alonso and Camara 2016ab) transparency benchmarks(Duffie Dworczak and Zhu 2017) stress tests (Goldstein and Leitner 2018 OrlovZryumov and Skrzypach 2018) trading mechanisms (Romanyuk and Smolin 2019Dworczak 2017 Yamashita 2018) quality certification (Zapechelnyuk 2019) andrelational communication (Kolotilin and Li 2019)

Our application to media censorship fits into the literature on media capture thataddresses the problem of media control by governments political parties or lobbyinggroups Besley and Prat (2006) pioneered this literature by studying how a gov-ernmentrsquos incentives to censor free media depend on plurality (the number of mediaoutlets) and on transaction costs of bribing the media In their model media outletsare identical and all voters have common interests Thus the governmentrsquos optimaldecision reduces to either capturing all media outlets or none Relatedly Gehlbachand Sonin (2014) consider a setting with a government-influenced monopoly mediaoutlet that exploits the trade-off between the governmentrsquos objective to ldquomobilizerdquo thepopulation for some collective goal and to collect the revenue from subscribers whodemand informative news Gehlbach and Sonin (2014) show that the presence of theldquomobilizationrdquo objective increases the news bias whereas the subscription revenuesreduce the bias but can cause the government to nationalize the media outlet7

The models of Besley and Prat (2006) and Gehlbach and Sonin (2014) have two statesof the world and either a single media outlet or a few identical media outlets Hencethe government uses the same censorship policy for each outlet We are the first toconsider a richer model of media censorship with a continuum of states and multipleheterogeneous media outlets As a result the government optimally discriminatesmedia outlets by permitting sufficiently loyal ones and banning the remaining ones

2 Model

21 Setup There are two players a sender (she) and a receiver (he) The receiverchooses whether to accept a proposal (a = 1) or reject it (a = 0) The proposalhas an uncertain value ω isin [0 1] By accepting the proposal the receiver forgoesan outside option worth r isin [0 1] so the receiverrsquos utility is a(ω minus r) The senderrsquosutility is av(ω r) where v(ω r) is linear in ω and continuously differentiable in rWe will refer to ω as state and to r as type and denote their distributions by F andG respectively Throughout the paper we assume that distribution G of r has acontinuously differentiable and strictly positive density g8

As in many applications the state is either a continuous or discrete random variablewe will separately analyze these two cases We say that the state ω and distribution F

7In different contexts media control by a government has also been studied by Egorov Gurievand Sonin (2009) Edmond (2013) and Lorentzen (2014) See also the overview of the literature onmedia capture slant and transparency in Prat and Stromberg (2013)

8This assumption is made for clarity of exposition The results can be extended to general G

6 KOLOTILIN MYLOVANOV ZAPECHELNYUK

are continuous if F has a density f on [0 1] We say that the state ω and distributionF are discrete if the support of F is a finite set

A particularly tractable and intuitive special case is where the senderrsquos utility is aweighted sum of the receiverrsquos utility and action We call this the canonical caseFormally in the canonical case the senderrsquos utility when the proposal is accepted is

v (ω r) = 1 + ρ(ω minus r) ρ isin R (A1)

That is the sender is biased towards a = 1 but also puts a weight ρ on the receiverrsquosutility In particular if the weight ρ is large then the sender and receiverrsquos inter-ests are aligned whereas if the weight is zero then the sender cares only about thereceiverrsquos action

The receiver privately knows his type but he does not observe the state The sendercan influence the action taken by the receiver a = 1 or a = 0 by releasing a signalthat reveals information about the state A signal is a random variable s isin [0 1]that is independent of r but is possibly correlated with ω For example s is fullyinformative if it is perfectly correlated with ω and s is completely uninformative ifit is independent of ω

The timing is as follows First the sender publicly chooses a signal s Then realiza-tions of ω r and s are drawn Finally the receiver observes the realizations of histype r and the signal s and then chooses between a = 0 and a = 1

We consider two scenarios

A The sender can choose any signal

B The sender can choose any monotone partitional signal

Formally a signal s is a monotone partitional signal if there exists a nondecreasingfunction ξ(ω) such that s = ξ(ω) for each ω Every such signal induces a partitionof the state space [0 1] into intervals and singletons and the receiver observes thepartition element that contains the state

Under Scenario A we are interested in an optimal signal that maximizes the senderrsquosexpected utility among all signals This is a standard persuasion problem

Under Scenario B we are interested in an optimal monotone partition that maximizesthe senderrsquos expected utility among all monotone partitional signals This scenarioincorporates constraints that information designers often face in practice For ex-ample a non-monotone grading policy that gives better grades to worse performingstudents will be perceived as unfair and will be manipulated by strategic students

22 Upper Censorship A subset of signals called upper censorship will play aspecial role in this paper

CENSORSHIP AS OPTIMAL PERSUASION 7

An upper-censorship signal reveals states below a specified cutoff and pools statesabove this cutoff When the state is equal to the cutoff it can be revealed or pooledIf the state is continuous it does not matter whether the cutoff state is revealed orpooled because this is a zero probability event However if the state is discretewe will distinguish between deterministic and stochastic upper-censorship signalsdepending on what happens at the cutoff state

A lower-censorship signal is defined symmetrically it pools states below a specifiedcutoff and reveals states above this cutoff Because the case of lower censorshipis symmetric (see Remark 1 below) for clarity of exposition we focus on uppercensorship throughout the paper

A signal s is stochastic upper censorship if there exists a cutoff pair (ωlowast qlowast) consistingof a cutoff state ωlowast isin [0 1] and probability qlowast isin [0 1] such that states below ωlowast arerevealed states above ωlowast are pooled and the state ω = ωlowast is revealed with probabilityqlowast and pooled with probability 1minus qlowast For example s can be expressed as

s =

983099983105983103

983105983101

ω with probability one if ω lt ωlowast

ωlowast and mlowast with probabilities qlowast and 1minus qlowast if ω = ωlowast

mlowast with probability one if ω gt ωlowast

where

mlowast =

983125(ωlowast1]

ωdF (ω) + ωlowast(1minus qlowast) Pr[ω = ωlowast]983125(ωlowast1]

dF (ω) + (1minus qlowast) Pr[ω = ωlowast](1)

is the posterior expected state induced by the pooling signal Notice that mlowast is afunction of ωlowast and qlowast

A stochastic upper-censorship signal s with a cutoff pair (ωlowast qlowast) is deterministic uppercensorship if and only if qlowast isin 0 1 in which case s can be expressed as a monotonepartition of the state space [0 1] For example if qlowast = 0 then the signal s fullyreveals states ω lt ωlowast and pools states ω ge ωlowast so s can be expressed as

s =

983083ω if ω lt ωlowast

mlowast if ω ge ωlowast

wheremlowast = E[ω|ω ge ωlowast]

is the expected state conditional on being at least ωlowast Note that the fully informativesignal and the completely uninformative signal are deterministic upper-censorshipsignals with (ωlowast qlowast) = (1 1) and (ωlowast qlowast) = (0 0) respectively

23 Benchmark To illustrate the difference between the cases of a continuous anddiscrete state as well as between Scenarios A and B we solve a benchmark examplewhere there is no uncertainty about the receiverrsquos type

Let the state ω have the expected value E[ω] = 12 and let the receiverrsquos type r be

known to the sender and satisfy r isin98304312 1983044 In addition suppose that the sender wishes

8 KOLOTILIN MYLOVANOV ZAPECHELNYUK

to maximize the probability that the receiver accepts the proposal so v(ω r) = 1 forall ω and all r

Observe that if the sender reveals no information about ω the receiver will evaluate ωby its expected value of 1

2 thus having utility 1

2minus r lt 0 from accepting the proposal

So revealing no information makes the receiver reject the proposal with certaintyThe sender can do better by fully revealing ω in which case the proposal is acceptedwith probability Pr[ω ge r]

However the sender can do even better by pooling states above r with states below rwhile keeping the expected state induced by the pooling signal at least r Thus thereceiver who is unable to distinguish between the states in the pool will accept theproposal at all of these states

To illustrate the case of the continuous state suppose that ω is uniformly distributedon [0 1] The largest pool that maintains the posterior expectation at least r is theinterval [2r minus 1 1] Indeed once the receiver learns that ω isin [2r minus 1 1] given theuniform prior of ω the posterior expected state is r So the proposal is accepted whenω ge 2rminus 1 and rejected when ω lt 2rminus 19 Thus the deterministic upper-censorshipsignal with cutoff ωlowast = 2r minus 1 is optimal The senderrsquos expected utility is equal to

Pr[proposal is accepted] = Pr[ω ge 2r minus 1] = 2(1minus r)

which is twice as high as that from the fully informative signal

Pr[proposal is accepted] = Pr[ω ge r] = 1minus r

When ω is a discrete random variable an optimal way to reveal information aboutω takes the form of stochastic upper censorship For illustration suppose that ω canonly be 0 or 1 equally likely Now simply pooling states is not helpful for the senderPooling 0 and 1 yields the posterior expected state of 1

2 which is smaller than the

type r resulting in a rejection of the proposal Including any other states in thepool makes no difference as these states never occur Thus an optimal monotonepartition is the fully informative signal and the probability that the receiver acceptsthe proposal is simply equal to the probability that ω = 1

Pr[proposal is accepted] = Pr[ω ge r] = Pr[ω = 1] =1

2

The sender can do strictly better by partial (stochastic) pooling If ω = 1 let thesignal realization be ldquoacceptrdquo with certainty If ω = 0 let the signal realization beldquorejectrdquo with some probability q and ldquoacceptrdquo with probability 1 minus q So when thereceiver observes ldquoacceptrdquo the posterior expected state is

E[ω|ldquoacceptrdquo] = Pr[ω = 1]

Pr[ω = 1] + Pr[ω = 0] middot (1minus q)=

12

12 + 12 middot (1minus q)=

1

2minus q

9Whether states strictly below ωlowast are revealed or pooled among themselves does not matterbecause they are all strictly below r and thus induce the receiver to take action a = 0

CENSORSHIP AS OPTIMAL PERSUASION 9

The sender now wishes to find the lowest q subject to 1(2 minus q) ge r which yieldsqlowast = 2 minus 1

r Thus stochastic upper censorship with ωlowast = 0 and qlowast = 2 minus 1

ris

optimal The probability that the receiver accepts the proposal is exactly equal tothe probability that the signal realization is ldquoacceptrdquo

Pr[proposal is accepted] = Pr[ω = 1] + Pr[ω = 0] middot (1minus qlowast) =1

2+

1

2middot 1minus r

r=

1

2r

We now summarize the insights obtained in this example If the state is continuousthen an optimal signal is deterministic upper censorship If the state is discrete thenan optimal monotone partition is deterministic upper censorship and an optimalsignal is stochastic upper censorship10 In the remainder of the paper we will showthat these insights extend to the case where the receiverrsquos type is uncertain

3 Continuous State

In this section we provide necessary and sufficient conditions for the optimality ofupper censorship when the state is continuous

Let m be the expected state conditional on observing a realization of a signal s Sincethe sender and receiverrsquos utilities are linear in ω they depend on the information aboutω revealed by signal s only through the expected state m In particular the receiverchooses a = 1 if and only if r le m

Let V (m) denote the indirect utility of the sender conditional on m

V (m) =

983133

rlem

E [v(ω r)|m] g(r)dr =

983133 m

0

v(m r)g(r)dr m isin [0 1] (2)

where E [v(ω r)|m] = v(m r) by the linearity of v in ω

A function V is said to be S-shaped if it is convex below some threshold and concaveabove that threshold or equivalently if V primeprime is single-crossing from above

there exists τ such that V primeprime(m) ge (le) 0 for all m lt (gt) τ

We now provide the criterion for the optimality of upper censorship Recall thatmlowast = E[ω|ω ge ωlowast]

Theorem 1 Let V be S-shaped Then and only then for all continuous F anoptimal signal is deterministic upper censorship whose cutoff state ωlowast isin [0 τ ] satisfies

V (mlowast) + V prime(mlowast)(mminusmlowast) ge (le)V (m) for all m ge (lt)ωlowast (3)

The criterion for the optimality of lower censorship is analogous

10See for example Kolotilin (2015) for a detailed treatment of the case where there is no uncer-tainty about the receiverrsquos type

10 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Remark 1 Let minusV be S-shaped Then and only then for all continuous F anoptimal signal is deterministic lower censorship whose cutoff state ωlowast isin [τ 1] satisfies

V (mlowast) + V prime(mlowast)(mminusmlowast) ge (le)V (m) for all m le (gt)ωlowast

Theorem 1 states that optimal persuasion takes a simple form of upper censorshipwhen V is S-shaped11 Moreover the theorem establishes that this result is tight inthe sense that if V is not S-shaped then there exists a distribution F of the statesuch that no upper-censorship signal is optimal

Note that every upper-censorship signal is a monotone partition Therefore Theorem1 applies in both Scenario A where the sender is free to choose any signal andScenario B where the sender is constrained to choose a monotone partition

The intuition behind Theorem 1 is as follows Observe that when no informationabout the state is revealed the receiverrsquos best-response action does not change withthe state The more information is revealed the more variable the receiverrsquos behaviorwill be in response to this information Consider an interval where V is concave Thesender would prefer not to reveal any information when the state is in this intervalsince a certain outcome is preferred to any lottery with the same expected stateConversely consider an interval where V is convex The sender would prefer to fullyreveal the state in this interval since now lotteries are preferred If V is S-shapedthat is it is convex below some threshold and concave above that threshold theinduced optimal persuasion takes the form of upper censorship

When V is S-shaped the senderrsquos optimization problem is reduced to finding anoptimal censorship cutoff ωlowast If the realized state ω is below the cutoff then it isrevealed to the receiver so the expected utility of the sender is V (ω) If the realizedstate ω is above the cutoff then the posterior expected state is mlowast = E[ω|ω ge ωlowast] sothe expected utility of the sender conditional on ω ge ωlowast is V (mlowast) The sender thusneeds to solve the problem

maxωlowastisin[01]

983133 ωlowast

0

V (ω)f(ω)dω +

983133 1

ωlowastV (mlowast)f(ω)dω (4)

The expression in (3) represents the first-order condition to this problem and capturesthree possible cases the boundary solutions ωlowast = 0 and ωlowast = 1 if the expression in(3) has the same sign for all ω and an interior solution ωlowast such that

V (mlowast) + V prime(mlowast)(ωlowast minusmlowast) = V (ωlowast) (5)

This first-order condition (5) is illustrated by Figure 1 The solid line is V (m) andthe dashed line is V (mlowast) + V prime(mlowast)(mminusmlowast) which is tangent to V at mlowast

11For a fixed distribution F Proposition 3 in Kolotilin (2018) implies that upper censorship withcutoff ωlowast is optimal if and only if (3) holds and V (m) is convex on [0ωlowast] This result can be used toshow that upper censorship is optimal when V is S-shaped For completeness we include a simpleself-contained proof inspired by the proof of Theorem 1 in Dworczak and Martini (2019)

CENSORSHIP AS OPTIMAL PERSUASION 11

1ωlowast

V

revealing pooling

mlowast0

Figure 1 Optimal upper censorship with cutoff ωlowast

The solution of the senderrsquos problem (4) is particularly simple if V is either globallyconvex or globally concave In this case the first-order condition in (3) has a constantsign so either ωlowast = 0 or ωlowast = 1 must be optimal If V is convex then ωlowast = 1 isoptimal which corresponds to the fully informative signal Similarly if V is concavethen ωlowast = 0 is optimal which corresponds to the completely uninformative signalThis is summarized in the following corollary

Corollary 1 An optimal signal is

(i) fully informative for all F if and only if V is convex

(ii) completely uninformative for all F if and only if V is concave

31 Canonical Case We now consider the canonical case where the senderrsquos utilitysatisfies assumption (A1)

The density g of receiver types r is said to be log-concave if ln g(r) is concave in r12

Note that ln g(r) is well defined in the canonical case

In this case the shape of the senderrsquos indirect utility V defined by (9) is connectedto the shape of the density g of receiver types as follows

Lemma 1 In the canonical case V is S-shaped for all ρ if and only if g is log-concave

12Table 1 in Bagnoli and Bergstrom (2005) reports distributions with log-concave densities

12 KOLOTILIN MYLOVANOV ZAPECHELNYUK

That is if the density of receiver types g is log-concave then the senderrsquos indirectutility V is S-shaped Moreover this result is tight in the sense that if g is notlog-concave then there exists ρ such that V is not S-shaped

By Theorem 1 and Lemma 1 we obtain a criterion for the optimality of upper cen-sorship with the condition on the primitive of the model the density g

Theorem 2 Consider the canonical case and let the density g be log-concave Thenand only then an optimal signal is deterministic upper censorship for all F and all ρ

The symmetric statement is also true an optimal signal is lower censorship for all Fand all ρ if and only if minusg is log-concave Consequently if both g and minusg are log-concave (that is g is exponential) then there exists an optimal signal which is bothupper censorship and lower censorship There are only two signals with this propertyfully informative and completely uninformative This allows us to obtain conditionson the distribution of receiver types under which the optimal signal is polarizedbetween fully informative and completely uninformative signals as for example inLewis and Sappington (1994) and Johnson and Myatt (2006)13

Corollary 2 In the canonical case an optimal signal is either fully informative orcompletely uninformative for all F and all ρ if and only if there exist λ isin R and c gt 0such that g(r) = ceminusλr for r isin [0 1]

If g(r) = ceminusλr then the fully informative signal is optimal whenever ρ ge λ and thecompletely uninformative signal is optimal whenever ρ le λ (and any signal is optimalwhen ρ = λ) In particular if ρ = 0 the optimal signal is fully determined by thesign of λ which is in turn determined by whether the mean of r is smaller or greaterthan 1

2

32 Comparative Statics Theorem 2 allows for a sharp comparative statics anal-ysis on the informativeness of the optimal signal

We compare signals by their Blackwell informativeness (Blackwell 1953) To compareupper-censorship signals s1 and s2 we only need to compare their cutoffs ωlowast

1 and ωlowast2

Signal s1 is more informative than signal s2 if ωlowast1 ge ωlowast

2 Indeed state ω isin [0ωlowast2) is

fully revealed by both s1 and s2 and state ω isin [ωlowast2 1] is partially revealed by s1 and

not revealed at all by s2 so s1 is more informative than s2

13In the case of log-concave density g the sets of upper- and lower-censorship signals are totallyordered by the rotation order of Johnson and Myatt (2006) If we restrict attention to the set oflower-censorship signals (which are generally suboptimal under log-concave g) then the rotationpoint is decreasing with respect to the rotation order By Lemma 1 of Johnson and Myatt (2006)the senderrsquos expected utility is quasiconvex and one of the extreme lower-censorship signals (thefully informative or completely uninformative signal) is optimal for the sender But if we restrictattention to the set of upper-censorship signals (which are optimal under log-concave g) then therotation point is not decreasing with respect to the rotation order Therefore Lemma 1 of Johnsonand Myatt (2006) does not apply and an interior upper-censorship signal is generally optimal forthe sender

CENSORSHIP AS OPTIMAL PERSUASION 13

For the purpose of comparison we extend the definition of distribution function Gto the real line and assume that its density g is log-concave Consider a family ofdistributions Gtσ of receiver types

Gtσ (r) = G

983061τ minus t+

r minus τ

σ

983062

where τ is the maximum point of g and t isin R and σ gt 0 are parameters Let gtσ (r)denote the corresponding density Note that gtσ (r) is log-concave for all t isin R andall σ gt 0

Because gtσ is log-concave on [0 1] deterministic upper censorship is optimal byTheorem 2 Let ωlowast (ρ t σ) isin [0 1] be the optimal censorship cutoff as given by (3)

The shift parameter t shifts the distribution along the horizontal axis and the stretchparameter σ stretches the distribution horizontally and symmetrically around themode τ so the limit of σ rarr 0 leads to the unit mass on τ We now show that thesender optimally discloses more information when

(i) the senderrsquos preferences are better aligned with the receiverrsquos preferences (thealignment parameter ρ is greater)

(ii) the receiver is more reluctant to accept the proposal (the shift parameter t isgreater)

(iii) the sender is more uncertain about the receiverrsquos type (the stretch parameter σis greater) provided ρ ge 014

Theorem 3 In the canonical case a cutoff state of an optimal signal satisfies

(i) ωlowast (ρ t σ) is increasing in ρ

(ii) ωlowast (ρ t σ) is increasing in t

(iii) ωlowast (ρ t σ) is increasing in σ if ρ ge 0

The intuition for part (i) is that for a higher ρ the sender puts more weight onthe receiverrsquos utility so she optimally endows the receiver with a higher utility byproviding more information

The intuition for part (ii) is that for a higher t each type of the receiver has a greatercost of accepting the proposal so to persuade the same type of the receiver the senderneeds to increase E[ω|ω ge ωlowast] by expanding the full disclosure interval [0ωlowast]

The intuition for part (iii) is that for a higher σ receiver types are more spread out soto persuade the same mass of types the sender again needs to increase E[ω|ω ge ωlowast]

14The case ρ ge 0 is the practically relevant case where the senderrsquos utility is a weighted averageof the receiverrsquos action and utility 1

1+ρv(ω r) =1

1+ρ + ρ1+ρ (ω minus r)

14 KOLOTILIN MYLOVANOV ZAPECHELNYUK

4 Discrete State

In this section we assume that the state is a discrete random variable

Theorem 1prime Let V be S-shaped Then and only then for all discrete F

(A) an optimal signal is stochastic upper censorship whose cutoff pair (ωlowast qlowast) satisfiescondition (3)

(B) an optimal monotone partition is deterministic upper censorship whose cutoff pair(ωlowast

d qlowastd) satisfies ωlowast

d = ωlowast and qlowastd isin 0 1

In the canonical case we obtain the result analogous to Theorem 2

Theorem 2prime Consider the canonical case and let the density g be log-concave Thenand only then for all discrete F and all ρ

(A) an optimal signal is stochastic upper censorship

(B) an optimal monotone partition is deterministic upper censorship

Let us discuss how the comparative statics result (Theorem 3) changes In Section 32we argue that upper-censorship signals can be ordered by their censorship cutoffs Butwhen the state is discrete a censorship cutoff is described by a pair (ωlowast qlowast) where ωlowast

is a cutoff state and qlowast is the probability of revealing this cutoff state when it realizes

Consider two stochastic upper-censorship signals s1 and s2 with cutoff pairs (ωlowast1 q

lowast1)

and (ωlowast2 q

lowast2) Denote (ωlowast

1 qlowast1) ≽ (ωlowast

2 qlowast2) if ω

lowast1 gt ωlowast

2 or ωlowast1 = ωlowast

2 and qlowast1 ge qlowast2 Observethat s1 is more (Blackwell) informative than s2 if and only if (ωlowast

1 qlowast1) ≽ (ωlowast

2 qlowast2) This

comparison also applies to deterministic upper-censorship signals with the constraintthat qlowast1 and qlowast2 are in 0 1

With the above order on cutoff pairs Theorem 3 extend as follows

Theorem 3prime Consider the canonical case

(A) A cutoff pair of an optimal signal satisfies

(i) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in ρ

(ii) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in t

(iii) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in σ if ρ ge 0

(B) A cutoff pair of an optimal monotone partition satisfies15

(i) (ωlowastd (ρ t σ) q

lowastd (ρ t σ)) is increasing in ρ

(ii) (ωlowastd (ρ t σ) q

lowastd (ρ t σ)) is increasing in t

15Part (iii) also holds for an optimal monotone partition when for example the discrete supportof the state ω is a regular grid with sufficiently many points However it is possible to construct acounterexample where part (iii) fails for an optimal monotone partition

CENSORSHIP AS OPTIMAL PERSUASION 15

That is the sender optimally discloses more information when she is less biasedrelative to the receiver (the alignment parameter ρ is greater) when the receiver ismore reluctant to accept the proposal (the shift parameter t is greater) and underScenario A when the sender is more uncertain about the receiverrsquos type (the stretchparameter σ is greater)

Remark 2 Theorem 1prime and Theorem 2prime show that upper censorship emerges as anoptimal persuasion strategy under the same conditions (V is S-shaped in Theorem1prime and g is log-concave in Theorem 2prime) regardless of whether the sender can choosean arbitrary signal or is constrained to choose a monotone partition It may betempting to conjecture that if the sender was allowed to choose an arbitrary partition(not necessarily monotone) the same result would hold This however is not true

For illustration suppose that the sender wishes to maximize the probability of a = 1and the receiver has type r = 1

2with certainty16 Let the state ω take values in

0 14 1 with probabilities (1

6 23 16) so the prior mean is 1

3

An optimal signal is stochastic upper censorship with cutoff pair (ωlowast qlowast) = (14 12)

which sends the pooling message with certainty at ω = 1 and with probability 1minusqlowast =12at ω = 1

4 and reveals the state otherwise The pooling message induces the posterior

mean m = 12 leading to a = 1 all other messages induce posterior means below 1

2

leading to a = 0 Under the optimal signal action a = 1 is chosen with probability23middot 12+ 1

6= 1

2

An optimal monotone partition is deterministic upper censorship that fully revealsthe state where only ω = 1 leads to a = 1 Under this partition action a = 1 ischosen with probability 1

6

An optimal partition pools the extreme states ω = 0 and ω = 1 and reveals theintermediate state ω = 1

4 The pooling message induces the posterior mean m = 1

2

leading to a = 1 This optimal partition is nonmonotone Under this partitionaction a = 1 is chosen with probability 1

6+ 1

6= 1

3 which is strictly better than the

probability of 16obtained under the optimal monotone partition

5 Application to Media Censorship

In this section we apply our results to the problem of media censorship by thegovernment In the modern world people obtain information about the governmentrsquosstate through various media sources such as television newspapers and internetblogs Without the media most people would not know what policies and reformsthe government pursues and how effective they are Media outlets have differentpositions on the political spectrum and differ substantially in how they select andpresent facts to cover the same news People choose their sources of information based

16In this case V (m) = 1 if m ge 12 and V (m) = 0 if m lt 1

2 Observe that the graph of V is

S-shaped and it can be approximated by a continuously differentiable S-shaped function

16 KOLOTILIN MYLOVANOV ZAPECHELNYUK

on their political ideology and socioeconomic status This information is valuable forsignificant individual decisions on migration investment and voting to name a fewIndividuals may not fully internalize externalities that their decisions impose on thesociety Likewise the government may not have the societyrsquos best interest at heartTo further its goals the government then wishes to influence individual decisions bymanipulating the information through media In autocracies and countries with weakchecks and balances the government has power to censor the media content

The governmentrsquos problem of media censorship can be represented as the persua-sion problem in Section 2 We apply our results to provide conditions for the op-timality of upper-censorship policies that censor all media outlets except the mostpro-government ones Furthermore we interpret our comparative statics results

51 Setup There is a continuum of heterogeneous citizens indexed by r isin [0 1]distributed with G Each citizen chooses between a = 0 and a = 1 The utility of acitizen of type r is given by

u(θ r ar a) = (θ minus r)ar + κ(θ r a)

where ar isin 0 1 denotes the citizenrsquos own action a =983125ardG(r) denotes the aggre-

gate action in the society θ isin [0 1] captures an unobserved benefit from action 1 ascompared to action 0 and κ captures the impact of the aggregate action a on thecitizenrsquos utility The term (θ minus r)ar is a private surplus of a citizen of type r Theterm κ(θ r a) is an externality because for a citizen of type r it is optimal to ignorethis term and choose ar = 1 if and only if θ ge r

There is a government which is concerned with a weighted average of the social utilityand the governmentrsquos intrinsic benefit from the aggregate action For a given θ thegovernmentrsquos utility is given by

983133 1

0

ν(θ r ar a)dG(r) + δγ(θ a)

The term ν captures a citizenrsquos utility from the governmentrsquos perspective We allow νto be different from u to reflect paternalistic or other concerns The term γ capturesthe governmentrsquos intrinsic benefit from the aggregate action The parameter δ ge 0captures the weight of the aggregate action in the governmentrsquos utility

Let T be a distribution of the random variable θ We assume that distributions G andT are independent and admit continuously differentiable and strictly positive densi-ties We also assume that κ ν and γ are linear in θ and continuously differentiablein r and a Furthermore to simplify interpretations we assume that ν and γ arenon-decreasing in θ and a so that for the government a high θ is a good news anda higher aggregate action is preferable

Citizens obtain information about the unobservable benefit θ through media outletsEach media outlet is identified by its editorial policy c isin [0 1] and it endorses action

CENSORSHIP AS OPTIMAL PERSUASION 17

a = 1 if θ ge c and criticizes it if θ lt c17 A set C of media outlets is a finite subsetof [0 1]18

The governmentrsquos censorship policy is a set of the media outlets X sub C that arepermitted to broadcast so the rest of the media outlets are censored

The timing is as follows First the government chooses a set X sub C of permittedmedia outlets Second state θ is realized and each permitted media outlet endorsesor criticizes action a = 1 according to its editorial policy Finally each citizen observesmessages from all permitted media outlets updates his beliefs about θ and choosesan action

52 Discussion We now discuss interpretations of the key components of the mediacensorship application As in Gehlbach and Sonin (2014) there can be various inter-pretations of the citizenrsquos action a = 1 such as voting for the government supportinga governmentrsquos policy or taking an individual decision that benefits the governmentA citizenrsquos type r can be interpreted as his ideological position or preference param-eter A citizen who is more supportive of the government has a smaller r

A media outlet with a higher editorial policy c can be interpreted as being less loyalto the government because it criticizes the government on a larger set of states Aneditorial policy c isin C can therefore represent a slant or political bias of the outletagainst the government and can be empirically measured as the frequency with whichthe outlet uses anti-government language Gentzkow and Shapiro (2010) constructsuch a slant index for US newspapers Empirical findings of their paper suggestthat the editorial policies of media outlets are driven by reader preferences justifyingour assumption of the existence of a large variety of editorial policies19 As in Suen(2004) Chan and Suen (2008) and Chiang and Knight (2011) the assumption ofthe binary media reports that communicate only whether the state θ is above somestandard c can be justified by a cursory readerrsquos preference for simple messages suchas positive or negative opinions and yes or no recommendations

The governmentrsquos censorship of media outlets can take various forms For examplethe government can ban access to internet sites withdraw licenses disrupt financingconfiscate print materials and equipment discredit media providers and even arresteditors and journalists using broadly formulated legislation on combating extremismIn some countries the government can exercise direct control over media editorialpolicies either through state ownership or administrative pressure

17The tie-breaking in the event of θ = c is unimportant as θ is a continuous random variable18We discuss the case of a continuum of media outlets in Section 5519Theoretical literature has explored the determinants of media slant of an outlet driven by its

citizens (Mullainathan and Shleifer 2005 Gentzkow and Shapiro 2006 and Chan and Suen 2008)and its owners (Baron 2006 and Besley and Prat 2006)

18 KOLOTILIN MYLOVANOV ZAPECHELNYUK

53 Formulating Media Censorship as Persuasion We now show that the me-dia censorship problem can be formulated as a linear persuasion problem in whichthe government is a sender and a representative citizen is a receiver

In our model the citizensrsquo and the governmentrsquos utilities are linear in θ There-fore given any information from the media outlets the utilities depend only on theposterior mean of θ

Consider an arbitrary posterior mean of θ denoted by m isin [0 1] Each citizen of typer chooses ar = 1 if and only if r le m The externality term κ plays no role in thisdecision Therefore the aggregate action a is simply the mass of all citizens whosetypes do not exceed m so a = G(m)

Next using the citizensrsquo optimal behavior we derive the governmentrsquos expected utilityconditional on the posterior mean m

V (m) = E983063983133 1

0

ν(θ r ar G(m))dG(r) + δγ(θ G(m))

983055983055983055983055m983064

=

983133 1

0

ν(m r1rlem G(m))dG(r) + δγ(mG(m)) (6)

This is the governmentrsquos indirect utility As in Section 2 the government now choosesa signal which is informative about the state to maximize its expected utility How-ever in contrast to Section 2 here the government is restricted to signals that areimplementable by a subset of a given set of media outlets

Next we show that with an appropriate definition of the state the restriction tosignals implementable by a subset of a given set of media outlets can be formulatedas a restriction to monotone partitional signals as in Scenario B in Section 2 Thuswe will be able to use our results in Section 4 to find the optimal censorship policies

Recall that the information about θ is only available through the media outlets Let ωdenote a random variable equal to the conditional expectation of θ given the messagesof all media outlets in C Note that as C is finite ω is a discrete random variablewith values in [0 1] Let F denote the distribution of ω From now on we treat therandom variable ω as the state and denote by s a signal as defined in Section 2

Let HM denote the set of all distributions of the posterior mean state induced byall monotone partitional signals and let HC denote the set of all distributions ofthe posterior mean state induced by all finite subsets of the set C of media outletsObserve that any set of media outlets induces a monotone partitional signal thatis HC sub HM We now show that every outcome implementable by an arbitrarymonotone partition in HM is also implementable by a subset of media outlets

Lemma 2 HC = HM

For illustration suppose that there is only one media outlet with the editorial policyc isin (0 1) There are two observable events θ ge c and θ lt c In each realized event

CENSORSHIP AS OPTIMAL PERSUASION 19

the citizens compute the posterior expectation of θ which we will call ω So in thisexample the distribution F of ω has two mass points one for each of the two events

Given these two mass points every monotone partition of [0 1] induces one of twopossible distributions of the posterior mean state (i) the mass points revealed and (ii)the mass points pooled Observe however that these distributions are implementableby permitting and censoring the media outlet respectively

A censorship policy is upper censorship if it censors all sufficiently disloyal mediaoutlets Specifically there exists a cutoff clowast isin C and an indicator qlowast isin 0 1 suchthat all media outlets whose editorial policies are below clowast are permitted all mediaoutlets whose editorial policies are above clowast are censored and clowast is permitted if andonly if qlowast = 1 That is X = c isin C c le clowast if qlowast = 1 and X = c isin C c lt clowastif qlowast = 0 Note that the full censorship policy clowast = 0 and qlowast = 0 (where all mediaoutlets are censored) and the free media policy clowast = 1 and qlowast = 1 (where all mediaoutlets are permitted) are the two extreme upper-censorship policies

As shown above the media censorship problem is equivalent to the linear persuasionproblem in which the sender is restricted to monotone partitional signals and thesenderrsquos indirect utility V is given by (6) We thus apply Theorem 1prime(B) to obtainthe following result

Theorem 1primeprime If V is S-shaped then an optimal censorship policy is upper censorship

To illustrate Theorem 1primeprime suppose the government is interested only in the aggregateaction so that ν = 0 and γ depends only on the aggregate action a so V (m) =γ(G(m)) Thus an upper-censorship policy is optimal if the composition functionγ(G(middot)) is S-shaped For example this condition holds if the government is interestedin reaching a certain approval threshold (eg a simple majority) so that γ is a stepfunction This condition also holds if γ is S-shaped and G is uniform or S-shapedwith the same inflection point as γ20

54 Canonical Case We now impose more structure on the utilities to obtain asharper result for the optimality of upper censorship and to perform a comparativestatics analysis Analogously to assumption (A1) we assume that the governmentrsquosutility is a weighted average of the citizensrsquo utility and their aggregate action

983133 1

0

u(θ r ar a)dG(r) + δa

where u(θ r ar a) = (θ minus r)ar + ζ(r)a

(A2)

for some continuously differentiable function ζ Define

β =

983133 1

0

ζ(r)dG(r) + δ

20This condition holds in many special cases where γ and G are S-shaped even when γ and Ghave different inflection points

20 KOLOTILIN MYLOVANOV ZAPECHELNYUK

The term983125 1

0ζ(r)dG(r) is the governmentrsquos expected bias towards the citizensrsquo action

a = 1 due to the citizensrsquo externality and the term δ is the governmentrsquos intrinsicbias towards a greater aggregate action a Thus we interpret β as the governmentrsquosaggregate bias This decomposition of the bias allows for different interpretations whythe government is biased So β can be high because the government is too self-serving(high δ) or because the government is benevolent (low δ) but wishes to internalizestrong positive externalities of the citizens (high ζ(r)) To ease interpretations weassume that β gt 021

Under Assumption (A2) the governmentrsquos indirect utility V given by (6) becomes

V (m) =

983133 m

0

(mminus r)dG(r) + βa =

983133 m

0

v(m r)dG(r)

where

v(m r) = β

9830611 +

1

β(mminus r)

983062

This is the same as v given by Assumption (A1) with ρ = βminus1 up to rescaling by apositive constant Consequently we can apply Theorem 2prime(B) to obtain the followingresult

Theorem 2primeprime Let (A2) hold If the density g of citizens types is log-concave then anoptimal censorship policy is upper censorship

We now apply the comparative statics analysis presented in Section 32 Note thatthe upper-censorship policies are ordered according to the amount of informationtransmitted to the citizens in the sense of Blackwell (1953) A greater censorshipthreshold clowast means that more media outlets are permitted With this order in mindwe apply Theorem 3prime(B) to make a comparative statics analysis on the amount ofinformation that is optimally disclosed by the government

First the censorship cutoff is increasing in the alignment parameter ρ Recall thatthis is a reciprocal of the governmentrsquos bias ρ = βminus1 This means that the governmentoptimally discloses more information (the censorship cutoff is greater) when it is lessbiased (β is smaller) Intuitively as β decreases the government puts more weighton the citizensrsquo utility so it optimally endows the population with a higher utility bycensoring fewer media outlets and disclosing more information

Second the censorship cutoff is increasing in the magnitude t of the horizontal shiftof the density g A greater t corresponds to a greater opportunity cost of actiona = 1 for each citizen This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more difficultto persuade to take action a = 1 (parameter t is greater) Informally speaking topersuade the same type of the citizen the government needs to increase the posterior

21If β lt 0 then swapping the roles of a = 0 and a = 1 reverses the sign of the bias If β = 0then the governmentrsquos utility and the citizensrsquo private interests coincide so it is trivially optimal topermit all media outlets

CENSORSHIP AS OPTIMAL PERSUASION 21

mean But the expected posterior mean must be equal to the prior mean so it is notpossible to increase all posterior means Due to the log-concave shape of the densityof the citizensrsquo types this tradeoff is resolved by increasing the posterior mean of thepooling interval E[ω|ω ge clowast] This is done by shrinking the pooling interval [clowast 1]that is increasing the censorship cutoff clowast

Third when the number of media outlets is sufficiently large the censorship cutoffis increasing in the stretch factor σ of the density g around its peak τ (see Footnote15) A greater σ corresponds to a more dispersed distribution of the opportunitycost in the population This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more diverse inhow difficult they are to persuade Intuitively when the types are more spread outpersuading the same mass of types requires a greater posterior mean which leads toa higher censorship cutoff similarly to the comparative statics in t discussed above

55 Extensions and Open Questions Let us now consider a few extensions ofour model of media censorship

In our model the set of media outlets is exogenous and the governmentrsquos only in-strument is censorship We now consider three alternative ways of expanding thegovernmentrsquos instruments of influence

First suppose that the government is able not only to censor existing media outletsbut also to introduce new media outlets with chosen editorial policies This is equiv-alent to our censorship model where all media outlets in [0 1] are initially availableand the government can censor any subset of them When all media outlets in [0 1]are permitted the revealed information about θ (which we call ω) is θ itself Thusω = θ is a continuous random variable with distribution F = T So we can nowapply our results for the continuous state from Section 3 instead of those for the dis-crete state in Section 4 reaching the same conclusion about the optimality of uppercensorship

Second suppose that the government can garble information available from mediaoutlets That is the government is not restricted to monotone partitions it cancreate arbitrary signals about state ω for the citizens to observe This becomes ageneral persuasion problem and our Theorems 1prime(A) 2prime(A) and 3prime(A) apply

Third suppose that the government is able to restrict not only which media outletsare permitted but also how many media outlets each citizen can choose to observeIn this extension the citizens are not allowed to communicate with one another (oth-erwise they could share the information thus observing all permitted media outletsindirectly) This extension does not affect our results as long as each citizen is al-lowed to access at least one media outlet of his choice as in Chan and Suen (2008)Intuitively this is because each citizen categorizes the information from the mediaoutlets into ldquogood newsrdquo where a = 1 is optimal and ldquobad newsrdquo where a = 0 is

22 KOLOTILIN MYLOVANOV ZAPECHELNYUK

optimal Because the information from the media outlets induces a monotone parti-tion it means that ldquogood newsrdquo is separated from ldquobad newsrdquo by a threshold mediaoutlet that depends on the citizenrsquos type That is it is sufficient to observe a singlethreshold media outlet to distinguish ldquogood newsrdquo from ldquobad newsrdquo

In our model each citizenrsquos utility depends on the aggregate action a through theexternality term κ(θ r a) which does not affect the chosen action Let us relax thisassumption so that a citizenrsquos optimal choice can depend on a We can still writethe senderrsquos indirect utility V as a function of the posterior mean state and applyour results However V is no longer uniquely determined by the primitives of themodel It is now endogenous and depends on an equilibrium the citizens play aseach citizenrsquos optimal action now depends on what all citizens do in equilibrium Forexample given the same information about the state a citizen could prefer to choosea = 1 if and only if sufficiently many citizens choose the same action so a is largeenough This creates the problem of multiplicity of equilibria and as a consequencethe dependence of optimal censorship on equilibrium selection

As a side our media censorship problem can also be applied to spatial voting mod-els as in Chiang and Knight (2011) Consider a government party (p = G) and anopposition party (p = O) competing in an election If party p wins a voter withan ideological position r gets utility wp minus (r minus rp)

2 where wp is the quality or va-lence of party p and rp is the ideology or policy platform of party p Voters knowthe partiesrsquo ideologies and obtain information about the partiesrsquo qualities from allavailable media outlets Each voter supports the party that maximizes his expectedutility Within this context our analysis still applies because the voterrsquos utility dif-ference between the government and opposition parties is proportional to θminusr whereθ = (wG minus wO + r2O minus r2G)2(rO minus rG)

There are a few more extensions that can be relevant in applications First insteadof complete censorship of a media outlet there can be a cost of accessing it Forexample an international news channel can be censored by a local government butcitizens may still access it through VPN at some cost Second it can be costly forthe government to censor media outlets So another important question to answeris how a government may prioritize censoring Finally citizens can incur some costof following each media outlet While we have already mentioned that citizens gainno benefit from following more than one outlet it is entirely possible for them tostop watching news altogether if it is sufficiently uninformative These extensions arenontrivial and left for future research

Appendix

Proof of Theorem 1 Suppose that V is S-shaped For mωlowast isin [0 1] and mlowast(ωlowast) =E[ω|ω ge ωlowast] define

V (mωlowast) = V (mlowast(ωlowast)) + V prime(mlowast(ωlowast))(mminusmlowast(ωlowast)) (7)

CENSORSHIP AS OPTIMAL PERSUASION 23

Since V (m) is convex on [0 τ ] and concave on [τ 1] there exists ωlowast isin [0 τ ] thatsatisfies condition (3) that is

V (mωlowast) ge (le)V (m) for all m ge (lt)ωlowast (8)

This is true because V (ωlowastωlowast) minus V (ωlowast) is single-crossing from below To see thisconsider Figure 1 Observe that (3) can hold only if ωlowast is on the convex part of V andmlowast is on the concave part of V As the censorship cutoff ωlowast increases the posteriormean state of the pooling message mlowast(ωlowast) also increases But because V is concaveat mlowast(ωlowast) the dashed tangent line V (mωlowast) becomes flatter in m so it crosses thesolid line V (m) at a smaller m

Consider an arbitrary signal s Let H be the distribution of m = E[ω|s] inducedby signal s Since each signal is a garbling of the fully informative signal F is amean-preserving spread of H Thus the senderrsquos expected utility is smaller undersignal s than under the upper-censorship signal with cutoff ωlowast as follows from

983133 1

0

V (m)dH(m) le983133 1

0

maxV (m) V (mωlowast)dH(m)

le983133 1

0

maxV (m) V (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mlowast(ωlowast))dF (m)

where the second line holds because maxV (m) V (mωlowast) is convex in m and F isa mean-preserving spread of H the third line holds by (8) and the last line holds by

(7) and by983125 1

ωlowast(mminusmlowast(ωlowast))dF (m) = 0

Conversely suppose that V is not S-shaped Then there exist 0 le m1 lt m2 lem3 lt m4 le 1 such that V primeprime(m) lt 0 for m isin (m1m2) V

primeprime(m) = 0 for m isin [m2m3]and V primeprime(m) gt 0 for m isin (m3m4) because V primeprime is continuous by assumption It isstraightforward to show that there exists 983141mlowast isin (m1m2) (sufficiently close to m2) and983141ωlowast isin (m3m4) such that

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) lt V (m) for all m isin (983141ωlowastm4]

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) gt V (m) for all m isin [m1 983141ωlowast] 983141mlowast

Moreover there exists a continuous distribution F such that the support of F is asubset of [m1m4] and 983141mlowast = E[ω|ω le 983141ωlowast] Using a chain of inequalities similar to theone above it is easy to show that lower-censorship signal 983141slowast with cutoff 983141ωlowast is optimalFurthermore it is uniquely optimal as both inequalities hold with equalities only forsignal 983141slowast This in turn implies that any upper-censorship signal is suboptimal

24 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Proof of Lemma 1 Notice that under assumption (A1) we have

V primeprime(m) = gprime(m) + ρg(m) for m isin [0 1]

Recall that V is S-shaped if and only if V primeprime(m) = gprime (m) + ρg (m) is single-crossingfrom above Using Proposition 1 in Quah and Strulovici (2012) it is easy to showthat V primeprime(m) is single-crossing from above for all ρ isin R if and only if gprime (m)g (m) isnonincreasing in m (that is ln g(m) is concave)

Proof of Theorem 2 Immediate by Theorem 1 and Lemma 1

Proof of Theorem 3 Part (i) We fix t and σ and vary ρ Without loss ofgenerality let t = 0 and σ = 1 (so that Gtσ = G) Consider a signal that induces thedistribution H of posterior mean m This signal implements the receiverrsquos interimutility U given by

U(r) =

983133 1

r

(mminus r)dH(m) for r isin [0 1]

which holds because the receiver acts if and only if m ge r Since types r lt 0 alwayschoose a = 1 and types r gt 1 always choose a = 0 we can exclude them fromconsideration and assume that supp(G) = [0 1] The senderrsquos expected utility is then

983133 1

0

V (m)dH(m) =

983133 1

0

983133 m

0

(1 + ρ(mminus r))dG(r)dH(m)

=

983133 1

0

983133 1

r

(1 + ρ(mminus r))dH(m)dG(r)

=

983133 1

0

(1minusH(r) + ρU(r))dG(r)

Consider ρ2 gt ρ1 Suppose by contradiction that the corresponding optimal upper-censorship signals s2 and s1 are such that s1 is strictly more informative than s2 LetU2 and U1 be the receiverrsquos interim utilities implemented by s2 and s1 Since thesender prefers signal s2 under ρ2 and signal s1 under ρ1 we have

983133 1

0

(1minusH2 (r) + ρ2U2 (r)) dG (r) ge983133 1

0

(1minusH1 (r) + ρ2U1 (r)) dG (r)

983133 1

0

(1minusH1 (r) + ρ1U1 (r)) dG (r) ge983133 1

0

(1minusH2 (r) + ρ1U2 (r)) dG (r)

Summing up these inequalities gives

(ρ2 minus ρ1)

983133 1

0

(U2 (r)minus U1 (r)) dG (r) ge 0

leading to a contradiction because U1(r) ge U2(r) for all r with strict inequality forsome r given that s1 is strictly more informative than s2

CENSORSHIP AS OPTIMAL PERSUASION 25

Part (ii) We fix ρ isin R and σ = 1 and vary t The indirect utility of the senderconditional on m isin [0 1] is given by

Vt(m) =

983133

rlem

(1+ρ(mminusr))gt1(r)dr =

983133

rlemminust

(1+ρ(mminustminusr))g(r)dr = V (mminust) (9)

By Theorem 1 when t0 = 0 an optimal cutoff ωlowast0 satisfies

V (m) le V (mlowast0) + V prime(mlowast

0)(mminusmlowast0) for m isin [ωlowast

0 1]

When V is S-shaped and t1 lt 0 it is easy to see that

V (mminus t1) le V (mlowast0 minus t1) + V prime(mlowast

0 minus t1)(mminusmlowast0) for m isin [ωlowast

0 1]

which implies that when t = t1 the above inequality changes its sign at some ωlowast1

that satisfies ωlowast1 le ωlowast

0 It follows from (9) that ωlowast1 is an optimal cutoff for t = t1 and

as we established above ωlowast1 le ωlowast

0

Part (iii) We fix ρ ge 0 and t = 0 and vary σ The indirect utility of the senderconditional on m isin [0 1] is given by

V ρσσ(m) =

983133

rlem

9830591 +

ρ

σ(mminus r)

983060dG0σ(r)

=

983133

rlem

9830591 +

ρ

σ(mminus r)

983060g

983061r minus (1minus σ)τ

σ

9830621

σdr

=

983133

rlemminus(1minusσ)τσ

9830611 + ρ

983061mminus (1minus σ)τ

σminus r

983062983062g(r)dr

= Vρ1

983061mminus (1minus σ)τ

σ

983062 (10)

Let σ0 = 1 σ1 lt 1 and ρ1 = ρσ1 To compare the optimal cutoffs under parameters(ρ σ0) and (ρ σ1) we consider two steps first comparing the cutoffs between (ρ σ0)and (ρ1 σ1) and then comparing the cutoffs between (ρ1 σ1) and (ρ σ1)

First we compare an optimal cutoff ωlowast0 under (ρ σ0) = (ρ 1) and an optimal cutoff

ωlowast1 under (ρ1 σ1) By Theorem 1 ωlowast

0 satisfies

Vρ1(m) le Vρ1(mlowast0) + V prime

ρ1(mlowast0)(mminusmlowast

0) for m isin [ωlowast0 1]

When V is increasing (implied by ρ ge 0) and S-shaped and σ1 lt 1 it is easy to seethat for all m isin [ωlowast

0 1]

Vρ1

983061mminus (1minus σ1)τ

σ1

983062le Vρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062+ V prime

ρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062(mminusmlowast

0)

which implies that when σ = σ1 lt 1 the above inequality changes its sign at someωlowast1 that satisfies ωlowast

1 le ωlowast0 It follows from (10) that ωlowast

1 is an optimal cutoff under(ρσ σ1) = (ρ1 σ1) and as we established above ωlowast

1 le ωlowast0

26 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Second we compare the optimal cutoff ωlowast1 under (ρ1 σ1) and the optimal cutoff ωlowast

2

under (ρ σ1) Because ρ lt ρ1 = ρσ1 by part (i) we have ωlowast2 le ωlowast

1

Combining ωlowast1 le ωlowast

0 and ωlowast2 le ωlowast

1 we obtain ωlowast2 le ωlowast

0

Proof of Theorem 1prime Part (A) The proof is analogous to that of Theorem 1 andis thus omitted

Part (B) The only if statement follows from that in part (A) (see the proof of Theorem1) and the fact that we can always ensure that an optimal stochastic censorship(ωlowast qlowast) is deterministic (so qlowast isin 0 1) by the choice of the support and distributionof ω Thus we only need to prove the if statement

Let F be a distribution of ω with support on n values 0 lt ω0 lt lt ωnminus1 lt 1 Thisis without loss of generality because if 0 andor 1 were in the support of F we couldredefine the state as ω = (ω + ε)(1 + 2ε) and the type as r = (r + ε)(1 + 2ε)

Let X sub ω1 ωnminus1 identify a set of cutoffs defining a monotone partition for eachωi isin X the receiver is informed whether ω lt ωi or ω ge ωi Note that we exclude ω0

from the set of cutoffs because ω ge ω0 always holds

Note that X is deterministic upper censorship if and only if there exists a censorshipthreshold ωlowast isin [0 1] such that for each i = 1 nminus 1 ωi isin X if and only if ωi lt ωlowastthat is we can assume that qlowast = 0 without loss of generality

Let VX be the senderrsquos expected utility when the monotone partition is given by X

If n = 2 then there are only two partitions X = ω1 (full disclosure) and X = empty(no disclosure) Both are upper censorships so part (B) holds trivially

If n = 3 then there are four monotone partitions X = ω1ω2 (full disclosure)X = ω2 X = ω1 and X = empty (no disclosure) Only X = ω2 is not uppercensorship The next lemma proves that X = ω2 cannot be uniquely optimal whenV is S-shaped thus proving the optimality of upper censorship for n = 3

Denote

mjk = E983045ω|ω isin [ωjωk]

983046

For convenience of notation we write mj = ωj

Lemma 3 Let V be S-shaped with an inflexion point τ Let ω be a discrete randomvariable with support on m0m1m2 and probabilities p0 p1 p2 where 0 lt m0 ltm1 lt m2 lt 1 Then

(i) if τ le m0 then Vm2 le Vempty

(ii) if τ ge m1 then Vm2 le Vm1m2

(iii) if m0 lt τ lt m1 then Vm2 le maxVempty Vm1m2

CENSORSHIP AS OPTIMAL PERSUASION 27

Proof Define p01 = p0 + p1 and p02 = p01 + p2 = 1 Observe that

m01 =p0m0 + p1m1

p0 + p1and m02 =

p0m0 + p1m1 + p2m2

p0 + p1 + p2

We thus have

Vempty = p02V (m02) = (p01 + p2)V

983061p01m01 + p2m2

p01 + p2

983062

Vm2 = p01V (m01) + p2V (m2) = (p0 + p1)V

983061p0m0 + p1m1

p0 + p1

983062+ p2V (m2)

Vm1m2 = p0V (m0) + p1V (m1) + p2V (m2)

Clearly

m0 lt m01 lt m1 and m0 lt m02 lt m2

Therefore

Vm2 le Vempty lArrrArrp01

p01 + p2V (m01) +

p2p01 + p2

V (m2) le V

983061p01

p01 + p2m01 +

p2p01 + p2

m2

983062= V (m02)

(11)

and

Vm2 le Vm1m2 lArrrArr

V

983061p0

p0 + p1m0 +

p1p0 + p1

m1

983062le p0

p0 + p1V (m0) +

p1p0 + p1

V (m1)(12)

Suppose that τ le m01 so V is concave on [m01m2] Then (11) holds by Jensenrsquosinequality In particular it holds for τ le m0 lt m01 thus proving part (i)

Suppose that τ ge m1 so V is convex on [m0m1] Then (12) holds by Jensenrsquosinequality thus proving part (ii)

Finally suppose that m01 lt τ lt m1 Let (τ V (τ)) lie below (or on) the straight lineA connecting (m0 V (m0)) and (m2 V (m2)) as illustrated on Figure 2 By convexityof V (m) for m lt τ it must be the case that V (m01) lies below (or on) the dashedline B By concavity of V (m) for m gt τ it must be the case that V (m1) lies above(or on) the dashed line C Consequently (12) is satisfied

Alternatively let (τ V (τ)) lie above the straight line A connecting (m0 V (m0)) and(m2 V (m2)) as illustrated on Figure 3 If Vm2 le Vm1m2 then the proof of part(iii) is complete So suppose that Vm2 gt Vm1m2 so (12) is violated

By convexity of V (m) for m lt τ it must be the case that V (m01) lies below (or on)the line B Since (12) is violated V (m01) must lie between the lines A and B Recallthat m01 lt m02 lt m2 Observe that V (m02) must be in the shaded area on Figure 3Indeed if m02 lt τ then by convexity of V (m) for m lt τ it must be the case thatV (m02) lies between (or on) the lines B and D If m02 ge τ then by concavity of

28 KOLOTILIN MYLOVANOV ZAPECHELNYUK

τ m2m0

A C

B

m01 m1

V (m0) V (τ)

V (m2)

Figure 2 The case where (τ V (τ)) lies below (or on) the straight lineconnecting (m0 V (m0)) and (m2 V (m2))

τ m2m0

A

C

B

m01

V (m0)

V (τ)

V (m2)

D

V (m01)

Figure 3 The case where (τ V (τ)) lies above the straight line con-necting (m0 V (m0)) and (m2 V (m2))

V (m) for m gt τ it must be the case that V (m02) lies above (or on) line C It followsthat (11) is satisfied

We now adapt Lemma 3 to prove the optimality of upper censorship for n ge 4

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

4 KOLOTILIN MYLOVANOV ZAPECHELNYUK

easier to persuade in that the receiverrsquos cost of accepting the proposal is smaller and(iii) if the sender is more certain about the receiverrsquos preferences in that the densityof receiver types is more concentrated around its mode

We apply our results to the problem of media censorship by the government Weconsider a stylized setting with a finite number of media outlets and a continuumof heterogeneous citizens (receivers) Each media outlet is identified by its editorialpolicy that specifies an interval of endorsement states We permit the aggregate ac-tion of the citizens to affect their utility but not their optimal actions For examplean election outcome impacts all citizens but does not change their preferences overcandidates The government wishes to influence citizensrsquo actions by deciding whichmedia outlets to censor In the canonical case if the probability density of citizenstypes is log-concave then the optimal media censorship policy is to permit all suf-ficiently loyal media outlets and to censor the remaining outlets Our comparativestatics results suggest that the government should optimally increase censorship (i)if influencing society decisions becomes relatively more important than maximizingindividual welfare (ii) if the society experiences an ideology shock in favor of thegovernment and (iii) if the society becomes less diverse in ideology and taste

Related Literature The literature on Bayesian persuasion was set in motion bythe seminal papers of Rayo and Segal (2010) and Kamenica and Gentzkow (2011)The linear case which is the subject of this paper is prevalent in the literature5

Kolotilin (2018) provides necessary and sufficient prior dependent conditions for theoptimality of censorship in the case of a continuous state6 Alonso and Camara(2016b) provide sufficient conditions in the case of a discrete state In contrastour paper provides necessary and sufficient conditions that are prior independent (aswell as bias independent in the canonical case) and apply in both continuous anddiscrete cases More importantly we show that these conditions are also necessaryand sufficient for deterministic censorship to be optimal in the class of monotonepartitional signals (simple signals) There are no counterparts to this result in theliterature

Similar to us Alonso and Camara (2016b) show that the optimal signal is less infor-mative if the receiver is easier to persuade Their proof is simpler but it is valid onlywhen the senderrsquos utility is state-independent But our comparative statics resultswith respect to the senderrsquos bias and uncertainty are novel

Censorship policies commonly emerge as optimal signals in linear persuasion models ina variety of contexts starting from the prosecutor-judge example as well as lobbyingand product advertising examples in Kamenica and Gentzkow (2011) Other contextswhere censorship is optimal include grading policies (Ostrovsky and Schwarz 2010)

5Notable exceptions include Rayo and Segal (2010) Goldstein and Leitner (2018) Guo andShmaya (2019) and Kolotilin and Wolitzky (2019)

6See also Gentzkow and Kamenica (2016) Kolotilin Mylovanov Zapechelnyuk and Li (2017)and Dworczak and Martini (2019) for sufficient conditions

CENSORSHIP AS OPTIMAL PERSUASION 5

media control (Gehlbach and Sonin 2014 Ginzburg 2019) clinical trials (Kolotilin2015) voter persuasion (Alonso and Camara 2016ab) transparency benchmarks(Duffie Dworczak and Zhu 2017) stress tests (Goldstein and Leitner 2018 OrlovZryumov and Skrzypach 2018) trading mechanisms (Romanyuk and Smolin 2019Dworczak 2017 Yamashita 2018) quality certification (Zapechelnyuk 2019) andrelational communication (Kolotilin and Li 2019)

Our application to media censorship fits into the literature on media capture thataddresses the problem of media control by governments political parties or lobbyinggroups Besley and Prat (2006) pioneered this literature by studying how a gov-ernmentrsquos incentives to censor free media depend on plurality (the number of mediaoutlets) and on transaction costs of bribing the media In their model media outletsare identical and all voters have common interests Thus the governmentrsquos optimaldecision reduces to either capturing all media outlets or none Relatedly Gehlbachand Sonin (2014) consider a setting with a government-influenced monopoly mediaoutlet that exploits the trade-off between the governmentrsquos objective to ldquomobilizerdquo thepopulation for some collective goal and to collect the revenue from subscribers whodemand informative news Gehlbach and Sonin (2014) show that the presence of theldquomobilizationrdquo objective increases the news bias whereas the subscription revenuesreduce the bias but can cause the government to nationalize the media outlet7

The models of Besley and Prat (2006) and Gehlbach and Sonin (2014) have two statesof the world and either a single media outlet or a few identical media outlets Hencethe government uses the same censorship policy for each outlet We are the first toconsider a richer model of media censorship with a continuum of states and multipleheterogeneous media outlets As a result the government optimally discriminatesmedia outlets by permitting sufficiently loyal ones and banning the remaining ones

2 Model

21 Setup There are two players a sender (she) and a receiver (he) The receiverchooses whether to accept a proposal (a = 1) or reject it (a = 0) The proposalhas an uncertain value ω isin [0 1] By accepting the proposal the receiver forgoesan outside option worth r isin [0 1] so the receiverrsquos utility is a(ω minus r) The senderrsquosutility is av(ω r) where v(ω r) is linear in ω and continuously differentiable in rWe will refer to ω as state and to r as type and denote their distributions by F andG respectively Throughout the paper we assume that distribution G of r has acontinuously differentiable and strictly positive density g8

As in many applications the state is either a continuous or discrete random variablewe will separately analyze these two cases We say that the state ω and distribution F

7In different contexts media control by a government has also been studied by Egorov Gurievand Sonin (2009) Edmond (2013) and Lorentzen (2014) See also the overview of the literature onmedia capture slant and transparency in Prat and Stromberg (2013)

8This assumption is made for clarity of exposition The results can be extended to general G

6 KOLOTILIN MYLOVANOV ZAPECHELNYUK

are continuous if F has a density f on [0 1] We say that the state ω and distributionF are discrete if the support of F is a finite set

A particularly tractable and intuitive special case is where the senderrsquos utility is aweighted sum of the receiverrsquos utility and action We call this the canonical caseFormally in the canonical case the senderrsquos utility when the proposal is accepted is

v (ω r) = 1 + ρ(ω minus r) ρ isin R (A1)

That is the sender is biased towards a = 1 but also puts a weight ρ on the receiverrsquosutility In particular if the weight ρ is large then the sender and receiverrsquos inter-ests are aligned whereas if the weight is zero then the sender cares only about thereceiverrsquos action

The receiver privately knows his type but he does not observe the state The sendercan influence the action taken by the receiver a = 1 or a = 0 by releasing a signalthat reveals information about the state A signal is a random variable s isin [0 1]that is independent of r but is possibly correlated with ω For example s is fullyinformative if it is perfectly correlated with ω and s is completely uninformative ifit is independent of ω

The timing is as follows First the sender publicly chooses a signal s Then realiza-tions of ω r and s are drawn Finally the receiver observes the realizations of histype r and the signal s and then chooses between a = 0 and a = 1

We consider two scenarios

A The sender can choose any signal

B The sender can choose any monotone partitional signal

Formally a signal s is a monotone partitional signal if there exists a nondecreasingfunction ξ(ω) such that s = ξ(ω) for each ω Every such signal induces a partitionof the state space [0 1] into intervals and singletons and the receiver observes thepartition element that contains the state

Under Scenario A we are interested in an optimal signal that maximizes the senderrsquosexpected utility among all signals This is a standard persuasion problem

Under Scenario B we are interested in an optimal monotone partition that maximizesthe senderrsquos expected utility among all monotone partitional signals This scenarioincorporates constraints that information designers often face in practice For ex-ample a non-monotone grading policy that gives better grades to worse performingstudents will be perceived as unfair and will be manipulated by strategic students

22 Upper Censorship A subset of signals called upper censorship will play aspecial role in this paper

CENSORSHIP AS OPTIMAL PERSUASION 7

An upper-censorship signal reveals states below a specified cutoff and pools statesabove this cutoff When the state is equal to the cutoff it can be revealed or pooledIf the state is continuous it does not matter whether the cutoff state is revealed orpooled because this is a zero probability event However if the state is discretewe will distinguish between deterministic and stochastic upper-censorship signalsdepending on what happens at the cutoff state

A lower-censorship signal is defined symmetrically it pools states below a specifiedcutoff and reveals states above this cutoff Because the case of lower censorshipis symmetric (see Remark 1 below) for clarity of exposition we focus on uppercensorship throughout the paper

A signal s is stochastic upper censorship if there exists a cutoff pair (ωlowast qlowast) consistingof a cutoff state ωlowast isin [0 1] and probability qlowast isin [0 1] such that states below ωlowast arerevealed states above ωlowast are pooled and the state ω = ωlowast is revealed with probabilityqlowast and pooled with probability 1minus qlowast For example s can be expressed as

s =

983099983105983103

983105983101

ω with probability one if ω lt ωlowast

ωlowast and mlowast with probabilities qlowast and 1minus qlowast if ω = ωlowast

mlowast with probability one if ω gt ωlowast

where

mlowast =

983125(ωlowast1]

ωdF (ω) + ωlowast(1minus qlowast) Pr[ω = ωlowast]983125(ωlowast1]

dF (ω) + (1minus qlowast) Pr[ω = ωlowast](1)

is the posterior expected state induced by the pooling signal Notice that mlowast is afunction of ωlowast and qlowast

A stochastic upper-censorship signal s with a cutoff pair (ωlowast qlowast) is deterministic uppercensorship if and only if qlowast isin 0 1 in which case s can be expressed as a monotonepartition of the state space [0 1] For example if qlowast = 0 then the signal s fullyreveals states ω lt ωlowast and pools states ω ge ωlowast so s can be expressed as

s =

983083ω if ω lt ωlowast

mlowast if ω ge ωlowast

wheremlowast = E[ω|ω ge ωlowast]

is the expected state conditional on being at least ωlowast Note that the fully informativesignal and the completely uninformative signal are deterministic upper-censorshipsignals with (ωlowast qlowast) = (1 1) and (ωlowast qlowast) = (0 0) respectively

23 Benchmark To illustrate the difference between the cases of a continuous anddiscrete state as well as between Scenarios A and B we solve a benchmark examplewhere there is no uncertainty about the receiverrsquos type

Let the state ω have the expected value E[ω] = 12 and let the receiverrsquos type r be

known to the sender and satisfy r isin98304312 1983044 In addition suppose that the sender wishes

8 KOLOTILIN MYLOVANOV ZAPECHELNYUK

to maximize the probability that the receiver accepts the proposal so v(ω r) = 1 forall ω and all r

Observe that if the sender reveals no information about ω the receiver will evaluate ωby its expected value of 1

2 thus having utility 1

2minus r lt 0 from accepting the proposal

So revealing no information makes the receiver reject the proposal with certaintyThe sender can do better by fully revealing ω in which case the proposal is acceptedwith probability Pr[ω ge r]

However the sender can do even better by pooling states above r with states below rwhile keeping the expected state induced by the pooling signal at least r Thus thereceiver who is unable to distinguish between the states in the pool will accept theproposal at all of these states

To illustrate the case of the continuous state suppose that ω is uniformly distributedon [0 1] The largest pool that maintains the posterior expectation at least r is theinterval [2r minus 1 1] Indeed once the receiver learns that ω isin [2r minus 1 1] given theuniform prior of ω the posterior expected state is r So the proposal is accepted whenω ge 2rminus 1 and rejected when ω lt 2rminus 19 Thus the deterministic upper-censorshipsignal with cutoff ωlowast = 2r minus 1 is optimal The senderrsquos expected utility is equal to

Pr[proposal is accepted] = Pr[ω ge 2r minus 1] = 2(1minus r)

which is twice as high as that from the fully informative signal

Pr[proposal is accepted] = Pr[ω ge r] = 1minus r

When ω is a discrete random variable an optimal way to reveal information aboutω takes the form of stochastic upper censorship For illustration suppose that ω canonly be 0 or 1 equally likely Now simply pooling states is not helpful for the senderPooling 0 and 1 yields the posterior expected state of 1

2 which is smaller than the

type r resulting in a rejection of the proposal Including any other states in thepool makes no difference as these states never occur Thus an optimal monotonepartition is the fully informative signal and the probability that the receiver acceptsthe proposal is simply equal to the probability that ω = 1

Pr[proposal is accepted] = Pr[ω ge r] = Pr[ω = 1] =1

2

The sender can do strictly better by partial (stochastic) pooling If ω = 1 let thesignal realization be ldquoacceptrdquo with certainty If ω = 0 let the signal realization beldquorejectrdquo with some probability q and ldquoacceptrdquo with probability 1 minus q So when thereceiver observes ldquoacceptrdquo the posterior expected state is

E[ω|ldquoacceptrdquo] = Pr[ω = 1]

Pr[ω = 1] + Pr[ω = 0] middot (1minus q)=

12

12 + 12 middot (1minus q)=

1

2minus q

9Whether states strictly below ωlowast are revealed or pooled among themselves does not matterbecause they are all strictly below r and thus induce the receiver to take action a = 0

CENSORSHIP AS OPTIMAL PERSUASION 9

The sender now wishes to find the lowest q subject to 1(2 minus q) ge r which yieldsqlowast = 2 minus 1

r Thus stochastic upper censorship with ωlowast = 0 and qlowast = 2 minus 1

ris

optimal The probability that the receiver accepts the proposal is exactly equal tothe probability that the signal realization is ldquoacceptrdquo

Pr[proposal is accepted] = Pr[ω = 1] + Pr[ω = 0] middot (1minus qlowast) =1

2+

1

2middot 1minus r

r=

1

2r

We now summarize the insights obtained in this example If the state is continuousthen an optimal signal is deterministic upper censorship If the state is discrete thenan optimal monotone partition is deterministic upper censorship and an optimalsignal is stochastic upper censorship10 In the remainder of the paper we will showthat these insights extend to the case where the receiverrsquos type is uncertain

3 Continuous State

In this section we provide necessary and sufficient conditions for the optimality ofupper censorship when the state is continuous

Let m be the expected state conditional on observing a realization of a signal s Sincethe sender and receiverrsquos utilities are linear in ω they depend on the information aboutω revealed by signal s only through the expected state m In particular the receiverchooses a = 1 if and only if r le m

Let V (m) denote the indirect utility of the sender conditional on m

V (m) =

983133

rlem

E [v(ω r)|m] g(r)dr =

983133 m

0

v(m r)g(r)dr m isin [0 1] (2)

where E [v(ω r)|m] = v(m r) by the linearity of v in ω

A function V is said to be S-shaped if it is convex below some threshold and concaveabove that threshold or equivalently if V primeprime is single-crossing from above

there exists τ such that V primeprime(m) ge (le) 0 for all m lt (gt) τ

We now provide the criterion for the optimality of upper censorship Recall thatmlowast = E[ω|ω ge ωlowast]

Theorem 1 Let V be S-shaped Then and only then for all continuous F anoptimal signal is deterministic upper censorship whose cutoff state ωlowast isin [0 τ ] satisfies

V (mlowast) + V prime(mlowast)(mminusmlowast) ge (le)V (m) for all m ge (lt)ωlowast (3)

The criterion for the optimality of lower censorship is analogous

10See for example Kolotilin (2015) for a detailed treatment of the case where there is no uncer-tainty about the receiverrsquos type

10 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Remark 1 Let minusV be S-shaped Then and only then for all continuous F anoptimal signal is deterministic lower censorship whose cutoff state ωlowast isin [τ 1] satisfies

V (mlowast) + V prime(mlowast)(mminusmlowast) ge (le)V (m) for all m le (gt)ωlowast

Theorem 1 states that optimal persuasion takes a simple form of upper censorshipwhen V is S-shaped11 Moreover the theorem establishes that this result is tight inthe sense that if V is not S-shaped then there exists a distribution F of the statesuch that no upper-censorship signal is optimal

Note that every upper-censorship signal is a monotone partition Therefore Theorem1 applies in both Scenario A where the sender is free to choose any signal andScenario B where the sender is constrained to choose a monotone partition

The intuition behind Theorem 1 is as follows Observe that when no informationabout the state is revealed the receiverrsquos best-response action does not change withthe state The more information is revealed the more variable the receiverrsquos behaviorwill be in response to this information Consider an interval where V is concave Thesender would prefer not to reveal any information when the state is in this intervalsince a certain outcome is preferred to any lottery with the same expected stateConversely consider an interval where V is convex The sender would prefer to fullyreveal the state in this interval since now lotteries are preferred If V is S-shapedthat is it is convex below some threshold and concave above that threshold theinduced optimal persuasion takes the form of upper censorship

When V is S-shaped the senderrsquos optimization problem is reduced to finding anoptimal censorship cutoff ωlowast If the realized state ω is below the cutoff then it isrevealed to the receiver so the expected utility of the sender is V (ω) If the realizedstate ω is above the cutoff then the posterior expected state is mlowast = E[ω|ω ge ωlowast] sothe expected utility of the sender conditional on ω ge ωlowast is V (mlowast) The sender thusneeds to solve the problem

maxωlowastisin[01]

983133 ωlowast

0

V (ω)f(ω)dω +

983133 1

ωlowastV (mlowast)f(ω)dω (4)

The expression in (3) represents the first-order condition to this problem and capturesthree possible cases the boundary solutions ωlowast = 0 and ωlowast = 1 if the expression in(3) has the same sign for all ω and an interior solution ωlowast such that

V (mlowast) + V prime(mlowast)(ωlowast minusmlowast) = V (ωlowast) (5)

This first-order condition (5) is illustrated by Figure 1 The solid line is V (m) andthe dashed line is V (mlowast) + V prime(mlowast)(mminusmlowast) which is tangent to V at mlowast

11For a fixed distribution F Proposition 3 in Kolotilin (2018) implies that upper censorship withcutoff ωlowast is optimal if and only if (3) holds and V (m) is convex on [0ωlowast] This result can be used toshow that upper censorship is optimal when V is S-shaped For completeness we include a simpleself-contained proof inspired by the proof of Theorem 1 in Dworczak and Martini (2019)

CENSORSHIP AS OPTIMAL PERSUASION 11

1ωlowast

V

revealing pooling

mlowast0

Figure 1 Optimal upper censorship with cutoff ωlowast

The solution of the senderrsquos problem (4) is particularly simple if V is either globallyconvex or globally concave In this case the first-order condition in (3) has a constantsign so either ωlowast = 0 or ωlowast = 1 must be optimal If V is convex then ωlowast = 1 isoptimal which corresponds to the fully informative signal Similarly if V is concavethen ωlowast = 0 is optimal which corresponds to the completely uninformative signalThis is summarized in the following corollary

Corollary 1 An optimal signal is

(i) fully informative for all F if and only if V is convex

(ii) completely uninformative for all F if and only if V is concave

31 Canonical Case We now consider the canonical case where the senderrsquos utilitysatisfies assumption (A1)

The density g of receiver types r is said to be log-concave if ln g(r) is concave in r12

Note that ln g(r) is well defined in the canonical case

In this case the shape of the senderrsquos indirect utility V defined by (9) is connectedto the shape of the density g of receiver types as follows

Lemma 1 In the canonical case V is S-shaped for all ρ if and only if g is log-concave

12Table 1 in Bagnoli and Bergstrom (2005) reports distributions with log-concave densities

12 KOLOTILIN MYLOVANOV ZAPECHELNYUK

That is if the density of receiver types g is log-concave then the senderrsquos indirectutility V is S-shaped Moreover this result is tight in the sense that if g is notlog-concave then there exists ρ such that V is not S-shaped

By Theorem 1 and Lemma 1 we obtain a criterion for the optimality of upper cen-sorship with the condition on the primitive of the model the density g

Theorem 2 Consider the canonical case and let the density g be log-concave Thenand only then an optimal signal is deterministic upper censorship for all F and all ρ

The symmetric statement is also true an optimal signal is lower censorship for all Fand all ρ if and only if minusg is log-concave Consequently if both g and minusg are log-concave (that is g is exponential) then there exists an optimal signal which is bothupper censorship and lower censorship There are only two signals with this propertyfully informative and completely uninformative This allows us to obtain conditionson the distribution of receiver types under which the optimal signal is polarizedbetween fully informative and completely uninformative signals as for example inLewis and Sappington (1994) and Johnson and Myatt (2006)13

Corollary 2 In the canonical case an optimal signal is either fully informative orcompletely uninformative for all F and all ρ if and only if there exist λ isin R and c gt 0such that g(r) = ceminusλr for r isin [0 1]

If g(r) = ceminusλr then the fully informative signal is optimal whenever ρ ge λ and thecompletely uninformative signal is optimal whenever ρ le λ (and any signal is optimalwhen ρ = λ) In particular if ρ = 0 the optimal signal is fully determined by thesign of λ which is in turn determined by whether the mean of r is smaller or greaterthan 1

2

32 Comparative Statics Theorem 2 allows for a sharp comparative statics anal-ysis on the informativeness of the optimal signal

We compare signals by their Blackwell informativeness (Blackwell 1953) To compareupper-censorship signals s1 and s2 we only need to compare their cutoffs ωlowast

1 and ωlowast2

Signal s1 is more informative than signal s2 if ωlowast1 ge ωlowast

2 Indeed state ω isin [0ωlowast2) is

fully revealed by both s1 and s2 and state ω isin [ωlowast2 1] is partially revealed by s1 and

not revealed at all by s2 so s1 is more informative than s2

13In the case of log-concave density g the sets of upper- and lower-censorship signals are totallyordered by the rotation order of Johnson and Myatt (2006) If we restrict attention to the set oflower-censorship signals (which are generally suboptimal under log-concave g) then the rotationpoint is decreasing with respect to the rotation order By Lemma 1 of Johnson and Myatt (2006)the senderrsquos expected utility is quasiconvex and one of the extreme lower-censorship signals (thefully informative or completely uninformative signal) is optimal for the sender But if we restrictattention to the set of upper-censorship signals (which are optimal under log-concave g) then therotation point is not decreasing with respect to the rotation order Therefore Lemma 1 of Johnsonand Myatt (2006) does not apply and an interior upper-censorship signal is generally optimal forthe sender

CENSORSHIP AS OPTIMAL PERSUASION 13

For the purpose of comparison we extend the definition of distribution function Gto the real line and assume that its density g is log-concave Consider a family ofdistributions Gtσ of receiver types

Gtσ (r) = G

983061τ minus t+

r minus τ

σ

983062

where τ is the maximum point of g and t isin R and σ gt 0 are parameters Let gtσ (r)denote the corresponding density Note that gtσ (r) is log-concave for all t isin R andall σ gt 0

Because gtσ is log-concave on [0 1] deterministic upper censorship is optimal byTheorem 2 Let ωlowast (ρ t σ) isin [0 1] be the optimal censorship cutoff as given by (3)

The shift parameter t shifts the distribution along the horizontal axis and the stretchparameter σ stretches the distribution horizontally and symmetrically around themode τ so the limit of σ rarr 0 leads to the unit mass on τ We now show that thesender optimally discloses more information when

(i) the senderrsquos preferences are better aligned with the receiverrsquos preferences (thealignment parameter ρ is greater)

(ii) the receiver is more reluctant to accept the proposal (the shift parameter t isgreater)

(iii) the sender is more uncertain about the receiverrsquos type (the stretch parameter σis greater) provided ρ ge 014

Theorem 3 In the canonical case a cutoff state of an optimal signal satisfies

(i) ωlowast (ρ t σ) is increasing in ρ

(ii) ωlowast (ρ t σ) is increasing in t

(iii) ωlowast (ρ t σ) is increasing in σ if ρ ge 0

The intuition for part (i) is that for a higher ρ the sender puts more weight onthe receiverrsquos utility so she optimally endows the receiver with a higher utility byproviding more information

The intuition for part (ii) is that for a higher t each type of the receiver has a greatercost of accepting the proposal so to persuade the same type of the receiver the senderneeds to increase E[ω|ω ge ωlowast] by expanding the full disclosure interval [0ωlowast]

The intuition for part (iii) is that for a higher σ receiver types are more spread out soto persuade the same mass of types the sender again needs to increase E[ω|ω ge ωlowast]

14The case ρ ge 0 is the practically relevant case where the senderrsquos utility is a weighted averageof the receiverrsquos action and utility 1

1+ρv(ω r) =1

1+ρ + ρ1+ρ (ω minus r)

14 KOLOTILIN MYLOVANOV ZAPECHELNYUK

4 Discrete State

In this section we assume that the state is a discrete random variable

Theorem 1prime Let V be S-shaped Then and only then for all discrete F

(A) an optimal signal is stochastic upper censorship whose cutoff pair (ωlowast qlowast) satisfiescondition (3)

(B) an optimal monotone partition is deterministic upper censorship whose cutoff pair(ωlowast

d qlowastd) satisfies ωlowast

d = ωlowast and qlowastd isin 0 1

In the canonical case we obtain the result analogous to Theorem 2

Theorem 2prime Consider the canonical case and let the density g be log-concave Thenand only then for all discrete F and all ρ

(A) an optimal signal is stochastic upper censorship

(B) an optimal monotone partition is deterministic upper censorship

Let us discuss how the comparative statics result (Theorem 3) changes In Section 32we argue that upper-censorship signals can be ordered by their censorship cutoffs Butwhen the state is discrete a censorship cutoff is described by a pair (ωlowast qlowast) where ωlowast

is a cutoff state and qlowast is the probability of revealing this cutoff state when it realizes

Consider two stochastic upper-censorship signals s1 and s2 with cutoff pairs (ωlowast1 q

lowast1)

and (ωlowast2 q

lowast2) Denote (ωlowast

1 qlowast1) ≽ (ωlowast

2 qlowast2) if ω

lowast1 gt ωlowast

2 or ωlowast1 = ωlowast

2 and qlowast1 ge qlowast2 Observethat s1 is more (Blackwell) informative than s2 if and only if (ωlowast

1 qlowast1) ≽ (ωlowast

2 qlowast2) This

comparison also applies to deterministic upper-censorship signals with the constraintthat qlowast1 and qlowast2 are in 0 1

With the above order on cutoff pairs Theorem 3 extend as follows

Theorem 3prime Consider the canonical case

(A) A cutoff pair of an optimal signal satisfies

(i) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in ρ

(ii) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in t

(iii) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in σ if ρ ge 0

(B) A cutoff pair of an optimal monotone partition satisfies15

(i) (ωlowastd (ρ t σ) q

lowastd (ρ t σ)) is increasing in ρ

(ii) (ωlowastd (ρ t σ) q

lowastd (ρ t σ)) is increasing in t

15Part (iii) also holds for an optimal monotone partition when for example the discrete supportof the state ω is a regular grid with sufficiently many points However it is possible to construct acounterexample where part (iii) fails for an optimal monotone partition

CENSORSHIP AS OPTIMAL PERSUASION 15

That is the sender optimally discloses more information when she is less biasedrelative to the receiver (the alignment parameter ρ is greater) when the receiver ismore reluctant to accept the proposal (the shift parameter t is greater) and underScenario A when the sender is more uncertain about the receiverrsquos type (the stretchparameter σ is greater)

Remark 2 Theorem 1prime and Theorem 2prime show that upper censorship emerges as anoptimal persuasion strategy under the same conditions (V is S-shaped in Theorem1prime and g is log-concave in Theorem 2prime) regardless of whether the sender can choosean arbitrary signal or is constrained to choose a monotone partition It may betempting to conjecture that if the sender was allowed to choose an arbitrary partition(not necessarily monotone) the same result would hold This however is not true

For illustration suppose that the sender wishes to maximize the probability of a = 1and the receiver has type r = 1

2with certainty16 Let the state ω take values in

0 14 1 with probabilities (1

6 23 16) so the prior mean is 1

3

An optimal signal is stochastic upper censorship with cutoff pair (ωlowast qlowast) = (14 12)

which sends the pooling message with certainty at ω = 1 and with probability 1minusqlowast =12at ω = 1

4 and reveals the state otherwise The pooling message induces the posterior

mean m = 12 leading to a = 1 all other messages induce posterior means below 1

2

leading to a = 0 Under the optimal signal action a = 1 is chosen with probability23middot 12+ 1

6= 1

2

An optimal monotone partition is deterministic upper censorship that fully revealsthe state where only ω = 1 leads to a = 1 Under this partition action a = 1 ischosen with probability 1

6

An optimal partition pools the extreme states ω = 0 and ω = 1 and reveals theintermediate state ω = 1

4 The pooling message induces the posterior mean m = 1

2

leading to a = 1 This optimal partition is nonmonotone Under this partitionaction a = 1 is chosen with probability 1

6+ 1

6= 1

3 which is strictly better than the

probability of 16obtained under the optimal monotone partition

5 Application to Media Censorship

In this section we apply our results to the problem of media censorship by thegovernment In the modern world people obtain information about the governmentrsquosstate through various media sources such as television newspapers and internetblogs Without the media most people would not know what policies and reformsthe government pursues and how effective they are Media outlets have differentpositions on the political spectrum and differ substantially in how they select andpresent facts to cover the same news People choose their sources of information based

16In this case V (m) = 1 if m ge 12 and V (m) = 0 if m lt 1

2 Observe that the graph of V is

S-shaped and it can be approximated by a continuously differentiable S-shaped function

16 KOLOTILIN MYLOVANOV ZAPECHELNYUK

on their political ideology and socioeconomic status This information is valuable forsignificant individual decisions on migration investment and voting to name a fewIndividuals may not fully internalize externalities that their decisions impose on thesociety Likewise the government may not have the societyrsquos best interest at heartTo further its goals the government then wishes to influence individual decisions bymanipulating the information through media In autocracies and countries with weakchecks and balances the government has power to censor the media content

The governmentrsquos problem of media censorship can be represented as the persua-sion problem in Section 2 We apply our results to provide conditions for the op-timality of upper-censorship policies that censor all media outlets except the mostpro-government ones Furthermore we interpret our comparative statics results

51 Setup There is a continuum of heterogeneous citizens indexed by r isin [0 1]distributed with G Each citizen chooses between a = 0 and a = 1 The utility of acitizen of type r is given by

u(θ r ar a) = (θ minus r)ar + κ(θ r a)

where ar isin 0 1 denotes the citizenrsquos own action a =983125ardG(r) denotes the aggre-

gate action in the society θ isin [0 1] captures an unobserved benefit from action 1 ascompared to action 0 and κ captures the impact of the aggregate action a on thecitizenrsquos utility The term (θ minus r)ar is a private surplus of a citizen of type r Theterm κ(θ r a) is an externality because for a citizen of type r it is optimal to ignorethis term and choose ar = 1 if and only if θ ge r

There is a government which is concerned with a weighted average of the social utilityand the governmentrsquos intrinsic benefit from the aggregate action For a given θ thegovernmentrsquos utility is given by

983133 1

0

ν(θ r ar a)dG(r) + δγ(θ a)

The term ν captures a citizenrsquos utility from the governmentrsquos perspective We allow νto be different from u to reflect paternalistic or other concerns The term γ capturesthe governmentrsquos intrinsic benefit from the aggregate action The parameter δ ge 0captures the weight of the aggregate action in the governmentrsquos utility

Let T be a distribution of the random variable θ We assume that distributions G andT are independent and admit continuously differentiable and strictly positive densi-ties We also assume that κ ν and γ are linear in θ and continuously differentiablein r and a Furthermore to simplify interpretations we assume that ν and γ arenon-decreasing in θ and a so that for the government a high θ is a good news anda higher aggregate action is preferable

Citizens obtain information about the unobservable benefit θ through media outletsEach media outlet is identified by its editorial policy c isin [0 1] and it endorses action

CENSORSHIP AS OPTIMAL PERSUASION 17

a = 1 if θ ge c and criticizes it if θ lt c17 A set C of media outlets is a finite subsetof [0 1]18

The governmentrsquos censorship policy is a set of the media outlets X sub C that arepermitted to broadcast so the rest of the media outlets are censored

The timing is as follows First the government chooses a set X sub C of permittedmedia outlets Second state θ is realized and each permitted media outlet endorsesor criticizes action a = 1 according to its editorial policy Finally each citizen observesmessages from all permitted media outlets updates his beliefs about θ and choosesan action

52 Discussion We now discuss interpretations of the key components of the mediacensorship application As in Gehlbach and Sonin (2014) there can be various inter-pretations of the citizenrsquos action a = 1 such as voting for the government supportinga governmentrsquos policy or taking an individual decision that benefits the governmentA citizenrsquos type r can be interpreted as his ideological position or preference param-eter A citizen who is more supportive of the government has a smaller r

A media outlet with a higher editorial policy c can be interpreted as being less loyalto the government because it criticizes the government on a larger set of states Aneditorial policy c isin C can therefore represent a slant or political bias of the outletagainst the government and can be empirically measured as the frequency with whichthe outlet uses anti-government language Gentzkow and Shapiro (2010) constructsuch a slant index for US newspapers Empirical findings of their paper suggestthat the editorial policies of media outlets are driven by reader preferences justifyingour assumption of the existence of a large variety of editorial policies19 As in Suen(2004) Chan and Suen (2008) and Chiang and Knight (2011) the assumption ofthe binary media reports that communicate only whether the state θ is above somestandard c can be justified by a cursory readerrsquos preference for simple messages suchas positive or negative opinions and yes or no recommendations

The governmentrsquos censorship of media outlets can take various forms For examplethe government can ban access to internet sites withdraw licenses disrupt financingconfiscate print materials and equipment discredit media providers and even arresteditors and journalists using broadly formulated legislation on combating extremismIn some countries the government can exercise direct control over media editorialpolicies either through state ownership or administrative pressure

17The tie-breaking in the event of θ = c is unimportant as θ is a continuous random variable18We discuss the case of a continuum of media outlets in Section 5519Theoretical literature has explored the determinants of media slant of an outlet driven by its

citizens (Mullainathan and Shleifer 2005 Gentzkow and Shapiro 2006 and Chan and Suen 2008)and its owners (Baron 2006 and Besley and Prat 2006)

18 KOLOTILIN MYLOVANOV ZAPECHELNYUK

53 Formulating Media Censorship as Persuasion We now show that the me-dia censorship problem can be formulated as a linear persuasion problem in whichthe government is a sender and a representative citizen is a receiver

In our model the citizensrsquo and the governmentrsquos utilities are linear in θ There-fore given any information from the media outlets the utilities depend only on theposterior mean of θ

Consider an arbitrary posterior mean of θ denoted by m isin [0 1] Each citizen of typer chooses ar = 1 if and only if r le m The externality term κ plays no role in thisdecision Therefore the aggregate action a is simply the mass of all citizens whosetypes do not exceed m so a = G(m)

Next using the citizensrsquo optimal behavior we derive the governmentrsquos expected utilityconditional on the posterior mean m

V (m) = E983063983133 1

0

ν(θ r ar G(m))dG(r) + δγ(θ G(m))

983055983055983055983055m983064

=

983133 1

0

ν(m r1rlem G(m))dG(r) + δγ(mG(m)) (6)

This is the governmentrsquos indirect utility As in Section 2 the government now choosesa signal which is informative about the state to maximize its expected utility How-ever in contrast to Section 2 here the government is restricted to signals that areimplementable by a subset of a given set of media outlets

Next we show that with an appropriate definition of the state the restriction tosignals implementable by a subset of a given set of media outlets can be formulatedas a restriction to monotone partitional signals as in Scenario B in Section 2 Thuswe will be able to use our results in Section 4 to find the optimal censorship policies

Recall that the information about θ is only available through the media outlets Let ωdenote a random variable equal to the conditional expectation of θ given the messagesof all media outlets in C Note that as C is finite ω is a discrete random variablewith values in [0 1] Let F denote the distribution of ω From now on we treat therandom variable ω as the state and denote by s a signal as defined in Section 2

Let HM denote the set of all distributions of the posterior mean state induced byall monotone partitional signals and let HC denote the set of all distributions ofthe posterior mean state induced by all finite subsets of the set C of media outletsObserve that any set of media outlets induces a monotone partitional signal thatis HC sub HM We now show that every outcome implementable by an arbitrarymonotone partition in HM is also implementable by a subset of media outlets

Lemma 2 HC = HM

For illustration suppose that there is only one media outlet with the editorial policyc isin (0 1) There are two observable events θ ge c and θ lt c In each realized event

CENSORSHIP AS OPTIMAL PERSUASION 19

the citizens compute the posterior expectation of θ which we will call ω So in thisexample the distribution F of ω has two mass points one for each of the two events

Given these two mass points every monotone partition of [0 1] induces one of twopossible distributions of the posterior mean state (i) the mass points revealed and (ii)the mass points pooled Observe however that these distributions are implementableby permitting and censoring the media outlet respectively

A censorship policy is upper censorship if it censors all sufficiently disloyal mediaoutlets Specifically there exists a cutoff clowast isin C and an indicator qlowast isin 0 1 suchthat all media outlets whose editorial policies are below clowast are permitted all mediaoutlets whose editorial policies are above clowast are censored and clowast is permitted if andonly if qlowast = 1 That is X = c isin C c le clowast if qlowast = 1 and X = c isin C c lt clowastif qlowast = 0 Note that the full censorship policy clowast = 0 and qlowast = 0 (where all mediaoutlets are censored) and the free media policy clowast = 1 and qlowast = 1 (where all mediaoutlets are permitted) are the two extreme upper-censorship policies

As shown above the media censorship problem is equivalent to the linear persuasionproblem in which the sender is restricted to monotone partitional signals and thesenderrsquos indirect utility V is given by (6) We thus apply Theorem 1prime(B) to obtainthe following result

Theorem 1primeprime If V is S-shaped then an optimal censorship policy is upper censorship

To illustrate Theorem 1primeprime suppose the government is interested only in the aggregateaction so that ν = 0 and γ depends only on the aggregate action a so V (m) =γ(G(m)) Thus an upper-censorship policy is optimal if the composition functionγ(G(middot)) is S-shaped For example this condition holds if the government is interestedin reaching a certain approval threshold (eg a simple majority) so that γ is a stepfunction This condition also holds if γ is S-shaped and G is uniform or S-shapedwith the same inflection point as γ20

54 Canonical Case We now impose more structure on the utilities to obtain asharper result for the optimality of upper censorship and to perform a comparativestatics analysis Analogously to assumption (A1) we assume that the governmentrsquosutility is a weighted average of the citizensrsquo utility and their aggregate action

983133 1

0

u(θ r ar a)dG(r) + δa

where u(θ r ar a) = (θ minus r)ar + ζ(r)a

(A2)

for some continuously differentiable function ζ Define

β =

983133 1

0

ζ(r)dG(r) + δ

20This condition holds in many special cases where γ and G are S-shaped even when γ and Ghave different inflection points

20 KOLOTILIN MYLOVANOV ZAPECHELNYUK

The term983125 1

0ζ(r)dG(r) is the governmentrsquos expected bias towards the citizensrsquo action

a = 1 due to the citizensrsquo externality and the term δ is the governmentrsquos intrinsicbias towards a greater aggregate action a Thus we interpret β as the governmentrsquosaggregate bias This decomposition of the bias allows for different interpretations whythe government is biased So β can be high because the government is too self-serving(high δ) or because the government is benevolent (low δ) but wishes to internalizestrong positive externalities of the citizens (high ζ(r)) To ease interpretations weassume that β gt 021

Under Assumption (A2) the governmentrsquos indirect utility V given by (6) becomes

V (m) =

983133 m

0

(mminus r)dG(r) + βa =

983133 m

0

v(m r)dG(r)

where

v(m r) = β

9830611 +

1

β(mminus r)

983062

This is the same as v given by Assumption (A1) with ρ = βminus1 up to rescaling by apositive constant Consequently we can apply Theorem 2prime(B) to obtain the followingresult

Theorem 2primeprime Let (A2) hold If the density g of citizens types is log-concave then anoptimal censorship policy is upper censorship

We now apply the comparative statics analysis presented in Section 32 Note thatthe upper-censorship policies are ordered according to the amount of informationtransmitted to the citizens in the sense of Blackwell (1953) A greater censorshipthreshold clowast means that more media outlets are permitted With this order in mindwe apply Theorem 3prime(B) to make a comparative statics analysis on the amount ofinformation that is optimally disclosed by the government

First the censorship cutoff is increasing in the alignment parameter ρ Recall thatthis is a reciprocal of the governmentrsquos bias ρ = βminus1 This means that the governmentoptimally discloses more information (the censorship cutoff is greater) when it is lessbiased (β is smaller) Intuitively as β decreases the government puts more weighton the citizensrsquo utility so it optimally endows the population with a higher utility bycensoring fewer media outlets and disclosing more information

Second the censorship cutoff is increasing in the magnitude t of the horizontal shiftof the density g A greater t corresponds to a greater opportunity cost of actiona = 1 for each citizen This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more difficultto persuade to take action a = 1 (parameter t is greater) Informally speaking topersuade the same type of the citizen the government needs to increase the posterior

21If β lt 0 then swapping the roles of a = 0 and a = 1 reverses the sign of the bias If β = 0then the governmentrsquos utility and the citizensrsquo private interests coincide so it is trivially optimal topermit all media outlets

CENSORSHIP AS OPTIMAL PERSUASION 21

mean But the expected posterior mean must be equal to the prior mean so it is notpossible to increase all posterior means Due to the log-concave shape of the densityof the citizensrsquo types this tradeoff is resolved by increasing the posterior mean of thepooling interval E[ω|ω ge clowast] This is done by shrinking the pooling interval [clowast 1]that is increasing the censorship cutoff clowast

Third when the number of media outlets is sufficiently large the censorship cutoffis increasing in the stretch factor σ of the density g around its peak τ (see Footnote15) A greater σ corresponds to a more dispersed distribution of the opportunitycost in the population This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more diverse inhow difficult they are to persuade Intuitively when the types are more spread outpersuading the same mass of types requires a greater posterior mean which leads toa higher censorship cutoff similarly to the comparative statics in t discussed above

55 Extensions and Open Questions Let us now consider a few extensions ofour model of media censorship

In our model the set of media outlets is exogenous and the governmentrsquos only in-strument is censorship We now consider three alternative ways of expanding thegovernmentrsquos instruments of influence

First suppose that the government is able not only to censor existing media outletsbut also to introduce new media outlets with chosen editorial policies This is equiv-alent to our censorship model where all media outlets in [0 1] are initially availableand the government can censor any subset of them When all media outlets in [0 1]are permitted the revealed information about θ (which we call ω) is θ itself Thusω = θ is a continuous random variable with distribution F = T So we can nowapply our results for the continuous state from Section 3 instead of those for the dis-crete state in Section 4 reaching the same conclusion about the optimality of uppercensorship

Second suppose that the government can garble information available from mediaoutlets That is the government is not restricted to monotone partitions it cancreate arbitrary signals about state ω for the citizens to observe This becomes ageneral persuasion problem and our Theorems 1prime(A) 2prime(A) and 3prime(A) apply

Third suppose that the government is able to restrict not only which media outletsare permitted but also how many media outlets each citizen can choose to observeIn this extension the citizens are not allowed to communicate with one another (oth-erwise they could share the information thus observing all permitted media outletsindirectly) This extension does not affect our results as long as each citizen is al-lowed to access at least one media outlet of his choice as in Chan and Suen (2008)Intuitively this is because each citizen categorizes the information from the mediaoutlets into ldquogood newsrdquo where a = 1 is optimal and ldquobad newsrdquo where a = 0 is

22 KOLOTILIN MYLOVANOV ZAPECHELNYUK

optimal Because the information from the media outlets induces a monotone parti-tion it means that ldquogood newsrdquo is separated from ldquobad newsrdquo by a threshold mediaoutlet that depends on the citizenrsquos type That is it is sufficient to observe a singlethreshold media outlet to distinguish ldquogood newsrdquo from ldquobad newsrdquo

In our model each citizenrsquos utility depends on the aggregate action a through theexternality term κ(θ r a) which does not affect the chosen action Let us relax thisassumption so that a citizenrsquos optimal choice can depend on a We can still writethe senderrsquos indirect utility V as a function of the posterior mean state and applyour results However V is no longer uniquely determined by the primitives of themodel It is now endogenous and depends on an equilibrium the citizens play aseach citizenrsquos optimal action now depends on what all citizens do in equilibrium Forexample given the same information about the state a citizen could prefer to choosea = 1 if and only if sufficiently many citizens choose the same action so a is largeenough This creates the problem of multiplicity of equilibria and as a consequencethe dependence of optimal censorship on equilibrium selection

As a side our media censorship problem can also be applied to spatial voting mod-els as in Chiang and Knight (2011) Consider a government party (p = G) and anopposition party (p = O) competing in an election If party p wins a voter withan ideological position r gets utility wp minus (r minus rp)

2 where wp is the quality or va-lence of party p and rp is the ideology or policy platform of party p Voters knowthe partiesrsquo ideologies and obtain information about the partiesrsquo qualities from allavailable media outlets Each voter supports the party that maximizes his expectedutility Within this context our analysis still applies because the voterrsquos utility dif-ference between the government and opposition parties is proportional to θminusr whereθ = (wG minus wO + r2O minus r2G)2(rO minus rG)

There are a few more extensions that can be relevant in applications First insteadof complete censorship of a media outlet there can be a cost of accessing it Forexample an international news channel can be censored by a local government butcitizens may still access it through VPN at some cost Second it can be costly forthe government to censor media outlets So another important question to answeris how a government may prioritize censoring Finally citizens can incur some costof following each media outlet While we have already mentioned that citizens gainno benefit from following more than one outlet it is entirely possible for them tostop watching news altogether if it is sufficiently uninformative These extensions arenontrivial and left for future research

Appendix

Proof of Theorem 1 Suppose that V is S-shaped For mωlowast isin [0 1] and mlowast(ωlowast) =E[ω|ω ge ωlowast] define

V (mωlowast) = V (mlowast(ωlowast)) + V prime(mlowast(ωlowast))(mminusmlowast(ωlowast)) (7)

CENSORSHIP AS OPTIMAL PERSUASION 23

Since V (m) is convex on [0 τ ] and concave on [τ 1] there exists ωlowast isin [0 τ ] thatsatisfies condition (3) that is

V (mωlowast) ge (le)V (m) for all m ge (lt)ωlowast (8)

This is true because V (ωlowastωlowast) minus V (ωlowast) is single-crossing from below To see thisconsider Figure 1 Observe that (3) can hold only if ωlowast is on the convex part of V andmlowast is on the concave part of V As the censorship cutoff ωlowast increases the posteriormean state of the pooling message mlowast(ωlowast) also increases But because V is concaveat mlowast(ωlowast) the dashed tangent line V (mωlowast) becomes flatter in m so it crosses thesolid line V (m) at a smaller m

Consider an arbitrary signal s Let H be the distribution of m = E[ω|s] inducedby signal s Since each signal is a garbling of the fully informative signal F is amean-preserving spread of H Thus the senderrsquos expected utility is smaller undersignal s than under the upper-censorship signal with cutoff ωlowast as follows from

983133 1

0

V (m)dH(m) le983133 1

0

maxV (m) V (mωlowast)dH(m)

le983133 1

0

maxV (m) V (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mlowast(ωlowast))dF (m)

where the second line holds because maxV (m) V (mωlowast) is convex in m and F isa mean-preserving spread of H the third line holds by (8) and the last line holds by

(7) and by983125 1

ωlowast(mminusmlowast(ωlowast))dF (m) = 0

Conversely suppose that V is not S-shaped Then there exist 0 le m1 lt m2 lem3 lt m4 le 1 such that V primeprime(m) lt 0 for m isin (m1m2) V

primeprime(m) = 0 for m isin [m2m3]and V primeprime(m) gt 0 for m isin (m3m4) because V primeprime is continuous by assumption It isstraightforward to show that there exists 983141mlowast isin (m1m2) (sufficiently close to m2) and983141ωlowast isin (m3m4) such that

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) lt V (m) for all m isin (983141ωlowastm4]

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) gt V (m) for all m isin [m1 983141ωlowast] 983141mlowast

Moreover there exists a continuous distribution F such that the support of F is asubset of [m1m4] and 983141mlowast = E[ω|ω le 983141ωlowast] Using a chain of inequalities similar to theone above it is easy to show that lower-censorship signal 983141slowast with cutoff 983141ωlowast is optimalFurthermore it is uniquely optimal as both inequalities hold with equalities only forsignal 983141slowast This in turn implies that any upper-censorship signal is suboptimal

24 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Proof of Lemma 1 Notice that under assumption (A1) we have

V primeprime(m) = gprime(m) + ρg(m) for m isin [0 1]

Recall that V is S-shaped if and only if V primeprime(m) = gprime (m) + ρg (m) is single-crossingfrom above Using Proposition 1 in Quah and Strulovici (2012) it is easy to showthat V primeprime(m) is single-crossing from above for all ρ isin R if and only if gprime (m)g (m) isnonincreasing in m (that is ln g(m) is concave)

Proof of Theorem 2 Immediate by Theorem 1 and Lemma 1

Proof of Theorem 3 Part (i) We fix t and σ and vary ρ Without loss ofgenerality let t = 0 and σ = 1 (so that Gtσ = G) Consider a signal that induces thedistribution H of posterior mean m This signal implements the receiverrsquos interimutility U given by

U(r) =

983133 1

r

(mminus r)dH(m) for r isin [0 1]

which holds because the receiver acts if and only if m ge r Since types r lt 0 alwayschoose a = 1 and types r gt 1 always choose a = 0 we can exclude them fromconsideration and assume that supp(G) = [0 1] The senderrsquos expected utility is then

983133 1

0

V (m)dH(m) =

983133 1

0

983133 m

0

(1 + ρ(mminus r))dG(r)dH(m)

=

983133 1

0

983133 1

r

(1 + ρ(mminus r))dH(m)dG(r)

=

983133 1

0

(1minusH(r) + ρU(r))dG(r)

Consider ρ2 gt ρ1 Suppose by contradiction that the corresponding optimal upper-censorship signals s2 and s1 are such that s1 is strictly more informative than s2 LetU2 and U1 be the receiverrsquos interim utilities implemented by s2 and s1 Since thesender prefers signal s2 under ρ2 and signal s1 under ρ1 we have

983133 1

0

(1minusH2 (r) + ρ2U2 (r)) dG (r) ge983133 1

0

(1minusH1 (r) + ρ2U1 (r)) dG (r)

983133 1

0

(1minusH1 (r) + ρ1U1 (r)) dG (r) ge983133 1

0

(1minusH2 (r) + ρ1U2 (r)) dG (r)

Summing up these inequalities gives

(ρ2 minus ρ1)

983133 1

0

(U2 (r)minus U1 (r)) dG (r) ge 0

leading to a contradiction because U1(r) ge U2(r) for all r with strict inequality forsome r given that s1 is strictly more informative than s2

CENSORSHIP AS OPTIMAL PERSUASION 25

Part (ii) We fix ρ isin R and σ = 1 and vary t The indirect utility of the senderconditional on m isin [0 1] is given by

Vt(m) =

983133

rlem

(1+ρ(mminusr))gt1(r)dr =

983133

rlemminust

(1+ρ(mminustminusr))g(r)dr = V (mminust) (9)

By Theorem 1 when t0 = 0 an optimal cutoff ωlowast0 satisfies

V (m) le V (mlowast0) + V prime(mlowast

0)(mminusmlowast0) for m isin [ωlowast

0 1]

When V is S-shaped and t1 lt 0 it is easy to see that

V (mminus t1) le V (mlowast0 minus t1) + V prime(mlowast

0 minus t1)(mminusmlowast0) for m isin [ωlowast

0 1]

which implies that when t = t1 the above inequality changes its sign at some ωlowast1

that satisfies ωlowast1 le ωlowast

0 It follows from (9) that ωlowast1 is an optimal cutoff for t = t1 and

as we established above ωlowast1 le ωlowast

0

Part (iii) We fix ρ ge 0 and t = 0 and vary σ The indirect utility of the senderconditional on m isin [0 1] is given by

V ρσσ(m) =

983133

rlem

9830591 +

ρ

σ(mminus r)

983060dG0σ(r)

=

983133

rlem

9830591 +

ρ

σ(mminus r)

983060g

983061r minus (1minus σ)τ

σ

9830621

σdr

=

983133

rlemminus(1minusσ)τσ

9830611 + ρ

983061mminus (1minus σ)τ

σminus r

983062983062g(r)dr

= Vρ1

983061mminus (1minus σ)τ

σ

983062 (10)

Let σ0 = 1 σ1 lt 1 and ρ1 = ρσ1 To compare the optimal cutoffs under parameters(ρ σ0) and (ρ σ1) we consider two steps first comparing the cutoffs between (ρ σ0)and (ρ1 σ1) and then comparing the cutoffs between (ρ1 σ1) and (ρ σ1)

First we compare an optimal cutoff ωlowast0 under (ρ σ0) = (ρ 1) and an optimal cutoff

ωlowast1 under (ρ1 σ1) By Theorem 1 ωlowast

0 satisfies

Vρ1(m) le Vρ1(mlowast0) + V prime

ρ1(mlowast0)(mminusmlowast

0) for m isin [ωlowast0 1]

When V is increasing (implied by ρ ge 0) and S-shaped and σ1 lt 1 it is easy to seethat for all m isin [ωlowast

0 1]

Vρ1

983061mminus (1minus σ1)τ

σ1

983062le Vρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062+ V prime

ρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062(mminusmlowast

0)

which implies that when σ = σ1 lt 1 the above inequality changes its sign at someωlowast1 that satisfies ωlowast

1 le ωlowast0 It follows from (10) that ωlowast

1 is an optimal cutoff under(ρσ σ1) = (ρ1 σ1) and as we established above ωlowast

1 le ωlowast0

26 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Second we compare the optimal cutoff ωlowast1 under (ρ1 σ1) and the optimal cutoff ωlowast

2

under (ρ σ1) Because ρ lt ρ1 = ρσ1 by part (i) we have ωlowast2 le ωlowast

1

Combining ωlowast1 le ωlowast

0 and ωlowast2 le ωlowast

1 we obtain ωlowast2 le ωlowast

0

Proof of Theorem 1prime Part (A) The proof is analogous to that of Theorem 1 andis thus omitted

Part (B) The only if statement follows from that in part (A) (see the proof of Theorem1) and the fact that we can always ensure that an optimal stochastic censorship(ωlowast qlowast) is deterministic (so qlowast isin 0 1) by the choice of the support and distributionof ω Thus we only need to prove the if statement

Let F be a distribution of ω with support on n values 0 lt ω0 lt lt ωnminus1 lt 1 Thisis without loss of generality because if 0 andor 1 were in the support of F we couldredefine the state as ω = (ω + ε)(1 + 2ε) and the type as r = (r + ε)(1 + 2ε)

Let X sub ω1 ωnminus1 identify a set of cutoffs defining a monotone partition for eachωi isin X the receiver is informed whether ω lt ωi or ω ge ωi Note that we exclude ω0

from the set of cutoffs because ω ge ω0 always holds

Note that X is deterministic upper censorship if and only if there exists a censorshipthreshold ωlowast isin [0 1] such that for each i = 1 nminus 1 ωi isin X if and only if ωi lt ωlowastthat is we can assume that qlowast = 0 without loss of generality

Let VX be the senderrsquos expected utility when the monotone partition is given by X

If n = 2 then there are only two partitions X = ω1 (full disclosure) and X = empty(no disclosure) Both are upper censorships so part (B) holds trivially

If n = 3 then there are four monotone partitions X = ω1ω2 (full disclosure)X = ω2 X = ω1 and X = empty (no disclosure) Only X = ω2 is not uppercensorship The next lemma proves that X = ω2 cannot be uniquely optimal whenV is S-shaped thus proving the optimality of upper censorship for n = 3

Denote

mjk = E983045ω|ω isin [ωjωk]

983046

For convenience of notation we write mj = ωj

Lemma 3 Let V be S-shaped with an inflexion point τ Let ω be a discrete randomvariable with support on m0m1m2 and probabilities p0 p1 p2 where 0 lt m0 ltm1 lt m2 lt 1 Then

(i) if τ le m0 then Vm2 le Vempty

(ii) if τ ge m1 then Vm2 le Vm1m2

(iii) if m0 lt τ lt m1 then Vm2 le maxVempty Vm1m2

CENSORSHIP AS OPTIMAL PERSUASION 27

Proof Define p01 = p0 + p1 and p02 = p01 + p2 = 1 Observe that

m01 =p0m0 + p1m1

p0 + p1and m02 =

p0m0 + p1m1 + p2m2

p0 + p1 + p2

We thus have

Vempty = p02V (m02) = (p01 + p2)V

983061p01m01 + p2m2

p01 + p2

983062

Vm2 = p01V (m01) + p2V (m2) = (p0 + p1)V

983061p0m0 + p1m1

p0 + p1

983062+ p2V (m2)

Vm1m2 = p0V (m0) + p1V (m1) + p2V (m2)

Clearly

m0 lt m01 lt m1 and m0 lt m02 lt m2

Therefore

Vm2 le Vempty lArrrArrp01

p01 + p2V (m01) +

p2p01 + p2

V (m2) le V

983061p01

p01 + p2m01 +

p2p01 + p2

m2

983062= V (m02)

(11)

and

Vm2 le Vm1m2 lArrrArr

V

983061p0

p0 + p1m0 +

p1p0 + p1

m1

983062le p0

p0 + p1V (m0) +

p1p0 + p1

V (m1)(12)

Suppose that τ le m01 so V is concave on [m01m2] Then (11) holds by Jensenrsquosinequality In particular it holds for τ le m0 lt m01 thus proving part (i)

Suppose that τ ge m1 so V is convex on [m0m1] Then (12) holds by Jensenrsquosinequality thus proving part (ii)

Finally suppose that m01 lt τ lt m1 Let (τ V (τ)) lie below (or on) the straight lineA connecting (m0 V (m0)) and (m2 V (m2)) as illustrated on Figure 2 By convexityof V (m) for m lt τ it must be the case that V (m01) lies below (or on) the dashedline B By concavity of V (m) for m gt τ it must be the case that V (m1) lies above(or on) the dashed line C Consequently (12) is satisfied

Alternatively let (τ V (τ)) lie above the straight line A connecting (m0 V (m0)) and(m2 V (m2)) as illustrated on Figure 3 If Vm2 le Vm1m2 then the proof of part(iii) is complete So suppose that Vm2 gt Vm1m2 so (12) is violated

By convexity of V (m) for m lt τ it must be the case that V (m01) lies below (or on)the line B Since (12) is violated V (m01) must lie between the lines A and B Recallthat m01 lt m02 lt m2 Observe that V (m02) must be in the shaded area on Figure 3Indeed if m02 lt τ then by convexity of V (m) for m lt τ it must be the case thatV (m02) lies between (or on) the lines B and D If m02 ge τ then by concavity of

28 KOLOTILIN MYLOVANOV ZAPECHELNYUK

τ m2m0

A C

B

m01 m1

V (m0) V (τ)

V (m2)

Figure 2 The case where (τ V (τ)) lies below (or on) the straight lineconnecting (m0 V (m0)) and (m2 V (m2))

τ m2m0

A

C

B

m01

V (m0)

V (τ)

V (m2)

D

V (m01)

Figure 3 The case where (τ V (τ)) lies above the straight line con-necting (m0 V (m0)) and (m2 V (m2))

V (m) for m gt τ it must be the case that V (m02) lies above (or on) line C It followsthat (11) is satisfied

We now adapt Lemma 3 to prove the optimality of upper censorship for n ge 4

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

CENSORSHIP AS OPTIMAL PERSUASION 5

media control (Gehlbach and Sonin 2014 Ginzburg 2019) clinical trials (Kolotilin2015) voter persuasion (Alonso and Camara 2016ab) transparency benchmarks(Duffie Dworczak and Zhu 2017) stress tests (Goldstein and Leitner 2018 OrlovZryumov and Skrzypach 2018) trading mechanisms (Romanyuk and Smolin 2019Dworczak 2017 Yamashita 2018) quality certification (Zapechelnyuk 2019) andrelational communication (Kolotilin and Li 2019)

Our application to media censorship fits into the literature on media capture thataddresses the problem of media control by governments political parties or lobbyinggroups Besley and Prat (2006) pioneered this literature by studying how a gov-ernmentrsquos incentives to censor free media depend on plurality (the number of mediaoutlets) and on transaction costs of bribing the media In their model media outletsare identical and all voters have common interests Thus the governmentrsquos optimaldecision reduces to either capturing all media outlets or none Relatedly Gehlbachand Sonin (2014) consider a setting with a government-influenced monopoly mediaoutlet that exploits the trade-off between the governmentrsquos objective to ldquomobilizerdquo thepopulation for some collective goal and to collect the revenue from subscribers whodemand informative news Gehlbach and Sonin (2014) show that the presence of theldquomobilizationrdquo objective increases the news bias whereas the subscription revenuesreduce the bias but can cause the government to nationalize the media outlet7

The models of Besley and Prat (2006) and Gehlbach and Sonin (2014) have two statesof the world and either a single media outlet or a few identical media outlets Hencethe government uses the same censorship policy for each outlet We are the first toconsider a richer model of media censorship with a continuum of states and multipleheterogeneous media outlets As a result the government optimally discriminatesmedia outlets by permitting sufficiently loyal ones and banning the remaining ones

2 Model

21 Setup There are two players a sender (she) and a receiver (he) The receiverchooses whether to accept a proposal (a = 1) or reject it (a = 0) The proposalhas an uncertain value ω isin [0 1] By accepting the proposal the receiver forgoesan outside option worth r isin [0 1] so the receiverrsquos utility is a(ω minus r) The senderrsquosutility is av(ω r) where v(ω r) is linear in ω and continuously differentiable in rWe will refer to ω as state and to r as type and denote their distributions by F andG respectively Throughout the paper we assume that distribution G of r has acontinuously differentiable and strictly positive density g8

As in many applications the state is either a continuous or discrete random variablewe will separately analyze these two cases We say that the state ω and distribution F

7In different contexts media control by a government has also been studied by Egorov Gurievand Sonin (2009) Edmond (2013) and Lorentzen (2014) See also the overview of the literature onmedia capture slant and transparency in Prat and Stromberg (2013)

8This assumption is made for clarity of exposition The results can be extended to general G

6 KOLOTILIN MYLOVANOV ZAPECHELNYUK

are continuous if F has a density f on [0 1] We say that the state ω and distributionF are discrete if the support of F is a finite set

A particularly tractable and intuitive special case is where the senderrsquos utility is aweighted sum of the receiverrsquos utility and action We call this the canonical caseFormally in the canonical case the senderrsquos utility when the proposal is accepted is

v (ω r) = 1 + ρ(ω minus r) ρ isin R (A1)

That is the sender is biased towards a = 1 but also puts a weight ρ on the receiverrsquosutility In particular if the weight ρ is large then the sender and receiverrsquos inter-ests are aligned whereas if the weight is zero then the sender cares only about thereceiverrsquos action

The receiver privately knows his type but he does not observe the state The sendercan influence the action taken by the receiver a = 1 or a = 0 by releasing a signalthat reveals information about the state A signal is a random variable s isin [0 1]that is independent of r but is possibly correlated with ω For example s is fullyinformative if it is perfectly correlated with ω and s is completely uninformative ifit is independent of ω

The timing is as follows First the sender publicly chooses a signal s Then realiza-tions of ω r and s are drawn Finally the receiver observes the realizations of histype r and the signal s and then chooses between a = 0 and a = 1

We consider two scenarios

A The sender can choose any signal

B The sender can choose any monotone partitional signal

Formally a signal s is a monotone partitional signal if there exists a nondecreasingfunction ξ(ω) such that s = ξ(ω) for each ω Every such signal induces a partitionof the state space [0 1] into intervals and singletons and the receiver observes thepartition element that contains the state

Under Scenario A we are interested in an optimal signal that maximizes the senderrsquosexpected utility among all signals This is a standard persuasion problem

Under Scenario B we are interested in an optimal monotone partition that maximizesthe senderrsquos expected utility among all monotone partitional signals This scenarioincorporates constraints that information designers often face in practice For ex-ample a non-monotone grading policy that gives better grades to worse performingstudents will be perceived as unfair and will be manipulated by strategic students

22 Upper Censorship A subset of signals called upper censorship will play aspecial role in this paper

CENSORSHIP AS OPTIMAL PERSUASION 7

An upper-censorship signal reveals states below a specified cutoff and pools statesabove this cutoff When the state is equal to the cutoff it can be revealed or pooledIf the state is continuous it does not matter whether the cutoff state is revealed orpooled because this is a zero probability event However if the state is discretewe will distinguish between deterministic and stochastic upper-censorship signalsdepending on what happens at the cutoff state

A lower-censorship signal is defined symmetrically it pools states below a specifiedcutoff and reveals states above this cutoff Because the case of lower censorshipis symmetric (see Remark 1 below) for clarity of exposition we focus on uppercensorship throughout the paper

A signal s is stochastic upper censorship if there exists a cutoff pair (ωlowast qlowast) consistingof a cutoff state ωlowast isin [0 1] and probability qlowast isin [0 1] such that states below ωlowast arerevealed states above ωlowast are pooled and the state ω = ωlowast is revealed with probabilityqlowast and pooled with probability 1minus qlowast For example s can be expressed as

s =

983099983105983103

983105983101

ω with probability one if ω lt ωlowast

ωlowast and mlowast with probabilities qlowast and 1minus qlowast if ω = ωlowast

mlowast with probability one if ω gt ωlowast

where

mlowast =

983125(ωlowast1]

ωdF (ω) + ωlowast(1minus qlowast) Pr[ω = ωlowast]983125(ωlowast1]

dF (ω) + (1minus qlowast) Pr[ω = ωlowast](1)

is the posterior expected state induced by the pooling signal Notice that mlowast is afunction of ωlowast and qlowast

A stochastic upper-censorship signal s with a cutoff pair (ωlowast qlowast) is deterministic uppercensorship if and only if qlowast isin 0 1 in which case s can be expressed as a monotonepartition of the state space [0 1] For example if qlowast = 0 then the signal s fullyreveals states ω lt ωlowast and pools states ω ge ωlowast so s can be expressed as

s =

983083ω if ω lt ωlowast

mlowast if ω ge ωlowast

wheremlowast = E[ω|ω ge ωlowast]

is the expected state conditional on being at least ωlowast Note that the fully informativesignal and the completely uninformative signal are deterministic upper-censorshipsignals with (ωlowast qlowast) = (1 1) and (ωlowast qlowast) = (0 0) respectively

23 Benchmark To illustrate the difference between the cases of a continuous anddiscrete state as well as between Scenarios A and B we solve a benchmark examplewhere there is no uncertainty about the receiverrsquos type

Let the state ω have the expected value E[ω] = 12 and let the receiverrsquos type r be

known to the sender and satisfy r isin98304312 1983044 In addition suppose that the sender wishes

8 KOLOTILIN MYLOVANOV ZAPECHELNYUK

to maximize the probability that the receiver accepts the proposal so v(ω r) = 1 forall ω and all r

Observe that if the sender reveals no information about ω the receiver will evaluate ωby its expected value of 1

2 thus having utility 1

2minus r lt 0 from accepting the proposal

So revealing no information makes the receiver reject the proposal with certaintyThe sender can do better by fully revealing ω in which case the proposal is acceptedwith probability Pr[ω ge r]

However the sender can do even better by pooling states above r with states below rwhile keeping the expected state induced by the pooling signal at least r Thus thereceiver who is unable to distinguish between the states in the pool will accept theproposal at all of these states

To illustrate the case of the continuous state suppose that ω is uniformly distributedon [0 1] The largest pool that maintains the posterior expectation at least r is theinterval [2r minus 1 1] Indeed once the receiver learns that ω isin [2r minus 1 1] given theuniform prior of ω the posterior expected state is r So the proposal is accepted whenω ge 2rminus 1 and rejected when ω lt 2rminus 19 Thus the deterministic upper-censorshipsignal with cutoff ωlowast = 2r minus 1 is optimal The senderrsquos expected utility is equal to

Pr[proposal is accepted] = Pr[ω ge 2r minus 1] = 2(1minus r)

which is twice as high as that from the fully informative signal

Pr[proposal is accepted] = Pr[ω ge r] = 1minus r

When ω is a discrete random variable an optimal way to reveal information aboutω takes the form of stochastic upper censorship For illustration suppose that ω canonly be 0 or 1 equally likely Now simply pooling states is not helpful for the senderPooling 0 and 1 yields the posterior expected state of 1

2 which is smaller than the

type r resulting in a rejection of the proposal Including any other states in thepool makes no difference as these states never occur Thus an optimal monotonepartition is the fully informative signal and the probability that the receiver acceptsthe proposal is simply equal to the probability that ω = 1

Pr[proposal is accepted] = Pr[ω ge r] = Pr[ω = 1] =1

2

The sender can do strictly better by partial (stochastic) pooling If ω = 1 let thesignal realization be ldquoacceptrdquo with certainty If ω = 0 let the signal realization beldquorejectrdquo with some probability q and ldquoacceptrdquo with probability 1 minus q So when thereceiver observes ldquoacceptrdquo the posterior expected state is

E[ω|ldquoacceptrdquo] = Pr[ω = 1]

Pr[ω = 1] + Pr[ω = 0] middot (1minus q)=

12

12 + 12 middot (1minus q)=

1

2minus q

9Whether states strictly below ωlowast are revealed or pooled among themselves does not matterbecause they are all strictly below r and thus induce the receiver to take action a = 0

CENSORSHIP AS OPTIMAL PERSUASION 9

The sender now wishes to find the lowest q subject to 1(2 minus q) ge r which yieldsqlowast = 2 minus 1

r Thus stochastic upper censorship with ωlowast = 0 and qlowast = 2 minus 1

ris

optimal The probability that the receiver accepts the proposal is exactly equal tothe probability that the signal realization is ldquoacceptrdquo

Pr[proposal is accepted] = Pr[ω = 1] + Pr[ω = 0] middot (1minus qlowast) =1

2+

1

2middot 1minus r

r=

1

2r

We now summarize the insights obtained in this example If the state is continuousthen an optimal signal is deterministic upper censorship If the state is discrete thenan optimal monotone partition is deterministic upper censorship and an optimalsignal is stochastic upper censorship10 In the remainder of the paper we will showthat these insights extend to the case where the receiverrsquos type is uncertain

3 Continuous State

In this section we provide necessary and sufficient conditions for the optimality ofupper censorship when the state is continuous

Let m be the expected state conditional on observing a realization of a signal s Sincethe sender and receiverrsquos utilities are linear in ω they depend on the information aboutω revealed by signal s only through the expected state m In particular the receiverchooses a = 1 if and only if r le m

Let V (m) denote the indirect utility of the sender conditional on m

V (m) =

983133

rlem

E [v(ω r)|m] g(r)dr =

983133 m

0

v(m r)g(r)dr m isin [0 1] (2)

where E [v(ω r)|m] = v(m r) by the linearity of v in ω

A function V is said to be S-shaped if it is convex below some threshold and concaveabove that threshold or equivalently if V primeprime is single-crossing from above

there exists τ such that V primeprime(m) ge (le) 0 for all m lt (gt) τ

We now provide the criterion for the optimality of upper censorship Recall thatmlowast = E[ω|ω ge ωlowast]

Theorem 1 Let V be S-shaped Then and only then for all continuous F anoptimal signal is deterministic upper censorship whose cutoff state ωlowast isin [0 τ ] satisfies

V (mlowast) + V prime(mlowast)(mminusmlowast) ge (le)V (m) for all m ge (lt)ωlowast (3)

The criterion for the optimality of lower censorship is analogous

10See for example Kolotilin (2015) for a detailed treatment of the case where there is no uncer-tainty about the receiverrsquos type

10 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Remark 1 Let minusV be S-shaped Then and only then for all continuous F anoptimal signal is deterministic lower censorship whose cutoff state ωlowast isin [τ 1] satisfies

V (mlowast) + V prime(mlowast)(mminusmlowast) ge (le)V (m) for all m le (gt)ωlowast

Theorem 1 states that optimal persuasion takes a simple form of upper censorshipwhen V is S-shaped11 Moreover the theorem establishes that this result is tight inthe sense that if V is not S-shaped then there exists a distribution F of the statesuch that no upper-censorship signal is optimal

Note that every upper-censorship signal is a monotone partition Therefore Theorem1 applies in both Scenario A where the sender is free to choose any signal andScenario B where the sender is constrained to choose a monotone partition

The intuition behind Theorem 1 is as follows Observe that when no informationabout the state is revealed the receiverrsquos best-response action does not change withthe state The more information is revealed the more variable the receiverrsquos behaviorwill be in response to this information Consider an interval where V is concave Thesender would prefer not to reveal any information when the state is in this intervalsince a certain outcome is preferred to any lottery with the same expected stateConversely consider an interval where V is convex The sender would prefer to fullyreveal the state in this interval since now lotteries are preferred If V is S-shapedthat is it is convex below some threshold and concave above that threshold theinduced optimal persuasion takes the form of upper censorship

When V is S-shaped the senderrsquos optimization problem is reduced to finding anoptimal censorship cutoff ωlowast If the realized state ω is below the cutoff then it isrevealed to the receiver so the expected utility of the sender is V (ω) If the realizedstate ω is above the cutoff then the posterior expected state is mlowast = E[ω|ω ge ωlowast] sothe expected utility of the sender conditional on ω ge ωlowast is V (mlowast) The sender thusneeds to solve the problem

maxωlowastisin[01]

983133 ωlowast

0

V (ω)f(ω)dω +

983133 1

ωlowastV (mlowast)f(ω)dω (4)

The expression in (3) represents the first-order condition to this problem and capturesthree possible cases the boundary solutions ωlowast = 0 and ωlowast = 1 if the expression in(3) has the same sign for all ω and an interior solution ωlowast such that

V (mlowast) + V prime(mlowast)(ωlowast minusmlowast) = V (ωlowast) (5)

This first-order condition (5) is illustrated by Figure 1 The solid line is V (m) andthe dashed line is V (mlowast) + V prime(mlowast)(mminusmlowast) which is tangent to V at mlowast

11For a fixed distribution F Proposition 3 in Kolotilin (2018) implies that upper censorship withcutoff ωlowast is optimal if and only if (3) holds and V (m) is convex on [0ωlowast] This result can be used toshow that upper censorship is optimal when V is S-shaped For completeness we include a simpleself-contained proof inspired by the proof of Theorem 1 in Dworczak and Martini (2019)

CENSORSHIP AS OPTIMAL PERSUASION 11

1ωlowast

V

revealing pooling

mlowast0

Figure 1 Optimal upper censorship with cutoff ωlowast

The solution of the senderrsquos problem (4) is particularly simple if V is either globallyconvex or globally concave In this case the first-order condition in (3) has a constantsign so either ωlowast = 0 or ωlowast = 1 must be optimal If V is convex then ωlowast = 1 isoptimal which corresponds to the fully informative signal Similarly if V is concavethen ωlowast = 0 is optimal which corresponds to the completely uninformative signalThis is summarized in the following corollary

Corollary 1 An optimal signal is

(i) fully informative for all F if and only if V is convex

(ii) completely uninformative for all F if and only if V is concave

31 Canonical Case We now consider the canonical case where the senderrsquos utilitysatisfies assumption (A1)

The density g of receiver types r is said to be log-concave if ln g(r) is concave in r12

Note that ln g(r) is well defined in the canonical case

In this case the shape of the senderrsquos indirect utility V defined by (9) is connectedto the shape of the density g of receiver types as follows

Lemma 1 In the canonical case V is S-shaped for all ρ if and only if g is log-concave

12Table 1 in Bagnoli and Bergstrom (2005) reports distributions with log-concave densities

12 KOLOTILIN MYLOVANOV ZAPECHELNYUK

That is if the density of receiver types g is log-concave then the senderrsquos indirectutility V is S-shaped Moreover this result is tight in the sense that if g is notlog-concave then there exists ρ such that V is not S-shaped

By Theorem 1 and Lemma 1 we obtain a criterion for the optimality of upper cen-sorship with the condition on the primitive of the model the density g

Theorem 2 Consider the canonical case and let the density g be log-concave Thenand only then an optimal signal is deterministic upper censorship for all F and all ρ

The symmetric statement is also true an optimal signal is lower censorship for all Fand all ρ if and only if minusg is log-concave Consequently if both g and minusg are log-concave (that is g is exponential) then there exists an optimal signal which is bothupper censorship and lower censorship There are only two signals with this propertyfully informative and completely uninformative This allows us to obtain conditionson the distribution of receiver types under which the optimal signal is polarizedbetween fully informative and completely uninformative signals as for example inLewis and Sappington (1994) and Johnson and Myatt (2006)13

Corollary 2 In the canonical case an optimal signal is either fully informative orcompletely uninformative for all F and all ρ if and only if there exist λ isin R and c gt 0such that g(r) = ceminusλr for r isin [0 1]

If g(r) = ceminusλr then the fully informative signal is optimal whenever ρ ge λ and thecompletely uninformative signal is optimal whenever ρ le λ (and any signal is optimalwhen ρ = λ) In particular if ρ = 0 the optimal signal is fully determined by thesign of λ which is in turn determined by whether the mean of r is smaller or greaterthan 1

2

32 Comparative Statics Theorem 2 allows for a sharp comparative statics anal-ysis on the informativeness of the optimal signal

We compare signals by their Blackwell informativeness (Blackwell 1953) To compareupper-censorship signals s1 and s2 we only need to compare their cutoffs ωlowast

1 and ωlowast2

Signal s1 is more informative than signal s2 if ωlowast1 ge ωlowast

2 Indeed state ω isin [0ωlowast2) is

fully revealed by both s1 and s2 and state ω isin [ωlowast2 1] is partially revealed by s1 and

not revealed at all by s2 so s1 is more informative than s2

13In the case of log-concave density g the sets of upper- and lower-censorship signals are totallyordered by the rotation order of Johnson and Myatt (2006) If we restrict attention to the set oflower-censorship signals (which are generally suboptimal under log-concave g) then the rotationpoint is decreasing with respect to the rotation order By Lemma 1 of Johnson and Myatt (2006)the senderrsquos expected utility is quasiconvex and one of the extreme lower-censorship signals (thefully informative or completely uninformative signal) is optimal for the sender But if we restrictattention to the set of upper-censorship signals (which are optimal under log-concave g) then therotation point is not decreasing with respect to the rotation order Therefore Lemma 1 of Johnsonand Myatt (2006) does not apply and an interior upper-censorship signal is generally optimal forthe sender

CENSORSHIP AS OPTIMAL PERSUASION 13

For the purpose of comparison we extend the definition of distribution function Gto the real line and assume that its density g is log-concave Consider a family ofdistributions Gtσ of receiver types

Gtσ (r) = G

983061τ minus t+

r minus τ

σ

983062

where τ is the maximum point of g and t isin R and σ gt 0 are parameters Let gtσ (r)denote the corresponding density Note that gtσ (r) is log-concave for all t isin R andall σ gt 0

Because gtσ is log-concave on [0 1] deterministic upper censorship is optimal byTheorem 2 Let ωlowast (ρ t σ) isin [0 1] be the optimal censorship cutoff as given by (3)

The shift parameter t shifts the distribution along the horizontal axis and the stretchparameter σ stretches the distribution horizontally and symmetrically around themode τ so the limit of σ rarr 0 leads to the unit mass on τ We now show that thesender optimally discloses more information when

(i) the senderrsquos preferences are better aligned with the receiverrsquos preferences (thealignment parameter ρ is greater)

(ii) the receiver is more reluctant to accept the proposal (the shift parameter t isgreater)

(iii) the sender is more uncertain about the receiverrsquos type (the stretch parameter σis greater) provided ρ ge 014

Theorem 3 In the canonical case a cutoff state of an optimal signal satisfies

(i) ωlowast (ρ t σ) is increasing in ρ

(ii) ωlowast (ρ t σ) is increasing in t

(iii) ωlowast (ρ t σ) is increasing in σ if ρ ge 0

The intuition for part (i) is that for a higher ρ the sender puts more weight onthe receiverrsquos utility so she optimally endows the receiver with a higher utility byproviding more information

The intuition for part (ii) is that for a higher t each type of the receiver has a greatercost of accepting the proposal so to persuade the same type of the receiver the senderneeds to increase E[ω|ω ge ωlowast] by expanding the full disclosure interval [0ωlowast]

The intuition for part (iii) is that for a higher σ receiver types are more spread out soto persuade the same mass of types the sender again needs to increase E[ω|ω ge ωlowast]

14The case ρ ge 0 is the practically relevant case where the senderrsquos utility is a weighted averageof the receiverrsquos action and utility 1

1+ρv(ω r) =1

1+ρ + ρ1+ρ (ω minus r)

14 KOLOTILIN MYLOVANOV ZAPECHELNYUK

4 Discrete State

In this section we assume that the state is a discrete random variable

Theorem 1prime Let V be S-shaped Then and only then for all discrete F

(A) an optimal signal is stochastic upper censorship whose cutoff pair (ωlowast qlowast) satisfiescondition (3)

(B) an optimal monotone partition is deterministic upper censorship whose cutoff pair(ωlowast

d qlowastd) satisfies ωlowast

d = ωlowast and qlowastd isin 0 1

In the canonical case we obtain the result analogous to Theorem 2

Theorem 2prime Consider the canonical case and let the density g be log-concave Thenand only then for all discrete F and all ρ

(A) an optimal signal is stochastic upper censorship

(B) an optimal monotone partition is deterministic upper censorship

Let us discuss how the comparative statics result (Theorem 3) changes In Section 32we argue that upper-censorship signals can be ordered by their censorship cutoffs Butwhen the state is discrete a censorship cutoff is described by a pair (ωlowast qlowast) where ωlowast

is a cutoff state and qlowast is the probability of revealing this cutoff state when it realizes

Consider two stochastic upper-censorship signals s1 and s2 with cutoff pairs (ωlowast1 q

lowast1)

and (ωlowast2 q

lowast2) Denote (ωlowast

1 qlowast1) ≽ (ωlowast

2 qlowast2) if ω

lowast1 gt ωlowast

2 or ωlowast1 = ωlowast

2 and qlowast1 ge qlowast2 Observethat s1 is more (Blackwell) informative than s2 if and only if (ωlowast

1 qlowast1) ≽ (ωlowast

2 qlowast2) This

comparison also applies to deterministic upper-censorship signals with the constraintthat qlowast1 and qlowast2 are in 0 1

With the above order on cutoff pairs Theorem 3 extend as follows

Theorem 3prime Consider the canonical case

(A) A cutoff pair of an optimal signal satisfies

(i) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in ρ

(ii) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in t

(iii) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in σ if ρ ge 0

(B) A cutoff pair of an optimal monotone partition satisfies15

(i) (ωlowastd (ρ t σ) q

lowastd (ρ t σ)) is increasing in ρ

(ii) (ωlowastd (ρ t σ) q

lowastd (ρ t σ)) is increasing in t

15Part (iii) also holds for an optimal monotone partition when for example the discrete supportof the state ω is a regular grid with sufficiently many points However it is possible to construct acounterexample where part (iii) fails for an optimal monotone partition

CENSORSHIP AS OPTIMAL PERSUASION 15

That is the sender optimally discloses more information when she is less biasedrelative to the receiver (the alignment parameter ρ is greater) when the receiver ismore reluctant to accept the proposal (the shift parameter t is greater) and underScenario A when the sender is more uncertain about the receiverrsquos type (the stretchparameter σ is greater)

Remark 2 Theorem 1prime and Theorem 2prime show that upper censorship emerges as anoptimal persuasion strategy under the same conditions (V is S-shaped in Theorem1prime and g is log-concave in Theorem 2prime) regardless of whether the sender can choosean arbitrary signal or is constrained to choose a monotone partition It may betempting to conjecture that if the sender was allowed to choose an arbitrary partition(not necessarily monotone) the same result would hold This however is not true

For illustration suppose that the sender wishes to maximize the probability of a = 1and the receiver has type r = 1

2with certainty16 Let the state ω take values in

0 14 1 with probabilities (1

6 23 16) so the prior mean is 1

3

An optimal signal is stochastic upper censorship with cutoff pair (ωlowast qlowast) = (14 12)

which sends the pooling message with certainty at ω = 1 and with probability 1minusqlowast =12at ω = 1

4 and reveals the state otherwise The pooling message induces the posterior

mean m = 12 leading to a = 1 all other messages induce posterior means below 1

2

leading to a = 0 Under the optimal signal action a = 1 is chosen with probability23middot 12+ 1

6= 1

2

An optimal monotone partition is deterministic upper censorship that fully revealsthe state where only ω = 1 leads to a = 1 Under this partition action a = 1 ischosen with probability 1

6

An optimal partition pools the extreme states ω = 0 and ω = 1 and reveals theintermediate state ω = 1

4 The pooling message induces the posterior mean m = 1

2

leading to a = 1 This optimal partition is nonmonotone Under this partitionaction a = 1 is chosen with probability 1

6+ 1

6= 1

3 which is strictly better than the

probability of 16obtained under the optimal monotone partition

5 Application to Media Censorship

In this section we apply our results to the problem of media censorship by thegovernment In the modern world people obtain information about the governmentrsquosstate through various media sources such as television newspapers and internetblogs Without the media most people would not know what policies and reformsthe government pursues and how effective they are Media outlets have differentpositions on the political spectrum and differ substantially in how they select andpresent facts to cover the same news People choose their sources of information based

16In this case V (m) = 1 if m ge 12 and V (m) = 0 if m lt 1

2 Observe that the graph of V is

S-shaped and it can be approximated by a continuously differentiable S-shaped function

16 KOLOTILIN MYLOVANOV ZAPECHELNYUK

on their political ideology and socioeconomic status This information is valuable forsignificant individual decisions on migration investment and voting to name a fewIndividuals may not fully internalize externalities that their decisions impose on thesociety Likewise the government may not have the societyrsquos best interest at heartTo further its goals the government then wishes to influence individual decisions bymanipulating the information through media In autocracies and countries with weakchecks and balances the government has power to censor the media content

The governmentrsquos problem of media censorship can be represented as the persua-sion problem in Section 2 We apply our results to provide conditions for the op-timality of upper-censorship policies that censor all media outlets except the mostpro-government ones Furthermore we interpret our comparative statics results

51 Setup There is a continuum of heterogeneous citizens indexed by r isin [0 1]distributed with G Each citizen chooses between a = 0 and a = 1 The utility of acitizen of type r is given by

u(θ r ar a) = (θ minus r)ar + κ(θ r a)

where ar isin 0 1 denotes the citizenrsquos own action a =983125ardG(r) denotes the aggre-

gate action in the society θ isin [0 1] captures an unobserved benefit from action 1 ascompared to action 0 and κ captures the impact of the aggregate action a on thecitizenrsquos utility The term (θ minus r)ar is a private surplus of a citizen of type r Theterm κ(θ r a) is an externality because for a citizen of type r it is optimal to ignorethis term and choose ar = 1 if and only if θ ge r

There is a government which is concerned with a weighted average of the social utilityand the governmentrsquos intrinsic benefit from the aggregate action For a given θ thegovernmentrsquos utility is given by

983133 1

0

ν(θ r ar a)dG(r) + δγ(θ a)

The term ν captures a citizenrsquos utility from the governmentrsquos perspective We allow νto be different from u to reflect paternalistic or other concerns The term γ capturesthe governmentrsquos intrinsic benefit from the aggregate action The parameter δ ge 0captures the weight of the aggregate action in the governmentrsquos utility

Let T be a distribution of the random variable θ We assume that distributions G andT are independent and admit continuously differentiable and strictly positive densi-ties We also assume that κ ν and γ are linear in θ and continuously differentiablein r and a Furthermore to simplify interpretations we assume that ν and γ arenon-decreasing in θ and a so that for the government a high θ is a good news anda higher aggregate action is preferable

Citizens obtain information about the unobservable benefit θ through media outletsEach media outlet is identified by its editorial policy c isin [0 1] and it endorses action

CENSORSHIP AS OPTIMAL PERSUASION 17

a = 1 if θ ge c and criticizes it if θ lt c17 A set C of media outlets is a finite subsetof [0 1]18

The governmentrsquos censorship policy is a set of the media outlets X sub C that arepermitted to broadcast so the rest of the media outlets are censored

The timing is as follows First the government chooses a set X sub C of permittedmedia outlets Second state θ is realized and each permitted media outlet endorsesor criticizes action a = 1 according to its editorial policy Finally each citizen observesmessages from all permitted media outlets updates his beliefs about θ and choosesan action

52 Discussion We now discuss interpretations of the key components of the mediacensorship application As in Gehlbach and Sonin (2014) there can be various inter-pretations of the citizenrsquos action a = 1 such as voting for the government supportinga governmentrsquos policy or taking an individual decision that benefits the governmentA citizenrsquos type r can be interpreted as his ideological position or preference param-eter A citizen who is more supportive of the government has a smaller r

A media outlet with a higher editorial policy c can be interpreted as being less loyalto the government because it criticizes the government on a larger set of states Aneditorial policy c isin C can therefore represent a slant or political bias of the outletagainst the government and can be empirically measured as the frequency with whichthe outlet uses anti-government language Gentzkow and Shapiro (2010) constructsuch a slant index for US newspapers Empirical findings of their paper suggestthat the editorial policies of media outlets are driven by reader preferences justifyingour assumption of the existence of a large variety of editorial policies19 As in Suen(2004) Chan and Suen (2008) and Chiang and Knight (2011) the assumption ofthe binary media reports that communicate only whether the state θ is above somestandard c can be justified by a cursory readerrsquos preference for simple messages suchas positive or negative opinions and yes or no recommendations

The governmentrsquos censorship of media outlets can take various forms For examplethe government can ban access to internet sites withdraw licenses disrupt financingconfiscate print materials and equipment discredit media providers and even arresteditors and journalists using broadly formulated legislation on combating extremismIn some countries the government can exercise direct control over media editorialpolicies either through state ownership or administrative pressure

17The tie-breaking in the event of θ = c is unimportant as θ is a continuous random variable18We discuss the case of a continuum of media outlets in Section 5519Theoretical literature has explored the determinants of media slant of an outlet driven by its

citizens (Mullainathan and Shleifer 2005 Gentzkow and Shapiro 2006 and Chan and Suen 2008)and its owners (Baron 2006 and Besley and Prat 2006)

18 KOLOTILIN MYLOVANOV ZAPECHELNYUK

53 Formulating Media Censorship as Persuasion We now show that the me-dia censorship problem can be formulated as a linear persuasion problem in whichthe government is a sender and a representative citizen is a receiver

In our model the citizensrsquo and the governmentrsquos utilities are linear in θ There-fore given any information from the media outlets the utilities depend only on theposterior mean of θ

Consider an arbitrary posterior mean of θ denoted by m isin [0 1] Each citizen of typer chooses ar = 1 if and only if r le m The externality term κ plays no role in thisdecision Therefore the aggregate action a is simply the mass of all citizens whosetypes do not exceed m so a = G(m)

Next using the citizensrsquo optimal behavior we derive the governmentrsquos expected utilityconditional on the posterior mean m

V (m) = E983063983133 1

0

ν(θ r ar G(m))dG(r) + δγ(θ G(m))

983055983055983055983055m983064

=

983133 1

0

ν(m r1rlem G(m))dG(r) + δγ(mG(m)) (6)

This is the governmentrsquos indirect utility As in Section 2 the government now choosesa signal which is informative about the state to maximize its expected utility How-ever in contrast to Section 2 here the government is restricted to signals that areimplementable by a subset of a given set of media outlets

Next we show that with an appropriate definition of the state the restriction tosignals implementable by a subset of a given set of media outlets can be formulatedas a restriction to monotone partitional signals as in Scenario B in Section 2 Thuswe will be able to use our results in Section 4 to find the optimal censorship policies

Recall that the information about θ is only available through the media outlets Let ωdenote a random variable equal to the conditional expectation of θ given the messagesof all media outlets in C Note that as C is finite ω is a discrete random variablewith values in [0 1] Let F denote the distribution of ω From now on we treat therandom variable ω as the state and denote by s a signal as defined in Section 2

Let HM denote the set of all distributions of the posterior mean state induced byall monotone partitional signals and let HC denote the set of all distributions ofthe posterior mean state induced by all finite subsets of the set C of media outletsObserve that any set of media outlets induces a monotone partitional signal thatis HC sub HM We now show that every outcome implementable by an arbitrarymonotone partition in HM is also implementable by a subset of media outlets

Lemma 2 HC = HM

For illustration suppose that there is only one media outlet with the editorial policyc isin (0 1) There are two observable events θ ge c and θ lt c In each realized event

CENSORSHIP AS OPTIMAL PERSUASION 19

the citizens compute the posterior expectation of θ which we will call ω So in thisexample the distribution F of ω has two mass points one for each of the two events

Given these two mass points every monotone partition of [0 1] induces one of twopossible distributions of the posterior mean state (i) the mass points revealed and (ii)the mass points pooled Observe however that these distributions are implementableby permitting and censoring the media outlet respectively

A censorship policy is upper censorship if it censors all sufficiently disloyal mediaoutlets Specifically there exists a cutoff clowast isin C and an indicator qlowast isin 0 1 suchthat all media outlets whose editorial policies are below clowast are permitted all mediaoutlets whose editorial policies are above clowast are censored and clowast is permitted if andonly if qlowast = 1 That is X = c isin C c le clowast if qlowast = 1 and X = c isin C c lt clowastif qlowast = 0 Note that the full censorship policy clowast = 0 and qlowast = 0 (where all mediaoutlets are censored) and the free media policy clowast = 1 and qlowast = 1 (where all mediaoutlets are permitted) are the two extreme upper-censorship policies

As shown above the media censorship problem is equivalent to the linear persuasionproblem in which the sender is restricted to monotone partitional signals and thesenderrsquos indirect utility V is given by (6) We thus apply Theorem 1prime(B) to obtainthe following result

Theorem 1primeprime If V is S-shaped then an optimal censorship policy is upper censorship

To illustrate Theorem 1primeprime suppose the government is interested only in the aggregateaction so that ν = 0 and γ depends only on the aggregate action a so V (m) =γ(G(m)) Thus an upper-censorship policy is optimal if the composition functionγ(G(middot)) is S-shaped For example this condition holds if the government is interestedin reaching a certain approval threshold (eg a simple majority) so that γ is a stepfunction This condition also holds if γ is S-shaped and G is uniform or S-shapedwith the same inflection point as γ20

54 Canonical Case We now impose more structure on the utilities to obtain asharper result for the optimality of upper censorship and to perform a comparativestatics analysis Analogously to assumption (A1) we assume that the governmentrsquosutility is a weighted average of the citizensrsquo utility and their aggregate action

983133 1

0

u(θ r ar a)dG(r) + δa

where u(θ r ar a) = (θ minus r)ar + ζ(r)a

(A2)

for some continuously differentiable function ζ Define

β =

983133 1

0

ζ(r)dG(r) + δ

20This condition holds in many special cases where γ and G are S-shaped even when γ and Ghave different inflection points

20 KOLOTILIN MYLOVANOV ZAPECHELNYUK

The term983125 1

0ζ(r)dG(r) is the governmentrsquos expected bias towards the citizensrsquo action

a = 1 due to the citizensrsquo externality and the term δ is the governmentrsquos intrinsicbias towards a greater aggregate action a Thus we interpret β as the governmentrsquosaggregate bias This decomposition of the bias allows for different interpretations whythe government is biased So β can be high because the government is too self-serving(high δ) or because the government is benevolent (low δ) but wishes to internalizestrong positive externalities of the citizens (high ζ(r)) To ease interpretations weassume that β gt 021

Under Assumption (A2) the governmentrsquos indirect utility V given by (6) becomes

V (m) =

983133 m

0

(mminus r)dG(r) + βa =

983133 m

0

v(m r)dG(r)

where

v(m r) = β

9830611 +

1

β(mminus r)

983062

This is the same as v given by Assumption (A1) with ρ = βminus1 up to rescaling by apositive constant Consequently we can apply Theorem 2prime(B) to obtain the followingresult

Theorem 2primeprime Let (A2) hold If the density g of citizens types is log-concave then anoptimal censorship policy is upper censorship

We now apply the comparative statics analysis presented in Section 32 Note thatthe upper-censorship policies are ordered according to the amount of informationtransmitted to the citizens in the sense of Blackwell (1953) A greater censorshipthreshold clowast means that more media outlets are permitted With this order in mindwe apply Theorem 3prime(B) to make a comparative statics analysis on the amount ofinformation that is optimally disclosed by the government

First the censorship cutoff is increasing in the alignment parameter ρ Recall thatthis is a reciprocal of the governmentrsquos bias ρ = βminus1 This means that the governmentoptimally discloses more information (the censorship cutoff is greater) when it is lessbiased (β is smaller) Intuitively as β decreases the government puts more weighton the citizensrsquo utility so it optimally endows the population with a higher utility bycensoring fewer media outlets and disclosing more information

Second the censorship cutoff is increasing in the magnitude t of the horizontal shiftof the density g A greater t corresponds to a greater opportunity cost of actiona = 1 for each citizen This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more difficultto persuade to take action a = 1 (parameter t is greater) Informally speaking topersuade the same type of the citizen the government needs to increase the posterior

21If β lt 0 then swapping the roles of a = 0 and a = 1 reverses the sign of the bias If β = 0then the governmentrsquos utility and the citizensrsquo private interests coincide so it is trivially optimal topermit all media outlets

CENSORSHIP AS OPTIMAL PERSUASION 21

mean But the expected posterior mean must be equal to the prior mean so it is notpossible to increase all posterior means Due to the log-concave shape of the densityof the citizensrsquo types this tradeoff is resolved by increasing the posterior mean of thepooling interval E[ω|ω ge clowast] This is done by shrinking the pooling interval [clowast 1]that is increasing the censorship cutoff clowast

Third when the number of media outlets is sufficiently large the censorship cutoffis increasing in the stretch factor σ of the density g around its peak τ (see Footnote15) A greater σ corresponds to a more dispersed distribution of the opportunitycost in the population This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more diverse inhow difficult they are to persuade Intuitively when the types are more spread outpersuading the same mass of types requires a greater posterior mean which leads toa higher censorship cutoff similarly to the comparative statics in t discussed above

55 Extensions and Open Questions Let us now consider a few extensions ofour model of media censorship

In our model the set of media outlets is exogenous and the governmentrsquos only in-strument is censorship We now consider three alternative ways of expanding thegovernmentrsquos instruments of influence

First suppose that the government is able not only to censor existing media outletsbut also to introduce new media outlets with chosen editorial policies This is equiv-alent to our censorship model where all media outlets in [0 1] are initially availableand the government can censor any subset of them When all media outlets in [0 1]are permitted the revealed information about θ (which we call ω) is θ itself Thusω = θ is a continuous random variable with distribution F = T So we can nowapply our results for the continuous state from Section 3 instead of those for the dis-crete state in Section 4 reaching the same conclusion about the optimality of uppercensorship

Second suppose that the government can garble information available from mediaoutlets That is the government is not restricted to monotone partitions it cancreate arbitrary signals about state ω for the citizens to observe This becomes ageneral persuasion problem and our Theorems 1prime(A) 2prime(A) and 3prime(A) apply

Third suppose that the government is able to restrict not only which media outletsare permitted but also how many media outlets each citizen can choose to observeIn this extension the citizens are not allowed to communicate with one another (oth-erwise they could share the information thus observing all permitted media outletsindirectly) This extension does not affect our results as long as each citizen is al-lowed to access at least one media outlet of his choice as in Chan and Suen (2008)Intuitively this is because each citizen categorizes the information from the mediaoutlets into ldquogood newsrdquo where a = 1 is optimal and ldquobad newsrdquo where a = 0 is

22 KOLOTILIN MYLOVANOV ZAPECHELNYUK

optimal Because the information from the media outlets induces a monotone parti-tion it means that ldquogood newsrdquo is separated from ldquobad newsrdquo by a threshold mediaoutlet that depends on the citizenrsquos type That is it is sufficient to observe a singlethreshold media outlet to distinguish ldquogood newsrdquo from ldquobad newsrdquo

In our model each citizenrsquos utility depends on the aggregate action a through theexternality term κ(θ r a) which does not affect the chosen action Let us relax thisassumption so that a citizenrsquos optimal choice can depend on a We can still writethe senderrsquos indirect utility V as a function of the posterior mean state and applyour results However V is no longer uniquely determined by the primitives of themodel It is now endogenous and depends on an equilibrium the citizens play aseach citizenrsquos optimal action now depends on what all citizens do in equilibrium Forexample given the same information about the state a citizen could prefer to choosea = 1 if and only if sufficiently many citizens choose the same action so a is largeenough This creates the problem of multiplicity of equilibria and as a consequencethe dependence of optimal censorship on equilibrium selection

As a side our media censorship problem can also be applied to spatial voting mod-els as in Chiang and Knight (2011) Consider a government party (p = G) and anopposition party (p = O) competing in an election If party p wins a voter withan ideological position r gets utility wp minus (r minus rp)

2 where wp is the quality or va-lence of party p and rp is the ideology or policy platform of party p Voters knowthe partiesrsquo ideologies and obtain information about the partiesrsquo qualities from allavailable media outlets Each voter supports the party that maximizes his expectedutility Within this context our analysis still applies because the voterrsquos utility dif-ference between the government and opposition parties is proportional to θminusr whereθ = (wG minus wO + r2O minus r2G)2(rO minus rG)

There are a few more extensions that can be relevant in applications First insteadof complete censorship of a media outlet there can be a cost of accessing it Forexample an international news channel can be censored by a local government butcitizens may still access it through VPN at some cost Second it can be costly forthe government to censor media outlets So another important question to answeris how a government may prioritize censoring Finally citizens can incur some costof following each media outlet While we have already mentioned that citizens gainno benefit from following more than one outlet it is entirely possible for them tostop watching news altogether if it is sufficiently uninformative These extensions arenontrivial and left for future research

Appendix

Proof of Theorem 1 Suppose that V is S-shaped For mωlowast isin [0 1] and mlowast(ωlowast) =E[ω|ω ge ωlowast] define

V (mωlowast) = V (mlowast(ωlowast)) + V prime(mlowast(ωlowast))(mminusmlowast(ωlowast)) (7)

CENSORSHIP AS OPTIMAL PERSUASION 23

Since V (m) is convex on [0 τ ] and concave on [τ 1] there exists ωlowast isin [0 τ ] thatsatisfies condition (3) that is

V (mωlowast) ge (le)V (m) for all m ge (lt)ωlowast (8)

This is true because V (ωlowastωlowast) minus V (ωlowast) is single-crossing from below To see thisconsider Figure 1 Observe that (3) can hold only if ωlowast is on the convex part of V andmlowast is on the concave part of V As the censorship cutoff ωlowast increases the posteriormean state of the pooling message mlowast(ωlowast) also increases But because V is concaveat mlowast(ωlowast) the dashed tangent line V (mωlowast) becomes flatter in m so it crosses thesolid line V (m) at a smaller m

Consider an arbitrary signal s Let H be the distribution of m = E[ω|s] inducedby signal s Since each signal is a garbling of the fully informative signal F is amean-preserving spread of H Thus the senderrsquos expected utility is smaller undersignal s than under the upper-censorship signal with cutoff ωlowast as follows from

983133 1

0

V (m)dH(m) le983133 1

0

maxV (m) V (mωlowast)dH(m)

le983133 1

0

maxV (m) V (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mlowast(ωlowast))dF (m)

where the second line holds because maxV (m) V (mωlowast) is convex in m and F isa mean-preserving spread of H the third line holds by (8) and the last line holds by

(7) and by983125 1

ωlowast(mminusmlowast(ωlowast))dF (m) = 0

Conversely suppose that V is not S-shaped Then there exist 0 le m1 lt m2 lem3 lt m4 le 1 such that V primeprime(m) lt 0 for m isin (m1m2) V

primeprime(m) = 0 for m isin [m2m3]and V primeprime(m) gt 0 for m isin (m3m4) because V primeprime is continuous by assumption It isstraightforward to show that there exists 983141mlowast isin (m1m2) (sufficiently close to m2) and983141ωlowast isin (m3m4) such that

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) lt V (m) for all m isin (983141ωlowastm4]

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) gt V (m) for all m isin [m1 983141ωlowast] 983141mlowast

Moreover there exists a continuous distribution F such that the support of F is asubset of [m1m4] and 983141mlowast = E[ω|ω le 983141ωlowast] Using a chain of inequalities similar to theone above it is easy to show that lower-censorship signal 983141slowast with cutoff 983141ωlowast is optimalFurthermore it is uniquely optimal as both inequalities hold with equalities only forsignal 983141slowast This in turn implies that any upper-censorship signal is suboptimal

24 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Proof of Lemma 1 Notice that under assumption (A1) we have

V primeprime(m) = gprime(m) + ρg(m) for m isin [0 1]

Recall that V is S-shaped if and only if V primeprime(m) = gprime (m) + ρg (m) is single-crossingfrom above Using Proposition 1 in Quah and Strulovici (2012) it is easy to showthat V primeprime(m) is single-crossing from above for all ρ isin R if and only if gprime (m)g (m) isnonincreasing in m (that is ln g(m) is concave)

Proof of Theorem 2 Immediate by Theorem 1 and Lemma 1

Proof of Theorem 3 Part (i) We fix t and σ and vary ρ Without loss ofgenerality let t = 0 and σ = 1 (so that Gtσ = G) Consider a signal that induces thedistribution H of posterior mean m This signal implements the receiverrsquos interimutility U given by

U(r) =

983133 1

r

(mminus r)dH(m) for r isin [0 1]

which holds because the receiver acts if and only if m ge r Since types r lt 0 alwayschoose a = 1 and types r gt 1 always choose a = 0 we can exclude them fromconsideration and assume that supp(G) = [0 1] The senderrsquos expected utility is then

983133 1

0

V (m)dH(m) =

983133 1

0

983133 m

0

(1 + ρ(mminus r))dG(r)dH(m)

=

983133 1

0

983133 1

r

(1 + ρ(mminus r))dH(m)dG(r)

=

983133 1

0

(1minusH(r) + ρU(r))dG(r)

Consider ρ2 gt ρ1 Suppose by contradiction that the corresponding optimal upper-censorship signals s2 and s1 are such that s1 is strictly more informative than s2 LetU2 and U1 be the receiverrsquos interim utilities implemented by s2 and s1 Since thesender prefers signal s2 under ρ2 and signal s1 under ρ1 we have

983133 1

0

(1minusH2 (r) + ρ2U2 (r)) dG (r) ge983133 1

0

(1minusH1 (r) + ρ2U1 (r)) dG (r)

983133 1

0

(1minusH1 (r) + ρ1U1 (r)) dG (r) ge983133 1

0

(1minusH2 (r) + ρ1U2 (r)) dG (r)

Summing up these inequalities gives

(ρ2 minus ρ1)

983133 1

0

(U2 (r)minus U1 (r)) dG (r) ge 0

leading to a contradiction because U1(r) ge U2(r) for all r with strict inequality forsome r given that s1 is strictly more informative than s2

CENSORSHIP AS OPTIMAL PERSUASION 25

Part (ii) We fix ρ isin R and σ = 1 and vary t The indirect utility of the senderconditional on m isin [0 1] is given by

Vt(m) =

983133

rlem

(1+ρ(mminusr))gt1(r)dr =

983133

rlemminust

(1+ρ(mminustminusr))g(r)dr = V (mminust) (9)

By Theorem 1 when t0 = 0 an optimal cutoff ωlowast0 satisfies

V (m) le V (mlowast0) + V prime(mlowast

0)(mminusmlowast0) for m isin [ωlowast

0 1]

When V is S-shaped and t1 lt 0 it is easy to see that

V (mminus t1) le V (mlowast0 minus t1) + V prime(mlowast

0 minus t1)(mminusmlowast0) for m isin [ωlowast

0 1]

which implies that when t = t1 the above inequality changes its sign at some ωlowast1

that satisfies ωlowast1 le ωlowast

0 It follows from (9) that ωlowast1 is an optimal cutoff for t = t1 and

as we established above ωlowast1 le ωlowast

0

Part (iii) We fix ρ ge 0 and t = 0 and vary σ The indirect utility of the senderconditional on m isin [0 1] is given by

V ρσσ(m) =

983133

rlem

9830591 +

ρ

σ(mminus r)

983060dG0σ(r)

=

983133

rlem

9830591 +

ρ

σ(mminus r)

983060g

983061r minus (1minus σ)τ

σ

9830621

σdr

=

983133

rlemminus(1minusσ)τσ

9830611 + ρ

983061mminus (1minus σ)τ

σminus r

983062983062g(r)dr

= Vρ1

983061mminus (1minus σ)τ

σ

983062 (10)

Let σ0 = 1 σ1 lt 1 and ρ1 = ρσ1 To compare the optimal cutoffs under parameters(ρ σ0) and (ρ σ1) we consider two steps first comparing the cutoffs between (ρ σ0)and (ρ1 σ1) and then comparing the cutoffs between (ρ1 σ1) and (ρ σ1)

First we compare an optimal cutoff ωlowast0 under (ρ σ0) = (ρ 1) and an optimal cutoff

ωlowast1 under (ρ1 σ1) By Theorem 1 ωlowast

0 satisfies

Vρ1(m) le Vρ1(mlowast0) + V prime

ρ1(mlowast0)(mminusmlowast

0) for m isin [ωlowast0 1]

When V is increasing (implied by ρ ge 0) and S-shaped and σ1 lt 1 it is easy to seethat for all m isin [ωlowast

0 1]

Vρ1

983061mminus (1minus σ1)τ

σ1

983062le Vρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062+ V prime

ρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062(mminusmlowast

0)

which implies that when σ = σ1 lt 1 the above inequality changes its sign at someωlowast1 that satisfies ωlowast

1 le ωlowast0 It follows from (10) that ωlowast

1 is an optimal cutoff under(ρσ σ1) = (ρ1 σ1) and as we established above ωlowast

1 le ωlowast0

26 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Second we compare the optimal cutoff ωlowast1 under (ρ1 σ1) and the optimal cutoff ωlowast

2

under (ρ σ1) Because ρ lt ρ1 = ρσ1 by part (i) we have ωlowast2 le ωlowast

1

Combining ωlowast1 le ωlowast

0 and ωlowast2 le ωlowast

1 we obtain ωlowast2 le ωlowast

0

Proof of Theorem 1prime Part (A) The proof is analogous to that of Theorem 1 andis thus omitted

Part (B) The only if statement follows from that in part (A) (see the proof of Theorem1) and the fact that we can always ensure that an optimal stochastic censorship(ωlowast qlowast) is deterministic (so qlowast isin 0 1) by the choice of the support and distributionof ω Thus we only need to prove the if statement

Let F be a distribution of ω with support on n values 0 lt ω0 lt lt ωnminus1 lt 1 Thisis without loss of generality because if 0 andor 1 were in the support of F we couldredefine the state as ω = (ω + ε)(1 + 2ε) and the type as r = (r + ε)(1 + 2ε)

Let X sub ω1 ωnminus1 identify a set of cutoffs defining a monotone partition for eachωi isin X the receiver is informed whether ω lt ωi or ω ge ωi Note that we exclude ω0

from the set of cutoffs because ω ge ω0 always holds

Note that X is deterministic upper censorship if and only if there exists a censorshipthreshold ωlowast isin [0 1] such that for each i = 1 nminus 1 ωi isin X if and only if ωi lt ωlowastthat is we can assume that qlowast = 0 without loss of generality

Let VX be the senderrsquos expected utility when the monotone partition is given by X

If n = 2 then there are only two partitions X = ω1 (full disclosure) and X = empty(no disclosure) Both are upper censorships so part (B) holds trivially

If n = 3 then there are four monotone partitions X = ω1ω2 (full disclosure)X = ω2 X = ω1 and X = empty (no disclosure) Only X = ω2 is not uppercensorship The next lemma proves that X = ω2 cannot be uniquely optimal whenV is S-shaped thus proving the optimality of upper censorship for n = 3

Denote

mjk = E983045ω|ω isin [ωjωk]

983046

For convenience of notation we write mj = ωj

Lemma 3 Let V be S-shaped with an inflexion point τ Let ω be a discrete randomvariable with support on m0m1m2 and probabilities p0 p1 p2 where 0 lt m0 ltm1 lt m2 lt 1 Then

(i) if τ le m0 then Vm2 le Vempty

(ii) if τ ge m1 then Vm2 le Vm1m2

(iii) if m0 lt τ lt m1 then Vm2 le maxVempty Vm1m2

CENSORSHIP AS OPTIMAL PERSUASION 27

Proof Define p01 = p0 + p1 and p02 = p01 + p2 = 1 Observe that

m01 =p0m0 + p1m1

p0 + p1and m02 =

p0m0 + p1m1 + p2m2

p0 + p1 + p2

We thus have

Vempty = p02V (m02) = (p01 + p2)V

983061p01m01 + p2m2

p01 + p2

983062

Vm2 = p01V (m01) + p2V (m2) = (p0 + p1)V

983061p0m0 + p1m1

p0 + p1

983062+ p2V (m2)

Vm1m2 = p0V (m0) + p1V (m1) + p2V (m2)

Clearly

m0 lt m01 lt m1 and m0 lt m02 lt m2

Therefore

Vm2 le Vempty lArrrArrp01

p01 + p2V (m01) +

p2p01 + p2

V (m2) le V

983061p01

p01 + p2m01 +

p2p01 + p2

m2

983062= V (m02)

(11)

and

Vm2 le Vm1m2 lArrrArr

V

983061p0

p0 + p1m0 +

p1p0 + p1

m1

983062le p0

p0 + p1V (m0) +

p1p0 + p1

V (m1)(12)

Suppose that τ le m01 so V is concave on [m01m2] Then (11) holds by Jensenrsquosinequality In particular it holds for τ le m0 lt m01 thus proving part (i)

Suppose that τ ge m1 so V is convex on [m0m1] Then (12) holds by Jensenrsquosinequality thus proving part (ii)

Finally suppose that m01 lt τ lt m1 Let (τ V (τ)) lie below (or on) the straight lineA connecting (m0 V (m0)) and (m2 V (m2)) as illustrated on Figure 2 By convexityof V (m) for m lt τ it must be the case that V (m01) lies below (or on) the dashedline B By concavity of V (m) for m gt τ it must be the case that V (m1) lies above(or on) the dashed line C Consequently (12) is satisfied

Alternatively let (τ V (τ)) lie above the straight line A connecting (m0 V (m0)) and(m2 V (m2)) as illustrated on Figure 3 If Vm2 le Vm1m2 then the proof of part(iii) is complete So suppose that Vm2 gt Vm1m2 so (12) is violated

By convexity of V (m) for m lt τ it must be the case that V (m01) lies below (or on)the line B Since (12) is violated V (m01) must lie between the lines A and B Recallthat m01 lt m02 lt m2 Observe that V (m02) must be in the shaded area on Figure 3Indeed if m02 lt τ then by convexity of V (m) for m lt τ it must be the case thatV (m02) lies between (or on) the lines B and D If m02 ge τ then by concavity of

28 KOLOTILIN MYLOVANOV ZAPECHELNYUK

τ m2m0

A C

B

m01 m1

V (m0) V (τ)

V (m2)

Figure 2 The case where (τ V (τ)) lies below (or on) the straight lineconnecting (m0 V (m0)) and (m2 V (m2))

τ m2m0

A

C

B

m01

V (m0)

V (τ)

V (m2)

D

V (m01)

Figure 3 The case where (τ V (τ)) lies above the straight line con-necting (m0 V (m0)) and (m2 V (m2))

V (m) for m gt τ it must be the case that V (m02) lies above (or on) line C It followsthat (11) is satisfied

We now adapt Lemma 3 to prove the optimality of upper censorship for n ge 4

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

6 KOLOTILIN MYLOVANOV ZAPECHELNYUK

are continuous if F has a density f on [0 1] We say that the state ω and distributionF are discrete if the support of F is a finite set

A particularly tractable and intuitive special case is where the senderrsquos utility is aweighted sum of the receiverrsquos utility and action We call this the canonical caseFormally in the canonical case the senderrsquos utility when the proposal is accepted is

v (ω r) = 1 + ρ(ω minus r) ρ isin R (A1)

That is the sender is biased towards a = 1 but also puts a weight ρ on the receiverrsquosutility In particular if the weight ρ is large then the sender and receiverrsquos inter-ests are aligned whereas if the weight is zero then the sender cares only about thereceiverrsquos action

The receiver privately knows his type but he does not observe the state The sendercan influence the action taken by the receiver a = 1 or a = 0 by releasing a signalthat reveals information about the state A signal is a random variable s isin [0 1]that is independent of r but is possibly correlated with ω For example s is fullyinformative if it is perfectly correlated with ω and s is completely uninformative ifit is independent of ω

The timing is as follows First the sender publicly chooses a signal s Then realiza-tions of ω r and s are drawn Finally the receiver observes the realizations of histype r and the signal s and then chooses between a = 0 and a = 1

We consider two scenarios

A The sender can choose any signal

B The sender can choose any monotone partitional signal

Formally a signal s is a monotone partitional signal if there exists a nondecreasingfunction ξ(ω) such that s = ξ(ω) for each ω Every such signal induces a partitionof the state space [0 1] into intervals and singletons and the receiver observes thepartition element that contains the state

Under Scenario A we are interested in an optimal signal that maximizes the senderrsquosexpected utility among all signals This is a standard persuasion problem

Under Scenario B we are interested in an optimal monotone partition that maximizesthe senderrsquos expected utility among all monotone partitional signals This scenarioincorporates constraints that information designers often face in practice For ex-ample a non-monotone grading policy that gives better grades to worse performingstudents will be perceived as unfair and will be manipulated by strategic students

22 Upper Censorship A subset of signals called upper censorship will play aspecial role in this paper

CENSORSHIP AS OPTIMAL PERSUASION 7

An upper-censorship signal reveals states below a specified cutoff and pools statesabove this cutoff When the state is equal to the cutoff it can be revealed or pooledIf the state is continuous it does not matter whether the cutoff state is revealed orpooled because this is a zero probability event However if the state is discretewe will distinguish between deterministic and stochastic upper-censorship signalsdepending on what happens at the cutoff state

A lower-censorship signal is defined symmetrically it pools states below a specifiedcutoff and reveals states above this cutoff Because the case of lower censorshipis symmetric (see Remark 1 below) for clarity of exposition we focus on uppercensorship throughout the paper

A signal s is stochastic upper censorship if there exists a cutoff pair (ωlowast qlowast) consistingof a cutoff state ωlowast isin [0 1] and probability qlowast isin [0 1] such that states below ωlowast arerevealed states above ωlowast are pooled and the state ω = ωlowast is revealed with probabilityqlowast and pooled with probability 1minus qlowast For example s can be expressed as

s =

983099983105983103

983105983101

ω with probability one if ω lt ωlowast

ωlowast and mlowast with probabilities qlowast and 1minus qlowast if ω = ωlowast

mlowast with probability one if ω gt ωlowast

where

mlowast =

983125(ωlowast1]

ωdF (ω) + ωlowast(1minus qlowast) Pr[ω = ωlowast]983125(ωlowast1]

dF (ω) + (1minus qlowast) Pr[ω = ωlowast](1)

is the posterior expected state induced by the pooling signal Notice that mlowast is afunction of ωlowast and qlowast

A stochastic upper-censorship signal s with a cutoff pair (ωlowast qlowast) is deterministic uppercensorship if and only if qlowast isin 0 1 in which case s can be expressed as a monotonepartition of the state space [0 1] For example if qlowast = 0 then the signal s fullyreveals states ω lt ωlowast and pools states ω ge ωlowast so s can be expressed as

s =

983083ω if ω lt ωlowast

mlowast if ω ge ωlowast

wheremlowast = E[ω|ω ge ωlowast]

is the expected state conditional on being at least ωlowast Note that the fully informativesignal and the completely uninformative signal are deterministic upper-censorshipsignals with (ωlowast qlowast) = (1 1) and (ωlowast qlowast) = (0 0) respectively

23 Benchmark To illustrate the difference between the cases of a continuous anddiscrete state as well as between Scenarios A and B we solve a benchmark examplewhere there is no uncertainty about the receiverrsquos type

Let the state ω have the expected value E[ω] = 12 and let the receiverrsquos type r be

known to the sender and satisfy r isin98304312 1983044 In addition suppose that the sender wishes

8 KOLOTILIN MYLOVANOV ZAPECHELNYUK

to maximize the probability that the receiver accepts the proposal so v(ω r) = 1 forall ω and all r

Observe that if the sender reveals no information about ω the receiver will evaluate ωby its expected value of 1

2 thus having utility 1

2minus r lt 0 from accepting the proposal

So revealing no information makes the receiver reject the proposal with certaintyThe sender can do better by fully revealing ω in which case the proposal is acceptedwith probability Pr[ω ge r]

However the sender can do even better by pooling states above r with states below rwhile keeping the expected state induced by the pooling signal at least r Thus thereceiver who is unable to distinguish between the states in the pool will accept theproposal at all of these states

To illustrate the case of the continuous state suppose that ω is uniformly distributedon [0 1] The largest pool that maintains the posterior expectation at least r is theinterval [2r minus 1 1] Indeed once the receiver learns that ω isin [2r minus 1 1] given theuniform prior of ω the posterior expected state is r So the proposal is accepted whenω ge 2rminus 1 and rejected when ω lt 2rminus 19 Thus the deterministic upper-censorshipsignal with cutoff ωlowast = 2r minus 1 is optimal The senderrsquos expected utility is equal to

Pr[proposal is accepted] = Pr[ω ge 2r minus 1] = 2(1minus r)

which is twice as high as that from the fully informative signal

Pr[proposal is accepted] = Pr[ω ge r] = 1minus r

When ω is a discrete random variable an optimal way to reveal information aboutω takes the form of stochastic upper censorship For illustration suppose that ω canonly be 0 or 1 equally likely Now simply pooling states is not helpful for the senderPooling 0 and 1 yields the posterior expected state of 1

2 which is smaller than the

type r resulting in a rejection of the proposal Including any other states in thepool makes no difference as these states never occur Thus an optimal monotonepartition is the fully informative signal and the probability that the receiver acceptsthe proposal is simply equal to the probability that ω = 1

Pr[proposal is accepted] = Pr[ω ge r] = Pr[ω = 1] =1

2

The sender can do strictly better by partial (stochastic) pooling If ω = 1 let thesignal realization be ldquoacceptrdquo with certainty If ω = 0 let the signal realization beldquorejectrdquo with some probability q and ldquoacceptrdquo with probability 1 minus q So when thereceiver observes ldquoacceptrdquo the posterior expected state is

E[ω|ldquoacceptrdquo] = Pr[ω = 1]

Pr[ω = 1] + Pr[ω = 0] middot (1minus q)=

12

12 + 12 middot (1minus q)=

1

2minus q

9Whether states strictly below ωlowast are revealed or pooled among themselves does not matterbecause they are all strictly below r and thus induce the receiver to take action a = 0

CENSORSHIP AS OPTIMAL PERSUASION 9

The sender now wishes to find the lowest q subject to 1(2 minus q) ge r which yieldsqlowast = 2 minus 1

r Thus stochastic upper censorship with ωlowast = 0 and qlowast = 2 minus 1

ris

optimal The probability that the receiver accepts the proposal is exactly equal tothe probability that the signal realization is ldquoacceptrdquo

Pr[proposal is accepted] = Pr[ω = 1] + Pr[ω = 0] middot (1minus qlowast) =1

2+

1

2middot 1minus r

r=

1

2r

We now summarize the insights obtained in this example If the state is continuousthen an optimal signal is deterministic upper censorship If the state is discrete thenan optimal monotone partition is deterministic upper censorship and an optimalsignal is stochastic upper censorship10 In the remainder of the paper we will showthat these insights extend to the case where the receiverrsquos type is uncertain

3 Continuous State

In this section we provide necessary and sufficient conditions for the optimality ofupper censorship when the state is continuous

Let m be the expected state conditional on observing a realization of a signal s Sincethe sender and receiverrsquos utilities are linear in ω they depend on the information aboutω revealed by signal s only through the expected state m In particular the receiverchooses a = 1 if and only if r le m

Let V (m) denote the indirect utility of the sender conditional on m

V (m) =

983133

rlem

E [v(ω r)|m] g(r)dr =

983133 m

0

v(m r)g(r)dr m isin [0 1] (2)

where E [v(ω r)|m] = v(m r) by the linearity of v in ω

A function V is said to be S-shaped if it is convex below some threshold and concaveabove that threshold or equivalently if V primeprime is single-crossing from above

there exists τ such that V primeprime(m) ge (le) 0 for all m lt (gt) τ

We now provide the criterion for the optimality of upper censorship Recall thatmlowast = E[ω|ω ge ωlowast]

Theorem 1 Let V be S-shaped Then and only then for all continuous F anoptimal signal is deterministic upper censorship whose cutoff state ωlowast isin [0 τ ] satisfies

V (mlowast) + V prime(mlowast)(mminusmlowast) ge (le)V (m) for all m ge (lt)ωlowast (3)

The criterion for the optimality of lower censorship is analogous

10See for example Kolotilin (2015) for a detailed treatment of the case where there is no uncer-tainty about the receiverrsquos type

10 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Remark 1 Let minusV be S-shaped Then and only then for all continuous F anoptimal signal is deterministic lower censorship whose cutoff state ωlowast isin [τ 1] satisfies

V (mlowast) + V prime(mlowast)(mminusmlowast) ge (le)V (m) for all m le (gt)ωlowast

Theorem 1 states that optimal persuasion takes a simple form of upper censorshipwhen V is S-shaped11 Moreover the theorem establishes that this result is tight inthe sense that if V is not S-shaped then there exists a distribution F of the statesuch that no upper-censorship signal is optimal

Note that every upper-censorship signal is a monotone partition Therefore Theorem1 applies in both Scenario A where the sender is free to choose any signal andScenario B where the sender is constrained to choose a monotone partition

The intuition behind Theorem 1 is as follows Observe that when no informationabout the state is revealed the receiverrsquos best-response action does not change withthe state The more information is revealed the more variable the receiverrsquos behaviorwill be in response to this information Consider an interval where V is concave Thesender would prefer not to reveal any information when the state is in this intervalsince a certain outcome is preferred to any lottery with the same expected stateConversely consider an interval where V is convex The sender would prefer to fullyreveal the state in this interval since now lotteries are preferred If V is S-shapedthat is it is convex below some threshold and concave above that threshold theinduced optimal persuasion takes the form of upper censorship

When V is S-shaped the senderrsquos optimization problem is reduced to finding anoptimal censorship cutoff ωlowast If the realized state ω is below the cutoff then it isrevealed to the receiver so the expected utility of the sender is V (ω) If the realizedstate ω is above the cutoff then the posterior expected state is mlowast = E[ω|ω ge ωlowast] sothe expected utility of the sender conditional on ω ge ωlowast is V (mlowast) The sender thusneeds to solve the problem

maxωlowastisin[01]

983133 ωlowast

0

V (ω)f(ω)dω +

983133 1

ωlowastV (mlowast)f(ω)dω (4)

The expression in (3) represents the first-order condition to this problem and capturesthree possible cases the boundary solutions ωlowast = 0 and ωlowast = 1 if the expression in(3) has the same sign for all ω and an interior solution ωlowast such that

V (mlowast) + V prime(mlowast)(ωlowast minusmlowast) = V (ωlowast) (5)

This first-order condition (5) is illustrated by Figure 1 The solid line is V (m) andthe dashed line is V (mlowast) + V prime(mlowast)(mminusmlowast) which is tangent to V at mlowast

11For a fixed distribution F Proposition 3 in Kolotilin (2018) implies that upper censorship withcutoff ωlowast is optimal if and only if (3) holds and V (m) is convex on [0ωlowast] This result can be used toshow that upper censorship is optimal when V is S-shaped For completeness we include a simpleself-contained proof inspired by the proof of Theorem 1 in Dworczak and Martini (2019)

CENSORSHIP AS OPTIMAL PERSUASION 11

1ωlowast

V

revealing pooling

mlowast0

Figure 1 Optimal upper censorship with cutoff ωlowast

The solution of the senderrsquos problem (4) is particularly simple if V is either globallyconvex or globally concave In this case the first-order condition in (3) has a constantsign so either ωlowast = 0 or ωlowast = 1 must be optimal If V is convex then ωlowast = 1 isoptimal which corresponds to the fully informative signal Similarly if V is concavethen ωlowast = 0 is optimal which corresponds to the completely uninformative signalThis is summarized in the following corollary

Corollary 1 An optimal signal is

(i) fully informative for all F if and only if V is convex

(ii) completely uninformative for all F if and only if V is concave

31 Canonical Case We now consider the canonical case where the senderrsquos utilitysatisfies assumption (A1)

The density g of receiver types r is said to be log-concave if ln g(r) is concave in r12

Note that ln g(r) is well defined in the canonical case

In this case the shape of the senderrsquos indirect utility V defined by (9) is connectedto the shape of the density g of receiver types as follows

Lemma 1 In the canonical case V is S-shaped for all ρ if and only if g is log-concave

12Table 1 in Bagnoli and Bergstrom (2005) reports distributions with log-concave densities

12 KOLOTILIN MYLOVANOV ZAPECHELNYUK

That is if the density of receiver types g is log-concave then the senderrsquos indirectutility V is S-shaped Moreover this result is tight in the sense that if g is notlog-concave then there exists ρ such that V is not S-shaped

By Theorem 1 and Lemma 1 we obtain a criterion for the optimality of upper cen-sorship with the condition on the primitive of the model the density g

Theorem 2 Consider the canonical case and let the density g be log-concave Thenand only then an optimal signal is deterministic upper censorship for all F and all ρ

The symmetric statement is also true an optimal signal is lower censorship for all Fand all ρ if and only if minusg is log-concave Consequently if both g and minusg are log-concave (that is g is exponential) then there exists an optimal signal which is bothupper censorship and lower censorship There are only two signals with this propertyfully informative and completely uninformative This allows us to obtain conditionson the distribution of receiver types under which the optimal signal is polarizedbetween fully informative and completely uninformative signals as for example inLewis and Sappington (1994) and Johnson and Myatt (2006)13

Corollary 2 In the canonical case an optimal signal is either fully informative orcompletely uninformative for all F and all ρ if and only if there exist λ isin R and c gt 0such that g(r) = ceminusλr for r isin [0 1]

If g(r) = ceminusλr then the fully informative signal is optimal whenever ρ ge λ and thecompletely uninformative signal is optimal whenever ρ le λ (and any signal is optimalwhen ρ = λ) In particular if ρ = 0 the optimal signal is fully determined by thesign of λ which is in turn determined by whether the mean of r is smaller or greaterthan 1

2

32 Comparative Statics Theorem 2 allows for a sharp comparative statics anal-ysis on the informativeness of the optimal signal

We compare signals by their Blackwell informativeness (Blackwell 1953) To compareupper-censorship signals s1 and s2 we only need to compare their cutoffs ωlowast

1 and ωlowast2

Signal s1 is more informative than signal s2 if ωlowast1 ge ωlowast

2 Indeed state ω isin [0ωlowast2) is

fully revealed by both s1 and s2 and state ω isin [ωlowast2 1] is partially revealed by s1 and

not revealed at all by s2 so s1 is more informative than s2

13In the case of log-concave density g the sets of upper- and lower-censorship signals are totallyordered by the rotation order of Johnson and Myatt (2006) If we restrict attention to the set oflower-censorship signals (which are generally suboptimal under log-concave g) then the rotationpoint is decreasing with respect to the rotation order By Lemma 1 of Johnson and Myatt (2006)the senderrsquos expected utility is quasiconvex and one of the extreme lower-censorship signals (thefully informative or completely uninformative signal) is optimal for the sender But if we restrictattention to the set of upper-censorship signals (which are optimal under log-concave g) then therotation point is not decreasing with respect to the rotation order Therefore Lemma 1 of Johnsonand Myatt (2006) does not apply and an interior upper-censorship signal is generally optimal forthe sender

CENSORSHIP AS OPTIMAL PERSUASION 13

For the purpose of comparison we extend the definition of distribution function Gto the real line and assume that its density g is log-concave Consider a family ofdistributions Gtσ of receiver types

Gtσ (r) = G

983061τ minus t+

r minus τ

σ

983062

where τ is the maximum point of g and t isin R and σ gt 0 are parameters Let gtσ (r)denote the corresponding density Note that gtσ (r) is log-concave for all t isin R andall σ gt 0

Because gtσ is log-concave on [0 1] deterministic upper censorship is optimal byTheorem 2 Let ωlowast (ρ t σ) isin [0 1] be the optimal censorship cutoff as given by (3)

The shift parameter t shifts the distribution along the horizontal axis and the stretchparameter σ stretches the distribution horizontally and symmetrically around themode τ so the limit of σ rarr 0 leads to the unit mass on τ We now show that thesender optimally discloses more information when

(i) the senderrsquos preferences are better aligned with the receiverrsquos preferences (thealignment parameter ρ is greater)

(ii) the receiver is more reluctant to accept the proposal (the shift parameter t isgreater)

(iii) the sender is more uncertain about the receiverrsquos type (the stretch parameter σis greater) provided ρ ge 014

Theorem 3 In the canonical case a cutoff state of an optimal signal satisfies

(i) ωlowast (ρ t σ) is increasing in ρ

(ii) ωlowast (ρ t σ) is increasing in t

(iii) ωlowast (ρ t σ) is increasing in σ if ρ ge 0

The intuition for part (i) is that for a higher ρ the sender puts more weight onthe receiverrsquos utility so she optimally endows the receiver with a higher utility byproviding more information

The intuition for part (ii) is that for a higher t each type of the receiver has a greatercost of accepting the proposal so to persuade the same type of the receiver the senderneeds to increase E[ω|ω ge ωlowast] by expanding the full disclosure interval [0ωlowast]

The intuition for part (iii) is that for a higher σ receiver types are more spread out soto persuade the same mass of types the sender again needs to increase E[ω|ω ge ωlowast]

14The case ρ ge 0 is the practically relevant case where the senderrsquos utility is a weighted averageof the receiverrsquos action and utility 1

1+ρv(ω r) =1

1+ρ + ρ1+ρ (ω minus r)

14 KOLOTILIN MYLOVANOV ZAPECHELNYUK

4 Discrete State

In this section we assume that the state is a discrete random variable

Theorem 1prime Let V be S-shaped Then and only then for all discrete F

(A) an optimal signal is stochastic upper censorship whose cutoff pair (ωlowast qlowast) satisfiescondition (3)

(B) an optimal monotone partition is deterministic upper censorship whose cutoff pair(ωlowast

d qlowastd) satisfies ωlowast

d = ωlowast and qlowastd isin 0 1

In the canonical case we obtain the result analogous to Theorem 2

Theorem 2prime Consider the canonical case and let the density g be log-concave Thenand only then for all discrete F and all ρ

(A) an optimal signal is stochastic upper censorship

(B) an optimal monotone partition is deterministic upper censorship

Let us discuss how the comparative statics result (Theorem 3) changes In Section 32we argue that upper-censorship signals can be ordered by their censorship cutoffs Butwhen the state is discrete a censorship cutoff is described by a pair (ωlowast qlowast) where ωlowast

is a cutoff state and qlowast is the probability of revealing this cutoff state when it realizes

Consider two stochastic upper-censorship signals s1 and s2 with cutoff pairs (ωlowast1 q

lowast1)

and (ωlowast2 q

lowast2) Denote (ωlowast

1 qlowast1) ≽ (ωlowast

2 qlowast2) if ω

lowast1 gt ωlowast

2 or ωlowast1 = ωlowast

2 and qlowast1 ge qlowast2 Observethat s1 is more (Blackwell) informative than s2 if and only if (ωlowast

1 qlowast1) ≽ (ωlowast

2 qlowast2) This

comparison also applies to deterministic upper-censorship signals with the constraintthat qlowast1 and qlowast2 are in 0 1

With the above order on cutoff pairs Theorem 3 extend as follows

Theorem 3prime Consider the canonical case

(A) A cutoff pair of an optimal signal satisfies

(i) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in ρ

(ii) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in t

(iii) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in σ if ρ ge 0

(B) A cutoff pair of an optimal monotone partition satisfies15

(i) (ωlowastd (ρ t σ) q

lowastd (ρ t σ)) is increasing in ρ

(ii) (ωlowastd (ρ t σ) q

lowastd (ρ t σ)) is increasing in t

15Part (iii) also holds for an optimal monotone partition when for example the discrete supportof the state ω is a regular grid with sufficiently many points However it is possible to construct acounterexample where part (iii) fails for an optimal monotone partition

CENSORSHIP AS OPTIMAL PERSUASION 15

That is the sender optimally discloses more information when she is less biasedrelative to the receiver (the alignment parameter ρ is greater) when the receiver ismore reluctant to accept the proposal (the shift parameter t is greater) and underScenario A when the sender is more uncertain about the receiverrsquos type (the stretchparameter σ is greater)

Remark 2 Theorem 1prime and Theorem 2prime show that upper censorship emerges as anoptimal persuasion strategy under the same conditions (V is S-shaped in Theorem1prime and g is log-concave in Theorem 2prime) regardless of whether the sender can choosean arbitrary signal or is constrained to choose a monotone partition It may betempting to conjecture that if the sender was allowed to choose an arbitrary partition(not necessarily monotone) the same result would hold This however is not true

For illustration suppose that the sender wishes to maximize the probability of a = 1and the receiver has type r = 1

2with certainty16 Let the state ω take values in

0 14 1 with probabilities (1

6 23 16) so the prior mean is 1

3

An optimal signal is stochastic upper censorship with cutoff pair (ωlowast qlowast) = (14 12)

which sends the pooling message with certainty at ω = 1 and with probability 1minusqlowast =12at ω = 1

4 and reveals the state otherwise The pooling message induces the posterior

mean m = 12 leading to a = 1 all other messages induce posterior means below 1

2

leading to a = 0 Under the optimal signal action a = 1 is chosen with probability23middot 12+ 1

6= 1

2

An optimal monotone partition is deterministic upper censorship that fully revealsthe state where only ω = 1 leads to a = 1 Under this partition action a = 1 ischosen with probability 1

6

An optimal partition pools the extreme states ω = 0 and ω = 1 and reveals theintermediate state ω = 1

4 The pooling message induces the posterior mean m = 1

2

leading to a = 1 This optimal partition is nonmonotone Under this partitionaction a = 1 is chosen with probability 1

6+ 1

6= 1

3 which is strictly better than the

probability of 16obtained under the optimal monotone partition

5 Application to Media Censorship

In this section we apply our results to the problem of media censorship by thegovernment In the modern world people obtain information about the governmentrsquosstate through various media sources such as television newspapers and internetblogs Without the media most people would not know what policies and reformsthe government pursues and how effective they are Media outlets have differentpositions on the political spectrum and differ substantially in how they select andpresent facts to cover the same news People choose their sources of information based

16In this case V (m) = 1 if m ge 12 and V (m) = 0 if m lt 1

2 Observe that the graph of V is

S-shaped and it can be approximated by a continuously differentiable S-shaped function

16 KOLOTILIN MYLOVANOV ZAPECHELNYUK

on their political ideology and socioeconomic status This information is valuable forsignificant individual decisions on migration investment and voting to name a fewIndividuals may not fully internalize externalities that their decisions impose on thesociety Likewise the government may not have the societyrsquos best interest at heartTo further its goals the government then wishes to influence individual decisions bymanipulating the information through media In autocracies and countries with weakchecks and balances the government has power to censor the media content

The governmentrsquos problem of media censorship can be represented as the persua-sion problem in Section 2 We apply our results to provide conditions for the op-timality of upper-censorship policies that censor all media outlets except the mostpro-government ones Furthermore we interpret our comparative statics results

51 Setup There is a continuum of heterogeneous citizens indexed by r isin [0 1]distributed with G Each citizen chooses between a = 0 and a = 1 The utility of acitizen of type r is given by

u(θ r ar a) = (θ minus r)ar + κ(θ r a)

where ar isin 0 1 denotes the citizenrsquos own action a =983125ardG(r) denotes the aggre-

gate action in the society θ isin [0 1] captures an unobserved benefit from action 1 ascompared to action 0 and κ captures the impact of the aggregate action a on thecitizenrsquos utility The term (θ minus r)ar is a private surplus of a citizen of type r Theterm κ(θ r a) is an externality because for a citizen of type r it is optimal to ignorethis term and choose ar = 1 if and only if θ ge r

There is a government which is concerned with a weighted average of the social utilityand the governmentrsquos intrinsic benefit from the aggregate action For a given θ thegovernmentrsquos utility is given by

983133 1

0

ν(θ r ar a)dG(r) + δγ(θ a)

The term ν captures a citizenrsquos utility from the governmentrsquos perspective We allow νto be different from u to reflect paternalistic or other concerns The term γ capturesthe governmentrsquos intrinsic benefit from the aggregate action The parameter δ ge 0captures the weight of the aggregate action in the governmentrsquos utility

Let T be a distribution of the random variable θ We assume that distributions G andT are independent and admit continuously differentiable and strictly positive densi-ties We also assume that κ ν and γ are linear in θ and continuously differentiablein r and a Furthermore to simplify interpretations we assume that ν and γ arenon-decreasing in θ and a so that for the government a high θ is a good news anda higher aggregate action is preferable

Citizens obtain information about the unobservable benefit θ through media outletsEach media outlet is identified by its editorial policy c isin [0 1] and it endorses action

CENSORSHIP AS OPTIMAL PERSUASION 17

a = 1 if θ ge c and criticizes it if θ lt c17 A set C of media outlets is a finite subsetof [0 1]18

The governmentrsquos censorship policy is a set of the media outlets X sub C that arepermitted to broadcast so the rest of the media outlets are censored

The timing is as follows First the government chooses a set X sub C of permittedmedia outlets Second state θ is realized and each permitted media outlet endorsesor criticizes action a = 1 according to its editorial policy Finally each citizen observesmessages from all permitted media outlets updates his beliefs about θ and choosesan action

52 Discussion We now discuss interpretations of the key components of the mediacensorship application As in Gehlbach and Sonin (2014) there can be various inter-pretations of the citizenrsquos action a = 1 such as voting for the government supportinga governmentrsquos policy or taking an individual decision that benefits the governmentA citizenrsquos type r can be interpreted as his ideological position or preference param-eter A citizen who is more supportive of the government has a smaller r

A media outlet with a higher editorial policy c can be interpreted as being less loyalto the government because it criticizes the government on a larger set of states Aneditorial policy c isin C can therefore represent a slant or political bias of the outletagainst the government and can be empirically measured as the frequency with whichthe outlet uses anti-government language Gentzkow and Shapiro (2010) constructsuch a slant index for US newspapers Empirical findings of their paper suggestthat the editorial policies of media outlets are driven by reader preferences justifyingour assumption of the existence of a large variety of editorial policies19 As in Suen(2004) Chan and Suen (2008) and Chiang and Knight (2011) the assumption ofthe binary media reports that communicate only whether the state θ is above somestandard c can be justified by a cursory readerrsquos preference for simple messages suchas positive or negative opinions and yes or no recommendations

The governmentrsquos censorship of media outlets can take various forms For examplethe government can ban access to internet sites withdraw licenses disrupt financingconfiscate print materials and equipment discredit media providers and even arresteditors and journalists using broadly formulated legislation on combating extremismIn some countries the government can exercise direct control over media editorialpolicies either through state ownership or administrative pressure

17The tie-breaking in the event of θ = c is unimportant as θ is a continuous random variable18We discuss the case of a continuum of media outlets in Section 5519Theoretical literature has explored the determinants of media slant of an outlet driven by its

citizens (Mullainathan and Shleifer 2005 Gentzkow and Shapiro 2006 and Chan and Suen 2008)and its owners (Baron 2006 and Besley and Prat 2006)

18 KOLOTILIN MYLOVANOV ZAPECHELNYUK

53 Formulating Media Censorship as Persuasion We now show that the me-dia censorship problem can be formulated as a linear persuasion problem in whichthe government is a sender and a representative citizen is a receiver

In our model the citizensrsquo and the governmentrsquos utilities are linear in θ There-fore given any information from the media outlets the utilities depend only on theposterior mean of θ

Consider an arbitrary posterior mean of θ denoted by m isin [0 1] Each citizen of typer chooses ar = 1 if and only if r le m The externality term κ plays no role in thisdecision Therefore the aggregate action a is simply the mass of all citizens whosetypes do not exceed m so a = G(m)

Next using the citizensrsquo optimal behavior we derive the governmentrsquos expected utilityconditional on the posterior mean m

V (m) = E983063983133 1

0

ν(θ r ar G(m))dG(r) + δγ(θ G(m))

983055983055983055983055m983064

=

983133 1

0

ν(m r1rlem G(m))dG(r) + δγ(mG(m)) (6)

This is the governmentrsquos indirect utility As in Section 2 the government now choosesa signal which is informative about the state to maximize its expected utility How-ever in contrast to Section 2 here the government is restricted to signals that areimplementable by a subset of a given set of media outlets

Next we show that with an appropriate definition of the state the restriction tosignals implementable by a subset of a given set of media outlets can be formulatedas a restriction to monotone partitional signals as in Scenario B in Section 2 Thuswe will be able to use our results in Section 4 to find the optimal censorship policies

Recall that the information about θ is only available through the media outlets Let ωdenote a random variable equal to the conditional expectation of θ given the messagesof all media outlets in C Note that as C is finite ω is a discrete random variablewith values in [0 1] Let F denote the distribution of ω From now on we treat therandom variable ω as the state and denote by s a signal as defined in Section 2

Let HM denote the set of all distributions of the posterior mean state induced byall monotone partitional signals and let HC denote the set of all distributions ofthe posterior mean state induced by all finite subsets of the set C of media outletsObserve that any set of media outlets induces a monotone partitional signal thatis HC sub HM We now show that every outcome implementable by an arbitrarymonotone partition in HM is also implementable by a subset of media outlets

Lemma 2 HC = HM

For illustration suppose that there is only one media outlet with the editorial policyc isin (0 1) There are two observable events θ ge c and θ lt c In each realized event

CENSORSHIP AS OPTIMAL PERSUASION 19

the citizens compute the posterior expectation of θ which we will call ω So in thisexample the distribution F of ω has two mass points one for each of the two events

Given these two mass points every monotone partition of [0 1] induces one of twopossible distributions of the posterior mean state (i) the mass points revealed and (ii)the mass points pooled Observe however that these distributions are implementableby permitting and censoring the media outlet respectively

A censorship policy is upper censorship if it censors all sufficiently disloyal mediaoutlets Specifically there exists a cutoff clowast isin C and an indicator qlowast isin 0 1 suchthat all media outlets whose editorial policies are below clowast are permitted all mediaoutlets whose editorial policies are above clowast are censored and clowast is permitted if andonly if qlowast = 1 That is X = c isin C c le clowast if qlowast = 1 and X = c isin C c lt clowastif qlowast = 0 Note that the full censorship policy clowast = 0 and qlowast = 0 (where all mediaoutlets are censored) and the free media policy clowast = 1 and qlowast = 1 (where all mediaoutlets are permitted) are the two extreme upper-censorship policies

As shown above the media censorship problem is equivalent to the linear persuasionproblem in which the sender is restricted to monotone partitional signals and thesenderrsquos indirect utility V is given by (6) We thus apply Theorem 1prime(B) to obtainthe following result

Theorem 1primeprime If V is S-shaped then an optimal censorship policy is upper censorship

To illustrate Theorem 1primeprime suppose the government is interested only in the aggregateaction so that ν = 0 and γ depends only on the aggregate action a so V (m) =γ(G(m)) Thus an upper-censorship policy is optimal if the composition functionγ(G(middot)) is S-shaped For example this condition holds if the government is interestedin reaching a certain approval threshold (eg a simple majority) so that γ is a stepfunction This condition also holds if γ is S-shaped and G is uniform or S-shapedwith the same inflection point as γ20

54 Canonical Case We now impose more structure on the utilities to obtain asharper result for the optimality of upper censorship and to perform a comparativestatics analysis Analogously to assumption (A1) we assume that the governmentrsquosutility is a weighted average of the citizensrsquo utility and their aggregate action

983133 1

0

u(θ r ar a)dG(r) + δa

where u(θ r ar a) = (θ minus r)ar + ζ(r)a

(A2)

for some continuously differentiable function ζ Define

β =

983133 1

0

ζ(r)dG(r) + δ

20This condition holds in many special cases where γ and G are S-shaped even when γ and Ghave different inflection points

20 KOLOTILIN MYLOVANOV ZAPECHELNYUK

The term983125 1

0ζ(r)dG(r) is the governmentrsquos expected bias towards the citizensrsquo action

a = 1 due to the citizensrsquo externality and the term δ is the governmentrsquos intrinsicbias towards a greater aggregate action a Thus we interpret β as the governmentrsquosaggregate bias This decomposition of the bias allows for different interpretations whythe government is biased So β can be high because the government is too self-serving(high δ) or because the government is benevolent (low δ) but wishes to internalizestrong positive externalities of the citizens (high ζ(r)) To ease interpretations weassume that β gt 021

Under Assumption (A2) the governmentrsquos indirect utility V given by (6) becomes

V (m) =

983133 m

0

(mminus r)dG(r) + βa =

983133 m

0

v(m r)dG(r)

where

v(m r) = β

9830611 +

1

β(mminus r)

983062

This is the same as v given by Assumption (A1) with ρ = βminus1 up to rescaling by apositive constant Consequently we can apply Theorem 2prime(B) to obtain the followingresult

Theorem 2primeprime Let (A2) hold If the density g of citizens types is log-concave then anoptimal censorship policy is upper censorship

We now apply the comparative statics analysis presented in Section 32 Note thatthe upper-censorship policies are ordered according to the amount of informationtransmitted to the citizens in the sense of Blackwell (1953) A greater censorshipthreshold clowast means that more media outlets are permitted With this order in mindwe apply Theorem 3prime(B) to make a comparative statics analysis on the amount ofinformation that is optimally disclosed by the government

First the censorship cutoff is increasing in the alignment parameter ρ Recall thatthis is a reciprocal of the governmentrsquos bias ρ = βminus1 This means that the governmentoptimally discloses more information (the censorship cutoff is greater) when it is lessbiased (β is smaller) Intuitively as β decreases the government puts more weighton the citizensrsquo utility so it optimally endows the population with a higher utility bycensoring fewer media outlets and disclosing more information

Second the censorship cutoff is increasing in the magnitude t of the horizontal shiftof the density g A greater t corresponds to a greater opportunity cost of actiona = 1 for each citizen This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more difficultto persuade to take action a = 1 (parameter t is greater) Informally speaking topersuade the same type of the citizen the government needs to increase the posterior

21If β lt 0 then swapping the roles of a = 0 and a = 1 reverses the sign of the bias If β = 0then the governmentrsquos utility and the citizensrsquo private interests coincide so it is trivially optimal topermit all media outlets

CENSORSHIP AS OPTIMAL PERSUASION 21

mean But the expected posterior mean must be equal to the prior mean so it is notpossible to increase all posterior means Due to the log-concave shape of the densityof the citizensrsquo types this tradeoff is resolved by increasing the posterior mean of thepooling interval E[ω|ω ge clowast] This is done by shrinking the pooling interval [clowast 1]that is increasing the censorship cutoff clowast

Third when the number of media outlets is sufficiently large the censorship cutoffis increasing in the stretch factor σ of the density g around its peak τ (see Footnote15) A greater σ corresponds to a more dispersed distribution of the opportunitycost in the population This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more diverse inhow difficult they are to persuade Intuitively when the types are more spread outpersuading the same mass of types requires a greater posterior mean which leads toa higher censorship cutoff similarly to the comparative statics in t discussed above

55 Extensions and Open Questions Let us now consider a few extensions ofour model of media censorship

In our model the set of media outlets is exogenous and the governmentrsquos only in-strument is censorship We now consider three alternative ways of expanding thegovernmentrsquos instruments of influence

First suppose that the government is able not only to censor existing media outletsbut also to introduce new media outlets with chosen editorial policies This is equiv-alent to our censorship model where all media outlets in [0 1] are initially availableand the government can censor any subset of them When all media outlets in [0 1]are permitted the revealed information about θ (which we call ω) is θ itself Thusω = θ is a continuous random variable with distribution F = T So we can nowapply our results for the continuous state from Section 3 instead of those for the dis-crete state in Section 4 reaching the same conclusion about the optimality of uppercensorship

Second suppose that the government can garble information available from mediaoutlets That is the government is not restricted to monotone partitions it cancreate arbitrary signals about state ω for the citizens to observe This becomes ageneral persuasion problem and our Theorems 1prime(A) 2prime(A) and 3prime(A) apply

Third suppose that the government is able to restrict not only which media outletsare permitted but also how many media outlets each citizen can choose to observeIn this extension the citizens are not allowed to communicate with one another (oth-erwise they could share the information thus observing all permitted media outletsindirectly) This extension does not affect our results as long as each citizen is al-lowed to access at least one media outlet of his choice as in Chan and Suen (2008)Intuitively this is because each citizen categorizes the information from the mediaoutlets into ldquogood newsrdquo where a = 1 is optimal and ldquobad newsrdquo where a = 0 is

22 KOLOTILIN MYLOVANOV ZAPECHELNYUK

optimal Because the information from the media outlets induces a monotone parti-tion it means that ldquogood newsrdquo is separated from ldquobad newsrdquo by a threshold mediaoutlet that depends on the citizenrsquos type That is it is sufficient to observe a singlethreshold media outlet to distinguish ldquogood newsrdquo from ldquobad newsrdquo

In our model each citizenrsquos utility depends on the aggregate action a through theexternality term κ(θ r a) which does not affect the chosen action Let us relax thisassumption so that a citizenrsquos optimal choice can depend on a We can still writethe senderrsquos indirect utility V as a function of the posterior mean state and applyour results However V is no longer uniquely determined by the primitives of themodel It is now endogenous and depends on an equilibrium the citizens play aseach citizenrsquos optimal action now depends on what all citizens do in equilibrium Forexample given the same information about the state a citizen could prefer to choosea = 1 if and only if sufficiently many citizens choose the same action so a is largeenough This creates the problem of multiplicity of equilibria and as a consequencethe dependence of optimal censorship on equilibrium selection

As a side our media censorship problem can also be applied to spatial voting mod-els as in Chiang and Knight (2011) Consider a government party (p = G) and anopposition party (p = O) competing in an election If party p wins a voter withan ideological position r gets utility wp minus (r minus rp)

2 where wp is the quality or va-lence of party p and rp is the ideology or policy platform of party p Voters knowthe partiesrsquo ideologies and obtain information about the partiesrsquo qualities from allavailable media outlets Each voter supports the party that maximizes his expectedutility Within this context our analysis still applies because the voterrsquos utility dif-ference between the government and opposition parties is proportional to θminusr whereθ = (wG minus wO + r2O minus r2G)2(rO minus rG)

There are a few more extensions that can be relevant in applications First insteadof complete censorship of a media outlet there can be a cost of accessing it Forexample an international news channel can be censored by a local government butcitizens may still access it through VPN at some cost Second it can be costly forthe government to censor media outlets So another important question to answeris how a government may prioritize censoring Finally citizens can incur some costof following each media outlet While we have already mentioned that citizens gainno benefit from following more than one outlet it is entirely possible for them tostop watching news altogether if it is sufficiently uninformative These extensions arenontrivial and left for future research

Appendix

Proof of Theorem 1 Suppose that V is S-shaped For mωlowast isin [0 1] and mlowast(ωlowast) =E[ω|ω ge ωlowast] define

V (mωlowast) = V (mlowast(ωlowast)) + V prime(mlowast(ωlowast))(mminusmlowast(ωlowast)) (7)

CENSORSHIP AS OPTIMAL PERSUASION 23

Since V (m) is convex on [0 τ ] and concave on [τ 1] there exists ωlowast isin [0 τ ] thatsatisfies condition (3) that is

V (mωlowast) ge (le)V (m) for all m ge (lt)ωlowast (8)

This is true because V (ωlowastωlowast) minus V (ωlowast) is single-crossing from below To see thisconsider Figure 1 Observe that (3) can hold only if ωlowast is on the convex part of V andmlowast is on the concave part of V As the censorship cutoff ωlowast increases the posteriormean state of the pooling message mlowast(ωlowast) also increases But because V is concaveat mlowast(ωlowast) the dashed tangent line V (mωlowast) becomes flatter in m so it crosses thesolid line V (m) at a smaller m

Consider an arbitrary signal s Let H be the distribution of m = E[ω|s] inducedby signal s Since each signal is a garbling of the fully informative signal F is amean-preserving spread of H Thus the senderrsquos expected utility is smaller undersignal s than under the upper-censorship signal with cutoff ωlowast as follows from

983133 1

0

V (m)dH(m) le983133 1

0

maxV (m) V (mωlowast)dH(m)

le983133 1

0

maxV (m) V (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mlowast(ωlowast))dF (m)

where the second line holds because maxV (m) V (mωlowast) is convex in m and F isa mean-preserving spread of H the third line holds by (8) and the last line holds by

(7) and by983125 1

ωlowast(mminusmlowast(ωlowast))dF (m) = 0

Conversely suppose that V is not S-shaped Then there exist 0 le m1 lt m2 lem3 lt m4 le 1 such that V primeprime(m) lt 0 for m isin (m1m2) V

primeprime(m) = 0 for m isin [m2m3]and V primeprime(m) gt 0 for m isin (m3m4) because V primeprime is continuous by assumption It isstraightforward to show that there exists 983141mlowast isin (m1m2) (sufficiently close to m2) and983141ωlowast isin (m3m4) such that

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) lt V (m) for all m isin (983141ωlowastm4]

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) gt V (m) for all m isin [m1 983141ωlowast] 983141mlowast

Moreover there exists a continuous distribution F such that the support of F is asubset of [m1m4] and 983141mlowast = E[ω|ω le 983141ωlowast] Using a chain of inequalities similar to theone above it is easy to show that lower-censorship signal 983141slowast with cutoff 983141ωlowast is optimalFurthermore it is uniquely optimal as both inequalities hold with equalities only forsignal 983141slowast This in turn implies that any upper-censorship signal is suboptimal

24 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Proof of Lemma 1 Notice that under assumption (A1) we have

V primeprime(m) = gprime(m) + ρg(m) for m isin [0 1]

Recall that V is S-shaped if and only if V primeprime(m) = gprime (m) + ρg (m) is single-crossingfrom above Using Proposition 1 in Quah and Strulovici (2012) it is easy to showthat V primeprime(m) is single-crossing from above for all ρ isin R if and only if gprime (m)g (m) isnonincreasing in m (that is ln g(m) is concave)

Proof of Theorem 2 Immediate by Theorem 1 and Lemma 1

Proof of Theorem 3 Part (i) We fix t and σ and vary ρ Without loss ofgenerality let t = 0 and σ = 1 (so that Gtσ = G) Consider a signal that induces thedistribution H of posterior mean m This signal implements the receiverrsquos interimutility U given by

U(r) =

983133 1

r

(mminus r)dH(m) for r isin [0 1]

which holds because the receiver acts if and only if m ge r Since types r lt 0 alwayschoose a = 1 and types r gt 1 always choose a = 0 we can exclude them fromconsideration and assume that supp(G) = [0 1] The senderrsquos expected utility is then

983133 1

0

V (m)dH(m) =

983133 1

0

983133 m

0

(1 + ρ(mminus r))dG(r)dH(m)

=

983133 1

0

983133 1

r

(1 + ρ(mminus r))dH(m)dG(r)

=

983133 1

0

(1minusH(r) + ρU(r))dG(r)

Consider ρ2 gt ρ1 Suppose by contradiction that the corresponding optimal upper-censorship signals s2 and s1 are such that s1 is strictly more informative than s2 LetU2 and U1 be the receiverrsquos interim utilities implemented by s2 and s1 Since thesender prefers signal s2 under ρ2 and signal s1 under ρ1 we have

983133 1

0

(1minusH2 (r) + ρ2U2 (r)) dG (r) ge983133 1

0

(1minusH1 (r) + ρ2U1 (r)) dG (r)

983133 1

0

(1minusH1 (r) + ρ1U1 (r)) dG (r) ge983133 1

0

(1minusH2 (r) + ρ1U2 (r)) dG (r)

Summing up these inequalities gives

(ρ2 minus ρ1)

983133 1

0

(U2 (r)minus U1 (r)) dG (r) ge 0

leading to a contradiction because U1(r) ge U2(r) for all r with strict inequality forsome r given that s1 is strictly more informative than s2

CENSORSHIP AS OPTIMAL PERSUASION 25

Part (ii) We fix ρ isin R and σ = 1 and vary t The indirect utility of the senderconditional on m isin [0 1] is given by

Vt(m) =

983133

rlem

(1+ρ(mminusr))gt1(r)dr =

983133

rlemminust

(1+ρ(mminustminusr))g(r)dr = V (mminust) (9)

By Theorem 1 when t0 = 0 an optimal cutoff ωlowast0 satisfies

V (m) le V (mlowast0) + V prime(mlowast

0)(mminusmlowast0) for m isin [ωlowast

0 1]

When V is S-shaped and t1 lt 0 it is easy to see that

V (mminus t1) le V (mlowast0 minus t1) + V prime(mlowast

0 minus t1)(mminusmlowast0) for m isin [ωlowast

0 1]

which implies that when t = t1 the above inequality changes its sign at some ωlowast1

that satisfies ωlowast1 le ωlowast

0 It follows from (9) that ωlowast1 is an optimal cutoff for t = t1 and

as we established above ωlowast1 le ωlowast

0

Part (iii) We fix ρ ge 0 and t = 0 and vary σ The indirect utility of the senderconditional on m isin [0 1] is given by

V ρσσ(m) =

983133

rlem

9830591 +

ρ

σ(mminus r)

983060dG0σ(r)

=

983133

rlem

9830591 +

ρ

σ(mminus r)

983060g

983061r minus (1minus σ)τ

σ

9830621

σdr

=

983133

rlemminus(1minusσ)τσ

9830611 + ρ

983061mminus (1minus σ)τ

σminus r

983062983062g(r)dr

= Vρ1

983061mminus (1minus σ)τ

σ

983062 (10)

Let σ0 = 1 σ1 lt 1 and ρ1 = ρσ1 To compare the optimal cutoffs under parameters(ρ σ0) and (ρ σ1) we consider two steps first comparing the cutoffs between (ρ σ0)and (ρ1 σ1) and then comparing the cutoffs between (ρ1 σ1) and (ρ σ1)

First we compare an optimal cutoff ωlowast0 under (ρ σ0) = (ρ 1) and an optimal cutoff

ωlowast1 under (ρ1 σ1) By Theorem 1 ωlowast

0 satisfies

Vρ1(m) le Vρ1(mlowast0) + V prime

ρ1(mlowast0)(mminusmlowast

0) for m isin [ωlowast0 1]

When V is increasing (implied by ρ ge 0) and S-shaped and σ1 lt 1 it is easy to seethat for all m isin [ωlowast

0 1]

Vρ1

983061mminus (1minus σ1)τ

σ1

983062le Vρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062+ V prime

ρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062(mminusmlowast

0)

which implies that when σ = σ1 lt 1 the above inequality changes its sign at someωlowast1 that satisfies ωlowast

1 le ωlowast0 It follows from (10) that ωlowast

1 is an optimal cutoff under(ρσ σ1) = (ρ1 σ1) and as we established above ωlowast

1 le ωlowast0

26 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Second we compare the optimal cutoff ωlowast1 under (ρ1 σ1) and the optimal cutoff ωlowast

2

under (ρ σ1) Because ρ lt ρ1 = ρσ1 by part (i) we have ωlowast2 le ωlowast

1

Combining ωlowast1 le ωlowast

0 and ωlowast2 le ωlowast

1 we obtain ωlowast2 le ωlowast

0

Proof of Theorem 1prime Part (A) The proof is analogous to that of Theorem 1 andis thus omitted

Part (B) The only if statement follows from that in part (A) (see the proof of Theorem1) and the fact that we can always ensure that an optimal stochastic censorship(ωlowast qlowast) is deterministic (so qlowast isin 0 1) by the choice of the support and distributionof ω Thus we only need to prove the if statement

Let F be a distribution of ω with support on n values 0 lt ω0 lt lt ωnminus1 lt 1 Thisis without loss of generality because if 0 andor 1 were in the support of F we couldredefine the state as ω = (ω + ε)(1 + 2ε) and the type as r = (r + ε)(1 + 2ε)

Let X sub ω1 ωnminus1 identify a set of cutoffs defining a monotone partition for eachωi isin X the receiver is informed whether ω lt ωi or ω ge ωi Note that we exclude ω0

from the set of cutoffs because ω ge ω0 always holds

Note that X is deterministic upper censorship if and only if there exists a censorshipthreshold ωlowast isin [0 1] such that for each i = 1 nminus 1 ωi isin X if and only if ωi lt ωlowastthat is we can assume that qlowast = 0 without loss of generality

Let VX be the senderrsquos expected utility when the monotone partition is given by X

If n = 2 then there are only two partitions X = ω1 (full disclosure) and X = empty(no disclosure) Both are upper censorships so part (B) holds trivially

If n = 3 then there are four monotone partitions X = ω1ω2 (full disclosure)X = ω2 X = ω1 and X = empty (no disclosure) Only X = ω2 is not uppercensorship The next lemma proves that X = ω2 cannot be uniquely optimal whenV is S-shaped thus proving the optimality of upper censorship for n = 3

Denote

mjk = E983045ω|ω isin [ωjωk]

983046

For convenience of notation we write mj = ωj

Lemma 3 Let V be S-shaped with an inflexion point τ Let ω be a discrete randomvariable with support on m0m1m2 and probabilities p0 p1 p2 where 0 lt m0 ltm1 lt m2 lt 1 Then

(i) if τ le m0 then Vm2 le Vempty

(ii) if τ ge m1 then Vm2 le Vm1m2

(iii) if m0 lt τ lt m1 then Vm2 le maxVempty Vm1m2

CENSORSHIP AS OPTIMAL PERSUASION 27

Proof Define p01 = p0 + p1 and p02 = p01 + p2 = 1 Observe that

m01 =p0m0 + p1m1

p0 + p1and m02 =

p0m0 + p1m1 + p2m2

p0 + p1 + p2

We thus have

Vempty = p02V (m02) = (p01 + p2)V

983061p01m01 + p2m2

p01 + p2

983062

Vm2 = p01V (m01) + p2V (m2) = (p0 + p1)V

983061p0m0 + p1m1

p0 + p1

983062+ p2V (m2)

Vm1m2 = p0V (m0) + p1V (m1) + p2V (m2)

Clearly

m0 lt m01 lt m1 and m0 lt m02 lt m2

Therefore

Vm2 le Vempty lArrrArrp01

p01 + p2V (m01) +

p2p01 + p2

V (m2) le V

983061p01

p01 + p2m01 +

p2p01 + p2

m2

983062= V (m02)

(11)

and

Vm2 le Vm1m2 lArrrArr

V

983061p0

p0 + p1m0 +

p1p0 + p1

m1

983062le p0

p0 + p1V (m0) +

p1p0 + p1

V (m1)(12)

Suppose that τ le m01 so V is concave on [m01m2] Then (11) holds by Jensenrsquosinequality In particular it holds for τ le m0 lt m01 thus proving part (i)

Suppose that τ ge m1 so V is convex on [m0m1] Then (12) holds by Jensenrsquosinequality thus proving part (ii)

Finally suppose that m01 lt τ lt m1 Let (τ V (τ)) lie below (or on) the straight lineA connecting (m0 V (m0)) and (m2 V (m2)) as illustrated on Figure 2 By convexityof V (m) for m lt τ it must be the case that V (m01) lies below (or on) the dashedline B By concavity of V (m) for m gt τ it must be the case that V (m1) lies above(or on) the dashed line C Consequently (12) is satisfied

Alternatively let (τ V (τ)) lie above the straight line A connecting (m0 V (m0)) and(m2 V (m2)) as illustrated on Figure 3 If Vm2 le Vm1m2 then the proof of part(iii) is complete So suppose that Vm2 gt Vm1m2 so (12) is violated

By convexity of V (m) for m lt τ it must be the case that V (m01) lies below (or on)the line B Since (12) is violated V (m01) must lie between the lines A and B Recallthat m01 lt m02 lt m2 Observe that V (m02) must be in the shaded area on Figure 3Indeed if m02 lt τ then by convexity of V (m) for m lt τ it must be the case thatV (m02) lies between (or on) the lines B and D If m02 ge τ then by concavity of

28 KOLOTILIN MYLOVANOV ZAPECHELNYUK

τ m2m0

A C

B

m01 m1

V (m0) V (τ)

V (m2)

Figure 2 The case where (τ V (τ)) lies below (or on) the straight lineconnecting (m0 V (m0)) and (m2 V (m2))

τ m2m0

A

C

B

m01

V (m0)

V (τ)

V (m2)

D

V (m01)

Figure 3 The case where (τ V (τ)) lies above the straight line con-necting (m0 V (m0)) and (m2 V (m2))

V (m) for m gt τ it must be the case that V (m02) lies above (or on) line C It followsthat (11) is satisfied

We now adapt Lemma 3 to prove the optimality of upper censorship for n ge 4

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

CENSORSHIP AS OPTIMAL PERSUASION 7

An upper-censorship signal reveals states below a specified cutoff and pools statesabove this cutoff When the state is equal to the cutoff it can be revealed or pooledIf the state is continuous it does not matter whether the cutoff state is revealed orpooled because this is a zero probability event However if the state is discretewe will distinguish between deterministic and stochastic upper-censorship signalsdepending on what happens at the cutoff state

A lower-censorship signal is defined symmetrically it pools states below a specifiedcutoff and reveals states above this cutoff Because the case of lower censorshipis symmetric (see Remark 1 below) for clarity of exposition we focus on uppercensorship throughout the paper

A signal s is stochastic upper censorship if there exists a cutoff pair (ωlowast qlowast) consistingof a cutoff state ωlowast isin [0 1] and probability qlowast isin [0 1] such that states below ωlowast arerevealed states above ωlowast are pooled and the state ω = ωlowast is revealed with probabilityqlowast and pooled with probability 1minus qlowast For example s can be expressed as

s =

983099983105983103

983105983101

ω with probability one if ω lt ωlowast

ωlowast and mlowast with probabilities qlowast and 1minus qlowast if ω = ωlowast

mlowast with probability one if ω gt ωlowast

where

mlowast =

983125(ωlowast1]

ωdF (ω) + ωlowast(1minus qlowast) Pr[ω = ωlowast]983125(ωlowast1]

dF (ω) + (1minus qlowast) Pr[ω = ωlowast](1)

is the posterior expected state induced by the pooling signal Notice that mlowast is afunction of ωlowast and qlowast

A stochastic upper-censorship signal s with a cutoff pair (ωlowast qlowast) is deterministic uppercensorship if and only if qlowast isin 0 1 in which case s can be expressed as a monotonepartition of the state space [0 1] For example if qlowast = 0 then the signal s fullyreveals states ω lt ωlowast and pools states ω ge ωlowast so s can be expressed as

s =

983083ω if ω lt ωlowast

mlowast if ω ge ωlowast

wheremlowast = E[ω|ω ge ωlowast]

is the expected state conditional on being at least ωlowast Note that the fully informativesignal and the completely uninformative signal are deterministic upper-censorshipsignals with (ωlowast qlowast) = (1 1) and (ωlowast qlowast) = (0 0) respectively

23 Benchmark To illustrate the difference between the cases of a continuous anddiscrete state as well as between Scenarios A and B we solve a benchmark examplewhere there is no uncertainty about the receiverrsquos type

Let the state ω have the expected value E[ω] = 12 and let the receiverrsquos type r be

known to the sender and satisfy r isin98304312 1983044 In addition suppose that the sender wishes

8 KOLOTILIN MYLOVANOV ZAPECHELNYUK

to maximize the probability that the receiver accepts the proposal so v(ω r) = 1 forall ω and all r

Observe that if the sender reveals no information about ω the receiver will evaluate ωby its expected value of 1

2 thus having utility 1

2minus r lt 0 from accepting the proposal

So revealing no information makes the receiver reject the proposal with certaintyThe sender can do better by fully revealing ω in which case the proposal is acceptedwith probability Pr[ω ge r]

However the sender can do even better by pooling states above r with states below rwhile keeping the expected state induced by the pooling signal at least r Thus thereceiver who is unable to distinguish between the states in the pool will accept theproposal at all of these states

To illustrate the case of the continuous state suppose that ω is uniformly distributedon [0 1] The largest pool that maintains the posterior expectation at least r is theinterval [2r minus 1 1] Indeed once the receiver learns that ω isin [2r minus 1 1] given theuniform prior of ω the posterior expected state is r So the proposal is accepted whenω ge 2rminus 1 and rejected when ω lt 2rminus 19 Thus the deterministic upper-censorshipsignal with cutoff ωlowast = 2r minus 1 is optimal The senderrsquos expected utility is equal to

Pr[proposal is accepted] = Pr[ω ge 2r minus 1] = 2(1minus r)

which is twice as high as that from the fully informative signal

Pr[proposal is accepted] = Pr[ω ge r] = 1minus r

When ω is a discrete random variable an optimal way to reveal information aboutω takes the form of stochastic upper censorship For illustration suppose that ω canonly be 0 or 1 equally likely Now simply pooling states is not helpful for the senderPooling 0 and 1 yields the posterior expected state of 1

2 which is smaller than the

type r resulting in a rejection of the proposal Including any other states in thepool makes no difference as these states never occur Thus an optimal monotonepartition is the fully informative signal and the probability that the receiver acceptsthe proposal is simply equal to the probability that ω = 1

Pr[proposal is accepted] = Pr[ω ge r] = Pr[ω = 1] =1

2

The sender can do strictly better by partial (stochastic) pooling If ω = 1 let thesignal realization be ldquoacceptrdquo with certainty If ω = 0 let the signal realization beldquorejectrdquo with some probability q and ldquoacceptrdquo with probability 1 minus q So when thereceiver observes ldquoacceptrdquo the posterior expected state is

E[ω|ldquoacceptrdquo] = Pr[ω = 1]

Pr[ω = 1] + Pr[ω = 0] middot (1minus q)=

12

12 + 12 middot (1minus q)=

1

2minus q

9Whether states strictly below ωlowast are revealed or pooled among themselves does not matterbecause they are all strictly below r and thus induce the receiver to take action a = 0

CENSORSHIP AS OPTIMAL PERSUASION 9

The sender now wishes to find the lowest q subject to 1(2 minus q) ge r which yieldsqlowast = 2 minus 1

r Thus stochastic upper censorship with ωlowast = 0 and qlowast = 2 minus 1

ris

optimal The probability that the receiver accepts the proposal is exactly equal tothe probability that the signal realization is ldquoacceptrdquo

Pr[proposal is accepted] = Pr[ω = 1] + Pr[ω = 0] middot (1minus qlowast) =1

2+

1

2middot 1minus r

r=

1

2r

We now summarize the insights obtained in this example If the state is continuousthen an optimal signal is deterministic upper censorship If the state is discrete thenan optimal monotone partition is deterministic upper censorship and an optimalsignal is stochastic upper censorship10 In the remainder of the paper we will showthat these insights extend to the case where the receiverrsquos type is uncertain

3 Continuous State

In this section we provide necessary and sufficient conditions for the optimality ofupper censorship when the state is continuous

Let m be the expected state conditional on observing a realization of a signal s Sincethe sender and receiverrsquos utilities are linear in ω they depend on the information aboutω revealed by signal s only through the expected state m In particular the receiverchooses a = 1 if and only if r le m

Let V (m) denote the indirect utility of the sender conditional on m

V (m) =

983133

rlem

E [v(ω r)|m] g(r)dr =

983133 m

0

v(m r)g(r)dr m isin [0 1] (2)

where E [v(ω r)|m] = v(m r) by the linearity of v in ω

A function V is said to be S-shaped if it is convex below some threshold and concaveabove that threshold or equivalently if V primeprime is single-crossing from above

there exists τ such that V primeprime(m) ge (le) 0 for all m lt (gt) τ

We now provide the criterion for the optimality of upper censorship Recall thatmlowast = E[ω|ω ge ωlowast]

Theorem 1 Let V be S-shaped Then and only then for all continuous F anoptimal signal is deterministic upper censorship whose cutoff state ωlowast isin [0 τ ] satisfies

V (mlowast) + V prime(mlowast)(mminusmlowast) ge (le)V (m) for all m ge (lt)ωlowast (3)

The criterion for the optimality of lower censorship is analogous

10See for example Kolotilin (2015) for a detailed treatment of the case where there is no uncer-tainty about the receiverrsquos type

10 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Remark 1 Let minusV be S-shaped Then and only then for all continuous F anoptimal signal is deterministic lower censorship whose cutoff state ωlowast isin [τ 1] satisfies

V (mlowast) + V prime(mlowast)(mminusmlowast) ge (le)V (m) for all m le (gt)ωlowast

Theorem 1 states that optimal persuasion takes a simple form of upper censorshipwhen V is S-shaped11 Moreover the theorem establishes that this result is tight inthe sense that if V is not S-shaped then there exists a distribution F of the statesuch that no upper-censorship signal is optimal

Note that every upper-censorship signal is a monotone partition Therefore Theorem1 applies in both Scenario A where the sender is free to choose any signal andScenario B where the sender is constrained to choose a monotone partition

The intuition behind Theorem 1 is as follows Observe that when no informationabout the state is revealed the receiverrsquos best-response action does not change withthe state The more information is revealed the more variable the receiverrsquos behaviorwill be in response to this information Consider an interval where V is concave Thesender would prefer not to reveal any information when the state is in this intervalsince a certain outcome is preferred to any lottery with the same expected stateConversely consider an interval where V is convex The sender would prefer to fullyreveal the state in this interval since now lotteries are preferred If V is S-shapedthat is it is convex below some threshold and concave above that threshold theinduced optimal persuasion takes the form of upper censorship

When V is S-shaped the senderrsquos optimization problem is reduced to finding anoptimal censorship cutoff ωlowast If the realized state ω is below the cutoff then it isrevealed to the receiver so the expected utility of the sender is V (ω) If the realizedstate ω is above the cutoff then the posterior expected state is mlowast = E[ω|ω ge ωlowast] sothe expected utility of the sender conditional on ω ge ωlowast is V (mlowast) The sender thusneeds to solve the problem

maxωlowastisin[01]

983133 ωlowast

0

V (ω)f(ω)dω +

983133 1

ωlowastV (mlowast)f(ω)dω (4)

The expression in (3) represents the first-order condition to this problem and capturesthree possible cases the boundary solutions ωlowast = 0 and ωlowast = 1 if the expression in(3) has the same sign for all ω and an interior solution ωlowast such that

V (mlowast) + V prime(mlowast)(ωlowast minusmlowast) = V (ωlowast) (5)

This first-order condition (5) is illustrated by Figure 1 The solid line is V (m) andthe dashed line is V (mlowast) + V prime(mlowast)(mminusmlowast) which is tangent to V at mlowast

11For a fixed distribution F Proposition 3 in Kolotilin (2018) implies that upper censorship withcutoff ωlowast is optimal if and only if (3) holds and V (m) is convex on [0ωlowast] This result can be used toshow that upper censorship is optimal when V is S-shaped For completeness we include a simpleself-contained proof inspired by the proof of Theorem 1 in Dworczak and Martini (2019)

CENSORSHIP AS OPTIMAL PERSUASION 11

1ωlowast

V

revealing pooling

mlowast0

Figure 1 Optimal upper censorship with cutoff ωlowast

The solution of the senderrsquos problem (4) is particularly simple if V is either globallyconvex or globally concave In this case the first-order condition in (3) has a constantsign so either ωlowast = 0 or ωlowast = 1 must be optimal If V is convex then ωlowast = 1 isoptimal which corresponds to the fully informative signal Similarly if V is concavethen ωlowast = 0 is optimal which corresponds to the completely uninformative signalThis is summarized in the following corollary

Corollary 1 An optimal signal is

(i) fully informative for all F if and only if V is convex

(ii) completely uninformative for all F if and only if V is concave

31 Canonical Case We now consider the canonical case where the senderrsquos utilitysatisfies assumption (A1)

The density g of receiver types r is said to be log-concave if ln g(r) is concave in r12

Note that ln g(r) is well defined in the canonical case

In this case the shape of the senderrsquos indirect utility V defined by (9) is connectedto the shape of the density g of receiver types as follows

Lemma 1 In the canonical case V is S-shaped for all ρ if and only if g is log-concave

12Table 1 in Bagnoli and Bergstrom (2005) reports distributions with log-concave densities

12 KOLOTILIN MYLOVANOV ZAPECHELNYUK

That is if the density of receiver types g is log-concave then the senderrsquos indirectutility V is S-shaped Moreover this result is tight in the sense that if g is notlog-concave then there exists ρ such that V is not S-shaped

By Theorem 1 and Lemma 1 we obtain a criterion for the optimality of upper cen-sorship with the condition on the primitive of the model the density g

Theorem 2 Consider the canonical case and let the density g be log-concave Thenand only then an optimal signal is deterministic upper censorship for all F and all ρ

The symmetric statement is also true an optimal signal is lower censorship for all Fand all ρ if and only if minusg is log-concave Consequently if both g and minusg are log-concave (that is g is exponential) then there exists an optimal signal which is bothupper censorship and lower censorship There are only two signals with this propertyfully informative and completely uninformative This allows us to obtain conditionson the distribution of receiver types under which the optimal signal is polarizedbetween fully informative and completely uninformative signals as for example inLewis and Sappington (1994) and Johnson and Myatt (2006)13

Corollary 2 In the canonical case an optimal signal is either fully informative orcompletely uninformative for all F and all ρ if and only if there exist λ isin R and c gt 0such that g(r) = ceminusλr for r isin [0 1]

If g(r) = ceminusλr then the fully informative signal is optimal whenever ρ ge λ and thecompletely uninformative signal is optimal whenever ρ le λ (and any signal is optimalwhen ρ = λ) In particular if ρ = 0 the optimal signal is fully determined by thesign of λ which is in turn determined by whether the mean of r is smaller or greaterthan 1

2

32 Comparative Statics Theorem 2 allows for a sharp comparative statics anal-ysis on the informativeness of the optimal signal

We compare signals by their Blackwell informativeness (Blackwell 1953) To compareupper-censorship signals s1 and s2 we only need to compare their cutoffs ωlowast

1 and ωlowast2

Signal s1 is more informative than signal s2 if ωlowast1 ge ωlowast

2 Indeed state ω isin [0ωlowast2) is

fully revealed by both s1 and s2 and state ω isin [ωlowast2 1] is partially revealed by s1 and

not revealed at all by s2 so s1 is more informative than s2

13In the case of log-concave density g the sets of upper- and lower-censorship signals are totallyordered by the rotation order of Johnson and Myatt (2006) If we restrict attention to the set oflower-censorship signals (which are generally suboptimal under log-concave g) then the rotationpoint is decreasing with respect to the rotation order By Lemma 1 of Johnson and Myatt (2006)the senderrsquos expected utility is quasiconvex and one of the extreme lower-censorship signals (thefully informative or completely uninformative signal) is optimal for the sender But if we restrictattention to the set of upper-censorship signals (which are optimal under log-concave g) then therotation point is not decreasing with respect to the rotation order Therefore Lemma 1 of Johnsonand Myatt (2006) does not apply and an interior upper-censorship signal is generally optimal forthe sender

CENSORSHIP AS OPTIMAL PERSUASION 13

For the purpose of comparison we extend the definition of distribution function Gto the real line and assume that its density g is log-concave Consider a family ofdistributions Gtσ of receiver types

Gtσ (r) = G

983061τ minus t+

r minus τ

σ

983062

where τ is the maximum point of g and t isin R and σ gt 0 are parameters Let gtσ (r)denote the corresponding density Note that gtσ (r) is log-concave for all t isin R andall σ gt 0

Because gtσ is log-concave on [0 1] deterministic upper censorship is optimal byTheorem 2 Let ωlowast (ρ t σ) isin [0 1] be the optimal censorship cutoff as given by (3)

The shift parameter t shifts the distribution along the horizontal axis and the stretchparameter σ stretches the distribution horizontally and symmetrically around themode τ so the limit of σ rarr 0 leads to the unit mass on τ We now show that thesender optimally discloses more information when

(i) the senderrsquos preferences are better aligned with the receiverrsquos preferences (thealignment parameter ρ is greater)

(ii) the receiver is more reluctant to accept the proposal (the shift parameter t isgreater)

(iii) the sender is more uncertain about the receiverrsquos type (the stretch parameter σis greater) provided ρ ge 014

Theorem 3 In the canonical case a cutoff state of an optimal signal satisfies

(i) ωlowast (ρ t σ) is increasing in ρ

(ii) ωlowast (ρ t σ) is increasing in t

(iii) ωlowast (ρ t σ) is increasing in σ if ρ ge 0

The intuition for part (i) is that for a higher ρ the sender puts more weight onthe receiverrsquos utility so she optimally endows the receiver with a higher utility byproviding more information

The intuition for part (ii) is that for a higher t each type of the receiver has a greatercost of accepting the proposal so to persuade the same type of the receiver the senderneeds to increase E[ω|ω ge ωlowast] by expanding the full disclosure interval [0ωlowast]

The intuition for part (iii) is that for a higher σ receiver types are more spread out soto persuade the same mass of types the sender again needs to increase E[ω|ω ge ωlowast]

14The case ρ ge 0 is the practically relevant case where the senderrsquos utility is a weighted averageof the receiverrsquos action and utility 1

1+ρv(ω r) =1

1+ρ + ρ1+ρ (ω minus r)

14 KOLOTILIN MYLOVANOV ZAPECHELNYUK

4 Discrete State

In this section we assume that the state is a discrete random variable

Theorem 1prime Let V be S-shaped Then and only then for all discrete F

(A) an optimal signal is stochastic upper censorship whose cutoff pair (ωlowast qlowast) satisfiescondition (3)

(B) an optimal monotone partition is deterministic upper censorship whose cutoff pair(ωlowast

d qlowastd) satisfies ωlowast

d = ωlowast and qlowastd isin 0 1

In the canonical case we obtain the result analogous to Theorem 2

Theorem 2prime Consider the canonical case and let the density g be log-concave Thenand only then for all discrete F and all ρ

(A) an optimal signal is stochastic upper censorship

(B) an optimal monotone partition is deterministic upper censorship

Let us discuss how the comparative statics result (Theorem 3) changes In Section 32we argue that upper-censorship signals can be ordered by their censorship cutoffs Butwhen the state is discrete a censorship cutoff is described by a pair (ωlowast qlowast) where ωlowast

is a cutoff state and qlowast is the probability of revealing this cutoff state when it realizes

Consider two stochastic upper-censorship signals s1 and s2 with cutoff pairs (ωlowast1 q

lowast1)

and (ωlowast2 q

lowast2) Denote (ωlowast

1 qlowast1) ≽ (ωlowast

2 qlowast2) if ω

lowast1 gt ωlowast

2 or ωlowast1 = ωlowast

2 and qlowast1 ge qlowast2 Observethat s1 is more (Blackwell) informative than s2 if and only if (ωlowast

1 qlowast1) ≽ (ωlowast

2 qlowast2) This

comparison also applies to deterministic upper-censorship signals with the constraintthat qlowast1 and qlowast2 are in 0 1

With the above order on cutoff pairs Theorem 3 extend as follows

Theorem 3prime Consider the canonical case

(A) A cutoff pair of an optimal signal satisfies

(i) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in ρ

(ii) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in t

(iii) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in σ if ρ ge 0

(B) A cutoff pair of an optimal monotone partition satisfies15

(i) (ωlowastd (ρ t σ) q

lowastd (ρ t σ)) is increasing in ρ

(ii) (ωlowastd (ρ t σ) q

lowastd (ρ t σ)) is increasing in t

15Part (iii) also holds for an optimal monotone partition when for example the discrete supportof the state ω is a regular grid with sufficiently many points However it is possible to construct acounterexample where part (iii) fails for an optimal monotone partition

CENSORSHIP AS OPTIMAL PERSUASION 15

That is the sender optimally discloses more information when she is less biasedrelative to the receiver (the alignment parameter ρ is greater) when the receiver ismore reluctant to accept the proposal (the shift parameter t is greater) and underScenario A when the sender is more uncertain about the receiverrsquos type (the stretchparameter σ is greater)

Remark 2 Theorem 1prime and Theorem 2prime show that upper censorship emerges as anoptimal persuasion strategy under the same conditions (V is S-shaped in Theorem1prime and g is log-concave in Theorem 2prime) regardless of whether the sender can choosean arbitrary signal or is constrained to choose a monotone partition It may betempting to conjecture that if the sender was allowed to choose an arbitrary partition(not necessarily monotone) the same result would hold This however is not true

For illustration suppose that the sender wishes to maximize the probability of a = 1and the receiver has type r = 1

2with certainty16 Let the state ω take values in

0 14 1 with probabilities (1

6 23 16) so the prior mean is 1

3

An optimal signal is stochastic upper censorship with cutoff pair (ωlowast qlowast) = (14 12)

which sends the pooling message with certainty at ω = 1 and with probability 1minusqlowast =12at ω = 1

4 and reveals the state otherwise The pooling message induces the posterior

mean m = 12 leading to a = 1 all other messages induce posterior means below 1

2

leading to a = 0 Under the optimal signal action a = 1 is chosen with probability23middot 12+ 1

6= 1

2

An optimal monotone partition is deterministic upper censorship that fully revealsthe state where only ω = 1 leads to a = 1 Under this partition action a = 1 ischosen with probability 1

6

An optimal partition pools the extreme states ω = 0 and ω = 1 and reveals theintermediate state ω = 1

4 The pooling message induces the posterior mean m = 1

2

leading to a = 1 This optimal partition is nonmonotone Under this partitionaction a = 1 is chosen with probability 1

6+ 1

6= 1

3 which is strictly better than the

probability of 16obtained under the optimal monotone partition

5 Application to Media Censorship

In this section we apply our results to the problem of media censorship by thegovernment In the modern world people obtain information about the governmentrsquosstate through various media sources such as television newspapers and internetblogs Without the media most people would not know what policies and reformsthe government pursues and how effective they are Media outlets have differentpositions on the political spectrum and differ substantially in how they select andpresent facts to cover the same news People choose their sources of information based

16In this case V (m) = 1 if m ge 12 and V (m) = 0 if m lt 1

2 Observe that the graph of V is

S-shaped and it can be approximated by a continuously differentiable S-shaped function

16 KOLOTILIN MYLOVANOV ZAPECHELNYUK

on their political ideology and socioeconomic status This information is valuable forsignificant individual decisions on migration investment and voting to name a fewIndividuals may not fully internalize externalities that their decisions impose on thesociety Likewise the government may not have the societyrsquos best interest at heartTo further its goals the government then wishes to influence individual decisions bymanipulating the information through media In autocracies and countries with weakchecks and balances the government has power to censor the media content

The governmentrsquos problem of media censorship can be represented as the persua-sion problem in Section 2 We apply our results to provide conditions for the op-timality of upper-censorship policies that censor all media outlets except the mostpro-government ones Furthermore we interpret our comparative statics results

51 Setup There is a continuum of heterogeneous citizens indexed by r isin [0 1]distributed with G Each citizen chooses between a = 0 and a = 1 The utility of acitizen of type r is given by

u(θ r ar a) = (θ minus r)ar + κ(θ r a)

where ar isin 0 1 denotes the citizenrsquos own action a =983125ardG(r) denotes the aggre-

gate action in the society θ isin [0 1] captures an unobserved benefit from action 1 ascompared to action 0 and κ captures the impact of the aggregate action a on thecitizenrsquos utility The term (θ minus r)ar is a private surplus of a citizen of type r Theterm κ(θ r a) is an externality because for a citizen of type r it is optimal to ignorethis term and choose ar = 1 if and only if θ ge r

There is a government which is concerned with a weighted average of the social utilityand the governmentrsquos intrinsic benefit from the aggregate action For a given θ thegovernmentrsquos utility is given by

983133 1

0

ν(θ r ar a)dG(r) + δγ(θ a)

The term ν captures a citizenrsquos utility from the governmentrsquos perspective We allow νto be different from u to reflect paternalistic or other concerns The term γ capturesthe governmentrsquos intrinsic benefit from the aggregate action The parameter δ ge 0captures the weight of the aggregate action in the governmentrsquos utility

Let T be a distribution of the random variable θ We assume that distributions G andT are independent and admit continuously differentiable and strictly positive densi-ties We also assume that κ ν and γ are linear in θ and continuously differentiablein r and a Furthermore to simplify interpretations we assume that ν and γ arenon-decreasing in θ and a so that for the government a high θ is a good news anda higher aggregate action is preferable

Citizens obtain information about the unobservable benefit θ through media outletsEach media outlet is identified by its editorial policy c isin [0 1] and it endorses action

CENSORSHIP AS OPTIMAL PERSUASION 17

a = 1 if θ ge c and criticizes it if θ lt c17 A set C of media outlets is a finite subsetof [0 1]18

The governmentrsquos censorship policy is a set of the media outlets X sub C that arepermitted to broadcast so the rest of the media outlets are censored

The timing is as follows First the government chooses a set X sub C of permittedmedia outlets Second state θ is realized and each permitted media outlet endorsesor criticizes action a = 1 according to its editorial policy Finally each citizen observesmessages from all permitted media outlets updates his beliefs about θ and choosesan action

52 Discussion We now discuss interpretations of the key components of the mediacensorship application As in Gehlbach and Sonin (2014) there can be various inter-pretations of the citizenrsquos action a = 1 such as voting for the government supportinga governmentrsquos policy or taking an individual decision that benefits the governmentA citizenrsquos type r can be interpreted as his ideological position or preference param-eter A citizen who is more supportive of the government has a smaller r

A media outlet with a higher editorial policy c can be interpreted as being less loyalto the government because it criticizes the government on a larger set of states Aneditorial policy c isin C can therefore represent a slant or political bias of the outletagainst the government and can be empirically measured as the frequency with whichthe outlet uses anti-government language Gentzkow and Shapiro (2010) constructsuch a slant index for US newspapers Empirical findings of their paper suggestthat the editorial policies of media outlets are driven by reader preferences justifyingour assumption of the existence of a large variety of editorial policies19 As in Suen(2004) Chan and Suen (2008) and Chiang and Knight (2011) the assumption ofthe binary media reports that communicate only whether the state θ is above somestandard c can be justified by a cursory readerrsquos preference for simple messages suchas positive or negative opinions and yes or no recommendations

The governmentrsquos censorship of media outlets can take various forms For examplethe government can ban access to internet sites withdraw licenses disrupt financingconfiscate print materials and equipment discredit media providers and even arresteditors and journalists using broadly formulated legislation on combating extremismIn some countries the government can exercise direct control over media editorialpolicies either through state ownership or administrative pressure

17The tie-breaking in the event of θ = c is unimportant as θ is a continuous random variable18We discuss the case of a continuum of media outlets in Section 5519Theoretical literature has explored the determinants of media slant of an outlet driven by its

citizens (Mullainathan and Shleifer 2005 Gentzkow and Shapiro 2006 and Chan and Suen 2008)and its owners (Baron 2006 and Besley and Prat 2006)

18 KOLOTILIN MYLOVANOV ZAPECHELNYUK

53 Formulating Media Censorship as Persuasion We now show that the me-dia censorship problem can be formulated as a linear persuasion problem in whichthe government is a sender and a representative citizen is a receiver

In our model the citizensrsquo and the governmentrsquos utilities are linear in θ There-fore given any information from the media outlets the utilities depend only on theposterior mean of θ

Consider an arbitrary posterior mean of θ denoted by m isin [0 1] Each citizen of typer chooses ar = 1 if and only if r le m The externality term κ plays no role in thisdecision Therefore the aggregate action a is simply the mass of all citizens whosetypes do not exceed m so a = G(m)

Next using the citizensrsquo optimal behavior we derive the governmentrsquos expected utilityconditional on the posterior mean m

V (m) = E983063983133 1

0

ν(θ r ar G(m))dG(r) + δγ(θ G(m))

983055983055983055983055m983064

=

983133 1

0

ν(m r1rlem G(m))dG(r) + δγ(mG(m)) (6)

This is the governmentrsquos indirect utility As in Section 2 the government now choosesa signal which is informative about the state to maximize its expected utility How-ever in contrast to Section 2 here the government is restricted to signals that areimplementable by a subset of a given set of media outlets

Next we show that with an appropriate definition of the state the restriction tosignals implementable by a subset of a given set of media outlets can be formulatedas a restriction to monotone partitional signals as in Scenario B in Section 2 Thuswe will be able to use our results in Section 4 to find the optimal censorship policies

Recall that the information about θ is only available through the media outlets Let ωdenote a random variable equal to the conditional expectation of θ given the messagesof all media outlets in C Note that as C is finite ω is a discrete random variablewith values in [0 1] Let F denote the distribution of ω From now on we treat therandom variable ω as the state and denote by s a signal as defined in Section 2

Let HM denote the set of all distributions of the posterior mean state induced byall monotone partitional signals and let HC denote the set of all distributions ofthe posterior mean state induced by all finite subsets of the set C of media outletsObserve that any set of media outlets induces a monotone partitional signal thatis HC sub HM We now show that every outcome implementable by an arbitrarymonotone partition in HM is also implementable by a subset of media outlets

Lemma 2 HC = HM

For illustration suppose that there is only one media outlet with the editorial policyc isin (0 1) There are two observable events θ ge c and θ lt c In each realized event

CENSORSHIP AS OPTIMAL PERSUASION 19

the citizens compute the posterior expectation of θ which we will call ω So in thisexample the distribution F of ω has two mass points one for each of the two events

Given these two mass points every monotone partition of [0 1] induces one of twopossible distributions of the posterior mean state (i) the mass points revealed and (ii)the mass points pooled Observe however that these distributions are implementableby permitting and censoring the media outlet respectively

A censorship policy is upper censorship if it censors all sufficiently disloyal mediaoutlets Specifically there exists a cutoff clowast isin C and an indicator qlowast isin 0 1 suchthat all media outlets whose editorial policies are below clowast are permitted all mediaoutlets whose editorial policies are above clowast are censored and clowast is permitted if andonly if qlowast = 1 That is X = c isin C c le clowast if qlowast = 1 and X = c isin C c lt clowastif qlowast = 0 Note that the full censorship policy clowast = 0 and qlowast = 0 (where all mediaoutlets are censored) and the free media policy clowast = 1 and qlowast = 1 (where all mediaoutlets are permitted) are the two extreme upper-censorship policies

As shown above the media censorship problem is equivalent to the linear persuasionproblem in which the sender is restricted to monotone partitional signals and thesenderrsquos indirect utility V is given by (6) We thus apply Theorem 1prime(B) to obtainthe following result

Theorem 1primeprime If V is S-shaped then an optimal censorship policy is upper censorship

To illustrate Theorem 1primeprime suppose the government is interested only in the aggregateaction so that ν = 0 and γ depends only on the aggregate action a so V (m) =γ(G(m)) Thus an upper-censorship policy is optimal if the composition functionγ(G(middot)) is S-shaped For example this condition holds if the government is interestedin reaching a certain approval threshold (eg a simple majority) so that γ is a stepfunction This condition also holds if γ is S-shaped and G is uniform or S-shapedwith the same inflection point as γ20

54 Canonical Case We now impose more structure on the utilities to obtain asharper result for the optimality of upper censorship and to perform a comparativestatics analysis Analogously to assumption (A1) we assume that the governmentrsquosutility is a weighted average of the citizensrsquo utility and their aggregate action

983133 1

0

u(θ r ar a)dG(r) + δa

where u(θ r ar a) = (θ minus r)ar + ζ(r)a

(A2)

for some continuously differentiable function ζ Define

β =

983133 1

0

ζ(r)dG(r) + δ

20This condition holds in many special cases where γ and G are S-shaped even when γ and Ghave different inflection points

20 KOLOTILIN MYLOVANOV ZAPECHELNYUK

The term983125 1

0ζ(r)dG(r) is the governmentrsquos expected bias towards the citizensrsquo action

a = 1 due to the citizensrsquo externality and the term δ is the governmentrsquos intrinsicbias towards a greater aggregate action a Thus we interpret β as the governmentrsquosaggregate bias This decomposition of the bias allows for different interpretations whythe government is biased So β can be high because the government is too self-serving(high δ) or because the government is benevolent (low δ) but wishes to internalizestrong positive externalities of the citizens (high ζ(r)) To ease interpretations weassume that β gt 021

Under Assumption (A2) the governmentrsquos indirect utility V given by (6) becomes

V (m) =

983133 m

0

(mminus r)dG(r) + βa =

983133 m

0

v(m r)dG(r)

where

v(m r) = β

9830611 +

1

β(mminus r)

983062

This is the same as v given by Assumption (A1) with ρ = βminus1 up to rescaling by apositive constant Consequently we can apply Theorem 2prime(B) to obtain the followingresult

Theorem 2primeprime Let (A2) hold If the density g of citizens types is log-concave then anoptimal censorship policy is upper censorship

We now apply the comparative statics analysis presented in Section 32 Note thatthe upper-censorship policies are ordered according to the amount of informationtransmitted to the citizens in the sense of Blackwell (1953) A greater censorshipthreshold clowast means that more media outlets are permitted With this order in mindwe apply Theorem 3prime(B) to make a comparative statics analysis on the amount ofinformation that is optimally disclosed by the government

First the censorship cutoff is increasing in the alignment parameter ρ Recall thatthis is a reciprocal of the governmentrsquos bias ρ = βminus1 This means that the governmentoptimally discloses more information (the censorship cutoff is greater) when it is lessbiased (β is smaller) Intuitively as β decreases the government puts more weighton the citizensrsquo utility so it optimally endows the population with a higher utility bycensoring fewer media outlets and disclosing more information

Second the censorship cutoff is increasing in the magnitude t of the horizontal shiftof the density g A greater t corresponds to a greater opportunity cost of actiona = 1 for each citizen This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more difficultto persuade to take action a = 1 (parameter t is greater) Informally speaking topersuade the same type of the citizen the government needs to increase the posterior

21If β lt 0 then swapping the roles of a = 0 and a = 1 reverses the sign of the bias If β = 0then the governmentrsquos utility and the citizensrsquo private interests coincide so it is trivially optimal topermit all media outlets

CENSORSHIP AS OPTIMAL PERSUASION 21

mean But the expected posterior mean must be equal to the prior mean so it is notpossible to increase all posterior means Due to the log-concave shape of the densityof the citizensrsquo types this tradeoff is resolved by increasing the posterior mean of thepooling interval E[ω|ω ge clowast] This is done by shrinking the pooling interval [clowast 1]that is increasing the censorship cutoff clowast

Third when the number of media outlets is sufficiently large the censorship cutoffis increasing in the stretch factor σ of the density g around its peak τ (see Footnote15) A greater σ corresponds to a more dispersed distribution of the opportunitycost in the population This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more diverse inhow difficult they are to persuade Intuitively when the types are more spread outpersuading the same mass of types requires a greater posterior mean which leads toa higher censorship cutoff similarly to the comparative statics in t discussed above

55 Extensions and Open Questions Let us now consider a few extensions ofour model of media censorship

In our model the set of media outlets is exogenous and the governmentrsquos only in-strument is censorship We now consider three alternative ways of expanding thegovernmentrsquos instruments of influence

First suppose that the government is able not only to censor existing media outletsbut also to introduce new media outlets with chosen editorial policies This is equiv-alent to our censorship model where all media outlets in [0 1] are initially availableand the government can censor any subset of them When all media outlets in [0 1]are permitted the revealed information about θ (which we call ω) is θ itself Thusω = θ is a continuous random variable with distribution F = T So we can nowapply our results for the continuous state from Section 3 instead of those for the dis-crete state in Section 4 reaching the same conclusion about the optimality of uppercensorship

Second suppose that the government can garble information available from mediaoutlets That is the government is not restricted to monotone partitions it cancreate arbitrary signals about state ω for the citizens to observe This becomes ageneral persuasion problem and our Theorems 1prime(A) 2prime(A) and 3prime(A) apply

Third suppose that the government is able to restrict not only which media outletsare permitted but also how many media outlets each citizen can choose to observeIn this extension the citizens are not allowed to communicate with one another (oth-erwise they could share the information thus observing all permitted media outletsindirectly) This extension does not affect our results as long as each citizen is al-lowed to access at least one media outlet of his choice as in Chan and Suen (2008)Intuitively this is because each citizen categorizes the information from the mediaoutlets into ldquogood newsrdquo where a = 1 is optimal and ldquobad newsrdquo where a = 0 is

22 KOLOTILIN MYLOVANOV ZAPECHELNYUK

optimal Because the information from the media outlets induces a monotone parti-tion it means that ldquogood newsrdquo is separated from ldquobad newsrdquo by a threshold mediaoutlet that depends on the citizenrsquos type That is it is sufficient to observe a singlethreshold media outlet to distinguish ldquogood newsrdquo from ldquobad newsrdquo

In our model each citizenrsquos utility depends on the aggregate action a through theexternality term κ(θ r a) which does not affect the chosen action Let us relax thisassumption so that a citizenrsquos optimal choice can depend on a We can still writethe senderrsquos indirect utility V as a function of the posterior mean state and applyour results However V is no longer uniquely determined by the primitives of themodel It is now endogenous and depends on an equilibrium the citizens play aseach citizenrsquos optimal action now depends on what all citizens do in equilibrium Forexample given the same information about the state a citizen could prefer to choosea = 1 if and only if sufficiently many citizens choose the same action so a is largeenough This creates the problem of multiplicity of equilibria and as a consequencethe dependence of optimal censorship on equilibrium selection

As a side our media censorship problem can also be applied to spatial voting mod-els as in Chiang and Knight (2011) Consider a government party (p = G) and anopposition party (p = O) competing in an election If party p wins a voter withan ideological position r gets utility wp minus (r minus rp)

2 where wp is the quality or va-lence of party p and rp is the ideology or policy platform of party p Voters knowthe partiesrsquo ideologies and obtain information about the partiesrsquo qualities from allavailable media outlets Each voter supports the party that maximizes his expectedutility Within this context our analysis still applies because the voterrsquos utility dif-ference between the government and opposition parties is proportional to θminusr whereθ = (wG minus wO + r2O minus r2G)2(rO minus rG)

There are a few more extensions that can be relevant in applications First insteadof complete censorship of a media outlet there can be a cost of accessing it Forexample an international news channel can be censored by a local government butcitizens may still access it through VPN at some cost Second it can be costly forthe government to censor media outlets So another important question to answeris how a government may prioritize censoring Finally citizens can incur some costof following each media outlet While we have already mentioned that citizens gainno benefit from following more than one outlet it is entirely possible for them tostop watching news altogether if it is sufficiently uninformative These extensions arenontrivial and left for future research

Appendix

Proof of Theorem 1 Suppose that V is S-shaped For mωlowast isin [0 1] and mlowast(ωlowast) =E[ω|ω ge ωlowast] define

V (mωlowast) = V (mlowast(ωlowast)) + V prime(mlowast(ωlowast))(mminusmlowast(ωlowast)) (7)

CENSORSHIP AS OPTIMAL PERSUASION 23

Since V (m) is convex on [0 τ ] and concave on [τ 1] there exists ωlowast isin [0 τ ] thatsatisfies condition (3) that is

V (mωlowast) ge (le)V (m) for all m ge (lt)ωlowast (8)

This is true because V (ωlowastωlowast) minus V (ωlowast) is single-crossing from below To see thisconsider Figure 1 Observe that (3) can hold only if ωlowast is on the convex part of V andmlowast is on the concave part of V As the censorship cutoff ωlowast increases the posteriormean state of the pooling message mlowast(ωlowast) also increases But because V is concaveat mlowast(ωlowast) the dashed tangent line V (mωlowast) becomes flatter in m so it crosses thesolid line V (m) at a smaller m

Consider an arbitrary signal s Let H be the distribution of m = E[ω|s] inducedby signal s Since each signal is a garbling of the fully informative signal F is amean-preserving spread of H Thus the senderrsquos expected utility is smaller undersignal s than under the upper-censorship signal with cutoff ωlowast as follows from

983133 1

0

V (m)dH(m) le983133 1

0

maxV (m) V (mωlowast)dH(m)

le983133 1

0

maxV (m) V (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mlowast(ωlowast))dF (m)

where the second line holds because maxV (m) V (mωlowast) is convex in m and F isa mean-preserving spread of H the third line holds by (8) and the last line holds by

(7) and by983125 1

ωlowast(mminusmlowast(ωlowast))dF (m) = 0

Conversely suppose that V is not S-shaped Then there exist 0 le m1 lt m2 lem3 lt m4 le 1 such that V primeprime(m) lt 0 for m isin (m1m2) V

primeprime(m) = 0 for m isin [m2m3]and V primeprime(m) gt 0 for m isin (m3m4) because V primeprime is continuous by assumption It isstraightforward to show that there exists 983141mlowast isin (m1m2) (sufficiently close to m2) and983141ωlowast isin (m3m4) such that

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) lt V (m) for all m isin (983141ωlowastm4]

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) gt V (m) for all m isin [m1 983141ωlowast] 983141mlowast

Moreover there exists a continuous distribution F such that the support of F is asubset of [m1m4] and 983141mlowast = E[ω|ω le 983141ωlowast] Using a chain of inequalities similar to theone above it is easy to show that lower-censorship signal 983141slowast with cutoff 983141ωlowast is optimalFurthermore it is uniquely optimal as both inequalities hold with equalities only forsignal 983141slowast This in turn implies that any upper-censorship signal is suboptimal

24 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Proof of Lemma 1 Notice that under assumption (A1) we have

V primeprime(m) = gprime(m) + ρg(m) for m isin [0 1]

Recall that V is S-shaped if and only if V primeprime(m) = gprime (m) + ρg (m) is single-crossingfrom above Using Proposition 1 in Quah and Strulovici (2012) it is easy to showthat V primeprime(m) is single-crossing from above for all ρ isin R if and only if gprime (m)g (m) isnonincreasing in m (that is ln g(m) is concave)

Proof of Theorem 2 Immediate by Theorem 1 and Lemma 1

Proof of Theorem 3 Part (i) We fix t and σ and vary ρ Without loss ofgenerality let t = 0 and σ = 1 (so that Gtσ = G) Consider a signal that induces thedistribution H of posterior mean m This signal implements the receiverrsquos interimutility U given by

U(r) =

983133 1

r

(mminus r)dH(m) for r isin [0 1]

which holds because the receiver acts if and only if m ge r Since types r lt 0 alwayschoose a = 1 and types r gt 1 always choose a = 0 we can exclude them fromconsideration and assume that supp(G) = [0 1] The senderrsquos expected utility is then

983133 1

0

V (m)dH(m) =

983133 1

0

983133 m

0

(1 + ρ(mminus r))dG(r)dH(m)

=

983133 1

0

983133 1

r

(1 + ρ(mminus r))dH(m)dG(r)

=

983133 1

0

(1minusH(r) + ρU(r))dG(r)

Consider ρ2 gt ρ1 Suppose by contradiction that the corresponding optimal upper-censorship signals s2 and s1 are such that s1 is strictly more informative than s2 LetU2 and U1 be the receiverrsquos interim utilities implemented by s2 and s1 Since thesender prefers signal s2 under ρ2 and signal s1 under ρ1 we have

983133 1

0

(1minusH2 (r) + ρ2U2 (r)) dG (r) ge983133 1

0

(1minusH1 (r) + ρ2U1 (r)) dG (r)

983133 1

0

(1minusH1 (r) + ρ1U1 (r)) dG (r) ge983133 1

0

(1minusH2 (r) + ρ1U2 (r)) dG (r)

Summing up these inequalities gives

(ρ2 minus ρ1)

983133 1

0

(U2 (r)minus U1 (r)) dG (r) ge 0

leading to a contradiction because U1(r) ge U2(r) for all r with strict inequality forsome r given that s1 is strictly more informative than s2

CENSORSHIP AS OPTIMAL PERSUASION 25

Part (ii) We fix ρ isin R and σ = 1 and vary t The indirect utility of the senderconditional on m isin [0 1] is given by

Vt(m) =

983133

rlem

(1+ρ(mminusr))gt1(r)dr =

983133

rlemminust

(1+ρ(mminustminusr))g(r)dr = V (mminust) (9)

By Theorem 1 when t0 = 0 an optimal cutoff ωlowast0 satisfies

V (m) le V (mlowast0) + V prime(mlowast

0)(mminusmlowast0) for m isin [ωlowast

0 1]

When V is S-shaped and t1 lt 0 it is easy to see that

V (mminus t1) le V (mlowast0 minus t1) + V prime(mlowast

0 minus t1)(mminusmlowast0) for m isin [ωlowast

0 1]

which implies that when t = t1 the above inequality changes its sign at some ωlowast1

that satisfies ωlowast1 le ωlowast

0 It follows from (9) that ωlowast1 is an optimal cutoff for t = t1 and

as we established above ωlowast1 le ωlowast

0

Part (iii) We fix ρ ge 0 and t = 0 and vary σ The indirect utility of the senderconditional on m isin [0 1] is given by

V ρσσ(m) =

983133

rlem

9830591 +

ρ

σ(mminus r)

983060dG0σ(r)

=

983133

rlem

9830591 +

ρ

σ(mminus r)

983060g

983061r minus (1minus σ)τ

σ

9830621

σdr

=

983133

rlemminus(1minusσ)τσ

9830611 + ρ

983061mminus (1minus σ)τ

σminus r

983062983062g(r)dr

= Vρ1

983061mminus (1minus σ)τ

σ

983062 (10)

Let σ0 = 1 σ1 lt 1 and ρ1 = ρσ1 To compare the optimal cutoffs under parameters(ρ σ0) and (ρ σ1) we consider two steps first comparing the cutoffs between (ρ σ0)and (ρ1 σ1) and then comparing the cutoffs between (ρ1 σ1) and (ρ σ1)

First we compare an optimal cutoff ωlowast0 under (ρ σ0) = (ρ 1) and an optimal cutoff

ωlowast1 under (ρ1 σ1) By Theorem 1 ωlowast

0 satisfies

Vρ1(m) le Vρ1(mlowast0) + V prime

ρ1(mlowast0)(mminusmlowast

0) for m isin [ωlowast0 1]

When V is increasing (implied by ρ ge 0) and S-shaped and σ1 lt 1 it is easy to seethat for all m isin [ωlowast

0 1]

Vρ1

983061mminus (1minus σ1)τ

σ1

983062le Vρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062+ V prime

ρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062(mminusmlowast

0)

which implies that when σ = σ1 lt 1 the above inequality changes its sign at someωlowast1 that satisfies ωlowast

1 le ωlowast0 It follows from (10) that ωlowast

1 is an optimal cutoff under(ρσ σ1) = (ρ1 σ1) and as we established above ωlowast

1 le ωlowast0

26 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Second we compare the optimal cutoff ωlowast1 under (ρ1 σ1) and the optimal cutoff ωlowast

2

under (ρ σ1) Because ρ lt ρ1 = ρσ1 by part (i) we have ωlowast2 le ωlowast

1

Combining ωlowast1 le ωlowast

0 and ωlowast2 le ωlowast

1 we obtain ωlowast2 le ωlowast

0

Proof of Theorem 1prime Part (A) The proof is analogous to that of Theorem 1 andis thus omitted

Part (B) The only if statement follows from that in part (A) (see the proof of Theorem1) and the fact that we can always ensure that an optimal stochastic censorship(ωlowast qlowast) is deterministic (so qlowast isin 0 1) by the choice of the support and distributionof ω Thus we only need to prove the if statement

Let F be a distribution of ω with support on n values 0 lt ω0 lt lt ωnminus1 lt 1 Thisis without loss of generality because if 0 andor 1 were in the support of F we couldredefine the state as ω = (ω + ε)(1 + 2ε) and the type as r = (r + ε)(1 + 2ε)

Let X sub ω1 ωnminus1 identify a set of cutoffs defining a monotone partition for eachωi isin X the receiver is informed whether ω lt ωi or ω ge ωi Note that we exclude ω0

from the set of cutoffs because ω ge ω0 always holds

Note that X is deterministic upper censorship if and only if there exists a censorshipthreshold ωlowast isin [0 1] such that for each i = 1 nminus 1 ωi isin X if and only if ωi lt ωlowastthat is we can assume that qlowast = 0 without loss of generality

Let VX be the senderrsquos expected utility when the monotone partition is given by X

If n = 2 then there are only two partitions X = ω1 (full disclosure) and X = empty(no disclosure) Both are upper censorships so part (B) holds trivially

If n = 3 then there are four monotone partitions X = ω1ω2 (full disclosure)X = ω2 X = ω1 and X = empty (no disclosure) Only X = ω2 is not uppercensorship The next lemma proves that X = ω2 cannot be uniquely optimal whenV is S-shaped thus proving the optimality of upper censorship for n = 3

Denote

mjk = E983045ω|ω isin [ωjωk]

983046

For convenience of notation we write mj = ωj

Lemma 3 Let V be S-shaped with an inflexion point τ Let ω be a discrete randomvariable with support on m0m1m2 and probabilities p0 p1 p2 where 0 lt m0 ltm1 lt m2 lt 1 Then

(i) if τ le m0 then Vm2 le Vempty

(ii) if τ ge m1 then Vm2 le Vm1m2

(iii) if m0 lt τ lt m1 then Vm2 le maxVempty Vm1m2

CENSORSHIP AS OPTIMAL PERSUASION 27

Proof Define p01 = p0 + p1 and p02 = p01 + p2 = 1 Observe that

m01 =p0m0 + p1m1

p0 + p1and m02 =

p0m0 + p1m1 + p2m2

p0 + p1 + p2

We thus have

Vempty = p02V (m02) = (p01 + p2)V

983061p01m01 + p2m2

p01 + p2

983062

Vm2 = p01V (m01) + p2V (m2) = (p0 + p1)V

983061p0m0 + p1m1

p0 + p1

983062+ p2V (m2)

Vm1m2 = p0V (m0) + p1V (m1) + p2V (m2)

Clearly

m0 lt m01 lt m1 and m0 lt m02 lt m2

Therefore

Vm2 le Vempty lArrrArrp01

p01 + p2V (m01) +

p2p01 + p2

V (m2) le V

983061p01

p01 + p2m01 +

p2p01 + p2

m2

983062= V (m02)

(11)

and

Vm2 le Vm1m2 lArrrArr

V

983061p0

p0 + p1m0 +

p1p0 + p1

m1

983062le p0

p0 + p1V (m0) +

p1p0 + p1

V (m1)(12)

Suppose that τ le m01 so V is concave on [m01m2] Then (11) holds by Jensenrsquosinequality In particular it holds for τ le m0 lt m01 thus proving part (i)

Suppose that τ ge m1 so V is convex on [m0m1] Then (12) holds by Jensenrsquosinequality thus proving part (ii)

Finally suppose that m01 lt τ lt m1 Let (τ V (τ)) lie below (or on) the straight lineA connecting (m0 V (m0)) and (m2 V (m2)) as illustrated on Figure 2 By convexityof V (m) for m lt τ it must be the case that V (m01) lies below (or on) the dashedline B By concavity of V (m) for m gt τ it must be the case that V (m1) lies above(or on) the dashed line C Consequently (12) is satisfied

Alternatively let (τ V (τ)) lie above the straight line A connecting (m0 V (m0)) and(m2 V (m2)) as illustrated on Figure 3 If Vm2 le Vm1m2 then the proof of part(iii) is complete So suppose that Vm2 gt Vm1m2 so (12) is violated

By convexity of V (m) for m lt τ it must be the case that V (m01) lies below (or on)the line B Since (12) is violated V (m01) must lie between the lines A and B Recallthat m01 lt m02 lt m2 Observe that V (m02) must be in the shaded area on Figure 3Indeed if m02 lt τ then by convexity of V (m) for m lt τ it must be the case thatV (m02) lies between (or on) the lines B and D If m02 ge τ then by concavity of

28 KOLOTILIN MYLOVANOV ZAPECHELNYUK

τ m2m0

A C

B

m01 m1

V (m0) V (τ)

V (m2)

Figure 2 The case where (τ V (τ)) lies below (or on) the straight lineconnecting (m0 V (m0)) and (m2 V (m2))

τ m2m0

A

C

B

m01

V (m0)

V (τ)

V (m2)

D

V (m01)

Figure 3 The case where (τ V (τ)) lies above the straight line con-necting (m0 V (m0)) and (m2 V (m2))

V (m) for m gt τ it must be the case that V (m02) lies above (or on) line C It followsthat (11) is satisfied

We now adapt Lemma 3 to prove the optimality of upper censorship for n ge 4

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

8 KOLOTILIN MYLOVANOV ZAPECHELNYUK

to maximize the probability that the receiver accepts the proposal so v(ω r) = 1 forall ω and all r

Observe that if the sender reveals no information about ω the receiver will evaluate ωby its expected value of 1

2 thus having utility 1

2minus r lt 0 from accepting the proposal

So revealing no information makes the receiver reject the proposal with certaintyThe sender can do better by fully revealing ω in which case the proposal is acceptedwith probability Pr[ω ge r]

However the sender can do even better by pooling states above r with states below rwhile keeping the expected state induced by the pooling signal at least r Thus thereceiver who is unable to distinguish between the states in the pool will accept theproposal at all of these states

To illustrate the case of the continuous state suppose that ω is uniformly distributedon [0 1] The largest pool that maintains the posterior expectation at least r is theinterval [2r minus 1 1] Indeed once the receiver learns that ω isin [2r minus 1 1] given theuniform prior of ω the posterior expected state is r So the proposal is accepted whenω ge 2rminus 1 and rejected when ω lt 2rminus 19 Thus the deterministic upper-censorshipsignal with cutoff ωlowast = 2r minus 1 is optimal The senderrsquos expected utility is equal to

Pr[proposal is accepted] = Pr[ω ge 2r minus 1] = 2(1minus r)

which is twice as high as that from the fully informative signal

Pr[proposal is accepted] = Pr[ω ge r] = 1minus r

When ω is a discrete random variable an optimal way to reveal information aboutω takes the form of stochastic upper censorship For illustration suppose that ω canonly be 0 or 1 equally likely Now simply pooling states is not helpful for the senderPooling 0 and 1 yields the posterior expected state of 1

2 which is smaller than the

type r resulting in a rejection of the proposal Including any other states in thepool makes no difference as these states never occur Thus an optimal monotonepartition is the fully informative signal and the probability that the receiver acceptsthe proposal is simply equal to the probability that ω = 1

Pr[proposal is accepted] = Pr[ω ge r] = Pr[ω = 1] =1

2

The sender can do strictly better by partial (stochastic) pooling If ω = 1 let thesignal realization be ldquoacceptrdquo with certainty If ω = 0 let the signal realization beldquorejectrdquo with some probability q and ldquoacceptrdquo with probability 1 minus q So when thereceiver observes ldquoacceptrdquo the posterior expected state is

E[ω|ldquoacceptrdquo] = Pr[ω = 1]

Pr[ω = 1] + Pr[ω = 0] middot (1minus q)=

12

12 + 12 middot (1minus q)=

1

2minus q

9Whether states strictly below ωlowast are revealed or pooled among themselves does not matterbecause they are all strictly below r and thus induce the receiver to take action a = 0

CENSORSHIP AS OPTIMAL PERSUASION 9

The sender now wishes to find the lowest q subject to 1(2 minus q) ge r which yieldsqlowast = 2 minus 1

r Thus stochastic upper censorship with ωlowast = 0 and qlowast = 2 minus 1

ris

optimal The probability that the receiver accepts the proposal is exactly equal tothe probability that the signal realization is ldquoacceptrdquo

Pr[proposal is accepted] = Pr[ω = 1] + Pr[ω = 0] middot (1minus qlowast) =1

2+

1

2middot 1minus r

r=

1

2r

We now summarize the insights obtained in this example If the state is continuousthen an optimal signal is deterministic upper censorship If the state is discrete thenan optimal monotone partition is deterministic upper censorship and an optimalsignal is stochastic upper censorship10 In the remainder of the paper we will showthat these insights extend to the case where the receiverrsquos type is uncertain

3 Continuous State

In this section we provide necessary and sufficient conditions for the optimality ofupper censorship when the state is continuous

Let m be the expected state conditional on observing a realization of a signal s Sincethe sender and receiverrsquos utilities are linear in ω they depend on the information aboutω revealed by signal s only through the expected state m In particular the receiverchooses a = 1 if and only if r le m

Let V (m) denote the indirect utility of the sender conditional on m

V (m) =

983133

rlem

E [v(ω r)|m] g(r)dr =

983133 m

0

v(m r)g(r)dr m isin [0 1] (2)

where E [v(ω r)|m] = v(m r) by the linearity of v in ω

A function V is said to be S-shaped if it is convex below some threshold and concaveabove that threshold or equivalently if V primeprime is single-crossing from above

there exists τ such that V primeprime(m) ge (le) 0 for all m lt (gt) τ

We now provide the criterion for the optimality of upper censorship Recall thatmlowast = E[ω|ω ge ωlowast]

Theorem 1 Let V be S-shaped Then and only then for all continuous F anoptimal signal is deterministic upper censorship whose cutoff state ωlowast isin [0 τ ] satisfies

V (mlowast) + V prime(mlowast)(mminusmlowast) ge (le)V (m) for all m ge (lt)ωlowast (3)

The criterion for the optimality of lower censorship is analogous

10See for example Kolotilin (2015) for a detailed treatment of the case where there is no uncer-tainty about the receiverrsquos type

10 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Remark 1 Let minusV be S-shaped Then and only then for all continuous F anoptimal signal is deterministic lower censorship whose cutoff state ωlowast isin [τ 1] satisfies

V (mlowast) + V prime(mlowast)(mminusmlowast) ge (le)V (m) for all m le (gt)ωlowast

Theorem 1 states that optimal persuasion takes a simple form of upper censorshipwhen V is S-shaped11 Moreover the theorem establishes that this result is tight inthe sense that if V is not S-shaped then there exists a distribution F of the statesuch that no upper-censorship signal is optimal

Note that every upper-censorship signal is a monotone partition Therefore Theorem1 applies in both Scenario A where the sender is free to choose any signal andScenario B where the sender is constrained to choose a monotone partition

The intuition behind Theorem 1 is as follows Observe that when no informationabout the state is revealed the receiverrsquos best-response action does not change withthe state The more information is revealed the more variable the receiverrsquos behaviorwill be in response to this information Consider an interval where V is concave Thesender would prefer not to reveal any information when the state is in this intervalsince a certain outcome is preferred to any lottery with the same expected stateConversely consider an interval where V is convex The sender would prefer to fullyreveal the state in this interval since now lotteries are preferred If V is S-shapedthat is it is convex below some threshold and concave above that threshold theinduced optimal persuasion takes the form of upper censorship

When V is S-shaped the senderrsquos optimization problem is reduced to finding anoptimal censorship cutoff ωlowast If the realized state ω is below the cutoff then it isrevealed to the receiver so the expected utility of the sender is V (ω) If the realizedstate ω is above the cutoff then the posterior expected state is mlowast = E[ω|ω ge ωlowast] sothe expected utility of the sender conditional on ω ge ωlowast is V (mlowast) The sender thusneeds to solve the problem

maxωlowastisin[01]

983133 ωlowast

0

V (ω)f(ω)dω +

983133 1

ωlowastV (mlowast)f(ω)dω (4)

The expression in (3) represents the first-order condition to this problem and capturesthree possible cases the boundary solutions ωlowast = 0 and ωlowast = 1 if the expression in(3) has the same sign for all ω and an interior solution ωlowast such that

V (mlowast) + V prime(mlowast)(ωlowast minusmlowast) = V (ωlowast) (5)

This first-order condition (5) is illustrated by Figure 1 The solid line is V (m) andthe dashed line is V (mlowast) + V prime(mlowast)(mminusmlowast) which is tangent to V at mlowast

11For a fixed distribution F Proposition 3 in Kolotilin (2018) implies that upper censorship withcutoff ωlowast is optimal if and only if (3) holds and V (m) is convex on [0ωlowast] This result can be used toshow that upper censorship is optimal when V is S-shaped For completeness we include a simpleself-contained proof inspired by the proof of Theorem 1 in Dworczak and Martini (2019)

CENSORSHIP AS OPTIMAL PERSUASION 11

1ωlowast

V

revealing pooling

mlowast0

Figure 1 Optimal upper censorship with cutoff ωlowast

The solution of the senderrsquos problem (4) is particularly simple if V is either globallyconvex or globally concave In this case the first-order condition in (3) has a constantsign so either ωlowast = 0 or ωlowast = 1 must be optimal If V is convex then ωlowast = 1 isoptimal which corresponds to the fully informative signal Similarly if V is concavethen ωlowast = 0 is optimal which corresponds to the completely uninformative signalThis is summarized in the following corollary

Corollary 1 An optimal signal is

(i) fully informative for all F if and only if V is convex

(ii) completely uninformative for all F if and only if V is concave

31 Canonical Case We now consider the canonical case where the senderrsquos utilitysatisfies assumption (A1)

The density g of receiver types r is said to be log-concave if ln g(r) is concave in r12

Note that ln g(r) is well defined in the canonical case

In this case the shape of the senderrsquos indirect utility V defined by (9) is connectedto the shape of the density g of receiver types as follows

Lemma 1 In the canonical case V is S-shaped for all ρ if and only if g is log-concave

12Table 1 in Bagnoli and Bergstrom (2005) reports distributions with log-concave densities

12 KOLOTILIN MYLOVANOV ZAPECHELNYUK

That is if the density of receiver types g is log-concave then the senderrsquos indirectutility V is S-shaped Moreover this result is tight in the sense that if g is notlog-concave then there exists ρ such that V is not S-shaped

By Theorem 1 and Lemma 1 we obtain a criterion for the optimality of upper cen-sorship with the condition on the primitive of the model the density g

Theorem 2 Consider the canonical case and let the density g be log-concave Thenand only then an optimal signal is deterministic upper censorship for all F and all ρ

The symmetric statement is also true an optimal signal is lower censorship for all Fand all ρ if and only if minusg is log-concave Consequently if both g and minusg are log-concave (that is g is exponential) then there exists an optimal signal which is bothupper censorship and lower censorship There are only two signals with this propertyfully informative and completely uninformative This allows us to obtain conditionson the distribution of receiver types under which the optimal signal is polarizedbetween fully informative and completely uninformative signals as for example inLewis and Sappington (1994) and Johnson and Myatt (2006)13

Corollary 2 In the canonical case an optimal signal is either fully informative orcompletely uninformative for all F and all ρ if and only if there exist λ isin R and c gt 0such that g(r) = ceminusλr for r isin [0 1]

If g(r) = ceminusλr then the fully informative signal is optimal whenever ρ ge λ and thecompletely uninformative signal is optimal whenever ρ le λ (and any signal is optimalwhen ρ = λ) In particular if ρ = 0 the optimal signal is fully determined by thesign of λ which is in turn determined by whether the mean of r is smaller or greaterthan 1

2

32 Comparative Statics Theorem 2 allows for a sharp comparative statics anal-ysis on the informativeness of the optimal signal

We compare signals by their Blackwell informativeness (Blackwell 1953) To compareupper-censorship signals s1 and s2 we only need to compare their cutoffs ωlowast

1 and ωlowast2

Signal s1 is more informative than signal s2 if ωlowast1 ge ωlowast

2 Indeed state ω isin [0ωlowast2) is

fully revealed by both s1 and s2 and state ω isin [ωlowast2 1] is partially revealed by s1 and

not revealed at all by s2 so s1 is more informative than s2

13In the case of log-concave density g the sets of upper- and lower-censorship signals are totallyordered by the rotation order of Johnson and Myatt (2006) If we restrict attention to the set oflower-censorship signals (which are generally suboptimal under log-concave g) then the rotationpoint is decreasing with respect to the rotation order By Lemma 1 of Johnson and Myatt (2006)the senderrsquos expected utility is quasiconvex and one of the extreme lower-censorship signals (thefully informative or completely uninformative signal) is optimal for the sender But if we restrictattention to the set of upper-censorship signals (which are optimal under log-concave g) then therotation point is not decreasing with respect to the rotation order Therefore Lemma 1 of Johnsonand Myatt (2006) does not apply and an interior upper-censorship signal is generally optimal forthe sender

CENSORSHIP AS OPTIMAL PERSUASION 13

For the purpose of comparison we extend the definition of distribution function Gto the real line and assume that its density g is log-concave Consider a family ofdistributions Gtσ of receiver types

Gtσ (r) = G

983061τ minus t+

r minus τ

σ

983062

where τ is the maximum point of g and t isin R and σ gt 0 are parameters Let gtσ (r)denote the corresponding density Note that gtσ (r) is log-concave for all t isin R andall σ gt 0

Because gtσ is log-concave on [0 1] deterministic upper censorship is optimal byTheorem 2 Let ωlowast (ρ t σ) isin [0 1] be the optimal censorship cutoff as given by (3)

The shift parameter t shifts the distribution along the horizontal axis and the stretchparameter σ stretches the distribution horizontally and symmetrically around themode τ so the limit of σ rarr 0 leads to the unit mass on τ We now show that thesender optimally discloses more information when

(i) the senderrsquos preferences are better aligned with the receiverrsquos preferences (thealignment parameter ρ is greater)

(ii) the receiver is more reluctant to accept the proposal (the shift parameter t isgreater)

(iii) the sender is more uncertain about the receiverrsquos type (the stretch parameter σis greater) provided ρ ge 014

Theorem 3 In the canonical case a cutoff state of an optimal signal satisfies

(i) ωlowast (ρ t σ) is increasing in ρ

(ii) ωlowast (ρ t σ) is increasing in t

(iii) ωlowast (ρ t σ) is increasing in σ if ρ ge 0

The intuition for part (i) is that for a higher ρ the sender puts more weight onthe receiverrsquos utility so she optimally endows the receiver with a higher utility byproviding more information

The intuition for part (ii) is that for a higher t each type of the receiver has a greatercost of accepting the proposal so to persuade the same type of the receiver the senderneeds to increase E[ω|ω ge ωlowast] by expanding the full disclosure interval [0ωlowast]

The intuition for part (iii) is that for a higher σ receiver types are more spread out soto persuade the same mass of types the sender again needs to increase E[ω|ω ge ωlowast]

14The case ρ ge 0 is the practically relevant case where the senderrsquos utility is a weighted averageof the receiverrsquos action and utility 1

1+ρv(ω r) =1

1+ρ + ρ1+ρ (ω minus r)

14 KOLOTILIN MYLOVANOV ZAPECHELNYUK

4 Discrete State

In this section we assume that the state is a discrete random variable

Theorem 1prime Let V be S-shaped Then and only then for all discrete F

(A) an optimal signal is stochastic upper censorship whose cutoff pair (ωlowast qlowast) satisfiescondition (3)

(B) an optimal monotone partition is deterministic upper censorship whose cutoff pair(ωlowast

d qlowastd) satisfies ωlowast

d = ωlowast and qlowastd isin 0 1

In the canonical case we obtain the result analogous to Theorem 2

Theorem 2prime Consider the canonical case and let the density g be log-concave Thenand only then for all discrete F and all ρ

(A) an optimal signal is stochastic upper censorship

(B) an optimal monotone partition is deterministic upper censorship

Let us discuss how the comparative statics result (Theorem 3) changes In Section 32we argue that upper-censorship signals can be ordered by their censorship cutoffs Butwhen the state is discrete a censorship cutoff is described by a pair (ωlowast qlowast) where ωlowast

is a cutoff state and qlowast is the probability of revealing this cutoff state when it realizes

Consider two stochastic upper-censorship signals s1 and s2 with cutoff pairs (ωlowast1 q

lowast1)

and (ωlowast2 q

lowast2) Denote (ωlowast

1 qlowast1) ≽ (ωlowast

2 qlowast2) if ω

lowast1 gt ωlowast

2 or ωlowast1 = ωlowast

2 and qlowast1 ge qlowast2 Observethat s1 is more (Blackwell) informative than s2 if and only if (ωlowast

1 qlowast1) ≽ (ωlowast

2 qlowast2) This

comparison also applies to deterministic upper-censorship signals with the constraintthat qlowast1 and qlowast2 are in 0 1

With the above order on cutoff pairs Theorem 3 extend as follows

Theorem 3prime Consider the canonical case

(A) A cutoff pair of an optimal signal satisfies

(i) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in ρ

(ii) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in t

(iii) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in σ if ρ ge 0

(B) A cutoff pair of an optimal monotone partition satisfies15

(i) (ωlowastd (ρ t σ) q

lowastd (ρ t σ)) is increasing in ρ

(ii) (ωlowastd (ρ t σ) q

lowastd (ρ t σ)) is increasing in t

15Part (iii) also holds for an optimal monotone partition when for example the discrete supportof the state ω is a regular grid with sufficiently many points However it is possible to construct acounterexample where part (iii) fails for an optimal monotone partition

CENSORSHIP AS OPTIMAL PERSUASION 15

That is the sender optimally discloses more information when she is less biasedrelative to the receiver (the alignment parameter ρ is greater) when the receiver ismore reluctant to accept the proposal (the shift parameter t is greater) and underScenario A when the sender is more uncertain about the receiverrsquos type (the stretchparameter σ is greater)

Remark 2 Theorem 1prime and Theorem 2prime show that upper censorship emerges as anoptimal persuasion strategy under the same conditions (V is S-shaped in Theorem1prime and g is log-concave in Theorem 2prime) regardless of whether the sender can choosean arbitrary signal or is constrained to choose a monotone partition It may betempting to conjecture that if the sender was allowed to choose an arbitrary partition(not necessarily monotone) the same result would hold This however is not true

For illustration suppose that the sender wishes to maximize the probability of a = 1and the receiver has type r = 1

2with certainty16 Let the state ω take values in

0 14 1 with probabilities (1

6 23 16) so the prior mean is 1

3

An optimal signal is stochastic upper censorship with cutoff pair (ωlowast qlowast) = (14 12)

which sends the pooling message with certainty at ω = 1 and with probability 1minusqlowast =12at ω = 1

4 and reveals the state otherwise The pooling message induces the posterior

mean m = 12 leading to a = 1 all other messages induce posterior means below 1

2

leading to a = 0 Under the optimal signal action a = 1 is chosen with probability23middot 12+ 1

6= 1

2

An optimal monotone partition is deterministic upper censorship that fully revealsthe state where only ω = 1 leads to a = 1 Under this partition action a = 1 ischosen with probability 1

6

An optimal partition pools the extreme states ω = 0 and ω = 1 and reveals theintermediate state ω = 1

4 The pooling message induces the posterior mean m = 1

2

leading to a = 1 This optimal partition is nonmonotone Under this partitionaction a = 1 is chosen with probability 1

6+ 1

6= 1

3 which is strictly better than the

probability of 16obtained under the optimal monotone partition

5 Application to Media Censorship

In this section we apply our results to the problem of media censorship by thegovernment In the modern world people obtain information about the governmentrsquosstate through various media sources such as television newspapers and internetblogs Without the media most people would not know what policies and reformsthe government pursues and how effective they are Media outlets have differentpositions on the political spectrum and differ substantially in how they select andpresent facts to cover the same news People choose their sources of information based

16In this case V (m) = 1 if m ge 12 and V (m) = 0 if m lt 1

2 Observe that the graph of V is

S-shaped and it can be approximated by a continuously differentiable S-shaped function

16 KOLOTILIN MYLOVANOV ZAPECHELNYUK

on their political ideology and socioeconomic status This information is valuable forsignificant individual decisions on migration investment and voting to name a fewIndividuals may not fully internalize externalities that their decisions impose on thesociety Likewise the government may not have the societyrsquos best interest at heartTo further its goals the government then wishes to influence individual decisions bymanipulating the information through media In autocracies and countries with weakchecks and balances the government has power to censor the media content

The governmentrsquos problem of media censorship can be represented as the persua-sion problem in Section 2 We apply our results to provide conditions for the op-timality of upper-censorship policies that censor all media outlets except the mostpro-government ones Furthermore we interpret our comparative statics results

51 Setup There is a continuum of heterogeneous citizens indexed by r isin [0 1]distributed with G Each citizen chooses between a = 0 and a = 1 The utility of acitizen of type r is given by

u(θ r ar a) = (θ minus r)ar + κ(θ r a)

where ar isin 0 1 denotes the citizenrsquos own action a =983125ardG(r) denotes the aggre-

gate action in the society θ isin [0 1] captures an unobserved benefit from action 1 ascompared to action 0 and κ captures the impact of the aggregate action a on thecitizenrsquos utility The term (θ minus r)ar is a private surplus of a citizen of type r Theterm κ(θ r a) is an externality because for a citizen of type r it is optimal to ignorethis term and choose ar = 1 if and only if θ ge r

There is a government which is concerned with a weighted average of the social utilityand the governmentrsquos intrinsic benefit from the aggregate action For a given θ thegovernmentrsquos utility is given by

983133 1

0

ν(θ r ar a)dG(r) + δγ(θ a)

The term ν captures a citizenrsquos utility from the governmentrsquos perspective We allow νto be different from u to reflect paternalistic or other concerns The term γ capturesthe governmentrsquos intrinsic benefit from the aggregate action The parameter δ ge 0captures the weight of the aggregate action in the governmentrsquos utility

Let T be a distribution of the random variable θ We assume that distributions G andT are independent and admit continuously differentiable and strictly positive densi-ties We also assume that κ ν and γ are linear in θ and continuously differentiablein r and a Furthermore to simplify interpretations we assume that ν and γ arenon-decreasing in θ and a so that for the government a high θ is a good news anda higher aggregate action is preferable

Citizens obtain information about the unobservable benefit θ through media outletsEach media outlet is identified by its editorial policy c isin [0 1] and it endorses action

CENSORSHIP AS OPTIMAL PERSUASION 17

a = 1 if θ ge c and criticizes it if θ lt c17 A set C of media outlets is a finite subsetof [0 1]18

The governmentrsquos censorship policy is a set of the media outlets X sub C that arepermitted to broadcast so the rest of the media outlets are censored

The timing is as follows First the government chooses a set X sub C of permittedmedia outlets Second state θ is realized and each permitted media outlet endorsesor criticizes action a = 1 according to its editorial policy Finally each citizen observesmessages from all permitted media outlets updates his beliefs about θ and choosesan action

52 Discussion We now discuss interpretations of the key components of the mediacensorship application As in Gehlbach and Sonin (2014) there can be various inter-pretations of the citizenrsquos action a = 1 such as voting for the government supportinga governmentrsquos policy or taking an individual decision that benefits the governmentA citizenrsquos type r can be interpreted as his ideological position or preference param-eter A citizen who is more supportive of the government has a smaller r

A media outlet with a higher editorial policy c can be interpreted as being less loyalto the government because it criticizes the government on a larger set of states Aneditorial policy c isin C can therefore represent a slant or political bias of the outletagainst the government and can be empirically measured as the frequency with whichthe outlet uses anti-government language Gentzkow and Shapiro (2010) constructsuch a slant index for US newspapers Empirical findings of their paper suggestthat the editorial policies of media outlets are driven by reader preferences justifyingour assumption of the existence of a large variety of editorial policies19 As in Suen(2004) Chan and Suen (2008) and Chiang and Knight (2011) the assumption ofthe binary media reports that communicate only whether the state θ is above somestandard c can be justified by a cursory readerrsquos preference for simple messages suchas positive or negative opinions and yes or no recommendations

The governmentrsquos censorship of media outlets can take various forms For examplethe government can ban access to internet sites withdraw licenses disrupt financingconfiscate print materials and equipment discredit media providers and even arresteditors and journalists using broadly formulated legislation on combating extremismIn some countries the government can exercise direct control over media editorialpolicies either through state ownership or administrative pressure

17The tie-breaking in the event of θ = c is unimportant as θ is a continuous random variable18We discuss the case of a continuum of media outlets in Section 5519Theoretical literature has explored the determinants of media slant of an outlet driven by its

citizens (Mullainathan and Shleifer 2005 Gentzkow and Shapiro 2006 and Chan and Suen 2008)and its owners (Baron 2006 and Besley and Prat 2006)

18 KOLOTILIN MYLOVANOV ZAPECHELNYUK

53 Formulating Media Censorship as Persuasion We now show that the me-dia censorship problem can be formulated as a linear persuasion problem in whichthe government is a sender and a representative citizen is a receiver

In our model the citizensrsquo and the governmentrsquos utilities are linear in θ There-fore given any information from the media outlets the utilities depend only on theposterior mean of θ

Consider an arbitrary posterior mean of θ denoted by m isin [0 1] Each citizen of typer chooses ar = 1 if and only if r le m The externality term κ plays no role in thisdecision Therefore the aggregate action a is simply the mass of all citizens whosetypes do not exceed m so a = G(m)

Next using the citizensrsquo optimal behavior we derive the governmentrsquos expected utilityconditional on the posterior mean m

V (m) = E983063983133 1

0

ν(θ r ar G(m))dG(r) + δγ(θ G(m))

983055983055983055983055m983064

=

983133 1

0

ν(m r1rlem G(m))dG(r) + δγ(mG(m)) (6)

This is the governmentrsquos indirect utility As in Section 2 the government now choosesa signal which is informative about the state to maximize its expected utility How-ever in contrast to Section 2 here the government is restricted to signals that areimplementable by a subset of a given set of media outlets

Next we show that with an appropriate definition of the state the restriction tosignals implementable by a subset of a given set of media outlets can be formulatedas a restriction to monotone partitional signals as in Scenario B in Section 2 Thuswe will be able to use our results in Section 4 to find the optimal censorship policies

Recall that the information about θ is only available through the media outlets Let ωdenote a random variable equal to the conditional expectation of θ given the messagesof all media outlets in C Note that as C is finite ω is a discrete random variablewith values in [0 1] Let F denote the distribution of ω From now on we treat therandom variable ω as the state and denote by s a signal as defined in Section 2

Let HM denote the set of all distributions of the posterior mean state induced byall monotone partitional signals and let HC denote the set of all distributions ofthe posterior mean state induced by all finite subsets of the set C of media outletsObserve that any set of media outlets induces a monotone partitional signal thatis HC sub HM We now show that every outcome implementable by an arbitrarymonotone partition in HM is also implementable by a subset of media outlets

Lemma 2 HC = HM

For illustration suppose that there is only one media outlet with the editorial policyc isin (0 1) There are two observable events θ ge c and θ lt c In each realized event

CENSORSHIP AS OPTIMAL PERSUASION 19

the citizens compute the posterior expectation of θ which we will call ω So in thisexample the distribution F of ω has two mass points one for each of the two events

Given these two mass points every monotone partition of [0 1] induces one of twopossible distributions of the posterior mean state (i) the mass points revealed and (ii)the mass points pooled Observe however that these distributions are implementableby permitting and censoring the media outlet respectively

A censorship policy is upper censorship if it censors all sufficiently disloyal mediaoutlets Specifically there exists a cutoff clowast isin C and an indicator qlowast isin 0 1 suchthat all media outlets whose editorial policies are below clowast are permitted all mediaoutlets whose editorial policies are above clowast are censored and clowast is permitted if andonly if qlowast = 1 That is X = c isin C c le clowast if qlowast = 1 and X = c isin C c lt clowastif qlowast = 0 Note that the full censorship policy clowast = 0 and qlowast = 0 (where all mediaoutlets are censored) and the free media policy clowast = 1 and qlowast = 1 (where all mediaoutlets are permitted) are the two extreme upper-censorship policies

As shown above the media censorship problem is equivalent to the linear persuasionproblem in which the sender is restricted to monotone partitional signals and thesenderrsquos indirect utility V is given by (6) We thus apply Theorem 1prime(B) to obtainthe following result

Theorem 1primeprime If V is S-shaped then an optimal censorship policy is upper censorship

To illustrate Theorem 1primeprime suppose the government is interested only in the aggregateaction so that ν = 0 and γ depends only on the aggregate action a so V (m) =γ(G(m)) Thus an upper-censorship policy is optimal if the composition functionγ(G(middot)) is S-shaped For example this condition holds if the government is interestedin reaching a certain approval threshold (eg a simple majority) so that γ is a stepfunction This condition also holds if γ is S-shaped and G is uniform or S-shapedwith the same inflection point as γ20

54 Canonical Case We now impose more structure on the utilities to obtain asharper result for the optimality of upper censorship and to perform a comparativestatics analysis Analogously to assumption (A1) we assume that the governmentrsquosutility is a weighted average of the citizensrsquo utility and their aggregate action

983133 1

0

u(θ r ar a)dG(r) + δa

where u(θ r ar a) = (θ minus r)ar + ζ(r)a

(A2)

for some continuously differentiable function ζ Define

β =

983133 1

0

ζ(r)dG(r) + δ

20This condition holds in many special cases where γ and G are S-shaped even when γ and Ghave different inflection points

20 KOLOTILIN MYLOVANOV ZAPECHELNYUK

The term983125 1

0ζ(r)dG(r) is the governmentrsquos expected bias towards the citizensrsquo action

a = 1 due to the citizensrsquo externality and the term δ is the governmentrsquos intrinsicbias towards a greater aggregate action a Thus we interpret β as the governmentrsquosaggregate bias This decomposition of the bias allows for different interpretations whythe government is biased So β can be high because the government is too self-serving(high δ) or because the government is benevolent (low δ) but wishes to internalizestrong positive externalities of the citizens (high ζ(r)) To ease interpretations weassume that β gt 021

Under Assumption (A2) the governmentrsquos indirect utility V given by (6) becomes

V (m) =

983133 m

0

(mminus r)dG(r) + βa =

983133 m

0

v(m r)dG(r)

where

v(m r) = β

9830611 +

1

β(mminus r)

983062

This is the same as v given by Assumption (A1) with ρ = βminus1 up to rescaling by apositive constant Consequently we can apply Theorem 2prime(B) to obtain the followingresult

Theorem 2primeprime Let (A2) hold If the density g of citizens types is log-concave then anoptimal censorship policy is upper censorship

We now apply the comparative statics analysis presented in Section 32 Note thatthe upper-censorship policies are ordered according to the amount of informationtransmitted to the citizens in the sense of Blackwell (1953) A greater censorshipthreshold clowast means that more media outlets are permitted With this order in mindwe apply Theorem 3prime(B) to make a comparative statics analysis on the amount ofinformation that is optimally disclosed by the government

First the censorship cutoff is increasing in the alignment parameter ρ Recall thatthis is a reciprocal of the governmentrsquos bias ρ = βminus1 This means that the governmentoptimally discloses more information (the censorship cutoff is greater) when it is lessbiased (β is smaller) Intuitively as β decreases the government puts more weighton the citizensrsquo utility so it optimally endows the population with a higher utility bycensoring fewer media outlets and disclosing more information

Second the censorship cutoff is increasing in the magnitude t of the horizontal shiftof the density g A greater t corresponds to a greater opportunity cost of actiona = 1 for each citizen This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more difficultto persuade to take action a = 1 (parameter t is greater) Informally speaking topersuade the same type of the citizen the government needs to increase the posterior

21If β lt 0 then swapping the roles of a = 0 and a = 1 reverses the sign of the bias If β = 0then the governmentrsquos utility and the citizensrsquo private interests coincide so it is trivially optimal topermit all media outlets

CENSORSHIP AS OPTIMAL PERSUASION 21

mean But the expected posterior mean must be equal to the prior mean so it is notpossible to increase all posterior means Due to the log-concave shape of the densityof the citizensrsquo types this tradeoff is resolved by increasing the posterior mean of thepooling interval E[ω|ω ge clowast] This is done by shrinking the pooling interval [clowast 1]that is increasing the censorship cutoff clowast

Third when the number of media outlets is sufficiently large the censorship cutoffis increasing in the stretch factor σ of the density g around its peak τ (see Footnote15) A greater σ corresponds to a more dispersed distribution of the opportunitycost in the population This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more diverse inhow difficult they are to persuade Intuitively when the types are more spread outpersuading the same mass of types requires a greater posterior mean which leads toa higher censorship cutoff similarly to the comparative statics in t discussed above

55 Extensions and Open Questions Let us now consider a few extensions ofour model of media censorship

In our model the set of media outlets is exogenous and the governmentrsquos only in-strument is censorship We now consider three alternative ways of expanding thegovernmentrsquos instruments of influence

First suppose that the government is able not only to censor existing media outletsbut also to introduce new media outlets with chosen editorial policies This is equiv-alent to our censorship model where all media outlets in [0 1] are initially availableand the government can censor any subset of them When all media outlets in [0 1]are permitted the revealed information about θ (which we call ω) is θ itself Thusω = θ is a continuous random variable with distribution F = T So we can nowapply our results for the continuous state from Section 3 instead of those for the dis-crete state in Section 4 reaching the same conclusion about the optimality of uppercensorship

Second suppose that the government can garble information available from mediaoutlets That is the government is not restricted to monotone partitions it cancreate arbitrary signals about state ω for the citizens to observe This becomes ageneral persuasion problem and our Theorems 1prime(A) 2prime(A) and 3prime(A) apply

Third suppose that the government is able to restrict not only which media outletsare permitted but also how many media outlets each citizen can choose to observeIn this extension the citizens are not allowed to communicate with one another (oth-erwise they could share the information thus observing all permitted media outletsindirectly) This extension does not affect our results as long as each citizen is al-lowed to access at least one media outlet of his choice as in Chan and Suen (2008)Intuitively this is because each citizen categorizes the information from the mediaoutlets into ldquogood newsrdquo where a = 1 is optimal and ldquobad newsrdquo where a = 0 is

22 KOLOTILIN MYLOVANOV ZAPECHELNYUK

optimal Because the information from the media outlets induces a monotone parti-tion it means that ldquogood newsrdquo is separated from ldquobad newsrdquo by a threshold mediaoutlet that depends on the citizenrsquos type That is it is sufficient to observe a singlethreshold media outlet to distinguish ldquogood newsrdquo from ldquobad newsrdquo

In our model each citizenrsquos utility depends on the aggregate action a through theexternality term κ(θ r a) which does not affect the chosen action Let us relax thisassumption so that a citizenrsquos optimal choice can depend on a We can still writethe senderrsquos indirect utility V as a function of the posterior mean state and applyour results However V is no longer uniquely determined by the primitives of themodel It is now endogenous and depends on an equilibrium the citizens play aseach citizenrsquos optimal action now depends on what all citizens do in equilibrium Forexample given the same information about the state a citizen could prefer to choosea = 1 if and only if sufficiently many citizens choose the same action so a is largeenough This creates the problem of multiplicity of equilibria and as a consequencethe dependence of optimal censorship on equilibrium selection

As a side our media censorship problem can also be applied to spatial voting mod-els as in Chiang and Knight (2011) Consider a government party (p = G) and anopposition party (p = O) competing in an election If party p wins a voter withan ideological position r gets utility wp minus (r minus rp)

2 where wp is the quality or va-lence of party p and rp is the ideology or policy platform of party p Voters knowthe partiesrsquo ideologies and obtain information about the partiesrsquo qualities from allavailable media outlets Each voter supports the party that maximizes his expectedutility Within this context our analysis still applies because the voterrsquos utility dif-ference between the government and opposition parties is proportional to θminusr whereθ = (wG minus wO + r2O minus r2G)2(rO minus rG)

There are a few more extensions that can be relevant in applications First insteadof complete censorship of a media outlet there can be a cost of accessing it Forexample an international news channel can be censored by a local government butcitizens may still access it through VPN at some cost Second it can be costly forthe government to censor media outlets So another important question to answeris how a government may prioritize censoring Finally citizens can incur some costof following each media outlet While we have already mentioned that citizens gainno benefit from following more than one outlet it is entirely possible for them tostop watching news altogether if it is sufficiently uninformative These extensions arenontrivial and left for future research

Appendix

Proof of Theorem 1 Suppose that V is S-shaped For mωlowast isin [0 1] and mlowast(ωlowast) =E[ω|ω ge ωlowast] define

V (mωlowast) = V (mlowast(ωlowast)) + V prime(mlowast(ωlowast))(mminusmlowast(ωlowast)) (7)

CENSORSHIP AS OPTIMAL PERSUASION 23

Since V (m) is convex on [0 τ ] and concave on [τ 1] there exists ωlowast isin [0 τ ] thatsatisfies condition (3) that is

V (mωlowast) ge (le)V (m) for all m ge (lt)ωlowast (8)

This is true because V (ωlowastωlowast) minus V (ωlowast) is single-crossing from below To see thisconsider Figure 1 Observe that (3) can hold only if ωlowast is on the convex part of V andmlowast is on the concave part of V As the censorship cutoff ωlowast increases the posteriormean state of the pooling message mlowast(ωlowast) also increases But because V is concaveat mlowast(ωlowast) the dashed tangent line V (mωlowast) becomes flatter in m so it crosses thesolid line V (m) at a smaller m

Consider an arbitrary signal s Let H be the distribution of m = E[ω|s] inducedby signal s Since each signal is a garbling of the fully informative signal F is amean-preserving spread of H Thus the senderrsquos expected utility is smaller undersignal s than under the upper-censorship signal with cutoff ωlowast as follows from

983133 1

0

V (m)dH(m) le983133 1

0

maxV (m) V (mωlowast)dH(m)

le983133 1

0

maxV (m) V (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mlowast(ωlowast))dF (m)

where the second line holds because maxV (m) V (mωlowast) is convex in m and F isa mean-preserving spread of H the third line holds by (8) and the last line holds by

(7) and by983125 1

ωlowast(mminusmlowast(ωlowast))dF (m) = 0

Conversely suppose that V is not S-shaped Then there exist 0 le m1 lt m2 lem3 lt m4 le 1 such that V primeprime(m) lt 0 for m isin (m1m2) V

primeprime(m) = 0 for m isin [m2m3]and V primeprime(m) gt 0 for m isin (m3m4) because V primeprime is continuous by assumption It isstraightforward to show that there exists 983141mlowast isin (m1m2) (sufficiently close to m2) and983141ωlowast isin (m3m4) such that

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) lt V (m) for all m isin (983141ωlowastm4]

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) gt V (m) for all m isin [m1 983141ωlowast] 983141mlowast

Moreover there exists a continuous distribution F such that the support of F is asubset of [m1m4] and 983141mlowast = E[ω|ω le 983141ωlowast] Using a chain of inequalities similar to theone above it is easy to show that lower-censorship signal 983141slowast with cutoff 983141ωlowast is optimalFurthermore it is uniquely optimal as both inequalities hold with equalities only forsignal 983141slowast This in turn implies that any upper-censorship signal is suboptimal

24 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Proof of Lemma 1 Notice that under assumption (A1) we have

V primeprime(m) = gprime(m) + ρg(m) for m isin [0 1]

Recall that V is S-shaped if and only if V primeprime(m) = gprime (m) + ρg (m) is single-crossingfrom above Using Proposition 1 in Quah and Strulovici (2012) it is easy to showthat V primeprime(m) is single-crossing from above for all ρ isin R if and only if gprime (m)g (m) isnonincreasing in m (that is ln g(m) is concave)

Proof of Theorem 2 Immediate by Theorem 1 and Lemma 1

Proof of Theorem 3 Part (i) We fix t and σ and vary ρ Without loss ofgenerality let t = 0 and σ = 1 (so that Gtσ = G) Consider a signal that induces thedistribution H of posterior mean m This signal implements the receiverrsquos interimutility U given by

U(r) =

983133 1

r

(mminus r)dH(m) for r isin [0 1]

which holds because the receiver acts if and only if m ge r Since types r lt 0 alwayschoose a = 1 and types r gt 1 always choose a = 0 we can exclude them fromconsideration and assume that supp(G) = [0 1] The senderrsquos expected utility is then

983133 1

0

V (m)dH(m) =

983133 1

0

983133 m

0

(1 + ρ(mminus r))dG(r)dH(m)

=

983133 1

0

983133 1

r

(1 + ρ(mminus r))dH(m)dG(r)

=

983133 1

0

(1minusH(r) + ρU(r))dG(r)

Consider ρ2 gt ρ1 Suppose by contradiction that the corresponding optimal upper-censorship signals s2 and s1 are such that s1 is strictly more informative than s2 LetU2 and U1 be the receiverrsquos interim utilities implemented by s2 and s1 Since thesender prefers signal s2 under ρ2 and signal s1 under ρ1 we have

983133 1

0

(1minusH2 (r) + ρ2U2 (r)) dG (r) ge983133 1

0

(1minusH1 (r) + ρ2U1 (r)) dG (r)

983133 1

0

(1minusH1 (r) + ρ1U1 (r)) dG (r) ge983133 1

0

(1minusH2 (r) + ρ1U2 (r)) dG (r)

Summing up these inequalities gives

(ρ2 minus ρ1)

983133 1

0

(U2 (r)minus U1 (r)) dG (r) ge 0

leading to a contradiction because U1(r) ge U2(r) for all r with strict inequality forsome r given that s1 is strictly more informative than s2

CENSORSHIP AS OPTIMAL PERSUASION 25

Part (ii) We fix ρ isin R and σ = 1 and vary t The indirect utility of the senderconditional on m isin [0 1] is given by

Vt(m) =

983133

rlem

(1+ρ(mminusr))gt1(r)dr =

983133

rlemminust

(1+ρ(mminustminusr))g(r)dr = V (mminust) (9)

By Theorem 1 when t0 = 0 an optimal cutoff ωlowast0 satisfies

V (m) le V (mlowast0) + V prime(mlowast

0)(mminusmlowast0) for m isin [ωlowast

0 1]

When V is S-shaped and t1 lt 0 it is easy to see that

V (mminus t1) le V (mlowast0 minus t1) + V prime(mlowast

0 minus t1)(mminusmlowast0) for m isin [ωlowast

0 1]

which implies that when t = t1 the above inequality changes its sign at some ωlowast1

that satisfies ωlowast1 le ωlowast

0 It follows from (9) that ωlowast1 is an optimal cutoff for t = t1 and

as we established above ωlowast1 le ωlowast

0

Part (iii) We fix ρ ge 0 and t = 0 and vary σ The indirect utility of the senderconditional on m isin [0 1] is given by

V ρσσ(m) =

983133

rlem

9830591 +

ρ

σ(mminus r)

983060dG0σ(r)

=

983133

rlem

9830591 +

ρ

σ(mminus r)

983060g

983061r minus (1minus σ)τ

σ

9830621

σdr

=

983133

rlemminus(1minusσ)τσ

9830611 + ρ

983061mminus (1minus σ)τ

σminus r

983062983062g(r)dr

= Vρ1

983061mminus (1minus σ)τ

σ

983062 (10)

Let σ0 = 1 σ1 lt 1 and ρ1 = ρσ1 To compare the optimal cutoffs under parameters(ρ σ0) and (ρ σ1) we consider two steps first comparing the cutoffs between (ρ σ0)and (ρ1 σ1) and then comparing the cutoffs between (ρ1 σ1) and (ρ σ1)

First we compare an optimal cutoff ωlowast0 under (ρ σ0) = (ρ 1) and an optimal cutoff

ωlowast1 under (ρ1 σ1) By Theorem 1 ωlowast

0 satisfies

Vρ1(m) le Vρ1(mlowast0) + V prime

ρ1(mlowast0)(mminusmlowast

0) for m isin [ωlowast0 1]

When V is increasing (implied by ρ ge 0) and S-shaped and σ1 lt 1 it is easy to seethat for all m isin [ωlowast

0 1]

Vρ1

983061mminus (1minus σ1)τ

σ1

983062le Vρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062+ V prime

ρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062(mminusmlowast

0)

which implies that when σ = σ1 lt 1 the above inequality changes its sign at someωlowast1 that satisfies ωlowast

1 le ωlowast0 It follows from (10) that ωlowast

1 is an optimal cutoff under(ρσ σ1) = (ρ1 σ1) and as we established above ωlowast

1 le ωlowast0

26 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Second we compare the optimal cutoff ωlowast1 under (ρ1 σ1) and the optimal cutoff ωlowast

2

under (ρ σ1) Because ρ lt ρ1 = ρσ1 by part (i) we have ωlowast2 le ωlowast

1

Combining ωlowast1 le ωlowast

0 and ωlowast2 le ωlowast

1 we obtain ωlowast2 le ωlowast

0

Proof of Theorem 1prime Part (A) The proof is analogous to that of Theorem 1 andis thus omitted

Part (B) The only if statement follows from that in part (A) (see the proof of Theorem1) and the fact that we can always ensure that an optimal stochastic censorship(ωlowast qlowast) is deterministic (so qlowast isin 0 1) by the choice of the support and distributionof ω Thus we only need to prove the if statement

Let F be a distribution of ω with support on n values 0 lt ω0 lt lt ωnminus1 lt 1 Thisis without loss of generality because if 0 andor 1 were in the support of F we couldredefine the state as ω = (ω + ε)(1 + 2ε) and the type as r = (r + ε)(1 + 2ε)

Let X sub ω1 ωnminus1 identify a set of cutoffs defining a monotone partition for eachωi isin X the receiver is informed whether ω lt ωi or ω ge ωi Note that we exclude ω0

from the set of cutoffs because ω ge ω0 always holds

Note that X is deterministic upper censorship if and only if there exists a censorshipthreshold ωlowast isin [0 1] such that for each i = 1 nminus 1 ωi isin X if and only if ωi lt ωlowastthat is we can assume that qlowast = 0 without loss of generality

Let VX be the senderrsquos expected utility when the monotone partition is given by X

If n = 2 then there are only two partitions X = ω1 (full disclosure) and X = empty(no disclosure) Both are upper censorships so part (B) holds trivially

If n = 3 then there are four monotone partitions X = ω1ω2 (full disclosure)X = ω2 X = ω1 and X = empty (no disclosure) Only X = ω2 is not uppercensorship The next lemma proves that X = ω2 cannot be uniquely optimal whenV is S-shaped thus proving the optimality of upper censorship for n = 3

Denote

mjk = E983045ω|ω isin [ωjωk]

983046

For convenience of notation we write mj = ωj

Lemma 3 Let V be S-shaped with an inflexion point τ Let ω be a discrete randomvariable with support on m0m1m2 and probabilities p0 p1 p2 where 0 lt m0 ltm1 lt m2 lt 1 Then

(i) if τ le m0 then Vm2 le Vempty

(ii) if τ ge m1 then Vm2 le Vm1m2

(iii) if m0 lt τ lt m1 then Vm2 le maxVempty Vm1m2

CENSORSHIP AS OPTIMAL PERSUASION 27

Proof Define p01 = p0 + p1 and p02 = p01 + p2 = 1 Observe that

m01 =p0m0 + p1m1

p0 + p1and m02 =

p0m0 + p1m1 + p2m2

p0 + p1 + p2

We thus have

Vempty = p02V (m02) = (p01 + p2)V

983061p01m01 + p2m2

p01 + p2

983062

Vm2 = p01V (m01) + p2V (m2) = (p0 + p1)V

983061p0m0 + p1m1

p0 + p1

983062+ p2V (m2)

Vm1m2 = p0V (m0) + p1V (m1) + p2V (m2)

Clearly

m0 lt m01 lt m1 and m0 lt m02 lt m2

Therefore

Vm2 le Vempty lArrrArrp01

p01 + p2V (m01) +

p2p01 + p2

V (m2) le V

983061p01

p01 + p2m01 +

p2p01 + p2

m2

983062= V (m02)

(11)

and

Vm2 le Vm1m2 lArrrArr

V

983061p0

p0 + p1m0 +

p1p0 + p1

m1

983062le p0

p0 + p1V (m0) +

p1p0 + p1

V (m1)(12)

Suppose that τ le m01 so V is concave on [m01m2] Then (11) holds by Jensenrsquosinequality In particular it holds for τ le m0 lt m01 thus proving part (i)

Suppose that τ ge m1 so V is convex on [m0m1] Then (12) holds by Jensenrsquosinequality thus proving part (ii)

Finally suppose that m01 lt τ lt m1 Let (τ V (τ)) lie below (or on) the straight lineA connecting (m0 V (m0)) and (m2 V (m2)) as illustrated on Figure 2 By convexityof V (m) for m lt τ it must be the case that V (m01) lies below (or on) the dashedline B By concavity of V (m) for m gt τ it must be the case that V (m1) lies above(or on) the dashed line C Consequently (12) is satisfied

Alternatively let (τ V (τ)) lie above the straight line A connecting (m0 V (m0)) and(m2 V (m2)) as illustrated on Figure 3 If Vm2 le Vm1m2 then the proof of part(iii) is complete So suppose that Vm2 gt Vm1m2 so (12) is violated

By convexity of V (m) for m lt τ it must be the case that V (m01) lies below (or on)the line B Since (12) is violated V (m01) must lie between the lines A and B Recallthat m01 lt m02 lt m2 Observe that V (m02) must be in the shaded area on Figure 3Indeed if m02 lt τ then by convexity of V (m) for m lt τ it must be the case thatV (m02) lies between (or on) the lines B and D If m02 ge τ then by concavity of

28 KOLOTILIN MYLOVANOV ZAPECHELNYUK

τ m2m0

A C

B

m01 m1

V (m0) V (τ)

V (m2)

Figure 2 The case where (τ V (τ)) lies below (or on) the straight lineconnecting (m0 V (m0)) and (m2 V (m2))

τ m2m0

A

C

B

m01

V (m0)

V (τ)

V (m2)

D

V (m01)

Figure 3 The case where (τ V (τ)) lies above the straight line con-necting (m0 V (m0)) and (m2 V (m2))

V (m) for m gt τ it must be the case that V (m02) lies above (or on) line C It followsthat (11) is satisfied

We now adapt Lemma 3 to prove the optimality of upper censorship for n ge 4

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

CENSORSHIP AS OPTIMAL PERSUASION 9

The sender now wishes to find the lowest q subject to 1(2 minus q) ge r which yieldsqlowast = 2 minus 1

r Thus stochastic upper censorship with ωlowast = 0 and qlowast = 2 minus 1

ris

optimal The probability that the receiver accepts the proposal is exactly equal tothe probability that the signal realization is ldquoacceptrdquo

Pr[proposal is accepted] = Pr[ω = 1] + Pr[ω = 0] middot (1minus qlowast) =1

2+

1

2middot 1minus r

r=

1

2r

We now summarize the insights obtained in this example If the state is continuousthen an optimal signal is deterministic upper censorship If the state is discrete thenan optimal monotone partition is deterministic upper censorship and an optimalsignal is stochastic upper censorship10 In the remainder of the paper we will showthat these insights extend to the case where the receiverrsquos type is uncertain

3 Continuous State

In this section we provide necessary and sufficient conditions for the optimality ofupper censorship when the state is continuous

Let m be the expected state conditional on observing a realization of a signal s Sincethe sender and receiverrsquos utilities are linear in ω they depend on the information aboutω revealed by signal s only through the expected state m In particular the receiverchooses a = 1 if and only if r le m

Let V (m) denote the indirect utility of the sender conditional on m

V (m) =

983133

rlem

E [v(ω r)|m] g(r)dr =

983133 m

0

v(m r)g(r)dr m isin [0 1] (2)

where E [v(ω r)|m] = v(m r) by the linearity of v in ω

A function V is said to be S-shaped if it is convex below some threshold and concaveabove that threshold or equivalently if V primeprime is single-crossing from above

there exists τ such that V primeprime(m) ge (le) 0 for all m lt (gt) τ

We now provide the criterion for the optimality of upper censorship Recall thatmlowast = E[ω|ω ge ωlowast]

Theorem 1 Let V be S-shaped Then and only then for all continuous F anoptimal signal is deterministic upper censorship whose cutoff state ωlowast isin [0 τ ] satisfies

V (mlowast) + V prime(mlowast)(mminusmlowast) ge (le)V (m) for all m ge (lt)ωlowast (3)

The criterion for the optimality of lower censorship is analogous

10See for example Kolotilin (2015) for a detailed treatment of the case where there is no uncer-tainty about the receiverrsquos type

10 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Remark 1 Let minusV be S-shaped Then and only then for all continuous F anoptimal signal is deterministic lower censorship whose cutoff state ωlowast isin [τ 1] satisfies

V (mlowast) + V prime(mlowast)(mminusmlowast) ge (le)V (m) for all m le (gt)ωlowast

Theorem 1 states that optimal persuasion takes a simple form of upper censorshipwhen V is S-shaped11 Moreover the theorem establishes that this result is tight inthe sense that if V is not S-shaped then there exists a distribution F of the statesuch that no upper-censorship signal is optimal

Note that every upper-censorship signal is a monotone partition Therefore Theorem1 applies in both Scenario A where the sender is free to choose any signal andScenario B where the sender is constrained to choose a monotone partition

The intuition behind Theorem 1 is as follows Observe that when no informationabout the state is revealed the receiverrsquos best-response action does not change withthe state The more information is revealed the more variable the receiverrsquos behaviorwill be in response to this information Consider an interval where V is concave Thesender would prefer not to reveal any information when the state is in this intervalsince a certain outcome is preferred to any lottery with the same expected stateConversely consider an interval where V is convex The sender would prefer to fullyreveal the state in this interval since now lotteries are preferred If V is S-shapedthat is it is convex below some threshold and concave above that threshold theinduced optimal persuasion takes the form of upper censorship

When V is S-shaped the senderrsquos optimization problem is reduced to finding anoptimal censorship cutoff ωlowast If the realized state ω is below the cutoff then it isrevealed to the receiver so the expected utility of the sender is V (ω) If the realizedstate ω is above the cutoff then the posterior expected state is mlowast = E[ω|ω ge ωlowast] sothe expected utility of the sender conditional on ω ge ωlowast is V (mlowast) The sender thusneeds to solve the problem

maxωlowastisin[01]

983133 ωlowast

0

V (ω)f(ω)dω +

983133 1

ωlowastV (mlowast)f(ω)dω (4)

The expression in (3) represents the first-order condition to this problem and capturesthree possible cases the boundary solutions ωlowast = 0 and ωlowast = 1 if the expression in(3) has the same sign for all ω and an interior solution ωlowast such that

V (mlowast) + V prime(mlowast)(ωlowast minusmlowast) = V (ωlowast) (5)

This first-order condition (5) is illustrated by Figure 1 The solid line is V (m) andthe dashed line is V (mlowast) + V prime(mlowast)(mminusmlowast) which is tangent to V at mlowast

11For a fixed distribution F Proposition 3 in Kolotilin (2018) implies that upper censorship withcutoff ωlowast is optimal if and only if (3) holds and V (m) is convex on [0ωlowast] This result can be used toshow that upper censorship is optimal when V is S-shaped For completeness we include a simpleself-contained proof inspired by the proof of Theorem 1 in Dworczak and Martini (2019)

CENSORSHIP AS OPTIMAL PERSUASION 11

1ωlowast

V

revealing pooling

mlowast0

Figure 1 Optimal upper censorship with cutoff ωlowast

The solution of the senderrsquos problem (4) is particularly simple if V is either globallyconvex or globally concave In this case the first-order condition in (3) has a constantsign so either ωlowast = 0 or ωlowast = 1 must be optimal If V is convex then ωlowast = 1 isoptimal which corresponds to the fully informative signal Similarly if V is concavethen ωlowast = 0 is optimal which corresponds to the completely uninformative signalThis is summarized in the following corollary

Corollary 1 An optimal signal is

(i) fully informative for all F if and only if V is convex

(ii) completely uninformative for all F if and only if V is concave

31 Canonical Case We now consider the canonical case where the senderrsquos utilitysatisfies assumption (A1)

The density g of receiver types r is said to be log-concave if ln g(r) is concave in r12

Note that ln g(r) is well defined in the canonical case

In this case the shape of the senderrsquos indirect utility V defined by (9) is connectedto the shape of the density g of receiver types as follows

Lemma 1 In the canonical case V is S-shaped for all ρ if and only if g is log-concave

12Table 1 in Bagnoli and Bergstrom (2005) reports distributions with log-concave densities

12 KOLOTILIN MYLOVANOV ZAPECHELNYUK

That is if the density of receiver types g is log-concave then the senderrsquos indirectutility V is S-shaped Moreover this result is tight in the sense that if g is notlog-concave then there exists ρ such that V is not S-shaped

By Theorem 1 and Lemma 1 we obtain a criterion for the optimality of upper cen-sorship with the condition on the primitive of the model the density g

Theorem 2 Consider the canonical case and let the density g be log-concave Thenand only then an optimal signal is deterministic upper censorship for all F and all ρ

The symmetric statement is also true an optimal signal is lower censorship for all Fand all ρ if and only if minusg is log-concave Consequently if both g and minusg are log-concave (that is g is exponential) then there exists an optimal signal which is bothupper censorship and lower censorship There are only two signals with this propertyfully informative and completely uninformative This allows us to obtain conditionson the distribution of receiver types under which the optimal signal is polarizedbetween fully informative and completely uninformative signals as for example inLewis and Sappington (1994) and Johnson and Myatt (2006)13

Corollary 2 In the canonical case an optimal signal is either fully informative orcompletely uninformative for all F and all ρ if and only if there exist λ isin R and c gt 0such that g(r) = ceminusλr for r isin [0 1]

If g(r) = ceminusλr then the fully informative signal is optimal whenever ρ ge λ and thecompletely uninformative signal is optimal whenever ρ le λ (and any signal is optimalwhen ρ = λ) In particular if ρ = 0 the optimal signal is fully determined by thesign of λ which is in turn determined by whether the mean of r is smaller or greaterthan 1

2

32 Comparative Statics Theorem 2 allows for a sharp comparative statics anal-ysis on the informativeness of the optimal signal

We compare signals by their Blackwell informativeness (Blackwell 1953) To compareupper-censorship signals s1 and s2 we only need to compare their cutoffs ωlowast

1 and ωlowast2

Signal s1 is more informative than signal s2 if ωlowast1 ge ωlowast

2 Indeed state ω isin [0ωlowast2) is

fully revealed by both s1 and s2 and state ω isin [ωlowast2 1] is partially revealed by s1 and

not revealed at all by s2 so s1 is more informative than s2

13In the case of log-concave density g the sets of upper- and lower-censorship signals are totallyordered by the rotation order of Johnson and Myatt (2006) If we restrict attention to the set oflower-censorship signals (which are generally suboptimal under log-concave g) then the rotationpoint is decreasing with respect to the rotation order By Lemma 1 of Johnson and Myatt (2006)the senderrsquos expected utility is quasiconvex and one of the extreme lower-censorship signals (thefully informative or completely uninformative signal) is optimal for the sender But if we restrictattention to the set of upper-censorship signals (which are optimal under log-concave g) then therotation point is not decreasing with respect to the rotation order Therefore Lemma 1 of Johnsonand Myatt (2006) does not apply and an interior upper-censorship signal is generally optimal forthe sender

CENSORSHIP AS OPTIMAL PERSUASION 13

For the purpose of comparison we extend the definition of distribution function Gto the real line and assume that its density g is log-concave Consider a family ofdistributions Gtσ of receiver types

Gtσ (r) = G

983061τ minus t+

r minus τ

σ

983062

where τ is the maximum point of g and t isin R and σ gt 0 are parameters Let gtσ (r)denote the corresponding density Note that gtσ (r) is log-concave for all t isin R andall σ gt 0

Because gtσ is log-concave on [0 1] deterministic upper censorship is optimal byTheorem 2 Let ωlowast (ρ t σ) isin [0 1] be the optimal censorship cutoff as given by (3)

The shift parameter t shifts the distribution along the horizontal axis and the stretchparameter σ stretches the distribution horizontally and symmetrically around themode τ so the limit of σ rarr 0 leads to the unit mass on τ We now show that thesender optimally discloses more information when

(i) the senderrsquos preferences are better aligned with the receiverrsquos preferences (thealignment parameter ρ is greater)

(ii) the receiver is more reluctant to accept the proposal (the shift parameter t isgreater)

(iii) the sender is more uncertain about the receiverrsquos type (the stretch parameter σis greater) provided ρ ge 014

Theorem 3 In the canonical case a cutoff state of an optimal signal satisfies

(i) ωlowast (ρ t σ) is increasing in ρ

(ii) ωlowast (ρ t σ) is increasing in t

(iii) ωlowast (ρ t σ) is increasing in σ if ρ ge 0

The intuition for part (i) is that for a higher ρ the sender puts more weight onthe receiverrsquos utility so she optimally endows the receiver with a higher utility byproviding more information

The intuition for part (ii) is that for a higher t each type of the receiver has a greatercost of accepting the proposal so to persuade the same type of the receiver the senderneeds to increase E[ω|ω ge ωlowast] by expanding the full disclosure interval [0ωlowast]

The intuition for part (iii) is that for a higher σ receiver types are more spread out soto persuade the same mass of types the sender again needs to increase E[ω|ω ge ωlowast]

14The case ρ ge 0 is the practically relevant case where the senderrsquos utility is a weighted averageof the receiverrsquos action and utility 1

1+ρv(ω r) =1

1+ρ + ρ1+ρ (ω minus r)

14 KOLOTILIN MYLOVANOV ZAPECHELNYUK

4 Discrete State

In this section we assume that the state is a discrete random variable

Theorem 1prime Let V be S-shaped Then and only then for all discrete F

(A) an optimal signal is stochastic upper censorship whose cutoff pair (ωlowast qlowast) satisfiescondition (3)

(B) an optimal monotone partition is deterministic upper censorship whose cutoff pair(ωlowast

d qlowastd) satisfies ωlowast

d = ωlowast and qlowastd isin 0 1

In the canonical case we obtain the result analogous to Theorem 2

Theorem 2prime Consider the canonical case and let the density g be log-concave Thenand only then for all discrete F and all ρ

(A) an optimal signal is stochastic upper censorship

(B) an optimal monotone partition is deterministic upper censorship

Let us discuss how the comparative statics result (Theorem 3) changes In Section 32we argue that upper-censorship signals can be ordered by their censorship cutoffs Butwhen the state is discrete a censorship cutoff is described by a pair (ωlowast qlowast) where ωlowast

is a cutoff state and qlowast is the probability of revealing this cutoff state when it realizes

Consider two stochastic upper-censorship signals s1 and s2 with cutoff pairs (ωlowast1 q

lowast1)

and (ωlowast2 q

lowast2) Denote (ωlowast

1 qlowast1) ≽ (ωlowast

2 qlowast2) if ω

lowast1 gt ωlowast

2 or ωlowast1 = ωlowast

2 and qlowast1 ge qlowast2 Observethat s1 is more (Blackwell) informative than s2 if and only if (ωlowast

1 qlowast1) ≽ (ωlowast

2 qlowast2) This

comparison also applies to deterministic upper-censorship signals with the constraintthat qlowast1 and qlowast2 are in 0 1

With the above order on cutoff pairs Theorem 3 extend as follows

Theorem 3prime Consider the canonical case

(A) A cutoff pair of an optimal signal satisfies

(i) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in ρ

(ii) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in t

(iii) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in σ if ρ ge 0

(B) A cutoff pair of an optimal monotone partition satisfies15

(i) (ωlowastd (ρ t σ) q

lowastd (ρ t σ)) is increasing in ρ

(ii) (ωlowastd (ρ t σ) q

lowastd (ρ t σ)) is increasing in t

15Part (iii) also holds for an optimal monotone partition when for example the discrete supportof the state ω is a regular grid with sufficiently many points However it is possible to construct acounterexample where part (iii) fails for an optimal monotone partition

CENSORSHIP AS OPTIMAL PERSUASION 15

That is the sender optimally discloses more information when she is less biasedrelative to the receiver (the alignment parameter ρ is greater) when the receiver ismore reluctant to accept the proposal (the shift parameter t is greater) and underScenario A when the sender is more uncertain about the receiverrsquos type (the stretchparameter σ is greater)

Remark 2 Theorem 1prime and Theorem 2prime show that upper censorship emerges as anoptimal persuasion strategy under the same conditions (V is S-shaped in Theorem1prime and g is log-concave in Theorem 2prime) regardless of whether the sender can choosean arbitrary signal or is constrained to choose a monotone partition It may betempting to conjecture that if the sender was allowed to choose an arbitrary partition(not necessarily monotone) the same result would hold This however is not true

For illustration suppose that the sender wishes to maximize the probability of a = 1and the receiver has type r = 1

2with certainty16 Let the state ω take values in

0 14 1 with probabilities (1

6 23 16) so the prior mean is 1

3

An optimal signal is stochastic upper censorship with cutoff pair (ωlowast qlowast) = (14 12)

which sends the pooling message with certainty at ω = 1 and with probability 1minusqlowast =12at ω = 1

4 and reveals the state otherwise The pooling message induces the posterior

mean m = 12 leading to a = 1 all other messages induce posterior means below 1

2

leading to a = 0 Under the optimal signal action a = 1 is chosen with probability23middot 12+ 1

6= 1

2

An optimal monotone partition is deterministic upper censorship that fully revealsthe state where only ω = 1 leads to a = 1 Under this partition action a = 1 ischosen with probability 1

6

An optimal partition pools the extreme states ω = 0 and ω = 1 and reveals theintermediate state ω = 1

4 The pooling message induces the posterior mean m = 1

2

leading to a = 1 This optimal partition is nonmonotone Under this partitionaction a = 1 is chosen with probability 1

6+ 1

6= 1

3 which is strictly better than the

probability of 16obtained under the optimal monotone partition

5 Application to Media Censorship

In this section we apply our results to the problem of media censorship by thegovernment In the modern world people obtain information about the governmentrsquosstate through various media sources such as television newspapers and internetblogs Without the media most people would not know what policies and reformsthe government pursues and how effective they are Media outlets have differentpositions on the political spectrum and differ substantially in how they select andpresent facts to cover the same news People choose their sources of information based

16In this case V (m) = 1 if m ge 12 and V (m) = 0 if m lt 1

2 Observe that the graph of V is

S-shaped and it can be approximated by a continuously differentiable S-shaped function

16 KOLOTILIN MYLOVANOV ZAPECHELNYUK

on their political ideology and socioeconomic status This information is valuable forsignificant individual decisions on migration investment and voting to name a fewIndividuals may not fully internalize externalities that their decisions impose on thesociety Likewise the government may not have the societyrsquos best interest at heartTo further its goals the government then wishes to influence individual decisions bymanipulating the information through media In autocracies and countries with weakchecks and balances the government has power to censor the media content

The governmentrsquos problem of media censorship can be represented as the persua-sion problem in Section 2 We apply our results to provide conditions for the op-timality of upper-censorship policies that censor all media outlets except the mostpro-government ones Furthermore we interpret our comparative statics results

51 Setup There is a continuum of heterogeneous citizens indexed by r isin [0 1]distributed with G Each citizen chooses between a = 0 and a = 1 The utility of acitizen of type r is given by

u(θ r ar a) = (θ minus r)ar + κ(θ r a)

where ar isin 0 1 denotes the citizenrsquos own action a =983125ardG(r) denotes the aggre-

gate action in the society θ isin [0 1] captures an unobserved benefit from action 1 ascompared to action 0 and κ captures the impact of the aggregate action a on thecitizenrsquos utility The term (θ minus r)ar is a private surplus of a citizen of type r Theterm κ(θ r a) is an externality because for a citizen of type r it is optimal to ignorethis term and choose ar = 1 if and only if θ ge r

There is a government which is concerned with a weighted average of the social utilityand the governmentrsquos intrinsic benefit from the aggregate action For a given θ thegovernmentrsquos utility is given by

983133 1

0

ν(θ r ar a)dG(r) + δγ(θ a)

The term ν captures a citizenrsquos utility from the governmentrsquos perspective We allow νto be different from u to reflect paternalistic or other concerns The term γ capturesthe governmentrsquos intrinsic benefit from the aggregate action The parameter δ ge 0captures the weight of the aggregate action in the governmentrsquos utility

Let T be a distribution of the random variable θ We assume that distributions G andT are independent and admit continuously differentiable and strictly positive densi-ties We also assume that κ ν and γ are linear in θ and continuously differentiablein r and a Furthermore to simplify interpretations we assume that ν and γ arenon-decreasing in θ and a so that for the government a high θ is a good news anda higher aggregate action is preferable

Citizens obtain information about the unobservable benefit θ through media outletsEach media outlet is identified by its editorial policy c isin [0 1] and it endorses action

CENSORSHIP AS OPTIMAL PERSUASION 17

a = 1 if θ ge c and criticizes it if θ lt c17 A set C of media outlets is a finite subsetof [0 1]18

The governmentrsquos censorship policy is a set of the media outlets X sub C that arepermitted to broadcast so the rest of the media outlets are censored

The timing is as follows First the government chooses a set X sub C of permittedmedia outlets Second state θ is realized and each permitted media outlet endorsesor criticizes action a = 1 according to its editorial policy Finally each citizen observesmessages from all permitted media outlets updates his beliefs about θ and choosesan action

52 Discussion We now discuss interpretations of the key components of the mediacensorship application As in Gehlbach and Sonin (2014) there can be various inter-pretations of the citizenrsquos action a = 1 such as voting for the government supportinga governmentrsquos policy or taking an individual decision that benefits the governmentA citizenrsquos type r can be interpreted as his ideological position or preference param-eter A citizen who is more supportive of the government has a smaller r

A media outlet with a higher editorial policy c can be interpreted as being less loyalto the government because it criticizes the government on a larger set of states Aneditorial policy c isin C can therefore represent a slant or political bias of the outletagainst the government and can be empirically measured as the frequency with whichthe outlet uses anti-government language Gentzkow and Shapiro (2010) constructsuch a slant index for US newspapers Empirical findings of their paper suggestthat the editorial policies of media outlets are driven by reader preferences justifyingour assumption of the existence of a large variety of editorial policies19 As in Suen(2004) Chan and Suen (2008) and Chiang and Knight (2011) the assumption ofthe binary media reports that communicate only whether the state θ is above somestandard c can be justified by a cursory readerrsquos preference for simple messages suchas positive or negative opinions and yes or no recommendations

The governmentrsquos censorship of media outlets can take various forms For examplethe government can ban access to internet sites withdraw licenses disrupt financingconfiscate print materials and equipment discredit media providers and even arresteditors and journalists using broadly formulated legislation on combating extremismIn some countries the government can exercise direct control over media editorialpolicies either through state ownership or administrative pressure

17The tie-breaking in the event of θ = c is unimportant as θ is a continuous random variable18We discuss the case of a continuum of media outlets in Section 5519Theoretical literature has explored the determinants of media slant of an outlet driven by its

citizens (Mullainathan and Shleifer 2005 Gentzkow and Shapiro 2006 and Chan and Suen 2008)and its owners (Baron 2006 and Besley and Prat 2006)

18 KOLOTILIN MYLOVANOV ZAPECHELNYUK

53 Formulating Media Censorship as Persuasion We now show that the me-dia censorship problem can be formulated as a linear persuasion problem in whichthe government is a sender and a representative citizen is a receiver

In our model the citizensrsquo and the governmentrsquos utilities are linear in θ There-fore given any information from the media outlets the utilities depend only on theposterior mean of θ

Consider an arbitrary posterior mean of θ denoted by m isin [0 1] Each citizen of typer chooses ar = 1 if and only if r le m The externality term κ plays no role in thisdecision Therefore the aggregate action a is simply the mass of all citizens whosetypes do not exceed m so a = G(m)

Next using the citizensrsquo optimal behavior we derive the governmentrsquos expected utilityconditional on the posterior mean m

V (m) = E983063983133 1

0

ν(θ r ar G(m))dG(r) + δγ(θ G(m))

983055983055983055983055m983064

=

983133 1

0

ν(m r1rlem G(m))dG(r) + δγ(mG(m)) (6)

This is the governmentrsquos indirect utility As in Section 2 the government now choosesa signal which is informative about the state to maximize its expected utility How-ever in contrast to Section 2 here the government is restricted to signals that areimplementable by a subset of a given set of media outlets

Next we show that with an appropriate definition of the state the restriction tosignals implementable by a subset of a given set of media outlets can be formulatedas a restriction to monotone partitional signals as in Scenario B in Section 2 Thuswe will be able to use our results in Section 4 to find the optimal censorship policies

Recall that the information about θ is only available through the media outlets Let ωdenote a random variable equal to the conditional expectation of θ given the messagesof all media outlets in C Note that as C is finite ω is a discrete random variablewith values in [0 1] Let F denote the distribution of ω From now on we treat therandom variable ω as the state and denote by s a signal as defined in Section 2

Let HM denote the set of all distributions of the posterior mean state induced byall monotone partitional signals and let HC denote the set of all distributions ofthe posterior mean state induced by all finite subsets of the set C of media outletsObserve that any set of media outlets induces a monotone partitional signal thatis HC sub HM We now show that every outcome implementable by an arbitrarymonotone partition in HM is also implementable by a subset of media outlets

Lemma 2 HC = HM

For illustration suppose that there is only one media outlet with the editorial policyc isin (0 1) There are two observable events θ ge c and θ lt c In each realized event

CENSORSHIP AS OPTIMAL PERSUASION 19

the citizens compute the posterior expectation of θ which we will call ω So in thisexample the distribution F of ω has two mass points one for each of the two events

Given these two mass points every monotone partition of [0 1] induces one of twopossible distributions of the posterior mean state (i) the mass points revealed and (ii)the mass points pooled Observe however that these distributions are implementableby permitting and censoring the media outlet respectively

A censorship policy is upper censorship if it censors all sufficiently disloyal mediaoutlets Specifically there exists a cutoff clowast isin C and an indicator qlowast isin 0 1 suchthat all media outlets whose editorial policies are below clowast are permitted all mediaoutlets whose editorial policies are above clowast are censored and clowast is permitted if andonly if qlowast = 1 That is X = c isin C c le clowast if qlowast = 1 and X = c isin C c lt clowastif qlowast = 0 Note that the full censorship policy clowast = 0 and qlowast = 0 (where all mediaoutlets are censored) and the free media policy clowast = 1 and qlowast = 1 (where all mediaoutlets are permitted) are the two extreme upper-censorship policies

As shown above the media censorship problem is equivalent to the linear persuasionproblem in which the sender is restricted to monotone partitional signals and thesenderrsquos indirect utility V is given by (6) We thus apply Theorem 1prime(B) to obtainthe following result

Theorem 1primeprime If V is S-shaped then an optimal censorship policy is upper censorship

To illustrate Theorem 1primeprime suppose the government is interested only in the aggregateaction so that ν = 0 and γ depends only on the aggregate action a so V (m) =γ(G(m)) Thus an upper-censorship policy is optimal if the composition functionγ(G(middot)) is S-shaped For example this condition holds if the government is interestedin reaching a certain approval threshold (eg a simple majority) so that γ is a stepfunction This condition also holds if γ is S-shaped and G is uniform or S-shapedwith the same inflection point as γ20

54 Canonical Case We now impose more structure on the utilities to obtain asharper result for the optimality of upper censorship and to perform a comparativestatics analysis Analogously to assumption (A1) we assume that the governmentrsquosutility is a weighted average of the citizensrsquo utility and their aggregate action

983133 1

0

u(θ r ar a)dG(r) + δa

where u(θ r ar a) = (θ minus r)ar + ζ(r)a

(A2)

for some continuously differentiable function ζ Define

β =

983133 1

0

ζ(r)dG(r) + δ

20This condition holds in many special cases where γ and G are S-shaped even when γ and Ghave different inflection points

20 KOLOTILIN MYLOVANOV ZAPECHELNYUK

The term983125 1

0ζ(r)dG(r) is the governmentrsquos expected bias towards the citizensrsquo action

a = 1 due to the citizensrsquo externality and the term δ is the governmentrsquos intrinsicbias towards a greater aggregate action a Thus we interpret β as the governmentrsquosaggregate bias This decomposition of the bias allows for different interpretations whythe government is biased So β can be high because the government is too self-serving(high δ) or because the government is benevolent (low δ) but wishes to internalizestrong positive externalities of the citizens (high ζ(r)) To ease interpretations weassume that β gt 021

Under Assumption (A2) the governmentrsquos indirect utility V given by (6) becomes

V (m) =

983133 m

0

(mminus r)dG(r) + βa =

983133 m

0

v(m r)dG(r)

where

v(m r) = β

9830611 +

1

β(mminus r)

983062

This is the same as v given by Assumption (A1) with ρ = βminus1 up to rescaling by apositive constant Consequently we can apply Theorem 2prime(B) to obtain the followingresult

Theorem 2primeprime Let (A2) hold If the density g of citizens types is log-concave then anoptimal censorship policy is upper censorship

We now apply the comparative statics analysis presented in Section 32 Note thatthe upper-censorship policies are ordered according to the amount of informationtransmitted to the citizens in the sense of Blackwell (1953) A greater censorshipthreshold clowast means that more media outlets are permitted With this order in mindwe apply Theorem 3prime(B) to make a comparative statics analysis on the amount ofinformation that is optimally disclosed by the government

First the censorship cutoff is increasing in the alignment parameter ρ Recall thatthis is a reciprocal of the governmentrsquos bias ρ = βminus1 This means that the governmentoptimally discloses more information (the censorship cutoff is greater) when it is lessbiased (β is smaller) Intuitively as β decreases the government puts more weighton the citizensrsquo utility so it optimally endows the population with a higher utility bycensoring fewer media outlets and disclosing more information

Second the censorship cutoff is increasing in the magnitude t of the horizontal shiftof the density g A greater t corresponds to a greater opportunity cost of actiona = 1 for each citizen This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more difficultto persuade to take action a = 1 (parameter t is greater) Informally speaking topersuade the same type of the citizen the government needs to increase the posterior

21If β lt 0 then swapping the roles of a = 0 and a = 1 reverses the sign of the bias If β = 0then the governmentrsquos utility and the citizensrsquo private interests coincide so it is trivially optimal topermit all media outlets

CENSORSHIP AS OPTIMAL PERSUASION 21

mean But the expected posterior mean must be equal to the prior mean so it is notpossible to increase all posterior means Due to the log-concave shape of the densityof the citizensrsquo types this tradeoff is resolved by increasing the posterior mean of thepooling interval E[ω|ω ge clowast] This is done by shrinking the pooling interval [clowast 1]that is increasing the censorship cutoff clowast

Third when the number of media outlets is sufficiently large the censorship cutoffis increasing in the stretch factor σ of the density g around its peak τ (see Footnote15) A greater σ corresponds to a more dispersed distribution of the opportunitycost in the population This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more diverse inhow difficult they are to persuade Intuitively when the types are more spread outpersuading the same mass of types requires a greater posterior mean which leads toa higher censorship cutoff similarly to the comparative statics in t discussed above

55 Extensions and Open Questions Let us now consider a few extensions ofour model of media censorship

In our model the set of media outlets is exogenous and the governmentrsquos only in-strument is censorship We now consider three alternative ways of expanding thegovernmentrsquos instruments of influence

First suppose that the government is able not only to censor existing media outletsbut also to introduce new media outlets with chosen editorial policies This is equiv-alent to our censorship model where all media outlets in [0 1] are initially availableand the government can censor any subset of them When all media outlets in [0 1]are permitted the revealed information about θ (which we call ω) is θ itself Thusω = θ is a continuous random variable with distribution F = T So we can nowapply our results for the continuous state from Section 3 instead of those for the dis-crete state in Section 4 reaching the same conclusion about the optimality of uppercensorship

Second suppose that the government can garble information available from mediaoutlets That is the government is not restricted to monotone partitions it cancreate arbitrary signals about state ω for the citizens to observe This becomes ageneral persuasion problem and our Theorems 1prime(A) 2prime(A) and 3prime(A) apply

Third suppose that the government is able to restrict not only which media outletsare permitted but also how many media outlets each citizen can choose to observeIn this extension the citizens are not allowed to communicate with one another (oth-erwise they could share the information thus observing all permitted media outletsindirectly) This extension does not affect our results as long as each citizen is al-lowed to access at least one media outlet of his choice as in Chan and Suen (2008)Intuitively this is because each citizen categorizes the information from the mediaoutlets into ldquogood newsrdquo where a = 1 is optimal and ldquobad newsrdquo where a = 0 is

22 KOLOTILIN MYLOVANOV ZAPECHELNYUK

optimal Because the information from the media outlets induces a monotone parti-tion it means that ldquogood newsrdquo is separated from ldquobad newsrdquo by a threshold mediaoutlet that depends on the citizenrsquos type That is it is sufficient to observe a singlethreshold media outlet to distinguish ldquogood newsrdquo from ldquobad newsrdquo

In our model each citizenrsquos utility depends on the aggregate action a through theexternality term κ(θ r a) which does not affect the chosen action Let us relax thisassumption so that a citizenrsquos optimal choice can depend on a We can still writethe senderrsquos indirect utility V as a function of the posterior mean state and applyour results However V is no longer uniquely determined by the primitives of themodel It is now endogenous and depends on an equilibrium the citizens play aseach citizenrsquos optimal action now depends on what all citizens do in equilibrium Forexample given the same information about the state a citizen could prefer to choosea = 1 if and only if sufficiently many citizens choose the same action so a is largeenough This creates the problem of multiplicity of equilibria and as a consequencethe dependence of optimal censorship on equilibrium selection

As a side our media censorship problem can also be applied to spatial voting mod-els as in Chiang and Knight (2011) Consider a government party (p = G) and anopposition party (p = O) competing in an election If party p wins a voter withan ideological position r gets utility wp minus (r minus rp)

2 where wp is the quality or va-lence of party p and rp is the ideology or policy platform of party p Voters knowthe partiesrsquo ideologies and obtain information about the partiesrsquo qualities from allavailable media outlets Each voter supports the party that maximizes his expectedutility Within this context our analysis still applies because the voterrsquos utility dif-ference between the government and opposition parties is proportional to θminusr whereθ = (wG minus wO + r2O minus r2G)2(rO minus rG)

There are a few more extensions that can be relevant in applications First insteadof complete censorship of a media outlet there can be a cost of accessing it Forexample an international news channel can be censored by a local government butcitizens may still access it through VPN at some cost Second it can be costly forthe government to censor media outlets So another important question to answeris how a government may prioritize censoring Finally citizens can incur some costof following each media outlet While we have already mentioned that citizens gainno benefit from following more than one outlet it is entirely possible for them tostop watching news altogether if it is sufficiently uninformative These extensions arenontrivial and left for future research

Appendix

Proof of Theorem 1 Suppose that V is S-shaped For mωlowast isin [0 1] and mlowast(ωlowast) =E[ω|ω ge ωlowast] define

V (mωlowast) = V (mlowast(ωlowast)) + V prime(mlowast(ωlowast))(mminusmlowast(ωlowast)) (7)

CENSORSHIP AS OPTIMAL PERSUASION 23

Since V (m) is convex on [0 τ ] and concave on [τ 1] there exists ωlowast isin [0 τ ] thatsatisfies condition (3) that is

V (mωlowast) ge (le)V (m) for all m ge (lt)ωlowast (8)

This is true because V (ωlowastωlowast) minus V (ωlowast) is single-crossing from below To see thisconsider Figure 1 Observe that (3) can hold only if ωlowast is on the convex part of V andmlowast is on the concave part of V As the censorship cutoff ωlowast increases the posteriormean state of the pooling message mlowast(ωlowast) also increases But because V is concaveat mlowast(ωlowast) the dashed tangent line V (mωlowast) becomes flatter in m so it crosses thesolid line V (m) at a smaller m

Consider an arbitrary signal s Let H be the distribution of m = E[ω|s] inducedby signal s Since each signal is a garbling of the fully informative signal F is amean-preserving spread of H Thus the senderrsquos expected utility is smaller undersignal s than under the upper-censorship signal with cutoff ωlowast as follows from

983133 1

0

V (m)dH(m) le983133 1

0

maxV (m) V (mωlowast)dH(m)

le983133 1

0

maxV (m) V (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mlowast(ωlowast))dF (m)

where the second line holds because maxV (m) V (mωlowast) is convex in m and F isa mean-preserving spread of H the third line holds by (8) and the last line holds by

(7) and by983125 1

ωlowast(mminusmlowast(ωlowast))dF (m) = 0

Conversely suppose that V is not S-shaped Then there exist 0 le m1 lt m2 lem3 lt m4 le 1 such that V primeprime(m) lt 0 for m isin (m1m2) V

primeprime(m) = 0 for m isin [m2m3]and V primeprime(m) gt 0 for m isin (m3m4) because V primeprime is continuous by assumption It isstraightforward to show that there exists 983141mlowast isin (m1m2) (sufficiently close to m2) and983141ωlowast isin (m3m4) such that

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) lt V (m) for all m isin (983141ωlowastm4]

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) gt V (m) for all m isin [m1 983141ωlowast] 983141mlowast

Moreover there exists a continuous distribution F such that the support of F is asubset of [m1m4] and 983141mlowast = E[ω|ω le 983141ωlowast] Using a chain of inequalities similar to theone above it is easy to show that lower-censorship signal 983141slowast with cutoff 983141ωlowast is optimalFurthermore it is uniquely optimal as both inequalities hold with equalities only forsignal 983141slowast This in turn implies that any upper-censorship signal is suboptimal

24 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Proof of Lemma 1 Notice that under assumption (A1) we have

V primeprime(m) = gprime(m) + ρg(m) for m isin [0 1]

Recall that V is S-shaped if and only if V primeprime(m) = gprime (m) + ρg (m) is single-crossingfrom above Using Proposition 1 in Quah and Strulovici (2012) it is easy to showthat V primeprime(m) is single-crossing from above for all ρ isin R if and only if gprime (m)g (m) isnonincreasing in m (that is ln g(m) is concave)

Proof of Theorem 2 Immediate by Theorem 1 and Lemma 1

Proof of Theorem 3 Part (i) We fix t and σ and vary ρ Without loss ofgenerality let t = 0 and σ = 1 (so that Gtσ = G) Consider a signal that induces thedistribution H of posterior mean m This signal implements the receiverrsquos interimutility U given by

U(r) =

983133 1

r

(mminus r)dH(m) for r isin [0 1]

which holds because the receiver acts if and only if m ge r Since types r lt 0 alwayschoose a = 1 and types r gt 1 always choose a = 0 we can exclude them fromconsideration and assume that supp(G) = [0 1] The senderrsquos expected utility is then

983133 1

0

V (m)dH(m) =

983133 1

0

983133 m

0

(1 + ρ(mminus r))dG(r)dH(m)

=

983133 1

0

983133 1

r

(1 + ρ(mminus r))dH(m)dG(r)

=

983133 1

0

(1minusH(r) + ρU(r))dG(r)

Consider ρ2 gt ρ1 Suppose by contradiction that the corresponding optimal upper-censorship signals s2 and s1 are such that s1 is strictly more informative than s2 LetU2 and U1 be the receiverrsquos interim utilities implemented by s2 and s1 Since thesender prefers signal s2 under ρ2 and signal s1 under ρ1 we have

983133 1

0

(1minusH2 (r) + ρ2U2 (r)) dG (r) ge983133 1

0

(1minusH1 (r) + ρ2U1 (r)) dG (r)

983133 1

0

(1minusH1 (r) + ρ1U1 (r)) dG (r) ge983133 1

0

(1minusH2 (r) + ρ1U2 (r)) dG (r)

Summing up these inequalities gives

(ρ2 minus ρ1)

983133 1

0

(U2 (r)minus U1 (r)) dG (r) ge 0

leading to a contradiction because U1(r) ge U2(r) for all r with strict inequality forsome r given that s1 is strictly more informative than s2

CENSORSHIP AS OPTIMAL PERSUASION 25

Part (ii) We fix ρ isin R and σ = 1 and vary t The indirect utility of the senderconditional on m isin [0 1] is given by

Vt(m) =

983133

rlem

(1+ρ(mminusr))gt1(r)dr =

983133

rlemminust

(1+ρ(mminustminusr))g(r)dr = V (mminust) (9)

By Theorem 1 when t0 = 0 an optimal cutoff ωlowast0 satisfies

V (m) le V (mlowast0) + V prime(mlowast

0)(mminusmlowast0) for m isin [ωlowast

0 1]

When V is S-shaped and t1 lt 0 it is easy to see that

V (mminus t1) le V (mlowast0 minus t1) + V prime(mlowast

0 minus t1)(mminusmlowast0) for m isin [ωlowast

0 1]

which implies that when t = t1 the above inequality changes its sign at some ωlowast1

that satisfies ωlowast1 le ωlowast

0 It follows from (9) that ωlowast1 is an optimal cutoff for t = t1 and

as we established above ωlowast1 le ωlowast

0

Part (iii) We fix ρ ge 0 and t = 0 and vary σ The indirect utility of the senderconditional on m isin [0 1] is given by

V ρσσ(m) =

983133

rlem

9830591 +

ρ

σ(mminus r)

983060dG0σ(r)

=

983133

rlem

9830591 +

ρ

σ(mminus r)

983060g

983061r minus (1minus σ)τ

σ

9830621

σdr

=

983133

rlemminus(1minusσ)τσ

9830611 + ρ

983061mminus (1minus σ)τ

σminus r

983062983062g(r)dr

= Vρ1

983061mminus (1minus σ)τ

σ

983062 (10)

Let σ0 = 1 σ1 lt 1 and ρ1 = ρσ1 To compare the optimal cutoffs under parameters(ρ σ0) and (ρ σ1) we consider two steps first comparing the cutoffs between (ρ σ0)and (ρ1 σ1) and then comparing the cutoffs between (ρ1 σ1) and (ρ σ1)

First we compare an optimal cutoff ωlowast0 under (ρ σ0) = (ρ 1) and an optimal cutoff

ωlowast1 under (ρ1 σ1) By Theorem 1 ωlowast

0 satisfies

Vρ1(m) le Vρ1(mlowast0) + V prime

ρ1(mlowast0)(mminusmlowast

0) for m isin [ωlowast0 1]

When V is increasing (implied by ρ ge 0) and S-shaped and σ1 lt 1 it is easy to seethat for all m isin [ωlowast

0 1]

Vρ1

983061mminus (1minus σ1)τ

σ1

983062le Vρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062+ V prime

ρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062(mminusmlowast

0)

which implies that when σ = σ1 lt 1 the above inequality changes its sign at someωlowast1 that satisfies ωlowast

1 le ωlowast0 It follows from (10) that ωlowast

1 is an optimal cutoff under(ρσ σ1) = (ρ1 σ1) and as we established above ωlowast

1 le ωlowast0

26 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Second we compare the optimal cutoff ωlowast1 under (ρ1 σ1) and the optimal cutoff ωlowast

2

under (ρ σ1) Because ρ lt ρ1 = ρσ1 by part (i) we have ωlowast2 le ωlowast

1

Combining ωlowast1 le ωlowast

0 and ωlowast2 le ωlowast

1 we obtain ωlowast2 le ωlowast

0

Proof of Theorem 1prime Part (A) The proof is analogous to that of Theorem 1 andis thus omitted

Part (B) The only if statement follows from that in part (A) (see the proof of Theorem1) and the fact that we can always ensure that an optimal stochastic censorship(ωlowast qlowast) is deterministic (so qlowast isin 0 1) by the choice of the support and distributionof ω Thus we only need to prove the if statement

Let F be a distribution of ω with support on n values 0 lt ω0 lt lt ωnminus1 lt 1 Thisis without loss of generality because if 0 andor 1 were in the support of F we couldredefine the state as ω = (ω + ε)(1 + 2ε) and the type as r = (r + ε)(1 + 2ε)

Let X sub ω1 ωnminus1 identify a set of cutoffs defining a monotone partition for eachωi isin X the receiver is informed whether ω lt ωi or ω ge ωi Note that we exclude ω0

from the set of cutoffs because ω ge ω0 always holds

Note that X is deterministic upper censorship if and only if there exists a censorshipthreshold ωlowast isin [0 1] such that for each i = 1 nminus 1 ωi isin X if and only if ωi lt ωlowastthat is we can assume that qlowast = 0 without loss of generality

Let VX be the senderrsquos expected utility when the monotone partition is given by X

If n = 2 then there are only two partitions X = ω1 (full disclosure) and X = empty(no disclosure) Both are upper censorships so part (B) holds trivially

If n = 3 then there are four monotone partitions X = ω1ω2 (full disclosure)X = ω2 X = ω1 and X = empty (no disclosure) Only X = ω2 is not uppercensorship The next lemma proves that X = ω2 cannot be uniquely optimal whenV is S-shaped thus proving the optimality of upper censorship for n = 3

Denote

mjk = E983045ω|ω isin [ωjωk]

983046

For convenience of notation we write mj = ωj

Lemma 3 Let V be S-shaped with an inflexion point τ Let ω be a discrete randomvariable with support on m0m1m2 and probabilities p0 p1 p2 where 0 lt m0 ltm1 lt m2 lt 1 Then

(i) if τ le m0 then Vm2 le Vempty

(ii) if τ ge m1 then Vm2 le Vm1m2

(iii) if m0 lt τ lt m1 then Vm2 le maxVempty Vm1m2

CENSORSHIP AS OPTIMAL PERSUASION 27

Proof Define p01 = p0 + p1 and p02 = p01 + p2 = 1 Observe that

m01 =p0m0 + p1m1

p0 + p1and m02 =

p0m0 + p1m1 + p2m2

p0 + p1 + p2

We thus have

Vempty = p02V (m02) = (p01 + p2)V

983061p01m01 + p2m2

p01 + p2

983062

Vm2 = p01V (m01) + p2V (m2) = (p0 + p1)V

983061p0m0 + p1m1

p0 + p1

983062+ p2V (m2)

Vm1m2 = p0V (m0) + p1V (m1) + p2V (m2)

Clearly

m0 lt m01 lt m1 and m0 lt m02 lt m2

Therefore

Vm2 le Vempty lArrrArrp01

p01 + p2V (m01) +

p2p01 + p2

V (m2) le V

983061p01

p01 + p2m01 +

p2p01 + p2

m2

983062= V (m02)

(11)

and

Vm2 le Vm1m2 lArrrArr

V

983061p0

p0 + p1m0 +

p1p0 + p1

m1

983062le p0

p0 + p1V (m0) +

p1p0 + p1

V (m1)(12)

Suppose that τ le m01 so V is concave on [m01m2] Then (11) holds by Jensenrsquosinequality In particular it holds for τ le m0 lt m01 thus proving part (i)

Suppose that τ ge m1 so V is convex on [m0m1] Then (12) holds by Jensenrsquosinequality thus proving part (ii)

Finally suppose that m01 lt τ lt m1 Let (τ V (τ)) lie below (or on) the straight lineA connecting (m0 V (m0)) and (m2 V (m2)) as illustrated on Figure 2 By convexityof V (m) for m lt τ it must be the case that V (m01) lies below (or on) the dashedline B By concavity of V (m) for m gt τ it must be the case that V (m1) lies above(or on) the dashed line C Consequently (12) is satisfied

Alternatively let (τ V (τ)) lie above the straight line A connecting (m0 V (m0)) and(m2 V (m2)) as illustrated on Figure 3 If Vm2 le Vm1m2 then the proof of part(iii) is complete So suppose that Vm2 gt Vm1m2 so (12) is violated

By convexity of V (m) for m lt τ it must be the case that V (m01) lies below (or on)the line B Since (12) is violated V (m01) must lie between the lines A and B Recallthat m01 lt m02 lt m2 Observe that V (m02) must be in the shaded area on Figure 3Indeed if m02 lt τ then by convexity of V (m) for m lt τ it must be the case thatV (m02) lies between (or on) the lines B and D If m02 ge τ then by concavity of

28 KOLOTILIN MYLOVANOV ZAPECHELNYUK

τ m2m0

A C

B

m01 m1

V (m0) V (τ)

V (m2)

Figure 2 The case where (τ V (τ)) lies below (or on) the straight lineconnecting (m0 V (m0)) and (m2 V (m2))

τ m2m0

A

C

B

m01

V (m0)

V (τ)

V (m2)

D

V (m01)

Figure 3 The case where (τ V (τ)) lies above the straight line con-necting (m0 V (m0)) and (m2 V (m2))

V (m) for m gt τ it must be the case that V (m02) lies above (or on) line C It followsthat (11) is satisfied

We now adapt Lemma 3 to prove the optimality of upper censorship for n ge 4

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

10 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Remark 1 Let minusV be S-shaped Then and only then for all continuous F anoptimal signal is deterministic lower censorship whose cutoff state ωlowast isin [τ 1] satisfies

V (mlowast) + V prime(mlowast)(mminusmlowast) ge (le)V (m) for all m le (gt)ωlowast

Theorem 1 states that optimal persuasion takes a simple form of upper censorshipwhen V is S-shaped11 Moreover the theorem establishes that this result is tight inthe sense that if V is not S-shaped then there exists a distribution F of the statesuch that no upper-censorship signal is optimal

Note that every upper-censorship signal is a monotone partition Therefore Theorem1 applies in both Scenario A where the sender is free to choose any signal andScenario B where the sender is constrained to choose a monotone partition

The intuition behind Theorem 1 is as follows Observe that when no informationabout the state is revealed the receiverrsquos best-response action does not change withthe state The more information is revealed the more variable the receiverrsquos behaviorwill be in response to this information Consider an interval where V is concave Thesender would prefer not to reveal any information when the state is in this intervalsince a certain outcome is preferred to any lottery with the same expected stateConversely consider an interval where V is convex The sender would prefer to fullyreveal the state in this interval since now lotteries are preferred If V is S-shapedthat is it is convex below some threshold and concave above that threshold theinduced optimal persuasion takes the form of upper censorship

When V is S-shaped the senderrsquos optimization problem is reduced to finding anoptimal censorship cutoff ωlowast If the realized state ω is below the cutoff then it isrevealed to the receiver so the expected utility of the sender is V (ω) If the realizedstate ω is above the cutoff then the posterior expected state is mlowast = E[ω|ω ge ωlowast] sothe expected utility of the sender conditional on ω ge ωlowast is V (mlowast) The sender thusneeds to solve the problem

maxωlowastisin[01]

983133 ωlowast

0

V (ω)f(ω)dω +

983133 1

ωlowastV (mlowast)f(ω)dω (4)

The expression in (3) represents the first-order condition to this problem and capturesthree possible cases the boundary solutions ωlowast = 0 and ωlowast = 1 if the expression in(3) has the same sign for all ω and an interior solution ωlowast such that

V (mlowast) + V prime(mlowast)(ωlowast minusmlowast) = V (ωlowast) (5)

This first-order condition (5) is illustrated by Figure 1 The solid line is V (m) andthe dashed line is V (mlowast) + V prime(mlowast)(mminusmlowast) which is tangent to V at mlowast

11For a fixed distribution F Proposition 3 in Kolotilin (2018) implies that upper censorship withcutoff ωlowast is optimal if and only if (3) holds and V (m) is convex on [0ωlowast] This result can be used toshow that upper censorship is optimal when V is S-shaped For completeness we include a simpleself-contained proof inspired by the proof of Theorem 1 in Dworczak and Martini (2019)

CENSORSHIP AS OPTIMAL PERSUASION 11

1ωlowast

V

revealing pooling

mlowast0

Figure 1 Optimal upper censorship with cutoff ωlowast

The solution of the senderrsquos problem (4) is particularly simple if V is either globallyconvex or globally concave In this case the first-order condition in (3) has a constantsign so either ωlowast = 0 or ωlowast = 1 must be optimal If V is convex then ωlowast = 1 isoptimal which corresponds to the fully informative signal Similarly if V is concavethen ωlowast = 0 is optimal which corresponds to the completely uninformative signalThis is summarized in the following corollary

Corollary 1 An optimal signal is

(i) fully informative for all F if and only if V is convex

(ii) completely uninformative for all F if and only if V is concave

31 Canonical Case We now consider the canonical case where the senderrsquos utilitysatisfies assumption (A1)

The density g of receiver types r is said to be log-concave if ln g(r) is concave in r12

Note that ln g(r) is well defined in the canonical case

In this case the shape of the senderrsquos indirect utility V defined by (9) is connectedto the shape of the density g of receiver types as follows

Lemma 1 In the canonical case V is S-shaped for all ρ if and only if g is log-concave

12Table 1 in Bagnoli and Bergstrom (2005) reports distributions with log-concave densities

12 KOLOTILIN MYLOVANOV ZAPECHELNYUK

That is if the density of receiver types g is log-concave then the senderrsquos indirectutility V is S-shaped Moreover this result is tight in the sense that if g is notlog-concave then there exists ρ such that V is not S-shaped

By Theorem 1 and Lemma 1 we obtain a criterion for the optimality of upper cen-sorship with the condition on the primitive of the model the density g

Theorem 2 Consider the canonical case and let the density g be log-concave Thenand only then an optimal signal is deterministic upper censorship for all F and all ρ

The symmetric statement is also true an optimal signal is lower censorship for all Fand all ρ if and only if minusg is log-concave Consequently if both g and minusg are log-concave (that is g is exponential) then there exists an optimal signal which is bothupper censorship and lower censorship There are only two signals with this propertyfully informative and completely uninformative This allows us to obtain conditionson the distribution of receiver types under which the optimal signal is polarizedbetween fully informative and completely uninformative signals as for example inLewis and Sappington (1994) and Johnson and Myatt (2006)13

Corollary 2 In the canonical case an optimal signal is either fully informative orcompletely uninformative for all F and all ρ if and only if there exist λ isin R and c gt 0such that g(r) = ceminusλr for r isin [0 1]

If g(r) = ceminusλr then the fully informative signal is optimal whenever ρ ge λ and thecompletely uninformative signal is optimal whenever ρ le λ (and any signal is optimalwhen ρ = λ) In particular if ρ = 0 the optimal signal is fully determined by thesign of λ which is in turn determined by whether the mean of r is smaller or greaterthan 1

2

32 Comparative Statics Theorem 2 allows for a sharp comparative statics anal-ysis on the informativeness of the optimal signal

We compare signals by their Blackwell informativeness (Blackwell 1953) To compareupper-censorship signals s1 and s2 we only need to compare their cutoffs ωlowast

1 and ωlowast2

Signal s1 is more informative than signal s2 if ωlowast1 ge ωlowast

2 Indeed state ω isin [0ωlowast2) is

fully revealed by both s1 and s2 and state ω isin [ωlowast2 1] is partially revealed by s1 and

not revealed at all by s2 so s1 is more informative than s2

13In the case of log-concave density g the sets of upper- and lower-censorship signals are totallyordered by the rotation order of Johnson and Myatt (2006) If we restrict attention to the set oflower-censorship signals (which are generally suboptimal under log-concave g) then the rotationpoint is decreasing with respect to the rotation order By Lemma 1 of Johnson and Myatt (2006)the senderrsquos expected utility is quasiconvex and one of the extreme lower-censorship signals (thefully informative or completely uninformative signal) is optimal for the sender But if we restrictattention to the set of upper-censorship signals (which are optimal under log-concave g) then therotation point is not decreasing with respect to the rotation order Therefore Lemma 1 of Johnsonand Myatt (2006) does not apply and an interior upper-censorship signal is generally optimal forthe sender

CENSORSHIP AS OPTIMAL PERSUASION 13

For the purpose of comparison we extend the definition of distribution function Gto the real line and assume that its density g is log-concave Consider a family ofdistributions Gtσ of receiver types

Gtσ (r) = G

983061τ minus t+

r minus τ

σ

983062

where τ is the maximum point of g and t isin R and σ gt 0 are parameters Let gtσ (r)denote the corresponding density Note that gtσ (r) is log-concave for all t isin R andall σ gt 0

Because gtσ is log-concave on [0 1] deterministic upper censorship is optimal byTheorem 2 Let ωlowast (ρ t σ) isin [0 1] be the optimal censorship cutoff as given by (3)

The shift parameter t shifts the distribution along the horizontal axis and the stretchparameter σ stretches the distribution horizontally and symmetrically around themode τ so the limit of σ rarr 0 leads to the unit mass on τ We now show that thesender optimally discloses more information when

(i) the senderrsquos preferences are better aligned with the receiverrsquos preferences (thealignment parameter ρ is greater)

(ii) the receiver is more reluctant to accept the proposal (the shift parameter t isgreater)

(iii) the sender is more uncertain about the receiverrsquos type (the stretch parameter σis greater) provided ρ ge 014

Theorem 3 In the canonical case a cutoff state of an optimal signal satisfies

(i) ωlowast (ρ t σ) is increasing in ρ

(ii) ωlowast (ρ t σ) is increasing in t

(iii) ωlowast (ρ t σ) is increasing in σ if ρ ge 0

The intuition for part (i) is that for a higher ρ the sender puts more weight onthe receiverrsquos utility so she optimally endows the receiver with a higher utility byproviding more information

The intuition for part (ii) is that for a higher t each type of the receiver has a greatercost of accepting the proposal so to persuade the same type of the receiver the senderneeds to increase E[ω|ω ge ωlowast] by expanding the full disclosure interval [0ωlowast]

The intuition for part (iii) is that for a higher σ receiver types are more spread out soto persuade the same mass of types the sender again needs to increase E[ω|ω ge ωlowast]

14The case ρ ge 0 is the practically relevant case where the senderrsquos utility is a weighted averageof the receiverrsquos action and utility 1

1+ρv(ω r) =1

1+ρ + ρ1+ρ (ω minus r)

14 KOLOTILIN MYLOVANOV ZAPECHELNYUK

4 Discrete State

In this section we assume that the state is a discrete random variable

Theorem 1prime Let V be S-shaped Then and only then for all discrete F

(A) an optimal signal is stochastic upper censorship whose cutoff pair (ωlowast qlowast) satisfiescondition (3)

(B) an optimal monotone partition is deterministic upper censorship whose cutoff pair(ωlowast

d qlowastd) satisfies ωlowast

d = ωlowast and qlowastd isin 0 1

In the canonical case we obtain the result analogous to Theorem 2

Theorem 2prime Consider the canonical case and let the density g be log-concave Thenand only then for all discrete F and all ρ

(A) an optimal signal is stochastic upper censorship

(B) an optimal monotone partition is deterministic upper censorship

Let us discuss how the comparative statics result (Theorem 3) changes In Section 32we argue that upper-censorship signals can be ordered by their censorship cutoffs Butwhen the state is discrete a censorship cutoff is described by a pair (ωlowast qlowast) where ωlowast

is a cutoff state and qlowast is the probability of revealing this cutoff state when it realizes

Consider two stochastic upper-censorship signals s1 and s2 with cutoff pairs (ωlowast1 q

lowast1)

and (ωlowast2 q

lowast2) Denote (ωlowast

1 qlowast1) ≽ (ωlowast

2 qlowast2) if ω

lowast1 gt ωlowast

2 or ωlowast1 = ωlowast

2 and qlowast1 ge qlowast2 Observethat s1 is more (Blackwell) informative than s2 if and only if (ωlowast

1 qlowast1) ≽ (ωlowast

2 qlowast2) This

comparison also applies to deterministic upper-censorship signals with the constraintthat qlowast1 and qlowast2 are in 0 1

With the above order on cutoff pairs Theorem 3 extend as follows

Theorem 3prime Consider the canonical case

(A) A cutoff pair of an optimal signal satisfies

(i) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in ρ

(ii) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in t

(iii) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in σ if ρ ge 0

(B) A cutoff pair of an optimal monotone partition satisfies15

(i) (ωlowastd (ρ t σ) q

lowastd (ρ t σ)) is increasing in ρ

(ii) (ωlowastd (ρ t σ) q

lowastd (ρ t σ)) is increasing in t

15Part (iii) also holds for an optimal monotone partition when for example the discrete supportof the state ω is a regular grid with sufficiently many points However it is possible to construct acounterexample where part (iii) fails for an optimal monotone partition

CENSORSHIP AS OPTIMAL PERSUASION 15

That is the sender optimally discloses more information when she is less biasedrelative to the receiver (the alignment parameter ρ is greater) when the receiver ismore reluctant to accept the proposal (the shift parameter t is greater) and underScenario A when the sender is more uncertain about the receiverrsquos type (the stretchparameter σ is greater)

Remark 2 Theorem 1prime and Theorem 2prime show that upper censorship emerges as anoptimal persuasion strategy under the same conditions (V is S-shaped in Theorem1prime and g is log-concave in Theorem 2prime) regardless of whether the sender can choosean arbitrary signal or is constrained to choose a monotone partition It may betempting to conjecture that if the sender was allowed to choose an arbitrary partition(not necessarily monotone) the same result would hold This however is not true

For illustration suppose that the sender wishes to maximize the probability of a = 1and the receiver has type r = 1

2with certainty16 Let the state ω take values in

0 14 1 with probabilities (1

6 23 16) so the prior mean is 1

3

An optimal signal is stochastic upper censorship with cutoff pair (ωlowast qlowast) = (14 12)

which sends the pooling message with certainty at ω = 1 and with probability 1minusqlowast =12at ω = 1

4 and reveals the state otherwise The pooling message induces the posterior

mean m = 12 leading to a = 1 all other messages induce posterior means below 1

2

leading to a = 0 Under the optimal signal action a = 1 is chosen with probability23middot 12+ 1

6= 1

2

An optimal monotone partition is deterministic upper censorship that fully revealsthe state where only ω = 1 leads to a = 1 Under this partition action a = 1 ischosen with probability 1

6

An optimal partition pools the extreme states ω = 0 and ω = 1 and reveals theintermediate state ω = 1

4 The pooling message induces the posterior mean m = 1

2

leading to a = 1 This optimal partition is nonmonotone Under this partitionaction a = 1 is chosen with probability 1

6+ 1

6= 1

3 which is strictly better than the

probability of 16obtained under the optimal monotone partition

5 Application to Media Censorship

In this section we apply our results to the problem of media censorship by thegovernment In the modern world people obtain information about the governmentrsquosstate through various media sources such as television newspapers and internetblogs Without the media most people would not know what policies and reformsthe government pursues and how effective they are Media outlets have differentpositions on the political spectrum and differ substantially in how they select andpresent facts to cover the same news People choose their sources of information based

16In this case V (m) = 1 if m ge 12 and V (m) = 0 if m lt 1

2 Observe that the graph of V is

S-shaped and it can be approximated by a continuously differentiable S-shaped function

16 KOLOTILIN MYLOVANOV ZAPECHELNYUK

on their political ideology and socioeconomic status This information is valuable forsignificant individual decisions on migration investment and voting to name a fewIndividuals may not fully internalize externalities that their decisions impose on thesociety Likewise the government may not have the societyrsquos best interest at heartTo further its goals the government then wishes to influence individual decisions bymanipulating the information through media In autocracies and countries with weakchecks and balances the government has power to censor the media content

The governmentrsquos problem of media censorship can be represented as the persua-sion problem in Section 2 We apply our results to provide conditions for the op-timality of upper-censorship policies that censor all media outlets except the mostpro-government ones Furthermore we interpret our comparative statics results

51 Setup There is a continuum of heterogeneous citizens indexed by r isin [0 1]distributed with G Each citizen chooses between a = 0 and a = 1 The utility of acitizen of type r is given by

u(θ r ar a) = (θ minus r)ar + κ(θ r a)

where ar isin 0 1 denotes the citizenrsquos own action a =983125ardG(r) denotes the aggre-

gate action in the society θ isin [0 1] captures an unobserved benefit from action 1 ascompared to action 0 and κ captures the impact of the aggregate action a on thecitizenrsquos utility The term (θ minus r)ar is a private surplus of a citizen of type r Theterm κ(θ r a) is an externality because for a citizen of type r it is optimal to ignorethis term and choose ar = 1 if and only if θ ge r

There is a government which is concerned with a weighted average of the social utilityand the governmentrsquos intrinsic benefit from the aggregate action For a given θ thegovernmentrsquos utility is given by

983133 1

0

ν(θ r ar a)dG(r) + δγ(θ a)

The term ν captures a citizenrsquos utility from the governmentrsquos perspective We allow νto be different from u to reflect paternalistic or other concerns The term γ capturesthe governmentrsquos intrinsic benefit from the aggregate action The parameter δ ge 0captures the weight of the aggregate action in the governmentrsquos utility

Let T be a distribution of the random variable θ We assume that distributions G andT are independent and admit continuously differentiable and strictly positive densi-ties We also assume that κ ν and γ are linear in θ and continuously differentiablein r and a Furthermore to simplify interpretations we assume that ν and γ arenon-decreasing in θ and a so that for the government a high θ is a good news anda higher aggregate action is preferable

Citizens obtain information about the unobservable benefit θ through media outletsEach media outlet is identified by its editorial policy c isin [0 1] and it endorses action

CENSORSHIP AS OPTIMAL PERSUASION 17

a = 1 if θ ge c and criticizes it if θ lt c17 A set C of media outlets is a finite subsetof [0 1]18

The governmentrsquos censorship policy is a set of the media outlets X sub C that arepermitted to broadcast so the rest of the media outlets are censored

The timing is as follows First the government chooses a set X sub C of permittedmedia outlets Second state θ is realized and each permitted media outlet endorsesor criticizes action a = 1 according to its editorial policy Finally each citizen observesmessages from all permitted media outlets updates his beliefs about θ and choosesan action

52 Discussion We now discuss interpretations of the key components of the mediacensorship application As in Gehlbach and Sonin (2014) there can be various inter-pretations of the citizenrsquos action a = 1 such as voting for the government supportinga governmentrsquos policy or taking an individual decision that benefits the governmentA citizenrsquos type r can be interpreted as his ideological position or preference param-eter A citizen who is more supportive of the government has a smaller r

A media outlet with a higher editorial policy c can be interpreted as being less loyalto the government because it criticizes the government on a larger set of states Aneditorial policy c isin C can therefore represent a slant or political bias of the outletagainst the government and can be empirically measured as the frequency with whichthe outlet uses anti-government language Gentzkow and Shapiro (2010) constructsuch a slant index for US newspapers Empirical findings of their paper suggestthat the editorial policies of media outlets are driven by reader preferences justifyingour assumption of the existence of a large variety of editorial policies19 As in Suen(2004) Chan and Suen (2008) and Chiang and Knight (2011) the assumption ofthe binary media reports that communicate only whether the state θ is above somestandard c can be justified by a cursory readerrsquos preference for simple messages suchas positive or negative opinions and yes or no recommendations

The governmentrsquos censorship of media outlets can take various forms For examplethe government can ban access to internet sites withdraw licenses disrupt financingconfiscate print materials and equipment discredit media providers and even arresteditors and journalists using broadly formulated legislation on combating extremismIn some countries the government can exercise direct control over media editorialpolicies either through state ownership or administrative pressure

17The tie-breaking in the event of θ = c is unimportant as θ is a continuous random variable18We discuss the case of a continuum of media outlets in Section 5519Theoretical literature has explored the determinants of media slant of an outlet driven by its

citizens (Mullainathan and Shleifer 2005 Gentzkow and Shapiro 2006 and Chan and Suen 2008)and its owners (Baron 2006 and Besley and Prat 2006)

18 KOLOTILIN MYLOVANOV ZAPECHELNYUK

53 Formulating Media Censorship as Persuasion We now show that the me-dia censorship problem can be formulated as a linear persuasion problem in whichthe government is a sender and a representative citizen is a receiver

In our model the citizensrsquo and the governmentrsquos utilities are linear in θ There-fore given any information from the media outlets the utilities depend only on theposterior mean of θ

Consider an arbitrary posterior mean of θ denoted by m isin [0 1] Each citizen of typer chooses ar = 1 if and only if r le m The externality term κ plays no role in thisdecision Therefore the aggregate action a is simply the mass of all citizens whosetypes do not exceed m so a = G(m)

Next using the citizensrsquo optimal behavior we derive the governmentrsquos expected utilityconditional on the posterior mean m

V (m) = E983063983133 1

0

ν(θ r ar G(m))dG(r) + δγ(θ G(m))

983055983055983055983055m983064

=

983133 1

0

ν(m r1rlem G(m))dG(r) + δγ(mG(m)) (6)

This is the governmentrsquos indirect utility As in Section 2 the government now choosesa signal which is informative about the state to maximize its expected utility How-ever in contrast to Section 2 here the government is restricted to signals that areimplementable by a subset of a given set of media outlets

Next we show that with an appropriate definition of the state the restriction tosignals implementable by a subset of a given set of media outlets can be formulatedas a restriction to monotone partitional signals as in Scenario B in Section 2 Thuswe will be able to use our results in Section 4 to find the optimal censorship policies

Recall that the information about θ is only available through the media outlets Let ωdenote a random variable equal to the conditional expectation of θ given the messagesof all media outlets in C Note that as C is finite ω is a discrete random variablewith values in [0 1] Let F denote the distribution of ω From now on we treat therandom variable ω as the state and denote by s a signal as defined in Section 2

Let HM denote the set of all distributions of the posterior mean state induced byall monotone partitional signals and let HC denote the set of all distributions ofthe posterior mean state induced by all finite subsets of the set C of media outletsObserve that any set of media outlets induces a monotone partitional signal thatis HC sub HM We now show that every outcome implementable by an arbitrarymonotone partition in HM is also implementable by a subset of media outlets

Lemma 2 HC = HM

For illustration suppose that there is only one media outlet with the editorial policyc isin (0 1) There are two observable events θ ge c and θ lt c In each realized event

CENSORSHIP AS OPTIMAL PERSUASION 19

the citizens compute the posterior expectation of θ which we will call ω So in thisexample the distribution F of ω has two mass points one for each of the two events

Given these two mass points every monotone partition of [0 1] induces one of twopossible distributions of the posterior mean state (i) the mass points revealed and (ii)the mass points pooled Observe however that these distributions are implementableby permitting and censoring the media outlet respectively

A censorship policy is upper censorship if it censors all sufficiently disloyal mediaoutlets Specifically there exists a cutoff clowast isin C and an indicator qlowast isin 0 1 suchthat all media outlets whose editorial policies are below clowast are permitted all mediaoutlets whose editorial policies are above clowast are censored and clowast is permitted if andonly if qlowast = 1 That is X = c isin C c le clowast if qlowast = 1 and X = c isin C c lt clowastif qlowast = 0 Note that the full censorship policy clowast = 0 and qlowast = 0 (where all mediaoutlets are censored) and the free media policy clowast = 1 and qlowast = 1 (where all mediaoutlets are permitted) are the two extreme upper-censorship policies

As shown above the media censorship problem is equivalent to the linear persuasionproblem in which the sender is restricted to monotone partitional signals and thesenderrsquos indirect utility V is given by (6) We thus apply Theorem 1prime(B) to obtainthe following result

Theorem 1primeprime If V is S-shaped then an optimal censorship policy is upper censorship

To illustrate Theorem 1primeprime suppose the government is interested only in the aggregateaction so that ν = 0 and γ depends only on the aggregate action a so V (m) =γ(G(m)) Thus an upper-censorship policy is optimal if the composition functionγ(G(middot)) is S-shaped For example this condition holds if the government is interestedin reaching a certain approval threshold (eg a simple majority) so that γ is a stepfunction This condition also holds if γ is S-shaped and G is uniform or S-shapedwith the same inflection point as γ20

54 Canonical Case We now impose more structure on the utilities to obtain asharper result for the optimality of upper censorship and to perform a comparativestatics analysis Analogously to assumption (A1) we assume that the governmentrsquosutility is a weighted average of the citizensrsquo utility and their aggregate action

983133 1

0

u(θ r ar a)dG(r) + δa

where u(θ r ar a) = (θ minus r)ar + ζ(r)a

(A2)

for some continuously differentiable function ζ Define

β =

983133 1

0

ζ(r)dG(r) + δ

20This condition holds in many special cases where γ and G are S-shaped even when γ and Ghave different inflection points

20 KOLOTILIN MYLOVANOV ZAPECHELNYUK

The term983125 1

0ζ(r)dG(r) is the governmentrsquos expected bias towards the citizensrsquo action

a = 1 due to the citizensrsquo externality and the term δ is the governmentrsquos intrinsicbias towards a greater aggregate action a Thus we interpret β as the governmentrsquosaggregate bias This decomposition of the bias allows for different interpretations whythe government is biased So β can be high because the government is too self-serving(high δ) or because the government is benevolent (low δ) but wishes to internalizestrong positive externalities of the citizens (high ζ(r)) To ease interpretations weassume that β gt 021

Under Assumption (A2) the governmentrsquos indirect utility V given by (6) becomes

V (m) =

983133 m

0

(mminus r)dG(r) + βa =

983133 m

0

v(m r)dG(r)

where

v(m r) = β

9830611 +

1

β(mminus r)

983062

This is the same as v given by Assumption (A1) with ρ = βminus1 up to rescaling by apositive constant Consequently we can apply Theorem 2prime(B) to obtain the followingresult

Theorem 2primeprime Let (A2) hold If the density g of citizens types is log-concave then anoptimal censorship policy is upper censorship

We now apply the comparative statics analysis presented in Section 32 Note thatthe upper-censorship policies are ordered according to the amount of informationtransmitted to the citizens in the sense of Blackwell (1953) A greater censorshipthreshold clowast means that more media outlets are permitted With this order in mindwe apply Theorem 3prime(B) to make a comparative statics analysis on the amount ofinformation that is optimally disclosed by the government

First the censorship cutoff is increasing in the alignment parameter ρ Recall thatthis is a reciprocal of the governmentrsquos bias ρ = βminus1 This means that the governmentoptimally discloses more information (the censorship cutoff is greater) when it is lessbiased (β is smaller) Intuitively as β decreases the government puts more weighton the citizensrsquo utility so it optimally endows the population with a higher utility bycensoring fewer media outlets and disclosing more information

Second the censorship cutoff is increasing in the magnitude t of the horizontal shiftof the density g A greater t corresponds to a greater opportunity cost of actiona = 1 for each citizen This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more difficultto persuade to take action a = 1 (parameter t is greater) Informally speaking topersuade the same type of the citizen the government needs to increase the posterior

21If β lt 0 then swapping the roles of a = 0 and a = 1 reverses the sign of the bias If β = 0then the governmentrsquos utility and the citizensrsquo private interests coincide so it is trivially optimal topermit all media outlets

CENSORSHIP AS OPTIMAL PERSUASION 21

mean But the expected posterior mean must be equal to the prior mean so it is notpossible to increase all posterior means Due to the log-concave shape of the densityof the citizensrsquo types this tradeoff is resolved by increasing the posterior mean of thepooling interval E[ω|ω ge clowast] This is done by shrinking the pooling interval [clowast 1]that is increasing the censorship cutoff clowast

Third when the number of media outlets is sufficiently large the censorship cutoffis increasing in the stretch factor σ of the density g around its peak τ (see Footnote15) A greater σ corresponds to a more dispersed distribution of the opportunitycost in the population This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more diverse inhow difficult they are to persuade Intuitively when the types are more spread outpersuading the same mass of types requires a greater posterior mean which leads toa higher censorship cutoff similarly to the comparative statics in t discussed above

55 Extensions and Open Questions Let us now consider a few extensions ofour model of media censorship

In our model the set of media outlets is exogenous and the governmentrsquos only in-strument is censorship We now consider three alternative ways of expanding thegovernmentrsquos instruments of influence

First suppose that the government is able not only to censor existing media outletsbut also to introduce new media outlets with chosen editorial policies This is equiv-alent to our censorship model where all media outlets in [0 1] are initially availableand the government can censor any subset of them When all media outlets in [0 1]are permitted the revealed information about θ (which we call ω) is θ itself Thusω = θ is a continuous random variable with distribution F = T So we can nowapply our results for the continuous state from Section 3 instead of those for the dis-crete state in Section 4 reaching the same conclusion about the optimality of uppercensorship

Second suppose that the government can garble information available from mediaoutlets That is the government is not restricted to monotone partitions it cancreate arbitrary signals about state ω for the citizens to observe This becomes ageneral persuasion problem and our Theorems 1prime(A) 2prime(A) and 3prime(A) apply

Third suppose that the government is able to restrict not only which media outletsare permitted but also how many media outlets each citizen can choose to observeIn this extension the citizens are not allowed to communicate with one another (oth-erwise they could share the information thus observing all permitted media outletsindirectly) This extension does not affect our results as long as each citizen is al-lowed to access at least one media outlet of his choice as in Chan and Suen (2008)Intuitively this is because each citizen categorizes the information from the mediaoutlets into ldquogood newsrdquo where a = 1 is optimal and ldquobad newsrdquo where a = 0 is

22 KOLOTILIN MYLOVANOV ZAPECHELNYUK

optimal Because the information from the media outlets induces a monotone parti-tion it means that ldquogood newsrdquo is separated from ldquobad newsrdquo by a threshold mediaoutlet that depends on the citizenrsquos type That is it is sufficient to observe a singlethreshold media outlet to distinguish ldquogood newsrdquo from ldquobad newsrdquo

In our model each citizenrsquos utility depends on the aggregate action a through theexternality term κ(θ r a) which does not affect the chosen action Let us relax thisassumption so that a citizenrsquos optimal choice can depend on a We can still writethe senderrsquos indirect utility V as a function of the posterior mean state and applyour results However V is no longer uniquely determined by the primitives of themodel It is now endogenous and depends on an equilibrium the citizens play aseach citizenrsquos optimal action now depends on what all citizens do in equilibrium Forexample given the same information about the state a citizen could prefer to choosea = 1 if and only if sufficiently many citizens choose the same action so a is largeenough This creates the problem of multiplicity of equilibria and as a consequencethe dependence of optimal censorship on equilibrium selection

As a side our media censorship problem can also be applied to spatial voting mod-els as in Chiang and Knight (2011) Consider a government party (p = G) and anopposition party (p = O) competing in an election If party p wins a voter withan ideological position r gets utility wp minus (r minus rp)

2 where wp is the quality or va-lence of party p and rp is the ideology or policy platform of party p Voters knowthe partiesrsquo ideologies and obtain information about the partiesrsquo qualities from allavailable media outlets Each voter supports the party that maximizes his expectedutility Within this context our analysis still applies because the voterrsquos utility dif-ference between the government and opposition parties is proportional to θminusr whereθ = (wG minus wO + r2O minus r2G)2(rO minus rG)

There are a few more extensions that can be relevant in applications First insteadof complete censorship of a media outlet there can be a cost of accessing it Forexample an international news channel can be censored by a local government butcitizens may still access it through VPN at some cost Second it can be costly forthe government to censor media outlets So another important question to answeris how a government may prioritize censoring Finally citizens can incur some costof following each media outlet While we have already mentioned that citizens gainno benefit from following more than one outlet it is entirely possible for them tostop watching news altogether if it is sufficiently uninformative These extensions arenontrivial and left for future research

Appendix

Proof of Theorem 1 Suppose that V is S-shaped For mωlowast isin [0 1] and mlowast(ωlowast) =E[ω|ω ge ωlowast] define

V (mωlowast) = V (mlowast(ωlowast)) + V prime(mlowast(ωlowast))(mminusmlowast(ωlowast)) (7)

CENSORSHIP AS OPTIMAL PERSUASION 23

Since V (m) is convex on [0 τ ] and concave on [τ 1] there exists ωlowast isin [0 τ ] thatsatisfies condition (3) that is

V (mωlowast) ge (le)V (m) for all m ge (lt)ωlowast (8)

This is true because V (ωlowastωlowast) minus V (ωlowast) is single-crossing from below To see thisconsider Figure 1 Observe that (3) can hold only if ωlowast is on the convex part of V andmlowast is on the concave part of V As the censorship cutoff ωlowast increases the posteriormean state of the pooling message mlowast(ωlowast) also increases But because V is concaveat mlowast(ωlowast) the dashed tangent line V (mωlowast) becomes flatter in m so it crosses thesolid line V (m) at a smaller m

Consider an arbitrary signal s Let H be the distribution of m = E[ω|s] inducedby signal s Since each signal is a garbling of the fully informative signal F is amean-preserving spread of H Thus the senderrsquos expected utility is smaller undersignal s than under the upper-censorship signal with cutoff ωlowast as follows from

983133 1

0

V (m)dH(m) le983133 1

0

maxV (m) V (mωlowast)dH(m)

le983133 1

0

maxV (m) V (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mlowast(ωlowast))dF (m)

where the second line holds because maxV (m) V (mωlowast) is convex in m and F isa mean-preserving spread of H the third line holds by (8) and the last line holds by

(7) and by983125 1

ωlowast(mminusmlowast(ωlowast))dF (m) = 0

Conversely suppose that V is not S-shaped Then there exist 0 le m1 lt m2 lem3 lt m4 le 1 such that V primeprime(m) lt 0 for m isin (m1m2) V

primeprime(m) = 0 for m isin [m2m3]and V primeprime(m) gt 0 for m isin (m3m4) because V primeprime is continuous by assumption It isstraightforward to show that there exists 983141mlowast isin (m1m2) (sufficiently close to m2) and983141ωlowast isin (m3m4) such that

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) lt V (m) for all m isin (983141ωlowastm4]

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) gt V (m) for all m isin [m1 983141ωlowast] 983141mlowast

Moreover there exists a continuous distribution F such that the support of F is asubset of [m1m4] and 983141mlowast = E[ω|ω le 983141ωlowast] Using a chain of inequalities similar to theone above it is easy to show that lower-censorship signal 983141slowast with cutoff 983141ωlowast is optimalFurthermore it is uniquely optimal as both inequalities hold with equalities only forsignal 983141slowast This in turn implies that any upper-censorship signal is suboptimal

24 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Proof of Lemma 1 Notice that under assumption (A1) we have

V primeprime(m) = gprime(m) + ρg(m) for m isin [0 1]

Recall that V is S-shaped if and only if V primeprime(m) = gprime (m) + ρg (m) is single-crossingfrom above Using Proposition 1 in Quah and Strulovici (2012) it is easy to showthat V primeprime(m) is single-crossing from above for all ρ isin R if and only if gprime (m)g (m) isnonincreasing in m (that is ln g(m) is concave)

Proof of Theorem 2 Immediate by Theorem 1 and Lemma 1

Proof of Theorem 3 Part (i) We fix t and σ and vary ρ Without loss ofgenerality let t = 0 and σ = 1 (so that Gtσ = G) Consider a signal that induces thedistribution H of posterior mean m This signal implements the receiverrsquos interimutility U given by

U(r) =

983133 1

r

(mminus r)dH(m) for r isin [0 1]

which holds because the receiver acts if and only if m ge r Since types r lt 0 alwayschoose a = 1 and types r gt 1 always choose a = 0 we can exclude them fromconsideration and assume that supp(G) = [0 1] The senderrsquos expected utility is then

983133 1

0

V (m)dH(m) =

983133 1

0

983133 m

0

(1 + ρ(mminus r))dG(r)dH(m)

=

983133 1

0

983133 1

r

(1 + ρ(mminus r))dH(m)dG(r)

=

983133 1

0

(1minusH(r) + ρU(r))dG(r)

Consider ρ2 gt ρ1 Suppose by contradiction that the corresponding optimal upper-censorship signals s2 and s1 are such that s1 is strictly more informative than s2 LetU2 and U1 be the receiverrsquos interim utilities implemented by s2 and s1 Since thesender prefers signal s2 under ρ2 and signal s1 under ρ1 we have

983133 1

0

(1minusH2 (r) + ρ2U2 (r)) dG (r) ge983133 1

0

(1minusH1 (r) + ρ2U1 (r)) dG (r)

983133 1

0

(1minusH1 (r) + ρ1U1 (r)) dG (r) ge983133 1

0

(1minusH2 (r) + ρ1U2 (r)) dG (r)

Summing up these inequalities gives

(ρ2 minus ρ1)

983133 1

0

(U2 (r)minus U1 (r)) dG (r) ge 0

leading to a contradiction because U1(r) ge U2(r) for all r with strict inequality forsome r given that s1 is strictly more informative than s2

CENSORSHIP AS OPTIMAL PERSUASION 25

Part (ii) We fix ρ isin R and σ = 1 and vary t The indirect utility of the senderconditional on m isin [0 1] is given by

Vt(m) =

983133

rlem

(1+ρ(mminusr))gt1(r)dr =

983133

rlemminust

(1+ρ(mminustminusr))g(r)dr = V (mminust) (9)

By Theorem 1 when t0 = 0 an optimal cutoff ωlowast0 satisfies

V (m) le V (mlowast0) + V prime(mlowast

0)(mminusmlowast0) for m isin [ωlowast

0 1]

When V is S-shaped and t1 lt 0 it is easy to see that

V (mminus t1) le V (mlowast0 minus t1) + V prime(mlowast

0 minus t1)(mminusmlowast0) for m isin [ωlowast

0 1]

which implies that when t = t1 the above inequality changes its sign at some ωlowast1

that satisfies ωlowast1 le ωlowast

0 It follows from (9) that ωlowast1 is an optimal cutoff for t = t1 and

as we established above ωlowast1 le ωlowast

0

Part (iii) We fix ρ ge 0 and t = 0 and vary σ The indirect utility of the senderconditional on m isin [0 1] is given by

V ρσσ(m) =

983133

rlem

9830591 +

ρ

σ(mminus r)

983060dG0σ(r)

=

983133

rlem

9830591 +

ρ

σ(mminus r)

983060g

983061r minus (1minus σ)τ

σ

9830621

σdr

=

983133

rlemminus(1minusσ)τσ

9830611 + ρ

983061mminus (1minus σ)τ

σminus r

983062983062g(r)dr

= Vρ1

983061mminus (1minus σ)τ

σ

983062 (10)

Let σ0 = 1 σ1 lt 1 and ρ1 = ρσ1 To compare the optimal cutoffs under parameters(ρ σ0) and (ρ σ1) we consider two steps first comparing the cutoffs between (ρ σ0)and (ρ1 σ1) and then comparing the cutoffs between (ρ1 σ1) and (ρ σ1)

First we compare an optimal cutoff ωlowast0 under (ρ σ0) = (ρ 1) and an optimal cutoff

ωlowast1 under (ρ1 σ1) By Theorem 1 ωlowast

0 satisfies

Vρ1(m) le Vρ1(mlowast0) + V prime

ρ1(mlowast0)(mminusmlowast

0) for m isin [ωlowast0 1]

When V is increasing (implied by ρ ge 0) and S-shaped and σ1 lt 1 it is easy to seethat for all m isin [ωlowast

0 1]

Vρ1

983061mminus (1minus σ1)τ

σ1

983062le Vρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062+ V prime

ρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062(mminusmlowast

0)

which implies that when σ = σ1 lt 1 the above inequality changes its sign at someωlowast1 that satisfies ωlowast

1 le ωlowast0 It follows from (10) that ωlowast

1 is an optimal cutoff under(ρσ σ1) = (ρ1 σ1) and as we established above ωlowast

1 le ωlowast0

26 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Second we compare the optimal cutoff ωlowast1 under (ρ1 σ1) and the optimal cutoff ωlowast

2

under (ρ σ1) Because ρ lt ρ1 = ρσ1 by part (i) we have ωlowast2 le ωlowast

1

Combining ωlowast1 le ωlowast

0 and ωlowast2 le ωlowast

1 we obtain ωlowast2 le ωlowast

0

Proof of Theorem 1prime Part (A) The proof is analogous to that of Theorem 1 andis thus omitted

Part (B) The only if statement follows from that in part (A) (see the proof of Theorem1) and the fact that we can always ensure that an optimal stochastic censorship(ωlowast qlowast) is deterministic (so qlowast isin 0 1) by the choice of the support and distributionof ω Thus we only need to prove the if statement

Let F be a distribution of ω with support on n values 0 lt ω0 lt lt ωnminus1 lt 1 Thisis without loss of generality because if 0 andor 1 were in the support of F we couldredefine the state as ω = (ω + ε)(1 + 2ε) and the type as r = (r + ε)(1 + 2ε)

Let X sub ω1 ωnminus1 identify a set of cutoffs defining a monotone partition for eachωi isin X the receiver is informed whether ω lt ωi or ω ge ωi Note that we exclude ω0

from the set of cutoffs because ω ge ω0 always holds

Note that X is deterministic upper censorship if and only if there exists a censorshipthreshold ωlowast isin [0 1] such that for each i = 1 nminus 1 ωi isin X if and only if ωi lt ωlowastthat is we can assume that qlowast = 0 without loss of generality

Let VX be the senderrsquos expected utility when the monotone partition is given by X

If n = 2 then there are only two partitions X = ω1 (full disclosure) and X = empty(no disclosure) Both are upper censorships so part (B) holds trivially

If n = 3 then there are four monotone partitions X = ω1ω2 (full disclosure)X = ω2 X = ω1 and X = empty (no disclosure) Only X = ω2 is not uppercensorship The next lemma proves that X = ω2 cannot be uniquely optimal whenV is S-shaped thus proving the optimality of upper censorship for n = 3

Denote

mjk = E983045ω|ω isin [ωjωk]

983046

For convenience of notation we write mj = ωj

Lemma 3 Let V be S-shaped with an inflexion point τ Let ω be a discrete randomvariable with support on m0m1m2 and probabilities p0 p1 p2 where 0 lt m0 ltm1 lt m2 lt 1 Then

(i) if τ le m0 then Vm2 le Vempty

(ii) if τ ge m1 then Vm2 le Vm1m2

(iii) if m0 lt τ lt m1 then Vm2 le maxVempty Vm1m2

CENSORSHIP AS OPTIMAL PERSUASION 27

Proof Define p01 = p0 + p1 and p02 = p01 + p2 = 1 Observe that

m01 =p0m0 + p1m1

p0 + p1and m02 =

p0m0 + p1m1 + p2m2

p0 + p1 + p2

We thus have

Vempty = p02V (m02) = (p01 + p2)V

983061p01m01 + p2m2

p01 + p2

983062

Vm2 = p01V (m01) + p2V (m2) = (p0 + p1)V

983061p0m0 + p1m1

p0 + p1

983062+ p2V (m2)

Vm1m2 = p0V (m0) + p1V (m1) + p2V (m2)

Clearly

m0 lt m01 lt m1 and m0 lt m02 lt m2

Therefore

Vm2 le Vempty lArrrArrp01

p01 + p2V (m01) +

p2p01 + p2

V (m2) le V

983061p01

p01 + p2m01 +

p2p01 + p2

m2

983062= V (m02)

(11)

and

Vm2 le Vm1m2 lArrrArr

V

983061p0

p0 + p1m0 +

p1p0 + p1

m1

983062le p0

p0 + p1V (m0) +

p1p0 + p1

V (m1)(12)

Suppose that τ le m01 so V is concave on [m01m2] Then (11) holds by Jensenrsquosinequality In particular it holds for τ le m0 lt m01 thus proving part (i)

Suppose that τ ge m1 so V is convex on [m0m1] Then (12) holds by Jensenrsquosinequality thus proving part (ii)

Finally suppose that m01 lt τ lt m1 Let (τ V (τ)) lie below (or on) the straight lineA connecting (m0 V (m0)) and (m2 V (m2)) as illustrated on Figure 2 By convexityof V (m) for m lt τ it must be the case that V (m01) lies below (or on) the dashedline B By concavity of V (m) for m gt τ it must be the case that V (m1) lies above(or on) the dashed line C Consequently (12) is satisfied

Alternatively let (τ V (τ)) lie above the straight line A connecting (m0 V (m0)) and(m2 V (m2)) as illustrated on Figure 3 If Vm2 le Vm1m2 then the proof of part(iii) is complete So suppose that Vm2 gt Vm1m2 so (12) is violated

By convexity of V (m) for m lt τ it must be the case that V (m01) lies below (or on)the line B Since (12) is violated V (m01) must lie between the lines A and B Recallthat m01 lt m02 lt m2 Observe that V (m02) must be in the shaded area on Figure 3Indeed if m02 lt τ then by convexity of V (m) for m lt τ it must be the case thatV (m02) lies between (or on) the lines B and D If m02 ge τ then by concavity of

28 KOLOTILIN MYLOVANOV ZAPECHELNYUK

τ m2m0

A C

B

m01 m1

V (m0) V (τ)

V (m2)

Figure 2 The case where (τ V (τ)) lies below (or on) the straight lineconnecting (m0 V (m0)) and (m2 V (m2))

τ m2m0

A

C

B

m01

V (m0)

V (τ)

V (m2)

D

V (m01)

Figure 3 The case where (τ V (τ)) lies above the straight line con-necting (m0 V (m0)) and (m2 V (m2))

V (m) for m gt τ it must be the case that V (m02) lies above (or on) line C It followsthat (11) is satisfied

We now adapt Lemma 3 to prove the optimality of upper censorship for n ge 4

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

CENSORSHIP AS OPTIMAL PERSUASION 11

1ωlowast

V

revealing pooling

mlowast0

Figure 1 Optimal upper censorship with cutoff ωlowast

The solution of the senderrsquos problem (4) is particularly simple if V is either globallyconvex or globally concave In this case the first-order condition in (3) has a constantsign so either ωlowast = 0 or ωlowast = 1 must be optimal If V is convex then ωlowast = 1 isoptimal which corresponds to the fully informative signal Similarly if V is concavethen ωlowast = 0 is optimal which corresponds to the completely uninformative signalThis is summarized in the following corollary

Corollary 1 An optimal signal is

(i) fully informative for all F if and only if V is convex

(ii) completely uninformative for all F if and only if V is concave

31 Canonical Case We now consider the canonical case where the senderrsquos utilitysatisfies assumption (A1)

The density g of receiver types r is said to be log-concave if ln g(r) is concave in r12

Note that ln g(r) is well defined in the canonical case

In this case the shape of the senderrsquos indirect utility V defined by (9) is connectedto the shape of the density g of receiver types as follows

Lemma 1 In the canonical case V is S-shaped for all ρ if and only if g is log-concave

12Table 1 in Bagnoli and Bergstrom (2005) reports distributions with log-concave densities

12 KOLOTILIN MYLOVANOV ZAPECHELNYUK

That is if the density of receiver types g is log-concave then the senderrsquos indirectutility V is S-shaped Moreover this result is tight in the sense that if g is notlog-concave then there exists ρ such that V is not S-shaped

By Theorem 1 and Lemma 1 we obtain a criterion for the optimality of upper cen-sorship with the condition on the primitive of the model the density g

Theorem 2 Consider the canonical case and let the density g be log-concave Thenand only then an optimal signal is deterministic upper censorship for all F and all ρ

The symmetric statement is also true an optimal signal is lower censorship for all Fand all ρ if and only if minusg is log-concave Consequently if both g and minusg are log-concave (that is g is exponential) then there exists an optimal signal which is bothupper censorship and lower censorship There are only two signals with this propertyfully informative and completely uninformative This allows us to obtain conditionson the distribution of receiver types under which the optimal signal is polarizedbetween fully informative and completely uninformative signals as for example inLewis and Sappington (1994) and Johnson and Myatt (2006)13

Corollary 2 In the canonical case an optimal signal is either fully informative orcompletely uninformative for all F and all ρ if and only if there exist λ isin R and c gt 0such that g(r) = ceminusλr for r isin [0 1]

If g(r) = ceminusλr then the fully informative signal is optimal whenever ρ ge λ and thecompletely uninformative signal is optimal whenever ρ le λ (and any signal is optimalwhen ρ = λ) In particular if ρ = 0 the optimal signal is fully determined by thesign of λ which is in turn determined by whether the mean of r is smaller or greaterthan 1

2

32 Comparative Statics Theorem 2 allows for a sharp comparative statics anal-ysis on the informativeness of the optimal signal

We compare signals by their Blackwell informativeness (Blackwell 1953) To compareupper-censorship signals s1 and s2 we only need to compare their cutoffs ωlowast

1 and ωlowast2

Signal s1 is more informative than signal s2 if ωlowast1 ge ωlowast

2 Indeed state ω isin [0ωlowast2) is

fully revealed by both s1 and s2 and state ω isin [ωlowast2 1] is partially revealed by s1 and

not revealed at all by s2 so s1 is more informative than s2

13In the case of log-concave density g the sets of upper- and lower-censorship signals are totallyordered by the rotation order of Johnson and Myatt (2006) If we restrict attention to the set oflower-censorship signals (which are generally suboptimal under log-concave g) then the rotationpoint is decreasing with respect to the rotation order By Lemma 1 of Johnson and Myatt (2006)the senderrsquos expected utility is quasiconvex and one of the extreme lower-censorship signals (thefully informative or completely uninformative signal) is optimal for the sender But if we restrictattention to the set of upper-censorship signals (which are optimal under log-concave g) then therotation point is not decreasing with respect to the rotation order Therefore Lemma 1 of Johnsonand Myatt (2006) does not apply and an interior upper-censorship signal is generally optimal forthe sender

CENSORSHIP AS OPTIMAL PERSUASION 13

For the purpose of comparison we extend the definition of distribution function Gto the real line and assume that its density g is log-concave Consider a family ofdistributions Gtσ of receiver types

Gtσ (r) = G

983061τ minus t+

r minus τ

σ

983062

where τ is the maximum point of g and t isin R and σ gt 0 are parameters Let gtσ (r)denote the corresponding density Note that gtσ (r) is log-concave for all t isin R andall σ gt 0

Because gtσ is log-concave on [0 1] deterministic upper censorship is optimal byTheorem 2 Let ωlowast (ρ t σ) isin [0 1] be the optimal censorship cutoff as given by (3)

The shift parameter t shifts the distribution along the horizontal axis and the stretchparameter σ stretches the distribution horizontally and symmetrically around themode τ so the limit of σ rarr 0 leads to the unit mass on τ We now show that thesender optimally discloses more information when

(i) the senderrsquos preferences are better aligned with the receiverrsquos preferences (thealignment parameter ρ is greater)

(ii) the receiver is more reluctant to accept the proposal (the shift parameter t isgreater)

(iii) the sender is more uncertain about the receiverrsquos type (the stretch parameter σis greater) provided ρ ge 014

Theorem 3 In the canonical case a cutoff state of an optimal signal satisfies

(i) ωlowast (ρ t σ) is increasing in ρ

(ii) ωlowast (ρ t σ) is increasing in t

(iii) ωlowast (ρ t σ) is increasing in σ if ρ ge 0

The intuition for part (i) is that for a higher ρ the sender puts more weight onthe receiverrsquos utility so she optimally endows the receiver with a higher utility byproviding more information

The intuition for part (ii) is that for a higher t each type of the receiver has a greatercost of accepting the proposal so to persuade the same type of the receiver the senderneeds to increase E[ω|ω ge ωlowast] by expanding the full disclosure interval [0ωlowast]

The intuition for part (iii) is that for a higher σ receiver types are more spread out soto persuade the same mass of types the sender again needs to increase E[ω|ω ge ωlowast]

14The case ρ ge 0 is the practically relevant case where the senderrsquos utility is a weighted averageof the receiverrsquos action and utility 1

1+ρv(ω r) =1

1+ρ + ρ1+ρ (ω minus r)

14 KOLOTILIN MYLOVANOV ZAPECHELNYUK

4 Discrete State

In this section we assume that the state is a discrete random variable

Theorem 1prime Let V be S-shaped Then and only then for all discrete F

(A) an optimal signal is stochastic upper censorship whose cutoff pair (ωlowast qlowast) satisfiescondition (3)

(B) an optimal monotone partition is deterministic upper censorship whose cutoff pair(ωlowast

d qlowastd) satisfies ωlowast

d = ωlowast and qlowastd isin 0 1

In the canonical case we obtain the result analogous to Theorem 2

Theorem 2prime Consider the canonical case and let the density g be log-concave Thenand only then for all discrete F and all ρ

(A) an optimal signal is stochastic upper censorship

(B) an optimal monotone partition is deterministic upper censorship

Let us discuss how the comparative statics result (Theorem 3) changes In Section 32we argue that upper-censorship signals can be ordered by their censorship cutoffs Butwhen the state is discrete a censorship cutoff is described by a pair (ωlowast qlowast) where ωlowast

is a cutoff state and qlowast is the probability of revealing this cutoff state when it realizes

Consider two stochastic upper-censorship signals s1 and s2 with cutoff pairs (ωlowast1 q

lowast1)

and (ωlowast2 q

lowast2) Denote (ωlowast

1 qlowast1) ≽ (ωlowast

2 qlowast2) if ω

lowast1 gt ωlowast

2 or ωlowast1 = ωlowast

2 and qlowast1 ge qlowast2 Observethat s1 is more (Blackwell) informative than s2 if and only if (ωlowast

1 qlowast1) ≽ (ωlowast

2 qlowast2) This

comparison also applies to deterministic upper-censorship signals with the constraintthat qlowast1 and qlowast2 are in 0 1

With the above order on cutoff pairs Theorem 3 extend as follows

Theorem 3prime Consider the canonical case

(A) A cutoff pair of an optimal signal satisfies

(i) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in ρ

(ii) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in t

(iii) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in σ if ρ ge 0

(B) A cutoff pair of an optimal monotone partition satisfies15

(i) (ωlowastd (ρ t σ) q

lowastd (ρ t σ)) is increasing in ρ

(ii) (ωlowastd (ρ t σ) q

lowastd (ρ t σ)) is increasing in t

15Part (iii) also holds for an optimal monotone partition when for example the discrete supportof the state ω is a regular grid with sufficiently many points However it is possible to construct acounterexample where part (iii) fails for an optimal monotone partition

CENSORSHIP AS OPTIMAL PERSUASION 15

That is the sender optimally discloses more information when she is less biasedrelative to the receiver (the alignment parameter ρ is greater) when the receiver ismore reluctant to accept the proposal (the shift parameter t is greater) and underScenario A when the sender is more uncertain about the receiverrsquos type (the stretchparameter σ is greater)

Remark 2 Theorem 1prime and Theorem 2prime show that upper censorship emerges as anoptimal persuasion strategy under the same conditions (V is S-shaped in Theorem1prime and g is log-concave in Theorem 2prime) regardless of whether the sender can choosean arbitrary signal or is constrained to choose a monotone partition It may betempting to conjecture that if the sender was allowed to choose an arbitrary partition(not necessarily monotone) the same result would hold This however is not true

For illustration suppose that the sender wishes to maximize the probability of a = 1and the receiver has type r = 1

2with certainty16 Let the state ω take values in

0 14 1 with probabilities (1

6 23 16) so the prior mean is 1

3

An optimal signal is stochastic upper censorship with cutoff pair (ωlowast qlowast) = (14 12)

which sends the pooling message with certainty at ω = 1 and with probability 1minusqlowast =12at ω = 1

4 and reveals the state otherwise The pooling message induces the posterior

mean m = 12 leading to a = 1 all other messages induce posterior means below 1

2

leading to a = 0 Under the optimal signal action a = 1 is chosen with probability23middot 12+ 1

6= 1

2

An optimal monotone partition is deterministic upper censorship that fully revealsthe state where only ω = 1 leads to a = 1 Under this partition action a = 1 ischosen with probability 1

6

An optimal partition pools the extreme states ω = 0 and ω = 1 and reveals theintermediate state ω = 1

4 The pooling message induces the posterior mean m = 1

2

leading to a = 1 This optimal partition is nonmonotone Under this partitionaction a = 1 is chosen with probability 1

6+ 1

6= 1

3 which is strictly better than the

probability of 16obtained under the optimal monotone partition

5 Application to Media Censorship

In this section we apply our results to the problem of media censorship by thegovernment In the modern world people obtain information about the governmentrsquosstate through various media sources such as television newspapers and internetblogs Without the media most people would not know what policies and reformsthe government pursues and how effective they are Media outlets have differentpositions on the political spectrum and differ substantially in how they select andpresent facts to cover the same news People choose their sources of information based

16In this case V (m) = 1 if m ge 12 and V (m) = 0 if m lt 1

2 Observe that the graph of V is

S-shaped and it can be approximated by a continuously differentiable S-shaped function

16 KOLOTILIN MYLOVANOV ZAPECHELNYUK

on their political ideology and socioeconomic status This information is valuable forsignificant individual decisions on migration investment and voting to name a fewIndividuals may not fully internalize externalities that their decisions impose on thesociety Likewise the government may not have the societyrsquos best interest at heartTo further its goals the government then wishes to influence individual decisions bymanipulating the information through media In autocracies and countries with weakchecks and balances the government has power to censor the media content

The governmentrsquos problem of media censorship can be represented as the persua-sion problem in Section 2 We apply our results to provide conditions for the op-timality of upper-censorship policies that censor all media outlets except the mostpro-government ones Furthermore we interpret our comparative statics results

51 Setup There is a continuum of heterogeneous citizens indexed by r isin [0 1]distributed with G Each citizen chooses between a = 0 and a = 1 The utility of acitizen of type r is given by

u(θ r ar a) = (θ minus r)ar + κ(θ r a)

where ar isin 0 1 denotes the citizenrsquos own action a =983125ardG(r) denotes the aggre-

gate action in the society θ isin [0 1] captures an unobserved benefit from action 1 ascompared to action 0 and κ captures the impact of the aggregate action a on thecitizenrsquos utility The term (θ minus r)ar is a private surplus of a citizen of type r Theterm κ(θ r a) is an externality because for a citizen of type r it is optimal to ignorethis term and choose ar = 1 if and only if θ ge r

There is a government which is concerned with a weighted average of the social utilityand the governmentrsquos intrinsic benefit from the aggregate action For a given θ thegovernmentrsquos utility is given by

983133 1

0

ν(θ r ar a)dG(r) + δγ(θ a)

The term ν captures a citizenrsquos utility from the governmentrsquos perspective We allow νto be different from u to reflect paternalistic or other concerns The term γ capturesthe governmentrsquos intrinsic benefit from the aggregate action The parameter δ ge 0captures the weight of the aggregate action in the governmentrsquos utility

Let T be a distribution of the random variable θ We assume that distributions G andT are independent and admit continuously differentiable and strictly positive densi-ties We also assume that κ ν and γ are linear in θ and continuously differentiablein r and a Furthermore to simplify interpretations we assume that ν and γ arenon-decreasing in θ and a so that for the government a high θ is a good news anda higher aggregate action is preferable

Citizens obtain information about the unobservable benefit θ through media outletsEach media outlet is identified by its editorial policy c isin [0 1] and it endorses action

CENSORSHIP AS OPTIMAL PERSUASION 17

a = 1 if θ ge c and criticizes it if θ lt c17 A set C of media outlets is a finite subsetof [0 1]18

The governmentrsquos censorship policy is a set of the media outlets X sub C that arepermitted to broadcast so the rest of the media outlets are censored

The timing is as follows First the government chooses a set X sub C of permittedmedia outlets Second state θ is realized and each permitted media outlet endorsesor criticizes action a = 1 according to its editorial policy Finally each citizen observesmessages from all permitted media outlets updates his beliefs about θ and choosesan action

52 Discussion We now discuss interpretations of the key components of the mediacensorship application As in Gehlbach and Sonin (2014) there can be various inter-pretations of the citizenrsquos action a = 1 such as voting for the government supportinga governmentrsquos policy or taking an individual decision that benefits the governmentA citizenrsquos type r can be interpreted as his ideological position or preference param-eter A citizen who is more supportive of the government has a smaller r

A media outlet with a higher editorial policy c can be interpreted as being less loyalto the government because it criticizes the government on a larger set of states Aneditorial policy c isin C can therefore represent a slant or political bias of the outletagainst the government and can be empirically measured as the frequency with whichthe outlet uses anti-government language Gentzkow and Shapiro (2010) constructsuch a slant index for US newspapers Empirical findings of their paper suggestthat the editorial policies of media outlets are driven by reader preferences justifyingour assumption of the existence of a large variety of editorial policies19 As in Suen(2004) Chan and Suen (2008) and Chiang and Knight (2011) the assumption ofthe binary media reports that communicate only whether the state θ is above somestandard c can be justified by a cursory readerrsquos preference for simple messages suchas positive or negative opinions and yes or no recommendations

The governmentrsquos censorship of media outlets can take various forms For examplethe government can ban access to internet sites withdraw licenses disrupt financingconfiscate print materials and equipment discredit media providers and even arresteditors and journalists using broadly formulated legislation on combating extremismIn some countries the government can exercise direct control over media editorialpolicies either through state ownership or administrative pressure

17The tie-breaking in the event of θ = c is unimportant as θ is a continuous random variable18We discuss the case of a continuum of media outlets in Section 5519Theoretical literature has explored the determinants of media slant of an outlet driven by its

citizens (Mullainathan and Shleifer 2005 Gentzkow and Shapiro 2006 and Chan and Suen 2008)and its owners (Baron 2006 and Besley and Prat 2006)

18 KOLOTILIN MYLOVANOV ZAPECHELNYUK

53 Formulating Media Censorship as Persuasion We now show that the me-dia censorship problem can be formulated as a linear persuasion problem in whichthe government is a sender and a representative citizen is a receiver

In our model the citizensrsquo and the governmentrsquos utilities are linear in θ There-fore given any information from the media outlets the utilities depend only on theposterior mean of θ

Consider an arbitrary posterior mean of θ denoted by m isin [0 1] Each citizen of typer chooses ar = 1 if and only if r le m The externality term κ plays no role in thisdecision Therefore the aggregate action a is simply the mass of all citizens whosetypes do not exceed m so a = G(m)

Next using the citizensrsquo optimal behavior we derive the governmentrsquos expected utilityconditional on the posterior mean m

V (m) = E983063983133 1

0

ν(θ r ar G(m))dG(r) + δγ(θ G(m))

983055983055983055983055m983064

=

983133 1

0

ν(m r1rlem G(m))dG(r) + δγ(mG(m)) (6)

This is the governmentrsquos indirect utility As in Section 2 the government now choosesa signal which is informative about the state to maximize its expected utility How-ever in contrast to Section 2 here the government is restricted to signals that areimplementable by a subset of a given set of media outlets

Next we show that with an appropriate definition of the state the restriction tosignals implementable by a subset of a given set of media outlets can be formulatedas a restriction to monotone partitional signals as in Scenario B in Section 2 Thuswe will be able to use our results in Section 4 to find the optimal censorship policies

Recall that the information about θ is only available through the media outlets Let ωdenote a random variable equal to the conditional expectation of θ given the messagesof all media outlets in C Note that as C is finite ω is a discrete random variablewith values in [0 1] Let F denote the distribution of ω From now on we treat therandom variable ω as the state and denote by s a signal as defined in Section 2

Let HM denote the set of all distributions of the posterior mean state induced byall monotone partitional signals and let HC denote the set of all distributions ofthe posterior mean state induced by all finite subsets of the set C of media outletsObserve that any set of media outlets induces a monotone partitional signal thatis HC sub HM We now show that every outcome implementable by an arbitrarymonotone partition in HM is also implementable by a subset of media outlets

Lemma 2 HC = HM

For illustration suppose that there is only one media outlet with the editorial policyc isin (0 1) There are two observable events θ ge c and θ lt c In each realized event

CENSORSHIP AS OPTIMAL PERSUASION 19

the citizens compute the posterior expectation of θ which we will call ω So in thisexample the distribution F of ω has two mass points one for each of the two events

Given these two mass points every monotone partition of [0 1] induces one of twopossible distributions of the posterior mean state (i) the mass points revealed and (ii)the mass points pooled Observe however that these distributions are implementableby permitting and censoring the media outlet respectively

A censorship policy is upper censorship if it censors all sufficiently disloyal mediaoutlets Specifically there exists a cutoff clowast isin C and an indicator qlowast isin 0 1 suchthat all media outlets whose editorial policies are below clowast are permitted all mediaoutlets whose editorial policies are above clowast are censored and clowast is permitted if andonly if qlowast = 1 That is X = c isin C c le clowast if qlowast = 1 and X = c isin C c lt clowastif qlowast = 0 Note that the full censorship policy clowast = 0 and qlowast = 0 (where all mediaoutlets are censored) and the free media policy clowast = 1 and qlowast = 1 (where all mediaoutlets are permitted) are the two extreme upper-censorship policies

As shown above the media censorship problem is equivalent to the linear persuasionproblem in which the sender is restricted to monotone partitional signals and thesenderrsquos indirect utility V is given by (6) We thus apply Theorem 1prime(B) to obtainthe following result

Theorem 1primeprime If V is S-shaped then an optimal censorship policy is upper censorship

To illustrate Theorem 1primeprime suppose the government is interested only in the aggregateaction so that ν = 0 and γ depends only on the aggregate action a so V (m) =γ(G(m)) Thus an upper-censorship policy is optimal if the composition functionγ(G(middot)) is S-shaped For example this condition holds if the government is interestedin reaching a certain approval threshold (eg a simple majority) so that γ is a stepfunction This condition also holds if γ is S-shaped and G is uniform or S-shapedwith the same inflection point as γ20

54 Canonical Case We now impose more structure on the utilities to obtain asharper result for the optimality of upper censorship and to perform a comparativestatics analysis Analogously to assumption (A1) we assume that the governmentrsquosutility is a weighted average of the citizensrsquo utility and their aggregate action

983133 1

0

u(θ r ar a)dG(r) + δa

where u(θ r ar a) = (θ minus r)ar + ζ(r)a

(A2)

for some continuously differentiable function ζ Define

β =

983133 1

0

ζ(r)dG(r) + δ

20This condition holds in many special cases where γ and G are S-shaped even when γ and Ghave different inflection points

20 KOLOTILIN MYLOVANOV ZAPECHELNYUK

The term983125 1

0ζ(r)dG(r) is the governmentrsquos expected bias towards the citizensrsquo action

a = 1 due to the citizensrsquo externality and the term δ is the governmentrsquos intrinsicbias towards a greater aggregate action a Thus we interpret β as the governmentrsquosaggregate bias This decomposition of the bias allows for different interpretations whythe government is biased So β can be high because the government is too self-serving(high δ) or because the government is benevolent (low δ) but wishes to internalizestrong positive externalities of the citizens (high ζ(r)) To ease interpretations weassume that β gt 021

Under Assumption (A2) the governmentrsquos indirect utility V given by (6) becomes

V (m) =

983133 m

0

(mminus r)dG(r) + βa =

983133 m

0

v(m r)dG(r)

where

v(m r) = β

9830611 +

1

β(mminus r)

983062

This is the same as v given by Assumption (A1) with ρ = βminus1 up to rescaling by apositive constant Consequently we can apply Theorem 2prime(B) to obtain the followingresult

Theorem 2primeprime Let (A2) hold If the density g of citizens types is log-concave then anoptimal censorship policy is upper censorship

We now apply the comparative statics analysis presented in Section 32 Note thatthe upper-censorship policies are ordered according to the amount of informationtransmitted to the citizens in the sense of Blackwell (1953) A greater censorshipthreshold clowast means that more media outlets are permitted With this order in mindwe apply Theorem 3prime(B) to make a comparative statics analysis on the amount ofinformation that is optimally disclosed by the government

First the censorship cutoff is increasing in the alignment parameter ρ Recall thatthis is a reciprocal of the governmentrsquos bias ρ = βminus1 This means that the governmentoptimally discloses more information (the censorship cutoff is greater) when it is lessbiased (β is smaller) Intuitively as β decreases the government puts more weighton the citizensrsquo utility so it optimally endows the population with a higher utility bycensoring fewer media outlets and disclosing more information

Second the censorship cutoff is increasing in the magnitude t of the horizontal shiftof the density g A greater t corresponds to a greater opportunity cost of actiona = 1 for each citizen This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more difficultto persuade to take action a = 1 (parameter t is greater) Informally speaking topersuade the same type of the citizen the government needs to increase the posterior

21If β lt 0 then swapping the roles of a = 0 and a = 1 reverses the sign of the bias If β = 0then the governmentrsquos utility and the citizensrsquo private interests coincide so it is trivially optimal topermit all media outlets

CENSORSHIP AS OPTIMAL PERSUASION 21

mean But the expected posterior mean must be equal to the prior mean so it is notpossible to increase all posterior means Due to the log-concave shape of the densityof the citizensrsquo types this tradeoff is resolved by increasing the posterior mean of thepooling interval E[ω|ω ge clowast] This is done by shrinking the pooling interval [clowast 1]that is increasing the censorship cutoff clowast

Third when the number of media outlets is sufficiently large the censorship cutoffis increasing in the stretch factor σ of the density g around its peak τ (see Footnote15) A greater σ corresponds to a more dispersed distribution of the opportunitycost in the population This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more diverse inhow difficult they are to persuade Intuitively when the types are more spread outpersuading the same mass of types requires a greater posterior mean which leads toa higher censorship cutoff similarly to the comparative statics in t discussed above

55 Extensions and Open Questions Let us now consider a few extensions ofour model of media censorship

In our model the set of media outlets is exogenous and the governmentrsquos only in-strument is censorship We now consider three alternative ways of expanding thegovernmentrsquos instruments of influence

First suppose that the government is able not only to censor existing media outletsbut also to introduce new media outlets with chosen editorial policies This is equiv-alent to our censorship model where all media outlets in [0 1] are initially availableand the government can censor any subset of them When all media outlets in [0 1]are permitted the revealed information about θ (which we call ω) is θ itself Thusω = θ is a continuous random variable with distribution F = T So we can nowapply our results for the continuous state from Section 3 instead of those for the dis-crete state in Section 4 reaching the same conclusion about the optimality of uppercensorship

Second suppose that the government can garble information available from mediaoutlets That is the government is not restricted to monotone partitions it cancreate arbitrary signals about state ω for the citizens to observe This becomes ageneral persuasion problem and our Theorems 1prime(A) 2prime(A) and 3prime(A) apply

Third suppose that the government is able to restrict not only which media outletsare permitted but also how many media outlets each citizen can choose to observeIn this extension the citizens are not allowed to communicate with one another (oth-erwise they could share the information thus observing all permitted media outletsindirectly) This extension does not affect our results as long as each citizen is al-lowed to access at least one media outlet of his choice as in Chan and Suen (2008)Intuitively this is because each citizen categorizes the information from the mediaoutlets into ldquogood newsrdquo where a = 1 is optimal and ldquobad newsrdquo where a = 0 is

22 KOLOTILIN MYLOVANOV ZAPECHELNYUK

optimal Because the information from the media outlets induces a monotone parti-tion it means that ldquogood newsrdquo is separated from ldquobad newsrdquo by a threshold mediaoutlet that depends on the citizenrsquos type That is it is sufficient to observe a singlethreshold media outlet to distinguish ldquogood newsrdquo from ldquobad newsrdquo

In our model each citizenrsquos utility depends on the aggregate action a through theexternality term κ(θ r a) which does not affect the chosen action Let us relax thisassumption so that a citizenrsquos optimal choice can depend on a We can still writethe senderrsquos indirect utility V as a function of the posterior mean state and applyour results However V is no longer uniquely determined by the primitives of themodel It is now endogenous and depends on an equilibrium the citizens play aseach citizenrsquos optimal action now depends on what all citizens do in equilibrium Forexample given the same information about the state a citizen could prefer to choosea = 1 if and only if sufficiently many citizens choose the same action so a is largeenough This creates the problem of multiplicity of equilibria and as a consequencethe dependence of optimal censorship on equilibrium selection

As a side our media censorship problem can also be applied to spatial voting mod-els as in Chiang and Knight (2011) Consider a government party (p = G) and anopposition party (p = O) competing in an election If party p wins a voter withan ideological position r gets utility wp minus (r minus rp)

2 where wp is the quality or va-lence of party p and rp is the ideology or policy platform of party p Voters knowthe partiesrsquo ideologies and obtain information about the partiesrsquo qualities from allavailable media outlets Each voter supports the party that maximizes his expectedutility Within this context our analysis still applies because the voterrsquos utility dif-ference between the government and opposition parties is proportional to θminusr whereθ = (wG minus wO + r2O minus r2G)2(rO minus rG)

There are a few more extensions that can be relevant in applications First insteadof complete censorship of a media outlet there can be a cost of accessing it Forexample an international news channel can be censored by a local government butcitizens may still access it through VPN at some cost Second it can be costly forthe government to censor media outlets So another important question to answeris how a government may prioritize censoring Finally citizens can incur some costof following each media outlet While we have already mentioned that citizens gainno benefit from following more than one outlet it is entirely possible for them tostop watching news altogether if it is sufficiently uninformative These extensions arenontrivial and left for future research

Appendix

Proof of Theorem 1 Suppose that V is S-shaped For mωlowast isin [0 1] and mlowast(ωlowast) =E[ω|ω ge ωlowast] define

V (mωlowast) = V (mlowast(ωlowast)) + V prime(mlowast(ωlowast))(mminusmlowast(ωlowast)) (7)

CENSORSHIP AS OPTIMAL PERSUASION 23

Since V (m) is convex on [0 τ ] and concave on [τ 1] there exists ωlowast isin [0 τ ] thatsatisfies condition (3) that is

V (mωlowast) ge (le)V (m) for all m ge (lt)ωlowast (8)

This is true because V (ωlowastωlowast) minus V (ωlowast) is single-crossing from below To see thisconsider Figure 1 Observe that (3) can hold only if ωlowast is on the convex part of V andmlowast is on the concave part of V As the censorship cutoff ωlowast increases the posteriormean state of the pooling message mlowast(ωlowast) also increases But because V is concaveat mlowast(ωlowast) the dashed tangent line V (mωlowast) becomes flatter in m so it crosses thesolid line V (m) at a smaller m

Consider an arbitrary signal s Let H be the distribution of m = E[ω|s] inducedby signal s Since each signal is a garbling of the fully informative signal F is amean-preserving spread of H Thus the senderrsquos expected utility is smaller undersignal s than under the upper-censorship signal with cutoff ωlowast as follows from

983133 1

0

V (m)dH(m) le983133 1

0

maxV (m) V (mωlowast)dH(m)

le983133 1

0

maxV (m) V (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mlowast(ωlowast))dF (m)

where the second line holds because maxV (m) V (mωlowast) is convex in m and F isa mean-preserving spread of H the third line holds by (8) and the last line holds by

(7) and by983125 1

ωlowast(mminusmlowast(ωlowast))dF (m) = 0

Conversely suppose that V is not S-shaped Then there exist 0 le m1 lt m2 lem3 lt m4 le 1 such that V primeprime(m) lt 0 for m isin (m1m2) V

primeprime(m) = 0 for m isin [m2m3]and V primeprime(m) gt 0 for m isin (m3m4) because V primeprime is continuous by assumption It isstraightforward to show that there exists 983141mlowast isin (m1m2) (sufficiently close to m2) and983141ωlowast isin (m3m4) such that

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) lt V (m) for all m isin (983141ωlowastm4]

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) gt V (m) for all m isin [m1 983141ωlowast] 983141mlowast

Moreover there exists a continuous distribution F such that the support of F is asubset of [m1m4] and 983141mlowast = E[ω|ω le 983141ωlowast] Using a chain of inequalities similar to theone above it is easy to show that lower-censorship signal 983141slowast with cutoff 983141ωlowast is optimalFurthermore it is uniquely optimal as both inequalities hold with equalities only forsignal 983141slowast This in turn implies that any upper-censorship signal is suboptimal

24 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Proof of Lemma 1 Notice that under assumption (A1) we have

V primeprime(m) = gprime(m) + ρg(m) for m isin [0 1]

Recall that V is S-shaped if and only if V primeprime(m) = gprime (m) + ρg (m) is single-crossingfrom above Using Proposition 1 in Quah and Strulovici (2012) it is easy to showthat V primeprime(m) is single-crossing from above for all ρ isin R if and only if gprime (m)g (m) isnonincreasing in m (that is ln g(m) is concave)

Proof of Theorem 2 Immediate by Theorem 1 and Lemma 1

Proof of Theorem 3 Part (i) We fix t and σ and vary ρ Without loss ofgenerality let t = 0 and σ = 1 (so that Gtσ = G) Consider a signal that induces thedistribution H of posterior mean m This signal implements the receiverrsquos interimutility U given by

U(r) =

983133 1

r

(mminus r)dH(m) for r isin [0 1]

which holds because the receiver acts if and only if m ge r Since types r lt 0 alwayschoose a = 1 and types r gt 1 always choose a = 0 we can exclude them fromconsideration and assume that supp(G) = [0 1] The senderrsquos expected utility is then

983133 1

0

V (m)dH(m) =

983133 1

0

983133 m

0

(1 + ρ(mminus r))dG(r)dH(m)

=

983133 1

0

983133 1

r

(1 + ρ(mminus r))dH(m)dG(r)

=

983133 1

0

(1minusH(r) + ρU(r))dG(r)

Consider ρ2 gt ρ1 Suppose by contradiction that the corresponding optimal upper-censorship signals s2 and s1 are such that s1 is strictly more informative than s2 LetU2 and U1 be the receiverrsquos interim utilities implemented by s2 and s1 Since thesender prefers signal s2 under ρ2 and signal s1 under ρ1 we have

983133 1

0

(1minusH2 (r) + ρ2U2 (r)) dG (r) ge983133 1

0

(1minusH1 (r) + ρ2U1 (r)) dG (r)

983133 1

0

(1minusH1 (r) + ρ1U1 (r)) dG (r) ge983133 1

0

(1minusH2 (r) + ρ1U2 (r)) dG (r)

Summing up these inequalities gives

(ρ2 minus ρ1)

983133 1

0

(U2 (r)minus U1 (r)) dG (r) ge 0

leading to a contradiction because U1(r) ge U2(r) for all r with strict inequality forsome r given that s1 is strictly more informative than s2

CENSORSHIP AS OPTIMAL PERSUASION 25

Part (ii) We fix ρ isin R and σ = 1 and vary t The indirect utility of the senderconditional on m isin [0 1] is given by

Vt(m) =

983133

rlem

(1+ρ(mminusr))gt1(r)dr =

983133

rlemminust

(1+ρ(mminustminusr))g(r)dr = V (mminust) (9)

By Theorem 1 when t0 = 0 an optimal cutoff ωlowast0 satisfies

V (m) le V (mlowast0) + V prime(mlowast

0)(mminusmlowast0) for m isin [ωlowast

0 1]

When V is S-shaped and t1 lt 0 it is easy to see that

V (mminus t1) le V (mlowast0 minus t1) + V prime(mlowast

0 minus t1)(mminusmlowast0) for m isin [ωlowast

0 1]

which implies that when t = t1 the above inequality changes its sign at some ωlowast1

that satisfies ωlowast1 le ωlowast

0 It follows from (9) that ωlowast1 is an optimal cutoff for t = t1 and

as we established above ωlowast1 le ωlowast

0

Part (iii) We fix ρ ge 0 and t = 0 and vary σ The indirect utility of the senderconditional on m isin [0 1] is given by

V ρσσ(m) =

983133

rlem

9830591 +

ρ

σ(mminus r)

983060dG0σ(r)

=

983133

rlem

9830591 +

ρ

σ(mminus r)

983060g

983061r minus (1minus σ)τ

σ

9830621

σdr

=

983133

rlemminus(1minusσ)τσ

9830611 + ρ

983061mminus (1minus σ)τ

σminus r

983062983062g(r)dr

= Vρ1

983061mminus (1minus σ)τ

σ

983062 (10)

Let σ0 = 1 σ1 lt 1 and ρ1 = ρσ1 To compare the optimal cutoffs under parameters(ρ σ0) and (ρ σ1) we consider two steps first comparing the cutoffs between (ρ σ0)and (ρ1 σ1) and then comparing the cutoffs between (ρ1 σ1) and (ρ σ1)

First we compare an optimal cutoff ωlowast0 under (ρ σ0) = (ρ 1) and an optimal cutoff

ωlowast1 under (ρ1 σ1) By Theorem 1 ωlowast

0 satisfies

Vρ1(m) le Vρ1(mlowast0) + V prime

ρ1(mlowast0)(mminusmlowast

0) for m isin [ωlowast0 1]

When V is increasing (implied by ρ ge 0) and S-shaped and σ1 lt 1 it is easy to seethat for all m isin [ωlowast

0 1]

Vρ1

983061mminus (1minus σ1)τ

σ1

983062le Vρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062+ V prime

ρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062(mminusmlowast

0)

which implies that when σ = σ1 lt 1 the above inequality changes its sign at someωlowast1 that satisfies ωlowast

1 le ωlowast0 It follows from (10) that ωlowast

1 is an optimal cutoff under(ρσ σ1) = (ρ1 σ1) and as we established above ωlowast

1 le ωlowast0

26 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Second we compare the optimal cutoff ωlowast1 under (ρ1 σ1) and the optimal cutoff ωlowast

2

under (ρ σ1) Because ρ lt ρ1 = ρσ1 by part (i) we have ωlowast2 le ωlowast

1

Combining ωlowast1 le ωlowast

0 and ωlowast2 le ωlowast

1 we obtain ωlowast2 le ωlowast

0

Proof of Theorem 1prime Part (A) The proof is analogous to that of Theorem 1 andis thus omitted

Part (B) The only if statement follows from that in part (A) (see the proof of Theorem1) and the fact that we can always ensure that an optimal stochastic censorship(ωlowast qlowast) is deterministic (so qlowast isin 0 1) by the choice of the support and distributionof ω Thus we only need to prove the if statement

Let F be a distribution of ω with support on n values 0 lt ω0 lt lt ωnminus1 lt 1 Thisis without loss of generality because if 0 andor 1 were in the support of F we couldredefine the state as ω = (ω + ε)(1 + 2ε) and the type as r = (r + ε)(1 + 2ε)

Let X sub ω1 ωnminus1 identify a set of cutoffs defining a monotone partition for eachωi isin X the receiver is informed whether ω lt ωi or ω ge ωi Note that we exclude ω0

from the set of cutoffs because ω ge ω0 always holds

Note that X is deterministic upper censorship if and only if there exists a censorshipthreshold ωlowast isin [0 1] such that for each i = 1 nminus 1 ωi isin X if and only if ωi lt ωlowastthat is we can assume that qlowast = 0 without loss of generality

Let VX be the senderrsquos expected utility when the monotone partition is given by X

If n = 2 then there are only two partitions X = ω1 (full disclosure) and X = empty(no disclosure) Both are upper censorships so part (B) holds trivially

If n = 3 then there are four monotone partitions X = ω1ω2 (full disclosure)X = ω2 X = ω1 and X = empty (no disclosure) Only X = ω2 is not uppercensorship The next lemma proves that X = ω2 cannot be uniquely optimal whenV is S-shaped thus proving the optimality of upper censorship for n = 3

Denote

mjk = E983045ω|ω isin [ωjωk]

983046

For convenience of notation we write mj = ωj

Lemma 3 Let V be S-shaped with an inflexion point τ Let ω be a discrete randomvariable with support on m0m1m2 and probabilities p0 p1 p2 where 0 lt m0 ltm1 lt m2 lt 1 Then

(i) if τ le m0 then Vm2 le Vempty

(ii) if τ ge m1 then Vm2 le Vm1m2

(iii) if m0 lt τ lt m1 then Vm2 le maxVempty Vm1m2

CENSORSHIP AS OPTIMAL PERSUASION 27

Proof Define p01 = p0 + p1 and p02 = p01 + p2 = 1 Observe that

m01 =p0m0 + p1m1

p0 + p1and m02 =

p0m0 + p1m1 + p2m2

p0 + p1 + p2

We thus have

Vempty = p02V (m02) = (p01 + p2)V

983061p01m01 + p2m2

p01 + p2

983062

Vm2 = p01V (m01) + p2V (m2) = (p0 + p1)V

983061p0m0 + p1m1

p0 + p1

983062+ p2V (m2)

Vm1m2 = p0V (m0) + p1V (m1) + p2V (m2)

Clearly

m0 lt m01 lt m1 and m0 lt m02 lt m2

Therefore

Vm2 le Vempty lArrrArrp01

p01 + p2V (m01) +

p2p01 + p2

V (m2) le V

983061p01

p01 + p2m01 +

p2p01 + p2

m2

983062= V (m02)

(11)

and

Vm2 le Vm1m2 lArrrArr

V

983061p0

p0 + p1m0 +

p1p0 + p1

m1

983062le p0

p0 + p1V (m0) +

p1p0 + p1

V (m1)(12)

Suppose that τ le m01 so V is concave on [m01m2] Then (11) holds by Jensenrsquosinequality In particular it holds for τ le m0 lt m01 thus proving part (i)

Suppose that τ ge m1 so V is convex on [m0m1] Then (12) holds by Jensenrsquosinequality thus proving part (ii)

Finally suppose that m01 lt τ lt m1 Let (τ V (τ)) lie below (or on) the straight lineA connecting (m0 V (m0)) and (m2 V (m2)) as illustrated on Figure 2 By convexityof V (m) for m lt τ it must be the case that V (m01) lies below (or on) the dashedline B By concavity of V (m) for m gt τ it must be the case that V (m1) lies above(or on) the dashed line C Consequently (12) is satisfied

Alternatively let (τ V (τ)) lie above the straight line A connecting (m0 V (m0)) and(m2 V (m2)) as illustrated on Figure 3 If Vm2 le Vm1m2 then the proof of part(iii) is complete So suppose that Vm2 gt Vm1m2 so (12) is violated

By convexity of V (m) for m lt τ it must be the case that V (m01) lies below (or on)the line B Since (12) is violated V (m01) must lie between the lines A and B Recallthat m01 lt m02 lt m2 Observe that V (m02) must be in the shaded area on Figure 3Indeed if m02 lt τ then by convexity of V (m) for m lt τ it must be the case thatV (m02) lies between (or on) the lines B and D If m02 ge τ then by concavity of

28 KOLOTILIN MYLOVANOV ZAPECHELNYUK

τ m2m0

A C

B

m01 m1

V (m0) V (τ)

V (m2)

Figure 2 The case where (τ V (τ)) lies below (or on) the straight lineconnecting (m0 V (m0)) and (m2 V (m2))

τ m2m0

A

C

B

m01

V (m0)

V (τ)

V (m2)

D

V (m01)

Figure 3 The case where (τ V (τ)) lies above the straight line con-necting (m0 V (m0)) and (m2 V (m2))

V (m) for m gt τ it must be the case that V (m02) lies above (or on) line C It followsthat (11) is satisfied

We now adapt Lemma 3 to prove the optimality of upper censorship for n ge 4

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

12 KOLOTILIN MYLOVANOV ZAPECHELNYUK

That is if the density of receiver types g is log-concave then the senderrsquos indirectutility V is S-shaped Moreover this result is tight in the sense that if g is notlog-concave then there exists ρ such that V is not S-shaped

By Theorem 1 and Lemma 1 we obtain a criterion for the optimality of upper cen-sorship with the condition on the primitive of the model the density g

Theorem 2 Consider the canonical case and let the density g be log-concave Thenand only then an optimal signal is deterministic upper censorship for all F and all ρ

The symmetric statement is also true an optimal signal is lower censorship for all Fand all ρ if and only if minusg is log-concave Consequently if both g and minusg are log-concave (that is g is exponential) then there exists an optimal signal which is bothupper censorship and lower censorship There are only two signals with this propertyfully informative and completely uninformative This allows us to obtain conditionson the distribution of receiver types under which the optimal signal is polarizedbetween fully informative and completely uninformative signals as for example inLewis and Sappington (1994) and Johnson and Myatt (2006)13

Corollary 2 In the canonical case an optimal signal is either fully informative orcompletely uninformative for all F and all ρ if and only if there exist λ isin R and c gt 0such that g(r) = ceminusλr for r isin [0 1]

If g(r) = ceminusλr then the fully informative signal is optimal whenever ρ ge λ and thecompletely uninformative signal is optimal whenever ρ le λ (and any signal is optimalwhen ρ = λ) In particular if ρ = 0 the optimal signal is fully determined by thesign of λ which is in turn determined by whether the mean of r is smaller or greaterthan 1

2

32 Comparative Statics Theorem 2 allows for a sharp comparative statics anal-ysis on the informativeness of the optimal signal

We compare signals by their Blackwell informativeness (Blackwell 1953) To compareupper-censorship signals s1 and s2 we only need to compare their cutoffs ωlowast

1 and ωlowast2

Signal s1 is more informative than signal s2 if ωlowast1 ge ωlowast

2 Indeed state ω isin [0ωlowast2) is

fully revealed by both s1 and s2 and state ω isin [ωlowast2 1] is partially revealed by s1 and

not revealed at all by s2 so s1 is more informative than s2

13In the case of log-concave density g the sets of upper- and lower-censorship signals are totallyordered by the rotation order of Johnson and Myatt (2006) If we restrict attention to the set oflower-censorship signals (which are generally suboptimal under log-concave g) then the rotationpoint is decreasing with respect to the rotation order By Lemma 1 of Johnson and Myatt (2006)the senderrsquos expected utility is quasiconvex and one of the extreme lower-censorship signals (thefully informative or completely uninformative signal) is optimal for the sender But if we restrictattention to the set of upper-censorship signals (which are optimal under log-concave g) then therotation point is not decreasing with respect to the rotation order Therefore Lemma 1 of Johnsonand Myatt (2006) does not apply and an interior upper-censorship signal is generally optimal forthe sender

CENSORSHIP AS OPTIMAL PERSUASION 13

For the purpose of comparison we extend the definition of distribution function Gto the real line and assume that its density g is log-concave Consider a family ofdistributions Gtσ of receiver types

Gtσ (r) = G

983061τ minus t+

r minus τ

σ

983062

where τ is the maximum point of g and t isin R and σ gt 0 are parameters Let gtσ (r)denote the corresponding density Note that gtσ (r) is log-concave for all t isin R andall σ gt 0

Because gtσ is log-concave on [0 1] deterministic upper censorship is optimal byTheorem 2 Let ωlowast (ρ t σ) isin [0 1] be the optimal censorship cutoff as given by (3)

The shift parameter t shifts the distribution along the horizontal axis and the stretchparameter σ stretches the distribution horizontally and symmetrically around themode τ so the limit of σ rarr 0 leads to the unit mass on τ We now show that thesender optimally discloses more information when

(i) the senderrsquos preferences are better aligned with the receiverrsquos preferences (thealignment parameter ρ is greater)

(ii) the receiver is more reluctant to accept the proposal (the shift parameter t isgreater)

(iii) the sender is more uncertain about the receiverrsquos type (the stretch parameter σis greater) provided ρ ge 014

Theorem 3 In the canonical case a cutoff state of an optimal signal satisfies

(i) ωlowast (ρ t σ) is increasing in ρ

(ii) ωlowast (ρ t σ) is increasing in t

(iii) ωlowast (ρ t σ) is increasing in σ if ρ ge 0

The intuition for part (i) is that for a higher ρ the sender puts more weight onthe receiverrsquos utility so she optimally endows the receiver with a higher utility byproviding more information

The intuition for part (ii) is that for a higher t each type of the receiver has a greatercost of accepting the proposal so to persuade the same type of the receiver the senderneeds to increase E[ω|ω ge ωlowast] by expanding the full disclosure interval [0ωlowast]

The intuition for part (iii) is that for a higher σ receiver types are more spread out soto persuade the same mass of types the sender again needs to increase E[ω|ω ge ωlowast]

14The case ρ ge 0 is the practically relevant case where the senderrsquos utility is a weighted averageof the receiverrsquos action and utility 1

1+ρv(ω r) =1

1+ρ + ρ1+ρ (ω minus r)

14 KOLOTILIN MYLOVANOV ZAPECHELNYUK

4 Discrete State

In this section we assume that the state is a discrete random variable

Theorem 1prime Let V be S-shaped Then and only then for all discrete F

(A) an optimal signal is stochastic upper censorship whose cutoff pair (ωlowast qlowast) satisfiescondition (3)

(B) an optimal monotone partition is deterministic upper censorship whose cutoff pair(ωlowast

d qlowastd) satisfies ωlowast

d = ωlowast and qlowastd isin 0 1

In the canonical case we obtain the result analogous to Theorem 2

Theorem 2prime Consider the canonical case and let the density g be log-concave Thenand only then for all discrete F and all ρ

(A) an optimal signal is stochastic upper censorship

(B) an optimal monotone partition is deterministic upper censorship

Let us discuss how the comparative statics result (Theorem 3) changes In Section 32we argue that upper-censorship signals can be ordered by their censorship cutoffs Butwhen the state is discrete a censorship cutoff is described by a pair (ωlowast qlowast) where ωlowast

is a cutoff state and qlowast is the probability of revealing this cutoff state when it realizes

Consider two stochastic upper-censorship signals s1 and s2 with cutoff pairs (ωlowast1 q

lowast1)

and (ωlowast2 q

lowast2) Denote (ωlowast

1 qlowast1) ≽ (ωlowast

2 qlowast2) if ω

lowast1 gt ωlowast

2 or ωlowast1 = ωlowast

2 and qlowast1 ge qlowast2 Observethat s1 is more (Blackwell) informative than s2 if and only if (ωlowast

1 qlowast1) ≽ (ωlowast

2 qlowast2) This

comparison also applies to deterministic upper-censorship signals with the constraintthat qlowast1 and qlowast2 are in 0 1

With the above order on cutoff pairs Theorem 3 extend as follows

Theorem 3prime Consider the canonical case

(A) A cutoff pair of an optimal signal satisfies

(i) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in ρ

(ii) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in t

(iii) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in σ if ρ ge 0

(B) A cutoff pair of an optimal monotone partition satisfies15

(i) (ωlowastd (ρ t σ) q

lowastd (ρ t σ)) is increasing in ρ

(ii) (ωlowastd (ρ t σ) q

lowastd (ρ t σ)) is increasing in t

15Part (iii) also holds for an optimal monotone partition when for example the discrete supportof the state ω is a regular grid with sufficiently many points However it is possible to construct acounterexample where part (iii) fails for an optimal monotone partition

CENSORSHIP AS OPTIMAL PERSUASION 15

That is the sender optimally discloses more information when she is less biasedrelative to the receiver (the alignment parameter ρ is greater) when the receiver ismore reluctant to accept the proposal (the shift parameter t is greater) and underScenario A when the sender is more uncertain about the receiverrsquos type (the stretchparameter σ is greater)

Remark 2 Theorem 1prime and Theorem 2prime show that upper censorship emerges as anoptimal persuasion strategy under the same conditions (V is S-shaped in Theorem1prime and g is log-concave in Theorem 2prime) regardless of whether the sender can choosean arbitrary signal or is constrained to choose a monotone partition It may betempting to conjecture that if the sender was allowed to choose an arbitrary partition(not necessarily monotone) the same result would hold This however is not true

For illustration suppose that the sender wishes to maximize the probability of a = 1and the receiver has type r = 1

2with certainty16 Let the state ω take values in

0 14 1 with probabilities (1

6 23 16) so the prior mean is 1

3

An optimal signal is stochastic upper censorship with cutoff pair (ωlowast qlowast) = (14 12)

which sends the pooling message with certainty at ω = 1 and with probability 1minusqlowast =12at ω = 1

4 and reveals the state otherwise The pooling message induces the posterior

mean m = 12 leading to a = 1 all other messages induce posterior means below 1

2

leading to a = 0 Under the optimal signal action a = 1 is chosen with probability23middot 12+ 1

6= 1

2

An optimal monotone partition is deterministic upper censorship that fully revealsthe state where only ω = 1 leads to a = 1 Under this partition action a = 1 ischosen with probability 1

6

An optimal partition pools the extreme states ω = 0 and ω = 1 and reveals theintermediate state ω = 1

4 The pooling message induces the posterior mean m = 1

2

leading to a = 1 This optimal partition is nonmonotone Under this partitionaction a = 1 is chosen with probability 1

6+ 1

6= 1

3 which is strictly better than the

probability of 16obtained under the optimal monotone partition

5 Application to Media Censorship

In this section we apply our results to the problem of media censorship by thegovernment In the modern world people obtain information about the governmentrsquosstate through various media sources such as television newspapers and internetblogs Without the media most people would not know what policies and reformsthe government pursues and how effective they are Media outlets have differentpositions on the political spectrum and differ substantially in how they select andpresent facts to cover the same news People choose their sources of information based

16In this case V (m) = 1 if m ge 12 and V (m) = 0 if m lt 1

2 Observe that the graph of V is

S-shaped and it can be approximated by a continuously differentiable S-shaped function

16 KOLOTILIN MYLOVANOV ZAPECHELNYUK

on their political ideology and socioeconomic status This information is valuable forsignificant individual decisions on migration investment and voting to name a fewIndividuals may not fully internalize externalities that their decisions impose on thesociety Likewise the government may not have the societyrsquos best interest at heartTo further its goals the government then wishes to influence individual decisions bymanipulating the information through media In autocracies and countries with weakchecks and balances the government has power to censor the media content

The governmentrsquos problem of media censorship can be represented as the persua-sion problem in Section 2 We apply our results to provide conditions for the op-timality of upper-censorship policies that censor all media outlets except the mostpro-government ones Furthermore we interpret our comparative statics results

51 Setup There is a continuum of heterogeneous citizens indexed by r isin [0 1]distributed with G Each citizen chooses between a = 0 and a = 1 The utility of acitizen of type r is given by

u(θ r ar a) = (θ minus r)ar + κ(θ r a)

where ar isin 0 1 denotes the citizenrsquos own action a =983125ardG(r) denotes the aggre-

gate action in the society θ isin [0 1] captures an unobserved benefit from action 1 ascompared to action 0 and κ captures the impact of the aggregate action a on thecitizenrsquos utility The term (θ minus r)ar is a private surplus of a citizen of type r Theterm κ(θ r a) is an externality because for a citizen of type r it is optimal to ignorethis term and choose ar = 1 if and only if θ ge r

There is a government which is concerned with a weighted average of the social utilityand the governmentrsquos intrinsic benefit from the aggregate action For a given θ thegovernmentrsquos utility is given by

983133 1

0

ν(θ r ar a)dG(r) + δγ(θ a)

The term ν captures a citizenrsquos utility from the governmentrsquos perspective We allow νto be different from u to reflect paternalistic or other concerns The term γ capturesthe governmentrsquos intrinsic benefit from the aggregate action The parameter δ ge 0captures the weight of the aggregate action in the governmentrsquos utility

Let T be a distribution of the random variable θ We assume that distributions G andT are independent and admit continuously differentiable and strictly positive densi-ties We also assume that κ ν and γ are linear in θ and continuously differentiablein r and a Furthermore to simplify interpretations we assume that ν and γ arenon-decreasing in θ and a so that for the government a high θ is a good news anda higher aggregate action is preferable

Citizens obtain information about the unobservable benefit θ through media outletsEach media outlet is identified by its editorial policy c isin [0 1] and it endorses action

CENSORSHIP AS OPTIMAL PERSUASION 17

a = 1 if θ ge c and criticizes it if θ lt c17 A set C of media outlets is a finite subsetof [0 1]18

The governmentrsquos censorship policy is a set of the media outlets X sub C that arepermitted to broadcast so the rest of the media outlets are censored

The timing is as follows First the government chooses a set X sub C of permittedmedia outlets Second state θ is realized and each permitted media outlet endorsesor criticizes action a = 1 according to its editorial policy Finally each citizen observesmessages from all permitted media outlets updates his beliefs about θ and choosesan action

52 Discussion We now discuss interpretations of the key components of the mediacensorship application As in Gehlbach and Sonin (2014) there can be various inter-pretations of the citizenrsquos action a = 1 such as voting for the government supportinga governmentrsquos policy or taking an individual decision that benefits the governmentA citizenrsquos type r can be interpreted as his ideological position or preference param-eter A citizen who is more supportive of the government has a smaller r

A media outlet with a higher editorial policy c can be interpreted as being less loyalto the government because it criticizes the government on a larger set of states Aneditorial policy c isin C can therefore represent a slant or political bias of the outletagainst the government and can be empirically measured as the frequency with whichthe outlet uses anti-government language Gentzkow and Shapiro (2010) constructsuch a slant index for US newspapers Empirical findings of their paper suggestthat the editorial policies of media outlets are driven by reader preferences justifyingour assumption of the existence of a large variety of editorial policies19 As in Suen(2004) Chan and Suen (2008) and Chiang and Knight (2011) the assumption ofthe binary media reports that communicate only whether the state θ is above somestandard c can be justified by a cursory readerrsquos preference for simple messages suchas positive or negative opinions and yes or no recommendations

The governmentrsquos censorship of media outlets can take various forms For examplethe government can ban access to internet sites withdraw licenses disrupt financingconfiscate print materials and equipment discredit media providers and even arresteditors and journalists using broadly formulated legislation on combating extremismIn some countries the government can exercise direct control over media editorialpolicies either through state ownership or administrative pressure

17The tie-breaking in the event of θ = c is unimportant as θ is a continuous random variable18We discuss the case of a continuum of media outlets in Section 5519Theoretical literature has explored the determinants of media slant of an outlet driven by its

citizens (Mullainathan and Shleifer 2005 Gentzkow and Shapiro 2006 and Chan and Suen 2008)and its owners (Baron 2006 and Besley and Prat 2006)

18 KOLOTILIN MYLOVANOV ZAPECHELNYUK

53 Formulating Media Censorship as Persuasion We now show that the me-dia censorship problem can be formulated as a linear persuasion problem in whichthe government is a sender and a representative citizen is a receiver

In our model the citizensrsquo and the governmentrsquos utilities are linear in θ There-fore given any information from the media outlets the utilities depend only on theposterior mean of θ

Consider an arbitrary posterior mean of θ denoted by m isin [0 1] Each citizen of typer chooses ar = 1 if and only if r le m The externality term κ plays no role in thisdecision Therefore the aggregate action a is simply the mass of all citizens whosetypes do not exceed m so a = G(m)

Next using the citizensrsquo optimal behavior we derive the governmentrsquos expected utilityconditional on the posterior mean m

V (m) = E983063983133 1

0

ν(θ r ar G(m))dG(r) + δγ(θ G(m))

983055983055983055983055m983064

=

983133 1

0

ν(m r1rlem G(m))dG(r) + δγ(mG(m)) (6)

This is the governmentrsquos indirect utility As in Section 2 the government now choosesa signal which is informative about the state to maximize its expected utility How-ever in contrast to Section 2 here the government is restricted to signals that areimplementable by a subset of a given set of media outlets

Next we show that with an appropriate definition of the state the restriction tosignals implementable by a subset of a given set of media outlets can be formulatedas a restriction to monotone partitional signals as in Scenario B in Section 2 Thuswe will be able to use our results in Section 4 to find the optimal censorship policies

Recall that the information about θ is only available through the media outlets Let ωdenote a random variable equal to the conditional expectation of θ given the messagesof all media outlets in C Note that as C is finite ω is a discrete random variablewith values in [0 1] Let F denote the distribution of ω From now on we treat therandom variable ω as the state and denote by s a signal as defined in Section 2

Let HM denote the set of all distributions of the posterior mean state induced byall monotone partitional signals and let HC denote the set of all distributions ofthe posterior mean state induced by all finite subsets of the set C of media outletsObserve that any set of media outlets induces a monotone partitional signal thatis HC sub HM We now show that every outcome implementable by an arbitrarymonotone partition in HM is also implementable by a subset of media outlets

Lemma 2 HC = HM

For illustration suppose that there is only one media outlet with the editorial policyc isin (0 1) There are two observable events θ ge c and θ lt c In each realized event

CENSORSHIP AS OPTIMAL PERSUASION 19

the citizens compute the posterior expectation of θ which we will call ω So in thisexample the distribution F of ω has two mass points one for each of the two events

Given these two mass points every monotone partition of [0 1] induces one of twopossible distributions of the posterior mean state (i) the mass points revealed and (ii)the mass points pooled Observe however that these distributions are implementableby permitting and censoring the media outlet respectively

A censorship policy is upper censorship if it censors all sufficiently disloyal mediaoutlets Specifically there exists a cutoff clowast isin C and an indicator qlowast isin 0 1 suchthat all media outlets whose editorial policies are below clowast are permitted all mediaoutlets whose editorial policies are above clowast are censored and clowast is permitted if andonly if qlowast = 1 That is X = c isin C c le clowast if qlowast = 1 and X = c isin C c lt clowastif qlowast = 0 Note that the full censorship policy clowast = 0 and qlowast = 0 (where all mediaoutlets are censored) and the free media policy clowast = 1 and qlowast = 1 (where all mediaoutlets are permitted) are the two extreme upper-censorship policies

As shown above the media censorship problem is equivalent to the linear persuasionproblem in which the sender is restricted to monotone partitional signals and thesenderrsquos indirect utility V is given by (6) We thus apply Theorem 1prime(B) to obtainthe following result

Theorem 1primeprime If V is S-shaped then an optimal censorship policy is upper censorship

To illustrate Theorem 1primeprime suppose the government is interested only in the aggregateaction so that ν = 0 and γ depends only on the aggregate action a so V (m) =γ(G(m)) Thus an upper-censorship policy is optimal if the composition functionγ(G(middot)) is S-shaped For example this condition holds if the government is interestedin reaching a certain approval threshold (eg a simple majority) so that γ is a stepfunction This condition also holds if γ is S-shaped and G is uniform or S-shapedwith the same inflection point as γ20

54 Canonical Case We now impose more structure on the utilities to obtain asharper result for the optimality of upper censorship and to perform a comparativestatics analysis Analogously to assumption (A1) we assume that the governmentrsquosutility is a weighted average of the citizensrsquo utility and their aggregate action

983133 1

0

u(θ r ar a)dG(r) + δa

where u(θ r ar a) = (θ minus r)ar + ζ(r)a

(A2)

for some continuously differentiable function ζ Define

β =

983133 1

0

ζ(r)dG(r) + δ

20This condition holds in many special cases where γ and G are S-shaped even when γ and Ghave different inflection points

20 KOLOTILIN MYLOVANOV ZAPECHELNYUK

The term983125 1

0ζ(r)dG(r) is the governmentrsquos expected bias towards the citizensrsquo action

a = 1 due to the citizensrsquo externality and the term δ is the governmentrsquos intrinsicbias towards a greater aggregate action a Thus we interpret β as the governmentrsquosaggregate bias This decomposition of the bias allows for different interpretations whythe government is biased So β can be high because the government is too self-serving(high δ) or because the government is benevolent (low δ) but wishes to internalizestrong positive externalities of the citizens (high ζ(r)) To ease interpretations weassume that β gt 021

Under Assumption (A2) the governmentrsquos indirect utility V given by (6) becomes

V (m) =

983133 m

0

(mminus r)dG(r) + βa =

983133 m

0

v(m r)dG(r)

where

v(m r) = β

9830611 +

1

β(mminus r)

983062

This is the same as v given by Assumption (A1) with ρ = βminus1 up to rescaling by apositive constant Consequently we can apply Theorem 2prime(B) to obtain the followingresult

Theorem 2primeprime Let (A2) hold If the density g of citizens types is log-concave then anoptimal censorship policy is upper censorship

We now apply the comparative statics analysis presented in Section 32 Note thatthe upper-censorship policies are ordered according to the amount of informationtransmitted to the citizens in the sense of Blackwell (1953) A greater censorshipthreshold clowast means that more media outlets are permitted With this order in mindwe apply Theorem 3prime(B) to make a comparative statics analysis on the amount ofinformation that is optimally disclosed by the government

First the censorship cutoff is increasing in the alignment parameter ρ Recall thatthis is a reciprocal of the governmentrsquos bias ρ = βminus1 This means that the governmentoptimally discloses more information (the censorship cutoff is greater) when it is lessbiased (β is smaller) Intuitively as β decreases the government puts more weighton the citizensrsquo utility so it optimally endows the population with a higher utility bycensoring fewer media outlets and disclosing more information

Second the censorship cutoff is increasing in the magnitude t of the horizontal shiftof the density g A greater t corresponds to a greater opportunity cost of actiona = 1 for each citizen This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more difficultto persuade to take action a = 1 (parameter t is greater) Informally speaking topersuade the same type of the citizen the government needs to increase the posterior

21If β lt 0 then swapping the roles of a = 0 and a = 1 reverses the sign of the bias If β = 0then the governmentrsquos utility and the citizensrsquo private interests coincide so it is trivially optimal topermit all media outlets

CENSORSHIP AS OPTIMAL PERSUASION 21

mean But the expected posterior mean must be equal to the prior mean so it is notpossible to increase all posterior means Due to the log-concave shape of the densityof the citizensrsquo types this tradeoff is resolved by increasing the posterior mean of thepooling interval E[ω|ω ge clowast] This is done by shrinking the pooling interval [clowast 1]that is increasing the censorship cutoff clowast

Third when the number of media outlets is sufficiently large the censorship cutoffis increasing in the stretch factor σ of the density g around its peak τ (see Footnote15) A greater σ corresponds to a more dispersed distribution of the opportunitycost in the population This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more diverse inhow difficult they are to persuade Intuitively when the types are more spread outpersuading the same mass of types requires a greater posterior mean which leads toa higher censorship cutoff similarly to the comparative statics in t discussed above

55 Extensions and Open Questions Let us now consider a few extensions ofour model of media censorship

In our model the set of media outlets is exogenous and the governmentrsquos only in-strument is censorship We now consider three alternative ways of expanding thegovernmentrsquos instruments of influence

First suppose that the government is able not only to censor existing media outletsbut also to introduce new media outlets with chosen editorial policies This is equiv-alent to our censorship model where all media outlets in [0 1] are initially availableand the government can censor any subset of them When all media outlets in [0 1]are permitted the revealed information about θ (which we call ω) is θ itself Thusω = θ is a continuous random variable with distribution F = T So we can nowapply our results for the continuous state from Section 3 instead of those for the dis-crete state in Section 4 reaching the same conclusion about the optimality of uppercensorship

Second suppose that the government can garble information available from mediaoutlets That is the government is not restricted to monotone partitions it cancreate arbitrary signals about state ω for the citizens to observe This becomes ageneral persuasion problem and our Theorems 1prime(A) 2prime(A) and 3prime(A) apply

Third suppose that the government is able to restrict not only which media outletsare permitted but also how many media outlets each citizen can choose to observeIn this extension the citizens are not allowed to communicate with one another (oth-erwise they could share the information thus observing all permitted media outletsindirectly) This extension does not affect our results as long as each citizen is al-lowed to access at least one media outlet of his choice as in Chan and Suen (2008)Intuitively this is because each citizen categorizes the information from the mediaoutlets into ldquogood newsrdquo where a = 1 is optimal and ldquobad newsrdquo where a = 0 is

22 KOLOTILIN MYLOVANOV ZAPECHELNYUK

optimal Because the information from the media outlets induces a monotone parti-tion it means that ldquogood newsrdquo is separated from ldquobad newsrdquo by a threshold mediaoutlet that depends on the citizenrsquos type That is it is sufficient to observe a singlethreshold media outlet to distinguish ldquogood newsrdquo from ldquobad newsrdquo

In our model each citizenrsquos utility depends on the aggregate action a through theexternality term κ(θ r a) which does not affect the chosen action Let us relax thisassumption so that a citizenrsquos optimal choice can depend on a We can still writethe senderrsquos indirect utility V as a function of the posterior mean state and applyour results However V is no longer uniquely determined by the primitives of themodel It is now endogenous and depends on an equilibrium the citizens play aseach citizenrsquos optimal action now depends on what all citizens do in equilibrium Forexample given the same information about the state a citizen could prefer to choosea = 1 if and only if sufficiently many citizens choose the same action so a is largeenough This creates the problem of multiplicity of equilibria and as a consequencethe dependence of optimal censorship on equilibrium selection

As a side our media censorship problem can also be applied to spatial voting mod-els as in Chiang and Knight (2011) Consider a government party (p = G) and anopposition party (p = O) competing in an election If party p wins a voter withan ideological position r gets utility wp minus (r minus rp)

2 where wp is the quality or va-lence of party p and rp is the ideology or policy platform of party p Voters knowthe partiesrsquo ideologies and obtain information about the partiesrsquo qualities from allavailable media outlets Each voter supports the party that maximizes his expectedutility Within this context our analysis still applies because the voterrsquos utility dif-ference between the government and opposition parties is proportional to θminusr whereθ = (wG minus wO + r2O minus r2G)2(rO minus rG)

There are a few more extensions that can be relevant in applications First insteadof complete censorship of a media outlet there can be a cost of accessing it Forexample an international news channel can be censored by a local government butcitizens may still access it through VPN at some cost Second it can be costly forthe government to censor media outlets So another important question to answeris how a government may prioritize censoring Finally citizens can incur some costof following each media outlet While we have already mentioned that citizens gainno benefit from following more than one outlet it is entirely possible for them tostop watching news altogether if it is sufficiently uninformative These extensions arenontrivial and left for future research

Appendix

Proof of Theorem 1 Suppose that V is S-shaped For mωlowast isin [0 1] and mlowast(ωlowast) =E[ω|ω ge ωlowast] define

V (mωlowast) = V (mlowast(ωlowast)) + V prime(mlowast(ωlowast))(mminusmlowast(ωlowast)) (7)

CENSORSHIP AS OPTIMAL PERSUASION 23

Since V (m) is convex on [0 τ ] and concave on [τ 1] there exists ωlowast isin [0 τ ] thatsatisfies condition (3) that is

V (mωlowast) ge (le)V (m) for all m ge (lt)ωlowast (8)

This is true because V (ωlowastωlowast) minus V (ωlowast) is single-crossing from below To see thisconsider Figure 1 Observe that (3) can hold only if ωlowast is on the convex part of V andmlowast is on the concave part of V As the censorship cutoff ωlowast increases the posteriormean state of the pooling message mlowast(ωlowast) also increases But because V is concaveat mlowast(ωlowast) the dashed tangent line V (mωlowast) becomes flatter in m so it crosses thesolid line V (m) at a smaller m

Consider an arbitrary signal s Let H be the distribution of m = E[ω|s] inducedby signal s Since each signal is a garbling of the fully informative signal F is amean-preserving spread of H Thus the senderrsquos expected utility is smaller undersignal s than under the upper-censorship signal with cutoff ωlowast as follows from

983133 1

0

V (m)dH(m) le983133 1

0

maxV (m) V (mωlowast)dH(m)

le983133 1

0

maxV (m) V (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mlowast(ωlowast))dF (m)

where the second line holds because maxV (m) V (mωlowast) is convex in m and F isa mean-preserving spread of H the third line holds by (8) and the last line holds by

(7) and by983125 1

ωlowast(mminusmlowast(ωlowast))dF (m) = 0

Conversely suppose that V is not S-shaped Then there exist 0 le m1 lt m2 lem3 lt m4 le 1 such that V primeprime(m) lt 0 for m isin (m1m2) V

primeprime(m) = 0 for m isin [m2m3]and V primeprime(m) gt 0 for m isin (m3m4) because V primeprime is continuous by assumption It isstraightforward to show that there exists 983141mlowast isin (m1m2) (sufficiently close to m2) and983141ωlowast isin (m3m4) such that

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) lt V (m) for all m isin (983141ωlowastm4]

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) gt V (m) for all m isin [m1 983141ωlowast] 983141mlowast

Moreover there exists a continuous distribution F such that the support of F is asubset of [m1m4] and 983141mlowast = E[ω|ω le 983141ωlowast] Using a chain of inequalities similar to theone above it is easy to show that lower-censorship signal 983141slowast with cutoff 983141ωlowast is optimalFurthermore it is uniquely optimal as both inequalities hold with equalities only forsignal 983141slowast This in turn implies that any upper-censorship signal is suboptimal

24 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Proof of Lemma 1 Notice that under assumption (A1) we have

V primeprime(m) = gprime(m) + ρg(m) for m isin [0 1]

Recall that V is S-shaped if and only if V primeprime(m) = gprime (m) + ρg (m) is single-crossingfrom above Using Proposition 1 in Quah and Strulovici (2012) it is easy to showthat V primeprime(m) is single-crossing from above for all ρ isin R if and only if gprime (m)g (m) isnonincreasing in m (that is ln g(m) is concave)

Proof of Theorem 2 Immediate by Theorem 1 and Lemma 1

Proof of Theorem 3 Part (i) We fix t and σ and vary ρ Without loss ofgenerality let t = 0 and σ = 1 (so that Gtσ = G) Consider a signal that induces thedistribution H of posterior mean m This signal implements the receiverrsquos interimutility U given by

U(r) =

983133 1

r

(mminus r)dH(m) for r isin [0 1]

which holds because the receiver acts if and only if m ge r Since types r lt 0 alwayschoose a = 1 and types r gt 1 always choose a = 0 we can exclude them fromconsideration and assume that supp(G) = [0 1] The senderrsquos expected utility is then

983133 1

0

V (m)dH(m) =

983133 1

0

983133 m

0

(1 + ρ(mminus r))dG(r)dH(m)

=

983133 1

0

983133 1

r

(1 + ρ(mminus r))dH(m)dG(r)

=

983133 1

0

(1minusH(r) + ρU(r))dG(r)

Consider ρ2 gt ρ1 Suppose by contradiction that the corresponding optimal upper-censorship signals s2 and s1 are such that s1 is strictly more informative than s2 LetU2 and U1 be the receiverrsquos interim utilities implemented by s2 and s1 Since thesender prefers signal s2 under ρ2 and signal s1 under ρ1 we have

983133 1

0

(1minusH2 (r) + ρ2U2 (r)) dG (r) ge983133 1

0

(1minusH1 (r) + ρ2U1 (r)) dG (r)

983133 1

0

(1minusH1 (r) + ρ1U1 (r)) dG (r) ge983133 1

0

(1minusH2 (r) + ρ1U2 (r)) dG (r)

Summing up these inequalities gives

(ρ2 minus ρ1)

983133 1

0

(U2 (r)minus U1 (r)) dG (r) ge 0

leading to a contradiction because U1(r) ge U2(r) for all r with strict inequality forsome r given that s1 is strictly more informative than s2

CENSORSHIP AS OPTIMAL PERSUASION 25

Part (ii) We fix ρ isin R and σ = 1 and vary t The indirect utility of the senderconditional on m isin [0 1] is given by

Vt(m) =

983133

rlem

(1+ρ(mminusr))gt1(r)dr =

983133

rlemminust

(1+ρ(mminustminusr))g(r)dr = V (mminust) (9)

By Theorem 1 when t0 = 0 an optimal cutoff ωlowast0 satisfies

V (m) le V (mlowast0) + V prime(mlowast

0)(mminusmlowast0) for m isin [ωlowast

0 1]

When V is S-shaped and t1 lt 0 it is easy to see that

V (mminus t1) le V (mlowast0 minus t1) + V prime(mlowast

0 minus t1)(mminusmlowast0) for m isin [ωlowast

0 1]

which implies that when t = t1 the above inequality changes its sign at some ωlowast1

that satisfies ωlowast1 le ωlowast

0 It follows from (9) that ωlowast1 is an optimal cutoff for t = t1 and

as we established above ωlowast1 le ωlowast

0

Part (iii) We fix ρ ge 0 and t = 0 and vary σ The indirect utility of the senderconditional on m isin [0 1] is given by

V ρσσ(m) =

983133

rlem

9830591 +

ρ

σ(mminus r)

983060dG0σ(r)

=

983133

rlem

9830591 +

ρ

σ(mminus r)

983060g

983061r minus (1minus σ)τ

σ

9830621

σdr

=

983133

rlemminus(1minusσ)τσ

9830611 + ρ

983061mminus (1minus σ)τ

σminus r

983062983062g(r)dr

= Vρ1

983061mminus (1minus σ)τ

σ

983062 (10)

Let σ0 = 1 σ1 lt 1 and ρ1 = ρσ1 To compare the optimal cutoffs under parameters(ρ σ0) and (ρ σ1) we consider two steps first comparing the cutoffs between (ρ σ0)and (ρ1 σ1) and then comparing the cutoffs between (ρ1 σ1) and (ρ σ1)

First we compare an optimal cutoff ωlowast0 under (ρ σ0) = (ρ 1) and an optimal cutoff

ωlowast1 under (ρ1 σ1) By Theorem 1 ωlowast

0 satisfies

Vρ1(m) le Vρ1(mlowast0) + V prime

ρ1(mlowast0)(mminusmlowast

0) for m isin [ωlowast0 1]

When V is increasing (implied by ρ ge 0) and S-shaped and σ1 lt 1 it is easy to seethat for all m isin [ωlowast

0 1]

Vρ1

983061mminus (1minus σ1)τ

σ1

983062le Vρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062+ V prime

ρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062(mminusmlowast

0)

which implies that when σ = σ1 lt 1 the above inequality changes its sign at someωlowast1 that satisfies ωlowast

1 le ωlowast0 It follows from (10) that ωlowast

1 is an optimal cutoff under(ρσ σ1) = (ρ1 σ1) and as we established above ωlowast

1 le ωlowast0

26 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Second we compare the optimal cutoff ωlowast1 under (ρ1 σ1) and the optimal cutoff ωlowast

2

under (ρ σ1) Because ρ lt ρ1 = ρσ1 by part (i) we have ωlowast2 le ωlowast

1

Combining ωlowast1 le ωlowast

0 and ωlowast2 le ωlowast

1 we obtain ωlowast2 le ωlowast

0

Proof of Theorem 1prime Part (A) The proof is analogous to that of Theorem 1 andis thus omitted

Part (B) The only if statement follows from that in part (A) (see the proof of Theorem1) and the fact that we can always ensure that an optimal stochastic censorship(ωlowast qlowast) is deterministic (so qlowast isin 0 1) by the choice of the support and distributionof ω Thus we only need to prove the if statement

Let F be a distribution of ω with support on n values 0 lt ω0 lt lt ωnminus1 lt 1 Thisis without loss of generality because if 0 andor 1 were in the support of F we couldredefine the state as ω = (ω + ε)(1 + 2ε) and the type as r = (r + ε)(1 + 2ε)

Let X sub ω1 ωnminus1 identify a set of cutoffs defining a monotone partition for eachωi isin X the receiver is informed whether ω lt ωi or ω ge ωi Note that we exclude ω0

from the set of cutoffs because ω ge ω0 always holds

Note that X is deterministic upper censorship if and only if there exists a censorshipthreshold ωlowast isin [0 1] such that for each i = 1 nminus 1 ωi isin X if and only if ωi lt ωlowastthat is we can assume that qlowast = 0 without loss of generality

Let VX be the senderrsquos expected utility when the monotone partition is given by X

If n = 2 then there are only two partitions X = ω1 (full disclosure) and X = empty(no disclosure) Both are upper censorships so part (B) holds trivially

If n = 3 then there are four monotone partitions X = ω1ω2 (full disclosure)X = ω2 X = ω1 and X = empty (no disclosure) Only X = ω2 is not uppercensorship The next lemma proves that X = ω2 cannot be uniquely optimal whenV is S-shaped thus proving the optimality of upper censorship for n = 3

Denote

mjk = E983045ω|ω isin [ωjωk]

983046

For convenience of notation we write mj = ωj

Lemma 3 Let V be S-shaped with an inflexion point τ Let ω be a discrete randomvariable with support on m0m1m2 and probabilities p0 p1 p2 where 0 lt m0 ltm1 lt m2 lt 1 Then

(i) if τ le m0 then Vm2 le Vempty

(ii) if τ ge m1 then Vm2 le Vm1m2

(iii) if m0 lt τ lt m1 then Vm2 le maxVempty Vm1m2

CENSORSHIP AS OPTIMAL PERSUASION 27

Proof Define p01 = p0 + p1 and p02 = p01 + p2 = 1 Observe that

m01 =p0m0 + p1m1

p0 + p1and m02 =

p0m0 + p1m1 + p2m2

p0 + p1 + p2

We thus have

Vempty = p02V (m02) = (p01 + p2)V

983061p01m01 + p2m2

p01 + p2

983062

Vm2 = p01V (m01) + p2V (m2) = (p0 + p1)V

983061p0m0 + p1m1

p0 + p1

983062+ p2V (m2)

Vm1m2 = p0V (m0) + p1V (m1) + p2V (m2)

Clearly

m0 lt m01 lt m1 and m0 lt m02 lt m2

Therefore

Vm2 le Vempty lArrrArrp01

p01 + p2V (m01) +

p2p01 + p2

V (m2) le V

983061p01

p01 + p2m01 +

p2p01 + p2

m2

983062= V (m02)

(11)

and

Vm2 le Vm1m2 lArrrArr

V

983061p0

p0 + p1m0 +

p1p0 + p1

m1

983062le p0

p0 + p1V (m0) +

p1p0 + p1

V (m1)(12)

Suppose that τ le m01 so V is concave on [m01m2] Then (11) holds by Jensenrsquosinequality In particular it holds for τ le m0 lt m01 thus proving part (i)

Suppose that τ ge m1 so V is convex on [m0m1] Then (12) holds by Jensenrsquosinequality thus proving part (ii)

Finally suppose that m01 lt τ lt m1 Let (τ V (τ)) lie below (or on) the straight lineA connecting (m0 V (m0)) and (m2 V (m2)) as illustrated on Figure 2 By convexityof V (m) for m lt τ it must be the case that V (m01) lies below (or on) the dashedline B By concavity of V (m) for m gt τ it must be the case that V (m1) lies above(or on) the dashed line C Consequently (12) is satisfied

Alternatively let (τ V (τ)) lie above the straight line A connecting (m0 V (m0)) and(m2 V (m2)) as illustrated on Figure 3 If Vm2 le Vm1m2 then the proof of part(iii) is complete So suppose that Vm2 gt Vm1m2 so (12) is violated

By convexity of V (m) for m lt τ it must be the case that V (m01) lies below (or on)the line B Since (12) is violated V (m01) must lie between the lines A and B Recallthat m01 lt m02 lt m2 Observe that V (m02) must be in the shaded area on Figure 3Indeed if m02 lt τ then by convexity of V (m) for m lt τ it must be the case thatV (m02) lies between (or on) the lines B and D If m02 ge τ then by concavity of

28 KOLOTILIN MYLOVANOV ZAPECHELNYUK

τ m2m0

A C

B

m01 m1

V (m0) V (τ)

V (m2)

Figure 2 The case where (τ V (τ)) lies below (or on) the straight lineconnecting (m0 V (m0)) and (m2 V (m2))

τ m2m0

A

C

B

m01

V (m0)

V (τ)

V (m2)

D

V (m01)

Figure 3 The case where (τ V (τ)) lies above the straight line con-necting (m0 V (m0)) and (m2 V (m2))

V (m) for m gt τ it must be the case that V (m02) lies above (or on) line C It followsthat (11) is satisfied

We now adapt Lemma 3 to prove the optimality of upper censorship for n ge 4

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

CENSORSHIP AS OPTIMAL PERSUASION 13

For the purpose of comparison we extend the definition of distribution function Gto the real line and assume that its density g is log-concave Consider a family ofdistributions Gtσ of receiver types

Gtσ (r) = G

983061τ minus t+

r minus τ

σ

983062

where τ is the maximum point of g and t isin R and σ gt 0 are parameters Let gtσ (r)denote the corresponding density Note that gtσ (r) is log-concave for all t isin R andall σ gt 0

Because gtσ is log-concave on [0 1] deterministic upper censorship is optimal byTheorem 2 Let ωlowast (ρ t σ) isin [0 1] be the optimal censorship cutoff as given by (3)

The shift parameter t shifts the distribution along the horizontal axis and the stretchparameter σ stretches the distribution horizontally and symmetrically around themode τ so the limit of σ rarr 0 leads to the unit mass on τ We now show that thesender optimally discloses more information when

(i) the senderrsquos preferences are better aligned with the receiverrsquos preferences (thealignment parameter ρ is greater)

(ii) the receiver is more reluctant to accept the proposal (the shift parameter t isgreater)

(iii) the sender is more uncertain about the receiverrsquos type (the stretch parameter σis greater) provided ρ ge 014

Theorem 3 In the canonical case a cutoff state of an optimal signal satisfies

(i) ωlowast (ρ t σ) is increasing in ρ

(ii) ωlowast (ρ t σ) is increasing in t

(iii) ωlowast (ρ t σ) is increasing in σ if ρ ge 0

The intuition for part (i) is that for a higher ρ the sender puts more weight onthe receiverrsquos utility so she optimally endows the receiver with a higher utility byproviding more information

The intuition for part (ii) is that for a higher t each type of the receiver has a greatercost of accepting the proposal so to persuade the same type of the receiver the senderneeds to increase E[ω|ω ge ωlowast] by expanding the full disclosure interval [0ωlowast]

The intuition for part (iii) is that for a higher σ receiver types are more spread out soto persuade the same mass of types the sender again needs to increase E[ω|ω ge ωlowast]

14The case ρ ge 0 is the practically relevant case where the senderrsquos utility is a weighted averageof the receiverrsquos action and utility 1

1+ρv(ω r) =1

1+ρ + ρ1+ρ (ω minus r)

14 KOLOTILIN MYLOVANOV ZAPECHELNYUK

4 Discrete State

In this section we assume that the state is a discrete random variable

Theorem 1prime Let V be S-shaped Then and only then for all discrete F

(A) an optimal signal is stochastic upper censorship whose cutoff pair (ωlowast qlowast) satisfiescondition (3)

(B) an optimal monotone partition is deterministic upper censorship whose cutoff pair(ωlowast

d qlowastd) satisfies ωlowast

d = ωlowast and qlowastd isin 0 1

In the canonical case we obtain the result analogous to Theorem 2

Theorem 2prime Consider the canonical case and let the density g be log-concave Thenand only then for all discrete F and all ρ

(A) an optimal signal is stochastic upper censorship

(B) an optimal monotone partition is deterministic upper censorship

Let us discuss how the comparative statics result (Theorem 3) changes In Section 32we argue that upper-censorship signals can be ordered by their censorship cutoffs Butwhen the state is discrete a censorship cutoff is described by a pair (ωlowast qlowast) where ωlowast

is a cutoff state and qlowast is the probability of revealing this cutoff state when it realizes

Consider two stochastic upper-censorship signals s1 and s2 with cutoff pairs (ωlowast1 q

lowast1)

and (ωlowast2 q

lowast2) Denote (ωlowast

1 qlowast1) ≽ (ωlowast

2 qlowast2) if ω

lowast1 gt ωlowast

2 or ωlowast1 = ωlowast

2 and qlowast1 ge qlowast2 Observethat s1 is more (Blackwell) informative than s2 if and only if (ωlowast

1 qlowast1) ≽ (ωlowast

2 qlowast2) This

comparison also applies to deterministic upper-censorship signals with the constraintthat qlowast1 and qlowast2 are in 0 1

With the above order on cutoff pairs Theorem 3 extend as follows

Theorem 3prime Consider the canonical case

(A) A cutoff pair of an optimal signal satisfies

(i) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in ρ

(ii) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in t

(iii) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in σ if ρ ge 0

(B) A cutoff pair of an optimal monotone partition satisfies15

(i) (ωlowastd (ρ t σ) q

lowastd (ρ t σ)) is increasing in ρ

(ii) (ωlowastd (ρ t σ) q

lowastd (ρ t σ)) is increasing in t

15Part (iii) also holds for an optimal monotone partition when for example the discrete supportof the state ω is a regular grid with sufficiently many points However it is possible to construct acounterexample where part (iii) fails for an optimal monotone partition

CENSORSHIP AS OPTIMAL PERSUASION 15

That is the sender optimally discloses more information when she is less biasedrelative to the receiver (the alignment parameter ρ is greater) when the receiver ismore reluctant to accept the proposal (the shift parameter t is greater) and underScenario A when the sender is more uncertain about the receiverrsquos type (the stretchparameter σ is greater)

Remark 2 Theorem 1prime and Theorem 2prime show that upper censorship emerges as anoptimal persuasion strategy under the same conditions (V is S-shaped in Theorem1prime and g is log-concave in Theorem 2prime) regardless of whether the sender can choosean arbitrary signal or is constrained to choose a monotone partition It may betempting to conjecture that if the sender was allowed to choose an arbitrary partition(not necessarily monotone) the same result would hold This however is not true

For illustration suppose that the sender wishes to maximize the probability of a = 1and the receiver has type r = 1

2with certainty16 Let the state ω take values in

0 14 1 with probabilities (1

6 23 16) so the prior mean is 1

3

An optimal signal is stochastic upper censorship with cutoff pair (ωlowast qlowast) = (14 12)

which sends the pooling message with certainty at ω = 1 and with probability 1minusqlowast =12at ω = 1

4 and reveals the state otherwise The pooling message induces the posterior

mean m = 12 leading to a = 1 all other messages induce posterior means below 1

2

leading to a = 0 Under the optimal signal action a = 1 is chosen with probability23middot 12+ 1

6= 1

2

An optimal monotone partition is deterministic upper censorship that fully revealsthe state where only ω = 1 leads to a = 1 Under this partition action a = 1 ischosen with probability 1

6

An optimal partition pools the extreme states ω = 0 and ω = 1 and reveals theintermediate state ω = 1

4 The pooling message induces the posterior mean m = 1

2

leading to a = 1 This optimal partition is nonmonotone Under this partitionaction a = 1 is chosen with probability 1

6+ 1

6= 1

3 which is strictly better than the

probability of 16obtained under the optimal monotone partition

5 Application to Media Censorship

In this section we apply our results to the problem of media censorship by thegovernment In the modern world people obtain information about the governmentrsquosstate through various media sources such as television newspapers and internetblogs Without the media most people would not know what policies and reformsthe government pursues and how effective they are Media outlets have differentpositions on the political spectrum and differ substantially in how they select andpresent facts to cover the same news People choose their sources of information based

16In this case V (m) = 1 if m ge 12 and V (m) = 0 if m lt 1

2 Observe that the graph of V is

S-shaped and it can be approximated by a continuously differentiable S-shaped function

16 KOLOTILIN MYLOVANOV ZAPECHELNYUK

on their political ideology and socioeconomic status This information is valuable forsignificant individual decisions on migration investment and voting to name a fewIndividuals may not fully internalize externalities that their decisions impose on thesociety Likewise the government may not have the societyrsquos best interest at heartTo further its goals the government then wishes to influence individual decisions bymanipulating the information through media In autocracies and countries with weakchecks and balances the government has power to censor the media content

The governmentrsquos problem of media censorship can be represented as the persua-sion problem in Section 2 We apply our results to provide conditions for the op-timality of upper-censorship policies that censor all media outlets except the mostpro-government ones Furthermore we interpret our comparative statics results

51 Setup There is a continuum of heterogeneous citizens indexed by r isin [0 1]distributed with G Each citizen chooses between a = 0 and a = 1 The utility of acitizen of type r is given by

u(θ r ar a) = (θ minus r)ar + κ(θ r a)

where ar isin 0 1 denotes the citizenrsquos own action a =983125ardG(r) denotes the aggre-

gate action in the society θ isin [0 1] captures an unobserved benefit from action 1 ascompared to action 0 and κ captures the impact of the aggregate action a on thecitizenrsquos utility The term (θ minus r)ar is a private surplus of a citizen of type r Theterm κ(θ r a) is an externality because for a citizen of type r it is optimal to ignorethis term and choose ar = 1 if and only if θ ge r

There is a government which is concerned with a weighted average of the social utilityand the governmentrsquos intrinsic benefit from the aggregate action For a given θ thegovernmentrsquos utility is given by

983133 1

0

ν(θ r ar a)dG(r) + δγ(θ a)

The term ν captures a citizenrsquos utility from the governmentrsquos perspective We allow νto be different from u to reflect paternalistic or other concerns The term γ capturesthe governmentrsquos intrinsic benefit from the aggregate action The parameter δ ge 0captures the weight of the aggregate action in the governmentrsquos utility

Let T be a distribution of the random variable θ We assume that distributions G andT are independent and admit continuously differentiable and strictly positive densi-ties We also assume that κ ν and γ are linear in θ and continuously differentiablein r and a Furthermore to simplify interpretations we assume that ν and γ arenon-decreasing in θ and a so that for the government a high θ is a good news anda higher aggregate action is preferable

Citizens obtain information about the unobservable benefit θ through media outletsEach media outlet is identified by its editorial policy c isin [0 1] and it endorses action

CENSORSHIP AS OPTIMAL PERSUASION 17

a = 1 if θ ge c and criticizes it if θ lt c17 A set C of media outlets is a finite subsetof [0 1]18

The governmentrsquos censorship policy is a set of the media outlets X sub C that arepermitted to broadcast so the rest of the media outlets are censored

The timing is as follows First the government chooses a set X sub C of permittedmedia outlets Second state θ is realized and each permitted media outlet endorsesor criticizes action a = 1 according to its editorial policy Finally each citizen observesmessages from all permitted media outlets updates his beliefs about θ and choosesan action

52 Discussion We now discuss interpretations of the key components of the mediacensorship application As in Gehlbach and Sonin (2014) there can be various inter-pretations of the citizenrsquos action a = 1 such as voting for the government supportinga governmentrsquos policy or taking an individual decision that benefits the governmentA citizenrsquos type r can be interpreted as his ideological position or preference param-eter A citizen who is more supportive of the government has a smaller r

A media outlet with a higher editorial policy c can be interpreted as being less loyalto the government because it criticizes the government on a larger set of states Aneditorial policy c isin C can therefore represent a slant or political bias of the outletagainst the government and can be empirically measured as the frequency with whichthe outlet uses anti-government language Gentzkow and Shapiro (2010) constructsuch a slant index for US newspapers Empirical findings of their paper suggestthat the editorial policies of media outlets are driven by reader preferences justifyingour assumption of the existence of a large variety of editorial policies19 As in Suen(2004) Chan and Suen (2008) and Chiang and Knight (2011) the assumption ofthe binary media reports that communicate only whether the state θ is above somestandard c can be justified by a cursory readerrsquos preference for simple messages suchas positive or negative opinions and yes or no recommendations

The governmentrsquos censorship of media outlets can take various forms For examplethe government can ban access to internet sites withdraw licenses disrupt financingconfiscate print materials and equipment discredit media providers and even arresteditors and journalists using broadly formulated legislation on combating extremismIn some countries the government can exercise direct control over media editorialpolicies either through state ownership or administrative pressure

17The tie-breaking in the event of θ = c is unimportant as θ is a continuous random variable18We discuss the case of a continuum of media outlets in Section 5519Theoretical literature has explored the determinants of media slant of an outlet driven by its

citizens (Mullainathan and Shleifer 2005 Gentzkow and Shapiro 2006 and Chan and Suen 2008)and its owners (Baron 2006 and Besley and Prat 2006)

18 KOLOTILIN MYLOVANOV ZAPECHELNYUK

53 Formulating Media Censorship as Persuasion We now show that the me-dia censorship problem can be formulated as a linear persuasion problem in whichthe government is a sender and a representative citizen is a receiver

In our model the citizensrsquo and the governmentrsquos utilities are linear in θ There-fore given any information from the media outlets the utilities depend only on theposterior mean of θ

Consider an arbitrary posterior mean of θ denoted by m isin [0 1] Each citizen of typer chooses ar = 1 if and only if r le m The externality term κ plays no role in thisdecision Therefore the aggregate action a is simply the mass of all citizens whosetypes do not exceed m so a = G(m)

Next using the citizensrsquo optimal behavior we derive the governmentrsquos expected utilityconditional on the posterior mean m

V (m) = E983063983133 1

0

ν(θ r ar G(m))dG(r) + δγ(θ G(m))

983055983055983055983055m983064

=

983133 1

0

ν(m r1rlem G(m))dG(r) + δγ(mG(m)) (6)

This is the governmentrsquos indirect utility As in Section 2 the government now choosesa signal which is informative about the state to maximize its expected utility How-ever in contrast to Section 2 here the government is restricted to signals that areimplementable by a subset of a given set of media outlets

Next we show that with an appropriate definition of the state the restriction tosignals implementable by a subset of a given set of media outlets can be formulatedas a restriction to monotone partitional signals as in Scenario B in Section 2 Thuswe will be able to use our results in Section 4 to find the optimal censorship policies

Recall that the information about θ is only available through the media outlets Let ωdenote a random variable equal to the conditional expectation of θ given the messagesof all media outlets in C Note that as C is finite ω is a discrete random variablewith values in [0 1] Let F denote the distribution of ω From now on we treat therandom variable ω as the state and denote by s a signal as defined in Section 2

Let HM denote the set of all distributions of the posterior mean state induced byall monotone partitional signals and let HC denote the set of all distributions ofthe posterior mean state induced by all finite subsets of the set C of media outletsObserve that any set of media outlets induces a monotone partitional signal thatis HC sub HM We now show that every outcome implementable by an arbitrarymonotone partition in HM is also implementable by a subset of media outlets

Lemma 2 HC = HM

For illustration suppose that there is only one media outlet with the editorial policyc isin (0 1) There are two observable events θ ge c and θ lt c In each realized event

CENSORSHIP AS OPTIMAL PERSUASION 19

the citizens compute the posterior expectation of θ which we will call ω So in thisexample the distribution F of ω has two mass points one for each of the two events

Given these two mass points every monotone partition of [0 1] induces one of twopossible distributions of the posterior mean state (i) the mass points revealed and (ii)the mass points pooled Observe however that these distributions are implementableby permitting and censoring the media outlet respectively

A censorship policy is upper censorship if it censors all sufficiently disloyal mediaoutlets Specifically there exists a cutoff clowast isin C and an indicator qlowast isin 0 1 suchthat all media outlets whose editorial policies are below clowast are permitted all mediaoutlets whose editorial policies are above clowast are censored and clowast is permitted if andonly if qlowast = 1 That is X = c isin C c le clowast if qlowast = 1 and X = c isin C c lt clowastif qlowast = 0 Note that the full censorship policy clowast = 0 and qlowast = 0 (where all mediaoutlets are censored) and the free media policy clowast = 1 and qlowast = 1 (where all mediaoutlets are permitted) are the two extreme upper-censorship policies

As shown above the media censorship problem is equivalent to the linear persuasionproblem in which the sender is restricted to monotone partitional signals and thesenderrsquos indirect utility V is given by (6) We thus apply Theorem 1prime(B) to obtainthe following result

Theorem 1primeprime If V is S-shaped then an optimal censorship policy is upper censorship

To illustrate Theorem 1primeprime suppose the government is interested only in the aggregateaction so that ν = 0 and γ depends only on the aggregate action a so V (m) =γ(G(m)) Thus an upper-censorship policy is optimal if the composition functionγ(G(middot)) is S-shaped For example this condition holds if the government is interestedin reaching a certain approval threshold (eg a simple majority) so that γ is a stepfunction This condition also holds if γ is S-shaped and G is uniform or S-shapedwith the same inflection point as γ20

54 Canonical Case We now impose more structure on the utilities to obtain asharper result for the optimality of upper censorship and to perform a comparativestatics analysis Analogously to assumption (A1) we assume that the governmentrsquosutility is a weighted average of the citizensrsquo utility and their aggregate action

983133 1

0

u(θ r ar a)dG(r) + δa

where u(θ r ar a) = (θ minus r)ar + ζ(r)a

(A2)

for some continuously differentiable function ζ Define

β =

983133 1

0

ζ(r)dG(r) + δ

20This condition holds in many special cases where γ and G are S-shaped even when γ and Ghave different inflection points

20 KOLOTILIN MYLOVANOV ZAPECHELNYUK

The term983125 1

0ζ(r)dG(r) is the governmentrsquos expected bias towards the citizensrsquo action

a = 1 due to the citizensrsquo externality and the term δ is the governmentrsquos intrinsicbias towards a greater aggregate action a Thus we interpret β as the governmentrsquosaggregate bias This decomposition of the bias allows for different interpretations whythe government is biased So β can be high because the government is too self-serving(high δ) or because the government is benevolent (low δ) but wishes to internalizestrong positive externalities of the citizens (high ζ(r)) To ease interpretations weassume that β gt 021

Under Assumption (A2) the governmentrsquos indirect utility V given by (6) becomes

V (m) =

983133 m

0

(mminus r)dG(r) + βa =

983133 m

0

v(m r)dG(r)

where

v(m r) = β

9830611 +

1

β(mminus r)

983062

This is the same as v given by Assumption (A1) with ρ = βminus1 up to rescaling by apositive constant Consequently we can apply Theorem 2prime(B) to obtain the followingresult

Theorem 2primeprime Let (A2) hold If the density g of citizens types is log-concave then anoptimal censorship policy is upper censorship

We now apply the comparative statics analysis presented in Section 32 Note thatthe upper-censorship policies are ordered according to the amount of informationtransmitted to the citizens in the sense of Blackwell (1953) A greater censorshipthreshold clowast means that more media outlets are permitted With this order in mindwe apply Theorem 3prime(B) to make a comparative statics analysis on the amount ofinformation that is optimally disclosed by the government

First the censorship cutoff is increasing in the alignment parameter ρ Recall thatthis is a reciprocal of the governmentrsquos bias ρ = βminus1 This means that the governmentoptimally discloses more information (the censorship cutoff is greater) when it is lessbiased (β is smaller) Intuitively as β decreases the government puts more weighton the citizensrsquo utility so it optimally endows the population with a higher utility bycensoring fewer media outlets and disclosing more information

Second the censorship cutoff is increasing in the magnitude t of the horizontal shiftof the density g A greater t corresponds to a greater opportunity cost of actiona = 1 for each citizen This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more difficultto persuade to take action a = 1 (parameter t is greater) Informally speaking topersuade the same type of the citizen the government needs to increase the posterior

21If β lt 0 then swapping the roles of a = 0 and a = 1 reverses the sign of the bias If β = 0then the governmentrsquos utility and the citizensrsquo private interests coincide so it is trivially optimal topermit all media outlets

CENSORSHIP AS OPTIMAL PERSUASION 21

mean But the expected posterior mean must be equal to the prior mean so it is notpossible to increase all posterior means Due to the log-concave shape of the densityof the citizensrsquo types this tradeoff is resolved by increasing the posterior mean of thepooling interval E[ω|ω ge clowast] This is done by shrinking the pooling interval [clowast 1]that is increasing the censorship cutoff clowast

Third when the number of media outlets is sufficiently large the censorship cutoffis increasing in the stretch factor σ of the density g around its peak τ (see Footnote15) A greater σ corresponds to a more dispersed distribution of the opportunitycost in the population This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more diverse inhow difficult they are to persuade Intuitively when the types are more spread outpersuading the same mass of types requires a greater posterior mean which leads toa higher censorship cutoff similarly to the comparative statics in t discussed above

55 Extensions and Open Questions Let us now consider a few extensions ofour model of media censorship

In our model the set of media outlets is exogenous and the governmentrsquos only in-strument is censorship We now consider three alternative ways of expanding thegovernmentrsquos instruments of influence

First suppose that the government is able not only to censor existing media outletsbut also to introduce new media outlets with chosen editorial policies This is equiv-alent to our censorship model where all media outlets in [0 1] are initially availableand the government can censor any subset of them When all media outlets in [0 1]are permitted the revealed information about θ (which we call ω) is θ itself Thusω = θ is a continuous random variable with distribution F = T So we can nowapply our results for the continuous state from Section 3 instead of those for the dis-crete state in Section 4 reaching the same conclusion about the optimality of uppercensorship

Second suppose that the government can garble information available from mediaoutlets That is the government is not restricted to monotone partitions it cancreate arbitrary signals about state ω for the citizens to observe This becomes ageneral persuasion problem and our Theorems 1prime(A) 2prime(A) and 3prime(A) apply

Third suppose that the government is able to restrict not only which media outletsare permitted but also how many media outlets each citizen can choose to observeIn this extension the citizens are not allowed to communicate with one another (oth-erwise they could share the information thus observing all permitted media outletsindirectly) This extension does not affect our results as long as each citizen is al-lowed to access at least one media outlet of his choice as in Chan and Suen (2008)Intuitively this is because each citizen categorizes the information from the mediaoutlets into ldquogood newsrdquo where a = 1 is optimal and ldquobad newsrdquo where a = 0 is

22 KOLOTILIN MYLOVANOV ZAPECHELNYUK

optimal Because the information from the media outlets induces a monotone parti-tion it means that ldquogood newsrdquo is separated from ldquobad newsrdquo by a threshold mediaoutlet that depends on the citizenrsquos type That is it is sufficient to observe a singlethreshold media outlet to distinguish ldquogood newsrdquo from ldquobad newsrdquo

In our model each citizenrsquos utility depends on the aggregate action a through theexternality term κ(θ r a) which does not affect the chosen action Let us relax thisassumption so that a citizenrsquos optimal choice can depend on a We can still writethe senderrsquos indirect utility V as a function of the posterior mean state and applyour results However V is no longer uniquely determined by the primitives of themodel It is now endogenous and depends on an equilibrium the citizens play aseach citizenrsquos optimal action now depends on what all citizens do in equilibrium Forexample given the same information about the state a citizen could prefer to choosea = 1 if and only if sufficiently many citizens choose the same action so a is largeenough This creates the problem of multiplicity of equilibria and as a consequencethe dependence of optimal censorship on equilibrium selection

As a side our media censorship problem can also be applied to spatial voting mod-els as in Chiang and Knight (2011) Consider a government party (p = G) and anopposition party (p = O) competing in an election If party p wins a voter withan ideological position r gets utility wp minus (r minus rp)

2 where wp is the quality or va-lence of party p and rp is the ideology or policy platform of party p Voters knowthe partiesrsquo ideologies and obtain information about the partiesrsquo qualities from allavailable media outlets Each voter supports the party that maximizes his expectedutility Within this context our analysis still applies because the voterrsquos utility dif-ference between the government and opposition parties is proportional to θminusr whereθ = (wG minus wO + r2O minus r2G)2(rO minus rG)

There are a few more extensions that can be relevant in applications First insteadof complete censorship of a media outlet there can be a cost of accessing it Forexample an international news channel can be censored by a local government butcitizens may still access it through VPN at some cost Second it can be costly forthe government to censor media outlets So another important question to answeris how a government may prioritize censoring Finally citizens can incur some costof following each media outlet While we have already mentioned that citizens gainno benefit from following more than one outlet it is entirely possible for them tostop watching news altogether if it is sufficiently uninformative These extensions arenontrivial and left for future research

Appendix

Proof of Theorem 1 Suppose that V is S-shaped For mωlowast isin [0 1] and mlowast(ωlowast) =E[ω|ω ge ωlowast] define

V (mωlowast) = V (mlowast(ωlowast)) + V prime(mlowast(ωlowast))(mminusmlowast(ωlowast)) (7)

CENSORSHIP AS OPTIMAL PERSUASION 23

Since V (m) is convex on [0 τ ] and concave on [τ 1] there exists ωlowast isin [0 τ ] thatsatisfies condition (3) that is

V (mωlowast) ge (le)V (m) for all m ge (lt)ωlowast (8)

This is true because V (ωlowastωlowast) minus V (ωlowast) is single-crossing from below To see thisconsider Figure 1 Observe that (3) can hold only if ωlowast is on the convex part of V andmlowast is on the concave part of V As the censorship cutoff ωlowast increases the posteriormean state of the pooling message mlowast(ωlowast) also increases But because V is concaveat mlowast(ωlowast) the dashed tangent line V (mωlowast) becomes flatter in m so it crosses thesolid line V (m) at a smaller m

Consider an arbitrary signal s Let H be the distribution of m = E[ω|s] inducedby signal s Since each signal is a garbling of the fully informative signal F is amean-preserving spread of H Thus the senderrsquos expected utility is smaller undersignal s than under the upper-censorship signal with cutoff ωlowast as follows from

983133 1

0

V (m)dH(m) le983133 1

0

maxV (m) V (mωlowast)dH(m)

le983133 1

0

maxV (m) V (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mlowast(ωlowast))dF (m)

where the second line holds because maxV (m) V (mωlowast) is convex in m and F isa mean-preserving spread of H the third line holds by (8) and the last line holds by

(7) and by983125 1

ωlowast(mminusmlowast(ωlowast))dF (m) = 0

Conversely suppose that V is not S-shaped Then there exist 0 le m1 lt m2 lem3 lt m4 le 1 such that V primeprime(m) lt 0 for m isin (m1m2) V

primeprime(m) = 0 for m isin [m2m3]and V primeprime(m) gt 0 for m isin (m3m4) because V primeprime is continuous by assumption It isstraightforward to show that there exists 983141mlowast isin (m1m2) (sufficiently close to m2) and983141ωlowast isin (m3m4) such that

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) lt V (m) for all m isin (983141ωlowastm4]

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) gt V (m) for all m isin [m1 983141ωlowast] 983141mlowast

Moreover there exists a continuous distribution F such that the support of F is asubset of [m1m4] and 983141mlowast = E[ω|ω le 983141ωlowast] Using a chain of inequalities similar to theone above it is easy to show that lower-censorship signal 983141slowast with cutoff 983141ωlowast is optimalFurthermore it is uniquely optimal as both inequalities hold with equalities only forsignal 983141slowast This in turn implies that any upper-censorship signal is suboptimal

24 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Proof of Lemma 1 Notice that under assumption (A1) we have

V primeprime(m) = gprime(m) + ρg(m) for m isin [0 1]

Recall that V is S-shaped if and only if V primeprime(m) = gprime (m) + ρg (m) is single-crossingfrom above Using Proposition 1 in Quah and Strulovici (2012) it is easy to showthat V primeprime(m) is single-crossing from above for all ρ isin R if and only if gprime (m)g (m) isnonincreasing in m (that is ln g(m) is concave)

Proof of Theorem 2 Immediate by Theorem 1 and Lemma 1

Proof of Theorem 3 Part (i) We fix t and σ and vary ρ Without loss ofgenerality let t = 0 and σ = 1 (so that Gtσ = G) Consider a signal that induces thedistribution H of posterior mean m This signal implements the receiverrsquos interimutility U given by

U(r) =

983133 1

r

(mminus r)dH(m) for r isin [0 1]

which holds because the receiver acts if and only if m ge r Since types r lt 0 alwayschoose a = 1 and types r gt 1 always choose a = 0 we can exclude them fromconsideration and assume that supp(G) = [0 1] The senderrsquos expected utility is then

983133 1

0

V (m)dH(m) =

983133 1

0

983133 m

0

(1 + ρ(mminus r))dG(r)dH(m)

=

983133 1

0

983133 1

r

(1 + ρ(mminus r))dH(m)dG(r)

=

983133 1

0

(1minusH(r) + ρU(r))dG(r)

Consider ρ2 gt ρ1 Suppose by contradiction that the corresponding optimal upper-censorship signals s2 and s1 are such that s1 is strictly more informative than s2 LetU2 and U1 be the receiverrsquos interim utilities implemented by s2 and s1 Since thesender prefers signal s2 under ρ2 and signal s1 under ρ1 we have

983133 1

0

(1minusH2 (r) + ρ2U2 (r)) dG (r) ge983133 1

0

(1minusH1 (r) + ρ2U1 (r)) dG (r)

983133 1

0

(1minusH1 (r) + ρ1U1 (r)) dG (r) ge983133 1

0

(1minusH2 (r) + ρ1U2 (r)) dG (r)

Summing up these inequalities gives

(ρ2 minus ρ1)

983133 1

0

(U2 (r)minus U1 (r)) dG (r) ge 0

leading to a contradiction because U1(r) ge U2(r) for all r with strict inequality forsome r given that s1 is strictly more informative than s2

CENSORSHIP AS OPTIMAL PERSUASION 25

Part (ii) We fix ρ isin R and σ = 1 and vary t The indirect utility of the senderconditional on m isin [0 1] is given by

Vt(m) =

983133

rlem

(1+ρ(mminusr))gt1(r)dr =

983133

rlemminust

(1+ρ(mminustminusr))g(r)dr = V (mminust) (9)

By Theorem 1 when t0 = 0 an optimal cutoff ωlowast0 satisfies

V (m) le V (mlowast0) + V prime(mlowast

0)(mminusmlowast0) for m isin [ωlowast

0 1]

When V is S-shaped and t1 lt 0 it is easy to see that

V (mminus t1) le V (mlowast0 minus t1) + V prime(mlowast

0 minus t1)(mminusmlowast0) for m isin [ωlowast

0 1]

which implies that when t = t1 the above inequality changes its sign at some ωlowast1

that satisfies ωlowast1 le ωlowast

0 It follows from (9) that ωlowast1 is an optimal cutoff for t = t1 and

as we established above ωlowast1 le ωlowast

0

Part (iii) We fix ρ ge 0 and t = 0 and vary σ The indirect utility of the senderconditional on m isin [0 1] is given by

V ρσσ(m) =

983133

rlem

9830591 +

ρ

σ(mminus r)

983060dG0σ(r)

=

983133

rlem

9830591 +

ρ

σ(mminus r)

983060g

983061r minus (1minus σ)τ

σ

9830621

σdr

=

983133

rlemminus(1minusσ)τσ

9830611 + ρ

983061mminus (1minus σ)τ

σminus r

983062983062g(r)dr

= Vρ1

983061mminus (1minus σ)τ

σ

983062 (10)

Let σ0 = 1 σ1 lt 1 and ρ1 = ρσ1 To compare the optimal cutoffs under parameters(ρ σ0) and (ρ σ1) we consider two steps first comparing the cutoffs between (ρ σ0)and (ρ1 σ1) and then comparing the cutoffs between (ρ1 σ1) and (ρ σ1)

First we compare an optimal cutoff ωlowast0 under (ρ σ0) = (ρ 1) and an optimal cutoff

ωlowast1 under (ρ1 σ1) By Theorem 1 ωlowast

0 satisfies

Vρ1(m) le Vρ1(mlowast0) + V prime

ρ1(mlowast0)(mminusmlowast

0) for m isin [ωlowast0 1]

When V is increasing (implied by ρ ge 0) and S-shaped and σ1 lt 1 it is easy to seethat for all m isin [ωlowast

0 1]

Vρ1

983061mminus (1minus σ1)τ

σ1

983062le Vρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062+ V prime

ρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062(mminusmlowast

0)

which implies that when σ = σ1 lt 1 the above inequality changes its sign at someωlowast1 that satisfies ωlowast

1 le ωlowast0 It follows from (10) that ωlowast

1 is an optimal cutoff under(ρσ σ1) = (ρ1 σ1) and as we established above ωlowast

1 le ωlowast0

26 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Second we compare the optimal cutoff ωlowast1 under (ρ1 σ1) and the optimal cutoff ωlowast

2

under (ρ σ1) Because ρ lt ρ1 = ρσ1 by part (i) we have ωlowast2 le ωlowast

1

Combining ωlowast1 le ωlowast

0 and ωlowast2 le ωlowast

1 we obtain ωlowast2 le ωlowast

0

Proof of Theorem 1prime Part (A) The proof is analogous to that of Theorem 1 andis thus omitted

Part (B) The only if statement follows from that in part (A) (see the proof of Theorem1) and the fact that we can always ensure that an optimal stochastic censorship(ωlowast qlowast) is deterministic (so qlowast isin 0 1) by the choice of the support and distributionof ω Thus we only need to prove the if statement

Let F be a distribution of ω with support on n values 0 lt ω0 lt lt ωnminus1 lt 1 Thisis without loss of generality because if 0 andor 1 were in the support of F we couldredefine the state as ω = (ω + ε)(1 + 2ε) and the type as r = (r + ε)(1 + 2ε)

Let X sub ω1 ωnminus1 identify a set of cutoffs defining a monotone partition for eachωi isin X the receiver is informed whether ω lt ωi or ω ge ωi Note that we exclude ω0

from the set of cutoffs because ω ge ω0 always holds

Note that X is deterministic upper censorship if and only if there exists a censorshipthreshold ωlowast isin [0 1] such that for each i = 1 nminus 1 ωi isin X if and only if ωi lt ωlowastthat is we can assume that qlowast = 0 without loss of generality

Let VX be the senderrsquos expected utility when the monotone partition is given by X

If n = 2 then there are only two partitions X = ω1 (full disclosure) and X = empty(no disclosure) Both are upper censorships so part (B) holds trivially

If n = 3 then there are four monotone partitions X = ω1ω2 (full disclosure)X = ω2 X = ω1 and X = empty (no disclosure) Only X = ω2 is not uppercensorship The next lemma proves that X = ω2 cannot be uniquely optimal whenV is S-shaped thus proving the optimality of upper censorship for n = 3

Denote

mjk = E983045ω|ω isin [ωjωk]

983046

For convenience of notation we write mj = ωj

Lemma 3 Let V be S-shaped with an inflexion point τ Let ω be a discrete randomvariable with support on m0m1m2 and probabilities p0 p1 p2 where 0 lt m0 ltm1 lt m2 lt 1 Then

(i) if τ le m0 then Vm2 le Vempty

(ii) if τ ge m1 then Vm2 le Vm1m2

(iii) if m0 lt τ lt m1 then Vm2 le maxVempty Vm1m2

CENSORSHIP AS OPTIMAL PERSUASION 27

Proof Define p01 = p0 + p1 and p02 = p01 + p2 = 1 Observe that

m01 =p0m0 + p1m1

p0 + p1and m02 =

p0m0 + p1m1 + p2m2

p0 + p1 + p2

We thus have

Vempty = p02V (m02) = (p01 + p2)V

983061p01m01 + p2m2

p01 + p2

983062

Vm2 = p01V (m01) + p2V (m2) = (p0 + p1)V

983061p0m0 + p1m1

p0 + p1

983062+ p2V (m2)

Vm1m2 = p0V (m0) + p1V (m1) + p2V (m2)

Clearly

m0 lt m01 lt m1 and m0 lt m02 lt m2

Therefore

Vm2 le Vempty lArrrArrp01

p01 + p2V (m01) +

p2p01 + p2

V (m2) le V

983061p01

p01 + p2m01 +

p2p01 + p2

m2

983062= V (m02)

(11)

and

Vm2 le Vm1m2 lArrrArr

V

983061p0

p0 + p1m0 +

p1p0 + p1

m1

983062le p0

p0 + p1V (m0) +

p1p0 + p1

V (m1)(12)

Suppose that τ le m01 so V is concave on [m01m2] Then (11) holds by Jensenrsquosinequality In particular it holds for τ le m0 lt m01 thus proving part (i)

Suppose that τ ge m1 so V is convex on [m0m1] Then (12) holds by Jensenrsquosinequality thus proving part (ii)

Finally suppose that m01 lt τ lt m1 Let (τ V (τ)) lie below (or on) the straight lineA connecting (m0 V (m0)) and (m2 V (m2)) as illustrated on Figure 2 By convexityof V (m) for m lt τ it must be the case that V (m01) lies below (or on) the dashedline B By concavity of V (m) for m gt τ it must be the case that V (m1) lies above(or on) the dashed line C Consequently (12) is satisfied

Alternatively let (τ V (τ)) lie above the straight line A connecting (m0 V (m0)) and(m2 V (m2)) as illustrated on Figure 3 If Vm2 le Vm1m2 then the proof of part(iii) is complete So suppose that Vm2 gt Vm1m2 so (12) is violated

By convexity of V (m) for m lt τ it must be the case that V (m01) lies below (or on)the line B Since (12) is violated V (m01) must lie between the lines A and B Recallthat m01 lt m02 lt m2 Observe that V (m02) must be in the shaded area on Figure 3Indeed if m02 lt τ then by convexity of V (m) for m lt τ it must be the case thatV (m02) lies between (or on) the lines B and D If m02 ge τ then by concavity of

28 KOLOTILIN MYLOVANOV ZAPECHELNYUK

τ m2m0

A C

B

m01 m1

V (m0) V (τ)

V (m2)

Figure 2 The case where (τ V (τ)) lies below (or on) the straight lineconnecting (m0 V (m0)) and (m2 V (m2))

τ m2m0

A

C

B

m01

V (m0)

V (τ)

V (m2)

D

V (m01)

Figure 3 The case where (τ V (τ)) lies above the straight line con-necting (m0 V (m0)) and (m2 V (m2))

V (m) for m gt τ it must be the case that V (m02) lies above (or on) line C It followsthat (11) is satisfied

We now adapt Lemma 3 to prove the optimality of upper censorship for n ge 4

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

14 KOLOTILIN MYLOVANOV ZAPECHELNYUK

4 Discrete State

In this section we assume that the state is a discrete random variable

Theorem 1prime Let V be S-shaped Then and only then for all discrete F

(A) an optimal signal is stochastic upper censorship whose cutoff pair (ωlowast qlowast) satisfiescondition (3)

(B) an optimal monotone partition is deterministic upper censorship whose cutoff pair(ωlowast

d qlowastd) satisfies ωlowast

d = ωlowast and qlowastd isin 0 1

In the canonical case we obtain the result analogous to Theorem 2

Theorem 2prime Consider the canonical case and let the density g be log-concave Thenand only then for all discrete F and all ρ

(A) an optimal signal is stochastic upper censorship

(B) an optimal monotone partition is deterministic upper censorship

Let us discuss how the comparative statics result (Theorem 3) changes In Section 32we argue that upper-censorship signals can be ordered by their censorship cutoffs Butwhen the state is discrete a censorship cutoff is described by a pair (ωlowast qlowast) where ωlowast

is a cutoff state and qlowast is the probability of revealing this cutoff state when it realizes

Consider two stochastic upper-censorship signals s1 and s2 with cutoff pairs (ωlowast1 q

lowast1)

and (ωlowast2 q

lowast2) Denote (ωlowast

1 qlowast1) ≽ (ωlowast

2 qlowast2) if ω

lowast1 gt ωlowast

2 or ωlowast1 = ωlowast

2 and qlowast1 ge qlowast2 Observethat s1 is more (Blackwell) informative than s2 if and only if (ωlowast

1 qlowast1) ≽ (ωlowast

2 qlowast2) This

comparison also applies to deterministic upper-censorship signals with the constraintthat qlowast1 and qlowast2 are in 0 1

With the above order on cutoff pairs Theorem 3 extend as follows

Theorem 3prime Consider the canonical case

(A) A cutoff pair of an optimal signal satisfies

(i) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in ρ

(ii) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in t

(iii) (ωlowast (ρ t σ) qlowast (ρ t σ)) is increasing in σ if ρ ge 0

(B) A cutoff pair of an optimal monotone partition satisfies15

(i) (ωlowastd (ρ t σ) q

lowastd (ρ t σ)) is increasing in ρ

(ii) (ωlowastd (ρ t σ) q

lowastd (ρ t σ)) is increasing in t

15Part (iii) also holds for an optimal monotone partition when for example the discrete supportof the state ω is a regular grid with sufficiently many points However it is possible to construct acounterexample where part (iii) fails for an optimal monotone partition

CENSORSHIP AS OPTIMAL PERSUASION 15

That is the sender optimally discloses more information when she is less biasedrelative to the receiver (the alignment parameter ρ is greater) when the receiver ismore reluctant to accept the proposal (the shift parameter t is greater) and underScenario A when the sender is more uncertain about the receiverrsquos type (the stretchparameter σ is greater)

Remark 2 Theorem 1prime and Theorem 2prime show that upper censorship emerges as anoptimal persuasion strategy under the same conditions (V is S-shaped in Theorem1prime and g is log-concave in Theorem 2prime) regardless of whether the sender can choosean arbitrary signal or is constrained to choose a monotone partition It may betempting to conjecture that if the sender was allowed to choose an arbitrary partition(not necessarily monotone) the same result would hold This however is not true

For illustration suppose that the sender wishes to maximize the probability of a = 1and the receiver has type r = 1

2with certainty16 Let the state ω take values in

0 14 1 with probabilities (1

6 23 16) so the prior mean is 1

3

An optimal signal is stochastic upper censorship with cutoff pair (ωlowast qlowast) = (14 12)

which sends the pooling message with certainty at ω = 1 and with probability 1minusqlowast =12at ω = 1

4 and reveals the state otherwise The pooling message induces the posterior

mean m = 12 leading to a = 1 all other messages induce posterior means below 1

2

leading to a = 0 Under the optimal signal action a = 1 is chosen with probability23middot 12+ 1

6= 1

2

An optimal monotone partition is deterministic upper censorship that fully revealsthe state where only ω = 1 leads to a = 1 Under this partition action a = 1 ischosen with probability 1

6

An optimal partition pools the extreme states ω = 0 and ω = 1 and reveals theintermediate state ω = 1

4 The pooling message induces the posterior mean m = 1

2

leading to a = 1 This optimal partition is nonmonotone Under this partitionaction a = 1 is chosen with probability 1

6+ 1

6= 1

3 which is strictly better than the

probability of 16obtained under the optimal monotone partition

5 Application to Media Censorship

In this section we apply our results to the problem of media censorship by thegovernment In the modern world people obtain information about the governmentrsquosstate through various media sources such as television newspapers and internetblogs Without the media most people would not know what policies and reformsthe government pursues and how effective they are Media outlets have differentpositions on the political spectrum and differ substantially in how they select andpresent facts to cover the same news People choose their sources of information based

16In this case V (m) = 1 if m ge 12 and V (m) = 0 if m lt 1

2 Observe that the graph of V is

S-shaped and it can be approximated by a continuously differentiable S-shaped function

16 KOLOTILIN MYLOVANOV ZAPECHELNYUK

on their political ideology and socioeconomic status This information is valuable forsignificant individual decisions on migration investment and voting to name a fewIndividuals may not fully internalize externalities that their decisions impose on thesociety Likewise the government may not have the societyrsquos best interest at heartTo further its goals the government then wishes to influence individual decisions bymanipulating the information through media In autocracies and countries with weakchecks and balances the government has power to censor the media content

The governmentrsquos problem of media censorship can be represented as the persua-sion problem in Section 2 We apply our results to provide conditions for the op-timality of upper-censorship policies that censor all media outlets except the mostpro-government ones Furthermore we interpret our comparative statics results

51 Setup There is a continuum of heterogeneous citizens indexed by r isin [0 1]distributed with G Each citizen chooses between a = 0 and a = 1 The utility of acitizen of type r is given by

u(θ r ar a) = (θ minus r)ar + κ(θ r a)

where ar isin 0 1 denotes the citizenrsquos own action a =983125ardG(r) denotes the aggre-

gate action in the society θ isin [0 1] captures an unobserved benefit from action 1 ascompared to action 0 and κ captures the impact of the aggregate action a on thecitizenrsquos utility The term (θ minus r)ar is a private surplus of a citizen of type r Theterm κ(θ r a) is an externality because for a citizen of type r it is optimal to ignorethis term and choose ar = 1 if and only if θ ge r

There is a government which is concerned with a weighted average of the social utilityand the governmentrsquos intrinsic benefit from the aggregate action For a given θ thegovernmentrsquos utility is given by

983133 1

0

ν(θ r ar a)dG(r) + δγ(θ a)

The term ν captures a citizenrsquos utility from the governmentrsquos perspective We allow νto be different from u to reflect paternalistic or other concerns The term γ capturesthe governmentrsquos intrinsic benefit from the aggregate action The parameter δ ge 0captures the weight of the aggregate action in the governmentrsquos utility

Let T be a distribution of the random variable θ We assume that distributions G andT are independent and admit continuously differentiable and strictly positive densi-ties We also assume that κ ν and γ are linear in θ and continuously differentiablein r and a Furthermore to simplify interpretations we assume that ν and γ arenon-decreasing in θ and a so that for the government a high θ is a good news anda higher aggregate action is preferable

Citizens obtain information about the unobservable benefit θ through media outletsEach media outlet is identified by its editorial policy c isin [0 1] and it endorses action

CENSORSHIP AS OPTIMAL PERSUASION 17

a = 1 if θ ge c and criticizes it if θ lt c17 A set C of media outlets is a finite subsetof [0 1]18

The governmentrsquos censorship policy is a set of the media outlets X sub C that arepermitted to broadcast so the rest of the media outlets are censored

The timing is as follows First the government chooses a set X sub C of permittedmedia outlets Second state θ is realized and each permitted media outlet endorsesor criticizes action a = 1 according to its editorial policy Finally each citizen observesmessages from all permitted media outlets updates his beliefs about θ and choosesan action

52 Discussion We now discuss interpretations of the key components of the mediacensorship application As in Gehlbach and Sonin (2014) there can be various inter-pretations of the citizenrsquos action a = 1 such as voting for the government supportinga governmentrsquos policy or taking an individual decision that benefits the governmentA citizenrsquos type r can be interpreted as his ideological position or preference param-eter A citizen who is more supportive of the government has a smaller r

A media outlet with a higher editorial policy c can be interpreted as being less loyalto the government because it criticizes the government on a larger set of states Aneditorial policy c isin C can therefore represent a slant or political bias of the outletagainst the government and can be empirically measured as the frequency with whichthe outlet uses anti-government language Gentzkow and Shapiro (2010) constructsuch a slant index for US newspapers Empirical findings of their paper suggestthat the editorial policies of media outlets are driven by reader preferences justifyingour assumption of the existence of a large variety of editorial policies19 As in Suen(2004) Chan and Suen (2008) and Chiang and Knight (2011) the assumption ofthe binary media reports that communicate only whether the state θ is above somestandard c can be justified by a cursory readerrsquos preference for simple messages suchas positive or negative opinions and yes or no recommendations

The governmentrsquos censorship of media outlets can take various forms For examplethe government can ban access to internet sites withdraw licenses disrupt financingconfiscate print materials and equipment discredit media providers and even arresteditors and journalists using broadly formulated legislation on combating extremismIn some countries the government can exercise direct control over media editorialpolicies either through state ownership or administrative pressure

17The tie-breaking in the event of θ = c is unimportant as θ is a continuous random variable18We discuss the case of a continuum of media outlets in Section 5519Theoretical literature has explored the determinants of media slant of an outlet driven by its

citizens (Mullainathan and Shleifer 2005 Gentzkow and Shapiro 2006 and Chan and Suen 2008)and its owners (Baron 2006 and Besley and Prat 2006)

18 KOLOTILIN MYLOVANOV ZAPECHELNYUK

53 Formulating Media Censorship as Persuasion We now show that the me-dia censorship problem can be formulated as a linear persuasion problem in whichthe government is a sender and a representative citizen is a receiver

In our model the citizensrsquo and the governmentrsquos utilities are linear in θ There-fore given any information from the media outlets the utilities depend only on theposterior mean of θ

Consider an arbitrary posterior mean of θ denoted by m isin [0 1] Each citizen of typer chooses ar = 1 if and only if r le m The externality term κ plays no role in thisdecision Therefore the aggregate action a is simply the mass of all citizens whosetypes do not exceed m so a = G(m)

Next using the citizensrsquo optimal behavior we derive the governmentrsquos expected utilityconditional on the posterior mean m

V (m) = E983063983133 1

0

ν(θ r ar G(m))dG(r) + δγ(θ G(m))

983055983055983055983055m983064

=

983133 1

0

ν(m r1rlem G(m))dG(r) + δγ(mG(m)) (6)

This is the governmentrsquos indirect utility As in Section 2 the government now choosesa signal which is informative about the state to maximize its expected utility How-ever in contrast to Section 2 here the government is restricted to signals that areimplementable by a subset of a given set of media outlets

Next we show that with an appropriate definition of the state the restriction tosignals implementable by a subset of a given set of media outlets can be formulatedas a restriction to monotone partitional signals as in Scenario B in Section 2 Thuswe will be able to use our results in Section 4 to find the optimal censorship policies

Recall that the information about θ is only available through the media outlets Let ωdenote a random variable equal to the conditional expectation of θ given the messagesof all media outlets in C Note that as C is finite ω is a discrete random variablewith values in [0 1] Let F denote the distribution of ω From now on we treat therandom variable ω as the state and denote by s a signal as defined in Section 2

Let HM denote the set of all distributions of the posterior mean state induced byall monotone partitional signals and let HC denote the set of all distributions ofthe posterior mean state induced by all finite subsets of the set C of media outletsObserve that any set of media outlets induces a monotone partitional signal thatis HC sub HM We now show that every outcome implementable by an arbitrarymonotone partition in HM is also implementable by a subset of media outlets

Lemma 2 HC = HM

For illustration suppose that there is only one media outlet with the editorial policyc isin (0 1) There are two observable events θ ge c and θ lt c In each realized event

CENSORSHIP AS OPTIMAL PERSUASION 19

the citizens compute the posterior expectation of θ which we will call ω So in thisexample the distribution F of ω has two mass points one for each of the two events

Given these two mass points every monotone partition of [0 1] induces one of twopossible distributions of the posterior mean state (i) the mass points revealed and (ii)the mass points pooled Observe however that these distributions are implementableby permitting and censoring the media outlet respectively

A censorship policy is upper censorship if it censors all sufficiently disloyal mediaoutlets Specifically there exists a cutoff clowast isin C and an indicator qlowast isin 0 1 suchthat all media outlets whose editorial policies are below clowast are permitted all mediaoutlets whose editorial policies are above clowast are censored and clowast is permitted if andonly if qlowast = 1 That is X = c isin C c le clowast if qlowast = 1 and X = c isin C c lt clowastif qlowast = 0 Note that the full censorship policy clowast = 0 and qlowast = 0 (where all mediaoutlets are censored) and the free media policy clowast = 1 and qlowast = 1 (where all mediaoutlets are permitted) are the two extreme upper-censorship policies

As shown above the media censorship problem is equivalent to the linear persuasionproblem in which the sender is restricted to monotone partitional signals and thesenderrsquos indirect utility V is given by (6) We thus apply Theorem 1prime(B) to obtainthe following result

Theorem 1primeprime If V is S-shaped then an optimal censorship policy is upper censorship

To illustrate Theorem 1primeprime suppose the government is interested only in the aggregateaction so that ν = 0 and γ depends only on the aggregate action a so V (m) =γ(G(m)) Thus an upper-censorship policy is optimal if the composition functionγ(G(middot)) is S-shaped For example this condition holds if the government is interestedin reaching a certain approval threshold (eg a simple majority) so that γ is a stepfunction This condition also holds if γ is S-shaped and G is uniform or S-shapedwith the same inflection point as γ20

54 Canonical Case We now impose more structure on the utilities to obtain asharper result for the optimality of upper censorship and to perform a comparativestatics analysis Analogously to assumption (A1) we assume that the governmentrsquosutility is a weighted average of the citizensrsquo utility and their aggregate action

983133 1

0

u(θ r ar a)dG(r) + δa

where u(θ r ar a) = (θ minus r)ar + ζ(r)a

(A2)

for some continuously differentiable function ζ Define

β =

983133 1

0

ζ(r)dG(r) + δ

20This condition holds in many special cases where γ and G are S-shaped even when γ and Ghave different inflection points

20 KOLOTILIN MYLOVANOV ZAPECHELNYUK

The term983125 1

0ζ(r)dG(r) is the governmentrsquos expected bias towards the citizensrsquo action

a = 1 due to the citizensrsquo externality and the term δ is the governmentrsquos intrinsicbias towards a greater aggregate action a Thus we interpret β as the governmentrsquosaggregate bias This decomposition of the bias allows for different interpretations whythe government is biased So β can be high because the government is too self-serving(high δ) or because the government is benevolent (low δ) but wishes to internalizestrong positive externalities of the citizens (high ζ(r)) To ease interpretations weassume that β gt 021

Under Assumption (A2) the governmentrsquos indirect utility V given by (6) becomes

V (m) =

983133 m

0

(mminus r)dG(r) + βa =

983133 m

0

v(m r)dG(r)

where

v(m r) = β

9830611 +

1

β(mminus r)

983062

This is the same as v given by Assumption (A1) with ρ = βminus1 up to rescaling by apositive constant Consequently we can apply Theorem 2prime(B) to obtain the followingresult

Theorem 2primeprime Let (A2) hold If the density g of citizens types is log-concave then anoptimal censorship policy is upper censorship

We now apply the comparative statics analysis presented in Section 32 Note thatthe upper-censorship policies are ordered according to the amount of informationtransmitted to the citizens in the sense of Blackwell (1953) A greater censorshipthreshold clowast means that more media outlets are permitted With this order in mindwe apply Theorem 3prime(B) to make a comparative statics analysis on the amount ofinformation that is optimally disclosed by the government

First the censorship cutoff is increasing in the alignment parameter ρ Recall thatthis is a reciprocal of the governmentrsquos bias ρ = βminus1 This means that the governmentoptimally discloses more information (the censorship cutoff is greater) when it is lessbiased (β is smaller) Intuitively as β decreases the government puts more weighton the citizensrsquo utility so it optimally endows the population with a higher utility bycensoring fewer media outlets and disclosing more information

Second the censorship cutoff is increasing in the magnitude t of the horizontal shiftof the density g A greater t corresponds to a greater opportunity cost of actiona = 1 for each citizen This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more difficultto persuade to take action a = 1 (parameter t is greater) Informally speaking topersuade the same type of the citizen the government needs to increase the posterior

21If β lt 0 then swapping the roles of a = 0 and a = 1 reverses the sign of the bias If β = 0then the governmentrsquos utility and the citizensrsquo private interests coincide so it is trivially optimal topermit all media outlets

CENSORSHIP AS OPTIMAL PERSUASION 21

mean But the expected posterior mean must be equal to the prior mean so it is notpossible to increase all posterior means Due to the log-concave shape of the densityof the citizensrsquo types this tradeoff is resolved by increasing the posterior mean of thepooling interval E[ω|ω ge clowast] This is done by shrinking the pooling interval [clowast 1]that is increasing the censorship cutoff clowast

Third when the number of media outlets is sufficiently large the censorship cutoffis increasing in the stretch factor σ of the density g around its peak τ (see Footnote15) A greater σ corresponds to a more dispersed distribution of the opportunitycost in the population This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more diverse inhow difficult they are to persuade Intuitively when the types are more spread outpersuading the same mass of types requires a greater posterior mean which leads toa higher censorship cutoff similarly to the comparative statics in t discussed above

55 Extensions and Open Questions Let us now consider a few extensions ofour model of media censorship

In our model the set of media outlets is exogenous and the governmentrsquos only in-strument is censorship We now consider three alternative ways of expanding thegovernmentrsquos instruments of influence

First suppose that the government is able not only to censor existing media outletsbut also to introduce new media outlets with chosen editorial policies This is equiv-alent to our censorship model where all media outlets in [0 1] are initially availableand the government can censor any subset of them When all media outlets in [0 1]are permitted the revealed information about θ (which we call ω) is θ itself Thusω = θ is a continuous random variable with distribution F = T So we can nowapply our results for the continuous state from Section 3 instead of those for the dis-crete state in Section 4 reaching the same conclusion about the optimality of uppercensorship

Second suppose that the government can garble information available from mediaoutlets That is the government is not restricted to monotone partitions it cancreate arbitrary signals about state ω for the citizens to observe This becomes ageneral persuasion problem and our Theorems 1prime(A) 2prime(A) and 3prime(A) apply

Third suppose that the government is able to restrict not only which media outletsare permitted but also how many media outlets each citizen can choose to observeIn this extension the citizens are not allowed to communicate with one another (oth-erwise they could share the information thus observing all permitted media outletsindirectly) This extension does not affect our results as long as each citizen is al-lowed to access at least one media outlet of his choice as in Chan and Suen (2008)Intuitively this is because each citizen categorizes the information from the mediaoutlets into ldquogood newsrdquo where a = 1 is optimal and ldquobad newsrdquo where a = 0 is

22 KOLOTILIN MYLOVANOV ZAPECHELNYUK

optimal Because the information from the media outlets induces a monotone parti-tion it means that ldquogood newsrdquo is separated from ldquobad newsrdquo by a threshold mediaoutlet that depends on the citizenrsquos type That is it is sufficient to observe a singlethreshold media outlet to distinguish ldquogood newsrdquo from ldquobad newsrdquo

In our model each citizenrsquos utility depends on the aggregate action a through theexternality term κ(θ r a) which does not affect the chosen action Let us relax thisassumption so that a citizenrsquos optimal choice can depend on a We can still writethe senderrsquos indirect utility V as a function of the posterior mean state and applyour results However V is no longer uniquely determined by the primitives of themodel It is now endogenous and depends on an equilibrium the citizens play aseach citizenrsquos optimal action now depends on what all citizens do in equilibrium Forexample given the same information about the state a citizen could prefer to choosea = 1 if and only if sufficiently many citizens choose the same action so a is largeenough This creates the problem of multiplicity of equilibria and as a consequencethe dependence of optimal censorship on equilibrium selection

As a side our media censorship problem can also be applied to spatial voting mod-els as in Chiang and Knight (2011) Consider a government party (p = G) and anopposition party (p = O) competing in an election If party p wins a voter withan ideological position r gets utility wp minus (r minus rp)

2 where wp is the quality or va-lence of party p and rp is the ideology or policy platform of party p Voters knowthe partiesrsquo ideologies and obtain information about the partiesrsquo qualities from allavailable media outlets Each voter supports the party that maximizes his expectedutility Within this context our analysis still applies because the voterrsquos utility dif-ference between the government and opposition parties is proportional to θminusr whereθ = (wG minus wO + r2O minus r2G)2(rO minus rG)

There are a few more extensions that can be relevant in applications First insteadof complete censorship of a media outlet there can be a cost of accessing it Forexample an international news channel can be censored by a local government butcitizens may still access it through VPN at some cost Second it can be costly forthe government to censor media outlets So another important question to answeris how a government may prioritize censoring Finally citizens can incur some costof following each media outlet While we have already mentioned that citizens gainno benefit from following more than one outlet it is entirely possible for them tostop watching news altogether if it is sufficiently uninformative These extensions arenontrivial and left for future research

Appendix

Proof of Theorem 1 Suppose that V is S-shaped For mωlowast isin [0 1] and mlowast(ωlowast) =E[ω|ω ge ωlowast] define

V (mωlowast) = V (mlowast(ωlowast)) + V prime(mlowast(ωlowast))(mminusmlowast(ωlowast)) (7)

CENSORSHIP AS OPTIMAL PERSUASION 23

Since V (m) is convex on [0 τ ] and concave on [τ 1] there exists ωlowast isin [0 τ ] thatsatisfies condition (3) that is

V (mωlowast) ge (le)V (m) for all m ge (lt)ωlowast (8)

This is true because V (ωlowastωlowast) minus V (ωlowast) is single-crossing from below To see thisconsider Figure 1 Observe that (3) can hold only if ωlowast is on the convex part of V andmlowast is on the concave part of V As the censorship cutoff ωlowast increases the posteriormean state of the pooling message mlowast(ωlowast) also increases But because V is concaveat mlowast(ωlowast) the dashed tangent line V (mωlowast) becomes flatter in m so it crosses thesolid line V (m) at a smaller m

Consider an arbitrary signal s Let H be the distribution of m = E[ω|s] inducedby signal s Since each signal is a garbling of the fully informative signal F is amean-preserving spread of H Thus the senderrsquos expected utility is smaller undersignal s than under the upper-censorship signal with cutoff ωlowast as follows from

983133 1

0

V (m)dH(m) le983133 1

0

maxV (m) V (mωlowast)dH(m)

le983133 1

0

maxV (m) V (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mlowast(ωlowast))dF (m)

where the second line holds because maxV (m) V (mωlowast) is convex in m and F isa mean-preserving spread of H the third line holds by (8) and the last line holds by

(7) and by983125 1

ωlowast(mminusmlowast(ωlowast))dF (m) = 0

Conversely suppose that V is not S-shaped Then there exist 0 le m1 lt m2 lem3 lt m4 le 1 such that V primeprime(m) lt 0 for m isin (m1m2) V

primeprime(m) = 0 for m isin [m2m3]and V primeprime(m) gt 0 for m isin (m3m4) because V primeprime is continuous by assumption It isstraightforward to show that there exists 983141mlowast isin (m1m2) (sufficiently close to m2) and983141ωlowast isin (m3m4) such that

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) lt V (m) for all m isin (983141ωlowastm4]

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) gt V (m) for all m isin [m1 983141ωlowast] 983141mlowast

Moreover there exists a continuous distribution F such that the support of F is asubset of [m1m4] and 983141mlowast = E[ω|ω le 983141ωlowast] Using a chain of inequalities similar to theone above it is easy to show that lower-censorship signal 983141slowast with cutoff 983141ωlowast is optimalFurthermore it is uniquely optimal as both inequalities hold with equalities only forsignal 983141slowast This in turn implies that any upper-censorship signal is suboptimal

24 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Proof of Lemma 1 Notice that under assumption (A1) we have

V primeprime(m) = gprime(m) + ρg(m) for m isin [0 1]

Recall that V is S-shaped if and only if V primeprime(m) = gprime (m) + ρg (m) is single-crossingfrom above Using Proposition 1 in Quah and Strulovici (2012) it is easy to showthat V primeprime(m) is single-crossing from above for all ρ isin R if and only if gprime (m)g (m) isnonincreasing in m (that is ln g(m) is concave)

Proof of Theorem 2 Immediate by Theorem 1 and Lemma 1

Proof of Theorem 3 Part (i) We fix t and σ and vary ρ Without loss ofgenerality let t = 0 and σ = 1 (so that Gtσ = G) Consider a signal that induces thedistribution H of posterior mean m This signal implements the receiverrsquos interimutility U given by

U(r) =

983133 1

r

(mminus r)dH(m) for r isin [0 1]

which holds because the receiver acts if and only if m ge r Since types r lt 0 alwayschoose a = 1 and types r gt 1 always choose a = 0 we can exclude them fromconsideration and assume that supp(G) = [0 1] The senderrsquos expected utility is then

983133 1

0

V (m)dH(m) =

983133 1

0

983133 m

0

(1 + ρ(mminus r))dG(r)dH(m)

=

983133 1

0

983133 1

r

(1 + ρ(mminus r))dH(m)dG(r)

=

983133 1

0

(1minusH(r) + ρU(r))dG(r)

Consider ρ2 gt ρ1 Suppose by contradiction that the corresponding optimal upper-censorship signals s2 and s1 are such that s1 is strictly more informative than s2 LetU2 and U1 be the receiverrsquos interim utilities implemented by s2 and s1 Since thesender prefers signal s2 under ρ2 and signal s1 under ρ1 we have

983133 1

0

(1minusH2 (r) + ρ2U2 (r)) dG (r) ge983133 1

0

(1minusH1 (r) + ρ2U1 (r)) dG (r)

983133 1

0

(1minusH1 (r) + ρ1U1 (r)) dG (r) ge983133 1

0

(1minusH2 (r) + ρ1U2 (r)) dG (r)

Summing up these inequalities gives

(ρ2 minus ρ1)

983133 1

0

(U2 (r)minus U1 (r)) dG (r) ge 0

leading to a contradiction because U1(r) ge U2(r) for all r with strict inequality forsome r given that s1 is strictly more informative than s2

CENSORSHIP AS OPTIMAL PERSUASION 25

Part (ii) We fix ρ isin R and σ = 1 and vary t The indirect utility of the senderconditional on m isin [0 1] is given by

Vt(m) =

983133

rlem

(1+ρ(mminusr))gt1(r)dr =

983133

rlemminust

(1+ρ(mminustminusr))g(r)dr = V (mminust) (9)

By Theorem 1 when t0 = 0 an optimal cutoff ωlowast0 satisfies

V (m) le V (mlowast0) + V prime(mlowast

0)(mminusmlowast0) for m isin [ωlowast

0 1]

When V is S-shaped and t1 lt 0 it is easy to see that

V (mminus t1) le V (mlowast0 minus t1) + V prime(mlowast

0 minus t1)(mminusmlowast0) for m isin [ωlowast

0 1]

which implies that when t = t1 the above inequality changes its sign at some ωlowast1

that satisfies ωlowast1 le ωlowast

0 It follows from (9) that ωlowast1 is an optimal cutoff for t = t1 and

as we established above ωlowast1 le ωlowast

0

Part (iii) We fix ρ ge 0 and t = 0 and vary σ The indirect utility of the senderconditional on m isin [0 1] is given by

V ρσσ(m) =

983133

rlem

9830591 +

ρ

σ(mminus r)

983060dG0σ(r)

=

983133

rlem

9830591 +

ρ

σ(mminus r)

983060g

983061r minus (1minus σ)τ

σ

9830621

σdr

=

983133

rlemminus(1minusσ)τσ

9830611 + ρ

983061mminus (1minus σ)τ

σminus r

983062983062g(r)dr

= Vρ1

983061mminus (1minus σ)τ

σ

983062 (10)

Let σ0 = 1 σ1 lt 1 and ρ1 = ρσ1 To compare the optimal cutoffs under parameters(ρ σ0) and (ρ σ1) we consider two steps first comparing the cutoffs between (ρ σ0)and (ρ1 σ1) and then comparing the cutoffs between (ρ1 σ1) and (ρ σ1)

First we compare an optimal cutoff ωlowast0 under (ρ σ0) = (ρ 1) and an optimal cutoff

ωlowast1 under (ρ1 σ1) By Theorem 1 ωlowast

0 satisfies

Vρ1(m) le Vρ1(mlowast0) + V prime

ρ1(mlowast0)(mminusmlowast

0) for m isin [ωlowast0 1]

When V is increasing (implied by ρ ge 0) and S-shaped and σ1 lt 1 it is easy to seethat for all m isin [ωlowast

0 1]

Vρ1

983061mminus (1minus σ1)τ

σ1

983062le Vρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062+ V prime

ρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062(mminusmlowast

0)

which implies that when σ = σ1 lt 1 the above inequality changes its sign at someωlowast1 that satisfies ωlowast

1 le ωlowast0 It follows from (10) that ωlowast

1 is an optimal cutoff under(ρσ σ1) = (ρ1 σ1) and as we established above ωlowast

1 le ωlowast0

26 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Second we compare the optimal cutoff ωlowast1 under (ρ1 σ1) and the optimal cutoff ωlowast

2

under (ρ σ1) Because ρ lt ρ1 = ρσ1 by part (i) we have ωlowast2 le ωlowast

1

Combining ωlowast1 le ωlowast

0 and ωlowast2 le ωlowast

1 we obtain ωlowast2 le ωlowast

0

Proof of Theorem 1prime Part (A) The proof is analogous to that of Theorem 1 andis thus omitted

Part (B) The only if statement follows from that in part (A) (see the proof of Theorem1) and the fact that we can always ensure that an optimal stochastic censorship(ωlowast qlowast) is deterministic (so qlowast isin 0 1) by the choice of the support and distributionof ω Thus we only need to prove the if statement

Let F be a distribution of ω with support on n values 0 lt ω0 lt lt ωnminus1 lt 1 Thisis without loss of generality because if 0 andor 1 were in the support of F we couldredefine the state as ω = (ω + ε)(1 + 2ε) and the type as r = (r + ε)(1 + 2ε)

Let X sub ω1 ωnminus1 identify a set of cutoffs defining a monotone partition for eachωi isin X the receiver is informed whether ω lt ωi or ω ge ωi Note that we exclude ω0

from the set of cutoffs because ω ge ω0 always holds

Note that X is deterministic upper censorship if and only if there exists a censorshipthreshold ωlowast isin [0 1] such that for each i = 1 nminus 1 ωi isin X if and only if ωi lt ωlowastthat is we can assume that qlowast = 0 without loss of generality

Let VX be the senderrsquos expected utility when the monotone partition is given by X

If n = 2 then there are only two partitions X = ω1 (full disclosure) and X = empty(no disclosure) Both are upper censorships so part (B) holds trivially

If n = 3 then there are four monotone partitions X = ω1ω2 (full disclosure)X = ω2 X = ω1 and X = empty (no disclosure) Only X = ω2 is not uppercensorship The next lemma proves that X = ω2 cannot be uniquely optimal whenV is S-shaped thus proving the optimality of upper censorship for n = 3

Denote

mjk = E983045ω|ω isin [ωjωk]

983046

For convenience of notation we write mj = ωj

Lemma 3 Let V be S-shaped with an inflexion point τ Let ω be a discrete randomvariable with support on m0m1m2 and probabilities p0 p1 p2 where 0 lt m0 ltm1 lt m2 lt 1 Then

(i) if τ le m0 then Vm2 le Vempty

(ii) if τ ge m1 then Vm2 le Vm1m2

(iii) if m0 lt τ lt m1 then Vm2 le maxVempty Vm1m2

CENSORSHIP AS OPTIMAL PERSUASION 27

Proof Define p01 = p0 + p1 and p02 = p01 + p2 = 1 Observe that

m01 =p0m0 + p1m1

p0 + p1and m02 =

p0m0 + p1m1 + p2m2

p0 + p1 + p2

We thus have

Vempty = p02V (m02) = (p01 + p2)V

983061p01m01 + p2m2

p01 + p2

983062

Vm2 = p01V (m01) + p2V (m2) = (p0 + p1)V

983061p0m0 + p1m1

p0 + p1

983062+ p2V (m2)

Vm1m2 = p0V (m0) + p1V (m1) + p2V (m2)

Clearly

m0 lt m01 lt m1 and m0 lt m02 lt m2

Therefore

Vm2 le Vempty lArrrArrp01

p01 + p2V (m01) +

p2p01 + p2

V (m2) le V

983061p01

p01 + p2m01 +

p2p01 + p2

m2

983062= V (m02)

(11)

and

Vm2 le Vm1m2 lArrrArr

V

983061p0

p0 + p1m0 +

p1p0 + p1

m1

983062le p0

p0 + p1V (m0) +

p1p0 + p1

V (m1)(12)

Suppose that τ le m01 so V is concave on [m01m2] Then (11) holds by Jensenrsquosinequality In particular it holds for τ le m0 lt m01 thus proving part (i)

Suppose that τ ge m1 so V is convex on [m0m1] Then (12) holds by Jensenrsquosinequality thus proving part (ii)

Finally suppose that m01 lt τ lt m1 Let (τ V (τ)) lie below (or on) the straight lineA connecting (m0 V (m0)) and (m2 V (m2)) as illustrated on Figure 2 By convexityof V (m) for m lt τ it must be the case that V (m01) lies below (or on) the dashedline B By concavity of V (m) for m gt τ it must be the case that V (m1) lies above(or on) the dashed line C Consequently (12) is satisfied

Alternatively let (τ V (τ)) lie above the straight line A connecting (m0 V (m0)) and(m2 V (m2)) as illustrated on Figure 3 If Vm2 le Vm1m2 then the proof of part(iii) is complete So suppose that Vm2 gt Vm1m2 so (12) is violated

By convexity of V (m) for m lt τ it must be the case that V (m01) lies below (or on)the line B Since (12) is violated V (m01) must lie between the lines A and B Recallthat m01 lt m02 lt m2 Observe that V (m02) must be in the shaded area on Figure 3Indeed if m02 lt τ then by convexity of V (m) for m lt τ it must be the case thatV (m02) lies between (or on) the lines B and D If m02 ge τ then by concavity of

28 KOLOTILIN MYLOVANOV ZAPECHELNYUK

τ m2m0

A C

B

m01 m1

V (m0) V (τ)

V (m2)

Figure 2 The case where (τ V (τ)) lies below (or on) the straight lineconnecting (m0 V (m0)) and (m2 V (m2))

τ m2m0

A

C

B

m01

V (m0)

V (τ)

V (m2)

D

V (m01)

Figure 3 The case where (τ V (τ)) lies above the straight line con-necting (m0 V (m0)) and (m2 V (m2))

V (m) for m gt τ it must be the case that V (m02) lies above (or on) line C It followsthat (11) is satisfied

We now adapt Lemma 3 to prove the optimality of upper censorship for n ge 4

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

CENSORSHIP AS OPTIMAL PERSUASION 15

That is the sender optimally discloses more information when she is less biasedrelative to the receiver (the alignment parameter ρ is greater) when the receiver ismore reluctant to accept the proposal (the shift parameter t is greater) and underScenario A when the sender is more uncertain about the receiverrsquos type (the stretchparameter σ is greater)

Remark 2 Theorem 1prime and Theorem 2prime show that upper censorship emerges as anoptimal persuasion strategy under the same conditions (V is S-shaped in Theorem1prime and g is log-concave in Theorem 2prime) regardless of whether the sender can choosean arbitrary signal or is constrained to choose a monotone partition It may betempting to conjecture that if the sender was allowed to choose an arbitrary partition(not necessarily monotone) the same result would hold This however is not true

For illustration suppose that the sender wishes to maximize the probability of a = 1and the receiver has type r = 1

2with certainty16 Let the state ω take values in

0 14 1 with probabilities (1

6 23 16) so the prior mean is 1

3

An optimal signal is stochastic upper censorship with cutoff pair (ωlowast qlowast) = (14 12)

which sends the pooling message with certainty at ω = 1 and with probability 1minusqlowast =12at ω = 1

4 and reveals the state otherwise The pooling message induces the posterior

mean m = 12 leading to a = 1 all other messages induce posterior means below 1

2

leading to a = 0 Under the optimal signal action a = 1 is chosen with probability23middot 12+ 1

6= 1

2

An optimal monotone partition is deterministic upper censorship that fully revealsthe state where only ω = 1 leads to a = 1 Under this partition action a = 1 ischosen with probability 1

6

An optimal partition pools the extreme states ω = 0 and ω = 1 and reveals theintermediate state ω = 1

4 The pooling message induces the posterior mean m = 1

2

leading to a = 1 This optimal partition is nonmonotone Under this partitionaction a = 1 is chosen with probability 1

6+ 1

6= 1

3 which is strictly better than the

probability of 16obtained under the optimal monotone partition

5 Application to Media Censorship

In this section we apply our results to the problem of media censorship by thegovernment In the modern world people obtain information about the governmentrsquosstate through various media sources such as television newspapers and internetblogs Without the media most people would not know what policies and reformsthe government pursues and how effective they are Media outlets have differentpositions on the political spectrum and differ substantially in how they select andpresent facts to cover the same news People choose their sources of information based

16In this case V (m) = 1 if m ge 12 and V (m) = 0 if m lt 1

2 Observe that the graph of V is

S-shaped and it can be approximated by a continuously differentiable S-shaped function

16 KOLOTILIN MYLOVANOV ZAPECHELNYUK

on their political ideology and socioeconomic status This information is valuable forsignificant individual decisions on migration investment and voting to name a fewIndividuals may not fully internalize externalities that their decisions impose on thesociety Likewise the government may not have the societyrsquos best interest at heartTo further its goals the government then wishes to influence individual decisions bymanipulating the information through media In autocracies and countries with weakchecks and balances the government has power to censor the media content

The governmentrsquos problem of media censorship can be represented as the persua-sion problem in Section 2 We apply our results to provide conditions for the op-timality of upper-censorship policies that censor all media outlets except the mostpro-government ones Furthermore we interpret our comparative statics results

51 Setup There is a continuum of heterogeneous citizens indexed by r isin [0 1]distributed with G Each citizen chooses between a = 0 and a = 1 The utility of acitizen of type r is given by

u(θ r ar a) = (θ minus r)ar + κ(θ r a)

where ar isin 0 1 denotes the citizenrsquos own action a =983125ardG(r) denotes the aggre-

gate action in the society θ isin [0 1] captures an unobserved benefit from action 1 ascompared to action 0 and κ captures the impact of the aggregate action a on thecitizenrsquos utility The term (θ minus r)ar is a private surplus of a citizen of type r Theterm κ(θ r a) is an externality because for a citizen of type r it is optimal to ignorethis term and choose ar = 1 if and only if θ ge r

There is a government which is concerned with a weighted average of the social utilityand the governmentrsquos intrinsic benefit from the aggregate action For a given θ thegovernmentrsquos utility is given by

983133 1

0

ν(θ r ar a)dG(r) + δγ(θ a)

The term ν captures a citizenrsquos utility from the governmentrsquos perspective We allow νto be different from u to reflect paternalistic or other concerns The term γ capturesthe governmentrsquos intrinsic benefit from the aggregate action The parameter δ ge 0captures the weight of the aggregate action in the governmentrsquos utility

Let T be a distribution of the random variable θ We assume that distributions G andT are independent and admit continuously differentiable and strictly positive densi-ties We also assume that κ ν and γ are linear in θ and continuously differentiablein r and a Furthermore to simplify interpretations we assume that ν and γ arenon-decreasing in θ and a so that for the government a high θ is a good news anda higher aggregate action is preferable

Citizens obtain information about the unobservable benefit θ through media outletsEach media outlet is identified by its editorial policy c isin [0 1] and it endorses action

CENSORSHIP AS OPTIMAL PERSUASION 17

a = 1 if θ ge c and criticizes it if θ lt c17 A set C of media outlets is a finite subsetof [0 1]18

The governmentrsquos censorship policy is a set of the media outlets X sub C that arepermitted to broadcast so the rest of the media outlets are censored

The timing is as follows First the government chooses a set X sub C of permittedmedia outlets Second state θ is realized and each permitted media outlet endorsesor criticizes action a = 1 according to its editorial policy Finally each citizen observesmessages from all permitted media outlets updates his beliefs about θ and choosesan action

52 Discussion We now discuss interpretations of the key components of the mediacensorship application As in Gehlbach and Sonin (2014) there can be various inter-pretations of the citizenrsquos action a = 1 such as voting for the government supportinga governmentrsquos policy or taking an individual decision that benefits the governmentA citizenrsquos type r can be interpreted as his ideological position or preference param-eter A citizen who is more supportive of the government has a smaller r

A media outlet with a higher editorial policy c can be interpreted as being less loyalto the government because it criticizes the government on a larger set of states Aneditorial policy c isin C can therefore represent a slant or political bias of the outletagainst the government and can be empirically measured as the frequency with whichthe outlet uses anti-government language Gentzkow and Shapiro (2010) constructsuch a slant index for US newspapers Empirical findings of their paper suggestthat the editorial policies of media outlets are driven by reader preferences justifyingour assumption of the existence of a large variety of editorial policies19 As in Suen(2004) Chan and Suen (2008) and Chiang and Knight (2011) the assumption ofthe binary media reports that communicate only whether the state θ is above somestandard c can be justified by a cursory readerrsquos preference for simple messages suchas positive or negative opinions and yes or no recommendations

The governmentrsquos censorship of media outlets can take various forms For examplethe government can ban access to internet sites withdraw licenses disrupt financingconfiscate print materials and equipment discredit media providers and even arresteditors and journalists using broadly formulated legislation on combating extremismIn some countries the government can exercise direct control over media editorialpolicies either through state ownership or administrative pressure

17The tie-breaking in the event of θ = c is unimportant as θ is a continuous random variable18We discuss the case of a continuum of media outlets in Section 5519Theoretical literature has explored the determinants of media slant of an outlet driven by its

citizens (Mullainathan and Shleifer 2005 Gentzkow and Shapiro 2006 and Chan and Suen 2008)and its owners (Baron 2006 and Besley and Prat 2006)

18 KOLOTILIN MYLOVANOV ZAPECHELNYUK

53 Formulating Media Censorship as Persuasion We now show that the me-dia censorship problem can be formulated as a linear persuasion problem in whichthe government is a sender and a representative citizen is a receiver

In our model the citizensrsquo and the governmentrsquos utilities are linear in θ There-fore given any information from the media outlets the utilities depend only on theposterior mean of θ

Consider an arbitrary posterior mean of θ denoted by m isin [0 1] Each citizen of typer chooses ar = 1 if and only if r le m The externality term κ plays no role in thisdecision Therefore the aggregate action a is simply the mass of all citizens whosetypes do not exceed m so a = G(m)

Next using the citizensrsquo optimal behavior we derive the governmentrsquos expected utilityconditional on the posterior mean m

V (m) = E983063983133 1

0

ν(θ r ar G(m))dG(r) + δγ(θ G(m))

983055983055983055983055m983064

=

983133 1

0

ν(m r1rlem G(m))dG(r) + δγ(mG(m)) (6)

This is the governmentrsquos indirect utility As in Section 2 the government now choosesa signal which is informative about the state to maximize its expected utility How-ever in contrast to Section 2 here the government is restricted to signals that areimplementable by a subset of a given set of media outlets

Next we show that with an appropriate definition of the state the restriction tosignals implementable by a subset of a given set of media outlets can be formulatedas a restriction to monotone partitional signals as in Scenario B in Section 2 Thuswe will be able to use our results in Section 4 to find the optimal censorship policies

Recall that the information about θ is only available through the media outlets Let ωdenote a random variable equal to the conditional expectation of θ given the messagesof all media outlets in C Note that as C is finite ω is a discrete random variablewith values in [0 1] Let F denote the distribution of ω From now on we treat therandom variable ω as the state and denote by s a signal as defined in Section 2

Let HM denote the set of all distributions of the posterior mean state induced byall monotone partitional signals and let HC denote the set of all distributions ofthe posterior mean state induced by all finite subsets of the set C of media outletsObserve that any set of media outlets induces a monotone partitional signal thatis HC sub HM We now show that every outcome implementable by an arbitrarymonotone partition in HM is also implementable by a subset of media outlets

Lemma 2 HC = HM

For illustration suppose that there is only one media outlet with the editorial policyc isin (0 1) There are two observable events θ ge c and θ lt c In each realized event

CENSORSHIP AS OPTIMAL PERSUASION 19

the citizens compute the posterior expectation of θ which we will call ω So in thisexample the distribution F of ω has two mass points one for each of the two events

Given these two mass points every monotone partition of [0 1] induces one of twopossible distributions of the posterior mean state (i) the mass points revealed and (ii)the mass points pooled Observe however that these distributions are implementableby permitting and censoring the media outlet respectively

A censorship policy is upper censorship if it censors all sufficiently disloyal mediaoutlets Specifically there exists a cutoff clowast isin C and an indicator qlowast isin 0 1 suchthat all media outlets whose editorial policies are below clowast are permitted all mediaoutlets whose editorial policies are above clowast are censored and clowast is permitted if andonly if qlowast = 1 That is X = c isin C c le clowast if qlowast = 1 and X = c isin C c lt clowastif qlowast = 0 Note that the full censorship policy clowast = 0 and qlowast = 0 (where all mediaoutlets are censored) and the free media policy clowast = 1 and qlowast = 1 (where all mediaoutlets are permitted) are the two extreme upper-censorship policies

As shown above the media censorship problem is equivalent to the linear persuasionproblem in which the sender is restricted to monotone partitional signals and thesenderrsquos indirect utility V is given by (6) We thus apply Theorem 1prime(B) to obtainthe following result

Theorem 1primeprime If V is S-shaped then an optimal censorship policy is upper censorship

To illustrate Theorem 1primeprime suppose the government is interested only in the aggregateaction so that ν = 0 and γ depends only on the aggregate action a so V (m) =γ(G(m)) Thus an upper-censorship policy is optimal if the composition functionγ(G(middot)) is S-shaped For example this condition holds if the government is interestedin reaching a certain approval threshold (eg a simple majority) so that γ is a stepfunction This condition also holds if γ is S-shaped and G is uniform or S-shapedwith the same inflection point as γ20

54 Canonical Case We now impose more structure on the utilities to obtain asharper result for the optimality of upper censorship and to perform a comparativestatics analysis Analogously to assumption (A1) we assume that the governmentrsquosutility is a weighted average of the citizensrsquo utility and their aggregate action

983133 1

0

u(θ r ar a)dG(r) + δa

where u(θ r ar a) = (θ minus r)ar + ζ(r)a

(A2)

for some continuously differentiable function ζ Define

β =

983133 1

0

ζ(r)dG(r) + δ

20This condition holds in many special cases where γ and G are S-shaped even when γ and Ghave different inflection points

20 KOLOTILIN MYLOVANOV ZAPECHELNYUK

The term983125 1

0ζ(r)dG(r) is the governmentrsquos expected bias towards the citizensrsquo action

a = 1 due to the citizensrsquo externality and the term δ is the governmentrsquos intrinsicbias towards a greater aggregate action a Thus we interpret β as the governmentrsquosaggregate bias This decomposition of the bias allows for different interpretations whythe government is biased So β can be high because the government is too self-serving(high δ) or because the government is benevolent (low δ) but wishes to internalizestrong positive externalities of the citizens (high ζ(r)) To ease interpretations weassume that β gt 021

Under Assumption (A2) the governmentrsquos indirect utility V given by (6) becomes

V (m) =

983133 m

0

(mminus r)dG(r) + βa =

983133 m

0

v(m r)dG(r)

where

v(m r) = β

9830611 +

1

β(mminus r)

983062

This is the same as v given by Assumption (A1) with ρ = βminus1 up to rescaling by apositive constant Consequently we can apply Theorem 2prime(B) to obtain the followingresult

Theorem 2primeprime Let (A2) hold If the density g of citizens types is log-concave then anoptimal censorship policy is upper censorship

We now apply the comparative statics analysis presented in Section 32 Note thatthe upper-censorship policies are ordered according to the amount of informationtransmitted to the citizens in the sense of Blackwell (1953) A greater censorshipthreshold clowast means that more media outlets are permitted With this order in mindwe apply Theorem 3prime(B) to make a comparative statics analysis on the amount ofinformation that is optimally disclosed by the government

First the censorship cutoff is increasing in the alignment parameter ρ Recall thatthis is a reciprocal of the governmentrsquos bias ρ = βminus1 This means that the governmentoptimally discloses more information (the censorship cutoff is greater) when it is lessbiased (β is smaller) Intuitively as β decreases the government puts more weighton the citizensrsquo utility so it optimally endows the population with a higher utility bycensoring fewer media outlets and disclosing more information

Second the censorship cutoff is increasing in the magnitude t of the horizontal shiftof the density g A greater t corresponds to a greater opportunity cost of actiona = 1 for each citizen This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more difficultto persuade to take action a = 1 (parameter t is greater) Informally speaking topersuade the same type of the citizen the government needs to increase the posterior

21If β lt 0 then swapping the roles of a = 0 and a = 1 reverses the sign of the bias If β = 0then the governmentrsquos utility and the citizensrsquo private interests coincide so it is trivially optimal topermit all media outlets

CENSORSHIP AS OPTIMAL PERSUASION 21

mean But the expected posterior mean must be equal to the prior mean so it is notpossible to increase all posterior means Due to the log-concave shape of the densityof the citizensrsquo types this tradeoff is resolved by increasing the posterior mean of thepooling interval E[ω|ω ge clowast] This is done by shrinking the pooling interval [clowast 1]that is increasing the censorship cutoff clowast

Third when the number of media outlets is sufficiently large the censorship cutoffis increasing in the stretch factor σ of the density g around its peak τ (see Footnote15) A greater σ corresponds to a more dispersed distribution of the opportunitycost in the population This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more diverse inhow difficult they are to persuade Intuitively when the types are more spread outpersuading the same mass of types requires a greater posterior mean which leads toa higher censorship cutoff similarly to the comparative statics in t discussed above

55 Extensions and Open Questions Let us now consider a few extensions ofour model of media censorship

In our model the set of media outlets is exogenous and the governmentrsquos only in-strument is censorship We now consider three alternative ways of expanding thegovernmentrsquos instruments of influence

First suppose that the government is able not only to censor existing media outletsbut also to introduce new media outlets with chosen editorial policies This is equiv-alent to our censorship model where all media outlets in [0 1] are initially availableand the government can censor any subset of them When all media outlets in [0 1]are permitted the revealed information about θ (which we call ω) is θ itself Thusω = θ is a continuous random variable with distribution F = T So we can nowapply our results for the continuous state from Section 3 instead of those for the dis-crete state in Section 4 reaching the same conclusion about the optimality of uppercensorship

Second suppose that the government can garble information available from mediaoutlets That is the government is not restricted to monotone partitions it cancreate arbitrary signals about state ω for the citizens to observe This becomes ageneral persuasion problem and our Theorems 1prime(A) 2prime(A) and 3prime(A) apply

Third suppose that the government is able to restrict not only which media outletsare permitted but also how many media outlets each citizen can choose to observeIn this extension the citizens are not allowed to communicate with one another (oth-erwise they could share the information thus observing all permitted media outletsindirectly) This extension does not affect our results as long as each citizen is al-lowed to access at least one media outlet of his choice as in Chan and Suen (2008)Intuitively this is because each citizen categorizes the information from the mediaoutlets into ldquogood newsrdquo where a = 1 is optimal and ldquobad newsrdquo where a = 0 is

22 KOLOTILIN MYLOVANOV ZAPECHELNYUK

optimal Because the information from the media outlets induces a monotone parti-tion it means that ldquogood newsrdquo is separated from ldquobad newsrdquo by a threshold mediaoutlet that depends on the citizenrsquos type That is it is sufficient to observe a singlethreshold media outlet to distinguish ldquogood newsrdquo from ldquobad newsrdquo

In our model each citizenrsquos utility depends on the aggregate action a through theexternality term κ(θ r a) which does not affect the chosen action Let us relax thisassumption so that a citizenrsquos optimal choice can depend on a We can still writethe senderrsquos indirect utility V as a function of the posterior mean state and applyour results However V is no longer uniquely determined by the primitives of themodel It is now endogenous and depends on an equilibrium the citizens play aseach citizenrsquos optimal action now depends on what all citizens do in equilibrium Forexample given the same information about the state a citizen could prefer to choosea = 1 if and only if sufficiently many citizens choose the same action so a is largeenough This creates the problem of multiplicity of equilibria and as a consequencethe dependence of optimal censorship on equilibrium selection

As a side our media censorship problem can also be applied to spatial voting mod-els as in Chiang and Knight (2011) Consider a government party (p = G) and anopposition party (p = O) competing in an election If party p wins a voter withan ideological position r gets utility wp minus (r minus rp)

2 where wp is the quality or va-lence of party p and rp is the ideology or policy platform of party p Voters knowthe partiesrsquo ideologies and obtain information about the partiesrsquo qualities from allavailable media outlets Each voter supports the party that maximizes his expectedutility Within this context our analysis still applies because the voterrsquos utility dif-ference between the government and opposition parties is proportional to θminusr whereθ = (wG minus wO + r2O minus r2G)2(rO minus rG)

There are a few more extensions that can be relevant in applications First insteadof complete censorship of a media outlet there can be a cost of accessing it Forexample an international news channel can be censored by a local government butcitizens may still access it through VPN at some cost Second it can be costly forthe government to censor media outlets So another important question to answeris how a government may prioritize censoring Finally citizens can incur some costof following each media outlet While we have already mentioned that citizens gainno benefit from following more than one outlet it is entirely possible for them tostop watching news altogether if it is sufficiently uninformative These extensions arenontrivial and left for future research

Appendix

Proof of Theorem 1 Suppose that V is S-shaped For mωlowast isin [0 1] and mlowast(ωlowast) =E[ω|ω ge ωlowast] define

V (mωlowast) = V (mlowast(ωlowast)) + V prime(mlowast(ωlowast))(mminusmlowast(ωlowast)) (7)

CENSORSHIP AS OPTIMAL PERSUASION 23

Since V (m) is convex on [0 τ ] and concave on [τ 1] there exists ωlowast isin [0 τ ] thatsatisfies condition (3) that is

V (mωlowast) ge (le)V (m) for all m ge (lt)ωlowast (8)

This is true because V (ωlowastωlowast) minus V (ωlowast) is single-crossing from below To see thisconsider Figure 1 Observe that (3) can hold only if ωlowast is on the convex part of V andmlowast is on the concave part of V As the censorship cutoff ωlowast increases the posteriormean state of the pooling message mlowast(ωlowast) also increases But because V is concaveat mlowast(ωlowast) the dashed tangent line V (mωlowast) becomes flatter in m so it crosses thesolid line V (m) at a smaller m

Consider an arbitrary signal s Let H be the distribution of m = E[ω|s] inducedby signal s Since each signal is a garbling of the fully informative signal F is amean-preserving spread of H Thus the senderrsquos expected utility is smaller undersignal s than under the upper-censorship signal with cutoff ωlowast as follows from

983133 1

0

V (m)dH(m) le983133 1

0

maxV (m) V (mωlowast)dH(m)

le983133 1

0

maxV (m) V (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mlowast(ωlowast))dF (m)

where the second line holds because maxV (m) V (mωlowast) is convex in m and F isa mean-preserving spread of H the third line holds by (8) and the last line holds by

(7) and by983125 1

ωlowast(mminusmlowast(ωlowast))dF (m) = 0

Conversely suppose that V is not S-shaped Then there exist 0 le m1 lt m2 lem3 lt m4 le 1 such that V primeprime(m) lt 0 for m isin (m1m2) V

primeprime(m) = 0 for m isin [m2m3]and V primeprime(m) gt 0 for m isin (m3m4) because V primeprime is continuous by assumption It isstraightforward to show that there exists 983141mlowast isin (m1m2) (sufficiently close to m2) and983141ωlowast isin (m3m4) such that

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) lt V (m) for all m isin (983141ωlowastm4]

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) gt V (m) for all m isin [m1 983141ωlowast] 983141mlowast

Moreover there exists a continuous distribution F such that the support of F is asubset of [m1m4] and 983141mlowast = E[ω|ω le 983141ωlowast] Using a chain of inequalities similar to theone above it is easy to show that lower-censorship signal 983141slowast with cutoff 983141ωlowast is optimalFurthermore it is uniquely optimal as both inequalities hold with equalities only forsignal 983141slowast This in turn implies that any upper-censorship signal is suboptimal

24 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Proof of Lemma 1 Notice that under assumption (A1) we have

V primeprime(m) = gprime(m) + ρg(m) for m isin [0 1]

Recall that V is S-shaped if and only if V primeprime(m) = gprime (m) + ρg (m) is single-crossingfrom above Using Proposition 1 in Quah and Strulovici (2012) it is easy to showthat V primeprime(m) is single-crossing from above for all ρ isin R if and only if gprime (m)g (m) isnonincreasing in m (that is ln g(m) is concave)

Proof of Theorem 2 Immediate by Theorem 1 and Lemma 1

Proof of Theorem 3 Part (i) We fix t and σ and vary ρ Without loss ofgenerality let t = 0 and σ = 1 (so that Gtσ = G) Consider a signal that induces thedistribution H of posterior mean m This signal implements the receiverrsquos interimutility U given by

U(r) =

983133 1

r

(mminus r)dH(m) for r isin [0 1]

which holds because the receiver acts if and only if m ge r Since types r lt 0 alwayschoose a = 1 and types r gt 1 always choose a = 0 we can exclude them fromconsideration and assume that supp(G) = [0 1] The senderrsquos expected utility is then

983133 1

0

V (m)dH(m) =

983133 1

0

983133 m

0

(1 + ρ(mminus r))dG(r)dH(m)

=

983133 1

0

983133 1

r

(1 + ρ(mminus r))dH(m)dG(r)

=

983133 1

0

(1minusH(r) + ρU(r))dG(r)

Consider ρ2 gt ρ1 Suppose by contradiction that the corresponding optimal upper-censorship signals s2 and s1 are such that s1 is strictly more informative than s2 LetU2 and U1 be the receiverrsquos interim utilities implemented by s2 and s1 Since thesender prefers signal s2 under ρ2 and signal s1 under ρ1 we have

983133 1

0

(1minusH2 (r) + ρ2U2 (r)) dG (r) ge983133 1

0

(1minusH1 (r) + ρ2U1 (r)) dG (r)

983133 1

0

(1minusH1 (r) + ρ1U1 (r)) dG (r) ge983133 1

0

(1minusH2 (r) + ρ1U2 (r)) dG (r)

Summing up these inequalities gives

(ρ2 minus ρ1)

983133 1

0

(U2 (r)minus U1 (r)) dG (r) ge 0

leading to a contradiction because U1(r) ge U2(r) for all r with strict inequality forsome r given that s1 is strictly more informative than s2

CENSORSHIP AS OPTIMAL PERSUASION 25

Part (ii) We fix ρ isin R and σ = 1 and vary t The indirect utility of the senderconditional on m isin [0 1] is given by

Vt(m) =

983133

rlem

(1+ρ(mminusr))gt1(r)dr =

983133

rlemminust

(1+ρ(mminustminusr))g(r)dr = V (mminust) (9)

By Theorem 1 when t0 = 0 an optimal cutoff ωlowast0 satisfies

V (m) le V (mlowast0) + V prime(mlowast

0)(mminusmlowast0) for m isin [ωlowast

0 1]

When V is S-shaped and t1 lt 0 it is easy to see that

V (mminus t1) le V (mlowast0 minus t1) + V prime(mlowast

0 minus t1)(mminusmlowast0) for m isin [ωlowast

0 1]

which implies that when t = t1 the above inequality changes its sign at some ωlowast1

that satisfies ωlowast1 le ωlowast

0 It follows from (9) that ωlowast1 is an optimal cutoff for t = t1 and

as we established above ωlowast1 le ωlowast

0

Part (iii) We fix ρ ge 0 and t = 0 and vary σ The indirect utility of the senderconditional on m isin [0 1] is given by

V ρσσ(m) =

983133

rlem

9830591 +

ρ

σ(mminus r)

983060dG0σ(r)

=

983133

rlem

9830591 +

ρ

σ(mminus r)

983060g

983061r minus (1minus σ)τ

σ

9830621

σdr

=

983133

rlemminus(1minusσ)τσ

9830611 + ρ

983061mminus (1minus σ)τ

σminus r

983062983062g(r)dr

= Vρ1

983061mminus (1minus σ)τ

σ

983062 (10)

Let σ0 = 1 σ1 lt 1 and ρ1 = ρσ1 To compare the optimal cutoffs under parameters(ρ σ0) and (ρ σ1) we consider two steps first comparing the cutoffs between (ρ σ0)and (ρ1 σ1) and then comparing the cutoffs between (ρ1 σ1) and (ρ σ1)

First we compare an optimal cutoff ωlowast0 under (ρ σ0) = (ρ 1) and an optimal cutoff

ωlowast1 under (ρ1 σ1) By Theorem 1 ωlowast

0 satisfies

Vρ1(m) le Vρ1(mlowast0) + V prime

ρ1(mlowast0)(mminusmlowast

0) for m isin [ωlowast0 1]

When V is increasing (implied by ρ ge 0) and S-shaped and σ1 lt 1 it is easy to seethat for all m isin [ωlowast

0 1]

Vρ1

983061mminus (1minus σ1)τ

σ1

983062le Vρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062+ V prime

ρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062(mminusmlowast

0)

which implies that when σ = σ1 lt 1 the above inequality changes its sign at someωlowast1 that satisfies ωlowast

1 le ωlowast0 It follows from (10) that ωlowast

1 is an optimal cutoff under(ρσ σ1) = (ρ1 σ1) and as we established above ωlowast

1 le ωlowast0

26 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Second we compare the optimal cutoff ωlowast1 under (ρ1 σ1) and the optimal cutoff ωlowast

2

under (ρ σ1) Because ρ lt ρ1 = ρσ1 by part (i) we have ωlowast2 le ωlowast

1

Combining ωlowast1 le ωlowast

0 and ωlowast2 le ωlowast

1 we obtain ωlowast2 le ωlowast

0

Proof of Theorem 1prime Part (A) The proof is analogous to that of Theorem 1 andis thus omitted

Part (B) The only if statement follows from that in part (A) (see the proof of Theorem1) and the fact that we can always ensure that an optimal stochastic censorship(ωlowast qlowast) is deterministic (so qlowast isin 0 1) by the choice of the support and distributionof ω Thus we only need to prove the if statement

Let F be a distribution of ω with support on n values 0 lt ω0 lt lt ωnminus1 lt 1 Thisis without loss of generality because if 0 andor 1 were in the support of F we couldredefine the state as ω = (ω + ε)(1 + 2ε) and the type as r = (r + ε)(1 + 2ε)

Let X sub ω1 ωnminus1 identify a set of cutoffs defining a monotone partition for eachωi isin X the receiver is informed whether ω lt ωi or ω ge ωi Note that we exclude ω0

from the set of cutoffs because ω ge ω0 always holds

Note that X is deterministic upper censorship if and only if there exists a censorshipthreshold ωlowast isin [0 1] such that for each i = 1 nminus 1 ωi isin X if and only if ωi lt ωlowastthat is we can assume that qlowast = 0 without loss of generality

Let VX be the senderrsquos expected utility when the monotone partition is given by X

If n = 2 then there are only two partitions X = ω1 (full disclosure) and X = empty(no disclosure) Both are upper censorships so part (B) holds trivially

If n = 3 then there are four monotone partitions X = ω1ω2 (full disclosure)X = ω2 X = ω1 and X = empty (no disclosure) Only X = ω2 is not uppercensorship The next lemma proves that X = ω2 cannot be uniquely optimal whenV is S-shaped thus proving the optimality of upper censorship for n = 3

Denote

mjk = E983045ω|ω isin [ωjωk]

983046

For convenience of notation we write mj = ωj

Lemma 3 Let V be S-shaped with an inflexion point τ Let ω be a discrete randomvariable with support on m0m1m2 and probabilities p0 p1 p2 where 0 lt m0 ltm1 lt m2 lt 1 Then

(i) if τ le m0 then Vm2 le Vempty

(ii) if τ ge m1 then Vm2 le Vm1m2

(iii) if m0 lt τ lt m1 then Vm2 le maxVempty Vm1m2

CENSORSHIP AS OPTIMAL PERSUASION 27

Proof Define p01 = p0 + p1 and p02 = p01 + p2 = 1 Observe that

m01 =p0m0 + p1m1

p0 + p1and m02 =

p0m0 + p1m1 + p2m2

p0 + p1 + p2

We thus have

Vempty = p02V (m02) = (p01 + p2)V

983061p01m01 + p2m2

p01 + p2

983062

Vm2 = p01V (m01) + p2V (m2) = (p0 + p1)V

983061p0m0 + p1m1

p0 + p1

983062+ p2V (m2)

Vm1m2 = p0V (m0) + p1V (m1) + p2V (m2)

Clearly

m0 lt m01 lt m1 and m0 lt m02 lt m2

Therefore

Vm2 le Vempty lArrrArrp01

p01 + p2V (m01) +

p2p01 + p2

V (m2) le V

983061p01

p01 + p2m01 +

p2p01 + p2

m2

983062= V (m02)

(11)

and

Vm2 le Vm1m2 lArrrArr

V

983061p0

p0 + p1m0 +

p1p0 + p1

m1

983062le p0

p0 + p1V (m0) +

p1p0 + p1

V (m1)(12)

Suppose that τ le m01 so V is concave on [m01m2] Then (11) holds by Jensenrsquosinequality In particular it holds for τ le m0 lt m01 thus proving part (i)

Suppose that τ ge m1 so V is convex on [m0m1] Then (12) holds by Jensenrsquosinequality thus proving part (ii)

Finally suppose that m01 lt τ lt m1 Let (τ V (τ)) lie below (or on) the straight lineA connecting (m0 V (m0)) and (m2 V (m2)) as illustrated on Figure 2 By convexityof V (m) for m lt τ it must be the case that V (m01) lies below (or on) the dashedline B By concavity of V (m) for m gt τ it must be the case that V (m1) lies above(or on) the dashed line C Consequently (12) is satisfied

Alternatively let (τ V (τ)) lie above the straight line A connecting (m0 V (m0)) and(m2 V (m2)) as illustrated on Figure 3 If Vm2 le Vm1m2 then the proof of part(iii) is complete So suppose that Vm2 gt Vm1m2 so (12) is violated

By convexity of V (m) for m lt τ it must be the case that V (m01) lies below (or on)the line B Since (12) is violated V (m01) must lie between the lines A and B Recallthat m01 lt m02 lt m2 Observe that V (m02) must be in the shaded area on Figure 3Indeed if m02 lt τ then by convexity of V (m) for m lt τ it must be the case thatV (m02) lies between (or on) the lines B and D If m02 ge τ then by concavity of

28 KOLOTILIN MYLOVANOV ZAPECHELNYUK

τ m2m0

A C

B

m01 m1

V (m0) V (τ)

V (m2)

Figure 2 The case where (τ V (τ)) lies below (or on) the straight lineconnecting (m0 V (m0)) and (m2 V (m2))

τ m2m0

A

C

B

m01

V (m0)

V (τ)

V (m2)

D

V (m01)

Figure 3 The case where (τ V (τ)) lies above the straight line con-necting (m0 V (m0)) and (m2 V (m2))

V (m) for m gt τ it must be the case that V (m02) lies above (or on) line C It followsthat (11) is satisfied

We now adapt Lemma 3 to prove the optimality of upper censorship for n ge 4

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

16 KOLOTILIN MYLOVANOV ZAPECHELNYUK

on their political ideology and socioeconomic status This information is valuable forsignificant individual decisions on migration investment and voting to name a fewIndividuals may not fully internalize externalities that their decisions impose on thesociety Likewise the government may not have the societyrsquos best interest at heartTo further its goals the government then wishes to influence individual decisions bymanipulating the information through media In autocracies and countries with weakchecks and balances the government has power to censor the media content

The governmentrsquos problem of media censorship can be represented as the persua-sion problem in Section 2 We apply our results to provide conditions for the op-timality of upper-censorship policies that censor all media outlets except the mostpro-government ones Furthermore we interpret our comparative statics results

51 Setup There is a continuum of heterogeneous citizens indexed by r isin [0 1]distributed with G Each citizen chooses between a = 0 and a = 1 The utility of acitizen of type r is given by

u(θ r ar a) = (θ minus r)ar + κ(θ r a)

where ar isin 0 1 denotes the citizenrsquos own action a =983125ardG(r) denotes the aggre-

gate action in the society θ isin [0 1] captures an unobserved benefit from action 1 ascompared to action 0 and κ captures the impact of the aggregate action a on thecitizenrsquos utility The term (θ minus r)ar is a private surplus of a citizen of type r Theterm κ(θ r a) is an externality because for a citizen of type r it is optimal to ignorethis term and choose ar = 1 if and only if θ ge r

There is a government which is concerned with a weighted average of the social utilityand the governmentrsquos intrinsic benefit from the aggregate action For a given θ thegovernmentrsquos utility is given by

983133 1

0

ν(θ r ar a)dG(r) + δγ(θ a)

The term ν captures a citizenrsquos utility from the governmentrsquos perspective We allow νto be different from u to reflect paternalistic or other concerns The term γ capturesthe governmentrsquos intrinsic benefit from the aggregate action The parameter δ ge 0captures the weight of the aggregate action in the governmentrsquos utility

Let T be a distribution of the random variable θ We assume that distributions G andT are independent and admit continuously differentiable and strictly positive densi-ties We also assume that κ ν and γ are linear in θ and continuously differentiablein r and a Furthermore to simplify interpretations we assume that ν and γ arenon-decreasing in θ and a so that for the government a high θ is a good news anda higher aggregate action is preferable

Citizens obtain information about the unobservable benefit θ through media outletsEach media outlet is identified by its editorial policy c isin [0 1] and it endorses action

CENSORSHIP AS OPTIMAL PERSUASION 17

a = 1 if θ ge c and criticizes it if θ lt c17 A set C of media outlets is a finite subsetof [0 1]18

The governmentrsquos censorship policy is a set of the media outlets X sub C that arepermitted to broadcast so the rest of the media outlets are censored

The timing is as follows First the government chooses a set X sub C of permittedmedia outlets Second state θ is realized and each permitted media outlet endorsesor criticizes action a = 1 according to its editorial policy Finally each citizen observesmessages from all permitted media outlets updates his beliefs about θ and choosesan action

52 Discussion We now discuss interpretations of the key components of the mediacensorship application As in Gehlbach and Sonin (2014) there can be various inter-pretations of the citizenrsquos action a = 1 such as voting for the government supportinga governmentrsquos policy or taking an individual decision that benefits the governmentA citizenrsquos type r can be interpreted as his ideological position or preference param-eter A citizen who is more supportive of the government has a smaller r

A media outlet with a higher editorial policy c can be interpreted as being less loyalto the government because it criticizes the government on a larger set of states Aneditorial policy c isin C can therefore represent a slant or political bias of the outletagainst the government and can be empirically measured as the frequency with whichthe outlet uses anti-government language Gentzkow and Shapiro (2010) constructsuch a slant index for US newspapers Empirical findings of their paper suggestthat the editorial policies of media outlets are driven by reader preferences justifyingour assumption of the existence of a large variety of editorial policies19 As in Suen(2004) Chan and Suen (2008) and Chiang and Knight (2011) the assumption ofthe binary media reports that communicate only whether the state θ is above somestandard c can be justified by a cursory readerrsquos preference for simple messages suchas positive or negative opinions and yes or no recommendations

The governmentrsquos censorship of media outlets can take various forms For examplethe government can ban access to internet sites withdraw licenses disrupt financingconfiscate print materials and equipment discredit media providers and even arresteditors and journalists using broadly formulated legislation on combating extremismIn some countries the government can exercise direct control over media editorialpolicies either through state ownership or administrative pressure

17The tie-breaking in the event of θ = c is unimportant as θ is a continuous random variable18We discuss the case of a continuum of media outlets in Section 5519Theoretical literature has explored the determinants of media slant of an outlet driven by its

citizens (Mullainathan and Shleifer 2005 Gentzkow and Shapiro 2006 and Chan and Suen 2008)and its owners (Baron 2006 and Besley and Prat 2006)

18 KOLOTILIN MYLOVANOV ZAPECHELNYUK

53 Formulating Media Censorship as Persuasion We now show that the me-dia censorship problem can be formulated as a linear persuasion problem in whichthe government is a sender and a representative citizen is a receiver

In our model the citizensrsquo and the governmentrsquos utilities are linear in θ There-fore given any information from the media outlets the utilities depend only on theposterior mean of θ

Consider an arbitrary posterior mean of θ denoted by m isin [0 1] Each citizen of typer chooses ar = 1 if and only if r le m The externality term κ plays no role in thisdecision Therefore the aggregate action a is simply the mass of all citizens whosetypes do not exceed m so a = G(m)

Next using the citizensrsquo optimal behavior we derive the governmentrsquos expected utilityconditional on the posterior mean m

V (m) = E983063983133 1

0

ν(θ r ar G(m))dG(r) + δγ(θ G(m))

983055983055983055983055m983064

=

983133 1

0

ν(m r1rlem G(m))dG(r) + δγ(mG(m)) (6)

This is the governmentrsquos indirect utility As in Section 2 the government now choosesa signal which is informative about the state to maximize its expected utility How-ever in contrast to Section 2 here the government is restricted to signals that areimplementable by a subset of a given set of media outlets

Next we show that with an appropriate definition of the state the restriction tosignals implementable by a subset of a given set of media outlets can be formulatedas a restriction to monotone partitional signals as in Scenario B in Section 2 Thuswe will be able to use our results in Section 4 to find the optimal censorship policies

Recall that the information about θ is only available through the media outlets Let ωdenote a random variable equal to the conditional expectation of θ given the messagesof all media outlets in C Note that as C is finite ω is a discrete random variablewith values in [0 1] Let F denote the distribution of ω From now on we treat therandom variable ω as the state and denote by s a signal as defined in Section 2

Let HM denote the set of all distributions of the posterior mean state induced byall monotone partitional signals and let HC denote the set of all distributions ofthe posterior mean state induced by all finite subsets of the set C of media outletsObserve that any set of media outlets induces a monotone partitional signal thatis HC sub HM We now show that every outcome implementable by an arbitrarymonotone partition in HM is also implementable by a subset of media outlets

Lemma 2 HC = HM

For illustration suppose that there is only one media outlet with the editorial policyc isin (0 1) There are two observable events θ ge c and θ lt c In each realized event

CENSORSHIP AS OPTIMAL PERSUASION 19

the citizens compute the posterior expectation of θ which we will call ω So in thisexample the distribution F of ω has two mass points one for each of the two events

Given these two mass points every monotone partition of [0 1] induces one of twopossible distributions of the posterior mean state (i) the mass points revealed and (ii)the mass points pooled Observe however that these distributions are implementableby permitting and censoring the media outlet respectively

A censorship policy is upper censorship if it censors all sufficiently disloyal mediaoutlets Specifically there exists a cutoff clowast isin C and an indicator qlowast isin 0 1 suchthat all media outlets whose editorial policies are below clowast are permitted all mediaoutlets whose editorial policies are above clowast are censored and clowast is permitted if andonly if qlowast = 1 That is X = c isin C c le clowast if qlowast = 1 and X = c isin C c lt clowastif qlowast = 0 Note that the full censorship policy clowast = 0 and qlowast = 0 (where all mediaoutlets are censored) and the free media policy clowast = 1 and qlowast = 1 (where all mediaoutlets are permitted) are the two extreme upper-censorship policies

As shown above the media censorship problem is equivalent to the linear persuasionproblem in which the sender is restricted to monotone partitional signals and thesenderrsquos indirect utility V is given by (6) We thus apply Theorem 1prime(B) to obtainthe following result

Theorem 1primeprime If V is S-shaped then an optimal censorship policy is upper censorship

To illustrate Theorem 1primeprime suppose the government is interested only in the aggregateaction so that ν = 0 and γ depends only on the aggregate action a so V (m) =γ(G(m)) Thus an upper-censorship policy is optimal if the composition functionγ(G(middot)) is S-shaped For example this condition holds if the government is interestedin reaching a certain approval threshold (eg a simple majority) so that γ is a stepfunction This condition also holds if γ is S-shaped and G is uniform or S-shapedwith the same inflection point as γ20

54 Canonical Case We now impose more structure on the utilities to obtain asharper result for the optimality of upper censorship and to perform a comparativestatics analysis Analogously to assumption (A1) we assume that the governmentrsquosutility is a weighted average of the citizensrsquo utility and their aggregate action

983133 1

0

u(θ r ar a)dG(r) + δa

where u(θ r ar a) = (θ minus r)ar + ζ(r)a

(A2)

for some continuously differentiable function ζ Define

β =

983133 1

0

ζ(r)dG(r) + δ

20This condition holds in many special cases where γ and G are S-shaped even when γ and Ghave different inflection points

20 KOLOTILIN MYLOVANOV ZAPECHELNYUK

The term983125 1

0ζ(r)dG(r) is the governmentrsquos expected bias towards the citizensrsquo action

a = 1 due to the citizensrsquo externality and the term δ is the governmentrsquos intrinsicbias towards a greater aggregate action a Thus we interpret β as the governmentrsquosaggregate bias This decomposition of the bias allows for different interpretations whythe government is biased So β can be high because the government is too self-serving(high δ) or because the government is benevolent (low δ) but wishes to internalizestrong positive externalities of the citizens (high ζ(r)) To ease interpretations weassume that β gt 021

Under Assumption (A2) the governmentrsquos indirect utility V given by (6) becomes

V (m) =

983133 m

0

(mminus r)dG(r) + βa =

983133 m

0

v(m r)dG(r)

where

v(m r) = β

9830611 +

1

β(mminus r)

983062

This is the same as v given by Assumption (A1) with ρ = βminus1 up to rescaling by apositive constant Consequently we can apply Theorem 2prime(B) to obtain the followingresult

Theorem 2primeprime Let (A2) hold If the density g of citizens types is log-concave then anoptimal censorship policy is upper censorship

We now apply the comparative statics analysis presented in Section 32 Note thatthe upper-censorship policies are ordered according to the amount of informationtransmitted to the citizens in the sense of Blackwell (1953) A greater censorshipthreshold clowast means that more media outlets are permitted With this order in mindwe apply Theorem 3prime(B) to make a comparative statics analysis on the amount ofinformation that is optimally disclosed by the government

First the censorship cutoff is increasing in the alignment parameter ρ Recall thatthis is a reciprocal of the governmentrsquos bias ρ = βminus1 This means that the governmentoptimally discloses more information (the censorship cutoff is greater) when it is lessbiased (β is smaller) Intuitively as β decreases the government puts more weighton the citizensrsquo utility so it optimally endows the population with a higher utility bycensoring fewer media outlets and disclosing more information

Second the censorship cutoff is increasing in the magnitude t of the horizontal shiftof the density g A greater t corresponds to a greater opportunity cost of actiona = 1 for each citizen This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more difficultto persuade to take action a = 1 (parameter t is greater) Informally speaking topersuade the same type of the citizen the government needs to increase the posterior

21If β lt 0 then swapping the roles of a = 0 and a = 1 reverses the sign of the bias If β = 0then the governmentrsquos utility and the citizensrsquo private interests coincide so it is trivially optimal topermit all media outlets

CENSORSHIP AS OPTIMAL PERSUASION 21

mean But the expected posterior mean must be equal to the prior mean so it is notpossible to increase all posterior means Due to the log-concave shape of the densityof the citizensrsquo types this tradeoff is resolved by increasing the posterior mean of thepooling interval E[ω|ω ge clowast] This is done by shrinking the pooling interval [clowast 1]that is increasing the censorship cutoff clowast

Third when the number of media outlets is sufficiently large the censorship cutoffis increasing in the stretch factor σ of the density g around its peak τ (see Footnote15) A greater σ corresponds to a more dispersed distribution of the opportunitycost in the population This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more diverse inhow difficult they are to persuade Intuitively when the types are more spread outpersuading the same mass of types requires a greater posterior mean which leads toa higher censorship cutoff similarly to the comparative statics in t discussed above

55 Extensions and Open Questions Let us now consider a few extensions ofour model of media censorship

In our model the set of media outlets is exogenous and the governmentrsquos only in-strument is censorship We now consider three alternative ways of expanding thegovernmentrsquos instruments of influence

First suppose that the government is able not only to censor existing media outletsbut also to introduce new media outlets with chosen editorial policies This is equiv-alent to our censorship model where all media outlets in [0 1] are initially availableand the government can censor any subset of them When all media outlets in [0 1]are permitted the revealed information about θ (which we call ω) is θ itself Thusω = θ is a continuous random variable with distribution F = T So we can nowapply our results for the continuous state from Section 3 instead of those for the dis-crete state in Section 4 reaching the same conclusion about the optimality of uppercensorship

Second suppose that the government can garble information available from mediaoutlets That is the government is not restricted to monotone partitions it cancreate arbitrary signals about state ω for the citizens to observe This becomes ageneral persuasion problem and our Theorems 1prime(A) 2prime(A) and 3prime(A) apply

Third suppose that the government is able to restrict not only which media outletsare permitted but also how many media outlets each citizen can choose to observeIn this extension the citizens are not allowed to communicate with one another (oth-erwise they could share the information thus observing all permitted media outletsindirectly) This extension does not affect our results as long as each citizen is al-lowed to access at least one media outlet of his choice as in Chan and Suen (2008)Intuitively this is because each citizen categorizes the information from the mediaoutlets into ldquogood newsrdquo where a = 1 is optimal and ldquobad newsrdquo where a = 0 is

22 KOLOTILIN MYLOVANOV ZAPECHELNYUK

optimal Because the information from the media outlets induces a monotone parti-tion it means that ldquogood newsrdquo is separated from ldquobad newsrdquo by a threshold mediaoutlet that depends on the citizenrsquos type That is it is sufficient to observe a singlethreshold media outlet to distinguish ldquogood newsrdquo from ldquobad newsrdquo

In our model each citizenrsquos utility depends on the aggregate action a through theexternality term κ(θ r a) which does not affect the chosen action Let us relax thisassumption so that a citizenrsquos optimal choice can depend on a We can still writethe senderrsquos indirect utility V as a function of the posterior mean state and applyour results However V is no longer uniquely determined by the primitives of themodel It is now endogenous and depends on an equilibrium the citizens play aseach citizenrsquos optimal action now depends on what all citizens do in equilibrium Forexample given the same information about the state a citizen could prefer to choosea = 1 if and only if sufficiently many citizens choose the same action so a is largeenough This creates the problem of multiplicity of equilibria and as a consequencethe dependence of optimal censorship on equilibrium selection

As a side our media censorship problem can also be applied to spatial voting mod-els as in Chiang and Knight (2011) Consider a government party (p = G) and anopposition party (p = O) competing in an election If party p wins a voter withan ideological position r gets utility wp minus (r minus rp)

2 where wp is the quality or va-lence of party p and rp is the ideology or policy platform of party p Voters knowthe partiesrsquo ideologies and obtain information about the partiesrsquo qualities from allavailable media outlets Each voter supports the party that maximizes his expectedutility Within this context our analysis still applies because the voterrsquos utility dif-ference between the government and opposition parties is proportional to θminusr whereθ = (wG minus wO + r2O minus r2G)2(rO minus rG)

There are a few more extensions that can be relevant in applications First insteadof complete censorship of a media outlet there can be a cost of accessing it Forexample an international news channel can be censored by a local government butcitizens may still access it through VPN at some cost Second it can be costly forthe government to censor media outlets So another important question to answeris how a government may prioritize censoring Finally citizens can incur some costof following each media outlet While we have already mentioned that citizens gainno benefit from following more than one outlet it is entirely possible for them tostop watching news altogether if it is sufficiently uninformative These extensions arenontrivial and left for future research

Appendix

Proof of Theorem 1 Suppose that V is S-shaped For mωlowast isin [0 1] and mlowast(ωlowast) =E[ω|ω ge ωlowast] define

V (mωlowast) = V (mlowast(ωlowast)) + V prime(mlowast(ωlowast))(mminusmlowast(ωlowast)) (7)

CENSORSHIP AS OPTIMAL PERSUASION 23

Since V (m) is convex on [0 τ ] and concave on [τ 1] there exists ωlowast isin [0 τ ] thatsatisfies condition (3) that is

V (mωlowast) ge (le)V (m) for all m ge (lt)ωlowast (8)

This is true because V (ωlowastωlowast) minus V (ωlowast) is single-crossing from below To see thisconsider Figure 1 Observe that (3) can hold only if ωlowast is on the convex part of V andmlowast is on the concave part of V As the censorship cutoff ωlowast increases the posteriormean state of the pooling message mlowast(ωlowast) also increases But because V is concaveat mlowast(ωlowast) the dashed tangent line V (mωlowast) becomes flatter in m so it crosses thesolid line V (m) at a smaller m

Consider an arbitrary signal s Let H be the distribution of m = E[ω|s] inducedby signal s Since each signal is a garbling of the fully informative signal F is amean-preserving spread of H Thus the senderrsquos expected utility is smaller undersignal s than under the upper-censorship signal with cutoff ωlowast as follows from

983133 1

0

V (m)dH(m) le983133 1

0

maxV (m) V (mωlowast)dH(m)

le983133 1

0

maxV (m) V (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mlowast(ωlowast))dF (m)

where the second line holds because maxV (m) V (mωlowast) is convex in m and F isa mean-preserving spread of H the third line holds by (8) and the last line holds by

(7) and by983125 1

ωlowast(mminusmlowast(ωlowast))dF (m) = 0

Conversely suppose that V is not S-shaped Then there exist 0 le m1 lt m2 lem3 lt m4 le 1 such that V primeprime(m) lt 0 for m isin (m1m2) V

primeprime(m) = 0 for m isin [m2m3]and V primeprime(m) gt 0 for m isin (m3m4) because V primeprime is continuous by assumption It isstraightforward to show that there exists 983141mlowast isin (m1m2) (sufficiently close to m2) and983141ωlowast isin (m3m4) such that

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) lt V (m) for all m isin (983141ωlowastm4]

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) gt V (m) for all m isin [m1 983141ωlowast] 983141mlowast

Moreover there exists a continuous distribution F such that the support of F is asubset of [m1m4] and 983141mlowast = E[ω|ω le 983141ωlowast] Using a chain of inequalities similar to theone above it is easy to show that lower-censorship signal 983141slowast with cutoff 983141ωlowast is optimalFurthermore it is uniquely optimal as both inequalities hold with equalities only forsignal 983141slowast This in turn implies that any upper-censorship signal is suboptimal

24 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Proof of Lemma 1 Notice that under assumption (A1) we have

V primeprime(m) = gprime(m) + ρg(m) for m isin [0 1]

Recall that V is S-shaped if and only if V primeprime(m) = gprime (m) + ρg (m) is single-crossingfrom above Using Proposition 1 in Quah and Strulovici (2012) it is easy to showthat V primeprime(m) is single-crossing from above for all ρ isin R if and only if gprime (m)g (m) isnonincreasing in m (that is ln g(m) is concave)

Proof of Theorem 2 Immediate by Theorem 1 and Lemma 1

Proof of Theorem 3 Part (i) We fix t and σ and vary ρ Without loss ofgenerality let t = 0 and σ = 1 (so that Gtσ = G) Consider a signal that induces thedistribution H of posterior mean m This signal implements the receiverrsquos interimutility U given by

U(r) =

983133 1

r

(mminus r)dH(m) for r isin [0 1]

which holds because the receiver acts if and only if m ge r Since types r lt 0 alwayschoose a = 1 and types r gt 1 always choose a = 0 we can exclude them fromconsideration and assume that supp(G) = [0 1] The senderrsquos expected utility is then

983133 1

0

V (m)dH(m) =

983133 1

0

983133 m

0

(1 + ρ(mminus r))dG(r)dH(m)

=

983133 1

0

983133 1

r

(1 + ρ(mminus r))dH(m)dG(r)

=

983133 1

0

(1minusH(r) + ρU(r))dG(r)

Consider ρ2 gt ρ1 Suppose by contradiction that the corresponding optimal upper-censorship signals s2 and s1 are such that s1 is strictly more informative than s2 LetU2 and U1 be the receiverrsquos interim utilities implemented by s2 and s1 Since thesender prefers signal s2 under ρ2 and signal s1 under ρ1 we have

983133 1

0

(1minusH2 (r) + ρ2U2 (r)) dG (r) ge983133 1

0

(1minusH1 (r) + ρ2U1 (r)) dG (r)

983133 1

0

(1minusH1 (r) + ρ1U1 (r)) dG (r) ge983133 1

0

(1minusH2 (r) + ρ1U2 (r)) dG (r)

Summing up these inequalities gives

(ρ2 minus ρ1)

983133 1

0

(U2 (r)minus U1 (r)) dG (r) ge 0

leading to a contradiction because U1(r) ge U2(r) for all r with strict inequality forsome r given that s1 is strictly more informative than s2

CENSORSHIP AS OPTIMAL PERSUASION 25

Part (ii) We fix ρ isin R and σ = 1 and vary t The indirect utility of the senderconditional on m isin [0 1] is given by

Vt(m) =

983133

rlem

(1+ρ(mminusr))gt1(r)dr =

983133

rlemminust

(1+ρ(mminustminusr))g(r)dr = V (mminust) (9)

By Theorem 1 when t0 = 0 an optimal cutoff ωlowast0 satisfies

V (m) le V (mlowast0) + V prime(mlowast

0)(mminusmlowast0) for m isin [ωlowast

0 1]

When V is S-shaped and t1 lt 0 it is easy to see that

V (mminus t1) le V (mlowast0 minus t1) + V prime(mlowast

0 minus t1)(mminusmlowast0) for m isin [ωlowast

0 1]

which implies that when t = t1 the above inequality changes its sign at some ωlowast1

that satisfies ωlowast1 le ωlowast

0 It follows from (9) that ωlowast1 is an optimal cutoff for t = t1 and

as we established above ωlowast1 le ωlowast

0

Part (iii) We fix ρ ge 0 and t = 0 and vary σ The indirect utility of the senderconditional on m isin [0 1] is given by

V ρσσ(m) =

983133

rlem

9830591 +

ρ

σ(mminus r)

983060dG0σ(r)

=

983133

rlem

9830591 +

ρ

σ(mminus r)

983060g

983061r minus (1minus σ)τ

σ

9830621

σdr

=

983133

rlemminus(1minusσ)τσ

9830611 + ρ

983061mminus (1minus σ)τ

σminus r

983062983062g(r)dr

= Vρ1

983061mminus (1minus σ)τ

σ

983062 (10)

Let σ0 = 1 σ1 lt 1 and ρ1 = ρσ1 To compare the optimal cutoffs under parameters(ρ σ0) and (ρ σ1) we consider two steps first comparing the cutoffs between (ρ σ0)and (ρ1 σ1) and then comparing the cutoffs between (ρ1 σ1) and (ρ σ1)

First we compare an optimal cutoff ωlowast0 under (ρ σ0) = (ρ 1) and an optimal cutoff

ωlowast1 under (ρ1 σ1) By Theorem 1 ωlowast

0 satisfies

Vρ1(m) le Vρ1(mlowast0) + V prime

ρ1(mlowast0)(mminusmlowast

0) for m isin [ωlowast0 1]

When V is increasing (implied by ρ ge 0) and S-shaped and σ1 lt 1 it is easy to seethat for all m isin [ωlowast

0 1]

Vρ1

983061mminus (1minus σ1)τ

σ1

983062le Vρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062+ V prime

ρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062(mminusmlowast

0)

which implies that when σ = σ1 lt 1 the above inequality changes its sign at someωlowast1 that satisfies ωlowast

1 le ωlowast0 It follows from (10) that ωlowast

1 is an optimal cutoff under(ρσ σ1) = (ρ1 σ1) and as we established above ωlowast

1 le ωlowast0

26 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Second we compare the optimal cutoff ωlowast1 under (ρ1 σ1) and the optimal cutoff ωlowast

2

under (ρ σ1) Because ρ lt ρ1 = ρσ1 by part (i) we have ωlowast2 le ωlowast

1

Combining ωlowast1 le ωlowast

0 and ωlowast2 le ωlowast

1 we obtain ωlowast2 le ωlowast

0

Proof of Theorem 1prime Part (A) The proof is analogous to that of Theorem 1 andis thus omitted

Part (B) The only if statement follows from that in part (A) (see the proof of Theorem1) and the fact that we can always ensure that an optimal stochastic censorship(ωlowast qlowast) is deterministic (so qlowast isin 0 1) by the choice of the support and distributionof ω Thus we only need to prove the if statement

Let F be a distribution of ω with support on n values 0 lt ω0 lt lt ωnminus1 lt 1 Thisis without loss of generality because if 0 andor 1 were in the support of F we couldredefine the state as ω = (ω + ε)(1 + 2ε) and the type as r = (r + ε)(1 + 2ε)

Let X sub ω1 ωnminus1 identify a set of cutoffs defining a monotone partition for eachωi isin X the receiver is informed whether ω lt ωi or ω ge ωi Note that we exclude ω0

from the set of cutoffs because ω ge ω0 always holds

Note that X is deterministic upper censorship if and only if there exists a censorshipthreshold ωlowast isin [0 1] such that for each i = 1 nminus 1 ωi isin X if and only if ωi lt ωlowastthat is we can assume that qlowast = 0 without loss of generality

Let VX be the senderrsquos expected utility when the monotone partition is given by X

If n = 2 then there are only two partitions X = ω1 (full disclosure) and X = empty(no disclosure) Both are upper censorships so part (B) holds trivially

If n = 3 then there are four monotone partitions X = ω1ω2 (full disclosure)X = ω2 X = ω1 and X = empty (no disclosure) Only X = ω2 is not uppercensorship The next lemma proves that X = ω2 cannot be uniquely optimal whenV is S-shaped thus proving the optimality of upper censorship for n = 3

Denote

mjk = E983045ω|ω isin [ωjωk]

983046

For convenience of notation we write mj = ωj

Lemma 3 Let V be S-shaped with an inflexion point τ Let ω be a discrete randomvariable with support on m0m1m2 and probabilities p0 p1 p2 where 0 lt m0 ltm1 lt m2 lt 1 Then

(i) if τ le m0 then Vm2 le Vempty

(ii) if τ ge m1 then Vm2 le Vm1m2

(iii) if m0 lt τ lt m1 then Vm2 le maxVempty Vm1m2

CENSORSHIP AS OPTIMAL PERSUASION 27

Proof Define p01 = p0 + p1 and p02 = p01 + p2 = 1 Observe that

m01 =p0m0 + p1m1

p0 + p1and m02 =

p0m0 + p1m1 + p2m2

p0 + p1 + p2

We thus have

Vempty = p02V (m02) = (p01 + p2)V

983061p01m01 + p2m2

p01 + p2

983062

Vm2 = p01V (m01) + p2V (m2) = (p0 + p1)V

983061p0m0 + p1m1

p0 + p1

983062+ p2V (m2)

Vm1m2 = p0V (m0) + p1V (m1) + p2V (m2)

Clearly

m0 lt m01 lt m1 and m0 lt m02 lt m2

Therefore

Vm2 le Vempty lArrrArrp01

p01 + p2V (m01) +

p2p01 + p2

V (m2) le V

983061p01

p01 + p2m01 +

p2p01 + p2

m2

983062= V (m02)

(11)

and

Vm2 le Vm1m2 lArrrArr

V

983061p0

p0 + p1m0 +

p1p0 + p1

m1

983062le p0

p0 + p1V (m0) +

p1p0 + p1

V (m1)(12)

Suppose that τ le m01 so V is concave on [m01m2] Then (11) holds by Jensenrsquosinequality In particular it holds for τ le m0 lt m01 thus proving part (i)

Suppose that τ ge m1 so V is convex on [m0m1] Then (12) holds by Jensenrsquosinequality thus proving part (ii)

Finally suppose that m01 lt τ lt m1 Let (τ V (τ)) lie below (or on) the straight lineA connecting (m0 V (m0)) and (m2 V (m2)) as illustrated on Figure 2 By convexityof V (m) for m lt τ it must be the case that V (m01) lies below (or on) the dashedline B By concavity of V (m) for m gt τ it must be the case that V (m1) lies above(or on) the dashed line C Consequently (12) is satisfied

Alternatively let (τ V (τ)) lie above the straight line A connecting (m0 V (m0)) and(m2 V (m2)) as illustrated on Figure 3 If Vm2 le Vm1m2 then the proof of part(iii) is complete So suppose that Vm2 gt Vm1m2 so (12) is violated

By convexity of V (m) for m lt τ it must be the case that V (m01) lies below (or on)the line B Since (12) is violated V (m01) must lie between the lines A and B Recallthat m01 lt m02 lt m2 Observe that V (m02) must be in the shaded area on Figure 3Indeed if m02 lt τ then by convexity of V (m) for m lt τ it must be the case thatV (m02) lies between (or on) the lines B and D If m02 ge τ then by concavity of

28 KOLOTILIN MYLOVANOV ZAPECHELNYUK

τ m2m0

A C

B

m01 m1

V (m0) V (τ)

V (m2)

Figure 2 The case where (τ V (τ)) lies below (or on) the straight lineconnecting (m0 V (m0)) and (m2 V (m2))

τ m2m0

A

C

B

m01

V (m0)

V (τ)

V (m2)

D

V (m01)

Figure 3 The case where (τ V (τ)) lies above the straight line con-necting (m0 V (m0)) and (m2 V (m2))

V (m) for m gt τ it must be the case that V (m02) lies above (or on) line C It followsthat (11) is satisfied

We now adapt Lemma 3 to prove the optimality of upper censorship for n ge 4

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

CENSORSHIP AS OPTIMAL PERSUASION 17

a = 1 if θ ge c and criticizes it if θ lt c17 A set C of media outlets is a finite subsetof [0 1]18

The governmentrsquos censorship policy is a set of the media outlets X sub C that arepermitted to broadcast so the rest of the media outlets are censored

The timing is as follows First the government chooses a set X sub C of permittedmedia outlets Second state θ is realized and each permitted media outlet endorsesor criticizes action a = 1 according to its editorial policy Finally each citizen observesmessages from all permitted media outlets updates his beliefs about θ and choosesan action

52 Discussion We now discuss interpretations of the key components of the mediacensorship application As in Gehlbach and Sonin (2014) there can be various inter-pretations of the citizenrsquos action a = 1 such as voting for the government supportinga governmentrsquos policy or taking an individual decision that benefits the governmentA citizenrsquos type r can be interpreted as his ideological position or preference param-eter A citizen who is more supportive of the government has a smaller r

A media outlet with a higher editorial policy c can be interpreted as being less loyalto the government because it criticizes the government on a larger set of states Aneditorial policy c isin C can therefore represent a slant or political bias of the outletagainst the government and can be empirically measured as the frequency with whichthe outlet uses anti-government language Gentzkow and Shapiro (2010) constructsuch a slant index for US newspapers Empirical findings of their paper suggestthat the editorial policies of media outlets are driven by reader preferences justifyingour assumption of the existence of a large variety of editorial policies19 As in Suen(2004) Chan and Suen (2008) and Chiang and Knight (2011) the assumption ofthe binary media reports that communicate only whether the state θ is above somestandard c can be justified by a cursory readerrsquos preference for simple messages suchas positive or negative opinions and yes or no recommendations

The governmentrsquos censorship of media outlets can take various forms For examplethe government can ban access to internet sites withdraw licenses disrupt financingconfiscate print materials and equipment discredit media providers and even arresteditors and journalists using broadly formulated legislation on combating extremismIn some countries the government can exercise direct control over media editorialpolicies either through state ownership or administrative pressure

17The tie-breaking in the event of θ = c is unimportant as θ is a continuous random variable18We discuss the case of a continuum of media outlets in Section 5519Theoretical literature has explored the determinants of media slant of an outlet driven by its

citizens (Mullainathan and Shleifer 2005 Gentzkow and Shapiro 2006 and Chan and Suen 2008)and its owners (Baron 2006 and Besley and Prat 2006)

18 KOLOTILIN MYLOVANOV ZAPECHELNYUK

53 Formulating Media Censorship as Persuasion We now show that the me-dia censorship problem can be formulated as a linear persuasion problem in whichthe government is a sender and a representative citizen is a receiver

In our model the citizensrsquo and the governmentrsquos utilities are linear in θ There-fore given any information from the media outlets the utilities depend only on theposterior mean of θ

Consider an arbitrary posterior mean of θ denoted by m isin [0 1] Each citizen of typer chooses ar = 1 if and only if r le m The externality term κ plays no role in thisdecision Therefore the aggregate action a is simply the mass of all citizens whosetypes do not exceed m so a = G(m)

Next using the citizensrsquo optimal behavior we derive the governmentrsquos expected utilityconditional on the posterior mean m

V (m) = E983063983133 1

0

ν(θ r ar G(m))dG(r) + δγ(θ G(m))

983055983055983055983055m983064

=

983133 1

0

ν(m r1rlem G(m))dG(r) + δγ(mG(m)) (6)

This is the governmentrsquos indirect utility As in Section 2 the government now choosesa signal which is informative about the state to maximize its expected utility How-ever in contrast to Section 2 here the government is restricted to signals that areimplementable by a subset of a given set of media outlets

Next we show that with an appropriate definition of the state the restriction tosignals implementable by a subset of a given set of media outlets can be formulatedas a restriction to monotone partitional signals as in Scenario B in Section 2 Thuswe will be able to use our results in Section 4 to find the optimal censorship policies

Recall that the information about θ is only available through the media outlets Let ωdenote a random variable equal to the conditional expectation of θ given the messagesof all media outlets in C Note that as C is finite ω is a discrete random variablewith values in [0 1] Let F denote the distribution of ω From now on we treat therandom variable ω as the state and denote by s a signal as defined in Section 2

Let HM denote the set of all distributions of the posterior mean state induced byall monotone partitional signals and let HC denote the set of all distributions ofthe posterior mean state induced by all finite subsets of the set C of media outletsObserve that any set of media outlets induces a monotone partitional signal thatis HC sub HM We now show that every outcome implementable by an arbitrarymonotone partition in HM is also implementable by a subset of media outlets

Lemma 2 HC = HM

For illustration suppose that there is only one media outlet with the editorial policyc isin (0 1) There are two observable events θ ge c and θ lt c In each realized event

CENSORSHIP AS OPTIMAL PERSUASION 19

the citizens compute the posterior expectation of θ which we will call ω So in thisexample the distribution F of ω has two mass points one for each of the two events

Given these two mass points every monotone partition of [0 1] induces one of twopossible distributions of the posterior mean state (i) the mass points revealed and (ii)the mass points pooled Observe however that these distributions are implementableby permitting and censoring the media outlet respectively

A censorship policy is upper censorship if it censors all sufficiently disloyal mediaoutlets Specifically there exists a cutoff clowast isin C and an indicator qlowast isin 0 1 suchthat all media outlets whose editorial policies are below clowast are permitted all mediaoutlets whose editorial policies are above clowast are censored and clowast is permitted if andonly if qlowast = 1 That is X = c isin C c le clowast if qlowast = 1 and X = c isin C c lt clowastif qlowast = 0 Note that the full censorship policy clowast = 0 and qlowast = 0 (where all mediaoutlets are censored) and the free media policy clowast = 1 and qlowast = 1 (where all mediaoutlets are permitted) are the two extreme upper-censorship policies

As shown above the media censorship problem is equivalent to the linear persuasionproblem in which the sender is restricted to monotone partitional signals and thesenderrsquos indirect utility V is given by (6) We thus apply Theorem 1prime(B) to obtainthe following result

Theorem 1primeprime If V is S-shaped then an optimal censorship policy is upper censorship

To illustrate Theorem 1primeprime suppose the government is interested only in the aggregateaction so that ν = 0 and γ depends only on the aggregate action a so V (m) =γ(G(m)) Thus an upper-censorship policy is optimal if the composition functionγ(G(middot)) is S-shaped For example this condition holds if the government is interestedin reaching a certain approval threshold (eg a simple majority) so that γ is a stepfunction This condition also holds if γ is S-shaped and G is uniform or S-shapedwith the same inflection point as γ20

54 Canonical Case We now impose more structure on the utilities to obtain asharper result for the optimality of upper censorship and to perform a comparativestatics analysis Analogously to assumption (A1) we assume that the governmentrsquosutility is a weighted average of the citizensrsquo utility and their aggregate action

983133 1

0

u(θ r ar a)dG(r) + δa

where u(θ r ar a) = (θ minus r)ar + ζ(r)a

(A2)

for some continuously differentiable function ζ Define

β =

983133 1

0

ζ(r)dG(r) + δ

20This condition holds in many special cases where γ and G are S-shaped even when γ and Ghave different inflection points

20 KOLOTILIN MYLOVANOV ZAPECHELNYUK

The term983125 1

0ζ(r)dG(r) is the governmentrsquos expected bias towards the citizensrsquo action

a = 1 due to the citizensrsquo externality and the term δ is the governmentrsquos intrinsicbias towards a greater aggregate action a Thus we interpret β as the governmentrsquosaggregate bias This decomposition of the bias allows for different interpretations whythe government is biased So β can be high because the government is too self-serving(high δ) or because the government is benevolent (low δ) but wishes to internalizestrong positive externalities of the citizens (high ζ(r)) To ease interpretations weassume that β gt 021

Under Assumption (A2) the governmentrsquos indirect utility V given by (6) becomes

V (m) =

983133 m

0

(mminus r)dG(r) + βa =

983133 m

0

v(m r)dG(r)

where

v(m r) = β

9830611 +

1

β(mminus r)

983062

This is the same as v given by Assumption (A1) with ρ = βminus1 up to rescaling by apositive constant Consequently we can apply Theorem 2prime(B) to obtain the followingresult

Theorem 2primeprime Let (A2) hold If the density g of citizens types is log-concave then anoptimal censorship policy is upper censorship

We now apply the comparative statics analysis presented in Section 32 Note thatthe upper-censorship policies are ordered according to the amount of informationtransmitted to the citizens in the sense of Blackwell (1953) A greater censorshipthreshold clowast means that more media outlets are permitted With this order in mindwe apply Theorem 3prime(B) to make a comparative statics analysis on the amount ofinformation that is optimally disclosed by the government

First the censorship cutoff is increasing in the alignment parameter ρ Recall thatthis is a reciprocal of the governmentrsquos bias ρ = βminus1 This means that the governmentoptimally discloses more information (the censorship cutoff is greater) when it is lessbiased (β is smaller) Intuitively as β decreases the government puts more weighton the citizensrsquo utility so it optimally endows the population with a higher utility bycensoring fewer media outlets and disclosing more information

Second the censorship cutoff is increasing in the magnitude t of the horizontal shiftof the density g A greater t corresponds to a greater opportunity cost of actiona = 1 for each citizen This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more difficultto persuade to take action a = 1 (parameter t is greater) Informally speaking topersuade the same type of the citizen the government needs to increase the posterior

21If β lt 0 then swapping the roles of a = 0 and a = 1 reverses the sign of the bias If β = 0then the governmentrsquos utility and the citizensrsquo private interests coincide so it is trivially optimal topermit all media outlets

CENSORSHIP AS OPTIMAL PERSUASION 21

mean But the expected posterior mean must be equal to the prior mean so it is notpossible to increase all posterior means Due to the log-concave shape of the densityof the citizensrsquo types this tradeoff is resolved by increasing the posterior mean of thepooling interval E[ω|ω ge clowast] This is done by shrinking the pooling interval [clowast 1]that is increasing the censorship cutoff clowast

Third when the number of media outlets is sufficiently large the censorship cutoffis increasing in the stretch factor σ of the density g around its peak τ (see Footnote15) A greater σ corresponds to a more dispersed distribution of the opportunitycost in the population This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more diverse inhow difficult they are to persuade Intuitively when the types are more spread outpersuading the same mass of types requires a greater posterior mean which leads toa higher censorship cutoff similarly to the comparative statics in t discussed above

55 Extensions and Open Questions Let us now consider a few extensions ofour model of media censorship

In our model the set of media outlets is exogenous and the governmentrsquos only in-strument is censorship We now consider three alternative ways of expanding thegovernmentrsquos instruments of influence

First suppose that the government is able not only to censor existing media outletsbut also to introduce new media outlets with chosen editorial policies This is equiv-alent to our censorship model where all media outlets in [0 1] are initially availableand the government can censor any subset of them When all media outlets in [0 1]are permitted the revealed information about θ (which we call ω) is θ itself Thusω = θ is a continuous random variable with distribution F = T So we can nowapply our results for the continuous state from Section 3 instead of those for the dis-crete state in Section 4 reaching the same conclusion about the optimality of uppercensorship

Second suppose that the government can garble information available from mediaoutlets That is the government is not restricted to monotone partitions it cancreate arbitrary signals about state ω for the citizens to observe This becomes ageneral persuasion problem and our Theorems 1prime(A) 2prime(A) and 3prime(A) apply

Third suppose that the government is able to restrict not only which media outletsare permitted but also how many media outlets each citizen can choose to observeIn this extension the citizens are not allowed to communicate with one another (oth-erwise they could share the information thus observing all permitted media outletsindirectly) This extension does not affect our results as long as each citizen is al-lowed to access at least one media outlet of his choice as in Chan and Suen (2008)Intuitively this is because each citizen categorizes the information from the mediaoutlets into ldquogood newsrdquo where a = 1 is optimal and ldquobad newsrdquo where a = 0 is

22 KOLOTILIN MYLOVANOV ZAPECHELNYUK

optimal Because the information from the media outlets induces a monotone parti-tion it means that ldquogood newsrdquo is separated from ldquobad newsrdquo by a threshold mediaoutlet that depends on the citizenrsquos type That is it is sufficient to observe a singlethreshold media outlet to distinguish ldquogood newsrdquo from ldquobad newsrdquo

In our model each citizenrsquos utility depends on the aggregate action a through theexternality term κ(θ r a) which does not affect the chosen action Let us relax thisassumption so that a citizenrsquos optimal choice can depend on a We can still writethe senderrsquos indirect utility V as a function of the posterior mean state and applyour results However V is no longer uniquely determined by the primitives of themodel It is now endogenous and depends on an equilibrium the citizens play aseach citizenrsquos optimal action now depends on what all citizens do in equilibrium Forexample given the same information about the state a citizen could prefer to choosea = 1 if and only if sufficiently many citizens choose the same action so a is largeenough This creates the problem of multiplicity of equilibria and as a consequencethe dependence of optimal censorship on equilibrium selection

As a side our media censorship problem can also be applied to spatial voting mod-els as in Chiang and Knight (2011) Consider a government party (p = G) and anopposition party (p = O) competing in an election If party p wins a voter withan ideological position r gets utility wp minus (r minus rp)

2 where wp is the quality or va-lence of party p and rp is the ideology or policy platform of party p Voters knowthe partiesrsquo ideologies and obtain information about the partiesrsquo qualities from allavailable media outlets Each voter supports the party that maximizes his expectedutility Within this context our analysis still applies because the voterrsquos utility dif-ference between the government and opposition parties is proportional to θminusr whereθ = (wG minus wO + r2O minus r2G)2(rO minus rG)

There are a few more extensions that can be relevant in applications First insteadof complete censorship of a media outlet there can be a cost of accessing it Forexample an international news channel can be censored by a local government butcitizens may still access it through VPN at some cost Second it can be costly forthe government to censor media outlets So another important question to answeris how a government may prioritize censoring Finally citizens can incur some costof following each media outlet While we have already mentioned that citizens gainno benefit from following more than one outlet it is entirely possible for them tostop watching news altogether if it is sufficiently uninformative These extensions arenontrivial and left for future research

Appendix

Proof of Theorem 1 Suppose that V is S-shaped For mωlowast isin [0 1] and mlowast(ωlowast) =E[ω|ω ge ωlowast] define

V (mωlowast) = V (mlowast(ωlowast)) + V prime(mlowast(ωlowast))(mminusmlowast(ωlowast)) (7)

CENSORSHIP AS OPTIMAL PERSUASION 23

Since V (m) is convex on [0 τ ] and concave on [τ 1] there exists ωlowast isin [0 τ ] thatsatisfies condition (3) that is

V (mωlowast) ge (le)V (m) for all m ge (lt)ωlowast (8)

This is true because V (ωlowastωlowast) minus V (ωlowast) is single-crossing from below To see thisconsider Figure 1 Observe that (3) can hold only if ωlowast is on the convex part of V andmlowast is on the concave part of V As the censorship cutoff ωlowast increases the posteriormean state of the pooling message mlowast(ωlowast) also increases But because V is concaveat mlowast(ωlowast) the dashed tangent line V (mωlowast) becomes flatter in m so it crosses thesolid line V (m) at a smaller m

Consider an arbitrary signal s Let H be the distribution of m = E[ω|s] inducedby signal s Since each signal is a garbling of the fully informative signal F is amean-preserving spread of H Thus the senderrsquos expected utility is smaller undersignal s than under the upper-censorship signal with cutoff ωlowast as follows from

983133 1

0

V (m)dH(m) le983133 1

0

maxV (m) V (mωlowast)dH(m)

le983133 1

0

maxV (m) V (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mlowast(ωlowast))dF (m)

where the second line holds because maxV (m) V (mωlowast) is convex in m and F isa mean-preserving spread of H the third line holds by (8) and the last line holds by

(7) and by983125 1

ωlowast(mminusmlowast(ωlowast))dF (m) = 0

Conversely suppose that V is not S-shaped Then there exist 0 le m1 lt m2 lem3 lt m4 le 1 such that V primeprime(m) lt 0 for m isin (m1m2) V

primeprime(m) = 0 for m isin [m2m3]and V primeprime(m) gt 0 for m isin (m3m4) because V primeprime is continuous by assumption It isstraightforward to show that there exists 983141mlowast isin (m1m2) (sufficiently close to m2) and983141ωlowast isin (m3m4) such that

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) lt V (m) for all m isin (983141ωlowastm4]

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) gt V (m) for all m isin [m1 983141ωlowast] 983141mlowast

Moreover there exists a continuous distribution F such that the support of F is asubset of [m1m4] and 983141mlowast = E[ω|ω le 983141ωlowast] Using a chain of inequalities similar to theone above it is easy to show that lower-censorship signal 983141slowast with cutoff 983141ωlowast is optimalFurthermore it is uniquely optimal as both inequalities hold with equalities only forsignal 983141slowast This in turn implies that any upper-censorship signal is suboptimal

24 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Proof of Lemma 1 Notice that under assumption (A1) we have

V primeprime(m) = gprime(m) + ρg(m) for m isin [0 1]

Recall that V is S-shaped if and only if V primeprime(m) = gprime (m) + ρg (m) is single-crossingfrom above Using Proposition 1 in Quah and Strulovici (2012) it is easy to showthat V primeprime(m) is single-crossing from above for all ρ isin R if and only if gprime (m)g (m) isnonincreasing in m (that is ln g(m) is concave)

Proof of Theorem 2 Immediate by Theorem 1 and Lemma 1

Proof of Theorem 3 Part (i) We fix t and σ and vary ρ Without loss ofgenerality let t = 0 and σ = 1 (so that Gtσ = G) Consider a signal that induces thedistribution H of posterior mean m This signal implements the receiverrsquos interimutility U given by

U(r) =

983133 1

r

(mminus r)dH(m) for r isin [0 1]

which holds because the receiver acts if and only if m ge r Since types r lt 0 alwayschoose a = 1 and types r gt 1 always choose a = 0 we can exclude them fromconsideration and assume that supp(G) = [0 1] The senderrsquos expected utility is then

983133 1

0

V (m)dH(m) =

983133 1

0

983133 m

0

(1 + ρ(mminus r))dG(r)dH(m)

=

983133 1

0

983133 1

r

(1 + ρ(mminus r))dH(m)dG(r)

=

983133 1

0

(1minusH(r) + ρU(r))dG(r)

Consider ρ2 gt ρ1 Suppose by contradiction that the corresponding optimal upper-censorship signals s2 and s1 are such that s1 is strictly more informative than s2 LetU2 and U1 be the receiverrsquos interim utilities implemented by s2 and s1 Since thesender prefers signal s2 under ρ2 and signal s1 under ρ1 we have

983133 1

0

(1minusH2 (r) + ρ2U2 (r)) dG (r) ge983133 1

0

(1minusH1 (r) + ρ2U1 (r)) dG (r)

983133 1

0

(1minusH1 (r) + ρ1U1 (r)) dG (r) ge983133 1

0

(1minusH2 (r) + ρ1U2 (r)) dG (r)

Summing up these inequalities gives

(ρ2 minus ρ1)

983133 1

0

(U2 (r)minus U1 (r)) dG (r) ge 0

leading to a contradiction because U1(r) ge U2(r) for all r with strict inequality forsome r given that s1 is strictly more informative than s2

CENSORSHIP AS OPTIMAL PERSUASION 25

Part (ii) We fix ρ isin R and σ = 1 and vary t The indirect utility of the senderconditional on m isin [0 1] is given by

Vt(m) =

983133

rlem

(1+ρ(mminusr))gt1(r)dr =

983133

rlemminust

(1+ρ(mminustminusr))g(r)dr = V (mminust) (9)

By Theorem 1 when t0 = 0 an optimal cutoff ωlowast0 satisfies

V (m) le V (mlowast0) + V prime(mlowast

0)(mminusmlowast0) for m isin [ωlowast

0 1]

When V is S-shaped and t1 lt 0 it is easy to see that

V (mminus t1) le V (mlowast0 minus t1) + V prime(mlowast

0 minus t1)(mminusmlowast0) for m isin [ωlowast

0 1]

which implies that when t = t1 the above inequality changes its sign at some ωlowast1

that satisfies ωlowast1 le ωlowast

0 It follows from (9) that ωlowast1 is an optimal cutoff for t = t1 and

as we established above ωlowast1 le ωlowast

0

Part (iii) We fix ρ ge 0 and t = 0 and vary σ The indirect utility of the senderconditional on m isin [0 1] is given by

V ρσσ(m) =

983133

rlem

9830591 +

ρ

σ(mminus r)

983060dG0σ(r)

=

983133

rlem

9830591 +

ρ

σ(mminus r)

983060g

983061r minus (1minus σ)τ

σ

9830621

σdr

=

983133

rlemminus(1minusσ)τσ

9830611 + ρ

983061mminus (1minus σ)τ

σminus r

983062983062g(r)dr

= Vρ1

983061mminus (1minus σ)τ

σ

983062 (10)

Let σ0 = 1 σ1 lt 1 and ρ1 = ρσ1 To compare the optimal cutoffs under parameters(ρ σ0) and (ρ σ1) we consider two steps first comparing the cutoffs between (ρ σ0)and (ρ1 σ1) and then comparing the cutoffs between (ρ1 σ1) and (ρ σ1)

First we compare an optimal cutoff ωlowast0 under (ρ σ0) = (ρ 1) and an optimal cutoff

ωlowast1 under (ρ1 σ1) By Theorem 1 ωlowast

0 satisfies

Vρ1(m) le Vρ1(mlowast0) + V prime

ρ1(mlowast0)(mminusmlowast

0) for m isin [ωlowast0 1]

When V is increasing (implied by ρ ge 0) and S-shaped and σ1 lt 1 it is easy to seethat for all m isin [ωlowast

0 1]

Vρ1

983061mminus (1minus σ1)τ

σ1

983062le Vρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062+ V prime

ρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062(mminusmlowast

0)

which implies that when σ = σ1 lt 1 the above inequality changes its sign at someωlowast1 that satisfies ωlowast

1 le ωlowast0 It follows from (10) that ωlowast

1 is an optimal cutoff under(ρσ σ1) = (ρ1 σ1) and as we established above ωlowast

1 le ωlowast0

26 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Second we compare the optimal cutoff ωlowast1 under (ρ1 σ1) and the optimal cutoff ωlowast

2

under (ρ σ1) Because ρ lt ρ1 = ρσ1 by part (i) we have ωlowast2 le ωlowast

1

Combining ωlowast1 le ωlowast

0 and ωlowast2 le ωlowast

1 we obtain ωlowast2 le ωlowast

0

Proof of Theorem 1prime Part (A) The proof is analogous to that of Theorem 1 andis thus omitted

Part (B) The only if statement follows from that in part (A) (see the proof of Theorem1) and the fact that we can always ensure that an optimal stochastic censorship(ωlowast qlowast) is deterministic (so qlowast isin 0 1) by the choice of the support and distributionof ω Thus we only need to prove the if statement

Let F be a distribution of ω with support on n values 0 lt ω0 lt lt ωnminus1 lt 1 Thisis without loss of generality because if 0 andor 1 were in the support of F we couldredefine the state as ω = (ω + ε)(1 + 2ε) and the type as r = (r + ε)(1 + 2ε)

Let X sub ω1 ωnminus1 identify a set of cutoffs defining a monotone partition for eachωi isin X the receiver is informed whether ω lt ωi or ω ge ωi Note that we exclude ω0

from the set of cutoffs because ω ge ω0 always holds

Note that X is deterministic upper censorship if and only if there exists a censorshipthreshold ωlowast isin [0 1] such that for each i = 1 nminus 1 ωi isin X if and only if ωi lt ωlowastthat is we can assume that qlowast = 0 without loss of generality

Let VX be the senderrsquos expected utility when the monotone partition is given by X

If n = 2 then there are only two partitions X = ω1 (full disclosure) and X = empty(no disclosure) Both are upper censorships so part (B) holds trivially

If n = 3 then there are four monotone partitions X = ω1ω2 (full disclosure)X = ω2 X = ω1 and X = empty (no disclosure) Only X = ω2 is not uppercensorship The next lemma proves that X = ω2 cannot be uniquely optimal whenV is S-shaped thus proving the optimality of upper censorship for n = 3

Denote

mjk = E983045ω|ω isin [ωjωk]

983046

For convenience of notation we write mj = ωj

Lemma 3 Let V be S-shaped with an inflexion point τ Let ω be a discrete randomvariable with support on m0m1m2 and probabilities p0 p1 p2 where 0 lt m0 ltm1 lt m2 lt 1 Then

(i) if τ le m0 then Vm2 le Vempty

(ii) if τ ge m1 then Vm2 le Vm1m2

(iii) if m0 lt τ lt m1 then Vm2 le maxVempty Vm1m2

CENSORSHIP AS OPTIMAL PERSUASION 27

Proof Define p01 = p0 + p1 and p02 = p01 + p2 = 1 Observe that

m01 =p0m0 + p1m1

p0 + p1and m02 =

p0m0 + p1m1 + p2m2

p0 + p1 + p2

We thus have

Vempty = p02V (m02) = (p01 + p2)V

983061p01m01 + p2m2

p01 + p2

983062

Vm2 = p01V (m01) + p2V (m2) = (p0 + p1)V

983061p0m0 + p1m1

p0 + p1

983062+ p2V (m2)

Vm1m2 = p0V (m0) + p1V (m1) + p2V (m2)

Clearly

m0 lt m01 lt m1 and m0 lt m02 lt m2

Therefore

Vm2 le Vempty lArrrArrp01

p01 + p2V (m01) +

p2p01 + p2

V (m2) le V

983061p01

p01 + p2m01 +

p2p01 + p2

m2

983062= V (m02)

(11)

and

Vm2 le Vm1m2 lArrrArr

V

983061p0

p0 + p1m0 +

p1p0 + p1

m1

983062le p0

p0 + p1V (m0) +

p1p0 + p1

V (m1)(12)

Suppose that τ le m01 so V is concave on [m01m2] Then (11) holds by Jensenrsquosinequality In particular it holds for τ le m0 lt m01 thus proving part (i)

Suppose that τ ge m1 so V is convex on [m0m1] Then (12) holds by Jensenrsquosinequality thus proving part (ii)

Finally suppose that m01 lt τ lt m1 Let (τ V (τ)) lie below (or on) the straight lineA connecting (m0 V (m0)) and (m2 V (m2)) as illustrated on Figure 2 By convexityof V (m) for m lt τ it must be the case that V (m01) lies below (or on) the dashedline B By concavity of V (m) for m gt τ it must be the case that V (m1) lies above(or on) the dashed line C Consequently (12) is satisfied

Alternatively let (τ V (τ)) lie above the straight line A connecting (m0 V (m0)) and(m2 V (m2)) as illustrated on Figure 3 If Vm2 le Vm1m2 then the proof of part(iii) is complete So suppose that Vm2 gt Vm1m2 so (12) is violated

By convexity of V (m) for m lt τ it must be the case that V (m01) lies below (or on)the line B Since (12) is violated V (m01) must lie between the lines A and B Recallthat m01 lt m02 lt m2 Observe that V (m02) must be in the shaded area on Figure 3Indeed if m02 lt τ then by convexity of V (m) for m lt τ it must be the case thatV (m02) lies between (or on) the lines B and D If m02 ge τ then by concavity of

28 KOLOTILIN MYLOVANOV ZAPECHELNYUK

τ m2m0

A C

B

m01 m1

V (m0) V (τ)

V (m2)

Figure 2 The case where (τ V (τ)) lies below (or on) the straight lineconnecting (m0 V (m0)) and (m2 V (m2))

τ m2m0

A

C

B

m01

V (m0)

V (τ)

V (m2)

D

V (m01)

Figure 3 The case where (τ V (τ)) lies above the straight line con-necting (m0 V (m0)) and (m2 V (m2))

V (m) for m gt τ it must be the case that V (m02) lies above (or on) line C It followsthat (11) is satisfied

We now adapt Lemma 3 to prove the optimality of upper censorship for n ge 4

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

18 KOLOTILIN MYLOVANOV ZAPECHELNYUK

53 Formulating Media Censorship as Persuasion We now show that the me-dia censorship problem can be formulated as a linear persuasion problem in whichthe government is a sender and a representative citizen is a receiver

In our model the citizensrsquo and the governmentrsquos utilities are linear in θ There-fore given any information from the media outlets the utilities depend only on theposterior mean of θ

Consider an arbitrary posterior mean of θ denoted by m isin [0 1] Each citizen of typer chooses ar = 1 if and only if r le m The externality term κ plays no role in thisdecision Therefore the aggregate action a is simply the mass of all citizens whosetypes do not exceed m so a = G(m)

Next using the citizensrsquo optimal behavior we derive the governmentrsquos expected utilityconditional on the posterior mean m

V (m) = E983063983133 1

0

ν(θ r ar G(m))dG(r) + δγ(θ G(m))

983055983055983055983055m983064

=

983133 1

0

ν(m r1rlem G(m))dG(r) + δγ(mG(m)) (6)

This is the governmentrsquos indirect utility As in Section 2 the government now choosesa signal which is informative about the state to maximize its expected utility How-ever in contrast to Section 2 here the government is restricted to signals that areimplementable by a subset of a given set of media outlets

Next we show that with an appropriate definition of the state the restriction tosignals implementable by a subset of a given set of media outlets can be formulatedas a restriction to monotone partitional signals as in Scenario B in Section 2 Thuswe will be able to use our results in Section 4 to find the optimal censorship policies

Recall that the information about θ is only available through the media outlets Let ωdenote a random variable equal to the conditional expectation of θ given the messagesof all media outlets in C Note that as C is finite ω is a discrete random variablewith values in [0 1] Let F denote the distribution of ω From now on we treat therandom variable ω as the state and denote by s a signal as defined in Section 2

Let HM denote the set of all distributions of the posterior mean state induced byall monotone partitional signals and let HC denote the set of all distributions ofthe posterior mean state induced by all finite subsets of the set C of media outletsObserve that any set of media outlets induces a monotone partitional signal thatis HC sub HM We now show that every outcome implementable by an arbitrarymonotone partition in HM is also implementable by a subset of media outlets

Lemma 2 HC = HM

For illustration suppose that there is only one media outlet with the editorial policyc isin (0 1) There are two observable events θ ge c and θ lt c In each realized event

CENSORSHIP AS OPTIMAL PERSUASION 19

the citizens compute the posterior expectation of θ which we will call ω So in thisexample the distribution F of ω has two mass points one for each of the two events

Given these two mass points every monotone partition of [0 1] induces one of twopossible distributions of the posterior mean state (i) the mass points revealed and (ii)the mass points pooled Observe however that these distributions are implementableby permitting and censoring the media outlet respectively

A censorship policy is upper censorship if it censors all sufficiently disloyal mediaoutlets Specifically there exists a cutoff clowast isin C and an indicator qlowast isin 0 1 suchthat all media outlets whose editorial policies are below clowast are permitted all mediaoutlets whose editorial policies are above clowast are censored and clowast is permitted if andonly if qlowast = 1 That is X = c isin C c le clowast if qlowast = 1 and X = c isin C c lt clowastif qlowast = 0 Note that the full censorship policy clowast = 0 and qlowast = 0 (where all mediaoutlets are censored) and the free media policy clowast = 1 and qlowast = 1 (where all mediaoutlets are permitted) are the two extreme upper-censorship policies

As shown above the media censorship problem is equivalent to the linear persuasionproblem in which the sender is restricted to monotone partitional signals and thesenderrsquos indirect utility V is given by (6) We thus apply Theorem 1prime(B) to obtainthe following result

Theorem 1primeprime If V is S-shaped then an optimal censorship policy is upper censorship

To illustrate Theorem 1primeprime suppose the government is interested only in the aggregateaction so that ν = 0 and γ depends only on the aggregate action a so V (m) =γ(G(m)) Thus an upper-censorship policy is optimal if the composition functionγ(G(middot)) is S-shaped For example this condition holds if the government is interestedin reaching a certain approval threshold (eg a simple majority) so that γ is a stepfunction This condition also holds if γ is S-shaped and G is uniform or S-shapedwith the same inflection point as γ20

54 Canonical Case We now impose more structure on the utilities to obtain asharper result for the optimality of upper censorship and to perform a comparativestatics analysis Analogously to assumption (A1) we assume that the governmentrsquosutility is a weighted average of the citizensrsquo utility and their aggregate action

983133 1

0

u(θ r ar a)dG(r) + δa

where u(θ r ar a) = (θ minus r)ar + ζ(r)a

(A2)

for some continuously differentiable function ζ Define

β =

983133 1

0

ζ(r)dG(r) + δ

20This condition holds in many special cases where γ and G are S-shaped even when γ and Ghave different inflection points

20 KOLOTILIN MYLOVANOV ZAPECHELNYUK

The term983125 1

0ζ(r)dG(r) is the governmentrsquos expected bias towards the citizensrsquo action

a = 1 due to the citizensrsquo externality and the term δ is the governmentrsquos intrinsicbias towards a greater aggregate action a Thus we interpret β as the governmentrsquosaggregate bias This decomposition of the bias allows for different interpretations whythe government is biased So β can be high because the government is too self-serving(high δ) or because the government is benevolent (low δ) but wishes to internalizestrong positive externalities of the citizens (high ζ(r)) To ease interpretations weassume that β gt 021

Under Assumption (A2) the governmentrsquos indirect utility V given by (6) becomes

V (m) =

983133 m

0

(mminus r)dG(r) + βa =

983133 m

0

v(m r)dG(r)

where

v(m r) = β

9830611 +

1

β(mminus r)

983062

This is the same as v given by Assumption (A1) with ρ = βminus1 up to rescaling by apositive constant Consequently we can apply Theorem 2prime(B) to obtain the followingresult

Theorem 2primeprime Let (A2) hold If the density g of citizens types is log-concave then anoptimal censorship policy is upper censorship

We now apply the comparative statics analysis presented in Section 32 Note thatthe upper-censorship policies are ordered according to the amount of informationtransmitted to the citizens in the sense of Blackwell (1953) A greater censorshipthreshold clowast means that more media outlets are permitted With this order in mindwe apply Theorem 3prime(B) to make a comparative statics analysis on the amount ofinformation that is optimally disclosed by the government

First the censorship cutoff is increasing in the alignment parameter ρ Recall thatthis is a reciprocal of the governmentrsquos bias ρ = βminus1 This means that the governmentoptimally discloses more information (the censorship cutoff is greater) when it is lessbiased (β is smaller) Intuitively as β decreases the government puts more weighton the citizensrsquo utility so it optimally endows the population with a higher utility bycensoring fewer media outlets and disclosing more information

Second the censorship cutoff is increasing in the magnitude t of the horizontal shiftof the density g A greater t corresponds to a greater opportunity cost of actiona = 1 for each citizen This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more difficultto persuade to take action a = 1 (parameter t is greater) Informally speaking topersuade the same type of the citizen the government needs to increase the posterior

21If β lt 0 then swapping the roles of a = 0 and a = 1 reverses the sign of the bias If β = 0then the governmentrsquos utility and the citizensrsquo private interests coincide so it is trivially optimal topermit all media outlets

CENSORSHIP AS OPTIMAL PERSUASION 21

mean But the expected posterior mean must be equal to the prior mean so it is notpossible to increase all posterior means Due to the log-concave shape of the densityof the citizensrsquo types this tradeoff is resolved by increasing the posterior mean of thepooling interval E[ω|ω ge clowast] This is done by shrinking the pooling interval [clowast 1]that is increasing the censorship cutoff clowast

Third when the number of media outlets is sufficiently large the censorship cutoffis increasing in the stretch factor σ of the density g around its peak τ (see Footnote15) A greater σ corresponds to a more dispersed distribution of the opportunitycost in the population This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more diverse inhow difficult they are to persuade Intuitively when the types are more spread outpersuading the same mass of types requires a greater posterior mean which leads toa higher censorship cutoff similarly to the comparative statics in t discussed above

55 Extensions and Open Questions Let us now consider a few extensions ofour model of media censorship

In our model the set of media outlets is exogenous and the governmentrsquos only in-strument is censorship We now consider three alternative ways of expanding thegovernmentrsquos instruments of influence

First suppose that the government is able not only to censor existing media outletsbut also to introduce new media outlets with chosen editorial policies This is equiv-alent to our censorship model where all media outlets in [0 1] are initially availableand the government can censor any subset of them When all media outlets in [0 1]are permitted the revealed information about θ (which we call ω) is θ itself Thusω = θ is a continuous random variable with distribution F = T So we can nowapply our results for the continuous state from Section 3 instead of those for the dis-crete state in Section 4 reaching the same conclusion about the optimality of uppercensorship

Second suppose that the government can garble information available from mediaoutlets That is the government is not restricted to monotone partitions it cancreate arbitrary signals about state ω for the citizens to observe This becomes ageneral persuasion problem and our Theorems 1prime(A) 2prime(A) and 3prime(A) apply

Third suppose that the government is able to restrict not only which media outletsare permitted but also how many media outlets each citizen can choose to observeIn this extension the citizens are not allowed to communicate with one another (oth-erwise they could share the information thus observing all permitted media outletsindirectly) This extension does not affect our results as long as each citizen is al-lowed to access at least one media outlet of his choice as in Chan and Suen (2008)Intuitively this is because each citizen categorizes the information from the mediaoutlets into ldquogood newsrdquo where a = 1 is optimal and ldquobad newsrdquo where a = 0 is

22 KOLOTILIN MYLOVANOV ZAPECHELNYUK

optimal Because the information from the media outlets induces a monotone parti-tion it means that ldquogood newsrdquo is separated from ldquobad newsrdquo by a threshold mediaoutlet that depends on the citizenrsquos type That is it is sufficient to observe a singlethreshold media outlet to distinguish ldquogood newsrdquo from ldquobad newsrdquo

In our model each citizenrsquos utility depends on the aggregate action a through theexternality term κ(θ r a) which does not affect the chosen action Let us relax thisassumption so that a citizenrsquos optimal choice can depend on a We can still writethe senderrsquos indirect utility V as a function of the posterior mean state and applyour results However V is no longer uniquely determined by the primitives of themodel It is now endogenous and depends on an equilibrium the citizens play aseach citizenrsquos optimal action now depends on what all citizens do in equilibrium Forexample given the same information about the state a citizen could prefer to choosea = 1 if and only if sufficiently many citizens choose the same action so a is largeenough This creates the problem of multiplicity of equilibria and as a consequencethe dependence of optimal censorship on equilibrium selection

As a side our media censorship problem can also be applied to spatial voting mod-els as in Chiang and Knight (2011) Consider a government party (p = G) and anopposition party (p = O) competing in an election If party p wins a voter withan ideological position r gets utility wp minus (r minus rp)

2 where wp is the quality or va-lence of party p and rp is the ideology or policy platform of party p Voters knowthe partiesrsquo ideologies and obtain information about the partiesrsquo qualities from allavailable media outlets Each voter supports the party that maximizes his expectedutility Within this context our analysis still applies because the voterrsquos utility dif-ference between the government and opposition parties is proportional to θminusr whereθ = (wG minus wO + r2O minus r2G)2(rO minus rG)

There are a few more extensions that can be relevant in applications First insteadof complete censorship of a media outlet there can be a cost of accessing it Forexample an international news channel can be censored by a local government butcitizens may still access it through VPN at some cost Second it can be costly forthe government to censor media outlets So another important question to answeris how a government may prioritize censoring Finally citizens can incur some costof following each media outlet While we have already mentioned that citizens gainno benefit from following more than one outlet it is entirely possible for them tostop watching news altogether if it is sufficiently uninformative These extensions arenontrivial and left for future research

Appendix

Proof of Theorem 1 Suppose that V is S-shaped For mωlowast isin [0 1] and mlowast(ωlowast) =E[ω|ω ge ωlowast] define

V (mωlowast) = V (mlowast(ωlowast)) + V prime(mlowast(ωlowast))(mminusmlowast(ωlowast)) (7)

CENSORSHIP AS OPTIMAL PERSUASION 23

Since V (m) is convex on [0 τ ] and concave on [τ 1] there exists ωlowast isin [0 τ ] thatsatisfies condition (3) that is

V (mωlowast) ge (le)V (m) for all m ge (lt)ωlowast (8)

This is true because V (ωlowastωlowast) minus V (ωlowast) is single-crossing from below To see thisconsider Figure 1 Observe that (3) can hold only if ωlowast is on the convex part of V andmlowast is on the concave part of V As the censorship cutoff ωlowast increases the posteriormean state of the pooling message mlowast(ωlowast) also increases But because V is concaveat mlowast(ωlowast) the dashed tangent line V (mωlowast) becomes flatter in m so it crosses thesolid line V (m) at a smaller m

Consider an arbitrary signal s Let H be the distribution of m = E[ω|s] inducedby signal s Since each signal is a garbling of the fully informative signal F is amean-preserving spread of H Thus the senderrsquos expected utility is smaller undersignal s than under the upper-censorship signal with cutoff ωlowast as follows from

983133 1

0

V (m)dH(m) le983133 1

0

maxV (m) V (mωlowast)dH(m)

le983133 1

0

maxV (m) V (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mlowast(ωlowast))dF (m)

where the second line holds because maxV (m) V (mωlowast) is convex in m and F isa mean-preserving spread of H the third line holds by (8) and the last line holds by

(7) and by983125 1

ωlowast(mminusmlowast(ωlowast))dF (m) = 0

Conversely suppose that V is not S-shaped Then there exist 0 le m1 lt m2 lem3 lt m4 le 1 such that V primeprime(m) lt 0 for m isin (m1m2) V

primeprime(m) = 0 for m isin [m2m3]and V primeprime(m) gt 0 for m isin (m3m4) because V primeprime is continuous by assumption It isstraightforward to show that there exists 983141mlowast isin (m1m2) (sufficiently close to m2) and983141ωlowast isin (m3m4) such that

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) lt V (m) for all m isin (983141ωlowastm4]

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) gt V (m) for all m isin [m1 983141ωlowast] 983141mlowast

Moreover there exists a continuous distribution F such that the support of F is asubset of [m1m4] and 983141mlowast = E[ω|ω le 983141ωlowast] Using a chain of inequalities similar to theone above it is easy to show that lower-censorship signal 983141slowast with cutoff 983141ωlowast is optimalFurthermore it is uniquely optimal as both inequalities hold with equalities only forsignal 983141slowast This in turn implies that any upper-censorship signal is suboptimal

24 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Proof of Lemma 1 Notice that under assumption (A1) we have

V primeprime(m) = gprime(m) + ρg(m) for m isin [0 1]

Recall that V is S-shaped if and only if V primeprime(m) = gprime (m) + ρg (m) is single-crossingfrom above Using Proposition 1 in Quah and Strulovici (2012) it is easy to showthat V primeprime(m) is single-crossing from above for all ρ isin R if and only if gprime (m)g (m) isnonincreasing in m (that is ln g(m) is concave)

Proof of Theorem 2 Immediate by Theorem 1 and Lemma 1

Proof of Theorem 3 Part (i) We fix t and σ and vary ρ Without loss ofgenerality let t = 0 and σ = 1 (so that Gtσ = G) Consider a signal that induces thedistribution H of posterior mean m This signal implements the receiverrsquos interimutility U given by

U(r) =

983133 1

r

(mminus r)dH(m) for r isin [0 1]

which holds because the receiver acts if and only if m ge r Since types r lt 0 alwayschoose a = 1 and types r gt 1 always choose a = 0 we can exclude them fromconsideration and assume that supp(G) = [0 1] The senderrsquos expected utility is then

983133 1

0

V (m)dH(m) =

983133 1

0

983133 m

0

(1 + ρ(mminus r))dG(r)dH(m)

=

983133 1

0

983133 1

r

(1 + ρ(mminus r))dH(m)dG(r)

=

983133 1

0

(1minusH(r) + ρU(r))dG(r)

Consider ρ2 gt ρ1 Suppose by contradiction that the corresponding optimal upper-censorship signals s2 and s1 are such that s1 is strictly more informative than s2 LetU2 and U1 be the receiverrsquos interim utilities implemented by s2 and s1 Since thesender prefers signal s2 under ρ2 and signal s1 under ρ1 we have

983133 1

0

(1minusH2 (r) + ρ2U2 (r)) dG (r) ge983133 1

0

(1minusH1 (r) + ρ2U1 (r)) dG (r)

983133 1

0

(1minusH1 (r) + ρ1U1 (r)) dG (r) ge983133 1

0

(1minusH2 (r) + ρ1U2 (r)) dG (r)

Summing up these inequalities gives

(ρ2 minus ρ1)

983133 1

0

(U2 (r)minus U1 (r)) dG (r) ge 0

leading to a contradiction because U1(r) ge U2(r) for all r with strict inequality forsome r given that s1 is strictly more informative than s2

CENSORSHIP AS OPTIMAL PERSUASION 25

Part (ii) We fix ρ isin R and σ = 1 and vary t The indirect utility of the senderconditional on m isin [0 1] is given by

Vt(m) =

983133

rlem

(1+ρ(mminusr))gt1(r)dr =

983133

rlemminust

(1+ρ(mminustminusr))g(r)dr = V (mminust) (9)

By Theorem 1 when t0 = 0 an optimal cutoff ωlowast0 satisfies

V (m) le V (mlowast0) + V prime(mlowast

0)(mminusmlowast0) for m isin [ωlowast

0 1]

When V is S-shaped and t1 lt 0 it is easy to see that

V (mminus t1) le V (mlowast0 minus t1) + V prime(mlowast

0 minus t1)(mminusmlowast0) for m isin [ωlowast

0 1]

which implies that when t = t1 the above inequality changes its sign at some ωlowast1

that satisfies ωlowast1 le ωlowast

0 It follows from (9) that ωlowast1 is an optimal cutoff for t = t1 and

as we established above ωlowast1 le ωlowast

0

Part (iii) We fix ρ ge 0 and t = 0 and vary σ The indirect utility of the senderconditional on m isin [0 1] is given by

V ρσσ(m) =

983133

rlem

9830591 +

ρ

σ(mminus r)

983060dG0σ(r)

=

983133

rlem

9830591 +

ρ

σ(mminus r)

983060g

983061r minus (1minus σ)τ

σ

9830621

σdr

=

983133

rlemminus(1minusσ)τσ

9830611 + ρ

983061mminus (1minus σ)τ

σminus r

983062983062g(r)dr

= Vρ1

983061mminus (1minus σ)τ

σ

983062 (10)

Let σ0 = 1 σ1 lt 1 and ρ1 = ρσ1 To compare the optimal cutoffs under parameters(ρ σ0) and (ρ σ1) we consider two steps first comparing the cutoffs between (ρ σ0)and (ρ1 σ1) and then comparing the cutoffs between (ρ1 σ1) and (ρ σ1)

First we compare an optimal cutoff ωlowast0 under (ρ σ0) = (ρ 1) and an optimal cutoff

ωlowast1 under (ρ1 σ1) By Theorem 1 ωlowast

0 satisfies

Vρ1(m) le Vρ1(mlowast0) + V prime

ρ1(mlowast0)(mminusmlowast

0) for m isin [ωlowast0 1]

When V is increasing (implied by ρ ge 0) and S-shaped and σ1 lt 1 it is easy to seethat for all m isin [ωlowast

0 1]

Vρ1

983061mminus (1minus σ1)τ

σ1

983062le Vρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062+ V prime

ρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062(mminusmlowast

0)

which implies that when σ = σ1 lt 1 the above inequality changes its sign at someωlowast1 that satisfies ωlowast

1 le ωlowast0 It follows from (10) that ωlowast

1 is an optimal cutoff under(ρσ σ1) = (ρ1 σ1) and as we established above ωlowast

1 le ωlowast0

26 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Second we compare the optimal cutoff ωlowast1 under (ρ1 σ1) and the optimal cutoff ωlowast

2

under (ρ σ1) Because ρ lt ρ1 = ρσ1 by part (i) we have ωlowast2 le ωlowast

1

Combining ωlowast1 le ωlowast

0 and ωlowast2 le ωlowast

1 we obtain ωlowast2 le ωlowast

0

Proof of Theorem 1prime Part (A) The proof is analogous to that of Theorem 1 andis thus omitted

Part (B) The only if statement follows from that in part (A) (see the proof of Theorem1) and the fact that we can always ensure that an optimal stochastic censorship(ωlowast qlowast) is deterministic (so qlowast isin 0 1) by the choice of the support and distributionof ω Thus we only need to prove the if statement

Let F be a distribution of ω with support on n values 0 lt ω0 lt lt ωnminus1 lt 1 Thisis without loss of generality because if 0 andor 1 were in the support of F we couldredefine the state as ω = (ω + ε)(1 + 2ε) and the type as r = (r + ε)(1 + 2ε)

Let X sub ω1 ωnminus1 identify a set of cutoffs defining a monotone partition for eachωi isin X the receiver is informed whether ω lt ωi or ω ge ωi Note that we exclude ω0

from the set of cutoffs because ω ge ω0 always holds

Note that X is deterministic upper censorship if and only if there exists a censorshipthreshold ωlowast isin [0 1] such that for each i = 1 nminus 1 ωi isin X if and only if ωi lt ωlowastthat is we can assume that qlowast = 0 without loss of generality

Let VX be the senderrsquos expected utility when the monotone partition is given by X

If n = 2 then there are only two partitions X = ω1 (full disclosure) and X = empty(no disclosure) Both are upper censorships so part (B) holds trivially

If n = 3 then there are four monotone partitions X = ω1ω2 (full disclosure)X = ω2 X = ω1 and X = empty (no disclosure) Only X = ω2 is not uppercensorship The next lemma proves that X = ω2 cannot be uniquely optimal whenV is S-shaped thus proving the optimality of upper censorship for n = 3

Denote

mjk = E983045ω|ω isin [ωjωk]

983046

For convenience of notation we write mj = ωj

Lemma 3 Let V be S-shaped with an inflexion point τ Let ω be a discrete randomvariable with support on m0m1m2 and probabilities p0 p1 p2 where 0 lt m0 ltm1 lt m2 lt 1 Then

(i) if τ le m0 then Vm2 le Vempty

(ii) if τ ge m1 then Vm2 le Vm1m2

(iii) if m0 lt τ lt m1 then Vm2 le maxVempty Vm1m2

CENSORSHIP AS OPTIMAL PERSUASION 27

Proof Define p01 = p0 + p1 and p02 = p01 + p2 = 1 Observe that

m01 =p0m0 + p1m1

p0 + p1and m02 =

p0m0 + p1m1 + p2m2

p0 + p1 + p2

We thus have

Vempty = p02V (m02) = (p01 + p2)V

983061p01m01 + p2m2

p01 + p2

983062

Vm2 = p01V (m01) + p2V (m2) = (p0 + p1)V

983061p0m0 + p1m1

p0 + p1

983062+ p2V (m2)

Vm1m2 = p0V (m0) + p1V (m1) + p2V (m2)

Clearly

m0 lt m01 lt m1 and m0 lt m02 lt m2

Therefore

Vm2 le Vempty lArrrArrp01

p01 + p2V (m01) +

p2p01 + p2

V (m2) le V

983061p01

p01 + p2m01 +

p2p01 + p2

m2

983062= V (m02)

(11)

and

Vm2 le Vm1m2 lArrrArr

V

983061p0

p0 + p1m0 +

p1p0 + p1

m1

983062le p0

p0 + p1V (m0) +

p1p0 + p1

V (m1)(12)

Suppose that τ le m01 so V is concave on [m01m2] Then (11) holds by Jensenrsquosinequality In particular it holds for τ le m0 lt m01 thus proving part (i)

Suppose that τ ge m1 so V is convex on [m0m1] Then (12) holds by Jensenrsquosinequality thus proving part (ii)

Finally suppose that m01 lt τ lt m1 Let (τ V (τ)) lie below (or on) the straight lineA connecting (m0 V (m0)) and (m2 V (m2)) as illustrated on Figure 2 By convexityof V (m) for m lt τ it must be the case that V (m01) lies below (or on) the dashedline B By concavity of V (m) for m gt τ it must be the case that V (m1) lies above(or on) the dashed line C Consequently (12) is satisfied

Alternatively let (τ V (τ)) lie above the straight line A connecting (m0 V (m0)) and(m2 V (m2)) as illustrated on Figure 3 If Vm2 le Vm1m2 then the proof of part(iii) is complete So suppose that Vm2 gt Vm1m2 so (12) is violated

By convexity of V (m) for m lt τ it must be the case that V (m01) lies below (or on)the line B Since (12) is violated V (m01) must lie between the lines A and B Recallthat m01 lt m02 lt m2 Observe that V (m02) must be in the shaded area on Figure 3Indeed if m02 lt τ then by convexity of V (m) for m lt τ it must be the case thatV (m02) lies between (or on) the lines B and D If m02 ge τ then by concavity of

28 KOLOTILIN MYLOVANOV ZAPECHELNYUK

τ m2m0

A C

B

m01 m1

V (m0) V (τ)

V (m2)

Figure 2 The case where (τ V (τ)) lies below (or on) the straight lineconnecting (m0 V (m0)) and (m2 V (m2))

τ m2m0

A

C

B

m01

V (m0)

V (τ)

V (m2)

D

V (m01)

Figure 3 The case where (τ V (τ)) lies above the straight line con-necting (m0 V (m0)) and (m2 V (m2))

V (m) for m gt τ it must be the case that V (m02) lies above (or on) line C It followsthat (11) is satisfied

We now adapt Lemma 3 to prove the optimality of upper censorship for n ge 4

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

CENSORSHIP AS OPTIMAL PERSUASION 19

the citizens compute the posterior expectation of θ which we will call ω So in thisexample the distribution F of ω has two mass points one for each of the two events

Given these two mass points every monotone partition of [0 1] induces one of twopossible distributions of the posterior mean state (i) the mass points revealed and (ii)the mass points pooled Observe however that these distributions are implementableby permitting and censoring the media outlet respectively

A censorship policy is upper censorship if it censors all sufficiently disloyal mediaoutlets Specifically there exists a cutoff clowast isin C and an indicator qlowast isin 0 1 suchthat all media outlets whose editorial policies are below clowast are permitted all mediaoutlets whose editorial policies are above clowast are censored and clowast is permitted if andonly if qlowast = 1 That is X = c isin C c le clowast if qlowast = 1 and X = c isin C c lt clowastif qlowast = 0 Note that the full censorship policy clowast = 0 and qlowast = 0 (where all mediaoutlets are censored) and the free media policy clowast = 1 and qlowast = 1 (where all mediaoutlets are permitted) are the two extreme upper-censorship policies

As shown above the media censorship problem is equivalent to the linear persuasionproblem in which the sender is restricted to monotone partitional signals and thesenderrsquos indirect utility V is given by (6) We thus apply Theorem 1prime(B) to obtainthe following result

Theorem 1primeprime If V is S-shaped then an optimal censorship policy is upper censorship

To illustrate Theorem 1primeprime suppose the government is interested only in the aggregateaction so that ν = 0 and γ depends only on the aggregate action a so V (m) =γ(G(m)) Thus an upper-censorship policy is optimal if the composition functionγ(G(middot)) is S-shaped For example this condition holds if the government is interestedin reaching a certain approval threshold (eg a simple majority) so that γ is a stepfunction This condition also holds if γ is S-shaped and G is uniform or S-shapedwith the same inflection point as γ20

54 Canonical Case We now impose more structure on the utilities to obtain asharper result for the optimality of upper censorship and to perform a comparativestatics analysis Analogously to assumption (A1) we assume that the governmentrsquosutility is a weighted average of the citizensrsquo utility and their aggregate action

983133 1

0

u(θ r ar a)dG(r) + δa

where u(θ r ar a) = (θ minus r)ar + ζ(r)a

(A2)

for some continuously differentiable function ζ Define

β =

983133 1

0

ζ(r)dG(r) + δ

20This condition holds in many special cases where γ and G are S-shaped even when γ and Ghave different inflection points

20 KOLOTILIN MYLOVANOV ZAPECHELNYUK

The term983125 1

0ζ(r)dG(r) is the governmentrsquos expected bias towards the citizensrsquo action

a = 1 due to the citizensrsquo externality and the term δ is the governmentrsquos intrinsicbias towards a greater aggregate action a Thus we interpret β as the governmentrsquosaggregate bias This decomposition of the bias allows for different interpretations whythe government is biased So β can be high because the government is too self-serving(high δ) or because the government is benevolent (low δ) but wishes to internalizestrong positive externalities of the citizens (high ζ(r)) To ease interpretations weassume that β gt 021

Under Assumption (A2) the governmentrsquos indirect utility V given by (6) becomes

V (m) =

983133 m

0

(mminus r)dG(r) + βa =

983133 m

0

v(m r)dG(r)

where

v(m r) = β

9830611 +

1

β(mminus r)

983062

This is the same as v given by Assumption (A1) with ρ = βminus1 up to rescaling by apositive constant Consequently we can apply Theorem 2prime(B) to obtain the followingresult

Theorem 2primeprime Let (A2) hold If the density g of citizens types is log-concave then anoptimal censorship policy is upper censorship

We now apply the comparative statics analysis presented in Section 32 Note thatthe upper-censorship policies are ordered according to the amount of informationtransmitted to the citizens in the sense of Blackwell (1953) A greater censorshipthreshold clowast means that more media outlets are permitted With this order in mindwe apply Theorem 3prime(B) to make a comparative statics analysis on the amount ofinformation that is optimally disclosed by the government

First the censorship cutoff is increasing in the alignment parameter ρ Recall thatthis is a reciprocal of the governmentrsquos bias ρ = βminus1 This means that the governmentoptimally discloses more information (the censorship cutoff is greater) when it is lessbiased (β is smaller) Intuitively as β decreases the government puts more weighton the citizensrsquo utility so it optimally endows the population with a higher utility bycensoring fewer media outlets and disclosing more information

Second the censorship cutoff is increasing in the magnitude t of the horizontal shiftof the density g A greater t corresponds to a greater opportunity cost of actiona = 1 for each citizen This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more difficultto persuade to take action a = 1 (parameter t is greater) Informally speaking topersuade the same type of the citizen the government needs to increase the posterior

21If β lt 0 then swapping the roles of a = 0 and a = 1 reverses the sign of the bias If β = 0then the governmentrsquos utility and the citizensrsquo private interests coincide so it is trivially optimal topermit all media outlets

CENSORSHIP AS OPTIMAL PERSUASION 21

mean But the expected posterior mean must be equal to the prior mean so it is notpossible to increase all posterior means Due to the log-concave shape of the densityof the citizensrsquo types this tradeoff is resolved by increasing the posterior mean of thepooling interval E[ω|ω ge clowast] This is done by shrinking the pooling interval [clowast 1]that is increasing the censorship cutoff clowast

Third when the number of media outlets is sufficiently large the censorship cutoffis increasing in the stretch factor σ of the density g around its peak τ (see Footnote15) A greater σ corresponds to a more dispersed distribution of the opportunitycost in the population This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more diverse inhow difficult they are to persuade Intuitively when the types are more spread outpersuading the same mass of types requires a greater posterior mean which leads toa higher censorship cutoff similarly to the comparative statics in t discussed above

55 Extensions and Open Questions Let us now consider a few extensions ofour model of media censorship

In our model the set of media outlets is exogenous and the governmentrsquos only in-strument is censorship We now consider three alternative ways of expanding thegovernmentrsquos instruments of influence

First suppose that the government is able not only to censor existing media outletsbut also to introduce new media outlets with chosen editorial policies This is equiv-alent to our censorship model where all media outlets in [0 1] are initially availableand the government can censor any subset of them When all media outlets in [0 1]are permitted the revealed information about θ (which we call ω) is θ itself Thusω = θ is a continuous random variable with distribution F = T So we can nowapply our results for the continuous state from Section 3 instead of those for the dis-crete state in Section 4 reaching the same conclusion about the optimality of uppercensorship

Second suppose that the government can garble information available from mediaoutlets That is the government is not restricted to monotone partitions it cancreate arbitrary signals about state ω for the citizens to observe This becomes ageneral persuasion problem and our Theorems 1prime(A) 2prime(A) and 3prime(A) apply

Third suppose that the government is able to restrict not only which media outletsare permitted but also how many media outlets each citizen can choose to observeIn this extension the citizens are not allowed to communicate with one another (oth-erwise they could share the information thus observing all permitted media outletsindirectly) This extension does not affect our results as long as each citizen is al-lowed to access at least one media outlet of his choice as in Chan and Suen (2008)Intuitively this is because each citizen categorizes the information from the mediaoutlets into ldquogood newsrdquo where a = 1 is optimal and ldquobad newsrdquo where a = 0 is

22 KOLOTILIN MYLOVANOV ZAPECHELNYUK

optimal Because the information from the media outlets induces a monotone parti-tion it means that ldquogood newsrdquo is separated from ldquobad newsrdquo by a threshold mediaoutlet that depends on the citizenrsquos type That is it is sufficient to observe a singlethreshold media outlet to distinguish ldquogood newsrdquo from ldquobad newsrdquo

In our model each citizenrsquos utility depends on the aggregate action a through theexternality term κ(θ r a) which does not affect the chosen action Let us relax thisassumption so that a citizenrsquos optimal choice can depend on a We can still writethe senderrsquos indirect utility V as a function of the posterior mean state and applyour results However V is no longer uniquely determined by the primitives of themodel It is now endogenous and depends on an equilibrium the citizens play aseach citizenrsquos optimal action now depends on what all citizens do in equilibrium Forexample given the same information about the state a citizen could prefer to choosea = 1 if and only if sufficiently many citizens choose the same action so a is largeenough This creates the problem of multiplicity of equilibria and as a consequencethe dependence of optimal censorship on equilibrium selection

As a side our media censorship problem can also be applied to spatial voting mod-els as in Chiang and Knight (2011) Consider a government party (p = G) and anopposition party (p = O) competing in an election If party p wins a voter withan ideological position r gets utility wp minus (r minus rp)

2 where wp is the quality or va-lence of party p and rp is the ideology or policy platform of party p Voters knowthe partiesrsquo ideologies and obtain information about the partiesrsquo qualities from allavailable media outlets Each voter supports the party that maximizes his expectedutility Within this context our analysis still applies because the voterrsquos utility dif-ference between the government and opposition parties is proportional to θminusr whereθ = (wG minus wO + r2O minus r2G)2(rO minus rG)

There are a few more extensions that can be relevant in applications First insteadof complete censorship of a media outlet there can be a cost of accessing it Forexample an international news channel can be censored by a local government butcitizens may still access it through VPN at some cost Second it can be costly forthe government to censor media outlets So another important question to answeris how a government may prioritize censoring Finally citizens can incur some costof following each media outlet While we have already mentioned that citizens gainno benefit from following more than one outlet it is entirely possible for them tostop watching news altogether if it is sufficiently uninformative These extensions arenontrivial and left for future research

Appendix

Proof of Theorem 1 Suppose that V is S-shaped For mωlowast isin [0 1] and mlowast(ωlowast) =E[ω|ω ge ωlowast] define

V (mωlowast) = V (mlowast(ωlowast)) + V prime(mlowast(ωlowast))(mminusmlowast(ωlowast)) (7)

CENSORSHIP AS OPTIMAL PERSUASION 23

Since V (m) is convex on [0 τ ] and concave on [τ 1] there exists ωlowast isin [0 τ ] thatsatisfies condition (3) that is

V (mωlowast) ge (le)V (m) for all m ge (lt)ωlowast (8)

This is true because V (ωlowastωlowast) minus V (ωlowast) is single-crossing from below To see thisconsider Figure 1 Observe that (3) can hold only if ωlowast is on the convex part of V andmlowast is on the concave part of V As the censorship cutoff ωlowast increases the posteriormean state of the pooling message mlowast(ωlowast) also increases But because V is concaveat mlowast(ωlowast) the dashed tangent line V (mωlowast) becomes flatter in m so it crosses thesolid line V (m) at a smaller m

Consider an arbitrary signal s Let H be the distribution of m = E[ω|s] inducedby signal s Since each signal is a garbling of the fully informative signal F is amean-preserving spread of H Thus the senderrsquos expected utility is smaller undersignal s than under the upper-censorship signal with cutoff ωlowast as follows from

983133 1

0

V (m)dH(m) le983133 1

0

maxV (m) V (mωlowast)dH(m)

le983133 1

0

maxV (m) V (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mlowast(ωlowast))dF (m)

where the second line holds because maxV (m) V (mωlowast) is convex in m and F isa mean-preserving spread of H the third line holds by (8) and the last line holds by

(7) and by983125 1

ωlowast(mminusmlowast(ωlowast))dF (m) = 0

Conversely suppose that V is not S-shaped Then there exist 0 le m1 lt m2 lem3 lt m4 le 1 such that V primeprime(m) lt 0 for m isin (m1m2) V

primeprime(m) = 0 for m isin [m2m3]and V primeprime(m) gt 0 for m isin (m3m4) because V primeprime is continuous by assumption It isstraightforward to show that there exists 983141mlowast isin (m1m2) (sufficiently close to m2) and983141ωlowast isin (m3m4) such that

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) lt V (m) for all m isin (983141ωlowastm4]

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) gt V (m) for all m isin [m1 983141ωlowast] 983141mlowast

Moreover there exists a continuous distribution F such that the support of F is asubset of [m1m4] and 983141mlowast = E[ω|ω le 983141ωlowast] Using a chain of inequalities similar to theone above it is easy to show that lower-censorship signal 983141slowast with cutoff 983141ωlowast is optimalFurthermore it is uniquely optimal as both inequalities hold with equalities only forsignal 983141slowast This in turn implies that any upper-censorship signal is suboptimal

24 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Proof of Lemma 1 Notice that under assumption (A1) we have

V primeprime(m) = gprime(m) + ρg(m) for m isin [0 1]

Recall that V is S-shaped if and only if V primeprime(m) = gprime (m) + ρg (m) is single-crossingfrom above Using Proposition 1 in Quah and Strulovici (2012) it is easy to showthat V primeprime(m) is single-crossing from above for all ρ isin R if and only if gprime (m)g (m) isnonincreasing in m (that is ln g(m) is concave)

Proof of Theorem 2 Immediate by Theorem 1 and Lemma 1

Proof of Theorem 3 Part (i) We fix t and σ and vary ρ Without loss ofgenerality let t = 0 and σ = 1 (so that Gtσ = G) Consider a signal that induces thedistribution H of posterior mean m This signal implements the receiverrsquos interimutility U given by

U(r) =

983133 1

r

(mminus r)dH(m) for r isin [0 1]

which holds because the receiver acts if and only if m ge r Since types r lt 0 alwayschoose a = 1 and types r gt 1 always choose a = 0 we can exclude them fromconsideration and assume that supp(G) = [0 1] The senderrsquos expected utility is then

983133 1

0

V (m)dH(m) =

983133 1

0

983133 m

0

(1 + ρ(mminus r))dG(r)dH(m)

=

983133 1

0

983133 1

r

(1 + ρ(mminus r))dH(m)dG(r)

=

983133 1

0

(1minusH(r) + ρU(r))dG(r)

Consider ρ2 gt ρ1 Suppose by contradiction that the corresponding optimal upper-censorship signals s2 and s1 are such that s1 is strictly more informative than s2 LetU2 and U1 be the receiverrsquos interim utilities implemented by s2 and s1 Since thesender prefers signal s2 under ρ2 and signal s1 under ρ1 we have

983133 1

0

(1minusH2 (r) + ρ2U2 (r)) dG (r) ge983133 1

0

(1minusH1 (r) + ρ2U1 (r)) dG (r)

983133 1

0

(1minusH1 (r) + ρ1U1 (r)) dG (r) ge983133 1

0

(1minusH2 (r) + ρ1U2 (r)) dG (r)

Summing up these inequalities gives

(ρ2 minus ρ1)

983133 1

0

(U2 (r)minus U1 (r)) dG (r) ge 0

leading to a contradiction because U1(r) ge U2(r) for all r with strict inequality forsome r given that s1 is strictly more informative than s2

CENSORSHIP AS OPTIMAL PERSUASION 25

Part (ii) We fix ρ isin R and σ = 1 and vary t The indirect utility of the senderconditional on m isin [0 1] is given by

Vt(m) =

983133

rlem

(1+ρ(mminusr))gt1(r)dr =

983133

rlemminust

(1+ρ(mminustminusr))g(r)dr = V (mminust) (9)

By Theorem 1 when t0 = 0 an optimal cutoff ωlowast0 satisfies

V (m) le V (mlowast0) + V prime(mlowast

0)(mminusmlowast0) for m isin [ωlowast

0 1]

When V is S-shaped and t1 lt 0 it is easy to see that

V (mminus t1) le V (mlowast0 minus t1) + V prime(mlowast

0 minus t1)(mminusmlowast0) for m isin [ωlowast

0 1]

which implies that when t = t1 the above inequality changes its sign at some ωlowast1

that satisfies ωlowast1 le ωlowast

0 It follows from (9) that ωlowast1 is an optimal cutoff for t = t1 and

as we established above ωlowast1 le ωlowast

0

Part (iii) We fix ρ ge 0 and t = 0 and vary σ The indirect utility of the senderconditional on m isin [0 1] is given by

V ρσσ(m) =

983133

rlem

9830591 +

ρ

σ(mminus r)

983060dG0σ(r)

=

983133

rlem

9830591 +

ρ

σ(mminus r)

983060g

983061r minus (1minus σ)τ

σ

9830621

σdr

=

983133

rlemminus(1minusσ)τσ

9830611 + ρ

983061mminus (1minus σ)τ

σminus r

983062983062g(r)dr

= Vρ1

983061mminus (1minus σ)τ

σ

983062 (10)

Let σ0 = 1 σ1 lt 1 and ρ1 = ρσ1 To compare the optimal cutoffs under parameters(ρ σ0) and (ρ σ1) we consider two steps first comparing the cutoffs between (ρ σ0)and (ρ1 σ1) and then comparing the cutoffs between (ρ1 σ1) and (ρ σ1)

First we compare an optimal cutoff ωlowast0 under (ρ σ0) = (ρ 1) and an optimal cutoff

ωlowast1 under (ρ1 σ1) By Theorem 1 ωlowast

0 satisfies

Vρ1(m) le Vρ1(mlowast0) + V prime

ρ1(mlowast0)(mminusmlowast

0) for m isin [ωlowast0 1]

When V is increasing (implied by ρ ge 0) and S-shaped and σ1 lt 1 it is easy to seethat for all m isin [ωlowast

0 1]

Vρ1

983061mminus (1minus σ1)τ

σ1

983062le Vρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062+ V prime

ρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062(mminusmlowast

0)

which implies that when σ = σ1 lt 1 the above inequality changes its sign at someωlowast1 that satisfies ωlowast

1 le ωlowast0 It follows from (10) that ωlowast

1 is an optimal cutoff under(ρσ σ1) = (ρ1 σ1) and as we established above ωlowast

1 le ωlowast0

26 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Second we compare the optimal cutoff ωlowast1 under (ρ1 σ1) and the optimal cutoff ωlowast

2

under (ρ σ1) Because ρ lt ρ1 = ρσ1 by part (i) we have ωlowast2 le ωlowast

1

Combining ωlowast1 le ωlowast

0 and ωlowast2 le ωlowast

1 we obtain ωlowast2 le ωlowast

0

Proof of Theorem 1prime Part (A) The proof is analogous to that of Theorem 1 andis thus omitted

Part (B) The only if statement follows from that in part (A) (see the proof of Theorem1) and the fact that we can always ensure that an optimal stochastic censorship(ωlowast qlowast) is deterministic (so qlowast isin 0 1) by the choice of the support and distributionof ω Thus we only need to prove the if statement

Let F be a distribution of ω with support on n values 0 lt ω0 lt lt ωnminus1 lt 1 Thisis without loss of generality because if 0 andor 1 were in the support of F we couldredefine the state as ω = (ω + ε)(1 + 2ε) and the type as r = (r + ε)(1 + 2ε)

Let X sub ω1 ωnminus1 identify a set of cutoffs defining a monotone partition for eachωi isin X the receiver is informed whether ω lt ωi or ω ge ωi Note that we exclude ω0

from the set of cutoffs because ω ge ω0 always holds

Note that X is deterministic upper censorship if and only if there exists a censorshipthreshold ωlowast isin [0 1] such that for each i = 1 nminus 1 ωi isin X if and only if ωi lt ωlowastthat is we can assume that qlowast = 0 without loss of generality

Let VX be the senderrsquos expected utility when the monotone partition is given by X

If n = 2 then there are only two partitions X = ω1 (full disclosure) and X = empty(no disclosure) Both are upper censorships so part (B) holds trivially

If n = 3 then there are four monotone partitions X = ω1ω2 (full disclosure)X = ω2 X = ω1 and X = empty (no disclosure) Only X = ω2 is not uppercensorship The next lemma proves that X = ω2 cannot be uniquely optimal whenV is S-shaped thus proving the optimality of upper censorship for n = 3

Denote

mjk = E983045ω|ω isin [ωjωk]

983046

For convenience of notation we write mj = ωj

Lemma 3 Let V be S-shaped with an inflexion point τ Let ω be a discrete randomvariable with support on m0m1m2 and probabilities p0 p1 p2 where 0 lt m0 ltm1 lt m2 lt 1 Then

(i) if τ le m0 then Vm2 le Vempty

(ii) if τ ge m1 then Vm2 le Vm1m2

(iii) if m0 lt τ lt m1 then Vm2 le maxVempty Vm1m2

CENSORSHIP AS OPTIMAL PERSUASION 27

Proof Define p01 = p0 + p1 and p02 = p01 + p2 = 1 Observe that

m01 =p0m0 + p1m1

p0 + p1and m02 =

p0m0 + p1m1 + p2m2

p0 + p1 + p2

We thus have

Vempty = p02V (m02) = (p01 + p2)V

983061p01m01 + p2m2

p01 + p2

983062

Vm2 = p01V (m01) + p2V (m2) = (p0 + p1)V

983061p0m0 + p1m1

p0 + p1

983062+ p2V (m2)

Vm1m2 = p0V (m0) + p1V (m1) + p2V (m2)

Clearly

m0 lt m01 lt m1 and m0 lt m02 lt m2

Therefore

Vm2 le Vempty lArrrArrp01

p01 + p2V (m01) +

p2p01 + p2

V (m2) le V

983061p01

p01 + p2m01 +

p2p01 + p2

m2

983062= V (m02)

(11)

and

Vm2 le Vm1m2 lArrrArr

V

983061p0

p0 + p1m0 +

p1p0 + p1

m1

983062le p0

p0 + p1V (m0) +

p1p0 + p1

V (m1)(12)

Suppose that τ le m01 so V is concave on [m01m2] Then (11) holds by Jensenrsquosinequality In particular it holds for τ le m0 lt m01 thus proving part (i)

Suppose that τ ge m1 so V is convex on [m0m1] Then (12) holds by Jensenrsquosinequality thus proving part (ii)

Finally suppose that m01 lt τ lt m1 Let (τ V (τ)) lie below (or on) the straight lineA connecting (m0 V (m0)) and (m2 V (m2)) as illustrated on Figure 2 By convexityof V (m) for m lt τ it must be the case that V (m01) lies below (or on) the dashedline B By concavity of V (m) for m gt τ it must be the case that V (m1) lies above(or on) the dashed line C Consequently (12) is satisfied

Alternatively let (τ V (τ)) lie above the straight line A connecting (m0 V (m0)) and(m2 V (m2)) as illustrated on Figure 3 If Vm2 le Vm1m2 then the proof of part(iii) is complete So suppose that Vm2 gt Vm1m2 so (12) is violated

By convexity of V (m) for m lt τ it must be the case that V (m01) lies below (or on)the line B Since (12) is violated V (m01) must lie between the lines A and B Recallthat m01 lt m02 lt m2 Observe that V (m02) must be in the shaded area on Figure 3Indeed if m02 lt τ then by convexity of V (m) for m lt τ it must be the case thatV (m02) lies between (or on) the lines B and D If m02 ge τ then by concavity of

28 KOLOTILIN MYLOVANOV ZAPECHELNYUK

τ m2m0

A C

B

m01 m1

V (m0) V (τ)

V (m2)

Figure 2 The case where (τ V (τ)) lies below (or on) the straight lineconnecting (m0 V (m0)) and (m2 V (m2))

τ m2m0

A

C

B

m01

V (m0)

V (τ)

V (m2)

D

V (m01)

Figure 3 The case where (τ V (τ)) lies above the straight line con-necting (m0 V (m0)) and (m2 V (m2))

V (m) for m gt τ it must be the case that V (m02) lies above (or on) line C It followsthat (11) is satisfied

We now adapt Lemma 3 to prove the optimality of upper censorship for n ge 4

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

20 KOLOTILIN MYLOVANOV ZAPECHELNYUK

The term983125 1

0ζ(r)dG(r) is the governmentrsquos expected bias towards the citizensrsquo action

a = 1 due to the citizensrsquo externality and the term δ is the governmentrsquos intrinsicbias towards a greater aggregate action a Thus we interpret β as the governmentrsquosaggregate bias This decomposition of the bias allows for different interpretations whythe government is biased So β can be high because the government is too self-serving(high δ) or because the government is benevolent (low δ) but wishes to internalizestrong positive externalities of the citizens (high ζ(r)) To ease interpretations weassume that β gt 021

Under Assumption (A2) the governmentrsquos indirect utility V given by (6) becomes

V (m) =

983133 m

0

(mminus r)dG(r) + βa =

983133 m

0

v(m r)dG(r)

where

v(m r) = β

9830611 +

1

β(mminus r)

983062

This is the same as v given by Assumption (A1) with ρ = βminus1 up to rescaling by apositive constant Consequently we can apply Theorem 2prime(B) to obtain the followingresult

Theorem 2primeprime Let (A2) hold If the density g of citizens types is log-concave then anoptimal censorship policy is upper censorship

We now apply the comparative statics analysis presented in Section 32 Note thatthe upper-censorship policies are ordered according to the amount of informationtransmitted to the citizens in the sense of Blackwell (1953) A greater censorshipthreshold clowast means that more media outlets are permitted With this order in mindwe apply Theorem 3prime(B) to make a comparative statics analysis on the amount ofinformation that is optimally disclosed by the government

First the censorship cutoff is increasing in the alignment parameter ρ Recall thatthis is a reciprocal of the governmentrsquos bias ρ = βminus1 This means that the governmentoptimally discloses more information (the censorship cutoff is greater) when it is lessbiased (β is smaller) Intuitively as β decreases the government puts more weighton the citizensrsquo utility so it optimally endows the population with a higher utility bycensoring fewer media outlets and disclosing more information

Second the censorship cutoff is increasing in the magnitude t of the horizontal shiftof the density g A greater t corresponds to a greater opportunity cost of actiona = 1 for each citizen This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more difficultto persuade to take action a = 1 (parameter t is greater) Informally speaking topersuade the same type of the citizen the government needs to increase the posterior

21If β lt 0 then swapping the roles of a = 0 and a = 1 reverses the sign of the bias If β = 0then the governmentrsquos utility and the citizensrsquo private interests coincide so it is trivially optimal topermit all media outlets

CENSORSHIP AS OPTIMAL PERSUASION 21

mean But the expected posterior mean must be equal to the prior mean so it is notpossible to increase all posterior means Due to the log-concave shape of the densityof the citizensrsquo types this tradeoff is resolved by increasing the posterior mean of thepooling interval E[ω|ω ge clowast] This is done by shrinking the pooling interval [clowast 1]that is increasing the censorship cutoff clowast

Third when the number of media outlets is sufficiently large the censorship cutoffis increasing in the stretch factor σ of the density g around its peak τ (see Footnote15) A greater σ corresponds to a more dispersed distribution of the opportunitycost in the population This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more diverse inhow difficult they are to persuade Intuitively when the types are more spread outpersuading the same mass of types requires a greater posterior mean which leads toa higher censorship cutoff similarly to the comparative statics in t discussed above

55 Extensions and Open Questions Let us now consider a few extensions ofour model of media censorship

In our model the set of media outlets is exogenous and the governmentrsquos only in-strument is censorship We now consider three alternative ways of expanding thegovernmentrsquos instruments of influence

First suppose that the government is able not only to censor existing media outletsbut also to introduce new media outlets with chosen editorial policies This is equiv-alent to our censorship model where all media outlets in [0 1] are initially availableand the government can censor any subset of them When all media outlets in [0 1]are permitted the revealed information about θ (which we call ω) is θ itself Thusω = θ is a continuous random variable with distribution F = T So we can nowapply our results for the continuous state from Section 3 instead of those for the dis-crete state in Section 4 reaching the same conclusion about the optimality of uppercensorship

Second suppose that the government can garble information available from mediaoutlets That is the government is not restricted to monotone partitions it cancreate arbitrary signals about state ω for the citizens to observe This becomes ageneral persuasion problem and our Theorems 1prime(A) 2prime(A) and 3prime(A) apply

Third suppose that the government is able to restrict not only which media outletsare permitted but also how many media outlets each citizen can choose to observeIn this extension the citizens are not allowed to communicate with one another (oth-erwise they could share the information thus observing all permitted media outletsindirectly) This extension does not affect our results as long as each citizen is al-lowed to access at least one media outlet of his choice as in Chan and Suen (2008)Intuitively this is because each citizen categorizes the information from the mediaoutlets into ldquogood newsrdquo where a = 1 is optimal and ldquobad newsrdquo where a = 0 is

22 KOLOTILIN MYLOVANOV ZAPECHELNYUK

optimal Because the information from the media outlets induces a monotone parti-tion it means that ldquogood newsrdquo is separated from ldquobad newsrdquo by a threshold mediaoutlet that depends on the citizenrsquos type That is it is sufficient to observe a singlethreshold media outlet to distinguish ldquogood newsrdquo from ldquobad newsrdquo

In our model each citizenrsquos utility depends on the aggregate action a through theexternality term κ(θ r a) which does not affect the chosen action Let us relax thisassumption so that a citizenrsquos optimal choice can depend on a We can still writethe senderrsquos indirect utility V as a function of the posterior mean state and applyour results However V is no longer uniquely determined by the primitives of themodel It is now endogenous and depends on an equilibrium the citizens play aseach citizenrsquos optimal action now depends on what all citizens do in equilibrium Forexample given the same information about the state a citizen could prefer to choosea = 1 if and only if sufficiently many citizens choose the same action so a is largeenough This creates the problem of multiplicity of equilibria and as a consequencethe dependence of optimal censorship on equilibrium selection

As a side our media censorship problem can also be applied to spatial voting mod-els as in Chiang and Knight (2011) Consider a government party (p = G) and anopposition party (p = O) competing in an election If party p wins a voter withan ideological position r gets utility wp minus (r minus rp)

2 where wp is the quality or va-lence of party p and rp is the ideology or policy platform of party p Voters knowthe partiesrsquo ideologies and obtain information about the partiesrsquo qualities from allavailable media outlets Each voter supports the party that maximizes his expectedutility Within this context our analysis still applies because the voterrsquos utility dif-ference between the government and opposition parties is proportional to θminusr whereθ = (wG minus wO + r2O minus r2G)2(rO minus rG)

There are a few more extensions that can be relevant in applications First insteadof complete censorship of a media outlet there can be a cost of accessing it Forexample an international news channel can be censored by a local government butcitizens may still access it through VPN at some cost Second it can be costly forthe government to censor media outlets So another important question to answeris how a government may prioritize censoring Finally citizens can incur some costof following each media outlet While we have already mentioned that citizens gainno benefit from following more than one outlet it is entirely possible for them tostop watching news altogether if it is sufficiently uninformative These extensions arenontrivial and left for future research

Appendix

Proof of Theorem 1 Suppose that V is S-shaped For mωlowast isin [0 1] and mlowast(ωlowast) =E[ω|ω ge ωlowast] define

V (mωlowast) = V (mlowast(ωlowast)) + V prime(mlowast(ωlowast))(mminusmlowast(ωlowast)) (7)

CENSORSHIP AS OPTIMAL PERSUASION 23

Since V (m) is convex on [0 τ ] and concave on [τ 1] there exists ωlowast isin [0 τ ] thatsatisfies condition (3) that is

V (mωlowast) ge (le)V (m) for all m ge (lt)ωlowast (8)

This is true because V (ωlowastωlowast) minus V (ωlowast) is single-crossing from below To see thisconsider Figure 1 Observe that (3) can hold only if ωlowast is on the convex part of V andmlowast is on the concave part of V As the censorship cutoff ωlowast increases the posteriormean state of the pooling message mlowast(ωlowast) also increases But because V is concaveat mlowast(ωlowast) the dashed tangent line V (mωlowast) becomes flatter in m so it crosses thesolid line V (m) at a smaller m

Consider an arbitrary signal s Let H be the distribution of m = E[ω|s] inducedby signal s Since each signal is a garbling of the fully informative signal F is amean-preserving spread of H Thus the senderrsquos expected utility is smaller undersignal s than under the upper-censorship signal with cutoff ωlowast as follows from

983133 1

0

V (m)dH(m) le983133 1

0

maxV (m) V (mωlowast)dH(m)

le983133 1

0

maxV (m) V (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mlowast(ωlowast))dF (m)

where the second line holds because maxV (m) V (mωlowast) is convex in m and F isa mean-preserving spread of H the third line holds by (8) and the last line holds by

(7) and by983125 1

ωlowast(mminusmlowast(ωlowast))dF (m) = 0

Conversely suppose that V is not S-shaped Then there exist 0 le m1 lt m2 lem3 lt m4 le 1 such that V primeprime(m) lt 0 for m isin (m1m2) V

primeprime(m) = 0 for m isin [m2m3]and V primeprime(m) gt 0 for m isin (m3m4) because V primeprime is continuous by assumption It isstraightforward to show that there exists 983141mlowast isin (m1m2) (sufficiently close to m2) and983141ωlowast isin (m3m4) such that

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) lt V (m) for all m isin (983141ωlowastm4]

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) gt V (m) for all m isin [m1 983141ωlowast] 983141mlowast

Moreover there exists a continuous distribution F such that the support of F is asubset of [m1m4] and 983141mlowast = E[ω|ω le 983141ωlowast] Using a chain of inequalities similar to theone above it is easy to show that lower-censorship signal 983141slowast with cutoff 983141ωlowast is optimalFurthermore it is uniquely optimal as both inequalities hold with equalities only forsignal 983141slowast This in turn implies that any upper-censorship signal is suboptimal

24 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Proof of Lemma 1 Notice that under assumption (A1) we have

V primeprime(m) = gprime(m) + ρg(m) for m isin [0 1]

Recall that V is S-shaped if and only if V primeprime(m) = gprime (m) + ρg (m) is single-crossingfrom above Using Proposition 1 in Quah and Strulovici (2012) it is easy to showthat V primeprime(m) is single-crossing from above for all ρ isin R if and only if gprime (m)g (m) isnonincreasing in m (that is ln g(m) is concave)

Proof of Theorem 2 Immediate by Theorem 1 and Lemma 1

Proof of Theorem 3 Part (i) We fix t and σ and vary ρ Without loss ofgenerality let t = 0 and σ = 1 (so that Gtσ = G) Consider a signal that induces thedistribution H of posterior mean m This signal implements the receiverrsquos interimutility U given by

U(r) =

983133 1

r

(mminus r)dH(m) for r isin [0 1]

which holds because the receiver acts if and only if m ge r Since types r lt 0 alwayschoose a = 1 and types r gt 1 always choose a = 0 we can exclude them fromconsideration and assume that supp(G) = [0 1] The senderrsquos expected utility is then

983133 1

0

V (m)dH(m) =

983133 1

0

983133 m

0

(1 + ρ(mminus r))dG(r)dH(m)

=

983133 1

0

983133 1

r

(1 + ρ(mminus r))dH(m)dG(r)

=

983133 1

0

(1minusH(r) + ρU(r))dG(r)

Consider ρ2 gt ρ1 Suppose by contradiction that the corresponding optimal upper-censorship signals s2 and s1 are such that s1 is strictly more informative than s2 LetU2 and U1 be the receiverrsquos interim utilities implemented by s2 and s1 Since thesender prefers signal s2 under ρ2 and signal s1 under ρ1 we have

983133 1

0

(1minusH2 (r) + ρ2U2 (r)) dG (r) ge983133 1

0

(1minusH1 (r) + ρ2U1 (r)) dG (r)

983133 1

0

(1minusH1 (r) + ρ1U1 (r)) dG (r) ge983133 1

0

(1minusH2 (r) + ρ1U2 (r)) dG (r)

Summing up these inequalities gives

(ρ2 minus ρ1)

983133 1

0

(U2 (r)minus U1 (r)) dG (r) ge 0

leading to a contradiction because U1(r) ge U2(r) for all r with strict inequality forsome r given that s1 is strictly more informative than s2

CENSORSHIP AS OPTIMAL PERSUASION 25

Part (ii) We fix ρ isin R and σ = 1 and vary t The indirect utility of the senderconditional on m isin [0 1] is given by

Vt(m) =

983133

rlem

(1+ρ(mminusr))gt1(r)dr =

983133

rlemminust

(1+ρ(mminustminusr))g(r)dr = V (mminust) (9)

By Theorem 1 when t0 = 0 an optimal cutoff ωlowast0 satisfies

V (m) le V (mlowast0) + V prime(mlowast

0)(mminusmlowast0) for m isin [ωlowast

0 1]

When V is S-shaped and t1 lt 0 it is easy to see that

V (mminus t1) le V (mlowast0 minus t1) + V prime(mlowast

0 minus t1)(mminusmlowast0) for m isin [ωlowast

0 1]

which implies that when t = t1 the above inequality changes its sign at some ωlowast1

that satisfies ωlowast1 le ωlowast

0 It follows from (9) that ωlowast1 is an optimal cutoff for t = t1 and

as we established above ωlowast1 le ωlowast

0

Part (iii) We fix ρ ge 0 and t = 0 and vary σ The indirect utility of the senderconditional on m isin [0 1] is given by

V ρσσ(m) =

983133

rlem

9830591 +

ρ

σ(mminus r)

983060dG0σ(r)

=

983133

rlem

9830591 +

ρ

σ(mminus r)

983060g

983061r minus (1minus σ)τ

σ

9830621

σdr

=

983133

rlemminus(1minusσ)τσ

9830611 + ρ

983061mminus (1minus σ)τ

σminus r

983062983062g(r)dr

= Vρ1

983061mminus (1minus σ)τ

σ

983062 (10)

Let σ0 = 1 σ1 lt 1 and ρ1 = ρσ1 To compare the optimal cutoffs under parameters(ρ σ0) and (ρ σ1) we consider two steps first comparing the cutoffs between (ρ σ0)and (ρ1 σ1) and then comparing the cutoffs between (ρ1 σ1) and (ρ σ1)

First we compare an optimal cutoff ωlowast0 under (ρ σ0) = (ρ 1) and an optimal cutoff

ωlowast1 under (ρ1 σ1) By Theorem 1 ωlowast

0 satisfies

Vρ1(m) le Vρ1(mlowast0) + V prime

ρ1(mlowast0)(mminusmlowast

0) for m isin [ωlowast0 1]

When V is increasing (implied by ρ ge 0) and S-shaped and σ1 lt 1 it is easy to seethat for all m isin [ωlowast

0 1]

Vρ1

983061mminus (1minus σ1)τ

σ1

983062le Vρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062+ V prime

ρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062(mminusmlowast

0)

which implies that when σ = σ1 lt 1 the above inequality changes its sign at someωlowast1 that satisfies ωlowast

1 le ωlowast0 It follows from (10) that ωlowast

1 is an optimal cutoff under(ρσ σ1) = (ρ1 σ1) and as we established above ωlowast

1 le ωlowast0

26 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Second we compare the optimal cutoff ωlowast1 under (ρ1 σ1) and the optimal cutoff ωlowast

2

under (ρ σ1) Because ρ lt ρ1 = ρσ1 by part (i) we have ωlowast2 le ωlowast

1

Combining ωlowast1 le ωlowast

0 and ωlowast2 le ωlowast

1 we obtain ωlowast2 le ωlowast

0

Proof of Theorem 1prime Part (A) The proof is analogous to that of Theorem 1 andis thus omitted

Part (B) The only if statement follows from that in part (A) (see the proof of Theorem1) and the fact that we can always ensure that an optimal stochastic censorship(ωlowast qlowast) is deterministic (so qlowast isin 0 1) by the choice of the support and distributionof ω Thus we only need to prove the if statement

Let F be a distribution of ω with support on n values 0 lt ω0 lt lt ωnminus1 lt 1 Thisis without loss of generality because if 0 andor 1 were in the support of F we couldredefine the state as ω = (ω + ε)(1 + 2ε) and the type as r = (r + ε)(1 + 2ε)

Let X sub ω1 ωnminus1 identify a set of cutoffs defining a monotone partition for eachωi isin X the receiver is informed whether ω lt ωi or ω ge ωi Note that we exclude ω0

from the set of cutoffs because ω ge ω0 always holds

Note that X is deterministic upper censorship if and only if there exists a censorshipthreshold ωlowast isin [0 1] such that for each i = 1 nminus 1 ωi isin X if and only if ωi lt ωlowastthat is we can assume that qlowast = 0 without loss of generality

Let VX be the senderrsquos expected utility when the monotone partition is given by X

If n = 2 then there are only two partitions X = ω1 (full disclosure) and X = empty(no disclosure) Both are upper censorships so part (B) holds trivially

If n = 3 then there are four monotone partitions X = ω1ω2 (full disclosure)X = ω2 X = ω1 and X = empty (no disclosure) Only X = ω2 is not uppercensorship The next lemma proves that X = ω2 cannot be uniquely optimal whenV is S-shaped thus proving the optimality of upper censorship for n = 3

Denote

mjk = E983045ω|ω isin [ωjωk]

983046

For convenience of notation we write mj = ωj

Lemma 3 Let V be S-shaped with an inflexion point τ Let ω be a discrete randomvariable with support on m0m1m2 and probabilities p0 p1 p2 where 0 lt m0 ltm1 lt m2 lt 1 Then

(i) if τ le m0 then Vm2 le Vempty

(ii) if τ ge m1 then Vm2 le Vm1m2

(iii) if m0 lt τ lt m1 then Vm2 le maxVempty Vm1m2

CENSORSHIP AS OPTIMAL PERSUASION 27

Proof Define p01 = p0 + p1 and p02 = p01 + p2 = 1 Observe that

m01 =p0m0 + p1m1

p0 + p1and m02 =

p0m0 + p1m1 + p2m2

p0 + p1 + p2

We thus have

Vempty = p02V (m02) = (p01 + p2)V

983061p01m01 + p2m2

p01 + p2

983062

Vm2 = p01V (m01) + p2V (m2) = (p0 + p1)V

983061p0m0 + p1m1

p0 + p1

983062+ p2V (m2)

Vm1m2 = p0V (m0) + p1V (m1) + p2V (m2)

Clearly

m0 lt m01 lt m1 and m0 lt m02 lt m2

Therefore

Vm2 le Vempty lArrrArrp01

p01 + p2V (m01) +

p2p01 + p2

V (m2) le V

983061p01

p01 + p2m01 +

p2p01 + p2

m2

983062= V (m02)

(11)

and

Vm2 le Vm1m2 lArrrArr

V

983061p0

p0 + p1m0 +

p1p0 + p1

m1

983062le p0

p0 + p1V (m0) +

p1p0 + p1

V (m1)(12)

Suppose that τ le m01 so V is concave on [m01m2] Then (11) holds by Jensenrsquosinequality In particular it holds for τ le m0 lt m01 thus proving part (i)

Suppose that τ ge m1 so V is convex on [m0m1] Then (12) holds by Jensenrsquosinequality thus proving part (ii)

Finally suppose that m01 lt τ lt m1 Let (τ V (τ)) lie below (or on) the straight lineA connecting (m0 V (m0)) and (m2 V (m2)) as illustrated on Figure 2 By convexityof V (m) for m lt τ it must be the case that V (m01) lies below (or on) the dashedline B By concavity of V (m) for m gt τ it must be the case that V (m1) lies above(or on) the dashed line C Consequently (12) is satisfied

Alternatively let (τ V (τ)) lie above the straight line A connecting (m0 V (m0)) and(m2 V (m2)) as illustrated on Figure 3 If Vm2 le Vm1m2 then the proof of part(iii) is complete So suppose that Vm2 gt Vm1m2 so (12) is violated

By convexity of V (m) for m lt τ it must be the case that V (m01) lies below (or on)the line B Since (12) is violated V (m01) must lie between the lines A and B Recallthat m01 lt m02 lt m2 Observe that V (m02) must be in the shaded area on Figure 3Indeed if m02 lt τ then by convexity of V (m) for m lt τ it must be the case thatV (m02) lies between (or on) the lines B and D If m02 ge τ then by concavity of

28 KOLOTILIN MYLOVANOV ZAPECHELNYUK

τ m2m0

A C

B

m01 m1

V (m0) V (τ)

V (m2)

Figure 2 The case where (τ V (τ)) lies below (or on) the straight lineconnecting (m0 V (m0)) and (m2 V (m2))

τ m2m0

A

C

B

m01

V (m0)

V (τ)

V (m2)

D

V (m01)

Figure 3 The case where (τ V (τ)) lies above the straight line con-necting (m0 V (m0)) and (m2 V (m2))

V (m) for m gt τ it must be the case that V (m02) lies above (or on) line C It followsthat (11) is satisfied

We now adapt Lemma 3 to prove the optimality of upper censorship for n ge 4

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

CENSORSHIP AS OPTIMAL PERSUASION 21

mean But the expected posterior mean must be equal to the prior mean so it is notpossible to increase all posterior means Due to the log-concave shape of the densityof the citizensrsquo types this tradeoff is resolved by increasing the posterior mean of thepooling interval E[ω|ω ge clowast] This is done by shrinking the pooling interval [clowast 1]that is increasing the censorship cutoff clowast

Third when the number of media outlets is sufficiently large the censorship cutoffis increasing in the stretch factor σ of the density g around its peak τ (see Footnote15) A greater σ corresponds to a more dispersed distribution of the opportunitycost in the population This means that the government optimally discloses moreinformation (the censorship cutoff is greater) when the citizens are more diverse inhow difficult they are to persuade Intuitively when the types are more spread outpersuading the same mass of types requires a greater posterior mean which leads toa higher censorship cutoff similarly to the comparative statics in t discussed above

55 Extensions and Open Questions Let us now consider a few extensions ofour model of media censorship

In our model the set of media outlets is exogenous and the governmentrsquos only in-strument is censorship We now consider three alternative ways of expanding thegovernmentrsquos instruments of influence

First suppose that the government is able not only to censor existing media outletsbut also to introduce new media outlets with chosen editorial policies This is equiv-alent to our censorship model where all media outlets in [0 1] are initially availableand the government can censor any subset of them When all media outlets in [0 1]are permitted the revealed information about θ (which we call ω) is θ itself Thusω = θ is a continuous random variable with distribution F = T So we can nowapply our results for the continuous state from Section 3 instead of those for the dis-crete state in Section 4 reaching the same conclusion about the optimality of uppercensorship

Second suppose that the government can garble information available from mediaoutlets That is the government is not restricted to monotone partitions it cancreate arbitrary signals about state ω for the citizens to observe This becomes ageneral persuasion problem and our Theorems 1prime(A) 2prime(A) and 3prime(A) apply

Third suppose that the government is able to restrict not only which media outletsare permitted but also how many media outlets each citizen can choose to observeIn this extension the citizens are not allowed to communicate with one another (oth-erwise they could share the information thus observing all permitted media outletsindirectly) This extension does not affect our results as long as each citizen is al-lowed to access at least one media outlet of his choice as in Chan and Suen (2008)Intuitively this is because each citizen categorizes the information from the mediaoutlets into ldquogood newsrdquo where a = 1 is optimal and ldquobad newsrdquo where a = 0 is

22 KOLOTILIN MYLOVANOV ZAPECHELNYUK

optimal Because the information from the media outlets induces a monotone parti-tion it means that ldquogood newsrdquo is separated from ldquobad newsrdquo by a threshold mediaoutlet that depends on the citizenrsquos type That is it is sufficient to observe a singlethreshold media outlet to distinguish ldquogood newsrdquo from ldquobad newsrdquo

In our model each citizenrsquos utility depends on the aggregate action a through theexternality term κ(θ r a) which does not affect the chosen action Let us relax thisassumption so that a citizenrsquos optimal choice can depend on a We can still writethe senderrsquos indirect utility V as a function of the posterior mean state and applyour results However V is no longer uniquely determined by the primitives of themodel It is now endogenous and depends on an equilibrium the citizens play aseach citizenrsquos optimal action now depends on what all citizens do in equilibrium Forexample given the same information about the state a citizen could prefer to choosea = 1 if and only if sufficiently many citizens choose the same action so a is largeenough This creates the problem of multiplicity of equilibria and as a consequencethe dependence of optimal censorship on equilibrium selection

As a side our media censorship problem can also be applied to spatial voting mod-els as in Chiang and Knight (2011) Consider a government party (p = G) and anopposition party (p = O) competing in an election If party p wins a voter withan ideological position r gets utility wp minus (r minus rp)

2 where wp is the quality or va-lence of party p and rp is the ideology or policy platform of party p Voters knowthe partiesrsquo ideologies and obtain information about the partiesrsquo qualities from allavailable media outlets Each voter supports the party that maximizes his expectedutility Within this context our analysis still applies because the voterrsquos utility dif-ference between the government and opposition parties is proportional to θminusr whereθ = (wG minus wO + r2O minus r2G)2(rO minus rG)

There are a few more extensions that can be relevant in applications First insteadof complete censorship of a media outlet there can be a cost of accessing it Forexample an international news channel can be censored by a local government butcitizens may still access it through VPN at some cost Second it can be costly forthe government to censor media outlets So another important question to answeris how a government may prioritize censoring Finally citizens can incur some costof following each media outlet While we have already mentioned that citizens gainno benefit from following more than one outlet it is entirely possible for them tostop watching news altogether if it is sufficiently uninformative These extensions arenontrivial and left for future research

Appendix

Proof of Theorem 1 Suppose that V is S-shaped For mωlowast isin [0 1] and mlowast(ωlowast) =E[ω|ω ge ωlowast] define

V (mωlowast) = V (mlowast(ωlowast)) + V prime(mlowast(ωlowast))(mminusmlowast(ωlowast)) (7)

CENSORSHIP AS OPTIMAL PERSUASION 23

Since V (m) is convex on [0 τ ] and concave on [τ 1] there exists ωlowast isin [0 τ ] thatsatisfies condition (3) that is

V (mωlowast) ge (le)V (m) for all m ge (lt)ωlowast (8)

This is true because V (ωlowastωlowast) minus V (ωlowast) is single-crossing from below To see thisconsider Figure 1 Observe that (3) can hold only if ωlowast is on the convex part of V andmlowast is on the concave part of V As the censorship cutoff ωlowast increases the posteriormean state of the pooling message mlowast(ωlowast) also increases But because V is concaveat mlowast(ωlowast) the dashed tangent line V (mωlowast) becomes flatter in m so it crosses thesolid line V (m) at a smaller m

Consider an arbitrary signal s Let H be the distribution of m = E[ω|s] inducedby signal s Since each signal is a garbling of the fully informative signal F is amean-preserving spread of H Thus the senderrsquos expected utility is smaller undersignal s than under the upper-censorship signal with cutoff ωlowast as follows from

983133 1

0

V (m)dH(m) le983133 1

0

maxV (m) V (mωlowast)dH(m)

le983133 1

0

maxV (m) V (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mlowast(ωlowast))dF (m)

where the second line holds because maxV (m) V (mωlowast) is convex in m and F isa mean-preserving spread of H the third line holds by (8) and the last line holds by

(7) and by983125 1

ωlowast(mminusmlowast(ωlowast))dF (m) = 0

Conversely suppose that V is not S-shaped Then there exist 0 le m1 lt m2 lem3 lt m4 le 1 such that V primeprime(m) lt 0 for m isin (m1m2) V

primeprime(m) = 0 for m isin [m2m3]and V primeprime(m) gt 0 for m isin (m3m4) because V primeprime is continuous by assumption It isstraightforward to show that there exists 983141mlowast isin (m1m2) (sufficiently close to m2) and983141ωlowast isin (m3m4) such that

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) lt V (m) for all m isin (983141ωlowastm4]

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) gt V (m) for all m isin [m1 983141ωlowast] 983141mlowast

Moreover there exists a continuous distribution F such that the support of F is asubset of [m1m4] and 983141mlowast = E[ω|ω le 983141ωlowast] Using a chain of inequalities similar to theone above it is easy to show that lower-censorship signal 983141slowast with cutoff 983141ωlowast is optimalFurthermore it is uniquely optimal as both inequalities hold with equalities only forsignal 983141slowast This in turn implies that any upper-censorship signal is suboptimal

24 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Proof of Lemma 1 Notice that under assumption (A1) we have

V primeprime(m) = gprime(m) + ρg(m) for m isin [0 1]

Recall that V is S-shaped if and only if V primeprime(m) = gprime (m) + ρg (m) is single-crossingfrom above Using Proposition 1 in Quah and Strulovici (2012) it is easy to showthat V primeprime(m) is single-crossing from above for all ρ isin R if and only if gprime (m)g (m) isnonincreasing in m (that is ln g(m) is concave)

Proof of Theorem 2 Immediate by Theorem 1 and Lemma 1

Proof of Theorem 3 Part (i) We fix t and σ and vary ρ Without loss ofgenerality let t = 0 and σ = 1 (so that Gtσ = G) Consider a signal that induces thedistribution H of posterior mean m This signal implements the receiverrsquos interimutility U given by

U(r) =

983133 1

r

(mminus r)dH(m) for r isin [0 1]

which holds because the receiver acts if and only if m ge r Since types r lt 0 alwayschoose a = 1 and types r gt 1 always choose a = 0 we can exclude them fromconsideration and assume that supp(G) = [0 1] The senderrsquos expected utility is then

983133 1

0

V (m)dH(m) =

983133 1

0

983133 m

0

(1 + ρ(mminus r))dG(r)dH(m)

=

983133 1

0

983133 1

r

(1 + ρ(mminus r))dH(m)dG(r)

=

983133 1

0

(1minusH(r) + ρU(r))dG(r)

Consider ρ2 gt ρ1 Suppose by contradiction that the corresponding optimal upper-censorship signals s2 and s1 are such that s1 is strictly more informative than s2 LetU2 and U1 be the receiverrsquos interim utilities implemented by s2 and s1 Since thesender prefers signal s2 under ρ2 and signal s1 under ρ1 we have

983133 1

0

(1minusH2 (r) + ρ2U2 (r)) dG (r) ge983133 1

0

(1minusH1 (r) + ρ2U1 (r)) dG (r)

983133 1

0

(1minusH1 (r) + ρ1U1 (r)) dG (r) ge983133 1

0

(1minusH2 (r) + ρ1U2 (r)) dG (r)

Summing up these inequalities gives

(ρ2 minus ρ1)

983133 1

0

(U2 (r)minus U1 (r)) dG (r) ge 0

leading to a contradiction because U1(r) ge U2(r) for all r with strict inequality forsome r given that s1 is strictly more informative than s2

CENSORSHIP AS OPTIMAL PERSUASION 25

Part (ii) We fix ρ isin R and σ = 1 and vary t The indirect utility of the senderconditional on m isin [0 1] is given by

Vt(m) =

983133

rlem

(1+ρ(mminusr))gt1(r)dr =

983133

rlemminust

(1+ρ(mminustminusr))g(r)dr = V (mminust) (9)

By Theorem 1 when t0 = 0 an optimal cutoff ωlowast0 satisfies

V (m) le V (mlowast0) + V prime(mlowast

0)(mminusmlowast0) for m isin [ωlowast

0 1]

When V is S-shaped and t1 lt 0 it is easy to see that

V (mminus t1) le V (mlowast0 minus t1) + V prime(mlowast

0 minus t1)(mminusmlowast0) for m isin [ωlowast

0 1]

which implies that when t = t1 the above inequality changes its sign at some ωlowast1

that satisfies ωlowast1 le ωlowast

0 It follows from (9) that ωlowast1 is an optimal cutoff for t = t1 and

as we established above ωlowast1 le ωlowast

0

Part (iii) We fix ρ ge 0 and t = 0 and vary σ The indirect utility of the senderconditional on m isin [0 1] is given by

V ρσσ(m) =

983133

rlem

9830591 +

ρ

σ(mminus r)

983060dG0σ(r)

=

983133

rlem

9830591 +

ρ

σ(mminus r)

983060g

983061r minus (1minus σ)τ

σ

9830621

σdr

=

983133

rlemminus(1minusσ)τσ

9830611 + ρ

983061mminus (1minus σ)τ

σminus r

983062983062g(r)dr

= Vρ1

983061mminus (1minus σ)τ

σ

983062 (10)

Let σ0 = 1 σ1 lt 1 and ρ1 = ρσ1 To compare the optimal cutoffs under parameters(ρ σ0) and (ρ σ1) we consider two steps first comparing the cutoffs between (ρ σ0)and (ρ1 σ1) and then comparing the cutoffs between (ρ1 σ1) and (ρ σ1)

First we compare an optimal cutoff ωlowast0 under (ρ σ0) = (ρ 1) and an optimal cutoff

ωlowast1 under (ρ1 σ1) By Theorem 1 ωlowast

0 satisfies

Vρ1(m) le Vρ1(mlowast0) + V prime

ρ1(mlowast0)(mminusmlowast

0) for m isin [ωlowast0 1]

When V is increasing (implied by ρ ge 0) and S-shaped and σ1 lt 1 it is easy to seethat for all m isin [ωlowast

0 1]

Vρ1

983061mminus (1minus σ1)τ

σ1

983062le Vρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062+ V prime

ρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062(mminusmlowast

0)

which implies that when σ = σ1 lt 1 the above inequality changes its sign at someωlowast1 that satisfies ωlowast

1 le ωlowast0 It follows from (10) that ωlowast

1 is an optimal cutoff under(ρσ σ1) = (ρ1 σ1) and as we established above ωlowast

1 le ωlowast0

26 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Second we compare the optimal cutoff ωlowast1 under (ρ1 σ1) and the optimal cutoff ωlowast

2

under (ρ σ1) Because ρ lt ρ1 = ρσ1 by part (i) we have ωlowast2 le ωlowast

1

Combining ωlowast1 le ωlowast

0 and ωlowast2 le ωlowast

1 we obtain ωlowast2 le ωlowast

0

Proof of Theorem 1prime Part (A) The proof is analogous to that of Theorem 1 andis thus omitted

Part (B) The only if statement follows from that in part (A) (see the proof of Theorem1) and the fact that we can always ensure that an optimal stochastic censorship(ωlowast qlowast) is deterministic (so qlowast isin 0 1) by the choice of the support and distributionof ω Thus we only need to prove the if statement

Let F be a distribution of ω with support on n values 0 lt ω0 lt lt ωnminus1 lt 1 Thisis without loss of generality because if 0 andor 1 were in the support of F we couldredefine the state as ω = (ω + ε)(1 + 2ε) and the type as r = (r + ε)(1 + 2ε)

Let X sub ω1 ωnminus1 identify a set of cutoffs defining a monotone partition for eachωi isin X the receiver is informed whether ω lt ωi or ω ge ωi Note that we exclude ω0

from the set of cutoffs because ω ge ω0 always holds

Note that X is deterministic upper censorship if and only if there exists a censorshipthreshold ωlowast isin [0 1] such that for each i = 1 nminus 1 ωi isin X if and only if ωi lt ωlowastthat is we can assume that qlowast = 0 without loss of generality

Let VX be the senderrsquos expected utility when the monotone partition is given by X

If n = 2 then there are only two partitions X = ω1 (full disclosure) and X = empty(no disclosure) Both are upper censorships so part (B) holds trivially

If n = 3 then there are four monotone partitions X = ω1ω2 (full disclosure)X = ω2 X = ω1 and X = empty (no disclosure) Only X = ω2 is not uppercensorship The next lemma proves that X = ω2 cannot be uniquely optimal whenV is S-shaped thus proving the optimality of upper censorship for n = 3

Denote

mjk = E983045ω|ω isin [ωjωk]

983046

For convenience of notation we write mj = ωj

Lemma 3 Let V be S-shaped with an inflexion point τ Let ω be a discrete randomvariable with support on m0m1m2 and probabilities p0 p1 p2 where 0 lt m0 ltm1 lt m2 lt 1 Then

(i) if τ le m0 then Vm2 le Vempty

(ii) if τ ge m1 then Vm2 le Vm1m2

(iii) if m0 lt τ lt m1 then Vm2 le maxVempty Vm1m2

CENSORSHIP AS OPTIMAL PERSUASION 27

Proof Define p01 = p0 + p1 and p02 = p01 + p2 = 1 Observe that

m01 =p0m0 + p1m1

p0 + p1and m02 =

p0m0 + p1m1 + p2m2

p0 + p1 + p2

We thus have

Vempty = p02V (m02) = (p01 + p2)V

983061p01m01 + p2m2

p01 + p2

983062

Vm2 = p01V (m01) + p2V (m2) = (p0 + p1)V

983061p0m0 + p1m1

p0 + p1

983062+ p2V (m2)

Vm1m2 = p0V (m0) + p1V (m1) + p2V (m2)

Clearly

m0 lt m01 lt m1 and m0 lt m02 lt m2

Therefore

Vm2 le Vempty lArrrArrp01

p01 + p2V (m01) +

p2p01 + p2

V (m2) le V

983061p01

p01 + p2m01 +

p2p01 + p2

m2

983062= V (m02)

(11)

and

Vm2 le Vm1m2 lArrrArr

V

983061p0

p0 + p1m0 +

p1p0 + p1

m1

983062le p0

p0 + p1V (m0) +

p1p0 + p1

V (m1)(12)

Suppose that τ le m01 so V is concave on [m01m2] Then (11) holds by Jensenrsquosinequality In particular it holds for τ le m0 lt m01 thus proving part (i)

Suppose that τ ge m1 so V is convex on [m0m1] Then (12) holds by Jensenrsquosinequality thus proving part (ii)

Finally suppose that m01 lt τ lt m1 Let (τ V (τ)) lie below (or on) the straight lineA connecting (m0 V (m0)) and (m2 V (m2)) as illustrated on Figure 2 By convexityof V (m) for m lt τ it must be the case that V (m01) lies below (or on) the dashedline B By concavity of V (m) for m gt τ it must be the case that V (m1) lies above(or on) the dashed line C Consequently (12) is satisfied

Alternatively let (τ V (τ)) lie above the straight line A connecting (m0 V (m0)) and(m2 V (m2)) as illustrated on Figure 3 If Vm2 le Vm1m2 then the proof of part(iii) is complete So suppose that Vm2 gt Vm1m2 so (12) is violated

By convexity of V (m) for m lt τ it must be the case that V (m01) lies below (or on)the line B Since (12) is violated V (m01) must lie between the lines A and B Recallthat m01 lt m02 lt m2 Observe that V (m02) must be in the shaded area on Figure 3Indeed if m02 lt τ then by convexity of V (m) for m lt τ it must be the case thatV (m02) lies between (or on) the lines B and D If m02 ge τ then by concavity of

28 KOLOTILIN MYLOVANOV ZAPECHELNYUK

τ m2m0

A C

B

m01 m1

V (m0) V (τ)

V (m2)

Figure 2 The case where (τ V (τ)) lies below (or on) the straight lineconnecting (m0 V (m0)) and (m2 V (m2))

τ m2m0

A

C

B

m01

V (m0)

V (τ)

V (m2)

D

V (m01)

Figure 3 The case where (τ V (τ)) lies above the straight line con-necting (m0 V (m0)) and (m2 V (m2))

V (m) for m gt τ it must be the case that V (m02) lies above (or on) line C It followsthat (11) is satisfied

We now adapt Lemma 3 to prove the optimality of upper censorship for n ge 4

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

22 KOLOTILIN MYLOVANOV ZAPECHELNYUK

optimal Because the information from the media outlets induces a monotone parti-tion it means that ldquogood newsrdquo is separated from ldquobad newsrdquo by a threshold mediaoutlet that depends on the citizenrsquos type That is it is sufficient to observe a singlethreshold media outlet to distinguish ldquogood newsrdquo from ldquobad newsrdquo

In our model each citizenrsquos utility depends on the aggregate action a through theexternality term κ(θ r a) which does not affect the chosen action Let us relax thisassumption so that a citizenrsquos optimal choice can depend on a We can still writethe senderrsquos indirect utility V as a function of the posterior mean state and applyour results However V is no longer uniquely determined by the primitives of themodel It is now endogenous and depends on an equilibrium the citizens play aseach citizenrsquos optimal action now depends on what all citizens do in equilibrium Forexample given the same information about the state a citizen could prefer to choosea = 1 if and only if sufficiently many citizens choose the same action so a is largeenough This creates the problem of multiplicity of equilibria and as a consequencethe dependence of optimal censorship on equilibrium selection

As a side our media censorship problem can also be applied to spatial voting mod-els as in Chiang and Knight (2011) Consider a government party (p = G) and anopposition party (p = O) competing in an election If party p wins a voter withan ideological position r gets utility wp minus (r minus rp)

2 where wp is the quality or va-lence of party p and rp is the ideology or policy platform of party p Voters knowthe partiesrsquo ideologies and obtain information about the partiesrsquo qualities from allavailable media outlets Each voter supports the party that maximizes his expectedutility Within this context our analysis still applies because the voterrsquos utility dif-ference between the government and opposition parties is proportional to θminusr whereθ = (wG minus wO + r2O minus r2G)2(rO minus rG)

There are a few more extensions that can be relevant in applications First insteadof complete censorship of a media outlet there can be a cost of accessing it Forexample an international news channel can be censored by a local government butcitizens may still access it through VPN at some cost Second it can be costly forthe government to censor media outlets So another important question to answeris how a government may prioritize censoring Finally citizens can incur some costof following each media outlet While we have already mentioned that citizens gainno benefit from following more than one outlet it is entirely possible for them tostop watching news altogether if it is sufficiently uninformative These extensions arenontrivial and left for future research

Appendix

Proof of Theorem 1 Suppose that V is S-shaped For mωlowast isin [0 1] and mlowast(ωlowast) =E[ω|ω ge ωlowast] define

V (mωlowast) = V (mlowast(ωlowast)) + V prime(mlowast(ωlowast))(mminusmlowast(ωlowast)) (7)

CENSORSHIP AS OPTIMAL PERSUASION 23

Since V (m) is convex on [0 τ ] and concave on [τ 1] there exists ωlowast isin [0 τ ] thatsatisfies condition (3) that is

V (mωlowast) ge (le)V (m) for all m ge (lt)ωlowast (8)

This is true because V (ωlowastωlowast) minus V (ωlowast) is single-crossing from below To see thisconsider Figure 1 Observe that (3) can hold only if ωlowast is on the convex part of V andmlowast is on the concave part of V As the censorship cutoff ωlowast increases the posteriormean state of the pooling message mlowast(ωlowast) also increases But because V is concaveat mlowast(ωlowast) the dashed tangent line V (mωlowast) becomes flatter in m so it crosses thesolid line V (m) at a smaller m

Consider an arbitrary signal s Let H be the distribution of m = E[ω|s] inducedby signal s Since each signal is a garbling of the fully informative signal F is amean-preserving spread of H Thus the senderrsquos expected utility is smaller undersignal s than under the upper-censorship signal with cutoff ωlowast as follows from

983133 1

0

V (m)dH(m) le983133 1

0

maxV (m) V (mωlowast)dH(m)

le983133 1

0

maxV (m) V (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mlowast(ωlowast))dF (m)

where the second line holds because maxV (m) V (mωlowast) is convex in m and F isa mean-preserving spread of H the third line holds by (8) and the last line holds by

(7) and by983125 1

ωlowast(mminusmlowast(ωlowast))dF (m) = 0

Conversely suppose that V is not S-shaped Then there exist 0 le m1 lt m2 lem3 lt m4 le 1 such that V primeprime(m) lt 0 for m isin (m1m2) V

primeprime(m) = 0 for m isin [m2m3]and V primeprime(m) gt 0 for m isin (m3m4) because V primeprime is continuous by assumption It isstraightforward to show that there exists 983141mlowast isin (m1m2) (sufficiently close to m2) and983141ωlowast isin (m3m4) such that

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) lt V (m) for all m isin (983141ωlowastm4]

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) gt V (m) for all m isin [m1 983141ωlowast] 983141mlowast

Moreover there exists a continuous distribution F such that the support of F is asubset of [m1m4] and 983141mlowast = E[ω|ω le 983141ωlowast] Using a chain of inequalities similar to theone above it is easy to show that lower-censorship signal 983141slowast with cutoff 983141ωlowast is optimalFurthermore it is uniquely optimal as both inequalities hold with equalities only forsignal 983141slowast This in turn implies that any upper-censorship signal is suboptimal

24 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Proof of Lemma 1 Notice that under assumption (A1) we have

V primeprime(m) = gprime(m) + ρg(m) for m isin [0 1]

Recall that V is S-shaped if and only if V primeprime(m) = gprime (m) + ρg (m) is single-crossingfrom above Using Proposition 1 in Quah and Strulovici (2012) it is easy to showthat V primeprime(m) is single-crossing from above for all ρ isin R if and only if gprime (m)g (m) isnonincreasing in m (that is ln g(m) is concave)

Proof of Theorem 2 Immediate by Theorem 1 and Lemma 1

Proof of Theorem 3 Part (i) We fix t and σ and vary ρ Without loss ofgenerality let t = 0 and σ = 1 (so that Gtσ = G) Consider a signal that induces thedistribution H of posterior mean m This signal implements the receiverrsquos interimutility U given by

U(r) =

983133 1

r

(mminus r)dH(m) for r isin [0 1]

which holds because the receiver acts if and only if m ge r Since types r lt 0 alwayschoose a = 1 and types r gt 1 always choose a = 0 we can exclude them fromconsideration and assume that supp(G) = [0 1] The senderrsquos expected utility is then

983133 1

0

V (m)dH(m) =

983133 1

0

983133 m

0

(1 + ρ(mminus r))dG(r)dH(m)

=

983133 1

0

983133 1

r

(1 + ρ(mminus r))dH(m)dG(r)

=

983133 1

0

(1minusH(r) + ρU(r))dG(r)

Consider ρ2 gt ρ1 Suppose by contradiction that the corresponding optimal upper-censorship signals s2 and s1 are such that s1 is strictly more informative than s2 LetU2 and U1 be the receiverrsquos interim utilities implemented by s2 and s1 Since thesender prefers signal s2 under ρ2 and signal s1 under ρ1 we have

983133 1

0

(1minusH2 (r) + ρ2U2 (r)) dG (r) ge983133 1

0

(1minusH1 (r) + ρ2U1 (r)) dG (r)

983133 1

0

(1minusH1 (r) + ρ1U1 (r)) dG (r) ge983133 1

0

(1minusH2 (r) + ρ1U2 (r)) dG (r)

Summing up these inequalities gives

(ρ2 minus ρ1)

983133 1

0

(U2 (r)minus U1 (r)) dG (r) ge 0

leading to a contradiction because U1(r) ge U2(r) for all r with strict inequality forsome r given that s1 is strictly more informative than s2

CENSORSHIP AS OPTIMAL PERSUASION 25

Part (ii) We fix ρ isin R and σ = 1 and vary t The indirect utility of the senderconditional on m isin [0 1] is given by

Vt(m) =

983133

rlem

(1+ρ(mminusr))gt1(r)dr =

983133

rlemminust

(1+ρ(mminustminusr))g(r)dr = V (mminust) (9)

By Theorem 1 when t0 = 0 an optimal cutoff ωlowast0 satisfies

V (m) le V (mlowast0) + V prime(mlowast

0)(mminusmlowast0) for m isin [ωlowast

0 1]

When V is S-shaped and t1 lt 0 it is easy to see that

V (mminus t1) le V (mlowast0 minus t1) + V prime(mlowast

0 minus t1)(mminusmlowast0) for m isin [ωlowast

0 1]

which implies that when t = t1 the above inequality changes its sign at some ωlowast1

that satisfies ωlowast1 le ωlowast

0 It follows from (9) that ωlowast1 is an optimal cutoff for t = t1 and

as we established above ωlowast1 le ωlowast

0

Part (iii) We fix ρ ge 0 and t = 0 and vary σ The indirect utility of the senderconditional on m isin [0 1] is given by

V ρσσ(m) =

983133

rlem

9830591 +

ρ

σ(mminus r)

983060dG0σ(r)

=

983133

rlem

9830591 +

ρ

σ(mminus r)

983060g

983061r minus (1minus σ)τ

σ

9830621

σdr

=

983133

rlemminus(1minusσ)τσ

9830611 + ρ

983061mminus (1minus σ)τ

σminus r

983062983062g(r)dr

= Vρ1

983061mminus (1minus σ)τ

σ

983062 (10)

Let σ0 = 1 σ1 lt 1 and ρ1 = ρσ1 To compare the optimal cutoffs under parameters(ρ σ0) and (ρ σ1) we consider two steps first comparing the cutoffs between (ρ σ0)and (ρ1 σ1) and then comparing the cutoffs between (ρ1 σ1) and (ρ σ1)

First we compare an optimal cutoff ωlowast0 under (ρ σ0) = (ρ 1) and an optimal cutoff

ωlowast1 under (ρ1 σ1) By Theorem 1 ωlowast

0 satisfies

Vρ1(m) le Vρ1(mlowast0) + V prime

ρ1(mlowast0)(mminusmlowast

0) for m isin [ωlowast0 1]

When V is increasing (implied by ρ ge 0) and S-shaped and σ1 lt 1 it is easy to seethat for all m isin [ωlowast

0 1]

Vρ1

983061mminus (1minus σ1)τ

σ1

983062le Vρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062+ V prime

ρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062(mminusmlowast

0)

which implies that when σ = σ1 lt 1 the above inequality changes its sign at someωlowast1 that satisfies ωlowast

1 le ωlowast0 It follows from (10) that ωlowast

1 is an optimal cutoff under(ρσ σ1) = (ρ1 σ1) and as we established above ωlowast

1 le ωlowast0

26 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Second we compare the optimal cutoff ωlowast1 under (ρ1 σ1) and the optimal cutoff ωlowast

2

under (ρ σ1) Because ρ lt ρ1 = ρσ1 by part (i) we have ωlowast2 le ωlowast

1

Combining ωlowast1 le ωlowast

0 and ωlowast2 le ωlowast

1 we obtain ωlowast2 le ωlowast

0

Proof of Theorem 1prime Part (A) The proof is analogous to that of Theorem 1 andis thus omitted

Part (B) The only if statement follows from that in part (A) (see the proof of Theorem1) and the fact that we can always ensure that an optimal stochastic censorship(ωlowast qlowast) is deterministic (so qlowast isin 0 1) by the choice of the support and distributionof ω Thus we only need to prove the if statement

Let F be a distribution of ω with support on n values 0 lt ω0 lt lt ωnminus1 lt 1 Thisis without loss of generality because if 0 andor 1 were in the support of F we couldredefine the state as ω = (ω + ε)(1 + 2ε) and the type as r = (r + ε)(1 + 2ε)

Let X sub ω1 ωnminus1 identify a set of cutoffs defining a monotone partition for eachωi isin X the receiver is informed whether ω lt ωi or ω ge ωi Note that we exclude ω0

from the set of cutoffs because ω ge ω0 always holds

Note that X is deterministic upper censorship if and only if there exists a censorshipthreshold ωlowast isin [0 1] such that for each i = 1 nminus 1 ωi isin X if and only if ωi lt ωlowastthat is we can assume that qlowast = 0 without loss of generality

Let VX be the senderrsquos expected utility when the monotone partition is given by X

If n = 2 then there are only two partitions X = ω1 (full disclosure) and X = empty(no disclosure) Both are upper censorships so part (B) holds trivially

If n = 3 then there are four monotone partitions X = ω1ω2 (full disclosure)X = ω2 X = ω1 and X = empty (no disclosure) Only X = ω2 is not uppercensorship The next lemma proves that X = ω2 cannot be uniquely optimal whenV is S-shaped thus proving the optimality of upper censorship for n = 3

Denote

mjk = E983045ω|ω isin [ωjωk]

983046

For convenience of notation we write mj = ωj

Lemma 3 Let V be S-shaped with an inflexion point τ Let ω be a discrete randomvariable with support on m0m1m2 and probabilities p0 p1 p2 where 0 lt m0 ltm1 lt m2 lt 1 Then

(i) if τ le m0 then Vm2 le Vempty

(ii) if τ ge m1 then Vm2 le Vm1m2

(iii) if m0 lt τ lt m1 then Vm2 le maxVempty Vm1m2

CENSORSHIP AS OPTIMAL PERSUASION 27

Proof Define p01 = p0 + p1 and p02 = p01 + p2 = 1 Observe that

m01 =p0m0 + p1m1

p0 + p1and m02 =

p0m0 + p1m1 + p2m2

p0 + p1 + p2

We thus have

Vempty = p02V (m02) = (p01 + p2)V

983061p01m01 + p2m2

p01 + p2

983062

Vm2 = p01V (m01) + p2V (m2) = (p0 + p1)V

983061p0m0 + p1m1

p0 + p1

983062+ p2V (m2)

Vm1m2 = p0V (m0) + p1V (m1) + p2V (m2)

Clearly

m0 lt m01 lt m1 and m0 lt m02 lt m2

Therefore

Vm2 le Vempty lArrrArrp01

p01 + p2V (m01) +

p2p01 + p2

V (m2) le V

983061p01

p01 + p2m01 +

p2p01 + p2

m2

983062= V (m02)

(11)

and

Vm2 le Vm1m2 lArrrArr

V

983061p0

p0 + p1m0 +

p1p0 + p1

m1

983062le p0

p0 + p1V (m0) +

p1p0 + p1

V (m1)(12)

Suppose that τ le m01 so V is concave on [m01m2] Then (11) holds by Jensenrsquosinequality In particular it holds for τ le m0 lt m01 thus proving part (i)

Suppose that τ ge m1 so V is convex on [m0m1] Then (12) holds by Jensenrsquosinequality thus proving part (ii)

Finally suppose that m01 lt τ lt m1 Let (τ V (τ)) lie below (or on) the straight lineA connecting (m0 V (m0)) and (m2 V (m2)) as illustrated on Figure 2 By convexityof V (m) for m lt τ it must be the case that V (m01) lies below (or on) the dashedline B By concavity of V (m) for m gt τ it must be the case that V (m1) lies above(or on) the dashed line C Consequently (12) is satisfied

Alternatively let (τ V (τ)) lie above the straight line A connecting (m0 V (m0)) and(m2 V (m2)) as illustrated on Figure 3 If Vm2 le Vm1m2 then the proof of part(iii) is complete So suppose that Vm2 gt Vm1m2 so (12) is violated

By convexity of V (m) for m lt τ it must be the case that V (m01) lies below (or on)the line B Since (12) is violated V (m01) must lie between the lines A and B Recallthat m01 lt m02 lt m2 Observe that V (m02) must be in the shaded area on Figure 3Indeed if m02 lt τ then by convexity of V (m) for m lt τ it must be the case thatV (m02) lies between (or on) the lines B and D If m02 ge τ then by concavity of

28 KOLOTILIN MYLOVANOV ZAPECHELNYUK

τ m2m0

A C

B

m01 m1

V (m0) V (τ)

V (m2)

Figure 2 The case where (τ V (τ)) lies below (or on) the straight lineconnecting (m0 V (m0)) and (m2 V (m2))

τ m2m0

A

C

B

m01

V (m0)

V (τ)

V (m2)

D

V (m01)

Figure 3 The case where (τ V (τ)) lies above the straight line con-necting (m0 V (m0)) and (m2 V (m2))

V (m) for m gt τ it must be the case that V (m02) lies above (or on) line C It followsthat (11) is satisfied

We now adapt Lemma 3 to prove the optimality of upper censorship for n ge 4

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

CENSORSHIP AS OPTIMAL PERSUASION 23

Since V (m) is convex on [0 τ ] and concave on [τ 1] there exists ωlowast isin [0 τ ] thatsatisfies condition (3) that is

V (mωlowast) ge (le)V (m) for all m ge (lt)ωlowast (8)

This is true because V (ωlowastωlowast) minus V (ωlowast) is single-crossing from below To see thisconsider Figure 1 Observe that (3) can hold only if ωlowast is on the convex part of V andmlowast is on the concave part of V As the censorship cutoff ωlowast increases the posteriormean state of the pooling message mlowast(ωlowast) also increases But because V is concaveat mlowast(ωlowast) the dashed tangent line V (mωlowast) becomes flatter in m so it crosses thesolid line V (m) at a smaller m

Consider an arbitrary signal s Let H be the distribution of m = E[ω|s] inducedby signal s Since each signal is a garbling of the fully informative signal F is amean-preserving spread of H Thus the senderrsquos expected utility is smaller undersignal s than under the upper-censorship signal with cutoff ωlowast as follows from

983133 1

0

V (m)dH(m) le983133 1

0

maxV (m) V (mωlowast)dH(m)

le983133 1

0

maxV (m) V (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mωlowast)dF (m)

=

983133 ωlowast

0

V (m)dF (m) +

983133 1

ωlowastV (mlowast(ωlowast))dF (m)

where the second line holds because maxV (m) V (mωlowast) is convex in m and F isa mean-preserving spread of H the third line holds by (8) and the last line holds by

(7) and by983125 1

ωlowast(mminusmlowast(ωlowast))dF (m) = 0

Conversely suppose that V is not S-shaped Then there exist 0 le m1 lt m2 lem3 lt m4 le 1 such that V primeprime(m) lt 0 for m isin (m1m2) V

primeprime(m) = 0 for m isin [m2m3]and V primeprime(m) gt 0 for m isin (m3m4) because V primeprime is continuous by assumption It isstraightforward to show that there exists 983141mlowast isin (m1m2) (sufficiently close to m2) and983141ωlowast isin (m3m4) such that

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) lt V (m) for all m isin (983141ωlowastm4]

V (983141mlowast) + V prime(983141mlowast)(mminus 983141mlowast) gt V (m) for all m isin [m1 983141ωlowast] 983141mlowast

Moreover there exists a continuous distribution F such that the support of F is asubset of [m1m4] and 983141mlowast = E[ω|ω le 983141ωlowast] Using a chain of inequalities similar to theone above it is easy to show that lower-censorship signal 983141slowast with cutoff 983141ωlowast is optimalFurthermore it is uniquely optimal as both inequalities hold with equalities only forsignal 983141slowast This in turn implies that any upper-censorship signal is suboptimal

24 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Proof of Lemma 1 Notice that under assumption (A1) we have

V primeprime(m) = gprime(m) + ρg(m) for m isin [0 1]

Recall that V is S-shaped if and only if V primeprime(m) = gprime (m) + ρg (m) is single-crossingfrom above Using Proposition 1 in Quah and Strulovici (2012) it is easy to showthat V primeprime(m) is single-crossing from above for all ρ isin R if and only if gprime (m)g (m) isnonincreasing in m (that is ln g(m) is concave)

Proof of Theorem 2 Immediate by Theorem 1 and Lemma 1

Proof of Theorem 3 Part (i) We fix t and σ and vary ρ Without loss ofgenerality let t = 0 and σ = 1 (so that Gtσ = G) Consider a signal that induces thedistribution H of posterior mean m This signal implements the receiverrsquos interimutility U given by

U(r) =

983133 1

r

(mminus r)dH(m) for r isin [0 1]

which holds because the receiver acts if and only if m ge r Since types r lt 0 alwayschoose a = 1 and types r gt 1 always choose a = 0 we can exclude them fromconsideration and assume that supp(G) = [0 1] The senderrsquos expected utility is then

983133 1

0

V (m)dH(m) =

983133 1

0

983133 m

0

(1 + ρ(mminus r))dG(r)dH(m)

=

983133 1

0

983133 1

r

(1 + ρ(mminus r))dH(m)dG(r)

=

983133 1

0

(1minusH(r) + ρU(r))dG(r)

Consider ρ2 gt ρ1 Suppose by contradiction that the corresponding optimal upper-censorship signals s2 and s1 are such that s1 is strictly more informative than s2 LetU2 and U1 be the receiverrsquos interim utilities implemented by s2 and s1 Since thesender prefers signal s2 under ρ2 and signal s1 under ρ1 we have

983133 1

0

(1minusH2 (r) + ρ2U2 (r)) dG (r) ge983133 1

0

(1minusH1 (r) + ρ2U1 (r)) dG (r)

983133 1

0

(1minusH1 (r) + ρ1U1 (r)) dG (r) ge983133 1

0

(1minusH2 (r) + ρ1U2 (r)) dG (r)

Summing up these inequalities gives

(ρ2 minus ρ1)

983133 1

0

(U2 (r)minus U1 (r)) dG (r) ge 0

leading to a contradiction because U1(r) ge U2(r) for all r with strict inequality forsome r given that s1 is strictly more informative than s2

CENSORSHIP AS OPTIMAL PERSUASION 25

Part (ii) We fix ρ isin R and σ = 1 and vary t The indirect utility of the senderconditional on m isin [0 1] is given by

Vt(m) =

983133

rlem

(1+ρ(mminusr))gt1(r)dr =

983133

rlemminust

(1+ρ(mminustminusr))g(r)dr = V (mminust) (9)

By Theorem 1 when t0 = 0 an optimal cutoff ωlowast0 satisfies

V (m) le V (mlowast0) + V prime(mlowast

0)(mminusmlowast0) for m isin [ωlowast

0 1]

When V is S-shaped and t1 lt 0 it is easy to see that

V (mminus t1) le V (mlowast0 minus t1) + V prime(mlowast

0 minus t1)(mminusmlowast0) for m isin [ωlowast

0 1]

which implies that when t = t1 the above inequality changes its sign at some ωlowast1

that satisfies ωlowast1 le ωlowast

0 It follows from (9) that ωlowast1 is an optimal cutoff for t = t1 and

as we established above ωlowast1 le ωlowast

0

Part (iii) We fix ρ ge 0 and t = 0 and vary σ The indirect utility of the senderconditional on m isin [0 1] is given by

V ρσσ(m) =

983133

rlem

9830591 +

ρ

σ(mminus r)

983060dG0σ(r)

=

983133

rlem

9830591 +

ρ

σ(mminus r)

983060g

983061r minus (1minus σ)τ

σ

9830621

σdr

=

983133

rlemminus(1minusσ)τσ

9830611 + ρ

983061mminus (1minus σ)τ

σminus r

983062983062g(r)dr

= Vρ1

983061mminus (1minus σ)τ

σ

983062 (10)

Let σ0 = 1 σ1 lt 1 and ρ1 = ρσ1 To compare the optimal cutoffs under parameters(ρ σ0) and (ρ σ1) we consider two steps first comparing the cutoffs between (ρ σ0)and (ρ1 σ1) and then comparing the cutoffs between (ρ1 σ1) and (ρ σ1)

First we compare an optimal cutoff ωlowast0 under (ρ σ0) = (ρ 1) and an optimal cutoff

ωlowast1 under (ρ1 σ1) By Theorem 1 ωlowast

0 satisfies

Vρ1(m) le Vρ1(mlowast0) + V prime

ρ1(mlowast0)(mminusmlowast

0) for m isin [ωlowast0 1]

When V is increasing (implied by ρ ge 0) and S-shaped and σ1 lt 1 it is easy to seethat for all m isin [ωlowast

0 1]

Vρ1

983061mminus (1minus σ1)τ

σ1

983062le Vρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062+ V prime

ρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062(mminusmlowast

0)

which implies that when σ = σ1 lt 1 the above inequality changes its sign at someωlowast1 that satisfies ωlowast

1 le ωlowast0 It follows from (10) that ωlowast

1 is an optimal cutoff under(ρσ σ1) = (ρ1 σ1) and as we established above ωlowast

1 le ωlowast0

26 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Second we compare the optimal cutoff ωlowast1 under (ρ1 σ1) and the optimal cutoff ωlowast

2

under (ρ σ1) Because ρ lt ρ1 = ρσ1 by part (i) we have ωlowast2 le ωlowast

1

Combining ωlowast1 le ωlowast

0 and ωlowast2 le ωlowast

1 we obtain ωlowast2 le ωlowast

0

Proof of Theorem 1prime Part (A) The proof is analogous to that of Theorem 1 andis thus omitted

Part (B) The only if statement follows from that in part (A) (see the proof of Theorem1) and the fact that we can always ensure that an optimal stochastic censorship(ωlowast qlowast) is deterministic (so qlowast isin 0 1) by the choice of the support and distributionof ω Thus we only need to prove the if statement

Let F be a distribution of ω with support on n values 0 lt ω0 lt lt ωnminus1 lt 1 Thisis without loss of generality because if 0 andor 1 were in the support of F we couldredefine the state as ω = (ω + ε)(1 + 2ε) and the type as r = (r + ε)(1 + 2ε)

Let X sub ω1 ωnminus1 identify a set of cutoffs defining a monotone partition for eachωi isin X the receiver is informed whether ω lt ωi or ω ge ωi Note that we exclude ω0

from the set of cutoffs because ω ge ω0 always holds

Note that X is deterministic upper censorship if and only if there exists a censorshipthreshold ωlowast isin [0 1] such that for each i = 1 nminus 1 ωi isin X if and only if ωi lt ωlowastthat is we can assume that qlowast = 0 without loss of generality

Let VX be the senderrsquos expected utility when the monotone partition is given by X

If n = 2 then there are only two partitions X = ω1 (full disclosure) and X = empty(no disclosure) Both are upper censorships so part (B) holds trivially

If n = 3 then there are four monotone partitions X = ω1ω2 (full disclosure)X = ω2 X = ω1 and X = empty (no disclosure) Only X = ω2 is not uppercensorship The next lemma proves that X = ω2 cannot be uniquely optimal whenV is S-shaped thus proving the optimality of upper censorship for n = 3

Denote

mjk = E983045ω|ω isin [ωjωk]

983046

For convenience of notation we write mj = ωj

Lemma 3 Let V be S-shaped with an inflexion point τ Let ω be a discrete randomvariable with support on m0m1m2 and probabilities p0 p1 p2 where 0 lt m0 ltm1 lt m2 lt 1 Then

(i) if τ le m0 then Vm2 le Vempty

(ii) if τ ge m1 then Vm2 le Vm1m2

(iii) if m0 lt τ lt m1 then Vm2 le maxVempty Vm1m2

CENSORSHIP AS OPTIMAL PERSUASION 27

Proof Define p01 = p0 + p1 and p02 = p01 + p2 = 1 Observe that

m01 =p0m0 + p1m1

p0 + p1and m02 =

p0m0 + p1m1 + p2m2

p0 + p1 + p2

We thus have

Vempty = p02V (m02) = (p01 + p2)V

983061p01m01 + p2m2

p01 + p2

983062

Vm2 = p01V (m01) + p2V (m2) = (p0 + p1)V

983061p0m0 + p1m1

p0 + p1

983062+ p2V (m2)

Vm1m2 = p0V (m0) + p1V (m1) + p2V (m2)

Clearly

m0 lt m01 lt m1 and m0 lt m02 lt m2

Therefore

Vm2 le Vempty lArrrArrp01

p01 + p2V (m01) +

p2p01 + p2

V (m2) le V

983061p01

p01 + p2m01 +

p2p01 + p2

m2

983062= V (m02)

(11)

and

Vm2 le Vm1m2 lArrrArr

V

983061p0

p0 + p1m0 +

p1p0 + p1

m1

983062le p0

p0 + p1V (m0) +

p1p0 + p1

V (m1)(12)

Suppose that τ le m01 so V is concave on [m01m2] Then (11) holds by Jensenrsquosinequality In particular it holds for τ le m0 lt m01 thus proving part (i)

Suppose that τ ge m1 so V is convex on [m0m1] Then (12) holds by Jensenrsquosinequality thus proving part (ii)

Finally suppose that m01 lt τ lt m1 Let (τ V (τ)) lie below (or on) the straight lineA connecting (m0 V (m0)) and (m2 V (m2)) as illustrated on Figure 2 By convexityof V (m) for m lt τ it must be the case that V (m01) lies below (or on) the dashedline B By concavity of V (m) for m gt τ it must be the case that V (m1) lies above(or on) the dashed line C Consequently (12) is satisfied

Alternatively let (τ V (τ)) lie above the straight line A connecting (m0 V (m0)) and(m2 V (m2)) as illustrated on Figure 3 If Vm2 le Vm1m2 then the proof of part(iii) is complete So suppose that Vm2 gt Vm1m2 so (12) is violated

By convexity of V (m) for m lt τ it must be the case that V (m01) lies below (or on)the line B Since (12) is violated V (m01) must lie between the lines A and B Recallthat m01 lt m02 lt m2 Observe that V (m02) must be in the shaded area on Figure 3Indeed if m02 lt τ then by convexity of V (m) for m lt τ it must be the case thatV (m02) lies between (or on) the lines B and D If m02 ge τ then by concavity of

28 KOLOTILIN MYLOVANOV ZAPECHELNYUK

τ m2m0

A C

B

m01 m1

V (m0) V (τ)

V (m2)

Figure 2 The case where (τ V (τ)) lies below (or on) the straight lineconnecting (m0 V (m0)) and (m2 V (m2))

τ m2m0

A

C

B

m01

V (m0)

V (τ)

V (m2)

D

V (m01)

Figure 3 The case where (τ V (τ)) lies above the straight line con-necting (m0 V (m0)) and (m2 V (m2))

V (m) for m gt τ it must be the case that V (m02) lies above (or on) line C It followsthat (11) is satisfied

We now adapt Lemma 3 to prove the optimality of upper censorship for n ge 4

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

24 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Proof of Lemma 1 Notice that under assumption (A1) we have

V primeprime(m) = gprime(m) + ρg(m) for m isin [0 1]

Recall that V is S-shaped if and only if V primeprime(m) = gprime (m) + ρg (m) is single-crossingfrom above Using Proposition 1 in Quah and Strulovici (2012) it is easy to showthat V primeprime(m) is single-crossing from above for all ρ isin R if and only if gprime (m)g (m) isnonincreasing in m (that is ln g(m) is concave)

Proof of Theorem 2 Immediate by Theorem 1 and Lemma 1

Proof of Theorem 3 Part (i) We fix t and σ and vary ρ Without loss ofgenerality let t = 0 and σ = 1 (so that Gtσ = G) Consider a signal that induces thedistribution H of posterior mean m This signal implements the receiverrsquos interimutility U given by

U(r) =

983133 1

r

(mminus r)dH(m) for r isin [0 1]

which holds because the receiver acts if and only if m ge r Since types r lt 0 alwayschoose a = 1 and types r gt 1 always choose a = 0 we can exclude them fromconsideration and assume that supp(G) = [0 1] The senderrsquos expected utility is then

983133 1

0

V (m)dH(m) =

983133 1

0

983133 m

0

(1 + ρ(mminus r))dG(r)dH(m)

=

983133 1

0

983133 1

r

(1 + ρ(mminus r))dH(m)dG(r)

=

983133 1

0

(1minusH(r) + ρU(r))dG(r)

Consider ρ2 gt ρ1 Suppose by contradiction that the corresponding optimal upper-censorship signals s2 and s1 are such that s1 is strictly more informative than s2 LetU2 and U1 be the receiverrsquos interim utilities implemented by s2 and s1 Since thesender prefers signal s2 under ρ2 and signal s1 under ρ1 we have

983133 1

0

(1minusH2 (r) + ρ2U2 (r)) dG (r) ge983133 1

0

(1minusH1 (r) + ρ2U1 (r)) dG (r)

983133 1

0

(1minusH1 (r) + ρ1U1 (r)) dG (r) ge983133 1

0

(1minusH2 (r) + ρ1U2 (r)) dG (r)

Summing up these inequalities gives

(ρ2 minus ρ1)

983133 1

0

(U2 (r)minus U1 (r)) dG (r) ge 0

leading to a contradiction because U1(r) ge U2(r) for all r with strict inequality forsome r given that s1 is strictly more informative than s2

CENSORSHIP AS OPTIMAL PERSUASION 25

Part (ii) We fix ρ isin R and σ = 1 and vary t The indirect utility of the senderconditional on m isin [0 1] is given by

Vt(m) =

983133

rlem

(1+ρ(mminusr))gt1(r)dr =

983133

rlemminust

(1+ρ(mminustminusr))g(r)dr = V (mminust) (9)

By Theorem 1 when t0 = 0 an optimal cutoff ωlowast0 satisfies

V (m) le V (mlowast0) + V prime(mlowast

0)(mminusmlowast0) for m isin [ωlowast

0 1]

When V is S-shaped and t1 lt 0 it is easy to see that

V (mminus t1) le V (mlowast0 minus t1) + V prime(mlowast

0 minus t1)(mminusmlowast0) for m isin [ωlowast

0 1]

which implies that when t = t1 the above inequality changes its sign at some ωlowast1

that satisfies ωlowast1 le ωlowast

0 It follows from (9) that ωlowast1 is an optimal cutoff for t = t1 and

as we established above ωlowast1 le ωlowast

0

Part (iii) We fix ρ ge 0 and t = 0 and vary σ The indirect utility of the senderconditional on m isin [0 1] is given by

V ρσσ(m) =

983133

rlem

9830591 +

ρ

σ(mminus r)

983060dG0σ(r)

=

983133

rlem

9830591 +

ρ

σ(mminus r)

983060g

983061r minus (1minus σ)τ

σ

9830621

σdr

=

983133

rlemminus(1minusσ)τσ

9830611 + ρ

983061mminus (1minus σ)τ

σminus r

983062983062g(r)dr

= Vρ1

983061mminus (1minus σ)τ

σ

983062 (10)

Let σ0 = 1 σ1 lt 1 and ρ1 = ρσ1 To compare the optimal cutoffs under parameters(ρ σ0) and (ρ σ1) we consider two steps first comparing the cutoffs between (ρ σ0)and (ρ1 σ1) and then comparing the cutoffs between (ρ1 σ1) and (ρ σ1)

First we compare an optimal cutoff ωlowast0 under (ρ σ0) = (ρ 1) and an optimal cutoff

ωlowast1 under (ρ1 σ1) By Theorem 1 ωlowast

0 satisfies

Vρ1(m) le Vρ1(mlowast0) + V prime

ρ1(mlowast0)(mminusmlowast

0) for m isin [ωlowast0 1]

When V is increasing (implied by ρ ge 0) and S-shaped and σ1 lt 1 it is easy to seethat for all m isin [ωlowast

0 1]

Vρ1

983061mminus (1minus σ1)τ

σ1

983062le Vρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062+ V prime

ρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062(mminusmlowast

0)

which implies that when σ = σ1 lt 1 the above inequality changes its sign at someωlowast1 that satisfies ωlowast

1 le ωlowast0 It follows from (10) that ωlowast

1 is an optimal cutoff under(ρσ σ1) = (ρ1 σ1) and as we established above ωlowast

1 le ωlowast0

26 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Second we compare the optimal cutoff ωlowast1 under (ρ1 σ1) and the optimal cutoff ωlowast

2

under (ρ σ1) Because ρ lt ρ1 = ρσ1 by part (i) we have ωlowast2 le ωlowast

1

Combining ωlowast1 le ωlowast

0 and ωlowast2 le ωlowast

1 we obtain ωlowast2 le ωlowast

0

Proof of Theorem 1prime Part (A) The proof is analogous to that of Theorem 1 andis thus omitted

Part (B) The only if statement follows from that in part (A) (see the proof of Theorem1) and the fact that we can always ensure that an optimal stochastic censorship(ωlowast qlowast) is deterministic (so qlowast isin 0 1) by the choice of the support and distributionof ω Thus we only need to prove the if statement

Let F be a distribution of ω with support on n values 0 lt ω0 lt lt ωnminus1 lt 1 Thisis without loss of generality because if 0 andor 1 were in the support of F we couldredefine the state as ω = (ω + ε)(1 + 2ε) and the type as r = (r + ε)(1 + 2ε)

Let X sub ω1 ωnminus1 identify a set of cutoffs defining a monotone partition for eachωi isin X the receiver is informed whether ω lt ωi or ω ge ωi Note that we exclude ω0

from the set of cutoffs because ω ge ω0 always holds

Note that X is deterministic upper censorship if and only if there exists a censorshipthreshold ωlowast isin [0 1] such that for each i = 1 nminus 1 ωi isin X if and only if ωi lt ωlowastthat is we can assume that qlowast = 0 without loss of generality

Let VX be the senderrsquos expected utility when the monotone partition is given by X

If n = 2 then there are only two partitions X = ω1 (full disclosure) and X = empty(no disclosure) Both are upper censorships so part (B) holds trivially

If n = 3 then there are four monotone partitions X = ω1ω2 (full disclosure)X = ω2 X = ω1 and X = empty (no disclosure) Only X = ω2 is not uppercensorship The next lemma proves that X = ω2 cannot be uniquely optimal whenV is S-shaped thus proving the optimality of upper censorship for n = 3

Denote

mjk = E983045ω|ω isin [ωjωk]

983046

For convenience of notation we write mj = ωj

Lemma 3 Let V be S-shaped with an inflexion point τ Let ω be a discrete randomvariable with support on m0m1m2 and probabilities p0 p1 p2 where 0 lt m0 ltm1 lt m2 lt 1 Then

(i) if τ le m0 then Vm2 le Vempty

(ii) if τ ge m1 then Vm2 le Vm1m2

(iii) if m0 lt τ lt m1 then Vm2 le maxVempty Vm1m2

CENSORSHIP AS OPTIMAL PERSUASION 27

Proof Define p01 = p0 + p1 and p02 = p01 + p2 = 1 Observe that

m01 =p0m0 + p1m1

p0 + p1and m02 =

p0m0 + p1m1 + p2m2

p0 + p1 + p2

We thus have

Vempty = p02V (m02) = (p01 + p2)V

983061p01m01 + p2m2

p01 + p2

983062

Vm2 = p01V (m01) + p2V (m2) = (p0 + p1)V

983061p0m0 + p1m1

p0 + p1

983062+ p2V (m2)

Vm1m2 = p0V (m0) + p1V (m1) + p2V (m2)

Clearly

m0 lt m01 lt m1 and m0 lt m02 lt m2

Therefore

Vm2 le Vempty lArrrArrp01

p01 + p2V (m01) +

p2p01 + p2

V (m2) le V

983061p01

p01 + p2m01 +

p2p01 + p2

m2

983062= V (m02)

(11)

and

Vm2 le Vm1m2 lArrrArr

V

983061p0

p0 + p1m0 +

p1p0 + p1

m1

983062le p0

p0 + p1V (m0) +

p1p0 + p1

V (m1)(12)

Suppose that τ le m01 so V is concave on [m01m2] Then (11) holds by Jensenrsquosinequality In particular it holds for τ le m0 lt m01 thus proving part (i)

Suppose that τ ge m1 so V is convex on [m0m1] Then (12) holds by Jensenrsquosinequality thus proving part (ii)

Finally suppose that m01 lt τ lt m1 Let (τ V (τ)) lie below (or on) the straight lineA connecting (m0 V (m0)) and (m2 V (m2)) as illustrated on Figure 2 By convexityof V (m) for m lt τ it must be the case that V (m01) lies below (or on) the dashedline B By concavity of V (m) for m gt τ it must be the case that V (m1) lies above(or on) the dashed line C Consequently (12) is satisfied

Alternatively let (τ V (τ)) lie above the straight line A connecting (m0 V (m0)) and(m2 V (m2)) as illustrated on Figure 3 If Vm2 le Vm1m2 then the proof of part(iii) is complete So suppose that Vm2 gt Vm1m2 so (12) is violated

By convexity of V (m) for m lt τ it must be the case that V (m01) lies below (or on)the line B Since (12) is violated V (m01) must lie between the lines A and B Recallthat m01 lt m02 lt m2 Observe that V (m02) must be in the shaded area on Figure 3Indeed if m02 lt τ then by convexity of V (m) for m lt τ it must be the case thatV (m02) lies between (or on) the lines B and D If m02 ge τ then by concavity of

28 KOLOTILIN MYLOVANOV ZAPECHELNYUK

τ m2m0

A C

B

m01 m1

V (m0) V (τ)

V (m2)

Figure 2 The case where (τ V (τ)) lies below (or on) the straight lineconnecting (m0 V (m0)) and (m2 V (m2))

τ m2m0

A

C

B

m01

V (m0)

V (τ)

V (m2)

D

V (m01)

Figure 3 The case where (τ V (τ)) lies above the straight line con-necting (m0 V (m0)) and (m2 V (m2))

V (m) for m gt τ it must be the case that V (m02) lies above (or on) line C It followsthat (11) is satisfied

We now adapt Lemma 3 to prove the optimality of upper censorship for n ge 4

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

CENSORSHIP AS OPTIMAL PERSUASION 25

Part (ii) We fix ρ isin R and σ = 1 and vary t The indirect utility of the senderconditional on m isin [0 1] is given by

Vt(m) =

983133

rlem

(1+ρ(mminusr))gt1(r)dr =

983133

rlemminust

(1+ρ(mminustminusr))g(r)dr = V (mminust) (9)

By Theorem 1 when t0 = 0 an optimal cutoff ωlowast0 satisfies

V (m) le V (mlowast0) + V prime(mlowast

0)(mminusmlowast0) for m isin [ωlowast

0 1]

When V is S-shaped and t1 lt 0 it is easy to see that

V (mminus t1) le V (mlowast0 minus t1) + V prime(mlowast

0 minus t1)(mminusmlowast0) for m isin [ωlowast

0 1]

which implies that when t = t1 the above inequality changes its sign at some ωlowast1

that satisfies ωlowast1 le ωlowast

0 It follows from (9) that ωlowast1 is an optimal cutoff for t = t1 and

as we established above ωlowast1 le ωlowast

0

Part (iii) We fix ρ ge 0 and t = 0 and vary σ The indirect utility of the senderconditional on m isin [0 1] is given by

V ρσσ(m) =

983133

rlem

9830591 +

ρ

σ(mminus r)

983060dG0σ(r)

=

983133

rlem

9830591 +

ρ

σ(mminus r)

983060g

983061r minus (1minus σ)τ

σ

9830621

σdr

=

983133

rlemminus(1minusσ)τσ

9830611 + ρ

983061mminus (1minus σ)τ

σminus r

983062983062g(r)dr

= Vρ1

983061mminus (1minus σ)τ

σ

983062 (10)

Let σ0 = 1 σ1 lt 1 and ρ1 = ρσ1 To compare the optimal cutoffs under parameters(ρ σ0) and (ρ σ1) we consider two steps first comparing the cutoffs between (ρ σ0)and (ρ1 σ1) and then comparing the cutoffs between (ρ1 σ1) and (ρ σ1)

First we compare an optimal cutoff ωlowast0 under (ρ σ0) = (ρ 1) and an optimal cutoff

ωlowast1 under (ρ1 σ1) By Theorem 1 ωlowast

0 satisfies

Vρ1(m) le Vρ1(mlowast0) + V prime

ρ1(mlowast0)(mminusmlowast

0) for m isin [ωlowast0 1]

When V is increasing (implied by ρ ge 0) and S-shaped and σ1 lt 1 it is easy to seethat for all m isin [ωlowast

0 1]

Vρ1

983061mminus (1minus σ1)τ

σ1

983062le Vρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062+ V prime

ρ1

983061mlowast

0 minus (1minus σ1)τ

σ1

983062(mminusmlowast

0)

which implies that when σ = σ1 lt 1 the above inequality changes its sign at someωlowast1 that satisfies ωlowast

1 le ωlowast0 It follows from (10) that ωlowast

1 is an optimal cutoff under(ρσ σ1) = (ρ1 σ1) and as we established above ωlowast

1 le ωlowast0

26 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Second we compare the optimal cutoff ωlowast1 under (ρ1 σ1) and the optimal cutoff ωlowast

2

under (ρ σ1) Because ρ lt ρ1 = ρσ1 by part (i) we have ωlowast2 le ωlowast

1

Combining ωlowast1 le ωlowast

0 and ωlowast2 le ωlowast

1 we obtain ωlowast2 le ωlowast

0

Proof of Theorem 1prime Part (A) The proof is analogous to that of Theorem 1 andis thus omitted

Part (B) The only if statement follows from that in part (A) (see the proof of Theorem1) and the fact that we can always ensure that an optimal stochastic censorship(ωlowast qlowast) is deterministic (so qlowast isin 0 1) by the choice of the support and distributionof ω Thus we only need to prove the if statement

Let F be a distribution of ω with support on n values 0 lt ω0 lt lt ωnminus1 lt 1 Thisis without loss of generality because if 0 andor 1 were in the support of F we couldredefine the state as ω = (ω + ε)(1 + 2ε) and the type as r = (r + ε)(1 + 2ε)

Let X sub ω1 ωnminus1 identify a set of cutoffs defining a monotone partition for eachωi isin X the receiver is informed whether ω lt ωi or ω ge ωi Note that we exclude ω0

from the set of cutoffs because ω ge ω0 always holds

Note that X is deterministic upper censorship if and only if there exists a censorshipthreshold ωlowast isin [0 1] such that for each i = 1 nminus 1 ωi isin X if and only if ωi lt ωlowastthat is we can assume that qlowast = 0 without loss of generality

Let VX be the senderrsquos expected utility when the monotone partition is given by X

If n = 2 then there are only two partitions X = ω1 (full disclosure) and X = empty(no disclosure) Both are upper censorships so part (B) holds trivially

If n = 3 then there are four monotone partitions X = ω1ω2 (full disclosure)X = ω2 X = ω1 and X = empty (no disclosure) Only X = ω2 is not uppercensorship The next lemma proves that X = ω2 cannot be uniquely optimal whenV is S-shaped thus proving the optimality of upper censorship for n = 3

Denote

mjk = E983045ω|ω isin [ωjωk]

983046

For convenience of notation we write mj = ωj

Lemma 3 Let V be S-shaped with an inflexion point τ Let ω be a discrete randomvariable with support on m0m1m2 and probabilities p0 p1 p2 where 0 lt m0 ltm1 lt m2 lt 1 Then

(i) if τ le m0 then Vm2 le Vempty

(ii) if τ ge m1 then Vm2 le Vm1m2

(iii) if m0 lt τ lt m1 then Vm2 le maxVempty Vm1m2

CENSORSHIP AS OPTIMAL PERSUASION 27

Proof Define p01 = p0 + p1 and p02 = p01 + p2 = 1 Observe that

m01 =p0m0 + p1m1

p0 + p1and m02 =

p0m0 + p1m1 + p2m2

p0 + p1 + p2

We thus have

Vempty = p02V (m02) = (p01 + p2)V

983061p01m01 + p2m2

p01 + p2

983062

Vm2 = p01V (m01) + p2V (m2) = (p0 + p1)V

983061p0m0 + p1m1

p0 + p1

983062+ p2V (m2)

Vm1m2 = p0V (m0) + p1V (m1) + p2V (m2)

Clearly

m0 lt m01 lt m1 and m0 lt m02 lt m2

Therefore

Vm2 le Vempty lArrrArrp01

p01 + p2V (m01) +

p2p01 + p2

V (m2) le V

983061p01

p01 + p2m01 +

p2p01 + p2

m2

983062= V (m02)

(11)

and

Vm2 le Vm1m2 lArrrArr

V

983061p0

p0 + p1m0 +

p1p0 + p1

m1

983062le p0

p0 + p1V (m0) +

p1p0 + p1

V (m1)(12)

Suppose that τ le m01 so V is concave on [m01m2] Then (11) holds by Jensenrsquosinequality In particular it holds for τ le m0 lt m01 thus proving part (i)

Suppose that τ ge m1 so V is convex on [m0m1] Then (12) holds by Jensenrsquosinequality thus proving part (ii)

Finally suppose that m01 lt τ lt m1 Let (τ V (τ)) lie below (or on) the straight lineA connecting (m0 V (m0)) and (m2 V (m2)) as illustrated on Figure 2 By convexityof V (m) for m lt τ it must be the case that V (m01) lies below (or on) the dashedline B By concavity of V (m) for m gt τ it must be the case that V (m1) lies above(or on) the dashed line C Consequently (12) is satisfied

Alternatively let (τ V (τ)) lie above the straight line A connecting (m0 V (m0)) and(m2 V (m2)) as illustrated on Figure 3 If Vm2 le Vm1m2 then the proof of part(iii) is complete So suppose that Vm2 gt Vm1m2 so (12) is violated

By convexity of V (m) for m lt τ it must be the case that V (m01) lies below (or on)the line B Since (12) is violated V (m01) must lie between the lines A and B Recallthat m01 lt m02 lt m2 Observe that V (m02) must be in the shaded area on Figure 3Indeed if m02 lt τ then by convexity of V (m) for m lt τ it must be the case thatV (m02) lies between (or on) the lines B and D If m02 ge τ then by concavity of

28 KOLOTILIN MYLOVANOV ZAPECHELNYUK

τ m2m0

A C

B

m01 m1

V (m0) V (τ)

V (m2)

Figure 2 The case where (τ V (τ)) lies below (or on) the straight lineconnecting (m0 V (m0)) and (m2 V (m2))

τ m2m0

A

C

B

m01

V (m0)

V (τ)

V (m2)

D

V (m01)

Figure 3 The case where (τ V (τ)) lies above the straight line con-necting (m0 V (m0)) and (m2 V (m2))

V (m) for m gt τ it must be the case that V (m02) lies above (or on) line C It followsthat (11) is satisfied

We now adapt Lemma 3 to prove the optimality of upper censorship for n ge 4

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

26 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Second we compare the optimal cutoff ωlowast1 under (ρ1 σ1) and the optimal cutoff ωlowast

2

under (ρ σ1) Because ρ lt ρ1 = ρσ1 by part (i) we have ωlowast2 le ωlowast

1

Combining ωlowast1 le ωlowast

0 and ωlowast2 le ωlowast

1 we obtain ωlowast2 le ωlowast

0

Proof of Theorem 1prime Part (A) The proof is analogous to that of Theorem 1 andis thus omitted

Part (B) The only if statement follows from that in part (A) (see the proof of Theorem1) and the fact that we can always ensure that an optimal stochastic censorship(ωlowast qlowast) is deterministic (so qlowast isin 0 1) by the choice of the support and distributionof ω Thus we only need to prove the if statement

Let F be a distribution of ω with support on n values 0 lt ω0 lt lt ωnminus1 lt 1 Thisis without loss of generality because if 0 andor 1 were in the support of F we couldredefine the state as ω = (ω + ε)(1 + 2ε) and the type as r = (r + ε)(1 + 2ε)

Let X sub ω1 ωnminus1 identify a set of cutoffs defining a monotone partition for eachωi isin X the receiver is informed whether ω lt ωi or ω ge ωi Note that we exclude ω0

from the set of cutoffs because ω ge ω0 always holds

Note that X is deterministic upper censorship if and only if there exists a censorshipthreshold ωlowast isin [0 1] such that for each i = 1 nminus 1 ωi isin X if and only if ωi lt ωlowastthat is we can assume that qlowast = 0 without loss of generality

Let VX be the senderrsquos expected utility when the monotone partition is given by X

If n = 2 then there are only two partitions X = ω1 (full disclosure) and X = empty(no disclosure) Both are upper censorships so part (B) holds trivially

If n = 3 then there are four monotone partitions X = ω1ω2 (full disclosure)X = ω2 X = ω1 and X = empty (no disclosure) Only X = ω2 is not uppercensorship The next lemma proves that X = ω2 cannot be uniquely optimal whenV is S-shaped thus proving the optimality of upper censorship for n = 3

Denote

mjk = E983045ω|ω isin [ωjωk]

983046

For convenience of notation we write mj = ωj

Lemma 3 Let V be S-shaped with an inflexion point τ Let ω be a discrete randomvariable with support on m0m1m2 and probabilities p0 p1 p2 where 0 lt m0 ltm1 lt m2 lt 1 Then

(i) if τ le m0 then Vm2 le Vempty

(ii) if τ ge m1 then Vm2 le Vm1m2

(iii) if m0 lt τ lt m1 then Vm2 le maxVempty Vm1m2

CENSORSHIP AS OPTIMAL PERSUASION 27

Proof Define p01 = p0 + p1 and p02 = p01 + p2 = 1 Observe that

m01 =p0m0 + p1m1

p0 + p1and m02 =

p0m0 + p1m1 + p2m2

p0 + p1 + p2

We thus have

Vempty = p02V (m02) = (p01 + p2)V

983061p01m01 + p2m2

p01 + p2

983062

Vm2 = p01V (m01) + p2V (m2) = (p0 + p1)V

983061p0m0 + p1m1

p0 + p1

983062+ p2V (m2)

Vm1m2 = p0V (m0) + p1V (m1) + p2V (m2)

Clearly

m0 lt m01 lt m1 and m0 lt m02 lt m2

Therefore

Vm2 le Vempty lArrrArrp01

p01 + p2V (m01) +

p2p01 + p2

V (m2) le V

983061p01

p01 + p2m01 +

p2p01 + p2

m2

983062= V (m02)

(11)

and

Vm2 le Vm1m2 lArrrArr

V

983061p0

p0 + p1m0 +

p1p0 + p1

m1

983062le p0

p0 + p1V (m0) +

p1p0 + p1

V (m1)(12)

Suppose that τ le m01 so V is concave on [m01m2] Then (11) holds by Jensenrsquosinequality In particular it holds for τ le m0 lt m01 thus proving part (i)

Suppose that τ ge m1 so V is convex on [m0m1] Then (12) holds by Jensenrsquosinequality thus proving part (ii)

Finally suppose that m01 lt τ lt m1 Let (τ V (τ)) lie below (or on) the straight lineA connecting (m0 V (m0)) and (m2 V (m2)) as illustrated on Figure 2 By convexityof V (m) for m lt τ it must be the case that V (m01) lies below (or on) the dashedline B By concavity of V (m) for m gt τ it must be the case that V (m1) lies above(or on) the dashed line C Consequently (12) is satisfied

Alternatively let (τ V (τ)) lie above the straight line A connecting (m0 V (m0)) and(m2 V (m2)) as illustrated on Figure 3 If Vm2 le Vm1m2 then the proof of part(iii) is complete So suppose that Vm2 gt Vm1m2 so (12) is violated

By convexity of V (m) for m lt τ it must be the case that V (m01) lies below (or on)the line B Since (12) is violated V (m01) must lie between the lines A and B Recallthat m01 lt m02 lt m2 Observe that V (m02) must be in the shaded area on Figure 3Indeed if m02 lt τ then by convexity of V (m) for m lt τ it must be the case thatV (m02) lies between (or on) the lines B and D If m02 ge τ then by concavity of

28 KOLOTILIN MYLOVANOV ZAPECHELNYUK

τ m2m0

A C

B

m01 m1

V (m0) V (τ)

V (m2)

Figure 2 The case where (τ V (τ)) lies below (or on) the straight lineconnecting (m0 V (m0)) and (m2 V (m2))

τ m2m0

A

C

B

m01

V (m0)

V (τ)

V (m2)

D

V (m01)

Figure 3 The case where (τ V (τ)) lies above the straight line con-necting (m0 V (m0)) and (m2 V (m2))

V (m) for m gt τ it must be the case that V (m02) lies above (or on) line C It followsthat (11) is satisfied

We now adapt Lemma 3 to prove the optimality of upper censorship for n ge 4

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

CENSORSHIP AS OPTIMAL PERSUASION 27

Proof Define p01 = p0 + p1 and p02 = p01 + p2 = 1 Observe that

m01 =p0m0 + p1m1

p0 + p1and m02 =

p0m0 + p1m1 + p2m2

p0 + p1 + p2

We thus have

Vempty = p02V (m02) = (p01 + p2)V

983061p01m01 + p2m2

p01 + p2

983062

Vm2 = p01V (m01) + p2V (m2) = (p0 + p1)V

983061p0m0 + p1m1

p0 + p1

983062+ p2V (m2)

Vm1m2 = p0V (m0) + p1V (m1) + p2V (m2)

Clearly

m0 lt m01 lt m1 and m0 lt m02 lt m2

Therefore

Vm2 le Vempty lArrrArrp01

p01 + p2V (m01) +

p2p01 + p2

V (m2) le V

983061p01

p01 + p2m01 +

p2p01 + p2

m2

983062= V (m02)

(11)

and

Vm2 le Vm1m2 lArrrArr

V

983061p0

p0 + p1m0 +

p1p0 + p1

m1

983062le p0

p0 + p1V (m0) +

p1p0 + p1

V (m1)(12)

Suppose that τ le m01 so V is concave on [m01m2] Then (11) holds by Jensenrsquosinequality In particular it holds for τ le m0 lt m01 thus proving part (i)

Suppose that τ ge m1 so V is convex on [m0m1] Then (12) holds by Jensenrsquosinequality thus proving part (ii)

Finally suppose that m01 lt τ lt m1 Let (τ V (τ)) lie below (or on) the straight lineA connecting (m0 V (m0)) and (m2 V (m2)) as illustrated on Figure 2 By convexityof V (m) for m lt τ it must be the case that V (m01) lies below (or on) the dashedline B By concavity of V (m) for m gt τ it must be the case that V (m1) lies above(or on) the dashed line C Consequently (12) is satisfied

Alternatively let (τ V (τ)) lie above the straight line A connecting (m0 V (m0)) and(m2 V (m2)) as illustrated on Figure 3 If Vm2 le Vm1m2 then the proof of part(iii) is complete So suppose that Vm2 gt Vm1m2 so (12) is violated

By convexity of V (m) for m lt τ it must be the case that V (m01) lies below (or on)the line B Since (12) is violated V (m01) must lie between the lines A and B Recallthat m01 lt m02 lt m2 Observe that V (m02) must be in the shaded area on Figure 3Indeed if m02 lt τ then by convexity of V (m) for m lt τ it must be the case thatV (m02) lies between (or on) the lines B and D If m02 ge τ then by concavity of

28 KOLOTILIN MYLOVANOV ZAPECHELNYUK

τ m2m0

A C

B

m01 m1

V (m0) V (τ)

V (m2)

Figure 2 The case where (τ V (τ)) lies below (or on) the straight lineconnecting (m0 V (m0)) and (m2 V (m2))

τ m2m0

A

C

B

m01

V (m0)

V (τ)

V (m2)

D

V (m01)

Figure 3 The case where (τ V (τ)) lies above the straight line con-necting (m0 V (m0)) and (m2 V (m2))

V (m) for m gt τ it must be the case that V (m02) lies above (or on) line C It followsthat (11) is satisfied

We now adapt Lemma 3 to prove the optimality of upper censorship for n ge 4

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

28 KOLOTILIN MYLOVANOV ZAPECHELNYUK

τ m2m0

A C

B

m01 m1

V (m0) V (τ)

V (m2)

Figure 2 The case where (τ V (τ)) lies below (or on) the straight lineconnecting (m0 V (m0)) and (m2 V (m2))

τ m2m0

A

C

B

m01

V (m0)

V (τ)

V (m2)

D

V (m01)

Figure 3 The case where (τ V (τ)) lies above the straight line con-necting (m0 V (m0)) and (m2 V (m2))

V (m) for m gt τ it must be the case that V (m02) lies above (or on) line C It followsthat (11) is satisfied

We now adapt Lemma 3 to prove the optimality of upper censorship for n ge 4

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

CENSORSHIP AS OPTIMAL PERSUASION 29

Fix a monotone partition X sub ω1 ωnminus1 that is not upper censorship that isthere exists i isin 2 nminus 1 such that

ωiminus1 ∕isin X and ωi isin X (13)

Denote ωn = 1 Define j and k as follows If X cap ω1 ωiminus2 ∕= empty let ωj be thelargest element inX smaller than i otherwise let ωj = ω0 IfXcapωi+1 ωnminus1 ∕= emptylet ωk be the smallest element in X greater than i otherwise let ωk = ωn = 1 Define

m0 = ωj m1 = ωiminus1 m2 = ωi

and

p0 =Pr[ω isin [ωjωiminus1)]

Pr[ω isin [ωjωk)] p1 =

Pr[ω isin [ωiminus1ωi)]

Pr[ω isin [ωjωk)] p2 =

Pr[ω isin [ωiωk)]

Pr[ω isin [ωjωk)]

To apply Lemma 3 we identify Xωi with empty X with m2 and X cup ωiminus1 withm1m2 By Lemma 3 VX le maxVXωi VXcupωiminus1 Thus replacing X withXωi if VX le VXωi and with X cup ωiminus1 otherwise weakly improves the senderrsquosexpected utility Repeatedly applying such replacements yields upper censorshipIndeed this procedure terminates after a finite number of replacements because byLemma 3 for ωj ge τ we replace X with Xωi and for ωiminus1 le τ we replace Xwith X cup ωiminus1

We thus have shown that when V is S-shaped when optimizing over the set ofmonotone partitions we can restrict attention to deterministic upper censorship Itremains to show that an optimal deterministic censorship cutoff ωlowast

d coincides with anoptimal stochastic censorship cutoff ωlowast from part (A) To see this it is straightforwardto show that

V lowast(ωlowast qlowast) =

983133

[0ωlowast)

V (ω)dF (ω) + V (ωlowast)qlowast Pr[ω = ωlowast]

+ V (mlowast(ωlowast qlowast))((1minus qlowast) Pr[ω = ωlowast] + Pr[ω gt ωlowast]) (14)

is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order Since the two de-terministic censorship cutoff pairs closest to (ωlowast qlowast) in the Blackwell informativenessorder are (ωlowast 0) and (ωlowast 1) we obtain ωlowast

d = ωlowast

Proof of Theorem 2prime Immediate by Theorem 1prime and Lemma 1

Proof of Theorem 3prime The proof for an optimal signal is analogous to that ofTheorem 3 and is thus omitted The same applies to part (i) for an optimal monotonepartition Thus it remains to prove part (ii) for an optimal monotone partition

We fix ρ and σ = 1 and vary t continuously If ωlowast isin supp(F ) then an optimalmonotone partition is an optimal signal because all qlowast isin [0 1] are optimal Thuspart (ii) in (B) holds by part (ii) in (A) If ωlowast isin supp(F ) and qlowast isin (0 1) then either(ωlowast

d qlowastd) = (ωlowast 0) or (ωlowast

d qlowastd) = (ωlowast 1) is optimal because the senderrsquos expected utility

V lowast given by (14) is single-peaked in (ωlowast qlowast) in the Blackwell informativeness order

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

30 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Therefore it suffices to consider values of t at which the sender is indifferent between(ωlowast 0) and (ωlowast 1) and to show that the sender strictly prefers (ωlowast 1) to (ωlowast 0) at aslightly higher t

V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) =rArr dV lowastt (ω

lowast 0)

dtle dV lowast

t (ωlowast 1)

dt (15)

The condition V lowastt (ω

lowast 0) = V lowastt (ω

lowast 1) can be written as

Vt(mlowast(ωlowast 0)) Pr[ω ge ωlowast] = Vt(ω

lowast) Pr[ω = ωlowast] + Vt(mlowast(ωlowast 1)) Pr[ω gt ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)Pr[ω = ωlowast]

Pr[ω ge ωlowast]+ Vt(m

lowast(ωlowast 1))Pr[ω gt ωlowast]

Pr[ω ge ωlowast]

lArrrArr Vt(mlowast(ωlowast 0)) = Vt(ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + Vt(mlowast(ωlowast 1))

mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Analogously the conditiondV lowast

t (ωlowast0)dt

le dV lowastt (ωlowast1)dt

can be written as

V primet (m

lowast(ωlowast 0)) ge V primet (ω

lowast)mlowast(ωlowast 1)minusmlowast(ωlowast 0)

mlowast(ωlowast 1)minus ωlowast + V primet (m

lowast(ωlowast 1))mlowast(ωlowast 0)minus ωlowast

mlowast(ωlowast 1)minus ωlowast

Thus (15) states that if Vt(mlowast(ωlowast 0)) is equal to the convex combination of Vt(ω

lowast)and Vt(m

lowast(ωlowast 1)) then the derivative V primet (m

lowast(ωlowast 0)) is greater than the convex com-bination of the derivatives V prime

t (ωlowast) and V prime

t (mlowast(ωlowast 1)) When V is S-shaped it is easy

to see that this property is satisfied

Proof of Lemma 2 Observe that a signal induced by any set X sub C of permittedmedia outlets induces a monotone partition of [0 1] and thus HC sub HM Indeed letcX(ω) and cX(ω) be the media outlets in X that are the closest to ω from above andfrom below

cX(ω) = sup983059c isin X c le ω

983052cup 0

983060and cX(ω) = inf

983059c isin X c gt ω cup 1

983060

By observing the media outlets in X each citizen is informed that ω isin [cX(ω) cX(ω))

Conversely let the set C of media outlets be c0 lt middot middot middot lt cn Without loss of generalityassume that C contains the two uninformative media outlets c0 = 0 and cn = 1 which(almost) always endorse and criticize action a = 1 respectively Then the distributionF of the state ω is a discrete distribution whose support is Ω = ω0 ωnminus1 whereωi = E[θ|θ isin [ci ci+1)] for i isin 0 nminus 1 Consider a monotone partitional signalrepresented by a nondecreasing function ξ [0 1] rarr R For each realization m of thissignal that occurs with a strictly positive probability define

x(m) = min983153ω isin Ω ξ(ω) = m

983154and x(m) = max

983153ω isin Ω ξ(ω) = m

983154

This signal induces a monotone partition of the finite set Ω This partition of Ω canbe implemented by a censorship policy that prohibits a media outlet c isin C if andonly if c isin (x(m) x(m)) for some realization m

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

CENSORSHIP AS OPTIMAL PERSUASION 31

References

Alonso R and O Camara (2016a) ldquoPersuading Votersrdquo American EconomicReview 106 3590ndash3605

(2016b) ldquoPolitical Disagreement and Information in Electionsrdquo Games andEconomic Behavior 100 390ndash412

Bagnoli M and T Bergstrom (2005) ldquoLog-Concave Probability and Its Ap-plicationsrdquo Economic Theory 26 445ndash469

Baron D P (2006) ldquoPersistent Media Biasrdquo Journal of Public Economics 90(1)1ndash36

Besley T and A Prat (2006) ldquoHandcuffs for the Grabbing Hand MediaCapture and Government Accountabilityrdquo American Economic Review 96(3) 720ndash736

Blackwell D (1953) ldquoEquivalent Comparisons of Experimentsrdquo Annals of Math-ematical Statistics 24 265ndash272

Chan J and W Suen (2008) ldquoA Spatial Theory of News Consumption andElectoral Competitionrdquo Review of Economic Studies 75 699ndash728

Chiang C-F and B Knight (2011) ldquoMedia Bias and Influence Evidence fromNewspaper Endorsementsrdquo Review of Economic Studies 78 795ndash820

Duffie D P Dworczak and H Zhu (2017) ldquoBenchmarks in Search MarketsrdquoJournal of Finance 72 1983ndash2044

Dworczak P (2017) ldquoMechanism Design with Aftermarkets Cutoff Mecha-nismsrdquo mimeo

Dworczak P and G Martini (2019) ldquoThe Simple Economics of Optimal Per-suasionrdquo Journal of Political Economy 127 993ndash2048

Edmond C (2013) ldquoInformation Manipulation Coordination and RegimeChangerdquo Review of Economic Studies 80 1422ndash1458

Egorov G S Guriev and K Sonin (2009) ldquoWhy Resource-poor DictatorsAllow Freer Media A Theory and Evidence from Panel Datardquo American PoliticalScience Review 103 645ndash668

Gehlbach S and K Sonin (2014) ldquoGovernment Control of the Mediardquo Journalof Public Economics 118 163ndash171

Gentzkow M and E Kamenica (2016) ldquoA Rothschild-Stiglitz Approach toBayesian Persuasionrdquo American Economic Review Papers amp Proceedings 106597ndash601

Gentzkow M and J Shapiro (2006) ldquoMedia Bias and Reputationrdquo Journalof Political Economy 114(2) 280ndash316

(2010) ldquoWhat Drives Media Slant Evidence from US Daily NewspapersrdquoEconometrica 78(1) 35ndash71

Ginzburg B (2019) ldquoOptimal Information Censorshiprdquo Journal of EconomicBehavior and Organization 163 377ndash385

Goldstein I and Y Leitner (2018) ldquoStress Tests and Information DisclosurerdquoJournal of Economic Theory 177 34ndash69

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming

32 KOLOTILIN MYLOVANOV ZAPECHELNYUK

Guo Y and E Shmaya (2019) ldquoThe Interval Structure of Optimal DisclosurerdquoEconometrica 87 653ndash675

Johnson J P and D P Myatt (2006) ldquoOn the Simple Economics of Adver-tising Marketing and Product Designrdquo American Economic Review 96 756ndash784

Kamenica E and M Gentzkow (2011) ldquoBayesian Persuasionrdquo American Eco-nomic Review 101 2590ndash2615

Kolotilin A (2015) ldquoExperimental Design to Persuaderdquo Games and EconomicBehavior 90 215ndash226

(2018) ldquoOptimal Information Disclosure A Linear Programming Ap-proachrdquo Theoretical Economics 13 607ndash636

Kolotilin A and H Li (2019) ldquoRelational Communicationrdquo mimeoKolotilin A M Li T Mylovanov and A Zapechelnyuk (2015) ldquoPer-suasion of a Privately Informed Receiverrdquo mimeo

Kolotilin A T Mylovanov A Zapechelnyuk and M Li (2017) ldquoPer-suasion of a Privately Informed Receiverrdquo Econometrica 85 1949ndash1964

Kolotilin A and A Wolitzky (2019) ldquoAssortative Information Disclosurerdquomimeo

Lewis T R and D Sappington (1994) ldquoSupplying Information to FacilitatePrice Discriminationrdquo International Economic Review 35 309ndash327

Lorentzen P (2014) ldquoChinarsquos Strategic Censorshiprdquo American Journal of Polit-ical Science 58 402ndash414

Mullainathan S and A Shleifer (2005) ldquoThe Market for Newsrdquo AmericanEconomic Review 95(4) 1031ndash1053

Orlov D P Zryumov and A Skrzypach (2018) ldquoDesign of Macro-prudentialStress Testsrdquo mimeo

Ostrovsky M and M Schwarz (2010) ldquoInformation Disclosure and Unravelingin Matching Marketsrdquo American Economic Journal Microeconomics 2 34ndash63

Prat A and D Stromberg (2013) ldquoThe Political Economy of Mass MediardquoAdvances in Economics and Econometrics 2 135ndash187

Quah J and B Strulovici (2012) ldquoAggregating the Single Crossing PropertyrdquoEconometrica 80 2333ndash2348

Rayo L and I Segal (2010) ldquoOptimal Information Disclosurerdquo Journal ofPolitical Economy 118 949ndash987

Romanyuk G and A Smolin (2019) ldquoCream Skimming and Information Designin Matching Marketsrdquo American Economic Journal Microeconomics 11 250ndash276

Suen W (2004) ldquoThe Self-Perpetuation of Biased Beliefsrdquo Economic Journal 114377ndash396

Yamashita T (2018) ldquoOptimal Public Information Disclosure by Mechanism De-signerrdquo mimeo

Zapechelnyuk A (2019) ldquoOptimal Quality Certificationrdquo American EconomicReview Insights forthcoming