tariff Choice with Consumer learning and Switching...

20
Journal of Marketing Research Vol. XLVIII (August 2011), 633 –652 *Ronald L. Goettler is Assistant Professor of Marketing, Booth School of Business, University of Chicago (e-mail: ronald.goettler@chicagobooth. edu). Karen Clay is Associate Professor of Economics, Heinz School of Public Policy and Management, Carnegie Mellon University (e-mail: karen. [email protected]). The authors gratefully acknowledge helpful com- ments from Dan Ackerberg, J.P. Dubé, Shane Greenstein, Bob Miller, Christine Parlour, Uday Rajan, and Holger Sieg. They especially thank Eric Wolff and the online grocer for providing the data. Ronald Goettler appre- ciates financial support from the True North Communications, Inc. Faculty Research Fund at the University of Chicago Booth School of Business. Ronald l. GoettleR and KaRen Clay* Consumers choosing flat-rate contracts tend to have insufficient usage to warrant the cost, particularly for new products. We propose and estimate a Bayesian learning model of tariff and usage choice that explains this flat-rate bias without relying on behavioral misjudgments or tariff-specific preferences. For new products, consumers are uncertain about both their utility relative to the population mean and the mean itself. We show that this latter uncertainty inflates prior variances, which leads consumers to weight their private signals more heavily when updating beliefs. Posteriors are unbiased across products. For a given product, however, the unknown mean yields a “winner’s curse”: Consumers with high posteriors tend to overestimate their utility. these consumers choose fixed-rate tariffs and lower their usage as they correct their beliefs. the flat-rate bias arises when switching costs deter them from changing tariffs. Using the estimated model, the authors find that tariff menus are ineffective screening devices for price discrimination by an online grocer. Predicted revenues increase by 20% when the per-use tariff is dropped, because more consumers choose and stay with the flat rate. Keywords: consumer learning, tariff choice, dynamic discrete choice tariff Choice wi th Consumer learning and Switching Costs © 2011, American Marketing Association ISSN: 0022-2437 (print), 1547-7193 (electronic) 633 Firms often price discriminate among consumers with heterogeneous preferences by offering menus of tariff or bundle options. Consumers are often uncertain about their future demands and base subscription choices on beliefs of future usage. Researchers have begun to explore the accu- racy of these beliefs and of optimal contract design given the uncertainty and potential biases. For example, Grubb (2009) shows that three-part tariffs, as used by cellular phone service providers, are nearly optimal when con- sumers are overconfident (i.e., underestimate the variance of future demand). Miravete (2003), DellaVigna and Mal- mendier (2006), and Lambrecht and Skiera (2006) show that consumers often choose and retain the wrong tariff for telephone calling plans, health clubs, and Internet service, respectively. The stylized facts indicate a flat-rate bias, with con- sumers tending to choose flat-rate contracts when per-use contracts would be better. DellaVigna and Malmendier (2006) present evidence that biased beliefs about future behavior best explain the flat-rate bias and attribute such biases to irrational or behavioral inclinations. Lambrecht and Skiera (2006), Lambrecht, Seim, and Skiera (2007), and Iyengar, Ansari, and Gupta (2007) explain the flat-rate bias by estimating a high mean utility for flat-rate contracts in models with fixed effects and idiosyncratic preferences for each tariff. This approach, however, is ill-suited for predict- ing choices and revenues under alternative tariffs because the alternative tariffs’ mean utilities are unknown. More- over, the idiosyncratic preferences for tariffs imply that sub- scriptions necessarily increase as tariffs are added, even if they are identical to those already offered. In this study, we propose a Bayesian learning model of tariff and usage choice that can explain the flat-rate bias within the rational expectations framework and without invoking tariff-specific preferences. The intuition is simple. Consider

Transcript of tariff Choice with Consumer learning and Switching...

Journal of Marketing Research

Vol. XLVIII (August 2011), 633 –652

*Ronald L. Goettler is Assistant Professor of Marketing, Booth Schoolof Business, University of Chicago (e-mail: [email protected]). Karen Clay is Associate Professor of Economics, Heinz School ofPublic Policy and Management, Carnegie Mellon University (e-mail:karen. [email protected]). The authors gratefully acknowledge helpful com-ments from Dan Ackerberg, J.P. Dubé, Shane Greenstein, Bob Miller,Christine Parlour, Uday Rajan, and Holger Sieg. They especially thank EricWolff and the online grocer for providing the data. Ronald Goettler appre-ciates financial support from the True North Communications, Inc. FacultyResearch Fund at the University of Chicago Booth School of Business.

Ronald l. GoettleR and KaRen Clay*

Consumers choosing flat-rate contracts tend to have insufficient usageto warrant the cost, particularly for new products. We propose andestimate a Bayesian learning model of tariff and usage choice thatexplains this flat-rate bias without relying on behavioral misjudgments ortariff-specific preferences. For new products, consumers are uncertainabout both their utility relative to the population mean and the meanitself. We show that this latter uncertainty inflates prior variances, whichleads consumers to weight their private signals more heavily whenupdating beliefs. Posteriors are unbiased across products. For a givenproduct, however, the unknown mean yields a “winner’s curse”:Consumers with high posteriors tend to overestimate their utility. theseconsumers choose fixed-rate tariffs and lower their usage as they correcttheir beliefs. the flat-rate bias arises when switching costs deter themfrom changing tariffs. Using the estimated model, the authors find thattariff menus are ineffective screening devices for price discrimination byan online grocer. Predicted revenues increase by 20% when the per-usetariff is dropped, because more consumers choose and stay with the flatrate.

Keywords: consumer learning, tariff choice, dynamic discrete choice

tariff Choice with Consumer learning andSwitching Costs

© 2011, American Marketing Association

ISSN: 0022-2437 (print), 1547-7193 (electronic) 633

Firms often price discriminate among consumers withheterogeneous preferences by offering menus of tariff orbundle options. Consumers are often uncertain about theirfuture demands and base subscription choices on beliefs offuture usage. Researchers have begun to explore the accu-racy of these beliefs and of optimal contract design giventhe uncertainty and potential biases. For example, Grubb(2009) shows that three-part tariffs, as used by cellularphone service providers, are nearly optimal when con-sumers are overconfident (i.e., underestimate the varianceof future demand). Miravete (2003), DellaVigna and Mal-mendier (2006), and Lambrecht and Skiera (2006) show

that consumers often choose and retain the wrong tariff fortelephone calling plans, health clubs, and Internet service,respectively.

The stylized facts indicate a flat-rate bias, with con-sumers tending to choose flat-rate contracts when per-usecontracts would be better. DellaVigna and Malmendier(2006) present evidence that biased beliefs about futurebehavior best explain the flat-rate bias and attribute suchbiases to irrational or behavioral inclinations. Lambrechtand Skiera (2006), Lambrecht, Seim, and Skiera (2007), andIyengar, Ansari, and Gupta (2007) explain the flat-rate biasby estimating a high mean utility for flat-rate contracts inmodels with fixed effects and idiosyncratic preferences foreach tariff. This approach, however, is ill-suited for predict-ing choices and revenues under alternative tariffs becausethe alternative tariffs’ mean utilities are unknown. More-over, the idiosyncratic preferences for tariffs imply that sub-scriptions necessarily increase as tariffs are added, even ifthey are identical to those already offered.

In this study, we propose a Bayesian learning model oftariff and usage choice that can explain the flat-rate bias withinthe rational expectations framework and without invokingtariff-specific preferences. The intuition is simple. Consider

a new product for which consumers’ “match-values” (i.e.,idiosyncratic utilities) are distributed N(mproduct, s2

product).Being a new product, mproduct is unknown. Suppose consumersknow that new products in this category have mproduct dis-tributed N(mcategory, s2

category). Therefore, the initial commonprior is N(mcategory, s2

product + s2category) with a variance that

sums the idiosyncratic uncertainty s2product and the aggre-

gate uncertainty s2category. Consumers have private informa-

tion about themselves that is relevant for their match-value,which we model as an unbiased signal. They incorporatethis signal in a Bayesian fashion and choose either the flat-rate or per-use tariff according to whether their posteriormean is above or below some critical value. Aggregateuncertainty (i.e., s2

category) inflates the prior variance, caus-ing consumers to weight private signals more heavily.Although posteriors are unbiased across products, the heavyweight on private signals implies that consumers with highposteriors for a given product tend to overestimate their util-ity. Consumers eventually correct their beliefs and lowertheir usage, but switching costs deter them from changingtariffs, which yields the flat-rate bias.

This phenomenon is similar to the winner’s curse in common-value auctions (and affiliated value auctions), inthat consumers with high posteriors have positive biases.The way these posteriors are used, however, differs: Whenbidding, a person conditions on having higher beliefs thanothers, whereas when choosing a tariff, he or she cannotcondition on others’ beliefs or actions. The winner’s curse isactually harmless in auctions because bidders use valuationsconditional on winning, which are lower than the uncondi-tional valuations. That is, bidders “shave” their bids becauseif they win the auction, they are likely to have overestimatedthe item’s value. In contrast, consumers choosing tariffscannot correct their beliefs before making the choicebecause they have no way of knowing whether their poste-rior is high. They should not, however, shave their valuationbecause the selected tariff is “won” regardless of other con-sumers’ actions and their valuations are actually correct onaverage across products.

We use our model to study the online grocer market, inwhich consumers order groceries for home delivery from amonopolist offering a menu with flat-rate, per-use, andtwo-part tariffs. This new service is an “experience good,”in that consumers learn their match-values through con-sumption (Nelson 1970). Therefore, forward-looking con-sumers have an informational incentive to experiment withthe service, leading them to trade off current utility forfuture utility. Anticipated costs to switching tariffs also leadto dynamic trade-offs.1 Previous studies of consumers’ tar-iff choices, including DellaVigna and Malmendier (2006),Miravete (1996, 2002), Lambrecht and Skiera (2006), Lam-brecht, Seim, and Skiera (2007), Narayanan, Chintagunta,and Miravete (2007), and Iyengar, Ansari, and Gupta (2007),avoid these dynamic trade-offs by assuming that switchingcosts are negligible and uncertainty is resolved independentof consumption. Neither of these assumptions suits ourstudy. Using household-level data, we estimate the modeland perform counterfactual simulations to evaluate alterna-

tive pricing schemes and the effect of uncertainty andswitching costs on revenue and consumer surplus.

Our model builds on Eckstein, Horsky, and Raban’s(1989) dynamic brand choice model, in which consumersupdate their beliefs as they experience the brands. Usingvariants of this model, Erdem and Keane (1996) investigatethe role of advertising in consumer learning about laundrydetergent, Ackerberg (2003) distinguishes between informa-tive and prestige advertising for a new yogurt, and Crawfordand Shum (2005) estimate how rapidly consumers learn theeffectiveness and side effects of antiulcer drugs. Predatingthese applications, Miller (1984) estimates a Bayesianlearning model matching workers to jobs.

We extend the dynamic brand choice model to includeconsumers’ initial and subsequent tariff choices. Consumersselect tariffs to maximize expected discounted utility giventheir beliefs, which induces a sorting by beliefs: Consumerswho expect to purchase often choose tariffs with high flatfees and low per-use prices. Therefore, the observed highusage of such consumers reflects both the self-selection andthe low per-use price. By endogenizing consumers’ tariffchoices, our model disentangles these two drivers ofobserved usage.

Using our estimated model, we assess the effectivenessof tariff menus as screening mechanisms to price discrimi-nate among heterogeneous consumers who face uncertainty.We numerically solve for optimal tariffs menus, assumingthey are fixed over time (as in our data), and find that menusare ineffective screening devices for price discrimination.Adding a second two-part tariff increases profits marginally,and a third tariff adds no gain. Miravete (2002) also findslimited gains from complex tariffs when consumers learntheir demand over time, as do Courty and Hao (2000) whenex ante consumer heterogeneity is high. The gains are mini-mal because the incentive compatibility constraints arecostly to satisfy: Offering additional tariffs reduces the firm’sability to extract surplus from high-demand consumers.

Moreover, fixed fees play two roles when consumers facehigh costs for switching tariffs. In addition to extracting sur-plus from inframarginal units consumed, they may lockconsumers into paying the fee even when they learn theirmatch-value is lower than initially expected. Including aper-use option on the menu may therefore lower revenuesbecause fewer consumers will choose the tariffs with fixedfees. Indeed, eliminating the per-use option from the onlinegrocer’s current menu increases predicted revenues by morethan 20%.

Data

We use consumer-level data on grocery deliveries to 5368households in a single metropolitan market during the 70weeks from September 16, 1997, to January 23, 1999. Theearlier date is the online grocer’s commencement of service.The online grocer was a monopolist in the home-deliverybusiness that partnered with an existing local grocery chainto supply groceries. Online prices and discounts were thesame as in the chain’s stores. Consumers learned about theservice through advertising in the form of mass mailings,media stories, print and radio advertising, in-store advertis-ing by the partner chain, and displays on the delivery trucks.Most consumers enrolled while shopping in the partner-chain’s stores. No free trials were offered. Once enrolled,

634 JoURnal oF MaRKetinG ReSeaRCh, aUGUSt 2011

1These switching costs are conceptually distinct from the switchingcosts related to brand loyalty.

tariff Choice with Consumer learning and Switching Costs 635

consumers placed orders from their computers using installedsoftware or a web-based interface. Consumers selected atwo-hour delivery window, typically the next day, duringwhich someone would be home to accept the delivery.

The online grocer offered consumers a menu of threetwo-part tariffs with weekly fixed fees of F = ($5.76, $1.14,0) and per-delivery prices of p = (0, $5, $11.95), respec-tively. Plan 1 is therefore a flat-rate plan, and plan 3 is a uni-form pricing plan. F and p are sometimes called ex ante andex post prices, respectively, because consumers pay Fbefore knowing their usage and pay p only for units con-sumed. Although the online grocer quotes fees on a monthlybasis, consumers could change plans at any time, with feesprorated.

We observe each consumer’s enrollment date and initialtariff choice, the date of each of his or her orders, the sub-scription plan at the time of each order, and the amount ofeach grocery order, which averages $119. We treat theenrollment date as exogenous, reflecting the randomnesswith which consumers become aware of the service. Toensure sufficient opportunity to observe usage behavior, weonly used consumers who had enrolled by week 60.

Grocery shopping typically occurs weekly, so we modelweekly usage of the online grocer, with cit = 1 if consumer iorders at least once during week t and cit = 0 for no orders.We use t = 1 to denote the week of enrollment, not the firstweek of our sample. Only 2.8% of customers’ weeks withorders have more than one order.

In addition, sit Œ 1, 2, 3 denotes consumer i’s subscrip-tion plan in week t. Unfortunately, we only observe a con-sumer’s plan upon enrollment and with each order. Surpris-ingly, we never observe consumers switch (i.e., no oneswitches and orders again).2 Thus, we assume sit isunchanged between orders. The plan beyond the last order,however, is censored. Although we do not observe switches,many consumers likely “quit” by switching to plan 3, whichhas no fee, and never ordering again. To account for thisevent, our likelihood function (Equation A1) integrates overall plans to compute the probability of observing no usageduring the “trailing weeks” between a customer’s last orderand the end of our sample period.

In Table 1, we provide relevant summary statistics foreach plan. Only 12.3% of consumers signed up for plan 1,compared with 31.9% for plan 2 and 55.8% for plan 3. Themean weekly usage rate was .61 for plan 1 enrollees, .38 forplan 2 enrollees, and .21 for plan 3 enrollees. We computeeach consumer’s usage rate from the weeks spanning enroll-ment and the last observed order. This measure is an upperbound because it ignores weeks beyond the last order dur-ing which the consumer may have remained on the plan butdid not order. In Table 1, we also report the range of usagerates for which each plan minimizes the expected cost perorder. Figure 1 plots this expected cost, which equals p +

(F/usage probability), for each plan as a function ofexpected usage. Plan 1 minimizes costs for usage ratesabove .67, plan 3 minimizes costs for usage rates below .23,and plan 2 minimizes costs elsewhere. As the last row ofTable 1 indicates, more than half of the consumers whoenrolled in plans with fees minimized ex post delivery costs,assuming they quit after the last observed order. The flipside is that nearly half did not minimize costs, which sug-gests consumers indeed faced substantial uncertainty andswitching costs. If consumers were uncertain but faced noswitching costs, they would have switched plans if their

2Consumers even ignored letters explaining they should switch plans.The firm also assures us new customer identifications would not have beenassigned upon switching. Being skeptical, we searched our data formatches within zip codes between new customers and customers whorecently placed their last order. Candidates are regular customers on plans2 or 3 whose last order is shortly before a new customer enrolls in plan 1and orders regularly. We found only four possible matches, which had anegligible effect on the estimates.

table 1taRiFF ChoiCeS

Plan 1 Plan 2 Plan 3

F, flat fee $5.76 $1.14 $0p, per-use price $0 $6.95 $11.95Number of households enrolled 658 1712 2998Initial plan shares .123 .319 .558Usage rates for which plan .67–1 .23–.67 0–.23

minimizes costMean usage (between enrollment .61 .38 .21

and last order)Share of enrollees who never order .12 .17 .57Share of ordering enrollees who .57 .53 .32

minimize costs

Notes: Weekly F is the quoted monthly fee ($24.95, $5.00, $0) dividedby 4.33.

Figure 1eXPeCted CoSt PeR deliVeRy

30

25

20

15

10

5

Exp

ecte

d C

ost

per

Use

Usage Probability per Week

.2 .3 .4 .5 .6 .7 .8 .9 1

expected cost = p + F/(usage probability)

Plan 1 (flat fee)

Plan 3 (uniform pricing)

Plan 2 (two-part tariff)

beliefs crossed the expected-usage thresholds depicted inFigure 1. The absence in our data of switches followed byusage therefore suggests switching costs are high.

Some consumers enroll in plans with fixed fees but nevertry the service: 12% of those on plan 1 and 17% of those onplan 2 never order. Because we only use consumers whoenroll at least ten weeks before the end of our sampleperiod, such outcomes suggest consumers can receive infor-mation and revise beliefs without actually placing an order.A “post-enrollment” signal can reflect difficulty installingthe software or navigating the company’s website for order-ing, thereby leading the consumer to quit without everordering. The details of this specification and the initial tar-iff choice are discussed subsequently.

Our data are best summarized by plotting usage ratesover time for enrollees of each plan. In Figure 2, the topsolid line is the average cit across consumers who enrolledin plan 1, even if they may have quit (i.e., including weeksbeyond their last order). Nearly 74% of plan 1 enrolleesused the online grocer during their first week. Their usagedeclined to around .2 by 60 weeks after enrollment. Averagecit becomes noisy as “weeks since enrollment” increasesbecause few consumers signed up early enough to providesuch data. If switching costs were low, plan 1 enrollees withlow initial beliefs quickly dropping after experientially con-

firming their beliefs could explain this usage drop. That is,consumers with low initial beliefs can justifiably sign up forplan 1 if switching costs are low because the incentive toexperiment implies that initial usage is higher than theexpected long-term usage. A high probability of needing toswitch does not deter consumers from signing up for high-fee tariffs when switching costs are low. However, highswitching costs, as the lack of observed switches suggests,imply that plan 1 enrollees should have had initial beliefshigh enough to warrant this plan in the long run. Therefore,the quick drop in usage of plan 1 enrollees implies that theywere overly optimistic. Our specification of beliefs neststwo alternative explanations for this pattern: aggregatebiases and conditional biases. Aggregate biases shift all con-sumers’ beliefs equally, whereas conditional biases dependon consumers’ signals. We show that conditional bias ismore important than aggregate bias for explaining our data.

The dotted lines in Figure 2 are usage rates conditional onconsuming some time beyond the week for which the usagerate is being computed. Because this conditioning eventselects from consumers who are learning their match-valuesare (relatively) high, the rates are higher than the uncondi-tional usage rates and increase at the end of the sample period.

MoDel

We first propose a Bayesian learning model of consumers’tariff choices and usage of an experience good. Second, wepresent comparative statics to illustrate the effects of pricesensitivity, switching costs, and uncertainty on tariff choiceand usage conditional on tariff choice. We then specify thedistribution of match-values and initial beliefs and distin-guish between aggregate and conditional biases. We con-clude the section with a model of the initial tariff choice,which differs slightly from the subsequent tariff choices.

a Bayesian learning Model

We model the consumer’s weekly decision of whether touse the online grocer. The consumer’s decision has twodynamic aspects. First, the online grocer’s service is a newexperience good about which consumers initially have lim-ited information. As consumers use the service, they learntheir utility from it (i.e., their match-values). In particular,consumers may learn whether they can efficiently use theordering software, whether they like the produce selected,and whether they indeed save much time by using the onlinegrocer. If a consumer’s prior belief suggests his or hermatch-value is likely low, he or she may still try the onlinegrocer because the lower expected current utility may beoffset by the possibility of learning that the service, in fact,provides high utility.

The second dynamic aspect arises from the firm’s use ofsubscription plans combined with the presence of switchingcosts. The firm offers a fixed menu of M two-part tariffsdenoted by (F1, p1, …, FM, pM), where F denotes the vector offlat fees and p denotes the vector of per-use prices. Each con-sumer chooses the best subscription plan (i.e., tariff) given hisor her beliefs about the value of the service and the costs ofswitching plans in the event that a switch is warranted. Formany products, switching costs are explicit financial charges.Because the online grocer does not charge consumers tochange plans, this cost represents the hassle of thinkingabout which plan is best and calling to request the change.

636 JoURnal oF MaRKetinG ReSeaRCh, aUGUSt 2011

Figure 2WeeKly USaGe RateS By Plan

.8

.7

.6

.5

.4

.3

.2

.1

0

Weeks Since Enrollment

10 20 30 40 50 600

initial plan shares 12.3, 31.9, 55.8

Plan 2

Plan 3

Plan 1

Notes: Solid lines average over consumers initially enrolled in specifiedplan. Dotted lines condition on ordering in the current or any future week.

tariff Choice with Consumer learning and Switching Costs 637

Each week, consumers choose plans and usage to maxi-mize expected discount utility from grocery shopping, netof switching costs, conditional on available information Iit:

where ct Œ {0, 1} is the consumer’s usage choice in period t(ct = 1 for the online grocer), st Œ {0, …, M} is the subscrip-tion (i.e., tariff) choice, uit is a vector of i.i.d. shocks to util-ity from each of the usage choices, b is the weekly discountfactor, a is the constant marginal utility of money, d is theswitching cost, and � is an indicator function. Importantly,the consumer knows uit before the choice of cit but notbefore the choice of sit.

The utility consumer i obtains from using only the tradi-tional grocery store in period t is simply the idiosyncraticshock: Ui0t = ui0t. The utility from using the online grocer is

(2) Ui1t = mi + eit – apsit+ ui1t,

where ui1t is the idiosyncratic shock, psit is the per-use com-ponent of tariff sit, and mi + eit is the experience signal,which is noisy because of variation in the provision of theservice (e.g., the quality of the fresh produce, the time ittook for the delivery to arrive).3 The consumer uses thenoisy signal to learn about his or her match-value mi.

Following Eckstein, Horsky, and Raban (1988), we spec-ify a Bayesian learning process that exploits conjugate distri-butions, as described in DeGroot (1970). In particular, eit ~i.i.d. N(0, s2

e) with a prior mi ~ N(mit, s2it) yields a learning

process in which the posterior on mi after an experience sig-nal mi + eit is mi ~ N(mi, t+1, s2

i, t+1), where

This model of Bayesian learning is tractable because theseclosed-form expressions for the posterior mean mit and pos-terior variance sit capture all the consumer’s informationregarding mi. Because the posterior is normal, the subsequentupdate follows the same process, with this posterior servingas the prior. Information at t is therefore Iit = (mit, sit).

Consumers choose their subscription plans prior toobserving the idiosyncratic shocks u.4 Without loss of gen-erality, we model the sequence of decisions within a periodto first entail the usage choice for the current period, fol-lowed by the plan choice for the following period. Thus, sitis a state variable chosen at the end of period t – 1. The con-sumer changes plans if an experience signal shifts his or her

( )( )

, ,3 1

2 2

2 2 12m

mi t

it it i it

iti t+ +=

+ ++

σ σ µ εσ σ

σε

εand ==

+σ σ

σ σε

ε

2 2

2 2it

it

.

( ) max( ), ( , , )

1s I c I s u

t

t

ici i i t

E Uτ τ τ τ τ τ τ

βτ

τ{ }−

=

=∞ ∑ ττ

τ

τ τ τ

τ ττ

α δ

( , )

( ) |

s u

F s s I

i

s it

t

(

− − ≠ )

=

−∑ ( 1 ,,

beliefs enough to warrant incurring the switching costs. Thesequence is summarized in the following timeline:

3We do not consider risk aversion in the usage choice, because Coscelliand Shum (2004) show that risk aversion is not identified when priors areestimated separately from the distribution of match-values. We also do notconsider risk aversion in the tariff choice because consumers do not facethe risk of “overage” charges, as they do with three-part tariffs for cellularphone service.

4Clay, Sibley, and Srinagesh (1992) compare ex post pricing (i.e., choos-ing tariffs after usage occurs) with ex ante tariff choice. They find that exante choice yields higher profits for low to moderate levels of uncertainty.

fixed

Iit, sit

t

observe uit

choose cit

if cit = 1

observe mi + eit

update to Ii, t + 1 choose si, t + 1

fixed

Ii, t + 1, si, t + 1

t + 1

Expected discounted utility as a function of the currentstate, assuming optimal policies are implemented eachperiod, is given by Bellman’s equation:

where the expectation is over next period’s state and thecurrent period’s experience signal mi + eit. Following Rust(1987), we integrate over the i.i.d. u shocks to remove themfrom the state space as they only affect current utility.Assuming u are Type I extreme value, this integration hasan analytic solution. The stochastic evolution of beliefs isthe remaining unknown determinant of the continuationvalue. Accordingly, let mit = mi + eit denote the realizedexperience signal (when cit = 1). In the following integratedvalue function, the first exponent corresponds to cit = 0 andthe second exponent corresponds to cit = 1:

where g is Euler’s constant and mi, t + 1(.) and si, t + 1(.) arethe posterior mean and variance from Equation 3. The per-ceived distribution of signals F is normal with mean mit andvariance s2

it + s2e to account for both the signal’s noise and

the uncertainty of current beliefs.When cit = 0, the beliefs mit and sit do not change because

no signal is received. When cit = 1, the maximization thatdetermines si, t + 1 occurs inside the integration over theexperience signal because consumers make this choice afterobserving the signal. Our model of tariff choice differs fromLambrecht and Skiera (2006), Lambrecht, Seim, and Skiera(2007), and Iyengar, Ansari, and Gupta (2007) in that con-sumers do not have idiosyncratic preferences for tariffs. Theoptimal tariff in our model maximizes the consumer’sexpected discounted utility from usage choices. Tariffs them-selves do not provide utility in our model; they only affectthe cost of usage and have no differential transactions costs.

Implications and Comparative Statics

We numerically solve the model and present comparativestatics to illustrate the effect of price sensitivity, switchingcosts, and uncertainty on tariff choice and usage across thethree plans, when consumers face the tariff menu observedin our data. In Figure 3, we present consumers’ optimal tar-iffs and usage rates after match-value uncertainty has been

( ) ( , , , ) max, ,

41

V m s u U

V

Eu it it it itc s

ic tit i t

itσ

β

=

+

[+

uu i t i t i t i t it it it itm s u m s u( , , , ) | , , ,, , , ,+ + + +1 1 1 1σ σ ,, ,cit

( ) ( , , ) ln[exp( max{ ( ,,

51

V m s V mit it its

it iti t

σ γ β σ= ++

,, )

( )}) exp(

,

,,

s

F s s m

p

i t

s i t it it

s

i t

+

+− − ≠ +

+

1

11α δ

α

(

iiti ts

i t it it it

i t it

V m m++

∫ +

+

β σ µ

σ σ

max ( ( , , ),

(

,

,

,

1

1

1 )), ) (

) ( | ,

, ,,s F s

s d m

i t s i t

it it it i

i t+ +− −

+1 11α δ

µ σ

(

Φ tt ))],

638 JoURnal oF MaRKetinG ReSeaRCh, aUGUSt 2011

Figure 3oPtiMal neW Plan and USaGe GiVen CURRent Plan and no ReMaininG UnCeRtainty

a: a = .1, d/a = $0 B: a = .3, d/a = $0

C: a = .1, d/a = $50 D: a = .3, d/a = $50

e: a = .1, d/a = $300 F: a = .3, d/a = $300

1.0

.8

.6

.4

.2

0

Usag

e F

req

uen

cy

Match-Value

–3 –2 –1 0 1 2 3

No switching costs implies new plan is independent of current plan.

Choose no fee

Choose low fee

High fee

high fee

Choose low fee

Choose no fee

no switching costs implies new plan

is independent of current plan

1.0

.8

.6

.4

.2

0

Usag

e F

req

uen

cy

Match-Value

–3 –2 –1 0 1 2 3

Choose no fee

Choose low fee

High fee

high fee

Choose low fee

Choose no fee

1.0

.8

.6

.4

.2

0

Usag

e F

req

uen

cy

Match-Value

–3 –2 –1 0 1 2 3

No switching costs implies new plan is independent of current plan.

No fee

Choose low fee

High fee

high fee

Choose low fee

no fee

no switching costs implies new plan

is independent of current plan

+ high-fee plan currently

◯ low-fee plan currently

◻ no-fee plan currently

1.0

.8

.6

.4

.2

0

Usag

e F

req

uen

cy

Match-Value

–3 –2 –1 0 1 2 3

No fee

Choose low fee

High fee

high fee

Choose low fee

no fee

+ high-fee plan currently

◯ low-fee plan currently

◻ no-fee plan currently

1.0

.8

.6

.4

.2

0

Usag

e F

req

uen

cy

Match-Value

–3 –2 –1 0 1 2 3

Choose no fee

Choose high fee

Choose no fee

+ high-fee plan currently

◯ low-fee plan currently

◻ no-fee plan currently

1.0

.8

.6

.4

.2

0

Usag

e F

req

uen

cy

Match-Value

–3 –2 –1 0 1 2 3

No feeChoose low fee

Choose high fee

Choose low fee

Choose high fee

no fee

+ high-fee plan currently

◯ low-fee plan currently

◻ no-fee plan currently

Choose high fee

tariff Choice with Consumer learning and Switching Costs 639

fully resolved as functions of the match-value (on the x-axis) and the consumer’s current plan (distinguished byplotting symbols). The price coefficient, a, is .1 for plots inthe left column and .3 in the right column. Switching costsin dollar terms, d/a, are zero for the top row, $50 for themiddle row, and $300 for the bottom row.

In the top row, we do not distinguish across current plansbecause without switching costs, plan choice is independentof the current plan. Consumers with low match-valueschoose the no-fee plan, consumers with high match-valueschoose the high-fee plan, and moderate match-value con-sumers choose the low-fee plan. With known match-values,usage frequencies by plan necessarily fall within the ranges,depicted in Figure 1, for which the chosen plan minimizesexpected delivery costs, p + (F/usage probability). Usageincreases monotonically with match-value, with discretejumps at the thresholds separating no-fee from low-fee sub-scribers and low-fee from high-fee subscribers. The usagejumps increase in a because higher a implies a greater sen-sitivity to the different marginal prices. The thresholds alsoshift to the right as a increases because higher match-valuesare needed to justify using the service. Moreover, thethreshold separating no-fee from low-fee subscribers shiftsmore than the threshold separating low-fee from high-feesubscribers because the expected cost per use is lower forconsumers with higher match-values. Consequently, therange of match-values choosing the low-fee plan shrinks asa increases. Indeed, no consumers choose the low-fee planwhen a exceeds $.3414.

The lower plots demonstrate the effect of switching costson tariff choice, given a previous plan. With moderateswitching costs of $50 and low price sensitivity, high-feesubscribers will switch to either the low-fee or no-fee plan iftheir realized match-value falls sufficiently from the rela-tively high prior that induced the consumer to initiallychoose the high-fee plan. Moderate switching costs withhigh price sensitivity, however, lead the high-fee subscriberto either stay on his or her current plan or to switch to theno-fee plan. The higher a reduces usage on the low-fee planenough that switching to it is never worth $50.

In Panels E and F of Figure 3, we show that with highswitching costs, consumers never switch to or from the low-fee plan and only switch across the no-fee and high-fee plansin extreme cases. Consequently, consumers with similarmatch-values face different marginal prices if their initialbeliefs varied sufficiently that they first chose differentplans.

Whereas Figure 3 focuses on tariff choice and usage givena current subscription plan and no remaining uncertainty,Figures 4 and 5 illustrate the determinants of the initial tar-iff choice, when consumers’ match-values are unknown. InFigure 4, we plot the thresholds that divide match-valuebeliefs on the y-axis, according to the corresponding opti-mal initial plan, as functions of price sensitivity on the x-axis. We depict three scenarios: no uncertainty and uncer-tainty with and without switching costs. The no-uncertaintycase, the solid line, is the same scenario as Panels A and Bof Figure 3. As such, the threshold line dividing no fee fromlow fee converges to the line dividing low fee from high feeas price sensitivity increases: No consumers choose low feefor a > .3414. As a increases, the market share of no feemonotonically increases. When consumers are uncertain

and face high switching costs (the dotted line), the set ofbeliefs for which the low-fee plan is optimal expands. Theincreased attraction of the middle plan is that it enables con-sumers to at least be near the plan that will be optimal afterthe uncertainty has been resolved, without incurring theswitching costs. When consumers face uncertainty but noswitching costs (the dashed line), the threshold lines shiftdown because consumers know they will increase theirusage, relative to the no-uncertainty case, to gain informa-tion about their match-value. Higher usage with no cost ofbeing stuck on the wrong plan implies that consumers willshift to the higher-fee plans. Figures 3 and 4 show that ainfluences both the initial plan shares and the gaps in usagerates across plans. Thus, these moments help identify awhen we estimate the model.

The effect of switching costs on initial plan shares andusage conditional on plan (i.e., the allocation of match-valuesacross plans) implies d is identified by more than justswitching behavior. Figure 4 illustrates this effect for twoparticular values of switching costs. In Figure 5, we putswitching costs on the x-axis to provide more details of thiseffect, with a of .1 in the top plot and .3 in the bottom plot.In both plots, each plan has some subscribers for low

no uncertainty

Uncertainty with no switching costs

Uncertainty with high switching costs

Notes: Upper line of each pair is lowest belief for which high fee is opti-mal. Lower line of each pair is highest belief for which no fee is optimal.Between each pair are beliefs, if any, for which low fee is optimal.

Figure 4eFFeCt oF UnCeRtainty, SWitChinG CoStS, and PRiCe

SenSitiVity on initial taRiFF ChoiCe

4

3

2

1

0

–1

–2

–3

Init

ial B

elief

(mi0

)

Price Coefficient (a)

.1 .2 .3 .4 .5 .60

high beliefs choose high-fee plan

Moderate beliefs choose low-fee plan

low beliefs choose no-fee plan

switching costs, although in the bottom plot, the range ofbeliefs for which low fee is optimal is small and requires verylow switching costs. As switching costs increase to moderatelevels, the low-fee plan loses its appeal but then becomesattractive again as switching costs increase further. In bothplots, switching costs near $400 are sufficiently high thatconsumers know they will never switch. Note that switch-ing to the no-fee plan will never occur if d/a exceeds F/(1 –b), the present discounted value of paying F each period.

Initial Beliefs: aggregate and Conditional Biases

We estimate the distribution of match-values across con-sumers and consumers’ initial beliefs.5 For parsimony andto facilitate our discussion of rational expectations, we

assume that match-values are distributed N(mpop, s2pop),

though any distribution may be used. Beliefs are distributednormal, with a common initial prior mi ~ N(~mpop, ~s2

pop),where ~mpop and ~s2

pop denote consumers’ beliefs about thepopulation mean and variance of match-values.

Consumers typically have private information beforeenrolling, such as their proximity to grocers and computersavvy. We model such information as unbiased private sig-nals (mi + ei, pre) ~ N(mi, s2

pre), where the subscript pre indi-cates preenrollment. Consumers then update beliefs to mi ~N(mi0, s2

i0), where

In empirical learning models, “rational expectations”means consumers know the distribution of match-values. Inmost applications, such as Miller (1984) and Crawford andShum (2005), this notion is applied at the product level,implying ~mpop = mpop and ~s2

pop = s2pop. For new products, this

interpretation is too narrow: Consumers may overestimatethe value of a particular new product while still being cor-rect on average across many new products.

In line with Ackerberg (2003), we take a broader view ofrational expectations and consider the population mean itselfdistributed normal: mpop ~ N(mcategory, s2

category) for somecategory of products. Rational expectations over the categoryimplies an initial prior with ~mpop = mcategory and ~s2

pop = s2pop +

s2category for each product because the perceived uncertainty

~spop accounts for both the dispersion in match-values for thefocus product and the uncertainty in its mean match-value.

The posterior mean mi0 is unbiased with the usual condi-tioning on the consumer’s information set: mi0 = E(mi | Ii0),where Ii0 consists of the initial prior and the realized signal.But firms and researchers are usually interested in whetherconsumers’ beliefs are correct regarding a particular prod-uct. Therefore, we define product-level bias by also condi-tioning on mpop:

where z(mi0) is the signal value that leads to mi0 when mpopis unknown (i.e., using Equation 6). The term [s2

prempop +s2

popz(mi0)]/(s2pop + s2

pre) is the posterior mean if mpop wereknown (i.e., [~mpop, ~s2

pop] = [mpop, s2pop]), which therefore

equals E(mi | Ii0, mpop).6

Product-level biases do not imply irrational or suboptimalbehavior. Rational expectations require beliefs to be unbiasedgiven consumers’ information sets, but product-level bias

( ) ( ; ) ( | , )7 0 0 0

0

2

bias m m E I

m

i pop i i i pop

ipre

µ µ µ

σ µ

≡ −

= − ppop pop i

pop pre

z m++σ

σ σ

20

2 2

( ),

( )( ),

6 0

2 2

2 2mi

pre pop pop i i pre

pop pre

=+ +

+σ µ σ µ ε

σ σ% %%

,, .σσ σ

σ σipre pop

pre pop02

2 2

2 2=

+%%

640 JoURnal oF MaRKetinG ReSeaRCh, aUGUSt 2011

Figure 5eFFeCt oF SWitChinG CoStS on initial taRiFF ChoiCe

a: a = .1

B: a = .3

2.0

1.5

1.0

.5

0

–.5

–1.0

–1.5

Init

ial B

elief

(mi0

)

Switching Cost in Dollar Terms (d/a)

0 50 100 150 200 250 300 350 4000 50 100 150 200 250 300 350 400

0

1

2

switching cost in dollar terms ( / )

initi

al b

elie

f (m

i0)

0 50 100 150 200 250 300 350 400

0

1

22.0

1.5

1.0

.5

0

–.5

–1.0

–1.5

Init

ial B

elief

(mi0

)

Switching Cost in Dollar Terms (d/a)

0 50 100 150 200 250 300 350 400

0 50 100 150 200 250 300 350 400

0

1

2

switching cost in dollar terms ( / )

initi

al b

elie

f (m

i0)

0 50 100 150 200 250 300 350 400

0

1

2

Notes: Upper line of each pair is lowest belief for which high fee is opti-mal. Lower line of each pair is highest belief for which no fee is optimal.Between each pair are beliefs, if any, for which low fee is optimal.

5Because we do not observe the set of consumers who consideredenrolling but chose not to, we estimate the distribution of mi conditional onenrolling. Consumers slowly became aware of this service, so we would beuncomfortable using an assumed market size to infer the proportion of con-sumers who deliberately chose not to sign up. Without the distribution of mifor nonenrollees, we cannot predict the full effect of lower prices. However,we can predict the effect of higher prices (because nonenrollees remainnonenrollees), and if lower prices are found to be more profitable for theenrollee population, we can indeed conclude that overall profits wouldincrease because adoption by some nonenrollees would only add to profits.

6By inverting mi0 in Equation 6 for the signal mi + ei, pre and pluggingthe resulting expression for z(mi0) into Equation 7, we can solve for the mi0that yields zero bias:

When ~s2pop > s2

pop, all consumers with mi0 > mi0NoBias have positive product-

level bias. Because mpop is unknown, mi0NoBias is also unknown: Consumers

cannot infer when their mi0 is biased.

miNoBias

poppop pop pop

pop pop0

2

2 2= −

−−

µµ µ σ

σ σ( )

.%%

tariff Choice with Consumer learning and Switching Costs 641

conditions on the unknown mpop. Nonetheless, we follow theconvention of using “bias” to describe product-level dis-crepancies between expectations and (average) realizations.

We decompose product-level bias into two components:aggregate bias due to ~mpop π mpop shifting mi0 equally for allconsumers and conditional bias due to ~s2

pop π s2pop shifting

mi0 by an amount that depends on each consumer’s signal.For new products, ~s2

pop > s2pop, which increases the signal’s

weight, causing consumers with high signals for a givenproduct to be overly optimistic, on average, for that product.7

We illustrate product-level bias and its implications fortariff choice in Figure 6. For this example, we use mcategory =0, s2

category = 3, s2pop = 1, s2

pre = 1, and consider three prod-ucts with mpop realizations of –2, 0, and 2. If consumersknew these mpop realizations, their priors would be N(–2, 1),N(0, 1), and N(2, 1), respectively, and their posteriors wouldbe unbiased at the product level. Because the mpop areunknown, the prior for each product is N(0, 1 + 3) and the pos-teriors exhibit product-level biases as plotted in the top graph.

The quality shock mpop affects all consumers equally,shifting bias (mi0; mpop) up and down. We refer to this shiftas “aggregate bias.” The posterior’s aggregate bias is lessthan the difference between ~mpop and mpop because theinformative signal partially corrects beliefs.

The degree of uncertainty about mpop (i.e., s2category) deter-

mines the slope of product-level bias. As s2category increases,

consumers place more weight on their private signals, whichincreases the relationship between product-level bias and mio.We refer to this variation in bias across beliefs as “condi-tional bias,” because the bias is conditional on one’s signal.

Consider the case of no aggregate bias (i.e., ~mpop = mpop).Despite having unbiased priors and signals, consumers’ pos-teriors exhibit product-level bias: On average, those withsignals above mpop will be overly optimistic, and those withsignals below mpop will be overly pessimistic, as illustratedby the middle line in Panel A of Figure 6. Thus, posteriorscan exhibit product-level biases even when both the priorand the signal are unbiased.

Conditional bias is particularly relevant for models of tariffchoice because consumers use threshold rules: Consumerswith beliefs above some threshold m* choose the flat-rate

plan, whereas consumers with mi0 < m* choose plans withlower fixed fees and higher per-use charges. Therefore, con-sumers on the flat-rate plan will have higher bias than con-sumers on other plans. If the product is weakly worse thanaverage (i.e., mpop £ mcategory), consumers on the flat-rate planwill necessarily have positive bias. If the product is betterthan average (i.e., has a negative aggregate bias due to mpop >mcategory), consumers on the flat-rate plan will, on average,have a positive bias only if threshold m* is sufficiently high.

We illustrate this result in Panel B of Figure 6, whichreports for each product the average product-level bias forconsumers with mi0 exceeding hypothetical m* thresholds.At the lowest cutoff, few consumers are excluded, so thelowest reported bias is approximately the average across thepopulation (i.e., the aggregate bias). The average product-level bias across all consumers is 0 for the product with noaggregate bias, as expected, and is approximately .5 and –.5for products with quality shocks of –2 and 2, respectively.The average bias conditional on mi0 > m* increases in m*and, for the pessimistic example with mpop = 2, becomespositive at m* = 1.9.8

Finally, Panel C of Figure 6 presents the share of con-sumers who are optimistic, conditional on their posteriormean exceeding the hypothetical m*. Again, the lowest cut-off values exclude few consumers, so the lowest shares areapproximately population measures. With mpop = 0, half theconsumers are optimistic; with the negative quality shock,more than half are optimistic; and with the positive qualityshock, less than half are optimistic. Because of conditionalbias, the share of consumers who are optimistic increases inthe threshold m*.9 Even for products with negative aggre-gate biases (i.e., ~mpop < mpop), a threshold exists above whichconsumers tend to be overly optimistic. As such, conditionalbias can explain a flat-rate bias even when consumers areon average pessimistic.

Thus far, we have presented a rational expectationsframework in which conditional bias provides a basis forthe oft-cited intuition that people with high beliefs for aproduct tend to be overly optimistic and often erroneouslychoose flat-rate tariffs. However, conditional biases can alsoarise from behavioral conjectures that violate strict rational-ity. The high weight on private signals that causes condi-tional bias may, for example, reflect projection bias.10 In

7Expressions for each of these components can be obtained from Equa-tion 7 by noting that mi0 in Equation 6 may be written as follows:

which implies

The first term captures aggregate bias; the second term is the conditional bias.

micategory

pop category pre

pr

0

2

2 2 21= −

+ +

σ

σ σ σ

σ ee pop pop pop pop i i pre

pop p

2 2

2

[ ( )] ( ),µ µ µ σ µ εσ σ

+ − + ++

%

rre

category i i pre

pop category pre

2

2

2 2+

++ +

σ µ εσ σ σ

( ),

22,

bias mi poppre pop pop

pop pre

ca( ; )

( )0

2

2 2µ

σ µ µσ σ

σ=

−+

+% ttegory

pop category pre

i i prepre

2

2 2 2

2

σ σ σ

µ εσ µ

+ +

+ −,

[ ppop pop pop pop i i pre

pop pre

+ − + ++

( )] ( ),%µ µ σ µ εσ σ

2

2 2

.

8Averaging bias(mi0; mpop) across all products (i.e., mpop realizations)yields zero for all mi0 when each is weighted by the product’s density ofconsumers at mi0. The average bias conditional on mi0 > m* is thereforealso zero when averaging across products. Although consumers whochoose flat-rate tariffs have unbiased beliefs across products, for mostproducts their product-level biases are positive. We obtain the overall biasof zero by balancing positive product-level biases for many products withrelatively low weights (i.e., lower density of mi0 > m* due to lower mpop)with negative product-level biases for fewer products with relatively highweights (i.e., higher density of mi0 > m* due to higher mpop).

9Although the share of consumers who are optimistic about a givenproduct is increasing in their posterior mi0, across all products, exactly halfof consumers are optimistic regardless of their mi0, which is consistentwith posteriors being unbiased across products.

10Loewenstein, O’Donoghue, and Rabin (2003) attribute biases aboutfuture demand to projection bias, in which consumers overestimate thesimilarity of their current and future preferences. Consistent with the pro-jection bias hypothesis, Conlin, O’Donoghue, and Vogelsang (2007) reportthat consumers are more likely to return catalog purchases of winter gearwhen the temperature is low on the day of ordering. Using experimentaldata, Lovett, Boulding, and Staelin (2009) also find that consumers over-weight private signals.

most data, including ours, the behavioral and fully rationalscenarios are observationally equivalent. The source ofproduct-level bias does not affect our estimator. Therefore,we separately estimate consumers’ initial beliefs (~mpop, ~spop)and the distribution of match-values (mpop, spop) and remainagnostic about whether their differences reflect behavioralmisjudgments or the rational expectations structure withuncertainty about mpop.

The special case of ~s2pop = • is of particular interest. With

flat priors, the posterior after processing one signal is cen-

tered around the private signal: mi ~ N(mi + ei, pre~s2

pre) . This

specification maximizes conditional bias and eliminates

aggregate bias because ~mpop receives no weight in the poste-

rior (and is therefore not identified). Narayanan, Chinta-

gunta, and Miravete (2007) use this specification to study

tariff choice for local telephone service. For models with

severe conditional bias, this specification is appealing with

two fewer parameters and a similar ability to fit the data, but

it is nonetheless a restriction that should be tested.

642 JoURnal oF MaRKetinG ReSeaRCh, aUGUSt 2011

Figure 6Conditional BiaSeS FoR neW PRodUCtS

a: Bias after one Update of N(0, s2pop + s2

category) Prior, Where s2pop = 1, s2

category = 3

B: average Product-level Bias Conditional on mi0 > m*

C: Share of Consumers Who are optimistic Conditional on mi0 > m*

2

1

0

–1

–2

Pro

du

ct-

Level B

ias

Posterior Mean Belief mi0 After One Unbiased Signal (spre = 1)

–4 –3 –2 –1 0 1 2 3 4

2.5

2.0

1.5

1.0

.5

0

–.5

Avera

ge P

rod

uct-

Level B

ias

Hypothetical Beliefs Threshold m*

–4 –3 –2 –1 0 1 2 3 4

1.00

.75

.50

.25

Sh

are

Op

tim

isti

c

Hypothetical Beliefs Threshold m*

–4 –3 –2 –1 0 1 2 3 4

mi ~ n(–2, 1)

mi ~ n(0, 1)

mi ~ n(2, 1)

mi ~ n(–2, 1)

mi ~ n(0, 1)

mi ~ n(2, 1)

mi ~ n(–2, 1)

mi ~ n(0, 1)

mi ~ n(2, 1)

tariff Choice with Consumer learning and Switching Costs 643

Consumers’ Initial tariff Choices

Given preenrollment beliefs mi0, the consumer choosesthe initial tariff si0 that maximizes expected discounted util-ity. After enrolling, consumers obtain an unbiased signal ofmi based, for example, on their experience installing the soft-ware or perusing the items available to purchase. To accountfor this anticipated post enrollment signal, we integrate overthe signal value mi0 = mi + ei, post and consider the option ofswitching tariffs (i.e., si1 ≠ si0) given the realized signal. Theoptimal initial tariff therefore solves

where mi1(mi0) = (s2postmi0 + s2

i0mi0)/(s2post + s2

i0) is the pos-terior mean and s2

i1 = (s2posts2

i0)/(s2post + s2

i0) is the posteriorvariance. The perceived distribution of signals F is normalwith mean mi0 and variance s2

post + s2i0 to account for both

the noise in the signal and the uncertainty of current beliefs.After choosing si0, a consumer realizes the postenrollment

signal (mi + ei, post), updates his or her beliefs to mi ~ N(mi1,si1) and then chooses his or her continuation plan si1 accord-ing to the maximization embedded in the integral in Equa-tion 8. The value function in Equation 5 governs her subse-quent behavior. The postenrollment signal is important forexplaining why some consumers enroll on tariffs with highfixed fees and never consume the product. However, thissignal plays no role in our explanation of the flat-rate bias.

eStIMatIoN

In the Appendix, we derive the likelihood function anddiscuss our simulation estimator. Because beliefs and sig-nals are unobserved, we use Monte Carlo methods tonumerically integrate over the distribution of possible sig-nal realizations. The likelihood involves the joint probabili-ties of consumers’ usage and tariff choices. Because tariffchoice is deterministic given beliefs, the probability of aconsumer’s tariff choice is the mass of beliefs for which thechosen tariff is optimal. To efficiently compute this proba-bility and avoid throwing away signal draws for which thechosen tariff is not optimal, we explicitly solve for theregions of beliefs that map into each tariff being optimaland use a Geweke-Hajivassiliou-Keane–like importancesampler to draw signals only from regions for which poste-riors are consistent with the chosen tariffs.

Our likelihood also addresses one other nonstandard fea-ture: Consumers may stop using the online grocer becauseof an exogenous shock such as moving, marriage, divorce,childbirth, retirement, and so forth. To account for suchshocks, we assume each household exogenously quits ineach period with probability g. Allowing for these shocksincreases the probability of the string of nonusage weeks atthe end of each consumer’s usage history.

In the Appendix, we also provide details for using theimportance-sampling methodology of Ackerberg (2009) toestimate random coefficients.

We now discuss identification. The match-value distribu-tion parameters mpop and s2

pop are identified by consumers’usage rates at the end of their histories, after experience sig-nals have eliminated much of the uncertainty. The extent towhich initial beliefs are biased, and whether the bias is

( ) max max{ [ ( ), , ]{ , , }

80 11 2 3

1 0 1 1s s

i i i ii i

V m s F∈ ∫ −µ σ α ss

i i i i i

i

s s d m

1

1 0 0 0 0− ≠δ µ σ(( )} ( | , ),Φ

aggregate (i.e., ~m ≠ μ) or conditional (i.e., ~s2pop π s2

pop),depends on the degree to which average usage rates adjustover time and how the adjustments vary across plans. Themarked drop in usage of consumers on plan 1 (with the highfee) relative to the change in usage of consumers on the otherplans suggests a significant conditional bias. We identify thevariance of the preenrollment signal s2

pre by the degree ofheterogeneity in ex ante beliefs needed to fit variation inconsumers’ initial tariff choices and initial usage rates. Weidentify the variance of the postenrollment signal s2

post bythe degree to which consumers initial usage rates appearinconsistent with their tariff choice. For example, a (relatively)high s2

post enables the model to predict that some consumerswho enroll on the high fixed-fee plan will never use theservice. The speed with which behavior adjusts over timeidentifies se, the informativeness of the experience signals.

With no observed plan switches, we might expect infiniteswitching costs. However, high switching costs reduce themodel’s ability to explain nonusage between a consumer’slast order and the end of our sample. Long trailing periodsof nonusage are possibly a result of consumers quitting (i.e.,switching to plan 3) and never ordering again. Such eventsonly receive significant weight in the integration over theunobserved plan during trailing weeks if switching costs arenot too high. We also identify switching costs by their effecton consumers’ initial tariff choices, as depicted by the com-parative statics.

Gaps in usage rates across consumers on different plans,as depicted in Figure 3 and discussed in our presentation ofcomparative statics, identify the price coefficient. The dis-count factor b is identified by the degree to which earlyusage exceeds later usage, even for plan 3 enrollees, becausethe information value of early uses increases in b. However,b would not be identified without parametric assumptionson utility, as Rust (1994) shows.

ReSUltS

Table 2 presents estimates for six specifications. The firstis the full model with both aggregate and conditional product-level biases and forward-looking consumers. The next fourare special cases of the full model: conditional bias only(~spop = •), aggregate bias only (~spop = spop), no product-level biases (~mpop = mpop, ~spop = spop), and myopic consumers(b = 0). The additional parameters of the full model, ~mpop,~spop, and b, significantly improve the log-likelihood, asrevealed in the bottom row of the table. Thus, chi-squaretests reject ~mpop = mpop, ~spop = spop, and b = 0 at the .01 sig-nificance level. The final specification adds random coeffi-cients to the full model.

Figure 7 depicts key moments of the data and their simu-lated values for each of the first five models. Although wedo not observe when consumers quit by switching (see the“Data” section), we know when such quits occur in simu-lated data. Thus, in addition to usage rates and initial planshares, we report the share of consumers on each plan whoswitch (and order again) and the share who quit (i.e., switchto plan 3 and never order again).

The full model in Figure 7, Panel B, indeed replicates thedata. The model accurately predicts usage by enrollees in eachplan and in each week, except week 1. The very steep declinesin usage from week 1 to week 2 by consumers on plans 2 and3, compared with the subsequent declines in the data, suggest

that an element not present in our model may drive week 1usage. In particular, consumers could sign up online as partof placing their first order. To avoid possible specificationerror of the week 1 usage, we ignore these usage choices(i.e., Pr[cit | mit(ei, mi), sit, sit; q] = 1 for t = 1 in Equation A1).

For the models with conditional bias, we might expectusage of consumers on plan 3 to increase as they correcttheir pessimistic beliefs. For the full model, 98% of theseconsumers have mi0 < mi, and the average bias is –8.73.However, the postenrollment signal raises their beliefs byan average of 8, such that only 58% are pessimistic imme-diately before their first usage choice. As a result, relative totheir extremely low (unobserved) usage rates before thepostenrollment signal, these consumers indeed increasetheir usage. The observed decline reflects the fact that plan3 enrollees who receive high postenrollment signals developpositive biases. These optimistic consumers tend to use theservice and revise their beliefs down, whereas the pes-simistic consumers tend to never use the service and there-fore never adjust their beliefs.

The full model also fits consumers’ plan choices, despitethe absence of plan-specific utility terms. Predicted planshares are .129, .317, and .554, respectively, compared withactual shares of .123, .319, and .558. Moreover, the modelpredicts that only .25% of consumers switch and order again,whereas more than half of plan 1 enrollees choose to quit.Exogenous quitting due to g = .003 adds to these quits, andis the only mechanism for simulated plan 2 enrollees to quitbecause estimated switching costs are prohibitively high.

The model with only conditional biases is the second-best-fitting model (of the first five) in terms of log-likelihood

and usage patterns, but it overpredicts the plan 3 share bythe largest margin across all models. The model with onlyaggregate biases does not deliver enough of a decline inusage and predicts that nearly a quarter of plan 3 enrolleesand 2.2% of plan 1 enrollees will switch plans and orderagain. Conditional biases are more important than aggregatebiases for explaining our data.

The predicted usage rates of the myopic model, in Figure7, Panel E, decline much faster than in the real data, and thesimulated consumers never quit or switch plans. Both ofthese aspects of the myopic model are unappealing. In Figure7, Panel F, we observe that the model without any product-level biases fits the data poorly. With neither aggregate norconditional biases, the decline in usage over time is due onlyto exogenous quits and the resolution of uncertainty (whichreduces the information-acquisition incentive to purchase).

Having established that the full model fits the data well, wenow turn to interpreting its parameter estimates. Consumershad imprecise information about the value of this service forthe overall population, as implied by ~spop/a = 17.549/.358 =49. Indeed, ~spop is nearly eight times the degree of hetero-geneity in actual match-values (spop = 2.2). For a new serviceusing new technology, we expect uncertainty to be high.Given spre = 23.565, the weight on the preenrollment signalis .357, and the weight on ~mpop = –6.473 is .643. Because~mpop is lower than mpop (–1.687), the aggregate bias is nega-tive. The average bias (i.e., average mi0 – mi) acrossenrollees in each plan is 10.38, 1.75, and −8.73, respectively.

Figure 8 illustrates the positive relationship betweenproduct-level bias and mi0 (i.e., mean belief beforeenrolling) and the resolution of bias as signals are received.

644 JoURnal oF MaRKetinG ReSeaRCh, aUGUSt 2011

table 2PaRaMeteR eStiMateS

Random CoefficientsConditional aggregate

Parameter Full Model Bias only Bias only b = 0 No Biases Mean qi SD qi

mpop (mean match quality) –1.687 –4.049 –6.871 –.604 –3.228 –3.049 2.255(.111) (.112) (.084) (.079) (.077) (.084) (.089)

m~

pop (perceived mean match quality) –6.473 N.A. –.872 –47.357 mpop –4.627 2.393.589 (.070) (7.785) (.316) (.278)

spop (SD match quality) 2.200 2.839 5.849 1.524 7.209(.047) (.061) (.075) (.039) (.110)

s~

pop (perceived SD match quality) 17.549 • spop 28.306 spop 14.237 3.313(1.135) 2.252 (.257) (.159)

spre (SD preenroll signal) 23.565 7.313 10.099 14.018 8.286 10.170 2.608(5.473) (.133) (.368) (.587) (.244) (.215) (.182)

spost (SD postenroll signal) 4.100 5.034 4.076 6.645 3.643 5.233 2.099(.081) (.079) (.060) (.234) (.041) (.102) (.110)

se (SD experience signal) 6.200 7.621 5.718 4.164 4.616 8.272 2.791(.130) (.148) (.094) (.083) (.110) (.185) (.172)

b (weekly discount factor) .974 .987 .998 0 .991 .981(.001) .001 (.001) (.001)

a (price sensitivity) .358 .176 .030 .273 .099 .234 .052(.013) (.010) (.003) (.009) (.009) (.004) (.003)

d (switching cost) 74.468 66.658 49.466 • 49.039 69.504 13.434(2.925) (6.521) (4.170) (7.151) (1.097) (.945)

log-likelihood –50,604.7 –50,714.7 –51,500.0 –50,806.7 –52,313.3 –50,502.5

Notes: We fix b in the random coefficients model to avoid convergence delays when a random b draw is close to 1. We started the random coefficients esti-mation from two local maxima of the full model. The run based on the local maximum with b = .981 yielded the highest likelihood. In the random coefficientsmodel, the standard deviation of mpop serves the role of spop. All models use an exogenous weekly quit rate of g = .003. m

~pop is not identified in the conditional-

bias-only model because it receives no weight in the posterior. Estimated switching costs when b = 0 deter all switching, so we fix d = •. Asymptotic stan-dard errors are in parentheses.

tariff Choice with Consumer learning and Switching Costs 645

Figure 7aCtUal and PRediCted USaGe RateS, By Plan

a: actual Data B: Full Model

C: Flat Prior (Conditional Bias only) D: aggregate Bias only

e: Myopic Consumers: b = 0 F: No Biases

.8

.6

.4

.2

0

Weeks Since Enrollment

0 10 20 30 40 50 600 1

(a) Actual Data

50 601

2

3

(b) Full Model

0 10 20 30 40 50 600

0.2

0.4

0.6

0.8

.8

.6

.4

.2

0

Weeks Since Enrollment

0 10 20 30 40 50 60

0 1

(a) Actual Data

50 601

2

3

(b) Full Model

0 10 20 30 40 50 600

0.2

0.4

0.6

0.8

.8

.6

.4

.2

0

Weeks Since Enrollment

0 10 20 30 40 50 600 1

(a) Actual Data

50 601

2

3

(b) Full Model

0 10 20 30 40 50 600

0.2

0.4

0.6

0.8

.8

.6

.4

.2

0

Weeks Since Enrollment

0 10 20 30 40 50 60

0 1

(a) Actual Data

50 601

2

3

(b) Full Model

0 10 20 30 40 50 600

0.2

0.4

0.6

0.8

.8

.6

.4

.2

0

Weeks Since Enrollment

0 10 20 30 40 50 60

0 1

(a) Actual Data

50 601

2

3

(b) Full Model

0 10 20 30 40 50 600

0.2

0.4

0.6

0.8

.8

.6

.4

.2

0

Weeks Since Enrollment

0 10 20 30 40 50 60

0 1

(a) Actual Data

50 601

2

3

(b) Full Model

0 10 20 30 40 50 600

0.2

0.4

0.6

0.8

Solid lines average over consumersinitially enrolled in specified plan

dotted lines condition on ordering inthe current or any future week

initial plan shares: 12.4, 32.0, 55.6% switch: 3.3, 0, 5.7

% quit: 9.5, 0, n.a.

initial plan shares: 13.1, 32.8, 54.1% switch: 0, 0, 0

% quit: 0, 0, n.a.

initial plan shares: 17.5, 28.0, 54.5% switch: 3.8, 0, 0.2

% quit: 46.3, 0, n.a.

initial plan shares: 13.9, 26.6, 59.5% switch: 23.6, 0, 2.2

% quit: 22.9, 0, n.a.

initial plan shares: 12.9, 31.7, 55.4% switch: .7, 0, .3

% quit: 50.7, 0, n.a.

initial plan shares: 12.3, 31.9, 55.8

Plan

1

2

3

Plan

1

2

3

Plan

1

2

3

Plan

1

2

3

Plan

1

2

3

Plan

1

2

3

646 JoURnal oF MaRKetinG ReSeaRCh, aUGUSt 2011

Figure 8Model CoMPaRiSon: PRodUCt-leVel BiaSeS

a: Share of Consumers Who are optimistic Conditional on mi0 > m* (on x-axis)

B: average Bias after Preenrollment Signal Conditional on mi0 > m* (on x-axis)

1.0

.9

.8

.7

.6

Sh

are

Op

tim

isti

c

Hypothetical Beliefs Threshold m*

–8 –6 –4 –2 0 2 4 60

0.7

0.9

1

00

i0 > m*

aver

age

bias

00

1

3

i0 > m*

aver

age

bias

0

◯ lowest mi0 of plan 2 enrollees

✸ highest mi0 of plan 2 enrollees

C: average Bias after Postenrollment Signal Conditional on mi0 > m* (on x-axis)

12

8

4

0

Avera

ge B

ias

–8 –6 –4 –2 0 2 4 6

0

0.7

0.9

1

00

i0 > m*

aver

age

bias

00

1

3

i0 > m*

aver

age

bias

0

◯ lowest mi0 of plan 2 enrollees

✸ highest mi0 of plan 2 enrollees

D: average Bias after one Usage Conditional on mi0 > m* (on x-axis)

4

3

2

1

0

Avera

ge B

ias

–8 –6 –4 –2 0 2 4 6

0

0.7

0.9

1

00

i0 > m*

aver

age

bias

00

1

3

i0 > m*

aver

age

bias

0

◯ lowest mi0 of plan 2 enrollees

✸ highest mi0 of plan 2 enrollees

3

2

1

0

Avera

ge B

ias

–8 –6 –4 –2 0 2 4 6

0

0.7

0.9

1

00

i0 > m*

aver

age

bias

00

1

3

i0 > m*

aver

age

bias

0

◯ lowest mi0 of plan 2 enrollees

✸ highest mi0 of plan 2 enrollees

Hypothetical Beliefs Threshold m*

Hypothetical Beliefs Threshold m*

Hypothetical Beliefs Threshold m*

aggregate bias only

aggregate bias only

aggregate bias only

aggregate bias only

Flat prior

Full model

Full model

Full model

Full model

Flat prior

Flat prior

Flat prior

tariff Choice with Consumer learning and Switching Costs 647

The top graph plots the share of consumers who are opti-mistic (i.e., have mi0 > mi) conditional on their mi0 exceed-ing the hypothetical cutoff m* that varies along the x-axis.The other three graphs plot the average bias of consumerswith mi0 exceeding the cutoff m* at three points in time:after the preenrollment signal, after the postenrollment sig-nal, and after the first experience signal. To provide a senseof the distribution of consumers’ mi0, we mark the lowestand highest values of mi0 for which plan 2 is chosen(approximately the 55th and 87th percentiles). For compari-son, we also show the biases implied by the conditional-bias-only model (i.e., flat prior) and the aggregate-bias-onlymodel. In this latter model, the share of consumers who areoptimistic and bias level do not vary across beliefs. Formodels with conditional biases, the share of consumers whoare optimistic and bias levels are increasing in mi0 because~spop exceeding spop implies the private signals receivehigher weights.

Essentially, all plan 1 enrollees are optimistic in the twomodels with conditional biases, whereas only 81% are opti-mistic in the model with aggregate biases only. Fewer plan2 enrollees are optimistic: 69% for the full model and 86%with the flat prior. Note that the conditional bias in the fullmodel is sufficiently large to overcome the negative aggre-gate biases (~mpop < mpop) for both plan 1 and plan 2enrollees. The lower three panels in Figure 8 reveal the res-olution of biases as consumers update beliefs with theirpreenrollment signal and their first consumption experience.

Although biases conditional on enrollment plan dissipatequickly in the full model, consumers still require many sig-nals to precisely learn their idiosyncratic mi. After the post -installation signal, the standard deviation of the posterior is3.9, or in dollar equivalents (by dividing by a = .358), $11.After three experience signals, this standard deviation is$7.40 and hits $4.90 after ten uses. The lack of switching,particularly by plan 3 enrollees who order frequently overlong periods, contributes to the estimation of slow learning.

The full model’s discount factor is .974. Figure 9 depictsthe shift in usage due to information acquisition by forward-looking consumers. In Panel A, the week 1 usage probabili-ties for forward-looking consumers exceed the usage proba-bilities for myopic (though otherwise identical) consumersfor all values of week 1 beliefs and subscribed plans. Thedifferences in these usage rates reaches .40, as plotted inPanel B.

The implied price sensitivity of consumers is also evidentin Figure 9: Usage probabilities, holding beliefs fixed, aresubstantially higher for consumers on plan 1 than for con-sumers on the plans with per-use charges.

Given the lack of plan switching, our estimate of d is high.Dividing the estimate of 74.468 utils by .358 dollars per utilimplies a switching cost equivalent to $208. Iyengar,Ansari, and Gupta (2007) estimate even higher “hasslecosts” for switching wireless service plans, and Handel(2010) estimates switching costs exceeding $2,500. As Fig-ure 7, Panel B indicates, we predict that more than half ofplan 1 enrollees switch to plan 3 despite the high switchingcosts.

The last two columns of Table 2 report the means andstandard deviations of the random coefficients. The meansfall between the estimates of the full model and the modelwith only conditional bias, except spost and se are slightly

higher. The standard deviations of a, d, and the s parame-ters are 20%–40% of their respective means.11

CoUNteRFaCtUal SIMUlatIoNS

We conduct two sets of counterfactuals. First, we decom-pose the effects of switching costs and uncertainty on con-sumer behavior, consumer surplus, and the firm’s revenues.Then, we compute optimal menus of two-part tariffs toinvestigate the effectiveness of menus as screening devicesfor experience goods. In each counterfactual, we simulate50,000 consumers over 100 weeks and report discountedrevenue and surplus values in dollars per consumer, assum-ing the firm’s annual discount factor is .9.

Decomposing the effects of Switching Costs andUncertainty on outcomes

Table 3 summarizes consumer behavior and surplus andfirm’s discounted revenues, for various specifications of the

Figure 9inFoRMation aCQUiSition

a: Week 1 Usage by Forward-looking and Myopic Consumers

B: Usage | b = .974 minus Usage | b = 0

1.0

.8

.6

.4

.2

0

Week 1 Prior Mean mi1

–4 –2 0 2 4 6−4 −2 0 2 4 60

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

week 1 prior mean mi1

week 1 usage by forward−looking and myopic consumers

Solid lines: β = .974

dashed lines: β = 0

Plan: 1 2 3

1 2 3

−4 −2 0 2 4 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

week 1 prior mean mi1

usage|β=.974 minus usage|β=0

Plan: 1 2 3

.40

.30

.20

.10

0

Week 1 Prior Mean mi1

–4 –2 0 2 4 6

−4 −2 0 2 4 60

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

week 1 prior mean mi1

week 1 usage by forward−looking and myopic consumers

Solid lines: β = .974

dashed lines: β = 0

Plan: 1 2 3

1 2 3

−4 −2 0 2 4 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

week 1 prior mean mi1

usage|β=.974 minus usage|β=0

Plan: 1 2 3

Solid lines: b = .974

dashed lines: b = 0

11We might expect priors and match-values to depend on consumers’enrollment dates. We therefore allowed “week of enrollment” to linearlyshift mpop, ~mpop, and their respective standard deviations in the random coeffi-cient model (which accepts conditioning variables without adding dynamicprograms to solve). The log-likelihood improves from −50,502.5 to−50,473.3, but the other estimates are virtually unchanged. Given that the ran-dom coefficients specification is primarily a robustness check for our findingthat menus of tariffs fail to increase revenues, we do not report the estimates.

model, when consumers face the actual tariff menu. The firsttwo rows use the full model as estimated. In the long run (i.e.,week 100), the firm receives $.96 per consumer, yielding asteady-state discounted revenue of $476.5 per consumer.The steady-state revenue is generally lower than the dis-counted revenue along the transition path because the firmearns revenues from consumers’ experiential consumption.

A consequence of consumers not knowing the populationdistribution of mi is that their average realized consumer sur-plus (CS) differs from their expected surplus. In the basecase, consumers lost $39.2 on average when they expectedsurplus of $349.12 Given ~mpop < mpop, one might expect real-ized CS to exceed expected CS. However, with uncertainty,expected CS is driven primarily by the possibility that mi ishigh, and the actual option value is lower than the perceivedoption value because spop is lower than ~spop.

The second model in Table 3 removes switching costs bysetting d to zero. Both expected and realized CS increase,though realized CS is still negative because consumersinvest too much in experiential consumption as a result oftheir overestimation of the service’s option value. Firms dis-counted revenues fall dramatically, from $497.2 per con-sumer with d to $185.4 without d, as consumers who enrollin plans with fixed fees, given their conditional biases,switch to the fee-less plan 3 after learning their mi is lowerthan expected.

The third model in Table 3 removes uncertainty aboutmatch-values by setting spre = 0, so consumers fully learnmi from the preenrollment signal. Discounted revenues of$175.8 are lower than the previous model with uncertaintyand d = 0 because revenues are no longer received fromexperimenting consumers. Realized CS equals expected CSand is finally positive, albeit at only $5.60 per consumerbecause few consumers use the service. Plan 1 enrolleeshave CS of a mere $105 on average, or $2.68 per week.

In summary, Table 3 shows that the combination ofuncertainty and switching costs largely drive consumers’behavior and surplus and the firm’s revenues. The combina-tion of biased beliefs (due to either aggregate or conditionalbiases) and high switching costs creates a windfall for firmsoffering tariffs with fixed fees and a costly learning experi-ence for consumers. If consumers’ hassle costs of switchingare small compared with their cost savings from switchingtariffs, firms should consider contracts with termination feesto increase the proportion of consumers who remain onhigh-fee plans after correcting their beliefs. However, theshort-term benefits of using termination fees must beweighed against the potential long-term cost of exploitedconsumers switching to other firms when the contract ends.

Table 3 also reports simulated outcomes for the modelswith no bias, only aggregate bias, and only conditional bias.The specification of initial beliefs, which determines product-level biases, has a significant impact on predicted revenuesand consumer surplus.

Price Discrimination

We evaluate the use of flat-rate tariffs, two-part tariffs,and menus of two-part tariffs to price discriminate when afirm sells an experience good.13 With consumer uncertaintyand switching costs, fixed fees play two roles: They extractsurplus in the traditional (static) sense from inframarginalunits, and they enable firms to capitalize on consumers’biases. The per-use prices determine the degree to whichconsumers learn: On the one hand, high per-use prices dis-courage experimentation, thereby reducing the firm’s long-term profits because some consumers who should use theproduct regularly will never discover this. On the other

648 JoURnal oF MaRKetinG ReSeaRCh, aUGUSt 2011

table 3eFFeCt oF SWitChinG CoStS, UnCeRtainty, and BiaSeS on ChoiCeS, ReVenUe, and SURPlUS

Usage: Initial, Final Revenue DiscountedCS

(Plan Share: Initial, Final)Realized

Model Description Plan 1 Plan 2 Plan 3 (expected)

Full model .712, .579 .324, .071 .148, .015 497.2 –39.2(.129, .058) (.317, .317) (.554, .625) (476.5) (349.0)

No switching costs .949, .941 .N.A., .N.A. .011, .010 185.4 –10.2(full model except d = 0) (1.00, .037) (.000, .000) (.000, .940) (166.1) (395.7)

No uncertainty .935, .935 .N.A., .N.A. .009, .009 175.8 5.6(full model except spre = 0) (.044, .044) (.000, .000) (.956, .956) (175.8) (5.6)

No-biases model .883, .913 .529, .419 .134, .049 1196.6 1415.9(.124, .138) (.320, .320) (.556, .542) (1190.3) (1415.9)

Aggregate-bias-only model .677, .738 .363, .182 .134, .039 721.2 6916.6(.139, .085) (.266, .266) (.595, .649) (704.1) (25780.1)

Conditional-bias-only model .766, .492 .363, .079 .098, .016 552.2 –85.4(.175, .082) (.280, .280) (.545, .638) (530.0) (856.0)

Notes: Values in parentheses correspond to the label in parentheses in the column header. All revenue and surplus values are in dollars per consumer.Weekly bfirm = .997976. Thus, one dollar per week has present value of nearly $500. Revfinal/(1 – bfirm) measures the firm’s steady-state discounted revenueper consumer. All values were generated by simulating 50,000 consumers over 100 weeks with g = 0.

Revfinal

firm( )1− β

12Choosing the outside good every week yields expected discounted util-ity of 22.6 (Euler’s constant divided by 1 – b). With the online grocer avail-able, expected discounted utility, averaged across consumers, is 147.4 utils.Expected CS is therefore (147.4 – 22.6)/a = $349, or $349(1 – b) = $9.07on a weekly basis.

13Firms frequently use two-part tariffs to extract surplus from peoplewho consume multiple units of a given good. In our model, consumerseither use the service once or not at all, each week. Nonetheless, becausethe fee component of the two-part tariff is paid before the consumer’sobserving the idiosyncratic shock, surplus can still be extracted. In essence,the unit of consumption is the probability of using the service. When a firmfaces consumers with (unobserved) heterogeneous preferences, offering amenu of two-part tariffs can induce them to reveal their preferencesthrough their tariff choices.

tariff Choice with Consumer learning and Switching Costs 649

hand, the firm may earn substantial revenue from con-sumers who are willing to pay high prices during the learn-ing period, given their initial beliefs.

We numerically compute optimal tariffs assuming thefirm knows the match-value distribution and all demandparameters. Because we focus on consumer uncertainty andlearning, we do not consider a need for the firm to discoverthe demand curve. We assume the firm maximizes expecteddiscounted revenues, which is consistent with profit maxi-mization when net marginal costs are zero, as suggested byindustry analysts.14

Table 4 presents the optimal prices and fees for varioustariffs when consumers behave according to the estimatedfull model. Given the high switching costs, the optimal flat-fee tariff of F = $5.32 generates more than three times thediscounted revenue of the optimal uniform price of p =

$5.39. In this model, all consumers have the same switchingcosts, so the optimal fixed fee is just low enough that no onequits (i.e., F = [1 – b]d/a). The optimal single two-part tar-iff, with F = $1.98 and p = $4.09, yields 20% more revenuethan the optimal flat-rate tariff. Adding a second two-parttariff, however, fails to increase revenues.

We also compute optimal tariffs for the random coeffi-cients model to assess whether richer consumer heterogene-ity leads to revenue gains with tariff menus. Table 5 presentsthe results. Although the exact fees and prices differ fromthose derived from the full model, the broad implicationsare the same: A fixed fee, either by itself or as part of a two-part tariff, enables the firm to earn approximately four timesmore than with uniform pricing, and menus offer little to nogain over the optimal two-part tariff. The revenue gain fromthe second two-part tariff is less than .1%.

In our application, menus of two-part tariffs are unable tosegment consumers without letting high-usage consumersretain too much surplus: The incentive compatibility con-straints are too costly to satisfy. Courty and Hao (2000) alsofind that segmenting consumers fails to increase profitswhen ex ante consumer heterogeneity is high.

CoNClUSIoN

We show that aggregate uncertainty about a new prod-uct’s quality yields posteriors that exhibit a phenomenon

14In addition to the revenue from the tariffs, the online grocer received akickback from the partner chain of 15% of each grocery order. The averagekickback was nearly $18 per order because the average order size was $119.We do not observe the firm’s costs of delivering groceries. An industry ana-lyst who estimates “picking and delivery costs” for several online grocersestimates that our firm’s costs were $25.41 per delivery (Wellman 1999).Marginal costs are presumably lower than this average cost because thedelivery truck is already delivering orders to other customers. For simplic-ity, we treat the kickback amount as exactly offsetting the marginal costs ofdelivery and instead maximize discounted revenues from delivery tariffs.

table 4oPtiMal taRiFFS: FUll Model

Usage: Initial, Final Revenue DiscountedCS

(Plan Share: Initial, Final)Realized

tariff Description Plan 1 Plan 2 Plan 3 (expected)

F1 = 5.32, p1 = 0 .662, .332 858.0 –49.8(flat-fee tariff) (.257, .327) (858.0) (345.3)

F3 = 0, p3 = 5.39 .331, .091 255.4 –2.0(per-use tariff) (1.0, 1.0) (241.5) (404.2)

F2 = 1.98, p2 = 4.09 .398, .126 1029.8 –59.6(one two-part tariff) (.814, .826) (1020.2) (352.5)

Notes: Values in parentheses correspond to the label in parentheses in the column header. All revenue and surplus values are in dollars per consumer.Weekly bfirm = .997976. Thus, one dollar per week has present value of nearly $500. Revfinal/(1 – bfirm) measures the firm’s steady-state value. All valueswere generated by simulating 50,000 consumers over 100 weeks with g = 0 for 100. For ease of comparison with the actual menu, single tariffs appear under“Plan 1” if a flat rate, under “Plan 2” if a two-part tariff, and under “Plan 3” if a uniform price. Menus of two-part tariffs offered no advantage over the singletwo-part tariff. In each simulation, consumers can also choose to not enroll in any plan.

Revfinal

firm( )1− β

table 5oPtiMal taRiFFS: RandoM CoeFFiCientS Model

Usage: Initial, Final Revenue DiscountedCS

(Plan Share: Initial, Final)Realized

tariff Description Plan 1 Plan 2 Plan 3 (expected)

F1 = 4.25, p1 = 0 .622, .208 883.6 –88.4(flat-fee tariff) (.403, .420) (882.2) (662.26)

F3 = 0, p3 = 8.35 .261, .048 220.2 –20.0(per-use tariff) (1.0 , 1.0) (199.2) (645.0)

F2 = 4.25, p2 = 1.35 .604, .183 890.8 –89.6(one two-part tariff) (.385, .399) (886.4) (636.7)

F1 = 4.25, p1 = 1.12 .621, .194 .447, .117 891.4 –89.5F2 = 4.08, p2 = 1.55 (.352, .366) (.037, .037) (887.4) (640.9)

Notes: Values in parentheses correspond to the label in parentheses in the column header. All revenue and surplus values are in dollars per consumer.Weekly bfirm = .997976. Thus, one dollar per week has present value of nearly $500. Revfinal/(1 – bfirm) measures the rm's steady-state value. Values gener-ated by simulating 5000 consumers over 100 weeks with g = 0 for 100 draws of q. For ease of comparison with the actual menu, single tariffs appear under“Plan 1” if a flat rate, under “Plan 2” if a two-part tariff, and under “Plan 3” if a uniform price. An additional ex post only tariff offered no advantage over twotwo-part tariffs. In each simulation, consumers can also choose to not enroll in any plan.

Revfinal

firm( )1− β

similar to the winner’s curse: Consumers with high posteri-ors tend to overestimate the product’s utility. This condi-tional bias (i.e., bias conditional on the consumer’s poste-rior) can explain consumers’ tariff and usage choices fornew products without invoking violations of rationality orexplicit preferences for particular types of tariffs.

Either conditional biases or positive aggregate biases(which affect all consumers equally) can explain why con-sumers on flat-rate plans often fail to use the productenough to justify the fixed fee. Our estimation results indi-cate that for the online grocer, the aggregate bias was in factnegative. DellaVigna and Malmendier’s (2006) finding thatbiased beliefs best explain consumers’ contract choices forhealth clubs suggests our results may be relevant for otherproducts as well. Even when consumers are pessimistic ingeneral, those with high posteriors can have positive biases.

However, our data cannot distinguish between the rationalmodel of the flat-rate bias and alternative behavioral expla-nations. For example, with switching costs, consumers maycommit to future use of a health club by enrolling in a flat-rate tariff with no per-use charge. We leave such distinctionsto future work. Although we have focused on tariff choice,our insights about beliefs and choice are potentially relevantfor all new products. In particular, the rational model ofconditional bias implies that most new products will fail tofulfill buyers’ expectations, even when expectations are cor-rect on average across products.

aPPeNDIx

In this Appendix, we derive the likelihood function, tak-ing into account the possibility of exogenous quits, and dis-cuss our use of importance-sampling to estimate randomcoefficients.

the likelihood Function

Because consumers’ match-values and beliefs are notobserved, we integrate over mi, ei, pre , ei, post, and eit. Let qdenote the vector of parameters to estimate, which will be asubset of (mpop, ~mpop, spop, ~spop, spre, spost, se, b, a, d),depending on the particular specification.

For each draw of unobservables over a consumer’s entirehistory, we compute the likelihood of the observed sequenceof cit and sit over the Ti weeks between the consumer’senrollment and the end of the sample. Because sit is unob-served in weeks after the last usage, the likelihood for thesetrailing weeks is based only on the observed cit = 0. Let ti –1 be the ith consumer’s last week with cit = 1 (so ti is thefirst week with censored s). Integrating over the unobservedmatch-value and signals, the likelihood for consumer i isthen

( ) ( ) [ | ( , ), ; ]A L Pr s m Pi i i i i i

t

T

i

i

1 0 0 0θ ε µ σ θτ

=

=∏∫∫

∏=

−Pr s m s

Pr c

t

it it i i it i t

it

i

1

1

1

τ

ε µ σ θ[ | ( , ), , ; ]

[

,

|| ( , ), , ; ]

[ | ( , )

m s

Pr s m

it i i it it

s

i i i i

i i

i i

ε µ σ θ

ε µτ

τ τ∑ ,, , ; ]

[ | ( , ), ,

σ θ

ε µ σ

τ τ

τ

i i

t

T

it it i i it i

i i

i

i

s

Pr c m s

=∏

1

ττ θ ε µ θ; ] ( , ; ),

Φ d di i

where mit (ei, mi) makes explicit the dependence of beliefs onthe unobserved match-value mi and signals ei = (ei, pre, ei, post,{eit}t = 1

ti – 1). The first line of this likelihood entails the proba-bility of the weeks from enrollment to the last usage,whereas the second line entails the likelihood for the trail-ing weeks after the last usage. The summation in the lasttwo lines integrates over the censored tariff choice sit afterthe last usage. Because sit only changes after signals arereceived, sit = sit for t ≥ t.

As Miller (1984) and Rust (1987) show, Pr(cit | mit, sit,sit; q) has the familiar logit formula. Net of the idiosyncraticutility shock uit, the value of choosing cit = 0 is

and the value of choosing cit = 1 is

We use the standard normalizations to fix the scale and levelof utility and obtain

The plan choice sit is deterministic given beliefs mitand sit. That is, Pr(sit | mit, sit, si, t – 1; q) equals one if sit isoptimal given (mit, sit, si, t – 1) and equals zero otherwise.Mathematically,

(A5) Pr(sit | mit, sit, si, t – 1; q = �{sit = s(mit, sit, si,t–1; q)},

where s(mit, sit, si, t – 1; q) ∫ argmaxs{V (mit, sit, s) – aFs –d�(s ≠ sit)} denotes the optimal tariff that solves the maxi-mization embedded in the continuation value integrands ofboth Equations 8 and 5. Although this choice is determinis-tic given beliefs, from the econometrician’s perspective, sitis probabilistic because beliefs are unobserved.

Now consider the integration over the censored planchoice for the trailing weeks t ≥ ti. Beliefs are fixed afterthe last usage because no more signals are received. Thus,the censored plan is s(miti

, siti, siti – 1; q). Integration over

the censored plan is therefore automatically handled by theintegration over unobserved beliefs.

We use Monte Carlo simulation with 2000 draws to evalu-ate Li(q) for each consumer and obtain our estimator bymaximizing the product of the consumers’ simulated likeli-hoods, using the nested fixed-point algorithm of Rust(1987). We solve the dynamic model numerically usingGauss-Hermite quadrature (Judd 1998) to integrate over mit.We discretize mit and use linear interpolation to evaluate Vat points off the grid. The posterior variance is a determinis-

( )

max{ [ ( ,,

,

A V m p

V m m

it it s

si t it it

it

i t

3 1

11

= −

++

∫ +

α

β σ ,, ), ( ), ]

(

, ,

,,

µ σ σ

α δ

it i t it i t

s i t

s

F s si t

+ +

+− − ≠+

1 1

11( iit it it itd m)} ( | , ).Φ µ σ

( ) ( | , , ; )exp( )

exp( )A Pr c m s

V

Vit it it it

c it

it

it40

σ θ =+ eexp( )

.V it1

( ) {max ( , , ),

,,A V V m s Fits

it it i t si t

i t2 0 1

11

= − −+

++β σ α δδ(( )},s si t it+ ≠1

( ) ( ) [ | ( , ), ; ]A L Pr s m Pi i i i i i

t

T

i

i

1 0 0 0θ ε µ σ θτ

=

=∏∫∫

∏=

−Pr s m s

Pr c

t

it it i i it i t

it

i

1

1

1

τ

ε µ σ θ[ | ( , ), , ; ]

[

,

|| ( , ), , ; ]

[ | ( , )

m s

Pr s m

it i i it it

s

i i i i

i i

i i

ε µ σ θ

ε µτ

τ τ∑ ,, , ; ]

[ | ( , ), ,

σ θ

ε µ σ

τ τ

τ

i i

t

T

it it i i it i

i i

i

i

s

Pr c m s

=∏

1

ττ θ ε µ θ; ] ( , ; ),

Φ d di i

650 JoURnal oF MaRKetinG ReSeaRCh, aUGUSt 2011

tariff Choice with Consumer learning and Switching Costs 651

tic function of the number of experience signals processed,which we allow to vary from 0 to 99.

Hajivassiliou and Ruud (1994) show that simulated maxi-mum likelihood yields an inconsistent estimator for a fixednumber of draws. To increase the efficiency of our simula-tion estimator, we draw experience signals from the distri-bution for which the observed plan choice is indeed optimaland reweight the likelihood by the probability mass of thistruncated normal. The sampling scheme is a nonstandardapplication of the Geweke-Hajivassiliou-Keane importancesampler. As a result of switching costs, the set of experiencesignals for which a given plan is optimal may contain twononcontiguous regions. In the absence of this samplingscheme, the contribution to the likelihood of many of thedraws of (mi, ei) would be zero because of a zero probabilityof the tariff choice in Equation A5, which is deterministicgiven beliefs. Sampling from truncated normals andreweighting also yields a smooth simulated likelihood func-tion because the weights are smooth functions of theparameters, which aids in the numerical optimization andcomputing of standard errors.

exogenous Quits

Many households use the online grocer regularly over along period and suddenly stop. Although such behavior canindicate slow learning, it may also reflect permanent house-hold shocks, such as moving, marriage, divorce, childbirth,retirement, and so on. To account for such shocks, weassume each household exogenously quits in each periodwith probability g. We assume consumers are unaware ofthese possible shocks. To account for exogenous quits in thelikelihood, multiply each Pr[cit | mit(ei, mi), sit, sit; q] by 1 −g in the third line of Equation A1, and replace PTi

t = tiPr[cit |

mit(ei, mi), sit, sit; q] in the last line with

Each trailing week’s inactivity may be caused by eitheran exogenous quit in the current period or a prior period orchoosing the traditional store despite still being a subscriber.When the consumer exogenously quits, however, the subse-quent periods of inactivity no longer represent observations.For example, if Ti = 52 and ti = 51, the probability of thesetrailing weeks is the sum of the probability he or she exoge-nously quits at t = 51, the probability he or she chooses nousage at t = 51 and exogenously quits at t = 52, and theprobability he or she chooses no usage at both t = 51 and t =52. Equation A6 sums the probabilities of these differentexplanations for the trailing weeks of nonusage.

We fix g = .003 based on the frequency of household relo-cations in our market, which implies a .145 annualizedexogenous quit rate. We experimented with estimating g, butits estimate of .019 (which implies an annualized rate of.63) seems implausibly high. The higher g increases theswitching costs estimate because (censored) endogenousquits are less important for explaining the trailing weeks ofnonusage when g is high. If the data contained observedswitches, we would be more inclined to estimate g.

( ) ( ) [ | ( , ), ,( )A Pr c m sI t T

t

T

it it i i iti

i

i

6 1γ γ ε µ στ

<

=∑ − ii

t iτ

τθ; ] .{ } −

Parameter Heterogeneity

We use the importance sampling methodology of Acker-berg (2009) to vary q across consumers. Integrating overrandom coefficients typically involves averaging Li(qi) overmany draws of qi for each consumer, which leads to aninfeasible number of fixed points to compute because eachqi for each i requires an associated V. Furthermore, these Vmust be recomputed each time the likelihood function iscalled during the nonlinear maximization over distributionsof random coefficients. Ackerberg suggests computing andretaining Li(qi) for a set of qi. The likelihood under an alter-native distribution of random coefficients is obtained not byredrawing qi from this new distribution and recomputingconditional likelihoods, but by changing the weights in theaveraging of the retained Li(qi).

To be more precise, let g(q | r) be the probability densityfunction of random coefficients parameterized by r, and leth(q) be an arbitrary distribution (independent of r). Then,

We draw (qi1, ..., qi

NS) from h and compute the simulatedlikelihood:

We center the initial h on estimates without random coeffi-cients and use a high variance so the reweighting (with h inthe denominator) does not explode. We then iterate by settingh to the previous iteration’s g until the estimated r implies gis similar to h. We impose restrictions such as spop > 0 byusing truncated normals for g and h. For each consumer, weuse 50 draws of the parameters that affect the consumer’sdynamic program (i.e., ~spop, spre, spost, se, b, a, d). Foreach vector of these dynamic parameters, we solve thedynamic program and evaluate the likelihood for 100 drawsof (mi,

~mpop, ei).

REFERENCES

Ackerberg, Daniel (2003), “Advertising, Learning, and ConsumerChoice in Experience Good Markets: An Empirical Examina-tion,” International economic Review, 44 (3), 1007–1040.

——— (2009), “A New Use of Importance Sampling to ReduceComputational Burden in Simulation Estimation,” QuantitativeMarketing and economics, 7 (4), 343–76.

Clay, Karen, David Sibley, and Padmanabhan Srinagesh (1992),“Ex Post vs. Ex Ante Pricing: Optional Calling Plans andTapered Tariffs,” Journal of Regulatory economics, 4 (2),115–38.

Conlin, Michael, Ted O’Donoghue, and Timothy Vogelsang(2007), “Projection Bias in Catalog Orders,” american eco-nomic Review, 97 (4), 1217–49.

Coscelli, Andrea and Matthew Shum (2004), “An EmpiricalModel of Learning and Patient Spillovers in New Drug Entry,”Journal of econometrics, 122 (2), 213–46.

Courty, Pascal and Li Hao (2000), “Sequential Screening,” Reviewof economic Studies, 67 (4), 697–717.

Crawford, Gregory and Matthew Shum (2005), “Uncertainty andLearning in Pharmaceutical Demand,” econometrica, 73 (4),1137–74.

DeGroot, Morris (1970), optimal Statistical Decisions. New York:McGraw-Hill.

( ) ( ) ( )( | )

( )A L

NSL

g

hiNS

i

ns

NS

ins i

ns

ins

81

1

% ρ θθ ρ

θ=

=∑ ..

( ) ( ) ( ) ( | ) ( )( | )

( )A L L g d L

g

hi i i i i i i

i

i

7 ρ θ θ ρ θ θ θ ρθ

= =∫ ∫ hh di i( ) .θ θ

DellaVigna, Stefano and Ulrike Malmendier (2006), “Paying Notto Go to the Gym,” american economic Review, 96 (3),694–719.

Eckstein, Zvi, Dan Horsky, and Yoel Raban (1988), “An EmpiricalDynamic Model of Optimal Brand Choice,” working paper,Eitan Berglas School of Economics, Tel-Aviv University.

Erdem, Tulin and Michael Keane (1996), “Decision-MakingUnder Uncertainty: Capturing Dynamic Brand Choice Pro-cesses in Turbulent Consumer Goods Markets,” Marketing Sci-ence, 15 (1), 1–20.

Grubb, Michael (2009), “Selling to Overconfident Consumers,”american economic Review, 99 (5), 1770–1807.

Hajivassiliou, Vassilis and Paul A. Ruud (1994), “Classical Esti-mation Methods for LDV Models Using Simulation,” in Hand-book of econometrics, Vol. 4, R.F. Engel and D.L. McFadden,eds. Amsterdam: Elsevier Science, 2384–2441.

Handel, Ben (2010), “Adverse Selection and Switching Costs inHealth Insurance Markets: When Nudging Hurts,” workingpaper, Department of Economics, Northwestern University.

Iyengar, Raghuram, Asim Ansari, and Sunil Gupta (2007), “AModel of Consumer Learning for Service Quality and Usage,”Journal of Marketing Research, 44 (November), 529–44.

Judd, Kenneth L. (1998), Numerical Methods in economics. Cam-bridge, MA: MIT Press.

Lambrecht, Anja, Katja Seim, and Bernd Skiera (2007), “DoesUncertainty Matter? Consumer Behavior Under Three-Part Tar-iffs,” Marketing Science, 26 (5), 698–710.

——— and Bernd Skiera (2006), “Paying Too Much and BeingHappy About It: Existence, Causes, and Consequences of Tariff-Choice Biases,” Journal of Marketing Research, 43 (May),212–23.

Loewenstein, George, Ted O’Donoghue, and Matthew Rabin(2003), “Projection Bias in Predicting Future Utility,” theQuarterly Journal of economics, 118 (4), 1209–1248.

Lovett, Mitchell, William Boulding, and Richard Staelin (2009),“Consumer Learning Models for Perceived and Actual ProductInstability,” working paper, Simon Graduate School of Busi-ness, University of Rochester.

Miller, Robert A. (1984), “Job Matching and OccupationalChoice,” Journal of Political economy, 92 (6), 1086–1120.

Miravete, Eugenio (1996), “Screening Consumers Through Alter-native Pricing Mechanisms,” Journal of Regulatory economics,9 (2), 111–32.

——— (2002), “Estimating Demand for Local Telephone Servicewith Asymmetric Information and Optional Calling Plans,” theReview of economic Studies, 69 (4), 943–71.

——— (2003), “Choosing the Wrong Calling Plan? Ignorance andLearning,” american economic Review, 93 (1), 297–310.

Narayanan, Sridhar, Pradeep Chintagunta, and Eugenio Miravete(2007), “The Role of Self Selection, Usage Uncertainty, andLearning in the Demand for Local Telephone Service,” Quanti-tative Marketing and economics, 5 (1), 1–34.

Nelson, Philip (1970), “Information and Consumer Behavior,”Journal of Political economy, 78 (2), 311–29.

Rust, John (1987), “Optimal Replacement of GMC Bus Engines:An Empirical Model of Harold Zurcher,” econometrica, 55 (5),999–1033.

——— (1994), “Structural Estimation of Markov Decision Pro-cesses,” in Handbook of econometrics, Vol. 4, R.F. Engle andD.L. McFadden, eds. Amsterdam: North-Holland, 3081–3143.

Wellman, D. (1999), “Are We On?” Supermarket Business, 54(12), 35.

652 JoURnal oF MaRKetinG ReSeaRCh, aUGUSt 2011