The Impact of Ebook Distribution on Print Sales: Analysis...
-
Upload
trinhthien -
Category
Documents
-
view
213 -
download
0
Transcript of The Impact of Ebook Distribution on Print Sales: Analysis...
Electronic copy available at: http://ssrn.com/abstract=1966115
The Impact of Ebook Distribution on Print Sales: Analysis of a Natural Experiment
Yu (Jeffrey) Hu‡ and Michael D. Smith†
[email protected], [email protected]
This Version: August 2011
Acknowledgements: The authors thank an anonymous publisher for generously providing data to support this research. Smith acknowledges generous support from Carnegie Mellon’s iLab. ‡ Krannert School of Management, Purdue University, West Lafayette, IN, 47907 † School of Information Systems and Management, Heinz College, Carnegie Mellon University,
Pittsburgh, PA, 15213.
Electronic copy available at: http://ssrn.com/abstract=1966115
1
The Impact of Ebook Distribution on Print Sales: Analysis of a Natural Experiment
ABSTRACT
Digital distribution channels introduce several new strategic questions for the creative industries. Two notable questions are (1) how will participating in a digital channels impact physical sales, and (2) where should digital product releases occur relative to existing “physical” release dates. These questions are particularly salient in the publishing industry, where publishers have expressed concern that the early release of ebooks may cannibalize high margin hardcover sales. We analyze the impact of digital channels on physical sales using a natural experiment that occurred between April and June 2010. We find that delaying the release of ebooks causes in an insignificant change in overall hardcover sales but a significant decrease in ebook sales, total sales, and likely total revenue and profit to the publisher. Together our results suggest that channel (platform) choice is more important than product choice in the minds of the majority of ebook consumers. These results add to a growing academic literature analyzing the impact of digitization on marketing strategies in the media industries. Our results also highlight the usefulness of natural experiments derived from a rapidly changing media distribution environment for identifying the impact of digital distribution channels on sales. Keywords: Digital distribution, channel, publishing industry, natural experiment.
2
1. Introduction
Digital distribution channels introduce a variety of new challenges and opportunities for the
creative industries. Ebooks represent one prominent (and understudied) setting where these
challenges are on display. The last few years have witnessed significant growth in ebook
platform adoption, and an associated growth in ebook sales. The Association of American
Publishers reports that, in the ten months between January and October 2010, ebook sales grew
by 171 percent reaching $345 million, or 9 percent of total trade book sales (Association of
American Publishers 2010). Similarly, Amazon.com, the largest online seller of print and digital
books, reported that Kindle sales surpassed Amazon’s total hardcover sales in July 2010, and
surpassed total print sales as of April 1, 2011 (Amazon.com 2011).
As the popularity of ebooks grows, it is increasingly important for academics and practitioners to
understand the dynamics of consumer behavior when facing dual print and digital distribution
channels for books. The answers to these questions will shape the future of the book publishing
industry as it adjusts to the new digital channels, and will help book publishers and book retailers
when making key marketing decisions in these changing environments.
Currently, the book industry is engaged in an active debate about how the use of ebook channels
will impact print sales. On one side of the debate are ebook platform providers, such as Amazon,
who claim that ebooks do not cannibalize print sales, but rather that ebook sales are mostly
incremental. For example, Jeff Bezos, CEO of Amazon, observed in an earnings call with
analysts (SeekingAlpha 2009),
“When people buy a Kindle, they actually continue to buy the same number of
physical books going forward as they did before they owned a Kindle. And then
3
incrementally, they buy about 1.6 to 1.7 electronic books, Kindle books, for every
physical book that they buy.”
On the other side of the debate many book publishers worry that because print books and ebooks
offer essentially the same content in different formats, it is natural to think that consumption of
ebooks would cannibalize sales of print books, particularly those that are priced significantly
higher than their ebook counterparts. As Carolyn Reidy, CEO of Simon & Schuster, represented
this view in a New York Times article (Rich 2009),
“What a (book) consumer is buying is the content, not necessarily the format.”
In 2009 and 2010 these sorts of cannibalization concerns led many publishers to delay the release
ebook releases in the hopes of not cannibalizing hardcover sales. Table 1 reviews several
announcements that publishers made regarding their decisions to delay ebook releases.
Table 1: Delaying Ebooks Event Quote
September 2009: HarperCollins delays the ebook release of Sarah Palin’s memoirs by 5 months after the hardcover release date.
"The publishing plan is focused on maximizing velocity of the hardcover before Christmas."
Brian Murray, CEO HarperCollins November 2009: Viacom/Schribner delays the ebook release of Stephen King’s new novel by 6 weeks after the hardcover release date.
"We think that this publishing sequence gives us the opportunity to maximize hardcover sales"
Adam Rothberg, Spokesperson Schribner Early 2010: Hachette Book Group delays the ebook release of nearly all of its titles by 3-4 months after the hardcover release date.
"I can't sit back and watch years of building authors sold off at bargain-basement prices."
David Young, CEO Hachette Early 2010: Simon & Schuster delays ebook release for 35 major titles by 4 months after the hardcover release date.
"The right place for the e-book is after the hardcover but before the paperback,"
Carolyn Reidy, CEO Simon & Schuster
4
However, discussions we have had with publishers suggest that these decisions have been made
with little empirical analysis. In this paper we attempt to provide empirical evidence regarding
how delaying the release of ebooks impacts hardcover sales, ebook sales, and total sales. To do
this we use a novel natural experiment where, over the course of a 2-month period, a particular
publisher stopped releasing new Kindle ebooks to Amazon, but continued to release new
hardcover books to Amazon. This event, which we describe in more detail below, creates an
environment where ebook release dates are exogenously delayed by between 1 to 8 weeks for a
relatively homogeneous set of titles. This event provides a unique opportunity for studying the
cross-channel effect between ebooks and print books.
Our analyses of this event suggest that delaying ebook release dates relative to their hardcover
counterparts results in an insignificant increase in hardcover sales, but results in a large decrease
in ebook sales, total sales, and likely to total revenue and profit for the publisher. We also find
evidence that the cross-channel effect between ebooks and print books is moderated by the
strength of a book’s popularity. For popular books, delaying ebook release dates leads to a
significant substitution toward print books. In contrast, for niche books, that do not have strong
brand awareness among consumers, we find an insignificant substitution toward print books
when ebook release dates are delayed.
We believe these results make three main contributions. First, these results inform an important
managerial question currently facing the book industry. Second, these results add to a growing
academic literature analyzing the impact of digital distribution channels on existing sales
channels and analyzing optimal marketing strategies for firms responding to the availability of
new digital distribution channels. Third, these results provide evidence that consumers can form
strong channel or platform preferences between physical and digital products.
5
2. Literature Review
Our research is related to a large marketing literature analyzing cannibalization and release
timing in media channels (e.g., Lehman and Weinberg 2000, Luan and Sudhir 2006, and Prasad
et al. 2004). Within this literature, or research is most closely related to a growing literature
analyzing the degree to which legitimate digital distribution channels impact sales in physical
channels. In this literature, Balasubramanian (1998) uses an analytic model to analyze how the
match between product characteristics and a new digital channel impacts sales. In an empirical
context, Deleersnyder et al. (2002) find that the introduction of online newspapers resulted in a
relatively small cannibalization of physical newspaper sales, Biyalogorsky and Naik (2003) find
that the introduction of online storefronts for music did not significantly cannibalize physical
record sales, Waldfogel (2007) shows that Youtube viewing has only a small negative impact on
television viewing, and Danaher et al. (2011) show that the presence of the iTunes distribution
channel has no statistical impact on DVD sales (but results in a large reduction in digital piracy).
There is also a long and related literature on the impact of digital piracy on sales in existing
channels (see Oberholzer-Gee and Strumpf 2010 for a review of this literature), in general
finding that piracy harms physical sales (see for example Zentner (2005), Hui and Png (2003),
Peitz and Waelbroeck (2004), Danaher and Waldfogel (2011), Rob and Waldfogel (2006), and
Rob and Waldfogel (2007)). However, a few notable exceptions to this are Oberholzer and
Strumpf (2007) who find little displacement of record sales from digital music piracy and Smith
and Telang (2009) who find little displacement of DVD sales by digital movie piracy for catalog
(i.e., older) content.
6
Our research differs from these papers in that we study cannibalization between digital and
physical channels in the context of the timing of releases between channels — a current
managerial question for several of the creative industries. In addition, our research informs the
literature on cross-channel effects by analyzing a novel natural experiment in the understudied
(and increasingly important) electronic publishing industry.
In addition to this literature, our findings regarding how the cross-channel effect is moderated by
product popularity are related to results in Simester at al. (2009), which show that the strength of
advertising effects vary based on brand awareness among consumers, and to Frambach, Roest,
and Krishnan (2007) who show that consumers have higher preferences for the online purchasing
channel when they have more favorable Internet experience.
Finally, our work relates to a literature studying online sales of books. In this literature, Ghose,
Smith, and Telang (2006) show that Internet channels for used books result in a relatively small
cannibalization of used book sales, Chevalier and Goolsbee (2003) show that the cross price
elasticity between books sold at Amazon and Barnes and Noble is relatively low, and Kannan,
Pope, and Jain (2009) analyze optimal pricing strategies for dual channel — print and ebook —
publishers.
3. Theory
Theory does not clearly predict the cross-channel effect between ebooks and print books. On one
hand, ebooks and print books offer exactly the same content. When the same product is offered
in two different channels, one would generally expect that consumers will switch easily between
the two channels, and as a result, that these two channels would be strong substitutes for each
other. Similar arguments for the existence of cross-channel substitution can be traced to
7
Balasubramanian (1998)’s analytic model, and have been empirically validated in the context of
online newspaper vs. traditional newspaper (e.g., Deleersnyder et al. 2002), Internet commerce
vs. brick-and-mortar commerce (e.g., Simester at al. 2008, Brynjolfsson, Hu, and Rahman 2009),
and Internet advertising vs. offline advertising (e.g., Goldfarb and Tucker 2011a, 2011b). If these
findings were to hold in the context of ebooks vs. print books, we would expect to strong
substitution toward print sales when Kindle books are released later than their print counterparts.
On the other hand, there are reasons to believe that some of the unique features of the ebook
format can differentiate ebooks from print books. For instance, ebooks provide the capability of
searching, changing font sizes, highlighting, and storing many books on a single device. In
addition, marketing theory shows that markets can be segmented based on customers’ channel
preferences (Kotler 2002). Thus, it is conceivable that ebook consumers are a relatively distinct
market segment from consumers who buy print books. Together, these arguments would suggest
that there will not be strong cross-channel substitution between ebooks and print books, and thus
delaying Kindle releases may not lead to an increase in print sales.
These two theoretical views about the cross-channel effect between ebooks and print books
parallel the ongoing debate in the industry about purchasing behaviors of book consumers. Do
book consumers choose content first and channel later, or do book consumers choose channel
first and content later? As seen in the quotes above, many book publishers hold the view that the
content matters and book consumers typically choose content first. Thus, a natural conclusion is
that there exists substitution between ebooks and print books.
In contrast, some industry observers have expressed the opposite view: that book consumers may
choose channel first. For instance, Lowe (2009) commented,
8
“Consumers who want ebooks now go to dedicated ebook sites to browse and buy.
This type of purchasing behavior is possible because ebook distributors have
amassed enough content to satisfy the majority of ebook consumers. Amazon, the
most popular ebook distributor, has over 300,000 titles and nearly all of the titles on
the New York Times Best Seller list. Individuals buying from these distributors (and
especially those who have purchased a dedicated e-reading device) have chosen their
desired format before choosing the content.”
Previous research has demonstrated that as consumers gain more experience with a channel, their
preference for that channel can grow (Frambach, Roest, and Krishnan 2007). Thus, it is possible
that consumers’ strong channel preference can lead to a behavior of choosing channel first.
We also note that a consumers’ behavior of choosing content or channel will likely depend on
the popularity of the book in the market. If a book has strong popularity among consumers,
consumers’ behavior of searching for alternatives will be shaped by their prior knowledge of the
focal product (Engel et al. 1990). They may choose the content first and search for a channel
offering the content. On the other hand, for less popular products consumers may be more likely
to search for alternatives in their preferred channel.
In summary, theory alone cannot predict the direction of the cross-channel effect between ebooks
and print books. Ultimately this is an empirical question. In this paper, we test whether there
exists a cross-channel effect between ebook sales and print book sales and whether this cross-
channel effect varies across books as a function of popularity.
9
4. Data and Description of Natural Experiment
An ideal starting point for understanding how delayed release dates for ebooks affect sales would
be a pure experiment where a publisher randomly assigns books to treatment groups with
different amounts of delay. Lacking this data, our research employs a natural experiment that
approximates this ideal setup.
Specifically, we use data from a publisher that stopped distributing new Kindle titles to Amazon
from April 1, 2010 and June 1, 2010. Prior to April 1, the policy of this publisher was to release
Kindle titles on the same day that a book is initially released in print (typically in a hardcover
version). Starting on April 1, as a result of a dispute with Amazon, this publisher stopped
releasing new Kindle titles to Amazon, while still releasing new print titles. On June 1, 2010 the
publisher returned to the Kindle store, releasing all Kindle titles that had been released in print
from April 1 through May 31 and returning to their previous policy of releasing Kindle ebooks
on the same day as the print (hardcover) version was released.
Table 1: Kindle Release “Experiment” Week(s) Print Release Kindle Release Kindle Delay
Before April 1, 2010 Print and Kindle titles released same day 0 weeks Week of April 4 April 4 June 1 8 weeks Week of April 11 April 11 June 1 7 weeks Week of April 18 April 18 June 1 6 weeks Week of April 25 April 25 June 1 5 weeks Week of May 2 May 2 June 1 4 weeks Week of May 9 May 9 June 1 3 weeks Week of May 16 May 16 June 1 2 weeks Week of May 23 May 23 June 1 1 week After June 1, 2010 Print and Kindle titles released same day 0 weeks
This results in the release schedule of Kindle and print titles shown in Table 1. Notably for our
purposes, if the titles released in April and May are statistically similar to those released in
10
March and June, this creates a “natural experiment” where a sample of Kindle titles are delayed
by between 1 to 8 weeks relative to their print counterparts.
To analyze the results of this “experiment,” we obtained data from the publisher in question,
covering print and Kindle sales, prices, and release dates for all books released from March 1,
2010 through June 30, 2010. One significant confounding factor in our experiment is that
Apple’s iBookstore opened on April 3 and included content from this publisher. So, to take this
event into account, we also obtained this publisher’s sales on the iBookstore for the titles in our
data.
We then divide this sample into two groups as follows:
1. “Control” Group: books that are released from March 1 to March 31 and books released
from June 1 to June 30. Books in this group are released simultaneously in both print and
Kindle formats.
2. “Experiment” Group: books that are released from April 1 to May 31. Books in this
group are released first in print and then one to eight weeks later (on June 1) released in
Kindle format.
Overall, we have 84 “control” titles and 103 “experiment” titles. After removing outliers: books
that have unusually large sales (average weekly sales higher than 1,600 copies), we are left with
79 “control” titles and 99 “experiment” titles.1
1 Our results are robust to including these high sales titles. Removing them allows us to maintain a more homogeneous group of “experiment” and “control” titles, and to partially ensure that our results are not driven by outliers.
11
4.1. Comparing The Control and Experiment Groups
For this to be a valid setting for a natural experiment we first must establish that books in the
control group have a similar profile to books in the experiment group on observable dimensions
— which in our case include the list price, number of pages, weight, and physical dimensions of
the books.2 Summary statistics on these dimensions are provided in Table 2 separately for books
in the control group and books in the experiment group.
Table 2: Summary Statistics
Mean S.D. Min Max Median Obs.
“Control” Group
List Price ($) 26.43 2.82 22.95 45.00 25.95 79 Number of Pages 351.25 129.45 224.00 1184.00 304.00 79 Weight (pound) 1.18 0.40 0.65 3.55 1.10 79 Height (inch) 1.32 0.26 1.00 2.50 1.26 79 Length (inch) 8.81 0.50 6.90 10.00 9.00 79 Width (inch) 6.04 0.38 5.20 7.80 6.10 79
“Experiment” Group
List Price ($) 25.88 1.98 21.95 37.95 25.95 99 Number of Pages 339.56 88.90 192.00 704.00 320.00 99 Weight (pound) 1.13 0.32 0.25 2.25 1.15 99 Height (inch) 1.32 0.25 0.80 2.30 1.42 99 Length (inch) 8.91 0.45 7.10 10.60 9.06 99 Width (inch) 6.11 0.36 5.10 8.00 6.20 99
We first run t-tests to examine whether the books in the control group and the books in the
experiment group have the same mean values for each of the observable dimensions. The
resulting t-stats are 1.54 for list price, 0.71 for number of pages, 0.80 for weight, -0.22 for height,
-1.59 for length, and -1.33 for width. Each of these t-stats are insignificant at the 5% confidence
level. This suggests that books in the control group are materially similar to — and thus a
reasonable control for — books in the experiment group.
2 Recall that because these are all new releases, format (hardcover versus paperback) is not an issue.
12
In addition to these t-tests, we also test whether any of the observable variables can be used to
predict a particular book belonging to the experiment group. To accomplish this, we estimate the
following Probit model:
P(Experimenti =1| Xi ) =Φ(Xiβ) , (1)
where Experimenti is an indicator of whether book i belongs to the experiment group, and Xi is a
vector of independent variables including book i’s list price, number of pages, weight, height,
length, and width. The results from estimating this model are reported in Table 3.
Table 3: Results from A Probit Model
Experiment Group Dummy
Constant -2.586 (2.267)
List Price -0.125 (0.077)
Number of Pages 0.0004 (0.003)
Weight -0.008 (0.005)
Height 0.008 (0.009)
Length 0.007 (0.004)
Width -0.001 (0.005)
Log Likelihood -116.522 Number of Obs. 178
Standard errors are in parentheses. **Significantly different from zero, p < 0.01. * p < 0.05.
The results in Table 3 show that none of the coefficients is statistically significant at the 5% or
10% confidence level. This is again consistent with the hypothesis that the control and
experiment titles are materially similar along observable dimensions.
13
4.2. Initial Analyses of Book Sales
When studying the sales and sales patterns of these books, we focus on the sales in the first
twenty weeks since each book’s initial release in their respective channels. We do this because
this is the period where the majority of sales occur, allowing us to keep the sales numbers
comparable across different books.
Summary statistics can provide an initial test for whether there are differences in sales patterns
between the control and experiment titles. These summary statistics are provided in Table 4 for
sales of books in the control and experiment groups in the first twenty weeks since each book’s
release in the respective channel (print and Kindle). We note that digital sales of the books in the
experiment group would be unambiguously lower if we studied digital sales in the first twenty
weeks since each book’s print release: this is because digital sales in the first few weeks would
have been zero for the digital books in the experiment group.
Table 4: Summary Statistics (Weeks 1-20 since Each Version’s Release)
Mean S.D. Min Max Median Obs.
“Control” Titles
Total Sales 212.81 609.49 0 13,441 59 1,580 Print Sales 111.30 394.38 0 10,234 31 1,580 Digital Sales 101.51 293.92 0 3,860 20 1,580
“Experiment” Titles
Total Sales† 168.71 381.15 0 5,573 53 1,980 Print Sales 132.54 364.27 0 5,573 34 1,980 Digital Sales 40.12 86.86 0 1,247 13 1,980 Digital Sales after Print Release†
36.17 86.94 0 1,247 8 1,980
† Sales are shown for weeks 1-20 after print release
Looking first at the “Control” titles, these summary statistics suggest that digital sales are a
significant sales channel. Digital sales make up nearly half of total sales for this publisher, which
is generally consistent with available data regarding sales patterns at Amazon (e.g., Amazon.com
2011). More importantly for our study, these summary statistics suggest that ebook sales are
14
significantly lower in the experiment group than in the control group, and that print sales, while
slightly higher in the experiment group, are nowhere near large enough to compensate for the
lost ebook sales.
To empirically test for this effect, we first run a linear regression of book sales onto a dummy
variable that equals one for all the books in the experiment group. We then run a quantile
regression to replace the linear regression so that the results are less likely to be driven by any
potential outliers in our sample. The model we estimate for each version (print or Kindle) of
book i in each week t is simply:
Salesit = β0 +β1Experimenti +εit (2)
where Salesit is the sales generated by a particular version of book i in week t and Experimenti is
a dummy variable that equals one if book i belongs to the experiment group. We study print sales
before moving on digital sales, and report the results from such a linear regression model and a
quantile regression model in Table 5.
In terms of the mean print sales level, we find that there are, on average, 21.2 more print sales for
the experiment group than there are for the control group (132.5 sales for the experiment group
versus 111.3 sales for the control group), but this difference is statistically insignificant. A
quantile regression, which controls for outliers, reveals similar results: the median print sales
level for the experiment group is 31.0 compared to 34.0 for the control group, a difference that is
also statistically insignificant. Thus, initial analyses show that delaying Kindle releases has not
led to significant increases in print sales, suggesting that cross-channel substitution between print
and ebooks is not significant.
15
Table 5: Initial Analyses of Sales Data (Weeks 1-20 since Each Version’s Release)
Standard errors are in parentheses. **Significantly different from zero, p < 0.01. * p < 0.05.
On the other hand, examining digital sales, we find that there are 61.4 fewer digital sales for the
experiment group than for the control group. Likewise, in terms of median sales levels, there are
7.0 fewer digital sales for the experiment group than for the control group. Both results are
statistically significant at the 1% confidence level. These initial results suggest that delaying
Kindle releases leads to a significant loss in digital sales, and that this loss has not been
compensated by an increase in print sales.
These analyses are sufficient for our data if the control group serves as a valid control for our
experiment group. This statement is true as long as other factors that may influence book sales
— such as book-level characteristics, over-time seasonality or sales trends — are orthogonal to
the experiment treatment. Our discussions with the publisher who provides these data suggest
that this is likely true: the publisher was unaware of and had not planned any systematic
differences between the control and experiment groups. In fact, the publisher’s publishing
schedule for the first six months of 2010 had been set by October 2009, and at that time, no one
from the publisher expected a dispute to occur with Amazon in 2010. However, in spite of these
reassurances, in the next section we use a more sophisticated model to control for the possibility
of other factors unrelated to our experiment that may influence book sales.
Print Sales Digital Sales Linear
Regression Quantile
Regression Linear
Regression Quantile
Regression
Constant 111.3** (9.5)
31.0** (1.5)
101.5** (5.2)
20.0** (0.8)
Experiment 21.2 (12.7)
3.0 (2.0)
-61.4** (7.0)
-7.0** (1.0)
R2 (or pseudo R2) 0.001 0.0002 0.021 0.002 Number of Obs. 3,560 3,560 3,560 3,560
16
5. Extended Model and Results
The first factor we control for in our extended model is the number of weeks since release. We
do this because book sales typically follow a declining curve over time. The price of a book may
also influence sales. Thus, we add a control variable that is the natural log of weekly average
price. We also control for a set of book-level characteristics, including the number of pages,
weight, height, length, width, and list price. Finally, we add a set of weekly dummy variables
(weekly fixed effects) for each calendar week in our data. These weekly dummy variables will
capture any variation in book sales over time that are driven by seasonality or that are due to the
effect of the opening of iBookstore on April 3. Accordingly, we estimate the following model for
each version (print or digital) of book i in each week t:
ln(Salesit ) = β0 +β1Experimenti +β2NumWeeksit +β3 ln(PrintPriceit )
+β4 ln(DigitalPriceit )+ β5kBookVarikk∑ + β6tWeekt
t∑ +εit
(3)
where ln(Salesit) is the natural log of sales generated by a particular version of book i in week t;
Experimenti is a dummy variable that equals one if book i belongs to the experiment group;
NumWeeksit is the number of weeks from the release of that version of book i until week t;
ln(PrintPriceit) is the natural log of price for the print version of book i in week t;
ln(DigitalPriceit) is the natural log of price for the digital version of book i in week t; BookVarik
is a series of book level characteristics such as book i’s list price, number of pages, weight,
height, length, and width; and Weekt is a series of dummy variables for each calendar week.3
3 Taking the natural log of sales allows the dependent variable to better conform to a normal distribution and improves the model fit. We add one to sales before taking the natural log. Our results are robust to adding different constants.
17
An advantage of this model is that it allows us to directly control for any factors that may have
affected book sales during this timeframe that may differ between the control and experiment
groups. We note that if sales of the control books are statistically representative of what sales of
the experiment books would have been if the Kindle version had been released along with the
print version (as we believe is the case), then this test is unnecessary. However, if the market
environment that the experiment group books faced is systematically different from the market
environment the control group faced, then this more sophisticated model can be used to control
for some of these outside factors.
5.1. The Effect on Print Sales and Digital Sales
Column 1 of Table 6 reports our results from estimating the model in (3) with print sales as the
dependent variable. These results show that the coefficient on variable Experiment is statistically
insignificant, indicating that delaying Kindle releases does not lead to a significant change in
print sales. Our control variables have the expected signs and approximate magnitudes, with
print sales declining at roughly 13.6% per week according to the coefficient -0.136 on
NumWeeks in Column 1 of Table 6.
The results in Column 1 of Table 6 indicate that the cross-channel effect between ebooks and
print books is weak. We interpret this as evidence that the digital channel of ebooks is quite
sticky: in general, ebooks consumers do not seem to consider a print copy a strong substitute for
a digital copy.
Table 6, Column 2, reports the results when this more sophisticated model is applied to analyze
digital sales. These results show a strong decrease in sales for delayed Kindle titles: the
(statistically significant) -0.520 coefficient on variable Experiment means that delaying Kindle
18
releases leads to a roughly 52.0% decrease in digital sales. As before the coefficients of the
control variables have the expected signs and approximate magnitudes.
Table 6: Analyses of Sales Data (Weeks 1-20 after Each Version’s Release)
Standard errors are in parentheses. **Significantly different from zero, p < 0.01. * p < 0.05.
These results suggest that the main effect of delaying Kindle releases is a dramatic decline in
ebook sales. This result is consistent with the summary statistics above (in Table 4) and suggests
that ebook consumers do not seem to consider print books as a strong substitute for their
preferred purchase channel. This result is consistent with Elberse and Eliashberg (2003)’s
finding that delaying the release of movies in international markets compared with the release in
U.S. market reduces such movies’ box office in international markets, mainly because of the loss
of marketing buzz.
Print Sales Digital Sales
Constant -7.521** (1.209)
2.910 (1.659)
Experiment 0.004 (0.043)
-0.520** (0.063)
NumWeeks -0.136** (0.005)
-0.091** (0.008)
Ln(PrintPrice) -2.923** (0.165)
-3.074** (0.256)
Ln(DigitalPrice) -0.524** (0.183)
-3.148** (0.275)
Other Control Variables
Book characteristics,
Weekly dummies
Book characteristics,
Weekly dummies
R2 0.412 0.178 Number of Obs. 3,560 3,560
19
5.2. The Cross-Channel Effect Varies across Popular and Niche Books
In this section we estimate the model in equation (3) separately for popular and niche books. We
do this to analyze whether popular books, which presumably have stronger brand awareness,
have different cross-channel cannibalization effects than less popular books do.
To do this, we rank all the 178 books by their average weekly sales in the first twenty weeks
since being released. We then split our sample, defining the top 20% books as “popular books”
and the bottom 80% books as “niche books”. On average, top 20% books have total sales of
587.4 copies per week in the first twenty weeks since being released, among which 358.1 copies
are sold in print versions and 229.2 copies are sold in digital versions. In contrast, bottom 80%
books have much lower sales numbers — 87.1 total sale per week, among which 63.5 copies are
sold in print versions and 23.6 copies are sold in digital versions — during the same period.
Thus, top 20% books account for 63.1 percent of the total sales on all the books in our sample.
Next, we estimate equation (3) separately for the two samples of top 20% books and bottom 80%
books, and report the results in Table 7.4
These results show that delaying Kindle releases leads to a statistically significant decrease in
digital sales of both the popular and niche titles: we find the coefficient on Experiment is -0.566
for the top 20% books and -0.522 for the bottom 80% books, with both coefficients being highly
significant. We also find that the coefficient on Experiment is insignificant for the bottom 80%
books when we study print sales. In other words, delaying Kindle releases leads to no significant
increase in print sales of niche titles, consistent with our results in Section 5.1.
4 We note that our results remain qualitatively the same if we split our sample into the top 50% and bottom 50% of titles on the basis of popularity. These results are reported in Table A1 of the Appendix.
20
Table 7: Analyses of Sales Data for Top 20% and Bottom 80% Titles (Weeks 1-20 after Each Version’s Release)
Standard errors are in parentheses. **Significantly different from zero, p < 0.01. * p < 0.05.
However, our results in Table 7 also show that delaying Kindle releases leas to a significant
increase in the sales of print versions for the top 20% books. The coefficient on Experiment is
0.475 and significant in Column 1, Table 7. This suggests that there is some substitution from
digital to print when digital content is delayed — but only for the most popular titles in our
study.
Our interpretation of the results in Table 7 is that we have found evidence that popularity can
moderate the cross-channel effect between ebooks and print books. The promotion associated
with popular books may increase a consumer’s association with a product such that when digital
versions of popular books are delayed, a significant portion of consumers who otherwise would
have bought digital versions may switch to the books’ print versions. This conjecture is tentative
however, and deserves further analysis by future research.
Top 20% Books Bottom 80% Books Print Sales Digital Sales Print Sales Digital Sales
Constant 2.606 (2.310)
-31.462** (3.319)
-5.298** (1.171)
14.994** (1.563)
Experiment 0.475** (0.065)
-0.566** (0.088)
-0.054 (0.042)
-0.522** (0.061)
NumWeeks -0.192** (0.008)
-0.044** (0.013)
-0.129** (0.005)
-0.099** (0.008)
Ln(PrintPrice) -1.239** (0.201)
-1.168** (0.348)
-1.892** (0.175)
-1.097** (0.263)
Ln(DigitalPrice) 0.785** (0.208)
-2.364** (0.382)
-0.215 (0.205)
-1.785** (0.303)
Other Control Variables
Book characteristics,
Weekly dummies
Book characteristics,
Weekly dummies
Book characteristics,
Weekly dummies
Book characteristics,
Weekly dummies
R2 0.742 0.522 0.422 0.160 Number of Obs. 720 720 2,840 2,840
21
5.3. Sensitivity: Adding iBooks Sales Does Not Change Our Results
One could argue that at least part of the drop in Kindle sales for the experiment group could be
attributed to the opening of iBookstore on Apr 3 if the decrease in sales came from consumers
who substituted from the (unavailable) Kindle channel to the iBookstore. We first note that
iBookstore purchases can only be viewed on Apple iOS devices (iPhone, iPod Touch, iPad),
reducing the potential market segment that could make the tradeoff between the Kindle and
iBookstore. We also note that we have already controlled for this effect in part by adding a set of
weekly dummy variables.
However, a more comprehensive way to control for this effect is to add all iBook sales to Kindle
sales and study whether our original results still hold. To do this, we first add weekly iBook sales
to Kindle sales to create a new digital sales variable. We then repeat our analyses from Table 6
(Column 2) and Table 7 (Columns 2 and 4), and report these new results as Table 8 below. We
note that this represents a very conservative test given that it includes both iBookstore purchase
from customers who may have switched from the Kindle to the iBook channel, and purchases
from new iBookstore customers who were not previously Kindle customers (and therefore whose
sales are unrelated to the experiment).
The results in Column 1 of Table 8 are similar to those in Column 2 of Table 6. They show that
delaying Kindle releases leads to a significant decrease in total digital (Kindle and iBook) sales.
The -0.447 coefficient on the variable Experiment in Column 1 of Table 8 is significant at a 1%
confidence level and is similar to the -0.520 coefficient on Experiment in Column 2 of Table 6.
Similarly, the coefficients on Experiment in Columns 2 and 3 of Table 8 (-0.551 and -0.438)
resemble the coefficients on Experiment in Columns 2 and 4 of Table 7 (-0.566 and -0.522).
22
Thus, our results are robust to including iBook sales: The lost Kindle sales we observe seem to
be truly lost sales, as opposed to representing a shift from the Kindle store to Apple’s iBookstore.
Table 8: Analyses of Digital Sales Data (Weeks 1-20 after Each Version’s Release)
Standard errors are in parentheses. **Significantly different from zero, p < 0.01. * p < 0.05.
5.4. Longer and Shorter Kindle Delays
An additional question that one might ask is whether the change in sales is different for books
that are delayed for a longer period than for books that than for those delayed for a shorter
period. To test this, we note that books that are published in April will see their Kindle version
experience longer delays (5-8 weeks) compared with books that are published in May (1-4
weeks). We then create a dummy variable LongExperiment for the former group and a dummy
variable ShortExperiment for latter group. We then replicate the analyses above, but replace the
Experiment dummy variable with these two new dummy variables. We report these results in
Table 9.
All Books Top 20% Books
Bottom 80% Books
Digital Sales (including
iBook sales)
Digital Sales (including
iBook sales)
Digital Sales (including
iBook sales)
Constant 0.287 (1.737)
-28.924** (3.205)
12.857** (1.654)
Experiment -0.447** (0.066)
-0.551** (0.083)
-0.438** (0.063)
NumWeeks -0.101** (0.008)
-0.072** (0.012)
-0.105** (0.008)
Ln(PrintPrice) -3.209** (0.261)
-1.515** (0.333)
-1.176** (0.267)
Ln(DigitalPrice) -3.117** (0.282)
-1.915** (0.358)
-1.726** (0.313)
Other Control Variables
Book characteristics,
Weekly dummies
Book characteristics,
Weekly dummies
Book characteristics,
Weekly dummies
R2 0.182 0.554 0.163 Number of Obs. 3,332 673 2,659
23
Table 9: Analyses of Sales Data (Weeks 1-20 after Each Version’s Release) for Books with Longer and Shorter Delays
Standard errors are in parentheses. **Significantly different from zero, p < 0.01. * p < 0.05.
Reassuringly, the results in Table 9 are similar to the results in Table 7. As one might expect, the
effect of Kindle delays is larger for books with longer delays than for books with shorter delays,
although the difference is only statistically significant for niche titles. Specifically, the
coefficient on LongExperiment for digital sales of popular books is -0.604, while the coefficient
on ShortExperiment for popular digital books is -0.510. Likewise, for niche titles the coefficient
is -0.671 for LongExperiment and -0.325 for ShortExperiment.
The only results in Table 9 that differ from those in Table 7 are those obtained when we study
print sales of niche books. The coefficient on LongExperiment is -0.131 and significant while the
coefficient on ShortExperiment is 0.044 and insignificant. This seems to suggest that long delays
of Kindle releases can lead to a drop in print sales for niche books. A plausible explanation for
this drop in print sales is that delaying Kindle release reduces Kindle sales, which in turn leads to
Top 20% Books Bottom 80% Books Print Sales Digital Sales Print Sales Digital Sales
Constant 2.458 (2.339)
-30.754** (3.443)
-5.609** (1.173)
14.953** (1.556)
LongExperiment 0.495** (0.080)
-0.604** (0.100)
-0.131** (0.048)
-0.671** (0.067)
ShortExperiment 0.457** (0.078)
-0.510** (0.114)
0.044 (0.052)
-0.325** (0.072)
NumWeeks -0.193** (0.008)
-0.044** (0.013)
-0.124** (0.005)
-0.099** (0.008)
Ln(PrintPrice) -1.225** (0.204)
-1.186** (0.348)
-1.970** (0.177)
-1.072** (0.262)
Ln(DigitalPrice) 0.764** (0.214)
-2.334** (0.384)
-0.210 (0.204)
-1.801** (0.301)
Other Control Variables
Book characteristics,
Weekly dummies
Book characteristics,
Weekly dummies
Book characteristics,
Weekly dummies
Book characteristics,
Weekly dummies
R2 0.742 0.523 0.424 0.168 Number of Obs. 720 720 2,840 2,840
24
a drop in word-of-mouth regarding the focal books. Because of the drop in word-of-mouth, print
sales of these niche books can also decline. However, we note that this result disappears when a
quantile regression, instead of a linear regression, is used to estimate the same model. Thus, the
result regarding the drop in print sales for niche books could have been driven by outliers. We
report the results from such a quantile regression in Table 10.
All the results in Table 10 are quite similar to the results in Table 9, except for the insignificant
coefficient on LongExperiment when we study print sales of niche books. We also note that we
have checked all the other tables using quantile regressions and found that the results in all other
tables remain unchanged. These results are not included here for the sake of brevity, but they are
available from the authors upon request.
Table 10: Analyses of Sales Data (Weeks 1-20 after Each Version’s Release) for Books with Longer and Shorter Delays Using Quantile Regression
Standard errors are in parentheses. **Significantly different from zero, p < 0.01. * p < 0.05.
Top 20% Books Bottom 80% Books Print Sales Digital Sales Print Sales Digital Sales
Constant -4.666 (3.702)
-32.213** (7.367)
-5.841** (1.189)
12.875** (2.065)
LongExperiment 0.562** (0.127)
-0.744** (0.216)
0.017 (0.049)
-0.838** (0.089)
ShortExperiment 0.566** (0.123)
-0.645** (0.247)
0.053 (0.053)
-0.443** (0.095)
NumWeeks -0.206** (0.013)
-0.052 (0.028)
-0.128** (0.005)
-0.112** (0.010)
Ln(PrintPrice) -1.345** (0.325)
-1.037 (0.737)
-1.595** (0.179)
-1.460** (0.347)
Ln(DigitalPrice) 0.468 (0.329)
-1.903* (0.802)
-0.254 (0.208)
-2.977** (0.400)
Other Control Variables
Book characteristics,
Weekly dummies
Book characteristics,
Weekly dummies
Book characteristics,
Weekly dummies
Book characteristics,
Weekly dummies
Pseudo R2 0.538 0.385 0.242 0.102 Number of Obs. 720 720 2,840 2,840
25
6. Discussion
Our research analyzes how the availability of a product in a digital channel impacts sales in
physical channels — in our case in the digital and physical channels for book sales. This
question is important to both managerial and academic audiences. From a managerial
perspective, content owners across a variety of industries are making decisions about whether to
include digital products in their existing set of distribution channels and where to place these
products into their existing product release cycles. These decisions are particularly salient for
book publishers, many of whom have experimented with releasing ebook versions in between the
(high margin) hardcover release date and the (lower margin) paperback release date. From an
academic perspective, our work adds to a growing literature analyzing the impact of new digital
distribution channels on physical sales.
We analyze this question using a novel “natural experiment” where a publisher stopped releasing
new ebook content to Amazon for a period of about two months. Because this publisher still
released print copies of their titles to Amazon during this period, the net effect of this event is
that books released during this timeframe were delayed on the electronic channel relative to the
print channel by between one to eight weeks.
Our results suggest that delaying the publication of the ebook relative to its print version causes a
large and persistent decrease in overall digital sales. Moreover, while there may be some
substitution between the (unavailable) Kindle channel and the print channel for popular books, in
total this substitution does not compensate for the lost digital sales. Thus, the net effect of
delayed Kindle releases is an overall loss in sales and, based on the best available data, a net loss
in revenue and profit to the publisher.
26
These results also point to the possibility that consumers are more tied to their mode of
consumption (physical or digital) than they are to a particular product. Said another way, when
facing new digital channels, publishers and other media firms have frequently conceptualized the
consumer’s decision process as being driven by product choice first and then channel choice.
This conceptualization is seen in the frequent strategy to delay digital availability as a way of
retaining physical sales. Our results suggest that this framing is wrong, at least for books. Rather,
our results suggest that, in general, consumers choose their channel (digital versus physical) first,
and then choose among the products available in this channel.
However, we also note that our results are not without limitations. First, and most notably, our
results are based on the assumption that the books released immediately before and after our
“experiment” period are good controls for the books released during the experiment. We believe
this is true based on the empirical tests above and based on conversations with the publisher in
question, however, we cannot conclusively rule out the possibility that unobserved differences
between the control and experiment groups may be driving some of our observed results.
Second, our results are situated within a particular market (books) and at a particular stage of
development of that market. Finally, our results suggest a difference in response for popular and
less popular books, a finding that we believe deserves further analysis.
27
References
Amazon.com. 2011. Amazon.com Now Selling More Kindle Titles than Print Books. Press
Release, May 19. Available from http://phx.corporate-
ir.net/phoenix.zhtml?c=176060&p=irol-newsArticle&ID=1565581, last accessed August 18,
2011.
Association of American Publishers. 2010. AAP Reports October Book Sales. December 8.
http://www.publishers.org/press/11/
Balasubramanian, S. 1998. Mail versus Mall: A Strategic Analysis of Competition between
Direct Marketers and Conventional Retailers. Marketing Science, 17(3) 181-195.
Biyalogorsky, E., P. Naik. 2003. Clicks and Mortar: The Effect of On-line Activities on Off-line
Sales. Marketing Letters, 14(1) 21-32.
Brynjolfsson, E., Y. J. Hu, M. Rahman. 2009. Battle of the Retail Channels: How Product
Selection and Geography Drive Cross-Channel Competition. Management Science 55 (11),
1755-1765.
Chevalier, J., A. Goolsbee. 2003. Measuring Prices and Price Competition Online: Amazon and
Barnes and Noble. Quantitative Marketing and Economics, 1(2) 203-222.
Danaher, B., S. Dhanasobhon, M.D. Smith, R. Telang. 2010. Converting Pirates without
Cannibalizing Purchasers: The Impact of Digital Distribution on Physical Sales and Internet
Piracy. Marketing Science, 29(6) 1138-1151.
Danaher, B., J. Waldfogel. 2011. Reel Piracy: The Effect of Online Movie Piracy on Film Box
Office Sales. Working paper. University of Pennsylvania, Philadelphia, Pennsylvania.
28
Deleersnyder, B., I. Geyskens, K. Gielens, M.G. Dekimpe. 2002. How Cannibalistic is the
Internet Channel? A study of the newspaper industry in the United Kingdom and The
Netherlands. International Journal of Research in Marketing, 19(4) 337-348.
Elberse, A., J. Eliashberg. 2003. Demand and Supply Dynamics for Sequentially Released
Products in International Markets: The Case of Motion Pictures. Marketing Science 22 (3)
329-354.
Engel, J., R. Blackwell, P. Winiard. 1990. Consumer Behavior. Dryden Press, Hinsdale, IL.
Frambach, R., H. Roest, T. Krishnan. 2007. The Impact of Consumer Internet Experience on
Channel Preference and Usage Intentions Across the Different Stages of the Buying Process.
Journal of Interactive Marketing 21 (2) 26-41.
Ghose A., M D Smith, R Telang R. 2006. Internet Exchanges for Used Books: An Empirical
Analysis of Product Cannibalization and Welfare Impact. Information Systems Research,
17(1), 3-19.
Goldfarb, Avi, and Catherine Tucker. 2011a. Search Engine Advertising: Channel Substitution
when Pricing Ads to Context. Management Science 57(3), 458-470.
Goldfarb, Avi and Catherine Tucker. 2011b. Advertising Bans and the Substitutability of Online
and Offline Advertising. Journal of Marketing Research 48(2), 207-228.
Hui, K., I. Png. 2003. Piracy and the Legitimate Demand for Recorded Music. Contributions to
Economic Analysis and Policy, 2(1).
Kannan, P. K., B. Kline Pope, S. Jain. 2009. Pricing Digital Content Product Lines: A Model and
Application for the National Academies Press. Marketing Science, 28(4) 620-636.
Kotler, P. 2002. Marketing Management. Prentice-Hall, Upper Saddle River, NJ.
Prasad et al. 2004
29
Lehmann, Donald R., Charles B Weinberg. 2000. Sales through Sequential Distribution
Channels: An Application to Movies and Videos. Journal of Marketing 64(July) 18-33.
Luan, Jackie Y., K. Sudhir, 2006. Optimal Inter-Release Timing for Sequentially Released
Products. Working paper. Yale School of Management, Yale University, New Haven,
Connecticut. Available at
http://faculty.som.yale.edu/ksudhir/workingpapers/SeqRelease_Sep13_KS.pdf (last accessed
April 26, 2011.)
Lowe, S. 2009. Do Ebooks Cannibalize Print Sales? PublishingBits. July 28.
http://publishingbits.com/ebook-strategy/7-do-ebooks-cannibalize-print-sales.html
Oberholzer-Gee, F., K. Strumpf. 2010. File-Sharing and Copyright. In Innovation Policy and the
Economy, Volume 10, J. Lerner and S. Stern eds., University of Chicago Press, Chicago, Il,
pp. 19-55.
Oberholzer, F., K. Strumpf. 2007. The Effect of File Sharing on Record Sales. An Empirical
Analysis. Journal of Political Economy, 115(1) 1-42.
Peitz, M, P. Waelbroeck. 2004. The Effect of Internet Piracy on Music Sales: Cross-Section
Evidence. Review of Economic Research on Copyright Issues, 1(2) 71-79.
Prasad, Ashutosh, Bart Bronnenberg and Vijay Mahajan .2004. Product Entry Timing in Dual
Distribution Channels: The Case of the Movie Industry. Review of Marketing Science, 2(1).
Rob, R., J. Waldfogel. 2006. Piracy on the High C’s: Music Downloading, Sales Displacement,
and Social Welfare in a Sample of College Students. Journal of Law and Economics, 49(1)
29-62.
Rob, R., J. Waldfogel. 2007. Piracy on the Silver Screen. Journal of Industrial Economics, 55(3)
379-393.
30
Rich, M. 2009. Steal This Book (for $9.99). New York Times, May 17, p.WK3.
SeekingAlpha. 2009. Amazon.com, Inc. Q4 2008 Earnings Call Transcript, Q&A.
http://seekingalpha.com/article/117508-amazon-com-inc-q4-2008-earnings-call-transcript
Simester, D., Y. J. Hu, E. Brynjolfsson, E. Anderson. 2009. Dynamics of Retail Advertising:
Evidence from a Field Experiment. Economic Inquiry 47 (3), 482-499.
Smith, M., R. Telang. 2009. Competing with Free: The Impact of Movie Broadcasts on DVD
Sales and Internet Piracy. Management Information Systems Quarterly, 33(2) 312-338.
Waldfogel, J. 2007. Lost on the web: Does Web Distribution Stimulate or Depress Television
Viewing? Information Economics and Policy.
Zentner, A. 2005. File Sharing and International Sales of Copyrighted Music: An Empirical
Analysis with a Panel of Countries. Topics in Economic Analysis & Policy 5(1), Article 21.
Available at: http://www.bepress.com/bejeap/topics/vol5/iss1/art21.
31
Appendix
Table A1: Analyses of Sales Data for Top 50% and Bottom 50% Titles (Weeks 1-20 after Each Version’s Release)
** Significantly different from zero, p<0.01;* p<0.05
Top 50% Books Bottom 50% Books Print Sales Digital Sales Print Sales Digital Sales
Constant -2.837* (1.201)
0.387 (2.214)
-3.652** (1.314)
13.433** (1.758)
Experiment 0.210** (0.045)
-0.439** (0.085)
-0.042 (0.045)
-0.554** (0.065)
NumWeeks -0.140** (0.005)
-0.096** (0.011)
-0.125** (0.005)
-0.074** (0.008)
Ln(PrintPrice) -1.755** (0.167)
-1.701** (0.344)
-1.167** (0.178)
-1.075** (0.270)
Ln(DigitalPrice) 0.323** (0.166)
-3.457** (0.333)
-0.321** (0.227)
-0.819* (0.343)
Other Control Variables
Book characteristics,
Weekly dummies
Book characteristics,
Weekly dummies
Book characteristics,
Weekly dummies
Book characteristics,
Weekly dummies
R2 0.623 0.200 0.486 0.199 Number of Obs. 1,780 1,780 1,780 1,780