Advertising, Searching, Blogging and New-Product Sales

1

Advertising, Searching, Blogging and New-Product Sales

Ho Kim and Dominique M. Hanssens

August 2013

Ho Kim is Assistant Professor of Marketing, School of Business and Management, Azusa Pacific University, 901

E. Alosta Ave. Azusa, CA 91702 (e-mail: [email protected]). Dominique M. Hanssens is the Bud Knapp Professor of

Marketing, UCLA Anderson School of Management, 110 Westwood Plaza, Los Angeles, CA 90095 (Email:

[email protected], Phone: (1) 310-825-4497, Fax: (1) 310-206-7422).

2

Advertising, Searching, Blogging and New-Product Sales

ABSTRACT

This paper presents new findings about how advertising and online word-of-mouth influence product sales

by examining the relationship between advertising, blog volume, search volume, and revenue of theatrically

released movies. To study the relationship over three different phases of a movie’s life cycle—i.e., pre-

launch period, opening week and post-launch period, we develop three separate models and estimate them

by correcting for endogeneity. We find that the majority of advertising effect on revenue is realized without

consumers’ online search activity, while the effect of blogs on revenue is materialized only indirectly,

through online search. Moreover, advertising is about three times more effective than blog volume in

generating movie revenue. Other findings include, but not limited to the followings. Search volume has

better predictive power than blog volume: Pre-launch blog volume does not explain the variation in

opening-week revenue once pre-launch search volume is controlled for; after opening week, weekly blog

volume does not explain variation in weekly revenue once weekly search volume is included. In pre-launch

period, consumers respond to advertising by online searching and blogging, but searching is five times

more responsive than blogging. Our findings indicate the importance of monitoring online search activity

of consumers.

Keywords: advertising, online search volume, blog volume, revenue prediction, endogeneity, instrumental

variables.

3

1. Introduction

The sales of an entertainment product depend on advertising, word-of-mouth, and search

activity around the product, among other things. Advertising is an effective tool to lift sales of an

entertainment product. For example, the advertising elasticity of a movie is over 0.2 (Elberse and

Eliashberg 2003), which is twice higher than that of average sales-to-advertising elasticity

(Hanssens 2009). Online word-of-mouth (WOM) is also an important factor due to due the

experiential nature of entertainment products. Previous research finds that volume and valence of

online WOM influence the success of an entertainment product (Chevalier and Mayzlin 2006,

Chintagunta et al. 2010, Dhar and Chang 2009, Duan et al. 2008ab, Gopinath et al. 2013, Liu

2006, Onishi and Manchanda 2012). Finally, to the extent that the online WOM matters, their

sales will be dependent on online search activity of consumers. This is because online WOM of a

consumer would influence the purchase behavior of others only if the WOM is reached to other

consumers, and because online search is an effective way of finding WOM on the Internet.

The three influencers, however, may not be equally important to managers for the following

reasons. First, advertising is delivered on various media while online WOM is carried only on

the Interne. Thus, online WOM is more limited than advertising in reaching consumers. Second,

online search activity is substantially more prevalent than online WOM activity. According to a

recent study, 32% of the U.S. adult Internet users read someone else’s online journal or blogs

and 32% of the Internet users posted a comment or review online about products they bought,

while 91% of the U.S. adult Internet user employed a search engine to find information (The Pew

Research Center 2012). These statistics suggest that online search data may be more useful than

online WOM data to predict market outcomes such as product sales. Third, market outcomes are

influenced by not WOM generated but WOM consumed (Onishi and Manchanda 2012). This is

4

because online WOM of a consumer would be able to influence the purchase behavior of others

only if the WOM is reached to other consumers. Therefore, consumers search intensity for a

product, or online search volume, should play a role in the relationship between the sales and

online WOM volume.

Thus, the following questions arise: (i) How does online WOM differ from advertising in

generating sales of a new entertainment product? (ii) How does online search activity of

consumers act in the revenue generation process of WOM and advertising? (iii) Between the

WOM volume and search volume, which is more important to accurately predict the sales of an

entertainment product? (iii) Finally, how are the focal variables related to each other before and

after product launch?

To answer the questions, we develop empirical models of advertising, online WOM, online

search, and revenue, and apply them to a weekly data set of 153 theatrically released movies.

The weekly data set spans from 30 weeks before each movie’s release until 10 weeks after the

release. Weekly blog volume is used to measure online WOM activity. We prefer blogs to user

reviews because consumers post blogs even long before a movie’s launch, while user reviews are

available only after the launch. The weekly Google keyword search index is used to measure

online search activity.

Three separate periods exists in the life of a theatrically released movie: pre-launch period,

opening week, and post-launch period. Thus we develop three separate models corresponding to

each period. The pre-launch period model is a panel data model that examines the relationship

between advertising, blog volume, and online search volume over the pre-launch period. The

post-launch period model is also a panel data model that examines the relationship between

5

advertising, blog volume, online search volume, and movie revenue over the post-launch period.

The opening-week model is a cross-sectional data model that examines the effects of pre-launch

blog and search volume on the opening-week revenue of movies. It is challenging to isolate

causal relationship between the focal variables as they are simultaneously determined. We find

exogenous variables for the weekly blog and search volume of individual movies and use

exclusion restrictions to identify the causal effects. To find the effect of movie consumption on

search and blog volume, we apply a covariance restriction technique.

We find that majority of advertising effect on revenue is realized without consumers’ online

search activity, while the effect of blogs on revenue is materialized only indirectly, through

online search. Moreover, advertising is about three times more effective than blog volume in

generating movie revenue. Other findings include, but not limited to the followings. Search

volume has better predictive power than blog volume: Pre-launch blog volume does not explain

the variation in opening-week revenue once pre-launch search volume is controlled for; after

opening week, weekly blog volume does not explain variation in weekly revenue once weekly

search volume is included. In pre-launch period, consumers respond to advertising by online

searching and blogging, but searching is five times more responsive than blogging. There is

concurrent feedback from movie consumption to online searching and blogging in a week,

implying a virtuous cycle. Our findings indicate the importance of monitoring online search

activity of consumers.

The remainder of the study is organized as follows. We first summarize previous research

and describe our movie data set. Next, we develop three models to answer our research

questions, apply them to our movie data set and discuss the findings and managerial

implications. We then formulate conclusions and areas for future research. The appendix and

6

web appendix explain, respectively, how to identify the parameters and how to transform the

Google search index to cross-sectionally comparable search volume metric.

2. Relevant Research

Previous studies find that both volume and valence of WOM influence product sales. Liu (2006)

and Duan et al. (2008ab) find that WOM volume offers significant explanatory power for

aggregate and weekly box-office revenue. Dhar and Chang (2009) show that future sales of

music albums are positively correlated with the volume of blog posts about the albums.

Chintagunta et al. (2010) find that valence captured by the average user rating explains

designated market area (DMA)-level opening-day box-office revenues. Gopinath et al. (2013)

find that the volume of blogs has significant effects on opening-week movie sales, but after

release the valence of blogs predict movie sales. Finally, Chevalier and Mayzlin (2006) find that

an improvement in book ratings leads to an increase in relative sales.

Others have examined how commercial media and social media influence market outcomes.

Onishi and Manchanda (2012) find that advertising and blogging are synergistic, and cumulative

blogs are predictive of market outcomes. Villanueva et al. (2008) find that customers acquired

through WOM add more long-term value to a firm than customers acquired through advertising.

Trusov et al. (2009) find that WOM referrals have substantially longer carryover effects than

traditional marketing actions in acquiring new members at an Internet social networking site.

When it comes to the antecedents of online WOM, previous literature finds that advertising,

the volume of the previous period’s WOM, and current and past market outcomes are positively

correlated with the volume of the current period’s online reviews and blog postings (Duan et al.

2008ab, Liu 2006, Onishi and Manchanda 2012). Yang et al. (2013) find that consumers’

7

propensity to generate WOM is positively correlated with their media exposure whereas their

propensity of consuming WOM is mixed.

In contrast to ample research on online WOM, few studies are done to examine how

consumer online search is associated with other managerial variables. Two exceptional studies

are Joo et al. (2012), who examine the significant effect of TV advertising on consumers’ online

search activity, and Kulkarni et al. (2012), who study the predictive power of online search

volume. But, none of the two studies examine online search in a setting that includes

advertising, online WOM and market outcome altogether.

To summarize, prior empirical research has examined the relationship between advertising,

online WOM and market outcomes, but such studies have not considered how consumers’ online

search interacts with the other variables. Therefore, the managerial implications of online search

are not well appreciated. Also, the changing relationship between the variables before and after a

new-product’s release has received little attention. By examining the relationship between

advertising, blog volume, online search volume, and revenue in the U.S. motion picture industry,

we aim to provide new insights for movie studios and entertainment product managers. Table 1

summarizes the previous studies mentioned above to help understand the contribution of this

study.== Table 1 about here ==

3. The Data

Our database consists of 153 movies, most of which were released in 2009. For each of the 153

movies, we collect weekly advertising spending, blog volume, and search volume, from 30

weeks before its release through 10 weeks after the release. Additionally, weekly screen numbers

and revenue are collected from the opening week through week 10 after release. For use as

8

instruments in estimating our models, we also collect various movie characteristics, weekly

Google search indices of the keyword “opening movie,” and weekly traffic to the five most

popular blog sites—i.e., blogger.com, tumblr.com, wordpress.com, squarespace.com, and

posterous.com.

3.1. Advertising and Box Office Revenue

Our advertising data cover the major media outlets such as television, print, radio, and outdoor

expenditure as collected by Nielsen. The average advertising spending of the 153 movies during

the analysis period—i.e., from 30 weeks before release to 10 weeks after release—is $21 million,

with 80% of the advertising budget spent in the pre-launch periods or during the release week.

Box-office revenue is collected from The Numbers (www.the-numbers.com). One hundred and

thirty-nine movies were exhibited for at least 10 weeks, and the shortest movie run was five

weeks. The median U.S. box-office revenue in our sample is $44 million. Figure 1(a) shows

weekly advertising spending and box-office revenue averaged across the 153 movies.

3.2. Blog Postings and Search Volume

Weekly blog postings for each movie were collected from the Google blog search engine

(www.google.com/blogsearch). To minimize noise in the data collection process, we constrained

our search to blogs whose titles contain either the word movie or film. Our general rule of

constructing search keywords for blog postings is <movie title> + “movie.”1 For example, to find

blog postings of the movie Avatar, we used the keyword “Avatar movie.” For movies with long

titles, reduced search keywords were used. For instance, to search for blog postings of the movie

Bad Lieutenant: Port of Call New Orleans, we searched for the postings that contain the

1 We developed several different versions of search keywords and found that <movie title> + “movie” is the most

appropriate keyword to collect blog postings of the focal movie. This judgment is based on the empirical fit between

the movie’s weekly advertising, launch schedule, and the collected blog volumes.

9

keyword “Bad Lieutenant” in their title. For each week of each movie, we repeated the search

trials five times and used the mode of the number of blog postings so gathered.2

For weekly search volume of movies, we relied on Google Trends. Google Trends shows the

weekly search index of the entered keyword. The raw index provided by Google is normalized to

conceal the actual search volume of the keyword entered into the Google search engine. This

normalization causes a problem when comparing search volumes of different movies. We

propose a method that transforms the raw search index of Google to a cross-sectionally

comparable search volume measure. The detailed methodology of collecting weekly Google

search indices of movie keywords and transforming them into cross-sectionally comparable

measures can be found in the web appendix. Figures 1(b) and 1(c) show how the average blog

volume, search volume, and revenue change on a weekly basis over the movie lifecycle.

== Figure 1 about here ==

3.3. Other Variables

We collect the weekly Google search index of the keyword “opening movie,” weekly traffic to

the five most popular blog sites, and various movie characteristics. They are used as instruments

in estimating our models.

The weekly search index of the keyword “opening movie” was collected from Google

Trends. Using the search filters of Google Trends, we consider only the search activities related

to the U.S. movie industry. The weekly traffic of the five most popular blog sites —i.e.,

2 In most cases, the five search trials give a unanimous result. That is, the Google blog search engine gives the same

number of blog postings for the same search condition. In some rare cases, the five search trials disagree—with one

outlier and four same numbers. In these cases, we use the four same numbers (i.e., the mode) for the number of blog

postings.

10

blogger.com, tumblr.com, wordpress.com, squarespace.com, and posterous.com—was collected

from Alexa.com, a subsidiary of Amazon.com. Alexa collects daily traffic information of

websites based on a global panel of toolbar users. The panel consists of millions of people using

toolbars created by over 25,000 publishers, including Alexa and Amazon. We aggregate the daily

reach of the five blog sites at the weekly level to derive the weekly total reach of the blog sites.

Figure 2 illustrates the weekly search index of the keyword “opening movie” and the weekly

reach of the five blog sites, from the first week of 2008 to the last week of 2009.


We also collect various movie characteristics. They include genre, MPAA rating, monthly

seasonality, whether the movie is a sequel, average critic rating from Metacritic

(www.metacritic.com), and director power. For the director power, we collect three variables:

total revenue of past movies with which the focal movie’s director was involved as either a

director, writer, or producer, since 1990 up to one calendar year before the focal movie’s release;

average user rating of such movies; and the standard deviation of user ratings of such movies.

Table 2 summarizes the variables and their sources and Table 3 shows the descriptive statistics

of the main variables.

== Table 2 about here ==


4. Modeling

We develop three separate models—the pre-launch period model, opening-week model and post-

launch period model— to deal with the three different phases that a theatrically-released movie

goes through. The pre-launch period model examines the relationships between advertising, blog

volume, and search volume of a movie until one week before its release. The model consists of

11

two equations whose dependent variables are weekly blog volume (blog equation) and weekly

search volume (search equation). The post-launch period model examines the relationships

between advertising, blog volume, search volume, and revenue of a movie after its opening

week. This model consists of three equations whose dependent variables are weekly blog volume

(blog equation), weekly search volume (search equation), and weekly revenue (revenue

equation). The opening-week model examines the effect of pre-launch search volume and pre-

launch blog volume on opening-week revenue after controlling for the effects of other relevant

variables. We distinguish the post-launch period model from the opening-week model as

opening-week of a movie has a distinctive significance to studio managers. Note that the

opening-week model is necessarily a cross-sectional data model while the pre- and post-launch

period models are panel data models.

We first discuss the two panel data models, whose specifications are based on the following

considerations.

First, we assume that a weekly variation in advertising spending is exogenous based on

the following institutional feature of movie advertising (Elberse and Anand 2007, Onishi and

Manchanda 2012). Movie studios typically purchase the vast majority of advertising in the

upfront market. This makes it difficult for a movie studio to buy additional advertising based on

new information. Additional purchase of advertising in the opportunistic market, if any, is

affected by exogenous events, which implies week-to-week variation in advertising spending of

a movie is mainly driven by exogenous factors.

Second, we explain the weekly variation in each endogenous variable with the weekly

variation in advertising and other endogenous variables. For example, we explain the weekly

12

search volume of a movie in the post-launch period with weekly advertising, blog volume and

revenue. We will discuss the lag-length determination.

Third, we do not include lagged dependent variables to explain the weekly variation in

dependent variables. For instance, we exclude lagged search volume to explain the variation in

current week’s search volume. Instead, we include a sufficient number of lags of advertising and

the other endogenous variables, wherever possible. This is a preferred approach in previous

studies (Basuroy et al. 2006, Elberse and Eliashberg 2003, Liu 2006) because it avoids

endogeneity caused by the possible correlation between the lagged dependent variables and the

error term of the equation.

Fourth, we include the time-specific effect but not the individual movie-specific effect in

each equation. In the panel data model, it is usually recommended to control for unobserved

individual or time effects. However, when lagged endogenous variables are included as

covariates—as in our models, controlling for individual-specific effects leads to biased

estimators (Baltagi 1995, Elberse and Eliashberg 2003). An alternative is to include only the

common intercept. This is preferred if cross-sections do not have significantly different

individual fixed effects after controlling for the effects of covariates. We conducted the Holtz-

Eakin (1988) test to examine if there exist significantly different individual fixed effects. The test

results revealed no supporting evidence for significant individual fixed effects. Based on these

test results, we include only the common intercept in each equation.

Fifth, based on unit-root test results, we first-difference the pre-launch period model but

not the post-launch period model.

Sixth, we need to determine the lag lengths of the covariates. More lagged variables are

preferred to mitigate the bias from potential model misspecification (Enders 2004). However,

13

including many lags reduces the degrees of freedom of the model. Also, if the time-series

variables are highly autocorrelated, including both contemporaneous and lagged values as

covariates can cause a collinearity problem. In Table 4, we check the serial correlation

coefficients of variables in the pre- (equations (1) and (2)) and post-launch (equations (3) – (5))

models.


The serial correlation coefficients suggest that including lagged explanatory variables in the post-

launch period model can cause a serious collinearity problem. To alleviate this potential effect,

we include only the contemporaneous variables in the post-launch period model. 3 The serial

correlations of the variables of the (first-differenced) pre-launch period models are weak and we

have thirty data points for each movie. As such, in the pre-launch period model, we include four

lags of the explanatory variables to lessen the misspecification problem that may result from

including an insufficient number of lags.

Seventh, we improve the identification of parameters by adding exogenous variables in

the blog and search equations. In the search equation, we include the weekly Google search

index of the keyword “opening movie”; in the blog equation, we include the weekly reach of the

five most popular blog sites.

Finally, we do not include movie characteristics on the RHSs of the equations. They are

reserved for instruments (Elberse and Eliashberg 2003).

With the above considerations, the pre- and post-launch period models are developed as follows.

3 Moreover, we have only ten date points for each movie in the post-launch period. This indicates that the post-

launch period model may suffer a small degrees-of-freedom problem if we include lagged covariates.

14

4.1. The Pre-Launch Period Model

The pre-launch period model is specified by equations (1) and (2).

(1)

4 4A S

it k it k k it k

k 0 k 0

B c B B

it it t it

ln(Blog ) ln(Ad ) ln(Search )

ln(Traffic to blog sites ) c u

(2)

4 4A B

it k it k k it k

k 0 k 0

S c S S

it it t it

ln(Search ) ln(Ad ) ln(Blog )

ln(Volume of keyword 'opening movie' ) c u ,

where Δ represents first-differencing. Blogit is the blog volume, Adit is the advertising spending,

and Searchit is the search volume of movie i in pre-launch week t. The column vector cit is the set

of control variables that might influence the blog volume and search volume of movie i in pre-

launch week t. In our analysis, cit consists of the holiday dummy variable. B

t and S

t are the

time-fixed effects of pre-launch week t.

4.2. The Post-Launch Period Model

The post-launch period model is specified by equations (3) – (5).

(3)

A S R

it it it it

B c B B

it it t it

ln(Blog ) ln(Ad ) ln(Search ) ln(Revenue )

ln(Traffic to blog sites ) c u

(4)

A B R

it it it it

S c S S

it it t it

ln(Search ) ln(Ad ) ln(Blog ) ln(Revenue )

ln(Volume of keyword 'opening movie' ) c u

(5)

A B S

it it it it

D c R R

it it t it

ln(Revenue ) ln(Ad ) ln(Blog ) ln(Search )

ln(Scrns ) c u

15

Blogit is the blog volume, Adit is the advertising spending, Searchit is the search volume, and

Revenueit is the revenue of movie i in post-launch week t. The vector cit is the variables other

than our focal variables that might influence the weekly blog volume, search volume, and

revenue. cit consists of the holiday dummy variable. B

t , S

t , and R

t are the time-fixed effects of

post-launch week t. In the blog and search equations, we include weekly revenue to examine the

effect of movie consumption on blogging (blog equation) and searching (search equation)

activity. As the number of screens is an important determinant of movie revenue, we control for

the screen effect in the revenue equation. For identification purposes, we assume that the error

term of the revenue equation is uncorrelated with those of the blog and search equations:

B R S R

it it it itCov(u ,u ) Cov(u ,u ) 0 for any i and t.

4.3. The Opening-Week Model

We develop the opening-week model to examine the effect of the pre-launch blog and search

volume on the opening-week revenue of movies. It is a cross-section data model as only one data

point is observed per movie in the opening week. The model is specified as follows.

(6)

i 0 1 i 2 i

3 i 4 i

R

5 i 6 i i

ln(Open_Revenue ) ln(Open_Ad ) ln(Open_Scrns )

ln(PreLaunch_Blog ) ln(PreLaunch_Search )

(Holiday ) (Critic_Review ) u ,

where Open_Revenuei is the opening-week revenue, Open_Adi is the opening-week advertising

spending, Open_Scrnsi is the number of opening-week screens, PreLaunch_Blogi is the blog

volume cumulated during the pre-launch period of movie i, PreLaunch_Searchi is the search

volume cumulated during the pre-launch period of movie i, Holidayi is the indicator whether

movie i’s opening week contains any national holidays, and Critic_Reviewi is the average critic

16

rating of movie i as collected from Metacritic.com. Other movie characteristics are reserved for

instruments.

4.4. Identification and Estimation

Identifying the model parameters is a challenging task. We use exclusion restrictions and a

covariance restriction to identify the parameters. Then, we use a generalized method of moments

procedure to estimate the parameters. Details on the identification and estimation can be found in

Appendix A.

5. Empirical Results

5.1. Post-Launch Analysis Results

The post-launch period model is estimated by ordinary least squares (OLS) and generalized

method of moments (GMM) and the results are reported in Table 5. We estimate two nested

versions of the post-launch period model, the results of which are reported in Table 6 and 7. The

first nested version in Table 7 assumes that weekly search volume data are not collected, while

the second nested version in Table 8 assumes that they are collected. We juxtapose the

statistically significant relationships between the endogenous variables of the three versions in

Figure 3.





17

First, we uncover how advertising and blogging influence revenue differently. Advertising

influences revenue directly as well as indirectly through search. Moreover, the direct effect of

advertising accounts for the majority of total advertising effectiveness: Doubling the weekly

advertising increases the same-week revenue by 22.5 percent directly and 3.3 percent (= 0.104 ×

0.234) indirectly through consumer search. In contrast, the effect of blogs on revenue is realized

only indirectly through consumer search activity. The elasticity of revenue to blog volume is 8.1

percent (= 0.348 × 0.234), which is less than one third of the advertising elasticity of revenue. In

sum, most of advertising effect on revenue is realized regardless of consumer online search,

while none of blog effect on revenue is materialized without consumer online search.

We can explain the difference between advertising effect and blog effect in terms of how the

two are delivered to consumers. Advertisements are available through various media including

TV, print, radio and the Internet, while blogs are available only on the Internet. As such, blogs

are reached to consumers only when the consumers search for the blogs, whereas advertisements

are reached to consumers regardless of consumers’ online search activity. This is consistent with

Onishi and Manchanda (2012) who find that blogs viewed, not blogs generated, matter for

market success of new products. Also, it implies that the effect of WOM on sales is moderated

by online search intensity.

Second, weekly online search volume predicts better weekly movie revenue than weekly

blog volume (compare the revenue equations in Table 6 and 7). Furthermore, weekly blog

volume does not explain weekly revenue once weekly advertising and search volume are

controlled for (Table 5). Thus, if the purpose of an analysis is to build a model to predict weekly

revenue, weekly search volume, not blog volume, is the variable that managers should collect.

18

Third, the weekly search volume of a movie is significantly associated with the weekly

Google search index of the keyword “opening movie”. In the same week, a one hundred percent

increase in the latter would lead to a 44.5 percent increase in the former. As such, the same ad

spending of a movie could generate substantially different search volumes, influenced by the

search index of the keyword “opening movie”. Knowing when the weekly search index of

“opening movie” is high during a year may help studio managers allocate advertising budgets.

The search index of the keyword “opening movie” represents how much consumers are

interested in opening movies in general, while the search volume of a movie measures how much

they are interested in the specific movie. Our finding shows that consumers’ generic interest in

opening movies triggers their search for individual movies.

Similarly, we find that the weekly blog volume of an individual movie is significantly

associated with the weekly blogging intensity of the blogger population, as measured by the

weekly reach of the five most popular blog sites. A one hundred percent increase in the weekly

reach of the five sites would lead to a 230 percent increase in the blogs of an individual movie in

the same week.

Fourth, a movie’s weekly revenue influences the search and blog volume of the movie in the

same week, implying that movie consumption is both a consequence and an antecedent of

blogging and online search. This has an implication for using the vector autoregressive (VAR)

framework (e.g., Pauwels et al. 2002) to find the long-term effects of the three variables: As blog

volume, search volume and movie revenue have contemporaneous effects on each other,

imposing a causal orders between the variables at the weekly level will bias the long-term effects

of the variables.

19

5.2. Pre-Launch Analysis Results

Table 8 shows the pre-launch analysis results, comparing OLS with GMM estimates.


First, the OLS indicates significant contemporaneous association between blogging and

searching activity around a movie, while the GMM finds no significant causal effects between

the two activities in a same week; rather, blogging and searching in a pre-launch week are

attributed to advertising and respective exogenous variables.

The difference between the two results indicates there may exist omitted underlying factors

that influence online search and blogging simultaneously. The OLS finds the effects of

underlying factors by reporting the significant association between search and blog volume. For

example, the level of excitement among consumers may increase as a new movie’s release is

approaching. If the level of excitement about to-be released movies influences blogging and

searching activities simultaneously but is not controlled for, we may observe a significant

association between the blogging and searching activities as in the OLS analysis.

Second, we find that consumers respond to a movie’s pre-launch advertising by posting blogs

about and searching for the movie. However, the responsiveness is substantially different

between the two activities. The GMM finds that the elasticity of weekly blog volume to weekly

advertising is 0.028, while that of weekly search volume to advertising is 0.104 (0.053 in the

same week and 0.051 in the following week).

The differences may be explained by the behavioral cost that blogging and searching

activities incur. Online search activity costs a consumer just a few keystrokes while blogging

activity requires much more labor. Furthermore, consumers have little to write about a movie in

20

its pre-launch period because the trailers are the only information source. This suggests that in

the pre-launch period, blogging may be less responsive to advertising than searching is, implying

that movie studio managers should consider not only WOM but also online search volume to

measure advertising effectiveness in the pre-launch period of a movie.

Third, the suggested exogenous variables—the weekly reach of the five blog sites and the

weekly Google search index of the keyword “opening movie”—are associated with their

respective endogenous variables. As such, an amount of weekly advertising generates more blog

and search volumes when consumers’ overall blogging intensity and their generic interest in

movies are higher than average.

5.3. Opening-Week Analysis Results

Table 9 shows the OLS and GMM results for the opening-week model, with two different sets of

covariates. The first set includes only pre-launch blog volume whereas the second set includes

pre-launch search as well as blog volume.


Pre-launch blog volume explains variations in opening-week movie revenue when pre-launch

search volume is omitted. When pre-launch search volume is added, however, pre-launch blog

volume loses its predictive power to pre-launch search volume. Furthermore, the model fit

improves when pre-launch search volume is included.

We attribute the superior predictive ability of pre-launch search volume results to the fact

that (i) searching is a more prevalent behavior of consumers than blogging and (ii) blogs

influence revenue insofar as they are exposed to consumers—as we find in the post-launch

period analysis. This finding complements previous studies that examine the predictive

21

performance of only search (e.g., Kulkarni et al. 2012) or online WOM (e.g., Gopinath et al.

2013, Liu 2006). To the best of our knowledge, this is the first study that compares the predictive

ability of pre-launch search volume and blog volume for new entertainment products. For

managers, this implies that the pre-launch search activity, not pre-launch blogging activity, is the

metric they need to monitor to better predict opening-week movie revenue.

6. Implications

This study implies that monitoring consumer search activity is important to i) better allocate

marketing budgets and ii) better predict the demand for a new product. Firms invest not only in

traditional advertising but also in social media advertising through promotional chat (Dellarocas

2006, Mayzlin 2006) and firm-created WOM (Godes and Mayzlin 2009). As such, it is important

to know how traditional and social media advertising influence revenue in order to maximize

communication effectiveness. Relevant to this objective is our finding that blogs require search

activity while traditional advertising does not for them to influence revenue. Firms can improve

their communication effectiveness by allocating their budget according to consumers’ search

intensity. For example, social media advertising can generate sales more effectively in

geographic markets where online search is more active than in markets where it is less active.

Demand forecasting is an important issue to any firm, and especially so to new-product

managers. Our findings indicate that managers can better forecast demand by monitoring search

volume of relevant keywords, as blog volume loses its predictive power to search volume when

both are included to predict movie revenue. For this purpose, the Google search index can serve

as a readily available data source.

22

Other implications are as follows. New-product managers want to measure pre-launch

advertising effectiveness to find a better pre-launch advertising schedule. To this end, previous

studies used virtual stock prices traded in the Hollywood Stock Exchange (e.g., Bruce et al.

2012). However, virtual stock prices have several limitations. First, the market where the virtual

stocks are traded is not a real but a simulated market. Furthermore, not all moviegoers are

involved in the virtual stock trading. As such, HSX stock prices do not measure the awareness

about movies in the entire consumer population. Second, virtual stock markets are not available

for all new products. For example, only movie and TV show stocks are traded on the HSX. Thus,

managers in other industries cannot conduct a similar analysis. This study suggests another and

potentially superior metric to measure pre-launch advertising effectiveness.

For researchers, this study exhibits that the online keyword search index and website traffic

can provide a source of exogenous variation in certain online actions of consumers. Figure 4

summarizes the key influencers of movie revenue and the resulting managerial implications.


7. Conclusions

Despite the prevalence of consumers’ media consumption activity, researchers have not paid

sufficient attention to it. Using blog and search volume of movies, this study has examined the

relationship between advertising, consumers’ media generation, media consumption and market

outcomes.

Several important findings emerge. First, there is an important difference as to how

advertising and blogging activity influence movie revenue. We find that blog postings require

consumer search for them to influence movie revenue. Advertising, on the other hand, influences

23

revenue without consumers’ online search activity. In fact, in the post-launch period of a movie,

the indirect effects of advertising on revenue through consumer searching activity are so small

that they barely contribute to advertising’s overall revenue impact.

Second, advertising is the main driver of movie revenue throughout the movie life cycle. The

opening-week revenue of a movie is influenced by its opening-week advertising and pre-launch

search volume. However, the pre-launch search volume of a movie is influenced mainly by its

advertising. As such, advertising is the dominant cause of opening-week revenue of movies. In

the post-launch period, both advertising and consumer blogging activity influence weekly movie

revenue, but the effectiveness of advertising is almost twice that of blog volumes.

Third, once online search volume of a movie is controlled for, blog volume of the movie does

not improve the performance of movie revenue predictions. Thus, for the purpose of predicting

the market success of a movie, managers should focus on search volume, not blog volume.

This study has several managerial implications. First, online search activity provide guidance

in allocating a firm’s marketing budget between traditional advertising and firm-created WOM

management (or social media management). Second, to predict opening-week revenue of a

movie, pre-launch search volume, not pre-launch WOM volume, is the metric that managers

should monitor. Also to predict the post-launch weekly revenue of a movie, weekly online search

volume of the movie should be monitored, not its weekly blog volume. Third, almost 80% of

movie advertising is executed in pre-launch period or during opening-week. Therefore, finding

an efficient pre-launch advertising schedule is an important task. Our findings that pre-launch

advertising is the main driver of pre-launch search activity and that the pre-launch search activity

24

has substantial effect on opening-week revenue suggest that studio managers may use the time-

series of pre-launch search volume to measure the effectiveness of pre-launch movie advertising.

This study is subject to several limitations. First, we use blog volume to examine the effect of

consumers’ media generation. While blog postings are the only WOM of consumers before a

new-product’s launch, in the post-launch period consumers express their opinions through

various channels, including review sites. Second, we do not consider the valence of blog

postings. This is expected in the pre-launch period because the new product is not yet available,

and as such there should be no valence information. But in the post-launch period, the WOM

valence can influence movie-going decisions. Perhaps, one way of controlling for the WOM

valence is to collect user review data and include it in the model. Third, this study is conducted

in the movie industry. For the generalization of its findings, this study needs to be extended to

other sectors such as video games, music albums, and books.

Several research opportunities remain in this field. The content of blogs may influence

consumer search behavior. For example, search may be greater when there is strong

disagreement among consumers’ opinions. Finding what products receive more searches from

consumers and why can be an interesting research question. Second, given that pre-launch search

volume influences post-launch sales, determining the optimal allocation of a pre-launch

advertising budget to maximize search volume is important. Lastly, consumer search activity

may lead them to the related products’ websites. Examining the relationship between search

activity and traffic to product websites can be an interesting topic.

25

Appendix A: Identification and Estimation

This appendix details how to identify and estimate the model parameters.

A.1 Identifying the Parameters of the Pre- and Post-Launch Period Models

We rely on a covariance restriction and several exclusion restrictions to identify the model

parameters. The covariance restriction is to identify the effect of revenue on blog and search

volume in the post-launch period model. It uses the assumption that the error term of the revenue

equation is uncorrelated with those of the blog and search equations for any movie and week:

B R S R

it it it itCov(u ,u ) Cov(u ,u ) 0 . The exclusion restrictions are to identify the effect of blog and

search volume on revenue and each other. For each endogenous variable, we first find a variable

that creates exogenous variations in only the focal endogenous variable. Then, we include the

exogenous variable as a covariate in only the focal endogenous variable’s equation (i.e. the

equation where the focal endogenous variable is the dependent variable). Also, the exogenous

variable is used as an instrument in the equations where the focal endogenous variable is

included as a covariate. For example, in order to identify the effect of weekly search volume on

revenue (the parameters S

k in (5)), we find a variable that create exogenous variations in only

weekly search volume and include it in the search equation, but not in blog and revenue

equations. We also use the exogenous variable as an instrument for weekly search volume in the

blog and revenue equations. In the following subsections, we explain how to identify the model.

An Exogenous Variable for the Weekly Blog Volume of a Movie. We argue that the overall

weekly blogging activity of the entire blogger population—i.e., blogging intensity of the blogger

population across all topics and issues in each week—provides exogenous variation for the

weekly blog volume of an individual movie. The intuition is that the overall blogging intensity of

26

the blogger population in a week, which is influenced by various exogenous factors such as

weather and holidays (Gopinath et al. 2013), is likely to influence their blogging activity on any

random topics in that week. Thus, the overall blogging intensity of the blogger population in a

week will provide exogenous variations for the weekly blog volumes of an individual movie in

the week.

The suggested exogenous variable—the weekly blogging intensity of the blogger

population—will have little influence on the weekly revenue of a specific movie given that we

control for the effects of weekly advertising, blog volume, and search volume of the movie—i.e.,

if we include weekly advertising, blog volume and search volume of the movie in the revenue

equation. Also, the exogenous variable will have little influence on consumers’ online search

activity for a specific movie given that we control for the effects of weekly advertising, blog

volume and search volume of the movie—i.e., if we include weekly advertising, blog volume

and search volume of the movie in the search equation. Finally, the weekly advertising spending

of an individual movie is not likely to be correlated with the weekly blogging intensity of the

blogger population because an individual movie’s advertising spending will hardly contribute to

the overall blogging activity of the blogger population.

The above arguments suggest that the weekly blogging intensity of the blogger population

provides exogenous variations to the weekly blog volume of an individual movie, but not to

other endogenous variables of the movie. Assuming that the weekly reach of popular blog sites

represents the overall weekly blogging activity of the blogger population, we include the weekly

reach of the five most popular blog sites (blogger.com, tumblr.com, wordpress.com,

squarespace.com, and posterous.com) as a covariate of the blog equation. We use the exogenous

variable as an instrument in the search and revenue equations.

27

An Exogenous Variable for the Weekly Search Volume of a Movie. We argue that the weekly

Google search index of the keyword “opening movie” provides exogenous variation for the

weekly search volume of an individual movie. The weekly search index of the keyword “opening

movie”, which reflects consumers’ generic interest in movies, measures the seasonality in the

movie industry (see Figure 2). As the seasonality is mainly determined by exogenous factors, the

weekly search index of the keyword “opening movie” contains exogenous variations. Moreover,

the generic interest in movies will influence the search volume of an individual movie as

consumers’ generic interest in movies in a week may transfer to interest in individual movies in

the week. For example, consumers may first search with the keyword “opening movies” to find

what movies are available to them and then narrow down to a few specific movies to acquire

detailed information about those movies. Rutz and Bucklin (2011) find a similar phenomenon in

the lodging industry.

The suggested exogenous variable—the weekly Google search index of the keyword

“opening movie”—will have little influence on the weekly revenue of a specific movie given that

we control for the effects of weekly advertising, blog volume, and search volume of the movie.

Also, the exogenous variable will have little influence on consumers’ blogging activity for a

specific movie given that we control for the effects of weekly advertising, blog volume and

search volume of the movie. Finally, weekly advertising spending of an individual movie should

not contribute to the weekly search volume of the generic keyword “opening movie”, as the latter

represents consumers’ generic interest in opening movies without a specific movie under

consideration (e.g., Joo et al. 2012 for the financial services product industry).

The above arguments suggest that the weekly search index of the keyword “opening movie”

provides exogenous variations for the weekly search volume of individual movies, but not to

28

other endogenous variables. We include the exogenous variable as a covariate of the search

equation and use it as an instrument in the blog and revenue equations.

Exogenous Variation in the Weekly Advertising Spending of a Movie. Elberse and Anand

(2007) argue that the week-to-week change in movie advertising spending is exogenous. The

reason is that movie studios typically purchase the vast majority of advertising times in the

upfront market. This practice makes it extremely hard for a movie studio to buy additional

advertising time based on new information, once the advertising schedule is set. Additional

purchase of advertising time in the opportunistic market, if any, is affected by exogenous events

such as sports broadcasts and award shows. Thus, week-to-week variation in advertising

spending of a movie is mainly driven by exogenous factors, implying that weekly variation in

advertising spending of a movie can be used as an instrument for the movie’s weekly advertising

spending.

Based on the above argument, we use log(Ait-1) – log(Ait-2) as an instrument for log(Ait) in

the post-launch model and the opening-week model4. We also include movie characteristics such

as director power, star power and production budget as instruments because movie

characteristics such as the participation of high-profile stars and directors influence the total

advertising budget of a movie (Basuroy et al. 2006, Hennig-Thurau et al. 2006). Note that we do

not need an instrument for pre-launch period model as the first-differenced weekly advertising in

the model is exogenous.

Identifying the Effect of Weekly Revenue on Weekly Blog Volume and Search Volume.

Unexpected shocks in the supply side—e.g., an unexpected (from consumer perspective)

4 For the opening-week model where t = 0, we use log(Ait-1) – log(Ait-2) as an instrument for the opening-week

advertising spending, log(Ai0).

29

increase or decrease in the available theaters—can create an exogenous variation to the weekly

revenue of individual movies. However, such unexpected shock in the supply side is not readily

observable to researchers.5 To identify the effect of the weekly revenue of a movie on its weekly

blog volume and search volume, we rely on the assumption that the error term of the revenue

equation (5) is contemporaneously uncorrelated with the error terms of the blog and search

equations (3) and (4): i.e., R B R S

it it it itCov(u ,u ) Cov(u ,u ) 0 . This assumption is justified if

unobserved factors in a post-launch week do not influence the weekly search volume, blog

volume, and revenue simultaneously in that week. Suppose, for instance, that more consumers to

search, blog, and watch movies in a week that contains a national holiday than in a week that

does not any holidays. If this is the case and the model does not include the holiday dummy

variable, the assumption R B R S

it it it itCov(u ,u ) Cov(u ,u ) 0 may not hold. This type of unobserved

simultaneous effects can be alleviated by including the weekly dummy variables—which will

account for the unobserved effect of each week—and any observable variables that are known to

influence our endogenous variables, making the assumption R B R S

it it it itCov(u ,u ) Cov(u ,u ) 0

reasonable.

If the model specification makes the assumption R B R S

it it it itCov(u ,u ) Cov(u ,u ) 0 reasonable, we

can identify the effect of weekly revenue on weekly blog and search volume by estimating (5)

first and using its residuals, R

itu as an instrument for ln(Revenueit) in (3) and (4). The intuition is

that if the parameters in (5) are known, R

itu is effectively known. The assumption

R B R S

it it it itCov(u ,u ) Cov(u ,u ) 0 states that R

itu is uncorrelated with B

itu and S

itu , whereas it is

5 Gopinath, Chintagunta and Venkataraman (2013) suggest, for the exogenous variation in the number of theaters in

a designated market area (DMA), the number of temporarily closed theaters in the DMA. This information was not

available to us.

30

partially correlated with ln(Revenueit). Thus, we effectively have R

itu as an instrument for

ln(Revenueit) in the blog and search equations. The estimation procedure is as follows. First, we

estimate (5) by an instrumental variable technique and save the residuals, R

itu . Then we estimate

the blog and search equations using R

itu as an instrument for ln(Revenueit). The fact that R

itu

depends on estimates from a prior stage does not affect consistency of the estimators of the blog

and search equations (Wooldridge 2002, p. 207).

A.2 Identifying the Parameters of the Opening-Week Model

The opening-week model (6) is a cross-section data model. The opening-week advertising,

opening-week screen, pre-launch search and blog volume are potentially correlated with the error

term. We use movie characteristics as common instruments for all the endogenous variables. As

an additional instrument for pre-launch blog volume of a movie, we use the total pre-launch blog

traffic of the movie, which is the sum of weekly traffic to the five blog sites during the movie’s

pre-launch weeks. Also, as an additional instrument for pre-launch search volume of a movie, we

use the sum of weekly Google search index of “opening-movie” during the movie’s pre-launch

weeks. Lastly, we use log(Ait-1) – log(Ait-2) as an instrument for the opening-week advertising

spending, log(Ai0).

A.3 Estimation of the Pre- and Post-Launch Period Models

We apply a generalized method of moments (GMM) procedure to each equation of each model.

Let i be the index for a movie (i = 1, …, N) where N = 153 and t be the index for time (t = 1, …,

Ti). In the pre-launch period model, Ti = 30 for each movie; in the post-launch period model, Ti ≤

10. Let yit be the dependent variable of the estimation equation, xit be the corresponding row

vector of explanatory variables, and zit be the corresponding row vector of instruments. For

31

movie i, let yi be the Ti×1 vector of the dependent variable of the focal equation, obtained by

stacking yit from t = 1, …, Ti. Xi and Zi are similarity constructed by stacking xit and zit.

For each equation, the GMM estimation steps are as follows. (i) For the focal equation, apply

the two-stage least square (2SLS) estimation and obtain residuals. (ii) Use these residuals to

obtain the GMM weighting matrix that is robust to arbitrary serial correlation of the error term.

(iii) With the weighting matrix, estimate the parameters of the focal equation by GMM. The

GMM weighting matrix in step (ii) is in (A-1).

(A-1) 1

N1

i i i ii 1

ˆ ˆˆ ˆW N Z u u Z

,

where i

û is the Ti×1 vector of residuals obtained from the 2SLS regression in (i). The GMM

estimator and its asymptotic robust covariance matrix are

(A-2)

1

GMM

1 1N N N

GMM i i i i i i i ii 1 i 1 i 1

ˆ X ZWZ X X ZWZ Y,

ˆ ˆ ˆ ˆ ˆ ˆ ˆˆ ˆV( ) X X X u u X X X ,

where X, Z, and Y are obtained by stacking Xi, Zi, and yi from i = 1, …, N,

1

i i i i i iX Z (Z Z ) Z X , and i i i GMM

û y X . In the post-launch analysis, we first estimate the

revenue equation (5) and obtain R

it it it GMMû y x , the GMM residual of the revenue equation.

Then we use R

itu as an instrument for log(Revenueit) in estimating the blog and search equations.

The following variables are used as instruments.

Blog equation: previous week’s advertising variation, weekly search index of the

keyword “opening movie,” weekly traffic to the five blog sites, the holiday dummy variable,

32

movie characteristics, the week dummy variables, and residuals from the revenue equation (for

the post-launch period model).

Search equation: previous week’s advertising variation, weekly search index of the


movie characteristics, the week dummy variables, and residuals from the revenue equation (for

the post-launch period model).

Revenue equation: previous week’s advertising variation, weekly search index of the


movie characteristics, and the week dummy variables.

A.4 Estimation of the Opening-Week Model

The opening-week model is a cross-section data model. The GMM estimator for this model is

similar to (A-1) and (A-2) except that the cross-section heteroskedasticity, instead of the

arbitrary serial correlation, is considered to construct the weighting matrix W.

33

Web Appendix: Constructing Cross-Sectionally Comparable Search Volume Measures

from the Google Search Index

Google Trends provides weekly search indices of keyword queries entered into the Google

search engine. Because the index is normalized to conceal the actual search volume of the

keyword, researchers cannot compare the search volumes across different keywords if the raw

search index is used as provided by Google Trends. In this section, we introduce a methodology

to transform the weekly search indices from Google into cross-sectionally comparable search

volume metrics. The cross-sectionally comparable search volume metrics introduced in the data

section is acquired by applying the following methodology to the weekly Google search index of

the focal movies.

The method consists of three steps. The first is the keyword selection step, where basis

keywords and movie keywords are selected. Any set of words can be selected for the basis

keywords. The only requirement is that the search volume is neither too high nor too low when

compared with the search volume of the focal movies. For our analysis, we select the following

seven basis keywords: “mac os,” lamp, hello, windows, weather, tomatoes, video, and imdb.

They are listed in the order of search amount in the U.S. movie industry. That is, among the eight

keywords, “mac os” is the least searched keyword and “imdb” is the most searched keyword in

the U.S. movie industry. Then, for each movie, we select a set of keywords that are considered to

be used most by consumers to search the movie. For example, for the movie 12 Rounds, we

choose “12 Rounds” as the keyword for the movie. For the movie Paul Blart: Mall Cop, we

choose “blart + mall cop,” which means either blart or “mall cop.”6

6 The selection of movie keywords is guided by the “Related terms” section of Google Trends. The chosen keywords

for each movie can be acquired upon request.

34

The second step is the keyword matching step. To each movie, we assign an appropriate

basis keyword and collect the Google search index of the movie keyword along with that of the

assigned basis keyword to the movie. Any basis keyword can be assigned to any movie as long

as the search index of the movie keyword is comparable to that of the chosen basis keyword for

the movie. That is, if the search volume of a certain basis keyword is too large compared to the

search volume of a movie keyword, that basis keyword should not be used for that movie

because the movie’s search index so collected will be shrunk to zero for many or all of the

weeks. Google Trends provides diverse filters to minimize the measurement error in collecting

intended search indices. We limit our search so that the search volume is measured only from the

U.S. movie industry.

The last step is the transformation step. We transform each movie’s search index into our

cross-sectionally comparable search volume measure. The mathematics behind this step can be

explained as follows. Let kj be the basis keyword at the j’th position (i.e., k1 = “mac os”, k2 =

lamp, …, k8 = imdb), and let jk

tI represent the search index of the j’th basis keyword at week t.

We calculate the ratio of the Google search index of two adjacent basis keywords,

j j 1k kj, j 1

t t tr I I , for each t and for all seven pairs of adjacent basis keywords. Let m

tI be the search

index of movie m at week t. Suppose that, in the second step, the basis keyword of position j was

assigned to movie m. Then, for movie m at week t, our cross-sectionally comparable search

volume measure, denoted by m

tS , is calculated as in (WA-1).

(WA-1) m m j, j 1 2,1 1,0

t t t t tS I (r r r )

, where 1,0

tr is the weekly search index of the basis keyword “mac os” collected together with the

keyword “lamp”. For example, if movie m is compared with the basis keyword of the eighth

35

position (i.e., “imdb”), then m m 8,7 2,1 1,0

t t t t tS I (r r r ) for that movie. If movie m is compared

with the basis keyword of the first position (i.e., “mac os”), then m m 1,0

t t rS I r for movie m.

Figure WA.1(a) shows the weekly multiplier associated with each basis keyword, i.e.,

j, j 1 2,1 1,0

t t t(r r r ) if the keyword is at the j’th position. For movies Zombieland and X-Men

Origins: Wolverine, Figure WA.1(b) exemplifies the raw search indices of Google Trends and

their transformed cross-sectionally comparable search volume measures from 60 weeks before

the movies’ releases to 10 weeks after their releases. Note that our transformed search volume

measures show a substantial difference in consumer search activities between the two movies.

== Figure WA.1 about here ==

36

References

Baltagi, B. H. 1995. Econometric Analysis of Panel Data. John Wiley & Sons. New York, NY.

Basuroy, S., K. K. Desai, D. Talukdar. 2006. An empirical investigation of signaling in the

motion picture industry. Journal of Marketing Research 43 (May) 287-295.

Bruce, N. I., N. Z. Foutz, C. Kolsarici. 2012. Dynamic effectiveness of advertising and word of

mouth in sequential distribution of new products. Journal of Marketing Research 49

(August) 469-486.

Chevalier, J., D. Mayzlin. 2006. The effect of word of mouth on sales: online book reviews.

Journal of Marketing Research 43 (3) 345-354.

Chintagunta, P. K., S. Gopinath, S. Venkataraman. 2010. The effects of online user reviews on

movie box office performance: accounting for sequential rollout and aggregation across local

markets. Marketing Science 29 (5) 944-957.

Dhar, V., E. A. Chang. 2009. Does chatter matter? The impact of user-generated content on

music sales. Journal of Interactive Marketing 23 (4) 300-307.

Dellarocas, C. 2006. Strategic manipulation of internet opinion forums: implications for

consumers and firms. Management Science 52 (10) 1577-1593.

Duan, W., B. Gu, A. B. Whinston. 2008a. The dynamics of online word-of-mouth and product

sales – an empirical investigation of the movie industry. Journal of Retailing 84 (2) 233-242.

______, ______, ______. 2008b. Do online reviews matter? – an empirical investigation of panel

data. Decision Support Systems 45 (4) 1007-1016.

Elberse, A., B. Anand. 2007. The effectiveness of pre-release advertising for motion pictures: an

empirical investigation using a simulated market. Information Economics and Policy 19 319-

343.

______, J. Eliashberg. 2003. Demand and supply dynamics for sequentially released products in

international markets: the case of motion pictures. Marketing Science 22 (3) 329-354.

Enders, W. 2004. Applied Econometric Time Series. John Wiley & Sons. Hoboken, NJ.

Godes, D., D. Mayzlin. 2009. Firm-created word-of-mouth communication: evidence from a

field test. Marketing Science 28 (4) 721-739.

Gopinath, S., P. K. Chintagunta, S. Venkataraman. 2013. Blogs and local-market movie box-

office performance. Forthcoming in Management Science.

37

Hanssens, D. M. 2009. Empirical Generalizations about Marketing Impact. Marketing Science

Institute. Cambridge, MA.

Hennig-Thurau, T., M. B. Houston, S. Sridhar. 2006. Can good marketing carry a bad product?

Evidence from the motion picture industry. Marketing Letters 17(3) 205-219.

Holtz-Eakin, D. 1988. Testing for individual effects in autoregressive models. Journal of

Econometrics 39 297-307.

Joo, M., K. C. Wilbur, Y. Zhu. 2012. Television advertising and online search. SSRN Working

Paper.

Kulkarni, G., P. K. Kannan, W. Moe. 2012. Using online search data to forecast new product

sales. Decision Support Systems 52 (3) 604-611.

Liu, Y. 2006. Word of mouth for movies: its dynamics and impact on box office revenue.

Journal of Marketing 70 (July) 74-89.

Mayzlin, D. 2006. Promotional chat on the internet. Marketing Science 25 (2) 155-163.

Onishi, H., P. Manchanda. 2012. Marketing activity, blogging, and sales. International Journal

of Research in Marketing 29 (3) 221-234.

Pauwels, K., D. M. Hanssens, S. Siddarth. 2002. The long-term effects of price promotions on

category incidence, brand choice, and purchase quantity. Journal of Marketing Research 39

(November) 421-439.

Rutz, O. J., R. E. Bucklin. 2011. From generic to branded: a model of spillover in paid search

advertising. Journal of Marketing Research 48 (February) 87-102.

The Pew Research Center. 2012. Pew Internet & American Life Project Tracking Surveys.

available at http://pewinternet.org/Trend-Data-(Adults)/Online-Activites-Total.aspx.

Trusov, M,, R. E. Bucklin, K. Pauwels. 2009. Effects of word-of-mouth versus traditional

marketing: findings from an internet social networking site. Journal of Marketing 73

(September) 90-102.

Villanueva, J, S. Yoo, D. M. Hanssens. 2008. The impact of marketing-induced versus word-of-

mouth customer acquisition on customer equity growth. Journal of Marketing Research 45

(February) 48-59.

Wooldridge, J. M. 2002. Econometric Analysis of Cross Section and Panel Data. The MIT Press.

Cambridge, MA.

38

Yang, S, M. Hu, R. S. Winer, H. Assael, X. Chen. 2013. An empirical study of word-of-mouth

generation and consumption. Forthcoming in Marketing Science.

39

Table 1: Comparison of Relevant Studies

Study

Variables Examined

Traditional

Media

Consumer-

Generated

Media

Search or

media

consumption

Market

Outcome

Chevalier and Mayzlin (2006)

Dhar and Chang (2009)

Duan et al. (2008ab)

Liu (2006)

√ √

Chintagunta et al. (2010)

Gopinath et al. (2011)

Onishi and Manchanda (2012)

Trusov et al. (2009)

Villanueva et al. (2008)

√ √ √

Joo et al. (2012) √ √ Kulkarni et al. (2012) √ √ √ Yang et. al. (2013) √ √ √ This study √ √ √ √

40

Table 2: Variables and Data Sources

Category Variable Source of Data

Marketing activities Weekly advertising spending Nielsen

Weekly number of screens The numbers

Focal endogenous

variables

Weekly blog postings Google blog search engine

Weekly search volume Google Trends

Weekly revenue The Numbers

Movie Characteristics Genre, MPAA rating, Sequel IMDb

Average critic rating [range: 1 – 100] Metacritic

Director power variables IMDb

Monthly Seasonality: January-April; May-August;

September-October, November-December

Einav (2007)

Ho et al. (2009)

Holiday National holiday

Others Weekly Google search index of the keyword “opening

movie” Google

Daily reach of the five popular blog sites Alexa.com

41

Table 3: Descriptive Statistics

(a) Pre-Launch Period (N = 153, t = -30 ~ -1)

Mean Median Std. Dev. Min Max

Weekly advertising spending ($ 000) 389.5 0 1,250.0 0 10,009

Weekly blog postings 10.19739 2 40.04684 0 1279

Weekly search volume 13,087.8 2,188.0 46,616.2 0 1,366,626

(b) Opening Week (N = 153, t = 0)


Advertising spending ($ 000) 5,080.2 5,651.1 2,960.4 0.022 12,562

Blog postings 177.2 45 483.5 2 4,733

Search volume 107,624 49,774.6 194,005.4 651 1,531,028

Screens 7,254.0 8,262 4,915.4 9 21,625

Revenue ($ 000) 22,960.5 14,118.4 29,393.3 46.2 200,077.3

(c) Post-Launch Period (N = 153, t = 1 up to 10)


Weekly advertising spending ($ 000) 429.2 3.4 990.3 0 8,318.2

Weekly blog postings 66.9 10 621.9 0 18,543

Weekly search volume 41,710.8 13,500 106,197.4 370 2,009,781

Weekly screens 8,993.7 5,152 8,546.5 14 30,479

Weekly revenue ($ 000) 5,189.5 928.8 11,273.8 0 139,403.7

(d) Movie Characteristics (N = 153) and Other Instruments


Critic Rating [0 ~ 100] 57.5 58.4 15.5 14.3 92.7

Production budget ($ 000) 59,041.1 40,000 54,694.9 11 250,000.0

Past B-O revenue of the focal director $873 M $ 451 M $ 1,090 M $ 0 M $ 6,520 M

Average director rating from the past 6.73 6.78 0.66 4.81 8.71

S. D. of director ratings from the past 1.98 1.96 0.27 1.46 3.45

Genre (%) Action: 21.6, Comedy: 28.1, Drama: 19.4

MPAA (%) G: 2.6, PG: 20.7, PG13: 43.0, R: 33.7

Sequel 15 movies (9.8%) are sequel.

Monthly Seasonality (%) Jan – Arp: 24.3, May – Aug: 31.1, Sep – Oct: 17.6

Weekly search index of keyword

“opening movie” [0 ~ 100]

43.6 40.0 14.2 19.0 100.0

Weekly traffic to five blog sites (000) 572.1 547.6 90.5 441.6 914.1

42

Table 4: Serial Correlation Coefficients of the Main Variables

(a) The Variables in the Pre-Launch Period Model

Δ(Advertising) Δ(Search) Δ(Blog)

Serial Correlation Coefficient -0.181 -0.234 -0.235

Δ represents first-differencing.

(b) The Variables in the Post-Launch Period Model

Advertising Blogs Search Revenue

Serial Correlation Coefficient 0.707 0.505 0.903 0.895

43

Table 5: Estimation Results: Post-Launch Period (N=153, T=10)

(a) Blog equation (DV: log of weekly blog volume)

OLS GMM

Variable Coef. SE P-val. Coef. SE P-val.

Advertising 0.102 0.037 0.006 *** 0.086 0.041 0.037 **

Searching 0.344 0.076 0.000 *** 0.460 0.164 0.005 ***

Revenue 0.092 0.069 0.181 0.240 0.133 0.071 *

Holiday -0.108 0.093 0.242 -0.161 0.079 0.040 **

Traffic to the five blog sites 1.867 0.568 0.001 *** 2.301 0.556 0.000 ***

R2 0.278 N.A.

Adj. R2 0.269 N.A.

SSR 1.358 1.393 *** P-val < 0.01, ** P-val < 0.05, * P-val <0.1. Note: The intercept and fixed effects are not reported.

(b) Search equation (DV: log of weekly search volume)

OLS GMM


Advertising 0.062 0.034 0.073 * 0.104 0.029 0.000 ***

Blogging 0.141 0.045 0.002 *** 0.348 0.092 0.000 ***

Revenue 0.426 0.062 0.000 *** 0.369 0.097 0.000 ***

Holiday 0.109 0.063 0.085 * 0.050 0.051 0.322

Search index of keyword “opening movie” 0.318 0.180 0.077 * 0.445 0.183 0.015 **

R2 0.510 N.A.

Adj. R2 0.504 N.A.


(c) Revenue equation (DV: log of weekly revenue)

OLS GMM


Advertising 0.196 0.014 0.000 *** 0.225 0.021 0.000 ***

Blogging 0.052 0.025 0.041 ** 0.045 0.045 0.321

Searching 0.221 0.036 0.000 *** 0.234 0.061 0.000 ***

Holiday 0.107 0.038 0.004 *** 0.101 0.034 0.003 ***

Screens 0.769 0.033 0.000 *** 0.799 0.057 0.000 ***

R2 0.918 N.A.

Adj. R2 0.917 N.A.


44

Table 6: Estimation Results: The Post-Launch Period Model Omitting Search Volume


OLS GMM


Advertising 0.097 0.035 0.006 ** 0.233 0.029 0.000 ***

Searching N.A. N.A. N.A. N.A. N.A. N.A.

Revenue 0.281 0.063 0.000 *** 0.422 0.142 0.003 ***

Holiday -0.072 0.089 0.416 -0.152 0.089 0.090 *


R2 0.241 N.A.

Adj. R2 0.234 N.A.

SSR 1.388 1.466 *** P-val < 0.01, ** P-val < 0.05, * P-val <0.1. Note: Fixed effects and intercepts are not reported.

(b) Revenue equation (DV: log of weekly revenue)

OLS GMM


Advertising 0.237 0.016 0.000 *** 0.273 0.021 0.000 ***

Blogging 0.093 0.026 0.001 *** 0.140 0.039 0.000 ***

Searching N.A. N.A. N.A. N.A. N.A. N.A.

Holiday 0.174 0.040 0.000 *** 0.157 0.038 0.000 ***

Screens 0.821 0.030 0.000 *** 0.925 0.065 0.000 ***

R2 0.902 N.A.

Adj. R2 0.901 N.A.


45

Table 7: Estimation Results: The Post-Launch Period Model Omitting Blog Volume

(a) Search equation (DV: log of weekly search volume)

OLS GMM


Advertising 0.068 0.035 0.055 * 0.160 0.030 0.000 ***

Blogging N.A. N.A. N.A. N.A. N.A. N.A.

Revenue 0.501 0.058 0.000 *** 0.512 0.089 0.000 ***

Holiday 0.102 0.060 0.090 * 0.001 0.065 0.985

Search index of keyword “opening movie” 0.309 0.172 0.073 * 0.340 0.166 0.041 **

R2 0.509 N.A.

Adj. R2 0.504 N.A.


(b) Revenue equation (DV: log of weekly revenue)

OLS GMM


Advertising 0.209 0.014 0.000 *** 0.230 0.022 0.000 ***

Blogging N.A. N.A. N.A. N.A. N.A. N.A.

Searching 0.214 0.034 0.000 *** 0.293 0.057 0.000 ***

Holiday 0.113 0.037 0.002 *** 0.071 0.037 0.054 *

Screens 0.757 0.031 0.000 *** 0.799 0.059 0.000 ***

R2 0.912 N.A.

Adj. R2 0.911 N.A.


46

Table 8: Estimation Results: Pre-Launch Period (N=153, T=30)


OLS GMM


Ad, same week 0.020 0.009 0.031 *** 0.027 0.011 0.010 ***

one week ago 0.012 0.007 0.110 0.015 0.009 0.114

two weeks ago -0.004 0.007 0.607 -0.011 0.010 0.289

three weeks ago 0.007 0.009 0.430 -0.001 0.011 0.957

four weeks ago 0.015 0.010 0.132 0.012 0.009 0.214

Searching, same week 0.101 0.015 0.000 *** -0.042 0.109 0.701

one week ago -0.018 0.013 0.159 0.015 0.113 0.896

two weeks ago -0.019 0.011 0.089 * 0.072 0.097 0.458

three weeks ago -0.029 0.009 0.001 *** -0.008 0.136 0.951

four weeks ago -0.029 0.010 0.003 *** -0.118 0.099 0.234

Holiday 0.001 0.024 0.975 -0.002 0.024 0.937


R2 0.043 N.A.

Adj. R2 0.035 N.A.

SSR 0.754 0.786

Corr. coef. between actual and fitted values

in level 0.750 0.730

*** P-val < 0.01, ** P-val < 0.05, * P-val <0.1. Note: Fixed effects are not reported.

47

(b) Search equation (DV: log of weekly search volume)

OLS GMM


Ad, same week 0.041 0.011 0.000 *** 0.053 0.014 0.000 ***

one week ago 0.036 0.010 0.001 *** 0.051 0.016 0.001 ***

two weeks ago -0.016 0.007 0.033 ** -0.007 0.014 0.627

three weeks ago -0.005 0.009 0.584 0.005 0.016 0.759

four weeks ago -0.006 0.007 0.389 -0.008 0.017 0.646

Blogging, same week 0.310 0.036 0.000 *** 0.020 0.320 0.950

one weeks ago 0.221 0.031 0.000 *** 0.235 0.258 0.363

two weeks ago 0.064 0.025 0.010 *** -0.157 0.404 0.698

three weeks ago 0.037 0.024 0.121 0.058 0.316 0.854

four weeks ago 0.001 0.023 0.969 0.137 0.242 0.573

Holiday 0.038 0.030 0.205 0.056 0.047 0.238

Search index of keyword “opening movie” 0.226 0.073 0.002 *** 0.237 0.085 0.005 ***

R2 0.064 N.A.

Adj. R2 0.057 N.A.

SSR 1.078 1.121

Corr. coef. between actual and fitted values

in level 0.923 0.680

*** P-val < 0.01, ** P-val < 0.05, * P-val <0.1. Note: Fixed effects are not reported.

48

Table 9: Estimation Results: Opening-Week (N=153)

(a) OLS Estimation


Advertising 0.252 0.083 0.003 *** 0.163 0.075 0.032 **

Screens 0.698 0.037 0.000 *** 0.672 0.033 0.000 ***

Pre-launch blog volume 0.120 0.037 0.001 *** 0.021 0.035 0.543

Pre-launch search volume N.A. N.A. N.A. 0.218 0.037 0.000 ***

Holiday 0.326 0.148 0.029 ** 0.277 0.122 0.024 **

Critical Review 0.541 0.210 0.011 ** 0.507 0.181 0.006 ***

R2 0.904 0.928

Adj. R2 0.900 0.925

SSR 0.655 0.567

*** P-val < 0.01, ** P-val < 0.05, * P-val <0.1. Note: The intercept is not reported.

(b) GMM Estimation


Advertising 0.303 0.146 0.041 ** 0.272 0.144 0.062 *

Screens 0.727 0.073 0.000 *** 0.655 0.063 0.000 ***

Pre-launch blog volume 0.211 0.062 0.001 *** -0.026 0.075 0.728

Pre-launch search volume N.A. N.A. N.A. 0.354 0.059 0.000 ***

Holiday 0.369 0.148 0.014 ** 0.167 0.120 0.165

Critical Review 0.501 0.267 0.063 * 0.513 0.209 0.016 **

R2 N.A. N.A.

Adj. R2 N.A. N.A.

SSR 0.686 0.599

*** P-val < 0.01, ** P-val < 0.05, * P-val <0.1. Note: The intercept is not reported.

49

Figure 1: Weekly Trends of Advertising, Blog Volume, Search Volume

and Box Office Revenue

(a) Average Weekly Advertising Spending and Box Office Revenue (N=153)

(b) Average Weekly Blog Volume and Box office Revenue (N=153)

(c) Average Weekly Search Volume and Box office Revenue (N=153)

0

5000

10000

15000

20000

25000

0

1000

2000

3000

4000

5000

6000

-30

-28

-26

-24

-22

-20

-18

-16

-14

-12

-10 -8 -6 -4 -2 0 2 4 6 8

10

B-O

rev

enu

e ($

00

0)

Ad

ver

tisi

ng (

$ 0

00

)

week

Revenue Advertising

0

5000

10000

15000

20000

25000

0

50

100

150

200

250

-30

-28

-26

-24

-22

-20

-18

-16

-14

-12

-10 -8 -6 -4 -2 0 2 4 6 8

10

B-O

rev

enu

e ($

00

0)

Blo

gs

week

Revenue Blogs

0

5000

10000

15000

20000

25000

0

20

40

60

80

100

120

-30

-28

-26

-24

-22

-20

-18

-16

-14

-12

-10 -8 -6 -4 -2 0 2 4 6 8

10

B-O

rev

enu

e ($

00

0)

Sea

rch

volu

me

(00

0)

week

Revenue Search

50

Figure 2: Weekly Search Index of the Keyword “Opening Movie” and

Weekly Traffic to the Five Blog Sites

0

100

200

300

400

500

600

700

800

900

1000

0

20

40

60

80

100

120

1/6

/200

8

2/6

/200

8

3/6

/200

8

4/6

/200

8

5/6

/200

8

6/6

/200

8

7/6

/200

8

8/6

/200

8

9/6

/200

8

10

/6/2

008

11

/6/2

008

12

/6/2

008

1/6

/200

9

2/6

/200

9

3/6

/200

9

4/6

/200

9

5/6

/200

9

6/6

/200

9

7/6

/200

9

8/6

/200

9

9/6

/200

9

10

/6/2

009

11

/6/2

009

12

/6/2

009

Rea

ch o

f th

e b

log s

ites

(0

00

)

Sea

rch

in

dex

of

"op

enin

g m

ovie

"

week

Search index of "opening movie" Reach of the blog sites

51

Figure 3: Comparison of Statistically Significant Relationship

in the Three Versions of the Post-Launch Period Model

Adt

Blogt Searcht

Revenuet

The Full Model

(a) The Full Model

Adt

Blogt

Revenuet

Model without Searching

(b) The Model Omitting Search Activity

Adt

Searcht

Revenuet

The Full Model

(c) The Model Omitting Blogging Activity

52

Figure 4: Key Influencers on Revenue

Opening-week revenue Weekly revenue after opening week

Key

influencers • Pre-launch search volume

• Opening-week advertising

• Weekly advertising

• Weekly online search activity

Implications • Monitoring pre-launch search

activity can help managers

better predict opening-week

revenues of movies.

• The time-series of pre-launch

search volume of a movie can be

used to develop efficient pre-

launch advertising schedule for

the movie.

• For the purpose of predicting weekly

revenue, knowing weekly advertising and

search volume is sufficient; weekly blog

volume does not contribute to the predictive

performance.

• Advertising influences revenue without the

help of consumer search; online WOM need

consumer search activity for it to influence

movie revenue.

53

Figure WA.1 Constructing Cross-Sectionally Comparable Search Volume Measure

(a) Weekly Multiplier Associated With Basis Keywords

Zombieland X-Men Origins: Wolverine

(b) Raw Search Indexes and Transformed Search Volume Measures

0

5000

10000

15000

20000

25000W

eekl

y M

ult

iplie

r

Week

imdb

video

tomatoes

weather

windows

hello

lamp

mac os

0

20

40

60

80

100

-60 -50 -40 -30 -20 -10 0 10

Go

ogl

e Se

arch

Ind

ex

Week

0

20

40

60

80

100

-60 -50 -40 -30 -20 -10 0 10

Go

ogl

e Se

arch

Ind

ex

Week

0

100000

200000

300000

400000

-60 -55 -50 -45 -40 -35 -30 -25 -20 -15 -10 -5 0 5 10

Tran

sfo

rmed

Mea

sure

Week

X-Men Origin: Wolverine Zombieland

Advertising, Searching, Blogging and New-Product Sales

Documents

Transcript of Advertising, Searching, Blogging and New-Product Sales