Final Senior Thesis...! 2! Abstract!...
Transcript of Final Senior Thesis...! 2! Abstract!...
1
The Hollywood Stock Exchange: Efficiency and The Power of Twitter
by
Nathaniel Harley
A special thanks to Professor Richard Walker for advising on this thesis. Also, thanks to Professor Joseph Ferrie, Sarah Ferrer, and the MMSS department.
2
Abstract
Online prediction markets are becoming increasingly popular and useful for forecasting real world events. The Hollywood Stock Exchange is one of the most successful online prediction markets and forecasts real world box-office returns. This thesis sets forth to answer questions about whether The Hollywood Stock Exchange is an efficient market, and if it is not, what factors can be used to predict future changes in MovieStock prices? Most importantly, this thesis will focus in on the usefulness of social media—specifically Twitter— in predicting future changes to these prices.
Introduction
Markets are a place where individuals can exchange items. Prices are used to
assign these items values so that buyers and sellers can easily trade them.
Embedded in these prices is a large amount of information that reflects the
collective opinion of informed and uninformed traders.
The two main types of markets are financial markets and prediction markets.
We are all familiar with financial markets, such as stock markets, bond markets,
futures markets, commodity markets, currency markets, and money markets.
Depending on the type of market, the price of an asset can represent different
meanings. On a stock market, such as the New York Stock Exchange, the price of
common stock represents how much an individual is willing to pay for one share of
a specific company. On a futures market, the price represents a forecast of what the
underlying asset will cost in the future. For prediction markets, the price of an asset
is used to indicate the likelihood of an event occurring.
Prediction markets are slowly becoming more popular and are being used as
an informational resource to predict events. Some prediction markets, such as
Intrade Prediction Market, forecast the likelihood of political events. Others, such as
The Hollywood Stock Exchange, trade prediction shares of movies, actors, and other
3
film-‐related options. As more and more prediction markets expand onto the
electronic platform, individuals have more access to trade on these markets.
The question driving my thesis is if The Hollywood Stock Exchange is not an
efficient market, what information can people use to predict future changes in stock
prices? Recently, a lot of work has been done to try and capture social media data,
such as twitter, and use it as a measurement to make quantitative predictions. Using
Twitter data, along with other non-‐social media variables, I attempt to test whether
MovieStock prices can be predicted by Twitter information.
Prediction Markets
There are many barriers that exist for establishing a new market, such as
high costs, government regulation, and the threat of lawsuits; however, artificial
online prediction markets do not have these barriers. Web market games are
increasingly easy to create because they have small operating costs for setup,
maintenance, advertising, searching, and transacting, and benefit from a global
group of Internet users. They do not need to get permission from government
officials and do not need to create strict rules that would limit trading because there
is little risk of lawsuits against them. Users can remain anonymous and record
keeping does not need to be as tight. As a result, online markets, such as The
Hollywood Stock Exchange, can exist and function effectively [1]. However, as Justin
Wolfers and Eric Zitzewitz illustrate in their paper Five Open Questions About
Prediction Markets, there are five open questions that must be answered in order for
prediction markets to fulfill their potential and ultimately succeed.
4
The first question Wolfers and Zitzewitz pose is “how to attract uninformed
traders?” (Wolfers and Zitzewitz, p. 2). Uninformed traders are important to any
market place because they create an uninformed order flow, which actually attracts
informed profit motivated groups to trade. In order to attract these traders, it is
essential to have low transaction costs as well as interest or buzz surrounding a new
predictive market. The Hollywood Stock Exchange has successfully attracted both
uninformed and informed traders by creating an attractive, easy to use platform
that has positioned itself as the premier box-‐office forecasting market available.
Their second question is “how to tradeoff interest and contractibility?”
(Wolfers and Zitzewitz, p. 3). Wolfers and Zitzewitz conclude that it is important to
establish clear guidelines outlining contracts traded on prediction exchanges, but
that there is a lot of leeway in doing so. The Hollywood Stock Exchange does not face
some of the complexities that other prediction markets encounter. For example,
box-‐office revenue and which actor won the Oscar for best actor leaves little room
for interpretation.
The third question they pose is “how to limit manipulation?” (Wolfers and
Zitzewitz, p. 3). This question addresses two types of manipulation: First,
manipulation of the outcomes on which the prediction markets are based and,
second, manipulation of the market prices themselves. Theoretically, traders could
go out and buy a mass amount of tickets for a specific movie in order to increase
box-‐office revenue, but practically, they would never do so because it would have
little impact on national box-‐office revenue and the cost of buying movie tickets
greatly out ways the potential profit since the exchange uses Hollywood Dollars and
5
not actual money. In terms of price manipulation, buying a mass amount of a specific
asset would have no effect on the real world outcome, so it would be pointless.
The fourth question Wolfers and Zitzewitz ask is “are markets well calibrated
on small probabilities?” (Wolfers and Zitzewitz, p. 3). They conclude that declining
transaction costs and carefully framed contracts will produce more accurate
responses from traders who are inherently bad at distinguishing small probabilities
and overvalue unlikely events. Although traders on The Hollywood Stock Exchange
are subject to this bias, it is important to realize that this behavior exists not only in
prediction markets, but also in real-‐world markets.
The final question Wolfers and Zitzewitz pose is “how to separate correlation
from causation?” (Wolfers and Zitzewitz, p. 3). The assets traded on The Hollywood
Stock Exchange, such as MovieStocks, are directly correlated with real world events
and the outcome of one event, in the movie world, does not cause the probability of
another event to change [2]. Therefore, there is not the problem of determining
correlation versus causation.
By examining these five open questions proposed by Wolfers and Zitzewitz, it
is clear that The Hollywood Stock Exchange has all the elements to make it a
successful prediction market and function effectively. However, this does not
guarantee that The Hollywood Stock Exchange is an efficient market.
Efficient Markets
If markets are, in fact, efficient, the market asset price is the best estimate of
value; however if markets are not efficient, the market price may deviate from the
6
true value. That being said, market efficiency does not require that the market price
be equal to true value at every point in time, but that if there is a deviation from the
true value, that the deviation is random. There are three main types of efficient
markets: Weak Form, Semi-‐Strong Form, and Strong Form [3].
o Weak Form – Future changes in prices are not predictable based on
information contained in all past prices suggesting that analysis of
past prices alone would not be helpful in determining undervalued or
overvalued assets.
o Semi-Strong Form – Future changes in prices are not predictable
based on past prices or any currently available public information
(including prices, economic variables, etc.).
o Strong Form – Future changes in prices fully reflect all information
available, public and private. Informed experts cannot consistently
outperform uninformed traders.
The Hollywood Stock Exchange
The Hollywood Stock Exchange is a online artificial prediction market game.
Participants can buy and sell virtual shares of celebrities and movies with a
currency called the Hollywood Dollar (H$). New users can join for free, and when
they do, they receive 2 million H$. They can trade various assets such as
MovieStocks, StarBonds, TVStocks, MovieFunds, and Deriviatives. The Hollywood
Stock Exchange then syndicates the data collected from the Exchange and sells it as
7
market research to entertainment companies, consumer product companies and
financial institutions.
For the purpose of this thesis, I will focus on MovieStocks and whether it is
possible to predict future changes in MovieStock prices. MovieStocks represent
films both still in production and currently in theaters. The price of the MovieStock
reflects how much money traders think the film will make with each $1 million
earned domestically equal to $1 Hollywood Dollar. The price of a MovieStock is
adjusted to reflect its exact earnings in the box-‐office. The price begins to adjust
after the movie’s opening weekend in order to bring the expected box-‐office gross
revenue in line with the actual box-‐office gross revenue. For example, if Movie A
grosses $20 million its first week in theaters, then the price after the first week
would be something like H$45 on the exchange. However, if Movie A only grossed
$3 million in the second week, then the price of Movie A would most likely drop
drastically to something like H$28. On average, a film makes 2.7 times its opening
weekend box-‐office during its first four weeks of wide release. MovieStocks delist
and cash out from the market on the first business day after its fourth weekend of
wide release or 12 weeks of limited release. The driving question behind this Thesis
is whether or not The Hollywood Stock Exchange is an efficient market, specifically
looking at MovieStocks; and if it is not, what factors can be used to predict future
changes in MovieStocks?
8
Social media is quickly changing the social landscape because it is easy to use
and reaches a global audience extremely quickly. As a result, social media is setting
trends in topics that range across the board from politics to technology to the
entertainment industry. Social media can be a very powerful tool and the question
becomes whether it is possible to aggregate social media and use it as a
measurement to gauge collective opinion.
We all know Twitter. Twitter is essentially a real-‐time information network that
connects you to the latest stories, ideas, opinions and news. Tweets are only 140
words, but they can be very powerful. Twitter uses the # symbol, called a "hash-‐tag",
to mark keywords or topics in a Tweet. Interestingly enough, twitter users created it
organically as a way to categorize messages. People use the hash-‐tag symbol before
relevant keywords or phrases in their Tweet to categorize those Tweets and help
them show up more easily in Twitter Search. Hash-‐tagged words that become very
popular are often referred to as “trending topics.” Twitter as a result has a lot of
power because it can identify important topics and also, the sentiment surrounding
those topics. For example, if a keyword is being used a lot – we can come to the
conclusion that many people find it important. Looking further, analyzing the
individual tweets can help us identify whether people feel positively or negatively.
Using this, we can create a measurement of collective opinion and use it to make
quantitative predictions.
Many people have already started to use Twitter to build forecasting models.
Specifically, they look at the sentiment of a Tweet and use it to gauge collective
9
opinion. For example, Duncan Watts, an Internet researcher who heads one of
Yahoo!’s research labs in New York, uses Twitter to forecast video-‐game and music
sales. He found that adding Twitter data greatly increased the accuracy of his
forecasting model. Similarly, Derwent Capital Markets, a hedge fund based in
London, implements a Twitter model to help guide their investments [4].
Related Studies
When looking at previous studies, I came across a few that were very
influential in shaping how I tested The Hollywood Stock Exchange for efficiency. In
The Power of Play: Efficiency and Forecast Accuracy in Web Market Games by David
M. Pennock et al. they analyzed the efficiency and forecast accuracy of two market
games: The Hollywood Stock Exchange and the Foresight Exchange. For the purpose
of this thesis, I will focus on their results regarding The Hollywood Stock Exchange.
In their paper, they focused on the question of whether or not efficiency breaks
down in artificial markets when there is no monetary incentive. The goal of their
research was to test whether The Hollywood Stock Exchange holds for two types of
market efficiency: internal coherence and strong form. They presented evidence
that some market simulations can act sufficiently well as both aggregators and
disseminators of information. In conclusion, they found that The Hollywood Stock
Exchange MovieStock prices were good indicators of what movies will do well in the
box-‐office.
First, it is important to understand what internal coherence is. Internal
coherence is defined as when prices are self-‐consistent or arbitrage-‐free: no trader
10
can make a sure profit without any risk. In efficient markets, arbitrage should not
exist. For example, arbitrage exists when you can buy a security on one exchange,
such as The New York Stock Exchange, for a certain price and then sell the same
security on the Tokyo Exchange for a higher price. The security should have the
same price on both exchanges. Another example can be shown in relation to the
securities market. Take for instance a security that pays $1 if and only if it rains
tomorrow. If another security existed that pays $1 if an only if it does not rain
tomorrow, then owning both securities would guarantee a $1 payoff. In order for
there not to be arbitrage opportunity, the price to buy both securities should always
be greater than $1 and the price to sell both securities should always be less than $1.
One of the driving questions behind their study was: “do HSX players have
utility for Hollywood dollars and, if so, are their resulting incentives strong enough
to maintain internal price consistency in the game?” (Pennock, p. 7). In order to
determine the degree of internal coherence in MovieStocks, they tested how closely
MovieStocks and options prices conformed to the put-‐call parity. In conducting their
experiment, they used weekend halt prices (the price before the movie adjusts to
approximately 2.7 times the opening weekend box-‐office proceeds) for 75
MovieStocks and their corresponding options during the period of March 3, 2000 to
September 1, 2000. They found that the relationship between the stock estimates of
weekend box-‐office returns versus the option estimates adhered relatively closely to
the put-‐call parity at the halt price. They then wanted to test whether prices
adhered to the put-‐call parity at all times, not just at the halt price. Their results
indicated that The Hollywood Stock Exchange market was not completely free of
11
arbitrage because prices diverged at times from parity by as much as H$6.5. When
examining whether the market showed signs of internal coherence, they concluded
that it did because when prices were too high, they were much more likely to
correct and go down on subsequent days. Similarly, when they were too low, they
were more likely to increase. They hypothesized that this price self-‐correction could
be attributed to traders taking advantage of arbitrage opportunities.
Pennock et al. wanted to test the forecast accuracy of The Hollywood Stock
Exchange and whether MovieStocks were good predictors of box-‐office returns. In
order to understand their process and results, it is important to understand Rational
Expectations Theory: prices are not only coherent, but also reflect the sum total of all
information available to all market participants. Essentially, the Rational
Expectations Theory states that even when some individuals have insider
information, prices equilibrate as if everyone has access to all the same information.
They wanted to go further and test where strong form efficiency holds in The
Hollywood Stock Exchange because internal coherence is only a minimal standard of
market efficiency, where as stronger forms of efficiency imply “market competence
as well and coherence: prices actually reflect an aggregation of information
distributed among the participants, and market forecasts are as accurate as expert
assessments” (Pennock, p. 11). Ultimately, they proposed that if The Hollywood
Stock Exchange holds for strong form efficiency, then the implications would be
more relevant to the societal benefit in the form of “cheap and reliable forecasts”
(Pennock, p. 11). In order to test strong form efficiency, they assessed the forecast
accuracy of The Hollywood Stock Exchange stock. Pennock, et al, quantified and
12
compared MovieStock prices (The Hollywood Stock Exchange prediction) to
Brandon Gray’s published forecasts at Box-‐office Mojo for 50 movies appearing on
both sources. Their results showed that there was a slight bias to underprice the
best performing movies and overprice the worst performing movies. They
attributed this bias to a manifestation of risk-‐seeking behavior where traders
preferred potential “sleepers” with the opportunity for a very large payoff, rather
than known quantities with a moderate payoff. They also found a correlation
between MovieStock estimates and Box-‐office Mojo estimates, which they
hypothesized resulted from the possibility that Box-‐office Mojo observes Hollywood
Stock Exchange prices, and/or some Hollywood Stock Exchange traders read Box-‐
office Mojo forecasts.
Ultimately, they concluded that The Hollywood Stock Exchange showed signs
of efficiency, in the form of price coherence and forecast accuracy. They deduced
that The Hollywood Stock Exchange is a good forecast for box-‐office returns and
provides a reasonable likelihood assessment of uncertain events (the final four
week box-‐office returns). The implications that Pennock et al. derived from their
study were that existing artificial markets, like The Hollywood Stock Exchange,
could be a valuable resource for information. Also, The Hollywood Stock Exchange
provides a good example for a successful artificial market and should promote the
creation of similar markets in the future [1].
In Sitaram Asur and Bernardo A. Huberman’s, Predicting the Future with
Social Media, they demonstrated how social media content could be used to predict
real-‐world outcomes. In particular, they used the chatter from Twitter.com to
13
forecast box-‐office revenues for movies. They showed how a simple model built
from the rate at which tweets were created about particular topics could
outperform market-‐based predictors. Furthermore, they demonstrated how
sentiments extracted from Twitter could be further utilized to improve the
forecasting power of social media.
Social media has the ability to aggregate opinions and act as a form of
“collective wisdom” that can be used to make “quantitative predictions that
outperform those of artificial markets” (Asur & Huberman, p. 1). Their goal was to
assess how buzz and attention was created for different movies and how it changed
over time. Also, they focused on the mechanism of viral marketing and pre-‐release
hype on Twitter, and the role that attention played in forecasting real-‐world box-‐
office performance. They also focused on how sentiments were created and how
positive and negative opinions influenced people.
Their hypothesis was that “movies that are well talked about will be well-‐
watched” (Asur & Huberman, p. 1).
Asur and Huberman wanted to look at how attention and popularity were
generated for movies on Twitter, and what affects this had on box-‐office
performance for the movies considered. Their results indicated that movies that had
greater publicity, in terms of linked urls on Twitter, did not necessarily perform
better in the box-‐office. Their initial analysis of tweet rates (defined as tweets for a
movie per hour) showed a positive correlation. When they compared their results to
The Hollywood Stock Exchange index, they found that their model outperformed
The Hollywood Stock Exchange based model in predicting actual box-‐office returns.
14
They then tested whether they could predict the price of The Hollywood
Stock Exchange MovieStock at the end of the opening weekend for the movies they
considered. In order to do so, they used historical Hollywood Stock Exchange prices
as well as the tweet-‐rates for the week prior to the release as predictive variables.
They created a simple time series regression of tweet-‐rates, over 7 days before the
weekend, to predict the box-‐office revenue for a particular weekend. Again, they
found that their tweet-‐rate model was better at predicting the actual values than the
historical Hollywood Stock Exchange prices. Their results showed how twitter could
be used as an accurate indicator of future outcomes.
Asur and Huberman also wanted to see if the sentiment of Tweets could
increase forecasting accuracy. They sectioned off tweets into Positive, Negative, and
Neutral. Their results indicated that tweets after the release had more value than
tweets before – as coincided with their expectations that people would hold more
weight to a tweet after they had seen the movie. They also found that there were
more positive sentiments than negative for all most all of the movies. They
concluded that adding Twitter sentiment to the equation did not significantly
increase the predictive power of tweets themselves.
In conclusion, they found that social media feeds could be an effective
indicator of real-‐world performance. Specifically, the rate at which movie Tweets
were generated could be used to build a powerful model for predicting movie box-‐
office revenue. They showed how their predictions were more accurate than The
Hollywood Stock Exchange prices. Finally, the sentiment of tweets could improve
15
box-‐office revenue predictions based on tweet rates, but only after the movies were
released [5].
In Twitter Mood Predicts the Stock Market by Johan Bollen et al., they looked
at the question of whether societies can experience mood states that affect their
collective decision-‐making and by extension whether the public mood was
correlated or even predictive of economic indicators. They investigated whether
measurements of collective mood states obtained from twitter feeds were
correlated to the value of the Dow Jones Industrial Average over time. In their study
they analyzed the text in Tweets using two mood-‐tracking tools, OpinionFinder
(measures positive vs. negative mood) and Google-‐Profile of Mood States (measures
mood in terms of 6 dimensions: Calm, Alert, Sure, Vital, Kind, and Happy).
Their results found that changes in the public mood state could be tracked
from the content of large-‐scale Twitter feeds by text processing techniques and that
“such changes respond to a variety of socio-‐cultural drivers in a highly differentiated
manner” (Bollen, p. 7). Also, they found that the inclusion of specific public mood
dimensions, but not others could significantly improve the accuracy of Dow Jones
Industrial Average predictions. They found that the calmness of the public was
predictive of the Dow Jones Industrial Average rather than general levels of positive
sentiment as measured by OpinionFinder [6].
These three studies helped shape how I wanted to form my own study of
MovieStock prices in relation with Twitter.
16
Data Summary
Before collecting my movie data, I needed to establish consistent criteria so
that all movies in the data set would share similar properties. First I collected data
for the top 2-‐3 grossing movies opened in wide release (so that every movie in my
data set would delist after four weeks) for each week over the time period of
September 2011 to December 2011. This gave me a data set of 26 movies. For every
movie, I collected the release date MovieStock price and then the price at the end of
each week, up until the delist date. This gave me five data points for each movie.
Ultimately, I only used the end of week stock price for my regression analysis and
dropped the release date stock price [12].
In order to capture Twitter data, I used a program called Hootsuite. First, I
tracked how many times a movie title was mention as a keyword on Twitter over
the four-‐week period. The keyword analysis performed by Hootsuite gave me daily
Twitter hits for each keyword. In order to make my Twitter data line up with the
0
20
40
60
80
100
120
1 2 3 4 Week
# Weekly Twi+er Hits (Thousands)
400
600
17
MovieStock price data, I summed the number of daily Twitter hits and calculated
weekly Twitter hit totals for each week. This gave me four data points for each
movie: Week 1 twitter total, Week 2 twitter total, Week 3 twitter total, and Week 4
twitter total [13].
I then calculated the log of the number of weekly twitter hits in order to analyze
the percent change from week to week. For the regression analysis, I wanted to
determine whether a change in Twitter hits was more predictive than the total
number of Twitter hits. This also would allow me to control for movies that were
more popular due to external factors, such as a higher budget or more proactive
advertising, and as a result, generated more discussion on Twitter [13].
In order to capture the sentiment of each Tweet, I used Hootsuite’s twitter
sentiment analytics, which capture the conversational tone of my keyword search. I
was able to analyze Tweet sentiment for each week, which gave me a total of four
data points. Hootsuite analyzed the data by breaking it out into eight different
categories based on the sentiment of the tweet: affection friendliness, enjoyment
0
2
4
6
8
10
12
14
1 2 3 4
Week
Log # Weekly Twi+er Hits
18
elation, amusement excitement, contentment gratitude,
sadness grief, anger loathing, fear uneasiness, and finally,
humiliation and shame. The analysis gave me a
percentage break down of the weekly Tweets for each
category [13].
I considered affection friendliness, enjoyment
elation, amusement excitement, and contentment
gratitude as a positive Twitter sentiment, and sadness grief, anger loathing, fear
uneasiness, and finally, humiliation and shame as a negative Twitter sentiment. I
aggregated the collective positive Tweet sentiments on a weekly basis in order to
capture the percent of Twitter hits that were positive. This gave me again, four data
points for each movie [13].
I then calculated the log of the percent of Twitter hits that were positive in
order to capture the percent change from week to week. For the regression analysis,
0 10 20 30 40 50 60 70 80 90 100
1 2 3 4
Week
% Twi+er Hits Posi<ve
19
I wanted to determine whether a change in Twitter sentiment was more predictive
than total sentiment [13].
I then used BoxOfficeMojo.com, a movie web site with the most comprehensive
box-‐office database, to capture the weekly box-‐office returns for each movie. Also, I
captured the weekly number of theaters the movie was released in. I then calculated
the log of both weekly box-‐office revenue and weekly number of theaters in order to
capture the percent change from week to week [14].
Finally, I used Rottentomatoes.com – a website devoted to reviews, information,
and news of films, widely know as a film review aggregator – to incorporate user
ratings. I used a dummy variable (1 or 0) to indicate whether the movie had
received a positive rotten tomatoes rating or negative one (rotten) [15].
2
2.5
3
3.5
4
4.5
5
1 2 3 4
Week
Log % Twi+er Hits Posi<ve
20
Graphs
Before creating my regression equation, I graphed the relationship between
MovieStock prices and a few key variables. I used the log of MovieStock price
because I wanted to focus on the percent change from week to week.
Looking at the relationship between the logStockPrice and
log#WeeklyTwitterHits, there appeared to be some positive correlation with a few
outliers.
2
2.5
3
3.5
4
4.5
5
5.5
6
6.5
0 2 4 6 8 10 12 14
Log Stock Price
Log # Weekly Twitter Hits
Correla<on Bewtween Change in Stock Price and Change in # Twi+er Hits
21
The relationship between logStockPrice and log%TwitterHitsPositive did
appear to have a clear correlation and appeared to be random at first glance.
2
2.5
3
3.5
4
4.5
5
5.5
6
6.5
2 2.5 3 3.5 4 4.5 5
Log Stock Price
Log % Twitter Hits Pos.
Correla<on Bewtween Change in Stock Price and Change in % Twi+er Hits Posi<ve
22
Initially, it looked like there was a clear correlation between logStockPrice
and logWeeklyBoxOffice. This would be expected given that the MovieStock price is
a forecast of actual box-‐office revenue.
2
2.5
3
3.5
4
4.5
5
5.5
6
6.5
-‐2 -‐1 0 1 2 3 4 5 6
Log Stock Price
Log Weekly Box Of;ice Revenue
Correla<on Bewtween Change in Stock Price and Change in Weekly Box Office Revenue
23
Finally, I looked at the relationship between logStockPrice and
logWeeklyTheaters. Based on the relationship pictured in the graph, it was hard to
conclude that there was a strong positive correlation. I would expect that a positive
change in the number of theaters a movie was released in would be positively
correlated with the MovieStock price because if they were increasing the number of
theaters it most likely indicates that people were going to see the movie.
2
2.5
3
3.5
4
4.5
5
5.5
6
6.5
4 4.5 5 5.5 6 6.5 7 7.5 8 8.5 9
Log Stock Price
Log Weekly Theaters
Correla<on Bewtween Change in Stock Price and Change in Weekly # Theaters
24
Regression Equation
Using the data described above, I was able to create a comprehensive panel
data set. It was important to create a panel data as opposed to a normal linear
regression because not only did I want to see how each weekly change in
MovieStock price was effected by the corresponding week data, but also, I wanted to
incorporate time series variables to test whether previous weeks had an effect on
the current week. As a result, I was able to create a regression equation that tested
whether or not the change in MovieStock price could be determined by specific
independent variables.
logStockPrice = logStockPricex-1 + logStockPricex-2 + ReleaseDateTwitterHits + WeekTwitterHits + WeekTwitterHitsx-1 + WeekTwitterHitsx-2 + logWeekTwitterHits + logWeekTwitterHitsx-1 + logWeekTwitterHitsx-2 + TwitterHitsPositive + TwitterHitsPositivex-1 + TwitterHitsPositivex-2 + logTwitterHitsPositive + logTwitterHitsPositivex-1 + logTwitterHitsPositivex-2 + logWeekBoxOffice + logWeekBoxOfficex-1 + logWeekBoxOfficex-2+ WeekTheater + WeekTheaterx-1 + WeekTheaterx-2 + logWeekTheater + logWeekTheaterx-1 + logWeekTheaterx-2 + RottenTomatoes
Results and Discussion
logStockPrice Coef. Std. Err. z P>|z| [95% Conf. Interval]
logStockPricex-‐1 0.8913884 0.0934942 9.53 0.000 0.7081431 1.0746340 logStockPricex-‐2 0.0481264 0.1215238 0.40 0.692 -‐0.190056 0.2863087 ReleaseDateTwitterHits -‐0.000006 0.0000032 -‐1.92 0.055 -‐0.000012 0.0000001 WeekTwitterHits 0.0000005 0.0000011 0.46 0.646 -‐0.000002 0.0000026 WeekTwitterHitsx-‐1 -‐0.000001 0.0000012 -‐0.59 0.554 -‐0.000003 0.0000016 WeekTwitterHitsx-‐2 -‐0.000000 0.0000003 -‐0.25 0.800 -‐0.000001 0.0000005 logWeekTwitterHits 0.0156865 0.0073672 2.13 0.033 0.0012470 0.0301259 logWeekTwitterHitsx-‐1 0.0166760 0.0074143 2.25 0.025 0.0021442 0.0312078
25
logWeekTwitterHitsx-‐2 -‐0.007142 0.0084267 -‐0.85 0.397 -‐0.023658 0.0093739 TwitterHitsPositive 0.0245292 0.0125585 1.95 0.051 -‐0.000085 0.0491435 TwitterHitsPositivex-‐1 -‐0.005177 0.0111295 -‐0.47 0.642 -‐0.026990 0.0166361 TwitterHitsPositivex-‐2 -‐0.022194 0.0183033 -‐1.21 0.225 -‐0.058067 0.0136798 logTwitterHitsPositive -‐1.467054 0.7915974 -‐1.85 0.064 -‐3.018557 0.0844481 logTwitterHitsPositivex-‐1 0.0808984 0.7752506 0.10 0.917 -‐1.438565 1.6003620
logTwitterHitsPositivex-‐2 1.4688500 1.3364470 1.10 0.272 -‐1.150537 4.0882380
logWeekBoxOffice 0.0007364 0.0175517 0.04 0.967 -‐0.033664 0.0351371 logWeekBoxOfficex-‐1 0.0014782 0.0206449 0.07 0.943 -‐0.038985 0.0419414 LogWeekBoxOfficex-‐2 0.0930352 0.0613214 1.52 0.129 -‐0.027152 0.2132229 WeekTheater -‐0.000000 0.0000341 -‐0.01 0.991 -‐0.000067 0.0000665 WeekTheaterx-‐1 -‐0.000058 0.0000854 -‐0.68 0.496 -‐0.000225 0.0001092 WeekTheaterx-‐2 -‐0.000162 0.0001541 -‐1.05 0.292 -‐0.000464 0.0001397 logWeekTheater -‐0.003385 0.0407300 -‐0.08 0.934 -‐0.083214 0.0764439 logWeekTheaterx-‐1 0.0307020 0.1550999 0.20 0.843 -‐0.273288 0.3346922 logWeekTheaterx-‐2 0.4900553 0.3653695 1.34 0.180 -‐0.226055 1.2061660 RottenTomatoes 0.0353428 0.0200099 1.77 0.077 -‐0.003875 0.0745616 Constant -‐3.843140 2.0975930 -‐1.83 0.067 -‐7.954347 0.2680673
R2 Value
Within 0.4624
Between 0.9994
Overall 0.9982
The variables highlighted in green are all significant at a critical value greater
than or equal to 1.96 – indicating a 90% confidence level. The variables highlighted
in yellow are all significant at a critical value greater than or equal to 1.645 –
indicating a 90% confidence level. The R-‐squared “between” is equal to .9994 which
is very high; however, the R-‐squared “within,” which is the R-‐squared for a fixed-‐
effect regression is much lower. Since I used a random-‐effects model, the R-‐squared
26
“between” is the significant number. One reason the R-‐squared is so high, could be
due to the fact that there are a lot of independent variables in the regression. The
most significant variables are LogStockPrice lagged one week, Release Date Twitter
Hits, Log Week Twitter Hits, Log Week Twitter Hits lagged one week, and % Weekly
Twitter Hits Positive. If The Hollywood Stock Exchange was a completely efficient
market, then past prices would have no correlation with current prices; however,
this is not the case: the previous week change in MovieStock price has a direct
correlation with the current week change in MovieStock price. We would also expect
that a positive change in Twitter hits would indicate a positive change in MovieStock
prices. It is interesting that a change in twitter hits one-‐week prior also indicates a
positive change in MovieStock stock price. This relationship suggests a momentum
effect: if a movie generates a lot of buzz on Twitter, more people will go to see it and
talk about it. In terms of percent change of the percent of the weekly Twitter hits
that are positive, we would also expect for this to have a direct correlation to an
increase in MovieStock price. If individuals are feeling positive about the movie, and
the collective opinion is increasingly more positive, then people will recommend the
movie, and more people will go to see it. The most interesting finding however is
that release date Twitter hits are inversely correlated. We would expect the
opposite, especially since the change in weekly Twitter hits is positively correlated.
One possible explanation could be that people who Tweet on the day a movie is
released are complaining about the movie and giving it bad reviews. The coefficient
is so small though, that it almost seems negligible even though the variable is
considered significant.
27
Conclusion and Looking Further
Overall, the results show that Twitter can provide some indication of when
the MovieStock price will increase or decrease; however, it is hard to determine
exactly how accurate this relationship is. It would also appear that The Hollywood
Stock Exchange is not a completely efficient market despite successfully operating
as an online market game. Traders could use information from Twitter to help them
predict how MovieStocks will perform in the future and potentially exploit this
information to make excess returns.
Ideally, I would have liked to capture data for more movies in order to a get a
more comprehensive data set. Also, due to a lack of resources, the collection of
twitter data could have been more comprehensive and I would not have solely
relied on Hootsuite as my main form of collection. It is questionable how accurate
Hootsuite’s method of capturing the number of keywords was. Also, in Hootsuite’s
Twitter sentiment analysis, there are some problems in how they assigned the
different categories. For example, the movie Killer Elite had an extremely high
percentage in the “fear uneasiness” category. This was probably due to the fact that
they assigned the word “Killer” in the movie title to sentiments of fear. In order to
get a more comprehensive and accurate data set, every individual tweet would need
to be analyzed, but clearly this process is too arduous for one person.
When thinking about the effects of twitter – other questions arise. Is there a
threshold effect for movies – meaning that after a certain amount of “chatter” on
28
twitter, does the power of twitter become less significant? Also, how important are
the number of Tweets and the change in number of Tweets prior to the release of
the Movie. It would be interesting to try and predict how well a movie would do in
the box-‐office for opening weekend.
I would also have liked to break down the twitter and box-‐office returns based
on geographical regions in the US to see if certain geographic regions have more
predictive power than others. For example, if more people in LA are talking about a
movie on Twitter, does that have implications on how well the movie performs just
in LA or because LA is a central city in the movie industry, does it have implications
about national box-‐office revenue. Also, it would be interesting to compare different
major cities, such as LA, New York, and Chicago, to test whether one city had more
influence and predictive power than another.
In conclusion, it does appear that Twitter has some effect on MovieStock prices
and in turn, some predictive power in determining real world box-‐office returns;
however, it is unclear to what extent. In order to predict future changes in
MovieStock price, one could use information they collect from Twitter, but based on
these results, it cannot be definitively determined how accurate such analysis would
be.
29
References
[1] – David M. Pennock, Steve Lawrence, C. Lee Giles, and Finn Arup Nielsen. The Power of Play: Efficiency and Forecast Accuracy in Web Market Games [2] – Justin Wolfers and Eric Zitzewitz. Five Open Questions About Prediction Markets [3] – Aswath Damodaran. Market Efficiency: Definitions and Tests. http://www.e-‐m-‐h.org/Damo.pdf. [4] – The Economist. Can Twitter predict the future? Internet forecasting: Businesses are mining online messages to unearth consumers’ moods – and even make market predictions. http://www.economist.com/node/18750604. [5] – Sitaram Asur and Bernardo A. Huberman. Predicting the Future with Social Media [6] – Johan Bollen, Huina Mao, and Xiaojun Zeng. Twitter Mood Predicts the Stock Market [7] – Ian Saxon. Intrade Prediciton Market Accuracy and Efficiency: An Analysis of the 2004 and 2008 Democratic Presidential Nomination Contests [8] – Shyam Gopinath, Pradeep K. Chintagunta, and Sriram Venkataraman. Blogs and Local-‐market Movie Box-‐office Perfromance [9] – Eugene F. Fama. Market Efficiency, Long-‐Term Returns, and Behavioral Finance [10] – Allan Timmermann and Clive W.J. Granger. Efficient Market Hypothesis and Forecasting [11] – Eugene F. Fama. Efficient Capital Markets: II
Websites [12] – HSX.com [13] – Hootsuite.com [14] – Boxofficemojo.com [15] – Rottentomatoes.com