Final Senior Thesis...! 2! Abstract!...

1

The Hollywood Stock Exchange: Efficiency and The Power of Twitter

by

Nathaniel Harley

A special thanks to Professor Richard Walker for advising on this thesis. Also, thanks to Professor Joseph Ferrie, Sarah Ferrer, and the MMSS department.

2

Abstract

Online prediction markets are becoming increasingly popular and useful for forecasting real world events. The Hollywood Stock Exchange is one of the most successful online prediction markets and forecasts real world box-office returns. This thesis sets forth to answer questions about whether The Hollywood Stock Exchange is an efficient market, and if it is not, what factors can be used to predict future changes in MovieStock prices? Most importantly, this thesis will focus in on the usefulness of social media—specifically Twitter— in predicting future changes to these prices.

Introduction

Markets are a place where individuals can exchange items. Prices are used to

assign these items values so that buyers and sellers can easily trade them.

Embedded in these prices is a large amount of information that reflects the

collective opinion of informed and uninformed traders.

The two main types of markets are financial markets and prediction markets.

We are all familiar with financial markets, such as stock markets, bond markets,

futures markets, commodity markets, currency markets, and money markets.

Depending on the type of market, the price of an asset can represent different

meanings. On a stock market, such as the New York Stock Exchange, the price of

common stock represents how much an individual is willing to pay for one share of

a specific company. On a futures market, the price represents a forecast of what the

underlying asset will cost in the future. For prediction markets, the price of an asset

is used to indicate the likelihood of an event occurring.

Prediction markets are slowly becoming more popular and are being used as

an informational resource to predict events. Some prediction markets, such as

Intrade Prediction Market, forecast the likelihood of political events. Others, such as

The Hollywood Stock Exchange, trade prediction shares of movies, actors, and other

3

film-‐related options. As more and more prediction markets expand onto the

electronic platform, individuals have more access to trade on these markets.

The question driving my thesis is if The Hollywood Stock Exchange is not an

efficient market, what information can people use to predict future changes in stock

prices? Recently, a lot of work has been done to try and capture social media data,

such as twitter, and use it as a measurement to make quantitative predictions. Using

Twitter data, along with other non-‐social media variables, I attempt to test whether

MovieStock prices can be predicted by Twitter information.

Prediction Markets

There are many barriers that exist for establishing a new market, such as

high costs, government regulation, and the threat of lawsuits; however, artificial

online prediction markets do not have these barriers. Web market games are

increasingly easy to create because they have small operating costs for setup,

maintenance, advertising, searching, and transacting, and benefit from a global

group of Internet users. They do not need to get permission from government

officials and do not need to create strict rules that would limit trading because there

is little risk of lawsuits against them. Users can remain anonymous and record

keeping does not need to be as tight. As a result, online markets, such as The

Hollywood Stock Exchange, can exist and function effectively [1]. However, as Justin

Wolfers and Eric Zitzewitz illustrate in their paper Five Open Questions About

Prediction Markets, there are five open questions that must be answered in order for

prediction markets to fulfill their potential and ultimately succeed.

4

The first question Wolfers and Zitzewitz pose is “how to attract uninformed

traders?” (Wolfers and Zitzewitz, p. 2). Uninformed traders are important to any

market place because they create an uninformed order flow, which actually attracts

informed profit motivated groups to trade. In order to attract these traders, it is

essential to have low transaction costs as well as interest or buzz surrounding a new

predictive market. The Hollywood Stock Exchange has successfully attracted both

uninformed and informed traders by creating an attractive, easy to use platform

that has positioned itself as the premier box-‐office forecasting market available.

Their second question is “how to tradeoff interest and contractibility?”

(Wolfers and Zitzewitz, p. 3). Wolfers and Zitzewitz conclude that it is important to

establish clear guidelines outlining contracts traded on prediction exchanges, but

that there is a lot of leeway in doing so. The Hollywood Stock Exchange does not face

some of the complexities that other prediction markets encounter. For example,

box-‐office revenue and which actor won the Oscar for best actor leaves little room

for interpretation.

The third question they pose is “how to limit manipulation?” (Wolfers and

Zitzewitz, p. 3). This question addresses two types of manipulation: First,

manipulation of the outcomes on which the prediction markets are based and,

second, manipulation of the market prices themselves. Theoretically, traders could

go out and buy a mass amount of tickets for a specific movie in order to increase

box-‐office revenue, but practically, they would never do so because it would have

little impact on national box-‐office revenue and the cost of buying movie tickets

greatly out ways the potential profit since the exchange uses Hollywood Dollars and

5

not actual money. In terms of price manipulation, buying a mass amount of a specific

asset would have no effect on the real world outcome, so it would be pointless.

The fourth question Wolfers and Zitzewitz ask is “are markets well calibrated

on small probabilities?” (Wolfers and Zitzewitz, p. 3). They conclude that declining

transaction costs and carefully framed contracts will produce more accurate

responses from traders who are inherently bad at distinguishing small probabilities

and overvalue unlikely events. Although traders on The Hollywood Stock Exchange

are subject to this bias, it is important to realize that this behavior exists not only in

prediction markets, but also in real-‐world markets.

The final question Wolfers and Zitzewitz pose is “how to separate correlation

from causation?” (Wolfers and Zitzewitz, p. 3). The assets traded on The Hollywood

Stock Exchange, such as MovieStocks, are directly correlated with real world events

and the outcome of one event, in the movie world, does not cause the probability of

another event to change [2]. Therefore, there is not the problem of determining

correlation versus causation.

By examining these five open questions proposed by Wolfers and Zitzewitz, it

is clear that The Hollywood Stock Exchange has all the elements to make it a

successful prediction market and function effectively. However, this does not

guarantee that The Hollywood Stock Exchange is an efficient market.

Efficient Markets

If markets are, in fact, efficient, the market asset price is the best estimate of

value; however if markets are not efficient, the market price may deviate from the

6

true value. That being said, market efficiency does not require that the market price

be equal to true value at every point in time, but that if there is a deviation from the

true value, that the deviation is random. There are three main types of efficient

markets: Weak Form, Semi-‐Strong Form, and Strong Form [3].

o Weak Form – Future changes in prices are not predictable based on

information contained in all past prices suggesting that analysis of

past prices alone would not be helpful in determining undervalued or

overvalued assets.

o Semi-Strong Form – Future changes in prices are not predictable

based on past prices or any currently available public information

(including prices, economic variables, etc.).

o Strong Form – Future changes in prices fully reflect all information

available, public and private. Informed experts cannot consistently

outperform uninformed traders.

The Hollywood Stock Exchange

The Hollywood Stock Exchange is a online artificial prediction market game.

Participants can buy and sell virtual shares of celebrities and movies with a

currency called the Hollywood Dollar (H$). New users can join for free, and when

they do, they receive 2 million H$. They can trade various assets such as

MovieStocks, StarBonds, TVStocks, MovieFunds, and Deriviatives. The Hollywood

Stock Exchange then syndicates the data collected from the Exchange and sells it as

7

market research to entertainment companies, consumer product companies and

financial institutions.

For the purpose of this thesis, I will focus on MovieStocks and whether it is

possible to predict future changes in MovieStock prices. MovieStocks represent

films both still in production and currently in theaters. The price of the MovieStock

reflects how much money traders think the film will make with each $1 million

earned domestically equal to $1 Hollywood Dollar. The price of a MovieStock is

adjusted to reflect its exact earnings in the box-‐office. The price begins to adjust

after the movie’s opening weekend in order to bring the expected box-‐office gross

revenue in line with the actual box-‐office gross revenue. For example, if Movie A

grosses $20 million its first week in theaters, then the price after the first week

would be something like H$45 on the exchange. However, if Movie A only grossed

$3 million in the second week, then the price of Movie A would most likely drop

drastically to something like H$28. On average, a film makes 2.7 times its opening

weekend box-‐office during its first four weeks of wide release. MovieStocks delist

and cash out from the market on the first business day after its fourth weekend of

wide release or 12 weeks of limited release. The driving question behind this Thesis

is whether or not The Hollywood Stock Exchange is an efficient market, specifically

looking at MovieStocks; and if it is not, what factors can be used to predict future

changes in MovieStocks?

8

Twitter

Social media is quickly changing the social landscape because it is easy to use

and reaches a global audience extremely quickly. As a result, social media is setting

trends in topics that range across the board from politics to technology to the

entertainment industry. Social media can be a very powerful tool and the question

becomes whether it is possible to aggregate social media and use it as a

measurement to gauge collective opinion.

We all know Twitter. Twitter is essentially a real-‐time information network that

connects you to the latest stories, ideas, opinions and news. Tweets are only 140

words, but they can be very powerful. Twitter uses the # symbol, called a "hash-‐tag",

to mark keywords or topics in a Tweet. Interestingly enough, twitter users created it

organically as a way to categorize messages. People use the hash-‐tag symbol before

relevant keywords or phrases in their Tweet to categorize those Tweets and help

them show up more easily in Twitter Search. Hash-‐tagged words that become very

popular are often referred to as “trending topics.” Twitter as a result has a lot of

power because it can identify important topics and also, the sentiment surrounding

those topics. For example, if a keyword is being used a lot – we can come to the

conclusion that many people find it important. Looking further, analyzing the

individual tweets can help us identify whether people feel positively or negatively.

Using this, we can create a measurement of collective opinion and use it to make

quantitative predictions.

Many people have already started to use Twitter to build forecasting models.

Specifically, they look at the sentiment of a Tweet and use it to gauge collective

9

opinion. For example, Duncan Watts, an Internet researcher who heads one of

Yahoo!’s research labs in New York, uses Twitter to forecast video-‐game and music

sales. He found that adding Twitter data greatly increased the accuracy of his

forecasting model. Similarly, Derwent Capital Markets, a hedge fund based in

London, implements a Twitter model to help guide their investments [4].

Related Studies

When looking at previous studies, I came across a few that were very

influential in shaping how I tested The Hollywood Stock Exchange for efficiency. In

The Power of Play: Efficiency and Forecast Accuracy in Web Market Games by David

M. Pennock et al. they analyzed the efficiency and forecast accuracy of two market

games: The Hollywood Stock Exchange and the Foresight Exchange. For the purpose

of this thesis, I will focus on their results regarding The Hollywood Stock Exchange.

In their paper, they focused on the question of whether or not efficiency breaks

down in artificial markets when there is no monetary incentive. The goal of their

research was to test whether The Hollywood Stock Exchange holds for two types of

market efficiency: internal coherence and strong form. They presented evidence

that some market simulations can act sufficiently well as both aggregators and

disseminators of information. In conclusion, they found that The Hollywood Stock

Exchange MovieStock prices were good indicators of what movies will do well in the

box-‐office.

First, it is important to understand what internal coherence is. Internal

coherence is defined as when prices are self-‐consistent or arbitrage-‐free: no trader

10

can make a sure profit without any risk. In efficient markets, arbitrage should not

exist. For example, arbitrage exists when you can buy a security on one exchange,

such as The New York Stock Exchange, for a certain price and then sell the same

security on the Tokyo Exchange for a higher price. The security should have the

same price on both exchanges. Another example can be shown in relation to the

securities market. Take for instance a security that pays $1 if and only if it rains

tomorrow. If another security existed that pays $1 if an only if it does not rain

tomorrow, then owning both securities would guarantee a $1 payoff. In order for

there not to be arbitrage opportunity, the price to buy both securities should always

be greater than $1 and the price to sell both securities should always be less than $1.

One of the driving questions behind their study was: “do HSX players have

utility for Hollywood dollars and, if so, are their resulting incentives strong enough

to maintain internal price consistency in the game?” (Pennock, p. 7). In order to

determine the degree of internal coherence in MovieStocks, they tested how closely

MovieStocks and options prices conformed to the put-‐call parity. In conducting their

experiment, they used weekend halt prices (the price before the movie adjusts to

approximately 2.7 times the opening weekend box-‐office proceeds) for 75

MovieStocks and their corresponding options during the period of March 3, 2000 to

September 1, 2000. They found that the relationship between the stock estimates of

weekend box-‐office returns versus the option estimates adhered relatively closely to

the put-‐call parity at the halt price. They then wanted to test whether prices

adhered to the put-‐call parity at all times, not just at the halt price. Their results

indicated that The Hollywood Stock Exchange market was not completely free of

11

arbitrage because prices diverged at times from parity by as much as H$6.5. When

examining whether the market showed signs of internal coherence, they concluded

that it did because when prices were too high, they were much more likely to

correct and go down on subsequent days. Similarly, when they were too low, they

were more likely to increase. They hypothesized that this price self-‐correction could

be attributed to traders taking advantage of arbitrage opportunities.

Pennock et al. wanted to test the forecast accuracy of The Hollywood Stock

Exchange and whether MovieStocks were good predictors of box-‐office returns. In

order to understand their process and results, it is important to understand Rational

Expectations Theory: prices are not only coherent, but also reflect the sum total of all

information available to all market participants. Essentially, the Rational

Expectations Theory states that even when some individuals have insider

information, prices equilibrate as if everyone has access to all the same information.

They wanted to go further and test where strong form efficiency holds in The

Hollywood Stock Exchange because internal coherence is only a minimal standard of

market efficiency, where as stronger forms of efficiency imply “market competence

as well and coherence: prices actually reflect an aggregation of information

distributed among the participants, and market forecasts are as accurate as expert

assessments” (Pennock, p. 11). Ultimately, they proposed that if The Hollywood

Stock Exchange holds for strong form efficiency, then the implications would be

more relevant to the societal benefit in the form of “cheap and reliable forecasts”

(Pennock, p. 11). In order to test strong form efficiency, they assessed the forecast

accuracy of The Hollywood Stock Exchange stock. Pennock, et al, quantified and

12

compared MovieStock prices (The Hollywood Stock Exchange prediction) to

Brandon Gray’s published forecasts at Box-‐office Mojo for 50 movies appearing on

both sources. Their results showed that there was a slight bias to underprice the

best performing movies and overprice the worst performing movies. They

attributed this bias to a manifestation of risk-‐seeking behavior where traders

preferred potential “sleepers” with the opportunity for a very large payoff, rather

than known quantities with a moderate payoff. They also found a correlation

between MovieStock estimates and Box-‐office Mojo estimates, which they

hypothesized resulted from the possibility that Box-‐office Mojo observes Hollywood

Stock Exchange prices, and/or some Hollywood Stock Exchange traders read Box-‐

office Mojo forecasts.

Ultimately, they concluded that The Hollywood Stock Exchange showed signs

of efficiency, in the form of price coherence and forecast accuracy. They deduced

that The Hollywood Stock Exchange is a good forecast for box-‐office returns and

provides a reasonable likelihood assessment of uncertain events (the final four

week box-‐office returns). The implications that Pennock et al. derived from their

study were that existing artificial markets, like The Hollywood Stock Exchange,

could be a valuable resource for information. Also, The Hollywood Stock Exchange

provides a good example for a successful artificial market and should promote the

creation of similar markets in the future [1].

In Sitaram Asur and Bernardo A. Huberman’s, Predicting the Future with

Social Media, they demonstrated how social media content could be used to predict

real-‐world outcomes. In particular, they used the chatter from Twitter.com to

13

forecast box-‐office revenues for movies. They showed how a simple model built

from the rate at which tweets were created about particular topics could

outperform market-‐based predictors. Furthermore, they demonstrated how

sentiments extracted from Twitter could be further utilized to improve the

forecasting power of social media.

Social media has the ability to aggregate opinions and act as a form of

“collective wisdom” that can be used to make “quantitative predictions that

outperform those of artificial markets” (Asur & Huberman, p. 1). Their goal was to

assess how buzz and attention was created for different movies and how it changed

over time. Also, they focused on the mechanism of viral marketing and pre-‐release

hype on Twitter, and the role that attention played in forecasting real-‐world box-‐

office performance. They also focused on how sentiments were created and how

positive and negative opinions influenced people.

Their hypothesis was that “movies that are well talked about will be well-‐

watched” (Asur & Huberman, p. 1).

Asur and Huberman wanted to look at how attention and popularity were

generated for movies on Twitter, and what affects this had on box-‐office

performance for the movies considered. Their results indicated that movies that had

greater publicity, in terms of linked urls on Twitter, did not necessarily perform

better in the box-‐office. Their initial analysis of tweet rates (defined as tweets for a

movie per hour) showed a positive correlation. When they compared their results to

The Hollywood Stock Exchange index, they found that their model outperformed

The Hollywood Stock Exchange based model in predicting actual box-‐office returns.

14

They then tested whether they could predict the price of The Hollywood

Stock Exchange MovieStock at the end of the opening weekend for the movies they

considered. In order to do so, they used historical Hollywood Stock Exchange prices

as well as the tweet-‐rates for the week prior to the release as predictive variables.

They created a simple time series regression of tweet-‐rates, over 7 days before the

weekend, to predict the box-‐office revenue for a particular weekend. Again, they

found that their tweet-‐rate model was better at predicting the actual values than the

historical Hollywood Stock Exchange prices. Their results showed how twitter could

be used as an accurate indicator of future outcomes.

Asur and Huberman also wanted to see if the sentiment of Tweets could

increase forecasting accuracy. They sectioned off tweets into Positive, Negative, and

Neutral. Their results indicated that tweets after the release had more value than

tweets before – as coincided with their expectations that people would hold more

weight to a tweet after they had seen the movie. They also found that there were

more positive sentiments than negative for all most all of the movies. They

concluded that adding Twitter sentiment to the equation did not significantly

increase the predictive power of tweets themselves.

In conclusion, they found that social media feeds could be an effective

indicator of real-‐world performance. Specifically, the rate at which movie Tweets

were generated could be used to build a powerful model for predicting movie box-‐

office revenue. They showed how their predictions were more accurate than The

Hollywood Stock Exchange prices. Finally, the sentiment of tweets could improve

15

box-‐office revenue predictions based on tweet rates, but only after the movies were

released [5].

In Twitter Mood Predicts the Stock Market by Johan Bollen et al., they looked

at the question of whether societies can experience mood states that affect their

collective decision-‐making and by extension whether the public mood was

correlated or even predictive of economic indicators. They investigated whether

measurements of collective mood states obtained from twitter feeds were

correlated to the value of the Dow Jones Industrial Average over time. In their study

they analyzed the text in Tweets using two mood-‐tracking tools, OpinionFinder

(measures positive vs. negative mood) and Google-‐Profile of Mood States (measures

mood in terms of 6 dimensions: Calm, Alert, Sure, Vital, Kind, and Happy).

Their results found that changes in the public mood state could be tracked

from the content of large-‐scale Twitter feeds by text processing techniques and that

“such changes respond to a variety of socio-‐cultural drivers in a highly differentiated

manner” (Bollen, p. 7). Also, they found that the inclusion of specific public mood

dimensions, but not others could significantly improve the accuracy of Dow Jones

Industrial Average predictions. They found that the calmness of the public was

predictive of the Dow Jones Industrial Average rather than general levels of positive

sentiment as measured by OpinionFinder [6].

These three studies helped shape how I wanted to form my own study of

MovieStock prices in relation with Twitter.

16

Data Summary

Before collecting my movie data, I needed to establish consistent criteria so

that all movies in the data set would share similar properties. First I collected data

for the top 2-‐3 grossing movies opened in wide release (so that every movie in my

data set would delist after four weeks) for each week over the time period of

September 2011 to December 2011. This gave me a data set of 26 movies. For every

movie, I collected the release date MovieStock price and then the price at the end of

each week, up until the delist date. This gave me five data points for each movie.

Ultimately, I only used the end of week stock price for my regression analysis and

dropped the release date stock price [12].

In order to capture Twitter data, I used a program called Hootsuite. First, I

tracked how many times a movie title was mention as a keyword on Twitter over

the four-‐week period. The keyword analysis performed by Hootsuite gave me daily

Twitter hits for each keyword. In order to make my Twitter data line up with the

0

20

40

60

80

100

120

1 2 3 4 Week

# Weekly Twi+er Hits (Thousands)

400

600

17

MovieStock price data, I summed the number of daily Twitter hits and calculated

weekly Twitter hit totals for each week. This gave me four data points for each

movie: Week 1 twitter total, Week 2 twitter total, Week 3 twitter total, and Week 4

twitter total [13].

I then calculated the log of the number of weekly twitter hits in order to analyze

the percent change from week to week. For the regression analysis, I wanted to

determine whether a change in Twitter hits was more predictive than the total

number of Twitter hits. This also would allow me to control for movies that were

more popular due to external factors, such as a higher budget or more proactive

advertising, and as a result, generated more discussion on Twitter [13].

In order to capture the sentiment of each Tweet, I used Hootsuite’s twitter

sentiment analytics, which capture the conversational tone of my keyword search. I

was able to analyze Tweet sentiment for each week, which gave me a total of four

data points. Hootsuite analyzed the data by breaking it out into eight different

categories based on the sentiment of the tweet: affection friendliness, enjoyment

0

2

4

6

8

10

12

14

1 2 3 4

Week

Log # Weekly Twi+er Hits

18

elation, amusement excitement, contentment gratitude,

sadness grief, anger loathing, fear uneasiness, and finally,

humiliation and shame. The analysis gave me a

percentage break down of the weekly Tweets for each

category [13].

I considered affection friendliness, enjoyment

elation, amusement excitement, and contentment

gratitude as a positive Twitter sentiment, and sadness grief, anger loathing, fear

uneasiness, and finally, humiliation and shame as a negative Twitter sentiment. I

aggregated the collective positive Tweet sentiments on a weekly basis in order to

capture the percent of Twitter hits that were positive. This gave me again, four data

points for each movie [13].

I then calculated the log of the percent of Twitter hits that were positive in

order to capture the percent change from week to week. For the regression analysis,

0 10 20 30 40 50 60 70 80 90 100

1 2 3 4

Week

% Twi+er Hits Posi<ve

19

I wanted to determine whether a change in Twitter sentiment was more predictive

than total sentiment [13].

I then used BoxOfficeMojo.com, a movie web site with the most comprehensive

box-‐office database, to capture the weekly box-‐office returns for each movie. Also, I

captured the weekly number of theaters the movie was released in. I then calculated

the log of both weekly box-‐office revenue and weekly number of theaters in order to

capture the percent change from week to week [14].

Finally, I used Rottentomatoes.com – a website devoted to reviews, information,

and news of films, widely know as a film review aggregator – to incorporate user

ratings. I used a dummy variable (1 or 0) to indicate whether the movie had

received a positive rotten tomatoes rating or negative one (rotten) [15].

2

2.5

3

3.5

4

4.5

5

1 2 3 4

Week

Log % Twi+er Hits Posi<ve

20

Graphs

Before creating my regression equation, I graphed the relationship between

MovieStock prices and a few key variables. I used the log of MovieStock price

because I wanted to focus on the percent change from week to week.

Looking at the relationship between the logStockPrice and

log#WeeklyTwitterHits, there appeared to be some positive correlation with a few

outliers.

2

2.5

3

3.5

4

4.5

5

5.5

6

6.5

0 2 4 6 8 10 12 14

Log Stock Price

Log # Weekly Twitter Hits

Correla<on Bewtween Change in Stock Price and Change in # Twi+er Hits

21

The relationship between logStockPrice and log%TwitterHitsPositive did

appear to have a clear correlation and appeared to be random at first glance.

2

2.5

3

3.5

4

4.5

5

5.5

6

6.5

2 2.5 3 3.5 4 4.5 5

Log Stock Price

Log % Twitter Hits Pos.

Correla<on Bewtween Change in Stock Price and Change in % Twi+er Hits Posi<ve

22

Initially, it looked like there was a clear correlation between logStockPrice

and logWeeklyBoxOffice. This would be expected given that the MovieStock price is

a forecast of actual box-‐office revenue.

2

2.5

3

3.5

4

4.5

5

5.5

6

6.5

-‐2 -‐1 0 1 2 3 4 5 6

Log Stock Price

Log Weekly Box Of;ice Revenue

Correla<on Bewtween Change in Stock Price and Change in Weekly Box Office Revenue

23

Finally, I looked at the relationship between logStockPrice and

logWeeklyTheaters. Based on the relationship pictured in the graph, it was hard to

conclude that there was a strong positive correlation. I would expect that a positive

change in the number of theaters a movie was released in would be positively

correlated with the MovieStock price because if they were increasing the number of

theaters it most likely indicates that people were going to see the movie.

2

2.5

3

3.5

4

4.5

5

5.5

6

6.5

4 4.5 5 5.5 6 6.5 7 7.5 8 8.5 9

Log Stock Price

Log Weekly Theaters

Correla<on Bewtween Change in Stock Price and Change in Weekly # Theaters

24

Regression Equation

Using the data described above, I was able to create a comprehensive panel

data set. It was important to create a panel data as opposed to a normal linear

regression because not only did I want to see how each weekly change in

MovieStock price was effected by the corresponding week data, but also, I wanted to

incorporate time series variables to test whether previous weeks had an effect on

the current week. As a result, I was able to create a regression equation that tested

whether or not the change in MovieStock price could be determined by specific

independent variables.

logStockPrice = logStockPricex-1 + logStockPricex-2 + ReleaseDateTwitterHits + WeekTwitterHits + WeekTwitterHitsx-1 + WeekTwitterHitsx-2 + logWeekTwitterHits + logWeekTwitterHitsx-1 + logWeekTwitterHitsx-2 + TwitterHitsPositive + TwitterHitsPositivex-1 + TwitterHitsPositivex-2 + logTwitterHitsPositive + logTwitterHitsPositivex-1 + logTwitterHitsPositivex-2 + logWeekBoxOffice + logWeekBoxOfficex-1 + logWeekBoxOfficex-2+ WeekTheater + WeekTheaterx-1 + WeekTheaterx-2 + logWeekTheater + logWeekTheaterx-1 + logWeekTheaterx-2 + RottenTomatoes

Results and Discussion

logStockPrice Coef. Std. Err. z P>|z| [95% Conf. Interval]

logStockPricex-‐1 0.8913884 0.0934942 9.53 0.000 0.7081431 1.0746340 logStockPricex-‐2 0.0481264 0.1215238 0.40 0.692 -‐0.190056 0.2863087 ReleaseDateTwitterHits -‐0.000006 0.0000032 -‐1.92 0.055 -‐0.000012 0.0000001 WeekTwitterHits 0.0000005 0.0000011 0.46 0.646 -‐0.000002 0.0000026 WeekTwitterHitsx-‐1 -‐0.000001 0.0000012 -‐0.59 0.554 -‐0.000003 0.0000016 WeekTwitterHitsx-‐2 -‐0.000000 0.0000003 -‐0.25 0.800 -‐0.000001 0.0000005 logWeekTwitterHits 0.0156865 0.0073672 2.13 0.033 0.0012470 0.0301259 logWeekTwitterHitsx-‐1 0.0166760 0.0074143 2.25 0.025 0.0021442 0.0312078

25

logWeekTwitterHitsx-‐2 -‐0.007142 0.0084267 -‐0.85 0.397 -‐0.023658 0.0093739 TwitterHitsPositive 0.0245292 0.0125585 1.95 0.051 -‐0.000085 0.0491435 TwitterHitsPositivex-‐1 -‐0.005177 0.0111295 -‐0.47 0.642 -‐0.026990 0.0166361 TwitterHitsPositivex-‐2 -‐0.022194 0.0183033 -‐1.21 0.225 -‐0.058067 0.0136798 logTwitterHitsPositive -‐1.467054 0.7915974 -‐1.85 0.064 -‐3.018557 0.0844481 logTwitterHitsPositivex-‐1 0.0808984 0.7752506 0.10 0.917 -‐1.438565 1.6003620

logTwitterHitsPositivex-‐2 1.4688500 1.3364470 1.10 0.272 -‐1.150537 4.0882380

logWeekBoxOffice 0.0007364 0.0175517 0.04 0.967 -‐0.033664 0.0351371 logWeekBoxOfficex-‐1 0.0014782 0.0206449 0.07 0.943 -‐0.038985 0.0419414 LogWeekBoxOfficex-‐2 0.0930352 0.0613214 1.52 0.129 -‐0.027152 0.2132229 WeekTheater -‐0.000000 0.0000341 -‐0.01 0.991 -‐0.000067 0.0000665 WeekTheaterx-‐1 -‐0.000058 0.0000854 -‐0.68 0.496 -‐0.000225 0.0001092 WeekTheaterx-‐2 -‐0.000162 0.0001541 -‐1.05 0.292 -‐0.000464 0.0001397 logWeekTheater -‐0.003385 0.0407300 -‐0.08 0.934 -‐0.083214 0.0764439 logWeekTheaterx-‐1 0.0307020 0.1550999 0.20 0.843 -‐0.273288 0.3346922 logWeekTheaterx-‐2 0.4900553 0.3653695 1.34 0.180 -‐0.226055 1.2061660 RottenTomatoes 0.0353428 0.0200099 1.77 0.077 -‐0.003875 0.0745616 Constant -‐3.843140 2.0975930 -‐1.83 0.067 -‐7.954347 0.2680673

R2 Value

Within 0.4624

Between 0.9994

Overall 0.9982

The variables highlighted in green are all significant at a critical value greater

than or equal to 1.96 – indicating a 90% confidence level. The variables highlighted

in yellow are all significant at a critical value greater than or equal to 1.645 –

indicating a 90% confidence level. The R-‐squared “between” is equal to .9994 which

is very high; however, the R-‐squared “within,” which is the R-‐squared for a fixed-‐

effect regression is much lower. Since I used a random-‐effects model, the R-‐squared

26

“between” is the significant number. One reason the R-‐squared is so high, could be

due to the fact that there are a lot of independent variables in the regression. The

most significant variables are LogStockPrice lagged one week, Release Date Twitter

Hits, Log Week Twitter Hits, Log Week Twitter Hits lagged one week, and % Weekly

Twitter Hits Positive. If The Hollywood Stock Exchange was a completely efficient

market, then past prices would have no correlation with current prices; however,

this is not the case: the previous week change in MovieStock price has a direct

correlation with the current week change in MovieStock price. We would also expect

that a positive change in Twitter hits would indicate a positive change in MovieStock

prices. It is interesting that a change in twitter hits one-‐week prior also indicates a

positive change in MovieStock stock price. This relationship suggests a momentum

effect: if a movie generates a lot of buzz on Twitter, more people will go to see it and

talk about it. In terms of percent change of the percent of the weekly Twitter hits

that are positive, we would also expect for this to have a direct correlation to an

increase in MovieStock price. If individuals are feeling positive about the movie, and

the collective opinion is increasingly more positive, then people will recommend the

movie, and more people will go to see it. The most interesting finding however is

that release date Twitter hits are inversely correlated. We would expect the

opposite, especially since the change in weekly Twitter hits is positively correlated.

One possible explanation could be that people who Tweet on the day a movie is

released are complaining about the movie and giving it bad reviews. The coefficient

is so small though, that it almost seems negligible even though the variable is

considered significant.

27

Conclusion and Looking Further

Overall, the results show that Twitter can provide some indication of when

the MovieStock price will increase or decrease; however, it is hard to determine

exactly how accurate this relationship is. It would also appear that The Hollywood

Stock Exchange is not a completely efficient market despite successfully operating

as an online market game. Traders could use information from Twitter to help them

predict how MovieStocks will perform in the future and potentially exploit this

information to make excess returns.

Ideally, I would have liked to capture data for more movies in order to a get a

more comprehensive data set. Also, due to a lack of resources, the collection of

twitter data could have been more comprehensive and I would not have solely

relied on Hootsuite as my main form of collection. It is questionable how accurate

Hootsuite’s method of capturing the number of keywords was. Also, in Hootsuite’s

Twitter sentiment analysis, there are some problems in how they assigned the

different categories. For example, the movie Killer Elite had an extremely high

percentage in the “fear uneasiness” category. This was probably due to the fact that

they assigned the word “Killer” in the movie title to sentiments of fear. In order to

get a more comprehensive and accurate data set, every individual tweet would need

to be analyzed, but clearly this process is too arduous for one person.

When thinking about the effects of twitter – other questions arise. Is there a

threshold effect for movies – meaning that after a certain amount of “chatter” on

28

twitter, does the power of twitter become less significant? Also, how important are

the number of Tweets and the change in number of Tweets prior to the release of

the Movie. It would be interesting to try and predict how well a movie would do in

the box-‐office for opening weekend.

I would also have liked to break down the twitter and box-‐office returns based

on geographical regions in the US to see if certain geographic regions have more

predictive power than others. For example, if more people in LA are talking about a

movie on Twitter, does that have implications on how well the movie performs just

in LA or because LA is a central city in the movie industry, does it have implications

about national box-‐office revenue. Also, it would be interesting to compare different

major cities, such as LA, New York, and Chicago, to test whether one city had more

influence and predictive power than another.

In conclusion, it does appear that Twitter has some effect on MovieStock prices

and in turn, some predictive power in determining real world box-‐office returns;

however, it is unclear to what extent. In order to predict future changes in

MovieStock price, one could use information they collect from Twitter, but based on

these results, it cannot be definitively determined how accurate such analysis would

be.

29

References

[1] – David M. Pennock, Steve Lawrence, C. Lee Giles, and Finn Arup Nielsen. The Power of Play: Efficiency and Forecast Accuracy in Web Market Games [2] – Justin Wolfers and Eric Zitzewitz. Five Open Questions About Prediction Markets [3] – Aswath Damodaran. Market Efficiency: Definitions and Tests. http://www.e-‐m-‐h.org/Damo.pdf. [4] – The Economist. Can Twitter predict the future? Internet forecasting: Businesses are mining online messages to unearth consumers’ moods – and even make market predictions. http://www.economist.com/node/18750604. [5] – Sitaram Asur and Bernardo A. Huberman. Predicting the Future with Social Media [6] – Johan Bollen, Huina Mao, and Xiaojun Zeng. Twitter Mood Predicts the Stock Market [7] – Ian Saxon. Intrade Prediciton Market Accuracy and Efficiency: An Analysis of the 2004 and 2008 Democratic Presidential Nomination Contests [8] – Shyam Gopinath, Pradeep K. Chintagunta, and Sriram Venkataraman. Blogs and Local-‐market Movie Box-‐office Perfromance [9] – Eugene F. Fama. Market Efficiency, Long-‐Term Returns, and Behavioral Finance [10] – Allan Timmermann and Clive W.J. Granger. Efficient Market Hypothesis and Forecasting [11] – Eugene F. Fama. Efficient Capital Markets: II

Websites [12] – HSX.com [13] – Hootsuite.com [14] – Boxofficemojo.com [15] – Rottentomatoes.com

Final Senior Thesis...! 2! Abstract!...

Documents

Transcript of Final Senior Thesis...! 2! Abstract!...