Final Senior Thesis...! 2! Abstract!...

29
1 The Hollywood Stock Exchange: Efficiency and The Power of Twitter by Nathaniel Harley A special thanks to Professor Richard Walker for advising on this thesis. Also, thanks to Professor Joseph Ferrie, Sarah Ferrer, and the MMSS department.

Transcript of Final Senior Thesis...! 2! Abstract!...

Page 1: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  1  

 

 

 

The  Hollywood  Stock  Exchange:  Efficiency  and  The  Power  of  Twitter  

     by  

Nathaniel  Harley    

 

 

 

 

 

 

 

 

 

 

 

A  special  thanks  to  Professor  Richard  Walker  for  advising  on  this  thesis.  Also,  thanks  to  Professor  Joseph  Ferrie,  Sarah  Ferrer,  and  the  MMSS  department.  

 

 

 

Page 2: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  2  

Abstract  

Online  prediction  markets  are  becoming  increasingly  popular  and  useful  for  forecasting  real  world  events.  The  Hollywood  Stock  Exchange  is  one  of  the  most  successful  online  prediction  markets  and  forecasts  real  world  box-­office  returns.  This  thesis  sets  forth  to  answer  questions  about  whether  The  Hollywood  Stock  Exchange  is  an  efficient  market,  and  if  it  is  not,  what  factors  can  be  used  to  predict  future  changes  in  MovieStock  prices?  Most  importantly,  this  thesis  will  focus  in  on  the  usefulness  of  social  media—specifically  Twitter—  in  predicting  future  changes  to  these  prices.      

Introduction  

Markets  are  a  place  where  individuals  can  exchange  items.  Prices  are  used  to  

assign  these  items  values  so  that  buyers  and  sellers  can  easily  trade  them.  

Embedded  in  these  prices  is  a  large  amount  of  information  that  reflects  the  

collective  opinion  of  informed  and  uninformed  traders.    

The  two  main  types  of  markets  are  financial  markets  and  prediction  markets.  

We  are  all  familiar  with  financial  markets,  such  as  stock  markets,  bond  markets,  

futures  markets,  commodity  markets,  currency  markets,  and  money  markets.  

Depending  on  the  type  of  market,  the  price  of  an  asset  can  represent  different  

meanings.  On  a  stock  market,  such  as  the  New  York  Stock  Exchange,  the  price  of  

common  stock  represents  how  much  an  individual  is  willing  to  pay  for  one  share  of  

a  specific  company.  On  a  futures  market,  the  price  represents  a  forecast  of  what  the  

underlying  asset  will  cost  in  the  future.  For  prediction  markets,  the  price  of  an  asset  

is  used  to  indicate  the  likelihood  of  an  event  occurring.    

Prediction  markets  are  slowly  becoming  more  popular  and  are  being  used  as  

an  informational  resource  to  predict  events.  Some  prediction  markets,  such  as  

Intrade  Prediction  Market,  forecast  the  likelihood  of  political  events.  Others,  such  as  

The  Hollywood  Stock  Exchange,  trade  prediction  shares  of  movies,  actors,  and  other  

Page 3: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  3  

film-­‐related  options.  As  more  and  more  prediction  markets  expand  onto  the  

electronic  platform,  individuals  have  more  access  to  trade  on  these  markets.  

The  question  driving  my  thesis  is  if  The  Hollywood  Stock  Exchange  is  not  an  

efficient  market,  what  information  can  people  use  to  predict  future  changes  in  stock  

prices?  Recently,  a  lot  of  work  has  been  done  to  try  and  capture  social  media  data,  

such  as  twitter,  and  use  it  as  a  measurement  to  make  quantitative  predictions.  Using  

Twitter  data,  along  with  other  non-­‐social  media  variables,  I  attempt  to  test  whether  

MovieStock  prices  can  be  predicted  by  Twitter  information.  

 

Prediction  Markets  

There  are  many  barriers  that  exist  for  establishing  a  new  market,  such  as  

high  costs,  government  regulation,  and  the  threat  of  lawsuits;  however,  artificial  

online  prediction  markets  do  not  have  these  barriers.  Web  market  games  are  

increasingly  easy  to  create  because  they  have  small  operating  costs  for  setup,  

maintenance,  advertising,  searching,  and  transacting,  and  benefit  from  a  global  

group  of  Internet  users.  They  do  not  need  to  get  permission  from  government  

officials  and  do  not  need  to  create  strict  rules  that  would  limit  trading  because  there  

is  little  risk  of  lawsuits  against  them.  Users  can  remain  anonymous  and  record  

keeping  does  not  need  to  be  as  tight.  As  a  result,  online  markets,  such  as  The  

Hollywood  Stock  Exchange,  can  exist  and  function  effectively  [1].  However,  as  Justin  

Wolfers  and  Eric  Zitzewitz  illustrate  in  their  paper  Five  Open  Questions  About  

Prediction  Markets,  there  are  five  open  questions  that  must  be  answered  in  order  for  

prediction  markets  to  fulfill  their  potential  and  ultimately  succeed.    

Page 4: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  4  

The  first  question  Wolfers  and  Zitzewitz  pose  is  “how  to  attract  uninformed  

traders?”  (Wolfers  and  Zitzewitz,  p.  2).  Uninformed  traders  are  important  to  any  

market  place  because  they  create  an  uninformed  order  flow,  which  actually  attracts  

informed  profit  motivated  groups  to  trade.  In  order  to  attract  these  traders,  it  is  

essential  to  have  low  transaction  costs  as  well  as  interest  or  buzz  surrounding  a  new  

predictive  market.  The  Hollywood  Stock  Exchange  has  successfully  attracted  both  

uninformed  and  informed  traders  by  creating  an  attractive,  easy  to  use  platform  

that  has  positioned  itself  as  the  premier  box-­‐office  forecasting  market  available.  

Their  second  question  is  “how  to  tradeoff  interest  and  contractibility?”  

(Wolfers  and  Zitzewitz,  p.  3).  Wolfers  and  Zitzewitz  conclude  that  it  is  important  to  

establish  clear  guidelines  outlining  contracts  traded  on  prediction  exchanges,  but  

that  there  is  a  lot  of  leeway  in  doing  so.  The  Hollywood  Stock  Exchange  does  not  face  

some  of  the  complexities  that  other  prediction  markets  encounter.  For  example,  

box-­‐office  revenue  and  which  actor  won  the  Oscar  for  best  actor  leaves  little  room  

for  interpretation.  

The  third  question  they  pose  is  “how  to  limit  manipulation?”  (Wolfers  and  

Zitzewitz,  p.  3).  This  question  addresses  two  types  of  manipulation:  First,  

manipulation  of  the  outcomes  on  which  the  prediction  markets  are  based  and,  

second,  manipulation  of  the  market  prices  themselves.  Theoretically,  traders  could  

go  out  and  buy  a  mass  amount  of  tickets  for  a  specific  movie  in  order  to  increase  

box-­‐office  revenue,  but  practically,  they  would  never  do  so  because  it  would  have  

little  impact  on  national  box-­‐office  revenue  and  the  cost  of  buying  movie  tickets  

greatly  out  ways  the  potential  profit  since  the  exchange  uses  Hollywood  Dollars  and  

Page 5: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  5  

not  actual  money.  In  terms  of  price  manipulation,  buying  a  mass  amount  of  a  specific  

asset  would  have  no  effect  on  the  real  world  outcome,  so  it  would  be  pointless.  

The  fourth  question  Wolfers  and  Zitzewitz  ask  is  “are  markets  well  calibrated  

on  small  probabilities?”  (Wolfers  and  Zitzewitz,  p.  3).  They  conclude  that  declining  

transaction  costs  and  carefully  framed  contracts  will  produce  more  accurate  

responses  from  traders  who  are  inherently  bad  at  distinguishing  small  probabilities  

and  overvalue  unlikely  events.  Although  traders  on  The  Hollywood  Stock  Exchange  

are  subject  to  this  bias,  it  is  important  to  realize  that  this  behavior  exists  not  only  in  

prediction  markets,  but  also  in  real-­‐world  markets.  

The  final  question  Wolfers  and  Zitzewitz  pose  is  “how  to  separate  correlation  

from  causation?”  (Wolfers  and  Zitzewitz,  p.  3).  The  assets  traded  on  The  Hollywood  

Stock  Exchange,  such  as  MovieStocks,  are  directly  correlated  with  real  world  events  

and  the  outcome  of  one  event,  in  the  movie  world,  does  not  cause  the  probability  of  

another  event  to  change  [2].  Therefore,  there  is  not  the  problem  of  determining  

correlation  versus  causation.  

By  examining  these  five  open  questions  proposed  by  Wolfers  and  Zitzewitz,  it  

is  clear  that  The  Hollywood  Stock  Exchange  has  all  the  elements  to  make  it  a  

successful  prediction  market  and  function  effectively.  However,  this  does  not  

guarantee  that  The  Hollywood  Stock  Exchange  is  an  efficient  market.  

 

Efficient  Markets  

If  markets  are,  in  fact,  efficient,  the  market  asset  price  is  the  best  estimate  of  

value;  however  if  markets  are  not  efficient,  the  market  price  may  deviate  from  the  

Page 6: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  6  

true  value.  That  being  said,  market  efficiency  does  not  require  that  the  market  price  

be  equal  to  true  value  at  every  point  in  time,  but  that  if  there  is  a  deviation  from  the  

true  value,  that  the  deviation  is  random.  There  are  three  main  types  of  efficient  

markets:  Weak  Form,  Semi-­‐Strong  Form,  and  Strong  Form  [3].    

o Weak  Form  –  Future  changes  in  prices  are  not  predictable  based  on  

information  contained  in  all  past  prices  suggesting  that  analysis  of  

past  prices  alone  would  not  be  helpful  in  determining  undervalued  or  

overvalued  assets.  

o Semi-­Strong  Form  –  Future  changes  in  prices  are  not  predictable  

based  on  past  prices  or  any  currently  available  public  information  

(including  prices,  economic  variables,  etc.).  

o  Strong  Form  –  Future  changes  in  prices  fully  reflect  all  information  

available,  public  and  private.  Informed  experts  cannot  consistently  

outperform  uninformed  traders.  

 

The  Hollywood  Stock  Exchange  

The  Hollywood  Stock  Exchange  is  a  online  artificial  prediction  market  game.  

Participants  can  buy  and  sell  virtual  shares  of  celebrities  and  movies  with  a  

currency  called  the  Hollywood  Dollar  (H$).  New  users  can  join  for  free,  and  when  

they  do,  they  receive  2  million  H$.  They  can  trade  various  assets  such  as  

MovieStocks,  StarBonds,  TVStocks,  MovieFunds,  and  Deriviatives.  The  Hollywood  

Stock  Exchange  then  syndicates  the  data  collected  from  the  Exchange  and  sells  it  as  

Page 7: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  7  

market  research  to  entertainment  companies,  consumer  product  companies  and  

financial  institutions.    

For  the  purpose  of  this  thesis,  I  will  focus  on  MovieStocks  and  whether  it  is  

possible  to  predict  future  changes  in  MovieStock  prices.  MovieStocks  represent  

films  both  still  in  production  and  currently  in  theaters.  The  price  of  the  MovieStock  

reflects  how  much  money  traders  think  the  film  will  make  with  each  $1  million  

earned  domestically  equal  to  $1  Hollywood  Dollar.  The  price  of  a  MovieStock  is  

adjusted  to  reflect  its  exact  earnings  in  the  box-­‐office.  The  price  begins  to  adjust  

after  the  movie’s  opening  weekend  in  order  to  bring  the  expected  box-­‐office  gross  

revenue  in  line  with  the  actual  box-­‐office  gross  revenue.  For  example,  if  Movie  A  

grosses  $20  million  its  first  week  in  theaters,  then  the  price  after  the  first  week  

would  be  something  like  H$45  on  the  exchange.  However,  if  Movie  A  only  grossed  

$3  million  in  the  second  week,  then  the  price  of  Movie  A  would  most  likely  drop  

drastically  to  something  like  H$28.  On  average,  a  film  makes  2.7  times  its  opening  

weekend  box-­‐office  during  its  first  four  weeks  of  wide  release.  MovieStocks  delist  

and  cash  out  from  the  market  on  the  first  business  day  after  its  fourth  weekend  of  

wide  release  or  12  weeks  of  limited  release.  The  driving  question  behind  this  Thesis  

is  whether  or  not  The  Hollywood  Stock  Exchange  is  an  efficient  market,  specifically  

looking  at  MovieStocks;  and  if  it  is  not,  what  factors  can  be  used  to  predict  future  

changes  in  MovieStocks?  

 

 

 

Page 8: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  8  

Twitter  

Social  media  is  quickly  changing  the  social  landscape  because  it  is  easy  to  use  

and  reaches  a  global  audience  extremely  quickly.  As  a  result,  social  media  is  setting  

trends  in  topics  that  range  across  the  board  from  politics  to  technology  to  the  

entertainment  industry.  Social  media  can  be  a  very  powerful  tool  and  the  question  

becomes  whether  it  is  possible  to  aggregate  social  media  and  use  it  as  a  

measurement  to  gauge  collective  opinion.  

We  all  know  Twitter.  Twitter  is  essentially  a  real-­‐time  information  network  that  

connects  you  to  the  latest  stories,  ideas,  opinions  and  news.  Tweets  are  only  140  

words,  but  they  can  be  very  powerful.  Twitter  uses  the  #  symbol,  called  a  "hash-­‐tag",  

to  mark  keywords  or  topics  in  a  Tweet.  Interestingly  enough,  twitter  users  created  it  

organically  as  a  way  to  categorize  messages.  People  use  the  hash-­‐tag  symbol  before  

relevant  keywords  or  phrases  in  their  Tweet  to  categorize  those  Tweets  and  help  

them  show  up  more  easily  in  Twitter  Search.  Hash-­‐tagged  words  that  become  very  

popular  are  often  referred  to  as  “trending  topics.”  Twitter  as  a  result  has  a  lot  of  

power  because  it  can  identify  important  topics  and  also,  the  sentiment  surrounding  

those  topics.  For  example,  if  a  keyword  is  being  used  a  lot  –  we  can  come  to  the  

conclusion  that  many  people  find  it  important.  Looking  further,  analyzing  the  

individual  tweets  can  help  us  identify  whether  people  feel  positively  or  negatively.  

Using  this,  we  can  create  a  measurement  of  collective  opinion  and  use  it  to  make  

quantitative  predictions.  

Many  people  have  already  started  to  use  Twitter  to  build  forecasting  models.  

Specifically,  they  look  at  the  sentiment  of  a  Tweet  and  use  it  to  gauge  collective  

Page 9: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  9  

opinion.  For  example,  Duncan  Watts,  an  Internet  researcher  who  heads  one  of  

Yahoo!’s  research  labs  in  New  York,  uses  Twitter  to  forecast  video-­‐game  and  music  

sales.  He  found  that  adding  Twitter  data  greatly  increased  the  accuracy  of  his  

forecasting  model.  Similarly,  Derwent  Capital  Markets,  a  hedge  fund  based  in  

London,  implements  a  Twitter  model  to  help  guide  their  investments  [4].    

 

Related  Studies  

When  looking  at  previous  studies,  I  came  across  a  few  that  were  very  

influential  in  shaping  how  I  tested  The  Hollywood  Stock  Exchange  for  efficiency.  In  

The  Power  of  Play:  Efficiency  and  Forecast  Accuracy  in  Web  Market  Games  by  David  

M.  Pennock  et  al.  they  analyzed  the  efficiency  and  forecast  accuracy  of  two  market  

games:  The  Hollywood  Stock  Exchange  and  the  Foresight  Exchange.  For  the  purpose  

of  this  thesis,  I  will  focus  on  their  results  regarding  The  Hollywood  Stock  Exchange.  

In  their  paper,  they  focused  on  the  question  of  whether  or  not  efficiency  breaks  

down  in  artificial  markets  when  there  is  no  monetary  incentive.  The  goal  of  their  

research  was  to  test  whether  The  Hollywood  Stock  Exchange  holds  for  two  types  of  

market  efficiency:  internal  coherence  and  strong  form.  They  presented  evidence  

that  some  market  simulations  can  act  sufficiently  well  as  both  aggregators  and  

disseminators  of  information.    In  conclusion,  they  found  that  The  Hollywood  Stock  

Exchange  MovieStock  prices  were  good  indicators  of  what  movies  will  do  well  in  the  

box-­‐office.    

   First,  it  is  important  to  understand  what  internal  coherence  is.  Internal  

coherence  is  defined  as  when  prices  are  self-­‐consistent  or  arbitrage-­‐free:  no  trader  

Page 10: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  10  

can  make  a  sure  profit  without  any  risk.  In  efficient  markets,  arbitrage  should  not  

exist.  For  example,  arbitrage  exists  when  you  can  buy  a  security  on  one  exchange,  

such  as  The  New  York  Stock  Exchange,  for  a  certain  price  and  then  sell  the  same  

security  on  the  Tokyo  Exchange  for  a  higher  price.  The  security  should  have  the  

same  price  on  both  exchanges.  Another  example  can  be  shown  in  relation  to  the  

securities  market.  Take  for  instance  a  security  that  pays  $1  if  and  only  if  it  rains  

tomorrow.  If  another  security  existed  that  pays  $1  if  an  only  if  it  does  not  rain  

tomorrow,  then  owning  both  securities  would  guarantee  a  $1  payoff.  In  order  for  

there  not  to  be  arbitrage  opportunity,  the  price  to  buy  both  securities  should  always  

be  greater  than  $1  and  the  price  to  sell  both  securities  should  always  be  less  than  $1.    

One  of  the  driving  questions  behind  their  study  was:  “do  HSX  players  have  

utility  for  Hollywood  dollars  and,  if  so,  are  their  resulting  incentives  strong  enough  

to  maintain  internal  price  consistency  in  the  game?”  (Pennock,  p.  7).  In  order  to  

determine  the  degree  of  internal  coherence  in  MovieStocks,  they  tested  how  closely  

MovieStocks  and  options  prices  conformed  to  the  put-­‐call  parity.  In  conducting  their  

experiment,  they  used  weekend  halt  prices  (the  price  before  the  movie  adjusts  to  

approximately  2.7  times  the  opening  weekend  box-­‐office  proceeds)  for  75  

MovieStocks  and  their  corresponding  options  during  the  period  of  March  3,  2000  to  

September  1,  2000.  They  found  that  the  relationship  between  the  stock  estimates  of  

weekend  box-­‐office  returns  versus  the  option  estimates  adhered  relatively  closely  to  

the  put-­‐call  parity  at  the  halt  price.  They  then  wanted  to  test  whether  prices  

adhered  to  the  put-­‐call  parity  at  all  times,  not  just  at  the  halt  price.  Their  results  

indicated  that  The  Hollywood  Stock  Exchange  market  was  not  completely  free  of  

Page 11: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  11  

arbitrage  because  prices  diverged  at  times  from  parity  by  as  much  as  H$6.5.  When  

examining  whether  the  market  showed  signs  of  internal  coherence,  they  concluded  

that  it  did  because  when  prices  were  too  high,  they  were  much  more  likely  to  

correct  and  go  down  on  subsequent  days.  Similarly,  when  they  were  too  low,  they  

were  more  likely  to  increase.  They  hypothesized  that  this  price  self-­‐correction  could  

be  attributed  to  traders  taking  advantage  of  arbitrage  opportunities.  

Pennock  et  al.  wanted  to  test  the  forecast  accuracy  of  The  Hollywood  Stock  

Exchange  and  whether  MovieStocks  were  good  predictors  of  box-­‐office  returns.  In  

order  to  understand  their  process  and  results,  it  is  important  to  understand  Rational  

Expectations  Theory:  prices  are  not  only  coherent,  but  also  reflect  the  sum  total  of  all  

information  available  to  all  market  participants.  Essentially,  the  Rational  

Expectations  Theory  states  that  even  when  some  individuals  have  insider  

information,  prices  equilibrate  as  if  everyone  has  access  to  all  the  same  information.  

They  wanted  to  go  further  and  test  where  strong  form  efficiency  holds  in  The  

Hollywood  Stock  Exchange  because  internal  coherence  is  only  a  minimal  standard  of  

market  efficiency,  where  as  stronger  forms  of  efficiency  imply  “market  competence  

as  well  and  coherence:  prices  actually  reflect  an  aggregation  of  information  

distributed  among  the  participants,  and  market  forecasts  are  as  accurate  as  expert  

assessments”  (Pennock,  p.  11).  Ultimately,  they  proposed  that  if  The  Hollywood  

Stock  Exchange  holds  for  strong  form  efficiency,  then  the  implications  would  be  

more  relevant  to  the  societal  benefit  in  the  form  of  “cheap  and  reliable  forecasts”  

(Pennock,  p.  11).  In  order  to  test  strong  form  efficiency,  they  assessed  the  forecast  

accuracy  of  The  Hollywood  Stock  Exchange  stock.  Pennock,  et  al,  quantified  and  

Page 12: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  12  

compared  MovieStock  prices  (The  Hollywood  Stock  Exchange  prediction)  to  

Brandon  Gray’s  published  forecasts  at  Box-­‐office  Mojo  for  50  movies  appearing  on  

both  sources.  Their  results  showed  that  there  was  a  slight  bias  to  underprice  the  

best  performing  movies  and  overprice  the  worst  performing  movies.  They  

attributed  this  bias  to  a  manifestation  of  risk-­‐seeking  behavior  where  traders  

preferred  potential  “sleepers”  with  the  opportunity  for  a  very  large  payoff,  rather  

than  known  quantities  with  a  moderate  payoff.  They  also  found  a  correlation  

between  MovieStock  estimates  and  Box-­‐office  Mojo  estimates,  which  they  

hypothesized  resulted  from  the  possibility  that  Box-­‐office  Mojo  observes  Hollywood  

Stock  Exchange  prices,  and/or  some  Hollywood  Stock  Exchange  traders  read  Box-­‐

office  Mojo  forecasts.  

Ultimately,  they  concluded  that  The  Hollywood  Stock  Exchange  showed  signs  

of  efficiency,  in  the  form  of  price  coherence  and  forecast  accuracy.  They  deduced  

that  The  Hollywood  Stock  Exchange  is  a  good  forecast  for  box-­‐office  returns  and  

provides  a  reasonable  likelihood  assessment  of  uncertain  events  (the  final  four  

week  box-­‐office  returns).  The  implications  that  Pennock  et  al.  derived  from  their  

study  were  that  existing  artificial  markets,  like  The  Hollywood  Stock  Exchange,  

could  be  a  valuable  resource  for  information.  Also,  The  Hollywood  Stock  Exchange  

provides  a  good  example  for  a  successful  artificial  market  and  should  promote  the  

creation  of  similar  markets  in  the  future  [1].    

In  Sitaram  Asur  and  Bernardo  A.  Huberman’s,  Predicting  the  Future  with  

Social  Media,  they  demonstrated  how  social  media  content  could  be  used  to  predict  

real-­‐world  outcomes.  In  particular,  they  used  the  chatter  from  Twitter.com  to  

Page 13: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  13  

forecast  box-­‐office  revenues  for  movies.  They  showed  how  a  simple  model  built  

from  the  rate  at  which  tweets  were  created  about  particular  topics  could  

outperform  market-­‐based  predictors.  Furthermore,  they  demonstrated  how  

sentiments  extracted  from  Twitter  could  be  further  utilized  to  improve  the  

forecasting  power  of  social  media.  

Social  media  has  the  ability  to  aggregate  opinions  and  act  as  a  form  of  

“collective  wisdom”  that  can  be  used  to  make  “quantitative  predictions  that  

outperform  those  of  artificial  markets”  (Asur  &  Huberman,  p.  1).  Their  goal  was  to  

assess  how  buzz  and  attention  was  created  for  different  movies  and  how  it  changed  

over  time.  Also,  they  focused  on  the  mechanism  of  viral  marketing  and  pre-­‐release  

hype  on  Twitter,  and  the  role  that  attention  played  in  forecasting  real-­‐world  box-­‐

office  performance.  They  also  focused  on  how  sentiments  were  created  and  how  

positive  and  negative  opinions  influenced  people.  

Their  hypothesis  was  that  “movies  that  are  well  talked  about  will  be  well-­‐

watched”  (Asur  &  Huberman,  p.  1).  

Asur  and  Huberman  wanted  to  look  at  how  attention  and  popularity  were  

generated  for  movies  on  Twitter,  and  what  affects  this  had  on  box-­‐office  

performance  for  the  movies  considered.  Their  results  indicated  that  movies  that  had  

greater  publicity,  in  terms  of  linked  urls  on  Twitter,  did  not  necessarily  perform  

better  in  the  box-­‐office.  Their  initial  analysis  of  tweet  rates  (defined  as  tweets  for  a  

movie  per  hour)  showed  a  positive  correlation.  When  they  compared  their  results  to  

The  Hollywood  Stock  Exchange  index,  they  found  that  their  model  outperformed  

The  Hollywood  Stock  Exchange  based  model  in  predicting  actual  box-­‐office  returns.  

Page 14: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  14  

They  then  tested  whether  they  could  predict  the  price  of  The  Hollywood  

Stock  Exchange  MovieStock  at  the  end  of  the  opening  weekend  for  the  movies  they  

considered.  In  order  to  do  so,  they  used  historical  Hollywood  Stock  Exchange  prices  

as  well  as  the  tweet-­‐rates  for  the  week  prior  to  the  release  as  predictive  variables.  

They  created  a  simple  time  series  regression  of  tweet-­‐rates,  over  7  days  before  the  

weekend,  to  predict  the  box-­‐office  revenue  for  a  particular  weekend.  Again,  they  

found  that  their  tweet-­‐rate  model  was  better  at  predicting  the  actual  values  than  the  

historical  Hollywood  Stock  Exchange  prices.  Their  results  showed  how  twitter  could  

be  used  as  an  accurate  indicator  of  future  outcomes.  

Asur  and  Huberman  also  wanted  to  see  if  the  sentiment  of  Tweets  could  

increase  forecasting  accuracy.  They  sectioned  off  tweets  into  Positive,  Negative,  and  

Neutral.  Their  results  indicated  that  tweets  after  the  release  had  more  value  than  

tweets  before  –  as  coincided  with  their  expectations  that  people  would  hold  more  

weight  to  a  tweet  after  they  had  seen  the  movie.  They  also  found  that  there  were  

more  positive  sentiments  than  negative  for  all  most  all  of  the  movies.  They  

concluded  that  adding  Twitter  sentiment  to  the  equation  did  not  significantly  

increase  the  predictive  power  of  tweets  themselves.  

In  conclusion,  they  found  that  social  media  feeds  could  be  an  effective  

indicator  of  real-­‐world  performance.  Specifically,  the  rate  at  which  movie  Tweets  

were  generated  could  be  used  to  build  a  powerful  model  for  predicting  movie  box-­‐

office  revenue.  They  showed  how  their  predictions  were  more  accurate  than  The  

Hollywood  Stock  Exchange  prices.  Finally,  the  sentiment  of  tweets  could  improve  

Page 15: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  15  

box-­‐office  revenue  predictions  based  on  tweet  rates,  but  only  after  the  movies  were  

released  [5].  

In  Twitter  Mood  Predicts  the  Stock  Market  by  Johan  Bollen  et  al.,  they  looked  

at  the  question  of  whether  societies  can  experience  mood  states  that  affect  their  

collective  decision-­‐making  and  by  extension  whether  the  public  mood  was  

correlated  or  even  predictive  of  economic  indicators.  They  investigated  whether  

measurements  of  collective  mood  states  obtained  from  twitter  feeds  were  

correlated  to  the  value  of  the  Dow  Jones  Industrial  Average  over  time.  In  their  study  

they  analyzed  the  text  in  Tweets  using  two  mood-­‐tracking  tools,  OpinionFinder  

(measures  positive  vs.  negative  mood)  and  Google-­‐Profile  of  Mood  States  (measures  

mood  in  terms  of  6  dimensions:  Calm,  Alert,  Sure,  Vital,  Kind,  and  Happy).  

Their  results  found  that  changes  in  the  public  mood  state  could  be  tracked  

from  the  content  of  large-­‐scale  Twitter  feeds  by  text  processing  techniques  and  that  

“such  changes  respond  to  a  variety  of  socio-­‐cultural  drivers  in  a  highly  differentiated  

manner”  (Bollen,  p.  7).  Also,  they  found  that  the  inclusion  of  specific  public  mood  

dimensions,  but  not  others  could  significantly  improve  the  accuracy  of  Dow  Jones  

Industrial  Average  predictions.  They  found  that  the  calmness  of  the  public  was  

predictive  of  the  Dow  Jones  Industrial  Average  rather  than  general  levels  of  positive  

sentiment  as  measured  by  OpinionFinder  [6].  

These  three  studies  helped  shape  how  I  wanted  to  form  my  own  study  of  

MovieStock  prices  in  relation  with  Twitter.    

 

 

Page 16: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  16  

Data  Summary  

Before  collecting  my  movie  data,  I  needed  to  establish  consistent  criteria  so  

that  all  movies  in  the  data  set  would  share  similar  properties.  First  I  collected  data  

for  the  top  2-­‐3  grossing  movies  opened  in  wide  release  (so  that  every  movie  in  my  

data  set  would  delist  after  four  weeks)  for  each  week  over  the  time  period  of  

September  2011  to  December  2011.  This  gave  me  a  data  set  of  26  movies.  For  every  

movie,  I  collected  the  release  date  MovieStock  price  and  then  the  price  at  the  end  of  

each  week,  up  until  the  delist  date.  This  gave  me  five  data  points  for  each  movie.  

Ultimately,  I  only  used  the  end  of  week  stock  price  for  my  regression  analysis  and  

dropped  the  release  date  stock  price  [12].  

In  order  to  capture  Twitter  data,  I  used  a  program  called  Hootsuite.  First,  I  

tracked  how  many  times  a  movie  title  was  mention  as  a  keyword  on  Twitter  over  

the  four-­‐week  period.  The  keyword  analysis  performed  by  Hootsuite  gave  me  daily  

Twitter  hits  for  each  keyword.  In  order  to  make  my  Twitter  data  line  up  with  the  

0  

20  

40  

60  

80  

100  

120  

1   2   3   4  Week  

#  Weekly  Twi+er  Hits  (Thousands)  

         400  

             600  

Page 17: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  17  

MovieStock  price  data,  I  summed  the  number  of  daily  Twitter  hits  and  calculated  

weekly  Twitter  hit  totals  for  each  week.  This  gave  me  four  data  points  for  each  

movie:  Week  1  twitter  total,  Week  2  twitter  total,  Week  3  twitter  total,  and  Week  4  

twitter  total  [13].    

I  then  calculated  the  log  of  the  number  of  weekly  twitter  hits  in  order  to  analyze  

the  percent  change  from  week  to  week.  For  the  regression  analysis,  I  wanted  to  

determine  whether  a  change  in  Twitter  hits  was  more  predictive  than  the  total  

number  of  Twitter  hits.  This  also  would  allow  me  to  control  for  movies  that  were  

more  popular  due  to  external  factors,  such  as  a  higher  budget  or  more  proactive  

advertising,  and  as  a  result,  generated  more  discussion  on  Twitter  [13].  

In  order  to  capture  the  sentiment  of  each  Tweet,  I  used  Hootsuite’s  twitter  

sentiment  analytics,  which  capture  the  conversational  tone  of  my  keyword  search.  I  

was  able  to  analyze  Tweet  sentiment  for  each  week,  which  gave  me  a  total  of  four  

data  points.  Hootsuite  analyzed  the  data  by  breaking  it  out  into  eight  different  

categories  based  on  the  sentiment  of  the  tweet:  affection  friendliness,  enjoyment  

0  

2  

4  

6  

8  

10  

12  

14  

1   2   3   4  

Week  

Log  #  Weekly  Twi+er  Hits  

Page 18: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  18  

elation,  amusement  excitement,  contentment  gratitude,  

sadness  grief,  anger  loathing,  fear  uneasiness,  and  finally,  

humiliation  and  shame.  The  analysis  gave  me  a  

percentage  break  down  of  the  weekly  Tweets  for  each  

category  [13].    

I  considered  affection  friendliness,  enjoyment  

elation,  amusement  excitement,  and  contentment  

gratitude  as  a  positive  Twitter  sentiment,  and  sadness  grief,  anger  loathing,  fear  

uneasiness,  and  finally,  humiliation  and  shame  as  a  negative  Twitter  sentiment.  I  

aggregated  the  collective  positive  Tweet  sentiments  on  a  weekly  basis  in  order  to  

capture  the  percent  of  Twitter  hits  that  were  positive.  This  gave  me  again,  four  data  

points  for  each  movie  [13].  

I  then  calculated  the  log  of  the  percent  of  Twitter  hits  that  were  positive  in  

order  to  capture  the  percent  change  from  week  to  week.  For  the  regression  analysis,  

0  10  20  30  40  50  60  70  80  90  100  

1   2   3   4  

Week  

%  Twi+er  Hits  Posi<ve  

Page 19: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  19  

I  wanted  to  determine  whether  a  change  in  Twitter  sentiment  was  more  predictive  

than  total  sentiment  [13].  

 

I  then  used  BoxOfficeMojo.com,  a  movie  web  site  with  the  most  comprehensive  

box-­‐office  database,  to  capture  the  weekly  box-­‐office  returns  for  each  movie.  Also,  I  

captured  the  weekly  number  of  theaters  the  movie  was  released  in.  I  then  calculated  

the  log  of  both  weekly  box-­‐office  revenue  and  weekly  number  of  theaters  in  order  to  

capture  the  percent  change  from  week  to  week  [14].  

Finally,  I  used  Rottentomatoes.com  –  a  website  devoted  to  reviews,  information,  

and  news  of  films,  widely  know  as  a  film  review  aggregator  –  to  incorporate  user  

ratings.  I  used  a  dummy  variable  (1  or  0)  to  indicate  whether  the  movie  had  

received  a  positive  rotten  tomatoes  rating  or  negative  one  (rotten)  [15].  

 

2  

2.5  

3  

3.5  

4  

4.5  

5  

1   2   3   4  

Week  

Log  %  Twi+er  Hits  Posi<ve  

Page 20: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  20  

Graphs  

Before  creating  my  regression  equation,  I  graphed  the  relationship  between  

MovieStock  prices  and  a  few  key  variables.  I  used  the  log  of  MovieStock  price  

because  I  wanted  to  focus  on  the  percent  change  from  week  to  week.  

Looking  at  the  relationship  between  the  logStockPrice  and  

log#WeeklyTwitterHits,  there  appeared  to  be  some  positive  correlation  with  a  few  

outliers.  

 

 

   

   

 

 

2  

2.5  

3  

3.5  

4  

4.5  

5  

5.5  

6  

6.5  

0   2   4   6   8   10   12   14  

Log  Stock  Price  

Log  #  Weekly  Twitter  Hits  

Correla<on  Bewtween  Change  in  Stock  Price  and  Change  in  #  Twi+er  Hits  

Page 21: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  21  

  The  relationship  between  logStockPrice  and  log%TwitterHitsPositive  did  

appear  to  have  a  clear  correlation  and  appeared  to  be  random  at  first  glance.    

 

 

 

 

 

 

 

 

 

 

2  

2.5  

3  

3.5  

4  

4.5  

5  

5.5  

6  

6.5  

2   2.5   3   3.5   4   4.5   5  

Log  Stock  Price  

Log  %  Twitter  Hits  Pos.  

Correla<on  Bewtween  Change  in  Stock  Price  and  Change  in  %  Twi+er  Hits  Posi<ve  

Page 22: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  22  

Initially,  it  looked  like  there  was  a  clear  correlation  between  logStockPrice  

and  logWeeklyBoxOffice.  This  would  be  expected  given  that  the  MovieStock  price  is  

a  forecast  of  actual  box-­‐office  revenue.  

 

 

 

 

 

 

2  

2.5  

3  

3.5  

4  

4.5  

5  

5.5  

6  

6.5  

-­‐2   -­‐1   0   1   2   3   4   5   6  

Log  Stock  Price  

Log  Weekly  Box  Of;ice  Revenue  

Correla<on  Bewtween  Change  in  Stock  Price  and  Change  in  Weekly  Box  Office  Revenue  

Page 23: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  23  

Finally,  I  looked  at  the  relationship  between  logStockPrice  and  

logWeeklyTheaters.  Based  on  the  relationship  pictured  in  the  graph,  it  was  hard  to  

conclude  that  there  was  a  strong  positive  correlation.  I  would  expect  that  a  positive  

change  in  the  number  of  theaters  a  movie  was  released  in  would  be  positively  

correlated  with  the  MovieStock  price  because  if  they  were  increasing  the  number  of  

theaters  it  most  likely  indicates  that  people  were  going  to  see  the  movie.  

 

 

 

 

 

 

2  

2.5  

3  

3.5  

4  

4.5  

5  

5.5  

6  

6.5  

4   4.5   5   5.5   6   6.5   7   7.5   8   8.5   9  

Log  Stock  Price  

Log  Weekly  Theaters  

Correla<on  Bewtween  Change  in  Stock  Price  and  Change  in  Weekly  #  Theaters  

Page 24: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  24  

Regression  Equation  

Using  the  data  described  above,  I  was  able  to  create  a  comprehensive  panel  

data  set.  It  was  important  to  create  a  panel  data  as  opposed  to  a  normal  linear  

regression  because  not  only  did  I  want  to  see  how  each  weekly  change  in  

MovieStock  price  was  effected  by  the  corresponding  week  data,  but  also,  I  wanted  to  

incorporate  time  series  variables  to  test  whether  previous  weeks  had  an  effect  on  

the  current  week.  As  a  result,  I  was  able  to  create  a  regression  equation  that  tested  

whether  or  not  the  change  in  MovieStock  price  could  be  determined  by  specific  

independent  variables.  

 logStockPrice  =  logStockPricex-­1  +  logStockPricex-­2  +  ReleaseDateTwitterHits  +  WeekTwitterHits  +  WeekTwitterHitsx-­1  +  WeekTwitterHitsx-­2  +  logWeekTwitterHits  +  logWeekTwitterHitsx-­1  +  logWeekTwitterHitsx-­2  +  TwitterHitsPositive  +  TwitterHitsPositivex-­1  +  TwitterHitsPositivex-­2  +  logTwitterHitsPositive  +  logTwitterHitsPositivex-­1  +  logTwitterHitsPositivex-­2  +  logWeekBoxOffice  +  logWeekBoxOfficex-­1  +  logWeekBoxOfficex-­2+  WeekTheater  +  WeekTheaterx-­1  +  WeekTheaterx-­2  +  logWeekTheater  +  logWeekTheaterx-­1  +  logWeekTheaterx-­2  +  RottenTomatoes      

Results  and  Discussion  

logStockPrice   Coef.   Std.  Err.   z   P>|z|   [95%  Conf.  Interval]  

logStockPricex-­‐1   0.8913884   0.0934942   9.53   0.000   0.7081431   1.0746340  logStockPricex-­‐2   0.0481264   0.1215238   0.40   0.692   -­‐0.190056   0.2863087  ReleaseDateTwitterHits   -­‐0.000006   0.0000032   -­‐1.92   0.055   -­‐0.000012   0.0000001  WeekTwitterHits   0.0000005   0.0000011   0.46   0.646   -­‐0.000002   0.0000026  WeekTwitterHitsx-­‐1   -­‐0.000001   0.0000012   -­‐0.59   0.554   -­‐0.000003   0.0000016  WeekTwitterHitsx-­‐2   -­‐0.000000   0.0000003   -­‐0.25   0.800   -­‐0.000001   0.0000005  logWeekTwitterHits   0.0156865   0.0073672   2.13   0.033   0.0012470   0.0301259  logWeekTwitterHitsx-­‐1   0.0166760   0.0074143   2.25   0.025   0.0021442   0.0312078  

Page 25: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  25  

logWeekTwitterHitsx-­‐2   -­‐0.007142   0.0084267   -­‐0.85   0.397   -­‐0.023658   0.0093739  TwitterHitsPositive   0.0245292   0.0125585   1.95   0.051   -­‐0.000085   0.0491435  TwitterHitsPositivex-­‐1   -­‐0.005177   0.0111295   -­‐0.47   0.642   -­‐0.026990   0.0166361  TwitterHitsPositivex-­‐2   -­‐0.022194   0.0183033   -­‐1.21   0.225   -­‐0.058067   0.0136798  logTwitterHitsPositive   -­‐1.467054   0.7915974   -­‐1.85   0.064   -­‐3.018557   0.0844481  logTwitterHitsPositivex-­‐1   0.0808984   0.7752506   0.10   0.917   -­‐1.438565   1.6003620  

logTwitterHitsPositivex-­‐2   1.4688500   1.3364470   1.10   0.272   -­‐1.150537   4.0882380  

logWeekBoxOffice   0.0007364   0.0175517   0.04   0.967   -­‐0.033664   0.0351371  logWeekBoxOfficex-­‐1   0.0014782   0.0206449   0.07   0.943   -­‐0.038985   0.0419414  LogWeekBoxOfficex-­‐2     0.0930352   0.0613214   1.52   0.129   -­‐0.027152   0.2132229  WeekTheater   -­‐0.000000   0.0000341   -­‐0.01   0.991   -­‐0.000067   0.0000665  WeekTheaterx-­‐1   -­‐0.000058   0.0000854   -­‐0.68   0.496   -­‐0.000225   0.0001092  WeekTheaterx-­‐2     -­‐0.000162   0.0001541   -­‐1.05   0.292   -­‐0.000464   0.0001397  logWeekTheater   -­‐0.003385   0.0407300   -­‐0.08   0.934   -­‐0.083214   0.0764439  logWeekTheaterx-­‐1   0.0307020   0.1550999   0.20   0.843   -­‐0.273288   0.3346922  logWeekTheaterx-­‐2   0.4900553   0.3653695   1.34   0.180   -­‐0.226055   1.2061660  RottenTomatoes   0.0353428   0.0200099   1.77   0.077   -­‐0.003875   0.0745616  Constant   -­‐3.843140   2.0975930   -­‐1.83   0.067   -­‐7.954347   0.2680673  

             

  R2   Value          

  Within   0.4624          

  Between   0.9994          

  Overall   0.9982            

The  variables  highlighted  in  green  are  all  significant  at  a  critical  value  greater  

than  or  equal  to  1.96  –  indicating  a  90%  confidence  level.  The  variables  highlighted  

in  yellow  are  all  significant  at  a  critical  value  greater  than  or  equal  to  1.645  –  

indicating  a  90%  confidence  level.  The  R-­‐squared  “between”  is  equal  to  .9994  which  

is  very  high;  however,  the  R-­‐squared  “within,”  which  is  the  R-­‐squared  for  a  fixed-­‐

effect  regression  is  much  lower.  Since  I  used  a  random-­‐effects  model,  the  R-­‐squared  

Page 26: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  26  

“between”  is  the  significant  number.  One  reason  the  R-­‐squared  is  so  high,  could  be  

due  to  the  fact  that  there  are  a  lot  of  independent  variables  in  the  regression.    The  

most  significant  variables  are  LogStockPrice  lagged  one  week,  Release  Date  Twitter  

Hits,  Log  Week  Twitter  Hits,  Log  Week  Twitter  Hits  lagged  one  week,  and  %  Weekly  

Twitter  Hits  Positive.  If  The  Hollywood  Stock  Exchange  was  a  completely  efficient  

market,  then  past  prices  would  have  no  correlation  with  current  prices;  however,  

this  is  not  the  case:  the  previous  week  change  in  MovieStock  price  has  a  direct  

correlation  with  the  current  week  change  in  MovieStock  price.  We  would  also  expect  

that  a  positive  change  in  Twitter  hits  would  indicate  a  positive  change  in  MovieStock  

prices.  It  is  interesting  that  a  change  in  twitter  hits  one-­‐week  prior  also  indicates  a  

positive  change  in  MovieStock  stock  price.  This  relationship  suggests  a  momentum  

effect:  if  a  movie  generates  a  lot  of  buzz  on  Twitter,  more  people  will  go  to  see  it  and  

talk  about  it.  In  terms  of  percent  change  of  the  percent  of  the  weekly  Twitter  hits  

that  are  positive,  we  would  also  expect  for  this  to  have  a  direct  correlation  to  an  

increase  in  MovieStock  price.  If  individuals  are  feeling  positive  about  the  movie,  and  

the  collective  opinion  is  increasingly  more  positive,  then  people  will  recommend  the  

movie,  and  more  people  will  go  to  see  it.  The  most  interesting  finding  however  is  

that  release  date  Twitter  hits  are  inversely  correlated.  We  would  expect  the  

opposite,  especially  since  the  change  in  weekly  Twitter  hits  is  positively  correlated.  

One  possible  explanation  could  be  that  people  who  Tweet  on  the  day  a  movie  is  

released  are  complaining  about  the  movie  and  giving  it  bad  reviews.  The  coefficient  

is  so  small  though,  that  it  almost  seems  negligible  even  though  the  variable  is  

considered  significant.  

Page 27: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  27  

 

 

Conclusion  and  Looking  Further  

Overall,  the  results  show  that  Twitter  can  provide  some  indication  of  when  

the  MovieStock  price  will  increase  or  decrease;  however,  it  is  hard  to  determine  

exactly  how  accurate  this  relationship  is.  It  would  also  appear  that  The  Hollywood  

Stock  Exchange  is  not  a  completely  efficient  market  despite  successfully  operating  

as  an  online  market  game.  Traders  could  use  information  from  Twitter  to  help  them  

predict  how  MovieStocks  will  perform  in  the  future  and  potentially  exploit  this  

information  to  make  excess  returns.  

Ideally,  I  would  have  liked  to  capture  data  for  more  movies  in  order  to  a  get  a  

more  comprehensive  data  set.  Also,  due  to  a  lack  of  resources,  the  collection  of  

twitter  data  could  have  been  more  comprehensive  and  I  would  not  have  solely  

relied  on  Hootsuite  as  my  main  form  of  collection.  It  is  questionable  how  accurate  

Hootsuite’s  method  of  capturing  the  number  of  keywords  was.  Also,  in  Hootsuite’s  

Twitter  sentiment  analysis,  there  are  some  problems  in  how  they  assigned  the  

different  categories.  For  example,  the  movie  Killer  Elite  had  an  extremely  high  

percentage  in  the  “fear  uneasiness”  category.  This  was  probably  due  to  the  fact  that  

they  assigned  the  word  “Killer”  in  the  movie  title  to  sentiments  of  fear.  In  order  to  

get  a  more  comprehensive  and  accurate  data  set,  every  individual  tweet  would  need  

to  be  analyzed,  but  clearly  this  process  is  too  arduous  for  one  person.  

When  thinking  about  the  effects  of  twitter  –  other  questions  arise.  Is  there  a  

threshold  effect  for  movies  –  meaning  that  after  a  certain  amount  of  “chatter”  on  

Page 28: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  28  

twitter,  does  the  power  of  twitter  become  less  significant?  Also,  how  important  are  

the  number  of  Tweets  and  the  change  in  number  of  Tweets  prior  to  the  release  of  

the  Movie.  It  would  be  interesting  to  try  and  predict  how  well  a  movie  would  do  in  

the  box-­‐office  for  opening  weekend.  

 I  would  also  have  liked  to  break  down  the  twitter  and  box-­‐office  returns  based  

on  geographical  regions  in  the  US  to  see  if  certain  geographic  regions  have  more  

predictive  power  than  others.  For  example,  if  more  people  in  LA  are  talking  about  a  

movie  on  Twitter,  does  that  have  implications  on  how  well  the  movie  performs  just  

in  LA  or  because  LA  is  a  central  city  in  the  movie  industry,  does  it  have  implications  

about  national  box-­‐office  revenue.  Also,  it  would  be  interesting  to  compare  different  

major  cities,  such  as  LA,  New  York,  and  Chicago,  to  test  whether  one  city  had  more  

influence  and  predictive  power  than  another.    

In  conclusion,  it  does  appear  that  Twitter  has  some  effect  on  MovieStock  prices  

and  in  turn,  some  predictive  power  in  determining  real  world  box-­‐office  returns;  

however,  it  is  unclear  to  what  extent.  In  order  to  predict  future  changes  in  

MovieStock  price,  one  could  use  information  they  collect  from  Twitter,  but  based  on  

these  results,  it  cannot  be  definitively  determined  how  accurate  such  analysis  would  

be.    

 

 

 

 

 

Page 29: Final Senior Thesis...! 2! Abstract! Onlineprediction&markets&arebecoming&increasinglypopular&and&useful&for& forecasting&real&world&events.&TheHollywood&Stock&Exchangeis&oneof&themost

  29  

References  

[1]  –  David  M.  Pennock,  Steve  Lawrence,  C.  Lee  Giles,  and  Finn  Arup  Nielsen.  The  Power  of  Play:  Efficiency  and  Forecast  Accuracy  in  Web  Market  Games    [2]  –  Justin  Wolfers  and  Eric  Zitzewitz.  Five  Open  Questions  About  Prediction  Markets    [3]  –  Aswath  Damodaran.  Market  Efficiency:  Definitions  and  Tests.  http://www.e-­‐m-­‐h.org/Damo.pdf.    [4]  –  The  Economist.  Can  Twitter  predict  the  future?  Internet  forecasting:  Businesses  are  mining  online  messages  to  unearth  consumers’  moods  –  and  even  make  market  predictions.  http://www.economist.com/node/18750604.    [5]  –  Sitaram  Asur  and  Bernardo  A.  Huberman.  Predicting  the  Future  with  Social  Media    [6]  –  Johan  Bollen,  Huina  Mao,  and  Xiaojun  Zeng.  Twitter  Mood  Predicts  the  Stock  Market    [7]  –  Ian  Saxon.  Intrade  Prediciton  Market  Accuracy  and  Efficiency:  An  Analysis  of  the  2004  and  2008  Democratic  Presidential  Nomination  Contests    [8]  –  Shyam  Gopinath,  Pradeep  K.  Chintagunta,  and  Sriram  Venkataraman.  Blogs  and  Local-­‐market  Movie  Box-­‐office  Perfromance    [9]  –  Eugene  F.  Fama.  Market  Efficiency,  Long-­‐Term  Returns,  and  Behavioral  Finance    [10]  –  Allan  Timmermann  and  Clive  W.J.  Granger.  Efficient  Market  Hypothesis  and  Forecasting    [11]  –  Eugene  F.  Fama.  Efficient  Capital  Markets:  II    

Websites    [12]  –  HSX.com    [13]  –  Hootsuite.com    [14]  –  Boxofficemojo.com    [15]  –  Rottentomatoes.com