Stock Prediction

SOPS: Stock Prediction using Web Sentiment

Vivek Sehgal and Charles SongDepartment of Computer Science

University of MarylandCollege Park, Maryland, USA{viveks, csfalcon}@cs.umd.edu

Abstract

Recently, the web has rapidly emerged as a great sourceof financial information ranging from news articles to per-sonal opinions. Data mining and analysis of such financialinformation can aid stock market predictions. Traditionalapproaches have usually relied on predictions based on pastperformance of the stocks. In this paper, we introduce anovel way to do stock market prediction based on sentimentsof web users. Our method involves scanning for financialmessage boards and extracting sentiments expressed by in-dividual authors. The system then learns the correlationbetween the sentiments and the stock values. The learnedmodel can then be used to make future predictions aboutstock values. In our experiments, we show that our methodis able to predict the sentiment with high precision and wealso show that the stock performance and its recent websentiments are also closely correlated.

1. Introduction

As web based technologies continue to be embraced bythe financial sector, abundance of financial information isbecoming available for the investors. One of the popularforms of financial information is the message board; thesewebsites have emerged as a major source for exchangingideas and information. Message boards provide a platformto investors from across the spectrum to come together andshare their opinions on companies and stocks. However,extracting good information from message boards is stilldifficult. The problem is the good information is hiddenwithin vast amount of data and it is nearly impossible foran investor to read all these websites and sort out the infor-mation. Therefore, providing computer software that canextract information can help investors greatly.

The data contained in the websites are almost always un-structured. Unstructured data makes an interesting yet chal-lenging machine learning problem. On a message board,

each data entry relates to a discussion to some stocks. Thiscan be visualized as a temporal data where the frequency ofwords and topics is changing with time. The opinions ona stock changes both with time and its performance on thestock exchanges. To put it more formally, there is correla-tion between the sentiment and performance of a stock. Ina study done recently [4], email spam and blogs have beenfound to closely predict stock market behavior. Messageboard data posses a challenge to data mining; opinions onmessage boards can be bullish, bearish, spam, rumor, vit-riolic or simply unrelated to the stock. We found that thenumber of useless messages surpasses useful ones. Fortu-nately for us, the sheer size of total messages allows signifi-cant number of posts of relevant opinions to be analyzed, aslong as we are able to filter out the noise.

In this paper, we introduce a novel way to do sentimentprediction using features extracted from the messages. Asmentioned before, many of the sentiments extracted are ir-relevant. To solve this problem, we develop a new measureknown as “TrustValue” which assigns trust to each messagebased on its author. This method rewards those authors whowrite relevant information and whose sentiments closelyfollow stock performance. The sentiments along with their“TrustValue” are then used to learn their relation with thestock behavior. In our experiments, we find our hypothesisthat stock values and sentiments are correlated is true formany message boards.

2. Related Work

In previous work on stock market predictions, [8] ana-lyzed how daily variations in financial message counts ef-fect volumes in stock market. [5] used computer algorithmsto identify news stories that influence markets, and thentraded successfully on this information. Both did real mar-ket stimulations to test the efficacy of their systems.

Another approach used in [4], tried to extract sentimentsfor a stock from the messages posted on web. They traineda meta-classifier which can classify messages as “good”,

“bad” or “neutral” based on the contents. The system usedboth the naive bag of word models and a primitive languagemodel. They found that time series and cross-sectional ag-gregation of message sentiment improved the quality of sen-timent index. [4] reaffirmed the findings of [8] by showingthat market activity is strongly correlated to small investorsentiments. [4] further showed that overnight message post-ing volume predicts changes in next day stock trading vol-ume.

There has been a lot of work on sentiment extraction.In [3], a social network is used to study how the sentimentpropagates across like minded blogs. The problem requirednatural language processing of text [6], they concluded thatcomputation can become intractable for a large corpus likeWeb. Our goal is develop an efficient model to predict sen-timent of message boards.

3. Approach

3.1 System Overview

Figure 1 gives an overall outline of our system. The firststep involved data collection. In this step we crawled mes-sage boards and stored the data in a database. The next stepwas extraction of information from the unstructured data.We removed HTML tags and extracted useful features suchas date, rating, message text etc. The information extractedis then used to build sentiment classifiers. By comparing onthe sentiments predicted from the web data and the actualstock value, our system calculated each author’s trust value.This trust value is then applied to filter noise, thus improv-ing our classifier’s performance. Finally, our system canmake predictions on stock behavior using all the featuresextracted or calculated.

3.2 Data Collection

We collected over 260,000 messages for 52 popularstocks on http : //finance.yahoo.com. The stocks werechosen to cover a good spectrum from technology to oil sec-tor. The messages covered over 6 month time period; thislarge amount of data gave us a big time window to analyzethe stock behavior. All the data extracted was stored in arepository for later processing.

On this website, the messages are organized by stockssymbol; a message board exists for each stock traded onmajor stock exchange such as NYSE and NASDAQ. Usersmust sign up for an account before they can post messagesand every message posted is associated with the author.This features makes author accountability possible in ourdata processing step. Along with text messages, the authorscan express the sentiments of their posts as “StrongBuy”,“Buy”, “Hold”, “Sell” and “StrongSell”. Aslo, other users

Figure 1. System Overview: The messageboard data is collected and processed. Thenwe predict the sentiments and calculate theTrustValues. These new features are thenused to predict stock behavior.

can rate the messages they read according to their opinionsand views, the rating is scaled out of five stars. And as withany message board, each message has a date, a title andmessage text.

We used a scraper program to extract message boardposts for the chosen set of stock symbols. Figure 2 showsa sample message from Yahoo! Finance along with the rel-evant information circled. In this sample message, the au-thor expressed extreme optimism about YAHOO! stock andencouraged others to buy the Yahoo! stocks as its currentvery underpriced with the possibility of a stock split. Ourscraper program would extract the subject, text, stock sym-bol, rating, number of rating, author name, date and author’ssentiment.

3.3 Feature Representation

After the relevant information has been extracted. Weconverted each message to a vector of words and authornames. The dates are mapped to an integer values. Thevalue of each entry in the vector is then calculated usingTFIDF formula:

TFIDF (w) = TF (w)× IDF (w)

TF (w) =n(w)∑w′ n(w′)

IDF (w) = log(|M |

{m : w ∈ m})

M is the set of all messages while n(w) is the frequencyof the term w in a message. The TFIDF (Term Frequency

Figure 2. We extracted relevant informationfrom the above message such as subject,text, date, author etc.

Inverse Document Frequency) weight is a statistical mea-sure used to evaluate how important a term (i.e., word, fea-ture, etc) is to a message in a corpus. The importance in-creases proportionally to the number of times the term ap-pears in the message but it is offset by the frequency of theterm in the corpus. Thus if the term is common in all themessages of the corpus then it is not a good indicator indifferentiating messages.

3.4 Sentiment Prediction

We assumed that the sentiment of a stock is highly re-sponsive to the performance of the stock and recent newsabout the company. For example, the news about introduc-tion of iPhone by Apple can fuel considerable interest inthe users, it can also affect their sentiments positively. Sim-ilarly, the sentiment can change when there is sharp changein stock performance in the near past. Using the above in-tuition, we modeled the sentiments as conditionally depen-dent upon the messages and stock value over the past oneday (it can be extended to a longer time period in our sys-tem). The sentiment for a message m at time instant i ismodeled as follows:

P (Sentiment|θ) = P (Sentiment|m,Mi, SVi)

This can also be visualized as a Markov process wherethe prediction at time instance i depends upon the valuesat previous time instance. In the above formula, Mi−1 andSVi−1 correspond to the set of messages and stock value attime instant i-1 respectively. The parameters for the abovemodel can now be learned using a suitable learning algo-rithm. For our experimentation, we used Naive Bayes, De-cision trees [3] and Bagging [2] to learn the corresponding

classifier. Naive Bayes is a simple model and can easilygeneralize to large data. Unfortunately, it is not able tomodel complex relationships. Decision trees on the otherhand can encode complex relationships but can often leadto over-fitting. Over-fitting can cause the model to performwell for training data but is unable to show similar perfor-mance for actual data.

For classifier training, we used a popular toolkit knownas weka [1] which provides all the standard classifierssuch as Naive Bayes, Decision Trees, etc. In our data,a small fractions of messages had sentiments already as-signed to them; their authors expressed the sentiment ex-plicitly. These messages were used as ground-truth whiletraining the sentiment classifier. We trained a classifier foreach stock on a daily bases and each message is classified as“StrongBuy”, “Buy”, “Hold”, “Sell” or “StrongSell”. Thenumber of features used by each classifier was in the rangeof 10,000.

3.5 TrustValue Calculation

We acknowledge that some authors are more knowledge-able than others about the stock market. For example, pro-fessional financial analysts should receive more trust, mean-ing their posts should carry more weight than the posts bythe less knowledgeable authors. However, obtaining an au-thor’s background is tricky and difficult. Message boardscommonly provide the user profile feature where the au-thors themselves can fill in information about their back-ground. But this feature is often left unused or filled withinaccurate information.

Instead of discovering each author’s background, we usean algorithm to calculate an author’s TrustValue base on hisor her historical performance on the message boards. Foreach message, we used the author’s sentiment from the sen-timent prediction step and compare them to the actual stockperformance around the time of the post. If the author’spost supported the actual performance, then author’s trustvalue is increased. Not only do we care about the directionin which the stock price went, we also care about the mag-nitude. For example, if the author gave a strong sell sen-timent in the post, but in reality the stock price only dropslightly, then the author should earn less TrustValue. Weused percentage difference in stock price at closing bell asthe normalized measure of stock performance.

Our algorithm also takes into account the fact that a sin-gle author cannot be expert on all stocks. It’s commonplacefor even professional financial analysts to keep track of onlya set of stocks. This means an author can only be trusted forthe set stocks he or she knows best. Each author can be as-signed different trust values for different stocks, this featureenables us to paint a clear picture of each author’s abilitieswith our algorithm. The TrustValue is calculated as follows:

TrustV alue =PredictionScore

NumberOfPreictions+

ExactPredictions + ClosePredictions

NumberOfPredictions + ActivityConstant

PredictionScore is equal to author’s prediction perfor-mance that is how closely does the author’s prediction fol-low the stock market. NumberOfPredictions is equals to thetotal number of predictions made by the author. ExactPre-dictions is the number of exact predictions made by the au-thor. ClosePredictions is the number of “good enough” pre-dictions made by the author. ActivityConstant is a constantused to penalize low activity or predictions by the author.

3.6 Stock Prediction

Stock prediction is a difficult task. In our method, weperformed stock prediction on the basis of web sentimentsabout the stock. To formalize, we predicted whether thestock value at time instance i would go up or down on thebasis of recent sentiments about the stock:

P (∆stock − valuei|Θ) =

P (∆stock − valuei|sentimenti, trusti−1, featuresi−1)

Figure 3 illustrates our stock prediction model. Welearned a classifier which can predict whether the stockprice would go up or down using the features extracted orcalculated over the past one day. We use all of the featuresincluding sentiment and TrustValue to train classifiers suchas Decision Tree, Naive Bayes and Bagging. In our exper-iments, we show strong evidence that stock value and sen-timent are indeed correlated and one can predict changes instock value using sentiments.

4 Experiments and Results

4.1 Evaluation

Sentiment prediction can be evaluated using statisticalmeasures. Accuracy, which is defined as the percent ofsentiments correctly predicted, is one method for evaluat-ing approaches. The quality of results is also measured bycomparing two standard performance measures, recall andprecision. Recall is defined as the proportion of positivesentiments which are correctly identified:

recall =Positive instances predicted

Total positive instances

Precision is defined as ratio between the numbers of cor-rect sentiments predicted to the total number of matchespredicted:

Figure 3. Our hypothesis: change in stockvalue is effected by sentiments of the pastday. Figure illustrates our model as BayesianNetwork.

precision =True positive instances predicted

Total instances predicted

One can increase recall by increasing the number of sen-timents predicted or by relaxing the threshold criteria. Butthis would often decrease the precision of the result. In gen-eral, there is an inverse relationship between recall and pre-cision. An ideal learning model has high recall and highprecision. Sometimes recall and precision are combined to-gether into a single number called F1, a harmonic mean ofrecall and precision:

F1 =2× recall × precision

recall + precision

4.2 Experiments

4.2.1 Sentiment Prediction

We chose 3 stocks to show our sentiment predictions,namely Apple, ExxonMobile and Starbucks, covering dif-ferent sectors of the economy. The aim of this experimentis to find out if sentiment prediction is possible using onlyfeatures present in the message. Table 1 to 3 show our re-sults. The sentiment classes are StrongBuy and StrongSell.We find that sentiment prediction can be done with bothhigh accuracy and recall in our system. This implies that

Table 1. StrongBuy & StrongSell sentimentprediction for Apple.

STRONGBUY

CLASSIFIER RECALL PRECISION F1NAIVE BAYES 0.24 0.42 0.31DECISION TREE 0.30 0.40 0.35BAGGING 0.56 0.20 0.30

STRONGSELL


Table 2. StrongBuy & StrongSell sentimentprediction for ExxonMobile.

STRONGBUY


STRONGSELL


the classifier is able to predict all the message for a partic-ular sentiment quite accurately though it might produce afew false positives.

In our decision tree model for Apple’s sentiment, a fea-ture near the root of the tree was the word “sheep”. At firstthis word seemed to be completely unrelated to the stockmarket or Apple. We were puzzled why it played such abig roll in classifying negative sentiments. After some re-search, we learned that in a popular 1984 movie about thestock market — Wall Street, the main character used theword sheep to describe weak fund managers. The entirephrase was “they’re sheep, and sheep get slaughtered”. Wethen confirmed the validity of our model when we found theword “slaughtered” a few features away from “sheep”. Wealso observed that when the word ”sheep” was present, theclassifier predicted a negative sentiment.

In our Starbucks sentiment model, phrases “china” and“interested” were important features to indicate negativesentiment. With growing Chinese economy and Starbucks’

Table 3. StrongBuy & StrongSell sentimentprediction for Startbucks.

STRONGBUY


STRONGSELL


Figure 4. Sentiment correlated with change instock price.

interest in strong presence in China, it was suprising to seethese phrases negatively associated with the stock. But aquick search in recent news reveal the Chinese bloggershave expressed extreme negative opinions in Starbucks’ in-terest in setting up store locations inside the Imperial For-bidden City. This proves our models were able to discoverrecent news related to the stock. We also found positiveadjectives associates with positive sentiment. For example,when building a decision tree for Apple stock, we found thatif the authors used words such as ”keep”, ”good”, ”relax”,”announced”, ”going over” then the sentiment was also pos-itive. Similarly, if the word ”dividends” was used whilediscussing ExxonMobile then the sentiment was again pos-itive.

Our results showed that web sentiment prediction can bedone with relatively high accuracy by building a model fromthe message boards. It also gives us interesting insightsinto the various things people are discussing about a stockand how it is effecting their opinion. Message boards canbe a useful corpus for companies to assess their perceptionamong the public.

Table 4. Accuracy results for stock predictionwith and without TrustValue.

COMPANY CLASSIFIER SENTIMENT W/TRUST

APPLE NAIVEBAYES 71 79APPLE DECISIONTREE 72 81STARBUCKS NAIVEBAYES 70 69STARBUCKS DECISIONTREE 70 71EXXONMOBILE NAIVEBAYES 61 62EXXONMOBILE DECISIONTREE 61 63

4.2.2 Stock Prediction

The aim of this set of experiments is to find out whetherthere is a correlation between the sentiment, the TrustValueand the corresponding stock value. For these experiments,we did not use any other feature to prevent biasing the re-sults. Table 4 shows our result on the 3 stocks. Our modelwas able to make accurate predictions, especially for Ap-ple where it was able to achieve an accuracy of 81% usingboth Web sentiment and the TrustValue. The model alsoperformed well for Starbucks and Exxon Mobile.

We also found that TrustValue is a helpful feature instock prediction. As shown in the table, the performanceof the model increased when the TrustValue is used. ForApple, the accuracy value increased by 9%. This showsTrustValue can help in removing many irrelevant or noisysentiments.

To conclude, we shown that there is a strong correla-tion between stock prices and web sentiment, and one canuse sentiment to make predictions about stock prices over ashort term period (a day). Figure 4 shows a snapshot of thedecision tree validating correlation between change in stockprice to sentiment.

5 Future Work

This is an exciting new area which has a lot of poten-tial for research. There are many things that can be doneto extend our current approach. In our current approach,we have not exploited information within a sector. For ex-ample, a change in stock price of a technology stock (sayYahoo) might be a good indicator of similar trends in otherstocks (e.g., Google) in the same sector. It would be inter-esting to train a stock prediction classifier using informationwithin a sector, in other word, a more specialized classifier.One can go even further and use information across relatedsectors [7]. For example, automobile industry is stronglylinked with performance of oil companies.

Another easy extension to our project would be to ex-

tend the model to make long term predictions. It is difficultto say how well this approach would perform but it woulddefinitely give insight into how sentiments and stock valueschange over time.

6 Conclusion

In this paper, we introduced a novel method to predictsentiment about stock using financial message boards. Inour experiments, we found that sentiment can be predictedwith high precision and recall. We also found interesting re-lationships between message text used in the message boardto the the corresponding sentimemts.

Web financial information is not always reliable. To takethis into consideration, we introduced a new measurementknown as TrustValue which takes into account the trust-worthiness of an author. We showed that TrustValue im-proves prediction accuracy by filtering irrelevant or noisysentiments.

We used the sentiment and TrustValue to make ourmodel for stock prediction. We used the intiution that sen-timents effect stock performance over short time period andwe captured this with Markov model. Our stock predictionresults showed that sentiment and stock value are closelyrelated and web sentiment can be used to predict stockbehavior with seasonable accuracy. To conclude, our re-sults showed promising prospects for automatic stock mar-ket predictions using web sentiments.

References

[1] Weka: Data mining software in java.[2] L. Breiman. Bagging predictors. Machine Learning,

24:123—140, 1996.[3] L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classifi-

cation and regression trees.[4] S. R. Das and M. Y. Chen. Yahoo! for amazon: Sentiment ex-

traction from small talk on the web. 8th Asia Pacific FinanceAssociation Annual Conference, 2001.

[5] V. Lavrenko, M. Scmill, D. Lawrie, P. Ogilvie, D. Jensen, andJ. Allen. Mining of concurrent text and time series. Pro-ceedings of the Internation Conference on KDD Text Miningworkshop, 2001.

[6] J. Nasukawa, T. Bunescu, R. Niblack, and W. Yi. Sentimentanalyzer: Extracting sentiments about a given topic using nat-ural language processing techniques. Proceedings of Interna-tional Conference of Data Mining, 2003.

[7] S. Sarawagi, S. Chakrabarti, and S. Godbole. Cross-training:Learning probabilistic mapping between topics. Special In-terest Group KDD, 2003.

[8] P. Wysocki. Cheap talk on the web: The determinant of post-ings on stock message boards. University of Michigan Busi-ness School, (98025), 1998.

Stock Prediction

Documents

Transcript of Stock Prediction