S-Sense: A Sentiment Analysis Framework for Social Media ...for analyzing social media contents....

IJCNLP 2013 Workshop on Natural Language Processing for Social Media (SocialNLP), pages 6–13,Nagoya, Japan, 14 October 2013.

S-Sense: A Sentiment Analysis Framework for Social Media Sensing

Choochart Haruechaiyasak, Alisa Kongthon,Pornpimon Palingoon and Kanokorn Trakultaweekoon

Speech and Audio Technology Laboratory (SPT)National Electronics and Computer Technology Center (NECTEC)Thailand Science Park, Klong Luang, Pathumthani 12120, Thailand

[email protected], [email protected]@nectec.or.th, [email protected]

Abstract

Due to the explosive growth of socialmedia usage in Thailand, many busi-nesses and organizations including mar-ket research agencies are seeking for toolswhich could perform real-time sentimentanalysis on the large contents. In this pa-per, we propose S-Sense, a framework foranalyzing sentiment on Thai social media.The proposed framework consists of anal-ysis modules and language resources. Twomain analysis modules, intention and sen-timent, are based on classification algo-rithm to automatically assign appropriateintention and sentiment class labels for agiven text. To train classification models,language resources, i.e., corpus and lexi-con, are needed. Corpus consists of a col-lection of texts manually labeled with ap-propriate intention and sentiment classes.Lexicon consists of both general termsfrom dictionary and clue terms which helpidentifying the intention and sentiment. Toevaluate performance and robustness ofthe analysis modules, we prepare a data setfrom Twitter posts and Pantip web boardin mobile service domain. The experi-ments are set up to compare the perfor-mance between two different lexicon sets,i.e., general and clue terms. The resultsshow that incorporating clue terms intofeature vectors for constructing the classi-fication models yield significant improve-ment in terms of accuracy.

1 Introduction

Due to the enormous volume, social media has be-come recognized as a good example of Big Data.One of the challenging issues in handling big datais to perform real-time analysis on the contents.

Today social media has been widely accepted asan active communication channel between com-panies and customers. Many companies regularlyuse social networking websites to promote newproducts and services, and post announcements tothe customers. On the other hand, customers oftenpost their comments to express some sentimentstowards products and services. Many customersalso post questions and requests to get answers andhelps from the customer services. Due to the real-time nature of the social media, monitoring cus-tomers’ comments has become a critical task incustomer relation management (CRM). Sentimentanalysis has received much attention among mar-ket research community as an effective approachfor analyzing social media contents. Some high-lighted applications of sentiment analysis includebrand monitoring, campaign monitoring and com-petitive analysis.

Thailand is among the top countries having alarge population on social networking websitessuch as Facebook and Twitter. The recent statis-tics show that the number of Facebook users inThailand has reached 17 millions as of October 29,20121. Many companies in Thailand start to seethe importance of using social media analysis togain some insight on what people think about theirbrands, products and services. Although manycommercial software tools for social media anal-ysis are available, they do not support Thai lan-guage. In this paper, we propose S-Sense, a frame-work for analyzing sentiment on Thai social mediacontents. To provide a complete solution, our pro-posed framework consists of many components in-cluding tagging tool, language resources, analysisand visualizing modules.

Among all of the components in S-Sense, lan-guage resources are considered very essential forproviding the infrastructure to train both inten-

1Facebook statistics, http://en.wikipedia.org/wiki/Facebook statistics

6

tion and sentiment analysis models. In our pro-posed framework, language resources consist oftwo components, corpus and lexicon. Corpus con-sists of a collection of texts manually labeled withappropriate intention and sentiment classes. Lex-icon consists of two types of terms, general andclue. The general lexicon includes terms found inLEXiTRON2, which is a well-known Thai-Englishelectronic dictionary. In S-Sense, the general lex-icon is modified by including new terms suchas slangs, chat language, transliterated words,found in Thai Twitter corpus. The second lex-icon consists of clue terms which help identify-ing the intention and sentiment. Example of clueterms for sentiment analysis are polar terms (suchas “stylish”, “beautiful” and “expensive”), whichcontain either positive or negative sentiment.

For the analysis modules, we apply classifica-tion algorithm to automatically assign appropriateintention and sentiment class labels for a giventext. The performance of classification modelsgenerally depends on the choice of classificationalgorithms including parameter settings, the sizeof training corpus and the design of term featuresets. The current version of S-Sense applies themultinomial Naive Bayes algorithm. The reasonwe used Naive Bayes is its requirement of a smallamount of training data to estimate the parame-ters for learning the models. Also Naive Bayesis a descriptive and probabilistic machine learn-ing, therefore, the results could be easily analyzedand explained. The classification results are re-turned with a probability value which could be in-terpreted as the confidence level. In addition tothe proposed framework, another contribution ofthis paper is the comparative study of using dif-ferent lexicon sets for training the analysis mod-els. We compare the performance of intention andsentiment analysis models by using two differentsets of lexicons, general and clue terms. The eval-uation corpus consist of Twitter posts and Pantipweb board topics in mobile service domain. Theexperimental results will be presented along withthe discussion on the error analysis.

The remainder of this paper is organized as fol-lows. In next section, we review some relatedworks on sentiment analysis and many differentapproaches for constructing language resourcesfor sentiment analysis. In Section 3, we presentthe proposed S-Sense framework for Thai inten-

2LEXiTRON, http://lexitron.nectec.or.th

tion and sentiment analysis. Details on each com-ponents are given with illustration. In Section 4,we evaluate the framework by using a data set col-lected from Twitter and Pantip Thai web board.Examples of difficult cases are discussed alongwith some possible solutions. Section 5 concludesthe paper with the future work.

2 Related work

Due to its potential and useful applications, opin-ion mining and sentiment analysis has gained alot of interest in text mining and NLP communi-ties (Ding et al., 2008; Jin et al., 2009; Tsytsarauand Palpanas., 2012). Much work in this area fo-cused on evaluating reviews as being positive ornegative either at the document level (Pang et al.,2002; Beineke et al., 2004) or sentence level (Kimand Hovy, 2004; Wilson et al., 2009). For in-stance, given some reviews of a product, the sys-tem classifies them into positive or negative re-views. No specific details or features are identi-fied about what customers like or dislike. To ob-tain such details, a feature-based opinion miningapproach has been proposed (Hu and Liu, 2004).

The problem of developing subjectivity lexi-cons for training and testing sentiment classifiershas recently attracted some attention. Althoughmost of the reference corpora has been focusedon English language, work on other languages isgrowing as well. Ku and Chen (2007) proposedthe bag-of-characters approach to determine senti-ment words in Chinese. This approach calculatesthe observation probabilities of characters from aset of seed sentiment words first, then dynami-cally expands the set and adjusts their probabili-ties. Later in 2009, Ku et al. (2009), extended theirbag-of-characters approach by including morpho-logical structures and syntactic structures betweensentence segment. Their experiments showed bet-ter performance of word polarity detection andopinion sentence extraction. Haruechaiyasak etal. (2010), proposed a framework for constructingThai language resource for feature-based opinionmining. The proposed approach for extracting fea-tures and polar words is based on syntactic patternanalysis.

Our main contribution in this paper is the pro-posed framework for analyzing intention, senti-ment, and language usage from social media texts.We initially performed some evaluation on Thaitexts to show the effectiveness of the proposed

7

components and modules. The proposed frame-work can be easily extended to support other lan-guages, especially for unsegmented languages, byproviding the plugged-in resources including lexi-con and corpus.

3 The proposed framework

In this paper, we focus on both language resourcesand the analysis modules as a complete frameworkfor Thai-language intention and sentiment analy-sis. The proposed framework could easily be ex-tended to support other languages by construct-ing language-specific resources. Our frameworkis also designed for easy adaptation to businessesin different domains. Similar to language-specificsupport, to apply the proposed framework for aspecific domain, one can use the provided taggingtool to prepare domain-specific resources, i.e., an-notated corpus and lexicon.

3.1 Components and modules

The proposed S-Sense framework (shown in Fig-ure 1) consists of the following components.

• Text collecting & processing: This com-ponent involves the process of crawling andcollecting social media contents from differ-ent websites. The process includes basic textprocessing, i.e., sentence segmentation, tok-enization and normalization. Term normal-ization is the process of converting a wordas appeared in the text into a predefined termand cleaning extra repeated characters whichare not part of the term. For example, aword “thnxsss” can be normalized to the term“thank”.

• UREKA: The main task of UREKA (Utiliza-tion on REsource for Knowledge Acquisi-tion) is to extract key feature terms or phrasesfrom a given text. Terms or phrases which arestatistically significant in the corpus can bepresented as interesting issues to the users.Another task is to filter and classify a giventext into a topic. When collecting texts fromsocial networking websites, it is very com-mon to see many collected texts are not rel-evant to the brands or products being moni-tored. Therefore, a classification model couldbe trained to filter out the irrelevant texts fromthe corpus. After obtaining the relevant texts,another classification model could be trained

to classify each text into a predefined set oftopics. For example, in mobile service do-main, topics could include signal quality, pro-motion and customer service.

• S-Sense: This is the main analysis compo-nent under the framework. S-Sense consistsof four analysis modules. Language usageanalysis classifies each text based on two as-pects, the use of obscene language and theuse of chat or informal languages. Detect-ing obscenity is useful since many texts withstrongly negative sentiment could sometimescontain obscene language. Intention analy-sis classifies each text into four classes: an-nouncement, request, question and sentiment.Sentiment analysis further classifies each textbased on its sentiment, i.e., positive or neg-ative. Emotion analysis is set in our futurework. The task of emotion analysis moduleis to perform an in-depth sentiment analysisregarding to the emotion or feeling such assad, happy and angry. Other components ofS-Sense include visualizing modules includ-ing adaptive emoticon and interactive dash-board. These modules are used for display-ing the summarized reports for the analyzedtexts.

• Tagging tool and language resources: Un-der the proposed framework, language re-sources include two components, annotatedcorpus with domain and language-specificlexicons. To construct language resources,we provide a tagging tool for linguists towork with. The tagging tool is a web-basedapplication which consists of a DBMS and aGUI.

3.2 Analysis tasksThe current version of S-Sense framework focuseson two main analysis modules, intention and senti-ment. The intention analysis include the followingcategories.

1. Announcement: This type of intention refersto messages or posts in which a company in-tends to communicate with their customers,e.g., advertisement of new products or eventannouncement.

2. Request: This intention is used for customersto ask for help when having trouble or prob-lem with the company’s products or services.

8

Figure 1: The proposed S-Sense framework.

Customers would expect immediate responsefrom the company to solve the problem.

3. Question: This intention refers to messagesor posts from customers asking for informa-tion related to products and services. Thequestion is, for example, a customer’s postasking for more details of a new mobile ser-vice promotion.

4. Sentiment: This intention is when customersexpress their opinions or sentiments towardsthe company’s brand, products and services.Sentiment can be divided into positive, neu-tral and negative aspects.

It is important to analyze intention before per-forming sentiment analysis. Without intentionanalysis, a sentence containing positive polarwords such as an advertisement would be identi-fied as containing the sentiment intention. For ex-ample, a sentence “The new high-speed Internet isfaster and cheaper. Apply today at the shop nearyou.” is an advertisement, but could be incorrectlyidentified as having positive sentiment. Therefore,Identifying a sentence as announcement or adver-tisement would help improve overall performanceof sentiment analysis.

3.3 Potential applications

S-Sense can be applied in many different applica-tions. Some of the potential applications are asfollows.

• Brand monitoring: With the widespreadof social media, today customers have morefreedom to express their sentiments towardsproducts and services. Analyzing sentimentsof the customers could help companies gainsome insight on how they feel when usingtheir products and services. More impor-tantly, many companies are highly associatedwith their brands. Negative sentiments to-wards the company’s brand could have neg-ative impact on the product sales. Therefore,it is very important for companies to monitoror track the mentions and sentiments of thecustomers on social media.

• Campaign Monitoring: Many timesthroughout the year, the company wouldlaunch different campaigns involving newproducts and services. The goal of campaignmonitoring (i.e, tracking) is to measure thecustomers’ feedback on each campaign.The results could be analyzed in terms ofnumber of mentions, positive and negativesentiments and the key product or servicefeatures in which customers feel positive ornegative about.

• Competitive Analysis: This task is to mon-itor and analyze the activities including sen-timents of customers towards the company’scompetitors. The analysis results could helpgain some insight on strengths and weak-nesses of the competitors in the market. For

9

example, if a competitor has many com-plaints on certain product features, the com-pany could grab the opportunity by advertis-ing its own product features which are betterthan the competitor’s.

• Employee Engagement: One of the mainproblems in many organizations today is thehigh turnover rate. One of the solutions isto monitor and analyze the employee engage-ment level. This task is to measure the em-ployees’ sentiments towards their jobs, col-leagues and organization. The measure couldreveal how much employees are willing tolearn and perform at work, and to get in-volved in different activities initiated by theorganization.

4 Experiments and discussion

To evaluate the proposed framework, we performexperiments using a corpus in the domain of mo-bile service. The corpus is obtained betweenMarch and June in 2013 from two sources, Twit-ter3 and Pantip4, one of the top visited web boardsin Thailand. The total number of randomly se-lected texts in the corpus is 2,723. The corpuswas annotated in two aspects, intention and sen-timent. Table 1 summarizes the number of taggedtexts in four different intentions. The majority ofintentions is sentiment which accounts for approx-imately 64% of the corpus. The reason is whenusing social networking websites or web boards,users often express their opinion and sentimentmore than other intentions.

For the sentiment intention, we further anno-tated each text based on its sentiment, i.e., positiveor negative. Table 2 summarizes the number oftagged texts in positive and negative sentiment. Itcan be observed that negative sentiment accountsfor approximately 91%. This is not very surprisingsince users tend to complain when having prob-lems using the mobile service. Major reportedproblems in mobile service industry include, forexample, weak or unavailable signal, call drop,slow data transfer rate, impolite service and longwaiting time for call center.

Table 3 shows some examples of annotated cor-pus in different intention and sentiment. In addi-tion to annotating each text with an intention label,

3Twitter, http://twitter.com4Pantip, http://pantip.com

Table 1: Number of tagged texts in four differentintentions.

Table 2: Number of tagged texts in positive andnegative sentiments.

we collect clue terms which could help identify theintention. For example, from the announcementintention, the terms and phrases “new promotion”,“best-deal” and “will start on” are collected intothe clue lexicon. From the sentiment intention, wecollected the terms “annoyed” and “impressive”.Other clue terms are underlined for each examplein the table.

Table 4 shows the statistics of lexicons usedin the experiments. There are two types of lex-icons: general and clue terms. General lexiconinclude two sets of terms, LEXiTRON5 whichare general words from Thai dictionary, and Twit-ter which contains newly found words from ThaiTwitter corpus. Words obtained from Twitter in-clude slangs and transliterated words from otherlanguages. Clue lexicon include terms or phraseswhich could help identify intention and sentiment.One of the main objectives in the experiments isto observe the effect of incorporating clue lexiconin constructing classification models for intentionand sentiment analysis. Therefore, we perform acomparative study on using different sets of lexi-cons.

To perform experiments, we apply the multino-mial Naive Bayes algorithm to learn the classifica-tion models (McCallum and Nigam, 1998). Thereason we use Naive Bayes is due to the smallnumber of sample texts in the corpus, especially

5LEXiTRON, http://lexitron.nectec.or.th

10

Table 3: Example of annotated texts categorized by different intentions and sentiments.

Table 4: Two types of lexicons: general and clue

for the announcement intention. Naive Bayes onlyrequires a small amount of training data to esti-mate the parameters for learning the models. AlsoNaive Bayes is a descriptive and probabilistic ma-chine learning, therefore, the results could be eas-ily analyzed and explained. The classification re-sults are returned with a probability value whichcould be interpreted as the confidence level.

The first experiment is the intention analysis.For each intention listed in Table 1, we train a bi-nary classification model with two classes, relatedand other. If a given text is analyzed as contain-

ing a particular intention, it will be assigned withthe class label related. We prepare the data set byusing the same amount of texts in each class. Forexample, in announcement intention, we use 94announcement texts and randomly select another94 texts from other intentions. To see the advan-tage of using clue terms as additional term feature,we compare the results between using only gen-eral lexicon and using both general and clue lexi-cons. The performance metric is accuracy whichis defined as the number of correctly classified in-stances over the total number of test instances.

Table 5 shows the experimental results for in-tention analysis. The results are based on 10-foldcross validation. From the table, it can be observedthat adding clue terms into the term feature helpsimprove the classification accuracy for all inten-tions. Especially for request, question and senti-ment, the improvement is over 6%. For announce-ment, the improvement is approximately 2%. Thisis probably due to the difficulty in defining andcollecting the clue terms for announcement inten-tion. For example, some of the terms like “new”must be collocated with other term in a phrase,e.g. “new promotion”. As the phrase becomes

11

more specific, it will not be found in the test in-stances. Another observation is the request inten-tion is the most difficult to analyze. This is dueto often when users wish to request for something,there is no specific term or clue term in the mes-sage. The request intention is implicitly expressedwith verbs or polar terms, therefore causing con-fusion to other intention classes.

Table 5: Experimental results on intention analysis

The second experiment is the sentiment analy-sis. We train a binary classification model withtwo classes, positive and negative. The number ofinstances for each class is given in Table 2. Ta-ble 6 shows the experimental results on sentimentanalysis. The results are based on 10-fold crossvalidation. From the table, we can observe thatusing clue terms as additional term features helpsincrease the accuracy by approximately 2%. Thesmall amount in improvement is probably due toterms in general dictionary and from Twitter con-tain sentiment which already helps identify the po-larity of the texts.

Table 6: Experimental results on sentiment analy-sis

To perform error analysis, we look at the testinstances which are misclassified, i.e., classifyingpositive into negative and vice versa. We can sum-marize two major causes of errors as word senseambiguity and sarcasm. The first problem occurswhen a polar term contains both positive and neg-ative senses depending on the contexts. For ex-

ample, the word “strong”, when appearing withthe term “signal” will give positive polarity. How-ever, when it appears with the term “employee”,the term has the meaning of ”impolite” and a neg-ative polarity should be assigned. However, dueto the small corpus size and simple feature vectorwhich treats each term independent, sometimes,the terms cannot be learned properly. To solve thisproblem, we will explore the idea of incorporatingcontextual terms with the clue terms in our futurework. Each clue term will be associated with somecontext terms to identify the polarity of the texts.

The second problem is sarcasm which is muchmore difficult to solve. This problem is still a dif-ficult and challenging task in sentiment analysisof any languages (Gonzalez-Ibanez et al., 2004).While there are some research work to identifysarcasm in given texts, the performance is stillpoor. However, some of the sarcastic texts can stillbe identified by detecting some common slangswhich are usually used in sarcastic texts. In Thailanguage, if users express a positive sentiment inan exaggerated way or in a contradicting way, thenthe message is most likely sarcastic. For example,“Today the download speed is faster than the speedof light. Thank you very much!” is considered assarcastic.

5 Conclusion and future work

We proposed a framework called S-Sense (SocialMedia Sensing) for developing a social media an-alyzing tool. The current version focuses on inten-tion and sentiment analysis. We applied the NaiveBayes as the classification algorithm to analyzefour different intentions (announcement, request,question and sentiment) and two sentiments (pos-itive and negative). The proposed framework wasevaluated by using a social media corpus in the do-main of mobile service obtained from Twitter andPantip web board.

To study the effect of using different lexiconsets to train the models, we compared two ap-proaches: using only general lexicon and usingboth general lexicon and clue terms. The resultsshowed that adding clue terms into feature vec-tor for training the classification models helps im-prove the accuracy for all intention and sentimentanalysis models. For intention models of request,question and sentiment, the accuracy is increasedby approximately 6%. For sentiment model, theaccuracy is increased by approximately 2%.

12

From the error analysis, we found that two ma-jor problems are word sense ambiguity and sar-casm. For future work, we plan to improve theperformance of both intention and sentiment anal-ysis models by incorporating the contexts nearbythe clue terms. Considering contexts could helpreduce the disambiguation of the word sense. An-other plan is to construct the lexicon and corpusfor other different domains. In addition to mo-bile service, other business domains in Thailandoften mentioned in the social media are automo-tive, consumer electronics, fashion, healthcare andtourism.

ReferencesPhilip Beineke, Trevor Hastie and Shivakumar

Vaithyanathan. 2004. The sentimental factor: im-proving review classification via human-providedinformation. Proc. of the 42nd Annual Meetingon Association for Computational Linguistics, 263–270.

Xiaowen Ding, Bing Liu and Philip S. Yu. 2008. Aholistic lexicon-based approach to opinion mining.Proc. of the int. conf. on web search and web datamining, 231–240.

Roberto Gonzalez-Ibanez, Smaranda Muresan andNina Wacholder. 2011. Identifying sarcasm in Twit-ter: a closer look. Proc. of the 49th ACL: HumanLanguage Technologies, 581–586.

Choochart Haruechaiyasak, Alisa Kongthon, Porn-pimon Palingoon, Chatchawal Sangkeettrakarn.2010. Constructing Thai Opinion Mining Resource:A Case Study on Hotel Reviews. Proc. of the EighthWorkshop on Asian Language Resources, 64–71.

Minqing Hu and Bing Liu. 2004. Mining and sum-marizing customer reviews. Proc. of the 10th ACMSIGKDD international conference on Knowledgediscovery and data mining, 168–177.

Wei Jin, Hung Hay Ho and Rohini K. Srihari. 2009.OpinionMiner: a novel machine learning system forweb opinion mining and extraction. Proc. of the15th ACM SIGKDD, 1195–1204.

Soo-Min Kim and Eduard Hovy. 2004. Determiningthe sentiment of opinions. Proc. of the 20th inter-national conference on Computational Linguistics,1367–1373.

Lun-Wei Ku and Hsin-Hsi Chen. 2007 Mining opin-ions from the Web: Beyond relevance retrieval.Journal of American Society for Information Scienceand Technology, 58(12):1838–1850.

Lun-Wei Ku, Ting-Hao Huang and Hsin-Hsi Chen.2009. Using morphological and syntactic structures

for Chinese opinion analysis. Proc. of the 2009empirical methods in natural language processing,1260–1269.

Andrew McCallum and Kamal Nigam. 1998. AComparison of Event Models for Naive Bayes TextClassification. Proc. of the AAAI-98 Workshop on’Learning for Text Categorization’ 41–48.

Bo Pang, Lillian Lee and Shivakumar Vaithyanathan.2002. Thumbs up?: sentiment classification usingmachine learning techniques. Proc. of the ACL-02conf. on empirical methods in natural language pro-cessing, 79–86.

Mikalai Tsytsarau and Themis Palpanas. 2012. Surveyon mining subjective data on the web. Data Minin-ing and Knowledge Discovery, 24(3): 478–514.

Theresa Wilson, Janyce Wiebe and Paul Hoffmann.2009. Recognizing contextual polarity: An explo-ration of features for phrase-level sentiment analy-sis. Comput. Linguist., 35(3):399–433.

13

S-Sense: A Sentiment Analysis Framework for Social Media ...for analyzing social media contents....

Documents

Transcript of S-Sense: A Sentiment Analysis Framework for Social Media ...for analyzing social media contents....