Chinese Blog Clustering by Hidden Sentiment Factors
-
Upload
lazzaro-murphy -
Category
Documents
-
view
22 -
download
0
description
Transcript of Chinese Blog Clustering by Hidden Sentiment Factors
![Page 1: Chinese Blog Clustering by Hidden Sentiment Factors](https://reader035.fdocuments.us/reader035/viewer/2022062721/56813811550346895d9fc7d6/html5/thumbnails/1.jpg)
Chinese Blog Clustering by Hidden Sentiment Factors
ADMA 2009Shi Feng, Daling Wang, Ge Yu,
Chao Yang, and Nan Yang.College of Information Science and
Engineering, Northeastern University
![Page 2: Chinese Blog Clustering by Hidden Sentiment Factors](https://reader035.fdocuments.us/reader035/viewer/2022062721/56813811550346895d9fc7d6/html5/thumbnails/2.jpg)
Hidden Sentiment Factors(HSF)
• Probabilistic latent semantic analysis (PLSA)– Blog Set B = {b1,b2,…,bN}– Sentiment words set W = {w1,w2,…,wM}• NTUSD
– 2,812 positive words and 8,276 negative words
• Hownet Sentiment Dictionary– 4,566positive words and 4,370 negative words
– A = NxM Matrix , A(i,j) = Freq(bi,wj)– HSF Z = {z1,z2,….,zk}
![Page 3: Chinese Blog Clustering by Hidden Sentiment Factors](https://reader035.fdocuments.us/reader035/viewer/2022062721/56813811550346895d9fc7d6/html5/thumbnails/3.jpg)
Hidden Sentiment Factors(HSF)
![Page 4: Chinese Blog Clustering by Hidden Sentiment Factors](https://reader035.fdocuments.us/reader035/viewer/2022062721/56813811550346895d9fc7d6/html5/thumbnails/4.jpg)
Hidden Sentiment Factors(HSF)
P(w|b) -> P(z|b)
![Page 5: Chinese Blog Clustering by Hidden Sentiment Factors](https://reader035.fdocuments.us/reader035/viewer/2022062721/56813811550346895d9fc7d6/html5/thumbnails/5.jpg)
Clustering by HSF
• K-Means Algorithm– k’ : # of clusters. In this paper, set k’ = k.– Fig.1 Similarity=0– Fig.2 Similarity=?
![Page 6: Chinese Blog Clustering by Hidden Sentiment Factors](https://reader035.fdocuments.us/reader035/viewer/2022062721/56813811550346895d9fc7d6/html5/thumbnails/6.jpg)
Label Words Extraction
![Page 7: Chinese Blog Clustering by Hidden Sentiment Factors](https://reader035.fdocuments.us/reader035/viewer/2022062721/56813811550346895d9fc7d6/html5/thumbnails/7.jpg)
Experiment
– 1. Collect blogs about reviews on Stephen Chow’s movie “CJ7” (Long River 7)
– 2. Collect blog entries about Liu Xiang since 2008/8/18.
• Tag1. “Positive”, “Negative” and “Neutral”Tag2. “Irrelevant” or not
• Ex: A blog may tagged {“Positive” , ”Irrelevant”}, {“Neutral”} or {“Negative” , ”Irrelevant”}
• Evaluate the clustering purity.
![Page 8: Chinese Blog Clustering by Hidden Sentiment Factors](https://reader035.fdocuments.us/reader035/viewer/2022062721/56813811550346895d9fc7d6/html5/thumbnails/8.jpg)
Experiment
![Page 9: Chinese Blog Clustering by Hidden Sentiment Factors](https://reader035.fdocuments.us/reader035/viewer/2022062721/56813811550346895d9fc7d6/html5/thumbnails/9.jpg)
Experiment
![Page 10: Chinese Blog Clustering by Hidden Sentiment Factors](https://reader035.fdocuments.us/reader035/viewer/2022062721/56813811550346895d9fc7d6/html5/thumbnails/10.jpg)
Experiment
![Page 11: Chinese Blog Clustering by Hidden Sentiment Factors](https://reader035.fdocuments.us/reader035/viewer/2022062721/56813811550346895d9fc7d6/html5/thumbnails/11.jpg)
Experiment