SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News
-
Upload
andre-freitas -
Category
Data & Analytics
-
view
156 -
download
2
Transcript of SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News
![Page 1: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/1.jpg)
NLP & Semantic Computing Group
N L P
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News
Keith Cortis, Andre Freitas, Tobias Daudert, Manuela Hurlimann, Manel Zarrouk, Siegfried Handschuh, Brian Davis
Semeval, August 2017
![Page 2: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/2.jpg)
NLP & Semantic Computing Group
Outline
• Motivation & relevance
• Task description
• Challenge results
• Discussions
![Page 3: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/3.jpg)
NLP & Semantic Computing Group
Motivation &
Relevance
![Page 4: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/4.jpg)
NLP & Semantic Computing Group
High societal impact
(Material implication of information and perception)
![Page 5: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/5.jpg)
NLP & Semantic Computing Group
Interpretation of Events and Perception at Scale
![Page 6: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/6.jpg)
NLP & Semantic Computing Group
Fine-grained (in which sense?)
![Page 7: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/7.jpg)
NLP & Semantic Computing Group
Fine-grained (in which sense?)
One sentence main contain multiple target-sentiment attributions
“Sales at the tilmari business went downby 8% to eur 11.8 million, while gallerixstores saw 29% growth to eur 2 million.”
![Page 8: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/8.jpg)
NLP & Semantic Computing Group
Fine-grained (in which sense?)
One sentence main contain multiple target-sentiment attributions
“Sales at the tilmari business went downby 8% to eur 11.8 million, while gallerixstores saw 29% growth to eur 2 million.”
![Page 9: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/9.jpg)
NLP & Semantic Computing Group
Fine-grained (in which sense?)
One sentence main contain multiple target-sentiment attributions
“Sales at the tilmari business went downby 8% to eur 11.8 million, while gallerixstores saw 29% growth to eur 2 million.”
![Page 10: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/10.jpg)
NLP & Semantic Computing Group
Fine-grained (in which sense?)
Continuous polarity scale
“Sales at the tilmari business went downby 8% to eur 11.8 million, while gallerixstores saw 29% growth to eur 2 million.”
- 0.20 + 0.65
![Page 11: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/11.jpg)
NLP & Semantic Computing Group
Other NLP Challenges
• Interpretation requires both domain-specific and commonsense knowledge
• Domain-specific can get really specific!
"$AAPL at pivot area on intradaychart- break here could send this to 50-day SMA, 457.80”
![Page 12: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/12.jpg)
NLP & Semantic Computing Group
Other NLP Challenges
• Need for contextual information
“Sales at the tilmari business went downby 8% to eur 11.8 million, while gallerixstores saw 29% growth to eur 2 million.”
balance sheet
market analysis
![Page 13: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/13.jpg)
NLP & Semantic Computing Group
Task
Description
![Page 14: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/14.jpg)
NLP & Semantic Computing Group
Two Subtasks
Subtask 1: Microblogs
Subtask 2: News Statements &
Headlines
![Page 15: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/15.jpg)
NLP & Semantic Computing Group
Microblog Messages
$EL: 0.95 $NKE: 0.5$SBUX: 0.5 $AAPL : 0.5
“Este Lauder beats on Revenues and EPSand boosts dividend 25% - global growthin the Middle Class trend continues. $EL$NKE $SBUX $AAPL”
![Page 16: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/16.jpg)
NLP & Semantic Computing Group
Data Source Types
• News Statements & Headlines
“First Solar, Vivint Solar Lead Short Interest Trend”
First Solar: -0.7, Vivint Solar: -0.7
![Page 17: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/17.jpg)
NLP & Semantic Computing Group
Test Collection Creation
March 11th and 18th, 2016
October 2011 to June, 2015
August and November, 2015
AP News, Reuters, Handelsblatt, Bloomberg and Forbes
Raw Data Collection
![Page 18: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/18.jpg)
NLP & Semantic Computing Group
Test Collection Creation
1591 messages
1847 messages
1780 newsstatements
Sampling & Filtering
Random sampling
Spam Filtering
![Page 19: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/19.jpg)
NLP & Semantic Computing Group
Test Collection CreationAnnotation
4 paid domain experts120 hours (30 hours per expert)
Random sampling
Spam Filtering
Annotation
![Page 20: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/20.jpg)
NLP & Semantic Computing Group
Annotation
• Identify target entities
• Associated sentiment score:
Continuous scale between:
• -1 (very negative/bearish)
• 1 (very positive/bullish)
• The sentiment is assigned from the point of view of an investment decision
![Page 21: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/21.jpg)
NLP & Semantic Computing Group
Inter-annotator agreement
• Average spearman’s rankcorrelation on sentiment scores wascalculated for each pair of annotators:
0.54 for news headlines
0.69 for microblogs
![Page 22: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/22.jpg)
NLP & Semantic Computing Group
Test Collection Creation
Random sampling
Spam Filtering
Annotation
Annotation
Subtask 1
Subtask 2
1647 Headlines and News Statements
2510 Twitter and StockTwits messages
![Page 23: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/23.jpg)
NLP & Semantic Computing Group
![Page 24: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/24.jpg)
NLP & Semantic Computing Group
Evaluation
• Inspired by Ghosh et al. (2015).
vector space abstraction
rewarding systems which attempt to classify more instances
![Page 25: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/25.jpg)
NLP & Semantic Computing Group
Challenge
Results
![Page 26: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/26.jpg)
NLP & Semantic Computing Group
Participation
• Total: 32 teams
• Track 1: 25 teams
• Track 2: 29 teams
• 22 teams addressed both tracks
• 19 teams submitted a paper
Strong engagement and active mailing list.
![Page 27: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/27.jpg)
NLP & Semantic Computing Group
Results - Subtask 1
(Microblog Messages)
![Page 28: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/28.jpg)
NLP & Semantic Computing Group
Machine Learning Methods
![Page 29: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/29.jpg)
NLP & Semantic Computing Group
Linguistic Resources
![Page 30: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/30.jpg)
NLP & Semantic Computing Group
Common Frameworks
![Page 31: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/31.jpg)
NLP & Semantic Computing Group
Results - Subtask 2
(News Statements and
Headlines)
![Page 32: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/32.jpg)
NLP & Semantic Computing Group
Discussions
![Page 33: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/33.jpg)
NLP & Semantic Computing Group
Alternative Evaluation Metric
![Page 34: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/34.jpg)
NLP & Semantic Computing Group
Linguistic Resources
• Common linguistic resources: Loughran and McDonald Sentiment Word (rank 2) Opinion Lexicon (rank 1, 2) MPQA Subjectivity Lexicon (rank 1, 2).
• New resources were created during the task: Moore and Rayson Seyeditabari et al. Cabanski et al. Schouten et al. Li
![Page 35: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/35.jpg)
NLP & Semantic Computing Group
Approaches
• Most approaches used ML + multiple sentiment lexicon features
• The spectrum of applied ML methods was very broad SVM-Regression (SVR) Ensemble methods Neural Network-based sequence models No conclusive or significant difference between
different ML categories
• The use of domain-specific lexicons impacted results for subtask1
![Page 36: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/36.jpg)
NLP & Semantic Computing Group
Open Questions
• Impact of the use of financial backgroundinformation as background knowledge
• Deeper discussion on the relation between ML and linguistic/semantic phenomena
![Page 37: SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News](https://reader034.fdocuments.us/reader034/viewer/2022051710/5a6ecf987f8b9a91058b4fdb/html5/thumbnails/37.jpg)
NLP & Semantic Computing Group
Acknowledgements
Horizon 2020 ICT Program Project SSIX: Social Sentiment analysis financialIndeXes, has received funding from the European Union’s Horizon 2020Research and Innovation Program ICT 2014 – Information and CommunicationsTechnologies under grant agreement No. 645425.