P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis

22
DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis Muhammad Abdul-Mageed 1,2 , Hassan AlHuzliy 1 , Duaa’ Abu Elhija 1 , Mona Diab 2 Indiana University 1, The George Washington University 2

Transcript of P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis

Page 1: P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis

DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis

Muhammad Abdul-Mageed1,2, Hassan AlHuzliy1, Duaa’ Abu Elhija1, Mona Diab2

Indiana University1, The George Washington University2

Page 2: P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis

2

Emotions

• Categories of emotion: – Ekman (e.g., 1992) proposes there are 6 basic

emotions: anger, disgust, fear, happiness, sadness, and surprise

– Plutchik (1980, 1985, 1994) adds trust and anticipation • Emotion on 3 dimensions:– e.g., Francisco and Gervas (2006) mark the attributes

of pleasantness, activation, and dominance in the genre of fairy tales.

– DINA is focused on the Ekman emotions.

Page 3: P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis

3

Motivations• Opinion Mining:– Provides an enriching component beyond the mere binary

valence (i.e. positive and negative) of most sentiment analysis systems.

• Health & Wellness– Early detection of certain emotional disorders such as depression. – Improving the well-being of people by exposing them to desired

emotions (since emotion is contagious [Kramer et al., 2014]).• Education:– Integrating emotionally-aware agents in intelligent

computer-assisted language learning, for example, should prove useful and enhance the naturalness of the pedagogical experience.

Page 4: P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis

4

Motivations Cont.• Marketing:– e.g., emotion-sensitive language generation can help with

marketing (Heath et al., 2001; Tan et al., 2014), political campaigning, etc.

• Security:– Deflect potential hazards and anticipate dangerous

behaviors • Author Profiling:– Useful for predicting age and gender (Meina et al., 2013;

Flekova and Gurevych, 2013; Farias et al., 2013; Bamman et al., 2014; Forner et al., 2013) and personality (Mohammad and Kiritchenko, 2013)

Page 5: P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis

5

Related Work• SemEval-2007 Affective Text task (Strapparava and

Mihalcea, 2007) [SEM07]: – Collection and classification of emotion and

valence in news headlines• Aman and Szpakowicz (2007):– Annotation and detection of emotions from blogs

• Qadir and Riloff (2014), Mohammad (2012), Wang et al. (2012):– use hashtags as an approximation of emotion

categories to collect emotion data

Page 6: P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis

6

Arabic: Motivations

• Morphologically Rich Language– Highly inflected: person, number, gender, case,

mood, aspect, voice• Strategic Language:– One of the 6 languages of UN, with ~ 300M

speakers worldwide• Exponential Web growth:– More than 2000% growth rate on the Web in 2010

onwards (www.internetworldstats.com).

Page 7: P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis

7

Arabic Dialects

Page 8: P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis

8

Data Collection

• Crawled Twitter data using a seed set of size < 10 phrases for each of the six Ekman emotion types.

• Each phrase is composed of an emotion word (e.g., “happy”) and the first personal pronoun “I”.

• We collect only tweets where a seed phrase occurs in the tweet body text.

• This approach does not depend on hashtags.• We collect 500 tweets from each of the 6 emotion

types. Total = 3,000.• Seeds capture various Arabic dialects.

Page 9: P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis

9

Seeds

Table 1. Example seeds

Page 10: P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis

10

Annotation

• To verify the utility of this seeds approach, two college-educated native speakers of Arabic labeled the data.

• For labeling, we use one of four tags from the set {“no-emotion/zero”, “weak-emotion”, “moderate/fair-emotion”, “strong-emotion”}.

• We measure inter-annotator agreement as to these intensity labels in Cohen’s Kappa.

• We also calculate the % of emotion-carrying tweets per category (those that did not end up assigned the label “no-emotion/zero”).

Page 11: P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis

11

DINA: Agreement & % Emotion

Table 3. Agreement in fine-grained annotation and average percentage of emotion

Page 12: P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis

12

Gold Labels from Happiness Class

Table 2. Agreement in happiness annotation

Page 13: P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis

13

Examples: Anger

Page 14: P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis

14

Examples: Disgust

Page 15: P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis

15

Examples: Fear

Page 16: P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis

16

Examples: Happiness

Page 17: P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis

17

Examples: Sadness

Page 18: P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis

18

Examples: Surprise

Page 19: P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis

19

Context of No- and Mixed Emotions

• Even with a list of well-crafted seeds, both annotators assign “no-emotion” for 7.5% of the data.

• This is a function of emotion being a pragmatics-level phenomenon.

• Contexts for “no-emotion” include:– Reported speech– Sarcasm

Page 20: P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis

20

Reported Speech

Page 21: P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis

21

Sarcasm

Page 22: P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis

22

Conclusion

• Emotion is like other pragmatic-level phenomena; hence a seed-collection approach is useful, but not perfect.

• Phenomena like reported speech and sarcasm interact with our method for emotion data collection.

• DINA is multidialectal, but we do not have exact dialect labels on the tweets.

• DINA is at 3,000 tweets, and we plan to grow the size.• Full evaluation of DINA is only possible when we build

models exploiting these data, which we plan to do.