P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis

Post on 15-Feb-2017

166 views 6 download

Transcript of P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis

DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis

Muhammad Abdul-Mageed1,2, Hassan AlHuzliy1, Duaa’ Abu Elhija1, Mona Diab2

Indiana University1, The George Washington University2

2

Emotions

• Categories of emotion: – Ekman (e.g., 1992) proposes there are 6 basic

emotions: anger, disgust, fear, happiness, sadness, and surprise

– Plutchik (1980, 1985, 1994) adds trust and anticipation • Emotion on 3 dimensions:– e.g., Francisco and Gervas (2006) mark the attributes

of pleasantness, activation, and dominance in the genre of fairy tales.

– DINA is focused on the Ekman emotions.

3

Motivations• Opinion Mining:– Provides an enriching component beyond the mere binary

valence (i.e. positive and negative) of most sentiment analysis systems.

• Health & Wellness– Early detection of certain emotional disorders such as depression. – Improving the well-being of people by exposing them to desired

emotions (since emotion is contagious [Kramer et al., 2014]).• Education:– Integrating emotionally-aware agents in intelligent

computer-assisted language learning, for example, should prove useful and enhance the naturalness of the pedagogical experience.

4

Motivations Cont.• Marketing:– e.g., emotion-sensitive language generation can help with

marketing (Heath et al., 2001; Tan et al., 2014), political campaigning, etc.

• Security:– Deflect potential hazards and anticipate dangerous

behaviors • Author Profiling:– Useful for predicting age and gender (Meina et al., 2013;

Flekova and Gurevych, 2013; Farias et al., 2013; Bamman et al., 2014; Forner et al., 2013) and personality (Mohammad and Kiritchenko, 2013)

5

Related Work• SemEval-2007 Affective Text task (Strapparava and

Mihalcea, 2007) [SEM07]: – Collection and classification of emotion and

valence in news headlines• Aman and Szpakowicz (2007):– Annotation and detection of emotions from blogs

• Qadir and Riloff (2014), Mohammad (2012), Wang et al. (2012):– use hashtags as an approximation of emotion

categories to collect emotion data

6

Arabic: Motivations

• Morphologically Rich Language– Highly inflected: person, number, gender, case,

mood, aspect, voice• Strategic Language:– One of the 6 languages of UN, with ~ 300M

speakers worldwide• Exponential Web growth:– More than 2000% growth rate on the Web in 2010

onwards (www.internetworldstats.com).

7

Arabic Dialects

8

Data Collection

• Crawled Twitter data using a seed set of size < 10 phrases for each of the six Ekman emotion types.

• Each phrase is composed of an emotion word (e.g., “happy”) and the first personal pronoun “I”.

• We collect only tweets where a seed phrase occurs in the tweet body text.

• This approach does not depend on hashtags.• We collect 500 tweets from each of the 6 emotion

types. Total = 3,000.• Seeds capture various Arabic dialects.

9

Seeds

Table 1. Example seeds

10

Annotation

• To verify the utility of this seeds approach, two college-educated native speakers of Arabic labeled the data.

• For labeling, we use one of four tags from the set {“no-emotion/zero”, “weak-emotion”, “moderate/fair-emotion”, “strong-emotion”}.

• We measure inter-annotator agreement as to these intensity labels in Cohen’s Kappa.

• We also calculate the % of emotion-carrying tweets per category (those that did not end up assigned the label “no-emotion/zero”).

11

DINA: Agreement & % Emotion

Table 3. Agreement in fine-grained annotation and average percentage of emotion

12

Gold Labels from Happiness Class

Table 2. Agreement in happiness annotation

13

Examples: Anger

14

Examples: Disgust

15

Examples: Fear

16

Examples: Happiness

17

Examples: Sadness

18

Examples: Surprise

19

Context of No- and Mixed Emotions

• Even with a list of well-crafted seeds, both annotators assign “no-emotion” for 7.5% of the data.

• This is a function of emotion being a pragmatics-level phenomenon.

• Contexts for “no-emotion” include:– Reported speech– Sarcasm

20

Reported Speech

21

Sarcasm

22

Conclusion

• Emotion is like other pragmatic-level phenomena; hence a seed-collection approach is useful, but not perfect.

• Phenomena like reported speech and sarcasm interact with our method for emotion data collection.

• DINA is multidialectal, but we do not have exact dialect labels on the tweets.

• DINA is at 3,000 tweets, and we plan to grow the size.• Full evaluation of DINA is only possible when we build

models exploiting these data, which we plan to do.