Three hybrid classifiers for the detection of emotions in suicide notes

1
EMBL- EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UK T +44 (0) 1223 494 444 F +44 (0) 1223 494 468 http://www.ebi.ac.uk Acknowledgements We would like to thank EMBL-EBI, the Leverhulme Trust and the Swiss National Science Foundation for funding our work. We would also like to thank Dr Ian Lewin and Dr Cristina Soriano for their feedback. Jee-Hyub Kim Maria Liakata [email protected] [email protected] Shyamasree Saha Janna Hastings [email protected] [email protected] Three hybrid classifiers for the detection of emotions in suicide notes Jee-Hyub Kim a,b , Maria Liakata a,b,c , Shyamasree Saha b , Janna Hastings b,d and Dietrich Rebholz-Schuhmann b Introduction Suicides increasingly present a major concern in today's society. We describe our system for the recognition of emotions in suicide notes. Motivated by the sparse and imbalanced data as well as the complex annotation scheme, we have considered three hybrid approaches for distinguishing between the different categories. Each of the three approaches combines machine learning with manually derived rules, where the latter target very sparse emotion categories. The first approach considers the task as single label multi-class classification, where an SVM and a CRF classifier are trained to recognise fifteen different categories and their results are combined. Our second approach trains individual binary classifiers (SVM and CRF) for each of the fifteen sentence categories and returns the union of the classifiers as the final result. Finally, our third approach is a combination of binary and multi-class classifiers (SVM and CRF) trained on different subsets of the training data. We considered a number of different feature configurations. All three systems were tested on 300 unseen messages. Our second system had the best performance of the three,yielding an F1 score of 45.6% and a Precision of 60.1% whereas the best Recall (43.6%) was obtained using the third system. System 1: Multi-class single label We followed three hybrid approaches combining supervised machine learning with manually created rules to cater for sparse categories. Features: Lexical (DATE, ADDRESS, NAME, unigrams, bigrams, trigrams), Grammatical (POS, verb, verb tense and voice, subjects, objects and grammatical triples) 1 ,Negation 2 in sentence, sentence length and position. Multi-class annotation: Single label. Independent LibSVM 3 and CRFSuite 4 models trained, final assignment according to confidence score. Union of binary classifiers: Multi-label. Trained individual binary classifiers for each of the fifteen categories present in the training data, for both LibSVM and CRFSuite and took the union of the classifiers. Best performance. Hybrid binary classifiers: Multi-label. Trained binary emotion/non-emotion on all the data, individual binary classifiers for emotions on just emotion data, multi-class on non-emotion data. Assignment of instance according to weighted voting of classifiers. Best recall but many false positives. System 2: Union of binary classifiers, multi-label System 3: Combination of binary and multiclass classifiers, multi-label We used 48 manual rules for the 8 sparsest categories, developed in Perl regular expressions. These rules only applied to sentences for which automatic classifiers gave no prediction. Thus, the choice of feature affects both classifier performance and the results we get from rules. Rules were proposed by manual inspection of the relevant annotated sentences from the training data as well as through the examination of synonym sets from WordNet- Affect 5 . Rules were validated against the corpus as a whole, and those which retrieved too many false positives were discarded. Only one rule is allowed to fire. Regular expressions are ordered from most specific to most general and from most frequent categories to least frequent. Example: $SORRY = "(sorry|miserable|lonely|sad|tired|alone| weary)" i* .{0,10}be .{0,10}$SORRY" => 'sorrow' Methods and Features Manual Rules Results on training data Discussion and Conclusion References b a Authors contributed equally d Swiss Center for Affective Sciences, University of Geneva, Switzerland c Department of Computer Science, Aberystwyth University 1 Curran James, Clark Stephen and Bos Johan. Linguistically motivated large-scale nlp with c&c and boxer. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions, pages 33–36, Prague, Czech Republic, June 2007 2 Morante Roser, Van Asch Vincent and Daelemans Walter. Memory-based resolution of in-sentence scopes of hedge cues. In Proceedings of the Fourteenth Conference on Computational Natural Language Learning — Shared Task, CoNLL ’10: Shared Task, pages 40–47, Stroudsburg, PA, USA, 2010.. 3 www.csie.ntu.edu.tw/cjlin/libsvm 4 www.chokkan.org/software/crfsuite 5 Valitutti Ro.Wordnet-affect: an affective extension of wordnet. In Proceedings of the 4th International Conference on Language Resources and Evaluation, pages 1083–1086, 2004. 6 Pestian John, Nasrallah Henry, Matykiewicz Pawel, Bennett Aurora and Leenaars Antoon. Suicide note classification using natural language processing: A content analysis. Biomed Inform Insights, 2010:19–28, 2010. Annotation categories are mixed and the distribution of data imbalanced. Some categories are very sparse. There is a need to distinguish between emotions and non-emotions. Emotion language is highly heterogeneous and overlapping across categories A combination of manual rules and machine learning classifiers were required for the task of emotion recognition. We obtained a highest precision of 60.1%, recall of 43.6% highest F1 of 45.6% on the test data. A union of binary classifiers performed best. At the moment the manual rules drop overall performance while increasing recall for sparse categories. Further fine tuning is required to improve this result. Future work will involve more sophisticated ways of combining classifiers using stacking. We believe a core problem of suicide prevention could be substantially reduced to that of detecting hopelessness, as the most significant predictive emotion category. Multi-Class Classifier Pre- processi ng Feature extract or Sentences Multi- Class LibSVM Classif ier If labele d Rule- Based Classifi er Labeled Sentences Multi- Class CRFSuit e Classif ier Voting A set of Binary Classifiers Binary CRFSuite Classifier on emotion 2 Binary CRFSuite Classifier on emotion n Binary CRFSuite Classifier on emotion 1 Union Binary LibSVM Classifier on emotion 2 Binary LibSVM Classifier on emotion n Binary LibSVM Classifier on emotion 1 Binary LibSVM Classifier on emotion/non-e Multi-Class LibSVM Classifier on non emotions If emotio n Binary LibSVM Classifier on emotion m Binary LibSVM Classifier on emotion 2 Binary LibSVM Classifier on emotion 1 Weighted Voting Binary CRFSuit Classifier on emotion m Binary CRFSuit Classifier on emotion 2 Binary CRFSuit Classifier on emotion 1 Training Feature SVM CRF SVM + CRF SVM + CRF + Rules All 0.3957 0.4347 0.4415 0.438 Multi Class Ngram 0.3606 0.4416 0.4437 0.44 No-Neg 0.3665 0.4332 0.4349 0.4318 All 0.464 0.461 Binary All Ngram 0.46 0.457 No-Neg 0.46 0.457 All 0.39 0.389 Hybrid Binary Ngram 0.384 0.384 No-Neg 0.3875 0.3865 All P/ R/ F Ngram P/ R/ F Unigram P/ R/ F GR P/ R/ F Subject P/ R/ F Verb P/ R/ F Negation P/ R/ F instructions 0.63/0.53/0.5 8 0.64/0.55/0.6 0 0.58/0.57/0.5 7 0.5/0.16/0.24 0.56/0.12/0.1 9 0.5/0.14/0.22 0/0/0 Hopelessness 0.55/0.37/0.4 4 0.51/0.37/0.4 3 0.45/0.43/0.4 4 0.29/0.1/0.15 0.45/0.07/0.1 3 0.28/0.05/0.0 8 0/0/0 Love 0.65/0.53/0.5 8 0.63/0.52/0.5 7 0.57/0.57/0.5 7 0.45/0.17/0.2 5 0.17/0/0.01 0.48/0.03/0.0 6 0/0/0 Information 0.53/0.29/0.3 7 0.46/0.26/0.3 3 0.35/0.3/0.32 0.44/0.09/0.1 5 0.37/0.05/0.0 9 0.27/0.03/0.0 6 0/0/0 Guilt 0.47/0.24/0.3 2 0.41/0.26/0.3 1 0.28/0.29/0.2 8 0.09/0.04/0.0 6 0.25/0.02/0.0 4 0.15/0.04/0.0 6 0/0/0 Blame 0.32/0.06/0.1 0.25/0.07/0.1 1 0.15/0.12/0.1 3 0.11/0.02/0.0 3 0/0/0 0.13/0.02/0.0 3 0/0/0 Thankfulness 0.75/0.46/0.5 7 0.71/0.47/0.5 7 0.59/0.46/0.5 2 0.21/0.06/0.1 0.36/0.09/0.1 4 0.24/0.09/0.1 3 0/0/0 Anger 0/0/0 0/0/0 0.03/0.02/0.0 2 0/0/0 0.2/0.03/0.05 0/0/0 0/0/0 Sorrow 0/0/0 0/0/0 0.06/0.06/0.0 6 0/0/0 0/0/0 0/0/0 0/0/0 Hopefulness 0/0/0 0/0/0 0.07/0.05/0.0 5 0/0/0 0/0/0 0/0/0 0/0/0 Happiness 0/0/0 0/0/0 0.09/0.04/0.0 6 0/0/0 0/0/0 0/0/0 0/0/0 Fear 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0.1/0.09/0.09 0/0/0 Pride 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 Abuse 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 Forgiveness 0/0/0 1/0.17/0.29 0.5/0.17/0.25 0/0/0 0/0/0 0/0/0 0/0/0 Overall 0.60/0.38/0.4 6 0.57/0.39/0.4 6 0.45/0.42/0.4 4 0.34/0.11/0.1 6 0.42/0.06/0.1 1 0/0/0 All P/ R/ F Ngram P/ R/ F Unigram P/ R/ F GR P/ R/ F Subject P/ R/ F Verb P/ R/ F Negation P/ R/ F Rules Only P/ R/ F instructions 0.63/0.53/0.5 8 0.64/0.55/0.6 0 0.58/0.57/0.5 7 0.5/0.16/0.24 0.56/0.12/0.1 9 0.5/0.14/0.22 0/0/0 0/0/0 Hopelessness 0.55/0.37/0.4 4 0.51/0.37/0.4 3 0.45/0.43/0.4 4 0.29/0.1/0.15 0.45/0.07/0.1 3 0.28/0.05/0.0 8 0/0/0 0/0/0 Love 0.65/0.53/0.5 8 0.63/0.52/0.5 7 0.57/0.57/0.5 7 0.45/0.17/0.2 5 0.17/0/0.01 0.48/0.03/0.0 6 0/0/0 0/0/0 Information 0.53/0.29/0.3 7 0.46/0.26/0.3 3 0.35/0.3/0.32 0.44/0.09/0.1 5 0.37/0.05/0.0 9 0.27/0.03/0.0 6 0/0/0 0/0/0 Guilt 0.47/0.24/0.3 2 0.41/0.26/0.3 1 0.28/0.29/0.2 8 0.09/0.04/0.0 6 0.25/0.02/0.0 4 0.15/0.04/0.0 6 0/0/0 0/0/0 Blame 0.32/0.06/0.1 0.25/0.07/0.1 1 0.15/0.12/0.1 3 0.11/0.02/0.0 3 0/0/0 0.13/0.02/0.0 3 0/0/0 0/0/0 Thankfulness 0.75/0.46/0.5 7 0.71/0.47/0.5 7 0.59/0.46/0.5 2 0.21/0.06/0.1 0.36/0.09/0.1 4 0.24/0.09/0.1 3 0/0/0 0/0/0 Anger 0.17/0.03/0.0 5 0.13/0.03/0.0 5 0.07/0.05/0.0 6 0.06/0.03/0.0 4 0.21/0.08/0.1 1 0.08/0.03/0.0 4 0.07/0.05/0.0 6 0.19/0.05/0.0 8 Sorrow 0.14/0.14/0.1 4 0.12/0.14/0.1 3 0.09/0.16/0.1 2 0.2/0.24/0.22 0.17/0.27/0.2 1 0.14/0.22/0.1 8 0.18/0.29/0.2 2 0.18/0.29/0.2 2 Hopefulness 0.13/0.07/0.0 9 0.13/0.07/0.0 9 0.09/0.09/0.0 9 0.09/0.07/0.0 8 0.11/0.09/0.1 0.14/0.11/0.1 3 0.18/0.14/0.1 6 0.18/0.14/0.1 6 Happiness 0.08/0.04/0.0 6 0.08/0.04/0.0 6 0.09/0.09/0.0 9 0.06/0.04/0.0 5 0.04/0.04/0.0 4 0.06/0.04/0.0 5 0.05/0.04/0.0 5 0.05/0.04/0.0 5 Fear 0.2/0.18/0.19 0.2/0.18/0.19 0.13/0.14/0.1 3 0.14/0.18/0.1 6 0.22/0.27/0.2 4 0.16/0.32/0.2 2 0.23/0.27/0.2 5 0.23/0.27/0.2 5 Pride 0.67/0.13/0.2 2 1/0.2/0.33 1/0.07/0.13 0.4/0.13/0.2 0.6/0.2/0.3 0.43/0.2/0.27 1/0.2/0.33 1/0.2/0.33 Abuse 0.5/0.38/0.43 0.25/0.13/0.1 7 0.05/0.25/0.3 3 0.6/0.38/0.46 0.33/0.13/0.1 8 0.5/0.25/0.33 0.6/0.38/0.46 0.6/0.38/0.46 Forgiveness 0.22/0.33/0.2 7 0.3/0.5/0.38 0.17/0.17/0.1 7 0/0/0 0.15/0.33/0.2 1 0.1/0.17/0.13 0.25/0.5/0.33 0.25/0.5/0.33 Overall 0.57/0.39/0.4 6 0.54/0.4/0.46 0.44/0.42/0.4 3 0.32/0.12/0.1 7 0.35/0.08/0.1 3 0.3/0.08/0.13 0.18/0.02/0.0 3 0.20/0.02/0.0 3 Table1: system 2 (binary classifiers), NO manual rules Table2: system 2 (binary classifiers with manual rules Table3: comparison of all three systems in three different feature configuratio ns

Transcript of Three hybrid classifiers for the detection of emotions in suicide notes

Page 1: Three hybrid classifiers for the detection of emotions in suicide notes

EMBL- EBI Wellcome Trust Genome CampusHinxton CambridgeCB10 1SDUK

T +44 (0) 1223 494 444F +44 (0) 1223 494 468http://www.ebi.ac.uk

Acknowledgements

We would like to thank EMBL-EBI, the Leverhulme Trust and the Swiss National Science Foundation for funding our work. We would also like to thank

Dr Ian Lewin and Dr Cristina Soriano for their feedback.

Jee-Hyub Kim Maria [email protected] [email protected]

Shyamasree Saha Janna [email protected] [email protected]

Three hybrid classifiers for the detection of emotions in suicide notesJee-Hyub Kima,b, Maria Liakataa,b,c, Shyamasree Sahab, Janna Hastingsb,d and Dietrich Rebholz-Schuhmannb

IntroductionSuicides increasingly present a major concern in today's society. We describe our system for the recognition of emotions in suicide notes. Motivated by the sparse and imbalanced data as well as the complex annotation scheme, we have considered three hybrid approaches for distinguishing between the different categories. Each of the three approaches combines machine learning with manually derived rules, where the latter target very sparse emotion categories. The first approach considers the task as single label multi-class classification, where an SVM and a CRF classifier are trained to recognise fifteen different categories and their results are combined. Our second approach trains individual binary classifiers (SVM and CRF) for each of the fifteen sentence categories and returns the union of the classifiers as the final result. Finally, our third approach is a combination of binary and multi-class classifiers (SVM and CRF) trained on different subsets of the training data. We considered a number of different feature configurations. All three systems were tested on 300 unseen messages. Our second system had the best performance of the three,yielding an F1 score of 45.6% and a Precision of 60.1% whereas the best Recall (43.6%) was obtained using the third system.

System 1: Multi-class single label

We followed three hybrid approaches combining supervised machine learning with manually created rules to cater for sparse categories.

Features: Lexical (DATE, ADDRESS, NAME, unigrams, bigrams, trigrams), Grammatical (POS, verb, verb tense and voice, subjects, objects and grammatical triples)1,Negation2 in sentence, sentence length and position.

Multi-class annotation:Single label. Independent LibSVM3 and CRFSuite4 models trained, final assignment according to confidence score.

Union of binary classifiers: Multi-label. Trained individual binary classifiers foreach of the fifteen categories present in the training data, for both LibSVM and CRFSuite and took the union of the classifiers. Best performance.

Hybrid binary classifiers: Multi-label. Trained binary emotion/non-emotion on all the data, individual binary classifiers for emotions on just emotion data, multi-class on non-emotion data. Assignment of instance according to weighted voting of classifiers. Best recall but many false positives.

System 2: Union of binary classifiers, multi-label

System 3: Combination of binary and multiclass classifiers, multi-label

We used 48 manual rules for the 8 sparsest categories, developed in Perl regular expressions. These rules only applied to sentences for which automatic classifiers gave no prediction. Thus, the choice of feature affects both classifier performance and the results we get from rules.

Rules were proposed by manual inspection of the relevant annotated sentences from the training data as well as through the examination of synonym sets from WordNet-Affect5. Rules were validated against the corpus as a whole, and those which retrieved too many false positives were discarded. Only one rule is allowed to fire. Regular expressions are ordered from most specific to most general and from most frequent categories to least frequent.

Example: $SORRY = "(sorry|miserable|lonely|sad|tired|alone|weary)"

i* .{0,10}be .{0,10}$SORRY" => 'sorrow'

Methods and Features

Manual Rules

Results on training data

Discussion and Conclusion

References

b a Authors contributed equally

d Swiss Center for Affective Sciences, University of Geneva,Switzerland

c Department of Computer Science, Aberystwyth University

1 Curran James, Clark Stephen and Bos Johan. Linguistically motivated large-scale nlp with c&c and boxer. InProceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion VolumeProceedings of the Demo and Poster Sessions, pages 33–36, Prague, Czech Republic, June 2007

2 Morante Roser, Van Asch Vincent and Daelemans Walter. Memory-based resolution of in-sentence scopes ofhedge cues. In Proceedings of the Fourteenth Conference on Computational Natural Language Learning —Shared Task, CoNLL ’10: Shared Task, pages 40–47, Stroudsburg, PA, USA, 2010..

3 www.csie.ntu.edu.tw/cjlin/libsvm4 www.chokkan.org/software/crfsuite5 Valitutti Ro.Wordnet-affect: an affective extension of wordnet. In Proceedings of the 4th International Conferenceon Language Resources and Evaluation, pages 1083–1086, 2004.

6 Pestian John, Nasrallah Henry, Matykiewicz Pawel, Bennett Aurora and Leenaars Antoon. Suicide note classificationusing natural language processing: A content analysis. Biomed Inform Insights, 2010:19–28, 2010.

Annotation categories are mixed and the distribution of data imbalanced. Some categories are very sparse. There is a need to distinguish between emotions and non-emotions.

Emotion language is highly heterogeneous and overlapping across categories

A combination of manual rules and machine learning classifiers were required for the task of emotion recognition.

We obtained a highest precision of 60.1%, recall of 43.6% highest F1 of 45.6% on the test data. A union of binary classifiers performed best.

At the moment the manual rules drop overall performance while increasing recall for sparse categories. Further fine tuning is required to improve this result.

Future work will involve more sophisticated ways of combining classifiers using stacking.

We believe a core problem of suicide prevention could be substantially reduced to that of detecting hopelessness, as the most significant predictive emotion category.

Multi-Class Classifier

Pre-processing

Feature extractor

Sentences

Multi-Class LibSVM

Classifier

If labeled

Rule-Based Classifier

Labeled Sentences

Multi-Class CRFSuite Classifier

Voting

A set of Binary Classifiers

Binary CRFSuite Classifier on emotion 2

Binary CRFSuite Classifier on emotion n

Binary CRFSuite Classifier on emotion 1

Union

Binary LibSVM Classifier on emotion 2

Binary LibSVM Classifier on emotion n

Binary LibSVM Classifier on emotion 1

… …

Binary LibSVM Classifier on

emotion/non-e

Multi-Class LibSVM Classifier on non

emotions

If emotion

Binary LibSVM Classifier on emotion m

Binary LibSVM Classifier on emotion 2

Binary LibSVM Classifier on emotion 1

Weighted Voting

…Binary CRFSuit

Classifier on emotion m

Binary CRFSuit Classifier on emotion 2

Binary CRFSuit Classifier on emotion 1

Training Feature SVM CRF SVM + CRF SVM + CRF + Rules

All 0.3957 0.4347 0.4415 0.438

Multi Class Ngram 0.3606 0.4416 0.4437 0.44

No-Neg 0.3665 0.4332 0.4349 0.4318

All 0.464 0.461

Binary All Ngram 0.46 0.457

No-Neg 0.46 0.457

All 0.39 0.389

Hybrid Binary Ngram 0.384 0.384

No-Neg 0.3875 0.3865

AllP/ R/ F

NgramP/ R/ F

UnigramP/ R/ F

GRP/ R/ F

SubjectP/ R/ F

VerbP/ R/ F

NegationP/ R/ F

instructions 0.63/0.53/0.58 0.64/0.55/0.60 0.58/0.57/0.57 0.5/0.16/0.24 0.56/0.12/0.19 0.5/0.14/0.22 0/0/0

Hopelessness 0.55/0.37/0.44 0.51/0.37/0.43 0.45/0.43/0.44 0.29/0.1/0.15 0.45/0.07/0.13 0.28/0.05/0.08 0/0/0

Love 0.65/0.53/0.58 0.63/0.52/0.57 0.57/0.57/0.57 0.45/0.17/0.25 0.17/0/0.01 0.48/0.03/0.06 0/0/0

Information 0.53/0.29/0.37 0.46/0.26/0.33 0.35/0.3/0.32 0.44/0.09/0.15 0.37/0.05/0.09 0.27/0.03/0.06 0/0/0

Guilt 0.47/0.24/0.32 0.41/0.26/0.31 0.28/0.29/0.28 0.09/0.04/0.06 0.25/0.02/0.04 0.15/0.04/0.06 0/0/0

Blame 0.32/0.06/0.1 0.25/0.07/0.11 0.15/0.12/0.13 0.11/0.02/0.03 0/0/0 0.13/0.02/0.03 0/0/0

Thankfulness 0.75/0.46/0.57 0.71/0.47/0.57 0.59/0.46/0.52 0.21/0.06/0.1 0.36/0.09/0.14 0.24/0.09/0.13 0/0/0

Anger 0/0/0 0/0/0 0.03/0.02/0.02 0/0/0 0.2/0.03/0.05 0/0/0 0/0/0

Sorrow 0/0/0 0/0/0 0.06/0.06/0.06 0/0/0 0/0/0 0/0/0 0/0/0

Hopefulness 0/0/0 0/0/0 0.07/0.05/0.05 0/0/0 0/0/0 0/0/0 0/0/0

Happiness 0/0/0 0/0/0 0.09/0.04/0.06 0/0/0 0/0/0 0/0/0 0/0/0

Fear 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0.1/0.09/0.09 0/0/0

Pride 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0

Abuse 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0

Forgiveness 0/0/0 1/0.17/0.29 0.5/0.17/0.25 0/0/0 0/0/0 0/0/0 0/0/0

Overall 0.60/0.38/0.46 0.57/0.39/0.46 0.45/0.42/0.44 0.34/0.11/0.16 0.42/0.06/0.11 0/0/0

AllP/ R/ F

NgramP/ R/ F

UnigramP/ R/ F

GRP/ R/ F

SubjectP/ R/ F

VerbP/ R/ F

NegationP/ R/ F

Rules OnlyP/ R/ F

instructions 0.63/0.53/0.58 0.64/0.55/0.60 0.58/0.57/0.57 0.5/0.16/0.24 0.56/0.12/0.19 0.5/0.14/0.22 0/0/0 0/0/0

Hopelessness 0.55/0.37/0.44 0.51/0.37/0.43 0.45/0.43/0.44 0.29/0.1/0.15 0.45/0.07/0.13 0.28/0.05/0.08 0/0/0 0/0/0

Love 0.65/0.53/0.58 0.63/0.52/0.57 0.57/0.57/0.57 0.45/0.17/0.25 0.17/0/0.01 0.48/0.03/0.06 0/0/0 0/0/0

Information 0.53/0.29/0.37 0.46/0.26/0.33 0.35/0.3/0.32 0.44/0.09/0.15 0.37/0.05/0.09 0.27/0.03/0.06 0/0/0 0/0/0

Guilt 0.47/0.24/0.32 0.41/0.26/0.31 0.28/0.29/0.28 0.09/0.04/0.06 0.25/0.02/0.04 0.15/0.04/0.06 0/0/0 0/0/0

Blame 0.32/0.06/0.1 0.25/0.07/0.11 0.15/0.12/0.13 0.11/0.02/0.03 0/0/0 0.13/0.02/0.03 0/0/0 0/0/0

Thankfulness 0.75/0.46/0.57 0.71/0.47/0.57 0.59/0.46/0.52 0.21/0.06/0.1 0.36/0.09/0.14 0.24/0.09/0.13 0/0/0 0/0/0

Anger 0.17/0.03/0.05 0.13/0.03/0.05 0.07/0.05/0.06 0.06/0.03/0.04 0.21/0.08/0.11 0.08/0.03/0.04 0.07/0.05/0.06 0.19/0.05/0.08

Sorrow 0.14/0.14/0.14 0.12/0.14/0.13 0.09/0.16/0.12 0.2/0.24/0.22 0.17/0.27/0.21 0.14/0.22/0.18 0.18/0.29/0.22 0.18/0.29/0.22

Hopefulness 0.13/0.07/0.09 0.13/0.07/0.09 0.09/0.09/0.09 0.09/0.07/0.08 0.11/0.09/0.1 0.14/0.11/0.13 0.18/0.14/0.16 0.18/0.14/0.16

Happiness 0.08/0.04/0.06 0.08/0.04/0.06 0.09/0.09/0.09 0.06/0.04/0.05 0.04/0.04/0.04 0.06/0.04/0.05 0.05/0.04/0.05 0.05/0.04/0.05

Fear 0.2/0.18/0.19 0.2/0.18/0.19 0.13/0.14/0.13 0.14/0.18/0.16 0.22/0.27/0.24 0.16/0.32/0.22 0.23/0.27/0.25 0.23/0.27/0.25

Pride 0.67/0.13/0.22 1/0.2/0.33 1/0.07/0.13 0.4/0.13/0.2 0.6/0.2/0.3 0.43/0.2/0.27 1/0.2/0.33 1/0.2/0.33

Abuse 0.5/0.38/0.43 0.25/0.13/0.17 0.05/0.25/0.33 0.6/0.38/0.46 0.33/0.13/0.18 0.5/0.25/0.33 0.6/0.38/0.46 0.6/0.38/0.46

Forgiveness 0.22/0.33/0.27 0.3/0.5/0.38 0.17/0.17/0.17 0/0/0 0.15/0.33/0.21 0.1/0.17/0.13 0.25/0.5/0.33 0.25/0.5/0.33

Overall 0.57/0.39/0.46 0.54/0.4/0.46 0.44/0.42/0.43 0.32/0.12/0.17 0.35/0.08/0.13 0.3/0.08/0.13 0.18/0.02/0.03 0.20/0.02/0.03

Table1: system 2 (binary classifiers), NO manual rules

Table2: system 2 (binary classifiers with manual rules

Table3: comparison of all three systems in three different feature configurations