AdversarialTrainingfor SatireDetection: Controllingfor … · 2019. 6. 23. ·...

14
University of Stuttgart Institute for Natural Language Processing Adversarial Training for Satire Detection: Controlling for Confounding Variables June 3rd, 2019 Robert McHardy, Heike Adel and Roman Klinger

Transcript of AdversarialTrainingfor SatireDetection: Controllingfor … · 2019. 6. 23. ·...

Page 1: AdversarialTrainingfor SatireDetection: Controllingfor … · 2019. 6. 23. · Motivation1:Satireornot? “Afteryearsofghtingthere nallyisasettlement betweentheGemaand Youtube.Itbecameknown

University of Stuttgart Institute for Natural Language Processing

Adversarial Training forSatire Detection:Controlling forConfounding VariablesJune 3rd, 2019

Robert McHardy, Heike Adeland Roman Klinger

Page 2: AdversarialTrainingfor SatireDetection: Controllingfor … · 2019. 6. 23. · Motivation1:Satireornot? “Afteryearsofghtingthere nallyisasettlement betweentheGemaand Youtube.Itbecameknown

Satire & Research Goals Model/Data Experiments & Results Conclusion

Motivation 1: Satire or not?

“After years of ghting therenally is a settlement

between the Gema andYoutube . It became knowntoday , that in future everymusic video is allowed to beplayed back in Germanyagain, as long as the audio isremoved”(translated from German)

University of Stuttgart McHardy/Adel/Klinger June 3rd, 2019 2 / 12

Page 3: AdversarialTrainingfor SatireDetection: Controllingfor … · 2019. 6. 23. · Motivation1:Satireornot? “Afteryearsofghtingthere nallyisasettlement betweentheGemaand Youtube.Itbecameknown

Satire & Research Goals Model/Data Experiments & Results Conclusion

Motivation 2: Satire or not?

“Erfurt ( dpo ) – It is anorganization which operatesoutside of law and order,funds numerous NPDoperatives and is to a notinconsiderable extentinvolved in the series ofmurders of the so-calledZwickauer Zelle.”(translated from German)

DPA is a German news agency –DPO does not exist (in this context).

University of Stuttgart McHardy/Adel/Klinger June 3rd, 2019 3 / 12

Page 4: AdversarialTrainingfor SatireDetection: Controllingfor … · 2019. 6. 23. · Motivation1:Satireornot? “Afteryearsofghtingthere nallyisasettlement betweentheGemaand Youtube.Itbecameknown

1 Satire, Previous Work and Research Goals

2 Model and Data

3 Experiments & Results

4 Conclusion & Availability

Outline

Page 5: AdversarialTrainingfor SatireDetection: Controllingfor … · 2019. 6. 23. · Motivation1:Satireornot? “Afteryearsofghtingthere nallyisasettlement betweentheGemaand Youtube.Itbecameknown

Satire & Research Goals Model/Data Experiments & Results Conclusion

Satire

● Form of art to critize in an entertaining manner● Stylistic devices include humor, irony, sarcasm● Goal: Mimic regular news in diction● It’s not misinformation or desinformation (fake news):Articles typically contain satire markers(similar to irony or sarcasm)

Automatic Satire DetectionAutomatically distinguish satirical news from regular news⇒ Challenging task (even for humans)

University of Stuttgart McHardy/Adel/Klinger June 3rd, 2019 4 / 12

Page 6: AdversarialTrainingfor SatireDetection: Controllingfor … · 2019. 6. 23. · Motivation1:Satireornot? “Afteryearsofghtingthere nallyisasettlement betweentheGemaand Youtube.Itbecameknown

Satire & Research Goals Model/Data Experiments & Results Conclusion

Previous Work

Yang et al. 2017, De Sarkar et al. 2018

● Created data sets which are automatically labeled frompublication source● Potential limitation: Models might learn characteristics ofpublication sources instead of actual characteristics of satire● (evaluation is not faulty, they use di erent publicationsources for validation than for training)

⇒ Bad generalization to unseen publication sources?⇒ Interpretation of models (regarding concepts of satire)

misleading?

University of Stuttgart McHardy/Adel/Klinger June 3rd, 2019 5 / 12

Page 7: AdversarialTrainingfor SatireDetection: Controllingfor … · 2019. 6. 23. · Motivation1:Satireornot? “Afteryearsofghtingthere nallyisasettlement betweentheGemaand Youtube.Itbecameknown

Satire & Research Goals Model/Data Experiments & Results Conclusion

Our Contributions

● We propose adversarial training: Improve robustness ofmodel against confounding variable of publication sources● We show that adversarial training is crucial for the model topay attention to satire instead of publication characteristics● We publish a large German data set for satire detection.

● First dataset in German● First dataset including publication sources● Largest resource for satire detection so far

University of Stuttgart McHardy/Adel/Klinger June 3rd, 2019 6 / 12

Page 8: AdversarialTrainingfor SatireDetection: Controllingfor … · 2019. 6. 23. · Motivation1:Satireornot? “Afteryearsofghtingthere nallyisasettlement betweentheGemaand Youtube.Itbecameknown

1 Satire, Previous Work and Research Goals

2 Model and Data

3 Experiments & Results

4 Conclusion & Availability

Outline

Page 9: AdversarialTrainingfor SatireDetection: Controllingfor … · 2019. 6. 23. · Motivation1:Satireornot? “Afteryearsofghtingthere nallyisasettlement betweentheGemaand Youtube.Itbecameknown

Satire & Research Goals Model/Data Experiments & Results Conclusion

Model

input layer

LSTM layer

attention layer

feature extractor

satire detector

publication identifier

satire? (yes/no) publication name

∂ J s∂θ s

∂ J s∂θ f

∂ J p∂θ p

−λ∂ J p∂θf

University of Stuttgart McHardy/Adel/Klinger June 3rd, 2019 7 / 12

Page 10: AdversarialTrainingfor SatireDetection: Controllingfor … · 2019. 6. 23. · Motivation1:Satireornot? “Afteryearsofghtingthere nallyisasettlement betweentheGemaand Youtube.Itbecameknown

Satire & Research Goals Model/Data Experiments & Results Conclusion

Data Collection and Selection

● Regular news:Der Spiegel, Der Standard, Die Zeit, Süddeutsche Zeitung● Satire:Der Enthüller, Eulenspiegel, Nordd. Nach., Der Postillon,Satirepatzer, Die Tagespresse, Titanic, Welt (Satire), DerZeitspiegel, Eine Zeitung, Zynismus24● Articles from January 1st, 2000 and May 1st, 2018

Average Length

Publication #Articles Article Sent. Title

Regular 320,219 663.45 17.79 6.86Satire 9,643 269.28 18.73 9.52

University of Stuttgart McHardy/Adel/Klinger June 3rd, 2019 8 / 12

Page 11: AdversarialTrainingfor SatireDetection: Controllingfor … · 2019. 6. 23. · Motivation1:Satireornot? “Afteryearsofghtingthere nallyisasettlement betweentheGemaand Youtube.Itbecameknown

Satire & Research Goals Model/Data Experiments & Results Conclusion

Research Question 1: Performance

How does a decrease in publication classi cation performancethrough adversarial training a ect the satire classi cationperformance?

��

���

���

���

���

����

��������������� ���� ���� ���� ��������������

��������� ���������

����������

�������� �����

University of Stuttgart McHardy/Adel/Klinger June 3rd, 2019 9 / 12

Page 12: AdversarialTrainingfor SatireDetection: Controllingfor … · 2019. 6. 23. · Motivation1:Satireornot? “Afteryearsofghtingthere nallyisasettlement betweentheGemaand Youtube.Itbecameknown

Satire & Research Goals Model/Data Experiments & Results Conclusion

Research Question 2: Attention Weights

Is adversarial training e ective for avoiding that the model paysmost attention to the characteristics of publication source ratherthan actual satire?

noad

v Erfurt ( dpo ) - It is an organization which operates outside of law and order , fundsnumerous NPD operatives and is to a not inconsiderable extent involved in the seriesof murders of the so called Zwickauer Zelle .

adv Erfurt ( dpo ) - It is an organization which operates outside of law and order , funds

numerous NPD operatives and is to a not inconsiderable extent involved in the seriesof murders of the so called Zwickauer Zelle .

noad

v

After all , the proposal to allow family reunion only inclusive mothers-in-law is beingdiscussed , whereof the Union hopes for an off-putting effect .

adv After all , the proposal to allow family reunion only inclusive mothers-in-law is being

discussed , whereof the Union hopes for an off-putting effect .

University of Stuttgart McHardy/Adel/Klinger June 3rd, 2019 10 / 12

Page 13: AdversarialTrainingfor SatireDetection: Controllingfor … · 2019. 6. 23. · Motivation1:Satireornot? “Afteryearsofghtingthere nallyisasettlement betweentheGemaand Youtube.Itbecameknown

Satire & Research Goals Model/Data Experiments & Results Conclusion

Conclusion and Availability

● Observation: Satire detection models learn characteristics ofpublication sources

Our Contributions

● Adversarial training to control for this confounding variable⇒ Considerable reduction of publication identi cationperformance while satire detection remains on comparablelevels⇒ Attention weights show e ectiveness of our approach● First German dataset for satire detection⇒ Dataset and code available at:http://www.ims.uni-stuttgart.de/data/germansatire

University of Stuttgart McHardy/Adel/Klinger June 3rd, 2019 11 / 12

Page 14: AdversarialTrainingfor SatireDetection: Controllingfor … · 2019. 6. 23. · Motivation1:Satireornot? “Afteryearsofghtingthere nallyisasettlement betweentheGemaand Youtube.Itbecameknown

University of Stuttgart Institute for Natural Language Processing

Adversarial Training forSatire Detection:Controlling forConfounding VariablesJune 3rd, 2019

Robert McHardy, Heike Adeland Roman Klinger