Emily Pitler, Annie Louis, Ani Nenkova University of Pennsylvania.
-
Upload
liana-cropp -
Category
Documents
-
view
217 -
download
1
Transcript of Emily Pitler, Annie Louis, Ani Nenkova University of Pennsylvania.
Automatic Sense Prediction for Implicit Discourse
Relations in TextEmily Pitler, Annie Louis, Ani Nenkova
University of Pennsylvania
Implicit Discourse Relations
2
I am in Singapore, but I live in the United States.◦ Explicit Comparison
The main conference is over Wednesday. I am staying for EMNLP. ◦ Implicit Comparison
Implicit discourse relations are hard
3
I am here because I have a presentation to give at ACL.◦ Explicit Contingency
I am a little tired; there is a 13 hour time difference.◦ Implicit Contingency
Implicit discourse relations are hard
4
Focus on implicit discourse relations ◦ in a realistic distribution
Better understanding of lexical features◦ Showed do not capture semantic oppositions
Empirical validation of new and old features◦ Polarity, verb classes, context, and some lexical
features indicate discourse relations
First experiments on implicits
5
Classify both implicits and explicits◦ Same sentence [Soricut and Marcu, 2003]
◦ Graphbank corpus: doesn’t distinguish implicit and explicit [Wellner et al., 2006]
Create artificial implicits by deleting connective◦ I am in Singapore, but I live in the United States.◦ [Marcu and Echihabi, 2001; Blair-Goldensohn et al., 2007;
Sporleder and Lascarides, 2008]
Related work on relation sense
6
Word Pairs Investigation
7
Most basic feature for implicits
I_there, I_is, …, tired_time, tired_difference
Word pairs as features
8
Iam
aIittle
tired
there is a 13 hour time difference
Marcu and Echihabi , 2001
The recent explosion of country funds mirrors the “closed-end fund mania of the 1920s, Mr. Foot says, when narrowly focused funds grew wildly popular.
They fell into oblivion after the 1929 crash.
Intuition: with large amounts of data, will find semantically-related pairs
9
Using just content words reduces performance (but has steeper learning curve) ◦ Marcu and Echihabi, 2001
Nouns and adjectives don’t help at all ◦ Lapata and Lascarides, 2004
Filtering out stopwords lowers results ◦ Blair-Goldensohn et al., 2007
Meta error analysis of prior work
10
Synthetic implicits: Cause/Contrast/None sentences ◦ Explicit instances from Gigaword with connective
deleted ◦ Because Cause, But Contrast◦ At least 3 sentences apart None◦ Blair-Goldensohn et al., 2007
Random selection ◦ 5,000 Cause◦ 5,000 Other
Computed information gain of word pairs
Word pairs experiments
11
The government says it has reached most isolated townships by now, but because roads are blocked, getting anything but basic food supplies to people remains difficult.
but because Comparison
but because Contingency
“but” signals “Not-Comparison” in synthetic data
12
Maybe even with lots and lots of data, we won’t see “popular…but…oblivion” that often
What are we trying to get at?
Popular Desirable Mollify
Oblivion Abhorrent Enrage
Sentiment orientation relieves lexical sparsity
13
Features for sense prediction
14
Multi-perspective Question Answering Opinion Corpus◦ Wilson et. al, 2005
Sentiment words annotated as◦ Positive◦ Negative◦ Both◦ Neutral
Resource for Polarity Tags
15
Similar to word pairs, but words replaced with polarity tags
Arg1: Executives at Time Inc. Magazine Co., a subsidiary of Time Warner, have said the joint venture with Mr. Lang wasn’t a good one.
Arg2: The venture, formed in 1986, was supposed to be Time’s low-cost, safe entry into women’s magazines.
Arg1NegatePositive Arg2PositiveArg1NegatePositiveArg2Positive
Polarity Tags pairs
16
General Inquirer lexicon◦ Stone et al., 1966◦ Semantic categories of words
Complementary classes◦ “Understatement” vs. “Overstatement”◦ “Rise” vs. “Fall”◦ “Pleasure” vs. “Pain”
Features ~ Tag pairs, only verbs
Inquirer Tags
17
Newsweek's circulation for the first six months of 1989 was 3,288,453, flat from the same period last year
U.S. News' circulation in the same time was 2,303,328, down 2.6%
Probably WSJ-specific
Money/Percent/Num
18
Levin verb class level in LCS database◦ Levin, 1993; Dorr, 2001◦ More related verbs ~ Expansion
Average length of verb chunk◦ They [are allowed to proceed] ~ Contingency◦ They [proceed] ~ Expansion, Temporal
POS tags of the main verb◦ Same tense ~ Expansion◦ Different tense ~ Contingency, Temporal
Verbs
19
Prior work found first and last words very helpful in predicting sense◦ Wellner et al., 2006◦ Often explicit connectives
First-Last, First3
20
Was preceding/following relation explicit?
◦ If so, which sense?
◦ If so, which connective?
Does Arg1 begin a paragraph?
Context
21
Largest available annotated corpus of discourse relations◦ Penn Treebank WSJ articles◦ 16,224 implicit relations between adjacent
sentences
I am a little tired; [because] there is a 13 hour time difference.◦ Contingency.cause.reason
Penn Discourse Treebank
22
Relation SenseProportion of implicits
Expansion 53%
Contingency 26%
Comparison 15%
Temporal 6%
Top level senses in PDTB
23
Developed features on sections 0-1 Trained on sections 2-20 Tested on sections 21-22 Binary classification task for each sense
Trained on equal numbers of positive and negative examples
Tested on natural distribution
Naïve Bayes classifier
Classification Experiments on PDTB Implicits
24
Results
25
Motivation in prior work◦ Train on synthetic implicits
Results: Word pairs for comparison and contingency
26
17.13 31.10
20.96 43.79
21.96 45.60
What works better◦ Train on actual implicits
Synthetic examples can still
help!
Comp. Cont.
◦ With only best features selected from synthetic implicits
Features f-score
First-Last, First3 21.01
Context 19.32
Money/Percent/Num 19.04
Random 9.91
Results: Comparison
27
Polarity is actually the
worst feature16.63
Comparison
Not Comparison
Positive-Negative or Negative-Positive Pairs
30% 31%
Distribution of Opposite Polarity Pairs
28
Features f-score
First-Last, First3 36.75
Verbs 36.59
Context 29.55
Random 19.11
Results: Contingency
29
Features f-score
Polarity Tags 71.29
Inquirer Tags 70.21
Context 67.77
Random 64.74
Results: Expansion
30
• Expansion is majority class
• precision more problematic than recall
• These features all help other senses
Features f-score
First-Last, First3 15.93
Verbs 12.61
Context 12.34
Random 5.38
Results: Temporal
31
Temporals often end with words like “Monday” or “yesterday”
Comparison◦ Selected word pairs
Contingency◦ Polarity, Verb, First/Last, Modality, Context,
Selected word pairs
Best feature sets
32
Expansion◦ Polarity, Inquirer Tags, Context
Temporal◦ First/Last+word pairs
Best feature sets
33
Comparison
21.96 (17.13)
Contingency
47.13 (31.10)
Expansion
76.41 (63.84)
Temporal
16.76 (16.21)
Best Results: f-scores
34
Comparison/Contingency baseline: synthetic implicits word pairsExpansion/Temporal baseline: real implicits word pairs
Results from classifying each relation independently◦ Naïve Bayes, MaxEnt, AdaBoost
Since context features were helpful, tried CRF
6-way classification, word pairs as features◦ Naïve Bayes accuracy: 43.27%◦ CRF accuracy: 44.58%
Further experiments using context
35
Focus on implicit discourse relations ◦ in a realistic distribution
Better understanding of word pairs◦ Showed do not capture semantic oppositions
Empirical validation of new and old features◦ Polarity, verb classes, context, and some lexical
features indicate discourse relations
Conclusion
36