Intrinsic(and Extrinsic(Approaches( to(Recognizing(Textual ... · Scope(• Logic$Entailment –...
Transcript of Intrinsic(and Extrinsic(Approaches( to(Recognizing(Textual ... · Scope(• Logic$Entailment –...
-
Intrinsic and Extrinsic Approaches to Recognizing Textual Entailment
Rui Wang DFKI GmbH & Saarland University
-
Intrinsic vs. Extrinsic
5/23/11 Oslo, Norway 2
-
Intrinsic vs. Extrinsic
5/23/11 Oslo, Norway 3
-
Recognizing Textual Entailment (RTE)
• Textual Entailment (Chierchia and McConnell-‐Ginet, 2000) – One text can be inferred by another text.
• An example – Text (T): Google files for its long awaited IPO. – Hypothesis (H): Google goes public.
5/23/11 Oslo, Norway 4
-
Scope • Logic Entailment
– H is true OR T is not true
• LinguisXc ImplicaXon – ConvenXonal Implicature – ConversaXonal Implicature
• Modality
5/23/11 Oslo, Norway 5
-
Machine Transla@on (MT) Triangle
5/23/11 Oslo, Norway 6
-
RTE Rectangle
5/23/11 Oslo, Norway 7
-
Outline
5/23/11 Oslo, Norway 8
-
Outline (cont.) • Intrinsic Approaches
– Architecture – RTE with Event Triples – RTE with Inference Rules
• Extrinsic Approaches – MoXvaXon – MulX-‐Dimensional ClassificaXon Model
• Summary and PerspecXves 5/23/11 Oslo, Norway 9
-
Intrinsic Approaches
-
An Example • T: Bush used his weekly radio address to try to build support for his plan to allow workers to divert part of their Social Security payroll taxes into private investment accounts.
• H: Mr. Bush is proposing that workers be allowed to divert their payroll taxes into private accounts.
5/23/11 Oslo, Norway 11
-
Performance of the Exis@ng Systems
5/23/11
0.8
0.746 0.735
0.685 0.706
0.669
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
RTE-‐3 RTE-‐4 RTE-‐5
1st
2nd
3rd
4th
5th
Wang
Oslo, Norway 12
-
Architecture
5/23/11 Oslo, Norway 13
-
Architecture
5/23/11 Oslo, Norway 14
-
Specialized RTE Module • Two requirements
– A good target – A good tackle
• Two examples – Event tuples containing named-‐enXXes – Tree skeleton with inference rules
5/23/11 Oslo, Norway 15
-
Event Tuple • < Event type, Time,
LocaXon,
List > • Persons • OrganizaXons
5/23/11 Oslo, Norway 16
-
An Example • T: Released in 1995, Tyson returned to boxing, winning the World Boxing Council Ktle in 1996. The same year, however, he lost to Evander Holyfield, and in a 1997 rematch bit Holyfield’s ear, for which he was temporarily banned from boxing.
• H: In 1996 Mike Tyson bit Holyfield’s ear.
5/23/11 Oslo, Norway 17
-
Event Time Pair • T:
– – – –
• H: –
5/23/11 Oslo, Norway 18
-
Event Tuple • T:
– – – –
• H: –
5/23/11 Oslo, Norway 19
-
Entailment Recogni@on • Event type comparison
– Lexical resources, e.g. WordNet, VerbOcean
• Named-‐enXty comparison – Time expressions normalizaXon and anchoring – Ontology of geographical terms – Person name / organizaXon name parXal matching
5/23/11 Oslo, Norway 20
-
Inference Rules
5/23/11
• DIRT collecXon (Lin and Pantel, 2001)
Oslo, Norway 21
-
Tree Skeletons
T: Doctor Robin Warren and Barry Marshall received Nobel Prize …
H: Robin Warren was awarded a Nobel Prize.
5/23/11 Oslo, Norway 22
-
Results
Systems RTE-‐2
BoW TACTE with ExDIRT
Coverage 100% 10.2% 8.1%
Accuracy* 0.579 0.684 0.654
5/23/11
*On covered data
RTE-‐3
BoW TACTE with ExDIRT
100% 8.1% 5.8%
0.611 0.655 0.721
Systems
Coverage
Accuracy*
Oslo, Norway 23
-
Summary and Future Extensions
Target Tackle Preferences
Event tuples NE recogniXon NE relaXon/resoluXon “No”
Tree skeleton with inference rules
Matched by DIRT rules
Match tree skeleton with DIRT rules
“Yes”
Future Extensions
NegaXon and modality NegaXon words or modal verbs
Scope and entailment rules
N/A
Logic inferencer High confidence Theorem prover N/A
External knowledge bases
Covered cases Knowledge applicaXon
N/A
5/23/11 Oslo, Norway 24
-
Extrinsic Approaches
-
Relevance and Necessity
5/23/11
No “necessity”
Oslo, Norway 26
-
An Example • T: At least five people have been killed in a head-‐on train collision in north-‐eastern France, while others are sKll trapped in the wreckage. All the vicKms are adults.
• H: A French train crash killed children.
• Contradictory but relevant!
5/23/11 Oslo, Norway 27
-
Textual Seman@c Rela@on (TSR)
5/23/11
Entailment
Unknown
Oslo, Norway 28
-
Related Tasks • ContradicXon recogniXon
– ContradicKon vs. Others (de Marneffe et al., 2008)
• Paraphrase acquisiXon – Paraphrase vs. Others (a survey by Androutsopoulos and MalakasioKs (2010))
• DirecXonality recogniXon – Entailment vs. Paraphrase (e.g., Kotlerman et al., 2009)
5/23/11 Oslo, Norway 29
-
RTE TSR Rectangle
5/23/11 Oslo, Norway 30
-
Meaning Representa@on • The TACTE system: Event Xme pairs • The ExTACTE system: Event tuples • The ExDIRT system: Tree skeletons
• Dependency triple sets: {DEPT} and {DEPH} – SyntacXc dependency tree – SemanXc dependency graph – Joint Dependency graph
5/23/11 Oslo, Norway 31
-
Meaning Representa@on (cont.) • H: Value is quesKoned.
• SyntacXc dependency – –
• SemanXc dependency –
5/23/11 Oslo, Norway 32
-
Meaning Representa@on (cont.) • T: Devotees of the market quesKon the value of the work.
5/23/11 Oslo, Norway 33
-
Features • H_NULL? (Boolean): whether H has dependencies • T_NULL? (Boolean) : whether T has dependencies • DIR? (Boolean) : whether T, H have same direcXon • MULTI? (Boolean) : Add "m_" to REL_PAIR if yes • DEP_SAME? (Boolean) : whether same dependency • REL_SIM? (Boolean) : whether similar relaXon • REL_SAME? (Boolean) : whether same relaXon • REL_PAIR (String) : dependency relaXon names
5/23/11 Oslo, Norway 34
-
Direct Classifica@on
5/23/11 Oslo, Norway 35
-
Two-‐Stage Classifica@on
5/23/11
3.4%
Oslo, Norway 36
-
Performance of the Exis@ng Systems
5/23/11
• Three-way RTE: Entailment, Contradiction, Unknown 0.685 0.683
0.637
0.614
0.5
0.52
0.54
0.56
0.58
0.6
0.62
0.64
0.66
0.68
0.7
RTE-‐3 RTE-‐4
1st
2nd
3rd
4th
5th
Wang
Oslo, Norway 37
-
The 3-‐Dimensional Model
5/23/11 Oslo, Norway 38
-
The 3-‐Dimensional Model (cont.) • Two-‐stage classificaXon
• Three measurements – Relatedness – Consistency – Equivalence
Relatedness Consistency Equivalence
Paraphrase (P) + + +
Entailment (E) + + -‐
Contradic@on (C) + -‐ -‐
Unknown (U) -‐ + -‐
5/23/11 Oslo, Norway 39
-
Corpora • Overview
– The RTE corpora – The PETE corpus – The MSR corpus
• Two issues – The annotaXon scheme – Data sources
5/23/11 Oslo, Norway 40
-
The TSR Corpus • Adjacent sentence pairs from the RST Discourse Treebank
• Six categories: backward entailment, forward entailment, equality, contradicKon, overlapping, and independent
5/23/11 Oslo, Norway 41
-
The AMT Corpus • Crowd-‐sourcing with Amazon Mechanical Turk • Facts vs. Counter-‐Facts
5/23/11 Oslo, Norway 42
-
Datasets Corpora Paraphrase (P) Entailment (E) Contradic@on (C) Unknown (U)
AMT (584) / Facts (406) Counter-‐facts
(178) /
MSR (5841) Paraphrase (3940)
Non-‐Paraphrase (1901)
PETE (367) / YES (194) NO (173)
RTE (2200) ENTAILMENT (1100) CONTRADICTION
(330) UNKNOWN
(770)
TSR (260) Equality (3)
Forward/Backward Entailment (10/27)
ContradicXon (17)
Overlapping & Independent
(203)
Total (9252) 3943 637 525 973
5/23/11 Oslo, Norway 43
-
Results (RTE)
Systems 4-‐Way 3-‐Way 2-‐Way
(C, E, P, U) (C, E&P, U) (E&P, C&U)
Direct BoW 39.3% 54.5% 63.2%
Direct Joint 42.3% 50.9% 66.8%
Only Relatedness (Our Prev.) / 59.1% 69.2%
3-‐D Model 45.9% 58.2% 69.9%
MacCartney and Manning (2007)* / / 59.4%
Heilman and Smith (2010)* / / 62.8%
5/23/11
*Different test data
Oslo, Norway 44
-
Results (Paraphrase)
P vs. C&E&U Accuracy Precision Recall
3-‐D Model 79.6% 57.2% 72.8%
Das and Smith (2009) (QG)* 73.9% 74.9% 91.3%
Das and Smith (2009) (PoE)* 76.1% 79.6% 86.0%
Heilman and Smith (2010)* 73.2% 75.7% 87.8%
5/23/11
*Different test data
Oslo, Norway 45
-
Results (cont.)
5/23/11 Oslo, Norway 46
-
Summary and Perspec@ves
-
Intrinsic Approaches • Event tuple
– Event Xme pair – Extended to other slots (NEs)
• (Textual) inference rules – DIRT (with tree skeleton) – More general paraphrase resources (ongoing)
5/23/11 Oslo, Norway 48
-
Extrinsic Approaches • Textual semanXc relaXons
– Relatedness recogniXon – Extended with two other dimensions
• Specialized TSR modules – Split the data (as for RTE) – SystemaXc feature engineering
5/23/11 Oslo, Norway 49
-
Applica@ons of RTE • Answer validaXon (Penãs et al., 2007; Rodrigo et al., 2008) – The best results for both English and German (Wang and Neumann, 2008a)
• RelaXon validaXon (Wang and Neumann, 2008b)
• Parser evaluaXon (Yuret et al., 2010) – The 3rd place (Wang and Zhang, 2010)
5/23/11 Oslo, Norway 50
-
Acknowledgements • Chris Callison-‐Burch • Georgiana Dinu • Guenter Neumann • Caroline Sporleder • Hans Uszkoreit • Yajing Zhang • Yi Zhang
5/23/11 Oslo, Norway 51
-
Thank YOU!
QuesXons?