Post on 20-Dec-2015
Modeling Semantic Containment and Modeling Semantic Containment and Exclusion in Natural Language Exclusion in Natural Language
InferenceInference
Bill MacCartney and Christopher D. Manning
NLP GroupStanford University
22 August 2008
2
Natural language inference (NLI)Natural language inference (NLI)
• Aka recognizing textual entailment (RTE)
• Does premise P justify an inference to hypothesis H?• An informal, intuitive notion of inference: not strict logic• Emphasis on variability of linguistic expression
Introduction • A Theory of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
• Necessary to goal of natural language understanding (NLU)
• Can also enable semantic search, question answering, …
P Every firm polled saw costs grow more than expected,even after adjusting for inflation.
H Every big company in the poll reported cost increases.yes
Some
Some no
3
NLI: a spectrum of approachesNLI: a spectrum of approaches
lexical/semanticoverlap
Jijkoun & de Rijke 2005
patternedrelation
extraction
Romano et al. 2006
semanticgraph
matching
Hickl et al. 2006MacCartney et al. 2006
FOL &theoremproving
Bos & Markert 2006
Introduction • A Theory of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
robust,but shallow
deep,but brittle
naturallogic
(this work)
Problem:imprecise easily confounded by negation, quantifiers, conditionals, factive & implicative verbs, etc.
Problem:hard to translate NL to FOLidioms, anaphora, ellipsis, intensionality, tense, aspect, vagueness, modals, indexicals, reciprocals, propositional attitudes, scope ambiguities, anaphoric adjectives, non-intersective adjectives, temporal & causal relations, unselective quantifiers, adverbs of quantification, donkey sentences, generic determiners, comparatives, phrasal verbs, …
Solution?
4
OutlineOutline
• Introduction
• A Theory of Natural Logic
• The NatLog System
• Experiments with FraCaS
• Experiments with RTE
• Conclusion
Introduction • A Theory of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
5
What is natural logic?What is natural logic? ( ( natural deduction) natural deduction)
• Characterizes valid patterns of inference via surface forms• precise, yet sidesteps difficulties of translating to FOL
• A long history• traditional logic: Aristotle’s syllogisms, scholastics, Leibniz, …• modern natural logic begins with Lakoff (1970)• van Benthem & Sánchez Valencia (1986-91): monotonicity
calculus• Nairn et al. (2006): an account of implicatives & factives
• We introduce a new theory of natural logic• extends monotonicity calculus to account for negation &
exclusion• incorporates elements of Nairn et al.’s model of implicatives
Introduction • A Theory of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
6
7 basic entailment relations7 basic entailment relations
Introduction • A Theory of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
Venn symbol
name example
P = Q equivalence couch = sofa
P ⊏ Q forward entailment(strict)
crow ⊏ bird
P ⊐ Q reverse entailment(strict)
European ⊐ French
P ^ Q negation(exhaustive exclusion)
human ^ nonhuman
P | Q alternation(non-exhaustive exclusion)
cat | dog
P _ Q cover(exhaustive non-exclusion)
animal _ nonhuman
P # Q independence hungry # hippo
Relations are defined for all semantic types: tiny ⊏ small, hover ⊏ fly, kick ⊏ strike,this morning ⊏ today, in Beijing ⊏ in China, everyone ⊏ someone, all ⊏ most ⊏ some
7
Entailment & semantic Entailment & semantic compositioncomposition
Introduction • A Theory of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
• Ordinarily, semantic composition preserves entailment relations: eat pork ⊏ eat meat, big bird | big fish
• But many semantic functions behave differently:tango ⊏ dance refuse to tango ⊐ refuse to danceFrench | German not French _ not German
• We categorize functions by how they project entailment• a generalization of monotonicity classes, implication
signatures• e.g., not has projectivity {=:=, ⊏:⊐, ⊐:⊏, ^:^, |:_,
_:|, #:#}• e.g., refuse has projectivity {=:=, ⊏:⊐, ⊐:⊏, ^:|, |:#,
_:#, #:#}
8
Projecting entailment relations Projecting entailment relations upwardupward
Introduction • A Theory of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
⊏
⊏
⊐
⊐
⊐
• If two compound expressions differ by a single atom, their entailment relation can be determined compositionally• Assume idealized semantic composition trees• Propagate entailment relation between atoms upward,
according to projectivity class of each node on path to root
a shirtnobody can without enter
@
@
@
@
clothesnobody can without enter
@
@
@
@
9
A (weak) inference procedureA (weak) inference procedure
1. Find sequence of edits connecting P and H• Insertions, deletions, substitutions, …
2. Determine lexical entailment relation for each edit• Substitutions: depends on meaning of substituends: cat | dog
• Deletions: ⊏ by default: red socks ⊏ socks
• But some deletions are special: not ill ^ ill, refuse to go | go
• Insertions are symmetric to deletions: ⊐ by default
• Project up to find entailment relation across each edit
• Compose entailment relations across sequence of edits1. à la Tarski’s relation algebra
Introduction • A Theory of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
10
The NatLog systemThe NatLog system
linguistic analysis
alignment
lexical entailment classification
1
2
3
NLI problem
prediction
Introduction • A Theory of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
entailment projection
entailment composition
4
5
11
Running exampleRunning example
Introduction • A Theory of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
OK, the example is contrived, but it compactly exhibits containment, exclusion, and implicativity
P Jimmy Dean refused to move without blue jeans.
H James Dean didn’t dance without pantsyes
12
PP
Step 1: Linguistic analysisStep 1: Linguistic analysis
• Tokenize & parse input sentences (future: & NER & coref & …)
• Identify items w/ special projectivity & determine scope• Problem: PTB-style parse tree semantic structure!
Jimmy Dean refused to move without blue jeans
NNP NNP VBD TO VB IN JJ NNS NP NP
VP S
• Solution: specify scope in PTB trees using Tregex [Levy & Andrew 06]
Introduction • A Theory of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
VP
VP S
+ + +–– –+ +
refuse
move
JimmyDean
without
jeans
blue
category: –/o implicativesexamples: refuse, forbid, prohibit, …scope: S complementpattern: __ > (/VB.*/ > VP $. S=arg)projectivity: {=:=, ⊏:⊐, ⊐:⊏, ^:|, |:#, _:#, #:#}
13
Step 2: AlignmentStep 2: Alignment
Introduction • A Theory of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
P Jimmy Dean
refused to move without blue jeans
H James Dean did n’t dance without pants
editindex 1 2 3 4 5 6 7 8
edittype SUB DEL INS INS SUB MAT DEL SUB
• Alignment as sequence of atomic phrase edits• Ordering of edits defines path through intermediate
forms• Need not correspond to sentence order
• Decomposes problem into atomic inference problems
• We haven’t (yet) invested much effort here• Experimental results use alignments from other sources
14
Step 3: Lexical entailment Step 3: Lexical entailment classificationclassification• Goal: predict entailment relation for each edit, based
solely on lexical features, independent of context
• Approach: use lexical resources & machine learning
• Feature representation:• WordNet features: synonymy (=), hyponymy (⊏/⊐), antonymy (|)• Other relatedness features: Jiang-Conrath (WN-based), NomBank• Fallback: string similarity (based on Levenshtein edit distance)• Also lexical category, quantifier category, implication signature
• Decision tree classifier• Trained on 2,449 hand-annotated lexical entailment problems• E.g., SUB(gun, weapon): ⊏, SUB(big, small): |, DEL(often): ⊏
Introduction • A Theory of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
15
Step 3: Lexical entailment Step 3: Lexical entailment classificationclassification
Introduction • A Theory of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
P Jimmy Dean
refused to move without blue jeans
H James Dean did n’t dance without pants
editindex 1 2 3 4 5 6 7 8
edittype SUB DEL INS INS SUB MAT DEL SUB
lexfeats
strsim=
0.67
implic:
–/ocat:a
uxcat:n
eg hypo hyper
lexentrel = | = ^ ⊐ = ⊏ ⊏
16
inversion
Step 4: Entailment projectionStep 4: Entailment projection
Introduction • A Theory of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
P Jimmy Dean
refused to move without blue jeans
H James Dean did n’t dance without pants
editindex 1 2 3 4 5 6 7 8
edittype SUB DEL INS INS SUB MAT DEL SUB
lexfeats
strsim=
0.67
implic:
–/ocat:a
uxcat:n
eg hypo hyper
lexentrel = | = ^ ⊐ = ⊏ ⊏
projec-tivity
atomic
entrel= | = ^ ⊏ = ⊏ ⊏
17
final answer
Step 5: Entailment compositionStep 5: Entailment composition
Introduction • A Theory of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
P Jimmy Dean
refused to move without blue jeans
H James Dean did n’t dance without pants
editindex 1 2 3 4 5 6 7 8
edittype SUB DEL INS INS SUB MAT DEL SUB
lexfeats
strsim=
0.67
implic:
–/ocat:a
uxcat:n
eg hypo hyper
lexentrel = | = ^ ⊐ = ⊏ ⊏
projec-tivity atomi
centrel
= | = ^ ⊏ = ⊏ ⊏
compo-
sition= | | ⊏ ⊏ ⊏ ⊏ ⊏
fish | human
human ^ nonhuman
fish < nonhuman
For example:
18
The FraCaS test suiteThe FraCaS test suite
• FraCaS: a project in computational semantics [Cooper et al. 96]
• 346 “textbook” examples of NLI problems
• 3 possible answers: yes, no, unknown (not balanced!)
• 55% single-premise, 45% multi-premise (excluded)
Introduction • A Theory of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
P At most ten commissioners spend time at home.H At most ten commissioners spend a lot of time at home. yes
P Dumbo is a large animal.H Dumbo is a small animal. no
P Smith believed that ITEL had won the contract in 1992.H ITEL won the contract in 1992. unk
19
27% error reduction
Results on FraCaSResults on FraCaS
Introduction • A Theory of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
System #prec %
rec %acc %
most common class 183 55.7100.
055.7
MacCartney & Manning 07
183 68.9 60.8 59.6
this work 183 89.3 65.7 70.5
20
high precisioneven outside
areas of expertise
27% error reduction
in largest category,all but one correct
high accuracyin sections
most amenableto natural logic
Results on FraCaSResults on FraCaS
Introduction • A Theory of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
System #prec %
rec %acc %
most common class 183 55.7100.
055.7
MacCartney & Manning 07
183 68.9 60.8 59.6
this work 183 89.3 65.7 70.5
§ Category #prec %
rec %acc %
1 Quantifiers 44 95.2100.
097.7
2 Plurals 24 90.0 64.3 75.03 Anaphora 6 100.0 60.0 50.04 Ellipsis 25 100.0 5.3 24.05 Adjectives 15 71.4 83.3 80.06 Comparatives 16 88.9 88.9 81.37 Temporal 36 85.7 70.6 58.38 Verbs 8 80.0 66.7 62.59 Attitudes 9 100.0 83.3 88.9
1, 2, 5, 6, 9 108 90.4 85.5 87.0
21
The RTE3 test suiteThe RTE3 test suite
Introduction • A Theory of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
P As leaders gather in Argentina ahead of this weekends regional talks, Hugo Chávez, Venezuela’s populist president is using an energy windfall to win friends and promote his vision of 21st-century socialism.
H Hugo Chávez acts as Venezuela’s president. yes
P Democrat members of the Ways and Means Committee, where tax bills are written and advanced, do not have strong small business voting records.
H Democrat members had strong small business voting records. no
• Somewhat more “natural”, but not ideal for NatLog• Many kinds of inference not addressed by NatLog:
paraphrase, temporal reasoning, relation extraction, …• Big edit distance propagation of errors from atomic model
22
Results on RTE3: NatLogResults on RTE3: NatLog
System Data % YesPrec %
Rec % Acc %
Stanford RTE dev 50.2 68.7 67.0 67.2
test 50.0 61.8 60.2 60.5
NatLog dev 22.5 73.9 32.4 59.2
test 26.4 70.1 36.1 59.4
Introduction • A Theory of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
(each data set contains 800 problems)
• Accuracy is unimpressive, but precision is relatively high• Strategy: hybridize with Stanford RTE system
• As in Bos & Markert 2006• But NatLog makes positive prediction far more often (~25% vs.
4%)
23
4% gain(significant,p < 0.05)
Results on RTE3: hybrid systemResults on RTE3: hybrid system
System Data % YesPrec %
Rec % Acc %
Stanford RTE dev 50.2 68.7 67.0 67.2
test 50.0 61.8 60.2 60.5
NatLog dev 22.5 73.9 32.4 59.2
test 26.4 70.1 36.1 59.4
Hybrid dev 56.0 69.2 75.2 70.0
test 54.5 64.4 68.5 64.5
Introduction • A Theory of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
(each data set contains 800 problems)
24
Conclusion: what natural logic Conclusion: what natural logic can’t docan’t do
• Not a universal solution for NLI
• Many types of inference not amenable to natural logic• Paraphrase: Eve was let go = Eve lost her job
• Verb/frame alternation: he drained the oil ⊏ the oil drained
• Relation extraction: Aho, a trader at UBS… ⊏ Aho works for UBS
• Common-sense reasoning: the sink overflowed ⊏ the floor got wet
• etc.
• Also, has a weaker proof theory than FOL• Can’t explain, e.g., de Morgan’s laws for quantifiers:
Not all birds fly = Some birds don’t fly
Introduction • A Theory of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion
25
Conclusion: what natural logic Conclusion: what natural logic cancan do do
Natural logic enables precise reasoning about containment, exclusion, and implicativity, while sidestepping the difficulties of translating to FOL.
The NatLog system successfully handles a broad range of such inferences, as demonstrated on the FraCaS test suite.
Ultimately, open-domain NLI is likely to require combining disparate reasoners, and a facility for natural logic is a good candidate to be a component of such a system. :-) Thanks! Questions?
Introduction • A Theory of Natural Logic • The NatLog System • Experiments with FraCaS • Experiments with RTE • Conclusion