Syntactic Contributions in the Entailment Task

22
Syntactic Contributions in the Entailment Task Lucy Vanderwende, Arul Menezes, Rion Snow (Stanford)

description

Syntactic Contributions in the Entailment Task. Lucy Vanderwende, Arul Menezes, Rion Snow (Stanford). RTE-1 analysis. Recap of MSR’s manual analysis of RTE-1 test data; in principle, 74% is achievable using syntax and thesaurus. RTE-1 analysis. - PowerPoint PPT Presentation

Transcript of Syntactic Contributions in the Entailment Task

Page 1: Syntactic Contributions in the Entailment Task

Syntactic Contributions in the Entailment Task

Lucy Vanderwende,

Arul Menezes,

Rion Snow (Stanford)

Page 2: Syntactic Contributions in the Entailment Task

RTE-1 analysis

• Recap of MSR’s manual analysis of RTE-1 test data; in principle, 74% is achievable using syntax and thesaurus

Without thesaurus

Using thesaurus

True 69 (9%) 147 (18%)

False 197 (25%) 243 (30%)

Not syntax 534 (67%) 410 (51%)

Page 3: Syntactic Contributions in the Entailment Task

RTE-1 analysis

• Recap of MSR’s manual analysis of RTE-1 test data; in principle, 74% is achievable using syntax and thesaurus

Without thesaurus

Using thesaurus

True 69 (9%) 147 (18%)

False 197 (25%) 243 (30%)

Not syntax 534 (67%) 410 (51%)

Page 4: Syntactic Contributions in the Entailment Task

MENT algorithm

Predicting negative entailment using syntactic features:

Obtain syntactic dependency graphs for T and H sentences

Attempt to align each H node to a node in T

Check syntactic heuristics on aligned nodes

if match, then predict false

If no match, use lexical similarity model (with threshold)

Page 5: Syntactic Contributions in the Entailment Task

MENT: heuristic alignment

Page 6: Syntactic Contributions in the Entailment Task

MENT: superlative heuristic

Superlative heuristic (100% accurate, 5 test items):– If the superlatives align, and their heads are aligned, and the

head in Text has any additional modifiers, and those modifiers are aligned to some modifier in H, say yes, else say no.

(RTE2-test- #477)

• Crater Lake is the deepest lake in the United States, the second deepest in the Western Hemisphere, and the seventh deepest in the world, dropping downward to 1,932 feet just southeast of Merriam Cone.

• Crater Lake is the deepest lake in the world.

Page 7: Syntactic Contributions in the Entailment Task

MENT: superlative heuristic

Superlative heuristic (100% accurate, 5 test items):– If the superlatives align, and their heads are aligned, and the

head in Text has any additional modifiers, and those modifiers are aligned to some modifier in H, say yes, else say no.

(RTE2-test- #477)

• Crater Lake is the deepest lake in the United States, the second deepest in the Western Hemisphere, and the seventh deepest in the world, dropping downward to 1,932 feet just southeast of Merriam Cone.

• Crater Lake is the deepest lake in the world.

Page 8: Syntactic Contributions in the Entailment Task

MENT: superlative heuristic

Superlative heuristic (100% accurate, 5 test items):– If the superlatives align, and their heads are aligned, and the

head in Text has any additional modifiers, and those modifiers are aligned to some modifier in H, say yes, else say no.

(RTE2-test- #477)

• Crater Lake is the deepest lake in the United States, the second deepest in the Western Hemisphere, and the seventh deepest in the world, dropping downward to 1,932 feet just southeast of Merriam Cone.

• Crater Lake is the deepest lake in the world.

Page 9: Syntactic Contributions in the Entailment Task

MENT: superlative heuristic

Superlative heuristic (100% accurate, 5 test items):– If the superlatives align, and their heads are aligned, and the

head in Text has any additional modifiers, and those modifiers are aligned to some modifier in H, say yes, else say no.

(RTE2-test- #477)

• Crater Lake is the deepest lake in the United States, the second deepest in the Western Hemisphere, and the seventh deepest lake in the world, dropping downward to 1,932 feet just southeast of Merriam Cone.

• Crater Lake is the deepest lake in the world.

Page 10: Syntactic Contributions in the Entailment Task

Counterfactual heuristic (80% accurate, 15 test items):– If there is a pair of aligned nodes, and a second pair of aligned nodes,

and the PATH in the dependency contains a conditional or counterfactual, say no.

(RTE2-test- #473)

• Blondlot was trying to polarize X-rays when he claimed to have discovered this new form of radiation.

• Blondlot discovered x-rays.

MENT: Counterfactual heuristic

Page 11: Syntactic Contributions in the Entailment Task

Counterfactual heuristic (80% accurate, 15 test items):– If there is a pair of aligned nodes, and a second pair of aligned nodes,

and the PATH in the dependency contains a conditional or counterfactual, say no.

(RTE2-test- #473)

• Blondlot was trying to polarize X-rays when he claimed to have discovered this new form of radiation.

• Blondlot discovered x-rays.

MENT: Counterfactual heuristic

Page 12: Syntactic Contributions in the Entailment Task

MENT: training feature weights

• “run2”: treating a syntactic heuristic match as a yes/no vote, alignment threshold set using training data

• “run1”: learning weights (using MaxEnt) for each syntactic and alignment heuristic, as well as for sub-components of these heuristics

Page 13: Syntactic Contributions in the Entailment Task

MENT: results

Run1 (with feature weights)

Run2

Training (1717 sents) 67.79 65.40

Dev (450 sents) 66.22 63.77

RTE2 test (800 sents) 60.25 58.50

RUN1

TRUTH Yes No

Yes 268 132

No 186 214

MENT Run1 says no 43.25% of the time

Page 14: Syntactic Contributions in the Entailment Task

MENT variations – no thresholds

• If heuristics apply, say no• Else say yes• 56% accurate• system says no 35%

• Say no, unless• everything is aligned and no

heuristics apply• 59.25% accurate• system says no 74.5%

SYSTEM

TRUTH Yes No

Yes 284 116

No 236 164

SYSTEM

TRUTH Yes No

Yes 134 261

No 65 335

** Note: Run2 = if no heuristics apply, and alignment score is above a threshold trained on the training set, then say yes, else no. Accuracy: 58.50

Page 15: Syntactic Contributions in the Entailment Task

MENT variations – with threshold

• With learned alignment and syntactic heuristic weights, with alignment threshold from training, say no

• Else say yes• 60.25% accurate• System says no 43% of the time

• Say no, unless• alignment score is above an

Oracle threshold and no heuristics apply

• 61.25% accurate• System says no 70% of the time

SYSTEM

TRUTH Yes No

Yes 168 232

No 75 325

RUN1

TRUTH Yes No

Yes 268 132

No 186 214

Page 16: Syntactic Contributions in the Entailment Task

Lessons?

• Use syntactic heuristics and sub-components as features and apply discriminative training

• Thresholding for lexical similarity isn’t stable across data sets

• Error Analysis …

Page 17: Syntactic Contributions in the Entailment Task

bad parses (e.g., rte2 test #550)

Page 18: Syntactic Contributions in the Entailment Task

How far do you take syntactic heuristics?

Location: for a pair of aligned verb nodes, if there is an argument in H, and that argument is aligned to a node in T, say no if that node is not also the same argument of the aligned verb (applied 7 times, 5 incorrect)

• Brandenburg Gate is one of Berlin's best known landmarks and is now regarded as one of the greatest symbols of German unity.

• Brandenburg Gate is in Berlin.

Page 19: Syntactic Contributions in the Entailment Task

A great heuristic …but

Unaligned Verb: if there is an aligned subject and an aligned object, then if their verb is not aligned, say no

• This heuristic was not used because of its poor performance, for example:

– Rodriguez told detectives he never touched the burning backpack, which was loaded with plastic pipes packed with gunpowder and BBs.

– The burning backpack contained plastic pipes packed with gunpowder and BBs.

• Need to learn paraphrase similarity for verbs – see NAACL-HLT paper forthcoming.

Page 20: Syntactic Contributions in the Entailment Task

Directions and Plans

• MSR submission available at http://research.microsoft.com/~lucyv/Might it be possible to have access to all sites’ submissions?

• Need to learn paraphrase similarity for verbs

• More feature engineering

• Different graph-matching strategies to avoid brittleness of syntactic heuristics

• Find more data for training to build more stable systems

Page 21: Syntactic Contributions in the Entailment Task

A plug for Pyramids• Conservatives oppose any form of devolution.• The conservatives are opposed to devolution.• The UK’s Tory Prime Minister adamantly resisted calls for devolution of

British rule.

• Scotts want self-rule• … as buoyed as most Scotts by North Ireland’s prospective self-rule• Wales is following Scotland, and moving towards a call for an elected

assembly with devolved powers …

• A self-governing Wales would be part of the EU• … an independent Wales within the European community• … Wales could participate directly in forthcoming EC meetings …• … a fully self-governing Wales within the European Community.

Page 22: Syntactic Contributions in the Entailment Task

A plug for Pyramids• Conservatives oppose any form of devolution.• The conservatives are opposed to devolution.• The UK’s Tory Prime Minister adamantly resisted calls for devolution of

British rule.

• Scotts want self-rule• … as buoyed as most Scotts by North Ireland’s prospective self-rule• Wales is following Scotland, and moving towards a call for an elected

assembly with devolved powers …

• A self-governing Wales would be part of the EU• … an independent Wales within the European community• … Wales could participate directly in forthcoming EC meetings …• … a fully self-governing Wales within the European Community.

SCU name, given by annotator

Candidate hypothesis?Candidate Text?