Dependency Hashing for n-best CCG Parsing
description
Transcript of Dependency Hashing for n-best CCG Parsing
![Page 1: Dependency Hashing for n-best CCG Parsing](https://reader036.fdocuments.us/reader036/viewer/2022062305/56815a79550346895dc7e179/html5/thumbnails/1.jpg)
1
Dependency Hashing for n-best CCG Parsing
Dominick Ng and James R. Curran
Presented by Yun Huang
![Page 2: Dependency Hashing for n-best CCG Parsing](https://reader036.fdocuments.us/reader036/viewer/2022062305/56815a79550346895dc7e179/html5/thumbnails/2.jpg)
2
Background: CCG
• CCG derivation• Dependency
• Evaluation– All components of a de
p. structure must match golden standard
– Prec./Recall/F-score
![Page 3: Dependency Hashing for n-best CCG Parsing](https://reader036.fdocuments.us/reader036/viewer/2022062305/56815a79550346895dc7e179/html5/thumbnails/3.jpg)
3
Background: CCGbank
• CCGbank was created by converting the phrase-structure trees in the PTB into normal-form CCG derivations. (99.44% covered)
![Page 4: Dependency Hashing for n-best CCG Parsing](https://reader036.fdocuments.us/reader036/viewer/2022062305/56815a79550346895dc7e179/html5/thumbnails/4.jpg)
4
Background: C&C parser
• Supertagger: assign possible lexical categories to word (eg. S\NP, (S\NP)/PP for swim)– Tag dictionary extracted from training data– Adaptive supertagging: β and k
• C&C parser: log-linear model parser– POS tags and lexical categories as input.– CKY chart parsing– N-best reranking
![Page 5: Dependency Hashing for n-best CCG Parsing](https://reader036.fdocuments.us/reader036/viewer/2022062305/56815a79550346895dc7e179/html5/thumbnails/5.jpg)
5
Ambiguity in n-best CCG parsing
• Spurious ambiguity– Norm-form (usually right branching)
• Absorption ambiguity
• Diversity problem: n-best CCG derivations, but with duplicated dependencies
![Page 6: Dependency Hashing for n-best CCG Parsing](https://reader036.fdocuments.us/reader036/viewer/2022062305/56815a79550346895dc7e179/html5/thumbnails/6.jpg)
6
Dependency Hashing (1)
• Constraint: any n-best candidate must not have the same dependencies as any candidate already in the list.– Similar in SMT: remove duplicated strings– Delete which: later inserted? lower score?
![Page 7: Dependency Hashing for n-best CCG Parsing](https://reader036.fdocuments.us/reader036/viewer/2022062305/56815a79550346895dc7e179/html5/thumbnails/7.jpg)
7
Dependency Hashing (2)
• Implementation:– 32-bit hash value for each dependency
– Bit-wise XOR to combine sub-derivations– Only hash value, no hash table
• Collision: miss some useful dependencies
![Page 8: Dependency Hashing for n-best CCG Parsing](https://reader036.fdocuments.us/reader036/viewer/2022062305/56815a79550346895dc7e179/html5/thumbnails/8.jpg)
8
Diversity experiments
• Dependency
• Grammatical relation
![Page 9: Dependency Hashing for n-best CCG Parsing](https://reader036.fdocuments.us/reader036/viewer/2022062305/56815a79550346895dc7e179/html5/thumbnails/9.jpg)
9
Parsing Results
• Oracle– Reranking u
pper bound
• Reranking
Gap
![Page 10: Dependency Hashing for n-best CCG Parsing](https://reader036.fdocuments.us/reader036/viewer/2022062305/56815a79550346895dc7e179/html5/thumbnails/10.jpg)
10
Three types of error
• Grammar error– Only a subset of CCGbank rules are used– Seen rule constraint
• Supertagger error– Restricted categories by frequency cutoff – Probability threshold βand cutoff k
• Model error– Suboptimal parse
![Page 11: Dependency Hashing for n-best CCG Parsing](https://reader036.fdocuments.us/reader036/viewer/2022062305/56815a79550346895dc7e179/html5/thumbnails/11.jpg)
11
Grammar Error
• Given gold-standard categories, the parser F-score is 99.49%, with 95.61% coverage
• Grammar error accounts about 0.5% of overall parser errors, and 4.4% drop in coverage
![Page 12: Dependency Hashing for n-best CCG Parsing](https://reader036.fdocuments.us/reader036/viewer/2022062305/56815a79550346895dc7e179/html5/thumbnails/12.jpg)
12
Supertagger and model error
• Supertagger error : differ from oracle• Model error : differ from baseline
![Page 13: Dependency Hashing for n-best CCG Parsing](https://reader036.fdocuments.us/reader036/viewer/2022062305/56815a79550346895dc7e179/html5/thumbnails/13.jpg)
13
More experiments
• Tradeoff of speed and accuracy
• Gold/automatic
POS tags
![Page 14: Dependency Hashing for n-best CCG Parsing](https://reader036.fdocuments.us/reader036/viewer/2022062305/56815a79550346895dc7e179/html5/thumbnails/14.jpg)
14
Conclusion
• Dependency hashing for n-best CCG– Avoid derivations with same dependency– Increase diversity in n-best list
• Comprehensive error analysis– Grammar error: 0.5%– Supertagger error: 5%– Model error: 7.5%
![Page 15: Dependency Hashing for n-best CCG Parsing](https://reader036.fdocuments.us/reader036/viewer/2022062305/56815a79550346895dc7e179/html5/thumbnails/15.jpg)
15
Thank you
Q & A