University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft...
Transcript of University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft...
![Page 1: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/1.jpg)
Straight to the Tree: Constituency Parsing with Neural Syntactic Distance
Yikang Shen*, Zhouhan Lin*,Athul Paul Jacob, Alessandro Sordoni, Aaron Courville, Yoshua Bengio
University of Montreal, Microsoft Research, University of Waterloo
![Page 2: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/2.jpg)
Overview- Motivation- Syntactic Distance based Parsing Framework- Model- Experimental Results
![Page 3: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/3.jpg)
Overview- Motivation- Syntactic Distance based Parsing Framework- Model- Experimental Results
![Page 4: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/4.jpg)
ICLR 2018: Neural Language Modeling by Jointly Learning Syntax and Lexicon
Syntactic Distance
Structured Self-Attention
LSTM
Language Model (61 ppl)
Unsupervised Constituency parser(68 UF1)
Supervised Constituency Parsing with Syntactic Distance?
[Shen et al. 2018]
![Page 5: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/5.jpg)
Chart Neural Parsers
1. High computational cost:Complexity of CYK is O(n^3).
2. Complicated loss function:
Transition based Neural Parsers
1. Greedy decoding:Incompleted tree (the shift and reduce steps may not match).
2. Exposure biasThe model is never exposed to its own mistakes during training
[Stern et al., 2017; Cross and Huang, 2016]
![Page 6: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/6.jpg)
Overview- Motivation- Syntactic Distance based Parsing Framework- Model- Experimental Results
![Page 7: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/7.jpg)
Intuitions
Only the order of split (or combination) matters for reconstructing the tree.
Can we model the order directly?
![Page 8: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/8.jpg)
Syntactic distance
For each split point, their syntactic distance should share the same order as the height of related node
N1
N2
S1 S2
![Page 9: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/9.jpg)
Convert to binary tree
[Stern et al., 2017]
![Page 10: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/10.jpg)
Tree to Distance
The height for each non-terminal node is the maximum height of its children plus 1
![Page 11: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/11.jpg)
Tree to Distance
S VP S-VP ∅
NP ∅ ∅ NP ∅
![Page 12: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/12.jpg)
Distance to Tree
Split point for each bracket is the one with maximum distance.
![Page 13: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/13.jpg)
Distance to Tree
![Page 14: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/14.jpg)
Overview- Motivation- Syntactic Distance based Parsing Framework- Model- Experimental Results
![Page 15: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/15.jpg)
Framework for inferring the distances and labels
Distances
Labels for leaf nodes
Labels for non-leaf nodes
![Page 16: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/16.jpg)
Inferring the distances
Distances
![Page 17: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/17.jpg)
Inferring the distances
![Page 18: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/18.jpg)
Pairwise learning-to-rank loss for distances
a variant of hinge loss
![Page 19: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/19.jpg)
Pairwise learning-to-rank loss for distances
L L
-1 1
While di > dj : While di < dj :
![Page 20: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/20.jpg)
Framework for inferring the distances and labels
Distances
Labels for leaf nodes
Labels for non-leaf nodes
![Page 21: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/21.jpg)
Framework for inferring the distances and labels
Labels for leaf nodes
Labels for non-leaf nodes
![Page 22: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/22.jpg)
Inferring the Labels
![Page 23: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/23.jpg)
Inferring the Labels
![Page 24: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/24.jpg)
Inferring the Labels
![Page 25: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/25.jpg)
Putting it together
![Page 26: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/26.jpg)
Putting it together
![Page 27: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/27.jpg)
Overview- Motivation- Syntactic Distance based Parsing Framework- Model- Experimental Results
![Page 28: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/28.jpg)
Experiments: Penn Treebank
![Page 29: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/29.jpg)
Experiments: Chinese Treebank
![Page 30: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/30.jpg)
Experiments: Detailed statistics in PTB and CTB
![Page 31: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/31.jpg)
Experiments: Ablation Test
![Page 32: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/32.jpg)
Experiments: Parsing Speed
![Page 33: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/33.jpg)
Conclusions and Highlights
- A novel constituency parsing scheme: predicting tree structure from a set of real-valued scalars (syntactic distances).
- Completely free from compounding errors. - Strong performance compare to previous models, and- Significantly more efficient than previous models- Easy deployment: The architecture of model is no more than a stack
of standard recurrent and convolutional layers.
![Page 34: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/34.jpg)
One more thing... Why it works now?
- The research in rank loss is well-studied in the topic of learning-to-rank, since 2005 (Burges et al. 2005).
- Models that are good at learning these syntactic distances are not widely known until the rediscovery of LSTM in 2013 (Graves 2013).
- Efficient regularization methods for LSTM didn’t become mature until 2017 (Merity 2017).
![Page 35: University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft Research, University of Waterloo. Overview - Motivation - Syntactic Distance based](https://reader034.fdocuments.us/reader034/viewer/2022050313/5f75162a71024e589a7ab76d/html5/thumbnails/35.jpg)
Thank you!
Questions?
Yikang Shen, Zhouhan LinMILA, Université de Montréal{yikang.shn, lin.zhouhan}@gmail.com
Paper:Code: