Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency...
Transcript of Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency...
Constituency Parsing李宏毅 Hung-yi Lee
One Sequence Multiple Sequences
One Class
Sentiment ClassificationStance Detection
Veracity PredictionIntent Classification
Dialogue Policy
NLISearch Engine
Relation Extraction
Class for each Token
POS taggingWord segmentation
Extractive SummarizationSlotting Filling
NER
Copyfrom Input
Extractive QA
GeneralSequence
Abstractive SummarizationTranslation
Grammar CorrectionNLG
General QAChatbot
State TrackerTask Oriented Dialogue
Other? Parsing, Coreference Resolution
Constituency Parsing
• Some text spans are constituents (“units”)
• Each constituent has a label.
deep learning is very powerful
constituent constituent
not constituent
NP ADJP
constituent
Constituency Parsing - Labels
+ All POS tags
Constituency Parsing
deep learning is very powerful
• Each word is a constituent (their labels are POS tags)
• Some consecutive constituents can form a larger one.
VP
S
NP ADJV
Form a tree
(Only considering binary tree in this course for simplicity)
Constituency Parsing
deep learning is very powerful
• Each word is a constituent (their labels are POS tags)
• Some consecutive constituents can form a larger one.
VP
S
ADJV
Form a tree
NP
Each constituent is a node.
Chart-based Approach
Source of image: https://web.stanford.edu/~jurafsky/slp3/13.pdf
CKY chart parsing
Chart-based
…… w2 w3 w4 w5 ……span
Constituent?
binary classification
Which label?
multi-classclassification
Classifier
Chart-based
deep learning is very powerful
Constituent? Which label?
Classifier
YES
Constituent? Which label?
Classifier
NP ADJPYES
Chart-based
deep learning is very powerful
Constituent? Which label?
Classifier
NO NO
Constituent? Which label?
Classifier
Don’t Care Don’t Care
Constituent?
w1 w2 w3 w4 w5 w6 w7 w8 w9 w10
Span FeatureExtraction
Pre-trained Model ELMO, BERT …
Which Label?
Yes/No Label
Chart-based – Classifier
Chart-based
• Given a sequence with N tokens, then run the classifier N(N-1)/2 times ……
deep learning is very powerful
Contradiction!
Constituent?
Classifier
YESConstituent?
Classifier
YES
I am good
I am good
Inference Enumerate all possible trees, and use the classifier to give scores. where you need CKY algorithm
I am good
I am good
Classifier
0.1
Classifier
0.9
Classifier
0.8
Classifier
0.9
Training?[Stern, et al., ACL’17]
Transition-based
Source of image: https://arxiv.org/pdf/1602.07776.pdf
Transition-based
Stack
Buffer
(empty at the beginning)
SHIFT REDUCECREATE(X)
More a token from buffer to stack
Close a constituteCreate a constitute X
(X = NP, VP …)
deep learning is very powerful
[Dyer, et al., NAACL’16]
Actions
Transition-based
deep learning is very powerful
CREATE(S)
(empty at the beginning)
Transition-based
deep learning is very powerful
CREATE(S)
(S
CREATE(NP)
(NP
SHIFT
deep
SHIFT
learning
REDUCE
)
CREATE(VP)
(VP
SHIFT
is
CREATE(ADJV)
(ADJV
SHIFT SHIFT
very powerful ) ) )
REDUCE REDUCE REDUCE
RNN Grammar
Stack Buffer
Previous actions
RNN RNN
RNN
Network
SHIFT REDUCECREATE(X)
deep learning is very powerful
CREATE(S)
(S
CREATE(NP)
(NP
SHIFT
deep
SHIFT
learning
REDUCE
Network
• typical classification task • RL is not needed
RNN Grammar – Training
Ground truth
RNN
RNN
RNN
Source of image: https://papers.nips.cc/paper/5635-grammar-as-a-foreign-language.pdf
[Vinyals, et al., NIPS’15]
deep learning is very powerful
VP
S
NP ADJV
Tree to Sequence
(S (NP deep learning )
1
2
3 4
5
6
7
8
9 10
11
12
13
(VP is
(ADJV very powerful ) ) )
Of course, you can try other tree traversal approaches
[Liu, et al., TACL’17]
Seq2seq!
deep learning is very powerful
(S (NP deep learning ) (VP is
(ADJV very powerful ) ) )
CREATE(S) CREATE(NP) SHIFT SHIFT REDUCE
CREATE(VP) SHIFT CREATE(ADJV) SHIFT SHIFT
REDUCE REDUCE REDUCE
[Vinyals, et al., NIPS’15]
[Dyer, et al., NAACL’16]
Seq2seq v.s. RNN grammar
Unsupervised Parsing?
deep learning is very powerful
Can we find parsing trees without label data?
YES!
Reference: https://youtu.be/YIuBHB9Ejok
Reference
• [Vinyals, et al., NIPS’15] Oriol Vinyals, Lukasz Kaiser, Terry Koo, Slav Petrov, Ilya Sutskever, Geoffrey Hinton, Grammar as a foreign language, NIPS, 2015
• [Dyer, et al., NAACL’16] Chris Dyer, Adhiguna Kuncoro, Miguel Ballesteros, Noah A. Smith, Recurrent Neural Network Grammars, NAACL, 2016
• [Stern, et al., ACL’17] Mitchell Stern, Jacob Andreas, Dan Klein, A Minimal Span-Based Neural Constituency Parser, ACL,2017
• [Liu, et al., TACL’17] Jiangming Liu, Yue Zhang, In-Order Transition-based Constituent Parsing, TACL, 2017