Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency...

Constituency Parsing李宏毅 Hung-yi Lee

One Sequence Multiple Sequences

One Class

Sentiment ClassificationStance Detection

Veracity PredictionIntent Classification

Dialogue Policy

NLISearch Engine

Relation Extraction

Class for each Token

POS taggingWord segmentation

Extractive SummarizationSlotting Filling

NER

Copyfrom Input

Extractive QA

GeneralSequence

Abstractive SummarizationTranslation

Grammar CorrectionNLG

General QAChatbot

State TrackerTask Oriented Dialogue

Other? Parsing, Coreference Resolution

Constituency Parsing

• Some text spans are constituents (“units”)

• Each constituent has a label.

deep learning is very powerful

constituent constituent

not constituent

NP ADJP

constituent

Constituency Parsing - Labels

+ All POS tags



• Each word is a constituent (their labels are POS tags)

• Some consecutive constituents can form a larger one.

VP

S

NP ADJV

Form a tree

(Only considering binary tree in this course for simplicity)



• Each word is a constituent (their labels are POS tags)

• Some consecutive constituents can form a larger one.

VP

S

ADJV

Form a tree

NP

Each constituent is a node.

Chart-based Approach

Source of image: https://web.stanford.edu/~jurafsky/slp3/13.pdf

CKY chart parsing

Chart-based

…… w2 w3 w4 w5 ……span

Constituent?

binary classification

Which label?

multi-classclassification

Classifier

Chart-based


Constituent? Which label?

Classifier

YES


Classifier

NP ADJPYES

Chart-based



Classifier

NO NO


Classifier

Don’t Care Don’t Care

Constituent?

w1 w2 w3 w4 w5 w6 w7 w8 w9 w10

Span FeatureExtraction

Pre-trained Model ELMO, BERT …

Which Label?

Yes/No Label

Chart-based – Classifier

Chart-based

• Given a sequence with N tokens, then run the classifier N(N-1)/2 times ……


Contradiction!

Constituent?

Classifier

YESConstituent?

Classifier

YES

I am good

I am good

Inference Enumerate all possible trees, and use the classifier to give scores. where you need CKY algorithm

I am good

I am good

Classifier

0.1

Classifier

0.9

Classifier

0.8

Classifier

0.9

Training?[Stern, et al., ACL’17]

Transition-based

Source of image: https://arxiv.org/pdf/1602.07776.pdf

Transition-based

Stack

Buffer

(empty at the beginning)

SHIFT REDUCECREATE(X)

More a token from buffer to stack

Close a constituteCreate a constitute X

(X = NP, VP …)


[Dyer, et al., NAACL’16]

Actions

Transition-based


CREATE(S)

(empty at the beginning)

Transition-based


CREATE(S)

(S

CREATE(NP)

(NP

SHIFT

deep

SHIFT

learning

REDUCE

)

CREATE(VP)

(VP

SHIFT

is

CREATE(ADJV)

(ADJV

SHIFT SHIFT

very powerful ) ) )

REDUCE REDUCE REDUCE

RNN Grammar

Stack Buffer

Previous actions

RNN RNN

RNN

Network

SHIFT REDUCECREATE(X)


CREATE(S)

(S

CREATE(NP)

(NP

SHIFT

deep

SHIFT

learning

REDUCE

Network

• typical classification task • RL is not needed

RNN Grammar – Training

Ground truth

RNN

RNN

RNN

Source of image: https://papers.nips.cc/paper/5635-grammar-as-a-foreign-language.pdf

[Vinyals, et al., NIPS’15]


VP

S

NP ADJV

Tree to Sequence

(S (NP deep learning )

1

2

3 4

5

6

7

8

9 10

11

12

13

(VP is

(ADJV very powerful ) ) )

Of course, you can try other tree traversal approaches

[Liu, et al., TACL’17]

Seq2seq!


(S (NP deep learning ) (VP is

(ADJV very powerful ) ) )

CREATE(S) CREATE(NP) SHIFT SHIFT REDUCE

CREATE(VP) SHIFT CREATE(ADJV) SHIFT SHIFT

REDUCE REDUCE REDUCE

[Vinyals, et al., NIPS’15]

[Dyer, et al., NAACL’16]

Seq2seq v.s. RNN grammar

Unsupervised Parsing?


Can we find parsing trees without label data?

YES!

Reference: https://youtu.be/YIuBHB9Ejok

Reference

• [Vinyals, et al., NIPS’15] Oriol Vinyals, Lukasz Kaiser, Terry Koo, Slav Petrov, Ilya Sutskever, Geoffrey Hinton, Grammar as a foreign language, NIPS, 2015

• [Dyer, et al., NAACL’16] Chris Dyer, Adhiguna Kuncoro, Miguel Ballesteros, Noah A. Smith, Recurrent Neural Network Grammars, NAACL, 2016

• [Stern, et al., ACL’17] Mitchell Stern, Jacob Andreas, Dan Klein, A Minimal Span-Based Neural Constituency Parser, ACL,2017

• [Liu, et al., TACL’17] Jiangming Liu, Yue Zhang, In-Order Transition-based Constituent Parsing, TACL, 2017

Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency...

Documents

Transcript of Constituency Parsingspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/ParsingC (v2).pdf · Constituency...