Recent Developments in Natural Language Parsing - · PDF fileRecent Developments in Natural...

Post on 27-Mar-2018

220 views 4 download

Transcript of Recent Developments in Natural Language Parsing - · PDF fileRecent Developments in Natural...

Recent Developmentsin Natural Language Parsing

Giorgio SattaUniversity of Padua

Venice, May 31st, 2016

Giorgio Satta University of Padua NL Parsing

Summary

Part I

Introduction to natural language parsing

A little bit of history

Part II

Dependency grammar

Abstract meaning representation

Giorgio Satta University of Padua NL Parsing

Parsing

The term parsing derives from the Latin expression pars orationis(lit. part of speech) meaning the analysis of sentencecomponents and their grammatical relations

Example:Rolls-Royce said it expects its U.S. sales to remain steady

1. Rolls-Royce proper name subject of 2.2. said verb main3. it pronoun subject of 4.4. expects verb subordinate of 2.

...

Giorgio Satta University of Padua NL Parsing

Parsing

In computer science, parsing refers to any process of recognition ofan object on the basis of a formal grammar (e.g., compilertheory, syntactic pattern matching)

The importance of parsing stems from the fact that you can notextract meaning without the support of syntax

Example:What is the value of (13 + 5) ∗ 7 ?What is the value of ( ) ∗ + 13 5 7 ? (lexicographic order)

Giorgio Satta University of Padua NL Parsing

Parsing Applications

In natural language (NL) processing, the parser is not astand-alone application, it is rather used as a component ofsome end-to-end system

Most popular systems exploiting parsing:

Automatic speech understanding

Information extraction

Intelligent personal assistant

Machine translation

Question answering

Text summarization

Giorgio Satta University of Padua NL Parsing

Parsing Applications

1968: HAL 9000 is a sentient computer appearing in StanleyKubrick’s 2001: A Space Odyssey

Giorgio Satta University of Padua NL Parsing

Parsing Applications

2011: IBM Watson is a question/answering system whichoutperformed its human opponents and former winners on the quizshow Jeopardy!

Giorgio Satta University of Padua NL Parsing

Parsing Overview

NL parsing is strongly rooted in

Generative linguistics

Formal language and automata theory

Computer algorithms

Machine learning

Giorgio Satta University of Padua NL Parsing

Generative Linguistics

In 1957, American scientist Noam Chomsky advocated for aformalized theory of linguistics structure, based onmathematically defined models

The idea started the field of generative linguistics, whichrevolutionized the scientific study of language

Giorgio Satta University of Padua NL Parsing

An Early Generative Model

Phrase-structure grammars (from Chomsky) are a generativemodel that strongly influenced NL parsing

Example: Rule system for fragment of English (operator ‘|’denotes alternative)

S → NP VPNP → NP PP | Det N | NVP → VP PP | V NPPP → P NPN → chocolate | I | fork | strawberriesV → eatDet → aP → with

Giorgio Satta University of Padua NL Parsing

An Early Generative Model

Example: Phrase structure

S

VP

NP

PP

NP

N

chocolate

P

with

NP

N

strawberries

V

eat

NP

N

I

Giorgio Satta University of Padua NL Parsing

An Early Generative Model

Example: Underlying grammatical relations

S

VP

NP

PP

NP

N

chocolate

P

with

NP

N

strawberries

V

eat

NP

N

I

sbj

obj

mod

Giorgio Satta University of Padua NL Parsing

An Early Generative Model

Example: Long distance syntactic movement

CP

C′

S

VP

NP

PP

NP

N

chocolate

P

with

NP

ti

V

eat

NP

N

I

C

do

NPi

What

mov

Giorgio Satta University of Padua NL Parsing

Ambiguity

In contrast with programming languages, NL is highly ambiguous

The number of possible syntactic interpretations of a sentence withn words can grow exponentially with n

Lexical, semantic and pragmatic knowledge needed to rule outundesired/unlikely interpretations (example next)

Giorgio Satta University of Padua NL Parsing

Ambiguity

S

VP

NP

PP

NP

N

chocolate

P

with

NP

N

strawberries

V

eat

NP

N

I

Giorgio Satta University of Padua NL Parsing

Ambiguity

S

VP

PP

NP

N

fork

Det

a

P

with

VP

NP

N

strawberries

V

eat

NP

N

I

Giorgio Satta University of Padua NL Parsing

Ambiguity

?? S

VP

NP

PP

NP

N

fork

Det

a

P

with

NP

N

strawberries

V

eat

NP

N

I

Giorgio Satta University of Padua NL Parsing

Ambiguity

?? S

VP

PP

NP

N

chocolate

P

with

VP

NP

N

strawberries

V

eat

NP

N

I

Giorgio Satta University of Padua NL Parsing

Parsing Algorithms

Chart Parsing, credited to Martin Key, is an algorithm forcontext-free grammar parsing that became very popular in the 80’s

Although independently discovered, it is very similar to analgorithm developed by Jay Earley in 1968

The algorithm uses dynamic programming to efficiently copewith syntactic ambiguity, running in time O(n3) and space O(n2),n the number of words in the input

Giorgio Satta University of Padua NL Parsing

Parsing Algorithms

Example: Chart parsing

0

I

1

eat

2

strb

3

. . .

[S → • NP VP]

[NP → • NP PP]

[NP → • det N][NP → • N][N → • I]

...

[N → I •][NP → N •][S → NP • VP]

[VP → • VP PP]

[VP → • V NP]

[V → • eat]

[V → eat •][VP → V • NP]

[NP → • NP PP]

[NP → • det N][NP → • N][N → • strb]

[S → NP VP •]

[VP → V NP •]

[N → strb •][NP → N •][NP → NP • PP]

[PP → • P NP]

[P → • with]

Giorgio Satta University of Padua NL Parsing

Parsing Algorithms

Example: Parse forest

0

I

1

eat

2

strb

3

. . .

[S → • NP VP]

[NP → • NP PP]

[NP → • det N][NP → • N][N → • I]

...

[N → I •][NP → N •][S → NP • VP]

[VP → • VP PP]

[VP → • V NP]

[V → • eat]

[V → eat •][VP → V • NP]

[NP → • NP PP]

[NP → • det N][NP → • N][NP → • strb]

[S → NP VP •]

[VP → V NP •]

[N → strb •][NP → N •][NP → NP • PP]

[PP → • P NP]

[P → • with]Giorgio Satta University of Padua NL Parsing

Parsing Algorithms

Chart parsing and many other parsing algorithms using dynamicprogramming can be seen as

Special constructions that ‘intersect’ a context-free grammarand a finite-state automaton

Simulations of specific push-down automata

Giorgio Satta University of Padua NL Parsing

Turning Point

70s and the 80s: Most of NL parsers based on hand-written rules,developed by linguists

High cost

Poor system coverage

Ad hoc evaluation

Giorgio Satta University of Padua NL Parsing

Turning Point

Every time I fire a linguist, the performance of the speechrecognizer goes up (Frederick Jelinek, IBM)

In contrast to main stream research, IBM developed its ownstatistical, data-centric approach, leading to the Penn TreebankProject (1989-1992): the construction of a large bank of linguistictrees, developed at University of Pennsylvania

Giorgio Satta University of Padua NL Parsing

Turning Point

Giorgio Satta University of Padua NL Parsing

Turning Point

Giorgio Satta University of Padua NL Parsing

Turning Point

Starting with the 90s

Surge of empirical approaches in all areas of NL processing

Many linguistic annotated corpora realised

Nowadays parsing is viewed as a structured prediction problem:input sentence has to be assigned a structured object (syntactictree) from an infinite space

Supervised machine learning techniques used to train parsers

Giorgio Satta University of Padua NL Parsing

Lexicalized Context-Free Grammars

Lexicalized context-free grammars are a syntactic model morefine-grained than phrase structure grammars

Accounts for lexical use

If used with scores (e.g., probabilities), very effective indisambiguation

Basic idea: enrich grammar symbols with lexical heads

NP[strawberry] → NP[strawberry] PP[chocolate]

Giorgio Satta University of Padua NL Parsing

Lexicalized Context-Free Grammars

S[eat]

VP[eat]

NP[strb]

PP[choc]

NP[choc]

N[choc]

chocolate

P[with]

with

NP[strb]

N[strb]

strawberries

V[eat]

eat

NP[I]

N[I]

I

Giorgio Satta University of Padua NL Parsing

Lexicalized Context-Free Grammars

Naıve parsing algorithm for lexicalized context-free grammars runsin time O(n5), n the number of words in the input

Using advanced dynamic programming, parsing can be done intime O(n3) and space O(n2)

Giorgio Satta University of Padua NL Parsing

More advanced formalisms

In 1985, Stuart Shieber showed that natural language is notcontext-free

This boosted the investigation of more powerful formalisms

Tree-adjoing grammars

Combinatorial categorial grammars

Linear context-free rewriting systems

Giorgio Satta University of Padua NL Parsing

More Advanced Formalisms

S

A

CFG

S

A

A

TAG

B

A

LCFRS

Giorgio Satta University of Padua NL Parsing

Summary

Part I

• Introduction to natural language parsing

• A little bit of history

Part II

Dependency grammar

• Abstract meaning representation

Giorgio Satta University of Padua NL Parsing

Dependency Grammars

Dependency grammars can be traced back to the work of Frenchlinguist Lucien Tesniere (1893-1954)

Very good balance between linguistic expressivity, annotationcost, and processing efficiency

10K to 102K words per second with greedy parsers

50++ languages covered

Universal dependencies project (ongoing)

Giorgio Satta University of Padua NL Parsing

Dependency Tree

In a dependency tree, clausal structure is determined by a binaryrelation, called dependency, between pair of words called headand dependent

root Rolls-Royce said it expects its U.S. sales to remain steady .

root

nsubj

ccomp

punc

nsubj

xcomp

nn

poss

aux

nsubj

acomp

Giorgio Satta University of Padua NL Parsing

Projectivity

A node is projective if it generates a substring of the input string

Example : Node 6 dominates substring [6, 9]

1root

2Mr.

3Tomash

4will

5remain

6as

7a

8director

9emeritus

10.

root

sbjnmod vc pp nmodnmod

np

punc

Giorgio Satta University of Padua NL Parsing

Projectivity

A node is non-projective if it is not projective

Example : Node 3 dominates substrings [2, 3] and [6, 8]

1root

2A

3hearing

4is

5scheduled

6on

7the

8issue

9today

10.

root

sbjnmod vc

tmppp

npnmod

punc

Giorgio Satta University of Padua NL Parsing

Accuracy (Greedy Parsing)

Labeled (LAS) and unlabeled (UAS) attachment scores on theCoNLL 2007 dataset (including punctuation) and on Penn TreeBank (excluding punctuation)

parser Arabic Basque Catalan Chinese Czech English Greek Hungarian Italian Turkish PTB

UASstd 81.39 75.37 90.32 85.17 78.90 85.69 79.90 77.67 82.98 77.04 89.86dyn 82.56 74.39 90.95 85.65 81.01 87.70 81.85 78.72 84.37 77.21 90.92spine 84.54 75.82 91.92 86.72 81.19 89.37 81.78 77.48 85.38 78.61 91.77

LASstd 71.93 65.64 84.90 80.35 71.39 84.60 72.25 67.66 78.77 65.9 87.56dyn 72.89 65.27 85.82 81.28 72.92 86.79 74.22 69.57 80.25 66.71 88.72spine 74.54 66.91 86.83 82.38 72.72 88.44 74.04 68.76 81.50 68.06 89.53

Giorgio Satta University of Padua NL Parsing

Arc-Standard Parser

The arc-standard parser is very similar to a shift-reducepush-down automaton

Internal states are represented by a very large feature vector(fortunately, also very sparse)

Transition relation inferred from data (not fully observed)

Giorgio Satta University of Padua NL Parsing

Arc-Standard Parser

Example :

Trans Stack Buffer

sh — rootroot Mr. · · ·

sh root Mr. Tomash · · ·

sh root Mr. Tomash will · · ·

la root Mr. Tomash will remain · · ·

Giorgio Satta University of Padua NL Parsing

Arc-Standard Parser

Example (cont’d) :

Trans Stack Buffer

sh root Tomash will remain · · ·

Mr.

la root Tomash will remain as · · ·

Mr.

Giorgio Satta University of Padua NL Parsing

Arc-Standard Parser

Example (cont’d) :

Trans Stack Buffer

sh root will remain as · · ·

Tomash

Mr.

ra root will remain as a · · ·

Tomash

Mr.

Giorgio Satta University of Padua NL Parsing

Arc-Standard Parser

Example (cont’d) :

Trans Stack Buffer

sh root will as a · · ·

Tomash

Mr.

remain

sh root will as a director · · ·

Tomash

Mr.

remain

Giorgio Satta University of Padua NL Parsing

Summary

Part I

• Introduction to natural language parsing

• A little bit of history

Part II

• Dependency grammar

Abstract meaning representation

Giorgio Satta University of Padua NL Parsing

Semantic Parsing

Syntax: Single parsing task rather than separate tasks as, e.g.,base noun identification, prepositional phrase attachment, tracerecovery, verb-argument dependencies, etc.

Semantics: Separate annotations for named entities, co-reference,semantic relations, discourse connectives, temporal entities, etc.

We lack a sembank of sentence, logical meaning pairs

Giorgio Satta University of Padua NL Parsing

Semantic Parsing

Abstract meaning representations (AMRs) are directed acyclicgraphs representing abstract concepts, predicates and semanticrelations realised by natural language sentences

Sentences with the same basic meaning are assigned the sameAMR

He described her as a genius

His description of her: genius

She was a genius, according to his description

Giorgio Satta University of Padua NL Parsing

Abstract Meaning Representation

‘Then the little prince flashed back at me, with a kind ofresentfulness: I don’t believe you!’

Giorgio Satta University of Padua NL Parsing

Abstract Meaning Representation

Formalisms being explored for AMR parsing

Directed acyclic graphs finite automata

Hyper-edge replacement grammars

Long term research program:

incorporate syntax and semantics into parsing

switch from conditional to join models

Giorgio Satta University of Padua NL Parsing