Treebank-Based Wide Coverage Probabilistic LFG Resources

92
Paris 2008 Treebank-Based LFG Resources 1 Treebank-Based Wide Coverage Probabilistic LFG Resources Josef van Genabith, Aoife Cahill, Grzegorz Chrupala, Jennifer Foster, Deirdre Hogan, Conor Cafferkey, Mick Burke, Ruth O’Donovan, Yvette Graham, Karolina Owczarzak, Yuqing Guo, Ines Rehbein, Natalie Schluter and Djame Sedah National Centre for Language Technology NCLT School of Computing, Dublin City University

description

Treebank-Based Wide Coverage Probabilistic LFG Resources. Josef van Genabith, Aoife Cahill, Grzegorz Chrupala, Jennifer Foster, Deirdre Hogan, Conor Cafferkey, Mick Burke, Ruth O’Donovan, Yvette Graham, Karolina Owczarzak, Yuqing Guo, Ines Rehbein, Natalie Schluter and Djame Sedah - PowerPoint PPT Presentation

Transcript of Treebank-Based Wide Coverage Probabilistic LFG Resources

Page 1: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 1

Treebank-Based Wide Coverage Probabilistic LFG Resources

Josef van Genabith, Aoife Cahill, Grzegorz Chrupala, Jennifer Foster, Deirdre Hogan, Conor Cafferkey, Mick Burke, Ruth O’Donovan, Yvette Graham, Karolina Owczarzak, Yuqing Guo, Ines

Rehbein, Natalie Schluter and Djame Sedah

National Centre for Language Technology NCLT

School of Computing, Dublin City University

Page 2: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 2

Overview

• Context/Motivation

• Treebank-Based Acquisition of Wide-Coverage LFG Resources (Penn-II)

– LFG

– Automatic F-Structure Annotation Algorithm

– Acquisition of Lexical Resources

• Parsing

– Parsing Architectures

– LDD-Resolution

– Comparison with Hand-Crafted (XLE, RASP) and Treebank-Based (CCG, HPSG) Resources

• Generation

– Basic Generator

– Generation Grammar Transforms

– “History-Based” Generation

• MT Evaluation

Page 3: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 3

Motivation

• What do grammars do?– Grammars define languages as sets of strings– Grammars define what strings are grammatical

and what strings are not– Grammars tell us about the syntactic structure of

(associated with) strings• “Shallow” vs. “Deep” grammars• Shallow grammars do all of the above• Deep grammars (in addition) relate text to information/meaning

representation• Information: predicate-argument-adjunct structure, deep

dependency relations, logical forms, …• In natural languages, linguistic material is not always

interpreted locally where you encounter it: long-distance dependencies (LDDs)

• Resolution of LDDs crucial to construct accurate and complete information/meaning representations.

• Deep grammars := (text <-> meaning) + (LDD resolution)

Page 4: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 4

Motivation

• Constraint-Based Grammar Formalisms (FU, GPSG, PATR-II, …)

– Lexical-Functional Grammar (LFG)– Head-Driven Phrase Structure Grammar (HPSG)– Combinatory Categorial Grammar (CCG)– Tree-Adjoining Grammar (TAG)

• Traditionally, deep constraint-based grammars are hand-crafted• LFG ParGram, HPSG LingoErg, Core Language Engine CLE, Alvey

Tools, RASP, ALPINO, …• Wide-coverage, deep constraint-based grammar development

is very time consuming, knowledge extensive and expensive!• Very hard to scale hand-crafted grammars to unrestricted text! • English XLE (Riezler et al. 2002); German XLE (Forst and Rohrer

2006); Japanese XLE (Masuichi and Okuma 2003); RASP (Carroll and Briscoe 2002); ALPINO (Bouma, van Noord and Malouf, 2000)

Page 5: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 5

Motivation

• Instance of “knowledge acquisition bottleneck” familiar from classical “rationalist rule/knowledge-based” AI/NLP

• Alternative to classical “rationalist” rule/knowledge-based AI/NLP• “Empiricist data-driven ” research paradigm (AI/NLP):

– Corpora, …, machine-learning-based and statistical approaches, …

– Treebank-based grammar acquisition, probabilistic parsing– Advantage: grammars can be induced (learned) automatically – Very low development cost, wide-coverage, robust, but …

• Most treebank-based grammar induction/parsing technology produces “shallow” grammars

• Shallow grammars don’t resolve LDDs (but see (Johnson 2002); …), do not map strings to information/meaning representations …

Page 6: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 6

Motivation

• Poses a number of research questions:

• Can we address the knowledge acquisition bottleneck for deep grammar development by combining insights from rationalist and empiricist research paradigms?

• Specifically:

• Can we automatically acquire wide-coverage “deep”, probabilistic, constraint-based grammars from treebanks?

• How do we use them in parsing?• Can we use them for generation?• Can we acquire resources for different languages and

treebank encodings?• How do these resources compare with hand-crafted

resources?• How do they fare in applications … ?

Page 7: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 7

Context

• TAG (Xia, 2001)• LFG (Cahill, McCarthy, van Genabith and Way, 2002)• CCG (Hockenmaier & Steedman, 2002)• HPSG (Miyao and Tsujii, 2003)

• LFG • (van Genabith, Sadler and Way, 1999)• (Frank, 2000)• (Sadler, van Genabith and Way, 2000)• (Frank, Sadler, van Genabith and Way, 2003)

Page 8: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 8

Lexical-Functional Grammar (LFG)

Parsing

Page 9: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 9

LFG Acquisition for English - Overview

• Treebank-Based Acquisition of LFG Resources (Penn-II)

– Lexical Functional Grammar LFG

– Penn-II Treebank & Preprocessing/Clean-Up

– F-Str Annotation Algorithm

– Grammar and Lexicon Extraction

• Parsing Architectures (LDD Resolution)

• Comparison with best hand-crafted resources: XLE and RASP

• Comparison with treebank-based CCG and HPSG resources

Page 10: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 10

Lexical-Functional Grammar (LFG)

Lexical-Functional Grammar (LFG) (Bresnan & Kaplan 1981, Bresnan 2001, Dalrymple 2001) is a constraint-based theory of grammar.

Two (basic) levels of representation:

• C-structure: represents surface grammatical configurations such as word order, annotated CFG rules/trees

• F-structure: represents abstract syntactic functions such as SUBJ(ject), OBJ(ect), OBL(ique), PRED(icate), COMP(lement), ADJ(unct) …, AVM attribute-value matrices/feature structures

F-structure approximates to basic predicate-argument structure, dependency representation, logical form (van Genabith and Crouch, 1996; 1997)

Page 11: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 11

Lexical-Functional Grammar (LFG)

Page 12: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 12

Lexical-Functional Grammar (LFG)

• Subcategorisation:

– Semantic forms (subcat frames): see<SUBJ,OBJ>– Completeness: all GFs in semantic form present at local f-

structure– Coherence: only the GFs in semantic form present at local f-

structure

• Long Distance Dependencies (LDDs): resolved at f-structure with

– Functional Uncertainty Equations (regular expressions specifying paths in f-structure): e.g. TOPICREL = COMP* OBJ

– subcat frames– Completeness/Coherence.

Page 13: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 13

Lexical-Functional Grammar (LFG)

Page 14: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 14

Introduction: Penn-II & LFG

• If we had f-structure annotated version of Penn-II, we could use (standard) machine learning methods to extract probabilistic, wide-coverage LFG resources

• How do we get f-structure annotated Penn-II?

• Manually? No: ~50,000 trees …!

• Automatically! Yes: F-Structure annotation algorithm … !

• Penn-II is a 2nd generation treebank – contains lots of annotations to support derivation of deep meaning representations:

– trees, Penn-II “functional” tags (-SBJ, -TMP, -LOC), traces & coindexation

• f-structure annotation algorithm exploits those.

Page 15: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 15

Treebank Annotation: Penn-II & LFG

Page 16: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 16

Treebank Annotation: Penn-II & LFG

Page 17: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 17

Treebank Preprocessing/Clean-Up: Penn-II & LFG

• Penn-II treebank: often flat analyses (coordination, NPs …), a certain amount of noise: inconsistent annotations, errors …

• No treebank preprocessing or clean-up in the LFG approach (unlike CCG- and HPSG-based approaches)

– Take Penn-II treebank as is, but

– Remove all trees with FRAG or X labelled constituents

– Frag = fragments, X = not known how to annotate

• Total of 48,424 trees as they are.

Page 18: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 18

Treebank Annotation: Penn-II & LFG

• Annotation-based (rather than conversion-based)• Automatic annotation of nodes in Penn-II treebank trees

with f-structure equations• Annotation Algorithm exploits:

– Head information – Categorial information– Configurational information– Penn-II functional tags– Trace information

Page 19: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 19

Treebank Annotation: Penn-II & LFG

Architecture of a modular algorithm to assign LFG f-structure equations to trees in the Penn-II treebank:

Left-Right Context Annotation Principles

Coordination Annotation Principles

Catch-All and Clean-Up

Traces

ProtoF-Structures Proper

F-Structures

Head-Lexicalisation [Magerman,1994]

Page 20: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 20

Treebank Annotation: Penn-II & LFG

• Head Lexicalisation: modified rules based on (Magerman, 1994)

Page 21: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 21

Treebank Annotation: Penn-II & LFG

Left-Right Context Annotation Principles:

• Head of NP likely to be rightmost noun …• Mother → Left Context Head Right Context

LeftContext

Right Context

Head

Page 22: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 22

Treebank Annotation: Penn-II & LFG

Left Context Head Right Context

DT: ↑spec:det=↓ QP: ↑spec:quant=↓JJ, ADJP: ↓↑adjunct

NN, NNS: ↑=↓

NP: ↓↑app PP: ↓↑adjunctS, SBAR: ↓↑relmod

NP

DT

RB

ADJP

very politicized

NN

JJ deala

NP

↑spec:det=↓

DT

RB

↓↑adjunct

ADJP

very politicized

↑=↓

NN

JJ deala

NP:

Left-Right Annotation Matrix

Page 23: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 23

Treebank Annotation: Penn-II & LFGADJPADJP-ADVADJP-CLRADJP-HLNADJP-LOCADJP-MNRADJP-PRDADJP-SBJADJP-TMPADJP-TPCADJP-TTLADVPADVP-CLRADVP-DIRADVP-EXTADVP-HLNADVP-LOCADVP-MNRADVP-PRDADVP-PRPADVP-PUTADVP-TMPADVP-TPCADVP|PRTCONJPFRAGFRAG-ADVFRAG-HLNFRAG-PRDFRAG-TPCFRAG-TTL

INTJINTJ-CLRINTJ-HLNLSTNACNAC-LOCNAC-TMPNAC-TTLNPNP-ADVNP-BNFNP-CLRNP-DIRNP-EXTNP-HLNNP-LGSNP-LOCNP-MNRNP-PRDNP-SBJNP-TMPNP-TPCNP-TTLNP-VOCNXNX-TTLPPPP-BNFPP-CLRPP-DIRPP-DTV

PP-EXTPP-HLNPP-LGSPP-LOCPP-MNRPP-NOMPP-PRDPP-PRPPP-PUTPP-SBJPP-TMPPP-TPCPP-TTLPRNPRTPRT|ADVPQPRRCSS-ADVS-CLFS-CLRS-HLNS-LOCS-MNRS-NOMS-PRDS-PRPS-SBJS-TMPS-TPC

S-TTLSBARSBAR-ADVSBAR-CLRSBAR-DIRSBAR-HLNSBAR-LOCSBAR-MNRSBAR-NOMSBAR-PRDSBAR-PRPSBAR-PUTSBAR-SBJSBAR-TMPSBAR-TPCSBAR-TTLSBARQSBARQ-HLNSBARQ-NOMSBARQ-PRDSBARQ-TPCSBARQ-TTLSINVSINV-ADVSINV-HLNSINV-TPCSINV-TTLSQSQ-PRDSQ-TPCSQ-TTL

UCPUCP-ADVUCP-CLRUCP-DIRUCP-EXTUCP-HLNUCP-LOCUCP-MNRUCP-PRDUCP-PRPUCP-TMPUCP-TPCVPVP-TPCVP-TTLWHADJPWHADVPWHADVP-TMPWHNPWHPPXX-ADVX-CLFX-DIRX-EXTX-HLNX-PUTX-TMPX-TTLX-TTL

Page 24: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 24

Treebank Annotation: Penn-II & LFG

• Do annotation matrix for each of the monadic categories

(without –Fun tags) in Penn-II

• Based on analysing the most frequent rule types for each

category

such that

sum total of token frequencies of these rule types is greater

than 85% of total number of rule tokens for that category

100% 85% 100% 85%

NP 6595 102 VP 10239 307

S 2602 20 ADVP 234 6

• Apply annotation matrix to all (i.e. also unseen) rules/sub-trees,

i.e. also those NP-LOC, NP-TMP etc.

Page 25: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 25

Treebank Annotation: Penn-II & LFG

Traces Module:

• Long Distance Dependencies (LDDs)

• Topicalisation• Questions• Wh- and wh-less relative clauses• Passivisation• Control constructions• ICH (interpret constituent here)• RNR (right node raising)• …

• Translate Penn-II traces and coindexation into corresponding reentrancy in f-structure

Page 26: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 26

Treebank Annotation: Control & Wh-Rel. LDD

Page 27: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 27

Treebank Annotation: Penn-II & LFG

Left-Right Context Annotation Principles

Coordination Annotation Principles

Catch-All and Clean-Up

Traces

ProtoF-Structures Proper

F-Structures

Head-Lexicalisation [Magerman,1995]

Constraint Solver

Page 28: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 28

Treebank Annotation: Penn-II & LFG

• Collect f-structure equations• Send to constraint solver• Generates f-structures

• F-structure annotation algorithm in Java, constraint solver in Prolog

• ~3 min annotating ~50,000 Penn-II trees• ~5 min producing ~50,000 f-structures

Page 29: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 29

Evaluation (Quantitative):

• Coverage:

Over 99.8% of Penn-II sentences (without X and FRAG constituents) receive a single covering and connected f-structure:

Treebank Annotation: Penn-II & LFG

0 F-structures

45

0.093%

1 F-structure

48329

99.804%

2 F-structures

50

0.103%

Page 30: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 30

Treebank Annotation: Penn-II & LFG

• F-structure quality evaluation against DCU 105 Dependency Bank, a manually annotated dependency gold standard of 105 sentences randomly extracted from WSJ section 23.

• Triples are extracted from the gold standard• Evaluation software from (Crouch et al. 2002) and (Riezler

et al. 2002)

relation(predicate~0, argument~1)

DCU 105 All Annotations Preds-Only

Precision 97.06% 94.28%

Recall 96.80% 94.28%

Page 31: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 31

Treebank Annotation: Penn-II & LFG

• Following (Kaplan et al. 2004) evaluation against PARC 700 Dependency Bank calculated for:

all annotations PARC features preds-only

• Mapping required (Burke 2004, 2006)

PARC 700 PARC features

Precision 88.31%

Recall 86.38%

Page 32: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 32

Grammar and Lexicon Extraction : Penn-II & LFG

Lexical Resources:

• Lexical information extremely important in modern lexicalised grammar formalisms

• LFG, HPSG, CCG, TAG, … • Lexicon development is time consuming and extremely

expensive • Rarely if ever complete• Familiar knowledge acquisition bottleneck …• Treebank-based subcategorisation frame induction (LFG

semantic forms) from Penn-II and –III• Parser-based induction from British National Corpus (BNC)• Evaluation against COMLEX, OALD, Korhonen’s data set

Page 33: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 33

Grammar and Lexicon Extraction: Penn-II & LFG

• Lexicon Construction– Manual vs. Automated

Our Approach:

– Subcat Frames not Predefined– Functional and/or Categorial Information– Parameterised for Prepositions and Particles– Active and Passive – Long Distance Dependencies– Conditional Probabilities

Page 34: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 34

Grammar and Lexicon Extraction: Penn-II & LFG

Page 35: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 35

Grammar and Lexicon Extraction: Penn-II & LFG

apply<SUBJ,OBL:for>

win<SUBJ,OBJ>

Page 36: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 36

Grammar and Lexicon Extraction: Penn-II & LFG

Semantic Form Conditional Probability

accept([subj,obj]) 0.813 accept([subj],p) 0.060 accept([subj,comp]) 0.033 accept([subj,obl:as],p) 0.020 accept([subj,obj,obl:as]) 0.020 accept([subj,obj,obl:from]) 0.020 accept([subj]) 0.013 Others 0.021

Without Prep/Part With Prep/Part Lemmas 3586 3586 Semantic Forms 10969 14348 Frame Types 38 577

Lexicon extracted from Penn-II (O’Donovan et al 2005):

Page 37: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 37

Grammar and Lexicon Extraction: Penn-II & LFG

Page 38: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 38

Grammar and Lexicon Extraction: Penn-II & LFG

Parsing-Based Subcat Frame Extraction (O’Donovan 2006):

• Treebank-based vs. parsing-based subcat frame extraction

• Parsed British National Corpus BNC (100 million words) with our automatically induced LFGs

• 19 days on single machine: ~5 million words per day

• Subcat frame extraction for ~10,000 verb lemmas

• Evaluation against COMLEX and OALD

• Evaluation against Korhonen (2002) gold standard

• Our method is statistically significantly better than Korhonen (2002)

Page 39: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 39

Parsing: Penn-II and LFG

• Overview Parsing Architectures:

Pipeline & Integrated

• Long-Distance Dependency (LDD) Resolution at F-Structure

• Evaluation & Comparison with Hand-Crafted Resources (XLE and RASP)

• Comparison against Treebank-Based CCG and HPSG Resources

Page 40: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 40

Parsing: Penn-II and LFG

Page 41: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 41

Lexical-Functional Grammar (LFG)

Page 42: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 42

Parsing: Penn-II and LFG

• Require:

– subcategorisation frames (O’Donovan et al., 2004, 2005; O’Donovan 2006)

– functional uncertainty equations

• Previous Example:

– claim([subj,comp]), deny([subj,obj]) topicrel = comp* obj (search along a path of 0 or more

comps)

Page 43: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 43

Parsing: Penn-II and LFG

Subcat frames: as above (O’Donovan et al. 2004, 2005)

Functional Uncertainty equations:

• Automatically acquire finite approximations of FU-equations

• Extract paths between co-indexed material in automatically

generated f-structures from sections 02-21 from Penn-II

• 26 TOPIC, 60 TOPICREL, 13 FOCUS path types

• 99.69% coverage of paths in WSJ Section 23

• Each path type associated with a probability

LDD resolution ranked by Path x Subcat probabilities (Cahill et al.,

2004)

Page 44: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 44

Parsing: Penn-II and LFG

• How do treebank-based constraint grammars compare to deep hand-crafted grammars like XLE and RASP?

• XLE (Riezler et al. 2002, Kaplan et al. 2004)– hand-crafted, wide-coverage, deep, state-of-the-art English LFG

and XLE parsing system with log-linear-based probability models for disambiguation

– PARC 700 Dependency Bank gold standard (King et al. 2003), Penn-II Section 23-based

• RASP (Carroll and Briscoe 2002)– hand-crafted, wide-coverage, deep, state-of-the-art English

probabilistic unification grammar and parsing system (RASP Rapid Accurate Statistical Parsing)

– CBS 500 Dependency Bank gold standard (Carroll, Briscoe and Sanfillippo 1999), Susanne-based

Page 45: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 45

• (Bikel 2002) retrained to retain Penn-II functional tags (-SBJ, -SBJ, -LOC,-TMP, -CLR, -LGS, etc.)

• Pipeline architecture:

• tag text Bikel retrained + f-structure annotation algorithm + LDD resolution f-structures automatic conversion evaluation against XLE/RASP gold standards PARC-700/CBS-500 Dependency Banks

Parsing: Penn-II and LFG

Page 46: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 46

• Systematic differences between f-structures and PARC 700 and CBS 500 dependency representations

• Automatic conversion of f-structures to PARC 700 / CBS 500 -like structures (Burke et al. 2004, Burke 2006, Cahill et al. 2008)

• Evaluation software (Crouch et al. 2002) and (Carroll and Briscoe 2002)

• Approximate Randomisation Test (Noreen 1989) for statistical significance

Parsing: Penn-II and LFG

Page 47: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 47

Parsing: Penn-II and LFG

• Result dependency f-scores (CL 2008 paper):

PARC 700 XLE vs. DCU-LFG• 80.55% XLE• 82.73% DCU-LFG (+2.18%)

CBS 500 RASP vs. DCU-LFG• 76.57% RASP• 80.23% DCU-LFG (+3.66%)

• Results statistically significant at 95% level (Noreen 1989)

• Best result now against PARC 700 84.00% (+3.45%) Charniak + Reranker + Grzegorz’ Penn-II function-tag labeler

Page 48: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 48

Parsing: Penn-II and LFG

PARC 700 Evaluation:

Page 49: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 49

Parsing: Penn-II and LFG

Page 50: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 50

Parsing: Penn-II and LFG

Page 51: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 51

Parsing: Penn-II and LFG

Page 52: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 52

Parsing: Penn-II and LFG

Page 53: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 53

Parsing: Penn-II and LFG

Page 54: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 54

Evaluation against Treebank-Based CCG and HPSG

• CCG = Combinatory Categorial Grammar (Steedman 2000)

• HPSG = Head-Driven Phrase Structure Grammar (Pollard & Sag 1994)

– Both constraint-based grammar formalisms

– Treebank-based CCG resources (Hockenmaier & Steedman 2002, Hockenmaier 2003, Clark & Curran 2004, …)

– Treebank-based HPSG resources (Miyao, Ninomiya & Tsujii 2003, Miyao & Tsujii 2004, …)

• DepBank = reannotated version of PARC 700 (Briscoe & Carroll 2006) with CBS 500–style GRs

• RASP (version 2) (Briscoe & Carroll 2006)

Page 55: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 55

Evaluation against Treebank-Based CCG and HPSG

• CCG:

– Small set of basic categories: {NP, N, PP, S}– Complex categories: VP = S\NP Vi = S\NP Vdi = (S\

NP)/NP

– Small set of combination rules:

• X/Y Y X• Y X\Y X• X/Y Y/Z X/Z• …

Page 56: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 56

Evaluation against Treebank-Based CCG and HPSG

• HPSG:

– Uniform representation: typed feature structures and inheritance

– Sign: PHON, SYNSEM, DTRS

– Inheritance hierarchy– Principles (HEAD-FEATURE, VALENCE, …) – Id-Schemata (HEAD-COMP, HEAD-MOD, …)

Page 57: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 57

Evaluation against Treebank-Based CCG and HPSG

Page 58: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 58

Evaluation against Treebank-Based CCG and HPSG

Page 59: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 59

Evaluation against Treebank-Based CCG and HPSG

Page 60: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 60

Probability Models: Penn-II & LFG

Page 61: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 61

Probability Models: Penn-II & LFG

Evaluation Results:

Page 62: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 62

Probability Models: Penn-II & LFG

Results are interesting as:

• Extensive treebank preprocessing (clean-up, correction and restructuring) in CCG and (some in) HPSG

• none in LFG

• Custom-designed parsers and sophisticated (log-linear, max ent) parse selection probability models in HPSG and CCG

• Mix of off-the-shelf and custom designed components, each with their own probability model in early-disambiguation processing pipeline in LFG, no proper overall probability model, but an approximation at best …

• Still competitive results …

Page 63: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 63

Probability Models: Penn-II & LFG

Probability Models:

• Our approach does not constitute proper probability model (Abney, 1996)

• Why? Probability model leaks:

• Highest ranking parse tree may feature f-structure equations that cannot be resolved into f-structure

• Probability associated with that parse tree is lost

• Doesn’t happen often in practice (coverage >99.5% on unseen data)

• Research on appropriate discriminative, log-linear or maximum entropy models is important (Miyao and Tsujii, 2002) (Riezler et al. 2002)

Page 64: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 64

Demo System

• http://lfg-demo.computing.dcu.ie/lfgparser.html

Page 65: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 65

Applications: Generation

Applications: Generation

Page 66: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 66

Applications: Generation

Research Question:

• Can we make the automatically induced LFG resources reversible/bi-directional?

• Can they be used for both (probabilistic) parsing and generation?

Page 67: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 67

Generation: Penn-II & LFG

Page 68: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 68

Generation: Penn-II & LFG

Page 69: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 69

Generation: Penn-II & LFG

Page 70: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 70

Generation: Penn-II & LFG

Page 71: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 71

Generation: Penn-II & LFG

Page 72: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 72

Generation: Penn-II & LFG

Page 73: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 73

Generation: Penn-II & LFG

Page 74: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 74

Generation: Penn-II & LFG

Problem: conditioning of generation rules on purely local f-str features

Solution I: generation grammar transformation (Cahill et al. 2006)

Solution II: history-based probabilistic generation (Hogan et al. 2007, Cafferkey et al. 2007): condition generation rules on parent GF

Page 75: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 75

Generation: Penn-II & LFG

Page 76: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 76

Generation: Penn-II & LFG

Page 77: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 77

Generation: Penn-II & LFG

Page 78: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 78

Generation: the Good, the Bad and the Ugly

• Orig: Supporters of the legislation view the bill as an effort to add stability and certainty to the airline-acquisition process , and to preserve the safety and fitness of the industry .

• Gen: Supporters of the legislation view the bill as an effort to add stability and certainty to the airline-acquisition process , and to preserve the safety and fitness of the industry.

• Orig: The upshot of the downshoot is that the A 's go into San Francisco 's Candlestick Park tonight up two games to none in the best-of-seven fest .

• Gen: The upshot of the downshoot is that the A 's tonight go into San Francisco 's Candlestick Park up two games to none in the best-of-seven fest .

• Orig: By this time , it was 4:30 a.m. in New York , and Mr. Smith fielded a call from a New York customer wanting an opinion on the British stock market , which had been having troubles of its own even before Friday 's New York market break .

• Gen: Mr. Smith fielded a call from New a customer York wanting an opinion on the market British stock which had been having troubles of its own even before Friday 's New York market break by this time and in New York , it was 4:30 a.m. .

• Orig: Only half the usual lunchtime crowd gathered at the tony Corney & Barrow wine bar on Old Broad Street nearby .

• Gen: At wine tony Corney & Barrow the bar on Old Broad Street nearby gathered usual , lunchtime only half the crowd , .

Page 79: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 79

Generation: Penn-II & LFG

Page 80: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 80

Generation: Penn-II & LFG

Problem: conditioning of generation rules on purely local f-str features

Solution: generation grammar transformation (Cahill et al. 2006)

Solution: history-based probabilistic generation (Hogan et al. 2007, Cafferkey et al. 2007): condition generation rules on parent GF

Page 81: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 81

Generation: the Good, the Bad and the Ugly

• Orig: By this time , it was 4:30 a.m. in New York , and Mr. Smith fielded a call from a New York customer wanting an opinion on the British stock market , which had been having troubles of its own even before Friday 's New York market break .

• Gen: Mr. Smith fielded a call from New a customer York wanting an opinion on the market British stock which had been having troubles of its own even before Friday 's New York market break by this time and in New York , it was 4:30 a.m. . (Cahill et al. 2006) GGT

• Gen: By this time , in New York , it was 4:30 a.m. , and Mr. Smith fielded a call from New a customer York , wanting an opinion on the market British stock which had been having troubles of its own even before Friday ’s New York market break . (Hogan et al. 2007) HB

• Gen: By this time , in New York , it was 4:30 a.m. , and Mr. Smith fielded a call from a New York customer , wanting an opinion on the market British stock which had been having troubles of its own even before Friday ’s New York market break . (Hogan et al. 2007) HB + MWU

Page 82: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 82

Generation: Chinese CTB2

• CTB2 (Yuqing Guo - Toshiba China Beijing R&D Lab) • (Cahill et al. 2006) out of the box• Training articles 1-270 (3,480 sentences)• Testing articles 301-325 (351 sentences)

Page 83: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 83

Applications: Machine Translation

Applications: Machine Translation

• Labelled Dependency-Based MT Evaluation (LaDEva)

• Automatic Acquisition of Transfer Rules

Page 84: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 84

Applications: Machine Translation

Labelled-Dependency-Based MT Evaluation

• Most automatic MT evaluation metrics (BLEU, NIST) are string (n-gram) based.

• They unfairly punish perfectly legitimate syntactic and lexical variation:

• Yesterday John resigned.

• John resigned yesterday.

• Yesterday John quit.

• Legitimate lexical variation: throw in WordNet synonyms into the string match

• What about syntactic variation?

Page 85: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 85

Applications: Machine Translation

• Idea: use labelled dependencies for MT evaluation

• Why: dependencies abstract away from some particulars of surface realisation

• Adjunct placement, order of conjuncts in a coordination, topicalisation, ...

Page 86: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 86

Applications: Machine Translation

• Idea is intuitive

• To make it happen you need a robust parser that can parse MT output

• Treebank-induced parsers parse anything …!

• How do we judge whether labelled dependency-based method is better than string-based methods?

• We compare (correlation) with human judgement/evaluation performance …

• Why: humans not fooled by legitimate syntactic variation

Page 87: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 87

Applications: Machine Translation

• Experiment: use LDC Multiple Translation Chinese (MTC) Parts 2 and 4 data

• 16,807 translation-reference human score segments

• 5,007 test, rest for training (weights … etc.)

• To make this work, we throw in

– n-best parsing

– WordNet synonyms

– partial matching

– training weights

– etc …

Page 88: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 88

Applications: Machine Translation

Page 89: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 89

Applications: Machine Translation

Page 90: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 90

References (MT Eval)

• Karolina Owczarzak, Yvette Graham and Josef van Genabith: Using F-structures in Machine Translation Evaluation. In Proceedings of the 12th International Conference on Lexical Functional Grammar, July 28-30, 2007, Stanford, CA

• Karolina Owczarzak, Josef van Genabith, and Andy Way. Labelled Dependencies in Machine Translation Evaluation. In Proceedings of ACL 2007 Workshop on Statistical Machine Translation, pages 104-111, Prague, Czech Republic

• Karolina Owczarzak, Josef van Genabith, and Andy Way. Dependency-Based Automatic Evaluation for Machine Translation. In Proceedings of HLT-NAACL 2007 Workshop on Syntax and Structure in Statistical Translation. Rochester, NY.

Page 91: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 91

References (Parsing)

• Aoife Cahill, Michael Burke, Ruth O'Donovan, Stefan Riezler, Josef van Genabith and Andy Way. 2008. Wide-Coverage Statistical Parsing Using Automatic Dependency Structure Annotation. Computational Linguistics, Volume 34, 1, MIT Press, March 2008. (accepted for publication)

• Joachim Wagner, Djamé Seddah, Jennifer Foster and Josef van Genabith: C-Structures and F-Structures for the British National Corpus. In Proceedings of the 12th International Conference on Lexical Functional Grammar, July 28-30, 2007, Stanford, CA

• A. Cahill, M. Burke, R. O'Donovan, J. van Genabith, and A. Way. Long-Distance Dependency Resolution in Automatically Acquired Wide-Coverage PCFG-Based LFG Approximations, In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04), July 21-26 2004, pages 320-327, Barcelona, Spain, 2004

• Cahill A, M. McCarthy, J. van Genabith and A. Way. Parsing with PCFGs and Automatic F-Structure Annotation, In M. Butt and T. Holloway-King (eds.): Proceedings of the Seventh International Conference on LFG CSLI Publications, Stanford, CA., pp.76--95. 2002

Page 92: Treebank-Based Wide Coverage Probabilistic LFG Resources

Paris 2008

Treebank-Based LFG Resources 92

References (Generation, Lex. Acq.)

• Deirdre Hogan, Conor Cafferkey, Aoife Cahill and Josef van Genabith, Exploiting Multi-Word Units in History-Based Probabilistic Generation, in Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Natural Language Learning (EMNLP-CoNLL 2007), Prague, Czech Republic. pp.267-276

• A. Cahill and J. Van Genabith, Robust PCFG-Based Generation using Automatically Acquired LFG-Approximations, COLING/ACL 2006, Sydney, Australia

• R. O'Donovan, M. Burke, A. Cahill, J. van Genabith and A. Way. Large-Scale Induction and Evaluation of Lexical Resources from the Penn-II and Penn-III Treebanks, Computational Linguistics, 2005

• R. O'Donovan, M. Burke, A. Cahill, J. van Genabith, and A. Way. Large-Scale Induction and Evaluation of Lexical Resources from the Penn-II Treebank, In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04), July 21-26 2004, pages 368-375, Barcelona, Spain, 2004