CSA2050 Introduction to Computational Linguistics

21
07/05/2005 CSA2050: DCG3 1 CSA2050 Introduction to Computational Linguistics Lecture DCG3 Handling Subcategorisation Handling Relative Clauses

description

CSA2050 Introduction to Computational Linguistics. Lecture DCG3 Handling Subcategorisation Handling Relative Clauses. Subcategories of Verbs. English includes the following sentences: John disappeared John broke the window John gave Mary the book - PowerPoint PPT Presentation

Transcript of CSA2050 Introduction to Computational Linguistics

Page 1: CSA2050 Introduction to Computational Linguistics

07/05/2005 CSA2050: DCG3 1

CSA2050 Introduction to Computational Linguistics

Lecture DCG3

Handling Subcategorisation

Handling Relative Clauses

Page 2: CSA2050 Introduction to Computational Linguistics

07/05/2005 CSA2050: DCG3 2

Subcategories of Verbs

• English includes the following sentences:– John disappeared– John broke the window– John gave Mary the book

• In each case the verb phrase has a different structure corresponding to the number of noun phrases that follow.

Page 3: CSA2050 Introduction to Computational Linguistics

07/05/2005 CSA2050: DCG3 3

Subcategories of VP

VP

V

disappeared

VP

V NP

broke the window

VP

V NP NP

gave Mary the book

• To handle this we need separate rules for VP

Page 4: CSA2050 Introduction to Computational Linguistics

07/05/2005 CSA2050: DCG3 4

Some Rules forSubcategories of Verb

vp -> v.

vp -> v, np.

vp -> v, np, np.

• Checkpoint: how good are these rules for handling English?

Page 5: CSA2050 Introduction to Computational Linguistics

07/05/2005 CSA2050: DCG3 5

Counterexamples

• The following sentences obey the above rules:– John broke– Mary disappeared the window– Mike told Mary the book

• All of them violate the subcategorisation constraints of the individual verbs.

• How can we make sure that the right rule goes with the right verb?

Page 6: CSA2050 Introduction to Computational Linguistics

07/05/2005 CSA2050: DCG3 6

Handling Subcategorisation - I

vp --> v1.

vp --> v2, np.

vp --> v3 np, np.

• This solution has the problem that the vi

are distinct symbols with no relationship between them. We thus miss the fact that they are all verbs

Page 7: CSA2050 Introduction to Computational Linguistics

07/05/2005 CSA2050: DCG3 7

Handling Subcategorisation - II

• A slightly better method is to adopt rules of the following shape:

vp --> v(intrans)

vp --> v(trans), np.

vp --> v(ditrans), np, np.• This is essentially the same method that

we used to avoid multiplying categories when handling agreement.

Page 8: CSA2050 Introduction to Computational Linguistics

07/05/2005 CSA2050: DCG3 8

Handling Subcategorisation:The Lexicon

• Assume that the lexicon looks something like this:

lex(disappeared, v, intrans).

lex(broke,v, trans).

lex(gave, v, ditrans).

Page 9: CSA2050 Introduction to Computational Linguistics

07/05/2005 CSA2050: DCG3 9

Two Approaches – One Lexicon

• Approach 1

v1 --> [X] {lex(X,v,intrans)}.

v2 --> [X] {lex(X,v,trans)}.

v2 --> [X] {lex(X,v,ditrans)}.

• Approach 2

v(Type) --> [X], {lex(X,v,Type)}.

Page 10: CSA2050 Introduction to Computational Linguistics

07/05/2005 CSA2050: DCG3 10

Other Verb Complements

• So far we have only discussed three possible VP rules. But in practice there are many more:– np, pp: John told Bill about the accident– np, vp: John persuaded Bill to come– np, pp,pp: John rented an apartment to Bill for

Lm100– np, s: John informed Bill that Jack was coming

• Taking prepositions into account, about 40 different patterns are required to handle verbs in English.

• This will lead to an undesirable proliferation of grammar rules.

Page 11: CSA2050 Introduction to Computational Linguistics

07/05/2005 CSA2050: DCG3 11

Subcategorisation:Approach III

• Shift responsibility towards the lexicon

• Key idea: lexical entry for each verb contains a complement list.lex(told,v, [np,pp]).lex(persuaded,v,[np,vp]).lex(rented,v,[np,pp,pp].

• The list is then managed by the grammar.

Page 12: CSA2050 Introduction to Computational Linguistics

07/05/2005 CSA2050: DCG3 12

Grammar Rules for VP

• Basic idea is to avoid multiple grammar rules of the formvp --> v([]).vp --> v([np]), np(...).vp --> v([np,np]), np(...), np(...).vp --> v([np,pp]), np(...), pp(...). vp --> v([np,s]), np(...), s(...).

• Instead have a single rule of this formvp --> v(SubCatList), comps(SubCatList).which allows a verb phrase to be formed from a verb followed by a sequence of zero or more complements.

Page 13: CSA2050 Introduction to Computational Linguistics

07/05/2005 CSA2050: DCG3 13

The Grammar

s --> np, vp.

np --> n.

np --> d, n.

vp --> v(SC), comps(SC).

comps([]) --> [].

comps([X|R]) --> x(X), comps(R).

Page 14: CSA2050 Introduction to Computational Linguistics

07/05/2005 CSA2050: DCG3 14

The Comps Rule

comps([]) --> [].

comps([X|R]) --> x(X), comps(R)

• Challenge: how to write the definition of the "x" predicate.

Page 15: CSA2050 Introduction to Computational Linguistics

07/05/2005 CSA2050: DCG3 15

Relative Clauses

• Sentences like

Terry runs a program that Bertrand wroteThe man who came to dinner vomited

include relative clauses (shown in boldface)

• Relative clauses are interesting because they involve "filler-gap dependency"

Page 16: CSA2050 Introduction to Computational Linguistics

07/05/2005 CSA2050: DCG3 16

Structure of Relative Clauses

• Relative clauses such asthat Bertrand wroteare not well described by

relclause --> [that], vp.

• This is because the underlying structure is not the concatenation of "that" and a VP, i.e.

• If this is not the structure, what is?

Page 17: CSA2050 Introduction to Computational Linguistics

07/05/2005 CSA2050: DCG3 17

The Structure of the Relative Clause

• To understand the structure we must first look at the wider context in which it occurs, in this case the noun phrase:a program that Bertrand wrote.

• This noun phrase is derived from the underlying sentenceBertrand wrote a program.

Page 18: CSA2050 Introduction to Computational Linguistics

07/05/2005 CSA2050: DCG3 18

Derivation of Relative Clause

• Basic sentence: Bertrand wrote a program• Step 1: identify object NP

Bertrand wrote [a program]• Step 2: move to the front, leaving gap where object NP

was.[A program] Bertrand wrote npgap

• Note that underlined structure is a sentence with a gap instead of an object.

• Step 3: insert relative pronoun just before sentence containing gap.

• [A program] that Bertrand wrote npgap• N.B. Gap is invisible

Page 19: CSA2050 Introduction to Computational Linguistics

07/05/2005 CSA2050: DCG3 19

Structure of the Noun Phrase

NP

d N

NP

rel Sgap

VPgap

v NPgap

a program that

n

bertrand wrote

RelCl

filler

gap

Page 20: CSA2050 Introduction to Computational Linguistics

07/05/2005 CSA2050: DCG3 20

Long Distance Dependencies

• A filler-gap dependency occurs in an NL sentence when a subpart of some phrase is missing from its normal location and another phrase, outside the incomplete one, stands in its place.

• Filler-gap dependencies are examples of long-distance dependencies.

• The amount of material between dependents (in this case filler and gap) is unbounded, at least in principle, e.g.Terry read a book that Bertrand told a student to ask a professor to write.

Page 21: CSA2050 Introduction to Computational Linguistics

07/05/2005 CSA2050: DCG3 21

Handling Gaps within a DCG

• Use a nonterminal argument position to indicate the presence or absence a gap, e.g.np(nogap) = an NP without a gapnp(gap) = an NP with a gap

• Introduce the term gap(T) to indicate the presence or absence of a gap of category T. So gap(np) will indicate where an NP used to be.

• For each category c that admits gaps, introduce the rule c(gap(c)) rewrites nothing, e.g.

np(gap(np)) --> [ ].