2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.

26
2007 CLINT-LIN-FEATSTR 1 Computational Linguistics for Linguists Feature Structures

Transcript of 2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.

Page 1: 2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.

2007 CLINT-LIN-FEATSTR 1

Computational Linguisticsfor Linguists

Feature Structures

Page 2: 2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.

2007 CLINT-LIN-FEATSTR 2

Example PATR-IIGrammar and Lexicon

Grammar (grammar.grm)

Rule

s -> np vp

Rule

np -> n

Rule

vp -> v

Lexicon (lexicon.lex)

\w uther

\c n

\w sleeps

\c v

Page 3: 2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.

2007 CLINT-LIN-FEATSTR 3

Example PATR-IIGrammar and Lexicon

Grammar (grammar.grm)

Rule

s -> np vp

Rule

np -> n

Rule

vp -> v

Lexicon (lexicon.lex)

\w uther

\c n

\w sleeps

\c v

\w sleep

\c v

Page 4: 2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.

2007 CLINT-LIN-FEATSTR 4

Example PATR-IIGrammar and Lexicon

%Grammar(grammar.grm)

Rules -> npsg vpsgRules -> nppl vpplRulenpsg -> nsgRulenppl -> nplRulevpsg -> vsgRulevppl -> vpl

%Lexicon (lexicon.lex)

\w cows\c npl

\w uther\c nsg

\w sleeps\c vsg

\w sleep\c vpl

Page 5: 2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.

2007 CLINT-LIN-FEATSTR 5

Grammar and Lexiconwith Pronouns

%Grammar(grammar.grm)Rules -> npsg vpsgRules -> nppl vpplRulenpsg -> nsgRulenppl -> nplRulevpsg -> vsgRulevppl -> vpl

%Lexicon (lexicon.lex)\w he\c nsg\w him\c nsg\w she\c nsg\w her\c nsg\w they\c npl\w them\c npl\w sleeps\c vsg\w sleep\c vpl

Page 6: 2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.

2007 CLINT-LIN-FEATSTR 6

Problem with the Grammar

• The grammar allows:

he/him/she/her sleepsthey/them sleep

Page 7: 2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.

2007 CLINT-LIN-FEATSTR 7

Grammar and Lexiconwith Pronouns

%Grammar(grammar.grm)Rules -> npsgnom vpsgRules -> npplnom vpplRulenpsgnom -> nsgnomRulenpplnom -> nplnomRulenpsgacc -> nsgaccRulenpplacc -> nplaccRulevpsg -> vsgRulevppl -> vpl

%Lexicon (lexicon.lex)\w he\c nsgnom\w him\c nsgacc\w she\c nsgnom\w her\c nsgacc\w they\c nplnom\w them\c nplacc\w sleeps\c vsg\w sleep\c vpl

Page 8: 2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.

2007 CLINT-LIN-FEATSTR 8

Remarks

• The only mechanism available to CFG to prevent overgeneration is the creation of new categories.

• Whenever we add new categories the grammar gets longer and less understandable

• Is there another way?

Page 9: 2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.

2007 CLINT-LIN-FEATSTR 9

Constraints andInformation Structures

• PATR2 handles this problem by associating words with feature structures.

• Feature structures are commonly written as attribute-value matrices e.g. [cat noun num sing ]

• Items on the left are attributes• Items on the right are corresponding

values

Page 10: 2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.

2007 CLINT-LIN-FEATSTR 10

Constraints andInformation Structures

• Rules are then augmented with constraint equations between feature structures associated with constituents.

• These can be used to express constraints between constituents (eg subject/verb agreement),

• or to pass information from words up to higher constituents (e.g. np inherits information from n).

Page 11: 2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.

2007 CLINT-LIN-FEATSTR 11

Example of a PATR ruleswith Constraints

Rule

s -> np vp

<np num> = <vp num>

Rule

np -> n<np head> = <n head>

Page 12: 2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.

2007 CLINT-LIN-FEATSTR 12

Feature Constraints

Feature constraints comprise three parts, in this order:

1. a feature path, the first element of which is one of the symbols from the phrase structure rule

2. an equal sign (=)

3. either a simple value, or another feature path that also starts with a symbol from the phrase structure rule

Page 13: 2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.

2007 CLINT-LIN-FEATSTR 13

Unification

• Unification is the basic operation applied to feature structures in PC-PATR

• It consists of the merging of the information from two feature structures.

• Two feature structures can unify if their common features have the same values, but do not unify if any feature values conflict.

Page 14: 2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.

2007 CLINT-LIN-FEATSTR 14

Examples

[num sg] unified with [person first] gives[num sg person first] [num sg] unified with [num sg] gives [num sg]

[num sg] unified with [num pl] gives …

Page 15: 2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.

2007 CLINT-LIN-FEATSTR 15

Examples

[num sg] unified with [person first] gives[num sg person first] [num sg] unified with [num sg] gives [num sg]

[num sg] unified with [num pl] gives NOTHING

Page 16: 2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.

2007 CLINT-LIN-FEATSTR 16

Complex-Valued FS

• Feature structures can have either simple values, or complex values, such as this[cat np head [agr [ num sg

gen masc] deftype indef]]

• Feature structures can be arbitrarily nested and used to build linguistic representations.

Page 17: 2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.

2007 CLINT-LIN-FEATSTR 17

Building Up Structures

• Agreement Features – 3rd person singular[ num sing person 3 ]

• Noun Phrase – 3rd person sing noun phrase[ cat np agr [ num sing

person 3 ]]• Sentence – with 3rd person singular subject

[cat s subj [ cat np agr [ num sing person 3 ]]]

Page 18: 2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.

2007 CLINT-LIN-FEATSTR 18

Simple Unification Examples

1. [ agreement: [ number: singular person: first ] ]2. [ agreement: [ number: singular case:

nominative ] ]3. [ agreement: [ number: singular person: third ] ]

4.[ agreement: [ number: singular person: first ] case: nominative ] ]5. [ agreement: [ number: singular person: third ] case: nominative ] ]

Page 19: 2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.

2007 CLINT-LIN-FEATSTR 19

Checkpoint

Satisfy yourself that, using the previous

examples:

• unify(1,2) = 4

• unify(2,3) = 5

• unify(1,3) = fail

Page 20: 2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.

2007 CLINT-LIN-FEATSTR 20

Paths

• Portions of a feature structure can be referred to using the path notation.

• A path is a sequence of one or more feature names enclosed in angled brackets (< >). For instance,(1) <head>

(2) <head deftype>

(3) <head agr num>

• Paths are used to express feature constraints,

Page 21: 2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.

2007 CLINT-LIN-FEATSTR 21

Examples of Constraints

• <head deftype> = indef

• <np head agr> = <vp head agr>

Page 22: 2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.

2007 CLINT-LIN-FEATSTR 22

Constraint Equations

• The feature constraints associated with phrase structure rules in PC-PATR consist of a set ofunification expressions.

• Each expression has three parts, in this order:• a feature path, the first element of which is one

of the symbols from the phrase structure rule• an equal sign (=)• either a simple value, or another feature path

that also starts with a symbol from the phrase structure

Page 23: 2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.

2007 CLINT-LIN-FEATSTR 23

Execution of Equations

• Each equation is interpreted as an instruction to unify the left and right hand sides

• First, each side is "evaluated" before any unification is attempted. If the path does not exist it is created.

• After successful unification, the two structures are not merely equivalent, but identical, so that any changes to one affect changes to the other.

Page 24: 2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.

2007 CLINT-LIN-FEATSTR 24

Lexical Entries

• Lexical entries define the basic properties of words.

• Each definition divided into fields, each of which begins with a standard format marker at the beginning of a line. – \w the lexical form of the word, – \c word category (part of speech) – \g word gloss – \f additional features of this word

Page 25: 2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.

2007 CLINT-LIN-FEATSTR 25

Lexical Entry Examples

\w fox \c N \g canine \f <number> = singular

\w foxes \c N \g canine+PL \f <number> = plural

Page 26: 2007CLINT-LIN-FEATSTR1 Computational Linguistics for Linguists Feature Structures.

2007 CLINT-LIN-FEATSTR 26

Corresponding Feature Structures

• When these entries are used by the grammar, they are represented by these feature structures: [ cat: N gloss: canine lex: foxes number: singular ]

[ cat: N gloss: canine+PL

lex: foxes number: plural ]