Minimalist Parsing Scott Drellishak CompLing Lab Meeting 2/1/2006.

25
Minimalist Parsing Minimalist Parsing Scott Drellishak Scott Drellishak CompLing Lab Meeting CompLing Lab Meeting 2/1/2006 2/1/2006
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    216
  • download

    0

Transcript of Minimalist Parsing Scott Drellishak CompLing Lab Meeting 2/1/2006.

Minimalist ParsingMinimalist Parsing

Scott DrellishakScott Drellishak

CompLing Lab MeetingCompLing Lab Meeting

2/1/20062/1/2006

OverviewOverview

Four parts:Four parts:1.1. Whirlwind tour of MinimalismWhirlwind tour of Minimalism

2.2. Formal definition of a Minimalist Formal definition of a Minimalist GrammarGrammar

3.3. Algorithms for parsing MGsAlgorithms for parsing MGs

4.4. Software and web sitesSoftware and web sites

Four parts:Four parts:1.1. Whirlwind tour of MinimalismWhirlwind tour of Minimalism

2.2. Formal definition of a Minimalist Formal definition of a Minimalist GrammarGrammar

3.3. Algorithms for parsing MGsAlgorithms for parsing MGs

4.4. Software and web sitesSoftware and web sites

MinimalismMinimalism

MinimalismMinimalism Recent version of transformational Recent version of transformational

generative grammar. Chomsky’s (1995) generative grammar. Chomsky’s (1995) The Minimalist ProgramThe Minimalist Program..

Updates and supersedes earlier Updates and supersedes earlier GB/P&PGB/P&P

Explores “the extent to which previous Explores “the extent to which previous empirical coverage can be maintained empirical coverage can be maintained with fewer grammatical devices.” with fewer grammatical devices.” (Stabler 1999: 299)(Stabler 1999: 299)

MinimalismMinimalism

Sentence derivations proceed Sentence derivations proceed according to this (famous) diagram:according to this (famous) diagram:

Lexicon

Phonetic Form (PF)

Logical Form (LF)

MinimalismMinimalism Items come out of the lexicon fully Items come out of the lexicon fully

inflected and with features: interpretable inflected and with features: interpretable and uninterpretable.and uninterpretable.

Uninterpretable features must cancel out Uninterpretable features must cancel out before the derivation reaches LF.before the derivation reaches LF.

The branch to PF allows the surface form The branch to PF allows the surface form to “peek” into the middle of the to “peek” into the middle of the derivation.derivation.

Cross-linguistic differences are Cross-linguistic differences are accounted for by variations in the accounted for by variations in the lexicon.lexicon.

MinimalismMinimalism

Trees are derived by starting with Trees are derived by starting with singleton trees (lexical items) and singleton trees (lexical items) and combining them.combining them.

Only two operations:Only two operations: On two trees: On two trees: mergemerge them together into a them together into a

single tree (with one “projecting over” the single tree (with one “projecting over” the other).other).

On a single tree: On a single tree: movemove a node in the tree up a node in the tree up to the root to cancel a feature.to the root to cancel a feature.

(We’ll see a couple of derivations later.)(We’ll see a couple of derivations later.)

Four parts:Four parts:1.1. Whirlwind tour of MinimalismWhirlwind tour of Minimalism

2.2. Formal definition of a Minimalist Formal definition of a Minimalist GrammarGrammar

3.3. Algorithms for parsing MGsAlgorithms for parsing MGs

4.4. Software and web sitesSoftware and web sites

Minimalist GrammarsMinimalist Grammars

For parsing, Minimalism needs For parsing, Minimalism needs formalization.formalization.

Stabler (1997) defines a MG as:Stabler (1997) defines a MG as:VV = = phonetic and interpretable featuresphonetic and interpretable features

CatCat = = categories, selectors, licensors, licenseescategories, selectors, licensors, licensees

LexLex = = expressions (trees) built from expressions (trees) built from VV and and CatCat

FF = = { { mergemerge, , movemove } } (Based on an earlier grammar formalism, (Based on an earlier grammar formalism,

so the names don’t mean what you think.)so the names don’t mean what you think.)

VV = Lexicon = Lexicon Lexical entries like:Lexical entries like:

=n d –case every=n d –case every(category D, selects a N, needs case)(category D, selects a N, needs case)n languagen language(category N)(category N)=d +case =d v speaks=d +case =d v speaks(category V, 2 DPs, assigns case to 1)(category V, 2 DPs, assigns case to 1)

This is a DP analysisThis is a DP analysis ““speaks” stands for /speaks/(speaks)speaks” stands for /speaks/(speaks)

CatCat = Features = Features

Base: c, t, v, d, n, … (parts of speech)Base: c, t, v, d, n, … (parts of speech) Select: =x, =X, X= (selects arguments)Select: =x, =X, X= (selects arguments)

Select features trigger Select features trigger mergemerge Upper-case moves phonetic content to Upper-case moves phonetic content to

merged node; “=” determines prefix or merged node; “=” determines prefix or postfixpostfix

Licensees: -case, -wh, … (needs…)Licensees: -case, -wh, … (needs…) Licensors: +case, +wh, … (provides…)Licensors: +case, +wh, … (provides…)

L* features trigger L* features trigger movemove; upper-case = ; upper-case = “strong”“strong”

LexLex = Trees = Trees

A set of nodes and three relations:A set of nodes and three relations: Dominance (Dominance (xx ⊳⊳ y =y = x is y’s parent) x is y’s parent)

Who’s higher in the tree?Who’s higher in the tree? Precedence (Precedence (xx ≺≺ yy = x precedes y) = x precedes y)

Who’s before who in the tree?Who’s before who in the tree? Projection (Projection (xx < < y =y = x projects over y) x projects over y)

Whose features percolate up to the Whose features percolate up to the parent?parent?

FF = Operations = Operations

mergemerge: Combines two trees. A head : Combines two trees. A head selects and combines with a phrase selects and combines with a phrase to its right:to its right:

=d=d =d v make + =d v make + dd lunch lunch ⇒⇒

<<

=d v make=d v make lunchlunch

FF = Operations = Operations

If the selector feature is upper case, If the selector feature is upper case, only the phonetic features combine:only the phonetic features combine:

D=D= =d v make + =d v make + dd lunch lunch ⇒⇒

<<

=d v /lunch make/(make)=d v /lunch make/(make) (lunch)(lunch)

FF = Operations = Operations

movemove: One tree’s head’s +x feature : One tree’s head’s +x feature attracts the nearest –x feature to the attracts the nearest –x feature to the root of the tree:root of the tree:

<<+case v speak+case v speak -case Nahuatl-case Nahuatl

⇒⇒

>>(Nahuatl)(Nahuatl) <<

v speakv speak /Nahuatl//Nahuatl/

A Sample DerivationA Sample Derivation

Let’s take a look at the derivation of Let’s take a look at the derivation of a simple sentence from Stabler a simple sentence from Stabler (1997)…(1997)…

Four parts:Four parts:1.1. Whirlwind tour of MinimalismWhirlwind tour of Minimalism

2.2. Formal definition of a Minimalist Formal definition of a Minimalist GrammarGrammar

3.3. Algorithms for parsing MGsAlgorithms for parsing MGs

4.4. Software and web sitesSoftware and web sites

Parsing MGsParsing MGs

Stabler (2000 and 2001) describes a Stabler (2000 and 2001) describes a CYK-like algorithm for parsing MGs.CYK-like algorithm for parsing MGs.

Defines a set of operations on strings Defines a set of operations on strings of features that are arranged in of features that are arranged in “chains” (forests of incomplete trees).“chains” (forests of incomplete trees).

Each of these operations operates on Each of these operations operates on a contiguous range of the forest, so a contiguous range of the forest, so they can be chart-parsed to recognize they can be chart-parsed to recognize input sentences.input sentences.

MG OperationsMG Operations

CYK?CYK?

Somewhat different from the version of Somewhat different from the version of CYK used to parse CFGs, but it’s still the CYK used to parse CFGs, but it’s still the same idea.same idea.

Each operation transforms a string of Each operation transforms a string of features, canceling out selection and features, canceling out selection and licensing features, producing licensing features, producing more more stringsstrings, which are stored in the chart., which are stored in the chart.

Then, look for further operations that Then, look for further operations that take them as input, building a hierarchy.take them as input, building a hierarchy.

Another RecognizerAnother Recognizer

Stabler refers to Harkema (2000), Stabler refers to Harkema (2000), which defines a MG recognizer that which defines a MG recognizer that works more like an Earley parser.works more like an Earley parser.

It has an agenda and a chart. As It has an agenda and a chart. As operations are applied to make new operations are applied to make new items, those go into the agenda. items, those go into the agenda. Stop when a “goal item” appears in Stop when a “goal item” appears in the chart.the chart.

Overall time complexity is O(nOverall time complexity is O(n4k+44k+4))

Another Sample Another Sample DerivationDerivation

Here’s a derivation from Stabler Here’s a derivation from Stabler (2000)—a slightly different format; (2000)—a slightly different format; note indices:note indices:1. (0,1)::=d v –w1. (0,1)::=d v –w lexicallexical

2. (1,2)::d –case2. (1,2)::d –case lexicallexical

3. (x,x)::=v +case acc3. (x,x)::=v +case acclexicallexical

4. (x,x)::=acc +w w4. (x,x)::=acc +w w lexicallexical

5. (0,1):v -w,(1,2):-case5. (0,1):v -w,(1,2):-case merge3(1,2)merge3(1,2)

6. (x,x):+case acc,(0,1):-w,(1,2):-case6. (x,x):+case acc,(0,1):-w,(1,2):-case merge3(3,5)merge3(3,5)

7. (1,2):acc,(0,1):-w7. (1,2):acc,(0,1):-w move1(6)move1(6)

8. (1,2):+w w,(0,1):-w8. (1,2):+w w,(0,1):-wmerge1(4,7)merge1(4,7)

9. (0,2):w9. (0,2):w move1(8)move1(8)

Four parts:Four parts:1.1. Whirlwind tour of MinimalismWhirlwind tour of Minimalism

2.2. Formal definition of a Minimalist Formal definition of a Minimalist GrammarGrammar

3.3. Algorithms for parsing MGsAlgorithms for parsing MGs

4.4. Software and web sitesSoftware and web sites

ParsersParsers Stabler’sStabler’s parsers parsers: MG parsers in Ocaml and : MG parsers in Ocaml and

two flavors of Prolog. (Also requires tcl/tk.)two flavors of Prolog. (Also requires tcl/tk.) SourabhSourabh NiyogiNiyogi: Stabler-based MG parser : Stabler-based MG parser

in Scheme, does verb subcategorization.in Scheme, does verb subcategorization. WillemijnWillemijn VermaatVermaat: Stabler-based MG : Stabler-based MG

parser w/ web interface (that I couldn’t parser w/ web interface (that I couldn’t figure out).figure out).

DekangDekang Lin Lin: MINIPAR. Executable only, : MINIPAR. Executable only, based on PRINCIPAR, not clear what the based on PRINCIPAR, not clear what the internals are like.internals are like.

ReferencesReferences

Chomsky (1995). Chomsky (1995). The Minimalist Program.The Minimalist Program.

Harkema (2000). A Recognizer for Minimalist Harkema (2000). A Recognizer for Minimalist Grammars.Grammars.

Stabler (1997). Derivational Minimalism.Stabler (1997). Derivational Minimalism.

Stabler (1999). Remnant Movement and Structural Stabler (1999). Remnant Movement and Structural Complexity.Complexity.

Stabler (2000). Minimalist Grammars and Stabler (2000). Minimalist Grammars and Recognition.Recognition.