Formal Issues in Natural Language Generation

44
Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 5 Stone, Doran,Webber, Bleam & Palmer

description

Formal Issues in Natural Language Generation. Lecture 5 Stone, Doran,Webber, Bleam & Palmer. GRE and surface realization. Arguably, GRE uses a grammar. Parameters such as the preference order on properties reflect knowledge of how to communicate effectively. - PowerPoint PPT Presentation

Transcript of Formal Issues in Natural Language Generation

Page 1: Formal Issues in Natural Language Generation

Kees van DeemterMatthew Stone

Formal Issuesin

Natural Language Generation

Lecture 5Stone, Doran,Webber, Bleam &

Palmer

Page 2: Formal Issues in Natural Language Generation

GRE and surface realization

Arguably, GRE uses a grammar.– Parameters such as the preference order on

properties reflect knowledge of how to communicate effectively.

– Decisions about usefulness or completeness of a referring expression reflect beliefs about utterance interpretation.

Maybe this is a good idea for NLG generally.

Page 3: Formal Issues in Natural Language Generation

GRE and surface realization

But we’ve thought GRE outputs semantics:

referent: furniture886

type: desk

status: definite

color: brown

origin: sweden

Page 4: Formal Issues in Natural Language Generation

GRE and surface realization

We also need to link this up with surface form:

the brown Swedish desk

Note: not

?the Swedish brown desk

Page 5: Formal Issues in Natural Language Generation

Today’s initial observations

It’s hard to do realization on its ownmapping from semantics to surface

structure.

It’s easy to combine GRE and realizationbecause GRE is grammatical reasoning!if you have a good representation for

syntax.

Page 6: Formal Issues in Natural Language Generation

Why it’s hard to do realizationA pathological grammar of adjective

order:

NP the N(w).N(w) w N(w) if w is an adjective and

wRw.N(w) w if w is a noun.

Page 7: Formal Issues in Natural Language Generation

Syntax with this grammar

Derivation of example:

the brown Swedish desk

NP

N(brown)

N(Swedish)

N(desk)

Requires: brown R Swedish, Swedish R desk

Page 8: Formal Issues in Natural Language Generation

Realization, formally

You start with k properties.Each property can be realized lexically.

assume: one noun, many adjectives(not that it’s easy to enforce this)

Realization solution:NP which realizes each property exactly

once.

Page 9: Formal Issues in Natural Language Generation

Quick formal analysis

View problem graph-theoretically:k words, corresponding to vertices in a graphR is a graph on the k wordsSurface structure is a Hamiltonian path

(which visits each vertex exactly once)through R.

This is a famous NP complete problemSo surface realization itself is intractable!

Page 10: Formal Issues in Natural Language Generation

Moral of the example

Semantics underdetermines syntactic relations.Here, semantics underdetermines syntactic

relations of adjectives to one another and to the head.

Searching for the correspondence is hard.See also Brew 92, Koller and Striegnitz 02.

Page 11: Formal Issues in Natural Language Generation

Today’s initial observations

It’s hard to do realization on its ownmapping from semantics to surface

structure.

It’s easy to combine GRE and realizationbecause GRE is grammatical reasoning!if you have a good representation for

syntax.

Page 12: Formal Issues in Natural Language Generation

Syntactic processing for GRE

LexicalizationSteps of grammatical derivation correspond to meaningful choices in NLG.

E.g., steps of grammar are synched with steps of adding a property to a description.

Page 13: Formal Issues in Natural Language Generation

Syntactic processing for GRE

Key ideas: lexicalization, plusFlat dependency structure (adjs modify

noun)Hierarchical representation of word-order

NP

N(color)

N(origin)

N(size)

N(material)the desk

Page 14: Formal Issues in Natural Language Generation

Syntactic processing for GRE

Other syntactic lexical entries

Adj

N(origin)

Swedish

N(color)

Adj

brown

Page 15: Formal Issues in Natural Language Generation

Describing syntactic combinationOperation of combination 1: Substitution

NP + =NP

N(color)

N(origin)

N(size)

N(material)the desk

NP

N(color)

N(origin)

N(size)

N(material)the desk

Page 16: Formal Issues in Natural Language Generation

Describing syntactic combinationOperation of combination 2: Sister adjunction

+ =NP

N(color)

N(origin)

N(size)

N(material)the desk

NP

N(color)

N(origin)

N(size)

N(material)the desk

N(color)

Adj

brownAdj

brown

Page 17: Formal Issues in Natural Language Generation

Abstracting syntax

Tree rewriting:Each lexical item is associated with a structure.You have a starting structure.You have ways of combining two structures together.

Page 18: Formal Issues in Natural Language Generation

Abstracting syntax

Derivation treerecords elements and how they are combined

the desk

brown(s.a. @ color)

Swedish(s.a. @ origin)

Page 19: Formal Issues in Natural Language Generation

An extended incremental algorithm• r = individual to be described• P = lexicon of entries, in preference

orderP is an individual entrysem(P) is a property or set of entries from

the contextsyn(P) is a syntactic element

• L = surface syntax of description

Page 20: Formal Issues in Natural Language Generation

Extended incremental algorithm

L := NPC := DomainFor each P P do:

If r sem(P) & C sem(P)Then do

L := add(syn(P), L)C := C sem(P)If C = {r} then return L

Return failure

Page 21: Formal Issues in Natural Language Generation

Observations

Why use tree-rewriting - not,e.g. CFG derivation?

NP the N(w).N(w) w N(w) if w is an adjective and

wRw.N(w) w if w is a noun.

CFG derivation forces you to select properties in the surface word-order.

Page 22: Formal Issues in Natural Language Generation

Observations

Tree-rewriting frees word-order from choice-order.

NP

N(color)

N(origin)

N(size)

N(material)

the

desk

NP

N(color)

N(origin)

N(size)

N(material)

the

desk

Adj

brown

NP

N(color)

N(origin)

N(size)

N(material)

the

desk

Adj

brown

Adj

Swedish

Page 23: Formal Issues in Natural Language Generation

Observations

Tree-rewriting frees word-order from choice-order.

NP

N(color)

N(origin)

N(size)

N(material)

the

desk

NP

N(color)

N(size)the

NP

N(color)

N(origin)

N(size)

N(material)

the

desk

Adj

brown

Adj

Swedish

N(origin)

N(material)

desk

Adj

Swedish

Page 24: Formal Issues in Natural Language Generation

This is reflected in derivation treeDerivation tree

records elements and how they are combined

the desk

brown(s.a. @ color)

Swedish(s.a. @ origin)

Page 25: Formal Issues in Natural Language Generation

Formal results

Logical completeness.If there’s a flat derivation tree for an NP that identifies referent r, Then the incremental algorithm finds it.

ButSensible combinations of properties may not yield surface NPs.Hierarchical derivation trees may require lookahead in usefulness check.

Page 26: Formal Issues in Natural Language Generation

Formal results

Computational complexityNothing changes – we just add properties, one after another…

Page 27: Formal Issues in Natural Language Generation

Now, though, we’re choosing specific lexical entries

NP

N(departure)

N(destination)

N(stops)

the

express

Adj

3:35

N

Trenton

vsNP

N(departure)

N(destination)

N(stops)

the

express

Adj

15:35

N

Trenton

maybe these lexical items express the same property…

Page 28: Formal Issues in Natural Language Generation

• Use

in 12-hour time context

• Use

in 24-hour time context

What motivates these choices?

N(departure)

Adj

3:35

N(departure)

Adj

15:35

Page 29: Formal Issues in Natural Language Generation

• P = lexicon of entries, in preference orderP is an individual entrysem(P) is a property or set of entries from

the contextsyn(P) is a syntactic elementprags(P) is a test which the context must

satisfy for the entry to be appropriate

Need to extend grammar again

Page 30: Formal Issues in Natural Language Generation

For example:

syn:

sem: departure(x, 1535)prags: twentyfourhourtime

Need to extend grammar again

N(departure)

Adj

15:35

Page 31: Formal Issues in Natural Language Generation

Extended incremental algorithm

L := NPC := DomainFor each P P do:

If r sem(P) & C sem(P) & prags(P) is trueThen do

L := add(syn(P), L)C := C sem(P)If C = {r} then return L

Return failure

Page 32: Formal Issues in Natural Language Generation

Discussion:What does this entry do?

syn:

sem: thing(x)prags: in-focus(x)

NP

it

Page 33: Formal Issues in Natural Language Generation

Suggestion: find best value

Given: – A set of entries that combine syntactically with

L in the same way– Related by semantic generality and pragmatic

specificity.– Current distractors

Take entries that remove the most distractorsOf those, take the most semantically generalOf those, take the most pragmatically specific

Page 34: Formal Issues in Natural Language Generation

Extended incremental algorithm

L := NP C := DomainRepeat

Choices := { P : add(syn(P), L) at next node & r sem(P) & prags(P) is true }

P := find best value(Choices)L := add(syn(P), L)C := C sem(P)If C = {r} then return L

Return failure

Page 35: Formal Issues in Natural Language Generation

What is generation anyway?

Generation is intentional (or rational) actionthat’s why Grice’s maxims apply, for

example.

You have a goalYou build a plan to achieve it

(& achieve it economically in a recognizable way)

You carry out the plan

Page 36: Formal Issues in Natural Language Generation

In GRE…

The goal is for hearer to know the identity of r(in general g)

The plan will be to utter some NP Usuch that the interpretation of U identifies { r }(in general c u cg)

Carrying out the plan means realizing this utterance.

Page 37: Formal Issues in Natural Language Generation

In other words

GRE amounts to a process of deliberation.

Adding a property to L incrementally is like committing to an action.These commitments are called intentions.Incrementality is characteristic of intentions –

though in general intentions are open to revision.

Note: this connects with belief-desire-intention models of bounded rationality.

Page 38: Formal Issues in Natural Language Generation

GRE as (BDI) rational agency

L := NP // Initial plan C := Domain // Interpretationwhile (P := FindBest(P, C, L)) { //

DeliberationL := add(syn(P), L) // Adopt new intentionC := C sem(P) // Update interpretationif C = { r } return L // Goal satisfied

}fail

Page 39: Formal Issues in Natural Language Generation

NLG as (BDI) rational agency

L := X C := Initial Interpretationwhile (P := FindBest(P, C, L)) {

L := AddSyntax(syn(P), L)C := AddInterpretation(sem(P), C)if GoalSatisfied(C) return L

}fail

Page 40: Formal Issues in Natural Language Generation

Conclusionsfor NLG researchers

It’s worth asking (and answering) formal questions about NLG.

Questions of logical completeness – can a generator express everything it ought to?Questions of computational complexity – is the cost of a generation algorithm worth the results?

Page 41: Formal Issues in Natural Language Generation

Conclusionsfor linguists

NLG offers a precise perspective on questions of language use.

For example, what’s the best way of communicating some message?

NLG – as opposed to other perspectives – gives more complete, smaller-scale models.

Page 42: Formal Issues in Natural Language Generation

Conclusionsfor AI in general

NLG does force us to characterize and implement representations & inference for practical interactive systems

Good motivation for computational semantics.Meaty problems like logical form equivalence.

Many connections and possibilities for implementation (graphs, CSPs, circuit optimization, data mining,…)

Page 43: Formal Issues in Natural Language Generation

Open Problems

• Sets and salience in REs.• Generating parallel REs.• Theoretical and empirical measures of

quality/utility for REs.• Avoiding ambiguity in REs.

Any problem in RE generalizes to one in NLG.

Page 44: Formal Issues in Natural Language Generation

Followup information

Course web page:http://www.itri.brighton.ac.uk/home/Kees.van.Deemter/esslli-notes.html

– downloadable papers– final lecture notes– papers we’ve talked about– links (recent/upcoming events, siggen,

sigsem)

by Monday August 26.