Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information...

32
Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    217
  • download

    1

Transcript of Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information...

Page 1: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

Generative Models of Discourse

Eugene Charniak

Brown Laboratory for Linguistic Information Processing

BL IPL

Page 2: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

Joint Work With

• Micha Elsner

(PhD student, Brown)

• Joseph Osterwile

(Ex Undergraduate, Brown)

Page 3: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

Abstract

Discourse, the study of how the meaning of a document is built out the meanings of its sentences, is the inter-sentential analogue of semantics.  In this talk we consider the following abstract problem in discourse.  Given a document, randomly permute the order of the sentences and then attempt to distinguish the original from the permuted version.  We present a sequence of generative models that can handle the problem with increasing accuracy.  Each model accounts for some aspect of the document, and assigns a probability to the document's contents.  In the standard generative way the subsequent models simply multiply individual probabilities to get their results. We also discuss the linkage of this abstract task to more realistic ones such as essay grading, document summarization and document generation.

Page 4: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

Revised Abstract We present a sequence of generative models that can handle

the problem with increasing accuracy. Each model accounts for some aspect of the document, and assigns a probability to the document's contents.   Given a document, randomly permute the order of its sentences and then attempt to distinguish the original from the permuted version.  In the standard generative way the subsequent models simply multiply individual probabilities to get their results.  In this talk we consider the following abstract problem in discourse. We also discuss the linkage of this abstract task to more realistic ones such as essay grading, document summarization and document generation. Discourse, the study of how the meaning of a document is built out the meanings of its sentences, is the inter-sentential analogue of semantics.  

NOTICE! This example is doctored to illustrate the program. You can ask me about the real randomized abstract if you like.

Page 5: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

A Note on “Generative”

When we talk about a “generative” model

we do NOT mean a model that actually generates language. (If we do mean that we will say “literally generate”) Rather “generative” is used in machine learning to talk about a model that assigns probability to the input. So “generate” = “assign a probability to”.

Page 6: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

Our Three Models

• So each of our three models assigns a probability to some aspect of the input (head-nouns, pronouns, and noun-phrase syntax, respectively).

• The idea is that the probability assigned to the original document should be higher than that assigned to the random one.

• One advantage of such generative models is that if done correctly, they can be combined by just multiplying their probabilities together. This is, in fact, exactly what we do.

Page 7: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

More Formally

i

ii SSPDP )|()( 1,1

)()|()|()|( 1,11,11,11,1 isiiPiiNii SPSSPSSPSSP

We generate each sentence conditioned on the previous sentences

For each sentence we compute three probabilities, head-nouns, pronouns, and NP syntax.

Page 8: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

Generative Models of Discourse

I Introduction

II Model 1 – Head Nouns (Entity Grids)

III Model 2 - Pronominal Reference

IV Model 3 – Noun-Phrase Syntax

V Real Problems (Future Work)

Page 9: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

Nouns Tend to Repeat

Discourse, the study of how the meaning of a document is built out the meaning’s of its sentences, is the inter-sentential analogue of semantics.  In this talk we consider the following abstract problem in discourse.  Given a document, randomly permute the order of the sentences and then attempt to distinguish the original from the permuted version.  We present a sequence of generative models that can handle the problem with increasing accuracy.  Each model accounts for some aspect of the document, and assigns a probability to the document's contents.  In the standard generative way the subsequent models simply multiply individual probabilities to get their results. We also discuss the linkage of this abstract task to more realistic ones such as essay grading, document summarization and document generation.

Page 10: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

Entity Grids

• Following Barzilay Lapata, and Lee, an entity grid is an array with the “entities” (really just the head nouns) of the document on one axis, the sentence ordering on the other, and at each point the role the entities plays in the sentence. As in previous work we limit the roles to subject (S), object (0), other (X) and not mentioned (-).

Page 11: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

A (Partial) Entity Grid

Discourse S X - - - - -

Meaning X - - - - - -

Document X - X - X - -

Sentences X - X - - - -

Talk - X - - - - -

Problem - O - O - - -

Order - - O - - - -

Original - - X - - - - -

Version - - X - - - - -

Models - - - X - S -

Page 12: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

The Grid for the Randomized Document

Discourse - - - - X - SMeaning - - - - - - XDocument - X X - - - XSentences - - X - - - XTalk - - - - X - -Problem O - - - O - -Order - - O - - - -Original - - X - - - -Version - - X - - - -Models X - - S - . -

Page 13: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

The Basic E-grid Probability

For head-noun probabilities we look at each head nouns probability given its two sentence history (what roles, (S,O,X,-) it filled in the two previous sentences.

n

iiiiN nrnrnPSSP )(),(|()|( 211,1

Each noun in the sentence

The role n plays in the i-1th sentence

Page 14: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

Model 1 Results

Baseline 50%

Model 1 82.2%

Trained on 10,000 automatically parsed documents form the NTC corpus, tested on 1323 other documents from same corpus.

Page 15: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

Generative Models of Discourse

I Introduction

II Model 1 - Entity Grids

III Model 2 - Pronominal Reference

IV Model 3 – Noun-Phrase Syntax

V Real Problems (Future Work)

Page 16: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

Can Pronouns Help

• In our abstract the only important pronouns have intra-sentential antecedents.

• Furthermore, when the document is out of order, there will almost always be something for the pronoun to point back to.

• As we will see, pronouns are the weakest of our models, but they do help.

Page 17: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

Adding Pronouns to the Mix

To handle pronouns we need to consider the various pronoun resolution possibilities

Unfortunately this sum is intractable, so we approximate it with

This is reasonable because most documents have only one set of reference assignments that make sense

a

pp DaAPDP ),()(

),(maxarg)( DaAPDP ap

Page 18: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

The probability of an Antecedent and the Pronoun given the

Antecedent

)|)(()|)((

))(),(|()|,( 21

apronounnumberPapronoungenderP

amentionsadistaAPSSSaAP iiip

Probability that the antecedent is a given how far away a is, and how often it has been mentioned

Probability of the pronoun gender given the antecedent.

Probability of the pronoun number given the antecedent.

Page 19: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

Example Pronoun Probabilities

P(ref=x|x is 1 back and appeared 1 time)= 0.25

If it is 1 back and appeared>4 times, 0.86

P(“asbestos” is neuter) = 0.998

P(“alice” is feminine)=0.84

P(“it” has a plural antecedent)=.04

Page 20: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

Model 2 Results

Model 1 82.2%

Model 2 71.3%

Model 1+2 85.3%

Page 21: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

Pronoun Reference vs. Discourse ModelingBest Gender Weak

Gender

Model Model

Model 2

Discourse71.3% 66.7%

Pronoun

Reference

Accuracy 79.1% 75.5%

Page 22: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

Generative Models of Discourse

I Introduction

II Model 1 - Entity Grids

III Model 2 - Pronominal Reference

IV Model 3 – Noun-Phrase Syntax

V Real Problems (Future Work)

Page 23: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

Abstract

Discourse, the study of how the meaning of a document is built out the meanings of its sentences, is the inter-sentential analogue of semantics.  In this talk we consider the following abstract problem in discourse.  Given a document, randomly permute the order of the sentences and then attempt to distinguish the original from the permuted version.  We present a sequence of generative models that can handle the problem with increasing accuracy.  Each model accounts for some aspect of the document, and assigns a probability to the document's contents.  In the standard generative way the subsequent models simply multiplies individual probabilities to get their results. We also discuss the linkage of this abstract task to more realistic ones such as essay grading, document summarization and document generation.

Page 24: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

Distinctions Between First and Non-First Mentions

• The first mention of an entity tends to have more deeply embedded syntax,

• It is longer at every level of embedding,

• Uses the determiner “a” more often,

• Often uses certain key words more or less often. E.g., most newspapers seem to follow the convention that, e.g., “John Doe” will be followed by “Mr. Doe”.

Page 25: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

Using This Information

• We assume that the first time a particular head noun occurs is the first mention, and all subsequent uses are non-first.

• We have a generative model of the noun-phrase syntax/key-words that should pick out the correct ordering.

Page 26: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

Generative NP Syntax

),1|()|()|()|(1

1,1 lksyntaxPlhPlnpPSSPhk

knpnpiiS

l={first, notfirst}

),1,|(),1|( 1 lkssPlksyntaxPi

ii

h=height. Probability of larger h will be higher for l=first

s is either a non-terminal or key-word

Page 27: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

A Simple Example

the document

NP

DET NOUN

P(h=1|l) is high for l=nonfirst

P(the|start,h=1,l) is high for l=nonfirst

Page 28: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

Model 3 Results

Model 1 82.2%

Model 1+2 85.3%

Model 3 86.2%

Model 1+2+3 90.3%

1+3, 89.1%

Page 29: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

Generative Models of Discourse

I Introduction

II Model 1 - Entity Grids

III Model 2 - Pronominal Reference

IV Model 3 – Noun-Phrase Syntax

V Real Problems (Future Work)

Page 30: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

Future Models

• Next week: Probabilistic choice of pronoun/full-NP.

• Next month: Insert quotations. (Almost) never in first sentence. Usually clustered together.

• Next year: Temporal relations between sentences, relations between verbs, different kinds of descriptions.

Page 31: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.

Real Problems

• Given an abstract representation of what we know about the entities in the document, (really) generate the words for those entities

• Given the sentences of two documents, and the first sentence of one of them, pick out the rest of the sentences of that document.

• The same, but with 10 documents on (roughly)

the same topic.

-er

Page 32: Generative Models of Discourse Eugene Charniak Brown Laboratory for Linguistic Information Processing BL IP L.