Shallow Semantics

54
Shallow Semantics

description

Shallow Semantics. Semantics and Pragmatics. High-level Linguistics (the good stuff!) Semantics: the study of meaning that can be determined from a sentence, phrase or word. Pragmatics: the study of meaning, as it depends on context (speaker, situation, dialogue history). - PowerPoint PPT Presentation

Transcript of Shallow Semantics

Page 1: Shallow Semantics

Shallow Semantics

Page 2: Shallow Semantics

LING 2000 - 2006 NLP 2

Semantics and Pragmatics

High-level Linguistics (the good stuff!)

Semantics: the study of meaning that can be determined from a sentence, phrase or word.

Pragmatics: the study of meaning, as it depends on context (speaker, situation, dialogue history)

Page 3: Shallow Semantics

LING 2000 - 2006 NLP 3

Language to (Simplistic) Logic

• John went to the book store.go(John, store1)

• John bought a book. buy(John,book1)

• John gave the book to Mary. give(John,book1,Mary)

• Mary put the book on the table. put(Mary,book1,on table1)

Page 4: Shallow Semantics

What’s missing?

• Word sense disambiguation• Quantification• Coreference• Interpreting within a phrase• Many, many more issues …

• But it’s still more than you get from parsing!

Page 5: Shallow Semantics

Some problems in shallow semantics

1. Identifying entities– noun-phrase chunking– named-entity recognition– coreference resolution

(involves discourse/pragmatics too)2. Identifying relationship names

– Verb-phrase chunking– Predicate identification (step 0 of semantic role labeling)– Synonym resolution (e.g., get = receive)

3. Identifying arguments to predicates– Information extraction– Argument identification (step 1 of semantic role labeling)

4. Assigning semantic roles (step 2 of semantic role labeling)5. Sentiment classification

– That is, does the relationship express an opinion?– If so, is the opinion positive or negative?

Page 6: Shallow Semantics

1. Identifying EntitiesNamed Entity Tagging: Identify all the proper names in a text

Sally went to see Up in the Air at the local theater. Person Film

Noun Phrase Chunking: Find all base noun phrases (that is, noun phrases that don’t have smaller noun phrases nested inside them)

Sally went to see Up in the Air at the local theater on Elm Street.

Page 7: Shallow Semantics

1. Identifying Entities (2)

Parsing: Identify all phrase constituents, which will of course include all noun phrases.

S

NP VP

N

Sally

V NP PP

P NP

the theateratUp in the Airsaw

NP

Elm St.on

PP

NPP

Page 8: Shallow Semantics

1. Identifying Entities (3)

Coreference Resolution: Identify all references (aka ‘mentions’) of people, places and things in text, and determine which mentions are ‘co-referential’.

John stuck his foot in his mouth.

Page 9: Shallow Semantics

2. Identifying relationship namesVerb phrase chunking: the commonest approach

Some issues: 1. Often, prepositions/particles “belong” with the relation name

You’re ticking me off.

2. Many relationships are expressed without a verb:Jack Welch, CEO of GE, …

3. Some verbs don’t really express a meaningful relationship by themselves:Jim is the father of 12 boys.

4. Verb sense disambiguation5. Synonymy

ticking off = bothering

Page 10: Shallow Semantics

2. Identifying relationship names (2)

Synonym Resolution:Discovery of Inference Rules from Text (DIRT) (Lin and Pantel, 2001)

1. They collect millions of examples of

Subject Verb Object

triples by parsing a Web corpus.

2. For a pair of verbs, v1 and v2, they compute mutual information scores between

- the vector space model (VSM) for subjects of v1 and the vector space model for the subjects of v2

- the VSM for objects of v1 and VSM for objects of v2

3. They cluster verbs with high MI scores between them

give donate

many gift souls gift

. yourself

partner monthly

How to animal please hair

you gift many dollars

please blood you car

help life you money

members energy you today

See (Yates and Etzioni, JAIR 2009) for a more recent approach using probabilistic models.

Page 11: Shallow Semantics

5. Sentiment ClassificationGiven a review (about a movie, hotel, Amazon product, etc.),

a sentiment classification system tries to determine what opinions are expressed in the review.

Coarse-level objective: is the review positive, negative, or neutral overall?

Fine-grained objective: what are the positive aspects (according to the reviewer), and what are the negative aspects?

Question: what technique(s) would you use to solve these two problems?

Page 12: Shallow Semantics

Semantic Role Labeling

a.k.a., Shallow Semantic Parsing

Page 13: Shallow Semantics

Semantic Role Labeling

Semantic role labeling is the computational task of assigning semantic roles to phrases

It’s usually divided into three subtasks:1. Predicate identification2. Argument Identification3. Argument Classification -- assigning semantic roles

John broke the window with a hammer.

Pred

B-Arg B-Arg I-Arg B-Arg I-Arg I-Arg

Agent Patient Means (or instrument)

Page 14: Shallow Semantics

NLP 14

Same event - different sentences

John broke the window with a hammer.

John broke the window with the crack.

The hammer broke the window.

The window broke.

Page 15: Shallow Semantics

NLP 15

Same event - different syntactic frames

John broke the window with a hammer. SUBJ VERB OBJ MODIFIER

John broke the window with the crack. SUBJ VERB OBJ MODIFIER

The hammer broke the window. SUBJ VERB OBJ

The window broke. SUBJ VERB

Page 16: Shallow Semantics

NLP 16

Semantic role example break(AGENT, INSTRUMENT, PATIENT)

AGENT PATIENT INSTRUMENT John broke the window with a hammer.

INSTRUMENT PATIENT The hammer broke the window.

PATIENT The window broke.

Fillmore 68 - The case for case

Page 17: Shallow Semantics

NLP 17

AGENT PATIENT INSTRUMENT John broke the window with a hammer. SUBJ OBJ MODIFIER

INSTRUMENT PATIENT The hammer broke the window. SUBJ OBJ

PATIENT The window broke. SUBJ

Page 18: Shallow Semantics

Semantic rolesSemantic roles (or just roles) are slots, belonging to a

predicate, which arguments can fill.- There are different naming conventions, but one common set of names for semantic roles are agent, patient, means/instrument, ….

Some constraints:1. Only certain kinds of phrases can fill certain kinds of semantic roles

“with a crack” will never be an agent But many are ambiguous:

“hammer” patient or instrument?2. Syntax provides a clue, but it is not the full answer

Subject Agent? Patient? Instrument?

Page 19: Shallow Semantics

Slot Filling

PredJohn

broke

the window

with a hammer

Agent

Patient

Means (or instrument)

Phrases Slots

Argument Classification

Page 20: Shallow Semantics

Slot Filling

PredThe hammer

broke

the window

Agent

Patient

Means (or instrument)

Phrases Slots

Argument Classification

Page 21: Shallow Semantics

Slot Filling

PredThe window

broke Agent

Patient

Means (or instrument)

Phrases Slots

Argument Classification

Page 22: Shallow Semantics

Slot Filling and Shallow Semantics

PredJohn

broke

the window

with a hammer

Agent

Patient

Means (or instrument)

Phrases SlotsShallow

Semantics

broke(John, the window, with a hammer)

Pred Agent PatientMeans

(or instrument)

Page 23: Shallow Semantics

Slot Filling and Shallow Semantics

Pred

broke

The window

Agent

Patient

Means (or instrument)

Phrases SlotsShallow

Semantics

broke( ?x , the window, ?y )

Pred Agent PatientMeans

(or instrument)

Page 24: Shallow Semantics

Semantic Role Labeling Techniques

Page 25: Shallow Semantics

Semantic Role Labeling Techniques

We’ll cover 3 approaches to SRL1. Basic (Gildea and Jurafsky, Comp. Ling. 2003)

2. Joint inference for argument structure (Toutanova et al., Comp. Ling. 2008)

3. Open-domain (Huang and Yates, ACL 2010)

Page 26: Shallow Semantics

1. Gildea and Jurafsky

Main idea: start with parse tree, and try to identify constituents that are arguments.

Page 27: Shallow Semantics

G&J (1)Build a (probabilistic) classifier for predicting:

- for each constituent, which role is it?- Essentially, a maximum-entropy classifier, although it’s not described that way

Features for Argument Classification:1. Phrase type of constituent2. Governing category of NPs – S or VP (differentiates between

subjects and objects)3. Position w.r.t. predicate (before or after)4. Voice of predicate (active or passive verb)5. Head word of constituent6. Parse tree path between predicate and constituent

Page 28: Shallow Semantics

G&J (2) – Parse Tree Path Feature

Parse tree path (or just path) feature:

Determines the syntactic relationship between predicate and current constituent.

In this example, path feature:

VB ↑ VP ↑ S ↓ NP

Page 29: Shallow Semantics

G&J (3)

4086 possible values of the Path feature in training data.A sparse feature!

Page 30: Shallow Semantics

G&J (4)

Build a (probabilistic) classifier for predicting:- for each constituent, which role is it?- Essentially, a maximum-entropy classifier, although it’s not described that way

Features for Argument Identification:1. Predicate word2. Head word of constituent3. Parse tree path between predicate and constituent

Page 31: Shallow Semantics

G&J (5): Results

Task Best Result

Argument Identification (only) 92% prec., 86% rec., .89 F1

Argument Classification (only) 78.5% assigned correct role

Page 32: Shallow Semantics

2. Toutanova, Haghighi, and Manning

A Global Joint Model for SRL (Comp. Ling., 2008)

Main idea(s): Include features that depend on multiple

argumentsUse multiple parsers as input, for robustness

Page 33: Shallow Semantics

THM (1): Motivation1. “The day that the ogre cooked the children is still remembered.”

2. “The meal that the ogre cooked the children is still remembered.”

Both sentences have identical syntax.They differ in only 1 word (day vs. meal).

If we classify arguments 1 at a time, “the children” will be labeled the same thing in both cases.

But in (1), “the children” is the Patient (thing being cooked).And in (2), “the children” is the Beneficiary (people for whom the cooking is

done).

Intuitively, we can’t classify these arguments independently.

Page 34: Shallow Semantics

THM(2): Features

Features:1.Whole label sequence

1. [voice:active, Arg1, pred, Arg4, ArgM-TMP]2. [voice:active, lemma:accelerated, Arg1, pred, Arg4, ArgM-TMP]3. [voice:active, lemma:accelerated, Arg1, pred, Arg4] (no adjuncts)4. [voice:active, lemma:accelerated, Arg, pred, Arg] (no adjuncts, no #s)

2.Syntax and semantics in the label sequence1. [voice:active, NP-Arg1, pred, PP-Arg4]2. [voice:active, lemma:accelerated, NP-Arg1, pred, PP-Arg4]

3.Repetition features: whether Arg1 (for example) appears multiple times

Page 35: Shallow Semantics

THM(3): Classifier

• First, for each sentence, obtain the top-10 most likely parse tree/semantic role label outputs from G&J

• Build a max-ent classifier to select from these 10, using the features above

• Also, include top-10 parses from the Charniak parser

Page 36: Shallow Semantics

THM(4): Results

These are on a different data set from G&J, so results not directly comparable. But the local model is similar to G&J, so think of that as the comparison.

Model WSJ (ID & CLS) Brown (ID & CLS)

Local 78.00 65.55

Joint (1 parse) 79.71 67.79

Joint (top 5 parses) 80.32 68.81

Results show F1 scores for IDentification and CLaSsification of arguments together.WSJ is the Wall Street Journal test set, a collection of approximately 4,000 news sentences.Brown is a smaller collection of fiction stories.The system is trained on a separate set of WSJ sentences.

Page 37: Shallow Semantics

3. Huang and Yates

Open-Domain SRL by Modeling Word Spans, ACL 2010

Main Idea: One of the biggest problems for SRL systems is that

they need lexical features to classify arguments, but lexical features are sparse.

We build a simple SRL system that outperforms the previous state-of-the-art on out-of-domain data, by learning new lexical representations.

Page 38: Shallow Semantics

Simple, open-domain SRL

Chris broke the window with a hammer

Proper Noun

Verb Det. Noun Prep. Det. Noun

B-NP B-VP B-NP I-NP B-PP B-NP I-NP

-1 0 +1 +2 +3 +4 +5

POS tag

Chunk tag

dist. from predicate

SRL Label Breaker Pred Thing Broken Means

Baseline Features

Page 39: Shallow Semantics

HMM label

Simple, open-domain SRL

Chris broke the window with a hammer

Proper Noun

Verb Det. Noun Prep. Det. Noun

B-NP B-VP B-NP I-NP B-PP B-NP I-NP

-1 0 +1 +2 +3 +4 +5

POS tag

Chunk tag

dist. from predicate

SRL Label Breaker Pred Thing Broken Means

Baseline +HMM

Page 40: Shallow Semantics

The importance of paths

Chris [predicate broke] [thing broken a hammer]

Chris [predicate broke] a window with [means a hammer]

Chris [predicate broke] the desk, so she fetched

[not an arg a hammer] and nails.

Page 41: Shallow Semantics

Simple, open-domain SRL

Chris broke the window with a hammer

None None None the the-window

the-window-

withthe-window-

with-aWord path

SRL Label Breaker Pred Thing Broken Means

Baseline +HMM + Paths

Page 42: Shallow Semantics

Simple, open-domain SRL

Chris broke the window with a hammer

None None None the the-window

the-window-

withthe-window-

with-aWord path

SRL Label Breaker Pred Thing Broken Means

Baseline +HMM + Paths

Det Det-Noun

Det-Noun-Prep

Det-Noun-

Prep-DetPOS path None None None

Page 43: Shallow Semantics

Simple, open-domain SRL

Chris broke the window with a hammer

None None None the the-window

the-window-

withthe-window-

with-aWord path

SRL Label Breaker Pred Thing Broken Means

Baseline +HMM + Paths

Det Det-Noun

Det-Noun-Prep

Det-Noun-

Prep-DetPOS path None None None

HMM path None None None

Page 44: Shallow Semantics

Experimental results – F1

All systems were trained on newswire text from the Wall Street Journal (WSJ), and tested on WSJ and fiction texts from the Brown corpus (Brown).

Page 45: Shallow Semantics

Experimental results – F1

All systems were trained on newswire text from the Wall Street Journal (WSJ), and tested on WSJ and fiction texts from the Brown corpus (Brown).

Page 46: Shallow Semantics

Span-HMMs

Page 47: Shallow Semantics

Span-HMM features

Chris broke the window with a hammer

Span-HMM for “hammer”

SRL Label Breaker Pred Thing Broken Means

Span-HMM Features

Span-HMM feature

Page 48: Shallow Semantics

Span-HMM features

Chris broke the window with a hammer

Span-HMM for “hammer”

SRL Label Breaker Pred Thing Broken Means

Span-HMM Features

Span-HMM feature

Page 49: Shallow Semantics

Span-HMM features

Chris broke the window with a hammer

Span-HMM for “a”

SRL Label Breaker Pred Thing Broken Means

Span-HMM Features

Span-HMM feature

Page 50: Shallow Semantics

Span-HMM features

Chris broke the window with a hammer

Span-HMM for “a”

SRL Label Breaker Pred Thing Broken Means

Span-HMM Features

Span-HMM feature

Page 51: Shallow Semantics

Span-HMM features

Chris broke the window with a hammer

SRL Label Breaker Pred Thing Broken Means

Span-HMM Features

Span-HMM feature None None None

Page 52: Shallow Semantics

Experimental results – SRL F1

All systems were trained on newswire text from the Wall Street Journal (WSJ), and tested on WSJ and fiction texts from the Brown corpus (Brown).

Page 53: Shallow Semantics

Experimental results – feature sparsity

Page 54: Shallow Semantics

Benefit grows with distance from predicate