Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

43
LREC 2012, May 24 th , 2012 Language Technologies Institute School of Computer Science Carnegie Mellon University, USA Diversi able Bootstrapping for Acquiring High- Coverage Paraphrase Resource Carnegie Mellon Hideki Shima Teruko Mitamura

description

Carnegie Mellon. Diversifiable Bootstrapping for Acquiring High-Coverage Paraphrase Resource. Hideki Shima Teruko Mitamura. Language Technologies Institute School of Computer Science Carnegie Mellon University, USA. Can a machine recognize the meaning similarity?. John killed Mary. - PowerPoint PPT Presentation

Transcript of Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Page 1: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

LREC 2012, May 24th, 2012

Language Technologies InstituteSchool of Computer Science

Carnegie Mellon University, USA

Diversifiable Bootstrapping for Acquiring High-Coverage Paraphrase Resource

Carnegie Mellon

Hideki Shima

Teruko Mitamura

Page 2: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

LREC 2012, May 24th, 2012 2

John killed Mary.Can a machine recognize the meaning similarity?

Page 3: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

LREC 2012, May 24th, 2012 3

John killed Mary. Mary was killed by John. passivization

Can a machine recognize the meaning similarity?

Page 4: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

LREC 2012, May 24th, 2012 4

John killed Mary. Mary was killed by John. John is the killer of Mary.

passivization

nominalization

Can a machine recognize the meaning similarity?

Page 5: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

LREC 2012, May 24th, 2012 5

John killed Mary. Mary was killed by John. John is the killer of Mary. John assassinated Mary.

passivization

nominalization

entailment

Can a machine recognize the meaning similarity?

Page 6: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

LREC 2012, May 24th, 2012 6

John killed Mary. Mary was killed by John. John is the killer of Mary. John assassinated Mary. John is the 187 suspect of Mary.

passivization

nominalization

entailment

slang

Can a machine recognize the meaning similarity?

187 means: “California penal code for murder, made popular in west coast gangsta rap”.

– From The Urban Dictionary dot com

Usage: “This is Gavilan. In pursuit of possible 187 suspects.” –From the movie, Hollywood Homicide

Page 7: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

LREC 2012, May 24th, 2012 7

John killed Mary. Mary was killed by John. John is the killer of Mary. John assassinated Mary. John is the 187 suspect of Mary. John terminated Mary with extreme

prejudice.

passivization

nominalization

entailment

slang

Can a machine recognize the meaning similarity?

euphemism

“In military and other covert operations, terminate with extreme prejudice is a euphemism for execution” – Wikipedia

Page 8: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

LREC 2012, May 24th, 2012 8

John killed Mary. Mary was killed by John. John is the killer of Mary. John assassinated Mary. John is the 187 suspect of Mary. John terminated Mary with extreme

prejudice.

passivization

nominalization

entailment

slang

Can a machine recognize the meaning similarity?

euphemism

Humans use various expressions to convey the same or similar meaning, which makes it difficult for machines to “read” text.

Page 9: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

LREC 2012, May 24th, 2012 9

X killed Y. Y was killed by Y. X is the killer of Y. X assassinated Y. X is the 187 suspect of Y. X terminated Y with extreme prejudice.

passivization

nominalization

entailment

slang

Can a machine recognize the meaning similarity?

euphemismGoal: automatically acquire paraphrase patterns that are lexically-diverse

Page 10: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

LREC 2012, May 24th, 2012 10

Automatic Evaluation– In Machine Translation [Kauchak & Barzilay, 2006][Padó et al., 2009]

– In Text Summarization [Zhou et al., 2006]

– In Question Answering [Ibrahim et al., 2003] [Dalmas, 2007]

Text Summarization [Lloret et al., 2008][Tatar et al., 2009]

Information Retrieval [Parapar et al., 2005][Riezler et al., 2007]

Information Extraction [Romano et al., 2006]

Question Answering [Harabagiu & Hickl, 2006][Dogdan et al., 2008]

Collocation Error Correction [Dahlmeier and Ng, 2011]

Paraphrase Recognition / Generationis a common need in various applications

Page 11: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

Outline

LREC 2012, May 24th, 2012 11

Motivation Method: Diversifiable Bootstrapping Experiment Related Works Conclusion

Page 12: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

Bootstrap Paraphrase Learning

LREC 2012, May 24th, 2012 12

monolingual plain corpus

seed instances

BOOTSTRAPLEARNING

ALGORITHM moreinstances

patterns

INPUT OUTPUT

Page 13: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

BOOTSTRAPLEARNING

ALGORITHM

monolingual plain corpus

Bootstrapping moreinstances

patterns

INPUT OUTPUT

Bootstrap Paraphrase Learning

LREC 2012, May 24th, 2012 13

seed instances

X (killer) Y (victim)John Wilkes Booth

Mark David ChapmanNathuram Godse

Yigal AmirJohn Bellingham

Mohammed BouyeriDan White

Sirhan SirhanEl Sayyid Nosair

Mijailo Mijailovic

Abraham Lincoln John Lennon

Mahatma Gandhi Yitzhak Rabin

Spencer PercevalTheo van Gogh

Mayor George MosconeRobert F. Kennedy

Meir KahaneAnna Lindh

Page 14: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

monolingual plain corpus

seed instances

Bootstrapping moreinstances

INPUT OUTPUT

Bootstrap Paraphrase Learning

LREC 2012, May 24th, 2012 14

patterns

X, the assassin of Yassassination of Y by X

X assassinated Ythe assassination of Y by X

of X, the assassin of YX assassinated Y in

: : :

Unlike many other bootstrapping worksthe goal is acquire patterns, not instances

Page 15: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

Bootstrap Paraphrase Learning

LREC 2012, May 24th, 2012 15

monolingual plain corpus

seed instances

BOOTSTRAPLEARNING

ALGORITHM moreinstances

patterns

INPUT OUTPUT

Page 16: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

Bootstrap Learning Algorithm

LREC 2012, May 24th, 2012 16

SeedInstances

Sentences ExtractedPatterns

RankedPatterns

ExtractedInstances

Sentences

RankedInstances

1stiteration

. . .2nditeration

This framework is based on ESPRESSO [Pantel & Pennacchiotti, 2006]

Page 17: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

Search sentences by instancesBootstrap Learning Algorithm

LREC 2012, May 24th, 2012 17

ExtractedPatterns

RankedPatterns

ExtractedInstances

Sentences

RankedInstances

1stiteration

. . .2nditeration

SentencesSeedInstances

Edwin Booth was brother of John Wilkes Booth, the assassin of Abraham Lincoln.

John Wilkes Booth, the assassin of Abraham Lincoln, was inspired by Brutus.

In 1969 Berman was part of the defense team of Sirhan Sirhan, the assassin of Robert F. Kennedy.

: : :

Page 18: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

Search sentences by instancesBootstrap Learning Algorithm

LREC 2012, May 24th, 2012 18

ExtractedPatterns

RankedPatterns

ExtractedInstances

Sentences

RankedInstances

1stiteration

. . .2nditeration

SentencesSeedInstances

Edwin Booth was brother of X, the assassin of Y. X, the assassin of Y, was inspired by Brutus. In 1969 Berman was part of the defense team of X,

the assassin of Y. : : :

Page 19: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

Extract patterns from sentencesBootstrap Learning Algorithm

LREC 2012, May 24th, 2012 19

SeedInstances

RankedPatterns

ExtractedInstances

Sentences

RankedInstances

1stiteration

. . .2nditeration

ExtractedPatterns

Sentences

… brother of X, the assassin of Y. X, the assassin of Y, was …team of X, the assassin of Y.

Page 20: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

Extract patterns from sentencesBootstrap Learning Algorithm

LREC 2012, May 24th, 2012 20

SeedInstances

RankedPatterns

ExtractedInstances

Sentences

RankedInstances

1stiteration

. . .2nditeration

ExtractedPatterns

Sentences

… brother of X, the assassin of Y . X, the assassin of Y , was …team of X, the assassin of Y .

Extracted Pattern: Longest Common Substring among retrieved sentences

Page 21: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

Score and rank patterns

Sentences

Bootstrap Learning Algorithm

LREC 2012, May 24th, 2012 21

ExtractedInstances

Sentences

RankedInstances

1stiteration

. . .2nditeration

RankedPatternsRank by reliability of pattern: r(p).

r(p) is based on an association measure with each instance in the corpus.

ExtractedPatterns

SeedInstances

Page 22: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

Score and rank patterns

Sentences

Bootstrap Learning Algorithm

LREC 2012, May 24th, 2012 22

ExtractedInstances

Sentences

RankedInstances

1stiteration

. . .2nditeration

RankedPatterns

1. 0.422 X, the assassin of Y2. 0.324 assassination of Y by X3. 0.312 X assassinated Y4. 0.231 the assassination of Y by X5. 0.208 of X, the assassin of Y

: : :

ExtractedPatterns

SeedInstances

Page 23: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

Search sentences by pattern(s)

Sentences ExtractedPatterns

SeedInstances

Bootstrap Learning Algorithm

LREC 2012, May 24th, 2012 23

ExtractedInstances

RankedInstances

1stiteration

. . .2nditeration

RankedPatterns

Still shot from the CCTV video footage showing Oguen Samast, the assassin of Hrant Dink.

Henry Bellingham is a descendant of John Bellingham, the assassin of Spencer Perceval.

Sentences

Page 24: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

RankedPatterns

Extract instances from sentences

Sentences ExtractedPatterns

SeedInstances

Bootstrap Learning Algorithm

LREC 2012, May 24th, 2012 24

RankedInstances

1stiteration

. . .2nditeration

Still shot from the CCTV video footage showing Oguen Samast, the assassin of Hrant Dink.

Henry Bellingham is a descendant of John Bellingham, the assassin of Spencer Perceval.

SentencesExtractedInstances

Page 25: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

Sentences

Sentences

1stiteration

ExtractedPatterns

SeedInstances

Score and rank instances Bootstrap Learning Algorithm

LREC 2012, May 24th, 2012 25

. . .2nditeration

RankedPatterns

ExtractedInstances

RankedInstances

Rank instances by reliability: r(i) (similar to pattern reliability scoring)

Page 26: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

Issue: Lack of Lexical Diversity

LREC 2012, May 24th, 2012 26

As a solution, we propose the Diversifiable Bootstrapping

X, the assassin of Yassassination of Y by X

X assassinated Ythe assassination of Y by X

of X, the assassin of YX assassinated Y in

Words participating in patterns are skewed

Page 27: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

Diversifiable Bootstrapping

LREC 2012, May 24th, 2012 27

)()1()()(' pdiversityprpr

Original reliability score of a pattern

How is a pattern lexically different from other patterns originally

ranked higher than this?

Page 28: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

Diversifiable Bootstrapping

LREC 2012, May 24th, 2012 28

)()1()()(' pdiversityprpr

Original reliability score of a pattern

Interpolation parameter: 10

How is a pattern lexically different from other patterns originally

ranked higher than this?

Page 29: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

How is this pattern lexically different from

other patterns originally ranked higher than this?

Diversifiable Bootstrapping

LREC 2012, May 24th, 2012 29

)()1()()(' pdiversityprpr

Original reliability score of a pattern

Key contributionBy tweaking the parameter λ, patterns to acquire can be diversifiable with a specific degree one can control.

Interpolation parameter: 10

Page 30: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

Experimental Settings

LREC 2012, May 24th, 2012 30

Bootstrapping Algorithm– Based on ESPRESSO framework [Pantel & Pennacchiotti, 2006]

– Unlike ESPRESSO, we aim to obtain patterns not instances

Lexical diversity scoring function:– Based on Shima & Mitamura [2011]

Seed instances: Schlaefer et al., [2006]

Corpus: English Wikipedia

Page 31: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

Acquired Paraphrases: killed

LREC 2012, May 24th, 2012 31

X, the assassin of Yassassination of Y by XX assassinated Ythe assassination of Y by Xof X, the assassin of YX assassinated Y inX, the man who assassinated YY's assassin, Xof Y's assassin Xof the assassination of Y by XX shot and killed YY was assassinated by Xnamed X assassinated YY was shot by XX to assassinate Y

1 (no diversification)

Page 32: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

Acquired Paraphrases: killed

LREC 2012, May 24th, 2012 32

X, the assassin of Yassassination of Y by XX assassinated Ythe assassination of Y by Xof X, the assassin of YX assassinated Y inX, the man who assassinated YY's assassin, Xof Y's assassin Xof the assassination of Y by XX shot and killed YY was assassinated by Xnamed X assassinated YY was shot by XX to assassinate Y

X, the assassin of YX assassinated Yassassination of Y by XY was shot by XX, who killed Ythe assassination of Y by XX assassinated Y inX tells his version of YX shoot YX murdered YY's killer, XY, at the theatre after XY, push X to his breaking pointX to assassinate Yof X, the assassin of Y

X, the assassin of YX, who killed YY was shot by XX tells his version of YX shoot YX murdered YY's killer, XY, at the theatre after XY, push X to his breaking pointX assassinated Yassassination of Y by XX to assassinate YX kills Yof X shooting YX assassinated Y in

1 7.0 3.0

Page 33: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

Acquired Paraphrases: killed

LREC 2012, May 24th, 2012 33

X, the assassin of Yassassination of Y by XX assassinated Ythe assassination of Y by Xof X, the assassin of YX assassinated Y inX, the man who assassinated YY's assassin, Xof Y's assassin Xof the assassination of Y by XX shot and killed YY was assassinated by Xnamed X assassinated YY was shot by XX to assassinate Y

X, the assassin of YX assassinated Yassassination of Y by XY was shot by XX, who killed Ythe assassination of Y by XX assassinated Y inX tells his version of YX shoot YX murdered YY's killer, XY, at the theatre after XY, push X to his breaking pointX to assassinate Yof X, the assassin of Y

X, the assassin of YX, who killed YY was shot by XX tells his version of YX shoot YX murdered YY's killer, XY, at the theatre after XY, push X to his breaking pointX assassinated Yassassination of Y by XX to assassinate YX kills Yof X shooting YX assassinated Y in

1 7.0 3.0

Page 34: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

Acquired Paraphrases: died-of

LREC 2012, May 24th, 2012 34

X died of YX died of Y inX died of Y onX died of lung YX died of lung Y inX died of lung Y onX died of Y in theX died of Y atX died of stomach YX died of natural Y X died of breast Y inX died of a YX died of Y in hisX passed away from YX died of a Y in

X died of Y inX died of YX's death from YX passed away from YY of X, newsY of X, a formerthat X was suffering from Ythe suspected Y of XX to breast Y inX was diagnosed with ovarian YX dies of YX was dying of YX died of lung YX died of Y onX died of lung Y in

X died of Y inX's death from YX passed away from YY of X, newsY of X, a formerthat X was suffering from Ythe suspected Y of XX succumbed to lung YX to breast Y inX was diagnosed with ovarian YX dies of YX was dying of YX died of YX's death from Y inX died of lung Y

1 7.0 3.0

Page 35: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

Acquired Paraphrases: was-led-by

LREC 2012, May 24th, 2012 35

Y came to power in X inY came to power in XY to power in XY came to power in X in thewhen Y came to power in X inwhen Y came to power in XY took power in XY rose to power in Xafter Y came to power in XY became chancellor of XY came to power in X andY seized power in XY gained power in Xto power of Y in XY's rise to power in X

Y came to power in XY to power in Xregime of Y in XY came to power in X inY to power in X inY became chancellor of Xthe rise of Y in XX's dictator YX's president YY took control of XY, who ruled XY's success and X's saviourY declared that X hadX's leader Ygovernment of Y in X

Y came to power in X inregime of Y in XX's dictator YY became chancellor of XX's president Ythe rise of Y in XX's leader YY, who ruled XY took control of Xgovernment of Y in XX, led by Yquisling had visited Y in Xto flee X after YY in X the year beforeX, under the leadership of Y

1 7.0 3.0

Page 36: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

LREC 2012, May 24th, 2012 36

E.g., WordNet [Miller, 1995], FrameNet [Baker et al., 1998], Nomlex [Macleod et al., 1998], VerbNet [Kipper et al., 2006]

Related Works – Use of Thesaurus

Synonyms of “lead (v)” in WordNetID Words DefinitionS1 lead, take, direct, conduct,

guidetake somebody somewhere

S2 leave, result, lead produce as a result or residue

: S6 run, go, pass, lead, extend stretch out over a distance,

space, time, or scope:

S14 moderate, chair, lead preside over

Page 37: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

LREC 2012, May 24th, 2012 37

E.g., WordNet [Miller, 1995], FrameNet [Baker et al., 1998], Nomlex [Macleod et al., 1998], VerbNet [Kipper et al., 2006]

Related Works – Use of Thesaurus

ID Words DefinitionS1 lead, take, direct, conduct,

guidetake somebody somewhere

S2 leave, result, lead produce as a result or residue

: S6 run, go, pass, lead, extend stretch out over a distance,

space, time, or scope:

S14 moderate, chair, lead preside over

Synonyms of “lead (v)” in WordNet

WEAKNESS

Need WSD or contexts to avoid false-positives.

Page 38: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

LREC 2012, May 24th, 2012 38

Alignment Approach– Monolingual Comparable Corpus [Shinyama et al, 2002]

– Bilingual Parallel Corpus [Barzilay & McKeown, 2001][Bannard &

Callison-Burch, 2005][Callison-Burch, 2008]

Distributional Approach– Context as Vector Space [Pasca & Dienes, 2005][Bhagat &

Ravichandran, 2008]

– Context as Surface Pattern [Lin & Pantel, 2001][Ravichandran &

Hovy, 2002]

Related Works – Paraphrase Acquisition

Page 39: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

LREC 2012, May 24th, 2012 39

Related Works – Paraphrase Acquisition[Bannard & Callison-Burch, 2005]

[Callison-Burch, 2008]

[Bhagat & Ravichandran, 2008]

[Pasca & Dienes, 2005]

murdered murdered killed in useddied dead killed , madebeaten death that killed involvedbeen killed deaths killed NN people foundare died killed NN bornlost victims killed by donewere killed killing were wounded in injuredkill been killed and wounding seenhave died dead , including taken

, hundreds released

Paraphrases acquired by Metzler et al., [2011]

Page 40: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

LREC 2012, May 24th, 2012 40

Our work requires just a plain non-parallel corpus– Language portability:

• Good news for resource/tool-scarce languages– There’s a potential to learn words used in a closed

community (slangs, technical terms etc) by providing a domain-specific corpus

Bootstrapping works iteratively with minimum supervision– Smaller human effort is required as compared to heavily

supervised learning methods, or to relying on domain expert humans to hand-craft patterns.

Differences from Related Works

Page 41: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

Conclusion

LREC 2012, May 24th, 2012 41

We proposed the Diversifiable Bootstrapping which can acquire lexically- diverse paraphrase patterns.

We gave initial experimental results on a few relations, which look promising.

As a future work, we hope to conduct formal evaluations on larger relations in different languages.

Page 42: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

Acknowledgment

LREC 2012, May 24th, 2012 42

We also gratefully acknowledge the support of Defense Advanced Research Projects Agency (DARPA) Machine Reading Program under Air Force Research Laboratory (AFRL) prime contract no. FA8750-09-C-0172. Any opinions, findings, and conclusion or recommendations expressed in this material are those of the authors and do not necessarily reflect the view of the DARPA, AFRL, or the US government.

This publication was made possible in part by a NPRP grant (No: 09-873-1-129) from the Qatar National Research Fund (a member of The Qatar Foundation). The statements made herein are solely the responsibility of the authors.

Page 43: Language Technologies Institute School of Computer Science Carnegie Mellon University, USA

Carnegie Mellon

Questions?

LREC 2012, May 24th, 2012 43