Paraphrase Detection Using Recursive Autoencoders CS224n Eric Huang Richard Socher, Jeffrey...

Post on 18-Dec-2015

219 views 0 download

Transcript of Paraphrase Detection Using Recursive Autoencoders CS224n Eric Huang Richard Socher, Jeffrey...

Paraphrase Detection Using Recursive Autoencoders

CS224nEric Huang

Richard Socher, Jeffrey Pennington, Professor Andrew Ng

Paraphrase Detection

• Microsoft Research Paraphrase Corpus

• Sentence 1: Amrozi accused his brother, whom he called "the witness", of deliberately distorting his evidence.

• Sentence 2: Referring to him as only "the witness", Amrozi accused his brother of deliberately distorting his evidence.

• Class: 1 (true paraphrase)

Autoencoder

Recursive Autoencoder

Unsupervised Training

• 152,487 sentences from English Gigaword dataset

• Minimize the sum of reconstruction errors at all nodes

Nearest Neighbors

• the U.S.• a U.S., the second biggest U.S., the most experienced

U.S.

• executive director• council director, general director, assistant director

Aggregate Features

• 10 Settings• Top node• Avg/Min/Max of :• Leaf nodes• Non-Leaf nodes• All nodes

Similarity MatrixThe dog sits

The 1 0.001 0.001

puppy

0.001 0.9 0.001

stays 0.001 0.001 0.5

Similarity MatrixThe dog sits The dog The dog

sits

The 1 0.001 0.001 0.05 0.05

puppy 0.001 0.9 0.001 0.8 0.4

stays 0.001 0.001 0.5 0.001 0.4

The puppy

0.05 0.8 0.001 0.9 0.5

The puppy stays

0.05 0.4 0.4 0.5 0.8

Results