Paraphrase Detection Using Recursive Autoencoders CS224n Eric Huang Richard Socher, Jeffrey...

10
Paraphrase Detection Using Recursive Autoencoders CS224n Eric Huang Richard Socher, Jeffrey Pennington, Professor Andrew Ng

Transcript of Paraphrase Detection Using Recursive Autoencoders CS224n Eric Huang Richard Socher, Jeffrey...

Page 1: Paraphrase Detection Using Recursive Autoencoders CS224n Eric Huang Richard Socher, Jeffrey Pennington, Professor Andrew Ng.

Paraphrase Detection Using Recursive Autoencoders

CS224nEric Huang

Richard Socher, Jeffrey Pennington, Professor Andrew Ng

Page 2: Paraphrase Detection Using Recursive Autoencoders CS224n Eric Huang Richard Socher, Jeffrey Pennington, Professor Andrew Ng.

Paraphrase Detection

• Microsoft Research Paraphrase Corpus

• Sentence 1: Amrozi accused his brother, whom he called "the witness", of deliberately distorting his evidence.

• Sentence 2: Referring to him as only "the witness", Amrozi accused his brother of deliberately distorting his evidence.

• Class: 1 (true paraphrase)

Page 3: Paraphrase Detection Using Recursive Autoencoders CS224n Eric Huang Richard Socher, Jeffrey Pennington, Professor Andrew Ng.

Autoencoder

Page 4: Paraphrase Detection Using Recursive Autoencoders CS224n Eric Huang Richard Socher, Jeffrey Pennington, Professor Andrew Ng.

Recursive Autoencoder

Page 5: Paraphrase Detection Using Recursive Autoencoders CS224n Eric Huang Richard Socher, Jeffrey Pennington, Professor Andrew Ng.

Unsupervised Training

• 152,487 sentences from English Gigaword dataset

• Minimize the sum of reconstruction errors at all nodes

Page 6: Paraphrase Detection Using Recursive Autoencoders CS224n Eric Huang Richard Socher, Jeffrey Pennington, Professor Andrew Ng.

Nearest Neighbors

• the U.S.• a U.S., the second biggest U.S., the most experienced

U.S.

• executive director• council director, general director, assistant director

Page 7: Paraphrase Detection Using Recursive Autoencoders CS224n Eric Huang Richard Socher, Jeffrey Pennington, Professor Andrew Ng.

Aggregate Features

• 10 Settings• Top node• Avg/Min/Max of :• Leaf nodes• Non-Leaf nodes• All nodes

Page 8: Paraphrase Detection Using Recursive Autoencoders CS224n Eric Huang Richard Socher, Jeffrey Pennington, Professor Andrew Ng.

Similarity MatrixThe dog sits

The 1 0.001 0.001

puppy

0.001 0.9 0.001

stays 0.001 0.001 0.5

Page 9: Paraphrase Detection Using Recursive Autoencoders CS224n Eric Huang Richard Socher, Jeffrey Pennington, Professor Andrew Ng.

Similarity MatrixThe dog sits The dog The dog

sits

The 1 0.001 0.001 0.05 0.05

puppy 0.001 0.9 0.001 0.8 0.4

stays 0.001 0.001 0.5 0.001 0.4

The puppy

0.05 0.8 0.001 0.9 0.5

The puppy stays

0.05 0.4 0.4 0.5 0.8

Page 10: Paraphrase Detection Using Recursive Autoencoders CS224n Eric Huang Richard Socher, Jeffrey Pennington, Professor Andrew Ng.

Results