FinalPresentation: NeuralNetworkDoc Summarization · FinalPresentation: NeuralNetworkDoc...

14
Final Presentation: Neural Network Doc Summarization CS4624 Multimedia, Hypertext, and Information Access Team: Junjie Cheng Instructor: Dr. Edward A. Fox Virginia Tech, Blacksburg VA 24061, Apr 30th, 2018

Transcript of FinalPresentation: NeuralNetworkDoc Summarization · FinalPresentation: NeuralNetworkDoc...

Page 1: FinalPresentation: NeuralNetworkDoc Summarization · FinalPresentation: NeuralNetworkDoc Summarization CS4624 Multimedia, Hypertext, and Information Access Team: Junjie Cheng Instructor:

Final Presentation:Neural Network Doc

SummarizationCS4624 Multimedia, Hypertext, and Information Access

Team: Junjie Cheng

Instructor: Dr. Edward A. Fox

Virginia Tech, Blacksburg VA 24061, Apr 30th, 2018

Page 2: FinalPresentation: NeuralNetworkDoc Summarization · FinalPresentation: NeuralNetworkDoc Summarization CS4624 Multimedia, Hypertext, and Information Access Team: Junjie Cheng Instructor:

Outline

u Project Overview

u Data Preprocess

u Model Architecture

u Training

u Model Performance

u References and Acknowledgements

Page 3: FinalPresentation: NeuralNetworkDoc Summarization · FinalPresentation: NeuralNetworkDoc Summarization CS4624 Multimedia, Hypertext, and Information Access Team: Junjie Cheng Instructor:

Project Overview

u Purpose: generate summarization from long documentthrough deep learning.

u Model: sequence to sequence model with RNN.

u Dataset: CNN/Daily Mail news.

Page 4: FinalPresentation: NeuralNetworkDoc Summarization · FinalPresentation: NeuralNetworkDoc Summarization CS4624 Multimedia, Hypertext, and Information Access Team: Junjie Cheng Instructor:

Data Preprocess

u Vocab size: 50000

u Input sequence max length: 400

u Target sequence max length: 100

Page 5: FinalPresentation: NeuralNetworkDoc Summarization · FinalPresentation: NeuralNetworkDoc Summarization CS4624 Multimedia, Hypertext, and Information Access Team: Junjie Cheng Instructor:

Model Architecture

Sequence to Sequence Model

Page 6: FinalPresentation: NeuralNetworkDoc Summarization · FinalPresentation: NeuralNetworkDoc Summarization CS4624 Multimedia, Hypertext, and Information Access Team: Junjie Cheng Instructor:

Encoder Architecture

Encoder

Shared embedding layer

Bidirectional LSTM layer

Page 7: FinalPresentation: NeuralNetworkDoc Summarization · FinalPresentation: NeuralNetworkDoc Summarization CS4624 Multimedia, Hypertext, and Information Access Team: Junjie Cheng Instructor:

Encoder Workflow

Embeddinglayer

• EmbeddedInputsequence

LSTM layer

• Context• Last hidden

vector• Last LSTM

cell state

Page 8: FinalPresentation: NeuralNetworkDoc Summarization · FinalPresentation: NeuralNetworkDoc Summarization CS4624 Multimedia, Hypertext, and Information Access Team: Junjie Cheng Instructor:

Decoder Architecture

Decoder

Shared embedding layer

LSTM layer

MLP attention Layer

Dropout layer

Out layer

Page 9: FinalPresentation: NeuralNetworkDoc Summarization · FinalPresentation: NeuralNetworkDoc Summarization CS4624 Multimedia, Hypertext, and Information Access Team: Junjie Cheng Instructor:

Decoder Workflow

Embeddinglayer

• Embeddedinputsequence

LSTM layer

• Context

Attentionlayer

• Attentionappliedcontext

Dropout layer

• Attentionappliedcontext

Out layer

• Contextwith vocabsize

Log softmaxfunction

• Possibilityof eachtoken inthe vocab

Page 10: FinalPresentation: NeuralNetworkDoc Summarization · FinalPresentation: NeuralNetworkDoc Summarization CS4624 Multimedia, Hypertext, and Information Access Team: Junjie Cheng Instructor:

Training Workflow

Load data

Train model

Computeloss

Backpropagation

Page 11: FinalPresentation: NeuralNetworkDoc Summarization · FinalPresentation: NeuralNetworkDoc Summarization CS4624 Multimedia, Hypertext, and Information Access Team: Junjie Cheng Instructor:

Training Architecture

u Optimizer: SGD

u Criterion: NLLLoss

u Batch size: 3

u Epoch number: 100

u Loss: 6.7 à 1.4

u Learning rate: 1

u Hidden size: 256

u Word embedding size: 128

Page 12: FinalPresentation: NeuralNetworkDoc Summarization · FinalPresentation: NeuralNetworkDoc Summarization CS4624 Multimedia, Hypertext, and Information Access Team: Junjie Cheng Instructor:

Model Performance

u Generated summary: “have beaten three of their last three league games . the <UNK> scored in the second half of the last minute . the win takes all three points to move ahead of champions league place”

u Human-produced summary: “two goals from lionel messihelp barcelona to a 3-1 win over almeria . kaka bags brace as real madrid coast to 3-0 victory at athletic bilbao . inter milan move up to second place in serie a with 2-0 win over chievo .”

Page 13: FinalPresentation: NeuralNetworkDoc Summarization · FinalPresentation: NeuralNetworkDoc Summarization CS4624 Multimedia, Hypertext, and Information Access Team: Junjie Cheng Instructor:

Acknowledgements

u Client: Yufeng Ma

u Mr. Ma is a PhD student at Virginia Tech. He worked as the client of this project and guided the project through all project phases.

Page 14: FinalPresentation: NeuralNetworkDoc Summarization · FinalPresentation: NeuralNetworkDoc Summarization CS4624 Multimedia, Hypertext, and Information Access Team: Junjie Cheng Instructor:

Reference

u Gokumohandas. Recurrent Neural Networks (RNN) – part 3: encoder-decoder. https://theneuralperspective.com/2016/11/20/recurrent-neural-networks-rnn-part-3- encoder-decoder/. Web. accessed: March 26, 2018.