Deep Learning for Text Analytics - Sas Institute · Text Structure Word Order Two random corpus...

Post on 19-Jun-2020

1 views 0 download

Transcript of Deep Learning for Text Analytics - Sas Institute · Text Structure Word Order Two random corpus...

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Deep Learning for Text AnalyticsSAS User Group Malaysia

12th April 2018

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Using Deep Learning in Natural Language

Copyright © SAS Inst itute Inc. A l l r ights reserved.

SAS in a Chatbot

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Natural LanguageInteraction

Natural Language Processing (NLP)

Natural Language Understanding (NLU)

Natural Language Generation (NLG)

Natural Language Interaction (NLI)

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Natural Language Processing

NLP Layer(Natural Language

Processing)

Knowledge Base

(Source Content)

Data Storage(Interaction History &

Analytics)

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Recurrent Neural Network

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Deep LearningRecurrent Neural Network (RNN)

• Type of Neural Network

• Recurring over time

• Sequential data• Words

• Time

• Common Methods

• GRU (Gated Recurrent Unit)

• LSTM (Long Short Term Memory)

Output

Input

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Recurrent Neural Network (RNN)Word Vector

Unlabeled Corpus

The 15th American President

The 16th American President

The 17th American President

Alex reads this sentence

Alex read this sentence

Alex is reading this sentence

15th

17th

16th

read

reading

reads

Word Vector Algorithm

Words with similar context should have

similar vectors

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Recurrent Neural Network (RNN)Text Classification

Text Classification

Raw Text Document

Technology

Politics

Sport

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Recurrent Neural Network (RNN)Text Classification

Text Classification

The 16th American President

number

order

entity

context

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Recurrent Neural Network (RNN)Text Generation

Translating vector back to

text

Convert text into vectorized

input

RNN

Calculate vector weight

Vector representing a sentence

based on the text

Use weight vector to refine model

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Text GenerationText Structure – Word Order

Who is the 16th American President

The 16th President who is American

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Text StructureWord Order

Two random corpus

• I don’t like this director but I like this movie

• I like this director but I don’t like this movie

• Specific words can be strong indicators

• Sentiment – boring, exciting

• Topic – deep learning, Siri

Positive sentiment

Negative sentiment

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Creating, Training, Scoring an RNNUsing Deep Learning

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Sample RNN ModelLoading the Action Sets

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Sample RNN ModelThe Dataset

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Sample RNN ModelText Classification Model

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Sample RNN ModelTraining the Model

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Sample RNN ModelScoring the Model

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Sample RNN ModelScoring the Model

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Sample RNN ModelText Generation Model

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Sample RNN ModelTraining the Model

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Sample RNN ModelScoring the Model

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Sample RNN ModelText Generation Output

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Useful Links

• What’s New In SAS Deep Learning (Documentation)

http://go.documentation.sas.com/?docsetId=casdlpg&docsetTarget=n0gv3jjm5obouun1uvducbzl8nlf.htm&docsetVersion=8.2&locale=en

• Understanding Recurrent Neural Networks

http://karpathy.github.io/2015/05/21/rnn-effectiveness/

• RNN Simplified

https://www.youtube.com/watch?v=_aCuOwF1ZjU

sas.com

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Thank You