Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

68
Confidential + Proprietary Smart Reply: Learning a Model of Conversation from Data Anjuli Kannan Software Engineer, Google Brain

Transcript of Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Page 1: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Confidential + Proprietary

Smart Reply: Learning a Model of Conversation from DataAnjuli KannanSoftware Engineer, Google Brain

Page 2: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Problem

Page 3: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Confidential + Proprietary

Can you do Tuesday or Wednesday?

Phil Sharp

Page 4: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Confidential + Proprietary

Tuesday Wednesday

Can you do Tuesday or Wednesday?

Phil Sharp

Page 5: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Smart Reply feature

● Provide text assistance for email reply composition

● Targeted at mobile● Responses can be sent on their

own or extended

Page 6: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Smart Reply feature predicts email responses

Smart ReplyIncoming email

Response email

Page 7: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Why is this task hard?

● extracting meaning from previous message● generating language● grammatical transformations between call and response● matching style/tone

Page 8: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Why is this solution interesting?

● Model is learned fully from data

Page 9: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Model

Page 10: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Confidential + Proprietary

Neural network

Is a 4

Is a 5

...

...

Image: Wikipedia

Page 11: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Confidential + Proprietary

Neural network

Neuron

Is a 4

Is a 5

Page 12: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Confidential + Proprietary

Basic building block is the neuron

Greg Corrado

Page 13: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Confidential + Proprietary

Neural network

Is a 4

Is a 5

...

...

Page 14: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Confidential + Proprietary

Learn a function from one space to another

f(.)x ∈ Rn y ∈ Rm

Page 15: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Confidential + Proprietary

Smartreply feature predicts email responses

SmartreplyIncoming email

Response email

Page 16: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Confidential + Proprietary

Recurrent neural networks handle sequences of input

Diagram by Felix Gers

Page 17: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Confidential + Proprietary

Recurrent neural networks handle sequences of input

Diagram by Felix Gers

Page 18: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Confidential + Proprietary

Recurrent neural networks handle sequences of input

Page 19: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Confidential + Proprietary

Reading a word into a feed-forward neural network

cat

output

Page 20: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Confidential + Proprietary

Reading a sequence of words into an RNN

That

Page 21: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Confidential + Proprietary

Reading a sequence of words into an RNN

That is

Page 22: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Confidential + Proprietary

Reading a sequence of words into an RNN

That is good

Page 23: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Confidential + Proprietary

Reading a sequence of words into an RNN

That is good !

Page 24: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Confidential + Proprietary

Reading a sequence of words into an RNN

That is good !

output

Page 25: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Sequence-to-sequence model

Sutskever et al, NIPS 2014

Page 26: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Sequence-to-sequence model

encoder decoder

Page 27: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Sequence-to-sequence model

Ingests incoming message Generates reply message

Page 28: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Inference

Page 29: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Reading a sequence of words into an RNN

How

Page 30: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Reading a sequence of words into an RNN

How are

Page 31: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Reading a sequence of words into an RNN

How are you

Page 32: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Reading a sequence of words into an RNN

How are you ?

Page 33: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Encoder ingests the incoming message

How are you ?

Internal state is a fixed length encoding of the message

Page 34: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Decoder is initialized with final state of encoder

How are you ? __

Page 35: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Decoder is initialized with final state of encoder

How are you ? __

Page 36: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Decoder predicts next word

How are you ? __

Page 37: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Decoder predicts next word

How are you ? ____ I

Page 38: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Smartreply model

How

Message

Page 39: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Smartreply model

How are

Message

Page 40: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Smartreply model

How are you

Message

Page 41: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Smartreply model

How are you ?

Message

Page 42: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Smartreply model

How are you ? __

I

Message

Response

Page 43: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Smartreply model

How are you ? __ I

I am

Message

Response

Page 44: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Smartreply model

How are you ? __ I am

I am great

Message

Response

Page 45: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Smartreply model

How are you ? __ I am great

I am great !

Message

Response

Vinyals & Le, ICML DL 2015

Page 46: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Inference

● Resulting model is fully generative● Output distribution can be used to determine the most likely responses using a

beam search

Page 47: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Training

Page 48: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Training

● Training data is a corpus of email-reply pairs● Both encoder and decoder are trained together (end-to-end)

Page 49: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Training

● Training data is a corpus of email-reply pairs● Both encoder and decoder are trained together (end-to-end)

Page 50: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Confidential + Proprietary

Key points about model

● Everything is learned from data, even features● Neural network smooths across language variation

Page 51: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Smart Reply in Production

Page 52: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Deployment & coverage

● Deployed in Inbox by Gmail● Used to assist with more than 10% of all mobile replies

Page 53: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Examples

Page 54: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Quality

● How do we ensure that the response options are always high quality in content and language?○ Avoid incorrect grammar and mechanics, misspellings e.g., your the best○ Avoid inappropriate, offensive responses. e.g., Leave me alone.○ Deal with wide variability, informal language. e.g., got it thx

● Restricting model vocabulary is not sufficient!

Solution: Restrict to a fixed set of valid responses, derived automatically from data.

Page 55: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Most frequently used clusters

Page 56: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Confidential + Proprietary

What the model can do

Page 57: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Confidential + Proprietary

What the model can't do

● Match every user's tone and style

Page 58: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Confidential + Proprietary

What the model can't do

● Match every user's tone and style● Ensure diverse options

Page 59: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Confidential + Proprietary

What the model can't do

● Match every user's tone and style● Ensure diverse options● Access and update any kind of state or knowledge base

Page 60: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Conclusions

Page 61: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Conclusions

● Sequence-to-sequence produces plausible email replies in many common scenarios, when trained on an email corpus

● Smart Reply is deployed in Inbox by Gmail and generates more than 10% of mobile replies

Page 62: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Confidential + Proprietary

Conclusions

● A conversation model learned entirely from data is very powerful● A data-driven approach can be complementary to hand-crafted rules and

scenarios

Page 63: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Confidential + Proprietary

Collaborators

- Greg Corrado, Oriol Vinyals (Google Brain)- Balint Miklos, Tobias Kaufman, Laszlo Lukacs, and Karol Kurach (GMail)- Sujith Ravi (Google Research)

Page 64: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Confidential + Proprietary

Thank you!

Page 65: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Extra slides

Page 66: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Example

Page 67: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Unique cluster and suggestion usage

Page 68: Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Ranking experiments