Anjuli Kannan, Software Engineer, Google at MLconf SF 2016
-
Upload
mlconf -
Category
Technology
-
view
266 -
download
1
Transcript of Anjuli Kannan, Software Engineer, Google at MLconf SF 2016
Confidential + Proprietary
Smart Reply: Learning a Model of Conversation from DataAnjuli KannanSoftware Engineer, Google Brain
Problem
Confidential + Proprietary
Can you do Tuesday or Wednesday?
Phil Sharp
Confidential + Proprietary
Tuesday Wednesday
Can you do Tuesday or Wednesday?
Phil Sharp
Smart Reply feature
● Provide text assistance for email reply composition
● Targeted at mobile● Responses can be sent on their
own or extended
Smart Reply feature predicts email responses
Smart ReplyIncoming email
Response email
Why is this task hard?
● extracting meaning from previous message● generating language● grammatical transformations between call and response● matching style/tone
Why is this solution interesting?
● Model is learned fully from data
Model
Confidential + Proprietary
Neural network
Is a 4
Is a 5
...
...
Image: Wikipedia
Confidential + Proprietary
Neural network
Neuron
Is a 4
Is a 5
Confidential + Proprietary
Basic building block is the neuron
Greg Corrado
Confidential + Proprietary
Neural network
Is a 4
Is a 5
...
...
Confidential + Proprietary
Learn a function from one space to another
f(.)x ∈ Rn y ∈ Rm
Confidential + Proprietary
Smartreply feature predicts email responses
SmartreplyIncoming email
Response email
Confidential + Proprietary
Recurrent neural networks handle sequences of input
Diagram by Felix Gers
Confidential + Proprietary
Recurrent neural networks handle sequences of input
Diagram by Felix Gers
Confidential + Proprietary
Recurrent neural networks handle sequences of input
Confidential + Proprietary
Reading a word into a feed-forward neural network
cat
output
Confidential + Proprietary
Reading a sequence of words into an RNN
That
Confidential + Proprietary
Reading a sequence of words into an RNN
That is
Confidential + Proprietary
Reading a sequence of words into an RNN
That is good
Confidential + Proprietary
Reading a sequence of words into an RNN
That is good !
Confidential + Proprietary
Reading a sequence of words into an RNN
That is good !
output
Sequence-to-sequence model
Sutskever et al, NIPS 2014
Sequence-to-sequence model
encoder decoder
Sequence-to-sequence model
Ingests incoming message Generates reply message
Inference
Reading a sequence of words into an RNN
How
Reading a sequence of words into an RNN
How are
Reading a sequence of words into an RNN
How are you
Reading a sequence of words into an RNN
How are you ?
Encoder ingests the incoming message
How are you ?
Internal state is a fixed length encoding of the message
Decoder is initialized with final state of encoder
How are you ? __
Decoder is initialized with final state of encoder
How are you ? __
Decoder predicts next word
How are you ? __
Decoder predicts next word
How are you ? ____ I
Smartreply model
How
Message
Smartreply model
How are
Message
Smartreply model
How are you
Message
Smartreply model
How are you ?
Message
Smartreply model
How are you ? __
I
Message
Response
Smartreply model
How are you ? __ I
I am
Message
Response
Smartreply model
How are you ? __ I am
I am great
Message
Response
Smartreply model
How are you ? __ I am great
I am great !
Message
Response
Vinyals & Le, ICML DL 2015
Inference
● Resulting model is fully generative● Output distribution can be used to determine the most likely responses using a
beam search
Training
Training
● Training data is a corpus of email-reply pairs● Both encoder and decoder are trained together (end-to-end)
Training
● Training data is a corpus of email-reply pairs● Both encoder and decoder are trained together (end-to-end)
Confidential + Proprietary
Key points about model
● Everything is learned from data, even features● Neural network smooths across language variation
Smart Reply in Production
Deployment & coverage
● Deployed in Inbox by Gmail● Used to assist with more than 10% of all mobile replies
Examples
Quality
● How do we ensure that the response options are always high quality in content and language?○ Avoid incorrect grammar and mechanics, misspellings e.g., your the best○ Avoid inappropriate, offensive responses. e.g., Leave me alone.○ Deal with wide variability, informal language. e.g., got it thx
● Restricting model vocabulary is not sufficient!
Solution: Restrict to a fixed set of valid responses, derived automatically from data.
Most frequently used clusters
Confidential + Proprietary
What the model can do
Confidential + Proprietary
What the model can't do
● Match every user's tone and style
Confidential + Proprietary
What the model can't do
● Match every user's tone and style● Ensure diverse options
Confidential + Proprietary
What the model can't do
● Match every user's tone and style● Ensure diverse options● Access and update any kind of state or knowledge base
Conclusions
Conclusions
● Sequence-to-sequence produces plausible email replies in many common scenarios, when trained on an email corpus
● Smart Reply is deployed in Inbox by Gmail and generates more than 10% of mobile replies
Confidential + Proprietary
Conclusions
● A conversation model learned entirely from data is very powerful● A data-driven approach can be complementary to hand-crafted rules and
scenarios
Confidential + Proprietary
Collaborators
- Greg Corrado, Oriol Vinyals (Google Brain)- Balint Miklos, Tobias Kaufman, Laszlo Lukacs, and Karol Kurach (GMail)- Sujith Ravi (Google Research)
Confidential + Proprietary
Thank you!
Extra slides
Example
Unique cluster and suggestion usage
Ranking experiments