neural dialog -...

NeuralDialogShrimai Prabhumoye

AlanWBlackSpeechProcessing11-[468]92

Review

• TaskOrientedSystems• Intents,slots,actionsandresponse

• Non-TaskOrientedSystems• Noagenda,forfun

• Buildingdialogsystems• RuleBasedSystems• Eliza

• RetrievalTechniques• Representations:TF-IDF,N-grams,wordsthemselves• SimilarityMeasures:Jaccard,cosine,euclidean distance• Limitations– fixedsetofresponses,novariationinresponse

Review

• TaskOrientedSystems• Non-TaskOrientedSystems• Buildingdialogsystems• RetrievalTechniques

• Representation• WordVectors

• SimilarityMeasures• Limitations– fixedsetofresponses,novariationinresponse

• GenerativeModels

Overview

• WordEmbeddings

• LanguageModelling

• RecurrentNeuralNetworks• SequencetoSequenceModels

• HowtoBuildDialogSystem

• IssuesandExamples

• Alexa-Prize

NeuralDialog

• Wewanttomodel:

• Howtowerepresentsentence(𝑃 𝑟𝑒𝑠𝑝𝑜𝑛𝑠𝑒 , 𝑃 𝑖𝑛𝑝𝑢𝑡 ?)• Howtobuildalanguagemodel.• Howtorepresentswords(wordembeddings?)

𝑷 𝒓𝒆𝒔𝒑𝒐𝒏𝒔𝒆 𝒊𝒏𝒑𝒖𝒕)

NaturalLanguageProcessing

• Typicalpreprocessingstepso FormvocabularyofwordsthatmapswordstoauniqueIDo Differentcriteriacanbeusedtoselectwhichwordsarepartof

thevocabulary(eg:thresholdfrequency)o Allwordsnotinthevocabularywillbemappedtoaspecial‘out-

of-vocabulary’• Typicalvocabularysizeswillvarybetween10,000and250,000

(Salakhutdinov,2017)

PreprocessingTechniques

• Tokenization• “Iamagirl.”tokenizedto“I”,“am”,“a”,“girl”,“.”

• Lowercaseallwords• RemovingStopWords• Ex:“the”,“a”,“and”,etc

• FrequencyofWords• SetathresholdandmakeallwordsbelowthisfrequencyasUNK

• Add<START>and<EOS>tagatthebeginningandendofsentence.

Vocabulary

One-HotEncoding

• FromitswordID,wegetabasicrepresentationofawordthroughthe

one-hotencodingoftheID

• theone-hotvectorofanIDisavectorfilledwith0s,exceptfora1at

thepositionassociatedwiththeID

• ForvocabularysizeD=10,theone-hotvectorofwordIDw=4is:

𝑒 𝑤 = [0001000000]

LimitationsofOne-HotEncoding

• Aone-hotencodingmakesnoassumptionaboutwordsimilarity.o [“working”,“on”,“Friday”,“is”,“tiring”]doesnotappearinourtrainingset.

o [“working”,“on”,“Monday”,“is”,“tiring”]isinthetrainset.oWewanttomodel𝑃 “𝑡𝑖𝑟𝑖𝑛𝑔” “𝑤𝑜𝑟𝑘𝑖𝑛𝑔”, “𝑜𝑛”, “𝐹𝑟𝑖𝑑𝑎𝑦”, “𝑖𝑠”)oWordrepresentationof“Monday”and“Friday” aresimilarthengeneralize

LimitationsofOne-HotEncoding

• Themajorproblemwiththeone-hotrepresentationisthatitisvery

high-dimensional

othedimensionalityofe(w)isthesizeofthevocabulary

oatypicalvocabularysizeis≈100,000

oawindowof10wordswouldcorrespondtoaninputvectorofat

least1,000,000units!

ContinuousRepresentationofWords

• Eachwordwisassociatedwithareal-valuedvectorC(w)• Typicalsizeofword– embeddingis300ormore.

ContinuousRepresentationofWords

• Wewouldlikethedistance||C(w)-C(w’)||toreflectmeaningfulsimilarities betweenwords

LanguageModeling

• Alanguagemodelallowsustopredicttheprobabilityofobserving

the sentence(inagivendataset)as:

𝑃 𝑥G, … , 𝑥I = J𝑃 𝑥K 𝑥G, … , 𝑥KLG)I

• Herelengthofsentenceisn.• Builda languagemodel usingaRecurrentNeuralNetwork.

WordEmbeddings fromLanguageModels

(Neubig,2017)

ContinuousBagofWords(CBOW)

• Predictwordbasedonsumofsurroundingembeddings

(Neubig,2017)

Skip-gram

• usethecurrentwordtopredictthesurroundingwindowofcontext

(Neubig,2017)

BERT(BidirectionalEncoderRepresentationsfromTransformers)

• BERTisamethodofpretraining languagerepresentations

• Data:Wikipedia(2.5Bwords)+BookCorpus (800Mwords)

• Maskoutk%oftheinputwords,andthenpredictthemaskedwords

• WordEmbeddingSize:768

UseofWordEmbeddings

• torepresentasentence

• asinputtoaneuralnetwork

• tounderstandpropertiesofwords

• Partofspeech

• Dotwowordsmeanthesamething?

• semanticrelation(is-a,part-of,went-to-school-at)?

NLPandSequentialData

• NLPisfullofsequentialdata

• Charactersinwords

• Wordsinsentences

• Sentencesindiscourse

• …

(Neubig,2017)

Long-distanceDependenciesinLanguage

• Agreementinnumber,gender,etc.

• He doesnothaveverymuchconfidenceinhimself.

• She doesnothaveverymuchconfidenceinherself.

• Selectional preference

• Thereign haslastedaslongasthelifeofthequeen.

• Therain haslastedaslongasthelifeoftheclouds.

(Neubig,2017)

RecurrentNeuralNetworks

• Toolstorememberinformation

(Neubig,2017)

FeedForwardNN RecurrentNN

UnrollinginTime

• Whatdoesprocessingasequencelooklike?

I hate this movie

predict

(Neubig,2017)

TrainingRNNsI hate this movie

predict

Prediction1

predict

Prediction2

predict

Prediction3

predict

Prediction4

Label1 Label2 Label3 Label4

Loss1 Loss2 Loss3 Loss4

sum totalloss (Neubig,2017)

WhatcanRNNsdo

• Representasentence

• Readwholesentence,makeaprediction

• Representacontextwithinasentence

• Readcontextupuntilthatpoint

(Neubig,2017)

Representingasentence

• ℎO istherepresentationofthesentence

• ℎO istherepresentationoftheprobabilityofobserving“Ihatethismovie”

I hate this movie

RNN RNN RNN RNN

ℎP ℎG ℎQ ℎR ℎO

(Neubig,2017)

LanguageModelingusingRNN

(Neubig,2017)

I hate this movie<start>

RNN RNN RNN RNN RNN

predict

Bidirectional-RNNs

• Asimpleextension,runtheRNNinbothdirections

(Neubig,2017)

Bidirectional-RNNs

(Neubig,2017)Prediction1

Bidirectional-RNNs

(Neubig,2017)Prediction1 Prediction2

Bidirectional-RNNs

(Neubig,2017)Prediction1 Prediction2 Prediction3 Prediction4

• TheideabehindRNNsistomakeuseofsequentialinformation.

• 𝑥S istheinputattimestept• 𝑥S isthewordembedding• 𝑠S isthehiddenrepresentationattimestept

𝑠S = 𝑓 𝑈𝑥S +𝑊𝑠SLG𝑜S = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥(𝑉𝑠S)

• Note: U,V,Wareshared acrossalltimesteps

RNNProblemsandAlternatives

• Vanishinggradients

• Gradientsdecreaseastheygetpushedback

• Sol:LongShort-termMemory(Hochreiter andSchmidhuber 1997)(Neubig,2017)

RNNStrengthsandWeaknesses

• RNNs,particularlydeepRNNs/LSTMs,arequitepowerfulandflexible

• Buttheyrequirealotofdata

• Alsohavetroublewithweakerrorsignalspassedbackfromtheend

ofthesentence

BuildChatbots

• Wewanttomodel𝑃 𝑟𝑒𝑠𝑝𝑜𝑛𝑠𝑒 𝑖𝑛𝑝𝑢𝑡_𝑠𝑒𝑛𝑡𝑒𝑛𝑐𝑒)

• Welearnthowtobuildwordembeddings

• Welearnthowtobuildalanguagemodel

• Welearnthowtorepresentasentence.

• Wewanttogetarepresentationoftheinput_sentence andthen

generatetheresponseconditionedontheinput.

ConditionalLanguageModels

• LanguageModel

𝑃 𝑋 = J𝑃 𝑥K|𝑥G, … , 𝑥KLG

• ConditionalLanguageModel

𝑃 𝑌 𝑋 = J𝑃 𝑦 |𝑋, 𝑦G, … , 𝑦 LG

(Neubig,2017)

contextnextword

context

Addedcontext

ConditionalLanguageModel(Sutskever etal.2014)

(Neubig,2017)

Howtopasshiddenstate?

• Initializedecoderw/encoder(Sutskever etal.2014)

• Transform(canbedifferentdimensions)

• Inputateverytimestep(Kalchbrenner &Blunsom 2013)

(Neubig,2017)

SequencetoSequenceModels

ConstraintsofNeuralModels

Backchanneling

Long-termconversationplanning

Context

Engagement

ConstraintsofNeuralModels

Constraints

Gesture

GazeLaughter

Backchanneling

Long-termconversationplanning

Context

Engagement

ExamplesofNeuralChatbots

Xiaoice

• https://www.youtube.com/watch?v=dg-x1WuGhuI

AlexaPrizeChallenge

• Challenge:Buildachatbotthatengages theusersfor20mins.• Sponsored12UniversityTeamswith$100k.• CMUMagnusandCMURuby.• Systemsaremulticomponent

oCombinationsoftask/non-taskoHand-writtenandstatistical/neuralmodels

• ItsaboutengagingresearchersoHavingmorePhDstudentsdodialogoGivingaccessfordeveloperstousersoCollectingdata:whatdouserssay

CMUMagnus

• Highaveragenumberofturns

• AverageRating

• Topics:Movies,Sports,Travel,GoT

• Usershadlongerconversationsbutdidnotenjoytheconversation.oIdentifywhenuserisfrustrated orwantstochangetopic.

oIdentifywhattheuserwouldliketotalkabout(intent).

• Detecting“Abusive”remarksandrespondingappropriately

Summary

• Howtorepresentwordsincontinuousspace.• WhatareRNNsandhowtousethemtorepresentasentence.• Sequencetosequencemodelsfor𝑃 𝑟𝑒𝑠𝑝𝑜𝑛𝑠𝑒 𝑖𝑛𝑝𝑢𝑡_𝑠𝑒𝑛𝑡𝑒𝑛𝑐𝑒)• Issuesinneuralmodel• IssueswithLivesystem!

References

• http://www.phontron.com/class/nn4nlp2017/assets/slides/nn4nlp-03-wordemb.pdf• http://www.phontron.com/class/nn4nlp2017/assets/slides/nn4nlp-06-rnn.pdf• http://www.phontron.com/class/nn4nlp2017/assets/slides/nn4nlp-08-condlm.pdf• https://www.cs.cmu.edu/~rsalakhu/10707/Lectures/Lecture_Language_2019.pdf• http://www.phontron.com/class/mtandseq2seq2017/mt-spring2017.chapter6.pdf

References

• http://www.wildml.com/2016/04/deep-learning-for-chatbots-part-1-introduction/• http://www.wildml.com/2015/09/recurrent-neural-networks-tutorial-part-1-introduction-to-rnns/• http://www.wildml.com/2015/09/recurrent-neural-networks-tutorial-part-2-implementing-a-language-model-rnn-with-python-numpy-and-theano/• http://www.wildml.com/2016/07/deep-learning-for-chatbots-2-retrieval-based-model-tensorflow/• https://nlp.stanford.edu/seminar/details/jdevlin.pdf

RNNtorepresentasentence

Embedding

𝒔𝟎 RNN

Embedding

𝒔𝟏RNN

Embedding

𝒔𝟐RNN

Embedding

𝒔𝟑𝒔𝟒

• 𝑠O istherepresentationoftheentiresentence• 𝑠O istherepresentationofprobabilityofobserving“howareyou?”

neural dialog -...

Documents

Transcript of neural dialog -...

BusinessPhone - Dialog 4222 Office / Dialog 3211 & 3212 · BusinessPhone – Dialog 4222 Office / Dialog 3211 & 3212 7 Description 1Display 2x20 characters. See section “Display

Dialog 4106 Basic/Dialog 4147 Medium

Dialog 4106 Dialog 4147 - UCLA Dialog 4106 4147.pdf · Dialog 4106 Basic/Dialog 4147 Medium 5 Bienvenido Bienvenido Bienvenido a la guía de usuario de los teléfonos Ericsson Dialog

Word Processing Graphics - Libnet · 2017-08-23 · Word Processing Graphics 2 * ... TOOLS FORMAT ribbon to open the Layout dialog box (next page). The Layout dialog box (next page)

Dialog 4106 Basic/Dialog 4147 Medium...Welcome 4 Dialog 4106 Basic/Dialog 4147 Medium Welcome Welcome to the user guide for the Dialog 4106 Basic and Dialog 4147 Medium.This guide

· Web viewThe Find dialog box allows you to find a word in your worksheet, either to simply locate a particular word or multiple instances of a word, or to manipulate a word or

Word Handout - Staff Training - Web viewOutlook Handbook – Staff Training. 2. Outlook. 2013. ... Properties . dialog box, ... Word Handout - Staff Training Description:

BusinessPhone - Dialog 4223 Professional / Dialog 3213

Word Tables - Maine.gov€¦ · Web view1) Close all remaining files, but remain in Microsoft Word. 2) In the Open Dialog box, delete all Microsoft Word files from the C:\My Documents

BusinessPhone - Dialog 4222 Office / Dialog 3211 & 3212ateleserv.com.br/resources/Manual Completo KS DBC 4222 - 3212.pdf · Bem-vindo 4 BusinessPhone – Dialog 4222 Office / Dialog

Speech Processing 11-492/18-492tts.speech.cs.cmu.edu/courses/11492/slides/voiceconversion.pdf · Voice transformation, Voice morphing . Voice Identity

Title stata.com dialog programming — Dialog programming · 4dialog programming— Dialog programming Thus the nesting is Dialog box, which contains Dialog 1, which contains input

A DIALOG TO CONNECT SOULS - Saunvad Saunvad e-Newsletter.pdfA Dialog to connect souls „Enlightenment‟ The word speaks for itself. Enlighten means to light up our inner path. The

dialog and dialog systems elevator project seminar, ws06/07

Page Water M Tec Dialog 3G Reg ister - Arad Grouparad.co.il/assets/Dialog-3G-register-information-sheet.pdfMicrosoft Word - Dialog 3G register- information sheet.docx Author rahels

20130712 - Dialog - dialog-Mail - Michael Kornfeld

Security Escort 3.1.3 Operation Manual...6.2.1 Backup dialog 37 6.2.2 Restore dialog 39 6.2.3 Security Preferences dialog 41 6.2.4 System Defaults dialog 42 6.2.5 System Labels dialog

Dialog 4223 Prof/Dialog 4225 User Guide

Dialog 4220 Lite/Dialog 4222 Office

Dialog 4223 Professional/Dialog 4225 Vision