Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain...

51
1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh [email protected], [email protected], [email protected]

Transcript of Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain...

Page 1: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

1

Edina: Building an Open-Domain

Socialbot using Self-Dialogues

ILCC, School of Informatics, University of Edinburgh

[email protected], [email protected],[email protected]

Page 3: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

3

2016: The year of the chatbot

from ‘Tracxn Research, Chatbot Startup Landscape’, June 2016

Page 4: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

4

Chatbot Applications

I Customer service

I IoT

I Other: help people with disabilities, etc.

Page 5: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

5

Amazon vs. Google vs. Microsoft

https://www.amazon.com/Amazon-Echo-Bluetooth-Speaker-with-WiFi-Alexa/dp/B00X4WHP5E

https://www.bhphotovideo.com/images/images2500x2500/google_ga3a00417a14_home_1297281.jpg

https://blogs.msdn.microsoft.com/ukhe/2015/09/15/student-survival-tips-from-cortana/

Page 6: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

6

Amazon Alexa Prize

I Goal: to build on open-domain conversation AI forcommercial purposes

I Currently, Alexa mostly is mostly rule-based (skills)

I 18 teams involved (12 sponsored by Amazon)

I Users in the U.S. evaluate the conversation with bot on ascale from 1 to 5

Page 7: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

7

Our team

Page 8: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

8

The problem(s)

Page 9: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

9

Where do we start?

I How do we build a chatbot?I No idea!I Let’s look at previous work!

Page 10: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

10

Rule-based bots: Mitsuku (try it at mistuku.com!)

Page 11: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

11

Rule-based vs. Machine-learning

I Rule-basedI 3 Fully deterministicI 3 Output fully intelligibleI 7 Very constrainedI 7 Time-consuming, Difficult to maintainI 7 Full of fallback strategies

Page 12: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

12

Machine-learning methods: Neural Networks

Page 13: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

13

Rule-based vs. Machine-learning

I Rule-basedI 3Fully deterministicI 3Output fully intelligibleI 7Very constrainedI 7Time-consuming, Difficult to maintainI 7Full of fallback strategies

I Machine-learningI 3 Easy to maintainI 3 Flexible, broader-coverageI 7 Non-deterministicI 7 Constrained to the domain of the training data

Page 14: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

14

Where do we start?

I How do we build a chatbot?I No idea!I Let’s look at previous work!

I What does Amazon want?I Open-domainI The user needs to be happy!!!

Page 15: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

15

Open-domain

Page 16: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

16

Rule-based vs. Machine-learning

I Rule-basedI 3Fully deterministicI 3Output fully intelligibleI 7 Very constrainedI 7Time-consuming, Difficult to maintainI 7Full of fallback strategies

I Machine-learningI 3 Easy to maintainI 3 Flexible, broader-coverageI 7Non-deterministicI 7Constrained to the domain of the training data

Page 17: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

17

Rule-based vs. Machine-learning

I Rule-basedI 3Fully deterministicI 3Output fully intelligibleI 7Very constrainedI 7Time-consuming, Difficult to maintainI 7Full of fallback strategies

I Machine-learningI 3Easy to maintainI 3Flexible, broader-coverageI 7Non-deterministicI 7 Constrained to the domain of the training data

Page 18: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

18

Corpora

I OpenSubtitles: Crowdsourced movie subtitles

I Movie scripts from IMDB

I Fisher: Phone conversations

I Ubuntu dialogue corpus: Technical support forUbuntu-related problems

No corpora that fit our domain!!!

Page 19: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

19

The user needs to be happy

I Avoid offensive language

I Avoid sensitive topics (politics, religion, sex)

I Be empathetic

Page 20: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

20

Be empathetic

[User is looking for info on a stage 4 cancer]

BOT: Who is your favorite actor?USER: <says they have cancer>BOT: I like a few of his other films .USER: <mentions cancer again>BOT: Me too ! I like him in the notebook .

Page 21: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

21

Rule-based vs. Machine-learning

I Rule-basedI 3 Fully deterministicI 3 Output fully intelligibleI 7Very constrainedI 7Time-consuming, Difficult to maintainI 7Full of fallback strategies

I Machine-learningI 3Easy to maintainI 3Flexible, broader-coverageI 7 Non-deterministicI 7Constrained to the domain of the training data

Page 22: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

22

What is ideal?

I A model that...I mostly machine-learning basedI feeds on clean data that is relevant to the task (what and

how the user wants it!)I maintainable from an engineering and financial perspectiveI outputs intelligible responses

Page 23: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

23

What is ideal?

I A model that...I mostly machine-learning basedI feeds on clean data that is relevant to the task (what and

how the user wants it!)I maintainable from an engineering and financial perspectiveI outputs intelligible responses

Page 24: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

24

Ask people!

I If you want to know what do people talk about and how theydo it, ask people.

I Two people conversing with each other on a topic

Page 25: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

25

Ask people the Turkers!

I Crowdsourcing platform

I Create and upload a task (e.g. ‘have a conversation withanother user on a topic’)

I Have people around the world solve the task

I Collect data

https://pbs.twimg.com/profile_images/661394940816035840/1R9_KPHN.png

Page 26: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

26

Visual Dialogue(Abhishek et al., 2016)

Page 27: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

27

However...

I Having two turkers to chat with each other requires goodtiming and a common ground (the image in VisDial)

E.g.A: Hey, have you seen Guardians of the Galaxy?B: NoA: Not your type I guess.B: Have you?A: I haveB: Sounds nice

I Costs double (when people two people at a time)

Page 28: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

28

Self-dialogues

The Turker makes up a fictitious conversation

Page 29: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

29

Self-dialogue: example

Page 30: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

30

Self-dialogues, cont’d

I 3 Speed and set-up: takes less effort and waiting time togather data from a single user

I 3 Cost effectiveness: halves the cost; after an initial bulk,only sporadic updates to keep on track with trendy topics

I 3 Quality: the users is always an expert in what is talkingabout; knows about the entities introduced in the dialogues

I 3 Naturalness: the flow conversation is natural

I 7 Not 2-people conversations: further analysis (dialogueacts etc.) are hindered

Page 31: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

31

Data collected

I 24,283 self-dialogues spread across 23 tasks.

I A peak of 2,307 conversations a day

I Total cost: US $17,947.54

You need a lot of $$$ for these tasks!

Page 32: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

32

Data collected, cont’d

Topic/subtopic # Conversations # Words # Turns

Movies 4,126 814,842 82,018Action 414 37,037 4,140Comedy 414 36,401 4,140Fast & Furious 343 33,964 3,430Harry Potter 414 44,220 4,140Disney 2,331 232,573 23,287Horror 414 428,33 4,138Thriller 828 77,975 8,277Star Wars 1,726 178,351 17,260Superhero 414 40,967 4,140

Music 4,911 924,993 98,123Pop 684 62,383 6,840Rap / Hip-Hop 684 66,376 6,840Rock 684 63,349 6,837The Beatles 679 68,396 6,781Lady Gaga 558 49,313 5,566

Music and Movies 216 37,303 4,320

NFL Football 2,801 562,801 55,939

Page 33: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

33

The system

Page 34: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

34

System overview

Page 35: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

35

A deterministic queue

I Queue of components: when a component fails, the next oneis called

1. EVI: a factoid Q&A component provided by Amazon2. Rule-based: deals with general chit-chat3. Edina’s likes and dislikes: a bit of personality (based on Wiki

views)4. Matching score: our main component. Retrieves the

most-likely answer from the self-dialogue database.5. Proactive: change the topic on its own volition6. Neural network: A generative neural network kicks in if

everything else fails.

Page 36: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

36

A deterministic queue

I Queue of components: when a component fails, the next oneis called

1. EVI: a factoid Q&A component provided by Amazon2. Rule-based: deals with general chit-chat3. Edina’s likes and dislikes: a bit of personality (based on Wiki

views)4. Matching score: our main component. Retrieves the

most-likely answer from the self-dialogue database.5. Proactive: change the topic on its own volition6. Neural network: A generative neural network kicks in if

everything else fails.

Page 37: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

37

Rule-based

I Bot’s identity: anonymized until the finals

I Edina’s favorites: favorite actor, artist, singer, etc.

I Sensitive topics: suicide, cancer, death as well as promptscontaining offensive contents that needed to be ‘gracefully’caught

I Topic shifting: deals with requests of topic shifting

I Games and jokes

I + a set of the most frequent prompts from Alexa users,provided by Amazon

Page 38: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

38

Matching score

I Our main component

I Matches a user query q with the conversation contexts cof all potential responses from the pool of self-dialoguesgathered through AMT, to return the most likely response r(and a confidence score).

E.g.q: Have you seen Hidden Figures?

c−2: Any cool new movie?c−1: What about Hidden Figures?r : I thought Hidden Figures was very thin on the actualmathematics of it all. S: 0.87

Page 39: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

39

Matching score - cont’d

I The matching score is an interpolation of bag-of-words,IDF-based scores (rare words are upweighted).

S(q, ri , ci ) =(Sc + Scr )(Sc)n

η+ λS2cq (1)

where Sc ,Scr , (Sc)n and S2cq are subscores and λ, η and nare constants.

Page 40: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

40

Neural network

I Language model with multiplicative LSTM (Krause et al.,2017)

I Trained on OpenSubtitles and fine-tuned on our data

Page 41: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

41

Evaluation

Page 42: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

42

Evaluation

I Evaluating the usefulness of the matching score

I Qualitative evaluation

I Evaluations we haven’t done but we would like to do

Page 43: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

43

Evaluating the matching score

I We sample conversation triplets from our self-dialogue poolA:It is my favorite Disney movie.B:I think Beauty and the Beast is my favorite.A:I think my favorite is ‘Sound of Music’

I We manually score the actual reply against what thematching score returns.

Page 44: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

44

Evaluating the matching score - Results

Response type Mean human evaluation score

Human AMT worker 4.09± 0.15Matching score top 4, confidence > 0.7 3.60± 0.07Matching score top-4, confidence > 0.5 3.50± 0.08Matching score top-4 3.08± 0.09

Corr.: r = 0.259 (p < 0.0005)

Page 45: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

45

Qualitative evaluation of conversation

B: What bands have you seen live?H: I’ve seen Sufjan Stevens, do you like them?B: What’s he like?H: Quite boring to be honestB: I know what you are talking about .H: Right?! What kind of bands have you seen live?B: Yeah! I saw Cake a few months ago, and just last week I sawThe Protomen.H: Really! How was Cake?B: Eating cake?H: No i’m not eating a cakeB: No, the band. Have you ever heard of them?H: No i haven’tB: She does more traditional type music, but I absolutely love theway she sings.H: I’ll have to check her outB: You should ! she ‘s the best singer for sure .

Page 46: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

46

Things we haven’t done but we would have loved to do

I Assessing whether the entities mentioned in the self-dialoguesreflect the entities Alexa users talk about

I Using the scores from Alexa users to tune our system

Evaluating open-domain chatbots is difficult!

Page 47: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

47

Conclusion

Page 48: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

48

Conclusion

I Open-domain conversational AI is hard and still a (very) openproblem

I Data collection/annotation is a real challenge, butself-dialogues are efficient and surprisingly effective.

I Evaluation is still an open problem

I An hybrid-system is a reasonable solution for this challenge(rule-based x machine-learning x IR)

Page 49: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

49

Final remarks

I We got 6th place (out of 15 team)

I ...despite being the underdogs of the competition

I Teamwork is hard but pays off!

Page 50: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

50

What’s next?

Page 51: Edina: Building an Open-Domain Socialbot using Self-Dialogues · 1 Edina: Building an Open-Domain Socialbot using Self-Dialogues ILCC, School of Informatics, University of Edinburgh

51

https://imgflip.com/i/1yie8f