Intent Sets - Architectural Choices for Building Practical Chatbots · Practical Chatbots Saurabh...

6
Intent Sets - Architectural Choices for Building Practical Chatbots Saurabh Srivastava [email protected] Indian Institute of Technology Kanpur, India T.V. Prabhakar [email protected] Indian Institute of Technology Kanpur, India Abstract “Chatbot” is a colloquial term used to refer to software com- ponents that possess the ability to interact with the end-user using natural language phrases. Many commercial platforms are offering sophisticated dashboards to build these chatbots with no or minimal coding. However, the job of composing the chatbot from real-world scenarios is not a trivial activity and requires a significant understanding of the problem as well as the domain. In this work, we present the concept of Intent Sets - an Architectural choice, that impacts the over- all accuracy of the chatbot. We show that the same chatbot can be built choosing one out of many possible Intent Sets. We also present our observations collected through a set of experiments while building the same chatbot over three commercial platforms - Google Dialogflow, IBM Watson As- sistant and Amazon Lex. CCS Concepts: Software and its engineering Soft- ware architectures; Software design tradeoffs. Keywords: Intent Sets, Chatbots, Architectural Choices ACM Reference Format: Saurabh Srivastava and T.V. Prabhakar. 2020. Intent Sets - Architec- tural Choices for Building Practical Chatbots. In 2020 12th Interna- tional Conference on Computer and Automation Engineering (ICCAE 2020), February 14–16, 2020, Sydney, NSW, Australia. ACM, New York, NY, USA, 6 pages. hps://doi.org/10.1145/3384613.3384639 1 Introduction Chatbots (or bots in short) are software components which can interact with users in English or any other natural lan- guage. These bots are becoming increasingly popular. Many commercial platforms provide a dashboard to build these bots. These dashboards provide the required tools to Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. ICCAE 2020, February 14–16, 2020, Sydney, NSW, Australia © 2020 Association for Computing Machinery. ACM ISBN 978-1-4503-7678-5/20/02. . . $15.00 hps://doi.org/10.1145/3384613.3384639 define, test, modify and deploy these bots with no or minimal custom coding. The job of a bot involves accepting a user query, and replying with a suitable response. These platforms usually expect the bot details to be expressed using four key pieces of information. In the absence of standard terminology, we refer to it as the Intent-Entity Paradigm [7]. Intents: The platform categorises the query in one of the predefined classes, called Intents. These classes roughly represent the types of queries that this chatbot is supposed to handle. For example, if we are building a bot to answer sports queries, possible Intents could be football_query, cricket_query, chess_query etc. These platforms usu- ally have a default Intent too (i.e. the ”Others” category) which is used if the query cannot be matched to any de- fined Intent. Entities: The platform can be trained to look for specific information within a query called Entities. Entities rep- resent real-world objects or their specific properties. For instance, to answer a Cricket query about the runs a par- ticular player scored in a calendar year, values for two Entities, namely Player name and Year would have to be provided by the user. Entities, when they are bound to a particular Intent, are also known as Slots or Parameters. These platforms usually provide prompts - predefined questions that can be a response to the query - to fill val- ues for any required Slots (e.g. “For which calendar year? ”). Some platforms also allow setting default values for the Slots, in case the user doesn’t provide a value explicitly. Fulfilment Logic: For each query, the bot must provide a response to the user. This response depends on the Intent associated with the query and the details provided by the user as the part of Slot-filling process. Based on this information, the platforms provide the developers to construct a response to be sent to the user. This process is termed as the Fulfilment for a specific combination of Intents and Slots. The responses can be supplied as templates with placeholders (e.g. “$player_name scored $fetched_runs in the calendar year $year ”) or they can be generated arbitrarily on a case-to-case basis. Sample Queries: The platform expects the developer to provide some examples of the questions a user may ask. We refer to them as Sample Queries. These examples make up the training data that the platforms use to train their models internally (the details of the training process are

Transcript of Intent Sets - Architectural Choices for Building Practical Chatbots · Practical Chatbots Saurabh...

Page 1: Intent Sets - Architectural Choices for Building Practical Chatbots · Practical Chatbots Saurabh Srivastava ssri@iitk.ac.in Indian Institute of Technology Kanpur, India T.V. Prabhakar

Intent Sets - Architectural Choices for BuildingPractical Chatbots

Saurabh [email protected]

Indian Institute of TechnologyKanpur, India

T.V. [email protected]

Indian Institute of TechnologyKanpur, India

Abstract“Chatbot” is a colloquial term used to refer to software com-ponents that possess the ability to interact with the end-userusing natural language phrases. Many commercial platformsare offering sophisticated dashboards to build these chatbotswith no or minimal coding. However, the job of composingthe chatbot from real-world scenarios is not a trivial activityand requires a significant understanding of the problem aswell as the domain. In this work, we present the concept ofIntent Sets - an Architectural choice, that impacts the over-all accuracy of the chatbot. We show that the same chatbotcan be built choosing one out of many possible Intent Sets.We also present our observations collected through a setof experiments while building the same chatbot over threecommercial platforms - Google Dialogflow, IBM Watson As-sistant and Amazon Lex.

CCS Concepts: • Software and its engineering → Soft-ware architectures; Software design tradeoffs.

Keywords: Intent Sets, Chatbots, Architectural Choices

ACM Reference Format:Saurabh Srivastava and T.V. Prabhakar. 2020. Intent Sets - Architec-tural Choices for Building Practical Chatbots. In 2020 12th Interna-tional Conference on Computer and Automation Engineering (ICCAE2020), February 14–16, 2020, Sydney, NSW, Australia. ACM, NewYork, NY, USA, 6 pages. https://doi.org/10.1145/3384613.3384639

1 IntroductionChatbots (or bots in short) are software components whichcan interact with users in English or any other natural lan-guage. These bots are becoming increasingly popular.

Many commercial platforms provide a dashboard to buildthese bots. These dashboards provide the required tools to

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies are notmade or distributed for profit or commercial advantage and that copies bearthis notice and the full citation on the first page. Copyrights for componentsof this work owned by others than ACMmust be honored. Abstracting withcredit is permitted. To copy otherwise, or republish, to post on servers or toredistribute to lists, requires prior specific permission and/or a fee. Requestpermissions from [email protected] 2020, February 14–16, 2020, Sydney, NSW, Australia© 2020 Association for Computing Machinery.ACM ISBN 978-1-4503-7678-5/20/02. . . $15.00https://doi.org/10.1145/3384613.3384639

define, test, modify and deploy these bots with no or minimalcustom coding. The job of a bot involves accepting a userquery, and replyingwith a suitable response. These platformsusually expect the bot details to be expressed using four keypieces of information. In the absence of standard terminology,we refer to it as the Intent-Entity Paradigm [7].

• Intents: The platform categorises the query in one of thepredefined classes, called Intents. These classes roughlyrepresent the types of queries that this chatbot is supposedto handle. For example, if we are building a bot to answersports queries, possible Intents could be football_query,cricket_query, chess_query etc. These platforms usu-ally have a default Intent too (i.e. the ”Others” category)which is used if the query cannot be matched to any de-fined Intent.

• Entities: The platform can be trained to look for specificinformation within a query called Entities. Entities rep-resent real-world objects or their specific properties. Forinstance, to answer a Cricket query about the runs a par-ticular player scored in a calendar year, values for twoEntities, namely Player name and Year would have tobe provided by the user. Entities, when they are bound toa particular Intent, are also known as Slots or Parameters.These platforms usually provide prompts - predefinedquestions that can be a response to the query - to fill val-ues for any required Slots (e.g. “For which calendar year?”).Some platforms also allow setting default values for theSlots, in case the user doesn’t provide a value explicitly.

• Fulfilment Logic: For each query, the bot must providea response to the user. This response depends on theIntent associated with the query and the details providedby the user as the part of Slot-filling process. Based onthis information, the platforms provide the developers toconstruct a response to be sent to the user. This processis termed as the Fulfilment for a specific combinationof Intents and Slots. The responses can be supplied astemplates with placeholders (e.g. “$player_name scored$fetched_runs in the calendar year $year”) or they can begenerated arbitrarily on a case-to-case basis.

• Sample Queries: The platform expects the developer toprovide some examples of the questions a user may ask.We refer to them as Sample Queries. These examples makeup the training data that the platforms use to train theirmodels internally (the details of the training process are

Page 2: Intent Sets - Architectural Choices for Building Practical Chatbots · Practical Chatbots Saurabh Srivastava ssri@iitk.ac.in Indian Institute of Technology Kanpur, India T.V. Prabhakar

ICCAE 2020, February 14–16, 2020, Sydney, NSW, Australia Saurabh Srivastava and T.V. Prabhakar

Team Stats Player Stats Match Stats

scored more than  0 runsWho

Tests ODIs T20s✔

with average greater than  0with strike rate greater than  0

and orand or

scored more than   50 centuriesand or

in any / all of the following formats:

  Sachin Tendulkar (India) - Centuries: 51            [1 row(s) matched your query]Result

QBEInterface Chatbot

Bowling Stats Fielding StatsBatting Stats Care to tell me the format(s)?Test / ODI / T20 ?

Which batsmen has scored more than 50centuries?

I am in interested in Test matches only !

Sachin Tendulkar from India scored 51centuries in Test matches !! 

I could not find any other player matchingyour query.

Do you have another query for me?

Figure 1. Contrast between QBE interface and a Chatbot for fetching the same data

generally “blackboxed”). Typical information providedalong with a Sample Query are the associated Intent andthe values for specific Slots. For instance, for the ExampleQuery “How many runs were scored by Steven Smithin 2009?”, the developer would provide the name of theIntent (e.g. runs_in_a_year) and values for two Slots (i.e.$player_name and $year).

The lump-sum accuracy of a bot can then be measured byfiring some unseen queries at it and measuring the numberof times it mapped the query to the expected Intent. Anotherdimension to evaluate the bot is its ability to parse parametervalues, with the least number of prompts.

The first step, thus, in the process of building a bot, isto come up with appropriate Intents to cover all possibletypes of queries that the user may ask. Next, there needs tobe an understanding of the Entities that will crop up in theconversations, and relate them with the respective Intents.Finally, a set of Sample Queries for each defined Intent needsto be collected, and supplied to the platform, before it canbe tested. We do not put our focus on Fulfilments, assumingthat the same involves applying business logic to preparea response, a task that is independent of the chatbot itself.There are two natural questions that arise out of this process.First, can there be more than one way to pick the Intents andEntities for the same use case? Second, if there are indeedmultiple choices available, does the accuracy of the bot getsaffected when choosing one option over others? In this work,we attempt to look at these questions in greater detail. Wename these choices as Intent Sets, and comment upon theirproperties with the help of some experiments.The rest of this paper is structured as follows. We begin

with mentioning some related literature in Section 2. Wethen introduce the concept of Intent Styles in Section 3. Thisfollowed by observations from some experiments that weperformed by building and evaluating a sample bot overthree commercial platforms in Section 4. Finally, we proceedto conclude the paper citing future possibilities in Section 5.

2 Related WorkBrandtzaeg et al. [2] provide an overview of how we need torethink our user interfaces in the wake of the Chatbot revo-lution. Researchers have started investigating these changes([3] [13] etc.), as more systems are built with a “conversa-tional interface” ([14] [5] [8] etc.). The success of Chatbotsis fueled by advances in Neural IR, which deploys DeepLearning methods internally. These methods are thoroughlysurveyed in multiple works ([6] [9] etc.). However, the mod-els built this way are hard to explain, change or tune. Thisis why the Chatbots built by commercial platforms haveunpredictable behaviour. The efforts for investigating Infor-mation Retrieval with Chatbots, where the source data is inthe form of documents are already ongoing ([15] [11]). Someresearchers are also working on a close, but a different taskof generating textual responses from tabular data ([1] [12]etc.). However, these works focus on text generation ratherthan looking into the issues that arise when a commercialplatform is used to build the conversation interface.To the best of our knowledge, our work is the first to

address architectural issues with solutions involving the useof chatbots. We have previously investigated [10] a formalapproach towards picking a chatbot platform, given a set ofquality attributes to achieve. This work intends to continueefforts towards bringing out architectural abstractions tosupport the use of chatbots in practical use cases.

3 Intent SetsWe now provide a semi-formal definition of Intent Sets as:

An Intent Set is a collection of Intents and their associatedParameters, which can collectively cover all possible queriesthat a chatbot may encounter.

An Intent Set categorises any possible natural languagephrase into one of the pre-defined categories, i.e. assign anIntent to it. From the outset, it might seem a challengingtask to come up with a set of Intents, that can cover any user

Page 3: Intent Sets - Architectural Choices for Building Practical Chatbots · Practical Chatbots Saurabh Srivastava ssri@iitk.ac.in Indian Institute of Technology Kanpur, India T.V. Prabhakar

Intent Sets - Architectural Choices for Building Practical Chatbots ICCAE 2020, February 14–16, 2020, Sydney, NSW, Australia

Player_StatTeam_Stat Match_Stat

venue:any

type:batting input_val_1:50

format:--missing--⇒PROMPT(Caretotell...)

stat:centuries

Batting_StatBowling_Stat Fielding_Stat

format:test

input_op_1:>

venue:any

type:individual input_val_1:50

format:--missing--⇒PROMPT(Caretotell...)

stat:centuries format:test

input_op_1:>

SachinTendulkar|India|51

SachinTendulkarfromIndiascored51centuriesinTestmatches!!

Whichbatsmanhasscoredmorethan50centuries?

Two Intent Sets to parse this query

Default value for a Parameter

Predicted Intentby the platform

Inferred value for a Parameter

Platform "prompts"user for requiredParameter value

Relevant data

Chatbot's Response

IS1 IS2

Figure 2. Two possible Intent Sets - IS1 and IS2, for the query in Figure 1. The default Intent is a part of both Intent Sets.

input. However, commercial platforms provide a defaultIntent pre-added to a new Chatbot project, which serves asan “others” category for the queries. This ensures that the ar-chitect can handle queries which are irrelevant or ambiguousin a graceful fashion (similar to a catch block that follows atry block in many programming languages). So, the task ofdefining the Intents can be done incrementally, by creatingIntents in the order of their priority, leaving the rest to behandled by the default Intent.In any case, modelling a real-world use case with the

Intent-Entity paradigm is not straightforward. As an exam-ple, consider the Cricket Statistics Website called Statsguru[4]. The website can be browsed to get peculiar details fromthe world of Cricket about players, teams, venues etc. How-ever, the website provides a tradition Query-By-Example(QBE) styled interface for filtering the required information.Consider the problem of building a chatbot, which can acceptqueries in the form of a question, “map” it to a structuredform, say an SQL statement or a Relational Algebra (RA)expression, pull out the required information and present itback to the user as a response. The difference between theQBE interface and a chatbot is highlighted in Figure 1.

For simplicity, we assume that the user asks about only onepiece of information per query. The QBE interface provideshints on how to categorise a query in one of the pre-definedsets, i.e. Intents. For example, a query may be asking a stat

about a “player”, a “team” or a particular “match”. We canthus create three Intents for our chatbot, and along with thedefault Intent, we should be able to cater to any possiblequery that the chatbot may receive. The finer filters that wesee in the QBE interface need to be abstracted to Entitiesand mapped to define Intents as a set of Parameters requiredto prepare a response. Similarly, there can be another di-mension to categorise the queries - “batting”, “bowling” or“fielding” queries. Figure 2 shows how two Intent Sets can be“equivalent” to each other. It can be gleaned that the overallinformation required to process a query remains the same.What differs from one Intent Set to another, is the top-levelcategorisation of the query, and associated Parameters withthe Intents. We now list some properties of Intent Sets:

• If the platform maps every query to the correct Intent, andparses all the Parameter values correctly, then the responseproduced by any Intent Set for a given chatbot is exactlythe same. This may seem counter-intuitive to the currentwork, as it makes the study of Intent Sets uninteresting.However, no platform can practically have a 100% accu-racy for either of these activities, meaning in practice,different Intent Sets would have different accuracy.

• An Intent Set divides the real-world into a finite number ofdisjoint scenarios, the union of which, constitutes all possibil-ities that the chatbot can handle. The important term hereis disjoint. For example, IS1 in Figure 2 assumes that every

Page 4: Intent Sets - Architectural Choices for Building Practical Chatbots · Practical Chatbots Saurabh Srivastava ssri@iitk.ac.in Indian Institute of Technology Kanpur, India T.V. Prabhakar

ICCAE 2020, February 14–16, 2020, Sydney, NSW, Australia Saurabh Srivastava and T.V. Prabhakar

MatchType

StatSpan

StatType1

StatType2

Stat Player Team Details

test innings highest batting runs Brian Lara West Indies scored 400* runsodi career highest bowling wickets Muttiah Muralitharan Sri Lanka took 534 wicketsodi career lowest bowling strike rate Rashid Khan Afghanistan bowled 2623 deliveriestest career highest batting average Sir Donald Bradman Pakistan had an average of 99.94t20 innings highest batting strike rate Dwayne Smith West Indies had a strike rate of 414.28t20 career highest batting centuries Rohit Sharma India scored 4 centuriest20 career lowest bowling average Rashid Khan Afghanistan has an average of 12.4t20 career lowest bowling economy rate Daniel Vettori New Zealand had economy rate of 5.7test innings best bowling figures Jim Laker England had figures of 51.2-23-10-53test career highest batting centuries Sachin Tendulkar India scored 51 centuriestest career highest batting double centuries Sir Donald Bradman Australia scored 12 double centuries

. . .Table 1. Some of the stats known to the Cricket Novice. The shaded columns are used by the bot to stitch a response.

query can be about (at most) one of the three - teams,players or matches. However, if it is theoretically possibleto formulate a query that can be simultaneously, say, a“team” as well as an “individual” query, then IS1 ceases tobe a valid Intent Set for the chatbot.

Besides, if the chatbot only serves the purpose of providinga conversational wrapper over some multi-dimensional data(e.g. for Statsguru [4]), there are some derived properties forthese Intent Sets:• Only nominal attributes can be used to create Intent Sets.This is rather straightforward. Only an attribute that cantake a fixed number of possible values can be used tocreate finite disjoint partitions of the data. If required,non-nominal attributes can be grouped into categories,by defining ranges. However, platforms usually providebetter support modelling such attributes as Entities.

• The queries that the chatbot can handle must be atomic forthe attribute used to create the Intent Set. This follows fromthe fact that the platform categorises the query in oneand only one out of the defined Intents. Formally, if theRA expression that fetches the data looks like:

Π(A1,A2,A3 ...) σ (Ai = x ∧Aj = y ∧Ak = z ...)

and, the Intent Set was created with the attribute Ai , thenone and only one condition involving Ai could be part ofthe RA expression.

4 Experimental ObservationsAs discussed in Section 3, the activities of Intent matchingand Parameter parsing involve Natural Language Processing(NLP) tasks, where errors cannot be avoided fully. The plat-form may not map some queries to their correct Intents andmay mess up the parsing of Parameters as well. Platformsusually keep the details of these activities “blackboxed”, sothere is no way to predict these failures beforehand. To ob-serve how the choice of picking one Intent Set over the other

may affect a bot’s accuracy, we performed a set of experi-ments. We built a bot called Cricket Novice, which cananswer basic questions about the game of Cricket. It can alsoanswer some statistical questions about players in Interna-tional Cricket, such as those with best average or strikerate. We describe our experimental setup briefly:

• We created 25 Descriptive Intents with a static response.These Intents did not have any Parameters, as they only in-formed the user about the basics of the game. For instance,these Intents answer queries such as “How is Cricketplayed?” or “What is the role of an Umpire?”.

• We hand-picked 35 rows of statistics about individualsfrom Statsguru [4], and stored them in a table. So, CricketNovice also acts as a conversational wrapper to extractstats out of this table. The structure of this table and someof its rows are shown in Table 1.

• We then created two Intent Sets with Statistical Intentsto cover all the user queries that Cricket Novice couldbe asked. These Intent Sets varied in the number of Intents,as well as their Parameters. In many cases, even a genuineCricket query was supposed to be handled by the defaultIntent due to lack of data in our table. Figure 3 provides anoverview of the two Intent Sets used for our experiments.

• For every row in the stats table, we created 3 sample userqueries which the user could ask to retrieve that stat. Forexample, “Who has the highest batting average in Testmatches?” and “Which player holds the record for the bestbatting average in Test cricket?” are formulations of thesame query. These Sample Queries are used to train theplatform for the Intent matching step. We created similarformulations for the descriptive Intents as well.

• We picked three platforms - Dialogflow, Watson Assistantand Lex, and created two versions of Cricket Novice onall of them. The Descriptive Intents remained the samefor both versions. The Statistical Intents were createdseparately using the Intent Sets shown in Figure 3. We

Page 5: Intent Sets - Architectural Choices for Building Practical Chatbots · Practical Chatbots Saurabh Srivastava ssri@iitk.ac.in Indian Institute of Technology Kanpur, India T.V. Prabhakar

Intent Sets - Architectural Choices for Building Practical Chatbots ICCAE 2020, February 14–16, 2020, Sydney, NSW, Australia

IntentSet1 IntentSet2

MatchType StatSpan Stat

runs

battingaverage

battingstrikerate

innings

match

test

odi

Entities

AllowedValues

ParametersIntents

MatchType StatSpan StatType

battinginnings

match

test

odi

strikerate

economyrate

figures

✔ ✔ ✔

✔ ✔ ✖

✔ ✔ ✖

Entities

AllowedValues

ParametersIntents

runs

wickets

average

✔ ✔ ✖

✔ ✖ ✖

✔ ✖ ✔

centuriesdoublecenturies ✔ ✖ ✖

✔ParameterRequired ✖

Parameternotrequired ✖

Parameternotrequiredbecauseoflackofdata

centuriesdoublecenturies

wicketsbowlingaverage

bowlingstrikerate

economyrate

figures

careert20

highest

lowest

best

✔ ✔ ✔

✔ ✔ ✔

✔ ✔ ✔

bowling

careert20

✔ ✖ ✖

Figure 3. Statistical Intents for Cricket Novice. The Descriptive and default Intents were common.

added the Sample Queries for Statistical Intents in phases.During Phase 1, we only provided one-third of the SampleQueries for each Intent. In Phase 2 and 3, we added therest of the two-third formulations.

• We also added a set of 10 negative examples in Phase 4, toaid triggering of the default Intent. These queries weremeant to reduce the false-positives, i.e. the cases wherethe query was mapped to one of the defined Intents whenideally, it should have triggered the default Intent.

• We created a set of 20 unseen queries, which were firedat the bot after each phase and observed their behaviour.If the bot could match the query to the correct Intent, aswell as parsed all the Parameters correctly, we called it asuccess (a score of 1). If the matched Intent was correct,but the bot botched up the Parameter-parsing, we calledit a partial success (a score of 0.5). Otherwise, we calledit a failure (a score of 0). For each phase, we summed upscores for all 20 queries as the Utility Score of the bot.

• We also performed an Ordering Experiment over thebots created on Dialogflow and Watson Assistant. In thisexperiment, we evaluated the states of the bots after Phase2. These states were reached via two different ways - byadding more Sample Queries to Phase 1, as well as remov-ing Sample Queries from Phase 3. In both cases, the final

set of Sample Queries available to the platform was thesame; however, the order in which they received themchanged. This experiment was done because, in an at-tempt to provide a better testing experience, DialogflowandWatson Assistant re-train their models even on slight-est changes in Sample Queries. This hinted towards anincremental approach towards building the model, mean-ing that the order in which these platforms received thequeries, affected the final model. Lex, on the other hand,allows users to make all changes to the bot, and thenexplicitly re-build the model.

Figure 4 shows our observations in a nutshell. They can besummarised as:

1. The two Intent Sets produce significantly different results.The Utility Scores showed variations for the same botacross different phases, Intent Sets as well as platforms.

2. The accuracy of the system generally got better with theadditions of more Sample Queries.

3. Except for Watson Assistant, Intent Set2 seemed to beperforming better than Intent Set1.

4. There was no conclusive inference on the effect of coun-terexamples on the overall accuracy of the system.

5. During theOrdering Experiment, we observed that thebot built over Dialogflow showed significant variations

Page 6: Intent Sets - Architectural Choices for Building Practical Chatbots · Practical Chatbots Saurabh Srivastava ssri@iitk.ac.in Indian Institute of Technology Kanpur, India T.V. Prabhakar

ICCAE 2020, February 14–16, 2020, Sydney, NSW, Australia Saurabh Srivastava and T.V. Prabhakar

1 2 3 402468

101214161820

Utility Scores for Dialogflow

Intent Set Intent Set

Phases

Uti

lity

Sco

re

1 2

14.0

17.0 18.0 18.0

5.5

7.0

13.012.0

1 2 3 402468

101214161820

Utility Scores for Watson Assistant

Intent Set Intent Set

Phases

Uti

lity

Sco

re

1 2

14.0

16.0

14.5 15.0

8.0

16.0

16.5 16.5

1 2 3 402468

101214161820

Utility Scores for Lex

Intent Set Intent Set

Phases

Uti

lity

Sco

re

1 2

12.0

16.5 17.0 17.0

8.09.5 9.5

11.5

Figure 4. Variations in Utility Scores for the two Intent Sets shown in Figure 3 for Cricket Novice over three platforms.

when the order of the samples was changed for both IntentSets. The bot built over Watson Assistant showed minorvariations with Intent Set1, but no variations were seenwith Intent Set2.

We must clarify that these observations are not evidence ofthe superiority of one Intent Set over others, or one platformbeing better than others. These observations are specific forthe set data we used to build our bot, and cannot be gener-alised. The notable point here is that picking of one Intent Setover others changes the behaviour of the bot significantly.

5 Conclusion and Future WorkIn this work, we presented the concept of Intent Sets, anarchitectural choice, which can impact the accuracy of achatbot. We discussed our observations noted while creat-ing an experimental bot using two Intent Sets over threeplatforms - Dialogflow, Watson Assistant and Lex.We briefly discuss some future dimensions for our work.

First, for chatbots that work over tabular data, there can beefforts at “predicting” if one Intent Set is better than another,based on a statistical analysis of the data. One exciting di-mension to explore is the relationship between the numberof possible values for an attribute and the Intent Set thatis built using it. Another dimension could be performingsimilar experiments for more complex data, and providehelpful guidelines to practitioners about picking an IntentSet over others. Second, there can be efforts to comment onthe “blackboxed” platforms, by observing their behaviourover multiple Intent Sets for the same problem. This mayhelp in picking the right platform for the right problem.

References[1] Junwei Bao, Duyu Tang, Nan Duan, Zhao Yan, Yuanhua Lv, Ming

Zhou, and Tiejun Zhao. 2018. Table-to-text: Describing table regionwith natural language. In Thirty-Second AAAI Conference on ArtificialIntelligence.

[2] Petter Bae Brandtzaeg and Asbjørn Følstad. 2018. Chatbots: changinguser needs and motivations. interactions 25, 5 (2018), 38–43.

[3] Ana Paula Chaves and Marco Aurelio Gerosa. 2019. How should mychatbot interact? A survey on human-chatbot interaction design. arXivpreprint arXiv:1904.02743 (2019).

[4] ESPNcricinfo.com. 1993. Statsguru | Searchable Cricket Statistics data-base. http://stats.espncricinfo.com/ci/engine/stats/index.html [On-line; accessed 9-December-2019].

[5] Ethan Fast, Binbin Chen, Julia Mendelsohn, Jonathan Bassen, andMichael S Bernstein. 2018. Iris: A conversational agent for complextasks. In Proceedings of the 2018 CHI Conference on Human Factors inComputing Systems. ACM, 473.

[6] Jianfeng Gao, Michel Galley, and Lihong Li. 2019. Neural Approachesto Conversational AI: Question Answering, Task-oriented Dialogues andSocial Chatbots. Now Foundations and Trends.

[7] Gidi Shperber. 2017. ChatBots vs Reality: how to buildan efficient chatbot, with wise usage of NLP. https://towardsdatascience.com/chatbots-vs-reality-how-to-build-an-efficient-chatbot-with-wise-usage-of-nlp-77f41949bf08 [Online;accessed 9-December-2019].

[8] Robin Håvik, Jo Dugstad Wake, Eivind Flobak, Astri Lundervold, andFrode Guribye. 2018. A Conversational Interface for Self-screeningfor ADHD in Adults. In International Conference on Internet Science.Springer, 133–144.

[9] KezbanDilek Onal, Ye Zhang, Ismail Sengor Altingovde,MdMustafizurRahman, Pinar Karagoz, Alex Braylan, Brandon Dang, Heng-Lu Chang,Henna Kim, Quinten McNamara, et al. 2018. Neural informationretrieval: At the end of the early years. Information Retrieval Journal21, 2-3 (2018), 111–182.

[10] Saurabh Srivastava and TV Prabhakar. 2019. Hospitality of chatbotbuilding platforms. In Proceedings of the 2nd ACM SIGSOFT Interna-tional Workshop on Software Qualities and Their Dependencies. ACM,12–19.

[11] theta.co.nz. 1995. FAQ Bot - AI chatbot that answers questions, in-stantly, 24/7. https://www.theta.co.nz/technologies/faq-bot/ [Online;accessed 9-December-2019].

[12] Svitlana Vakulenko and Vadim Savenkov. 2017. Tableqa: Questionanswering on tabular data. arXiv preprint arXiv:1705.06504 (2017).

[13] Lisa Waldera. 2019. Development of a Preliminary Measurement Toolof User Satisfaction for Information-Retrieval Chatbots. B.S. thesis.University of Twente.

[14] Xinyi Wang, Samuel S Sohn, and Mubbasir Kapadia. 2019. Towards aConversational Interface for Authoring Intelligent Virtual Characters.In Proceedings of the 19th ACM International Conference on IntelligentVirtual Agents. ACM, 127–129.

[15] Zhao Yan, Nan Duan, Junwei Bao, Peng Chen, Ming Zhou, Zhoujun Li,and Jianshe Zhou. 2016. Docchat: An information retrieval approachfor chatbot engines using unstructured documents. In Proceedings ofthe 54th Annual Meeting of the Association for Computational Linguistics(Volume 1: Long Papers). 516–525.