Challenges & Design Patterns for Conversational AI fileIncomplete sentences Grammatical errors...
Transcript of Challenges & Design Patterns for Conversational AI fileIncomplete sentences Grammatical errors...
Challenges & Design Patterns for Conversational AI
Peter Skomoroch, Head of Data Products
Introductions
Peter Skomoroch@peteskomoroch
• Co-Founder and CEO of SkipFlag, Enterprise AI startup acquired by Workday
• Co-Host of O’Reilly AI Bots Podcast
• Principal Scientist and early member of the data team at LinkedIn
• Machine Learning and Search at MIT, AOL
Challenges & Design Patterns for Conversational AI
• Didn't understand request
• Wrong interpretation of request
• No results found
• No memory of past conversations
• No knowledge of user’s identity
• No grasp of slang, typos, jargon
• Entity disambiguation errors
Common Scenarios in AI Conversations
Credit: @JamieSkella
• Rule Based Bots & Heuristics
• Slot Filling & Intent Classification
• Generative Models
• Retrieval Based Models
Common Approaches to Conversational AI
Rule Based Bots & Heuristics
Slot Filling & Intents
Generative Models
Smart Reply: Automated Response Suggestion for Email (Kannan et al)
Retrieval Based Models
https://rajpurkar.github.io/SQuAD-explorer/
Narrow Domains vs. Unconstrained Conversations
Knowledge Graphs & Conversational AI
SkipFlag: A Knowledge Base That Builds Itself
• Smart Knowledge Base
• Expert Identification
• Instant Answers
Content is auto-organized into a Knowledge Graph
Entity Understanding and Linking
Job DescriptionKnow python and django, and have some experience with docker
PythonHigh-level programming language
DockerComputer program
DjangoSoftware
Fact Extraction from Text with Linked Entities
Workday was founded by David Duffield, founder and former CEO of ERP company PeopleSoft, and former PeopleSoft chief strategist Aneel Bhusri. It is an on-demand (cloud-based) financial management and human capital management software vendor.
<Workday, Inc.> <founded by> <David Duffield>
<Workday, Inc.> <founded by> <Aneel Bhusri>
Good Training Data is Often the Bottleneck
Credit: @mrogati
Entity Understanding Training Data
Common Crawl: ~4B pages monthly
Challenge: Workplace Dialogue and Internal Jargon
Product ManagerJob Title
Agora ProjectInternal Project
Site AnalyticsInternal Team
Workplace Conversations
● Short messages
● Incomplete sentences
● Grammatical errors
● Alternating speakers
● Meandering topics
● Internal jargon
● Overlapping chat conversations
Conover et. al., “Pangloss: Fast entity linking in noisy text environments”, KDD 2018 (to appear)
Question Answering Training Data
• Don’t assume building a bot for a messaging platform is easier than an app. If you are training conversational AI, it’s much harder.
• User retention issues will cause most bots to fail, unless platforms let them be ambient and contextual.
• Distribution and discovery are still challenging on messaging platforms. You need users to get the conversation data flywheel going.
• Google and Alexa Assistants are becoming a higher level discovery layer that delegates requests to 3rd party skills or conversational agents.
Parting thoughts: Platform Level Challenges
Q&A