The LIFELIKE SYSTEM
description
Transcript of The LIFELIKE SYSTEM
THE LIFELIKE SYSTEM
ASK ALEX STORYBOARDISL
LifeLike System Overview
LifeLike Recognizer
(Leon-Barth, Dookhoo) LifeLike DialogManager
(Nguyen, Hung)
User Input
LifeLikeOutput
Voice Input
LifeLike SpeechOutput System
(UIC)
Multi-modal voice-input avatar with National Science Foundation Industry & University Cooperative Research Program (I/UCRC) expertise
LifeLike Recognizer utilizes untrained Automatic Speech Recognition methods enhanced by context-specific grammars
LifeLike Dialog Manager implements a Context-Based Reasoning architecture driven by multiple user-centered goals
Funded by the National Science Foundation
Synthesized Response
Recognized Input
Context
Response String
Response String
LifeLike Recognizer
Chant Middleware provides dictation and grammars control SAPI 5.1 recognizes speech and provides services
SAPI4Recognize
rSAPI5
Recognizer
IBMViaVoice
Dragon NaturallySpeaking
NuanceVoCon 3200
SAPI4SAPI5
SMAPI Dragon VoCon
ChantSR
LifeLike Recognizer (Leon-Barth, Dookhoo)Smart Layer
SRC Layer
SRE Layer {
{
{
LifeLike Recognizer Architecture
(Leon-Barth, Dookhoo)
Grammar rule content in W3C Speech Recognition Grammar Specification (SRGS) Grammars contain limited context vocabulary Matching phonemes with grammars improve recognition accuracy and speed
MSSAPI5.1
LifeLikeRecognizer
Grammars XMLDictation Mode
Text Repository
Dictation Mode
Grammar Recognition
LifeLike Dialog System
Synchronization
ChantSR
Microsoft SAPI5.1& Chant
Speech recognition: Converting an acoustic signal (i.e. audio data),
captured by a microphone Microsoft Speech SDK:
Tool for speech engines and applications for Microsoft Windows
CHANT SpeechKit: Speech recognition management class that
provides a productive way to develop software that listens
Your application sets properties and invokes methods through the speech recognition management class
Handles the low-level functions with speech recognition engines (i.e., recognizers)
Voice
Chant/C#
MS SDK
SAPI
Speech
Recognition
XML W3C Grammars
<GRAMMAR LANGID="409">
<RULE TOPLEVEL="ACTIVE" NAME="CenterID">
<PHRASE>
<!--Welcome back avelino which center are you from?-->
<RULEREF NAME="agencyCenters"/>
</PHRASE>
</RULE>
<p <DICTATION MIN="1" MAX="3"/> />
<LIST PROPNAME="agencyCenters">
<PHRASE VALSTR="UT">u t</PHRASE>
<PHRASE VALSTR="UCF">central florida</PHRASE>
<PHRASE VALSTR="UIC">illinois ?at chicago</PHRASE>
<PHRASE VALSTR="UT">university ?of texas</PHRASE>
<PHRASE VALSTR="UIC">u i c</PHRASE>
<PHRASE VALSTR="UCF">u c f</PHRASE>
</LIST>
Main Rule
Dictation Rule
Phrase List
Tradeoff between coverage and conflicts Standardized way for grammars recognizers
LifeLike Dialog System
LifeLike Dialog System(Hung, Nguyen)
KnowledgeManager
(Nguyen, Hung)
SpeechDisambiguator
(Hung)
Context-basedDialogueManager (Hung)
LifeLikeRecognizer(Leon-Barth, Dookhoo)
LifeLikeSpeech
Output (UIC)
Speech Disambiguator Spell Check (contextualized
spelling and phonetic matching) Semantics Check (linguistic
analysis using NLP toolkit)
Knowledge Manager User-centered (NSF user profiles in
XML format) Domain-specific (AskAlex Ontology) General knowledge (WordNet,
ConceptNet, Semantic Web) Subsets of data extracted into
Context Specific Knowledge
Context-Based Dialogue Manager Context-Based Reasoning
architecture Multiple, asynchronous user goal recognition Conversational Primitives Domain-Specific Contexts
AIML (Artificial Intelligence Markup Language) chatbot models
LifeLike Dialog System (Hung, Nguyen)
KnowledgeManager
(Nguyen, Hung)
Speech Disambiguator (Hung)
Context-based Dialogue Manager (Hung)
LifeLike Dialog System Architecture
LifeLikeRecognizer(Leon-Barth, Dookhoo)
NSF User Data
GeneralKnowledge
AskAlexOntology
Spell Check
Semantics Check
ContextSpecific
Knowledge
LifeLikeSpeechOutput
(UIC)
Dictation String
Phrase String
Context
Con
text
Disa
mb
igu
ate
dS
tring
Context
Dataset
Context
Dataset
Response String
Response String
Updated Data
Knowledge Manager Repository Building
Problem –Disconnect between user profile and avatar
Objective – Create a relational memory profile of the user
Approach XML Representation AskAlex Ontology Relational Interaction
System Communication Protocol• Avatar and LifeLike Dialog System
– Shared Memory and Socket– Variable Length Delimiter Protocol
• Avatar Origin - start, stop• LDS Origin - Speech, Text, and Documents
– In Dialog XML style markup• LifeLike Recognizer and LDS
– Socket– Variable Length Delimiter Protocol
• LR Origin – speech interpretation• LDS Origin - start, stop, contextual information
?
?
?
?
??
?
January Storyboard Sequence1. Initiate Interaction
• Avatar IE• IE Avatar, SR
2. User Speaks• SR IE
3. Avatar Reacts• IE Avatar, SR
4. Go to 2 until complete
LIFELIKE OPEN DISCUSSION
ASK ALEX STORYBOARD
Open Discussion
Synchronization
CommunicationProtocols
Testing/Integration
Recognition
Dialog system
Animation system
ASKALEX
!LIFELIKE
ISL ASK ALEX STORYBOARD