Voice-Enabling Chatbots

Post on 05-Dec-2014

1.216 views 1 download

description

You don't have to put you ear on the ground, and still can literally hear it coming. The broad introduction of Voice User Interfaces, allowing the interaction with mobile devices through voice, may become the biggest advancement in user interface design since the transition from text-based to graphical user interfaces.

Transcript of Voice-Enabling Chatbots

Slides: http://wolfpaulus.com/slidesCode: git clone https://github.com/wolfpaulus/bots.git

Voice-Enabling Chatbots

© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Star Trek© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Graphical User Interfaces - Mac System 1 (1984), Windows 95 (1995)© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Voice User Interfaces - Ford Sync, Siri, and Cora© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Speed and Accuracy of Speech Recognition© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Wearable Computing - Google Project Glass, Pebble Watch, Apple iWatch (concept)© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Artist on Android© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Artist on Android, Voice User Interface flattens navigation and configuration hierarchies© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Speech Recognition Client - Record, Encode, Compress, Send, Receive Transcription and Confidence

Adaptive Multi-Rate Narrowband Speech Codec8 KHz sampling rate and 12 Kb encoding rate

© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Horsemen of Speech Recognition© 2012-2013 Wolf Paulus - http://wolfpaulus.com

2001 Space Odyssey © 2012-2013 Wolf Paulus - http://wolfpaulus.com

Speech Synthesis

Use a pre-installed Text-To-Speech Engine

Package and ship a distinct Synthesizer and Voice with mobile application

Use a web-service to synthesis text into speech audio (VAAS)

Voice Matters© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Speech Synthesis on Android© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Echo BotAIML BotCreating a simple Voice-Enabled Android App

© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Cora

Speech Recognition

private void startVoiceRecognitionActivity() { final Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);

// Specify the calling package to identify your application intent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE, getClass().getPackage().getName());

// Display an hint to the user about what he should say. intent.putExtra(RecognizerIntent.EXTRA_PROMPT, getResources().getString(R.string.speakPROMPT));

// Given an hint to the recognizer about what the user is going to say intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);

// Specify how many results you want to receive. The results will be sorted // where the first result is the one with higher confidence. intent.putExtra(RecognizerIntent.EXTRA_MAX_RESULTS, 1);

//intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, new Locale("es").getLanguage()); startActivityForResult(intent, VOICE_RECOGNITION_REQUEST_CODE); }

© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Speech Synthesis

private void android.speech.tts.TextToSpeech mTts;..// Instantiate TextToSpeech with the current Context and an OnInitListenermTts = new TextToSpeech(this, this);..private void onInit(final int status) {

if (status == TextToSpeech.SUCCESS && mTts != null) { mTts.setOnUtteranceCompletedListener(new TextToSpeech.OnUtteranceCompletedListener() { public void onUtteranceCompleted(final String s) { startVoiceRecognitionActivity(); } }); } }

private void say(final String s) { final HashMap<String, String> map = new HashMap<String, String>(1); map.put(TextToSpeech.Engine.KEY_PARAM_UTTERANCE_ID, UTTERANCE_ID); mTts.speak(s, TextToSpeech.QUEUE_FLUSH, map); mTV_TTS.setText(s); }

© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Capture Speech Input

Convert Speech into Text

Synthesize Voice (Message)

SpeekMessage

access Web Serviceperform on Device

Echo Bot

© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Capture Speech Input

Convert Speech into Text

Execute Command

Synthesize Voice (Message)

SpeekMessage

access Web Serviceperform on Device

“stock quote for ...” Stock Quote Bot

© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Capture Speech Input

Convert Speech into Text

Create Text Response

Message or Command ?

Execute Command

Synthesize Voice (Message)

SpeekMessage

Cmd

Msg

access Web Serviceperform on Device

Msg

AIML Bot

© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Voice-Enabled Web Bots

• Recognition

<script>function processspeech() {document.form.submit();}</script>

<input type="TEXT" autocomplete="off" name="input" speech="speech" x-webkit-speech="x-webkit-speech" onspeechchange="processspeech();" onwebkitspeechchange="processspeech();" />

• Synthesis

<audio autoplay="true"><source type="audio/mpeg" src="http://goo.gl/r9Mhm" ></audio>

© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Voice User Interfaces - Ford Sync, Siri, and Cora© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Thanks

© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Slides: http://wolfpaulus.com/slidesCode: git clone https://github.com/wolfpaulus/bots.git