Voice-Enabling Chatbots

22
Slides: http://wolfpaulus.com/slides Code: git clone https://github.com/wolfpaulus/bots.git Voice-Enabling Chatbots © 2012-2013 Wolf Paulus - http://wolfpaulus.com

description

You don't have to put you ear on the ground, and still can literally hear it coming. The broad introduction of Voice User Interfaces, allowing the interaction with mobile devices through voice, may become the biggest advancement in user interface design since the transition from text-based to graphical user interfaces.

Transcript of Voice-Enabling Chatbots

Page 1: Voice-Enabling Chatbots

Slides: http://wolfpaulus.com/slidesCode: git clone https://github.com/wolfpaulus/bots.git

Voice-Enabling Chatbots

© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Page 2: Voice-Enabling Chatbots

Star Trek© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Page 3: Voice-Enabling Chatbots

Graphical User Interfaces - Mac System 1 (1984), Windows 95 (1995)© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Page 4: Voice-Enabling Chatbots

Voice User Interfaces - Ford Sync, Siri, and Cora© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Page 5: Voice-Enabling Chatbots

Speed and Accuracy of Speech Recognition© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Page 6: Voice-Enabling Chatbots

Wearable Computing - Google Project Glass, Pebble Watch, Apple iWatch (concept)© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Page 7: Voice-Enabling Chatbots

Artist on Android© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Page 8: Voice-Enabling Chatbots

Artist on Android, Voice User Interface flattens navigation and configuration hierarchies© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Page 9: Voice-Enabling Chatbots

Speech Recognition Client - Record, Encode, Compress, Send, Receive Transcription and Confidence

Adaptive Multi-Rate Narrowband Speech Codec8 KHz sampling rate and 12 Kb encoding rate

© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Page 10: Voice-Enabling Chatbots

Horsemen of Speech Recognition© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Page 11: Voice-Enabling Chatbots

2001 Space Odyssey © 2012-2013 Wolf Paulus - http://wolfpaulus.com

Page 12: Voice-Enabling Chatbots

Speech Synthesis

Use a pre-installed Text-To-Speech Engine

Package and ship a distinct Synthesizer and Voice with mobile application

Use a web-service to synthesis text into speech audio (VAAS)

Voice Matters© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Page 13: Voice-Enabling Chatbots

Speech Synthesis on Android© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Page 14: Voice-Enabling Chatbots

Echo BotAIML BotCreating a simple Voice-Enabled Android App

© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Cora

Page 15: Voice-Enabling Chatbots

Speech Recognition

private void startVoiceRecognitionActivity() { final Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);

// Specify the calling package to identify your application intent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE, getClass().getPackage().getName());

// Display an hint to the user about what he should say. intent.putExtra(RecognizerIntent.EXTRA_PROMPT, getResources().getString(R.string.speakPROMPT));

// Given an hint to the recognizer about what the user is going to say intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);

// Specify how many results you want to receive. The results will be sorted // where the first result is the one with higher confidence. intent.putExtra(RecognizerIntent.EXTRA_MAX_RESULTS, 1);

//intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, new Locale("es").getLanguage()); startActivityForResult(intent, VOICE_RECOGNITION_REQUEST_CODE); }

© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Page 16: Voice-Enabling Chatbots

Speech Synthesis

private void android.speech.tts.TextToSpeech mTts;..// Instantiate TextToSpeech with the current Context and an OnInitListenermTts = new TextToSpeech(this, this);..private void onInit(final int status) {

if (status == TextToSpeech.SUCCESS && mTts != null) { mTts.setOnUtteranceCompletedListener(new TextToSpeech.OnUtteranceCompletedListener() { public void onUtteranceCompleted(final String s) { startVoiceRecognitionActivity(); } }); } }

private void say(final String s) { final HashMap<String, String> map = new HashMap<String, String>(1); map.put(TextToSpeech.Engine.KEY_PARAM_UTTERANCE_ID, UTTERANCE_ID); mTts.speak(s, TextToSpeech.QUEUE_FLUSH, map); mTV_TTS.setText(s); }

© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Page 17: Voice-Enabling Chatbots

Capture Speech Input

Convert Speech into Text

Synthesize Voice (Message)

SpeekMessage

access Web Serviceperform on Device

Echo Bot

© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Page 18: Voice-Enabling Chatbots

Capture Speech Input

Convert Speech into Text

Execute Command

Synthesize Voice (Message)

SpeekMessage

access Web Serviceperform on Device

“stock quote for ...” Stock Quote Bot

© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Page 19: Voice-Enabling Chatbots

Capture Speech Input

Convert Speech into Text

Create Text Response

Message or Command ?

Execute Command

Synthesize Voice (Message)

SpeekMessage

Cmd

Msg

access Web Serviceperform on Device

Msg

AIML Bot

© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Page 20: Voice-Enabling Chatbots

Voice-Enabled Web Bots

• Recognition

<script>function processspeech() {document.form.submit();}</script>

<input type="TEXT" autocomplete="off" name="input" speech="speech" x-webkit-speech="x-webkit-speech" onspeechchange="processspeech();" onwebkitspeechchange="processspeech();" />

• Synthesis

<audio autoplay="true"><source type="audio/mpeg" src="http://goo.gl/r9Mhm" ></audio>

© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Page 21: Voice-Enabling Chatbots

Voice User Interfaces - Ford Sync, Siri, and Cora© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Page 22: Voice-Enabling Chatbots

Thanks

© 2012-2013 Wolf Paulus - http://wolfpaulus.com

Slides: http://wolfpaulus.com/slidesCode: git clone https://github.com/wolfpaulus/bots.git