Download - Voice Browser Original

7/27/2019 Voice Browser Original

1/27


2/27

Presented By

Akhil Sajeendran

R7A CS

Roll No:05

Reg No : 10016175

SNGCE


3/27

Browser technology is changing very fast these daysand we are moving from the visual paradigm to thevoice paradigm. Voice browser is the technology toenter this paradigm. A voice browser is a device whichinterprets a (voice) markup language and is capable ofgenerating voice output and/or interpreting voiceinput, and possibly other input/output modalities.


4/27

Avoice browseris a device :

that interprets voice input and interprets voice markuplanguages to generate voice output.

that interprets a script which specifies exactly what toverbally present to the user as well as when to presenteach piece of information.


5/27

Time frame: 1998 to ??

Hands-free accessing of web.

Pragmatic interface for functionally blind users.


6/27

Speech Recognition

Speech Synthesis


7/27

Voice inputVoXML file Text

Automatic speech recognition is the process by whicha computer maps an acoustic speech signal to text.

Speech is first digitized and then matched against a

dictionary of coded waveforms. The matches areconverted into text


8/27

Text VoXML file Output(Pre-recorded)

The specification defines a markup language forusers via a combination of prerecorded speech,synthetic speech and music.

You can select voice characteristics (name, genderand age) and the speed, volume, pitch, andemphasis. There is also provision for overriding thesynthesis engine's default pronunciation.


9/27

World Wide Web Consortium(W3C)

Voice Browser Working Group

Speech Interface Framework


10/27

Established on 26 March 1999.

Re-chartered through 31 January 2009.W3C Team Contacts are KazuyukiAshimura and Matt

Womer.

Co-chaired byJimLarson and ScottMcGlashan .


11/27

VoiceXML Speech Synthesis

Speech Recognition Speech Grammars

Semantic Interpretation

Stochastic Language Models


12/27

VoiceXML is a dialog markup language designed for

telephony applications, where users are restricted tovoice and DTMF (touch tone) input.

text.html

text.vxml

WebServer

Internet

Browser


13/27

The specification defines a markup language for

prompting users via a combination of prerecorded

speech, synthetic speech and music. We can selectvoice characteristics (name, gender and age) andthe speed, volume, pitch, and emphasis. There isalso provision for overriding the synthesis engine's

default pronunciation.


14/27

SpeechGrammars

StochasticLanguageModels

SemanticInterpretation

USER

Speech


15/27

In most cases, user prompts are very carefully designed to

encourage the user to answer in a form that matchescontext free grammar rules.

Speech Grammars allow authors to specify rules coveringthe sequences of words that users are expected to say inparticular contexts. These contexual clues allow therecognition engine to focus on likely utterances, improving

the chances of a correct match.


16/27

In some applications it is appropriate to use open

ended prompts (how can I help). In these cases,

context free grammars are unuseful. The solution is to use a stochastic language model.

Such models specify the probability that one wordoccurs following certain others. The probabilities

are computed from a collection of utterancescollected from many users.


17/27

The recognition process matches an utterance to a

speech grammar, building a parse tree as a byproduct.

There are two approaches to harvesting semanticresults from the parse tree:

1. Annotating grammar rules with semanticinterpretation tags.

2. Representing the result in XML.


18/27

It can be divided into three categories :

Web Browsing

Limited information Access

Spoken Dialog Systems


19/27

Browse any web pages using speech input.

Parsing for the purpose of voice recognition donewhen the page is accessed.

May or may not produce a voice feed back.


20/27

Useful information in limited domains like weather ina city, checking stock updates etc.

Audio feed back


21/27

Client-server architecture is used

Used for connecting to a remote server by a Javaapplet(client).

Examples are connecting to email servers


22/27

Voice is a very natural user interface which speeds upbrowsing.

Less space requirements.

Portable voice browsers can also be implemented.

Practical interface for functionally blind users.

Users can browse web while keeping there hands and

eyes for other jobs


23/27

Voice browsing will become visual(Multi-model)

Can be integrated to an OS

Integrated to every application.


24/27

Browser technology is changing very fast these daysand we are moving from the visual paradigm to thevoice paradigm.

Voice browser is the technology to enter thisparadigm.

Voice browser is a device which interpret voice input

and generate voice output.


25/27

http://www.w3.org/standards/webofdevices/voice

http://xml.coverpages.org/ccxml.html

http://reactos.ccp14.ac.uk/Voice/ http://www.w3.org/Voice/1998/Workshop/PhilJenkins

.html (for IBM)


26/27


27/27