Voxeo Summit 2010: Standards Update: VoiceXML3

25
Standards Update: VoiceXML 3 Dan Burnett, Ph.D. Dir. of Speech Technologies, Voxeo (Dir. of Standards, Voxeo)

description

At the Voxeo Customer Summit 2010, Dir. of Speech Technologies Dan Burnett provided an update on the evolving VoiceXML 3 standard.More information at:http://www.voxeo.com/http://www.voxeo.com/summit2010http://blogs.voxeo.com/speakingofstandards/

Transcript of Voxeo Summit 2010: Standards Update: VoiceXML3

Page 1: Voxeo Summit 2010: Standards Update: VoiceXML3

Standards Update:VoiceXML 3Dan Burnett, Ph.D. Dir. of Speech Technologies, Voxeo (Dir. of Standards, Voxeo)

Page 2: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

Voxeo on Standards

  Develop ahead of standards

  Make it Open Source

  Lead in standards creation

  Lead in standards adoption

Page 3: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

Past Leadership   W3C

•  VoiceXML 2.0/2.1, SRGS 1.0, SISR 1.0, SSML 1.0

•  CCXML 1.0, SCXML 1.0, EMMA 1.0

  IETF •  MRCPv1 extensions, MRCPv2,

P-charge-info, SIP security

Page 4: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

Where we are now   W3C

•  VoiceXML 3, SSML 1.1, Pronunciation Alphabet Registry, Speech in HTML 5

•  CCXML 1.0, SCXML 1.0, EMMA next, MMI architecture

  IETF, 3GPP •  MRCPv2, XMPP (incl. multi-party Jingle and

multiple chat), Media Control, SIP Overload, SIPREC, CODEC (Speex)

  JCP •  JSR 289, 309 – SIP servlets, media control •  JSR 154, 254 – Java servlets and servlet

pages •  XMPP SIP servlet – submitting to JCP

Page 5: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

VoiceXML

2000 2010 2004 2007

VoiceXML 1.0

VoiceXML 2.0

VoiceXML 2.1

VoiceXML 3

Page 6: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

VoiceXML

2000 2010 2004 2007

VoiceXML 1.0

VoiceXML 2.0

VoiceXML 2.1

VoiceXML 3

Page 7: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

V3 Motivations

  FIA flexibility

  New features

  Extensibility

  Better integration with other W3C languages

Page 8: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

V3 is . . .

  a restructured core

  some new features

  convenience elements to mimic VoiceXML 2.1

Page 9: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

V3 Architecture

  Core functionality defined in modules

  Modules combined with convenience syntax into profiles

Page 10: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

Core functionality defined in modules

  Module behavior defined precisely as state machines

Page 11: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

Modules + Conv. Syntax = Profiles

  Modules grouped into profiles

  Legacy (V2.1), Basic, Maximal

  Convenience syntax simplifies authoring

Page 12: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

Convenience Syntax

  New elements and attributes, but no new functionality

  Behavior defined in terms of core functionality

  For example, <menu> defined in terms of <form> with grammars and prompts

Page 13: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

Convenience Syntax

  Definite candidates are •  menu/choice/enumerate/option •  error/help/noinput/nomatch shortcuts •  link

  Possible (but different) candidates might be •  if/else/elseif (using SCXML) •  transfer (using CCXML)

Page 14: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

New Stuff

  New media, SIV functions

  Session root documents

  Real-time controls

  Author-specifiable transition controllers

  V2 eventing model now async & compatible with DOM Level 3

Page 15: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

New Functionality – Video

  Video -- <audio> replaced by <media>, which allows both audio and video

<media type="audio/x-wav" src="http://www.example.com/resource.wav"/>

<media type="video/3gpp" src="http://www.example.com/resource.3gp"/>

<media> <!-- inline SSML with audio media fallback--> <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis"> Ich bin ein Berliner. </speak> <media type="audio/x-wav" src="ichbineinberliner.wav"> </media>

Page 16: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

New Functionality – Media Control

  Media control -- media clipping, speed, and volume control now possible without resorting to SSML

<media type="audio/x-wav" soundLevel="+6.0dB" speed="50%" repeatcount= "2" src="http://www.example.com/resource.wav"/>

<media type="video/3gpp" clipBegin= "2s" clipEnd="5s" repeatDur="25s" src="http://www.example.com/resource.3gp"/>

Page 17: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

New Functionality – SIV

  SIV – speaker authentication capabilities available as core functionality

•  Enrollment – creates voice model, associates it with id in speaker database

•  Identification – which voice model in speaker database is a match for the speech?

•  Verification – for the claimed id, does the speech match the voice model in the speaker database?

Page 18: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

New Control – Session Root

  Just like application root

  Well, not exactly •  If not specified, no session root •  Session root change is ignored or causes error

  First, let’s review application roots

<vxml session="blahblah.vxml" ...>

Page 19: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

Application Root Review

A: <vxml>

B: <vxml>

C: <vxml root="B">

D: <vxml root="E">

F: <vxml root="E">

G: <vxml>

AppRoot A

AppRoot B

AppRoot B

AppRoot E

AppRoot E

AppRoot G

Page 20: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

Session Root

A: <vxml>

B: <vxml session="C">

D: <vxml>

E: <vxml session="F" >

G: <vxml session="H" requiresession="true">

No Session Root

Session Root C

Session Root C

Session Root C

error.badfetch

Page 21: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

Real-time Controls

  Special grammars that are always active (not just in the wait state) •  Allows arbitrary speech/dtmf •  Immediate: volume, speed, skip •  At next event processing: cancel, goto

  Acts as pre-filter on input stream, replacing matches with silence

<form> <rtc grammar="digit3.grxml" action="volume" params="+5"/> <field name="a"> ... </field> <field name="b"> <cancelrtc grammar= "digit3.grxml "/> ... </field> </form>

Page 22: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

Transition Controllers

  Inter-element transitions now under author control

  Controllers at form, document, application, and perhaps session levels •  e.g. form controller specifies which form item to

execute next

  Controllers can be in SCXML or another flow control language

  Default controllers will give FIA behavior in Legacy Profile

Page 23: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

Transition Controllers Example 1

<!-- document-level transition controller controls inter-form transitions --> <vxml ...> <controller ...> <scxml:scxml version="1.0" ...> <!-- SCXML code determining which form to go to next --> </scxml> </controller>

<form id="form_a" > ... <goto next="form_b"/> <!-- goto is only a suggestion now --> </form>

<form id="form_b" > ... </form> ... </vxml>

Page 24: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

Transition Controllers Example 2

<!-- form-level transition controller controls inter-field transitions --> <vxml ...> <form> <controller src= "myformbehavior.scxml">

<field name="field_a" > ... </field> <field name="field_b" > ... </field> <field name="field_c" > ... </field> <field name="field_d" > ... </field> </form> ... </vxml>

Page 25: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

For More V3 Info

  Follow the work •  http://www.w3.org/Voice

  Check out our recent Developer Jam Session •  http://developers.voiceobjects.com/tech-topics/

monthly-jam-sessions/

  Contact me •  dburnett at voxeo dot com

Dan Burnett, Ph.D. Dir. of Speech Technologies, Voxeo