Forschungszentrum Telekommunikation Wien An initiative of the K plus Programme Multimodal...

20
Forschungszentrum Telekommunikation Wien An initiative of the Kplus Programme Multimodal applications for mobile devices in Java Michael Pucher (FTW Vienna) Georg Niklfeld (FTW Vienna), Robert Finan (Mobilkom Austria AG), Wolfgang Eckhart (Sonorys Vienna AG)

Transcript of Forschungszentrum Telekommunikation Wien An initiative of the K plus Programme Multimodal...

ForschungszentrumTelekommunikation

Wien

An initiative of the Kplus Programme

Multimodal applications for mobile devices in Java

Michael Pucher (FTW Vienna)

Georg Niklfeld (FTW Vienna),

Robert Finan (Mobilkom Austria AG),

Wolfgang Eckhart (Sonorys Vienna AG)

Contents

Multimodality- History and types of multimodality- The importance of multimodality for mobile

devices- Applications

Architectures and Algorithms- Logical design of multimodal applications- Server and client side Speech processing- Java class architecture- Multimodal Integration algorithms in Java

- Parsing and Integration

- Servlet/Midlet architecture VoiceXML

History and types of multimodality

Multimodality research since the 1980’s Early versus late fusion Types of multimodality

- First order multimodality which allows sequential multimodal input

- Second order modality allows uncoordinated, simultaneous multimodal input

- Third order multimodality allows coordinated, simultaneous multimodal input

The importance of multimodality for mobile devices

Multimodal communication is perceived as natural

Disadvantages of unimodal interfaces for mobile devices

- Small displays- No comfortable alphanumeric keyboards- Visual access to the display is not always possible

Disadvantages cannot be overcome by increasing processor and memory capabilities

Applications

List selection (e.g. Adresses) Map Navigation (Location Based Serices -

GPS) Voice mail Car environments Advanced call managment Specialized applications for mobile working

environments

Logical design of multimodal applications

Visual Browser

Voice Browser

Final architecture

Server and client side speech processing

Server based ASR and TTS Embedded ASR and TTS Distributed Speech Recognition

- ETSI standard- Feature extraction- Compression and error detection (4800bit/s)

0102030405060708090

100

Optimum Server Embedded DSR

Data requirements

Application complexity

Network traffic

Java class architecture

MapInterface

drawRoute()drawDistance()showContent()showContent()showContent()

setAdress()

SocketApplet

res : java.uti l.ResourceBundleotherport : intappletport : int

SocketApplet()init()start()run()sendMMObject()destroy()

TestApplet

res : java.uti l.ResourceBundle

TestApplet()init()setAdress()drawRoute()drawDistance()showContent()showContent()showContent()

MMIntegrator

transactionHashTableruleList : java.util .LinkedList

MMIntegrator()setTransaction()removeTransaction()transactionExists()getTransaction()clearActions()getReactions()updateMMObject()

SocketThread

res : java.uti l.ResourceBundleappletPort : int

SocketThread()run()sendMMObject()

VXMLServlet

res : java.uti l.ResourceBundleappletPort : intvxmlsend : boolean = false

init()doGet()doPost()updateAndSendObject()sendVXML()sendVXML()sendMMObject()destroy()

+$mMIntegrator

-socketThread

MMObject

expired : boolean = falsetimestamp : java.uti l.Date

MMObject()factory()setExpired()isExpired()setTime()getTime()clone()

(from mmobject)

-mmActual

MMAction

ContentCommand

ContentCommand()

MapSelect

MapSelect()

MenuItemSelect

MenuItemSelect()

MMAction

MMAction()

MMObject

expired : boolean = falsetimestamp : java.util.Date

MMObject()factory()setExpired()isExpired()setTime()getTime()

RouteDistance

RouteDistance()

RouteShow

RouteShow()

SpeechAction

SpeechAction()

SpeechCommand

name : String = ""

SpeechCommand()setName()getName()

VisualAction

VisualAction()

PointClick

PointClick()setPoint()getPoint()

MMReactionMMObject

expired : boolean = falsetimestamp : java.util.Date

MMObject()factory()setExpired()isExpired()setTime()getTime()

MMReaction

MMReaction()act()

SayDistance

res : java.util.ResourceBundlex1 : Integery1 : Integerx2 : Integery2 : Integer

SayDistance()act()

SayRoute

res : java.util.ResourceBundlex1 : Integery1 : Integerx2 : Integery2 : Integer

SayRoute()act()

SpeechReaction

prompt : String = ""

SpeechReaction()setPrompt()getPrompt()act()

VisualReaction

VisualReaction()act()

ShowDistance

ShowDistance()act()

ShowRoute

ShowRoute()act()

PointClick

PointClick()setPoint()getPoint()

ShowContent

name : String = ""

ShowContent()setName()getName()act()

Answer

Answer()act()

MMRule

Route

Route()integrateActions()disintegrateActions()addMMAction()

Content

Content()integrateActions()disintegrateActions()addMMAction()

Content1

Content1()integrateActions()disintegrateActions()addMMAction()

Content2

Content2()integrateActions()disintegrateActions()addMMAction()

MMRule

intActionSize : int = 0

MMRule()factory()addMMAction()integrateActions()disintegrateActions()clearActions()getMMReaction()

Route1

Route1()integrateActions()disintegrateActions()addMMAction()

Multimodal integration algorithms in Java

public MMReaction[] getReactions(String id)

{

Transaction trans = this.getTransaction(id);

trans.removeOldObjects();

ListIterator actions = trans.getAllObjects();

while (actions.hasNext())

{

MMAction mma = (MMAction) actions.next();

ListIterator rules = ruleList.listIterator();

while(rules.hasNext())

{

((MMRule)rules.next()).addMMAction(mma);

}

}

.......

.....

ListIterator rulesI = ruleList.listIterator();

while(rulesI.hasNext())

{

((MMRule)rulesI.next()).integrateActions();

}

ListIterator rulesR= ruleList.listIterator();

while(rulesR.hasNext())

{

MMReaction[] mmreac = ((MMRule) rulesR.next()).getMMReaction();

if (mmreac!=null)

return mmreac;

}

return null;

}

Handling Parsing and Integration in MMIntegrator

public void addMMAction(MMAction mmo)

{

if (mmo instanceof PointClick && actArray[0]==null)

{

this.intActionSize = this.intActionSize +1;

actArray[0] = (MMAction)mmo;

}

else if (mmo instanceof PointClick && actArray[1]==null)

{

this.intActionSize = this.intActionSize +1;

actArray[1] = (MMAction)mmo;

}

else if (mmo instanceof RouteShow && actArray[2]==null)

{

this.intActionSize = this.intActionSize +1;

actArray[2] = (MMAction)mmo;

}

}

public void integrateActions()

{

if (this.intActionSize==3)

{

ShowRoute show = (ShowRoute)this.reacArray[0];

show.pc0 = (PointClick)this.actArray[0];

show.pc1 = (PointClick)this.actArray[1];

SayRoute say = (SayRoute)this.reacArray[1];

say.pc0 = (PointClick)this.actArray[0];

say.pc1 = ((PointClick)this.actArray[1];

}

}

Handling Parsing and Integration in Route (MMRule)

public MMReaction[] getReactions(String id)

{

...

while (actions.hasNext())

{

MMAction mma = (MMAction) actions.next();

ListIterator rules = partialRuleList.listIterator();

while(rules.hasNext())

{

((MMRule)rules.next()).addMMAction(mma);

}

}

.......

Optimizing Parsing and using probabilistic information1.Adding a probability to each MMAction depending on empirical investigations.(usability studies)2.Calculate the probability after the integrationdepending either on a specific rule for eachMMRule or on a global rule, using thetimestamp variable of MMObject.e.g. it is likely that the SpeechCommand occursbetween the PointClick commands and notbefore it.

public void integrateActions(){...

((ShowRoute)this.reacArray[0]).calcProb();

((SayRoute)this.reacArray[1]).calcProb();...}

Servlet/Midlet architecture

The act method is executed in the context of a Servlet

public void act(Object obj) throws Exception

{

((HttpServletResponse)obj).setContentType(res.getString("contenttype"));

PrintWriter out = ((HttpServletResponse)obj).getWriter();

out.println(res.getString("xmlversion"));

out.println(res.getString("vxmlversion"));

.....

// Print VoiceXML page here

.....

}

The act method is executed in the

context of an Applet/Midlet

The Applet/Midlet implements

MapInterface.

public void act(Object obj) throws Exception

{

((MapInterface)obj).drawRoute(pc0.getPoint(),pc1.getPoint());

}

Act method of SayRoute and ShowRoute

Servlet/Midlet architecture

MIDlet

PausedResume : int = 0Paused : int = 1Active : int = 2Destroyed : int = 3DestroyPending : int = 4

MIDlet()startApp()pauseApp()destroyApp()notifyDestroyed()notifyPaused()getAppProperty()resumeRequest()setState()getState()

(from midlet)

Datagram

getAddress()getData()

getLength()getOffset()

setAddress()setAddress()setLength()setData()

reset()

(from io)

DatagramConnection

getMaximumLength()getNominalLength()

send()receive()

newDatagram()newDatagram()newDatagram()newDatagram()

(from io)

Runnable

run()

(from lang)

VoiceXML

Dialogs

<?xml version="1.0" encoding="Cp1252"?><!DOCTYPE vxml PUBLIC '-//Nuance/DTD VoiceXML 1.0//EN'

'http://voicexml.nuance.com/dtd/nuancevoicexml-1-2.dtd'><vxml> <form id="form1"> <field name="pagename"> <grammar

src="http://mars.ftw.tuwien.ac.at/callmanag/gram/Main.grammar#Main" type="text/gsl" />

<prompt bargein="true">Sie können eine Nachricht hinterlassen eine Notiz abhören oder auf den Kalender zugreifen</prompt>

<filled mode="any"> <submit method="get" enctype="application/x-www-form-

urlencoded" next="http://mars.ftw.tuwien.ac.at/callmanag/servlet/ at.ftw.voicexml.GetVoiceXMLPageServlet"

namelist="pagename" /> </filled> <catch event="noinput"> <reprompt /> </catch> </field> </form></vxml>

Grammars

[

(

(?eine

?neue

nachricht)

?[hinterlassen

aufnehmen

aufzeichnen

hinterlegen]

?bitte

)

{ return("storemessage.vxml") }

]