Bill Chesnut Principal Consultant, Mexia MS Integration MVP, VTSP.
Multimodal Apps: Tablet PC & Speech Development in.NET casey chesnut brains-N-brawn.com...
-
Upload
osborn-hoover -
Category
Documents
-
view
216 -
download
0
Transcript of Multimodal Apps: Tablet PC & Speech Development in.NET casey chesnut brains-N-brawn.com...
Multimodal Apps: Tablet PC & Speech Development in .NET
casey chesnutbrains-N-brawn.com
Wisconsin .NET June 2005
Source Code
• The associated source can be found here:– http://www.brains-n-brawn.com/artifacts/ugTabletSpeech.zip
Seamless Computing
• Advanced Web Services (MVP05)
• Compact Framework (MVP04)
• MapPoint• Tablet PC (MVP03)
• Speech• Artificial Intelligence• Direct3D• Media Center
Questions
• How many programmers?– Tablet PC– Speech– Media Center
Outline
• Tablet PC
• Speech– Speech API (SAPI)– Speech Application SDK (SASDK)– Speech Server
• Demo– Tablet and Speech– Media Center and Speech
Outline : Tablet PC
• Development environment
• How it works
• Working with Ink
• Opinion
• Future
Development Environment
• Windows XP Pro (non Tablet edition)• Visual Studio .NET 1.1• Tablet PC SDK 1.7
– http://www.microsoft.com/downloads/details.aspx?familyid=b46d4b83-a821-40bc-aa85-c9ee3d6e9699&displaylang=en
• Recognizer Pack– http://www.microsoft.com/downloads/details.aspx?FamilyId=080
184DD-5E92-4464-B907-10762E9F918B&displaylang=en
• Digitizer Board– http://www.wacom.com/productinfo/index.cfm
• Tablet PC
How Ink works
• Digitizer collects stroke information
• Strokes are broken up into characters / words / drawings
• Character / word stroke info is transformed into some feature set
• Feature set is run through some sort of pre-trained AI
• Output is mapped to a dictionary or words
Demo
• Digitizer collects stroke information
• Tablet PC Inspector– http://codebetter.com/blogs/peter.van.ooijen/archive/0001/01/01/56161.aspx
Demo
• Strokes are broken up into characters / words / drawings
• InkDivider– Tablet PC SDK Sample
Demo
• Character / word stroke info is transformed into some feature set
• Feature set is run through some sort of pre-trained AI
• Demo– /aiTabletOcr
• Article– http://www.brains-N-brawn.com/aiTabletOcr/
Demo
• Output is mapped to a dictionary or words
• Dictionary Tool– http://blogs.msdn.com/omars/archive/2004/04/15/113597.aspx
• Article– http://www.brains-N-brawn.com/tabletDic/
Working with Ink
• InkControls
• InkOverlay– Collection– Recognition
• RealTimeStylus
• Ink on the web
Ink Controls
• InkEdit
• InkPicture
• Code from scratch
InkOverlay
• Collection
• Recognition
• Demo apps
RealTimeStylus
• RealTimeStylusPlugin– Tablet PC SDK Sample
Ink on the Web
• IE only
• InkBlogWeb– Tablet PC SDK Sample
• Article– http://www.brains-N-brawn.com/tabletWeb/
Opinion
• Green Light– Tablet PC Edition 2005 improved recognition
and usability dramatically– Recognition Pack made development more
accessible– Language Support
• Chinese (Traditional and Simplified),U.S. English, U.K. English, French, German, Italian, Japanese, Korean, Spanish
Possible Future
• VS.NET 2005?
• Avalon?
• Will IE7 have tighter integration with ink?
• Longhorn – baked in
• Possiblity for training ink recognition
What about Pocket PCs
• Handwriting Recognition
• Form factors
Outline : Speech
• How does it work?– Synthesis (TTS)– Recognition (SR)
• Development– Speech API (SAPI)– Speech Application SDK (SASDK)– Speech Server (MSS)
How Synthesis Works
• Text is converted to phonemes
• Phonemes are appended together
• Audio is played back
• Demo– /ttSpeech app
• Article– http://www.brains-N-brawn.com/ttSpeech/
How Recognition Works
• Audio wav is transformed to some meaningful form
• Phonemes are found in audio signals• Phonemes are mapped to a dictionary or words
• Demo– wavReader app
• Article– http://www.brains-N-brawn.com/noReco/
Speech API (SAPI)
• Old school COM
• Windows applications
• Can do dictation
• Demo– SAPI app
Opinion
• Yellow light– It works, but is aging– Has to be trained for dictation– Limited language support
• Green light for Tablet PCs– Tablet PC has recognition and synthesis
engines installed– Some Tablets have microphone arrays built in
Future
• System.Speech– Simple API– Reflection capabilities– Standards support (SSML, SRGS)– Engines should be improved from all the
Speech Server work
What about Pocket PCs
• OEMs can add VoiceCommand
• WindowsMobile has the SAPI API, but no engines
• PlatformBuilder is supposed to have engines
• There are 3rd party engines for purchase
Speech Application SDK
• VS.NET 1.1 integration• For web based apps
– Voice-only telephony– Multimodal browser
• Demo– Code voice-only from scratch
• Article– http://www.brains-N-brawn.com/noHands/
SASDK
• Speech Synthesis– Inline– Code behind– Prompt functions– Prompt databases
• Speech Recognition– Inline– Static Grammar– Dynamic Grammar– DTMF
Speech Server
• Runs SASDK applications• Primarily for Voice-only apps• Also for Multimodal PocketPC apps• Speech Language Packs
– North American Spanish– Canadian French
• Article– http://www.brains-N-brawn.com/speechMulti/
Deployment
Opinion
• Green light for Voice-Only– Great tool support– Cheap hardware– Language support
• Red light for Multimodal– Standards battle with VoiceXml– IE Speech Add-Ins are not accessible– Pocket IE Speech Add-In not updated for R2
release, nor does it support Smartphone
Possible Future
• VS.NET 2005?
• XAML?
• Will IE7 have voice browsing built-in?
• Other browsers to add SALT support?
• Pocket IE Professional?
Combo Demos
• Ink and Speech (WinForm)– InkCollection app– http://www.brains-N-brawn.com/tabletStrator/
• Ink and Speech (WebForm)– Video– http://www.brains-N-brawn.com/tabletWeb/
• Remote and Speech (AddIn)– http://www.brains-N-brawn.com/mceSAPI/
• Remote and Speech (HostedHTML)– http://www.brains-N-brawn.com/mceSALT/
Questions