Andrew Sutherland Presentation
-
Upload
ajax-experience-2009 -
Category
Education
-
view
1.260 -
download
0
Transcript of Andrew Sutherland Presentation
Voice-enabled web apps Voice-enabled web apps with WAMIwith WAMI
By Andrew SutherlandFounder, Quizlet.com
Who am I?Who am I?
• Founder of Quizlet – online flashcards and
study tool• Founded in 2005 in high school
• 500,000 registered users
• 32,000,000 flashcards uploaded
• Sophomore at MIT• I should be in Chemistry lecture right now…
What is WAMI?What is WAMI?
• A research project at MIT
• A free web service API
• Plug and play :• voice recognition
• audio recording
How WAMI worksHow WAMI works
• Microphone activated with a Java applet
• Audio streams to WAMI servers
• WAMI processes audio in real-time
• Javascript receives structured data of what
the person said
WAMI is a web serviceWAMI is a web service
• Plug-and-play javascript one-liner• You don’t have to maintain audio processing servers
• Re-Captcha model
• More apps -> more utterances ->
better quality voice recognition for all
WAMI lets javascript do the workWAMI lets javascript do the work
• Javascript can activate microphone• myWami.startRecording()
• Javascript receives the text of what you said
• No clunky extra UI necessary – you build your
web app how you like.
WAMI is fastWAMI is fast
• WAMI can send results before you finish your
sentence:1. “Put an X…”
2. Javascript displays an “X”
3. “…on square five”
4. Javascript moves that “X” to square five.
WAMI is grammar-basedWAMI is grammar-based
• Recognition is restricted to a grammar
defined by your app
• Grammar is compiled on page load or
recompiled at any time
• Very flexible JSGF format
What’s a grammar?What’s a grammar?
#JSGF V1.0;
grammar SampleGrammar;
public <top> = turtle | giraffe | pony;
What’s a grammar?What’s a grammar?
#JSGF V1.0;
grammar SampleGrammar;
public <top> = turtle {[id=1]} | giraffe {[id=2]} |
pony {[id=3]};
What’s a grammar?What’s a grammar?
#JSGF V1.0;
grammar SampleGrammar;
public <top> = i [really] want (a <animal>)+;
<animal> = turtle {[id=1]} | giraffe {[id=2]} |
pony {[id=3]};
Getting startedGetting started
<script src="http://wami.csail.mit.edu/portal/wami.js?
devKey=a1234"></script>
<script>
myWami = new WamiApp($(‘wamiDiv’), {
onRecognitionResult : receiveWAMIguess,
onReady : startApp
});
myWami.setGrammar(“#JSGF V1.0 …”);
</script>
Javascript Data receiverJavascript Data receiver
receiveWAMIguess(obj) {
// “You want a giraffe”
alert(“You want a ”+obj.hyps[0].text);
}
WAMI saves your audioWAMI saves your audio
• Instantly replay user’s audio.
• You can download audio files to your server
for long-term storage.
Real-world applicationReal-world application
• Built WAMI into Quizlet.com studying tool.
• Users control vocabulary games by voice.
• Thousands of students using it now• Over 1 million utterances recorded
Live DEMO!Live DEMO!
WAMI To Do:WAMI To Do:
• Complete real-time improvement system.
• Switch from Java to Flash
Please complete Please complete an evaluation. an evaluation.