SpeeG - A Multimodal Speech- and Gesture-based Text Input Solution
-
Upload
beat-signer -
Category
Science
-
view
1.548 -
download
5
description
Transcript of SpeeG - A Multimodal Speech- and Gesture-based Text Input Solution
SpeeGA Mul&modal Speech-‐ and
Gesture-‐based Text Input Solu&on
Lode Hoste, Bruno Dumas and Beat Signer
SpeeG - Lode HosteVrije Universiteit Brussel 2
Text-input for set-top boxes
SpeeG - Lode HosteVrije Universiteit Brussel 3
SpeeG - Lode HosteVrije Universiteit Brussel 4
SpeeG - Lode HosteVrije Universiteit Brussel 5
Text-input for set-top boxes
SpeeG - Lode HosteVrije Universiteit Brussel
Dasher
8PenSwiftKey
Speech Dasher SpeeG
EdgeWriter
1D Keyboard for Kinect Virtual Keyboard for XboxChatpad Controller
6
SpeeG - Lode HosteVrije Universiteit Brussel
Virtual keyboard
7
SpeeG - Lode HosteVrije Universiteit Brussel
Kinect 1D keyboard
8
SpeeG - Lode HosteVrije Universiteit Brussel
Kinect 1D keyboard
9
SpeeG - Lode HosteVrije Universiteit Brussel
Dasher
8PenSwiftKey
Speech Dasher SpeeG
EdgeWriter
1D Keyboard for Kinect Virtual Keyboard for XboxChatpad Controller
10
SpeeG - Lode HosteVrije Universiteit Brussel
Dasher
8PenSwiftKey
Speech Dasher SpeeG
EdgeWriter
1D Keyboard for Kinect Virtual Keyboard for XboxChatpad Controller
11
SpeeG - Lode HosteVrije Universiteit Brussel
Dasher
12
Continuous inputJoystick / Gaze / ...Open vocabularyAllows imprecise navigation
SpeeG - Lode HosteVrije Universiteit Brussel
Dasher
13
SpeeG - Lode HosteVrije Universiteit Brussel
Controller-freeText inputWithout training
14
KinectCMU SphinxDasher
Used technologies:Goals:
SpeeG - Lode HosteVrije Universiteit Brussel
SpeeG
15
SpeeG - Lode HosteVrije Universiteit Brussel 16
SpeeG - Lode HosteVrije Universiteit Brussel
SpeeG Architecture
User
1
GUI (JDasher)
Speech Recogniser(CMU Sphinx 4)
Hand Tracking(Microsoft Kinect and NITE)
5
42
3
17
SpeeG - Lode HosteVrije Universiteit Brussel
Evaluation
18
SpeeGUser
1
GUI (JDasher)
Speech Recogniser(CMU Sphinx 4)
Hand Tracking(Microsoft Kinect and NITE)
5
42
3Speech-only
Virtual Keyboard Kinect Keyboard
SpeeG - Lode HosteVrije Universiteit Brussel
Evaluation
“this was easy for us”“he will allow a rare lie”“did you eat yet”
“my watch fell in the water”“the world is a stage”“peek out the window”
19
7 (male) users: 23-31y
1-3: DARPA’s TIMIT
Performed a quantitative (Words per minute and nr of errors) and qualitative (feedback and preference) evaluation
4-6: MacKenzie and Soukoreff
show 2 about ‘expertise of users’
SpeeG - Lode HosteVrije Universiteit Brussel
0
1
2
3
4
5
6
7
8
9
10
S1 S2 S3 S4 S5 S6
WPM
Sentence
User 1
User 2
User 3
User 4
User 5
User 6
User 7
Virtual keyboard
20
6.3 WPM
SpeeG - Lode HosteVrije Universiteit Brussel
0.00
0.50
1.00
1.50
2.00
2.50
3.00
3.50
S1 S2 S3 S4 S5 S6
WPM
Sentence
User 1
User 2
User 3
User 4
User 5
User 6
User 7
Kinect Keyboard
21
*
1.83 WPM
SpeeG - Lode HosteVrije Universiteit Brussel
0
5
10
15
20
25
30
35
40
S1 S2 S3 S4 S5 S6
WPM
Sentence
User 1
User 2
User 3
User 4
User 5
User 6
User 7
Speech-only
22
User
1
GUI (JDasher)
Speech Recogniser(CMU Sphinx 4)
Hand Tracking(Microsoft Kinect and NITE)
5
42
3
11 WPM
SpeeG - Lode HosteVrije Universiteit Brussel
0
1
2
3
4
5
6
7
8
9
10
S1 S2 S3 S4 S5 S6
WPM
Sentence
User 2
User 1
User 3
User 4
User 5
User 6
User 7
SpeeG
23
5.8 WPM
SpeeG - Lode HosteVrije Universiteit Brussel
0
1
2
3
4
5
6
7
8
9
10
S1 S2 S3 S4 S5 S6
WPM
Sentence
User 2
User 1
User 3
User 4
User 5
User 6
User 7
SpeeG
24
2.6 7.8 WPM
SpeeG - Lode HosteVrije Universiteit Brussel
0
5
10
15
20
25
S1 S2 S3 S4 S5 S6
WPM
Sentence
Controller
Speech only
Kinect only
SpeeG
Mean WPM per sentenceand input device
25
SpeeG
1D Keyboard for XboxVirtual Keyboard for Xbox
Speech-onlyUser
1
GUI (JDasher)
Speech Recogniser(CMU Sphinx 4)
Hand Tracking(Microsoft Kinect and NITE)
5
42
3
SpeeG - Lode HosteVrije Universiteit Brussel 26
0
1
2
3
4
5
6
7
8
9
10
S1 S2 S3 S4 S5 S6
Mea
n nu
mbe
r of e
rror
s
Sentence
Controller Speech only Kinect only SpeeG
SpeeG
1D Keyboard for XboxVirtual Keyboard for Xbox
Speech-onlyUser
1
GUI (JDasher)
Speech Recogniser(CMU Sphinx 4)
Hand Tracking(Microsoft Kinect and NITE)
5
42
3
Errors per sentenceand input device
SpeeG - Lode HosteVrije Universiteit Brussel 27
SpeeG - Lode HosteVrije Universiteit Brussel
Future work
28
Other visualisations Smaller gesturesDedicated commands (gesture / voice)
SpeeG - Lode HosteVrije Universiteit Brussel 29
SpeeG - Lode HosteVrije Universiteit Brussel
Kinect
- Controller-free text input- Real-time correction- Dasher, zoomable interface - probabilities - alphabetic order - character-level
SpeeGA Mul&modal Speech-‐ and
Gesture-‐ based Text Input Solu&on Lode Hoste, Bruno Dumas, Beat Signer
Speech
- Non-native speakers- Untrained voice recogniser- 6-12 WPM- Perceived fastest- Game-like character- Novice and experts
30Special thanks to Jorn De Baerdenmaeker and Keith Vertaenen