The two different parts of speech Speech Production Speech Perception.
Speech recognition1
description
Transcript of Speech recognition1
![Page 1: Speech recognition1](https://reader036.fdocuments.us/reader036/viewer/2022082915/5454da0caf7959755d8b4597/html5/thumbnails/1.jpg)
1
Speech Recognition
![Page 2: Speech recognition1](https://reader036.fdocuments.us/reader036/viewer/2022082915/5454da0caf7959755d8b4597/html5/thumbnails/2.jpg)
2
Introduction
• What is Speech Recognition?
- Voice Recognition?
• Where can it be used?
- Dictation
- System control/navigation
- Commercial/Industrial applications
- Hand held digital recorders
![Page 3: Speech recognition1](https://reader036.fdocuments.us/reader036/viewer/2022082915/5454da0caf7959755d8b4597/html5/thumbnails/3.jpg)
3
Contents:
• Continuous/Discrete
• How does it work?
• Recent improvements
• Current software options
• Future of SR
![Page 4: Speech recognition1](https://reader036.fdocuments.us/reader036/viewer/2022082915/5454da0caf7959755d8b4597/html5/thumbnails/4.jpg)
4
Continuous or Discrete?
• Continuous speech
- dictation
• Discrete speech
- system controls
![Page 5: Speech recognition1](https://reader036.fdocuments.us/reader036/viewer/2022082915/5454da0caf7959755d8b4597/html5/thumbnails/5.jpg)
5
How does SR work?
• Recognition
• Training
• Correction
• Command/Control
![Page 6: Speech recognition1](https://reader036.fdocuments.us/reader036/viewer/2022082915/5454da0caf7959755d8b4597/html5/thumbnails/6.jpg)
6
Recognition (1)
Voice Input Analog to Digital Acoustic Model
Language Model
Display Speech EngineFeedback
![Page 7: Speech recognition1](https://reader036.fdocuments.us/reader036/viewer/2022082915/5454da0caf7959755d8b4597/html5/thumbnails/7.jpg)
7
Recognition (2)
Acoustic Modeling
• Spoken words: “I think there are…..”
• Phonemes: ‘ ay th-in-nk-kd dh-eh-r aa-r’
• H.M.M.’s: 5 state representation
• Speech Engine
![Page 8: Speech recognition1](https://reader036.fdocuments.us/reader036/viewer/2022082915/5454da0caf7959755d8b4597/html5/thumbnails/8.jpg)
8
Recognition (3)
Language Modeling
• Word context
• Word frequency
• Transition possibilities
![Page 9: Speech recognition1](https://reader036.fdocuments.us/reader036/viewer/2022082915/5454da0caf7959755d8b4597/html5/thumbnails/9.jpg)
9
Voice Training (1)
Can be done by:
• Predetermined text segments
• Individual words
Compare new acoustic with old and combines
• More training = better recognition
![Page 10: Speech recognition1](https://reader036.fdocuments.us/reader036/viewer/2022082915/5454da0caf7959755d8b4597/html5/thumbnails/10.jpg)
10
Voice Training (2)
User specific Voice file
• Voice qualities
• Pronunciation
• Patterns of word use
• Preferred vocabulary
![Page 11: Speech recognition1](https://reader036.fdocuments.us/reader036/viewer/2022082915/5454da0caf7959755d8b4597/html5/thumbnails/11.jpg)
11
Making Corrections
• Move cursor by voice command
• Memorize edit commands
• List of possible alternatives
• Make correction manually
![Page 12: Speech recognition1](https://reader036.fdocuments.us/reader036/viewer/2022082915/5454da0caf7959755d8b4597/html5/thumbnails/12.jpg)
12
Command/Control
• Desktop grid
• Program or Link name/number
• URL name
• Memorized commands
![Page 13: Speech recognition1](https://reader036.fdocuments.us/reader036/viewer/2022082915/5454da0caf7959755d8b4597/html5/thumbnails/13.jpg)
13
Recent Improvements in SR
• Faster training ~10 min.
• Better recognition ~95%
• More compatible software
• Better system control/command
![Page 14: Speech recognition1](https://reader036.fdocuments.us/reader036/viewer/2022082915/5454da0caf7959755d8b4597/html5/thumbnails/14.jpg)
14
Current Software Options for PC
• Dragon Systems – Naturally Speaking
• Philips – FreeSpeech
• IBM – ViaVoice
• Lernout & Hauspie – Voice Xpress
![Page 15: Speech recognition1](https://reader036.fdocuments.us/reader036/viewer/2022082915/5454da0caf7959755d8b4597/html5/thumbnails/15.jpg)
15
How well do the work?
Training Dictation Correct.
App.
Integrat.
Command
- Control
Dragon Excellent Excellent Good Good
Philips Fair Fair Good Good
IBM Excellent Good Good Excellent
L & H Good Good Good Good
![Page 16: Speech recognition1](https://reader036.fdocuments.us/reader036/viewer/2022082915/5454da0caf7959755d8b4597/html5/thumbnails/16.jpg)
16
Future of SR
• SUI – Speech-based User Interface
• Improvements needed:
- Greater accuracy
- Greater system control/command
- More compatible software
![Page 17: Speech recognition1](https://reader036.fdocuments.us/reader036/viewer/2022082915/5454da0caf7959755d8b4597/html5/thumbnails/17.jpg)
17
Conclusion
• SR Uses
• How does it work?
• Current Software
• Problems of SR
• More SR coming soon….
![Page 18: Speech recognition1](https://reader036.fdocuments.us/reader036/viewer/2022082915/5454da0caf7959755d8b4597/html5/thumbnails/18.jpg)
18
References
• 1. Alwang, Greg. “Speech Recognition,” PC Magazine, December 1 1999
• 2. Hauptmann, Alexander G. Jang, Photina Jaeyun. Carnegie Mellon University. “Learning to Recognize Speech by Watching Television,” IEEE Intelligent Systems, September/October 1999.
• 3. Miastkowski, Stan. “Latest Speech Software Gets You Up and Running Faster,” PC World, November 1999.