Speech recognition 2 DAY 15 – Sept 30, 2013

Post on 22-Feb-2016

15 views 0 download

Tags:

description

Speech recognition 2 DAY 15 – Sept 30, 2013. Brain & Language LING 4110-4890-5110-7960 NSCI 4110-4891-6110 Harry Howard Tulane University. Course organization. The syllabus, these slides and my recordings are available at http://www.tulane.edu/~howard/LING4110/ . - PowerPoint PPT Presentation

Transcript of Speech recognition 2 DAY 15 – Sept 30, 2013

SPEECH RECOGNITION 2DAY 15 – SEPT 30, 2013

Brain & LanguageLING 4110-4890-5110-7960NSCI 4110-4891-6110Harry HowardTulane University

2

Course organization• The syllabus, these slides and my recordings are

available at http://www.tulane.edu/~howard/LING4110/.• If you want to learn more about EEG and neurolinguistics,

you are welcome to participate in my lab. This is also a good way to get started on an honor's thesis.

• The grades are posted to Blackboard.

9/30/13 Brain & Language, Harry Howard, Tulane University

REVIEW

9/30/13 Brain & Language, Harry Howard, Tulane University 3

4

ReviewPitch shows fundamental frequency (F0)Spectrogram shows formants (F1-3)Sound wave

9/30/13 Brain & Language, Harry Howard, Tulane University

SPEECH RECOGNITIONIngram §5

9/30/13 Brain & Language, Harry Howard, Tulane University 5

6

• use Praat in class

9/30/13 Brain & Language, Harry Howard, Tulane University

Brain & Language, Harry Howard, Tulane University 79/30/13

Vowel articulation• Tongue height: high, (mid), low

• put your hand under your jaw and say the vowel of:• mat, met, mate, mitt, meat• meat, mitt, mate, met, mat

• Tongue advancement: front, central, back• Lip configuration: rounded, neutral, retracted

Brain & Language, Harry Howard, Tulane University 89/30/13

Vowel descriptionFront Central Back

Highi

ɪu

ʊ

(Mid)

e

ɛ

ɝə

ɚ

ʌ

o

ɔ

Lowæ a

Retracted Neutral Rounded

Brain & Language, Harry Howard, Tulane University 9

Sample vowel spectrograms

9/30/13

• Wide band spectrograms of the vowels of American English in a /b__d/ context. • Top row, left to right: [i, ɪ, eɪ, ɛ, æ]. Bottom row, left to right: [ɑ, ɔ, o, ʊ, u].

10

Acoustic cues and distinctive features• Three problems

a. Input signalb. Internal representationc. Interface between

(a)and (b)• Lexical information retrieval• but we only need the

phonological form of a lexical item

9/30/13 Brain & Language, Harry Howard, Tulane University

11

Why speech recognition is difficult• The segmentation problem• The variability problem

• coarticulation• The speaking environment• Speakers’ vocal tracts• Speech rate and style• Rate of information transmission

9/30/13 Brain & Language, Harry Howard, Tulane University

12

Lexical retrieval• Speech perception involves phonological parsing prior to

lexical access• It is not enough to know the lexicon beforehand.

• Phonetic forms and phonological representations• Speech/speaker normalization• Distinctive features and acoustic cues• Underspecified vs. fully specified• Discrete vs. continuous• Hierarchical organization vs. entrainment

9/30/13 Brain & Language, Harry Howard, Tulane University

NEXT TIMEFinish Ingram §6.

☞ Go over questions at end of chapter.

9/30/13 Brain & Language, Harry Howard, Tulane University 13