CS 551/651: Structure of Spoken Language Lecture 3: Phonetic Symbols and Physiology of Speech...

22
CS 551/651: Structure of Spoken Language Lecture 3: Phonetic Symbols and Physiology of Speech Production John-Paul Hosom Fall 2010

Transcript of CS 551/651: Structure of Spoken Language Lecture 3: Phonetic Symbols and Physiology of Speech...

Page 1: CS 551/651: Structure of Spoken Language Lecture 3: Phonetic Symbols and Physiology of Speech Production John-Paul Hosom Fall 2010.

CS 551/651:Structure of Spoken Language

Lecture 3: Phonetic Symbols andPhysiology of Speech Production

John-Paul HosomFall 2010

Page 2: CS 551/651: Structure of Spoken Language Lecture 3: Phonetic Symbols and Physiology of Speech Production John-Paul Hosom Fall 2010.

Phonetic Symbols: the IPA

The International Phonetic Alphabet (IPA)(reproduced compliments of the International Phonetic Association, Department of Linguistics, University of Victoria, Victoria, British Columbia, Canada)

Page 3: CS 551/651: Structure of Spoken Language Lecture 3: Phonetic Symbols and Physiology of Speech Production John-Paul Hosom Fall 2010.

Phonetic Symbols: the IPA

Page 4: CS 551/651: Structure of Spoken Language Lecture 3: Phonetic Symbols and Physiology of Speech Production John-Paul Hosom Fall 2010.

Phonetic Symbols: the IPA

Tongue tipor bladetouchingupper lip

Produced With tip of tongue,e.g. Spanish/r/

Page 5: CS 551/651: Structure of Spoken Language Lecture 3: Phonetic Symbols and Physiology of Speech Production John-Paul Hosom Fall 2010.

Phonetic Symbols: the IPA

Other IPA symbols…

Page 6: CS 551/651: Structure of Spoken Language Lecture 3: Phonetic Symbols and Physiology of Speech Production John-Paul Hosom Fall 2010.

Phonetic Symbols: the IPA

Examples:

Page 7: CS 551/651: Structure of Spoken Language Lecture 3: Phonetic Symbols and Physiology of Speech Production John-Paul Hosom Fall 2010.

Phonetic Symbols: Worldbet

An ASCII representation of IPA, developed by Hieronymous (AT&T)

Page 8: CS 551/651: Structure of Spoken Language Lecture 3: Phonetic Symbols and Physiology of Speech Production John-Paul Hosom Fall 2010.

Phonetic Symbols: ARPAbet, TIMITbet, OGIbet

ASCII representation of English used in TIMIT corpus.

Page 9: CS 551/651: Structure of Spoken Language Lecture 3: Phonetic Symbols and Physiology of Speech Production John-Paul Hosom Fall 2010.

Phonetic Symbols: SAMPA

An ASCII representation for multiple (European) languages

Page 10: CS 551/651: Structure of Spoken Language Lecture 3: Phonetic Symbols and Physiology of Speech Production John-Paul Hosom Fall 2010.

nasal tract(hard) palate

oral tract

velum (soft palate)

velic port

tongue

tongue tippharynx

glottis(vocal folds and

space between vocal cords)

vocal folds (larynx)= vocal cords

alveolar ridge

lips

teeth

The Speech Production Apparatus (from Olive, p. 23)

Acoustic Phonetics: Anatomy

Page 11: CS 551/651: Structure of Spoken Language Lecture 3: Phonetic Symbols and Physiology of Speech Production John-Paul Hosom Fall 2010.

Acoustic Phonetics: Anatomy

Breathing and Speech (from Daniloff, chapter 5):

• Duration of expiration in soft speech is 2.4 to 3.5 seconds; maximum value (singers, orators) is 15 to 20 seconds without distress.

• Louder voice requires inhaling more deeply after expiration; also deeper inhalation if followed by longer speech.

• More intense voicing requires greater lung pressure.

• Lung pressure relatively constant throughout an utterance.

• Emphasis in speech: greater tenseness in vocal folds yielding higher F0; greater lung pressure increases airflow (energy).

Page 12: CS 551/651: Structure of Spoken Language Lecture 3: Phonetic Symbols and Physiology of Speech Production John-Paul Hosom Fall 2010.

Acoustic Phonetics: Anatomy

the false vocal folds narrow the glottis during swallowing, preventing pieces of food from getting into the trachea.

Page 13: CS 551/651: Structure of Spoken Language Lecture 3: Phonetic Symbols and Physiology of Speech Production John-Paul Hosom Fall 2010.

Acoustic Phonetics: Anatomy

Phonation (from Daniloff, chapter 6):

Phonation is “conversion of potential energy of compressed airinto kinetic energy of acoustic vibration.” For voiced speech:

1. Buildup of Pressure:air pressure from the lungs pushes against closed vocal foldsso that Psubglottal > Poral; buildup continues untiluntil Psubglottal – Poral > elastic recoil force of vocal folds

2. Release:vocal folds forced open by pressure difference;burst of compressed air hits air in vocal tract, causingacoustic shock wave moving along vocal tract

Page 14: CS 551/651: Structure of Spoken Language Lecture 3: Phonetic Symbols and Physiology of Speech Production John-Paul Hosom Fall 2010.

Acoustic Phonetics: Anatomy

Phonation

3. Closure of Vocal Folds, two factors:(a) force of elastic recoil in vocal folds Vocal folds have elastic or recoil force proportional to the degree of change from the resting position.(b) Bernoulli Effect (i) energy at location of vocal folds is conserved: E = KE + PE (ii) increase in KE causes decrease in PE (iii) PE corresponds to pressure of air (iv)drop in pressure causes walls of glottis to be

drawn closer together Summary: air burst causes high rate of airflow, causes

drop in pressure, causes folds to be pulled together

Page 15: CS 551/651: Structure of Spoken Language Lecture 3: Phonetic Symbols and Physiology of Speech Production John-Paul Hosom Fall 2010.

Acoustic Phonetics: Anatomy

Implications:

1. vocal folds do not open and close because of separate muscle movements

2. opening and closing is automatic as long as the resting positionof the vocal folds is (near) closure, and there is sufficient pressure buildup below vocal folds

3. Factors governing vocal fold vibration:(a) position of vocal folds (degree of closeness between folds)(b) elasticity of vocal folds, depending on position and degree of tension(c) degree of pressure drop across vocal folds

Page 16: CS 551/651: Structure of Spoken Language Lecture 3: Phonetic Symbols and Physiology of Speech Production John-Paul Hosom Fall 2010.

Acoustic Phonetics: Anatomy

Types of phonation (from Daniloff, p. 194)

quietbreathing

forcedinhalation

normalphonation

whisper

Page 17: CS 551/651: Structure of Spoken Language Lecture 3: Phonetic Symbols and Physiology of Speech Production John-Paul Hosom Fall 2010.

Acoustic Phonetics: Anatomy

The cycle of glottal vibration (from Daniloff, p. 171)

1. folds at rest 2. muscle contraction

3. increase in pressure

4. forcing folds apart

5. “explosion” open

6. acoustic shockwave

8. folds close, goto step (3)

7. rebound toward closure

Page 18: CS 551/651: Structure of Spoken Language Lecture 3: Phonetic Symbols and Physiology of Speech Production John-Paul Hosom Fall 2010.

Acoustic Phonetics: Anatomy

The cycle of glottal vibration (from Pickett, p. 50)

closure to opening, 0 to 2.1 msec

opening to closure, 2.4 to 4.5 msec

(F0 = 222 Hz)

Page 19: CS 551/651: Structure of Spoken Language Lecture 3: Phonetic Symbols and Physiology of Speech Production John-Paul Hosom Fall 2010.

Acoustic Phonetics: Anatomy

Types of phonation (from Daniloff, p. 174)

voiceless, whisper, breathy voiced, creak, glottal stop

Page 20: CS 551/651: Structure of Spoken Language Lecture 3: Phonetic Symbols and Physiology of Speech Production John-Paul Hosom Fall 2010.

Video of fiberoptic stroboscopy exam:

(ignore the background music!)

And here’s another video from http://www.voiceinfo.org/showing the vibration of the vocal folds as a person’spitch increases:

Vcglide.mov

http://www.youtube.com/watch?v=ajbcJiYhFKY

Acoustic Phonetics: Anatomy

Some cool (gross?) videos:

Page 21: CS 551/651: Structure of Spoken Language Lecture 3: Phonetic Symbols and Physiology of Speech Production John-Paul Hosom Fall 2010.

Acoustic Phonetics: Anatomy

The effects of nasalization on vowels (from Pickett, p. 71)

Figure 4-17. An example of theeffects of vowel nasalization onthe vowel spectrum. The spectrumenvelopes of a normal [a] and a heavilynasalized [a] were plotted… The firstthree formants are labeled in the normal vowel. In the nasalized vowel,there are three local reductions inspectrum level, indicated by “z’s”;these are the result of the additionof anti-resonant zeros to the vocaltract response, due to a wide-openvelar port.

Page 22: CS 551/651: Structure of Spoken Language Lecture 3: Phonetic Symbols and Physiology of Speech Production John-Paul Hosom Fall 2010.

Acoustic Phonetics: Anatomy

The effects of nasalization on vowels (from Pickett, p. 71)

Coupling of the oral and nasal tract introduces pole-zero pairs(resonances & anti-resonances, occurring in pairs) in the spectrum.The amount of coupling affects the spacing between each poleand its corresponding zero, as well as their frequency locations.

1. The presence of a pole-zero pair increases the apparent bandwidth of the neighboring formants.

2. The presence of spectral zero below F1 tends to make the location of F1 appear slightly higher (50-100 Hz) than it normally would

3. If the zero is higher in frequency than its corresponding pole, the net effect is to reduce the amplitude of higher frequencies