The Neurophysiology of Speech

36
Neurophysiology of Speech T.S. Yo

description

A 50 mins talk to introduce some basics of the neurobiology of human speech to a group of researchers working on speech recognition. Date: 2008-03-05

Transcript of The Neurophysiology of Speech

Page 1: The Neurophysiology of Speech

Neurophysiology of Speech

T.S. Yo

Page 2: The Neurophysiology of Speech

ReferencesAudition, the body senses, and the chemical senses. Physiology of behavior, 6th Ed, 1998, pp. 185-223. by Carlson N. R.

Human communication. Physiology of behavior, 6th Ed, 1998, pp. 477-508. by Carlson, N. R.

FUNCTIONAL MRI OF LANGUAGE: New Approaches to Understanding the Cortical Organization of Semantic Processing Annu. Rev. Neurosci., (2002), pp. 151-188. by Bookheimer, S.

Lateralization of auditory language functions: A dynamic dual pathway model Brain and Language, 89 (2004) 267–276 by Friederici, A.D. and Alter, K.

Page 3: The Neurophysiology of Speech

Outline

● Auditory apparatus● MFCC● Lesion study● Neuroimaging● Dynamic dual channel model● Can we design ASR systems by mimicking

organic systems?

Page 4: The Neurophysiology of Speech

Auditory system

鼓膜耳廓

槌骨

砧骨

鐙骨

歐氏管;耳咽管

耳蝸

前庭

Page 5: The Neurophysiology of Speech

Cochlea

Page 6: The Neurophysiology of Speech

Cochlea (2)

Page 7: The Neurophysiology of Speech

Auditory Pathway

Page 8: The Neurophysiology of Speech

Detecting Acoustic Features● Pitch

– High freq: place coding– Low freq: rate coding

● Loudness– Freq of firing in cochlea nerves

● Timbre– Waveform decomposition

Page 9: The Neurophysiology of Speech

Localization with Neural Circuits

Page 10: The Neurophysiology of Speech

Localization with Neural Circuits

Page 11: The Neurophysiology of Speech

Vestibular System

Page 12: The Neurophysiology of Speech

MFCC● Mel Frequency Cepstral Coefficient

– Take the Fourier transform of a signal– Map the log amplitudes of the spectrum obtained

above onto the mel scale, using triangular overlapping windows.

– Take the Discrete Cosine Transform of the list of mel log-amplitudes, as if it were a signal.

– The MFCCs are the amplitudes of the resulting spectrum.

Page 13: The Neurophysiology of Speech

From the ears to the brain● Ear

– Spectral signals.– Fourier transform done by neural circuits.

● Brain– Two pathways in two hemisphere– Left: semantics and syntactics– Right: prosody

Page 14: The Neurophysiology of Speech

Brain Mechanisms for Language

● From lesion study to neuroimaging● Localization of functions● Lateralization● Speech Production and Comprehension● Prosody

Page 15: The Neurophysiology of Speech

Lesion Studies● Aphasia

– Difficulty in producing or comprehending speech caused by brain damage.

● Broca's aphasia– agrammatism– anomia

● Wernicke's aphasia– poor speech comprehension

Page 16: The Neurophysiology of Speech

Broca's Aphasia● Agrammatism:

– difficulty in understanding / using grammar● Anomia:

– difficulty in finding the appropriate word to describe an object, action, or attribute.

● Apraxia of speech: – impairment in the ability to program movements of

the tongue, lips, and throat required to produce the proper sequence of speech sounds.

Page 17: The Neurophysiology of Speech

Broca's Aphasia Example● "Yes ... Monday ... Dad, and Dad ... hospital,

and ... Wednesday, Wednesday, nine o'clock and ... Thursday, ten o'clock ... doctors, two, two ... doctors and ... teeth, yah."

● 是...阿...星期一...阿...父親及父親....阿...醫院...及阿...星期三...星期三九點... 以及 ,喔...星期四...十點, 阿,醫生...兩個...醫生...及阿...牙齒...對的。

Page 18: The Neurophysiology of Speech

Broca's Aphasia

Page 19: The Neurophysiology of Speech

Wernicke's Aphasia● Poor speech comprehension:

● Fluent but meaningless speech: –

● Pure word deafness: – The ability to hear, to speak, and to read and write

without being able to comprehend the meaning of speech.

Page 20: The Neurophysiology of Speech

Wernicke's Aphasia Example● Examiner: What kind of work have you done? ● Patient: We, the kids, all of us, and I, we were working for a long time

in the ... you know ... it's the kind of space, I mean place rear to the spedawn ...

● Examiner: Excuse me, but I wanted to know what work you have been doing.

● Patient: If you had said that, we had said that, poomer, near the fortunate, porpunate, tamppoo, all around the fourth of martz. Oh, I get all confused.

Page 21: The Neurophysiology of Speech

Wernicke's Aphasia

Page 22: The Neurophysiology of Speech

Neuroimaging Studies● Neuroimaging

– Functional magnetic resonance imaging (fMRI)– Positron emission tomography (PET)

● Subjects are asked to perform cognitive tasks while taking imaging.

Page 23: The Neurophysiology of Speech

Neuroimaging● FMRI● PET

Page 24: The Neurophysiology of Speech

Normalizing Neuroimages● Talairach coordinate space

– Center: Anterior Commissure

– X: [-65, +65]– Y: [+70, -90]– Z: [-40, +65]

Page 25: The Neurophysiology of Speech

Semantic Conditions● Same

– The lawyer questioned the witness.– The attorney questioned the witness.

● Different– The man was attacked by the doberman.– The man was attacked by the pitbull.

Page 26: The Neurophysiology of Speech

Syntactic Conditions● Same

– The policeman arrested the thief.– The thief was arrested by the policeman.

● Different– The teacher was outsmarted by the student.– The teacher outsmarted the student.

Page 27: The Neurophysiology of Speech

Summary by Bookheimer, 2002

● The role of the left inferior frontal lobe in semantic processing and dissociations from other frontal lobe language functions.

● The organization of categories of objects and concepts in the temporal lobe.

● The role of the right hemisphere in comprehending contextual and figurative meaning.

Page 28: The Neurophysiology of Speech

Overview by Ahrens, 2007● Past

– Functional localization (brain damage)● Present

– Narrower localization + discussion of overlap and integration (neuro-imaging techniques)

● Future – Language as a brain function (integrate knowledge

about timing, context, and individual differences)

Page 29: The Neurophysiology of Speech

The Three Myths● Myth 1: Broca’s area deals with syntax/production

– Fact: Semantics and phonology cluster in different areas of the IFG; syntax seems to be distributed throughout the IFG.

– Fact: IFG is activated during non-language tasks.

● Myth 2: Wernicke’s area deals with semantics/comprehension– Fact: There are functional subdivisions for language in

posterial temporal area.

Page 30: The Neurophysiology of Speech

The Three Myths● Myth 3: The right hemisphere is not used when

processing language – Fact: The right hemisphere is called upon for many

integrative language processes.> Figurative Language and Metaphor> Linguistic Context> Prosody

Page 31: The Neurophysiology of Speech

Summary of Neuroimaging Studies

Page 32: The Neurophysiology of Speech

Dynamic Dual Pathway Model

● Spoken language comprehension requires the coordination of different subprocesses in time.

● Segmental information: – phonemes, syntactic elements and lexical-semantic

elements.● Suprasegmental information:

– accentuation and intonational phrases, i.e., prosody.

Page 33: The Neurophysiology of Speech

Localization of Different Subsystems

● Segmental information:– syntactic and semantic information are primarily

processed in a left hemispheric temporo-frontal pathway including separate circuits for syntactic and semantic information

● Suprasegmental information: – sentence level prosody is processed in a right

hemispheric temporo-frontal pathway.

Page 34: The Neurophysiology of Speech

Dynamic Interaction● Corpus Callosum

Page 35: The Neurophysiology of Speech

Can we design ASR systems by imitating the brain?

● An open question– Is it possible? Is it more effective?

● Complexity– Basic computation power of a neuron: 60 hz– 10^8 of input, 10^10 in the brain, each with >8000

connections● Training time

– How long would it take for a human being to understand language?

Page 36: The Neurophysiology of Speech

Some factors in human neural system