Contrast enhancement in frequency spectra of cochlear microphonic responses to complex stimuli

7
59 Hearing Research, 21 (1986) 59-65 Elsevier HRR 00674 Contrast enhancement in frequency spectra of cochlear rnicrophonic responses to complex stimuli Hero P. Wit and J. Wiebe Horst Institute of Audiology, University Hospital, P.O. Box 30.001, 9700 RB Groningen. The Netherlands (Received 30 April 1985; accepted 7 October 1985) Cochlear microphonic responses were measured in pigeons and guinea pigs during stimulation with complex sounds. The acoustical stimuli had many component frequency spectra with a more or less undulating envelope. Enhancement of the peak-and-val- ley structure of the envelope occurred at high stimuli levels, especially if all frequency components in the stimulus had equal (cosine) phase. The observed effects could very well be modelled with a simple passive electronic network, as well as with an analytical expression that describes saturation of the cochlear microphonic at high stimulus levels. cochlear microphonic, complex stimulus, nonlinear Introduction In many cochlear physiology studies pure tones are used as stimuli. Such stimuli hardly exist in daily life. Natural stimuli (speech, animal sounds) have a complex nature: many components are present in frequency spectra of these sounds. Things already start getting complicated with two components, as is for instance found in studies of two-tone interactions (Wever et al., 1940; Sachs and Kiang, 1968; Legouix et al., 1973; Abbas and Sachs, 1976; Javel et al., 1978; Sellick and Russell, 1979; Cheatham and Dallos, 1982). In measurements of auditory-nerve fiber re- sponses to complex sounds, steady-state vowels (Sachs and Young, 1979; Young and Sachs, 1979) and synthesized consonant-vowel syllables (Sinex and Geisler, 1984) have been used. Suppression, as observed in two-tone studies, also plays an im- portant role in the auditory-nerve encoding of more complex stimuli (Sachs et al., 1980). At high sound levels cochlear frequency analysis of peri- odic vowels is highly nonlinear (Sachs et al., 1980). Sinex and Geisler (1984) suggest that one role for peripheral nonlinearities in the auditory system may be to enhance the neural representation of spectral features such as formants. The same suggestion was made earlier by Houtgast (1974) after a psychoacoustical study with vowel-like sounds. In order to obtain more insight in the processing of complex signals by the cochlea Horst et al. (1985, 1986) used harmonic complexes in single fiber response measurements in the cat. Their stimuli consisted of a large number (typically 15) of successive equal intensity harmonics of a funda- mental frequency of approximately 50 Hz. Also in this study the majority of the fibers exhibited intensity-dependent nonlinear behaviour. At higher stimulus levels ‘edge enhancement’ was observed. This means that components near the edges of the frequency spectra of fiber responses have a higher intensity than the middle components, even though the components in the acoustical stimulus had equal intensity. Recently Legouix and Avan (1984, 1985) de- scribed similar enhancement effects in the cochlear microphonic response to wide-band clicks in guinea pigs. This made us decide to record frequency spectra of cochlear microphonics in response to multicomponent complex stimuli, especially at higher stimulus levels, that elicit the largest en- hancement effects in single nerve fiber measure- ments. 0378-5955/86/$03.50 0 1986 Elsevier Science Publishers B.V. (Biomedical Division)

Transcript of Contrast enhancement in frequency spectra of cochlear microphonic responses to complex stimuli

59 Hearing Research, 21 (1986) 59-65

Elsevier

HRR 00674

Contrast enhancement in frequency spectra of cochlear rnicrophonic responses to complex stimuli

Hero P. Wit and J. Wiebe Horst Institute of Audiology, University Hospital, P.O. Box 30.001, 9700 RB Groningen. The Netherlands

(Received 30 April 1985; accepted 7 October 1985)

Cochlear microphonic responses were measured in pigeons and guinea pigs during stimulation with complex sounds. The

acoustical stimuli had many component frequency spectra with a more or less undulating envelope. Enhancement of the peak-and-val-

ley structure of the envelope occurred at high stimuli levels, especially if all frequency components in the stimulus had equal (cosine)

phase. The observed effects could very well be modelled with a simple passive electronic network, as well as with an analytical

expression that describes saturation of the cochlear microphonic at high stimulus levels.

cochlear microphonic, complex stimulus, nonlinear

Introduction

In many cochlear physiology studies pure tones are used as stimuli. Such stimuli hardly exist in daily life. Natural stimuli (speech, animal sounds) have a complex nature: many components are

present in frequency spectra of these sounds. Things already start getting complicated with two components, as is for instance found in studies of two-tone interactions (Wever et al., 1940; Sachs

and Kiang, 1968; Legouix et al., 1973; Abbas and Sachs, 1976; Javel et al., 1978; Sellick and Russell, 1979; Cheatham and Dallos, 1982).

In measurements of auditory-nerve fiber re- sponses to complex sounds, steady-state vowels (Sachs and Young, 1979; Young and Sachs, 1979) and synthesized consonant-vowel syllables (Sinex and Geisler, 1984) have been used. Suppression, as observed in two-tone studies, also plays an im- portant role in the auditory-nerve encoding of more complex stimuli (Sachs et al., 1980). At high sound levels cochlear frequency analysis of peri- odic vowels is highly nonlinear (Sachs et al., 1980). Sinex and Geisler (1984) suggest that one role for peripheral nonlinearities in the auditory system may be to enhance the neural representation of spectral features such as formants. The same

suggestion was made earlier by Houtgast (1974)

after a psychoacoustical study with vowel-like sounds.

In order to obtain more insight in the processing of complex signals by the cochlea Horst et al. (1985, 1986) used harmonic complexes in single fiber response measurements in the cat. Their stimuli consisted of a large number (typically 15) of successive equal intensity harmonics of a funda-

mental frequency of approximately 50 Hz. Also in this study the majority of the fibers exhibited intensity-dependent nonlinear behaviour. At higher stimulus levels ‘edge enhancement’ was observed. This means that components near the edges of the frequency spectra of fiber responses have a higher intensity than the middle components, even though the components in the acoustical stimulus had equal intensity.

Recently Legouix and Avan (1984, 1985) de- scribed similar enhancement effects in the cochlear microphonic response to wide-band clicks in guinea pigs. This made us decide to record frequency spectra of cochlear microphonics in response to multicomponent complex stimuli, especially at higher stimulus levels, that elicit the largest en- hancement effects in single nerve fiber measure- ments.

0378-5955/86/$03.50 0 1986 Elsevier Science Publishers B.V. (Biomedical Division)

60

Many of the experimentally observed nonlinear phenomena of the cochlear microphonic, including two-tone interaction, can be modeled by a simple

three-diode network, in cascade with a low-pass filter, as was shown by Engebretson and Eldredge (1968). Pfeiffer (1970) extended this hard-ware

model and made an analytical equivalent to de- scribe two-tone inhibition * of single cochlear

nerve fibers. Both with the Engebretson and Eldredge model and with Pfeiffer’s simple analyti- cal description for the cochlear microphonic non- linearity, we could very well model the enhance- ment effects present in the frequency spectra re- cordings of cochlear microphonics.

Material and Methods

Cochlear microphonic recordings were made in three adult pigeons (Columba livia) and in three pigmented guinea pigs approximately 6 weeks old; all animals were anaesthetized with a gas mixture of oxygen and fluothane (1.5%).

In the pigeons access to the labyrinth was achieved by the classic retro-auricular approach of Ewald (1892). One nichrom% wire electrode (diam- eter 0.05 mm) was placed in the recessus scalae tympani, another in the vestibulum. Differential

recordings were made from the signals of these two electrodes in order to cancel the compound action potential component in the response signal to a great extent. An earth electrode was in contact with the exposed neck muscles.

The same recording technique was used for the guinea pigs, where the active electrodes were placed in Scala tympani and Scala vestibuli of the basal turn of the cochlea, after a ventral approach to the bulla (Tasaki et al., 1952).

Acoustic stimuli were generated with a TDH 39 earphone and delivered to the earcanal of the

animal through a silastic tube with a length of approximately 0.4 m. Acoustic resonances in this tube created a stimulus with successive maxima and minima in the envelope of its amplitude frequency spectrum (Figs. 2 and 3). Stimuli were

* ‘Suppression’ is now usually preferred over the expression ‘inhibition’, which is generally reserved for processes media- ted by inhibitory synapses. In this paper we will use the same

expression as in the cited studies.

monitored close to the earcanal with a $-inch Briiel & Kjaer condenser microphone. The electri- cal signal fed to the earphone was either a train of 0.1 ms long pulses (repetition frequency 50 or 100 Hz) or pseudo-random noise from a Hewlett- Packard 3722A noise generator. Settings of this noise generator were chosen in such a way, that the amplitude frequency spectrum of the pseudo- random noise stimulus was identical to that of the

pulse train. High-pass filtering was applied to the

signal in both cases. Frequency spectra of acoustic stimuli or coch-

lear microphonic signals were recorded during an experiment with a Princeton Applied Research

(Unigon) type 4512 real-time spectrum analyzer in the spectrum averaging mode. These spectra were plotted with an XY-recorder.

With the sound delivery system connected to a small cavity instead of to an animal’s earcanal, the signal from the a-inch monitor microphone was applied to the Engebretson-Eldredge mode1 (Fig. 1). The output signal from the mode1 was analyzed in the same way as the cochlear microphonic sig- nals.

Frequency spectra for artificial signals, similar to that used in the animal experiments, were

calculated by computer with an FTT algorithm. This was done both for a cosine phase relation

between all components in the signal (‘filtered pulse train’) and for random phases of the compo- nents (‘pseudo-random noise’). These calculations were repeated after nonlinear transformation of x, (the time-domain representation of the signal). This transformation was simply f(x) = x!j3, the same as that used by Pfeiffer (1970) in his two-tone inhibition model. The transformation mimics saturation of cochlear microphonics at higher stimulus levels.

Fig. 1. Nonlinear analog model for cochlear microphonics distortion, after Engebretson and Eldredge (1968).

61

Results

Cochlear microphonics

There were no essential differences between the results from different animals of the same species. Also the results from pigeons and guinea pigs did

not show qualitative differences. In all cases am- plitude spectra for the cochlear microphonic

resembled that for the acoustic stimulus at mod- erate stimulus levels. At higher stimulus levels, however, the cochlear microphonic spectrum dif- fered markedly from that of the stimulus, Apart from the presence of more distortion components in the cochlear microphonic spectrum, the spec- trum envelope shape is of special interest: the peak-to-valley ratio in this envelope becomes larger at higher stimulus levels, for filtered pulse train stimuli (Figs. 2 and 3). When the stimulus was filtered pseudo-random noise, this ‘contrast en- hancement’ was not present. Instead, the envelope

of the spectra became irregular at high stimulus levels.

guinea pig

STIM.

80 dB

110 dB

I I I I , 1 I I ,

0 1

frequency (kHr)

2

Fig. 3. Amplitude spectra of the acoustical stimulus (upper

panel) and cochlear microphonic responses at two different

stimulus levels in guinea pig. The peak-to-valley ratio in the

spectrum envelope at 110 dB is larger than at 80 dB.

1 pigeon 1 Engebretson and Eldredge analog model

STIM.

70 dB

90 dB

110 dB

INPUT

I 10 dB

I I I IllI, I!

OUTPUT

I

1 , 1 I

---I I I I 1 I I I 1 I I

1 2

frequency (kHz)

1

frequency (kH3

Fig. 2. Amplitude spectra of the acoustical stimulus (upper Fig. 4. Amplitude spectra of the input and output stgnal 01 the

panel) and cochlear microphonic responses at different stimu- Engebretson-Eldredge model (Fig. 1). The input signal was a

lus levels (3 lower panels) in pigeon. Note the larger peak-to- filtered pulse train. Nonlinear distortion creates enhancement

valley ratio in the spectrum envelope at 110 dB. of the spectral peaks in the output signal.

62

Nonlinear analog model signal from the Engebretson-Eldredge model, Fig. 4 shows that contrast enhancement is also when a filtered pulse train stimulus of sufficient

present in the frequency spectrum of the output level is applied to the input of this model.

before tranaf.

.REOUENCY

after nonlinear traneformatlon

m 600 im FREOUENCT ______

COSINE PHASE

0.75 r RANDOM PHASE

Fig. 5. Results of fast Fourier transformation of a signal that is the sum of 17 equal intensity components, before and after a nonlinear

transformation was applied to it. The components of the signal have either all equal phase (middle panel) or random phases (lower panel). The nonlinearity causes ‘edge enhancement’ if all components have equal phase.

63

Fast Fourier transformation after nonlinear transfor- lated for a time-domain signal (2048 samples;

mation sample frequency 2048 Hz), being the sum of 17

A 1024-point fast Fourier transform was calcu- cosine functions with equal amplitude. As before,

r

1.50 before trrnrf.

0.m FREOUENC‘(

u.00 after nonlinear tran6formatioic COSINE PHASE

z

0 3.a x

0.m FREOUENCl

0.75 RANDOM PHASE

0.m FREOUENCY

Fig. 6. Results of fast Fourier transformation of a multicomponent signal with a spectral envelope that is the sum of two gaussian functions. After nonlinear transformation of the signal in the time domain, spectral contrast enhancement is present if ail signal

components have the same (cosine) phase (middle panel). If the signal components have random phases. an irregular spectrum

envelope results after transformation (lower panel).

64

the component frequencies were harmonically related, the spectral spacing being equal to the frequency of the (not presented) fundamental component. This was done for all cosine functions

having the same phase and for random phases of the individual components. These calculations were

repeated after nonlinear transformation of the sig-

nal, as described under Material and Methods. The results are given in Fig. 5. Edge enhancement,

as described by Horst et al. (1985a), is clearly present after nonlinear transformation, when all signal components have the same (cosine) phase. In order to check whether aliasing effects might cause the special effects, calculations were re- peated for other positions (and spacings) of the 17 spectrum components along the frequency axis. Although aliasing turned out to have a small in- fluence on the shape of the spectra, it did not

cause the edge enhancement effects. FFTs before and after nonlinear transformation

were also calculated for a signal that is again the sum of many cosine functions, but now having an envelope in the frequency domain that is the sum of two gaussian functions instead of being flat. This signal resembles the acoustical stimulus that was used in the cochlear microphonic recordings. It was reassuring to see that also in this model calculation contrast enhancement was present in the frequency spectra after nonlinear transforma- tion (Fig. 6).

Discussion

The observed enhancement effects in frequency spectra of cochlear microphonic signals are caused by interaction of signal components created by

nonlinear distortion with components that were already present in the signal before it was dis- torted. Phase relations between the interacting components are apparently such that contrast en- hancement occurs preferably if all components of the complex stimulus have equal (cosine) phase. Whether this enhancement effect is important for auditory perception or is just a side-effect of the cochlear transduction process cannot be decided from the experiments described in this paper. On the one hand, clear effects are only observed for high stimulus levels. If the enhancement effect would have a meaning, we would expect it to be

present at lower stimulus levels also. On the other hand, complex natural sounds often have equal phase for all components, a condition necessary to produce enhancement. Speech for instance is a filtered acoustical pulse train.

Many of the enhancement effects (if not all), as described for auditory nerve fibers, are already

present at the level in the cochlear transduction process where the cochlear microphonic is gener- ated. We are not the first authors to observe this. Already in 1973 Legouix et al. wrote that inhibi- tory effects seen in nerve fibers, stimulated by two tones simultaneously, reflect mechanical events in the cochlear partition and subsequent changes in the effective stimulating waveform triggering the auditory nerve. A similar conclusion was drawn by Sellick and Russell (1979), who showed that two- tone suppression occurs at the level of the hair cell receptor potential. This observation, in their opin-

ion, strengthens existing physiological and ana- tomical evidence against a neural mechanism, such as lateral inhibition, for the basis of two-tone

suppression. Whether neural mechanisms are involved or not

could be decided in future experiments in which

responses to complex stimuli are recorded from auditory nerve fibers, simultaneously with the cochlear microphonic response.

We have shown that there is no essential dif- ference between the enhancement effects in pigeon and in guinea pig. This may exclude interaction between inner and outer hair cells as the cause for these effects, because the sensory cell distribution on the pigeon basilar papilla is quite different from that on the basilar membrane in mammals (Takasaka and Smith, 1971); pigeons do not have inner and outer hair cells in the strict sense as in the case for mammals.

Acknowledgements

We thank Professor J.D. Bleeker and Hans Segenhout for assistance with the animal prepara- tion. Miss Lida op ‘t Ende typed the manuscript and Meindert Goslinga improved the figures. This work was supported by the Heinsius Houbolt Fund.

65

References

Abbas, P.J. and Sachs, M.B. (1976): Two-tone suppression in auditory-nerve fibers. J. Acoust. Sot. Am. 59, 112-122.

Cheatham, M.A. and Dallos, P. (1982): Two-tone interactions

in the cochlear microphonic. Hearing Res. 8, 29-48.

Engebretson, A.M. and Eldredge, D.H. (1968): Model for the

nonlinear characteristics of cochlear potentials. J. Acoust.

Sot. Am. 44, 548-554.

Ewald, J.R. (1892): Physiologische Untersuchungen iiber das

Endorgan des Nervus octavus. Bergmann. Wiesbaden.

Horst, J.W., Javel E. and Farley, G.R. (1985): Extraction and

enhancement of spectral structure by the cochlea. J. Acoust.

Sot. Am. 78, 1898-1901.

Horst, J.W., Javel E. and Farley, G.R. (1986): Coding of

spectral fine structure in the auditory nerve. 1. Fourier

analysis of period and interspike interval histograms. J.

Acoust. Sot. Am. 79 (in press).

Houtgast, T. (1974): Auditory analysis of vowel-like sounds.

Acustica 31, 320-324.

Javel, E., Geisler, C.D. and Ravindran, A. (1978): Two-tone

suppression in auditory nerve of the cat: Rate-intensity

and temporal analysis. J. Acoust. Sot. Am. 63, 1093-1104.

Legouix, J.P. and Avan, P. (1984): Etude du filtrage cochleaire

dans le cas de sons complexes. J. Physiol. (Paris) 79, 64A.

Legouix, J.P. and Avan, P. (1985): Role of suppressive interac-

tions in the cochlear microphonic response to wide-band

clicks. Hearing Res. 19, 227-234.

Legouix, J.P., Remond, M.D. and Greenbaum, H. (1973):

Interference and two-tone inhibition. J. Acoust. Sot. Am.

53.409-419.

Pfeiffer, R.R. (1970): A model for two-tone inhibition of single

cochlear nerve fibers. J. Acoust. Sot. Am. 48, 1373-1378.

Sachs, M.B. and Kiang, N.Y.S. (1968): Two-tone inhibition in

auditory-nerve fibers. J. Acoust. Sot. Am. 43, 1120-1128.

Sachs, M.B. and Young, E.D. (1979): Encoding of steady-state

vowels in the auditory nerve: representation in terms of

discharge rate. J. Acoust. Sot. Am. 66, 470-479.

Sachs, M.B., Young, E.D., Schalk, T.B. and Bernardin, C.P.

(1980): Suppression effects in the responses of auditory-

nerve fibers to broadband stimuli. In: Psychological, Physi-

ological and Behavioural Studies in Hearing, pp. 284-291.

Editors: G. van den Brink and F.A. Bilsen. Delft University

Press, Delft, The Netherlands.

Sellick, P.M. and Russell, I.J. (1979): Two-tone suppression in

cochlear hair cells. Hearing Res. 1, 227-236.

Sinex, D.G. and Geisler, C.D. (1984): Comparison of the

responses of auditory nerve fibers to consonant-vowel sylla-

bles with predictions from linear models. J. Acoust. Sot.

Am. 76, 116-121.

Takasaka, T. and Smith, C.A. (1971): The structure and in-

nervation of the pigeon’s basilar papilla. J. Ultrastruct. Res.

35, 20-65.

Tasaki, I., Davis, H. and Legouix, J.P. (1952): The space-time

pattern of the cochlear microphonics (guinea pig), as re-

corded by differential electrodes. J. Acoust. Sot. Am. 24,

502-519.

Wever, E.G., Bray, C.W. and Lawrence, M. (1940): The inter-

ference of tones in the cochlea. J. Acoust. Sot. Am. 12,

268-280.

Young, E.D. and Sachs, M.B. (1979): Representation of

steady-state vowels in the temporal aspects of the discharge

pattern of populations of auditory-nerve fibers. J. Acoust.

Sot. Am. 66. 1381-1403.