All India Institute of Speech and Hearing Manasagangothri...
Transcript of All India Institute of Speech and Hearing Manasagangothri...
STUDENT RESEARCH AT A.I.I.S.H. MYSORE
(ARTICLES BASED ON DISSERTATION DONE AT AIISH)
VOLUME V: 2006-2007
PART - A
AUDIOLOGY
Compiled by
Dr. Vijayalakshmi Basavaraj
Director
Dr. Y. V. Geetha
Prof. of Speech Sciences
All India Institute of Speech and Hearing
Manasagangothri, Mysore – 570 006
© 2010
A Publication of the All India Institute of Speech and Hearing
Under the title: “Student Research at A.I.I.S.H Mysore”
Articles based on dissertations done at AIISH: Vol. V
Telephone : 0821-2514449, 2515410, 2515805, 2515218, 2514313
Guest House : 2515786, Security (after office hours): 2514449
Telefax : 0821- 2510515
E-mail : [email protected], [email protected]
Website : www.aiishmysore.org
Work Hours : 9.00 am to 5.30 pm – Monday through Friday
Holidays : Central Government Holidays
Price: Rs. 150/-
Published by Dr Vijayalakshmi Basavaraj
Director, AIISH, Mysore
Foreword
We are happy to bring out the fifth volume of full length articles based on the
dissertation work done by our post graduate students in part fulfillment of their PG degrees in
Audiology and Speech Language Pathology.
This volume includes articles based on dissertations done by the post graduate students
during the year 2006-07. There are totally 38 articles. Part A comprises of 19 papers related to
Audiology which contains articles covering a wide range of topics including evoked potential
testing and related issues; Auditory Processing Disorders; issues related to hearing and aging;
Speech recognition and speech tests; Aids and Appliances for persons with hearing impairment.
Part B comprises of 19 papers in the area of Speech-Language Pathology containing articles
covering various topics in Language and cognition; Aphasia; Motor speech disorders;
Phonological processing; Voice and Fluency disorders. I am glad that our faculty have
enthused the students to undertake research in a variety of topics in Audiology and Speech-
language pathology. The titles of the articles are the titles of the dissertations. The first authors
are the II M.Sc students of 2006-07 and the second authors are their respective guides who
have supervised and guided the research work. The articles reflect the recent research interest
shown by the students in the field of Audiology and Speech-language pathology. The AIISH
faculty members who have guided the dissertations have modified and edited the papers to
bring it to the present shape to the best of their abilities in spite of their busy academic
schedules. Dr. Y.V. Geetha, Professor of Speech Sciences has put in efforts to procure and
compile the edited articles and has herself edited many of the articles. This is highly
appreciated. The unattended mistakes in print and references, if any, in spite of best efforts put
in is regretted.
We hope we will be able to bring out the other volumes of the publications of
dissertation papers pending at the earliest. You may please e mail your valuable feedback about
this volume to [email protected] with the subject “Student research, Volume V A/B,
2006-07”.
Dr. Vijayalakshmi Basavaraj
Director
Table of Contents Sl. No. Title Page No.
1. Speech Identification with Single Channel Multichannel and Channel Free Hearing Aids
Abhay Kumar R & K Rajalakshmi
1
2. Brainstem and Cortical Responses to Speech Stimuli in Individuals with Cochlear Hearing Loss
Anirban Chaudhury & C S Vanaja
11
3. Binaural Amplification – Does Technology Matter?
Anusha R & Manjula P
22
4. Reduction of Stimulus Artifacts in ASSR: An Investigation of a Stimulus Approach
Arivudai Nambi.P & C S Vanaja
33
5. Comparison of Word Recognition Scores using Different Settings of Telecoil in a Digital Hearing Aid
Bijan Saikia & P Manjula
41
6. Estimation of Auditory Thresholds in Cochlear Implant Subjects Using ASSR
Dayal Goswami & Rajalakshmi K
52
7. A search to possible pathways for later peaks of VEMP and N3 potential
Deepashri Agrawal & Animesh Barman
65
8. Acoustic Analysis of the Speech Processed Through Three Amplification Strategies and Their Effect on
Speech Recognition Scores of Individual with Severe Hearing Impairment
Gunjan Chand, Vijayalakshmi Basavaraj & Ajish Abraham
78
9. Some Aspects of Temporal Processing Deficits in Individual with Learning Disability
Kishan M. M & Animesh Barman
93
10. Speech recognition of spectrums with „holes‟ by children
Manasa Ranjan Panda & Asha Yathiraj
104
11. Importance of Long Latency Potential in Pediatric Hearing Assessment
Niraj Kumar & Animesh Barman
114
Pitch perception in individuals with sensorineural hearing loss with and without dead regions
Palash Dutta & K Rajalakshmi
127
12. Development of High and Low Predictable English Sentence Test (Ehlps)
V.V Rahana Nandan & Asha Yathiraj
147
13. NRT: Comparison of Artefact Cancellation and Threshold Estimation Techniques
Shibasis Chowdhury & P. Manjula
158
14. Speech-Evoked Auditory Late Latency Response (ALLR) in hearing aid selection
Shruti Kaul & C.S Vanaja
174
15. Brainstem Responses to Speech in Normal Hearing and Cochlear Hearing Loss Individuals
Sumesh.K & Animesh Barman
187
16. Music Processed by Hearing Aids
Sushmit Mishra & P Manjula
200
17. Efficacy of Frequency Transposition Hearing Aid in Dead Region Subjects
Swapna raj S & K Rajalakshmi
214
18. Effect of Cochlear Hearing Loss on Tone Burst Evoked Stacked Auditory Brainstem Response
Yatin Mahajan & Vanaja C
226
Dissertation Vol.V, Part-A, AIISH, Mysore
1
Speech Identification with Single Channel Multichannel and
Channel-free Hearing Aids Abhay Kumar Roy & K Rajalakshmi
Abstract
Results of investigations in multichannel and single hearing aids are equivocal; this is
due to some acoustical modifications brought about by the multichannel hearing aids. The
present study is aimed to compare the performance of single channel multichannel and channel
free hearing in quiet and two SNR conditions. 12 participants in the age range of 30-60 years
participated in the present study. All the participants had sensory neural hearing loss with
degree ranging from 41-70 dB. Speech identification scores were assessed in quiet and two SNR
conditions (+10 dB and 0 dB SNR) for single, three, eight channel and channel free hearing
aids. Results revealed that participants performed better with channel free hearing aid in quiet
and in presence of noise conditions. The performance was better in 8 channel hearing over three
and single channel only in noise conditions. Results suggest that the new technology overcomes
the disadvantages of multichannel hearing aids. Performance of multichannel hearing aids
showed better performance only in noise but no difference in performance between single and
three channel hearing aids. So increasing the number of channels improves performance only in
noise.
Introduction
The last decade has seen numerous and significant improvements in the technology of
hearing aids. With the advancement of digital technology, digital hearing aids have become
increasingly common. Modern digital signal processing technology includes non-linear,
adaptive, multiple channels/bands, speech enhancement, noise reduction, feedback management
etc. The issue regarding the ideal number of channels has been a hot topic in rehabilitative
amplification over a decade. Despite the ongoing debate, conventional wisdom indicates more
number of channels in digital hearing aid is better and there has seen a surge in the number of
channels in commercially available instruments over the last few years.
Compression is one such technology which helps to optimize the dynamic range of the
individual with hearing impairment. Compression is nothing but a nonlinear amplifier which
automatically adjusts its gain depending upon the incoming signal. Such a signal processing
feature helps to improve the perception in hearing impaired individual by normalizing the
loudness increasing the sound comfort and by reducing the inter-syllabic and inter-phoneme
intensity difference (Dillon, 2001). Although compression technology helps the hearing impaired
Reader in Audiology, All India Institute of Speech and Hearing, Mysore, India
e-mail: [email protected]
Dissertation Vol.V, Part-A, AIISH, Mysore
2
individual to perceive better, the benefit that compression provides partly depends on the way it
is implemented in hearing aids. Broadly, based on the implementation of number of compression
circuit in the hearing aids, it can be classified into either single channel or multichannel hearing
aids.
In single channel compression the entire dynamic range is optimized across the full range
of frequencies by a single compressor. In multichannel compression hearing aids this dynamic
range is optimized at discrete frequencies by using multiple compressors. Currently, hearing aids
with 1 to 20 channels are commercially available. Over the decades attempts have been made to
investigate if increasing the channel helps the hearing impaired individual to perceive better. It
may appear that the larger the number of channels the better the compensation for individual
hearing impairment. However, increased numbers of channels may also have drawbacks, worthy
of consideration.
Yund and Buckles (1993) measured speech discrimination for 8 channel compression and
linear amplification. As the signal to noise ratio (SNR) decreased the speech identification
became relatively better in multi-channel compared to linear amplification. Yund and Buckles
(1995) reported that speech identification scores improve as the number of channel increases
from 4 to 8 and did not vary significantly between 8 to16 channels. On the contrary, Bustamante
and Braida (1987) reported that multi-channel amplification reduces the speech intelligibility in
hearing impaired individuals. These findings are also supported by Drullman and Smoorenberg
(1997). Hickson (1994) have reported that the performance with 4 channel hearing aid is similar
to that of single channel hearing aid.
Studies have revealed equivocal results about the advantages and disadvantages of
multichannel hearing aid. Relative to single channel compression, multi channel compression
can increase intelligibility because it gives frequency specific amplification which in turn
provides better audibility of speech. Unfortunately, multichannel compression also decreases
some of the essential differences between different phonemes. Because compressor gives less
amplification to intense signals than to weak signals, multichannel compressors tend to decrease
the height of spectral peak and to raise the floor of spectral valleys. That is, they partially flatten
spectral shapes. Spectral peaks and valleys give speech sounds much of their identity. Spectral
flattening makes it harder for the hearing aids users to identify the place of articulation of
consonants (De Gennaro, Braida & Durlach, 1986).
Considering these opposing effects of multichannel compression, it is not surprising that
some experiments have shown multichannel compression to be better than single channel
compression (Kiessling & Steffens, 1991; Moore & Glasberg, 1986, 1988) and some have failed
to show any advantage for multi channel compression (Moore, Peters & Stone, 1998; Plomp,
1976; Walker, Byrne & Dillon, 1984). Multichannels decrease speech intelligibility for normal
hearing people (Hohmann & Kollmeier, 1995; Yund & Buckles, 1995). If high compression ratio
is used in multi channel compression hearing aid, intelligibility is also decreased for hearing
impaired listeners (Bustamante & Braida, 1987; De Gennaro, Braida & Durlach, 1986).
Dissertation Vol.V, Part-A, AIISH, Mysore
3
Whether the positive effects of multichannel compression outweigh the negative effects
depend on how much audibility is achieved in the reference condition. A net advantage for
multichannel compression is thus least likely for sounds than in the single channel condition.
They are comfortably loud and have been amplified by an appropriate gain frequency response
shape. So, there is a dearth of studies comparing single channel and multichannel compression
and most showing equivocal results. Hence further research is needed in the area to overcome the
ambiguity that is seen in the literature. The emergence of new techniques such as channel free
hearing aids necessitates it to be validated along with the existing techniques such as single
channel and multichannel. Hence current study was undertaken to compare the speech
identification score with the single channel, three channels, eight channels and channel free
hearing aid in quite as well as in two noise conditions (+10 dB SNR and 0 dB SNR).
Method
Present study was designed to compare the hearing aid performance across the channel
in quiet and different noise conditions.
Subjects
Twelve subjects (9 men and 3 women) in the age range of 35 to 60 years (mean age of
48.5) with confirmed diagnosis of sensorineural hearing loss participated in the study. They had
audiometric 3 frequency average (500, 1000 and 2000 Hz) pure-tone thresholds in the range of
41 to70 dB HL with speech identification score of greater than 50%. Tympanometry results
indicated no middle ear pathology. All of them were first time hearing aid users. All the
participants were native Kannada speakers (Language spoken in Karnataka state of India).
Instrumentation
A Calibrated two channel diagnostic audiometer (OB922) was used for estimation of pure
tone thresholds. Calibrated GSI-tympstar middle ear analyzer was used for Immittance
measurements. A single channel (Terra), Three Channel (Cielo), Eight Channel (Syncro),
Channel free (Symbio XT 110) hearing aids were used for the purpose of comparison of
performance. Hearing aids were programmed with NOAH based Connexx 5.3 (Terra and Cielo),
Genie 6 (Syncro) and (Symbio) Oasis plus 7 software. Hearing aids were connected with the
computer using HiPro.
Stimuli were played in laptop 44.1 KHz sampling rate and 32 bit software using
Cyberlink Power DVD Ultra software. Stimuli were routed through the OB922 two channel
audiometer to the two sound calibrated Martin audio C115 speakers.
Dissertation Vol.V, Part-A, AIISH, Mysore
4
Stimuli
The speech stimuli used in the present study was taken from bi-syllabic word lists in
Kannada developed by Yathiraj and Vijaylakshami (2005). This test contains four word lists,
each with 25 bi-syllabic words, which are phonetically balanced and are equally difficult. All the
four lists were selected for the present study. The words were spoken in conversational style by a
female native speaker of Kannada. They were digitally recorded in an acoustical treated room, on
a data acquisition system using 44.1 KHz sampling frequency and 32-bit analog to digital
converter. All the word lists were mixed with speech babble (Anitha & Manjula, 2005) at +10 dB
and 0 dB SNR. The speech babble is mixed with words with reference to RMS amplitude by
program written in MATLAB 7.
Procedure
Puretone thresholds were obtained using modified Hughson and Westlate procedure
(Carhart & Jerger, 1959) across octave frequencies from 250 to 8000 Hz for air conduction and
250 to 4000 Hz for bone conduction.
Tympanometric measurements were done using 226 Hz probe tone. This was done to
rule out conductive hearing loss due to middle ear pathology. Appropriate probe tips were used
to obtain hermetic seal and comfortable pressure for the subject. The parameters documented
were types of tympanogram and acoustic reflex thresholds agreeing with ear canal volume,
acoustic admittance and the tympanometric peak pressure. The results were also correlated with
the ENT findings.
Hearing aids were programmed on the basis of audiometric thresholds with the default
gain provided by software. Syncro and Cielo had noise management technology. While
programming these noise management options were switched off in order to avoid any unwanted
effect on result. All the hearing aids were switched to Omni directional microphone mode as
there was no need of noise reduction during the testing.
Test was done in acoustically treated room with noise with in permissible limits as per
ANSI (1991) specification. Subjects were seated at distance of one meter and at 45o
azimuths
from the speakers. First the testing was done in unaided condition and later in aided condition. In
the aided condition hearing aids were selected randomly for fitment and testing. Stimuli were
played on a laptop at 44.1 KHz sampling rate with 32 bit operating system and were routed
through the two channel audiometer (OB922). The intensity level was maintained at 40 dBHL
throughout the testing and inter stimulus interval was kept constant at 5 seconds. Written
responses were obtained from the subjects and in case of illiterate subjects the responses were
scored by Kannada speaker.
Dissertation Vol.V, Part-A, AIISH, Mysore
5
Results
Speech Identification score in Quiet
The speech identification scores of 12 subjects (15 ears) in unaided and aided conditions
are presented in Figure 1. A repeated measure of ANOVA was performed to assess the
significant difference across conditions (unaided & 4 aided conditions). Results showed a
significant difference across conditions (F (4, 39.3) = 14.7, p<0.01). Scheffe Post Analysis of
variance reveled significant difference between unaided condition and aided conditions (p<0.01)
but difference in means across different channel hearing aids data did not reach the significance.
However from the figure it is observed that channel free hearing aid had higher scores compared
to other different channel hearing aids.
Figure1. Speech identification scores in quiet condition
B. Speech Identification Scores in Noise
Figure 2 shows the percentage of speech identification score in quiet, +10 dB SNR and 0
dB SNR conditions for various channel hearing aids (1 channel, 3 channel & 8 channel) and
channel free hearing aid. It can be observed that participants performed better with channel free
and 8 channel hearing aids than single and 3 channel hearing aids in all the conditions (quiet &
+10 dB & 0 dB SNR). Furthermore, it is also clear from the figure 2 that channel free and 8
channel hearing aids show better performance in all the conditions. Participants performed better
with channel free in quiet and 0 dB SNR conditions than 8 channel hearing aid.
Repeated measures ANOVA were performed to assess the significant difference across
hearing aids in quiet and two different SNR conditions. Repeated measures ANOVA revealed
significant main effect of quiet and two SNR conditions (F (1.67, 112) =143.05, p<0.01) but no
significant interaction was observed (F (4.8, 96.6) =0.283, p=0.98). To see the significant
difference across different channel hearing aids and channel free hearing aid, Post Hoc analysis
Dissertation Vol.V, Part-A, AIISH, Mysore
6
of variance was performed and results revealed the mean difference across different channel
hearing aid and channel free hearing aid did not reach significant difference (p>0.05).
Figure 2: Speech identification scores in quiet and two noise conditions. Open square
indicates Single channel, open triangle indicates three channel and open diamond for channel
free hearing aid.
Discussion
A. Performance in quiet condition
Aided response with different hearing aids is better than in the unaided condition.
Results of the present study revealed no significant difference across hearing aids used in this
study. Although the mean scores did not reach the significance there is difference in mean scores
across hearing aids. Furthermore, more variability in the scores was observed which would have
led to no significant difference across hearing aids. One other reason is the age range studied in
the present study which could have contributed for variability in the scores.
The performance with multichannel hearing aids was almost similar to that observed in
single channel and 3 channel hearing aid. A number of investigators reported no significant
improvement in speech identification by increasing the numbers of channels in multichannel
hearing aid (Louise & Hickson, 1994). Souza, (2002) reported that multichannel hearing aids
with fast compression time constants distort some speech cues, offsetting the benefits of
improved audibility. In the present study, multichannel used syllabic compression, which has fast
attack and release time constants which could have caused the distortion and lead to the much
variability in performance. Anna O‟Brien, (2002) has provided explanation for the poorer
performance observed across studies in multichannel hearing aids. According to her,
theoretically, when vowels, diphthongs and other phonemes are processed by a multichannel
instrument, their key formant sounds may be managed and resolved by different channels,
receiving more or less amplification and compression than was originally present and intended.
This possible outcome distorts relationships among formants and potentially other key features
of vowel, phoneme and word recognition (see Figure-3). As observed in figure 3, an annotation
Dissertation Vol.V, Part-A, AIISH, Mysore
7
described by Dillon, (2001) shows that in stimulus /ii/ the spectral difference is lost and formant
frequencies are distorted.
In addition, another consideration is that the number of channels, compression ratios and
their time constants (attack and release times) all interact. Taken to an extreme, a large number
of channels with high compression ratios can result in an amplified signal (Plomp, 1988),
stripped of many of the identifiable speech elements. This effect is known as "spectral
smearing." Because of the distorted formant information, spectral smearing is most deleterious to
"place" of consonant articulation (e.g. difficulty discriminating between /b/, /d/ and /g/) and
increases susceptibility to noise (Boothroyd et al, 1996).”
The mean scores of channel free hearing aid were 10% higher compared to other
multichannel and single channel hearing aids. Similar to the present study, Dillon et al., (2003)
showed that the performance of subjects in quiet, impulse noise, for male and female voice was
better with channel free hearing aid compared with multichannel hearing aids. They also showed
that internal noise and distortion seen in the channel free hearing aid is less than those observed
with multichannel hearing aid and that low distortion and less internal noise would have
contributed for the better performance in channel free hearing aid. CASI offers unique frequency
shaping for optimal hearing-loss appropriate frequency response curves. Flexible input-
dependent filter characteristics are applied to the whole signal, allowing frequency-dependent
compression, without splitting the signal into channels and incurring the consequent spectral
smearing potentially present in many-channel instruments.
Sou
nd
Lev
el
S
ou
nd
Lev
el
Fig: 3. Annotated diagram of vowel spectra
Dissertation Vol.V, Part-A, AIISH, Mysore
8
B. Performance in Noisy condition
Results revealed that mean performance dropped significantly in noise for all hearing
impaired subjects. No significant effect of channel was observed. The drop in performance
across hearing aids may be due to the poorer performance of hearing impaired subjects in
adverse conditions. From Fig. 2 it can be noted that channel free and 8 channel hearing aid
performed better in two noise conditions. In addition, channel free aid provided better
performance in 0 dB SNR condition compared to 8 channel hearing aid. No significant
difference was observed in the present study. This may be due to large variability in data,
because of the small number of subjects and age range studied in the present study (30-68 years).
A number of investigators reported that performance with 8 channel hearing aid is better
than single to 6 channel hearing aids (Yund & Buckle, 1995). More number of channels will
provide the possibility of better fit to the individual hearing impairment. The greater the number
of channels and the narrower the channels, the greater the likelihood that important frequency
components of the signal will fall into channels which do not include higher-intensity
components of the noise of the signal itself. It is important that a signal component has a
positive signal-to-noise ratio (S/N) within a channel because only then can the signal component
determine the amplification in the channel to be amplified appropriately and become useful to
the subject. Whenever the S/N is negative in the channel, the noise controls the amplification and
the signal and noise components are amplified less than would have been appropriate for the
signal component alone. In the multichannel compression haring aids with few broad channels,
however, a signal component may be amplified too little (i.e., "masked electronically") due to the
presence of a noise component which would not have masked it perceptually had the signal and
noise components been amplified appropriately in two separate channel (Stone et. al.,1999).
Although number of studies has shown that multichannel hearing aid performance is
better other group of researchers has shown that there is variability due to sensoryneural hearing
loss (Yund, Simon, & Efron, 1987). It is because of the speech distortions that are caused by the
type of compression and time constants applied in the multichannel hearing aids. That is when
the input signal is broken into channels and applying compression and fast time constants the
spectro-temporal characteristics become distorted and important speech transition information is
lost which has been found to impair speech understanding (Boothroyd et al, 1996). In the present
study also mean scores were higher but there was more variability (SD) indicating not all
subjects improved with 8 channel hearing aid. Lippmann (1978) reported a deterioration of the
scores when the signal was compressed with the noise and Barfod (1978) also obtained
equivalent scores in his study.
Performance of channel free hearing aid was higher with less variability compared to the
multichannel hearing aid. Similar results have been reported by Dillon, (2002). Because, the
channel free hearing aid utilizes recently developed technology, Continuously Adaptive Speech
Integrity (CASI). This strategy offers unique frequency shaping for optimal hearing loss
appropriate frequency response curves. Flexible input-dependent filter characteristics are applied
Dissertation Vol.V, Part-A, AIISH, Mysore
9
to the whole signal, allowing frequency-dependent compression, without splitting the signal into
channels and incurring the consequent spectral smearing potentially present in many-channel
instruments. CASI analyses incoming signals according to their intensity and dominant spectral
elements and calculates the corresponding gain characteristic to be applied. Spectral
characteristics of speech are maintained resulting in more "natural" sounding amplification. So
the reduced spectral smearing and frequency dependent compression would have improved the
performance of subjects with channel free hearing aid.
One important observation made in the study was that channel free hearing aid showed
better performance over the eight channel hearing aid in 0 dB SNR and quiet condition. There
was no difference in performance between eight channel and channel free hearing aid in 10 dB
SNR. Bear and Moore (1993) and Ter Krause (1993) have shown no effect of spectral smearing
on speech identification scores in normal hearing subjects in quiet but it has significant effect in
adverse conditions. They further said that poor frequency resolution observed in cochlear hearing
loss subjects affects identification scores in noise rather in quiet. From the above it is understood
that in the adverse conditions (like 0 dB SNR) the amount of spectral information utilized for
understanding speech is more compared to the conditions like 10 dB SNR and quiet conditions.
In the multichannel hearing aids there is temporal distortion and spectral smearing. Small
improvement observed for channel free hearing aid may be due to the reduced spectral smearing
and temporal distortions which would have affected the speech identifications scores in
multichannel hearing aids.
To conclude, performance of subjects with channel free hearing aid was better in quiet
and noise conditions. Performance of multichannel hearing aids showed better performance only
in noise but no difference in performance between single and three channel hearing aids. So
increasing the number of channels improves performance only in noise.
References
Bentler, R. A. & Duve, M. R. (2000). Comparison of hearing aids over the 20th century. Ear and
Hearing, 21, 625-639.
Boike, K.T. & Souza, P.E. ( 2000a). Effect of compression ratio on speech recognition and.
speech-quality ratings with wide dynamic range compression amplification. J Speech
Language and Hearing Research, 43, 456-468.
Boike, K.T. & Souza, P.E. ( 2000b). Effect of compression ratio on speech recognition in
temporally complex background noise. Presented at the International Hearing Aid
Conference, Lake Tahoe, CA.
Bustamente, D. K. & Braida, L. D. (1987). Multiband compression limiting for severely
impaired listeners. Journal of Rehabilitation Research and Development, 24, 149-160.
Hickson, L. M. H. (1994). Compression amplification in hearing aids. American Journal of
Audiology, 3, 51-65.
Dissertation Vol.V, Part-A, AIISH, Mysore
10
Hohmann V. & Kollmeier B. (1995). The effect of multichannel dynamic compression on speech
intelligibility. Journal of Acoustic Society of America 97, 1191-1195.
Kiesseling, J. & Steffens, T. (1991). “Clinical evaluation of a programmable three channel
automatic gain control amplification system.” Audiology, 30(2), 70 – 81.
Moore, B. C. J. & Glasberg, B. R. (1986). A comparison of two-channel and single-channel
compression hearing aids. Audiology, 25, 210-226.
Moore, B. C. J., Alcantara, J. I., Stone, M. A. & Glasberg, B. R. (1999). Use of a loudness model
for hearing aid fitting: II. Hearing aids with multi-channel compression. British Journal
of Audiology, 33, 157-170.
Moor, B. & Glasberg, B. (1998), “A comparison of four methods of implementing automatic
gain control (AGC) in hearing aids,” British journal of Audiology, 22(2), 93 – 104.
Plomp, R. (1988). The negative effect of amplitude compression in multichannel hearing aids in
the light of the modulation-transfer function. Journal of Acoustic Society of America, 83,
2322-2327.
Souza, P. E. & Turner, C. W. (1999). Quantifying the contribution of audibility to recognition of
compression-amplified speech. Ear and Hearing, 20, 12-20.
Stelmachowicz, P. G., Kopun, J., Mace, A. & Lewis, D. (1995). The perception of amplified
speech by listeners with hearing loss: Acoustic correlates. Journal of Acoustic Society of
America, 98, 1388-1399.
Stone, M. A., Moore, B. C. J., Wojtczak, M. & Gudgin, E. (1997). Effects of fast-acting high-
frequency compression on the intelligibility of speech in steady and fluctuating
background sounds. British Journal of Audiology, 31, 257-273.
Yund, E. W. & Buckles, K. M. (1995b). Multichannel compression hearing aids, Effect of
number of channels on speech discrimination in noise. Journal of Acoustic Society of
America, 97, 1206-1223.
Yund, E.W. & Buckles, K. M. (1995a). Enhanced speech perception at low signal-to-noise ratios
with multichannel compression hearing aids. Journal of Acoustic Society of America, 97,
1224-1240.
Dissertation Vol.V, Part-A, AIISH, Mysore
11
Brainstem and Cortical Responses to Speech Stimuli in Individuals
with Cochlear Hearing Loss
Anirban Chaudhury & Vanaja C S
Abstract
This study investigated the effect cochlear pathology on brainstem and cortical responses
to speech burst and transition. The relationship between these potentials and speech
identification scores was also investigated. Ten adult subjects with cochlear pathology and 12
age matched normal hearing subjects were included in the study. Burst and transition portions
were extracted separately from the stimuli /pa/, /ta/, /ka/. Burst evoked brainstem responses were
analyzed for wave V, transient evoked brainstem responses were analyzed for peak V, A, C, D, E
and F and cortical evoked potentials were analyzed for P1, N1, P2 and N2. Speech identification
scores in quiet and in the presence of noise were obtained for bisyllabic word list in Kannada.
Burst evoked responses showed a significant difference between the latency of wave V obtained
in subjects with cochlear hearing loss and those with normal hearing group but no significant
difference was found in terms of wave V amplitude. For the transition stimuli, latencies of wave
V, A, C, D, E, and F as well as the amplitude of wave V were significantly different between the
two groups. All the components (V, A to F) evoked by transition stimuli significantly correlated
with SIS scores in noise. But no correlation was observed for burst evoked brainstem responses.
There was no significant difference between groups for all the components of LLR (P1, N1, P2 &
N2) but N1-P2 amplitude was significantly different between the groups. These findings suggest
that cochlear hearing loss impairs the processing of the burst and transition portion of speech
signal mainly at the brainstem level.
Introduction
Individuals with sensorineural hearing loss have difficulty in understanding speech
(Glasberg & Moore, 1989). Behavioral tests have been devised and used to assess these speech
processing difficulties. But these types of behavioral tests cannot be used in some of the difficult-
to-test population. In such individuals objective electrophysiological tests may be helpful in
predicting speech perception.
Conventionally brief acoustic signals such as clicks, tone bursts and tone pips have been
used to elicit the ABR. Recent investigations have shown that brainstem responses to speech
stimuli can also be reliably recorded and analyzed (Khaladkar, Karthik & Vanaja, 2005). As
brainstem responses can be best recorded using short duration signals, burst or transition portion
Professor of Audiology, School of Audiology and Speech Language Pathology, Bharathiya Vidya Peet University,
Katra-Dhanakawadi, Pune, India. e-mail: [email protected]
Dissertation Vol.V, Part-A, AIISH, Mysore
12
have been used to elicit brainstem responses in these studies. The speech-evoked ABR recorded
in human brainstem can be divided into transient and sustained portions, specifically the onset
response and the frequency-following response (FFR) (Kraus & Nicol, 2005). Onset responses
are transient, similar to click evoked ABR with peak durations lasting tenths of milliseconds.
The FFR arises from the harmonic portion of the stimulus and is characterized as a series of
transient neural events phase locked to periodic information within the stimulus (Batra, Kuwada
& Maher, 1986). Galbraith, Arbagey, Branski, Comerci and Rector (1995) demonstrated that the
FFR elicited by word stimuli reflects the stimulus accurately enough to allow it to be recognized
as intelligible speech when “played back” as an auditory stimulus. More recently, Galbraith,
Amaya, Rivera, Donan, Duong and Hsu, (2004) have suggested that based on the FFR pattern of
activation for forward and backward speech, synaptic processing at the level of the brain stem is
more effective for forward speech stimuli characterized by highly familiar prosodic and
phonemic structure than to backward speech. The studies carried out on children with learning
disability have shown that responses to speech stimuli were deviant in these children even when
responses to nonspeech stimuli were normal (Khaladkar, 2005).
Khaladkar, Karthik and Vanaja (2005) obtained speech burst ABRs for 20 ears with mild
to moderate sensorineural hearing loss. Two stimuli were used to evoke the ABR; a standard
acoustic click and the burst portion of the syllable /t/. The result of their study indicate that while
click evoked ABRs exhibited latency values within normal limits, speech burst evoked ABRs
showed more deviant results. There was a significant correlation between speech identification
score and speech burst ABR, perhaps suggesting that using speech sounds to elicit the ABR
offers an opportunity to better isolate normal speech processing from abnormal speech
processing. Hedrick and Jesteadt (1996) reported that sensorineural hearing loss may disrupt
formant transient coding or any type of dynamic process in periphery (i.e. rapidly changing
aspects of speech signal is not being coded). So it can be hypothesized that the transition
responses evoked by ABR may provide useful information about processing of speech at
brainstem level. There was also a need to study the cortical representation of burst and transition
of speech stimuli in subjects with normal hearing and those with hearing loss.
Speech evoked LLR were frequently used to study the neural representation of speech
sound in populations with impaired speech understanding. The underlying assumption is that
speech perception is dependent on the neural detection of time-varying spectral and temporal
cues contained in the speech signal (Tremblay, Billings & Rohila, 2004). The P1-N1-P2
complex reflects the neural detection of time-varying acoustic cues. Because abnormal P1-N1-
P2 response patterns have been reported in children and adults with varying types of speech
perception impairments, there is a current surge of interest in learning more about this brain-
behavior relationship (Rance, Wesson, Wunderlich & Dowell, 2002). There is a dearth of
studies correlating both brainstem and cortical responses with SIS in subjects with SN hearing
loss. Also research has not been carried out to study the cortical responses for only burst or
transition portion of a syllable. Hence the present study aimed to investigate if there is a
Dissertation Vol.V, Part-A, AIISH, Mysore
13
difference between subjects with normal hearing and those with cochlear pathology in the
following responses:
Brainstem responses to speech burst
Brainstem responses to transition of speech
Cortical responses to speech burst
Cortical responses to transition of speech
The study also investigated the relationship between the following in subjects with cochlear
pathology:
Brainstem responses to speech burst and speech identification scores
Brainstem responses to transition of speech and speech identification scores
Cortical responses to speech burst and speech identification scores
Cortical responses to transition of speech and speech identification scores
Method
Participants:
Participants of the present study were divided into two groups. Control group included
twelve ears of normal hearing individuals aged 15-50 years and hearing sensitivity within 15
dBHL. The clinical group included twenty two ears with cochlear hearing loss of subjects aged
15-50 years with hearing sensitivity within 55 dBHL. The hearing impairment was post-lingual.
Participants had no history of speech and language problem and all of them were native speakers
of Kannada.
Instrumentation:
A calibrated dual channel OB922 clinical audiometer (Version 2) with TDH 39
earphones housed in MX/41 AR ear cushions and Radio ear B 71 bone vibrator was used for
estimating pure tone threshold and speech audiometry. A calibrated GSI Tympstar middle ear
analyzer was used for tympanometry and acoustic reflex measurement to rule out middle ear
pathology. The IHS smart EP, version 2.39 (Intelligent Hearing systems, Florida, USA) with
Eartone 3A insert earphones was used to record and analyze auditory evoked potentials.
Materials:
Extracted transition and burst portion of naturally produced syllable /pa/, /ta/, /ka/ by an
adult female Kannada speaker was used to elicit brainstem and cortical response. The syllables
were spoken into a unidirectional microphone connected to the computer. To view and edit the
speech sounds, PRAAT (version 4.4.27) was used. The wave file was then converted to stimulus
file for ALLR recording using „Stim conv‟ provided by the Intelligent Hearing System (version
2.39). All the stimuli were calibrated in dB nHL. Paired words in Kannada were used to
Dissertation Vol.V, Part-A, AIISH, Mysore
14
determine the Speech Reception Thresholds (SRT) and recorded version of the word list from
speech identification test in Kannada developed by Vandana (1998) was used to determine SIS.
Test procedure:
Pure tone thresholds were assessed using modified Hughson Westlake method (Carhart &
Jerger, 1959) for air conduction stimuli from 250 Hz to 8000 Hz and for bone conduction stimuli
from 250 Hz to 4000 Hz. Speech Reception Threshold (SRT) was obtained using paired words in
Kannada. Speech identification scores (SIS) were obtained at 40 dB SL (ref: SRT) in both quiet
and noise (speech babble, 0 dB SNR) with PB word list developed by Vandana (1998). All the
auditory evoked potentials were recorded using conventional electrode montage with the
noninverting electrode on vertex, inverting electrode on mastoid and common electrode on the
forehead. Stimuli were presented at 40 dBSL (ref. SRT). Repetition rate of the stimuli was
11.1/sec for brainstem responses and 3.1/sec for cortical responses. The analysis window was 20
ms for brainstem responses to burst, 50 ms for brainstem responses to transition and 300 ms to
cortical responses. The analysis included 50 ms pre stimulus window while recording cortical
responses. Responses for 1500 stimuli were filtered using a band pass filter of 100 Hz to 3 KHz
and amplified 100 K times for brainstem responses. Cortical responses for 300 stimuli were
filtered using a band pass filter of 1-30 Hz and amplified 50 K times.
Results
Wave V and its negative trough (wave A) of ABR evoked by burst were marked. The
peak to trough amplitude of the wave V was measured. Similarly the Vth peak for transition
evoked ABR was also identified and amplitude was measured. In addition since transition ABR
has a steady state portion FFR was also analyzed as described by Kraus (2000). For the transition
evoked ABR latency of wave V, wave A, C, D, E and F and amplitude of wave V were
considered. For LLR, peak P1, N1, P2 and N2 were measured. The data obtained were tabulated
and statistical analyses were carried out using SPSS software (V15, SPSS Inc).
1. Latency and amplitude of wave V in individuals with normal hearing and individuals
with cochlear hearing loss for burst evoked ABR
Table 1 Shows the mean latency and amplitude of wave V evoked by bursts of /pa/, /ta/
and /ka/ in individuals with normal hearing and those with cochlear hearing loss. It can be noted
from the table that latency of wave V for /p/ and /k/ was similar but /t/ latency was shorter in
both the groups. The amplitude of the /p/ and /k/ was similar but /t/ amplitude was lesser than /p/
and /k/ in both the groups. It can be observed from the table that the latency was longer and
amplitude was lesser in individuals with hearing impairment than that of normal hearing, for all
the stimuli.
Multivariate Analysis of Variance was administered to assess the effect of groups and
three stimuli on latency and amplitude for wave V. Results revealed that there was a significant
effect of cochlear hearing loss on latency (p<0.05) but there was no significant difference in
Dissertation Vol.V, Part-A, AIISH, Mysore
15
amplitude of wave V between the groups (p>0.05). Also there was no interaction of stimulus
with group for latency and amplitude. Scheffe‟s post hoc showed no significant effect of
stimulus on latency and amplitude of wave V in both groups.
Table-1: Mean and SD of latency and amplitude of wave V
Group pa ta ka
Latency
in msec
Amplitude
in µv
Latency
in msec
Amplitude
in µv
Latency
in msec
Amplitude
in µv
Normal hearing 7.1 (0.5) 0.44 (0.04) 6.2 (0.4) 0.4 (0.2) 7.4 (0.7) 0.5 (0.3)
Hearing impairment 7.2 (0.02) 0.38 (0.8) 6.99 (0.7) 0.35 (0.1) 7.7 (1.6) 0.4 (0.2)
2. Latency and amplitude of peaks in individuals with normal hearing and individuals with
cochlear hearing loss for transition evoked ABR and FFR
Table 2 shows the mean latency of peaks V, A, C, D, E, F and amplitude of peak V,
elicited by transition portion of /pa/, /ta/ and /ka/ in individuals with normal hearing and hearing
impairment. For individuals with normal hearing the latency of wave V for /pa/ and /ka/ was
similar but /ta/ latency was longer than the other two stimuli. The amplitude of the /k/ was higher
than /t/ and /p/. However, the standard deviation for amplitude of /k/ was larger indicating
greater variability. For individuals with hearing impairment the trend obtained for different
stimuli was similar to that observed for participants with normal hearing. The latency of wave V
for /pa/ and /ka/ was similar but /ta/ was longer than the other two stimuli. However, the
amplitude of the /pa/ and /ka/ was similar but /ta/ amplitude was lesser than /pa/ and /ka/ in this
group.
Table-2: Latency of wave V, A, C, D, E, F and amplitude of wave V in individuals with normal
hearing and hearing impairment
Subjects
and
Stimuli
V A C D E F
Latency
in msec
Amplitude
in µv
Latency
in msec
Latency
in msec
Latency
in msec
Latency
in msec
Latency
in msec
norm
al
hea
ring
pa 9.9 (2.3) 0.3 (0.04) 11.1 (2.3) 13.8(2.7) 17.2(3.09) 21.4 (3.1) 25.6 (3.2)
ta 12.5 (1.6) 0.3 (0.07) 13.66 (1.8) 18.3 (2.7) 21.9 (2.9) 25.5 (3) 29.8 (3.3)
ka 9.8 (2.4) 0.39 (0.2) 11.5 (2.4) 14.8 (1.7) 18.3 (1.5) 22.4 (1.6) 26.8 (1.6)
hea
ring
impai
red
pa 15.4(2.5) 0.2(0.07) 17.2(2.9) 22.1(2.66) 25.7(2.6) 30.9(3.4) 35.1(3.6)
ta 19.6(1.3) 0.22(0.2) 21.7(1.6) 27.3(2.0) 31.4(2.0) 36(1.3) 40.4(1.6)
ka 16.2(1.7) 0.28(0.2) 18.1(1.9) 22(1.4) 26.6(2.0) 30.9(1.9) 35.2(1.8)
Multivariate analysis of variance was administered to assess the significant difference
between groups for three stimuli in latency and amplitude. There was a main effect of group
(cochlear hearing loss) on latency of all the peaks and amplitude of wave V (p<0.01) and there
was no interaction between stimulus and group. Scheffe‟s Post Hoc analysis of variance revealed
Dissertation Vol.V, Part-A, AIISH, Mysore
16
that the amplitude and latency of /ta/ differed significantly from that of /pa/ and /ka/ (p<0.01) for
wave V of ABR and other waves of FFR but there was no significant difference between /pa/ and
/ka/ (p>0.05).
3. Long latency responses evoked by speech bursts
Table 3 shows the mean for latencies for the components (P1, NI, P2, N2) of LLR and
amplitude of NI-P2 complex in individuals with normal hearing and individuals with cochlear
hearing loss across three speech burst stimuli. Multivariate analysis of variance was carried out
to check if there is a main effect of cochlear hearing loss on latencies of components of LLR and
N1-P2 amplitude. Results revealed that there was no significant main effect of group (cochlear
hearing loss) for its measure on latency of all the peaks (p>0.05) but N1-P2 amplitude differed
significantly (p<0.01) and no interaction was observed between stimulus and group.
Table 3: Mean (SD) of Latency for components of LLR and N1-P2 amplitude recorded with
burst in individuals with normal hearing and hearing impairment
P1 N1 P2 N2 N1P2 amp
Individuals
with
normal
hearing
pa 86.7(13.8) 127.2(12.3) 188.2(14.7) 197.6(24.1) 1.3(0.3)
ta 82.3(8.3) 128.4(16.7) 181.8(14.09) 229.2(11.8) 1.5(0.4)
ka 86.9(18.6) 130.7(12.4) 187.1(25.4) 225.4(20.5) 1.3(0.5)
Individuals
with
hearing
impairment
pa 87.3(34.2) 127.12(37.93) 184.65(42.06) 235.42(43.52) 0.98(0.17)
ta 83.3(22.7) 121.32(17.99) 173.25(16.19) 230.02(11.13) 0.83(0.16)
ka 86.07(21.90) 123.2(25.80) 177.47(17.91) 233.42(13.06) 1.04(0.30)
4. Long latency responses evoked by formant transition
Table 4 shows latencies for the components (P1, NI, P2, N2) of LLR and amplitude of
NI-P2 complex in individuals with normal hearing and individuals with cochlear hearing loss
across three speech formant transitions. Multivariate analysis of variance was carried out to
check if there was a main effect of cochlear hearing loss on latencies of P1, N1, P2, N2 and
amplitude of N1-P2.
Table 4: Mean and SD of latency (in ms) and amplitude (in µ V) of LLR peaks elicited by transition
P1 latency N1 latency P2 latency N2 latency N1P2 amp
Normal
hearing
subjects
pa 92.75(15.24) 139.31(2) 194.46(24.83) 251.31(21.29) 1.07(0.34)
ta 95.6(24.82 142.91(29.02 212.12(48.43 240.36(36.17) 1.33(1.33)
ka 97.1(9.88) 143.19(13.07) 211.94(31.01) 242.94(32.91) 1.47(0.4)
Hearing
impaired
subjects
pa 90.82(25.92) 123.52(24.82) 176.10(31.82) 230.85(44.91) 0.94(0.17)
ta 86.62(25.53) 119.87(23.95) 178.57(35.64) 227.92(41.21) 1.01(0.16)
ka 96.3(19.11) 137.07(19.66) 202.12(10.06) 257.4(15.99) 1.04(0.22)
Dissertation Vol.V, Part-A, AIISH, Mysore
17
Results revealed that there was no significant main effect of cochlear hearing loss (p>
0.05) and no interaction was observed between group and stimulus. Sheffec‟s Post Hoc analysis
revealed no significant effect of stimulus on latency or amplitude of LLR.
5. Speech identification scores in individuals with normal hearing and individuals with
cochlear hearing loss
Table 5 shows the speech identification scores in quiet and in presence of noisy condition
for the participants with normal hearing and those with hearing impairment. Independent sample
t test revealed that there was a significant difference between the scores of participants with
normal hearing and with hearing impairment in both quiet (t =13.0, p<0.01) and noisy condition
(t=19.9, p<0.01).
Table 5: Mean (SD) of speech identification scores in quiet and in the presence of noise
Group In Quiet In Noise
Normal Hearing 100 97 (3.9)
Hearing Impaired 76.2 (6.4) 15.6 (13.4)
6. Relationship between speech identification scores and brainstem and cortical responses
Pearson product moment correlation analysis was carried out to check the relationship of
latency and amplitude of brainstem potentials for the three stimuli with speech identification
scores (SIS) in quiet and in the presence of noise. Results revealed that SIS in noise correlated
significantly with formant transition evoked FFR and wave V for all the three stimuli (refer
Table 6 for r values) but SIS score in quiet did not show a significant correlation. Speech burst
evoked ABR and LLR as well as transition evoked LLR did not show a significant correlation
with SIS scores in quiet or in the presence of noise.
To summarize the results of the present study revealed that brainstem and cortical
responses to bursts and transition of speech stimuli can be recorded from participants with
normal hearing as well as those with hearing loss.
Table 6: Correlation of SIS scores with brainstem responses evoked by transition of /pa/, /ta/ and /ka/
Latency &
Amplitude
pa ta ka
In quiet In noise In quiet In noise In quiet In noise
V latency -0.217 -0.740** -0.281 -0.896** -0.235 -0.813**
V amplitude 0.286 0.640** 0.129 0.491 0.190 -0.251
A latency -0.293 -0.728** -0.394 -0.909** -0.287 -0.778**
C latency -0.325 -0.726** -0.341 -0.862** -0.397 -0.829**
D latency -0.288 -0.630** -0.354 -0.867** -0.413 -0.829**
E latency -0.303 -0.644** -0.365 -0.862** -0.416 -0.837**
F latency -0.368 -0641** -0.345 -0.845** -0.387 -0.820**
*p<0.05 and **p<0.01
Dissertation Vol.V, Part-A, AIISH, Mysore
18
There was a significant effect of hearing loss on brainstem responses to speech but
cortical responses to speech were not affected by hearing loss. Speech identification scores
obtained in the presence of noise showed a significant correlation with wave V and FFR evoked
by transition of speech.
Discussion
In the present study brainstem responses and cortical responses could be recorded for all
the stimuli from all the participants with normal hearing as well as those with mild to moderate
hearing loss. The latencies of peak V for different stimuli obtained in the present study are
comparable with those reported by Reddy, Kumar and Vanaja (2004) except for /ka/ which had
longer latency in the present study. This could be due to the difference in the stimuli used in the
two studies. The duration of the signals used in the present study was longer than those used by
the earlier study and the difference was largest for /ka/. ABR is an onset response and the latency
and amplitude of the response depends on stimulus onset/rise time, spectrum of the response and
the duration of the signal (Gorga et al., 1984). Differences in latencies can be attributed to the
differences in spectrum, rise time of the stimulus and durational differences of the stimuli used in
the two studies.
The prolongation of latencies in subjects with hearing impairment may be due to the
overall reduction in audibility. Previous studies on click evoked ABR have also reported that the
latency of all the peaks increase with increase in hearing threshold (Oates & Stapells, 1992).
Though statistically not significant the mean amplitude was lesser in subjects with hearing
impairment when compared to those with normal hearing. This is probably due to reduction in
number of nerve fibers responding for the stimuli. It has been reported in literature that the
amplitude of ABR depends on the number of nerve fibers firing (Hecox, Squires & Galambox,
1976). Thus the results of the present study suggest that coding of the processing of burst is
effected in subjects with hearing impairment. However, speech identification scores in quiet or in
the presence of noise did not show a significant correlation with latency or amplitude of ABR
elicited by burst. These results contradict the report of Khaladkar, Karthik and Vanaja (2005)
who observed that there was a significant correlation between SIS and speech burst ABR in
subjects with sensorineural hearing loss.
The latency of the onset response (Wave V and A) for the transition portion of the signal
in the present study was longer than that reported by King, Warrier, Hayes and Kraus (2002) but
the latency for the other peaks (C, D, E and F) was shorter. It has been reported that the wave V
and A signal the onset of sound at the brainstem whereas wave C is the response to the onset of
the vowel (Kraus & Nicol, 2005). The other peaks, D, E and F are responses to sustained portion
of the signal. So probably the difference in latency reflects the difference in the stimulus used in
the two studies. King, Warrier, Hayes and Kraus (2002) used synthesized transition of /da/ with
40 msec duration. On the other hand in the present study a natural stimulus was taken and the
transition part was extracted. The duration of transition in the present study was around 25 msec
Dissertation Vol.V, Part-A, AIISH, Mysore
19
for /pa/, 49 msec for /ta/ and 41 msec for /ka/. The fundamental frequency ranged from 103 to
121 Hz in their study and it was around 230 Hz in the present study.
The latency of the FFR portion in hearing impaired subjects was prolonged compared to
normal hearing subjects and the amplitude was significantly reduced in these subjects. This
suggests that the encoding of the sustained portion was affected in the participants with hearing
impairment. The inter-peak latency difference between D and E as well as E and F were around 4
msec in subjects with normal hearing whereas it was around 5 msec in subjects with hearing loss.
This indicates that processing of the fundamental frequencies was affected in subjects with
hearing impairment. It has been reported in literature that the F0 and F1 coding are affected in
persons with hearing impairment at the brainstem level and this is reflected in the abnormalities
in the waveform of ABR (Kraus & Nicol, 2005).
Auditory system encodes the F0 from fine structure but it can also encode the F0 from the
envelope but encoding of F0 from the envelope is weaker when compared to that extracted from
the fine structure (Zeng et al., 2004). In addition psycho-acoustical studies have shown that
cochlear hearing impaired subjects are impaired in coding the temporal fine structure of the
speech signal which contains the F0 and harmonics (Lorenzi, Gilbert, Carn, Garnier & Moore
2006). This indicates a greatly reduced ability to use temporal fine structure speech in
individuals with moderate hearing loss. This loss of ability to use temporal fine structure
information perhaps was related to a loss of neural synchrony (Woolf, Ryan & Bone, 1981). This
would have contributed for reduced amplitude and prolonged latencies in subjects with cochlear
hearing loss.
The recent studies have shown that speech in quiet could be completely understood with
only envelope cues (amplitude variation of the speech signal) (Nambi, Mahajan, Narne &
Vanaja, 2007). But understanding of speech in noise depends on the encoding of the fine
structure of the speech signal as well as envelope. It has been reported that coding of envelope of
the speech signal is normal in cochlear hearing loss subjects but processing of temporal fine
structure is impaired. The results of correlation also revealed that SIS scores in noise were
correlated well with components of FFR. This supports the hypothesis that processing of
temporal fine structure is affected in subjects with cochlear hearing loss.
There is dearth of study investigating LLR with burst and transition in subjects with
hearing loss. However the results obtained in this study are comparable with those reported in
literature for other stimuli. There was no significant difference in latency of LLR for the
participants with normal hearing and those with hearing impairment. This may be because the
degree of hearing loss was less than moderate degree. Mild to moderate degree of hearing
impairment do not significantly influence the latency of LLR (Albera et al., 1991). It has been
reported that at suprathreshold levels the latency of LLR is not significantly affected by intensity
of the stimulus (Picton et al., 1978). Variability of the LLR latency in normal subjects is also
high. This may have been one of the reasons for obtaining no significant difference in the latency
of LLR in the two groups. The N1-P2 amplitude was significantly better in subjects with hearing
Dissertation Vol.V, Part-A, AIISH, Mysore
20
loss when compared to that of normal hearing subjects. This suggests that probably less number
of cortical cells were responding in subjects with hearing loss. It has been reported that the
amplitudes of LLR depends on the number of cells responding for the stimulus and that long
deprivation of auditory stimuli may lead to loss of cells at the cortical level (Polley, Chen-Bee &
Frostig, 1999). However, the duration of hearing impairment in a majority of subjects in the
present study was not more than 9 months. Probably there would have been a significant effect
on LLR if the duration of hearing impairment was more. No significant correlation between SIS
and LLR measures suggests that probably the poor speech perception in the subjects was mainly
due to abnormal encoding of speech at the cortical and brainstem level.
Conclusion
In this study there was a significant difference in burst evoked wave V latency between
cochlear hearing loss group and normal hearing group but no significant difference was found in
terms of wave V amplitude. For the transition stimuli, latencies of wave V, A, C, D, E & F and
amplitude of wave V were significantly different between the two groups. All the components
(V, A to F) evoked by transition stimuli significantly correlated with SIS scores in noise. But no
correlation was observed for burst evoked brainstem responses. There was no significant
difference between groups for all the components of LLR (P1, N1, P2 & N2) but N1-P2
amplitude was significantly different between groups. No correlation found with SIS in quiet as
well as in noise. It can be concluded from the results of the present study that cochlear hearing
loss impairs the processing of the burst and transition portion of speech signal.
References
Albera, R., Roberto, C., Magnano, M., Lacilla, M., Morra, B. & Cortesina, G. (1991).
Identification of the waveform of cortical auditory evoked potentials. Acta
Otorhinolaryngologica Italica, 11(6), 543-9.
Batra, R., Kuwada, S. & Maher, V. L. (1986). The frequency following response to continuous
tones in humans. Hearing Research, 21, 167–177.
Galbraith, G. C., Amaya, E. M., de Rivera J. M., Donan, N. M., Duong, M. T. & Hsu, J N.
(2004). Brainstem evoked response to forward and reversed speech in humans.
Neuroreport, 15, 2057–2060.
Galbraith, G. C., Arbagey, P. W., Branski, R., Comerci, N. & Rector, P. M. (1995). Intelligible
speech encoded in the human brain stem frequency-following response. Neuroreport, 6,
2363–2367.
Glasberg, B. R. & Moore, B. C. (1989). Psychoacoustic abilities of subjects with unilateral and
bilateral cochlear hearing impairments and their relationship to the ability to understand
speech: Scandanavian Audiology (Suppliment), 32, 1-25.
Gorga, M. P., Beauchaine, K. A., Reiland, J. K., Worthington, D. W. & Javel, E.. (1984). Effects
of stimulus duration on ABR and behavioral threshold. Journal of the Acoustical society
of America. 76,616-619.
Hecox, K., Squires, N. & Galambos, R. (1976). Brainstem evoked responses in man: I Effect of
stimulus rise-fall time and duration. Journal of the Acoustical society of America, 60,
1187-1192.
Dissertation Vol.V, Part-A, AIISH, Mysore
21
Khaladkar, A. A. (2005). Speech elicited ABR: An exploratory study in normals and in children
with learning disability. Unpublished Master‟s dissertation submitted to University of
Mysore, Mysore.
Khaladkar, A. A., Kartik, N. & Vanaja, C.S. (2005). Speech Burst and Click Evoked ABR.
Scientific paper presented at the 37th National conference of the Indian Speech &
Hearing Association (Indore).
King, C., Warrier C.M., Hayes, E. & Kraus, N (2002) Deficits in auditory brainstem pathway
encoding of speech sounds in children with learning problems, Neuroscience Letter
319:111–115.
Kraus, N. & Nicol, T. G. (2005). Brainstem origins for cortical ‟what‟ and ‟where‟ pathways in
the auditory system. Trends in Neurosciences, 28, 176–181.
Lorenzi, C.G. Gilbert, H. Carn, S. Garnier. & Moore, B. C. J. (2006). Speech perception
problems of the hearing impaired reflect inability to use temporal fine structure.
Proceedings of the National Academy of Sciences. 103(49): 18866 - 18869.
Nambi, P. A., Mahajan, Y., Narne V. & Vanaja, C. S. (2007). Importance of amplitude and
frequency modulation cues for speech recognition. Paper presented at 39th
annual
conference of Indian Speech and Hearing Association, Calicut.
Oates, P. & Stapells, D. R. (1992). Interaction of click intensity and cochlear hearing loss on
auditory brain stem response wave V latency. Ear and Hearing. 13(1), 28-34.
Picton, T. W. & Smith, A. D. (1978). The practice of evoked potential audiometry.
Otolaryngologic Clinics of North America, 11(2), 263-82.
Polley, D.B., Chen-Bee, C.H. & Frostig, R.D. (1999). Two directions of plasticity in the sensory-
deprived adult cortex. Neuron. 24(3) ,623-37.
Rance, G., Cone-Wesson, B., Wunderlich, J. & Dowell, R. (2002). Speech perception and
cortical event related potentials in children with auditory neuropathy. Ear & Hearing, 23,
239-253.
Reddy, S. M., Kumar, U. A. & Vanaja, C. S. (2004). Characteristics of ABR evoked by speech
bursts. Scientific paper presented at the 36th National conference of the Indian Speech &
Hearing Association, Mysore.
Tremblay, K. L., Billings, C. & Rohila, N. (2004). Speech evoked cortical potentials: effects of
age and stimulus presentation rate. Journal of American Academy of Audiology, 15(3),
226-37.
Woolf, N. K., Ryan, A.F. & Bone, R.C. (1981). Neural phase-locking properties in the absence
of cochlear outer hair cells. Hearing Research, 4(3-4), 335-46.
Zeng F.G., Nie K., Liu S., Stickney G.S., Del Rio E., Kong Y.Y. & Chen H.B. (2004). On the
dichotomy in auditory perception between temporal envelope and fine structure cues.
Journal of the Acoustical Society of America, 116(3), 1351-1354.
Dissertation Vol.V, Part-A, AIISH, Mysore
22
Binaural Amplification – Does Technology Matter?
Anusha & Manjula P
Abstract
Hearing impairment is a reduction in the hearing sensitivity which will cause
deterioration in the speech abilities (Stach, 1997). Hearing aids, in particular, binaural
amplification are useful in the restoration of speech perception, in addition to perception of
environmental sounds, promoting improvement in communication skills according to Markides
(1977). The present study aimed at comparing the performance of the individuals with hearing
loss, in terms of improved audibility, understanding and quality of speech using binaural analog
hearing aids (AA), binaural digital hearing aids (DD), binaural amplification with analog and
digital hearing aids in opposite ears (AD). Results indicated no significant difference between
the DD and the AD conditions though the mean performance in the DD condition was higher
than that of the AD condition, for both the SRS and the quality rating of speech. It is implied
from the present study that the individuals with hearing impairment can be suggested to use
analog hearing aid in one ear and digital hearing aid in the opposite ear till they can afford for
binaural digital hearing aids considering the expensive nature of the digital hearing aids.
Key words: Analog hearing aids, digital hearing aids, speech recognition scores, quality rating
of speech, processing delay
Introduction
The problems caused by cochlear or sensory hearing impairment may be ameliorated
with the use of hearing aids. When compared to unaided hearing, hearing with amplification can
increase the amount of relevant information reaching through listener‟s speech recognition
system. As a primary management tool, amplification aims to raise the input signal level
sufficiently to activate the residual hearing while keeping the intensity within comfort range. It
also aims to shape the amplified signal to provide appropriate gain at each frequency in
accordance with the pattern of the deficit. Further, it should provide the best quality of sound for
different acoustic environments.
Amplification can not restore the lost capacity but it can help minimize the usefulness of
residual hearing (Sanders, 1977). There are different types of hearing aids, analog and digital,
depending on the technology they use to process the signal. The so called digital hearing aids are
an outgrowth of the computer revolution that has transformed our society (Ross, 1997). The
digital signal processing used in a digital hearing aid provides a better speech performance,
Professor in Audiology, All India Institute of Speech and Hearing, Mysore, India
e-mail: [email protected]
Dissertation Vol.V, Part-A, AIISH, Mysore
23
hearing levels, noise reduction mechanism, feed back suppression, etc., compared to the analog
hearing aids (Markides, 1977).
The analog amplification schemes had already reached a high level of sophistication
before the digital technology was introduced. Nonetheless, the major benefits from the digital
hearing aids are not to be underestimated. Also, as the technology improved, the cost of the
hearing aid using digital technology also increased. Effective use of hearing aids depends on the
optimal fitting. It should be ensured that the individuals with hearing impairment are given
adequate opportunity to become sophisticated users of amplification (Sanders, 1982).
Now that the technology has improved so much it is considered worthwhile fitting the
amplification in both ears namely binaural amplification (Sanders, 1993). Unless there are
specific contra-indications for fitting both the ears, every candidate for hearing aid with bilateral
hearing loss should be considered a candidate for binaural amplification. It should be considered
a dis-service to individuals with hearing impairment with two usable ears to make only a
monaural recommendation (MacKeith & Coles, 1971). Binaural performance is researched to a
greater extent due to the advantage of binaural amplification (Nabelek & Robinson, 1982). It is
essential that we educate the individuals with hearing impairment about the new technological
advancements so that they can decide about purchasing new hearing aids that might improve
listening (Sanders, 1993).
Even though the digital hearing aids have been commercially available since 1995 (Hirsh,
1995), its high cost reduces the number of seekers especially when the issue is about providing
amplification for both the ears. Some of them are ready to use a digital hearing aid in one ear and
analog hearing aid in the other till they can afford for binaural digital hearing aids. Condie,
Scollie and Checkly (1984) studied the performance of speech of children and concluded that the
performance with binaural digital hearing aids was better compared to binaural analog hearing
aids. Also, a significant improvement in speech perception in quiet and in noise was noticed.
This study focuses on the performance of individuals with hearing impairment while using one
digital and one analog hearing aid in opposite ears.
The aim of the present study was to compare the performance of the individuals with
hearing loss, in terms of improved audibility, understanding and quality of speech using:
1. binaural analog hearing aids
2. binaural digital hearing aids
3. binaural amplification with analog and digital hearing aids in opposite ears
Method Participants
The data from 15 participants were collected. The participants gave informed consent and
fulfilled the following selection criteria:
Age range from 16 to 60 years (mean = 49.06 years)
Dissertation Vol.V, Part-A, AIISH, Mysore
24
first time hearing aid users, native and fluent speakers of Kannada
Bilateral moderate to moderately severe (PTA = 41 to 70 dBHL) symmetrical (<20dB difference
between the PTA of right and left ears) sensorineural hearing loss
Equipment
The following instruments were used:
1. A calibrated sound field audiometer for unaided and aided testing.
2. A CD player connected to the audiometer for playing the speech material.
3. Two digital Behind-The-Ear (BTE) hearing aids and two analog hearing aids with a fitting range
to suit the hearing loss of the participants. Appropriate sized ear tips to fit the ears of the
participants.
4. Hipro and a personal computer with a soft ware to program the digital hearing aids
5. Calibrated hearing aid test system to measure the Real Ear Insertion Gain (REIG).
6. Calibrated hearing aid test system for measuring group delay of the hearing aids.
Speech Material Used
1. Four lists of Phonemically Balanced Kannada words developed by Yathiraj and
Vijayalakshmi (2006) were used for measuring Speech Recognition Scores. Each list
consists of 25 words. All the four lists were recorded using an adult female voice with
normal vocal effort.
2. A standard passage in Kannada was used for rating the quality of speech through the
hearing aid combinations. The passage contained all the speech sounds of the language.
The sample (one minute and twenty seconds) was recorded by an adult male voice with
normal vocal effort.
Test Environment
The test was conducted in a sound treated environment with ambient noise levels within
permissible limits.
Procedure
The testing procedure consisted of five phases
1. Pre-selection and programming of the hearing aids
2. Measurement of the insertion gain for hearing aid selection
3. Establishing the Speech Recognition Scores (SRS) using various combinations of the
selected hearing aids
4. Quality rating of each combination of the selected hearing aids
5. Group and phase delay measurement
Phase 1: Pre-selection and programming of hearing aids
Commercially available two digital and two analog BTE hearing aids were selected with
the fitting range to suit the hearing loss of the participants. The digital hearing aid was connected
Dissertation Vol.V, Part-A, AIISH, Mysore
25
through a Hipro to the Personal Computer (PC) with software for programming. After the
hearing thresholds were fed into the software (NOAH-3.0 and Connex 5), the digital hearing aids
were programmed based on the NAL-NL1 prescriptive procedure. An acclimatization level of 2
was used while programming and the volume control was disabled for the digital hearing aids.
Phase 2: Insertion gain measurements
The following steps were used to select the hearing aid for each participant before the
measurement of the aided benefit.
Otoscopic examination was done before the commencement of the testing to rule out any
contraindication for insertion gain measurement.
The participant was seated in front of the loudspeaker of the hearing aid test system for
the measurement of the insertion gain. For this purpose the loud speaker of the test
system was located on the speaker stand at 45 degrees Azimuth and at a distance of one
foot from the participant‟s test ear.
The reference and the probe tube microphones were placed near the test ear and the
instrument was leveled.
The audiometric thresholds of the participant were fed into the hearing aid test system
and the target gain curve was derived based on NAL-2 fitting formula.
The stimulus was routed through the loudspeaker. The stimulus was American National
Standards Institute (ANSI) digi-speech at 50, 65 and 90 dB SPL. The sound pressure
level in the ear canal of the test ear was measured by means of a pre-measured length of
the probe tube microphone inserted in the test ear. The reference microphone was located
on the band above the test ear. The reference microphone also measured the signal level
at different frequencies. The difference between the levels measured by the probe tube
microphone and the reference microphone was displayed on the monitor of the hearing
aid test system. Thus the Real Ear Unaided Response (REUR) was measured and stored.
Then the programmed digital hearing aid was switched „on‟ and fitted to the test ear of
the participant. Care was taken to see to it that the length of the probe tube in the ear
canal was not changed. The Real Ear Aided Response (REAR) was measured for the
same stimulus i.e., ANSI digi-speech at 50, 65 and 90 dB SPL.
After the measurement of the REUR and REAR, the Real Ear Insertion Gain (REIG =
REAG – REUR) at different frequencies was computed by the instrument. The REIG at
different frequencies was displayed as Real Ear Insertion Response curve (REIR). Thus,
the REIR was obtained for the hearing aid for both the ears of each participant.
This was repeated for three hearing aids in order to select the best hearing aid based on
the REIR that matched the target gain curve for testing the performance.
Phase 3: Establishing of Speech Recognition Scores (SRS)
The participants were seated comfortably in the sound treated audiological test room with the
appropriate loud speakers located at one meter from the participant at 45 degrees Azimuth. The
Dissertation Vol.V, Part-A, AIISH, Mysore
26
Speech Recognition Scores (SRS) for Kannada Phonemically Balanced word list (Yathiraj &
Vijayalakshmi, 2006) for each participant was noted down in the following four conditions.
1. Unaided condition
2. Aided conditions -
2. A. with binaural analog hearing aids (AA)
2. B. with binaural digital hearing aids (DD)
2. C. with digital hearing aid in one ear and analog hearing aid in the opposite ear (AD).
In the unaided condition the SRS was obtained. For this one of the four Phonemically
Balanced word lists (Yathiraj & Vijayalakshmi, 2006) was presented using monitored live voice,
at 45 dBHL, in sound field condition where the stimulus was routed through the speaker. The
participant was instructed to repeat the words presented. The number of words repeated correctly
was scored. Each correct repetition was given the score of one, the maximum score being 25, as
the list consisted of 25 words.
Then one of the combinations of the hearing aids (2A, 2B or 2C) was fitted in the
participant‟s ear. If it was the condition 2C, then analog hearing aid was fitted in one ear and the
digital hearing aid in the other ear. Order of testing the aided condition was randomized across
the participants to overcome the effect of order of the conditions i.e., the conditions in which the
participants were tested were randomized between participants in order to avoid the order effect.
Thus at the end of the third phase SRS in the four test conditions (unaided and three aided
conditions) were obtained for each participant.
Phase 4: Quality rating of speech through the hearing aids
In each test condition (AA, DD & AD), after the SRS was obtained the participant
listened carefully to a CD recorded sample of a passage in Kannada, played through a CD player
connected to the audiometer. Each participant was instructed to listen to the recorded passage
and to rate the quality of the recorded passage. The rating was based on three parameters of aided
speech such as loudness, clarity and intelligibility. Loudness was defined as the perception of
psychological impression of intensity of sound (Stach, 1997). Clarity was defined as the
distinctness (or) purity of tone (Cecil & Patridge, 1970). Intelligibility was defined as the correct
understanding ability of the speech units. A ten point rating scale was used for rating each of
these parameters. The scale for the three parameters of aided speech was as follows:
For loudness: 1 - Very soft (can‟t hear)………………….10 - uncomfortable
For clarity: 1 - Completely unclear…………………….10 - very clear
For intelligibility: 1 - Completely unintelligible……………...10 - fully intelligible
The participant was instructed to listen to the recorded passage presented at 45 dBHL
through loud speaker of the audiometer. Using the above rating scale the participant was
requested to listen to and rate the recorded passage in each of the three aided conditions. The
Dissertation Vol.V, Part-A, AIISH, Mysore
27
quality rating of connected discourse was also analyzed in order to find out the difference in
speech perception when using any of the three conditions.
Phase 5: Measurement of signal processing delay
The group delay is the amount of time it takes the digital hearing aid to process sound.
The processing delay for some hearing aids is so slight that it is imperceptible to the human ear.
The processing delay for other aids can extend to several milliseconds- longer than the
calculating time for an analog hearing aid. If an individual is fit monaurally with an aid with a
significant digital processing delay that person might experience some confusion because his
unaided ear will be hearing sounds slightly faster than his aided ear. The processing delay of the
hearing aids used was measured for the purpose of comparing the performance of the hearing
aids in the three aided conditions. The following steps were used for measuring the processing
delay of the hearing aids used.
The enhanced DSP screen was selected from the opening screen of the hearing aid test system.
The sound chamber was leveled and the hearing aid was placed for testing in the sound chamber.
That is, the hearing aid was connected to a 2 cc coupler and the output was collected through a
test microphone for analysis.
The hearing aid with the coupler and the microphone were placed in the anechoic chamber and
the measurement for the processing was performed.
The processing delay measurement was taken by sending a short impulse from the sound chamber
speaker to the hearing aid.
For the processing delay measurements the hearing aid test system microphone collected
information from the hearing aid for 20 msec. from the time the impulse was delivered which was
a series of varying amplitudes.
The data collected in the digital processing delay measurement were displayed in the graphical
format as amplitude vs. time (Figures 1&2). The delay point is represented by a dotted vertical
line along with the display of numerical data. There was a second dotted vertical line showing the
delay for reference.
Fig. 1. Measurement of the processing or group delay of a digital hearing aid
Dissertation Vol.V, Part-A, AIISH, Mysore
28
Fig. 2. Measurement of the processing or group delay of an analog hearing aid
The data collected from this measurement was displayed in a graphical format 20 msec. wide. The
measurement from the vertical point to the response wave of the hearing aid is taken as the processing
delay of that hearing aid.
Similarly the processing delay of all the test hearing aids used was measured and the values were
compared according to different aided conditions for the testing.
Thus the SRS and ratings for quality of speech (loudness, clarity and intelligibility) were
made for the three aided conditions and the data on processing delay were collected. The results
were analyzed to evaluate the objective of the study.
Results and Discussions
The objective of the study was to compare the performance of 15 participants with
bilateral symmetrical sensorineural hearing loss, in terms of audibility, recognition and quality of
speech using the three aided conditions, viz., binaural analog hearing aids, binaural digital
hearing aids and binaural amplification with analog and digital hearing aids in opposite ears.
The objective of the present study was evaluated by obtaining the Speech Recognition
Scores (SRS) and the quality rating of speech. The quality of speech was rated on three
parameters such as loudness, clarity and intelligibility in the three aided conditions; AA, DD and
AD. The data were tabulated and statistically analyzed using the Statistical Package for Social
Sciences (SPSS, version 10.0).
The mean and standard deviations of the following measures were computed for all the
three aided conditions. Following this, pair-wise comparisons were made to see if there was a
significant difference between the aided conditions. The results are given below under three main
headings.
I) Speech Recognition Scores (SRS)
II) Quality rating of speech
III) Hearing aid processing delay.
Dissertation Vol.V, Part-A, AIISH, Mysore
29
I. Speech Recognition Scores (SRS)
The SRS in the three aided conditions were analyzed. Table 1 depicts the mean and
standard deviation (SD) values of the SRS obtained in the three aided conditions. The three aided
conditions include analog binaural hearing aids (AA), digital binaural hearing aids (DD) and
analog hearing aid in one ear and digital hearing aid in the other ear (AD). As can be observed in
the Table 1 the mean SRS value was more in the DD condition and least in the AA condition.
The variation of SRS, i.e., SD in different conditions was comparable.
Table 1: Mean and Standard Deviation values of SRS obtained in the three aided conditions
Mean SD
SRS – AA 19. 60 4. 03
SRS – DD 22. 13 3. 54
SRS – AD 21. 33 3. 59
Note: SRS-AA: SRS with binaural analog hearing aids; SRS-DD: SRS with binaural digital hearing aids; SRS-AD:
SRS with analog hearing aid in one ear and digital hearing aid in the opposite ear.
Further, one way repeated measures Analysis of Variance (ANOVA) was performed to
see if the difference in the mean SRS values in the three aided conditions were significantly
different. The results showed that there was a significant difference between the three conditions,
F =(2, 28) = 9.73, (p < 0.01), indicating that there was a significant effect of the aided conditions.
Further the mean values of SRS in AD condition was higher than the mean value of SRS in AA
condition and the mean SRS values in DD condition was higher than that in the AD condition.
From Bonferroni‟s multiple comparison it was observed that there was no significant difference
between the AA and AD conditions and the AD and DD conditions (p > 0.05). However, there
was a significant difference between AA and DD conditions (p < 0.05).
This result is consistent with that reported by Prinz, Nubel and Gross (2002) who had got
similar results while testing on individuals with bilateral moderately-severe symmetrical hearing
loss. Interestingly all the studies that had proved the advantage of binaural amplification had
used ear level hearing aids, such as those by Jerger and Dirks (1961), MacKeith and Coles
(1971) and the others.
II. Quality rating of speech
Quality rating of hearing aid processed speech was done on three sub-scales. They were
loudness, clarity and intelligibility. The three sub-scales were rated on a ten- point rating scale,
one being very soft and ten being uncomfortable for loudness, one being completely unclear and
ten being very clear for clarity and one being unintelligible and ten being fully intelligible for
intelligibility.
From the Table 2, it can be observed that the mean values for quality ratings of loudness,
quality and intelligibility is higher for the DD and the AD conditions compared to the AA
condition but in all the quality ratings, the value of DD condition is higher than the AD
Dissertation Vol.V, Part-A, AIISH, Mysore
30
condition. The variation as revealed by the SD was slightly higher in the AA condition than in
the AD condition, which in turn was higher than in the DD condition.
Table 2: Mean and SD of ratings on loudness, clarity and intelligibility in AA, DD and AD
conditions
Quality sub-scales Aided conditions Mean (N=15) SD
Loudness
AA 6.93 2.46
DD 7.93 1.98
AD 7.66 2.05
Clarity
AA 6.40 2.13
DD 8.20 1.42
AD 8.00 2.00
Intelligibility
AA 7.06 2.15
DD 8.46 1.30
AD 7.86 1.95
II. a. Loudness
Friedman‟s test, a non-parametric equivalent of one way repeated measures ANOVA,
was used for comparison of ratings for loudness between the three aided conditions. From
Friedman‟s test no significant difference was found between the loudness ratings of the three
aided conditions [X^2(2) = 5.568, p > 0.05].
II. b. Clarity
Friedman‟s test was used for comparison of ratings for clarity in the three aided
conditions and the results showed a significant difference between the aided conditions AA, DD
and AD [X^2 (2) = 13.792, (p < 0.01)]. Wilcoxon‟s signed rank test (a non-parametric equivalent
of paired t-test), was used for pair-wise comparison of the three aided conditions which revealed
no significant difference between the AD and DD conditions (p > 0.05). However, a significant
difference between AA and AD conditions (p < 0.01) and the AA and DD conditions (p < 0.01)
were noted.
II. c. Intelligibility
Friedman‟s test was used for the comparison of intelligibility ratings for the three aided
conditions which showed significant difference between the aided conditions [X^2 = 6.545, (p <
0.05)]. Wilcoxon‟s signed rank test was performed for the pairwise comparison of the three aided
conditions which revealed no significant difference between the AD and AA (p > 0.05) and AD
and DD (p > 0.05) conditions. However, a significant difference between the DD and AA (p <
0.05) conditions was noted.
III. Hearing aid processing delay
Frye (2001) reported that one of the properties of the digital technology is that it always
takes time to process digital data. Group delay is the delay between input and output of the
digital hearing aid (Kates, 2003). The processing delay for some hearing aids is so less that it is
Dissertation Vol.V, Part-A, AIISH, Mysore
31
imperceptible to the human ear. The processing delay for other hearing aids can extend to several
milliseconds. For analog hearing aids, the processing delay would be comparatively very less
because it does not perform any signal processing activities like the digital hearing aid. Kates
(2003) suggested that even before the delay in the digital processing is considered the other
components of the hearing aid (microphone, receiver, A/D or D/A) and the acoustic interactions
will contribute from 2 to almost 5 msec. group delay. This depends on the sampling rate and the
algorithm/s implemented. The processing delay of the hearing aids used in the present study is
tabulated below.
Table 3: Processing delay of the hearings aids used
Hearing aid Processing delay
(msec)
Analog 1 0.4
Analog 2 0.4
Digital 1 0.9
Digital 2 0.9
From the Table it can be seen that the processing delays of the two analog hearing aids
were the same and that of the two digital hearing aids were also the same. As suggested by Stone
and Moore (2003) a processing delay of 32 msec. could be allowed for an effective speech
perception for individuals having hearing impairment of about 55 dBHL. The finding of the
study done by Henrickson (2004) also supported that of Stone and Moore concluding that
processing delays of about 0.3 to 0.7 msec. are acceptable for analog hearing aids and about 1 to
11 msec. are acceptable for digital hearing aids. Dillon (2001) by comparing five digital hearing
aids on individuals with hearing impairment concluded that a processing delay of 1.2 to10 msec.
were acceptable and there was no correlation between hearing aid preference and the processing
delay. Flame (2002) concluded that if mismatch in delay times did not change often individuals
with hearing impairment adapt with a period of hours or days. Though presently there is a
difference in the group delay between the analog (i.e., 0.4 msec.) and the digital (0.9 msec)
hearing aids, with evidence from the literature, in the present study also it is inferred that the
individuals with hearing impairment will adapt to the processing delays between analog and the
digital hearing aids. However, audiologist should keep in mind that the group delays between the
two hearing aids should not vary much for providing effective speech perception.
From the findings of the present study it can be inferred that the clients could be
recommended with one analog and one digital hearing aid in the opposite ears till they could
afford another digital hearing aid. However, the processing delay of the hearing aids needs to be
measured for optimum performance in such situations. This is due to the advantages of binaural
amplification and keeping in mind the better performance of binaural digital hearing aids
compared to analog, in terms of clarity, intelligibility and loudness.
Dissertation Vol.V, Part-A, AIISH, Mysore
32
References
Condie, R., Scollie, S. & Checkly, S. (1984). Children‟s performance with analog vs. digital
adaptive dual microphone hearing aid. Ear & Hearing, 19(5), 407-413.
Frye, G. J. (2001). Testing digital and Analog Hearing Instruments: Processing Time Delays and
Phase Measurements. A look at the potential side effects and ways of measuring them.
http://www.frye.com/library/acrobat/hrarticle2.pdf.
Jerger, J. & Dirks, D. (1961). Binaural hearing aids: An enigma. Journal of the Acoustical
Society of America, 19, 629-631.
Mc. Keith, N.W. & Coles, R.A. (1971). Binaural advantages of hearing in speech. Journal of
Laryngology and Otology, 85, 231-232.
Markides, A. (1977). Binaural amplification. NY: Academic Press.
Yathiraj, A. & Vijayalakshmi, C. S. (2006). Phonemically balanced word list in Kannada
developed at the dept. of Audiology, AIISH.
Kates, J. M. (2003). Dynamic range compression using digital frequency wrapping. US patent
application 20030081804, tia.sagepub.com/cgi/content/ref/8/3/84.
Hirsh, I. J. (1995). Pre-attentive discriminability of sound order as a function of tone duration
and interstimulus interval: A mismatch negativity study. Audiology and Neurotology,
4(6), 303-310.
Nabelek, A. K. & Robinson, P. K. (1982). Monaural and binaural speech perception in
reverberation for listeners of various ages. Journal of Acoustical Society of America, 71,
1242-1248.
Prinz, I., Nubel, K. & Gross, M. (2002). Digital and analog hearing aids in children. Is there a
method for making an objective comparison possible? http://www/medical/mm_
0093_coveragepositioncriteria_hearingaids.pdf
Ross, M. (1997). A retrospective look at the future of aural rehabilitation. Journal of American
Academy of Audiology, 30, 11-28.
Sanders, D.A. (1977). Auditory perception of speech. NJ: Prentice Hall, Inc.
Sanders, D.A. (1982). Aural rehabilitation; A management model. 2nd
edn. NJ: Prentice Hall, Inc.
Sanders, D.A. (1993). Management of hearing handicap: Infants to elderly. NJ: Prentice Hall,
Inc.
Stone, M.A. & Moore, B.C.J. (2003). Tolerable hearing aid delays: effect on speech production
and perception of across frequency variation of delays. Ear & Hearing, 24(2), 175-183.
Dissertation Vol.V, Part-A, AIISH, Mysore
33
Reduction of Stimulus Artifacts in ASSR: An Investigation of a
Stimulus Approach
Arivudai Nambi P & C S Vanaja
Abstract
Auditory steady state responses at high stimulus levels were contaminated by artifacts
when weighted averaging method as well as phase coherence method is used detect responses.
The current study aimed to determine the upper limit for artifact free ASSR for stimuli presented
through headphone and bone vibrator. The current study also investigated whether artifacts can
be avoided by changing the carrier frequency in such a way that they are not integer multiple of
sampling frequency. ASSR was recorded in 30 individuals with profound hearing loss who did
not show behavioural responses even at the upper limit of ASSR system. The results revealed that
the upper limit for artifact free ASSR measurement for head phone is 90 dB HL and for bone
vibrator is 50 dB HL. Changing the carrier frequency enhanced the dynamic range for artifact
free ASSR measurement obtained through headphone and bone vibrator.
Introduction
Auditory steady state response (ASSR) is one of the objective methods for estimation of
behavioural thresholds (Cone-wesson et al., 2002; Aoyagi et al., 1994; Luts & Wouter, 2005).
ASSR examines the response for sinusoids that are amplitude, frequency or mixed modulated
stimuli. Since stimuli used for ASSR recording is continuous and modulated in nature it is
possible to present a stimuli upto 120 dB HL also. This feature allows the ASSR to assess the
hearing threshold levels greater than 80 dB HL and it differentiates severe to profound hearing
loss (Rance et al., 1998; Rance et al., 1993; Swanepoel, Hugo & Roode, 2004). Gorga et al.
(2004) initially pointed out the presence of artifactual ASSR at higher levels (>95 dBHL) in
individuals with profound hearing loss who did not show any behavioral responses to modulated
stimuli even at the upper limits of ASSR system. The presence of these artifacts was also
supported by other investigators (Small & Stapells, 2004; Picton & John, 2004). Presence of
these artifacts at higher level might limit the ASSR‟s application to differentiate between severe
and profound hearing loss. Picton and John (2004) have reasoned out that “aliasing” error cause
the occurrence of artifacts at higher levels. Aliasing occurs when signal is sampled at a rate
lower than twice its frequency. Then the signal is seen at a frequency equal to absolute frequency
and its closest multiple integer of sampling rate. The sampling rate used in the ASSRs is
Professor of Audiology, School of Audiology and Speech Language Pathology, Bharathiya Vidya Peet University,
Katra-Dhanakawadi, Pune, India. e-mail: [email protected]
Dissertation Vol.V, Part-A, AIISH, Mysore
34
designed for the efficient analysis of the responses at the frequencies of modulation. Small and
Stapells (2004) explained that frequency of the aliasing error which occur in ASSR can be
predicted by the formula,
Alias frequency = Closest integer multiple of sampling frequency- Input frequency
For example a 500 Hz tone that is amplitude modulated at 80 Hz would have energy at
420, 500, 580 Hz. If this energy is present in the EEG being digitized at 500 Hz an alias
frequency would be 500Hz-420Hz = 80Hz which is exactly the same as the modulation rate for
this 500 carrier frequency. This was explained with respect to weighted averaging method of
response detection (spectrum of the responses was considered) which makes use of F-test.
Artifacts were also present in instrument which uses phase coherence to detect responses and the
upper limit for artifact free ASSR measurement for head phone is 95 dBHL and for insert ear
phone is 105 dBHL (Narne, Nambi & Vanaja, 2006). Picton and John (2004) used different
approaches to eliminate artifacts by shifting the aliasing frequency away from the modulation
frequency. These approaches include the use of different stimuli such as „beats‟ which has the
energy at carrier frequency half of modulation frequency, sinusoidally alternated amplitude
modulated tone which has the energy at carrier frequency 3/2 times of modulation frequency
and carrier frequency half of modulation frequency and changing the A/D conversion rates.
These approaches have shown to be effective in avoiding artifacts when ASSR is recorded using
weighted averaging method. It is not known whether these techniques will help in reducing
artifacts in ASSR recorded using phase coherence method. Another approach which can
probably shift the aliasing frequency away from the modulation frequency is using a carrier
frequency which is not an integer multiple of sampling frequency (Picton & John, 2004).
Research needs to be done to check whether changing the carrier frequency can avoid the
artifacts or enhance the dynamic range of ASSR for artifact free measurements. So, the current
study was aimed to investigate whether changing the carrier frequency can avoid artifacts while
recording ASSR using phase coherence method. The study also aimed at determining the upper
limit for artifact free ASSR measurements for stimuli presented through a bone vibrator.
Method
Participants
Thirty individuals with profound hearing loss ranging in age from 18 to 40 years
participated in the current study. ASSR was recorded from 15 participants for stimuli presented
through head phone and from 15 participants for stimuli presented through a bone vibrator. It
was ensured that the subjects did not show any behavioral responses to mixed modulated stimuli
used for recording ASSR at the presentation level used in the experiment.
Instrumentation
A calibrated two channel diagnostic audiometer Madsen OB 922 audiometer with TDH
39 headphone and Radio ear B71 bone vibrator was used to estimate the behavioral thresholds.
Dissertation Vol.V, Part-A, AIISH, Mysore
35
GSI- Audera version (1.0.2.2) was used to record ASSR as well as to obtain behavioral
thresholds to mixed modulated stimuli presented through headphone and bone vibrator.
Procedure
Pure tone audiometric thresholds and behavioral thresholds to the modulated tones were
measured using modified Hughson & Westlake (Carhart & Jerger, 1959) procedure. Behavioral
thresholds to modulated stimuli were obtained at 500, 1000, 2000 and 4000 Hz, hereafter
referred to as conventional carrier frequencies, and 522, 1022, 2022, & 4022 Hz, hereafter
referred to as experimental carrier frequencies, in the current study. These measurements were
carried out to ensure that the participants meet the subject selection criteria.
For recording ASSR, subjects were seated comfortably in a reclining chair and they were
asked to relax or sleep. Electrode sites were cleaned using skin prepping gel. Silver chloride
electrodes were used to record the ASSR using three electrode placement. For air conduction
measurement, inverting electrode was placed on the test ear mastoid, non inverting electrode was
placed on the fore head and ground electrode was placed on the mastoid of non test ear. For bone
conduction measurements non inverting electrode was placed at vertex and the site for inverting
and non inverting electrodes were same as that used for air conduction measurements. It was
ensured that electrode impedances were less than 5 k Ohms and inter electrode impedance was
less than 2 k Ohms.
For air conduction measurements supra aural headset was placed over the pinna and for
bone conduction measurements a bone vibrator was placed on the fore head. ASSRs were
recorded for both conventional carrier frequencies, 500, 1000, 2000 and 4000 Hz as well as for
experimental carrier frequencies 522, 1022, 2022 and 4022 Hz in all the subjects. ASSR
measurements were performed using high modulation frequency of 74 Hz for 500 & 522 Hz , 81
Hz for 1000 & 1022 Hz, 88 Hz for 2000 and 2022 Hz, and 95 Hz for 4000 & 4022 Hz carrier
frequencies. Testing was initiated at the maximum limits of instrument and the intensity was
varied in 5 dB steps to find out the highest intensity level at which artifact free ASSR
measurements can be obtained. Response was determined automatically by the instrument using
phase coherence method.
Results
It was observed that less number of individuals had artifacts for experimental carrier
frequency when compared to conventional carrier frequencies. Figure 1 and 2 depicts the number
individuals with artifacts for conventional and experimental carrier frequencies for air conducted
and bone conducted stimuli. From the figures it is clear that as the carrier frequency increases the
number of individuals with artifacts reduces for both conventional and experimental carrier
frequencies obtained through headphone and a bone vibrator.
Dissertation Vol.V, Part-A, AIISH, Mysore
36
0
2
4
6
8
10
12
14
No
of s
ubje
cts
500 1000 2000 4000
Frequency
Conventional CF
Experimental CF
Figure 1: Number of individuals with artifacts for air conducted stimuli
0
2
4
6
8
10
12
No
of
su
bje
cts
500 1000 2000 4000
Frequency
Conventional CF
Experimental CF
Figure 2: Number of individuals with artifacts for bone conducted stimuli
Overall percentage of subjects in whom artifacts were observed is more for air
conduction transducer when compared to bone conduction transducer. Table 1 depicts the
percentages of subjects in whom artifacts were observed for conventional and experimental
carrier frequencies recorded using air conduction and bone conduction transducer.
Table 1: Percentage of subjects with artifacts for conventional & experimental carrier frequencies
Frequency
Percentage (%)
Head phone Bone vibrator
500 93.3 91.6
1000 80.0 46.15
2000 66.6 40.0
4000 26.6 26.6
522 66.6 33.3
1022 40.0 23.07
2022 13.5 0.0
4022 0.0 0.0
It was observed that the minimum level at which artifacts occurred was higher for
experimental carrier frequencies when compare to conventional carrier frequencies. Table 2
shows the mean and standard deviation of minimum levels at which artifacts occurred across the
Dissertation Vol.V, Part-A, AIISH, Mysore
37
carrier frequencies for air conducted and bone conducted stimuli and maximum intensity at
which artifact free ASSR can be recorded.
Table 2: Mean and SD of minimum level at which artifact occurred and the maximum limit for
artifact free ASSR measurement in dBHL
Frequency
Transducer
Head phone Bone vibrator
Lowest level at
which artifacts
occurred
Mean (SD)
Max limit for
obtaining
artifact free
ASSR
Lowest level at
which artifacts
occurred
Mean (SD
Max limit for
obtaining artifact
free ASSR Max
limit
500 Hz 98.57
(4.97)
85 56.81
3.37
45
522 Hz 106.0
(3.94)
95 60.0
0.00
55
1000 Hz 109.16
(6.33)
90 73.33
2.58
65
1022 Hz 112.5
(3.94)
100 78.33
5.00
70
2000 Hz 108.0
(5.37)
95 80.83
6.40
65
2022 Hz 110
(0.00)
105 >80 80
4000 Hz 112.5
(5.00)
100 70.00
0.00
65
4022 Hz >115 115 >70 70
Discussion
In the present study it was found that using the experimental carrier frequency reduces
the presence of artifacts. The exact sampling frequency that is used in instrument is not known.
However Small and Stapells (2004) reported that commercially available ASSR systems use the
A/D to conversion rates of 500 Hz or 1000 Hz as default setting. So it was assumed that this
instrument also uses similar A/D rate and the carrier frequencies were changed in such a way that
they were not integer multiple of 500 Hz or 1000 Hz. Results revealed that using the
experimental carrier frequency reduced the occurrence of artifacts. This supports the notion that
these artifacts might be due to electro magnetic aliasing effect. The change in carrier frequency
may not allow the electromagnetic carrier signal to alias at the modulation frequency if the A/D
rates are 500 Hz or 1000 Hz (Picton & John, 2004). Similar results have been reported by other
investigators when they changed the sampling rate (Picton & John 2004; Small & Stapells 2004)
The artifact reduction for stimuli of low frequencies (522 & 1022 Hz) is less when
compared to high frequencies (2022 & 4022 Hz). The persistence of artifacts at low frequencies
Dissertation Vol.V, Part-A, AIISH, Mysore
38
may be attributed to physiological artifacts. The artifacts may be of vestibular origin. Vestibular
stimulation is larger at low frequencies when compared to high frequencies (Townsend & Cody,
1971; Todd, Cody & Banks, 2000). Small and Stapells (2004) reported that vestibular evoked
myogenic potential from inion muscle could be recorded by an electrode placed on the nape.
However in the present study electrodes were placed on mastoids and forehead/vertex. This
electrode placement is not suited for picking vestibular evoked myogenic potentials. There are
reports of a negative potential N3 generated from the vestibular nuclei through stimulation of
saccule (Nong, Ura & Noda, 2000) which could be picked up by using conventional electrode
placements (Fore head/Vertex to mastoid) in profound hearing loss individuals. An investigation
by Narne, Nambi and Vanaja, (2006) also supported the possibility of vestibular artifacts while
recording ASSR at high intensity based on latency calculations from the phase delay which falls
around 3-5 msec. They reported that these physiological artifacts mainly contaminated the ASSR
elicited by 500 Hz carrier signal. However in the current study the artifacts persisted for 1022 Hz
carrier signal also. Occurrence of artifacts at this frequency may also be of vestibular origin.
Vestibular stimulation by 1000 Hz acoustic signal has also been reported in the literature (Cheng,
Huang & Young, 2003; Welgampola & Colebatch, 2001). Another possible reason for obtaining
more artifacts at low frequencies are related to the electrical energy required to drive the
oscillator. It has been reported that less electrical energy is required to drive the oscillator at high
frequencies (2000 Hz & 4000 Hz) when compared to low frequencies (Small & Stapells, 2004).
So the electromagnetic energy that radiated during the generation of high frequency signals will
be less when compared to low frequency signals which in turn might reduce the amplitude of
electromagnetic stimulus artifacts. A third reason may be that the higher carrier frequencies will
be away from the EEG low pass filter setting and thus stimulus artifact would be smaller in
amplitude (Small & Stapells, 2004).
Results revealed that there were fewer artifacts for bone conducted stimuli when
compared to air conducted stimuli. In the current study the bone vibrator was placed on the
forehead and the electrodes were placed on mastoids and vertex. For air conduction testing the
electrodes was placed on the mastoids and forehead. So the physical proximity between the
transducer and the electrode was more in case of bone vibrator when compared to headphone.
This might have reduced the amplitude of electromagnetic energy reaching the electrode which
in turn probably reduced the occurrence of artifacts. Also forehead placement of the bone
oscillator might result in less vestibular stimulation, possibly due to the different mode of
stimulation compared with temporal bone placement (Small & Stapells, 2004). In the current
study ASSR for air conducted and bone conducted stimuli was not obtained from the same
subject due to time constraints. The individual variability among the subjects may also have
accounted for the difference in the percentage of subjects in whom artifacts were observed.
The upper limits for artifact free ASSRs elicited by experimental carrier frequencies were
high when compared to ASSRs elicited by conventional carrier frequencies. This may be because
experimental carrier frequencies reduced the electromagnetic artifacts and were mainly
contaminated by physiological artifacts. The physiological artifacts might occur at little higher
Dissertation Vol.V, Part-A, AIISH, Mysore
39
intensities when compared to stimulus artifacts and this probably enhanced the upper limits for
artifact free ASSR measurements for experimental carrier frequencies.
It can be concluded from the present study that percentage of subjects in whom artifacts
were observed for conventional carrier frequencies are higher when compared to experimental
carrier frequencies. The important clinical implication of the current study is that the
experimental carrier frequencies can be used for threshold estimation as the dynamic range for
artifact free ASSR measurement is higher for experimental carrier frequencies when compared to
conventional carrier frequencies.
References
Aoyagi, M., Kiren, T., Furuse, H, Fuse, T., Suzuki, Y. & Yokota, M. (1994). Pure-tone threshold
prediction by 80 Hz amplitude-modulation following response. Acta Otolaryngology,
Supplement, 511, 7–14.
Carhart, R. & Jerger, J. (1959). Preferred method for clinical determination of pure-tone
thresholds. Journal of Speech and Hearing Research, 24, 330–45.
Cheng, P.W., Huang, T.W. & Young, Y.H. (2003). The influence of clicks versus short tone
bursts on the vestibular evoked myogenic potential. Ear and Hearing, 24, 195-197.
Cone-Wesson, B., Rickards, F., Poulis, C., Parker, J., Tan, L. & Pollard, J. (2002). The auditory
steady-state response: clinical observations and applications in infants and children.
Journal of the American Academy of Audiology, 13, 270-282.
Gorga, M.P., Neely, S.T., Hoover, B. M. Dierking, D.M., Beauchiane, K.L. & Manning.C
(2004). Determining the upper limits of stimulation for auditory steady state response
measurements, Ear & Hearing, 25, 302 – 307.
Luts, H. & Wouters (2005). Comparision of MASTER and AUDERA for measurement of
auditory steady state responses. International Journal of Audiology, 44, 244-253.
Narne, V. K., Nambi, P. A. & Vanaja, C.S. (2006). Artifactual responses in auditory steady state
responses. Paper presented at 38th
Annual Conference of the Indian Speech and Hearing
Association, Ahmedabad, India.
Nong, D.X., Ura, M. & Noda, Y. (2000). An acoustically-evoked short latency negative response
in profound subjects. Acta Otolaryngologica, 128, 960-966.
Picton, T,W. & John,M. (2004). Avoiding electromagnetic artifacts when recording auditory
steady state responses. International journal of audiology, 15, 541-554
Rance, G, Dowell, R.C., Rickards, F.W., Beer, D.E. & Clark, G.M. (1998). Steady state evoked
potential and behavioral hearing thresholds in a group of children with absent click
evoked auditory brainstem response. Ear & Hearing, 19, 48-61.
Rance, G., Rickards, F.W., Cohen, L.T., Burton, M.J. & Clark, G.M. (1993). Steady state evoked
potentials. Advancements in Otorhinolaryngology, 48, 44-48.
Small, S. A. & Stapells, D. R. (2004). Artifacutal responses when Auditory Steady State
Responses, Ear and Hearing, 25, 611-623.
Dissertation Vol.V, Part-A, AIISH, Mysore
40
Swanepoel, D., Schmulian, D. & Hugo, R. (2004). Establishing normal hearing with the dichotic
multiple frequency auditory steady state response compared to an ABR protocol. Acta
Otolaryngology, 124, 62-67.
Todd, N., Cody, P. & Banks, P. (2000). A saccular origin of frequency tuning in myogenic
vestibular evoked potentials. Hearing Research, 141, 180-188.
Townsend, G. L. & Cody, D. (1971). The averaged inion response evoked by acoustic
stimulation: its relation to the saccule. Annals of Otology, 80, 121-132.
Welgampola, M.S. & Colebatch, J.G. (2001). Characteristics of tone burst-evoked myogenic
potentials in the sternocleidomastoid muscles. Otology and Neurotology, 22, 796-802.
Dissertation Vol.V, Part-A, AIISH, Mysore
41
Comparison of Word Recognition Scores using Different Settings of
Telecoil in a Digital Hearing Aid
Bijan Saikia & P Manjula
Abstract
This article describes the performance of individuals with hearing impairment while
using telephone as a means of communication. Listeners with hearing impairment almost always
perform poorly while using telephone as compared to face-to-face conversation. The present
study intended to evaluate the optimal settings by comparing the default and modified telecoil
settings needed for best performance over telephone.15 participants with moderate to
moderately-severe sensorineural hearing loss participated in the study. The word recognition
scores, using Phonemically Balanced Kannada word lists, were compared in different telecoil
settings of the hearing aid. The results revealed a significant difference between the scores
obtained with the different settings of the hearing aid, that is, default and modified telecoil
settings of the hearing aid. Modifications of the default telecoil setting are required to optimize
performance.
Key words: telecoil settings, teleboard, telewand.
Introduction
Hearing aid users have often expressed their dissatisfaction while using telephone
(Kochkin, 2002). Listeners with hearing impairment almost always perform poorly while using
telephone as compared to face-to-face conversation. Although audiologists carefully adjust the
electro acoustic characteristics of a hearing aid in the microphone mode in an attempt to improve
speech understanding, very little attention has been given to the output characteristics of telecoil
and how well individuals with hearing loss understand speech over the telephone when using
telecoil mode (Takahashi, 2005). A hearing aid telecoil enables a hearing aid user to listen over
the telephone without any acoustic feedback by taking advantage of the electro magnetic
induction (Skinner, 1988). This phenomenon is known as inductive coupling and was first
reported in 1947 by Sam Lybarger. Inductive coupling refers to the phenomenon whereby an
electrical current is induced in a coil of wire as a time varying magnetic field passes through the
coil (Ross, 2005). Advantages of using induction coil input mode rather than microphone mode
include reduced acoustic feedback problems and elimination of ambient noise surrounding the
hearing aid wearer while using a telephone.
The telecoil of the hearing aid is not as sensitive as the microphone input. In addition the
telephone system itself presents limitation on the signal received, the most notable being the
Professor of Audiology, All India Institute of Speech and Hearing, Mysore, India
e-mail: [email protected]
Dissertation Vol.V, Part-A, AIISH, Mysore
42
band-limiting nature of telephone transmissions. The transmitting bandwidth of the telephone is
approximately 3000 Hz, transmitting frequencies between 300 and 3300 Hz, with a relatively flat
spectrum. Although listening via inductive pick-up is often effective, problems can occur if
interfering magnetic sources such as digital cellular phones, power lines, transformers,
fluorescent lights and computer peripherals are nearby (Levitt, 2001). Another issue that comes
into consideration while using hearing aid with telephone is the hearing aid compatibility (Smith,
1971) The 1988 law states that a telephone is hearing aid compatible (HAC) if it “provides
internal means for effective use with hearing aids that are designed to be compatible with
telephones which meet established technical standards for hearing aid compatibility” established
by Electronic Industries Association (EIA) Standard RS-504.
A major factor influencing responses through inductive coupling is the physical
orientation of the T-coil relative to the electromagnetic field. Compton (1994) discussed the
relationship between telecoil sensitivity and its orientation and Revit (1996) from the perspective
of an audiologist and a hearing aid wearer, cited positioning and orientation as an essential part
of instructions to new telecoil users.
Many newer hearing aids now allow programmable adjustments of the level and
frequency response of the telecoil. Some manufacturers offer a telecoil that enables the hearing
aid user to access the telecoil without even having to manually switch to the telecoil mode, that
is, the “touchless” telecoil. It is often problematic to set the telecoil response as there are many
factors that influence real world speech communication over the telephone, such as, the
characteristics of the listener‟s as well as the talker‟s telephone, variability in transmission line
characteristics, variability in the position of the telephone handset relative to the telecoil and
interference from other electromagnetic sources (Takahashi, 2005).
Despite these challenges, in clinical situation, it may be valuable to program a hearing
aid‟s telecoil response based on the performance. Programming the telecoil seems to require the
same target that is used for acoustic response which may not be appropriate for telephone use
because of factors such as the frequencies transmitted through a telephone line and electro-
magnetic field strength around the telephone (Davidson & Noe, 1994). The target is for acoustic
input through microphone mode. Input through telecoil differs and hence target should be
different. Hence, the performance in telecoil mode is lesser than for microphone mode.
At present, provisions are there in the hearing aid analyzer system for measurement with
„teleboard‟ as well as „telewand‟ to evaluate the electroacoustic performance of the hearing aid in
the „T‟ mode and it is also important to know how the electroacoustic characteristics for telecoil
vary depending on the type of input source that is, „teleboard‟ or „telewand‟.
Although there is a capability to program the overall gain and frequency response of the
telecoil in some hearing aids, there is no standardized method of setting these parameters. Yanz
and Preves (2003) reported that inductive coupling is better in the high frequencies than in the
low frequencies. For this reason the low frequency response of the hearing aid is often boosted
relative to the high frequency response to compensate for poor coupling in the low frequency
Dissertation Vol.V, Part-A, AIISH, Mysore
43
(Compton, 1994). In 2007, Chowdhury, Manjula and Abraham have also reported of findings
similar to that of Yanz and Preves (2003). That is, they reported inductive coupling to be greater
in the high frequencies than in the low frequencies. Takahashi (2005) studied the effect of
different telecoil settings on the word recognition scores. The two settings used were default
setting obtained by simply switching to the telecoil mode without making any modifications and
the modified telecoil settings obtained by changing the telecoil settings in an attempt to match
the acoustic frequency response target.
The present study aimed at -
Evaluating the optimal settings of the telecoil by comparing the performance in
terms of default and modified telecoil settings.
Finding out the input methods of measuring the electroacoustic performance of the
telecoil of a hearing aid, that is, „teleboard‟ or „telewand‟ that would result in better
speech performance.
Method
Participants
15 adult‟s subjects in the age ranging from 15 to 55 years of age (mean age of 34.13
years), native and fluent speakers of Kannada participated in the study. They had an acquired
sensorineural hearing loss ranging from moderate to moderately-severe degree in the test ear.
Instrumentation
1. A calibrated diagnostic sound field audiometer was used for estimation of pure tone
thresholds, unaided and aided performance.
2. A CD player connected to a calibrated sound field audiometer for playing the speech material
in the unaided and aided testing condition.
3. A calibrated middle ear analyzer used for confirming normal functioning of the middle ear.
4. A digital behind-the-ear hearing aid with programmable telecoil, with a fitting range to suit
the degree of hearing loss of the participants. Appropriate sized ear tips to fit the test ears of
the participants were used.
5. A calibrated hearing aid analyzer to measure the root mean square (RMS) output of the
hearing aid in the microphone mode and for evaluating the electroacoustic performance using
both teleboard and telewand option.
6. Hipro and a personal computer with a soft ware to program the digital hearing aid.
7. Two landline telephones, one for sending and one for receiving the speech material.
Speech material used
Four different lists of phonemically balanced Kannada words developed by Yathiraj and
Vijaylakshmi (2005) were recorded onto an audio CD. The recorded word lists were presented at
a moderate level approximating the normal conversational level through the CD player via the
telephone. The telephone receiving the speech material was held/oriented by the participant in
Dissertation Vol.V, Part-A, AIISH, Mysore
44
such a way that it provided the best signal. This was achieved by asking the participant to adjust
the placement of the telephone during an informal talk over the telephone prior to beginning with
the test material.
Instructions
The participants were instructed to repeat the words they heard during the testing in free-
field condition with the hearing aid in microphone mode and over telephone in telecoil mode.
Test environment
The test was conducted in a sound treated double room with ambient noise levels within
permissible limits for testing in microphone mode. A quiet environment, free of electromagnetic
disturbances - especially those caused by fluorescent lights and a power line was used for testing
in the telecoil mode.
Procedure
Puretone thresholds were obtained using modified Hughson-Westlake procedure (Carhart
& Jerger, 1959), across octave frequencies from 250 to 8000 Hz for air conduction and 250 to
4000 Hz for bone conduction. Tympanometry and reflexometry were done to rule out any middle
ear pathology. The testing procedure consisted of the following five stages:
Stage I: Pre-selection and programming of the hearing aid in the microphone mode
The digital hearing aid was connected through a Hipro to the Personal Computer (PC)
that had the software for programming. After the hearing thresholds were fed into the software
(NOAH 3.0 and Connex 5), the digital hearing aid was programmed based on the NAL- NL1
prescriptive procedure. The gain at different frequencies was set as per the hearing loss of each
participant by fine tuning the hearing aid in microphone mode. Different program settings such
as listening in quiet for program 1 and telecoil mode for program 2 were activated. This was
done for each participant.
Stage II: Measurement of the output of the hearing aid in microphone or acoustic mode
and measurement of the speech recognition scores
The output characteristic of the hearing aid was measured with the help of a hearing aid
analyzer in the microphone mode. The participant was seated comfortably in the sound treated
audiological test room. The speakers of the audiometer were located at one meter distance from
the participant at 45° Azimuth from the test ear. The Speech Recognition Scores (SRS) for
Kannada PB word lists (Yathiraj & Vijaylakshmi, 2005) for each participant was noted down in
the unaided sound field and aided sound field conditions.
Stage III: Programming the telecoil of the digital behind-the-ear hearing aid in default and
modified settings
The hearing aid was programmed in two settings for the telecoil program:
Dissertation Vol.V, Part-A, AIISH, Mysore
45
As suggested by the programming software when the audiogram of a particular configuration was
plotted using prescriptive formula given by NAL-NL1. This was referred to as the „default‟ program
of the telecoil.
With „modified‟ settings based on the NAL-NL1 and on the basis of the feedback given by the
participant, that is the „modified‟ program of the telecoil. The gain with respect to different
frequencies was increased in the „modified‟ program such that the gain in the T-mode almost matched
with that of the target gain curve in the microphone mode. That is, the gain in the telecoil mode was
matched as close as possible to that of the target gain in the acoustic mode.
For each participant electroacoustic performance and speech recognition scores were
measured in the „default‟ and „modified‟ telecoil settings.
Stage IV: Measurement of electroacoustic performance in default and modified settings,
with teleboard and telewand
Electroacoustic measurement with the two program settings (i.e., default and modified)
was carried out. Teleboard and telewand were used as the input for creating the magnetic field
while carrying out electroacoustic measurements. The hearing aid analyzer was used for the
purpose. The sound chamber of the hearing aid test system has a telecoil board that was used to
measure the hearing aid performance with telecoil setting. The telewand is a device that is
supposed to provide a more realistic test of the telecoil features of a hearing aid than the
teleboard because it more closely simulates the magnetic field strength produced by a telephone
receiver. The telewand was connected to hearing aid analyzer instead of the teleboard to serve as
input to the hearing aid.
Stage V: Measurement of the Speech Recognition Scores (SRS) over the telephone with
hearing aid telecoil programmed to default and modified settings
The performance of the participants, in terms of SRS, over the telephone was evaluated in
the following two conditions:
1. Aided, over telephone, the telecoil of the hearing aid programmed to „default‟ setting.
2. Aided, over telephone, the telecoil of the hearing aid programmed to „modified‟
setting.
In the first condition, the speech recognition score was obtained with the hearing aid in
the telecoil mode, set at default program. The participant was asked to repeat the words heard
over the telephone. The number of words repeated correctly was scored. Each word repeated
correctly was given a score of one; the maximum score being 25, as the list consisted of 25
words. In the second condition, the speech recognition score was obtained with the hearing aid in
the telecoil mode set at the modified program. Similar procedure as in second condition was
followed and the responses were scored. The first and the second conditions were repeated for all
the participants.
These five stages were repeated for all participants for data collection. The data was
subjected to statistical analysis.
Dissertation Vol.V, Part-A, AIISH, Mysore
46
Results and Discussion
To investigate the aims of the present study, statistical analysis using SPSS software
(version 10.0) was carried out for the data obtained. The following statistical analyses were
carried out:
1. Comparison between Moderate and Moderately-Severe Groups
The following table (Table1) shows the mean and standard deviation values of SRS and
RMS (dBSPL) output for the participants in moderate and moderately-severe groups.
Table 1: Mean and SD values of SRS (max. score 25) and RMS output (dBSPL)
Table 1 depicts the mean and standard deviation (SD) values of SRS and RMS output (in
dBSPL). From the table it can be noted that the mean SRS was higher for subjects with moderate
hearing loss than for subjects with moderately-severe hearing loss in both the unaided and aided
conditions. The fact that individuals with greater degree of hearing impairment have more
difficulty in perception of speech has been well documented. The mean values of RMS output
were higher for the hearing aids programmed for moderately-severe hearing loss than that
programmed for moderate hearing loss. This is because for the moderately-severe hearing loss
category the target gain required was more than that needed for the moderate hearing loss
category.
The Mann Whitney U-Test (Non-parametric equivalent of independent t-test) was
performed to analyze the significance of difference between moderate and moderately- severe
Conditions
Mean & (SD)
Moderate
N=7
Mod.- Severe
(N=8)
SRS
(Max score = 25)
1. in unaided condition 10.71
(2.14)
8.75
(1.49)
2. with HA in microphone
mode
23.71
(1.25)
22.38
(2.26)
3. with HA in T-coil default 18.14
(3.13)
15.88
(2.53)
4. with HA in T-coil modified 22.00
(2.45)
20.63
(2.62)
RMS output
(dB SPL)
1. HA in microphone mode 92.76
(5.57)
104.03
(4.18)
2. HA in T-coil default mode
with teleboard
65.66
(6.92)
68.38
(7.03)
3. HA in T-coil modified with
teleboard
79.27
(5.47)
85.25
(6.10)
4. HA in T-coil default
mode with telewand
77.66
(5.06)
78.88
(5.82)
5. HA in T-coil modified
with telewand
92.13
(5.03)
98.88
(4.79)
Dissertation Vol.V, Part-A, AIISH, Mysore
47
groups, in SRS and RMS output. The test revealed no significant difference, even at 0.05 level of
significance, between the mean values of SRS and RMS output, in moderate and moderately-
severe groups. Hence, for further analysis, the scores of the subjects with moderate and
moderately-severe degree of hearing loss were combined to form one single group. Table 2
depicts the mean and standard deviation values of SRS when the groups were combined. The
Table 2 shows that the mean SRS values in the unaided condition was the lowest and those in the
aided conditions were higher. Among the aided mean SRS values the SRS was highest in
microphone mode and least when the telecoil was programmed to „default‟ mode.
Table 2: Mean and SD values of SRS (max. score = 25) when scores from moderate and
moderately-severe groups were combined (N = 15) in different conditions
Conditions Mean SD
Unaided SRS in free field 9.67 2.02
SRS with HA in microphone mode 23.00 1.93
SRS with HA in telecoil default mode 16.93 2.96
SRS with HA in telecoil modified mode 21.27 2.55
2. Comparison between SRS in different conditions, i.e., unaided SRS, SRS with hearing
aid in microphone mode, SRS with hearing aid in telecoil default mode and SRS with
hearing aid in telecoil modified mode
15151515N =
HA-Telecoil Modif ied
HA-Telecoil Default
HA-Microphone Mode
Unaided SRS
30
20
10
0
Fig.1: Mean SRS, with Confidence Interval of 95% SRS values, in unaided and aided
(microphone, telecoil default and telecoil modified) conditions
One-way repeated measures analysis of variance (ANOVA) was administered to check
the significance of difference between unaided condition, hearing aid in the microphone mode,
hearing aid in the telecoil default mode and hearing aid in the telecoil modified mode. A
significant interaction was found, with F (3, 42) = 325.87, p< 0.01. Bonferroni‟s multiple
Unaided Microphone Telecoil default Telecoil
modified
Hearing Aid Conditions
Mea
n S
RS
an
d 9
5%
CI
Co
nfi
de
ncI
nte
rval
Dissertation Vol.V, Part-A, AIISH, Mysore
48
comparison test was carried out to see the pair-wise differences. It revealed that all the
conditions, i.e., unaided SRS, SRS with HA in microphone mode, SRS with HA in telecoil
default mode and SRS with HA in telecoil modified mode when compared individually with
each other and all the conditions were significantly different from one another at 0.01 level of
significance.
The Figure 1 shows that among the aided conditions, the microphone mode had the
highest mean SRS values followed by the telecoil modified mode and then the telecoil default
mode. The unaided condition had the lowest mean SRS value. The reason for poorer SRS with
the hearing aid in telecoil default mode is that when the telecoil is in the default mode there is a
poor inductive coupling, especially in the low frequencies. Yanz and Preves (2003) have also
reported that inductive coupling is better in the high frequencies than in the low frequencies.
Chowdhury, Manjula and Abraham (2007) found similar results to that of Yanz and
Preves (2003). They found that the inductive coupling was poorer in the low frequencies and
hence gain should be increased in the low frequencies for better inductive coupling and thus
better SRS. Although in the present study the SRS in the telecoil modified mode were never
better than or equal to the microphone mode, in most of the cases, the scores in the telecoil
modified mode almost equated to that in the microphone mode. This could be because the
transmitting frequencies through the telephone is different and hence requires appropriate
changes in the „telecoil modified‟ program in order to equate the performance to that of the
hearing aid microphone mode.
Table 3: Comparison of SRS within different pairs of SRS in aided condition using microphone
and telecoil (default and modified) settings
SRS in Correlation
(r)
Paired difference
in Mean SRS
t(14)
Pair 1 HA-Microphone Mode -
HA-Telecoil Default
0.63** 6.07** 10.02
Pair 2 HA-Microphone Mode -
HA-Telecoil Modified
0.77** 1.73** 4.13
Pair 3 HA-Telecoil Default -
HA-Telecoil Modified
0.88** -4.33** -12.01
Note: ** p < 0.01
Table 3 depicts the results of the paired t-test. The test revealed that although the pairs
were highly correlated there was a significant difference between each pair at 0.01 level of
significance. It was observed that in pair 1 (Hearing aid microphone mode and Hearing aid
telecoil default mode) there was greater mean difference than in pair 2 (hearing aid microphone
and hearing aid telecoil modified mode). But when the telecoil gain in the modified setting was
increased and matched to the target gain curve of the microphone mode, the SRS improved. The
mean difference of the SRS in the microphone and the telecoil modified condition reduced. Thus,
from the above findings it can be concluded that increasing the gain, especially in the low
Dissertation Vol.V, Part-A, AIISH, Mysore
49
frequency region, of the telecoil can lead to better SRS. Hence it is recommended that the
telecoil gain be manipulated in order to achieve better speech recognition scores while using
telephone.
3. Comparison within pairs of RMS Output of the Hearing Aid Programmed in Different
Modes and Settings:
Table 4: Comparison of selected pairs of RMS output of hearing aid in different conditions
RMS output
of HA in
Pairwise comparison t(14)
1. Microphone vs. Teleboard default mode 15.69**
2. Microphone vs. Teleboard modified mode 8.19**
3. Microphone vs. Telewand default mode 10.77**
4. Microphone vs. Telewand modified mode 1.72
5. Teleboard default vs. Teleboard modified mode 5.76**
6. Telewand default vs. Telewand modified mode 8.08**
7. Teleboard default vs. Telewand default mode 9.95**
8. Teleboard modified vs. Telewand modified mode 10.93**
Note: ** Significant at 0.01 level
The results of the paired t-test revealed that all eight pairs were significantly different at
0.01 level of significance, except for the Pair 4, that is, RMS output in microphone mode versus
RMS output in telecoil modified mode, with telewand as the input source. That is, the RMS
output in the telecoil modified mode almost equated that in the microphone mode when the
measurement was made with telewand as the input.
4. Test of Significance of Correlation Coefficient between RMS Output and SRS
The relationship between the RMS output in different hearing aid settings and the SRS in
the respective hearing aid settings was measured using Pearson‟s correlation coefficient (r).
Table 5: Pearson‟s correlation coefficient (r) value between RMS output (dBSPL) and SRS in
different hearing aid (HA) settings
RMS Output of HA in SRS with HA in r
Microphone mode Microphone mode -0.30
Telecoil default (Teleboard) Telecoil default 0.30
Telecoil default (Telewand) Telecoil default 0.29
Telecoil modified (Teleboard) Telecoil modified -0.25
Telecoil modified (Telewand) Telecoil modified -0.12
It was found from the Table 5 that there was low correlation that was not significant
between the RMS output value and SRS in different hearing aid settings. That is, with increase in
the RMS output of the hearing aid, there was no increase in the performance of the participants
in their speech recognition scores at different settings of the hearing aid. The phenomenon called
speech level distortion which states that not only audibility but also speech perception ability of
the individuals with hearing impairment gets affected due to due high levels of presentations. In
Dissertation Vol.V, Part-A, AIISH, Mysore
50
such conditions, just a 10 to 20 dBSL may yield a better speech recognition score but further
increase may actually lead to poor speech intelligibility. Hence, the findings may be attributed to
this phenomenon.
Table 6: Pearson‟s correlation coefficient (r) value between RMS output of the hearing aid in
telecoil default and modified mode using teleboard and telewand
Note: ** Correlation is significant at 0.01 level
It was found from Table 6 that there was a highly significant correlation between the two
pairs of RMS output using teleboard and telewand in the default and modified settings. That is,
the increase in RMS output with teleboard was reflected in the increase in RMS output with
telewand. However, the RMS output measured using telewand was always slightly higher than
that measured with teleboard. The reason for this increase in the RMS output using telewand is
that the telewand is a device that is supposed to provide a more realistic simulation of the telecoil
features of a hearing aid than the built-in telecoil board in the sound chamber because it more
closely simulates the magnetic field produced by a telephone receiver.
To recapitulate the findings, it is inferred that the data from moderate and moderately-
severe groups when tested in the different aided conditions using Mann Whitney U-Test showed
no significant difference between the groups. In view of the main aim of the study that was,
whether it was telecoil default mode or the modified mode that gives the higher SRS, it was
found that the participants performed significantly well in the telecoil modified mode compared
to the telecoil default mode. This was evident from the mean and standard deviation values of
SRS between these modes. Next, a comparison of RMS output of hearing aid in different modes
was made using paired t-test. This revealed that all the pairs were significantly different at 0.01
level of significance, except for one, that is, RMS output of microphone versus RMS output of
the telecoil modified using telewand as input magnetic source.
To evaluate the second aim of the study, Pearson‟s correlation between the RMS output
and the SRS in the different hearing aid settings was performed. It revealed that there was no
correlation between the RMS output value and the SRS in any of the settings of the hearing aid.
References
Carhart, R. & Jerger, J. (1959). Preferred method for clinical determination of pure tone thresholds.
Journal of Speech and Hearing Disorders, 24, 330-345.
Compton, C. (1994). Providing effective telecoil performance with in-the-ear hearing instruments. The
Hearing Journal, 47, 23-26.
Chowdhury, S., Manjula, P. & Abraham, K., (2007). Programming Modifications for Optimal Inductive
Coupling in Hearing Aids. Presented in Indian Speech and Hearing Association, 39th Annual
Conference, January 19-21, Kerala, Calicut.
RMS Output in RMS Output in r
Telecoil default (Teleboard) Telecoil default (Telewand) 0.77**
Telecoil modified (Teleboard) Telecoil modified (Telewand) 0.71**
Dissertation Vol.V, Part-A, AIISH, Mysore
51
Davidson, S.A. & Noe, C.M. (1994). Digitally programmable telecoil responses: Potential advantages for
assistive listening device fitting. American Journal of Audiology, 3, 59-64.
EIA Standards RS-504 (1988). Magnetic Field Intensity Criteria for Telephone Compatibility with
Hearing Aids. Washington, DC: Electronic Industry Association.
Kochkin, S. (2002). Marke Trak VI: Consumers rate improvements sought in hearing
instruments. Hearing Review, 9, 18-22.
Levitt, H. (2001). The nature of electromagnetic interference. Journal of American Academy of
Audiology, 12, 322-326.
Lybarger, S. (1947). As cited in, Yanz JL, Preves D. (2002), Telecoils: principles, pitfalls, fixes, and the
future. Seminars in Hearing, 24, 1:29-41.
Revit, L. (1996). Proper use of telecoils: it‟s as easy as 1, 2, 3. . . 4! Available at
www.frye.com/library/application/larry.html
Ross, M. (2005). Telecoils: Issues and Relevancy. Seminars in Hearing, 26, 2, 99-108.
Skinner, M.W. (1988), Hearing Aid Evaluation. Englewood Cliffs. NJ: Prentice Hall, 222-223.
Smith, G. (1971). Coupling hearing aids to the telephone. The Volta Review 1, 47-50.
Takahashi, G. (2005). Programming the Telecoil: A Case Study. Seminars in Hearing, 26, 2:109-113.
Yanz, J.L. & Preves, D. (2003). Telecoils: principles, pitfalls, fixes and the future. Seminars in Hearing,
24, 29-41.
Yathiraj, A. & Vijaylakshmi, C.S. (2005). Phonemically Balanced word list in Kannada. Developed in
Dept. of Audiology, AIISH.
Dissertation Vol.V, Part-A, AIISH, Mysore
52
Estimation of Auditory Thresholds in Cochlear Implant Subjects
Using ASSR
Dayal Goswami & Rajalakshmi K
Abstract
The aims of this work were to characterize the electrophysiologic response obtained by
measurement of the Auditory Steady State Response (ASSR) in patients with a cochlear implant
and to study the relationship between the subjective thresholds of the implantees and those
estimated using acoustical auditory steady-state response -based objective audiometry. Nine
subjects were examined with the use of four carrier frequencies--500, 1000, 2000 and 4000 Hz--
modulated at frequencies between 78 and 95 Hz. Free field behavioral threshold estimation was
carried out using standard test protocols. The results revealed that there was a good correlation
between the acoustical ASSR and behavioral thresholds. In addition there was no statistical
difference between the two data, suggesting acceptable accuracy of behavioral threshold
estimation with the use of ASSR. Thus, ASSR technique shows great promise as a way to assess
auditory sensitivity in subjects with cochlear implants who cannot reliably respond on
behavioral testing. The results of the present study also suggests need for an in depth
investigation into efficacy of some measurement of supra threshold processes which would be
much more helpful in terms of monitoring device performance and adjust mapping. Finally the
results need to be confirmed in a greater number of subjects, not only for further validation of
the method, but also to acquire sufficient data to support the development of an expert system,
allowing the automated assessment of cochlear implantees and the programming of these
processors. The present study highlights the potential implications of ASSR instead of behavioral
methods in fitting of young children and other difficult-to-test to test patients.
Introduction
Scope of cochlear implantation is changing with increased emphasis on early
identification and remediation of hearing loss and the establishment of newborn hearing
screening programs throughout the world. This has increased the need for reliable, objective
techniques for determining candidacy and evaluating cochlear implant efficacy in infants and
very young children. The measurement of auditory thresholds or comfort levels for HI subjects
with cochlear implants currently requires their attention and active cooperation. Unfortunately,
subjective methods are of little or no value for the assessment of very young children and other
difficult- to-test population. The field of clinical objective audiometry has recently gained a new
technique promising to be a valuable addition to the AEPs test-battery. The ASSR evoked by
continuous amplitude modulated or mixed modulated tones, demonstrates unique characteristics
Reader in Audiology, All India Institute of Speech and Hearing, Mysore, India
e-mail:[email protected]
Dissertation Vol.V, Part-A, AIISH, Mysore
53
developed primarily to address many of the limitations presented by the most widely used AEP,
the auditory brainstem response (ABR). While the 40 Hz responses initially kindled interest, its
application has been limited by its susceptibility to state of consciousness (Hall, 1992). A faster
modulation rate of between 75–110 Hz is not significantly affected by sleep or sedation
representing essentially the same generators as the auditory brainstem response (ABR) (Lins,
Picton & Picton, 1995). These higher rates are suitable for audiometric purposes across
populations (Lins, Picton, Boucher et al., 1996; Rickards et al., 1994). ASSR can be used for
assessing the cochlear implant candidacy. Swanepoel and Hugo (2004) studied the estimations of
auditory sensitivity for young cochlear implant candidates using the ASSR. Preliminary results
indicate that absent ABR and behavioral thresholds do not preclude the possibility of residual
hearing, making the ASSR a primary source of information regarding profound levels of hearing
loss as ASSR can be measured even for 120 dBHL signal.
ASSR provides attractive features, making this response of potential interest in the
assessment of cochlear implant function. It is also shown that there is a relationship between the
thresholds estimated objectively via electrical ASSR measurement and the subjective thresholds
of cochlear implantees (Menard et al., 2004). In determining post implant audiogram and to
assess speech perception through cochlear implant by objective audiometry, ASSR is very
useful. This information may be helpful in mapping process and post implant therapeutic
rehabilitation and management.
The threshold estimation is very important in diagnostic as well as rehabilitative
audiology. Auditory evoked potential particularly suited to frequency-specific measurements is
the ASSR to assess clinical populations in whom behavioural measures, speech detection and
discrimination are difficult to obtain (e.g., infants, children and difficult to test population).
Several studies stated that auditory steady state responses could be used to estimate the
frequency specific auditory sensitivity as accurate as other electrophysiological tests like ABR.
These studies reported that there was a good correlation between behavioral thresholds and
estimated ASSR thresholds.
Over the years, many studies have demonstrated that steady state response to modulation
frequencies 75-100 Hz can provide reliable estimate of hearing thresholds in children and adults.
In general the 80 Hz response can be recognized at 15 dB above hearing threshold. Rance and
colleagues (1995) predicted hearing thresholds using ASSR in a sample that include children and
adults who had sensorineural hearing loss that was of moderate to profound degree. ASSR
thresholds were estimated using tones with mixed modulation frequency of 90 Hz for carrier
frequencies 250 to 4000 Hz. Correlation between pure tone and ASSR thresholds was 0.96 for
250 Hz and as high as 0.99 for 2000 and 4000 Hz. The difference between ASSR threshold and
behavioral threshold decreased with increase in degree of hearing loss.
ASSR gained a wider acceptance as a clinical tool after Rance et al. (1998) demonstrated
its advantage in determining residual hearing thresholds for those infants and children from
whom ABR could not be evoked (at 100 dBnHL) using click stimuli. ASSRs were obtained
Dissertation Vol.V, Part-A, AIISH, Mysore
54
using mixed modulation for stimulus frequency at 250 to 4000 Hz with modulation frequency of
90 Hz. In a sample of 109 children, whose hearing loss ranged from moderate to profound, the
average discrepancy between ASSR and behavioral thresholds was only 3 to 6 dB (with standard
deviation of 6 to 8) with larger discrepancies and standard deviation found at 250 Hz and 500 Hz
as in previous studies. ASSR thresholds were within 20 dB of pure tone thresholds for 99 % of
comparisons and less than 10 dB for 82 % of subjects.
Cone-Wesson, Dowell, Tomlin, Rance and Ming (2002) reported the findings of their
study in which the threshold estimates from ASSR tests are compared to those of click- or tone
burst-evoked auditory brainstem responses (ABRs). The first, a retrospective review of 51 cases,
demonstrated that both the click-evoked ABR and the ASSR threshold estimates in infants and
children could be used to predict the pure-tone threshold. The second, a prospective study of
normal hearing adults, provided evidence that the tone burst-evoked ABR and the modulated
tone-evoked ASSR thresholds were similar when both were detected with an automatic detection
algorithm and that threshold estimates varied with frequency, stimulus rate and detection
method. The study illustrates that ASSRs can be used to estimate pure-tone threshold in infants
and children at risk for hearing loss and also in normal hearing adults.
There is an imperative need to estimate the improvements in auditory thresholds and
evaluate the benefit of achieving normal auditory behaviors using frequency specific objective
audiometry in cochlear implant subjects. It would assist objectively in fitting, accurate
programming and post-operative evaluation of cochlear implants in young and difficult-to-test
population by providing frequency specific information. ASSR stimuli are similar to speech
stimuli so this can objectively assess the speech perception abilities (validating the cochlear
implant fitting by showing that speech stimuli across the speech spectrum evoke a neural
response and therefore likely to be perceived by the subject). The outcomes of this study would
also document the efficacy of sound-field ASSR in estimating behavioral thresholds in children
with CI.
Aim:
The present study aimed at determining the auditory thresholds in cochlear implant
subjects using Auditory Steady State Responses and compares it with behavioral thresholds so
that this technique may be useful for young cochlear implantees to whom behavioral thresholds
could not be obtained.
Method
Participants: In total 9 subjects (3 females & 6 males) who used cochlear implant system in
one ear participated in this study. Their age ranged between 4-15 years. The subjects were
recruited from the Listening Training Unit, Dept of Audiology, AIISH, Mysore. They are
continuing listening therapy after implant for 1 to 4 years duration. They have a recent and stable
mapping. The descriptive data of these children are given in Table 1.
Dissertation Vol.V, Part-A, AIISH, Mysore
55
Instrumentation: Calibrated diagnostic two channel audiometer was used for estimation of
auditory thresholds for frequencies 500 Hz to 4000 Hz in free field set-up. GSI Audera ASSR
(Version 1.0.2.2) was used for recording ASSR.
Test environment: All the measurements were carried out in an acoustically treated double
room situation. The ambient noise levels were within the permissible levels according to ANSI
(1991).
Procedure: Sound field audiometry
Free field behavioral threshold estimation was carried out in sound treated rooms using
standard test protocols. For this measurement FM tones ranging from 500 to 4000 Hz were
presented through calibrated loudspeakers.
Table 1: The descriptive data of all the children participated in the study
Ser
ial
Nu
mb
er
Age/
sex
Ear
of
imp
lan
tati
on
Age
of
imp
lan
tati
on
Du
rati
on
of
CI
use
d
Str
ate
gy
Con
tou
r or
stra
igh
t
Mod
e o
f
stim
ula
tion
No o
f act
ive
elec
trod
e
inse
rted
C
om
pan
y
1 7yrs/M L 5.5 yrs 1.5yrs ACE Contour Monopolar 22 N 24
Freedom
2 6 yrs/M R 4 yrs 2 yrs ACE Contour Monopolar 22 N 24
3 15yrs/M R 11 yrs 4 yrs ACE Contour Monopolar 22 N 24
4 4.5 yrs L 4 yrs 0.5 yrs ACE Straight Monopolar 22 N 24
Freedom
5 7 yrs/M R 6.5yrs 0.5 yrs ACE Contour Monopolar 22 N 24
6 7yrs/M L 6yrs 1yr ACE Contour Monopolar 22 N 24
7 7yrs /F R 5.5yrs 1.5 yrs ACE Contour Monopolar 22 N 24
8 7yrs/F R 5yrs 2 yrs ACE Contour Monopolar 22 N 24
9 7.5yrs/F R 6.5yrs 1 yr ACE Contour Monopolar 22 N 24
The ISO 8253-3 standard prescribes a loudspeaker set-up as shown in figure 1. The
participant was positioned on an axis in front of the loudspeaker from which the test signal is
emitted (shown on figure1). Noise was emitted from two loudspeakers placed symmetrically at a
45o
angle on each side of the subject. The loudspeaker was placed in level with the head of the
subject in sitting position. The speaker faced towards the reference point which was defined as
the middle point of a straight line between the openings of the subject‟s ear canals. The distance
between the reference point and the loudspeaker was 1 m.
Dissertation Vol.V, Part-A, AIISH, Mysore
56
ASSR: Recording of Auditory Steady State Response: ASSRs were recorded using the test
protocol given in the Table 2.
Table 2: ASSR stimulus and recording parameters
Stimulus Recording
Stimuli: AM/FM tones Electrode montage: FPz (+), Cz (-)
and ground on neck.
Carrier Frequency: 500, 1K, 2K and 4KHz Subject state: awake
Modulation frequency: 78, 81, 88, 95 Hz Number of samples: maximum of 64
Modulation depth: 100% AM & 10% FM
Transducer: loudspeaker at 45 degree angle
Thresholds were obtained using a bracketing approach. At higher intensities 10-20 dB
steps were used and at lower intensities 5dB steps were used to vary the intensity. The testing
was carried out in only in one ear implanted ear. Then threshold was defined as the minimum
level at which the phase coherence was significant.
Results
The ASSR determined thresholds and measured behavioral thresholds were statistically
analyzed using descriptive statistics and linear regression. Statistical evaluations were carried out
using SPSS for windows (Version 14.0).
Dissertation Vol.V, Part-A, AIISH, Mysore
57
Initial calculations assessed the mean and standard deviations of the two different
measures i.e., behavioral threshold and auditory steady state responses (ASSR) or ASSR
thresholds. The Table 3 shows mean and standard deviations of ASSR and Behavioral thresholds
at all the test frequencies.
Table 3: Mean and SD of ASSR thresholds, behavioral thresholds at each test frequency
Frequency (Hz) ASSR threshold (dB HL) Behavioral threshold (dB HL)
500 33.89 ± 4.859 25.56 ± 4.640
1000 36.11 ± 7.407 25.56 ± 4.640
2000 32.78 ± 7.949 24.44 ± 4.640
4000 38.33 ± 7.906 26.67 ± 5.000
.
0
5
10
15
20
500 1k 2k 4k
Frequency (Hz)
AS
SR
-Beh
avio
ral t
hre
sho
ld (d
B)
Figure 2: Mean differences between auditory steady state responses and behavioural thresholds
at each test frequency. Error bars represent the standard error of mean.
It can be observed in the figure 2 that the mean differences are more for 4 KHz compared
to other frequencies and it is less for 2 KHz. There was a good correlation (p<0.001) between
ASSR thresholds and behavioural thresholds. Independent t-test was run to test the significant
difference between behavioral thresholds and ASSR thresholds. No significant differences were
found between two groups. Although the mean data did not reach the significance, the difference
between behavioral threshold and ASSR threshold does exist.
Table 4: The correlation and significance values between the ASSR thresholds and behavioral
thresholds at each test frequency
Frequency Pearson‟s correlation Significance
500 Hz 0.862 0.001**
1000 Hz 0.798 0.010**
2000 Hz 0.894 0.001**
4000 Hz 0.632 0.34*
**. Correlation is significant at the 0.001 level (2-tailed)
*. Correlation is significant at the 0.05 level (1- tailed)
Dissertation Vol.V, Part-A, AIISH, Mysore
58
Table 5: t-value and level of significance between behavioral thresholds and auditory steady state
response thresholds at each of the test frequencies
Frequency (Hz) t- Value Significance
500 2.931 .430
1000 0.638 .220
2000 2.326 .127
4000 0.636 .099
Linear regression analysis was performed at each of the test frequencies to predict
behavioural thresholds from ASSR thresholds. Linear regression curves for 500, 1000, 2000 and
4000 Hz and combined for all frequencies are shown in figures 3, 4, 5, 6 and 7 respectively. In
figure 3 through 7 linear regressions is shown as the solid line. Dotted line represents equal
values in dB HL. Correlation coefficients (r), regression equations, and standard error of
regression (se) are shown in the upper left of each graph.
Figure 3: Relationship between ASSR thresholds (x axis) and behavioural thresholds (y axis) at 500 Hz
Figure 4: Relationship between ASSR thresholds (x axis) and behavioural thresholds (y axis) at 1000 Hz
Dissertation Vol.V, Part-A, AIISH, Mysore
59
Figure 5: Relationship between ASSR thresholds (x axis) and behavioural thresholds (y axis) at 2000 Hz
Figure 6: Relationship between ASSR thresholds (x axis) and behavioural thresholds (y axis) at 4000 Hz
Figure 7: Relationship between ASSR (x axis) & behavioural thresholds (y axis) at all the test frequencies
Dotted lines represent equal values in dB HL. Linear regression is shown as the solid
lines. Correlation coefficients (r), regression equations and standard error of regression (se) are
shown in the upper right of each graph.
Dissertation Vol.V, Part-A, AIISH, Mysore
60
Figure 8: Estimated behavioural thresholds and with actual measured behavioural thresholds at 500 Hz
Figure 9: Estimated behavioural thresholds and actual measured behavioural thresholds at 1000 Hz
Figure10: Estimated behavioural thresholds and with actual measured behavioural thresholds at 2000 Hz
Dissertation Vol.V, Part-A, AIISH, Mysore
61
Figure 11: Estimated behavioural thresholds and actual measured behavioural thresholds at 4000 Hz
As can be seen in the graphs 8 through 11 the AUDERA estimated behavioral thresholds
are significantly correlated with actual measured behavioral thresholds.
Discussion
The aim of the study was to estimate the auditory thresholds in cochlear implanted
children using an objective technique (auditory steady state responses) and compare that with
behavioral thresholds.
Comparison of behavioral and ASSR threshold
The ASSR thresholds for the subjects with cochlear implant obtained in the present
steady are higher than the measured behavioral thresholds. The results of the current study are
similar to some of the earlier investigators with normal and hearing impaired population (Aoyagi
et al., 1994; Rance et al., 1995; Rickards et al., 1994). The difference between ASSR threshold
and behavioral threshold was lower in subjects with hearing loss than in normals. This may be
attributed to softness imperceptions or recruitment (Rance et al., 1995).
The difference between ASSR threshold and behavioral threshold was higher at high
frequency compared to that of low frequency. These findings are not in consistence with the
earlier findings showing strong correlation at high and poor correlation at low frequencies in
hearing impaired population (Aoyagi et al., 1994a; Rance et al., 1995; Rickards et al., 1994). The
poor correlation at low frequencies could be because of normal biological and environmental
noise which centers on low frequency that might affect ASSR measurement at lower frequencies.
Strength of prediction
Truy et al. (1998) using other electrophysiological methods to estimate behavioral
thresholds observed linear decrease in strength of prediction as test frequency increases. In other
words, the prediction is more accurate at low frequencies compared to high frequencies. The
results of the present study followed the same trend as observed in Truy‟s (1998) study.
However, in the present study, linearity in prediction was not observed. Prediction at 2000 Hz
was the best of all the test frequencies.
Dissertation Vol.V, Part-A, AIISH, Mysore
62
The following explanations describe why prediction is high at low and mid frequencies
and vise versa. First, reduced dynamic range at high frequency compared to that at low
frequencies (Menard et al, 2004). Second, it can be hypothesized that as the current level
required is lower for low frequency so threshold prediction is better at low and mid frequencies
as compared to higher frequencies. This might be due to the larger neuronal survival of low and
mid frequency fibers leading to better synchrony. Third, Picton et.al (2003) reported that at
threshold level there is more jitter seen in neural responses and require more number of averages
for estimation of thresholds. As observed from the threshold (T) level of Cochlear implant (CI)
subjects participated in the present study, T levels are high at high frequencies as compared to
low and mid frequencies which would have contributed more threshold variations. However, it is
difficult to understand the basic physiology owing to less number of subjects in the present
study.
There was a significant correlation between ASSR and behavioral threshold and
correlation coefficient ranged from 0.632 to 0.894. ASSR detection involves only objective
procedures which restrict the audiologist‟s role. Sometimes artifactual responses may be
considered as response in objective procedures. This could lead to spurious results and hence
may be drawback for clinical use of ASSR. From the discussion it can be concluded that ASSR
>70 Hz are efficient in threshold prediction of CI population and estimating audiogram
configuration of the same population. However, the conclusions should be taken with caution as
the total number of participants was less. It is important to control the subject state while
recording ASSR for higher modulation frequency.
Conclusions
The results of the present study indicated that there is a good correlation between the
acoustical ASSR and behavioral thresholds. In addition accuracy of behavior threshold
estimation with ASSR in subjects with cochlear implants is acceptable. Thus ASSR technique
shows great promise as a way to assess auditory sensitivity in subjects with cochlear implants
who cannot reliably respond on behavioral testing. Moreover, the results of the present study
suggests need for an in depth investigation into efficacy of some measurement of supra threshold
processes which would be much more helpful in terms of monitoring device performance and
adjust mapping. Finally the results need to be confirmed in a greater number of subjects, not only
for further validation of the method but also to acquire sufficient data to support the development
of an expert system, allowing the automated assessment of cochlear implantees and the
programming of these processors. The present study highlights the potential implications of
ASSR instead of behavioral methods in fitting of young children and other difficult-to-test to test
patients.
References
Aoyagi, M., Kiren, T., Furuse, H., Fuse, T., Suzuki, Y., Yokota, M. & Koike, Y. (1994a). Pure
tone threshold predicted by 80 Hz amplitude-modulation following response. Acta-
otolaryngologica, (supplement-511): 7-14.
Dissertation Vol.V, Part-A, AIISH, Mysore
63
Aoyagi, M., Kiren, T., Furuse, H., Fuse, T., Suzuki, Y., Yokota, M. & Koike, Y. (1994b). Effects
of aging on amplitude-modulation following response. Acta-otolaryngologica,
(supplement-511):15-22.
Aoyagi, M., Suzuki, Y. M. Y., Furuse, H., Watanabe, T. & Tsukasa, I. (1999). Reliability of 80
Hz amplitude- modulation- following response detected by phase coherence. Audiology
and Neuro-otology, 4, 28-37.
Cone-Wesson, B., Parker, J., Swiderski, N. & Rickards, F. (2002). The auditory steady state
response: full-term and premature infants. Journal of the American Academy of
Audiology, 13, 260-269.
Hall III, J. W. (1992). Handbook of Auditory Evoked Responses, Boston: Allyn & Bacon.
Herdmann, A. T. & Stapells, D. R. (2003). Auditory steady-state response thresholds of adults
with sensorineural hearing impairments. International Journal of Audiology, 42, 237-254.
Lins, O. G., Picton, T.W. & Picton, E.W. (1995). Auditory steady state responses to multiple
simultaneous stimuli. Electroencephalography Clinical Neurophysiology, 96, 420-432.
Lins, O. G., Picton, T. W., Boucher, B. L., Durieux-Smith, A., Champagne,S.G., Moran, L. M.,
Perez-Abalo, M. C., Martin, V. & Savio, G. (1996). Frequency specific audiometry using
steady state response. Ear and Hearing, 17, 81-96.
Ménard, M., Gallego, S., Truy, E., Berger-Vachon, C., Durrant J. D. & Collet, L. (2004)
Auditory steady-state response evaluation of auditory thresholds in cochlear implant
patients. International Journal of Audiology, 43 (Suppl 1: S39-43).
Picton, T.W., Skinner, C.R., Champagne, S.C., Kellett, A.J. & Maiste, A.C. (1987). Potentials
evoked by the sinusoidal modulation of the amplitude or the frequency of the tone.
Journal of the Acoustical Society of America, 82: 165-178.
Picton, T. W., Durieux-Smith, A., Champagne, S. C., Whittingham, J., Moran, L. M., Giguère,
C. & Beauregard, Y. (1998). Objective evaluation of aided thresholds using auditory
steady-state responses. Journal of the American Academy of Audiology, 9, 315–331.
Picton, T. W., John, M. S. & Dimitrijevic, A. (2002). Possible role for auditory steady state
responses in identification, evaluation and management of hearing loss in infancy.
Audiology Today, 14, 29-34.
Picton, T. W., John, M. S., Dimitrijevic, A. & Purcell, D. (2003). Human auditory steady state
responses. International Journal of Audiology, 42, 177-219.
Rance, G., Rickards, F. W., Cohen, L. T., De Vidi, S. & Clark, G. M. (1995). Automated
prediction of hearing thresholds in sleeping subjects using auditory steady-state evoked
potentials. Ear and Hearing, 16, 499-507.
Rance, G., Dowell, R. C., Rickards, F. W., Beer, D. E. & Clark, G. M. (1998). Steady state
evoked potential and behavioral thresholds in a group of children with absent click
evoked auditory brainstem response. Ear and Hearing, 19, 48-61.
Rance, G., Richard, R., Lindsay, S., Lisa-Jane, M., Christane, P., Melissa, D. & Therese, K.
(2005). Hearing threshold estimation in infants using auditory steady state responses.
Journal of the American Academy of Audiology, 16, 291-300.
Dissertation Vol.V, Part-A, AIISH, Mysore
64
Rickards, F. W., Tan, L. E., Cohen, L. T., Wilson, O. J., Drew, J. H. & Clark, G.M. (1994).
Auditory steady state evoked potentials in newborns. British Journal of Audiology, 28,
327-337.
Rickards, F. W. & Clark, G.M. (1998). Steady- state potentials to amplitude modulated tones.
Journal of American Academy of Audiology, 9, 163-168.
Small, A.S. & Stapells, D. R. (2004). Artifactual responses when recording auditory steady state
responses. Ear and Hearing, 25(6), 611-623.
Stapells DR, Galambos R, Costello J.A. & Makeig S. (1988). Inconsistency of auditory middle
latency and steady state responses in infants. Electroencephalography Clinical
Neurophysiology, 71: 289–95.
Stapells, D.R., Picton, T.W., Perez-Abalo, M., Read, D. & Smith, A. (1985). Frequency
specificity in evoked potential audiometry. In J.T. Jacobson (Ed.), The Auditory
Brainstem Response (pp.147-177). San Diego: College Hill press.
Stapells DR, Linden, D., Sufeld, B., Hamel, G. & Picton TW. (1984). Human auditory steady
state potentials. Ear and Hearing, 5, 105–13.
Swanepoel,. D. W. & Hugo. R. (2004). Estimations of auditory sensitivity for young cochlear
implant candidates using the ASSR: preliminary results. International Journal of
Audiology, 43, 377-382.
Truy, E., Gallego, S., Frachet, B., Micheyl, C. & Collet, L. (1998). Correlation between electrical
auditory brainstem response and perceptual thresholds in diagnostic cochlear implant
users. Laryngoscope, 108, 554-559.
Dissertation Vol.V, Part-A, AIISH, Mysore
65
A search to possible pathways for later peaks of VEMP and N3
potential
Deepashri Agrawal & Animesh Barman
Abstract
The VEMP is an inhibitory potential recorded from the sternocleidomastoid (SCM)
muscle in response to loud sounds. There are four peaks in VEMP which have been classified
according to their latencies, known as p13, n23, n34, p44. Waves p13-n23 which are saccular in
origin possibly and have been studied excessively due to the higher response rate in normals
whereas the N34- P44 which are believed to be cochlear origin have been scarcely explored thus
ignoring their clinical significance. There is another potential (N3 potential) which is thought to
be originated from vestibular system is the negative peak at 3 msec in ABR recording. The aim of
the study was to know any relation between the later peaks of VEMP and N3 potential and also
to explore the possible routes for generation of these potentials. Two groups of subjects
participated in the study. First group of subjects (control group) consisted of subjects in the age
range of 16 to 45 years, with normal hearing (N=30) and the second group, (experimental
group) consisted of 30 subjects with different configuration of sensorineural hearing loss. The
experimental group was further divided into three subgroups based upon their hearing loss.
Results revealed that all the four groups were significantly different in terms of presence of later
peaks of VEMP. One way ANOVA showed significant difference in all the four groups. Results
revealed that occurrence of later peaks of VEMP increased with increase in severity of hearing
loss. It was observed that in only 6.7% of the subjects when VEMP was absent N3 potential was
also absent. In 40% of the subjects when later peaks of VEMP were present N3 potentials were
also present suggesting that there might be some similarity in pathway between later peaks of
VEMP and N3 potential. Based on the results and literature available it was felt that there could
be three possible pathways for later peaks of VEMP, first via Vestibulo-cochlear anastomosis,
second through cochlear afferents and third via Olivocochlear bundle. Similarly for N3 potential
there could be two sites of origin first could be vestibular nucleus and other one could be
cochlear nucleus.
Key words: Later peaks of VEMP, N3 potential, Vestibulo-cochlear anastomosis, cochlear
nucleus, olivocochlear bundle
Introduction
The VEMP by definition is a short-latency electromyogram recorded from the tonically
contracted SCM in response to high-intensity acoustic stimulation (Bickford, Jacobson & Cody,
Lecturer in Audiology, All India Institute of Speech and Hearing, Mysore, India
e-mail: [email protected]
Dissertation Vol.V, Part-A, AIISH, Mysore
66
1964; Cody & Bickford, 1969; Colebatch & Halmagyi, 1994). Colebatch et al., (1994) labeled
the serial peaks P13, N23, N34 and P44, based on their latencies. The research studies have
almost solely investigated wave P13-N23 complex. In contrast, Colebatch, Halmagyi and Skuse
(1994) and Robertson and Ireland (1995) reported presence of N34-P44 peaks of VEMP in 60 to
68% of their normal hearing subjects interrupting the investigation of their clinical significance.
There is another potential which is thought of vestibular in origin and can be recorded
during the ABR recording. The evoked potential is known as N3 potential which is a large
negative deflection with latency of 3 ms (N3) and which has been recorded in patients with
peripheral profound deafness (Kato et al., 1998). The later peaks of VEMP are thought to be of
cochlear origin and N3 potentials were thought to be of vestibular origin though both of them are
elicited by acoustic stimulation. Thus it was essential to know the anatomical and functional
relationship between the cochlear and vestibular system which resulted in initial peaks of
VEMPs to be of vestibular origin and later peaks to be of cochlear origin.
When n34-p44 peaks recorded from the Sternocleidomastoid muscle are thought to be of
cochlear in origin this implies that auditory nerve needs to stimulate the vestibular nerve to elicit
responses from the Sternocleidomastoid muscle. Thus it indicates that later potentials are likely
to be absent or decreased in cases with hearing loss. However there are a few studies in which
n34-p44 potentials are present in profound hearing loss individuals (Ferber-Viart, Dubreuil &
Duclaux, 1999) which raised the question whether these potentials are really of cochlear origin.
This could be understood if the study is carried out in normal and in subjects with hearing loss.
Kato et.al., (1998) studied the N3 potential in profound hearing loss cases and presence of this
negative peak confirms vestibular system especially saccule as a site of origin whereas N34-P44
peaks of VEMPs recorded by stimulating the SCM muscle are thought to be cochlear origin.
Hence it suggested that there could be some relation between the two potentials. If there
is a relationship between the two then what could be the possible pathway for both later peaks of
VEMP and N3 potential? How these potentials can help in understanding Vestibulo-cochlear
nerve pathology?
To find out the answers for all these questions the present study was taken up with the
following aims:
To study the relationship between the severity of hearing loss and N34-P44 potentials of VEMP
To understand the relationship between N3 potentials and VEMP
To understand the possible route for later peaks of VEMP and the possible route for the N3 potential
To understand the structural and functional relationship between the Vestibulo-cochlear nerve root
Method
Subjects
The present study comprised of two major subject groups, the Control group and the Clinical
group. The Control group comprised of 30 (60 ears) normal hearing individuals in the age range
Dissertation Vol.V, Part-A, AIISH, Mysore
67
of 16 to 45 years. The Clinical group consisted of 35 subjects with sensorineural hearing loss in
the age range of 16 to 45 years. This clinical group was further sub divided into three groups
based on their severity of hearing loss as:
Group A: 10 (20 ears) individuals with mild sensorineural hearing loss
Group B: 10 (20 ears) individuals with severe sensorineural hearing loss
Group C: 15 (30 ears) individuals with profound hearing loss
Subject selection criteria
Control group:
1. All the subjects had normal hearing sensitivity, having puretone thresholds within 15 dBHL
at frequencies from 250 to 8000 Hz in octaves, in both ears.
2. All of them showed „A‟ type tympanogram with normal reflexes in both ears
3. They did not have any history or presence of any otological problems
4. They reported no complaints of giddiness, vertigo, balancing problem, spondilitis and high
blood pressure
5. All the subjects had UCL (uncomfortable level) above 105dB HL
Clinical group:
1. Subjects with pure tone thresholds of varying degrees with sensorineural hearing loss were
chosen.
2. Attempts were made to rule out space occupying lesions based on ABR results or
neurological assessment as and when it was required.
3. They had „A‟ type tympanogram with presence, elevated or absence of acoustic reflexes in
both the ears.
4. Subjects did not have any history of middle ear pathology.
5. Subjects had UCL above 105 dBHL, as confirmed by UCL testing.
Instrumentation:
The vestibular evoked myogenic potentials and N3 potentials were recorded using IHS smart EP
version 2.39 (Intelligent hearing system, Florida, USA) instrument
Procedure
Phase I: Routine evaluations for the selection of subjects
1. Detailed case history was taken for all the individuals to rule out any history or presence of any
otological problems, general weakness, giddiness, vertigo, high blood pressure and spondilitis.
2. The pure tone audiometric thresholds were obtained using modified version of Hugson and
Westlake procedure (Carhart & Jerger, 1959) with the help of GSI-61 Audiometer.
3. Tympanometry and acoustic reflexes were tested using GSI Tympstar to rule out the presence
of middle ear pathology.
Dissertation Vol.V, Part-A, AIISH, Mysore
68
4. ABR was done by cleaning the electrode site with the help of skin preparing gel to rule out the
possibility of space occupying lesion.
Subjects who fulfilled the selection criteria either for control group and or for clinical group
were considered for the study.
Phase II: Experiment
VEMP: VEMPs were recorded for all the subjects in control and clinical group.
Subjects were instructed:
To sit straight and turn their head to opposite side of the test ear so as to stretch the Ipsilateral
Sternocleidomastoid muscle.
To close their eyes at the time of recording to avoid interference by occulomotor reflexes.
To avoid extraneous movements of head, neck and jaw to elude muscle artifacts.
Prior to the VEMP and N3 potential recording the electrode sites were cleaned using skin
preparation gel to reduce the impedance. The electrodes were placed on respective sites as given
in the table 1 and 2 for VEMP and N3 potential respectively with conducting paste to improve
the conduction. It was ensured that impedance was within 5 K ohms at each recording site and
inter electrode impedance was within 3 K ohms to obtain good responses.
The protocol proposed by Huang T., Young Y. and Cheng P. (2004) was used in the
present study to record VEMP which is given below:
Table 2: Depicts the parameters used for VEMP recording
N3 potential: N3 potentials were recorded for individuals with profound hearing loss.
Instructions given to the subjects were as follows:
1. Subjects were asked to sit in chair and close their eyes.
2. They were informed not to stretch their neck, instead were asked to sit quietly.
Electrode Montage
Inverting electrode Sternoclavicular joint
Non Inverting electrode Midpoint of SCM.
Ground electrode Forehead
Acquisition parameters
Analysis time 100 msec.
Filter setting 20-2000 Hz.
Amplification 50,000
Stimulus parameters
Type of stimulus Clicks
Repetition rate 5/sec.
Polarity Rarefraction
Intensity 105 dBnHL.
Stimulus duration 500 µsec.
Dissertation Vol.V, Part-A, AIISH, Mysore
69
3. They were instructed to avoid extraneous movements of head, neck and jaw to elude muscle
artifacts.
Table 3: Depicts the parameters used to record N3 potential
Electrode montage
Non Inverting electrode Vertex
Inverting electrode Ipsilateral mastoid
Ground electrode Nape of the neck
Acquisition parameters
Analysis time 10 msec.
Filter setting 100 – 3000 Hz.
Amplification 10,000
Stimulus parameters
Type of stimulus Clicks
Repetition rate 10/sec.
Polarity Rarefraction
Intensity 95 dBnHL
Total no. of stimuli 500 stimuli.
Stimulus duration 100 µsec.
N3 potential was recorded using the protocol proposed by Toshihisa, Iwasakib, Takaib
and Takegoshic, (2005) as shown in the table above. For all the subjects VEMPs and N3
potential were recorded for right side first and then for the left side. It was ensured that each
recording was repeated to have reproducibility of the responses. For subjects in Group C, N3
potential was recorded after the VEMP recording.
Results
The data was statistically analyzed using One-way ANOVA, Duncan‟s post-hoc test, Chi-square
test and Cramer‟s V test. All the statistical analysis was carried out using SPSS 10 software.
1. Relationship between the severity of hearing loss and N34 – P44 potentials of VEMP:
Table 4: Depicts the Mean and SD of latencies and peak to peak amplitude of later peaks of VEMP
Entity No. Group Mean SD Min. Max.
N34
13 Mild 32.5538 3.1532 28.80 37.60
14 Severe 35.1714 2.2228 32.00 38.60
24 Profound 34.7937 3.0226 29.00 39.80
31 Normal 33.3161 3.1136 27.80 39.20
P44
13 Mild 42.6154 3.4024 37.60 47.20
14 Severe 46.0143 2.1260 43.00 49.80
24 Profound 44.8138 3.2994 38.20 50.00
31 Normal 43.7574 3.1351 38.40 49.00
PP
13 Mild 4.4885 2.2823 2.18 9.32
14 Severe 4.1950 1.5442 2.09 6.78
24 Profound 4.4579 1.8855 1.58 8.97
31 Normal 3.3110 2.7835 0.90 15.00
Dissertation Vol.V, Part-A, AIISH, Mysore
70
Table shows that there is a slight variation in latency and peak to peak amplitude of N34
– P44 peak of VEMP with the varying degree of hearing sensitivity.
The One-way ANOVA was administered to check whether latency and peak to peak
amplitude are significantly different or not and if ANOVA was significant then Duncan‟s post-
hoc test was done to see significant differences between the groups of subjects. Results revealed
that control group and subgroups of clinical group were significantly different for latency N34
peak and P44 peak and not for peak to peak amplitude for later peaks of VEMP (p<0.05).
On Duncan‟s post-hoc test showed there was significant difference between the data
obtained from individuals with mild hearing loss from those with profound and severe hearing
loss and rest of the groups were not different for N34 peak of VEMP.
Table 5: Depicts the Duncan‟s post-hoc test results for N34 peak of VEMP
Groups No. of ears Subset for alpha = .05
1 2
Mild 13 32.5538
Normal 31 33.3161 33.3161
Profound 24 34.7937
Severe 14 35.1714
Sig. 0.443 0.079
For P44 peak of VEMP significant difference was observed between individuals with
mild hearing loss and with severe and profound hearing loss but not from individuals with
normal hearing, whereas individuals with normal hearing differed significantly from individuals
with severe hearing loss and not from individuals with mild and profound hearing loss (p<0.05).
Table 6: Depicts the Duncan‟s post-hoc test results for P44 peak of VEMP
Groups No. of ears Subset for alpha = .05
1 2 3
Mild 13 42.6154
Normal 31 43.7574 43.7574
Profound 24 44.8137 44.8137
Severe 14 46.0143
Sig. .270 .308 .247
Further data was analyzed to see if there is any association between the presence or
absence of N34- P44 wave and hearing sensitivity based on the data obtained. Chi square test
was administered and results revealed that there lies a significant association between the
presence of N34 – P44 wave of VEMP and increase in hearing sensitivity, χ2 (1) = 0.464,
(0.05<p<0.1).
Dissertation Vol.V, Part-A, AIISH, Mysore
71
On Cramer‟s V test it was seen that association is 23% and the response rate to eliciting
wave N34-P44 from group with normal hearing sensitivity, mild, severe group and profound
hearing loss group were 51.7%, 65%, 70% and 80% respectively as seen in the following table.
Table 7: No. of ears and percentage of the presence or absence of later peaks of VEMP
Groups No. of ears Absent Present Total
Mild No. of ears 7 13 20
35.0% 65.0% 100.0%
Severe No. of ears 6 14 20
30.0% 70.0% 100.0%
Profound No. of ears 6 24 30
20.0% 80.0% 100.0%
Normal No. of ears 29 31 60
48.3% 51.7% 100.0%
Total No. of ears 48 82 130
36.9% 63.1% 100.0%
2. Relationship between N3 potentials and N34 – P44 peaks of VEMPs:
The later peaks of VEMP and N3 potential were recorded in individuals with profound
hearing loss. The mean and the standard deviation along with minimum and maximum values for
N3 potential latency and absolute amplitude are given in the table below:
Table 8: Mean, Standard deviation (SD), minimum and maximum values for N3 potential
latency and absolute amplitude in subjects with profound hearing loss
Parameters Mean SD Minimum Maximum
N3 3.3033 0.4782 2.28 4.05
Amplitude -0.2142 0.3941 0.22 -1.25
It was seen that N3 potential was recorded with a range was of 1.77 msec and for
absolute amplitude the range was 1.47 micro volts. The Chi-square test was done to see if there
lies any association between the presence and absence of N34-P44 peak of VEMP and presence
or absence of N3 potential. Chi-square test showed no significant association between the
presence and absence of the two potentials for χ2
(3) = 7.486, (p>0.05). On cross tabulation it
was evident that N3 potential was present in 4 ears when N34-P44 peak of VEMP were absent
where as 12 ears had absence of N3 potential with presence of later peaks of VEMP.
Form the table it is evident that only 6.7% of the subjects when VEMP was absent, N3
potential was also absent. And for 40% of the subjects when later peaks of VEMP were present,
N3 potentials were also present. Thus 46.7% of the population had similar test results and 53.3%
did not show the same findings
Dissertation Vol.V, Part-A, AIISH, Mysore
72
Table 9: Distribution of data for presence/absence of later peaks of VEMP and N3 potential in
subjects with profound hearing loss
Parameter N3 Total
Absent Present
N34 –P44
No. of ears 2 4 6
Absent 6.7% 13.3% 20%
No. of ears 12 12 24
present 40% 40% 80%
Total
No. of ears 14 16 30
46.7% 53.3% 100%
These results are discussed to understand the possible root of VEMP and the physiological
relationship between the Vestibulo-cochlear nerve routes.
Discussion
The results of the present study have shown that there is association between presence of
later peaks of VEMP and severity of hearing loss. Chi-square test showed that there is
association of 23% between presence of N34- P44 wave of VEMP and severity of hearing loss.
Duncan‟s post-hoc test did not show any specific trend in increase or decrease of latencies and
amplitude of later peaks of VEMP in individuals with different hearing sensitivity. On the
Cramer‟s V test it was seen that in individuals with normal hearing sensitivity only 51% had
presence of N34-P44 peak of VEMP compared to 80% seen in individuals with profound hearing
loss. The initial findings reported in the literature by Colebatch et al. (1994) who could record
later peaks of VEMP from 60% of individuals with normal hearing, whereas Huang et al. (2004)
could record from 80% of the individuals with normal hearing sensitivity.
In the present study later peaks of VEMP could be recorded from 80% of the individuals
with profound hearing loss which is much higher than what has been reported in the literature by
Wu and Young in (2002). They could record later peaks of VEMP only from 45% of the
individuals with sudden hearing loss. It is also evident from the current study that as the severity
of hearing loss increased the percentage of presence of later peaks of VEMP also increased.
However, there are no such studies available in the literature which compared the presence of
later peaks of VEMP in population with different degrees of hearing loss.
In the current study the findings suggest that probably later peaks of VEMP may not be
cochlear origin as wave complex is present in 80% of individuals with profound hearing loss.
However, there are reports in literature which support the fact that later peaks of VEMP are
cochlear in origin as Colebatch et al., (1994) found presence of later peaks of VEMP in before
and after selective vestibular nerve section.
Taking these results together it might suggests that wave n34–p44 may have both a
cochlear and vestibular origin. The possible pathways for generation of N34- P44 peak of VEMP
are discussed below:
Dissertation Vol.V, Part-A, AIISH, Mysore
73
The first possible pathway:
There are histopathological and morphological studies which have proven that the
(saccular nerve) inferior vestibular nerve links to the cochlear nerve in internal acoustic canal
and this intimate connection was named as Vestibulocochlear anastomosis (Oort, 1918;
Rasmussen, 1940; House, 1961; Kim et al., 1998; Nageris, Kalmanowitz, Segal & Frenkiel,
2000; Labrousse, Ouedraogo, Avisse, Chays & Delattre, 2005). It was also found that cochlear
afferent fibers go to the lateral ipsilateral vestibular nucleus (Cazals, Erre & Aurousseau, 1987)
may be via vestibulocochlear anastomosis. There are reports which suggest that vestibulospinal
nerve fibers from the medial and lateral vestibular nucleus descend down to the SCM muscle via
MVST and to leg muscle via LVST, Colebatch (1992 & 1994).
Pathway I Pathway II
Figure 1: possible pathways for generation of later peaks of VEMP
It can be explained that from the fig. 1 that there is a possibility of acoustic stimulation to
the cochlea stimulates the cochlear afferents which in turn might be stimulating vestibular
nucleus in the brainstem via vestibulocochlear anastomosis. From there impulses are sent to the
neck muscles via the medial vestibulospinal tract (MVST) and the leg muscles via the lateral
vestibulospinal tract (LVST).
Second possible pathway:
On the other hand the current study suggests that the wave n34-/p44 could also be
obtained in deaf ears, implying that they were probably not of cochlear afferent origin as WU
and Young (2002) observed the presence of later peaks of VEMP in 45% of subjects after sudden
hearing loss. This implies that these peaks might occur via a polysynaptic pathway also
Dissertation Vol.V, Part-A, AIISH, Mysore
74
terminating on the motor neuron of SCM muscles, Wilson et al. (1969), Murofushi, Halmagyi,
Yavor & Colebatch (1996) & Kushiro, Zakir, Sato, Ono, Ogawa and Meng (2000).
It can be seen from figure.1 that inferior vestibular nerve originates from saccule and
utricle and sends a few projections into the dorsal cochlear nucleus. Hence the possible pathway
could be from saccule to the inferior vestibular nerve fibres progress to the vestibular nuclei
which are responsible for generation of early peaks of VEMP, whereas few fibers progressed to
the cochlear nucleus. These few fibers later via the intimate connection between cochlear and
vestibular nucleus terminates in medial and lateral vestibular nuclei. From this vestibular nuclei
they innervate the SCM i.e. neck muscle via MVST and to leg muscle via LVST resulting in the
generation of later peaks of VEMP.
From the above discussion and information from the literature it may be concluded that
the above mentioned route could be the generator of later peaks of VEMP. This could be the
possible route as 80% of individuals with profound hearing loss showed presence of later peaks
of VEMP. Thus the longer latencies of later peaks could be due to the longer path travelled by
the nerve fibers to stimulate the neck muscle.
Third possible pathway:
There could also be a possibility of the third route for generation of later peaks of VEMP
via olivocochlear bundle. Brown (1993), Benson and Brown (1996) and Winter, Robertson and
Cole (1989) found that the medial and lateral olivocochlear fiber systems give off branches to the
inferior vestibular nucleus and the lateral vestibular nucleus respectively, apart from those given
to the cochlear nucleus in the mouse and guinea pig. This suggests that these few olivocochlear
fibers might progress further to vestibulospinal tract to stimulate SCM muscle via MVST.
Thus from the above discussion it is evident that there could be three possible pathways
which help in eliciting later peaks of VEMP in individuals with normal hearing sensitivity as
well as in individuals with profound hearing loss. The possible first and third pathway are more
applicable for generation of later peaks of VEMP in individuals with normal hearing sensitivity
whereas second possible pathway might explain the generation of later peaks of VEMP in
individuals with profound hearing loss. So these two mechanisms for generation of later peaks of
VEMP for two different groups might have resulted in lesser and more percentage of presence of
later peaks of VEMP in individuals with normal hearing sensitivity or individuals with profound
hearing loss. Thus, this presence of later peaks of VEMP can give some information of integrity
of cochlear nerve which in turn might help in selecting candidates for cochlear implants. The
proposed possible second pathway might encourage considering acoustic stimulation of the
saccule as an alternative to the cochlear implant.
Pathways for N3 potential:
To understand the possible pathway for N3 potential, later peaks of VEMP and N3
potential were recorded from individuals with profound hearing loss. Results revealed 46.7% of
the population had similar results suggesting that there could be similar pathway for later peaks
Dissertation Vol.V, Part-A, AIISH, Mysore
75
of VEMP and N3 potential, whereas 53% showed the disagreement between the results
suggesting that there could be two different sites for generation of N3 potential. Thus there could
also be two different sites of origin for N3 potential, one could be vestibular nucleus and other
could be the cochlear nucleus.
The possible site for generation of N3 potential suggested in literature is vestibular
nucleus. Mason (1996) reported a short latency negative component during ABR recordings in
child candidates for cochlear implant suggesting it to be of vestibular origin hence the relation
between sound and the vestibular system is undoubtedly believed. Studies from Elidan and
Honrubia (1987) and Cazals et al., (1987) suggested origin of N3 potential could be the
vestibular nerve and vestibular nuclei. Nong, Kyuna, Owa and Noda (2002) and Ochi and Ohashi
(2001) suggested that N3 potential might be of vestibular origin as is VEMP. Thus, the high level
of acoustic stimulation to the cochlea stimulates the saccule. This excitation of saccular cells in
the saccule then sends the information to the vestibular nucleus via inferior vestibular nerve and
resulted in low amplitude negative potential at around 3 msec. In the process of exploring the
possible pathway for later peaks of VEMP it was observed that a few fibers of the
vestibulospinal tract progress to the dorsal cochlear nucleus (Bukoswka, 2002). This suggests
that negative peak at 3 msec might be of the cochlear nucleus origin also.
The second possible route could be thus explained from the fact that few saccular fibers
enter the cochlear nucleus while progressing towards vestibular nucleus (Kevetter & Perachio,
1989). The intense stimulation to the cochlea stimulates the saccule. Stimulation to the sensory
cells of saccule in turn sends the information to the dorsal cochlear nucleus which might result in
generation of low amplitude negative peak at around 3 msec. The above discussion suggests that
there could be three different pathways for the generation of the later peaks of VEMP and two
possible pathways for generation of N3 potential.
Conclusion
It can be concluded that later peaks of VEMP can be recorded at higher percentage in
profound hearing loss individuals compared to normals. There lies relationship between later
peaks of VEMP and N3 potential as 46.7% of the population showed similar results. Thus there
might be possibility of multiple pathways for later peaks of VEMP as well as N3 potential. For
later peaks of VEMP one pathway might be from cochlea to the Vestibulo-cochlear anastomosis
further progressing to the vestibular nuclei ending at SCM muscle and second could be via
cochlear nucleus to the vestibular nucleus to the SCM muscle. There could also be a third
pathway from cochlea to the medial olivocochlear bundle to the vestibular nucleus descending to
the SCM muscle.
For N3 potential there could be two possible sites of origins. First as Vestibular nucleus
and second might be the cochlear nucleus by stimulating saccule using acoustic stimulation.
Thus this communication between the two systems might have lot of implications in field of
Audiology.
Dissertation Vol.V, Part-A, AIISH, Mysore
76
Implications of the study:
The study has implication in knowing the pathophysiology of hearing loss in profound
hearing loss.
Non-invasively the condition of the vestibular system (especially the saccule and inferior
vestibular nerve and medial and lateral vestibular nuclei) can be assessed as the later peaks of
VEMP and N3 potentials can only be obtained when saccule is intact and can serve as 2
different tests to check saccular system.
Presence of later peaks of VEMP in profound hearing loss might suggest about the residual
function of cochlea/cochlear nerve in turn helping to decide on candidacy for cochlear
implant.
On the basis of all possible pathways it is evident that there might be a possibility for a
totally deaf person to process acoustic stimulation via a saccular implant.
Limitations of the study:
All the possible pathways proposed have been based on the electrophysiological results. It
would have been better if the results were correlated with other direct methods like injecting
wheat germ agglutinin-horseradish peroxidase (WGA-HRP).
Multiple pathway for each potential limits to find out exact lesion when they are absent,
leading to lack of diagnostic purpose.
References
Benson, T. E. & Brown, M. C. (1998). Synapses from medial olivocochlear branches in the
inferior vestibular nucleus. The Journal of Comparative Neurology, 372(2), 176-188.
Bickford R.G., Jacbson, J. L. & Cody, D. T. (1964). Nature of average evoked potentials to
sound and other stimuli in man. Ann N Y Acad. Sci, 112, 204-223.
Brown, M. C. (1993). Fiber pathways and branching patterns of biocytin-labeled olivocochlear
neurons in the mouse brainstem. J Comp Neurol, 337, 600–613.
Bukowska, D. (2002). Morphological Evidence for the Secondary Vestibular Afferent
Connections to the Dorsal Cochlear Nucleus in the Rabbit. Cells Tissues Organs, 170,
61-68.
Carhart, R. & Jerger, J. F. (1959). Prefered method for clinical determination of puretone
thresholds. Journal of Speech and Hearing Research, 24, 330.
Cazals, Y., Erre, J. & Aurousseau, C. (1987). Eighth nerve auditory evoked responses recorded
at the base of the vestibular nucleus in the guinea pig. Hear Res, 31, 93–97.
Cody, T. R. & Bickford, R. (1969). Averaged evoked myogenic responses in normal man.
Laryngoscope, 79, 400–446.
Colebatch, J. G. & Halmagyi, G. M. (1992). Vestibular evoked potentials in human neck muscles
before and after unilateral vestibular deafferentation. Neurology, 42, 1635–1636.
Colebatch, J. G., Halmagyi. G. M. & Skuse N. F. (1994). Myogenic potentials generated by
click-evoked vestibulocollic reflex. J Neurol Neurosurg Psychiatry, 57, 190–197.
Elidan, J. J. & Honrubia, V. (1987). Vestibular ototoxicity of gentamicin assessed by the
recording of a short-latency vestibular-evoked response in cats. Laryngoscope, 97, 865-
870.
Ferber-Viart, C., Dubreuil, C. & Duclaux R. (1999). Vestibular evoked myogenic potentials in
humans: A review. Acta Otolaryngol. Suppl. (Stockh), 119, 6-15.
Dissertation Vol.V, Part-A, AIISH, Mysore
77
Huang, T-W., Young, Y-H. & Cheng, P-W. (2004). Eliciting constant and prominent waves n34
- p44 of vestibular-evoked myogenic potentials. Journal of neurology neurosurgery
neuropsychiatry, 57, 190-197.
Kato, T., Shiraishi, K., Eura,Y., Shibata, K., Sakata, T., Morizono, T. & Soda T. A. (1998).
„neural‟ responses with 3 ms latency evoked by loud sound in profoundly deaf patients.
Audiol Neurootol, 3, 253-264.
Kevetter, G. A. & Perachio, A. (1989). Projections from the sacculus to the cochlear nuclei in the
Mongolian gerbil. Brain Behav. Evol, 34, 193-200.
Kim, H.S., Chung, I.H., Lee, W. S. & Kim, K. (1998). Topographical relationship of the facial
and vestibulocochlear nerves in the subarachnoid space and internal auditory canal. AJNR
Am. J. Neuroradiol, 19, 1155–1161.
Kushiro, K., Zakir, M., Sato, H., Ono, S., Ogawa, Y. & Meng, H. (2000). Saccular and utricular
inputs to single vestibular neurons in cats. Exp Brain Res, 131, 406-415.
Labrousse, M., Leveque, M., Ouedraogo, K., Avisse, C., Chays, A. & J.-F. Delattre. (2005). An
anatomical study of the vestibulocochlear anastomosis (anastomosis of Oort) in humans:
preliminary results. Surgical and Radiologic Anatomy, 27, 238-242.
Mason, S., Garnham, C. & Hudson, B. (1996). Electric response audiometry in young children
before cochlear implantation: a short latency component. Ear Hear, 17, 537-543.
Murofushi, T., Halmagyi, G., Yavor, R. A. & Colebatch, J. G. (1996). Absent vestibular evoked
myogenic potentials in vestibular neurolabyrinthitis. Arch Otolaryngol Head Neck Surg
122, 845-848.
Nageris, B., Kalmanowitz, M., Segal, K. & Frenkiel, S. (2000). Connections of the facial and
vestibular nerves: an anatomic study. J. Otolaryngol., 29, 159–161.
Nong, D. X., Kyuna, A., Owa, T., Noda, Y. (2002). Saccular origin of acoustically evoked short
latency negative response. . Otol Neurotol, 23, 953-957.
Ochi, K., Ohashi, T. & Nishino, H. (2001). Variance of vestibular evoked myogenic potentials.
Laryngoscope, 111, 522-527.
Oort H. (1918). Über die verästelung des nervus octavus bei Säugetieren. (Modell des utriculus
und sacculus des kaninchens). Anat. Anz, 51, 272–280.
Rasmussen, A. T. (1940). Studies of eighth cranial nerve of man. Laryngoscope, 50, 67–83.
Robertson, D.D. & Ireland, D. (1995). Vestibular evoked myogenic potentials. The Journal of
Otolaryngology, 24, 3-8.
Toshihisa, M., Iwasakib, S., Takaib, Y. & Takegoshic, H. (2005). Sound-evoked neurogenic
responses with short latency of vestibular origin. Clinical Neurophysiology 116 401–405.
Wilber, L. A. (1994). Calibration, pure tone, speech and noise signals. In J. Katz (Eds).
Handbook of Clinical Audiology (4th
Edn.) (pp. 73-96). Baltimore: Williams & Wilkins.
Wilson, V. J., Fukushima, K., Rose P. and Shinoda, K. (1995). The vestibulocollic reflex. J
Vestib Res, 5, 147-170.
Winter, I. M., Robertson, D. & Cole, K. S. (1989). Descending projections from auditory
brainstem nuclei to the cochlea and cochlear nucleus of the guinea pig. J Comp Neurol,
280, 143–157.
Wu, C. & Young, Y. (2002). Vestibular evoked myogenic potentials are intact after sudden
deafness. Ear Hear, 23, 235-238.
Young, Y. H., Huang, T. W. & Cheng, P. W. (2003). Assessing the stage of Meniere‟s disease
using vestibular evoked myogenic potentials. Archives of Otolaryngology-Head & Neck
Surgery, 129, 815–818.
Dissertation Vol.V, Part-A, AIISH, Mysore
78
Acoustic Analysis of the Speech Processed Through Three Amplification
Strategies and Their Effect on Speech Recognition Scores of Individual with
Severe Hearing Impairment
Gunjan Chand, Vijayalakshmi Basavaraj & Ajish Abraham
Abstract
Individuals with severe hearing impairment exhibit reduced frequency resolution and
temporal discrimination. Therefore the requirements of amplification for this group of
population will be different from those with lesser degree of hearing loss. The aim of the present
study was to investigate acoustic changes to the speech signal [in terms of Consonant Vowel
Ratio (CVR) and Envelope Difference Index (EDI)] that occurred with different amplification
strategies and to examine the relationship between such changes and speech perception in
individuals with severe sensorineural impairment. A total of 10 subjects having moderately
severe to severe hearing loss participated in the study. Speech Recognition Scores were
calculated for CV nonsense syllable list at input level of 65 and 80 dBSPL for the three
amplification strategies viz Peak Clipping, Compression Limiting, and Wide Dynamic Range
Compression. Consonant Vowel Ratio was calculated for the unprocessed and processed stimuli
for 5 subjects and Correlation Index for one subject at input level of 65 and 80 dBSPL for all the
three strategies using Matlab software. The scores were better with Compression Limiting
compared to Wide Dynamic Range Compression (WDRC) at both 65 and 80 dBSPL. The CVR
values for the processed stimuli with vowel environment /u/ were higher for the Compression
Limiting strategy as compared to WDRC and at 65 and 80 dBSPL for /a/ and /i/ at 80 dBSPL.
The EDI value was greater at 65 dBSPL and as the level increased to 80 dBSPL there was a
decrease in the EDI value for all the three strategies. The results of the present study indicate a
relationship between acoustic changes to the hearing aid processed speech signal and speech
perception performance of severely hearing impaired individuals.
Key words: Peak Clipping, Compression Limiting, Wide Dynamic Range Compression,
Consonant Vowel Ratio, Envelope Difference Index.
Introduction
Sensorineural hearing loss is often associated with loudness recruitment, an abnormally
rapid growth of loudness level with increasing sound level (Moore, 2004). Recruitment could be
due to reduced compressive nonlinearity on the basilar membrane produced by loss of outer hair
cell function (Moore, 1998). The effect of recruitment is represented on the audiogram by the
reduced range between hearing thresholds and uncomfortable loudness levels. In some patients
with large losses, and thus small dynamic range, even the dynamics of speech signal itself causes
Director, All Idia Institute of Speech and Hearing, Mysore, India [email protected] Reader in Electronics, All Idia Institute of Speech and Hearing, Mysore, India
Dissertation Vol.V, Part-A, AIISH, Mysore
79
problem, amplifying the weak parts of the speech to audible level that causes the strong parts to
be uncomfortably loud.
The feature of Multichannel wide dynamic range compression gives more gain for weak
sounds than for intense sounds. WDRC compresses most of the speech spectrum into the residual
range giving increased audibility and comfort and making loudness perception more similar to
normal (Villchur, 1973). There have been various studies reported in literature that compared
WDRC with Linear amplification and found greatest benefits for WDRC for low level speech in
quiet and conversational level speech in quiet (Souza 2002) and some studies have even shown
small benefits for speech in background noises (Moore, Peters & Stone 1999). However, most of
these studies have dealt with listeners with mild to moderate hearing loss.
The audiological profile differs for different degrees of hearing loss and hence the choice
of amplification also varies. Individuals with severe hearing loss are characterized by
suprathreshold processing deficits primarily by dramatically reduced frequency selectivity
(Faulkner, Rosen & Moore 1990) and in some circumstances by reduced temporal discrimination
(Lamore, Verwiej & Brocaar 1990). It has long been accepted that listeners with a severe loss
require different linear amplification characteristics than listeners with a mild to moderate loss
(Byrne, 1978; Byrne et al., 1990; Schwartz et al., 1988; Van Tasell, 1993). Because of their
broader auditory filters (Faulkner et al., 1990), listeners with a severe-to profound loss may not
be able to take full advantage of spectral information (eg, Erber, 1972) and must rely to a greater
extent on temporal cues which are altered by WDRC amplification (Moore 1996; Van Tasell et
al., 1987). For WDRC amplification one effect is alteration of the natural time-intensity
variations of the speech signal. For listeners with a mild-to-moderate loss who presumably
depend to a greater extent on spectral cues, these changes in time-intensity variations do not
significantly offset the benefits of improved speech audibility (Souza & Turner, 1996, 1998 and
1999).
Souza and Jenstad (2005) attempted to compare speech recognition scores across
different amplification strategies for listeners with severe hearing loss and found that the benefits
of fast acting WDRC relative to more linear amplification may be reduced in listeners with
severe loss. In contrast, Moore and Marriage (2005) studied the effect of three amplification
strategies on speech perception by children with severe and profound hearing loss and found that
speech scores on close set testing for the profound group showed significant benefit for WDRC
over the other two algorithms. The contradictory results could probably be because the latter
study was done on children with congenital hearing loss in whom the dynamic range is reduced
as seen in adults with sensorineural hearing loss.
Review of hearing aids for hearing impairment has shown that signal processing
techniques that take the acoustic- phonetic structure of speech into account promise to be more
effective in improving intelligibility than non phonetically -based methods of signal processing,
provided the relevant speech features are extracted reliably. A form of signal processing which is
phonetically based and which holds some promise for improving intelligibility is that of
Dissertation Vol.V, Part-A, AIISH, Mysore
80
adjusting the ratio of consonant intensity to vowel intensity (C-V ratio). So the consonant vowel
ratio appears promising as a good measure for selection of suitable strategy for an individual.
Acoustic analysis of single channel syllabic compression and linear amplification has revealed
that compression may result in changes in the intensity relationships between parts of the speech
signal (Hickson & Byrne, 1995). It is expected that increase in the CVR could be expected to
improve consonant perception for people with hearing impairment. Even research in linguistics
with normal hearing subjects reveal that CVR itself is an important cue for perception of certain
sounds. Thus calculating the Consonant Vowel ratio of the speech signal after signal processing
through a hearing aid might help in predicting the performance with that hearing aid. Hickson
and Thyer (1999) reported that it is possible to predict speech perception performance with
compression by examining the acoustic characteristics of the processeed speech signal.
The aim of the present study was to investigate the acoustic changes to the speech signal
(interms of Consonant Vowel Ratio and Envelope Difference Index) that occurred with different
amplification strategies and to examine the relationship between such changes and speech
perception in individuals with severe sensorineural impairment.
Objectives:
1) To study the effects of different amplification strategies on speech recognition scores of
severely Hearing Impaired listeners, 2) To objectively measure the acoustic effects of different
amplification strategies on amplified speech ( by calculating the consonant vowel ratio and
Envelope difference index) and 3) To evaluate the relation between acoustic changes and speech
recognition.
Method
Subjects: Ten (5 males and 5 females) hearing aid users between 20-55 years of age
participated in the study. All subjects had bilateral moderately severe to severe sensorineural
hearing loss (65-90 dBHL) with normal middle ear functioning.
Stimuli: CV items word list containing nonsense monosyllabic words were recorded with a
unidirectional microphone fixed at a distance of 6 inches from the speaker. The recording was
done by a native Kannada speaker seated in a sound treated room. The CV word list consisted of
16 consonants paired with three different vowels /a/, /i/ and /u/ such that most of the speech
frequencies are covered. The word list consisted of 48 CV items shown in Table 1. The stimulus
was recorded at the sampling rate of 44.1 KHz and 16 bit resolution and stored onto the
computer memory. An inter-stimulus interval of 3 secs was introduced between stimuli using
Wave pad Software.
The Speech stimuli and 1 KHz tone were delivered from the same loudspeaker at a
distance of 1 meter from the clients head. Level in dBA was set using the calibration track on the
computer output with the sound level meter (Larsen & Davis) placed in the position of the clients
head without the client present.
Dissertation Vol.V, Part-A, AIISH, Mysore
81
Table 1- Consonants, Vowels and combination of CV stimuli used in the study
Vowels
Stop Nasal Affricate Fricative Liquid Glide
p b T d k g m n t∫ dz s ∫ h r l v
a pa ba Ta da ka ga ma na t∫a dza sa ∫a ha ra la va
i pi bi Ti di ki gi mi ni t∫i dzi si ∫i hi ri li vi
u pu bu Tu du ku gu mu nu t∫u dzu su ∫u hu ru lu vu
Procedure
[A] Experiment-I: To measure the effect of different amplification strategies on speech
recognition scores:
A routine audiological evaluation that included pure tone audiometry using Carhart-
Jerger Modified Hughson-Westlake (1959) procedure using a calibrated (ISO-389, 1994) dual
channel diagnostic audiometer (MADSEN OB922) with TDH 39 headphone was done. Speech
recognition scores and uncomfortable loudness level for speech was measured. Immitance
measurements including tympanogram and acoustic reflex threshold were carried out using GSI-
Tympstar immitance audiometer to rule out any middle ear pathology. The test was carried out in
an acoustically treated room with noise level within the permissible limits (ANSI S3.1-1991
cited, Wilber 1994). After audiological evaluation subjects were fitted with Phonak Supero 412
Digital BTE Hearing aid having the option of different signal processing strategies: Wide
Dynamic Range Compression, Peak clipping and Compression limiting. The hearing aid was
programmed for all the three signal processing strategies using NAL-NL1 (Dillon et al., 1998)
prescriptive formula using the NOAH Link Compass Version 4 programming software. CV
items were presented from the computer sound card attached to the two channel diagnostic
audiometer. The stimuli were presented via the loudspeaker at the distance of 1 m from the client
at the level of 65 and 80 dBSPL. The responses of the client were noted and scored.
[B] Experiment II: To measure the effect of different amplification strategies on speech
acoustics:
The acoustic measures used in the study were Consonant Vowel Ratio (CVR) and
Envelope Difference Index (EDI) that quantifies the effect of amplification strategies on the
temporal envelope of speech.
1) CVR calculation: The CV items were presented at the level of 65 and 80 dBSPL into an
anechoic chamber through a PC soundcard. A microphone connected to the sound level meter
was placed in the anechoic chamber to record the input stimuli. The stimuli picked up by the
microphone was routed through the SLM and stored on to the computer memory. Using the same
procedure all the CV items were recorded at 65 and 80 dBSPL. In the next step the programmed
hearing aid for each of the different conditions was kept in the anechoic chamber with the
receiver output coupled to a 2cc coupler. The stimuli presented at 65 and 80 dBSPL in the
anechoic chamber were picked up by the hearing aid microphone and the output of hearing aid is
Dissertation Vol.V, Part-A, AIISH, Mysore
82
picked up by the microphone of SLM and stored onto the computer memory. The CVR was
calculated using an algorithm in Matlab. From the acquired waveforms for the processed and
unprocessed stimuli the consonant and vowel amplitudes are separated out through high pass and
low pass Butterworth filter respectively.
2) Envelope Difference Index: This quantifies temporal changes caused by amplification and a
measure is used for comparing the temporal contrasts of the two acoustic signals called EDI.
Similar procedure was followed for recording the sound as for CVR calculation. The input
waveform of the speech signal the waveform of the amplified signal after passing through the
hearing aid was acquired and the absolute value of the waveforms were taken. The waveforms
were scaled to a mean value of 1. Both the scaled waveforms were correlated using the cross
correlation technique. The CI value was calculated using the formula:
NCI = (∑ ISAMPLE1n – SAMPLE2n1) / 2N
n=1
The procedure was repeated using each of the three amplification strategies.
Results
[A] Experiment-I: The speech recognition scores obtained for 10 subjects were analyzed to
study the effect of amplification strategies and levels. SPSS, Statistical Package for Social
Sciences (version 10) for windows was used to analyze the data. The following parameters were
analyzed.
1) Effect of strategy on Speech recognition Scores: Table 2 shows the overall mean
Speech recognition scores, Standard deviation for different amplification strategies at 65
and 80 dBSPL. The mean scores were better for the Peak clipping (PC) at 65 dBSPL and
for Compression Limiting (CL) strategy at 80 dBSPL.
Table 2- Mean and SD of speech recognition scores across conditions
Level dBspl Strategies Mean SD
65
Compression Limiting 23.8 2.49
Peak Clipping 24.0 3.83
Wide Dynamic Range Compression 21.6 2.72
80
Compression Limiting 28.5 2.01
Peak Clipping 26.7 4.11
Wide Dynamic Range Compression 26.7 4.00
a) At 65 dBSPL input: One-way repeated measure ANOVA was performed for comparison
across strategies within 65 dBSPL. The effect of amplification strategy was significant, F (2, 18)
= 3.661; p < 0.05. Since there was a significant difference across strategies, pair-wise differences
among them was tested with Bonferroni‟s multiple comparison. There was a significant
Dissertation Vol.V, Part-A, AIISH, Mysore
83
difference between WDRC and CL at 0.05 level of significance. The mean scores were better for
Compression Limiting than WDRC. The remaining pairs were not significant at 0.05 level.
Strategy
CLPCWDRC
Mea
n SR
S
36
34
32
30
28
26
24
22
20
18
16
14
12
10
Level
65 SPL
80 SPL
Graph 1: Mean Percentage Speech recognition scores across strategies
b) At 80 dBSPL input: One-way repeated measure ANOVA was performed for comparison
across strategy for 80 dBSPL input. The effect of amplification strategy was not significant, F (2,
18) = 1.254; p > 0.05.
2) Effect of presentation level on speech recognition Scores
Paired t- test was done for comparison across level within each strategy. The effect of
presentation level was significant for all the strategies, WDRC [t (9) = 4.680, p < 0.05], PC [t (9)
= 6.384, p < 0.05], CL [t (9) = 6.567; p < 0.05]. The scores were higher at 80 dBSPL than at 65
dBSPL.
[B] Experiment II
1. Effect of strategy on consonant vowel ratio: The consonant vowel ratio values were
calculated for 5 subjects. They were analyzed to study the effect of amplification strategy.
a) Peak clipping condition: The CVR values were calculated for the stimuli with vowel
environment /a/, /i/, /u/. Table 3 shows the mean CVR values and Standard Deviation of the
input (unprocessed) stimuli and the output (processed) stimuli in 3 different vowel environments
for 5 subjects.
Paired t test was done to compare the CVR values of the unprocessed and processed
stimuli.
i) Stimuli with vowel environment /a/: There was significant difference between the
CVR values of the unprocessed and processed stimuli at 65 dBSPL [t (15) = 3.451,
p<0.01] and 80 dBSPL [t (15) =4.351, p<0.01]. The CVR values were higher for the
processed stimuli as compared to the unprocessed stimuli both for 65 and 80 dBSPL
Dissertation Vol.V, Part-A, AIISH, Mysore
84
Table 3: Mean and SD of CVR values for the processed and unprocessed stimuli
Stimuli Mean SD
/a/ 65 Input 0.68 0.14
Output 0.80 0.04
/a/ 80 Input 0.60 0.17
Output 0.77 0.04
/i/ 65 Input 0.39 0.22
Output 0.37 0.10
/i/ 80 Input 0.21 0.18
Output 0.35 0.09
/u/ 65 Input 0.66 0.23
Output 0.67 0.08
/u/ 80 Input 0.58 0.19
Output 0.69 0.08
ii) Stimuli with vowel environment /i/: There was no significant difference between
the CVR values of the unprocessed and processed stimuli at 65 dBSPL [t (15)= 0.736,
p>0.01] whereas difference was seen at 80 dBSPL input [t (15) = 2.899, p<0.01]. The
CVR was enhanced significantly after processing at 80 but not at 65 dBSPL.
iii) Stimuli with vowel environment /u/: There was no significant difference between
the CVR values of the unprocessed and processed stimuli at 65 dBSPL [t (15)= 0.259,
p>0.01] whereas difference was seen at 80 dBSPL input [t (15) = 2.940, p<0.01]. The
CVR was enhanced significantly after processing at 80 dBSPL but not at 65 dBSPL.
b) Compression Limiting: The CVR values were calculated for stimuli with vowel
environment /a/, /i/, /u/. Table 4 shows the mean CVR values and SD of the input
unprocessed stimuli and the output processed stimuli in 3 different vowel environments.
Table 4: Mean CVR values and Standard Deviation for unprocessed and processed stimuli
Stimuli Input/Output Mean SD
/a/ 65 Input 0.68 0.14
Output 0.79 0.04
/a/ 80 Input 0.60 0.17
Output 0.80 0.02
/i/ 65 Input 0.39 0.22
Output 0.41 0.11
/i/ 80 Input 0.20 0.18
Output 0.40 0.08
/u/ 65 Input 0.66 0.23
Output 0.75 0.07
/u/ 80 Input 0.58 0.19
Output 0.71 0.05
Dissertation Vol.V, Part-A, AIISH, Mysore
85
Paired t test was done to compare the CVR values of the unprocessed input and processed
output stimuli.
i) Stimuli with vowel environment /a/: There was significant difference between the CVR
values of the unprocessed and processed stimuli at 65 dBSPL [t (15) = 2.840, p<0.01] and 80
dBSPL [t (15) =5.035, p<0.01].
ii) Stimuli with vowel environment /i/: There was no significant difference between the
CVR values of the unprocessed and processed stimuli at 65 dBSPL [t (15) = 0.194, p>0.01]
whereas difference was seen at 80 dBSPL input [t (15) = 4.454, p<0.01].
iii) Stimuli with vowel environment /u/: There was no significant difference between the
CVR values of the unprocessed and processed stimuli at 65 dBSPL [t (15) = 1.389, p>0.01]
whereas difference was seen at 80 dBSPL input [t (15) = 2.720, p<0.01].
c) Wide dynamic range compression: The CVR values were calculated for the stimuli with
vowel environment /a/, /i/ and /u/ divided into 6 consonants groups (stops, nasals, affricates,
fricatives, liquids and glides respectively). Table 5 shows the mean CVR values and Standard
Deviation of the input unprocessed stimuli and the output processed stimuli in 3 different vowel
environments.
Table 5: Mean CVR values and Standard Deviation for unprocessed and processed stimuli
Stimuli Input/Output Mean SD
/a/ 65 Input 0.68 0.14
Output 0.70 0.50
/a/ 80 Input 0.60 0.17
Output 0.74 0.05
/i/ 65 Input 0.40 0.22
Output 0.49 0.10
/i/ 80 Input 0.21 0.18
Output 0.35 0.08
/u/ 65 Input 0.66 0.23
Output 0.56 0.06
/u/ 80 Input 0.58 0.19
Output 0.66 0.07
Paired t test was done to compare the CVR values of the unprocessed input and processed
output stimuli.
i) Stimuli with vowel environment /a/: There was no significant difference between the
CVR values of the unprocessed and processed stimuli at 65 dBSPL [t (15) = 0.520, p>0.01]
whereas there was significant difference at 80 dBSPL [t (15) =3.152, p<0.01].
Dissertation Vol.V, Part-A, AIISH, Mysore
86
ii) Stimuli with vowel environment /i/: There was no significant difference between the
CVR values of the unprocessed and processed stimuli at 65 dBSPL [t (15) = 1.431, p>0.01]
whereas difference was seen at 80 dBSPL input [t (15) = 3.118, p<0.01].
iii) Stimuli with vowel environment /u/: There was no significant difference between the
CVR values of the unprocessed and processed stimuli at 65 dBSPL [t (15) = 1.488, p>0.01] at
80 dBSPL input [t (15) = 1.790, p>0.01].
2) Comparison of CVR values across different amplification strategies
i) Stimuli with vowel environment /a/
a) At 65 dBSPL: The mean CVR values and Standard Deviation for the processed stimuli for all
the 3 conditions are shown in Table 6. The mean CVR values were higher for peak clipping.
Table 6: Mean CVR values across strategies for stimuli with vowel /a/ at 65 dBSPL
One-way repeated measure ANOVA was done to see the effect of amplification
condition. There was a significant effect of amplification condition on the CVR values [F (2, 30)
= 4.946, p< 0.05]. Since there was a significant difference across strategies, pair-wise differences
among them was tested with Bonferroni‟s multiple comparison. There was significant difference
between Peak Clipping and WDRC, WDRC and Compression Limiting at 0.05 level of
significance. There was no significant difference between Peak Clipping and Compression
Limiting at 0.05 level.
b) At 80 dBSPL: The mean CVR values and Standard Deviation for the processed stimuli for all
the 3 conditions are shown in Table 7. The mean CVR values were higher for compression
limiting.
Table 7: Mean CVR values across strategies for stimuli with vowel /a/ at 80 dBSPL
Strategies Mean SD
Peak Clipping 0.77 0.39
Compression Limiting 0.80 0.02
Wide Dynamic Range Compression 0.74 0.05
One-way repeated measure ANOVA was done to see the effect of amplification
condition. There was a significant effect of amplification condition on the CVR values. [F (2, 30)
= 10.659, p< .05]. Since there was a significant difference across strategies, pair-wise differences
among them was tested with Bonferroni‟s multiple comparison. There was significant difference
between Peak Clipping and Compression Limiting, WDRC and Compression Limiting at 0.05
level of significance. There was no significant difference between Peak Clipping and WDRC.
Strategies Mean SD
Peak Clipping 0.81 0.04
Compression Limiting 0.79 0.03
Wide Dynamic Range Compression 0.70 0.05
Dissertation Vol.V, Part-A, AIISH, Mysore
87
ii) Stimuli with vowel environment /i/
a) At 65 dBSPL: The mean CVR values and Standard Deviation for the processed stimuli
for all the 3 conditions are shown in Table 8. The mean CVR values were higher for
Wide Dynamic range Compression.
Table 8: Mean CVR values across strategies for stimuli with vowel /i/ at 65 dBSPL
Strategies Mean SD
Peak Clipping 0.37 0.99
Compression Limiting 0.41 0.10
Wide Dynamic Range Compression 0.49 0.09
One-way repeated measure ANOVA was done to see the effect of amplification
condition. There was a significant effect of amplification condition on the CVR values [F (2, 30)
=12.961, p< 0.05]. Since there was a significant difference across strategies, pair-wise
differences among them was tested with Bonferroni‟s multiple comparison. There was
significant difference between Peak Clipping and WDRC, WDRC and Compression Limiting at
0.05 level of significance. There was no significant difference between Peak Clipping and
Compression Limiting.
b) At 80 dBSPL: The mean CVR values and Standard Deviation for the processed stimuli
for all the 3 conditions are shown in Table 9. The mean CVR values were higher for
compression limiting.
Table 9: Mean CVR values across strategies for stimuli with vowel /i/ at 80 dBSPL
Strategies Mean SD
Peak Clipping 0.35 0.95
Compression Limiting 0.40 0.08
Wide Dynamic Range Compression 0.35 0.07
One-way repeated measure ANOVA was done to see the effect of amplification
condition. There was a significant effect of amplification condition on the CVR values [F (2, 30)
= 5.533, p< 0.05]. Since there was a significant difference across strategies, pair-wise differences
among them was tested with Bonferroni‟s multiple comparison. There was significant difference
between Peak Clipping and Compression Limiting, WDRC and Compression Limiting at 0.05
level of significance. There was no significant difference between Peak Clipping and WDRC.
iii) Stimuli with vowel environment /u/
a) At 65 dBSPL: The mean CVR values and Standard Deviation for the processed stimuli
for all the 3 conditions are shown in Table 10. The mean CVR scores were higher for
compression limiting.
Dissertation Vol.V, Part-A, AIISH, Mysore
88
Table 10: Mean CVR values across strategies for stimuli with vowel /u/ at 65 dBSPL
Strategies Mean SD
Peak Clipping 0.67 0.07
Compression Limiting 0.75 0.07
Wide Dynamic Range Compression 0.56 0.06
One-way repeated measure ANOVA was done to see the effect of amplification
condition. There was a significant effect of amplification condition on the CVR values [F (2, 30)
= 42.301, p<0.05]. Since there was a significant difference across strategies pair-wise differences
among them was tested with Bonferroni‟s multiple comparison. There was significant difference
between Peak Clipping and Compression Limiting, WDRC and Compression Limiting, Peak
Clipping and WDRC at 0.05 level of significance.
b) At 80 dBSPL: The mean CVR values and Standard Deviation for the processed stimuli
for all the 3 conditions are shown in Table 11. The mean CVR were higher for
compression limiting.
Table 11: Mean CVR values across strategies for stimuli with vowel /u/ at 80 dBSPL
Strategies Mean SD
Peak Clipping 0.70 0.08
Compression Limiting 0.71 0.05
Wide Dynamic Range Compression 0.66 0.06
One-way repeated measure ANOVA was done to see the effect of amplification
condition. There was a significant effect of amplification condition on the CVR values [F (2, 30)
= 4.194, p< 0.05]. Since there was a significant difference across strategies, pair-wise differences
among them was tested with Bonferroni‟s multiple comparison. There was significant difference
between Compression Limiting and WDRC at 0.05 level significance. There was no significant
difference found between Peak Clipping and WDRC and Compression Limiting and Peak
Clipping.
Input and Strategies
Strategy IIIStrategy IIStrategy IInput
95%
CI f
or C
VR V
alue
s at
65
dB
1.0
.9
.8
.7
.6
.5
.4
.3
.2
.1
0.0
Vowels
/a/
/i /
/u/
Graph 2: Error graph showing 95% confidence interval for Consonant vowel ratio at 65 dBSPL for input and output
processed stimuli across strategies.
Dissertation Vol.V, Part-A, AIISH, Mysore
89
Strategy I- peak clipping, strategy II- compression limiting, strategy III- wide dynamic range
compression
Input and Strategies
Strategy IIIStrategy IIStrategy IInput
95%
CI f
or C
VR V
alue
s at
80
dB
1.0
.9
.8
.7
.6
.5
.4
.3
.2
.1
0.0
Vowels
/a/
/i /
/u/
Graph 3: Error graph showing 95% confidence interval for Consonant vowel ratio at 80 dBSPL
for input and output processed stimuli across strategies. Strategy I- peak clipping, strategy II-
compression limiting, strategy III- wide dynamic range compression.
The range of CVR values was greater for the input stimuli as compared to the output
stimuli at 65 and 80 dBSPL. The CVR values were higher for the stimuli with /a/ and /u/ vowel
environment than for /i/ both for the input and output stimuli as seen in the graph 2 and 3.
2) Effect of strategies on Envelope Difference Index: Table 12 shows the mean Correlation
index values and standard deviation for the stimuli with vowel environment /a/, /i/, /u/ at input 65
and 80 dBSPL for one subject across three strategies. The EDI values were highest for Peak
clipping and lowest for compression limiting at 65 dBSPL whereas at 80 dBSPL it was just the
opposite, highest for WDRC and lowest for Peak Clipping.
Table 12: Mean Correlation index values and standard deviation for the stimuli with Vowel
environment /a/, /i/, /u/ at input 65 and 80 dBSPL
stimuli
strategies
Peak Clipping Compression Limiting Wide Dynamic Range Compression
Mean SD Mean SD Mean SD
/a/ 65 0.29 0.17 0.22 0.18 0.12 0.09
/a/ 80 0.04 0.02 0.74 0.07 0.11 0.10
/i/ 65/ 0.23 0.13 0.20 0.14 0.19 0.15
/i/ 80 0.05 0.05 0.08 0.13 0.12 0.14
/u/ 65 0.13 0.10 0.09 0.06 0.06 0.06
/u/ 80 0.02 0.06 0.05 0.07 0.14 0.15
Discussion
1. Speech recognition scores: Results of the study demonstrated significant difference in speech
recognition scores across strategies at 65 dBSPL but no significant difference between strategies
at 80 dBSPL. The scores were better with Compression Limiting at 65 and 80 dBSPL compared
to WDRC. The study is in agreement with previous study by Souza and Jenstad (2005). Even
Dissertation Vol.V, Part-A, AIISH, Mysore
90
Hickson and Thyer (2003) found that at higher input levels there was no difference between
linear and compression amplification. We know that variation in amplitude over time provides
critical speech information. Some authors have suggested that severely hearing impaired listeners
depend more heavily on these cues because their broadened auditory filters prevent full access to
spectral detail. In this study data was collected on specific groups of stimuli from a small number
of subjects. To assess the efficacy of the approach it would be desirable to obtain similar
measures from a large number of subjects across a relatively small number of phonetic contrasts.
2. Consonant vowel ratio: The results of the study showed that the Consonant vowel ratio was
better for Compression limiting condition compared to Wide dynamic range compression for the
stimuli with vowel environment /a/ and /i/ at 80 dBSPL and /u/ at 65 and 80 dBSPL. One of the
reasons attributed is that the benefits of recruitment compensation may be nullified in effect by
temporal distortions from the compressor attack and recovery times and their alterations of the
normal intensity cues of speech. As compression limiting is only active at high signal levels it
may provide better CVR without significantly altering the dynamics of conversational speech
signals compared to the effect of a syllabic compressor as speculated by Walker and Dillon
(1982) and Dreschler (1988). The finding of the study suggests that substantial changes in a
speech signal can occur as a result of signal processing by hearing aids. In addition to simple
changes due to frequency shaping, temporal changes such as loss or reduction in the periodicity
associated with voicing and as obscuring of the boundary between aperiodic consonant noise and
the onset of voicing can occur. In this study marked changes in Consonant vowel ratio occurred
with processing. The magnitude of these changes for a given syllable however appears to be
influenced by many factors including system release time, compression parameters, amplitude
and duration of preceding speech sounds, the time delay between the vowel and consonant and
the amplitude of the unprocessed consonant. As such, the changes in the speech signal observed
after processing may not be easily predicted from traditional electroacoustic measures of hearing
aid performance. There is likely to be a complex interaction between the dynamic characteristics
of hearing aid processing and the dynamic characteristics of the speech signal. The result of the
present study indicate a relationship between acoustic changes to the hearing aid processed
speech signal and speech perception performance of severely hearing impaired individuals. The
consonant vowel ratio was higher for the Compression Limiting compared to WDRC strategy
and also the speech recognition scores were better with Compression Limiting compared to
WDRC. So it is clear that the acoustic analysis of the aided speech signal does provide
indicators about the perceptual measures and thus has clinical applications.
3. Envelope difference index: The EDI value was greater at 65 dBSPL and as the level
increased to 80 dBSPL there was a decrease in the EDI value for all the strategies. The change is
more significant for Peak clipping and Compression limiting and not for Wide dynamic range
compression. Since the results were only for one subject one cannot generalize the results.
Dissertation Vol.V, Part-A, AIISH, Mysore
91
Conclusion
The study represents a step at resolving the clinical issue of how audiologists may choose
the right amplification strategy while prescribing hearing aids for the severely hearing impaired
individuals. The acoustic analysis is an initial step in describing and quantifying the effects of
amplification strategies on phonemes and thus quantifying the benefits of amplification. In the
present study speech recognition scores were calculated in quiet condition which does not depict
real life situations. So, further study may be done to see the effect of amplification strategies on
speech perception in noise. Since the study had certain time restrictions it was done on a small
group of subjects so the results cannot be generalized. Replicated study on a larger number of
subjects may be carried out to validate the results. Hence, studies on acclimatization effects in
long term use of hearing aids and their effect on speech perception can be carried out. The
present study addressed to only the effect of different amplification strategies on speech
perception and speech acoustics. However, there are other compression parameters such as
compression threshold, attack time/release time and compression bands that affect speech
perception. Further studies need to be done to see the effect of these parameters on speech
perception and speech acoustics.
References
Byrne, D. (1978). Selecting hearing aids for severely deaf children. British Journal of
Audiology, 12, 9-22.
Byrne, D., Parkinson, A. & Newall, P. (1990). Hearing aid gain and frequency response
requirements for the severely/ profoundly hearing impaired. Ear and Hearing, 11, 40-49.
Dreschler, W. A. (1988). Dynamic range reduction by peak clipping or compression and its
effects on phoneme perception in hearing impaired listeners. Scandinavian Audiology,
28, 49-60.
Erber, N. P. (1972). Speech-envelope cues as an acoustic aid to lip reading for profoundly deaf
children. Journal of Acoustical Society of America, 51, 1224-1227.
Faulkner, A., Rosen, S. & Moore, B. C. J. (1990). Residual frequency selectivity in the
profoundly hearing impaired listener. British Journal of Audiology, 24, 381-392.
Hickson, L. & Byrne, D. (1995). Acoustic analysis of speech through a hearing aid: effects of
linear vs compression amplification. Australian Journal of Audiology, 17, 1-13.
Hickson, L. M. H., Thyer, N. & Bates, D. (1999). Acoustic analysis of speech through a hearing
aid: consonant vowel ratio effects with two channel compression amplification. Journal
of the American Academy of Audiology, 10, 549-556.
Hickson, L. & Thyer, N. (2003). Acoustic analysis of speech through a hearing aid: Perceptual
effects of changes with two channel compression. Journal of the American Academy of
Audiology, 14, 414-426.
Dissertation Vol.V, Part-A, AIISH, Mysore
92
Jenstad, L. M. & Souza, P. E. (2005). Quantifying the effect of compression hearing aid release
time on speech acousitcs and intelligibility. Journal of Speech Language and Hearing
Research, 48, 651-667.
Lamore, P. J. J., Verweij, C. & Brocaar, M. P. (1990). Residual hearing capacity of severely
hearing impaired subjects. Acta Otolaryngolica Supplement, 469, 7-15.
Marriage, J. E., Moore, B. C. J. & Stone, M. A. (2005). Effect of three amplification strategies
on speech recognition by children with severe and profound hearing loss. Ear and
Hearing, 26, 35-47.
Moore, B. C. J. (1996). Perceptual consequences of cochlear hearing loss and their implications
for the design of hearing aids. Ear and Hearing, 17, 133-16.
Moore, B. C. J. & Glasberg, B. R. (1998).Use of loudness model for hearing aid fitting. I. Linear
hearing aids. British Journal of Audiology, 35, 349-374
Moore, B. C. J., Peters, R. W. & Stone, M. A. (1999). Benefits of linear amplification and
multichannel compression for speech comprehension in backgrounds with spectral and
temporal dips. Journal of Acoustical Society of America, 105, 400-410.
Moore, B. C. J. & Stone, M. A. (2004). Side effects of fast dynamic range compression that
affect intelligibility in a competing speech task. Journal of Acoustical Society of America,
116, 2311-2323.
Souza, P. E. (2002). Effects of compression on speech acoustics, intelligibility and sound quality.
Trends in amplification, 6, 131- 165.
Souza, P. E. & Turner, C. W. (1996). Effect of single-channel compression on temporal speech
information. Journal of Speech and Hearing Research, 39, 901-911.
Souza, P. E. & Turner, C. W. (1998). Multichannel compression, temporal cues and audibility.
Journal of Speech Language and Hearing Research, 41, 315-326.
Souza, P. E. & Turner, C. W. (1999). Quantifying the contribution of audibility to recognition of
compression-amplified speech. Ear and Hearing, 20, 12-20. .
Villchur, E. (1973). Signal processing to improve speech intelligibility in perceptive deafness.
Journal of Acoustical Society of America, 53, 1646-1657
Van Tasell, D.J. (1993). Hearing loss, speech and hearing aids. Journal of Speech Hearing
Research, 36, 228-244.
Van Tasell, D. J., Soli, S. D., Kirby, V. M. & Widin, G. P. (1987). Speech waveform envelope
cues for consonant recognition. Journal of Acoustical Society of America, 32, 1152-1160.
Walkner, G. & Dillon, H. (1982). Compression in hearing aids: an analysis, a review and some
recommendations (NAL Report No. 90). Australian Government Publishing Service.
Dissertation Vol.V, Part-A, AIISH, Mysore
93
Some Aspects of Temporal Processing Deficits in Individual with
Learning Disability
Kishan M & Animesh Barman
Abstract
Individual with learning disability are likely to have auditory temporal processing deficit.
Auditory temporal processing deficits lead to inability to process three main temporal features of
speech sounds such as envelope, periodicity and fine structure cues. Since TMTF signal involves
envelope, periodicity and fine structure, TMTF assessment would help in understanding the
ability of the individual in perceiving the amplitude variation in continuous speech. To address
this issue (a) TMTF function across different frequency modulation rates in individual with
learning disability and individual with normal hearing, (b) age related changes in TMTF
perception at different modulation rates in individual with normal hearing and children with
learning disability, (c) comparison of phoneme recognition scores in the presence of noise
between the normal hearing individual and individual with learning disability and (d) the
correlation between TMTF perception and phoneme recognition scores in the presence of noise
for children with learning disability on 24 individuals with learning disability and 20 normal
hearing children were measured. TMTF threshold was obtained at different modulation
frequency (fm: 4, 16, 32, 64 & 128 Hz). SPIN scores obtained in noise at 0 dB SNR and without
noise in both the groups. TMTF threshold showed higher value for children with learning
disability than normal hearing subjects with a peak sensitivity at 16 Hz and 4 Hz in normal and
individuals with learning disability respectively. SPIN score showed no significant difference
between normal and individuals with learning disability. Results suggest that TMTF is a better
predictor of temporal processing deficit.
Key Words: learning disability, temporal modulation transfer function, speech perception.
Introduction
Learning disability is a disorder in the psychological processes involved in understanding
or using language, spoken or written, which may manifest in an imperfect ability to listen, think,
speak, read, write, spell or do mathematical calculations. The causes of learning disability are
unknown and often poorly defined. Children with learning disability have auditory processing
disorder which has been experimentally investigated by many studies. Whether these auditory
processing deficits are seen only in association with language disorder or as a causal factor is yet
to be explored. A majority of studies in the literature report that a subgroup of children with
learning disability has auditory processing disorder. Tallal et al., (1996) described a deficit in
dyslexics involving processing of brief, rapidly changing auditory stimuli. This basic temporal
Lecturer in Audiology, All India Institute of Speech and Hearing, Mysore, India
e-mail: [email protected]
Dissertation Vol.V, Part-A, AIISH, Mysore
94
processing impairment underlies their inability to integrate sensory information that conveys
rapid succession in the central nervous system.
Natural speech is a complex signal which has variation in frequency and amplitude with
respect to time. There are three main temporal features of speech namely envelope, periodicity
and fine structure. A number of investigators demonstrated that nearly perfect consonant
identification and sentence intelligibility could be achieved with speech stimuli processed only
with temporal modulation cues which are as low as 50 Hz. (Shannon, 1992; Drullman et al.,
1994). Since TMTF signal involves envelope, periodicity and fine structure, TMTF assessment
would help us in understanding ability of individual in perceiving the amplitude variation in
continuous speech.
Human auditory system has the capacity to resolve the faster and slower changes in the
amplitude, frequency with respect to time. (Separate „fast‟ and „slow‟ auditory system). Any
defects in the development of these two „fast‟ and „slow‟ auditory system may be related to rapid
processing deficits which is common to specific language impairment or individuals with
language learning disability. Individuals with learning disability specifically dyslexics are
impaired in processing the rapidly varying signals which may affect their speech perception
ability in the presence of noisy situation (Tallal, et al., 1996).
TMTF has undergone an extensive research in various populations. Normal-hearing
listeners‟ sensitivity to sinusoidal amplitude modulation threshold is relatively independent of
modulation frequency upto 50-60 Hz, and decreases progressively at higher modulation
frequencies (Viemester, 1979; Bacon & Viemester (1985). For low modulation frequencies (16
Hz), detection is limited by the amplitude resolution of the auditory system rather than its
temporal resolution. As the modulation frequency increases beyond 16 Hz temporal resolution
starts to have an effect and SAM detection threshold increases.
The severe reduction in sentence intelligibility by degrading the consonant identification
were observed in normal hearing individuals when amplitude envelope is low pass filtered
(Drullman et al., 1994; Zeng et al., 1999). However, unlike other psychophysical studies TMTF
is also affected by developmental changes. The psychophysical function varies with the
developmental changes and is applicable to other psychophysical tests. The obtained data
normally may not be suitable to all the population across the world. So attempt has been made in
the present study to obtain norms for the comparison.
Tallal et al., (1996) and Lorenzi et al, (2000) have used TMTF as tool to assess the
temporal processing ability in individual with learning disability and it was assessed at two-
modulation frequencies 2 Hz and128 Hz. Based on their study they said that dyslexics exhibit
impaired ability to perceive the faster modulation which might be leading to poor speech
perception in noise. They measured the ability to process temporal envelope cues in dyslexic
children by measuring detection of sinusoidal amplitude modulation thresholds (SAM). Each
threshold was measured at slow rates and faster rates at 4 Hz and 128 Hz respectively. Overall
SAM thresholds were higher in dyslexics than in normal at both rates. These findings are
Dissertation Vol.V, Part-A, AIISH, Mysore
95
consistent with Tallal‟s hypothesis, according to which the speech reading deficits in 35% of
dyslexics may be caused by impaired temporal processing which play an important role in
speech perception.
Zeng, Kong, Michalewski and Starr (2005) studied TMTF at different rates and obtained
different patterns in auditory neuropathy in comparison with normal. There is dearth of
information about TMTF approach applied to the individual with learning disability in Indian
context and it is not checked at different rates. The purpose of the present study were therefore to
perform systematic study using different modulation rates in individuals with learning disability
and to see age related changes in TMTF perception. Hence an attempt was made to see the
temporal processing ability of learning disability at different modulation frequencies.
Poor speech perception in noise by individual with normal hearing and cochlear hearing
loss is mainly attributed to degradation caused by noise in processing the low modulation
frequency of the speech signal (Noordhock et al., 1997). From the literature it can be understood
that poor speech perception may be caused when processing of the temporal modulation in the
speech signal is impaired. It has been reported in the literature that dyslexic and individuals with
language delay have poor perception in presence of noise (Sandeep & Vanaja, 2004). Hence the
present study was conducted to investigate phoneme perception ability of different LD in
presence of noise and correlate with TMTF thresholds.
The present study was taken up with the aim to know the: (a) TMTF function across
different frequency modulation rates in children with learning disability having normal hearing
and children with normal hearing without learning disability. (b) Age related changes in TMTF
perception at different modulation rates in children with normal hearing without learning
disability and children with learning disability having normal hearing. (c) Comparison of
phoneme recognition scores in the presence of noise between the normal hearing and children
with learning disability. (d) The correlation between TMTF perception and phoneme recognition
scores in the presence of noise for children with learning disability.
Method
Subjects consisted of 24 children with learning disability (age ranging from 8 to 15 years)
and 21 subjects with normal hearing (age ranging from 8 to 15 years) who formed the clinical
and control group respectively. All the subjects from both the groups underwent hearing
evaluation to rule out the hearing loss by routine clinical hearing evaluation. Pure tone
audiometry was conducted using modified Hughson-Westlake procedure (Carhart &
Jerger,1959) and the threshold were obtained at octaves frequencies from 250 Hz to 8 KHz
using clinical diagnostic audiometer (OB 922) under TDH 39 head phones.
Speech identification scores were obtained by conducting speech audiometry using clinical
diagnostic audiometer (OB 922) under TDH 39 head phones for each ear independently.
Phonetically balanced words developed by Mayadevi (1978) were presented monaurally at 40
Dissertation Vol.V, Part-A, AIISH, Mysore
96
dBSL or at most comfortable level and speech recognition score was calculated for 100
percentages.
Normal middle ear function was assesed using GSI-Tympstar immittance audiometer.
Each ear was tested separately by placing an airtight probe tip with 226 Hz probe tone and
responses were taken. Similarly stapedial acoustic reflexes were measured at 4 frequencies (500
1 K, 2 K & 4 KHz).
All the subjects in the clinical group had poor scholastic performance in reading, writing
and calculation. These subjects were assessed by experienced speech-language pathologist and
psychologist by using standardized tests materials. Learning Disability was diagnosed by using
“Early Reading Skills” developed by Rae and Potter in 1981 which assesses the ability in terms
of Alphabet test, Visual and auditory discrimination, Phoneme-Grapheme discrimination,
Structural analysis test and reading skills. Psychologist diagnosed the child to have learning
disability based on general assessment and detailed case history to obtain with reference to
reading, writing, calculation, phoneme-grapheme analysis. All the subjects had undergone APD
tests such as dichotic digit test, dichotic consonant vowel test and also speech in noise test.
Majority of them showed poor scores in the APD tests administered. The selection criteria are as
follows for both groups:
Control group
Subject selection criteria:
1. All the subjects had pure tone thresholds within 15 dBHL
2. Speech identification scores of more than 90% in quite
3. Speech identification scores in noise (0 dB SNR) were better than 80%
4. All of them had „A‟ type tympanogram with presence of acoustic reflexes
5. No history of any other problems such as otological and neurological problems
6. The subjects were native speakers of Kannada and English was the medium of instruction
Clinical group
Subjects selection criteria
1. All the subjects had pure tone thresholds within 15dBHL
2. Speech identification scores was better than 90% in quite condition
3. All of them had „A‟ type tympanogram with reflexes present.
4. They were native speakers of Kannada and English was the medium of instruction
5. All of them were free of retardation, autism, brain damage or any other psycho-physical
dysfunction which was ruled out by experienced psychologist and speech language pathologist
and also by detailed case history taken from the parents and school teachers.
6. All of them were diagnosed to have learning disability by experienced speech language
pathologist and psychologist
The clinical group and control group was further divided into four subgroups based on
their age. Subgroup 1 consisted of 7 normal hearing children without learning disability and 6
children with learning disability in the age range of 8 to 8 year 11 months. Subgroup 2 consisted
Dissertation Vol.V, Part-A, AIISH, Mysore
97
of 3 normal hearing children without learning disability and 9 children with learning disability of
subjects in the age range of 9 to 9 year 11 months. Subgroup 3 consisted of 8 normal hearing
children without learning disability and 4 individual with learning disability number of subjects
in the age range of 10 years to 10 year 11 months. Subgroup 4 consisted of 3 normal hearing
children without learning disability and 5 children with learning disability in the age range of 11
years to 11.11 months.
The following instruments were used for the experiment:
A computer with “speech editing software” was used to generate the TMTF signal.
A calibrated 2 channel diagnostic audiometer (orbiter 922). To route the TMTF signal to
check for the temporal processing ability and to obtain speech identification scores in
quiet and in noise condition. Speech identification score in noise was obtained at 0dB
SNR
Experiment was conducted in two phases, they were:
Test stimuli
The stimuli consisted of unmodulated and sinusoidally amplitude modulated (SAM)
white noise of 500 ms with a ramp of 10 ms. The modulated signal was derived by multiplying
the white noise by a dc-shifted sine wave. The depth of modulation was controlled by varying
the amplitude of modulating sine wave. The expression given below to generate the modulated
noise;
m (t) = [1+ m (sin2π fm t)] * n (t)
Where m is the modulation depth (0<m<1), fm is the modulation frequency (2, 4, 8, 16,
32, 64, 128, 256, 512 Hz) and n (t) is the white noise. Stimuli were low pass filtered at 20 KHz.
All the stimuli were generated using a 32 bit digital to analog converter at a sampling frequency
of 44.1 KHz.
Procedure adapted to establish TMTF threshold:
Subjects were instructed to discriminate the presence of SAM applied to a white noise
carrier. On each trail a standard and a target stimulus were successively presented in random
order to the listener. The standard consisted of white noise n (t). In the target a white noise
carrier was sinusoidally amplitude modulated at a given modulation frequency.
SAM- detection thresholds were obtained using an adaptive two-interval, two-alternative
forced-choice (2I, 2AFC) procedure that estimates the modulation depth „m‟. During one of the
two 500 ms observation intervals continous wideband noise was sinusoidally modulated. The
observer was to discriminate amplitude modulated noise and unmodulated noise. The step size
and threshold were based on the modulation depths in decibels (Am=20 log m). The amplitude of
the modulation was varied according to the following role: „Am‟ decreased 3 dB following a
correct response and „Am‟ increased 3 dB following an incorrect response to obtain the
threshold. The lowest „Am‟ at which modulation is detected was considered as threshold. The
Dissertation Vol.V, Part-A, AIISH, Mysore
98
lowest threshold that can be measured is 0 dB which corresponds to modulation depth of 1
(100% modulated noise).
The testing was conducted in sound treated room where noise level was within
permissible limits (ANSI-1996). All the stimuli were presented at 40 dB SL. The stimuli were
played from a computer, routed through an audiometer (OB-922) and presented through a loud
speaker which was placed 1 meter away from the subjects at an angle of 0 degree azimuth. The
presentation level was changed in all the subjects at least at one modulation frequency and
modulation detection threshold to ensure that subjects were not using loudness judgments.
The second phase of the study included phoneme perception in the presence and absence
of noise. Subjects were instructed to repeat the phoneme which was heard by them. Speech
material was presented live through the orbiter 922 clinical audiometer. Stimuli were presented
at 40 dB SL or at the comfortable level through the headphones.
Open set phoneme recognition paradigm was used in which listener had to listen to each
phoneme tokens and had to say back in a proper order in quiet condition. Further more, same
speech stimulus was presented monaurally in the presence of noise at 0 dB SNR. Order of the
presentation of the test material was randomized between the conditions for the same subjects to
avoid practice effect. Then the correct response obtained was calculated for 100%.
Results and Discussion
The obtained data was statically analyzed using SPSS (version 15) software and results are
discussed separately using statistical values.
A. Detection threshold of sinusoidal amplitude modulation:
The following graph shows the TMTF threshold obtained in both control and clinical
group along with standard deviations.
Figure1: Mean TMTF thresholds along with standard deviations at different modulation rates for
individual with normal hearing without learning disability and children with learning disability
without hearing loss.
Normal
LD
Dissertation Vol.V, Part-A, AIISH, Mysore
99
It can be seen in the figure that normal hearing subjects display a typical low-pass
characteristic i.e. hearing is most sensitive to slow modulation signal but becomes less sensitive
as the modulation frequency increases having peak sensitivity at 16 Hz. A similar trend of typical
low-pass characteristic was also displayed in children with learning disability subjects. However,
they showed much broader response pattern and having peak sensitivity at 4 Hz.
Table 1: Post-Hoc test results of different modulation frequencies for children with LD and with
normal hearing
Modulation Frequency
Number of subjects
Subtest
2 1
4.00
16.00
32.00
64.00
128.00
21
21
21
21
21
-10.57
-10.14
-9.57
-7.92
-
-
-
-
-7.92
-5.78
Sig - .146* .338*
* p< 0.01
A repeated measure ANOVA was performed to assess the significant difference in mean
thresholds between two groups at all modulation frequencies. The analysis showed a significant
main effect between groups [F (1,100=7.65, P<0.01]. No significant interaction between groups
and modulation frequencies [p (4,100) =1.18, P<0.01] were observed. The Scheffe‟s Post Hoc
analysis of variance was carried between the groups across the modulation rates. The results
indicate significant difference between TMTF threshold at 128 Hz modulation frequency from 4,
16 and 32 Hz TMTF thresholds. However, no significant difference between 64 Hz and 128 Hz
thresholds for both the groups was observed.
The normal hearing subjects had significantly lower TMTF thresholds. SAM-detection
thresholds were relatively constant up to 16 Hz but reduced gradually beyond 16 Hz as the
modulation frequency increased. Bacon and Viemester, (1985) also observed similar changes in
their study. This may be because the individuals with normal hearing show significantly larger
physiological response to „Am‟ with respect to variation in signal which will be in synchrony of
neural fibers to modulation (McAnally & Stein, 1997). Children with learning disability had
significantly higher thresholds to amplitude modulation depth than did the normal hearing
subjects. The difference in TMTF performance between two groups could be because the
individuals with learning disability show significantly smaller physiological responses to
amplitude modulated signal than those with normal hearing since they require more synchronous
firing of neurons (McAnally & Stein, 1997). As the modulation frequency increases the
amplitude fluctuations become extremely smoothened and the observer thus required greater
amplitude change in order to resolve the fluctuations (Viemester, 1979) resulting in increase in
TMTF threshold in both the groups.
Poor performance by individuals with learning disability is likely to reflect a true defect
in „Am‟ sensitivity rather than their difficulty in performing the task as shown by previous
Dissertation Vol.V, Part-A, AIISH, Mysore
100
authors (McAnally & Stein, 1997). They did electrophysiological study in which dyslexic
children also had significantly smaller physiological response to „Am‟ than control group
subjects where they concluded that this may be because of loss of synchrony of neural response
to modulation. This resulted in higher sinusoidal amplitude modulation thresholds in individuals
with learning disability.
The TMTF for individuals with learning disability was similar to that previously
described by McAnally & Stein, (1997); Lorenzi et al. (2000). Threshold obtained in this study
was higher at each modulation frequency. This variation in the thresholds may be accounted to
the procedural difference used to elicit the response whether it is an identification (Lorenzi et al.,
2000) rather than discrimination task.
B. Age related changes in TMTF perception in normal hearing subjects and children with
learning disability across the age
It is evident from the table that as the age increases TMTF threshold decreases.
However, there is no significant difference seen in TMTF threshold across the age. This pattern
was observed in both normal hearing and learning disability group.
Table 2: The mean TMTF thresholds at each frequency along with SD for different age group
Rate
Age
Normal Subjects TMTF Learning Disability Subjects TMTF
4 Hz 16 Hz 32 Hz 64 Hz 128 HZ 4 Hz 16 Hz 32 Hz 64 Hz 128 Hz
8 - 8.11
N=7, LD=6
M -17.5 -17.0 -13.28 -12.0 -9.8 -3.5 -3.5 -6.0 -6.5 -3.5
SD 2.07 2.4 1.6 2.4 8.2 6.1 6.1 0 1.2 5.1
9 - 9.11
N=3, LD=9
M -17.0 -15.0 -14.0 -4.0 -9.0 -8.0 -7.6 -7.3 -6.5 -3.6
SD 1.7 3.0 3.4 3.0 8.0 1.5 1.5 1.5 1.2 1.3
10-10.11
N=8, LD=4
M -15.85 -12.85 -11.14 -10.71 -9.42 -8.25 -6.75 -6.0 -4.5 -4.5
SD 2.1 2.7 3.1 2.2 5.6 1.5 1.5 2.4 1.7 1.7
11-11.11
N=3, LD=5
M -13.50 -13.50 -12.00 -10.5 -4.50 -7.80 -7.20 -6.6 -4.82 -3.60
SD 1.7 1.7 1.7 1.6 5.1 1.6 1.6 1.3 1.6 1.3
N- Number of subjects in normal group; LD- Number of subjects in learning disability group
Table 3: Post Hoc test results to see age related changes in TMTF thresholds across modulation
frequencies in normal hearing subjects and individual with learning disability
Age Learning Disability Normal Hearing
N Subjects N Subjects
8.00-8.11
9.00-9.11
10.00-10.11
11.00-11.11
6
9
4
5
-5.400
-6.446
-6.000
-6.000
6
9
4
2
-11.700
-11.866
-14.250
-12.300
Sig 0.537+ 0.160
+
+
Not significant
In the present study psychophysical test TMTF perception was compared using mixed
analysis of variance to see developmental changes in TMTF modulation depth performance.
Analysis showed that there was no significant difference within subgroup of either control or
Dissertation Vol.V, Part-A, AIISH, Mysore
101
Learning DisabledNormal Hearing
Mea
n S
peec
h Id
entif
icat
ion
scor
es in
noi
se
100.00
80.00
60.00
40.00
20.00
0.00
clinical groups at each frequency. The significant difference is not seen in this study because the
age range selected was higher and temporal processing maturation might have been complete by
12 years of age. However it also depends on what type of temporal processing tasks are involved
(Chermak & Musiek 1997).
Hall and Grose, (1994) felt that time constant across all age group was interpreted as
indicating that the peripheral encoding of the temporal envelope is probably adult like in children
aged 4 years and above based on their test results. However young children appear to be
relatively inefficient in processing the information underlying modulation detection. In this study
similar trend is not seen because of selected age range being much higher. Hence TMTF
maturation might have completed much earlier and reached adult like response.
C. Comparison of speech perception ability in the presence of noise between the normal
hearing subjects and children with learning disability:
The speech perception scores obtained at 0 dB SNR in normal hearing subjects were
better than the scores obtained in individuals with learning disability. There was no significant
difference between SIS scores of left and right ear in both the groups. Hence the scores were
combined to compare the performance between normal hearing subjects and children with
learning disability. Analysis was done using paired sample t-test to know the significance
difference if any between the ears and independent samples t-test between the groups.
Figure 2: Speech identification scores in noise obtained in children with normal hearing and learning disability
Results showed that there was no significant difference in the performance between the
groups in right ear [t=3.07, P<0.01] and left ear [t=3.2, P<0.01]. Similar kind of performance
was also obtained by earlier studies by Ferre and Wilber (1986). They reported that an individual
with learning disability shows poor performance in CAPD tests including speech in noise test.
Lorenzi et al (2000) obtained unprocessed speech signal and speech envelope noise signal
identification and observed that individual with dyslexia exhibit poor performance in processing
the speech envelope noise when compared to normal hearing subjects. However in the present
study few children with learning disability had showed equal performance to that of normal
hearing subjects. This might be because all children with learning disability may not exhibit
auditory processing problem.
Dissertation Vol.V, Part-A, AIISH, Mysore
102
Speech is a complex signal which has variation in its amplitude and frequency of the
spectrum (temporal envelope). Presence of background noise will mask the variations in
frequency and amplitude of the signal and the signal becomes less redundant to be processed.
Chermak and Musiek (1997) reported that the individuals with normal auditory system will be
able to process selectively to speech spectrum by ignoring the background noise whereas an
individual with auditory processing problem will fail to extract the information from the complex
signal. Thus this might have resulted in poor speech recognition scores in the presence of noise
in learning disability group.
D. Correlation between TMTF perception and phoneme recognition in the presence of
noise obtained from individual with learning disability:
In the present study to see the correlation between SPIN scores and TMTF thresholds
peak sensitivity was calculated for the lowest threshold across modulation frequency which in
turn correlated with SPIN scores in children with learning disability. To obtain correlation
between these two variables Pearson‟s product moment correlation was used to analyze the data.
The analyses showed that there is a significant correlation between these two variables (r = -
0.39, P<0.01). However, a few subjects showed better performance in SPIN scores equal to that
of normal. This may be because speech in noise test is less sensitive and less reliable tool in
assessing auditory processing deficits (Chermak & Musiek (1997). Results obtained in the
current study reveals that TMTF is a sensitive test to assess temporal processing ability than the
speech in noise test, to differentiate processing problem that may be auditory based rather than
linguistic based. However until now there is no study reported regarding correlation between
TMTF and speech perception in the presence of noise.
Conclusion
From the above discussion it can be concluded that learning disability required higher
modulation depth to perceive the modulation than the normal group. The peak sensitivity for
normal hearing children is higher than for children with learning disability. SPIN scores are
likely to be poorer for learning disability than normal group, thus suggesting temporal processing
deficit in learning disability. TMTF could be better test to assess temporal processing than SPIN.
Data obtained in normal hearing group at different modulation rates can be used as a normative
data (as shown in fig 1).
Clinical implication:
1. TMTF is an effective, non invasive, quick and sensitive tool which helps to identify temporal
processing disorder, especially in individuals with learning disabilities.
2. TMTF performance in combination with SPIN scores gives a better idea about whether the
processing problem is linguistic- based or auditory- based problem.
3. TMTF perception indirectly assesses how well an individual can perceive speech
4. Early indication to diagnosis at risk of learning disability
5. Also can be used in rehabilitation
Dissertation Vol.V, Part-A, AIISH, Mysore
103
References
American National Standards Institute. (1996). “American National Standard Maximum
Permissible Ambient Noise Levels for Audiometric Test Rooms”. ANSI S3.1- (1996).
New York: American National Standards Institute.
Baccon, N. F. & Viemester. (1984). Temporal modulation transfer function in normal hearing
and hearing impaired listeners. British Journal of Audiology, 24, 117-134.
Bacon, S.P. & Viemeister, N.F. (1985). Temporal modulation transfer functions in normal-
hearing and hearing–impaired subjects. Audiology, 24, 117-134.
Carhart, R. & Jerger, J. F. (1959). Preferred method for clinical determination of pure-tone
thresholds. Journal of Speech and Hearing Disorder, 24, 330-345.
Chermak, G.D. & Musiek, F. E. (1997). Central Auditory Processing Disorders: New
perspectives. San Diego: Singular Publishing Group
Drullman, R., Festen, J.M. & Plomp, R. (1994). The effect of temporal envelope smearing on
speech reception. Journal of the Acoustical Society of America, 95, 1053-1064.
Drullman, R., Festen, J.M. & Plomp, R. (1994). Effects of reducing slow temporal modulation
in speech reception. Journal of the Acoustical Society of America, 95, 2670-2680.
Kraus, N., McGee, T. J., Carrel, T.D., Zeeker, S.G, Nicol., T.G. & Koch, D.B. (1996). Auditory
Neurophysiology Responses and Discrimination Deficits in children with learning
problem. Science, 273, 971-973.
Lorenzi, C., Wable, J., Moroni, C., Derobert, C. & Belin, F. C. (2000). Auditory temporal
envelope processing in a patient with left-hemisphere damage. Neurocase, 6, 231-244.
Mayadevi (1978). The development and standardization of a common speech discrimination test
for Indians. Unpublished master dissertation, University of Mysore, Mysore.
McAnally, K.I. & Stein, J. F. (1997). Scalp potentials evoked by amplitude-modulation tones in
dyslexia. Journal of Speech, Language and Hearing Research, 40, 939-945.
Sandeep, M. & Vanaja, C.S. (2004). Auditory long latency responses in children with learning
disability and normal hearing subjects. Unpublished dissertation, submitted to university
of Mysore.
Shannon, R. V. (1992). Temporal modulation transfer function in cochlear implant patients.
Journal of the Acoustical Society of America, 91, 2156-2164.
Tallal, P., Miller, S.T., Bedi, G., Byma, G., Wang, X., Nagarajan, S., Schreiner, C. & Jenkins,
W. (1996). Language Comprehension in Language Learning Impaired Children
Improved with Acoustically Modified Speech Science, 271, 81-84.
Viemester, N. F., (1979). Temporal modulation transfer function based upon modulation
thresholds. Journal of the Acoustical Society of America, 66, 173-178.
Zeng, F. G., Oba, S., Garde, S., Sinnger, Y. & Starr, A. (1999). Temporal and speech processing
deficits in auditory neuropathy. Neuroreport, 10, 3429-3435.
Zeng, F.G., Kong, Y. Y., Michalewski, H.J. & Starr, A. (2005). Perceptual consequences of
disrupted auditory nerve activity. Journal of neurophysiology, 93, 3050-3063.
Dissertation Vol.V, Part-A, AIISH, Mysore
104
Speech recognition of spectrums with ‘holes’ by children
Manasa Ranjan Panda & Asha Yathiraj
Abstract
The identification of speech having holes in various bands in the spectrum was assessed
in 30 normal hearing children. The speech material that was developed simulated perception
through a twenty–two channel cochlear implant using noise band simulation technique. The
„holes‟ in the spectrum were created by filtering specific frequency bands which corresponded to
the frequency bands of adjacent electrodes in a Nucleus cochlear implant. Responses were
scored in terms of words and phonemes as well as consonants and vowels. It was found that
there were less consonant errors than vowel errors for both word scoring and phoneme scoring.
This error pattern was more with larger „hole‟ size. It was also noticed that younger children in
this study showed significantly poorer performance than older children.
Key words: spectral hole, filter, simulation.
Introduction
It is generally accepted that humans rely on cues that exist across several frequency bands
to understand speech. The question of how listeners use and combine information across several
frequency bands when understanding speech has puzzled researchers for many decades. It can be
recognized even when the spectral information is reduced to three sinusoids that track the
formant transitions over time (Remez, Rubin, Pisoni & Carrell, 1981). A high degree of speech
discrimination and recognition is observed even under conditions of great reduction of spectral
information. (Van Tasell, Greenfield, Logemann & Nelson, 1992; Shannon, Zeng, Wygonski,
Kamath & Ekelid, 1995; Turner, Souza & Forget, 1995).
Understanding how speech is perceived after being processed through a cochlear implant
is a challenge. In cochlear implants a relatively small number of electrodes activate tonotopic
patches of neurons with a portion of the speech signal. Even with these abnormalities in
physiologic process Shannon et al. (1995) found high levels of phoneme, word and sentence
recognition could be achieved by adults with just four bands of information. This observation
indicates how little is understood about recognition of speech under conditions of distorted
spectral information.
Cochlear implantation is based on the idea that there are surviving neurons in the vicinity
where electrodes are placed in the cochlea. The lack of hair cells and/or surviving neurons within
the areas of cochlea essentially creates „hole(s)‟ in the spectrum. The influence of the „holes‟ in
Professor of Audiology, All India Institute of Speech and Hearing, Mysore, India
e-mail: [email protected]
Dissertation Vol.V, Part-A, AIISH, Mysore
105
the spectrum in speech understanding is not well understood. It is not known whether the spectral
„holes‟ can account for some of the variability in the performance among cochlear implant users
(Kasturi, Loizou, Dorman & Spahr, 2002). Hence, it is of interest to find whether recognition
will be affected with the set of „hole‟ pattern in speech.
Shannon, Galvin and Baskent (2001) assessed the impact of size and location of spectral
„holes‟ in cochlear implantees and normal hearing listeners. Results showed that holes in the low
frequency region are more damaging than holes in the middle or high frequency region on
speech recognition. Vidya, Rima and Yathiraj (2006) evaluated perception of seven lists of
words having different band rejections and one list with no modification on thirty normal hearing
adults. They found that despite information being filtered from the speech signal perception was
not altered. Based on their findings they interpreted that as many as eight adjacent electrodes
could be switched off in a cochlear implant without affecting speech identification. This gives an
insight of how cochlear implant users perceive speech even when specific electrodes are
switched off.
Various research findings contradict each other regarding number of electrodes recquired
for better speech performance. Holmes, Kemker and Mervin (1987) evaluated speech perception
of a patient fitted with a multichannel processor under different electrode conditions. Their study
suggested that by increasing the number of programmed electrodes the subject‟s speech
perception improved. This finding contradicts that obtained by others (Dorman, Loizou &
Rainey 1997; Shannon et al. 1995; Turner, Souza & Forget, 1995). Dorman et al. (1997)
processed vowels, consonants and sentences through software emulations of cochlear implant
signal processors with 2-9 output channels. They found that high levels of speech understanding
could be obtained using signal processors with a small number of channels. Despite many
investigations the basic question of “how many electrodes for speech information” are still to be
answered.
Knowing the relation between the numbers of electrodes activated and speech
recognition is necessary for future designing devices as well as during counselling. Clinically a
number of conditions may necessiate reducing the number of electrodes in a cochlear implant or
not programming the electrodes. The effect of spectral resolution on speech recognition has
received considerable attention in last few years. However, this attention is mainly concentrated
on adults (Dorman et al., 1997; Fu, Zeng, Shannon & Soli., 1998; Holmes et al., 1987; Shannon
et al., 1995, 2001; Vidya et al., 2006). It is theoretically and practically important to understand
whether the limited spectral resolution is a key factor for children also. Eisenberg, Shannon,
Martnez, Wygonski and Boothroyd (2000) reported that young children did not have sufficiently
developed speech pattern recognition. Thus children may require more spectral channels than
adults to obtain similar speech recognition skills. Hence there is a need to study to assess the
effects of spectral „hole(s)‟ in children using cochlear implants.
Besides evaluating the effect of spectral „holes‟ on individuals using cochlear implant it
has also been evaluated on normal hearing individuals using simulated material. This has been a
Dissertation Vol.V, Part-A, AIISH, Mysore
106
preferred method of study due to the ease with which the research can be conducted. It has also
been found that controlling subject-related variables are a lot easier in a simulated condition.
Thus the present study aims at investigating speech recognition in children using a cochlear
implant simulated condition. Recognition of speech with varying spectral „holes‟ will be studied.
Method
Participants
Thirty children with normal air conduction and bone conduction thresholds in the
frequency range of 250 Hz to 8 KHz and 250 Hz to 4 KHz respectively, participated in the study.
The children were in the age range of 7-10 years. They were native speakers of Kannada and
were able to read and write the language. They had no history of any neurological disorders and
had normal middle ear functioning as measured by tympanometry and acoustic reflexes.
Instrumentation
A Pentium IV computer with Matlab software was used for the development of the
material. A CD burner (Nero 7 Ultra Edition) was used to write the material on a CD. To
evaluate as well as present the test items a calibrated two channel audiometer (Madsen OB 922)
with TDH 39 earphone was used. The middle ear status was determined with measurements from
GSI Tympstar immittance meter. The speech material was played through a Philips CD player.
Material development
The speech identification test material “The Kannada Phonemically Balanced words”
developed by Yathiraj and Vijayalakshmi (2005) was used to simulate speech processed through
a cochlear implant. The test consisted of four lists of bisyllabic words with each list having 25
words. The CD recorded version of the test was used. The below mentioned procedure was used
to simulate speech processed through a twenty-two channel cochlear implant.
The words of each of the original lists were randomized to create eight lists. These speech
signals were band pass filtered into twenty-two frequency bands using 6th
order Butterworth
filters. Crossover and center frequencies were calculated using the following equation relating
the position on the basilar membrane to its characteristic frequency and assuming a basilar
membrane length of 35 mm (Greenwood, 1990, cited in Rosen, Faulkner & Wilkinson, 1999):
Frequency = 165.4 (10 0.06x
-1)
X = 1/0.06 log (Frequency/165+1)
The envelope detection occurred at the output of each filter by full wave rectification and
second order Butterworth low pass filter at 400 Hz. Forward and backward filtering were used to
cancel the phase delays. These envelopes were then multiplied by signal correlated noise. Before
being summed the signal was passed through an output filter similar to the analysis filter (Figure
1). Two conditions were created - “no „hole‟ condition and “dropped” condition. In the dropped
condition seven lists were created in which the output noise bands were simply omitted (band
Dissertation Vol.V, Part-A, AIISH, Mysore
107
pass filtered with 6th
order Butterworth filter) from the processed signal. The „hole‟ size was
calculated corresponding to each of the dropped condition. The frequency bands of the band stop
filters and „hole‟ sizes are shown in Table 1. These band rejections were created in the
frequencies which corresponded to frequency bands of groups of adjacent electrodes in a
Nucleus cochlear implant. The above procedure was carried out using Matlab software. In the no
„hole‟ condition no band pass filtering was carried out. Prior to each test stimuli a 1 KHz
calibration tone was also recorded. The altered stimuli were recorded on a compact disc (CD).
Figure 1: Block diagram of the processing used for transforming the speech signal
Test Environment
The test was carried out in a two-room audiometric set-up which was acoustically
treated. The noise level was within permissible limits as recommended by ANSI (1991).
Table 1: Band rejection done for specific lists
Procedure
The hearing sensitivity of the participants was assessed using a calibrated two channel
audiometer (Madsen OB 922) with TDH 39 earphone. A GSI Tympstar immittance meter was
used to evaluate the status of the middle ear and to obtain acoustic reflex. Thirty children who
met the subject selection criteria participated in the study.
The developed material was played using a Philips CD player. The output from the player
was routed to the tape input of the two-channel audiometer. Prior to the speech signals being
Lists Frequency bands Hole size
LIST 1 438-813Hz 3.4 mm
LIST 2 938-1313Hz 2.0 mm
LIST 3 1563-2313 Hz 2.6 mm
LIST 4 2688-4063 Hz 2.8 mm
LIST 1 A 4688-6938 Hz 2.8 mm
LIST 2 A 438-1063 Hz 5.1 mm
LIST 3 A 438-1313 Hz 6.4 mm
LIST 4 A No filter -
Full wave
rectifiers Output
filters
Analysis
filters
400 Hz
Low
pass
Speech
input Multipliers
White noise
source
Sum noise
bands
Dissertation Vol.V, Part-A, AIISH, Mysore
108
presented the recorded 1 KHz calibration tone was played. This was used to adjust the VU meter
deflection to zero. The signal was presented at 40 dB HL. The output from the audiometer was
heard by the participants through circumaural headphone. Half the participants heard the signal
in the right ear and the other half in the left ear. The order in which the lists were presented was
randomized to avoid any list order bias. The participants were instructed to write down the words
they heard.
Scoring
The written responses of the participants were scored in terms of phonemes as well as
words that were correctly identified. For word scoring, each correct response was given a score
of one and a wrong response a score of zero. The maximum possible word score for each list was
25. For phoneme scoring, each correct response was scored as one and wrong as zero. The
maximum possible scores for list 1/1A, 2/2A, 3/3A and 4/4A was 103, 103, 104 and 106
respectively.
Results and Discussion
Statistical analysis was done using the SPSS software (version, 10.0). Descriptive
statistics and repeated measure ANOVA was carried out on the data obtained from the thirty
children to check significance of difference of speech identification scores across different filter
conditions. This was done across all the filter conditions as well as non-filter condition. Both
word scores and phoneme scores were analyzed for each filter condition.
A) Effect of different band stop filters on speech identification scores
The mean and standard deviation for the word scores and phoneme scores are shown in
Tables 2. This information is provided for all eight lists that were evaluated. Details of the vowel
and phoneme scores are discussed below.
Table 2: Mean and SD for the word and phoneme scores for different filter conditions
Lists Band
Rejections
(Hz)
Word Score Phoneme Score
Mean scores SD of raw scores Mean scores SD of raw
scores
List I 438-813 49.08% (12.27) 5.56 80.30% (82.73) 9.01
List II 938-1313 52.52% (13.13) 6.13 84.36% (86.90) 8.28
List III 1563-2313 56.28% (14.07) 6.02 83.46% (86.80) 10.00
List IV 2688-4063 54.00% (13.50) 7.16 81.00% (85.86) 11.51
List IA 4688-6938 57.48% (14.37) 5.85 85.95% (88.53) 8.53
List IIA 438-1063 58.28% (14.57) 5.67 87.01% (89.63) 7.62
List IIIA 438-1313 52.00% (13.00) 6.56 83.71% (87.06) 9.93
List IVA No Filter 57.20% (14.30) 6.34 85.08% (90.13) 9.03
Note: Maximum word score = 25; Maximum phoneme scores vary from 103 to 106
Values given in bracket refer to the raw score
Dissertation Vol.V, Part-A, AIISH, Mysore
109
The word scores across filter conditions were relatively low, including the unfiltered
condition. This indicates that the material simulating speech processed through a twenty-two
channel cochlear implant resulted in reduced word scores without and with band stop filters.
This highlights that the simulated material resulted in distortion reducing the overall
intelligibility of the developed material.
A repeated measure ANOVA revealed that there was a significant difference between the
word scores across the different lists [F (7, 203) =3.957, p < .05]. The Bonferroni multiple
comparison tests further indicated that for the word scores there was a significant difference
between List I and List IIIA; List IV and List IVA at the level of 0.05 level of significance.
Surprisingly, though the „hole‟ was larger for List IIIA, the performance was poorer for List I.
Though both the lists were equal in terms of phonemic balance and difficulty the variations in
test items could have led to the difference in performance. The participants may have been able
to utilize coarticulated information to a greater extent in List IIIA than in List I.
A significant difference was also found between List IV and List IVA. List IVA was
without any „hole‟ whereas List IV had a „hole‟ of 2.8 mm. The cutoff frequencies used to create
List IV were in the frequency region of the second format (F2) for several of speech sounds and
F2 is a major cue to differentiate vowels (Carlson, Fant & Grantson, 1975). Hence removal of
F2 information in List IV might have led to a significant difference compared to that of the no
„hole‟ condition. However, no significant differences were found for word scores across all other
lists. Based on this finding it can be inferred that generally the listeners were able to combine the
information across various frequency bands to perceive a “whole” signal.
The phoneme score across filter conditions was found to be significantly different on the
ANOVA test across the eight lists [F (7, 203) = 7.863, p < .05]. This significant difference was
observed between List I and Lists IA, III, IIIA, IV and IVA. In addition List IV significantly
differed from List IVA. The difference in phoneme scores can be attributed to increase in „hole‟
size. The „hole‟ size for List I was larger (3.44 mm) when compared to List IA (2.8 mm), List III
(2.6 mm) and List IV (2.8 mm).
The finding of the present study is in agreement with the findings of Shannon et al.
(2001). They too reported that speech recognition decreased as the „hole‟ size increased in
normal hearing adults. They reported that a 4.5 mm „hole‟ caused the performance to decrease
significantly for consonants and vowel recognition. However, in the present study, it was found
that a „hole‟ size of more than 2.6 mm was able to reduce speech recognition. This difference of
findings in both the studies might be due to test procedure and age of the participants. Shannon et
al. (2001) carried out the experiment on adults in a free field condition while the present study
was carried out on children under head phones. The task in the study by Shannon et al. (2001)
was to identify medial vowels and medial consonants. However, in the present study, consonants
and vowels in the initial, medial and final position had to be identified. This might have also
attributed to difference in findings.
Dissertation Vol.V, Part-A, AIISH, Mysore
110
Shannon et al. (2001) also found that decrease in speech recognition was larger for apical
„holes‟ than basal „holes‟. In the present study also the „holes‟ representing the apical region of
the cochlea resulted in poorer scores (List I) when compared to the „holes‟ representing the more
basal regions. However, this was not seen for all „holes‟ representing the basal region. No
significant difference was found among other lists. It showed that listeners were able to
effectively combine information from different frequency regions and perceive the speech signal
despite the removal of certain frequency components.
In general, higher phoneme scores were obtained for phonemes in comparison to word
scores. Similar results were also found by Olsen, Van Tasell and Speak (1997) in a group of
normal hearing adults. In their study, phoneme scoring yielded scores that were on the order of
20% higher than scores for whole words heard. Barick (2006) also found a significant difference
in word and phoneme scores. He recommended that word scores be calculated rather than
phoneme scores since this scoring procedure depicts the perceptual problem better. However, he
suggested that if the client was to be referred for auditory listening training the phoneme scoring
procedure should be used.
Phoneme scores as a function of age was evaluated across filter conditions. All the
participants were classified into four age groups, 7-8, 8-9, 9-10 and 10-11 years. The Duncan
post hoc test revealed that the youngest group (7-8 year old) performed significantly poorer than
the older three groups.
The finding regarding the performance of different age groups is similar to that reported
by Eisenberg et al. (2000). They too noted that the youngest group in their study (5-7 year old)
performed significantly poorer than their older children aged 10-12 years on a speech perception
task. They had used age appropriate test material and found it on a variety of age matched tasks.
They reported in their study that the younger children were more variable in their performance
than adults and older children suggesting probable cognitive or task related factors playing a
role.
Thus the findings of the present study substantiate the presence of a developmental trend
in the perception of spectral „holes‟. This indicates that unlike the older children younger
children are unable to carry out an auditory closure activity and guess the material that has been
presented to them.
B) Phoneme errors as a function of different band stop filters
In addition to obtaining word and phoneme scores confusion of vowels and consonants
were observed in each lists. Further, an error analysis was carried out for both the vowels and
consonants to assess the error pattern. The analysis was done for three lists (List I, List III and
List IA). These three lists were analyzed as they represented low (438-813 Hz), mid (1563-2313
Hz) and high (4688-6938 Hz) frequency band-stop filters respectively. Overall less consonantal
errors were noticed than vowel errors (Table 3). Probably due to the larger number of redundant
segmental cues present in consonants the participants were able to guess them despite the
Dissertation Vol.V, Part-A, AIISH, Mysore
111
presence of „holes‟ in the spectrum. Thus if one cue is missed due to filtering listeners can
perceive the other cues and identify the consonants. Despite vowels being more robust the
number of redundant segmental cues present in them is less.
Vowel Errors
Among the three lists the least errors were observed in List IA, followed by List III and
List I. In List IA, the cutoff frequency was between 4688-6938 Hz whereas the cues for vowel
perception lie between 270 Hz and 2160 Hz (Peterson & Barney, 1954). Thus, in List IA, all the
information for the perception of vowels was preserved. Hence the spectral „hole‟ in it did not
adversely affect the perception of vowel. In contrast the errors were maximum in List I as it
removed the low frequency component in which first formants for all the vowels and second
formants of many vowels lie (Peterson & Barney, 1954).
Further, the error analysis of these three lists revealed that there was confusion between
short and long duration vowels. This was observed across all the three lists. This highlights that
the simulated speech material affected the temporal cues required for the perception of long
versus short vowels, resulting in this confusion. The error analysis for vowels also highlighted
that in List I which contained the low frequency spectral „hole‟, /u/ was confused with /i/ and /o/
was confused with /a/. Elimination of essential formant information probably resulted in this
confusion.
Table 3: Consonants and vowel errors across different filter condition
List Consonant errors (in %) Vowel errors (in %)
List I 40.0 60.0
List II 44.0 56.0
List III 41.1 58.8
List IV 42.1 58.8
List IA 40.6 60.0
List IIA 46.6 53.3
List IIIA 35.3 64.7
List IVA 46.6 53.3
Consonant Errors
The consonantal error analysis was carried out in the same three lists as that done for
vowels. In all the three lists voicing, place of articulation and manner was noticed to be affected.
However, majority of errors were observed for place of articulation followed by manner and
voicing errors. Manner and voicing cues have been found to be primarily temporal cues (Van
Tassel, Soli, Kirby & Widin, 1987) and they require minimal spectral cues to be accurately
perceived (Shannon et al., 1995). In contrast place cues require spectral cues which are affected
by the spectral „holes‟. Similar results were also obtained by Shannon et al. (2001). They too
reported that information received on place of articulation decreased considerably as the „hole‟
size was increased particularly when the „hole‟ was located apically. In the present study
Dissertation Vol.V, Part-A, AIISH, Mysore
112
maximum errors were obtained in List I which had a „hole‟ located more apically. The „hole‟
size in the List I was 3.4 which was larger than the other two lists. This apical location and larger
size might have lead to more errors in List I. Also the alveolar /d/ was confused with labial /b/
and velar /k/ in List I and in List IA respectively. Generally the major cues for the perception of
place of articulation of stops are the bursts and second format transition (Cooper, Delattre,
Liberman, Borst & Gerstman, 1952). The spectal „holes‟ in the speech material probably
eliminated some of the major cues, causing confusion in the place of articulation perception.
Conclusion
From the findings of the present study it can be concluded that with increase in „hole‟
size there was a deterioration in speech recognition scores. The apical location of „holes‟ or
band-stop filters affected speech perception more than the basal „holes‟ or band-stop filters.
However, not all the basal „holes‟ had the similar adverse affect. The phoneme scores were
higher in comparison to word scores. The error analysis indicated that consonantal perception
was better than vowels for all filter condition. This error pattern was more with larger „hole‟ size.
However, this trend was not followed in all of the lists suggesting that listeners were able to
combine the information from other frequencies to perceive the whole signal. Among vowels
maximum confusion was noticed among short versus long vowels. For consonants more place of
articulation confusion than manner and voicing confusion was observed. It was also noticed that
7-8 year old children showed significantly poorer performance than older children.
The present study will add to the current knowledge-pool of understanding speech pattern
recognition in young cochlear implantees and their perceptual differences in the speech
recognition with adults. Clinically findings of this study may help in predicting speech
perception as a function of the electrodes that are switched on. Information from the study would
be useful in counselling parents of young cochlear implantees or cochlear implantees regarding
the speech sounds that will be affected if specific electrodes would be switched off.
References
ANSI. Maximum permissible ambient noise levels for audiometric test rooms, ANSI S3.1. New
York: American National Standards Institute (ANSI), 1991.
Barick, S. K. (2006). High frequency-English speech identification test. Unpublished Master‟s
dissertation submitted as part fulfillment for the degree of Masters of Science, to the
University of Mysore, Mysore.
Carlson, R., Fant G. & Grantson B. (1975). Auditory Analysis and Perception of Speech.
London: Academic Press Inc.
Cooper, F. S., Delattre. P. C., Liberman, A. M., Borst, J. M. & Gerstman. L. J. (1952). Some
experiments on the Perception of Synthetic Speech Sounds. Journal of Acoustical Society
of America, 24, 6, 597-606.
Dorman, M. F., Loizou, P. C. & Rainey, D. (1997). Speech intelligibility as a function of the
number of channels of stimulation for signal processors using sine-wave and noise-band
outputs. Journal of Acoustical Society of America, 102, 2403–2411.
Dissertation Vol.V, Part-A, AIISH, Mysore
113
Eisenberg, L., Shannon, R. V., Martnez, A. S., Wygonski, J. & Boothroyd, A. (2000). Speech
perception with reduced spectral cues as a function of age, Journal of Acoustical Society
of America, 107, 5, 2704-2710.
Fu, Q. J., Zeng, F. G., Shannon, R. V. & Soli, S. D. (1998). Importance of tonal envelope cues in
Chinese speech recognition. Journal of Acoustical Society of America, 104: 505-510.
Holmes, A. E., Kemker, F. J. & Mervin, G. E. (1987). The effects of varying the number of
cochlear implant electrodes on speech perception. American Journal of Otology, 8: 240-
246.
Kasturi, K., Loizou, P., Dorman, M. & Spahr, T. (2002). The intelligibility of speech with „holes‟
in the spectrum. Journal of Acoustical society of America, 112 (3), 1102-1111.
Olsen, W. O., Van Tassel, D. J. & Speaks C. E. (1997). Phoneme and Word Recognition for
Words in Isolation and in Sentences, Ear and Hearing, 18, 3, 175-188
Peterson, G. E. & Barney H. L. (1954). Control methods used in a study of the identification of
vowels. Journal of Acoustical society of America, 24, 183-
Remez, R., Rubin, P., Pisoni, D. & Carrell, T. (1981). Speech perception without traditional
cues, Science, 212, 947–950.
Shannon, R. V., Zeng, F.-G., Wygonski, J., Kamath, V. & Ekelid, M. (1995). Speech recognition
with primarily temporal cues, Science, 270, 303–304.
Shannon, R. V., Galvin, J.J. & Baskent, D. (2001): Holes in hearing, Journal of the Association
for Research in Otolaryngology, 185-199.
Turner, C. W., Souza, P. E. & Forget, L. N. (1995). Use of temporal envelope cues in speech
recognition by normal and hearing-impaired listeners. Journal of Acoustical society of
America, 97, 2568–2576.
Van Tasell, D. J., Soli, S. D., Kirby, V. M., & Widin, G. P. (1987) Speech waveform envelope
cues for consonant recognition. Journal of Acoustical Society of America, 82, 1152–1161.
Van Tasell, D. J., Greenfield, D. G., Logemann, J. J. & Nelson, D. A. (1992). „Temporal cues for
consonant recognition: Training, talker generalization and use in evaluation of cochlear
implants‟‟ Journal of Acoustical Society of America, 92, 1247–1257.
Vidya, M, Rima, D. & Yathiraj, A. (2006). Speech Identification of a Spectrum with Holes:
Presented at ISHACON, 2006 held at Ahmedabad.
Yathiraj, A. & Vijayalakshmi (2005) The Kannada Phonemically Balanced Word Test developed
in the department of Audiology, All India Institute of Speech and Hearing.
Dissertation Vol.V, Part-A, AIISH, Mysore
114
Importance of Long Latency Potential in Pediatric Hearing
Assessment
Niraj Kumar & Animesh Barman*
Abstract
Numerous researchers‟ reports caution clinicians that infants with risk factors have
reversible features of auditory dys-synchrony (AD). It is important for an audiologist to be able
to differentiate such reversible condition or auditory maturational delay (AMD) from non-
reversible AD. It is difficult to identify the exact nature and type of hearing loss in infants and
toddlers with risk factors using routine audiological test battery as they present inconclusive
tests results (ABR, BOA, OAE & Immittance) in a single assessment. So an appropriate test
battery becomes essential in order to differentiate these mimicking conditions. Thus, the study
was taken up with the aim to check the importance of LLR in pediatric hearing assessment,
especially to differentiate AMD from AD and permanent hearing impairment. 55 infants/
toddlers (30 males & 25 females) were taken for the study with age below 2 years at the time of
first evaluation. They were divided into four groups on the basis of ABR and LLR results. The
results revealed that LLR is an important tool in differential diagnosis of different conditions that
are likely to be encountered when dealing with the hearing assessment of pediatric population of
infants/toddler less than 2 years of age. In LLR results large variability was observed across the
subjects. So, it is recommended that the interpretation of LLR wave be cautiously approached
especially with regard to the absolute latency. It is also recommended that rather than looking at
the latency it would be better to look for the presence or absence of LLR for the differential
diagnosis of different conditions.
Introduction
Approximately 10% of newborns are at risk for medical problems and developmental
disability. Most infants at risk are detected either at birth as reflected in low APGAR scores or
during the complete physical examination within few hours of birth.
Different aspects of neurogenesis take place somewhat independently yet simultaneously
and interactively in infants. Degenerative and regressive events involving cell death, retraction of
axonal process and elimination of synapses occur concurrently throughout the development
(Berry, cited in Salamy, Eggermont & Elredge, 1994). Literature on developmental outcomes of
infants has shown that at risk infants display increased susceptibility to a variety of physical and
developmental deficits.
Hearing is critical for normal speech and language development which in turn is vital for
most aspects of normal human development. A significant hearing impairment at birth can
Lecturer in Audiology, All India Institute of Speech and Hearing, Mysore, India
e-mail: [email protected]
Dissertation Vol.V, Part-A, AIISH, Mysore
115
produce major disruption in language learning (Menyuk, 1977) and produce irreversible deficits
in the development of central auditory pathways (Moore, 1985). Even a mild hearing impairment
has been implicated in delayed development of auditory skills (Quigley, 1978). Early
identification of hearing loss followed by appropriate management minimizes the auditory
deprivation which can interfere with speech and language learning and central nervous system
maturation. So, early identification will waive off a number of problems regarding person‟s
educational, social and economic development that might arise in future.
Hearing loss in children is silent and hidden disability. It is hidden because children,
especially infants and toddlers, cannot tell us about their inability to hear well. However, there
are several ways to identify hearing loss in infants. There are subjective and objective methods to
identify hearing impairment. Such methods usually do not involve the subjects‟ active
involvement. Most commonly used behavioral method is Behavioral Observation Audiometry
(BOA) and the objective methods are Auditory Brain-stem Response (ABR), Oto-Acoustic
Emissions (OAE) and the Immittance audiometry, among which ABR is the most commonly
used tool to identify presence or absence of hearing loss.
Galambos and Galambos (1975) found a drop in latency of the ABR peaks with age in
the premature infants which they attributed to maturational changes. Roberts, Davis, Phon,
Reichet, Sturtevants and Marshall (1982) concluded that ABR failures in the infants in their
study resulted mainly from immaturity. Raj, Gupta and Anand (1991) found that on re-evaluation
8 out of 13 infants with risk factors, who had failed on BAER earlier, had developed normal
thresholds and by 6 months follow-up, only 3% had hearing loss. Misra, Katiyar, Kapoor,
Shukla, Malik, and Thakur (1996) reported BAER abnormalities and their reversibility in
neonates with birth asphyxia.
It is evident that infants with risk factors are likely to have varieties of abnormalities in
auditory system which might vary from permanent severe hearing loss to normal hearing;
reversible audiological test results in Auditory Maturational Delay (AMD) and Auditory Dys-
synchrony (AD) of different degrees. An appropriate test protocol is essential to differentiate one
pathological condition from other as it not only aids to the early appropriate rehabilitation but
also adds to appropriate counseling to the parents.
Need for the study:
Most authors agree that from 2-15% of infants with hearing loss may exhibit auditory
neuropathy; that is, one can expect to identify auditory neuropathy in approximately 1-3 infants
per 10,000 births (Rance, et al., 2003; Sininger, 2002). The great majority of infants with
auditory neuropathy exhibit one or more high risk factors including blood transfusion,
hyperbilirubinemia, anoxia, low birth weight, NICU residence or family history (Berlin, et al.,
1998).
Numerous researchers reported (Raj, Gupta & Anand, 1991; Berlin, Morlet and Hood,
2003) and cautioned clinicians that infants with risk factors have reversible features of auditory
Dissertation Vol.V, Part-A, AIISH, Mysore
116
dys-synchrony. Thus it becomes all the more important for us to be able to differentiate such
conditions (AMD) from AD which is most often irreversible. But it is difficult to identify the
exact nature and type of hearing loss in infants and toddlers with risk factors using test protocol
including ABR, BOA, OAE and Immittance as they present with a conflicting test results in a
single assessment. So an appropriate test battery becomes essential in order to differentiate these
mimicking conditions.
It may be recalled that during the neural development dendritic arborization and
synaptogenesis occur. Due to auditory deprivation the synapses lose their function or die which
might lead to absence of LLR in most of the adult cases with AD. However, it is likely that if
LLR is administered during synaptogenesis one might observe LLR in most of the infants. Lee,
McPherson, Yuen and Wong (2001) reported of two cases (school going children) with AD, one
with absent MLR and present LLR while the other had presence of both MLR and LLR (N1/P2).
Thus, LLR could be useful tool to identify and differentiate different conditions in children with
risk factors, if not all, at least most of the clients.
Aim of the study: Thus the study was taken up with the aim to check the importance of LLR in
pediatric hearing assessment especially to differentiate AMD from AD and permanent hearing
impairment.
Method
Subjects:
55 infants/toddlers (30 males and 25 females) were taken for the study with age below 2
years at the time of first evaluation. The subjects included both, those with risk factors which are
associated with hearing impairment and also those who did not demonstrate any such risk
factors. The family history of congenital or acquired hearing loss at an early age was also
specifically looked for in these subjects.
Apart from this all the subjects were also distributed into 4 groups based on the results of
ABR and LLR irrespective of presence or absence of risk factors for further analysis. Group I
was assigned those subjects who had both ABR and LLR present whereas Group II and III
consisted of those with ABR absent – LLR present and ABR present at least in second or third
evaluation– LLR absent in first evaluation respectively. The final group, Group IV comprised of
those subjects who had neither ABR nor LLR present. The number of subjects that formed a part
of Groups I, II, III and IV, was 18, 14, 10 and 13 respectively.
Instrumentation:
1. A calibrated two-channel diagnostic audiometer OB922 with impedance matched speakers was
used to obtain behavioral responses (BOA)
2. Transient Evoked Oto-Acoustic Emissions (TEOAE) were acquired using ILO292 (software
version 5) in full TE menu option in order to examine the status of the outer hair cells to rule
out absence or abnormal Auditory Brainstem Responses (ABR) due to cochlear pathology.
Dissertation Vol.V, Part-A, AIISH, Mysore
117
3. Intelligent Hearing System (HIS) Smart EP version 3.86USBeZ was used to obtain Auditory
Brainstem Responses (ABR) and Long Latency Potentials (LLR) to check the integrity of
neural pathway at the levels of brainstem and cortex respectively.
4. An immittance meter, Grason Stadler Inc. (GSI) Tympstar was used to rule out the presence of
middle ear pathology causing absence of TEOAE or prolongation of ABR wave latencies.
All the instruments were checked for calibration prior to use on each of the subjects
according to manufacturer‟s recommendations.
Test Procedure:
a. Case History:
Detailed information regarding the history of prenatal, natal and postnatal medical
conditions was secured for each of the subjects. Medical records were looked for to obtain
information regarding risk factors pertaining to congenital or early onset hearing loss like
TORCH infections, neonatal jaundice, birth asphyxia, low APGAR scores, seizures, premature
delivery, low birth weight, drinking of Amniotic fluid at the time of delivery, mother getting
Chicken Pox in the first trimester of pregnancy and Bronchopneumonia. A detailed report
regarding the auditory behavior of the subject at home for various environmental sounds like call
bell, dog bark, voices from TV or radio, pressure cooker whistle etc. was obtained from the
parents or caretakers. Parents were counseled regarding frequent follow-ups and were asked to
look for changes in the auditory behavior and also to report those changes during the next
follow-up visit.
b. Test Battery:
1) Behavioral Observation Audiometry (BOA):
Behavioral responses of the subjects were obtained in sound-field condition using warble
tones or narrow band noise of 500, 1, 2 and 4 KHz and also speech stimuli. It was carried out in a
double-room situation. The subjects were seated on the caretakers lap at a distance of 1 meter
from the speakers and at an azimuth of 45° in the observation room. The stimuli were presented
sequentially and the starting level was decided based on the parental report about the auditory
behavior at home. The lowest levels of presentation of each of the stimuli at which the subject
exhibited some sort of auditory behavior was noted down.
2) ABR and LLR:
Single channel ABR and LLR were recorded in sleep condition using IHS Smart EP
instrument. The electrodes were placed at Fz (high forehead) for Non-inverting (positive), A1
(left ear mastoid) for inverting or ground and A2 (right ear mastoid) for ground or inverting.
Neo-prep was used for preparing skin at these electrode sites in order to obtain allowable
impedance values. Independent at each site and inter-electrode impedances were maintained
within 5 KΏ. TDH-39 headphones were placed taking care not to dislodge the electrodes from
their positions and the electrode impedances were re-measured to make sure that the impedance
Dissertation Vol.V, Part-A, AIISH, Mysore
118
stayed within the desired levels at each of the electrodes. The parameters used for ABR and LLR
recording have been shown in Table 2 and 3 respectively.
Table 2: Parameters used to acquire ABR
Acquisition Parameter Stimulus Parameters
Sensitivity- 50 µV
Band-pass Filters- Low Pass- 3 KHz
High Pass- 30 Hz
Notch Filter- Off
Artifact Rejection- On
Electrode Montage- A1- Fz- A2
Time Window- 15 msec.
Type of stimulus- Click
Polarity- Rarefaction
Intensity- Variable
Number of stimuli- 1500
Repetition rate- 11.1/sec. (to obtain
better waveform morphology)
Table 3: Parameters used to acquire LLR
Amplifier Set-up Stimulus Parameters
Sensitivity- 50 µV
Band-pass Filters- Low Pass- 300 KHz
High Pass- 1 Hz
Notch Filter- Off
Artifact Rejection- On
Electrode Montage- A1- Fz- A2
Time Window- 750 msec.
Type of stimulus- Click
Polarity- Rarefaction
Intensity- 70 dB
Number of stimuli- 300
Repetition rate- 1.1/sec.
Presence of ABR (wave I or V) at lowest level was taken as threshold and used for
interpretation as wave I is likely to be more prominent in infants. In case of LLR, the latencies of
the two positive peaks (P1 and P2) and the two negative peaks (N1 and N2) were noted
whenever these were present. In case any one or more of these peaks were absent, only the
latencies of the peaks that were present were noted. LLR responses were shown to three
experienced Audiologists to identify the peaks.
Infants with presence of ABR at 30 dBnHL and also presence of LLR were not followed-
up. Only those infants who demonstrated absence of one or both of these potentials in the first
evaluation were asked to follow-up to monitor any changes with development and diagnosis.
3) Oto-Acoustic Emissions (OAE):
TEOAEs were obtained using ILO292 instrument with a foam tip positioned in the
external auditory canal so as to give a flat frequency spectrum across the frequency range. The
stimuli were clicks filtered with a band-pass filter encompassing 500 to 6000 Hz. The duration of
the rectangular pulses (clicks) was 80 µsec. The level was maintained at 80 dBpkSPL in the
external auditory canal and the inter-stimulus interval was kept constant at 20 msec. A total of
260 averages above the automatic noise rejection level of instrument were stored for analysis.
The presentation mode included a series of four stimuli, three at same level and of same polarity
and the fourth of three times the level of either of the three and opposite in polarity. This, called
Dissertation Vol.V, Part-A, AIISH, Mysore
119
the non-linear averaging, is used for artifact reduction during the response acquisition. The
responses were considered as emissions based on the reproducibility and the signal-to-noise ratio
(SNR). The overall SNR of greater than or equal to +3 dB and the reproducibility of greater than
50% were considered (Dijk & Wit, 1987) for it to be considered as a presence of an echo or
emission.
4) Immittance:
Tympanometric measurements were done using 678 Hz probe tone (since infants and
toddlers have mass dominant middle ear system) or 226 Hz based on the age of the subject at the
time of evaluation. This was done to rule out absence of OAE due to middle ear pathology.
Appropriate probe tips were used to obtain hermetic seal and comfortable pressure for the
subject. The parameters documented were types of tympanogram to go with ear canal volume,
acoustic admittance and the tympanometric peak pressure. The results were also correlated with
the ENT findings.
Result and Discussion
The subjects were divided into four groups based on the presence or absence of ABR and
LLR irrespective of the presence or absence of risk factors to understand the role of LLR in
pediatric hearing assessment. Group I consisted of 18 subjects, Group II of 14, Group III of 10
and Group IV of 13 subjects. The profile includes findings of different tests (BOA, OAE,
Immittance, ABR and LLR) that were administered on each of the subjects who participated in
the study. Table 4 shows the audiological profile of cases who demonstrated presence of both-
ABR and LLR (Group I).
It can be seen in table 4 that subjects 1, 2, 4, 9 and 10 had positive history for risk factors
that are associated with hearing loss. Subjects 1, 2, 4 and 10 had history of Neonatal Jaundice
(NJ) that did not require blood transfusion. In addition to NJ subject 2 also reported of delayed
birth cry by 1 minute, Neonatal Meningitis (NM) and Febrile Seizures at the age of 2 months.
Subject 9 had history of delayed birth cry by 1 minute without other associated complications. In
spite of presence of risk factor/factors that have been found to be associated with hearing loss in
literature these subjects‟ audiological results fall well within normal limits for BOA, OAE, ABR,
Immittance and LLR except for subjects 4 and 9 who showed absence of OAE and slightly
elevated BOA values for FM tones and speech stimuli. Occurrence of conductive pathology as
indicated by B-type Tympanogram and positive history of ear discharge might have resulted in
abnormal OAE and BOA results in these two subjects. This group was considered to have
normal hearing based on the test battery result except for subject 4 and 9 as they had conductive
component.
Dissertation Vol.V, Part-A, AIISH, Mysore
120
Table 4: Audiological Profile of subjects with ABR and LLR present (Group I)
„P‟ - „present‟; „Ab‟ - „absent‟; „T‟- „tone‟; „S‟ - „speech‟; „R‟-„right‟ and „L‟ - „left‟ ear
The youngest age of the subject in whom LLR could be recorded was 16 days and all the
components of LLR (P1, N1, P2, and N2) were found to be present in this subject. It has also
been reported by McPherson, Tures, and Starr (1989) that LLR or at least a component of LLR
can be present in normal hearing infants even at birth. Though all the components of LLR could
be recorded in the subject 12, it is not always necessary that all the LLR components would be
observable at birth. But there is high possibility of presence of at least one of the LLR
components at high intensity if the infants follow normal developmental pattern. Thus, it
suggested that LLR might be able to substitute ABR provided the normative data is established
for this population and also the relationship between the behavioral and LLR threshold is
established.
Some time LLR might be a better tool for assessing hearing sensitivity when there is mild
degree of dys-synchrony which might lead to noisy ABR morphology but might show better
LLR responses. This might help to identify hearing sensitivity in such cases. 5 subjects had
undergone three evaluations each at a interval of 3 months or more between two follow-ups and
also 2 subjects who underwent two evaluations apiece at interval of three months or more. Rest
of the 7 subjects underwent only 1 evaluation each. This group showed interesting finding of
absence of ABR peaks even at 90 dBnHL in spite of BOA showing normal hearing in most of
the subjects to severe hearing loss in subjects 1 and 3. However, all the subjects showed A-type
Sub Risk
Factors
BOA Level
(dBHL)
Tympanogram
Type
OAE ABR Threshold
(dBnHL)
LLR at 70
dBnHL
T S R L R L R L R L
1 P 40-55 35 A A P P 30 30 P P
2 P 45-50 40 A A P P 30 30 P P
3 Ab 35-50 35 A A P P 30 30 P P
4 P 50-65 50 B B Ab Ab 50 40 P P
5 Ab 40-55 40 As As Ab Ab 40 40 P P
6 Ab 40-55 40 A A P P 30 30 P P
7 Ab 45-50 45 A A P P 40 40 P P
8 Ab 35-45 30 As As P P 30 30 P P
9 P 50-60 45 B B Ab Ab 50 50 P P
10 P 50-55 40 A A P P 30 30 P P
11 Ab 35-45 35 A A P P 30 30 P P
12 Ab 30-45 30 A A P P 40 40 P P
13 Ab 30-40 35 A A P P 30 30 P P
14 Ab 45-55 45 A A P P 50 50 P P
15 Ab 35-40 30 A A P P 30 30 P P
16 Ab 35-40 30 A A P P 35 40 P P
17 Ab 35-40 30 A A P P 30 30 P P
18 Ab 30-40 30 A A P P 30 30 P P
Dissertation Vol.V, Part-A, AIISH, Mysore
121
tympanogram in presence of recordable TEOAE and presence of LLR at 70 dBnHL. Considering
that the maturation of auditory system occurs from peripheral to central (auditory nerve to
auditory cortex) (Romand, 1983; Montandon, Cao, Engel & Grajew, 1979; Stockard & Stockard,
1981, 1983), ABR also should have been present if LLR were to be present. But this was not to
be in these subjects which gives an indication towards some permanent abnormality at the level
of Auditory Nerve and Brain-stem which contain the generators for different ABR peaks. ABR is
actually a test of neural synchrony and is dependent upon the ability of neurons to maintain
precise timing and respond synchronously to external stimuli (Jewett & Williston, 1971). So the
absence of ABR peaks could be because of dys-synchronous firing of the ANFs or in other
words, Auditory Dys-synchrony (AD). LLR requires much lesser degree of synchrony of firing
of the ANFs (Kraus, et al., 2000) which is why probably it was found to be present in these
subjects. The physiology behind this aspect has been reported by Kraus et al. (2000). The cortical
potentials reflect neural synchrony differently than ABR. The ABR peaks reflect synchronous
spike discharge generated in the nerve tracts whereas the peaks in cortical responses reflect the
summation of excitatory post-synaptic potentials. In other words the ABR reflects action currents
in axons while the cortical potentials reflect slow dendritic events. Because unit contributions to
the ABR are biphasic and of short duration ABR peaks tend to cancel when discharges are
separated by fractions of a millisecond. In contrast, for cortical potentials, the waves are so slow
that contributions separated by several milliseconds contribute to these later waves. While the
ABR reflects highly synchronous discharge with millisecond precision the synchrony required
for cortical potentials is on the order of several milliseconds.
This highlights that after 3 evaluations in 5 subjects it was possible to label them as
having AD. In the first evaluation itself ABR was found to be absent and LLR present which
prompted to diagnose the subjects as having AD. The follow-up testing was done only to confirm
this status. All other subjects in the group were suspected to have AD based on similar findings
in the first or first two evaluations but they could not be followed up owing to time constraint or
their inability to come for follow-up. Thus we may be able to arrive at the diagnosis of infants or
toddlers having AD after the first evaluation itself if ABR is absent but LLR present. This also
receives support from Starr, Picton, Sininger, Hood and Berlin (1996) who reported presence of
LLR in the subjects with Auditory Neuropathy which is more recently being called AD.
Subjects 3, 5, 6, 8 and 9 had absence of TEOAE which could be accounted by As-B, As-
B, B-B, As-As, and As-As type of tympanogram respectively suggesting presence of conductive
pathology in these subjects. The conductive pathology hampers the reverse transmission of OAE
through the middle ear system and thus prevents the OAE from being recorded at the level of ear
canal. If LLR was not done it would not have been possible to categorize these subjects into AD
as most of the available literature on AD highlights the presence of OAE being an integral part of
the test battery for its identification. Thus, it goes to show how important can LLR be in
identifying AD in infants and toddlers at an early age even if OAEs are absent owing to some
middle ear problem. This can implicate in early and appropriate use of intervention strategies or
measure like cochlear implantation can be taken very early in life as the literature supports the
Dissertation Vol.V, Part-A, AIISH, Mysore
122
usefulness of cochlear implantation in cases with AD (Peterson et al., 2003 cited in Kirk, Firszt,
Hood, and Holt, 2006).
Also in subject 10 there is presence of ABR at 90 dBnHL only which suggests a milder
degree of dys-synchrony in the ANFs‟ firing. LLR was found to be present at 70 dBnHL in this
subject. Hyde (1997) found 10 dB discrepancy between the LLR threshold in asleep condition
and the behavioral threshold (LLR threshold> Behavioral threshold) and suggested about 10-15
dB discrepancy between ABR threshold and behavioral threshold (ABR threshold> Behavioral
threshold). This implies that LLR and ABR thresholds should roughly coincide which is not
what was found in subject 10. This subject showed a discrepancy of grater than or in worst case
equal to 20 dB between ABR and LLR. Also, BOA responses were observed at 45-55 dB and 45
dB for FM tones and speech respectively which suggests that the subject was hearing sounds at
much lower intensities than suggested by ABR finding. All this in conjunction led to this subject
being put in the category of having AD (i.e., Group II). The inclusion of this case in this group
also receives support from Sininger (2002) who said that the neural response (ABR) will be poor
or completely absent but will occasionally show a small wave V response (at high stimulus
intensities). However, the subject did not come for follow up to confirm the diagnosis. Since
LLR requires lesser synchrony than ABR, even small amount of demyelinization might have led
to its presence whereas a small amount of demyelinization of lower structures (auditory nerve
and brainstem) could have produced synchronous firing only at much higher level (90 dBnHL).
So, this could even be classified in to AMD. But due to lack of information about further follow-
up it can only be a matter of debate whether to call it AD or AMD. So if such cases are
encountered follow-up evaluations are advisable to confirm the actual existing condition.
Apart from this Group II comprised of 8 subjects out of a total of 14, who had history of
risk factor/factors. These were subjects 4, 6, 7, 8, 10, 12, 13 and 14. Subjects 4, 6, 10, 12 and 13
had history of Neonatal Jaundice but only 4 and 13 required blood transfusion and phototherapy
respectively. Subject 8 demonstrated multiple risk factors in terms of history of mother having
Chicken Pox in the first trimester of pregnancy, consumption of amniotic fluid by the subject at
the time of delivery, low blood sugar level at the time of birth and febrile seizures at 1 month of
age. Subject 10 also had the history of birth asphyxia to go with neonatal jaundice and subject 14
had birth asphyxia followed by seizures few days later. So in all, 55.55% of the subjects clubbed
under the AD group (Group II) had history of severe degree of risk factor/factors pertaining to
hearing loss. Thus, it suggests that severe degree of risk factors showed-up in high chances of
auditory abnormality and hence such infants and toddlers must be considered for detailed
audiological evaluation and LLR must be the part of the test battery to identify AD.
Group III consisted of 10 subjects 4 of whom had history of one or more risk factors that
have been suggested to cause hearing loss. Subject 1 and 3 had history of neonatal jaundice to go
with seizures in early infancy whereas subject 9 had history of premature delivery. Subject 8 was
confronted with multiple risk factors that included birth asphyxia, low APGAR scores,
bronchopneumonia, and seizures.
Dissertation Vol.V, Part-A, AIISH, Mysore
123
Six (1, 3, 6, 8, 9, and 10) out of the total of 10 subjects of this group showed presence of
OAE in absence of any recognizable ABR peak even at 90 dBnHL in first recording. Based on
the reports in literature that talk of absence of ABR in presence of OAE being the feature of AD
(Starr et al., 1996), these subjects could have been diagnosed as having AD. But follow-up test
results in these subjects (second follow-up for subjects 3, 6 and 10 and second and third follow-
ups for subjects 1, 8 and 9) suggested other wise. These follow up recordings demonstrated
presence of ABR and gradual progression towards normalcy in terms of peak latencies in those
with two or three evaluations. These recordings also showed absence of LLR in presence of ABR
in first recording itself which is a big indicator that the maturation has not fully occurred at the
central level (at the level of cortex). The routine audiological testing (which includes BOA,
ABR, OAE and immittance) in such cases, give the findings that are similar to the findings in
cases with AD (i.e., absent ABR, near normal or slightly elevated BOA, A-type tympanogram
and present TEOAE) at least in first evaluation. This can lead to a case being misdiagnosed as
AD though it could be a case of delayed maturation (Auditory Maturational Delay or AMD).
Comparison of the profiles of subjects in Table 2 and 3 can give a valuable introspection in this
regard. The presence of LLR in absence of ABR gives an indication towards normal cortical
functioning in lieu of sub-normal or abnormal peripheral (auditory nerve and brain-stem)
functioning. Based on the earlier discussion with regard to table 2, the label of AD (based on
peripheral-to-central course of maturation) can safely be put forth for such cases. Absence of
LLR, when ABR is present (normal or abnormal) or when it is absent, shows a trend towards
either AMD or severe hearing loss. But it would be better if cases with such findings are
monitored with regard to the changes in auditory behavior at home and also through regular
follow up (preferably at 3 months intervals) till a clearer picture of the condition evolves
(preferably upto 2 to 3 years of age), more so if OAEs are absent.
In subjects 4 and 6 again, the absence of OAE can be accounted by B type tympanogram
in both ears of both the subjects. These have been considered in this group based on the BOA
findings which suggests near normal responses to FM tones and speech in subject 4 and shows a
maturational course, indicated by improvement in response levels in second evaluation, in
subject 6. So, if ABR is present at any level and LLR is absent in the first evaluation itself the
case can be diagnosed as AMD if OAEs are found to be present. This receives support from the
findings of 4 out of 6 subjects in table 3. These subjects (1 and 8 in third evaluation and 9 and 10
in second evaluation itself) showed the presence of LLR and also improvement in the ABR peak
latencies with increase in age, thus, supporting the diagnosis of AMD which was established
based on absence of LLR in the first recording session itself.
Rest of the subjects (2, 4, 5, and 7) had undergone only one evaluation. They had
presence of ABR and OAE, except subject 4 (absent OAE), but LLR was absent. Absence of
OAE could be accounted based on the tympanometry result which showed B-type tympanogram
in subject 4. Presence of ABR and absence of LLR helped in the diagnosis of AMD in these
subjects. However, follow-up is required to confirm this diagnosis. It has been seen that absence
of ABR in the first evaluation could be misleading as there can be case of delay in maturation
Dissertation Vol.V, Part-A, AIISH, Mysore
124
that lead to this phenomenon. This paradox can easily be solved by the inclusion of LLR in the
test battery for the hearing assessment in infants and toddlers. Thus, it is strongly recommended
to include LLR in the protocol for assessing hearing in infants and toddlers.
In Group IV (ABR absent- LLR absent) 3 out of a total of 13 subjects (23.08%) had
history of one or more risk factors pertaining to hearing loss. Subject 1 had history of neonatal
jaundice, subject 13 had birth asphyxia and subject 8 had a cluster of risk factors that included
prenatal high blood pressure (at 7th
month), premature delivery and low birth weight. All the
subjects had BOA responses at much higher levels than normal hearing infants which correlated
well with the findings that included absence of OAE, A-type tympanogram (except subjects 1, 2,
and 9 in first recording and subjects 1 and 7 in second recording in both and left ear respectively)
and no repeatable peaks in ABR and LLR recordings. Absence of LLR along with absence of
OAE and ABR gives a fair indication towards subjects having permanent hearing loss. A
conductive component in subjects 1, 2, 7 and 9 cannot account for absence of ABR at 90 dBnHL
and that of LLR at 70 dBnHL. Thus, this group of subjects can be diagnosed as having
permanent hearing loss.
The differentiation between the two- AMD Vs severe hearing loss can be easily
accomplished based on TEOAE findings. If the TEOAEs are present it shows that the course of
maturation may be slightly prolonged or delayed causing abnormality in ABR and LLR findings
whereas absence of TEOAE would indicate abnormality at the level of cochlea too and hence
severe hearing loss could be a better recommendation.
It can be concluded based on the results that LLR is an important tool in differential
diagnosis of different conditions that are likely to be encountered when dealing with the hearing
assessment of pediatric population of less than 2 years of age. The same can be clearly
understood from the table 5.
Table 5: Different test results and diagnosis based on them
BOA Immittance OAE ABR LLR Diagnosis
1. Normal A-type Present Present Present Normal hearing
2. Normal A-type Present Absent Present AD
3. Normal A-type Present Absent/
Abnormal/
Present
Absent AMD
4. Abnormal A-type Absent Absent Absent Severe hearing loss
The test results in case of AMD and AD are identical if LLR results are taken away. The
Audiologist will, hence, find it very difficult to diagnose the condition or, more realistically,
would have to wait until the maturation has fully occurred. This difficulty can be overcome when
LLR results are included. The absence of LLR can be considered for the diagnosis of AMD
whereas its presence can be termed as AD (based on the pattern of maturation which suggests
Dissertation Vol.V, Part-A, AIISH, Mysore
125
peripheral to central course for it) based on presence of ABR at any level and absence of ABR
even at high levels respectively. Thus LLR can prove to be of immense importance in differential
diagnosis of AD and AMD. Also, though other routine audiological tests indicate hearing
sensitivity within normal limits, still there may be a case of delayed maturation at higher centers
(auditory cortex). This can be ruled out by presence of LLR. Not only that, LLR can also be used
as a substitute for ABR to obtain threshold especially if ABR morphology is poor or if ABR is
completely absent as in cases with AD. LLR can also be used as supporting tool for the diagnosis
of severe hearing loss. Absence of LLR would indicate minimum or no signal reaching the
auditory cortex that can evoke a cortical response.
The normative data was also established in the present study for ABR but due to lack in
number of subjects a more careful usage of these findings is recommended. In case of LLR
norms, large variability was observed across the subjects. So it is recommended that the
interpretation of LLR wave be cautiously approached especially with regard to the absolute
latency. It is also recommended that rather than looking at the latency it would be better to look
for the presence or absence of LLR for the differential diagnosis of different conditions.
Implications of the study:
First and foremost, the study highlights the importance of LLR in differential diagnosis of
AMD from AD and permanent hearing loss. So this brings out a solution to the paradoxical
nature of hearing assessment in infants and toddlers. The study also suggests the use of LLR in
threshold estimation especially if there is case of AD in which ABR is absent and thus implicates
in early decision for cochlear implantation and avoid unnecessary psychological trauma to the
parents if it is AMD. The study also tried to establish norms for ABR and LLR which could be
used for arriving at conclusion if an infant or toddler is developing normally or not, though a
careful use of the findings of the present study is recommended.
References
Berlin, C .I., Bordelon, J., St. John, P., Wilensky, D., Hurley, A., Kluka, E. & Hood, L. J. (1998).
Reversing click polarity may uncover auditory neuropathy in infants. Ear & Hearing, 19,
37-47.
Berlin, C. I., Morlet, T. & Hood, L. J. (2003). Auditory neuropathy/dys-synchrony : Its diagnosis
and management. Pediatric Clinics of North America, 50, 331-340.
Dijk, P. V. & Wit, J. A. (1987). The occurrence of click evoked Otoacoustic emissions (“Kemp
Echoes”) in normal hearing ears. Scandinavian Audiology, 16, 62-64.
Galambos, C. S. & Galambos, R. (1975). Brainstem auditory evoked responses in premature
infants. Journal of Speech and Hearing Research, 18, 456-465.
Hyde, M. (1997). The N1 response and its applications. Audiology & Neuro-otology, 2(5), 281-
307.
Jewett, D. & Williston, J. S. (1971). Auditory evoked far fields averaged from the scalp of
humans. Brain, 94, 681-696.
Kirk, K. I., Firszt, J. B., Hood, L. J. & Holt, R. F. (2006). New directions in pediatric cochlear
implantation: Effects on candidacy. The ASHA Leader, 11(16), 6-7, 14-15.
Dissertation Vol.V, Part-A, AIISH, Mysore
126
Kraus N., Bradlow, A. R., Cheatham, M. A., Cunningham, J., King, C. D., Koch, D. B., Nicol, T.
G., McGee, T. J., Stein, L. K. & Wright, B. A. (2000). Consequences of Neural
Asynchrony: A Case of Auditory Neuropathy. Journal of the Association for Research in
Otolaryngology, 1, 33-45.
Lee, J. S., McPherson, B., Yuen, K. C. & Wong, L. L. (2001). Screening for auditory neuropathy
in a school for hearing impaired children. International Journal of Pediatric
Otorhinolaryngology, 61(1), 39-46.
McPherson, D. L., Tures, C. & Starr, A. (1989). Binaural interaction of the auditory brainstem
potentials and middle latency auditory evoked potentials in infants and adults.
Electroencephlography and Clinical Neurophysiology, 74, 124-130.
Misra, P. K., Katiyar, C. P., Kapoor, R. K., Shukla, R., Malik, G. K. & Thakur, S. (1996).
Brainstem auditory evoked response in neonates with birth asphyxia. Indian Pediatrics,
34, 199-205.
Montandon, P. B., Cao, M. H., Engel, R. T. & Grajew, T. (1979). Auditory nerve and brainstem
responses in the newborn and in preschool children. Acta Otolaryngologica, 87(3-4), 279-
86.
Moore, D. R. (1985). Post natal development of the mammalian central auditory system and the
neural consequences of auditory deprivation. Acta Otolaryngologica Supplement, 421,
19-30.
Quigley, S. P. (1978). The effects of early hearing loss on normal language development. In F.
N. Martin (Ed.). Pediatric Audiology. Englewood Cliffs NJ: Prentice Hall.
Raj, H., Gupta, A. K. & Anand, N. K. (1991). Hearing assessment by brainstem auditory evoked
responses (BAER) in neonates at risk. Indian Pediatrics, 28, 1175-1183.
Rance, G., Beer, D. E., Wesson, B. C., Shepherd, R. K., Dowell, R. C., King, A. M., Rickards, F.
W. & Clark, G. M. (2003). Clinical findings of a group for a group of infants and young
children with auditory neuropathy. Ear and Hearing, 20, 238-252.
Roberts, J. L., Davis, H., Phon, G. L., Reichet, M. D., Sturtevant, B. S. N. & Marshall, M. D.
(1982). Auditory brainstem response in preterm neonates: Maturation and follow-up.
Journal of Pediatrics, 101, 257-263.
Romand, R. (Ed.). (1983). Development of Auditory and Vestibular Systems. New York:
Academic Press.
Salamy, A., Eggermont, J. & Elredge, L. (1994). Neurodevelopment and auditory function in
preterm infants. In Jacobson, J. T. (Ed.). Principles and Applications in Auditory Evoked
Potentials (1st Ed) (pp: 287-312). Massachusetts: Allyn & Bacon.
Sininger, Y. S. (2002). Identification of Auditory Neuropathy in Infants and Children. Seminars
in Hearing, 23(3), 193-200.
Starr, A., Picton, T. W., Sininger, Y., Hood, L. J. & Berlin, C. I. (1996). Auditory Neuropathy.
Brain, 119, 741-753.
Stockard, J. E. & Stockard, J. J. (1981). Brainstem auditory evoked potentials in normal and
otologically impaired newborns and infants. In C. Henry Ed.), Current Clinical
Neurophysiology. Amsterdam: Elsevier Science Publishers.
Stockard, J. E. & Stockard, J. J. (1983). Recording and analyzing. In E. J. Moore (Ed.), Bases of
auditory brainstem evoked responses. New York: Grune and Stratton.
Dissertation Vol.V, Part-A, AIISH, Mysore
127
Pitch perception in individuals with sensorineural hearing loss with
and without dead regions
Palash Dutta & K Rajalakshmi
Abstract
It has been suspected that a pure tone might be perceived as noise like when the tone
produces maximum excitation in a region of the cochlea where there is extensive or complete
loss of inner hair cell (IHC) and /or neural function which is referred to as a dead region (DR).
It is defined as region in the cochlea where IHCs and /or neurons are functioning so poorly that
a tone producing peak vibration in that region is detected by off place listening (i.e. the tone is
detected at a place where the amount of basilar membrane vibration is lower but IHCs and
neurons are functioning more effectively). In this study total 17 sensorineural hearing loss
individuals with and without dead region were taken. They were divided into two groups-
subjects having sensorineural hearing loss with dead region and without dead region. For
detection of dead regions psychophysical tuning curves (PTCs) was established using a
procedure which is similar to the physiological determination of a tuning curve on the basilar
membrane or in the auditory nerve. TEN (Threshold-Equalizing Noise) test was also used which
is relatively fast and simple test. If the subjects had dead region or not then pitch matching
experiment was carried out. For the pitch matching task subjects were asked to match the
perceived pitch of a variable pure tone with that of another pure tone that was fixed in
frequency. The results reveal that pitch perception in individuals with sensorineural hearing loss
with dead region is different than in those individuals having sensorineural hearing loss without
dead region. The result also shows that if sensorineural hearing loss is accompanied with dead
region (DR) then there is broader auditory filter and hence pitch matching is difficult.
Introduction
A sinusoid is usually perceived by people with normal hearing as having clear tonal
quality and a distinct single pitch; hence sinusoids are often called pure tones. However some
people with hearing impairment report that pure tones sound highly distorted and noise like
(Florentine & Houtsma, 1983; Moore et al. 1985; 1977b; Murry & Byrere, 1986; Huss & Moore,
2005). From the audiogram it is difficult to predict whether or not a person will experience such
a percept. It has been suspected that a pure tone might be perceived as noise like when the tone
produces maximum excitation in a region of the cochlea where there is extensive or complete
loss of inner hair cell (IHC) and/or neural function which is sometimes referred to as a dead
region (DR) (Florentine & Houtsma, 1983; Moore et al, 1985; Huss & Moore, 2005). A DR can
be defined as a region in the cochlea where IHCs and/or neurons are functioning so poorly that a
Reader in Audiology, All India Institute of Speech and Hearing, Mysore, India
e-mail: [email protected]
Dissertation Vol.V, Part-A, AIISH, Mysore
128
tone producing peak vibration in that region is detected by off-place listening (i.e. the tone is
detected at a place where the amount of basilar-membrane vibration is lower but the IHCs and
neurons are functioning more effectively; Moore, 2004). The extent of a DR can be defined in
terms of the characteristic frequencies (CFs) of the IHCs and/or neurons immediately adjacent to
the DR (Moore, 2001). DRs can be diagnosed and localized using psychophysical tuning curves
(PTCs) (Florentine & Houtsma, 1983; Turner et al. 1983; Moore et al. 2000; Moore & Alcantara,
2001; Huss & Moore, 2003; Kluk & Moore, 2005) and using the threshold-equalizing noise
(TEN) test (Moore et al, 2000; 2004; Huss and Moore, 2003).
The Pitch of a pure tone may be determined by the auditory system from the distribution
of activity or excitation along the basilar membrane or in the auditory nerve (place theory)
(Helmholtz, 1863; von Bekesy, 1960; Siebert, 1968, 1970) or from temporal information derived
from patterns of phase locking in the auditory nerve (Siebert, 1970; Moore, 1973; Goldstein &
Srulovicz, 1977; Srulovicz & Goldstein, 1983).
In principle, theories of pitch perception can be evaluated by studying pitch perception
for people with hearing impairment (Moore & Carlyon, 2005). Studies investigating pitch
perception in subjects with low-frequency DRs have suggested that a tone with a frequency
falling in a DR is often perceived with a low pitch that is roughly “normal” (Florentine &
Houtsma, 1983; Turner et al., 1983). Pitch shifts were much smaller than would be predicted
based on the place where the tone was assumed to be detected (Just outside, the boundary of the
DR). The results were interpreted as indicating that the pitch of a low frequency tone is
predominantly derived from the temporal pattern of neural firing evoked by the tone.
The results obtained by Florentine and Houtsma (1983) and by Turner et al. (1983) were
mostly based on pitch matches using tones at and within about 2 octaves of the edge frequency
of the DR. It is possible that the analysis of information about pitch carried by interspike
intervals is optimized when place and temporal information is consistent (Evans, 1978). For
example, the analysis of temporal information may depend upon differences in the phase of the
response at different points along the basilar membrane (Loeb et al., 1983; Shamma & Klein,
2000; Carney et al., 2002). The processing of temporal information could be disrupted when the
propagation time of the traveling wave along the basilar membrane deviates from normal as may
happen in hearing impaired ears (Ruggero et al., 1996). When a tone produces peak vibration in
a DR perception of the tone depends on the spread of vibration to an adjacent functioning region
of the cochlea. At that region, the traveling wave pattern and specifically the relative phase of the
response at different places, might differ markedly from the pattern occurring around the place
where peak vibration occurs. This might markedly disrupt the processing of temporal
information.
It remains unclear as to what determines the pitch when either temporal and place
information become weaker and what happens when temporal and place codes give conflicting
information about pitch. The presence or absence of dead regions can have serious implications
for the fitting of hearing aids. Amplification over a frequency range corresponding to a dead
Dissertation Vol.V, Part-A, AIISH, Mysore
129
region may not be beneficial because amplified frequency components would be detected and
analyzed in frequency channels that normally respond to other frequencies.
The present study aimed to investigate the pitch perception in individual with and without
dead regions in subjects with sensorineural hearing loss and whether there is relationship
between dead region and perceived pitch. The study also investigated whether there is any
relationship between the extent of DRs and pitch shift.
Method Participants
Participants were divided into 2 groups. In first group 7 subjects having sensorineural
hearing loss without dead region were taken. In the second group 10 subjects having
sensorineural hearing loss with dead region were taken. Sensorineural hearing loss without dead
region group subjects had no significant history of neurological disorders. Subject had pure tone
threshold more than 40 dBHL in frequency ranges 250 Hz to 4000 Hz and immittance screening
revealed no middle ear pathology. Subjects‟ air bone gap was less than 10 dBHL at all
frequencies from 250 Hz to 4000 Hz. Subject had pure tone average more than or equal to
moderate degree of hearing loss and immittance screening revealing no middle ear pathology.
Instrumentation
A calibrated two channel diagnostic audiometer MA-53 was used for testing the subjects.
An immittance audiometer (GSI-33) used for evaluation of middle ear function. Tape recorder
with CD for TEN test was connected to a two channel diagnostic audiometer for presenting the
stimulus. Test was carried out in an air conditioned sound treated double room set up with
ambient noise levels within permissible limits (Re: ANSI 1991, as cited in wilber, 1994).
Test materials
TEN test CD was used for the purpose of diagnosing dead region in subjects with sensorineural
hearing loss.
Procedure
1) Pure Tone average: Pure tone thresholds were obtained at octave frequencies between 250
Hz and 8000 Hz for air conduction stimuli and between 250 Hz to 4000 Hz for bone conduction
stimuli using modified Hughson Westlake method (Carhart & Jerger, 1959).
2) Tympanometry: Tympanometry and reflexometry were carried out to rule out any middle ear
pathology.
3) Psychophysical tuning curves: Psychophysical tuning curves (PTCs) (Chistovich, 1957;
Small, 1959) were measured using a procedure which is similar to the physiological
determination of a tuning curve on the basilar membrane (Sellick, et al., 1982) or in the auditory
nerve (Kiang, Watanabe, Thomas & Clark, 1965). The signal was a sinusoid which was
presented at a level 10 dB above the absolute threshold. In a given run the signal frequency was
Dissertation Vol.V, Part-A, AIISH, Mysore
130
fixed. The masker was an 80 Hz wide band of noise with variable center frequency. The exact
masker frequencies were chosen individually for each subject so as to define the position of the
tip of the tuning curve with reasonable accuracy. Several signal frequencies were used for each
subject; they were chosen to cover a range including any suspected dead region. For each of
several masker center frequencies the level of the masker needed just to mask the signal was
determined.
4) TEN (Threshold-Equalizing Noise) test: For detection of dead regions TEN test was used
which is relatively fast and simple test (Moore, et al. 2000). The test makes use of a masking
noise called “threshold-equalizing noise” (TEN) which is spectrally shaped.
Absolute thresholds and masked thresholds in TEN were measured using manual
audiometry with the procedure proposed by Carhart and Jerger (1959). The TEN from the CD
was fed to one of the tape inputs on OB 922 audiometer and the sinusoidal test signal was fed to
the other. TEN and signal levels were controlled by the use of the level controls on the
audiometer. The noise and sinusoidal signal were mixed using the audiometer, and stimuli were
delivered using TDH - 39 earphones supplied with audiometer. Each ear of each subject was
tested separately.
Pitch-matching Procedure
For the pitch matching task subjects were asked to match the perceived pitch of a variable
pure tone with that of another pure tone that was fixed in frequency. The two tones were
presented alternatively. Matches were made within the same ear to estimate the reliability of
matching. The subject was instructed to say „same‟ or „different‟ in the perceived pitch of
variable tone with that of fixed frequency tone. The procedure was carried out for frequencies at
250, 500, 1000, 2000 & 4000 Hz. Data was tabulated in terms of the perceived pitch at the above
frequencies. The extent of pitch matching was noted down for each frequency in all the subjects.
Results and discussion
There were 10 subjects with sensorineural hearing loss with dead region and 7
sensorineural hearing losses without dead region in the present study.
A) Group analysis
Table 1a &1b: Cross tabulation with respect to fixed and variable frequency in SNHL without dead region
1(a) Right ear
Fq. In Hz 250 500 1000 2000 4000
250 7 0 0 0 0
500 0 7 0 0 0
1000 0 0 7 0 0
2000 0 0 0 7 0
4000 0 0 0 0 7
# fixed frequency is shown in the first column and variable frequencies in other 5 columns
Dissertation Vol.V, Part-A, AIISH, Mysore
131
Number of subjects perceiving each fixed frequency and variable frequency are presented
as a cross table for both the groups and both the ears. Since we are dealing with frequency data
test of significance were not suitable and there was possibility of one person perceiving in more
than one way and because of this constraint the analyses were restricted to graphical
representation.
1 (b) Left ear
Fq. In Hz 250 500 1000 2000 4000
250 7 0 0 0 0
500 0 7 0 0 0
1000 0 0 7 0 0
2000 0 0 0 7 0
4000 0 0 0 0 7
# Fixed frequency is shown in the first column and variable frequencies in other 5 columns
Table 2a & 2b: Cross tabulation with respect to fixed & variable fq in SNHL with dead region
2(a) Right ear
Fq. In Hz 250 500 1000 2000 4000
250 10 1 1 - -
500 - 10 1 - -
1000 1 2 9 4 2
2000 1 2 4 10 9
4000 - 1 3 8 9
2(b) Left ear
Fq. In Hz 250 500 1000 2000 4000
250 10 1 1 - -
500 - 10 2 - -
1000 - 1 10 4 3
2000 1 1 3 9 7
4000 - 1 3 7 9
# Fixed frequency is shown in the first column and variable frequencies in other 5 columns
Number of subjects was converted as percentage since the subject size was not similar in
both the groups. For the first group sensorineural hearing loss without dead region 7 subjects
were taken and for second group sensorineural hearing loss with dead region 10 subjects were
taken.
1) Sensorineural hearing loss without dead region right ear group and left ear group
It can be observed from the graph that all 7 subject with sensorineural hearing loss
without dead region in both right ear and left ear fixed frequency same as variable frequencies at
all frequency levels. None of the subject perceived in any other frequency level.
Dissertation Vol.V, Part-A, AIISH, Mysore
132
Figure 1: Sensorineural hearing loss without dead region (Right ear)
Figure 2: Sensorineural hearing loss without dead region (Left ear)
2) Sensorineural hearing loss with dead region right ear group and left ear group
It can be noticed that in each fixed frequency we can find subjects who perceived in other
variable frequencies but most of the subjects could perceive at the same frequency.
Figure 3: Sensorineural hearing loss with dead region (Right ear)
Dissertation Vol.V, Part-A, AIISH, Mysore
133
Figure 4: Sensorineural hearing loss with dead region (Left ear)
Cochlear hearing loss results in a variety of changes in the way that sounds are
represented in the auditory system. For such changes are especially relevant for the perception of
pitch. There may be regions within the cochlea where the inner hair cells (IHCs) and/or neurons
are completely nonfunctional. These are referred to as dead regions. The peak in the neural
excitation pattern may occur at a place very different from that normally associated with that
frequency. The place theory predicts that the perceived pitch of the tone in such a case should be
very different from normal.
The result of present study indicated that pitch perception in individual with sensorineural
hearing loss with dead region is different than the sensorineural hearing loss without dead region.
The result in pitch shifts are for two reasons. The first applies when the amount of hearing loss
varies with frequency and especially when the amount of inner hair cell (IHC) damaged varies
with characteristic frequency. When the IHC transduction efficiency is reduced and so a given
amount of basilar membrane (BM) vibration leads to less neural activity than when the IHCs are
intact. When IHC damage varies with characteristic frequency the peak in the neural excitation
pattern evoked by a tone will shift away from a region of greater IHC loss. Hence the perceived
pitch is predicted to shift away from that region. Early studies of diplacusis (De Mare, 1948;
Webster & Schubert, 1954) were generally consistent with this prediction, showing that when a
sinusoidal tone is presented in a frequency region of hearing loss, the pitch shifts towards a
frequency region where there is less hearing loss. An alternative way in which pitch shifts might
occur is by shifts in the position of the peak excitation on the BM.
The results of the present study indicated that the tips of tuning curves shifted towards
lower frequencies in case of sensorineural hearing loss with dead region. This means that the
maximum excitation at a given place is produced by a lower frequency. The results also showed
shift of pitch toward higher frequency (upward) but some cases showed shift towards lower
frequency. The peak of the BM response in an impaired cochlea would be shifted towards the
base –i.e toward place normally responding to higher frequencies. Gaeth and Norris (1965) and
Schoeny and Carhart (1971) reported that pitch shifts were generally upwards regardless of the
configuration of loss. However it is also clear that individual differences can be substantial and
Dissertation Vol.V, Part-A, AIISH, Mysore
134
subjects with similar patterns of hearing loss (absolute thresholds as a function of frequency) can
show quite different pitch shifts. Huss, et al. (2001) and Huss and Moore (2005) obtained pitch
matches and octave matches for subject with an extensive high frequency dead region. For tones
whose frequency fell well within the dead region the perceived pitch was shifted upwards
although it was also unclear.
The result of the present study indicated that frequency discrimination is poor for the
individual with sensorineural hearing loss with dead region. The frequency of pure tones may be
represented in terms of phase locking (a temporal representation) for frequencies below about
5000 Hz and purely spectrally (a place representation) for higher frequencies.
The precision of phase locking can be reduced (Wolf et al. 1981; Miller, et al. 1999),
although this has not always been found. According to temporal theory reduced precision of
phase locking should adversely affect frequency discrimination.
The propagation time of the traveling wave along the basilar membrane and the relative
phase of the response at different places may differ from normal because of loss of the active
“mechanism”, structural abnormalities or both (Ruggero, 1994; Ruggero et al. 1996). This could
adversely affect mechanisms for pitch perception based on cross-correlation of the outputs of
different points on the basilar membrane (Loeb et al. 1983; Shamma, 1985; Shamma and Klein,
2000).
It has been proposed that the frequency discrimination of steady pulsed tones by normally
hearing listeners is largely based on temporal information (cues desired from phase locking) for
frequencies upto 4 to 5 KHz (Moore, 1973, 1974, 2003; Goldstein and Srulovicz, 1977; Sek &
Moore, 1995; Micheyl, et al. 1998, Heinz, et al. 2001). Above 4 to 5 KHz, frequency
discrimination is thought to depend mainly on place based changes in the excitation pattern
(Moore, 1973b; Sek & Moore, 1995) although residual phase locking may play some role
(Heinz, et al. 2001).
The results clearly indicated that if sensorineural hearing loss is accompanied with dead
region (DR) then there is broader auditory filter. If there is no dead region the auditory filter
shape is narrow which might indicate that hair cell in that region could be functioning. It can be
expected that the perception of pitch might be more affected by the relative phase of the
component in a dead region than the without dead region.
For such DR cases frequency selectivity is reduced. Auditory filters are broader than
normal (Pick et al. 1977; Glasberg & Moore, 1986; Moore, 1998). Hence the excitation pattern
evoked by a sinusoid is also broader than normal. According to place theory this should lead to
impaired frequency discrimination of sinusoids. Reduced frequency selectivity also presumably
leads to a reduced ability to resolve partials in complex tones and this might adversely affect the
perception of the pitch of complex tones and also pure tone. For subjects with broad auditory
filters even the lower harmonics would interact at the outputs of the auditory filters giving a
Dissertation Vol.V, Part-A, AIISH, Mysore
135
potential for strong phase effects. Changes in phase locking and in cochlear traveling wave phase
could also lead to less clear pitches and poorer discrimination of pitch.
In the present study subjects of sensorineural hearing loss with dead region reported that
they did not perceive distinct pitch but sounded like noises. There have been a few studies of
pitch perception in people with hearing losses that increase abruptly at high frequencies, who
probably had dead regions at high frequencies. These subjects often report that high frequency
sinusoids do not have distinct pitch but sound like noises or buzzes (Villchur, 1973; Moore et al.
1985b; Murray & Byrne, 1986). Subjective reports that pure tones sound noise like may be taken
as a hint that a dead region is present but ratings of the clarity of the tonal percept cannot be used
as a reliable indicator of dead regions. A sensorineural hearing loss involves not only a reduction
of sensitivity but also a set of supra threshold impairments that
distort the perception of sounds:
listeners may suffer increased susceptibility to forward and backward masking, making it more
likely that vowels will mask energy in weaker adjacent consonants; auditory filters are often
broader than normal, leading to increased masking by background noises and by echoes in
reverberant rooms; in extreme cases, even in quiet anechoic environments, difficulties
may be
experienced in detecting changes in the pitch of a talker's voice and in determining the spectral
shape of speech sounds; the ability to analyze the temporal fine structure of the output
of auditory
filters may also be reduced leading to difficulties in following rapid changes in amplitude,
frequency and pitch and exacerbating the effects of noise.
Conclusions
From the result we can conclude that pitch matches are often erratic and frequency
discrimination is poor for tones with frequencies falling in a dead region. This indicates that such
tones do not evoke a clear pitch sensation. The shifted pitches found for some subjects indicate
that the pitch of low frequency tones is not represented solely by a temporal code. Possibly there
needs to be a correspondence between place and temporal information for a “normal” pitch to be
perceived.
References
De Mare G (1948). Investigations into the functions of the auditory apparatus in perception
deafness. Acta Otolaryngology supplement, 74, 107-116
Florentine, M. & Houtsma, A. J.M. (1983). Tuning curves and pitch matches in a listener with a
unilateral low frequency hearing loss. Journal of the Acoustical Society of America, 73,
961 -965.
Heinz M. G, Colburn H. S. & Carney L. H (2001). Rate and timing cues associated with the
cochlear amplifier: level discrimination based on monaural cross-frequency coincidence
detection. Journal of the Acoustical Society of America, 110(4):2065-2084.
Huss, M. & Moore, B. C. J. (2005). Dead regions and pitch perception, Journal of the Acoustical
Society of America, 117, 3841 -3852.
Dissertation Vol.V, Part-A, AIISH, Mysore
136
Huss, M. & Moore, B. C.J. (2003) Tone decay for hearing impaired listeners with and without
dead regions in the cochlea. Journal of the Acous Society of America, 114, 3283 -3294.
Kluk, K. & Moore, B. C. J. (2005). Factors affecting psychophysical tuning curves for hearing
impaired subjects, Hearing research. 2000, 115 -131.
Moore, B.C.J. (2004). Dead region in the cochlea: conceptual foundation, diagnosis and clinical
applications. Ear Hearing. 25, 98 -116.
Moore, B. C. J. & Alcantara, J. I. (2001). The use of psychophysical tuning curves to explore
dead regions in the cochlea. Ear Hearing, 22, 268 -278.
Moore, B. C. J., Huss, M., Vickers, D. A., Glasberg, B. R. & Alcantara, J. I. (2000). A test for
the diagnosis of dead regions in the cochlea. British Jounal of Audiology, 34, 205 – 224.
Moore, B. C. J. & Alcantara, J. I (2001). The use of psychophysical tuning curves to explore
dead regions in the cochlea. Ear and Hearing, 22, 268 -278.
Moore, B. C. J. & Glasberg, B. R. (1998). Use of a loudness model for hearing aid fitting. I.
Linear hearing aids. British Journal of Audiology, 32,317 -335.
Schoeny, Z. & Carhart, R. (1971). Effects of unilateral Meniere‟s disease on masking level
differences. Journal of the Acoustical Society of America, 50, 1143-1150
Shamma. S. A. (1985). Speech processing in the auditory system. II: Lateral inhibition and the
central processing of speech evoked activity in the auditory nerve. Journal of the
Acoustical Society of America, 78: 1622-1632.
Shamma S. & Klein D. (2000). The case of the missing pitch templates: how harmonic templates
emerge in the early auditory system. Journal of the Acoustical Society of America, 107,
2631-2644.
Turner, C.W., Burns, E. M. & Nelson, D.A. (1983). Pure tone pitche perception and low
frequency hearing loss. Journal of the Acoustical Society of America, 73, 966 -975.
Villchur E (1973). Signal processing to improve speech intelligibility in perceptive deafness.
Journal of the Acoustical Society of America, 53, 1646-1657.
Webster, J. C. & Schubert, E.D (1954). Pitch shifts accompanying certain auditory threshold
shifts. Journal of the Acoustical Society of America, 26,754-60
Wolf, N. K, Ryan. & A. F, Bone, R .C. (1981). Neural phase locking properties in the absence of
outer hair cells. Hearing Research, 4,335-4346.
Dissertation Vol.V, Part-A, AIISH, Mysore
137
Effect of Dichotic Offset Training (Dot) in Children with an
Auditory Processing Disorder
Priya G & Asha Yathiraj
Abstract
Management of children with auditory processing disorders had gained wide importance
in recent years. Various studies in the literature have shown that training children with central
auditory processing problems using deficit specific intervention results in the improvement of
auditory skills. The present study aimed at finding out the effectiveness of Dichotic Offset
Training in children with auditory processing disorder. Twelve children who failed a screening
checklist and the Dichotic CV and/or the Dichotic Digit test were included in the study. Six of
them in the experimental group received Dichotic Offset Training using the training material
developed by Yathiraj (2006). The children in the control group did not receive any training. The
results revealed that there was statistically significant improvement after training in dichotic CV
test. In dichotic digit test statistically significant improvement was seen in right ear single
correct scores alone and not for left ear single correct score and double correct scores. Thus
training children with binaural integration deficits using dichotic Offset Training was found to
be effective.
Introduction
Auditory stimulation is so essential to development of humans that any interruption in
this decoding process may have adverse effects on the overall maturation of an individual. The
presence of an auditory processing problem can disrupt the decoding of auditory signals (Hanson
& Ulvestad, 1979). The current definition of (C)APD explicitly recognizes both the auditory
nature of the disorder and the inherent non-modularity of the central auditory nervous system.
ASHA (2005) defined central auditory processing as “the perceptual (i.e., neural) processing of
auditory information in the central nervous system (CNS) and the neurobiologic activity that
gives rise to the electrophysiologic auditory potentials”. It includes neural mechanisms that
underlie a variety of auditory behaviours including localization/lateralization, performance with
degraded or competing acoustic signals, temporal aspects of audition, auditory discrimination
and auditory pattern recognition.
Recent reports suggest that auditory training (AT) can serve as a valuable intervention
tool particularly for individuals with language impairment and central auditory processing
disorder (C)APD (Chermak & Musiek, 2002). Musiek, Shinn and Hare (2002) noted that the use
of AT for treatment of APD is different from the classic use of AT. Most important to this
Professor of Audiology, All India Institute of Speech and Hearing, Mysore, India
e-mail: [email protected]
Dissertation Vol.V, Part-A, AIISH, Mysore
138
difference is that AT applied to APD is targeting the brain as the main site of mediation and the
brain, unlike the auditory periphery is plastic.
Training for auditory integration is one such formal training program (Katz, Chertoff &
Sawusch, 1984; English, Martonik & Moir, 2003). It has been shown that providing deficit
specific therapy does result in improvement in auditory processing (Katz et al., 1984; Putter-Katz
et al., 2002; English et al., 2003).
Binaural integration (BI) is the ability of a listener to process information presented to
both ears at the same time. Poor performance in binaural integration has been found to result in
difficulty in hearing in the presence of background noises or difficulty listening to two
conversations at the same time (Bellis, 1996). An individual with deficit in binaural integration
has been reported to have difficulty in integrating or processing information from more than one
source at a time.
Binaural integration and binaural separation tasks are considered warranted when deficits
are identified during dichotic evaluations. Musiek and Schochat (1998) used auditory training
which involved directing the stimuli to the stronger ear at a reduced level. This sound field
condition provided more cross-over between signals and greater demands on the patient than if
the task was conducted under earphones. It was suggested by Musiek et al. (2002) that this
procedure can also be modified using temporal offsets that lag in the poorer ear which improves
the poorer ear performance.
One form of remediation for individuals with binaural integration problems is dichotic
offset training, originally proposed by Rudmin and Katz (1982, cited in Katz et al., 1984). The
main objective of Dichotic Offset Training (DOT) was to train the child to differentially integrate
the two different stimuli which were separately given to both ears. Katz et al., (1984) studied 10
children aged 7-10 years who demonstrated difficulty on a dichotic test (SSW). They were given
DOT for 15 one-hour sessions using different offset conditions (500, 100, 300, 200, 100 and 0
msec). A consistent pattern of improvement was documented for Staggered Dichotic Digit Test
(SDD). However, they found a lack of statistically significant improvement on the SSW and
Speech-in-Noise tests. They suggested that a battery of auditory training tasks is likely to be
more beneficial than training any single skill.
Musiek and Schochat (1998) reported a case study of a 15 year old patient who
demonstrated bilateral mild deficits on dichotic digits test and moderate bilateral deficits on the
frequency pattern test and the compressed speech with reverberation test. A 6-week auditory
training program was given that included three 1-hour sessions per week along with home
training. Post auditory training performance showed higher scores on all central auditory tests.
A study by English et al., (2003) described another form of treatment for children with
deficit in dichotic learning skill. Ten children with reduced left ear Dichotic Digit Test (DDT)
scores (in the age range of 5 years 10 months to 10 years 9 months) were taken as subjects. They
received additional auditory training in conjunction with the left-ear-only stimulation. The
Dissertation Vol.V, Part-A, AIISH, Mysore
139
training was given for 1 hour a week for 10 to 13 weeks. It was found that for most subjects
providing auditory stimulation to the left ear only improved left ear dichotic deficits as measured
by the dichotic digit test. From the above studies it is evident that different forms of training can
be provided which would result in an enhancement in dichotic performance. Both dichotic offset
training as well as stimulation of the deviant ear have shown to bring about improvement in
auditory integration.
According to Rupp and Stockdell (1978) 15 to 20% of school age population have some
type of language/learning disorder, 70 percent of these have some form of auditory impairment.
Further, Chermak and Musiek (1997) estimated that as many as 2 to 5% of the school age
population exhibit (C) APDs. In India it has been found that 3% of the children were found to
have dyslexia (Ramaa, 1985). Since many of the school going children have this problem there is
a need to find appropriate treatment procedures to help them develop their auditory skills and
perform better academically. Many intervention procedures have been reported in literature but
their efficacy has not been studied. Hence there is a need to study the effectiveness of an auditory
training procedure which would enhance auditory perception. The aim of the present study is to
determine the effectiveness of Dichotic Offset Training in children with low scores on the
Dichotic CV and the Dichotic Digit tests.
Method Participants
Two groups of participants were included in the present study, an experimental group and
a control group. All the participants who were in the age range of 7-12 years had studied in an
English medium school for at least 3 years. They had normal pure tone, immittance and speech
identification findings. Further, they had normal IQ and no speech problems. Only those who
failed the „Screening Checklist for Auditory Processing‟ (SCAP) developed by Yathiraj and
Mascarenhas (2002), the Dichotic CV test developed by Yathiraj (1999) and/or the Dichotic
Digit test developed at AIISH were included in the study. The participant selection criteria for
the control group were the same as the experimental group. While the experimental group
received dichotic offset training the control group did not.
Instrumentation
A calibrated dual channel audiometer (Orbiter 922) was utilized for pure tone testing and
for presenting the Dichotic CV and Dichotic Digit tests. To rule out any middle ear pathology a
calibrated immittance meter (GSI Tympstar) was used. An audio CD player (Philips) was used to
present test stimuli during evaluation while a portable audio CD player (Sony) with head phones
was used during the training sessions.
Test Environment
All the evaluations were carried out in a two room situation which was acoustically
treated as per ANSI (1991). Training was given in a quiet, distraction free environment.
Dissertation Vol.V, Part-A, AIISH, Mysore
140
Material Used
To select the participants the „Screening Checklist for Auditory Processing‟ (SCAP)
developed by Yathiraj and Mascarenhas (2002) was used. Further, to determine their binaural
integration abilities they were evaluated utilizing the „Dichotic CV test‟ developed by Yathiraj
(1999) using the norms developed by Krishna (2001) and the „Dichotic Digit test‟ developed at
AIISH, with the norms obtained by Regishia (2003). The dichotic offset material developed by
Yathiraj (2006) was used for the training. It consisted of 12 dichotic word lists with six lists
having monosyllables without blends and six lists having monosyllables with blends. Each list
had 10 word pairs. The material had 6 offset lags (500 ms, 300 ms, 200 ms, 100 ms, 50 ms and 0
ms). Each offset lag consisted of 4 word lists, two having a right ear lag and two with a left ear
lag. Prior to administering the dichotic material the familiarity of the words was checked on ten
children in the age range of 7 to 7 years 11 months. In addition the intelligibility of the recorded
material which had been done on a computer by a female speaker with a sampling rate of 16
KHz was checked on ten adults. The material was found to be familiar to children as well as
intelligible to adults.
Procedure
Participant Selection Procedure
The initial selection of the participants was done by screening for children using the
„Screening Checklist for Auditory Processing‟ (SCAP), developed by Yathiraj and Mascarenhas
(2002). The checklist was administered by teachers who had a good knowledge about the
abilities of the children. Twelve of those children who had scored less than 50% were taken for
further evaluation. They were evaluated using dichotic CV and dichotic digit test. Half of the
participants were administered the Dichotic CV first while the other half the Dichotic Digit test.
Only those who failed these two tests were included in the study. The initial dichotic test scores
also served as the baseline evaluation.
Baseline Evaluation (Evaluation I)
The Dichotic CV test which consisted of 30 pairs of CV segments was administered at 50
dB HL. The children had to repeat the phonemes and the responses were written down by the
clinician. The scores obtained were compared with the norms developed by Krishna (2001). Of
the twelve children who were administered the test ten failed the Dichotic CV test.
The Dichotic Digit Test was presented at 40 dB SL. The children were instructed to
repeat all the numbers heard regardless of the order and the responses were written down. The
norms developed by Regishia (2003) were used to decide whether a child passed or failed a test.
Eleven out of the twelve children failed the test.
Dichotic Offset Training:
Six of the children who failed either of the above tests were given training using the
Dichotic Offset Training (DOT) material developed by Yathiraj (2006) using an audio CD player
Dissertation Vol.V, Part-A, AIISH, Mysore
141
with headphones. The training was started with the easier offset lag (500 ms) and once a child
obtained approximately 70% double correct scores the next lower lag material was used. If the
double correct scores obtained did not reach the 70% criteria the lists were presented again in a
randomized order. Gradually the offset lag was reduced and the task was made more difficult.
Each child was trained using all the lag times with both monosyllable lists without and with
blends. Throughout the training the children were provided feedback regarding their performance
(a head nod for every correct response). On completion of the 0 ms lag lists therapy was stopped.
The number of sessions required by the children varied between 10 to 15 sessions depending on
the abilities of the child.
Post therapy evaluation (Evaluation II)
After completion of the 0 ms lag therapy, post therapy evaluation was done for the
experimental group. For the control group evaluation II was done 15 days after evaluation I.
These evaluations were done using the dichotic CV and dichotic digit test and the single correct
and double correct scores were obtained. The scores obtained from evaluation I and II were
tabulated and scored.
Results and discussion
A comparison of the scores obtained in I and II evaluations were done separately for the
experimental group and control group and also across groups. In addition a comparison of
dichotic offset scores obtained during therapy by the experimental group, was carried out.
I a) Comparison of evaluations I and II in the experimental group
The scores obtained by the experimental group during evaluation I (pre training
evaluation) and evaluation II (post training evaluation) on the dichotic tests were compared using
the Wilcoxon Signed ranks test. The results revealed a statistically significant difference between
the evaluation I and II scores following the dichotic offset training in the experimental group.
The test scores were statistically significant at 0.05 levels for both single correct and double
correct scores in the dichotic CV test. For the dichotic digit test, the scores were statistically
significant only for the right ear single correct scores at a 0.05 level of significance. The left
single correct scores and double correct scores did not show any statistically significant
improvement (Table 1 & Figure 1).
Table 1: Comparison of pre and post test scores in the experimental group
Test Score type Mean pre therapy score Mean post therapy score z value
Dichotic CV
Right single correct
Left single correct
Double correct
8.7
13.3
1.8
15.5
23.2
10.8
-2.201*
-2.01*
-2.207*
Dichotic digit Right single correct
Left single correct
Double correct
14.4
18.2
1.7
22.0
24.3
7.5
-2.201*
-1.577
-1.826
* Significant at 0.05 level
Dissertation Vol.V, Part-A, AIISH, Mysore
142
The results revealed that the dichotic offset training given to children who had deficit in
binaural integration was found to be effective in acquiring that particular auditory skill. The
improvement was found to be lesser in the Dichotic Digit test when compared to the Dichotic
CV test which may be because the Dichotic Digit test requires auditory memory skills also along
with binaural integration.
Experimental Group Control Group
05
1015202530
Sing
leRi
ght
Sing
leLe
ft
Dou
bleC
orre
ct
Sing
leRi
ght
Sing
leLe
ft
Dou
bleC
orre
ct
Score Type
Mea
n Sc
ores
Evaluation I
Evaluation II
Figure 1: Evaluation I and II for Dichotic CV test for the experimental & control Group
I b) Comparison of evaluations I and II done in the control group
The scores obtained by the control group during evaluations I and II were compared
using the Wilcoxon Signed rank test for both Dichotic CV and Dichotic Digit test. The results
revealed that there was not much improvement seen in the Dichotic CV and Dichotic Digit test
scores for the control group who did not receive any training. The Z scores obtained shows that
the difference in the scores was not statistically significant (Figure 2).
Experimental Group Control Group
0
5
10
15
20
25
30
Sing
leR
ight
Sing
leLe
ft
Dou
bleC
orre
ct
Sing
leR
ight
Sing
leLe
ft
Dou
bleC
orre
ct
Score Type
Mea
n Sc
ores
Evaluation I
Evaluation II
Figure 2: Evaluation I and II for the Dichotic Digit Test for experimental and control group
Thus it can be construed that without Dichotic Offset Training the individuals with poor
auditory integration skills do not show any marked variation in their performance. The finding of
the present study is similar to that of Katz et al., (1984) who also reported that children who did
Dissertation Vol.V, Part-A, AIISH, Mysore
143
not receive Dichotic Offset Training did not show an improvement in performance. Besides the
improvement seen using Dichotic Offset Training a study by English et al., (2003) showed that
even training those with poor dichotic scores in one ear resulted in improvement in dichotic
scores. In their study the poorer ear was stimulated and improvement was seen in left ear alone.
II) Comparison of evaluation I and II across groups
The scores obtained were compared between the experimental and control groups,
separately for evaluations I and II (Table 2). For evaluation I the mean scores for both the groups
did not vary much for the Dichotic CV and the Dichotic Digit test. However, for evaluation II,
there were variations in the mean scores for the Dichotic CV test but not much for the Dichotic
Digit Test.
To compare the mean scores between the experimental and control groups for evaluations
I and II, non-parametric Mann-Whitney test was carried out. From Table 3 it is evident that there
was no significant difference between the experimental and control group for evaluation I in the
Dichotic CV and the Dichotic Digit Test. However in evaluation II there was a statistically
significant difference across the groups in the Dichotic CV test. The left single correct score
showed a significant difference at the 0.05 level whereas the right single correct score and double
correct score showed a significant difference at 0.1 level.
Table 2: Mean and standard deviation scores for both the groups on I and II evaluations
Evaluation
Test
Score Type Experimental group Control group
Mean SD Mean SD
Evaluation I
Dichotic CV
RE 8.7 4.5 9.6 3.9
LE 13.3 7.7 16.3 6.6
DC 1.9 3.6 2.3 4.8
Dichotic digit test
RE 14.4 5.0 17.1 4.7
LE 18.2 10.2 23.3 4.9
DC 1.7 2.4 4.8 8.7
Evaluation II
Dichotic CV
RE 15.5 3.3 12.5 2.3
LE 23.2 2.7 18.2 4.5
DC 10.8 5.0 4.7 4.4
Dichotic digit test
RE 22.0 3.2 18.4 4.5
LE 24.3 4.3 24.1 5.9
DC 7.5 8.4 6.5 7.3
The Dichotic Digit test did not show any significant difference when compared across the
groups. Thus it can be concluded that following training the experimental group showed a
significant difference which was not observed in the control group on a test that purely tapped
auditory integration (dichotic CV). In contrast, the test that tapped both auditory integration and
auditory memory (dichotic digit test) did not show such an improvement.
Dissertation Vol.V, Part-A, AIISH, Mysore
144
Table 3: Comparison of mean scores across the groups
Test Group Score
Type
Evaluation I
Mean scores
Significance Evaluation II
Mean scores
Significance
Dichotic CV
Experimental RE 8.666 NS
15.500 0.124**
Control RE 9.583 12.500
Experimental LE 13.333 NS
23.166 0.036*
Control LE 16.333 18.166
Experimental DC 1.833 NS
10.833 0.091**
Control DC 2.333 4.666
Dichotic
Digit Test
Experimental RE 14.416 NS
22.000 NS
Control RE 17.083 18.416
Experimental LE 18.166 NS
24.250 NS
Control LE 23.250 24.083
Experimental DC 1.666 NS 7.500 NS
Control DC 4.833 6.500
* Significant at 0.05 level; ** Significant at 0.1 level
III) Comparison of dichotic offset scores in the experimental group:
The scores obtained by the experimental group during the dichotic offset training were
also analyzed. The scores obtained at each of the lag times for the monosyllables without blends
(Figure 3) and with blends (Figure 4) were analyzed. The double correct scores obtained during
the therapy sessions were compared across various offset lags. This was done separately for the
training material having a right lag and that having a left lag. For each of the conditions the
baseline scores obtained at the start of the training were compared with the scores obtained at the
end of the training for a particular lag time.
Figure 3: Double correct scores for monosyllables without blends, for varying lag times
Dissertation Vol.V, Part-A, AIISH, Mysore
145
Figure 4: Double correct scores for monosyllables with blends, for varying lag times
From Figures 3 and 4 it can be observed that for all the lag conditions, material type
(non-blends and blends) and ear of lag, there was an improvement in performance with training.
The improvement seen during therapy was greater for the monosyllables without blends than for
the monosyllables with blends. The Mann-Whitney test was carried out to check for overall
changes between the baseline performance and the post therapy scores for each lag time. A
statistically significant response was observed only for the 100 msec lag time. For other lag
times, though there was an improvement, it was not statistically significant.
Conclusion
Based on the results of the present study it can be concluded that the Dichotic Offset
training (DOT) is found to be effective in helping the children with deficits in binaural
integration. No significant improvement was found for the control group in both the Dichotic CV
and Dichotic Digit tests. The experimental group showed significant improvement (p < 0.05) in
both the single and double correct scores in the Dichotic CV test following training. In the
Dichotic Digit test the significant improvement was found only for right ear single correct score
(p < 0.05) and not for left ear single correct and double correct score. It can be concluded that the
improvement is more for a dichotic test that taps only binaural integration and not a test that taps
both binaural integration and auditory memory.
References
ANSI. Maximum permissible ambient noise levels for audiometric test rooms, ANSI S3.1. New
York: American National Standards Institute (ANSI), 1991.
ASHA (2005). (Central) auditory processing disorders (Technical report). Available at
www.asha.org /members/deskref-journals/deskref/default.
Dissertation Vol.V, Part-A, AIISH, Mysore
146
Bellis, T.J., (1996). Assessment and Management of Central Auditory Processing Disorders in
the Educational Setting: From Science to Practice. San Diego, CA: Singular.
Chermak, G.D. & Musiek, F.E., (1997). Central Auditory Processing Disorders: New
Perspectives. San Diego, CA: Singular.
Chermak, G.D. & Musiek, F.E., (2002). Auditory Training: Principles and Approaches for
Remediating and Managing Auditory Processing Disorders. Seminars in Hearing, 23(4),
297-308.
English, K., Martonik, J. & Moir, L. (2003). An auditory training technique to improve
dichotic listening, Hearing Journal, 56(1), 34-38.
Hanson, D.G. & Ulvestad, R.F., (1979). Otitis media and child development. Annals of Otology,
Rhinology and Laryngology, 88 (Suppl.60), 1-111.
Katz, J., Chertoff, M. & Sawusch, J., (1984). Dichotic training. Journal of Auditory Research,
24, 251-264.
Krishna, G. (2001). Dichotic CV tests – Revised: Normative Data on Children, Independent
project done as part fulfillment for the Degree of Master of Science, Submitted to the
University of Mysore, Mysore.
Musiek, F.E. & Schochat, E., (1998). Auditory training and central auditory processing
disorders: A case study. Seminars in Hearing, 19, 357-366.
Musiek, F. E., Shinn, J. & Hare, C., (2002). Plasticity, Auditory Training, and Auditory
Processing Disorders. Seminars in Hearing, 23(4), 263-275. Guest Editor: Chermak,
G.D.
Putter-Katz, H., Said, L.A., Feldman, I., Meran, D., Kushmit, D., Muchnik, C. & Hildesheimer,
M., (2002). Treatment and evaluation indices of Auditory Processing Disorders. Seminars
in Hearing, 23(4): 357-363.
Ramaa, S., (1985). Diagnosis and remediation of dyslexia. Ph.D. Thesis, University of Mysore,
Mysore.
Regishia, A., (2003). Effect of maturation on Dichotic tests: A comparison of Dichotic Digit and
Dichotic Consonant Vowel test, Independent project done as part fulfillment for the
Degree of Master of Science, Submitted to the University of Mysore, Mysore.
Rupp, R.R. & Stockdell, K.G., (1978). Speech protocols in Audiology. New York: Grune &
Stratton.
Yathiraj, A., (1999). The Dichotic CV Test. Unpublished Material developed at the Department
of Audiology, AIISH, Mysore.
Yathiraj, A. & Mascarenhas, K. (2002). Effect of Auditory stimulation of central auditory
processes in children with APD. A project funded by the AIISH research fund.
Yathiraj, A., (2006). The Dichotic Offset Training Material. Unpublished Material developed at
the Department of Audiology, AIISH, Mysore.
Dissertation Vol.V, Part-A, AIISH, Mysore
147
Development of High And Low Predictable English
Sentence Test (EHLPS)
V V Rahana Nandan & Asha Yathiraj
Abstract
In an everyday situation there is a combination of high and low predictable sentences.
The present study aimed at developing an English High Predictable-Low Predictable Sentence
test for Non-Native English speakers (EHLPS). The test was administered on twenty normal
hearing and eleven individuals with mild-to-moderate sensorineural hearing loss. The responses
were scored in terms of high and low predictable target words and key words in the sentences.
The statistical analysis of the data revealed that there was a significant difference between the
normal group and individuals with hearing impairment on the EHLPS for both key word and
target word scoring. The developed test has been found to be useful in determining the
perceptual problems in individuals having hearing loss. Hence it may be used as a part of
diagnostic test battery and for pre and post therapy evaluations in individuals having auditory
perceptual problems.
Key words: speech identification, perceptual problems, sensorineural hearing loss
Introduction
Speech is highly redundant because the information in it is conveyed in several ways
simultaneously (Martin, 1994). A hearing loss involving only part of the auditory frequency
range may go undetected in a speech test which is not carefully controlled. It has been noted by
Martin (1994) that it was not possible in a single test to sample all types of speech events that
might occur in practice. This is because everyday speech communication covers a wide range of
spoken material and takes place in a variety of contexts.
Denes and Pinson (1963) reported that basically two kinds of operations were involved in
the understanding of sentences. One was the reception and initial processing of acoustic
information through the auditory system and the other was the utilization of linguistic
information that is stored in memory. A test of a listener‟s ability to understand everyday speech
therefore must assess both the acoustic-phonetic and the linguistic-situational components of the
process.
The goal of most speech perception tests is to provide a measure of an individual‟s
performance in everyday listening situations. Although there are many meaningful word and
nonsense syllable tests available that provides analytic information regarding a patient‟s speech
perception abilities, sentence tests offer additional insight about the individual‟s performance in
Professor of Audiology, All India Institute of Speech and Hearing, Mysore, India
e-mail: [email protected]
Dissertation Vol.V, Part-A, AIISH, Mysore
148
more realistic communication situations. Sentences are considered more valid indicators of
intelligibility and a better representation of spoken communication. The use of single words,
especially single syllable words, imposes severe limitations on the capacity to manipulate certain
patterns like intonation and co-articulation effects on the ongoing speech. Sentences have face
validity as „natural‟ and „meaningful‟ stimuli for assessing auditory function (Miller, Heise &
Lichten, 1951).
Different forms of sentence tests have been developed over the years. Certain sentence
tests have been developed with the aim of tapping the perceptual difficulties of those with
hearing loss (Mendel & Danhauer, 1997). Other sentence tests have been constructed to
determine difficulties in the perception of high predictable or low predictable sentences. High
predictable sentences are those in which the target test words can be guessed from the context
whereas in low predictable sentences the final word cannot be guessed from the sentence context
(Kalikow, Stevens & Elliot, 1977). In our day-to-day life situation there is a combination of high
and low predictable sentences.
The present study aimed to develop a test having low and high predictability sentences
and administer it on a group of normal hearing individuals and a group having mild-to-moderate
hearing loss. It would be determined whether the test can differentiate the perceptual problems of
individuals with a hearing loss.
Method
Participants: The study was done in the three stages. The participants in each stage were
different. In stage I, ten normal hearing children in the age range of 12 years to 17 years 11
months were used to check for the familiarity of words used in the sentence test. In addition to
classify the sentences as high predictable and low predictable 10 normal hearing adults (18 to 30
years) were used. For stage II of the study two groups of normal hearing individuals were used
each consisting of 20 members. One group was in the age range of 12 years to 17 years 11
months and the other in the age range of 18 years to 30 years. In stage III ten individuals having
mild-to-moderate sensorineural hearing loss were taken to check the utility of the material.
Each participant for stages I and II had English as a medium of instruction for at least 5
years. They had normal hearing (i.e. air conduction and bone conduction thresholds within 15 dB
HL with an air-bone gap of less than 10 dB in the frequency range of 250 Hz to 8 KHz and 250
Hz to 4 KHz respectively). The participants had normal speech and language with no history of
hearing loss and any report of neurological problems. The participant inclusion criterion for stage
III was the same as that of stages I and II except that they had a mild-to-moderate sensorineural
hearing loss. Their age ranged from 20 to 55 years (mean age of 40 years).
Instrumentation: A dual channel calibrated diagnostic audiometer (Madsen OB 922) was used
for establishing hearing thresholds and for administering the developed test material. To rule out
middle ear problems an immittance audiometer (GSI-Tymstar) was utilized. A Pentium IV
Dissertation Vol.V, Part-A, AIISH, Mysore
149
computer with the WavePad software was used to record the material and normalization of the
speech material was done using the Adobe Audition software.
Test development: Seventy-five sentences were constructed with each sentence containing 5 to
7 words. A pilot study was done to check for the familiarity of the developed material and to
classify them as high and low predictable sentences. Ten normal hearing children aged 12 years
were used to check the familiarity of words. The participants were instructed to classify the
words on a three point scale as „highly familiar‟, „familiar‟ or „not familiar‟. The familiarity of
words was decided based on their frequency of occurrence in regular communication.
To classify the sentences in terms of predictability ten adults were used. The adults were
instructed to categorize the sentences as high predictable or low predictable sentences based on
their ability to guess the final word. Each participant was given the set of sentences with the
target word not provided and they had to guess it. They were instructed to give as many options
as possible for the target word. The sentences in which only one option was given that matched
the test stimuli were classified as highly predictable sentences. In contrast, sentences with more
than one target word were considered as low predictable sentences. Using the above material five
lists of sentences was developed. Each of them consisted of 10 sentences. The sentences were
such that they contained equal number of high and low predictable sentences (Appendix A). The
developed test was titled „High predictable and low predictable English sentence test for non-
native English speakers (EHLPS).
A female speaker was used for recording the material onto a computer. The „WavePad‟
software was used for the recording. The recording was done in a quiet room using a sampling
rate of 16 KHz. Scaling of the signals were done using „Adobe Audition‟ software to ensure that
the intensity of all the sounds were equal. A 1 KHz calibration tone was recorded prior to each
list.
Procedure: Administration of the developed sentence test was done on normal hearing
individuals in stage II in a sound treated double suite. Prior to the administration of the test the
pure tone thresholds of the participants were obtained. The speech recognition threshold (SRT)
was established using the English paired words developed by Chandrashekara (1972). The
recorded version of the EHLPS test was played on a Pentium IV computer using the WavePad
software. The output of the computer was routed to the tape input of the audiometer. The output
from the audiometer was played at 40 dB SL with reference to the participant‟s SRT. The
calibration tone was used to adjust the VU meter deflection of the audiometer to zero. The
participants heard the recorded material through headphones. Half the participants were tested in
the left ear and half in the right ear to avoid any ear effect. The participants were asked to write
down as well as verbally repeat what they heard.The verbal responses were noted by the
experimenter. The procedure for stage III was similar to that of stage II. Instead of evaluating
normal hearing individuals the test was administered on ten adults with a mild-to-moderate
hearing loss. They were tested in their better ear.
Dissertation Vol.V, Part-A, AIISH, Mysore
150
The responses from the participants were scored in two different ways. While the first
way involved scoring the high predictable or low predicable target words (final words) the
second way involved scoring the key words in the sentences. Every correct score was awarded a
score of one and every incorrect response got a score of zero. The maximum score for target
words was ten for each list with five being awarded for the high predictable sentences and the
other five for the low predictable sentences. In contrast the scores for the key words varied
across each list. List 1 had 28 key words while lists 2, 3, 4 and 5 had 29, 27, 30 and 27 key words
respectively. The raw scores for target words and key words of the participants were statistically
analyzed separately using the computer software SPSS (version 10.0).
Results and Discussion
The data obtained on the group having normal hearing and the group having hearing loss
were analysed separately. For each of the groups the analyses were done within lists and across
lists.
Analyses of data from Nnormal hearing individuals
To compare the scores of high and low predictable sentences within each list descriptive
statistics was initially done where the mean, standard deviation (SD) and 95% confidence
interval were calculated. This was done separately for each of the lists (Table 1). It can be
observed from the table that for both high predictable (HP) and low predictable (LP) sentences
the mean scores were either equal to the maximum scores or were just slightly less than the
maximum scores. The variability in scores was almost nil or minimal as evident from the SD
values.
Table 1: Mean, SD and 95% confidence interval values for High predictable (HP) and Low
predictable (LP) sentence scores
List
no
Sentence
type
Mean
(Max score=5) SD
Lower
bound
Upper
bound Significance
List 1 HP 4.85 .37 4.68 5.00
NS LP 5.00 .00 - -
List 2 HP 5.00 .00 - -
NS LP 4.85 .49 4.62 5.00
List 3 HP 4.95 .22 4.85 5.00
NS LP 5.00 .00 - -
List 4 HP 5.00 .00 - -
NS LP 4.85 .37 4.68 5.00
List 5 HP 5.00 .00 - -
NS LP 4.9 .45 4.69 5.00
Note. NS = Not significant
Further to check for the variation between the high and low predictable sentences within
each list paired sample t-test was done. The t values obtained showed no significant difference at
the 0.05 level (Table 1). The findings of the present study are not in agreement with that of
Dissertation Vol.V, Part-A, AIISH, Mysore
151
Kallikow, Stevens and Elliot (1977). They reported that better performance was noted for HP
sentences than the LP sentences. This lack of agreement in finding can be attributed to the
difference in testing procedure. The study by Kallikow et al. was done in presence of various
signal-to-noise ratios (SNR) whereas the EHLPS was done in a quiet condition. Individuals with
normal hearing depended more on the contextual cues in adverse listening conditions such as
noise and not in a quiet condition. Had the present study been conducted in the presence of noise
a similar result would have been probably obtained as that of Kallikow et al.
To check the difference between high predictable and low predictable sentence scores
across lists a one-way repeated measure ANOVA was carried out. It showed no significant
difference across the lists between the high predictable sentences [F (4, 76) = 2.259, p > 0.05]
and low predictable sentences [F (4, 76) = 1.048, p > 0.05]. The above findings indicate that all
five lists are similar in terms of HP and LP sentences. Since normal hearing individuals
performed equally well on all five tests, any one of them can be used while evaluating the speech
identification ability of clients when HP-LP scoring is done. Repeated measures ANOVA was
also calculated for key word scores across sentences and it showed a significant difference [F (4,
76) = 3.009, p < 0.05]. This was unlike that seen for the HP-LP target word scores (Table 1)
where there was no significant difference across lists. This highlights that the lists are equal
when they are valued in terms of HP and LP scores but are unequal when they are scored on the
basis of key words. The mean values for both the target HP-LP scoring and key word scoring is
given in Table 2. Within each list the scores for HP-LP words and key words are comparable.
Table 2: Mean values for HP-LP target word scores and key-word scores
List no HP-LP Target word score Key word score
List 1 9.85 (98.5%) 27.6(97.6%)
List 2 9.85 (98.5%) 28.6 (98.7%)
List 3 9.95 (99.5%) 26.9 (99.6%)
List 4 9.85 (98.5%) 29.7 (99.3%)
List 5 9.9 (99%) 26.6 (98.7%)
Note: Value given in bracket refers to the percentage score. Maximum HP-LP target word scores
was ten and maximum key word score ranged between 27-30.
From the Bonferroni‟s multiple comparison test it was evident that List 1 and List 3
showed a significant difference while the other pairs of lists did not. The participants obtained
significantly lower scores on List 3 when compared to List 1.
A possible reason as to why List 1 and List 3 are not equal could be due to the method
used in the construction of the lists. While constructing the test care was taken to equate the
target HP-LP words in each sentence in terms of frequency of occurrence of various phonemes.
Dissertation Vol.V, Part-A, AIISH, Mysore
152
This was not done for the key words as the main aim of the study was to develop and evaluate
HP-LP sentences. Further it is possible that the HP-LP target words were easier to predict in the
sentence compared to the other key words in the sentences. Also the words occurring toward the
end of a sentence tend to be more predictable and more likely to be restored and recalled quickly
than the rest of the words in the sentence.
Thus, it is recommended that when key words are being used to score the responses of
participants the combination of List 1 and 3 should not be used for comparing perceptual
outcomes. However, other list combinations can be used for perceptual evaluation of individuals.
These combinations include Lists 1, 2, 4 and 5 or Lists 2, 3, 4 and 5.
Analyses of data collected from the group with hearing impairment
Table 3: Mean, Standard deviation and 95% confidence interval values for HP and LP sentence
scores in individuals with hearing impairment
List no Mean
(Max score = 5) SD
Lower
bound
Upper
bound
Level
of Sig.
List 1 HP 3.55 1.21 2.73 4.36
0.831 LP 3.45 .93 2.83 4.08
List 2 HP 4.55 .69 4.08 5.00
0.006** LP 3.45 .93 2.83 4.08
List 3 HP 4.64 .50 4.30 4.98
0.019* LP 3.55 1.29 2.68 4.41
List 4 HP 4.55 .52 4.19 4.90
0.000** LP 3.18 .75 2.65 3.69
List 5 HP 4.64 .50 4.30 4.98
0.038* LP 3.91 1.14 3.15 4.67
Note. * Significant at .05 level; ** Significant at .01 level
High and low predictable sentence scores within each list were compared in the
individuals with hearing impairment. The mean scores varied only minimally depending on
whether the sentence was a high predictable one or a low predictable one. For all five lists, the
scores obtained on the LP sentences were lower. The t-test revealed a significant difference
between the HP and LP sentences for all but List 1 either at the 0.05 level or the 0.01 level
(Table 3). Also, the variability in scores was comparatively more in LP sentences compared to
the HP sentences as seen from the SD values.
The findings of the present study reveal that the individuals with hearing impairment did
depend more on the contextual cues rather than the audibility cues. The contextual cues were
limited in the LP sentences and hence they obtained comparatively less scores in these sentences.
One-way repeated measure ANOVA was done to compare the difference between high
predictable and low predictable sentence scores across lists. The scores obtained from the
individuals with hearing impairment showed a significant difference between the lists for the
high predictable sentences [F (4, 40) = 4.518, p < 0.05]. The Bonferroni‟s multiple comparisons
Dissertation Vol.V, Part-A, AIISH, Mysore
153
test revealed that List 1 and List 2 had a significant difference and the other pairs of lists did not
show a significant difference. The results were not similar for the low predictable sentences
across lists. Here there was no significant difference seen [F (4, 40 = 0.974, p > 0.05] indicating
that individuals with hearing impairment performed similarly on the LP sentence across lists.
Probably with the HP sentences the individuals were able to guess the target word in certain lists
and not so in certain other lists. However, this was not the case with the LP sentences.
The mean scores obtained for HP-LP target words and key words, expressed in terms of
raw scores as well as percentage, are depicted in Table 4. When the key words were scored in
individuals with hearing impairment it showed a significant difference across lists [F (4, 40) =
4.905, p < 0.05]. This is similar to what was observed for the LP sentence scores in the group
with hearing impairment. It was seen from the Bonferroni‟s multiple comparison test that for the
key word scores, List 2 and 5 showed a significant difference. Likewise Lists 3 and 5 had a
significant difference while the other lists did not have a significant difference between them.
Table 4: Mean and SD for HP-LP target word scores and key word scores in individuals with a HI
List no Target HP-LP
word score
Key word
score
List 1 7
(70%)
20.09
(71.7%)
List 2 8
(80%)
21.81
(75.1%)
List 3 8.19
(81.9%)
20.36
(74.7%)
List 4 7.73
(77.3%)
23.36
(79.3%)
List 5 8.55
(85.5%)
22.81
(85.1%)
Note: Value given in bracket refers to the percentage score; Maximum HP-LP word scores was
ten and Maximum key word score ranged between 27-30
Scores were comparable within a list when HP-LP target words and key words scores
were used. The similarity in scores was more pronounced in List 1, 4 and 5. Both scoring
procedures seem to detect the perceptual problems of individuals with hearing impairment.
Comparison between Normal Group and the Deviant Group
The high and low predictability sentence scores were compared between groups. The
mean HP-LP scores for the two groups are depicted in Table 5. The mean scores obtained by the
individuals with hearing impairment were lower when compared to the normal hearing group.
An independent t-test was performed and it was found that there was a significant difference at
the 0.01 level between the two groups for all the Lists for both HP and LP sentences. Only the
HP sentences in List 3 were significantly different at the 0.05 level. The findings of the present
study are in agreement with that reported in literature. Olsen, Noffsinger and Kurdziel (1975)
have documented that speech discrimination scores were comparatively worse in individuals
Dissertation Vol.V, Part-A, AIISH, Mysore
154
with hearing impairment in quiet. Similarly, Pekkarinen, Salmivalli and Suonpaa (1990) reported
that word recognition scores were poorer in their participants with hearing impairment compared
to the normal hearing group in a quiet situation. Thus it can be inferred that HP-LP target word
scores are sensitive in assessing perceptual problems in individuals with hearing impairment.
Both high predictable sentences as well as low predictable sentences are equally sensitive.
Table 5: Mean and t values for HP-LP target words across normal and individuals with HI
List Sentence type Groups Mean (Max score = 5) T values
List 1
HP Normal 4.85
4.50** HI 3.55
LP Normal 5.00
7.50** HI 3.45
List 2
HP Normal 5.00
2.99** HI 4.55
LP Normal 4.85
5.49** HI 3,45
List 3
HP Normal 4.95
2.40* HI 4.64
LP Normal 5.00
5.10** HI 3.55
List 4
HP Normal 5.00
3.94** HI 4.55
LP Normal 4.85
8.36** HI 3.18
List 5
HP Normal 5.00
3.27** HI 4.64
LP Normal 4.90
3.47** HI 3.91
One-way repeated measure ANOVA was calculated for the key words in the normal and
deviant group and it showed a significance difference [F (4, 116) = 9.067, p < 0.05]. Along with
ANOVA independent t-test was also done to check for difference between key word scoring
across both the groups. The t values showed a significant difference at the 0.01 level (Table 6).
This shows that key word scoring is also an equally sensitive test procedure to detect perceptual
deficits in the hearing impaired population.
Table 6: Mean and t values for key words across normal and HI group
List no Groups Mean T values
List 1 Normal 97.66
9.90** HI 71.75
List 2 Normal 98.78
8.52** HI 75.19
List 3 Normal 99.63
9.36** HI 74.74
List 4 Normal 99.33
6.80** HI 79.36
List 5 Normal 98.70
6.41** HI 85.10
Note. ** Significant at .01 level
Dissertation Vol.V, Part-A, AIISH, Mysore
155
Conclusion Thus from the present study it can be concluded that there is no significant difference
between the high predictable and low predictable sentence scores in the normal population. It is
seen that all the five lists containing high predictable and low predictable sentences were equal
but the lists are unequal when key words are scored in normal hearing individuals. In individuals
with hearing impairment the LP sentences yielded significantly lower scores than the HP
sentences for most of the lists. Overall there was a significant difference between the normal
hearing group and individuals with hearing impairment on the EHLPS for both key word and
target word scoring. Thus EHLPS is a sensitive test for the assessment of auditory perceptual
difficulty in individuals having a hearing problem. The test would provide information about the
auditory perceptual problems present in individuals having a hearing loss.
References
Chandrashekara, S. (1972). Development and standardization of speech test material in English
for Indians. Unpublished Master‟s Dissertation, University of Mysore, Mysore.
Denes, P. B. & Pinson, E. N. (1963). The Speech Chain. Baltimore. Waverly Press Inc.
Kalikow, D. N., Stevens, K. N. & Elliott, L. L. (1977). Development of a test of speech
intelligibility in noise using sentence materials with controlled word predictability.
Journal of Acoustical Society of America. 61, 1337-1351.
Martin, F.N. (1994). Introduction to Audiology. (Eds.), New Jersey Prentice Hall, Englewood.
Cliffs.
Mendel, L. L. & Danhauer, J. L., (1997). Audiological evaluation and management and speech
perception assessment. San Diego. London. Singular Publishing Group Inc.
Miller, G. A., Heise, G. A. & Litchen, W. (1951). The intelligibility of speech as a function of
the context of the test materials. Journal of Experimental Psychology, 41, 329-335.
Olsen, W, O., Noffsinger, D. & Kurdziel, S. (1975). Speech discrimination in quiet and in white
noise by patients with peripheral and central lesions. Acta Otolaryngol, 80, 375-382.
Pekkarinen, E., Salmivalli, A. & Suonpaa. J. (1990). Effect of noise on word discrimination by
participants with impaired hearing, compared with those with normal hearing. Scand
Audiol, 19, 31-36.
Appendix – A
English High Predictable Low Predictable Sentence test for Non-native English speakers
(EHLPS)
List 1
1. A year has twelve months.
2. I hit the ball with a bat.
3. The sport shirt has short sleeves.
4. I was made to lift my bag.
Dissertation Vol.V, Part-A, AIISH, Mysore
156
5. The baby slept with closed eyes.
6. She baked his birthday cake.
7. The room is always kept neat.
8. Put a battery in the clock.
9. February has 28 days.
10. He looks different with a beard.
List 2
1. She just heard a loud scream.
2. The peacock is our national bird.
3. He had a bath with hot water.
4. The heavy rains caused a flood.
5. The baby has chubby cheeks.
6. I have got a new dress.
7. He wiped the mirror with a sponge.
8. He eats using his right hand.
9. A day has 24 hours.
10. Give her a few slices of bread.
List 3
1. The dogs were tied to the gate.
2. She has to pay the tuition fees.
3. We got drenched in the rain.
4. I need to fill ink in my pen.
5. He prefers to have tea.
6. I got stuck in the lift
7. Lotus is our national flower
8. The bomb exploded with a blast.
9. The barber cut his hair.
10. She opened the room with a key.
List 4
1. The cricket match ended in a draw.
2. The bomb exploded with a blast.
3. He stuck the paper with glue.
4. In autumn, the trees shed their leaves.
5. Sunday is a holiday.
6. Every morning I brush my teeth.
7. There are 7 days in a week.
8. She hit the water with a splash.
9. He was asked to unlock the door.
10. We could consider the request.
List 5
Dissertation Vol.V, Part-A, AIISH, Mysore
157
1. A dog has four legs.
2. He was assigned the task.
3. A rainbow has seven colours.
4. I met with a car accident.
5. The sun rises in the east.
6. The door was wide open.
7. I made the call from a booth.
8. Stop playing with your hands.
9. Help me in arranging the books.
10. We should have considered the matter.
Note: Words in bold are the key words and words underlined are the HP-LP target words.
Dissertation Vol.V, Part-A, AIISH, Mysore
158
NRT: Comparison of Artefact Cancellation and
Threshold Estimation Techniques
Shibasis Chowdhury & P Manjula
Abstract
Neural response telemetry (NRT) is a process of recording electrically evoked compound
action potential (ECAP) from the auditory nerve in individuals with cochlear implants. Since
ECAP is a very early potential, there is an adverse effect of stimulus artefact on the recording of
the ECAP via NRT. Several approaches have been proposed to reduce the stimulus artefact in
NRT. A comparison of three such techniques for artefact reduction was made. Results revealed
that all the techniques can be used to record NRT. A comparison of techniques available to
determine the threshold NRT (T-NRT) was also done and results revealed that the visual
estimation of T-NRT was a better choice.
Key words: Alternating polarity, forward masking, artefact template, visual estimation, peak
picker.
Introduction
The most direct measure of auditory nerve activity in cochlear implant users is the
electrically evoked compound action potential (Abbas et al., 1999). The first recording of
electrically evoked whole nerve compound action potential (ECAP) from human cochlear
implant users was reported by Brown, Abbas and Gantz in 1990. The method used was an
adaptation of the paradigm described by Sauvage, Cazals, Erre, and Aran in1983. The term
“Telemetry” describes the measurement of data and transmission of data from a remote source to
a receiving station for recording and analysis (Mens, 2004). The telemetry system used to
measure the ECAP in Nucleus cochlear implant users is referred to as NRT (Abbas et al., 1999).
In humans this response consists primarily of a negative peak often referred to as N1 with a
latency of 0.2 to 0.5 ms; and at high presentation levels, the initial negative peak is often
followed by a less robust positive peak that is referred to as P2 (Brown, 2004).
While recording ECAP the problem faced is that, in addition to the neural response
evoked by the electrical stimulus pulse, a very large stimulus artefact will also be recorded which
is often large to saturate the recording amplifier (Brown, 2004). As a result the ECAP could not
be visualized in the presence of the artefact. This problem has led to several proposals for
reducing or minimizing the stimulus artefact recorded during ECAP measurement.
For recording NRT there are several techniques available in the Custom Sound EP
software (version 1.3) for artefact reduction such as 1. Forward masking, 2. Artefact template, 3.
Professor in Audiology, All India Institute of Speech and Hearing, Mysore, India
email: [email protected]
Dissertation Vol.V, Part-A, AIISH, Mysore
159
Alternating polarity and 4. Masked response extraction. These techniques are referred to as
artefact cancellation techniques in NRT. The present study is concerned with only the first three
artefact cancellation techniques. Once the NRT is recorded at different current levels there is a
need to identify the threshold of NRT (T-NRT), i.e., the minimum current level at which a NRT
response is obtained. Threshold estimation of NRT can be done by visual detection of the NRT
waveform or automatically by the software, based on some predefined rules.
The Custom Sound EP software has the option of „peak picker‟ which offers the facility
of automated NRT response identification. Threshold can be defined as the lowest current level,
at which the peak picker identifies a NRT response. Also, there is an option for extrapolated
NRT threshold identification using regression analysis. In the present study, the different artefact
cancellation techniques (forward masking, artefact template and alternate polarity) and the
different threshold estimation techniques (visual T-NRT estimation, peak picker based T-NRT
estimation and regression analysis extrapolated T-NRT) were compared across. The different
artefact cancellation and threshold estimation techniques are described briefly below.
Forward Masking
The method involves a non-simultaneous forward masking paradigm, using a masker plus
probe condition, to put the auditory nerve fibers into refractory period and thereby recording
only the stimulus and masker artefact. This is subtracted from the probe alone condition which
consists of both stimulus artefact and neural response. The resultant is a neural response with a
masker artefact. The masker artefact is then removed by recording of a masker alone condition.
Once the recordings have been made in each of these three stimulation conditions, extraction of
the ECAP from the stimulus artefact is accomplished in two steps. First, the average response
recorded in the second condition (masker plus probe) is subtracted from the averaged response
recorded in the first condition (probe alone). This subtraction yields a response in which the
masker artefact has been inverted 180 degrees and the probe artefact has been minimized. The
second step is to add the response recorded in the third condition (masker alone) to the product
of the subtraction. This step allows elimination, or at least reduction, of the artefact associated
with the masker. The main assumptions is that the masker-probe interval is short enough (<0.5
ms) for all the nerves to be in their absolute refractory state (Brown & Abbas, 1990). If the
masker-probe interval is > 0.5 ms, there may be a relative refractory component at the moment of
the probe stimulus in the masker-probe frame, caused by some of the nerves that have recovered
from their refractory state (Klop, Hartlooper, Briare & Frijns, 2004). This will result in unwanted
neural response to this probe, which influences the final response calculated.
Artefact Template
A second technique for reducing the effects of stimulus artefact is template subtraction
(Miller, Abbas, Rubinstein, Robinson, Matsuoka & Woodworgh, 1998). The principle that the
tissue impedance and the amplifier are linear, and a current that is twice as high will produce an
artefact that is twice as large is used to record a scaled version of the artefact by measuring the
Dissertation Vol.V, Part-A, AIISH, Mysore
160
artefact at a low, sub-threshold current level where there is no response and only contain artefact.
The artefact template can be scaled up accordingly at a higher supra threshold stimulation level.
The measured trace minus the scaled up artefact template should result in a pure neural response.
Alternating Polarity
Another common technique by which the stimulus artefact contamination can be
minimized is by alternating the polarity of the stimulus in successive presentations and then
averaging the response that is recorded. The artefact recorded is out of phase for the anodic-
leading and cathodic-leading biphasic current stimuli. When averaging is done the out of phase
stimuli artefact are averaged out. The neural response evoked by the stimulus should not reverse
the polarity as the stimulus polarity is changed and therefore will be preserved in the average
(Brown, 2004). Brown, Abbas and Gantz (1990) reported successful recording of NRT using
alternating polarity.
Visual T-NRT estimation
In this approach a visual observation of the NRT recordings and determination of the
lowest current level of the stimulus that elicits a measurable response is noted as T-NRT. This
can be used either through ascending or descending approach. Ideally initial responses should be
obtained at a high enough supra threshold level so that the user can be sure that the neural
response decreases with amplitude. One drawback to the visual detection method is that for
systems with a relatively high noise floor the true threshold can be obscured by the noise,
yielding a threshold estimate that is likely to be too high (Hughes, 2006).
Peak picker based T-NRT estimation
The second option to determine T-NRT is the peak picker. The peak picker identifies the
N1 and P2 of the NRT waveforms based on a set of rules that are dependent on a set of
parameters such as signal to noise ratio, current level, correlation of the recording with the
previous current level and correlation of the recording with a known response. These set of rules
constitute the peak picker algorithm. The T-NRT can be defined as the lowest current level at
which the peak picker identifies a NRT response.
Regression analysis extrapolated T-NRT
The third method of threshold estimation in NRT involves applying a regression analysis
to points on an input-output (or amplitude growth) function. Threshold is determined as the level
at which the regression line crosses zero amplitude (i.e., intercept of the x-axis where y = 0). The
advantage to this method is that lower thresholds can be extrapolated for high-noise systems
(Hughes, 2006).
Since there are differences in the working principle of the three different artefact
cancellation techniques mentioned above it can be possible that the NRT or the ECAP recorded
using them might be expected to vary in terms of latency, threshold, amplitude and morphology.
Therefore the need arises to study systematically, in detail and compare the NRT recorded using
Dissertation Vol.V, Part-A, AIISH, Mysore
161
the three different artefact cancellation techniques mentioned. There is a dearth of literature
comparing various techniques to reduce artefact while recording NRT/ECAP.
Further comparing the threshold and the amplitude at the threshold of the recorded
NRT/ECAP using different techniques to reduce stimulus artefact is also required. With respect
to the threshold estimation techniques, the efficacy of the peak picker and the regression analysis
in order to extrapolate threshold of NRT or T-NRT, with the NRT recorded using the three
different artefact cancellation techniques needs to be studied. These methods of estimation of T-
NRT need to be compared to that of obtained by the visual method of NRT estimation.
The relationship between T-NRT and behavioral thresholds have been used to program
the speech processors of the cochlear implant (Brown, Hughes, Luk, Abbas, Wolaver & Gervais,
2000; Hughes, Brown, Abbas, Wolaver & Gervais, 2000; Cooper et al., 2003). If T-NRT varies
with different artefact cancellation techniques the same relation cannot be used. So there is a
need to study the variation, if any, in T-NRT for NRT recorded with different artefact
cancellation techniques.
The objectives of the present study were:-
1. To record NRT using three different artefact cancellation techniques, viz.,
forward masking, artefact template and alternating polarity, on a basal, medial and
apical electrode sites in the cochlea.
2. To compare the NRT recorded with the three different artefact cancellation
techniques.
3. To compare the T-NRT estimated using the visual detection, peak picker and
regression analysis techniques.
4. To compare the amplitude of the visually estimated T-NRT for NRT recorded
with the three different artefact cancellation techniques.
Method
Following method was used to study and compare the artefact cancellation and threshold
estimation techniques used in NRT. The method is explained under the following headings.
Participants
A total number of eight children (4 males and 4 females) with pre-lingual hearing loss of
severe to profound degree participated in the study. The children had no contra-indication for
cochlear implant surgery. All the participants were implanted with the Nucleus Freedom Contour
Advanced cochlear implant systems from Cochlear Corporation, Australia. The mean age of the
participants was 6.2 years (age range was from 2.6 to 13.3 years). Out of the eight participants, 7
were implanted in the right ear and one received the implant in the left ear. All the participants
had a post switch-on experience of electrical hearing with the cochlear implant system for at
least 3 months.
Dissertation Vol.V, Part-A, AIISH, Mysore
162
Instrumentation
Custom Sound EP (version 1.3) from Cochlear Corporation was the software that was
used to record the NRT from the participants implanted with Nucleus Freedom cochlear implant
systems. A laptop computer was used to run the Custom Sound EP (version 1.3) program. The
programming POD, an hardware interface, established the link between the speech processor of
the Nucleus Freedom cochlear implant system and the Custom Sound EP software installed in a
computer. The POD was connected to the speech processor of the Freedom Cochlear implant and
the other end of the programming POD was connected by a USB cable to the USB 2.0 port of the
laptop.
Procedure
All the measurements were done post-operatively with 4, 12 and 20 as the probe active
electrode which represented the positions in the basal, medial and apical part of the cochlea.
During the recording process the participants were comfortably seated and were allowed to
watch an animation film which held their attention. After the connection between the computer
and the speech processor was established the advanced NRT option was selected in the Custom
Sound EP (version 1.3). First an electrode impedance check was carried out in all participants to
rule out any open circuit or abnormally high electrode impedances in the selected electrode pairs.
NRT was then recorded, for each of the three electrodes, using three different artefact
cancellation techniques. The artefact cancellation techniques were forward masking, alternating
polarity and artefact template.
A test for optimized recording parameters (ORP) was carried out with each of the artefact
cancellation techniques to establish the optimum gain and delay measures for recording NRT. An
internal amplifier gain of 50 dB and a recording delay of 122µs were found to be optimal at all
the three electrodes, for all the participants and with each of the artefact cancellation techniques.
During NRT recordings the sequence of the use of the three different artefact cancellation
techniques was varied to rule out any sort of order effect.
In three of the eight participants NRT could be recorded only with forward masking and
alternating polarity methods of artefact cancellation as the children were awake during the
testing and did not co-operate for longer testing sessions. In the rest of the five participants NRT
was recorded with all the three different artefact cancellation techniques. This resulted in a total
data pool of 63 T-NRT values from 24 different electrode sites.
The NRT waveforms were recorded using the protocol described in Table 1 for the three
artefact cancellation techniques. Once the NRT recordings were made T-NRT values, given by
the peak picker and regression analysis of the software, were recorded. In the present study the
AutoNRT peak picker option was used. Peak-to-peak amplitude of visually determined N1 and
P2 was recorded for the visually estimated T-NRTs. The visually estimated T-NRT taken was
that which was agreed upon by a panel of three experienced audiologists so as to avoid any
Dissertation Vol.V, Part-A, AIISH, Mysore
163
individual bias. This was the T-NRT estimated based on the visual observation for the purpose of
the study. T-NRT was recorded:
for each of the three recording electrodes i.e., electrode number 4, 12, and 20.
with each of the threshold estimation techniques, for NRT recorded with the different
artefact cancellation techniques.
for each of the participant.
Table 1: Stimulating and recording parameters for NRT
Stimulation and recording
parameters
Artefact cancellation techniques
Forward Masking Alternating Polarity Artefact Template
Probe indifferent electrode MP 1 MP 1 MP 1
Probe pulse width 25 µs/phase 25 µs/phase 25 µs/phase
Probe rate 80 Hz 80 Hz 80 Hz
Probe inter phase gap 7 µs 7 µs 7 µs
Masker active electrode Probe active
electrode
NA NA
Masker indifferent electrode MP 1 NA NA
Masker current level Probe current level
+ 10
NA NA
Number of maskers 1 NA NA
Masker rate 100 Hz NA NA
Masker inter phase gap 7 µs NA NA
Masker probe interval 400 µs NA NA
Recording active electrode Probe active
electrode + 2
Probe active
electrode + 2
Probe active
electrode + 2
Recording indifferent
electrode
MP 2 MP 2 MP 2
Recording Gain and Delay Based on ORP Based on ORP Based on ORP
Number of sweeps 50 50 50
Measurement window 1600 µs 1600 µs 1600 µs
Effective sample rate 20 kHz 20 kHz 20 kHz
Artefact template current
level
NA NA Probe current level
– 15
Scaling factor NA NA Auto
No. of sweeps for template NA NA 500
Note: NA = Not applicable. MP1, MP2: Monopolar stimulation modes
The artefact cancellation techniques were statistically compared under the following stages:
Stage I: Comparison across different artefact cancellation techniques based on visually
estimated T-NRT at each of the three electrodes.
Stage II: Comparison across different artefact cancellation techniques based on peak picker
estimated T-NRT at each of the three electrodes.
Dissertation Vol.V, Part-A, AIISH, Mysore
164
Stage III: Comparison across different artefact cancellation techniques based on regression
analysis estimated T-NRT at each of the three electrodes.
Stage IV: Comparison across different artefact cancellation techniques based on the amplitude
of the visually estimated T-NRT at each of the three electrodes.
The methods of threshold estimation were statistically compared under the following stages:
Stage V: Comparison across different methods of threshold estimation for NRT recorded with
forward masking at each electrode.
Stage VI: Comparison across different methods of threshold estimation for NRT recorded with
artefact template at each electrode.
Stage VII: Comparison across different methods of threshold estimation for NRT recorded with
alternating polarity at each electrode.
Results & Discusson
For statistical comparison Friedman‟s test of significance was carried out across the
artefact cancellation techniques and threshold estimation methods in the all the seven stages
described earlier. Upon the presence of any significant statistical difference, Wilcoxon signed
rank test was carried out to find out which of the artefact cancellation techniques or threshold
estimation methods had significant difference. The results for seven different stages of
comparison are discussed below.
Stage I
The comparison across the artefact cancellation techniques based on the visually
estimated T-NRT in each of the electrodes revealed that there was no significant difference, even
at 0.05 level of significance in any of the electrodes. Table 2 shows mean and standard deviation
values for visually estimated T-NRT values with forward masking, alternating polarity and
artefact template on the 4th
, 12th
and 20th
electrodes.
Table 2: Mean and standard deviation (SD) values of visually estimated T-NRT for different
electrodes, using different artefact cancellation techniques
Recording
Electrode
Artefact Cancellation
Techniques
Mean Standard
Deviation
Electrode 4
Forward Masking 178.88 4.73
Alternating Polarity 181.63 4.21
Artefact Template 180.60 7.70
Electrode 12
Forward Masking 179.75 14.37
Alternating Polarity 182.38 14.79
Artefact Template 184.00 13.00
Electrode 20
Forward Masking 164.00 14.37
Alternating Polarity 172.13 14.79
Artefact Template 171.80 13.00
Dissertation Vol.V, Part-A, AIISH, Mysore
165
Although there was no significant difference seen comparison of the mean threshold
revealed that the mean threshold of the visually detected T-NRT was lowest for NRT recorded
with forward masking paradigm (Table 2 and Figure 1). In the forward masking paradigm the
stimulus artefact is recorded separately in the absence of any stimulus response as the nerve
fibers are put to refractory period. The stimulus artefact present in the probe alone condition and
probe-plus-masker condition is expected to be similar as in both cases the probe level is same.
Since the stimulus artefact is measured with precision it can be expected to be cancelled out and
the true neural response be recorded.
In the alternating polarity method of artefact cancellation it is not always true that the
neural response is identical in response to either anodic-leading or cathodic-leading biphasic
current pulses (Van Den Honert & Stypulkowski, 1987; Miller, Abbas, Rubinstein, Robinson,
Matsuoka & Woodworgh, 1998; Miller, Robinson, Rubinstein & Matsuoka, 1999). Klop,
Hartlooper, Briare and Frijns in 2004 reported that the N1 and P2 latencies are shorter for
cathodic-first (0.13 and 0.32 ms, respectively) than for anodic-first stimuli (0.16 and 0.38 ms,
respectively). As the N1-P2 peaks vary for anodic-leading and cathodic-leading biphasic current
pulses it may affect the averaged response and thereby the threshold.
In the artefact template technique the scaled down template of the artefact is always
measured at a lower probe level than the probe level used for measuring the NRT. The artefact
template is then scaled up accordingly when a recording is done at threshold level. The principal
limitation to this method, as reported by Brown (2004), is that the amplifier and tissue
conductance should be perfectly linear to produce exactly the same shaped artefact at a lower
current level which is generally not the case. As a result the scaled up template of the artefact can
be either overestimating or underestimating the actual artefact at certain probe level. In either
case it will distort the ECAP to a certain extent and hence might be expected to overestimate the
NRT threshold. It also requires a system with very low levels of ambient noise.
Stage II
The comparison across the different artefact cancellation techniques based on the peak
picker estimated T-NRT in each of the electrodes revealed that there was a significant difference
between forward masking and alternating polarity techniques for the 4th
electrode (p<0.05) and
20th
electrode (p<0.05). Table 3 shows mean and standard deviation (SD) values of T-NRT
estimated by peak picker, using the three different artefact cancellation techniques, for the 4th
,
12th
, and 20th
electrode. The mean T-NRT was lowest with the forward masking technique in all
the three electrodes.
It is to be remembered that there are two peak picker options. One is the AutoNRT peak
picker and the other one is the standard peak picker. In the present study the AutoNRT peak
picker was used because it was expected that the standard peak picker which can be user defined
will have good correlation with the visually estimated T-NRT.
Dissertation Vol.V, Part-A, AIISH, Mysore
166
Table 3: Mean and standard deviation (SD) values of peak picker estimated T-NRT for different
electrodes using different artefact cancellation techniques
Recording Electrode Artefact Cancellation Techniques Mean Standard Deviation
Electrode 4
Forward Masking 171.75 6.02
Alternating Polarity 181.00 4.04
Artefact Template 176.60 9.32
Electrode 12
Forward Masking 176.25 15.66
Alternating Polarity 181.00 14.52
Artefact Template 182.20 13.66
Electrode 20
Forward Masking 161.50 18.37
Alternating Polarity 172.00 23.60
Artefact Template 168.60 12.66
The significant difference seen between forward masking and alternating polarity based
on peak picker estimated T-NRT is because of the fact that with the alternating polarity the peak
picker was identifying NRT tracings as response at a higher stimulation level that were very near
to actual T-NRT. Whereas with forward masking peak picker was identifying many NRT
tracings as response at very lower stimulation levels which were not NRT response as there were
no visible ECAP. The peak picker estimated T-NRT responses with artefact template were
generally near to that picked with the alternating polarity or in between that of the T-NRT
recorded with forward masking and alternating polarity. The mean T-NRT based on peak picker
was always lowest for NRT recorded with the forward masking paradigm, in all the electrodes,
as it detected many tracings as NRT responses at sub-threshold levels where no visible ECAP
were present.
Stage III
In this stage comparison across the artefact cancellation techniques based on the
regression analysis estimated T-NRT was made in each of the electrodes. Table 4 shows mean
and standard deviation (SD) value for T-NRT estimated by regression analysis using three
different artefact cancellation techniques on three different electrodes.
Table 4: Mean and standard deviation (SD) values T-NRT estimated from regression analysis for
different electrodes using different artefact cancellation techniques
Recording Electrode Artefact Cancellation Techniques Mean SD
Electrode 4
Forward Masking 170.82 6.29
Alternating Polarity 180.98 4.00
Artefact Template 177.06 9.13
Electrode 12
Forward Masking 174.15 14.30
Alternating Polarity 180.07 12.97
Artefact Template 180.65 13.40
Electrode 20
Forward Masking 160.13 19.41
Alternating Polarity 167.58 21.83
Artefact Template 164.57 13.07
Dissertation Vol.V, Part-A, AIISH, Mysore
167
The comparison across the artefact cancellation techniques based on the T-NRT
estimated by regression analysis in each of the electrodes revealed that there was a significant
difference (p<0.05) between forward masking and alternating polarity, based on the T-NRT
established by regression analysis on the 4th
electrode. Though the mean differences were not
significant at the other electrodes with other artefact cancellation techniques, the T-NRT
estimated with regression analysis were again the lowest with the forward masking technique for
all the electrodes.
The extrapolated T-NRT given by linear regression analysis is based on the correct
responses identified by the NRT software at different stimulation levels and involves applying a
regression analysis to points on an input-output (or amplitude growth) function. Threshold is
determined as the level at which the regression line crosses zero amplitude (i.e., intercept of the
x-axis where y = 0). The T-NRT based on regression analysis can be affected if the peak picker
marks the amplitude measures in NRT tracings where there are actually no responses. Similar
finding was reported by Hughes (2006) wherein when the amplitude measures were unmarked
on the no-response waveforms, the linear regression based T-NRT became virtually the same as
the visual detection threshold. Since peak piker identification of correct NRT responses with
alternating polarity and forward masking different a significant difference, at times, can be
expected in the regression analysis based estimations for T-NRT recorded with these two
methods.
The regression analysis estimated T-NRT is also based on the amplitude growth function
linearity assumption. Typically the amplitude growth function is linear at higher current levels
and tails off near threshold but also flattens out at very high current levels, giving an over all
sigmoidal function (Botros, Dijk & Killian, 2006). These authors also reported that non-linearity
near threshold poses a difficulty for automated systems that are based on extrapolated threshold
method.
Stage IV
Comparison of the amplitude of the visually estimated T-NRT waveform recorded using
the three different artefact cancellation techniques revealed that there was a significant difference
between the amplitude of the T-NRT recorded with the three different artefact cancellation
techniques(p<0.05). Further analysis revealed significant differences between visually
established T-NRT amplitude recorded with alternating polarity and forward masking and
between visually estimated T-NRT amplitude recorded with artefact template and alternating
polarity for each the 4th
, 12th
and 20th
electrode(p<0.05). The amplitude of the N1 and P2 peaks
in the T-NRT tracings were recorded and the peak to peak amplitude between the N1-P2
complex was taken as the amplitude of the T-NRT. Table 5 shows mean and standard deviation
(SD) values of the amplitude (µV) of the visually estimated T-NRT recorded with the three
different artefact cancellation techniques on electrode number 4, 12 and 20. The lower mean
amplitude of visually estimated T-NRT recorded with alternating polarity as compared the mean
amplitude of the visually estimated T-NRT recorded with other two artefact cancellation
Dissertation Vol.V, Part-A, AIISH, Mysore
168
techniques can be observed in Figure 4. From Figure 4, it is noted that the NRT amplitude did
not vary much across the electrodes with alternating polarity technique. The amplitude of NRT
was always least with alternating polarity compared to artefact template and forward masking, in
all the electrodes. This is evident from the mean amplitude of the T-NRT recorded with
alternating polarity, which was always least for alternating polarity in all the electrodes. The
amplitude was highest for forward masking in the 4th
and 20th
electrode and, for artefact template
in 12th
electrode.
Table 5: Mean amplitude of the visually estimated T-NRT for different artefact cancellation
techniques on different electrodes
The lower amplitude for NRT recorded with alternating polarity can be attributed to the
fact that it is not always true that the neural response is identical in response to either anodic-
leading or cathodic-leading biphasic current pulses as reported by Van Den Honert and
Stypulkowski, (1987); Miller, Abbas, Rubinstein, Robinson, Matsuoka and Woodworgh, (1998);
Miller, Robinson, Rubinstein and Matsuoka, (1999). Klop, Hartlooper, Briare, and Frijns, (2004)
reported that the N1 and P2 latencies are shorter for cathodic-leading (0.13 and 0.32 ms
respectively) than for anodic-leading stimuli (0.16 and 0.38 ms respectively). Since the N1 and
P2 is recorded at different latencies with anodic-leading and cathodic-leading biphasic current
pulses they will lie at different sampling points during recording for half of the anodic-leading
biphasic current pulse stimuli and half of the cathodic-leading biphasic current pulse stimuli.
This will lead to lesser N1-P2 peak-to-peak amplitude recorded after averaging, when compared
to the averaged N1-P2 peak-to-peak amplitude of any other method where the N1 and P2
latencies fall at similar latencies and hence at similar sampling points for each stimulation. The
findings of this study is consistent with that of Hughes, Abbas, Brown, Behrens and Dunn
(2003), who reported that the amplitude of the NRT recorded with alternating polarity method
tends to be significantly smaller than that obtained with the subtraction method (r = 0.97, p <
0.0001) yielding higher thresholds with alternating polarity (r = 0.86, p = 0.01).
Recording Electrode Artefact Cancellation Techniques Mean SD
Electrode 4
Forward Masking 17.67 6.88
Alternating Polarity 7.24 2.40
Artefact Template 12.39 1.23
Electrode 12
Forward Masking 14.51 1.05
Alternating Polarity 7.33 2.38
Artefact Template 16.70 16.70
Electrode 20
Forward Masking 16.57 1.66
Alternating Polarity 7.30 1.94
Artefact Template 15.03 4.86
Dissertation Vol.V, Part-A, AIISH, Mysore
169
Electrode Number
20124
Me
an
Vis
ua
l T-N
RT
cu
rre
nt
leve
ls
225
200
175
150
125
100
ACT
AP
AT
FM
Electrode Number
20124
Me
an
Pe
ak
Pic
ker
T-N
RT
cu
rre
nt
leve
ls
225
200
175
150
125
100
ACT
AP
AT
FM
Fig. 1: Bar diagram of the mean T-NRT Fig. 2: Bar diagram of the mean T-NRT
estimated visually with the different artefact estimated with peak picker for different
cancellation techniques artefact cancellation techniques
Electrode Number
20124
Me
an
Re
gre
ssio
n T
-NR
T c
urr
en
t le
vels
225
200
175
150
125
100
ACT
AP
AT
FM
Electrode Number
20124
Me
an
Am
plit
ud
e (
µV
) o
f vi
sua
l T-N
RT
20
18
16
14
12
10
8
6
ACT
AP
AT
FM
Fig. 3: Bar diagram of mean T-NRT estimated Fig. 4: Bar diagram of the mean
by regression analysis for different artefact of visually estimated T-NRT with
Cancellation techniques. different artefact cancellation techniques
Note: ACT = artefact cancellation techniques; AP = alternating polarity; AT = artefact template;
FM= forward masking
Stage V
The visual, peak picker and regression analysis estimated T-NRT for NRT recorded with
forward masking were compared at each of the 4th
, 12th
, and 20th
electrode. Statistical
comparison revealed that when forward masking was used as an artefact cancellation technique,
there were significant differences between the different methods of estimating T-NRT (p<0.05).
Further analysis revealed that when forward masking was used significant difference was seen
between T-NRTs that were visually estimated and T-NRTs that were estimated using the
regression analysis and the peak picker methods (p<0.05). No statistical difference was observed
even at 0.05 level of significance between the peak picker estimated and regression analysis
estimated T-NRTs. Similar findings were observed in all the electrodes.
Dissertation Vol.V, Part-A, AIISH, Mysore
170
There was no significant difference between peak picker estimated and regression
analysis estimated T-NRT, when forward masking was used, because of the fact that the
regression based extrapolated threshold considers the responses picked up by the peak picker and
involves applying a regression analysis to points on an input-output (or amplitude growth)
function. The same trend was observed in every case and no significant difference between peak
picker and regression analysis estimated T-NRT was seen in any electrode with any artefact
cancellation technique.
A significant difference between visual and peak picker based T-NRT for NRT/ECAP
recorded with forward masking as artefact cancellation technique was seen because of the fact
that the peak picker identified NRT tracings as responses even where no visible ECAP existed.
The present finding which show that peak picker identified NRT tracings with no visible ECAP
as responses is consistent with the findings of Hughes (2006).
Figure 5 depicts that when NRT waveforms recorded with forward masking as the
artefact cancellation technique the peak picker identified an NRT response at current levels
where no visible ECAP can be observed. The actual visually detected threshold was at 183
current levels. Also in Figure 5 we can see the improper marking of the P2 latency even when
the NRT tracing is correctly identified as a response. For example, the P2 latency was picked too
late by the peak picker for NRT tracing with 186 current levels. Also, P2 is expected to reduce in
latency with increase in stimulus level. However, the P2 latency picked for NRT tracing with
183 current levels is less than the P2 latency picked for NRT tracing with 190 current levels.
As discussed earlier the regression analysis based estimation of T-NRT involves applying
a regression analysis to points on an input-output (or amplitude growth) to the responses picked
by the peak, so an incorrect marking of responses by the peak picker will also affect the
regression based T-NRT. This is why a significant difference was seen between visual and
regression based T-NRT with forward masking as artefact cancellation technique for recording
NRT. Similar results of incorrect NRT response identification by peak picker affecting the
regression T-NRT was reported by Hughes (2006).
Stage VI
Similar to NRT recorded with forward masking NRT recorded with artefact template also
had significant differences between the estimated T-NRT based on visual detection and peak
picker and between the estimated T-NRT based on visual detection and regression analysis. The
findings can be discussed on similar lines as above. The peak picker picked up incorrect NRT
tracings as responses for NRT recorded with artefact template as artefact cancellation technique,
even when there was no visible ECAP. This is understood in Figure 6 where NRT tracings at 161
and 164 current levels have been picked as NRT responses. The incorrect placing of cursors of
N1 and P2 peaks can also be observed.
Dissertation Vol.V, Part-A, AIISH, Mysore
171
Stage VII
Comparison of the visual, peak picker and regression estimated T-NRT for NRT recorded
with alternating polarity as artefact cancellation technique did not show any significant
difference in any of the electrodes. The reason can be attributed to the ability of the peak picker
to identify NRT responses correctly when NRT was recorded with alternating polarity as artefact
cancellation technique. This is understood from Figure 7. It is to be noted that visually NRT with
175 current levels was taken as the T-NRT and the peak picker picked up 174 current levels as
the T-NRT.
.
Fig. 5: Peak picker marked NRT waveforms Fig. 6: Peak picker marked NRT waveforms
recorded with forward masking at different recorded with artefact template at different
current levels current levels
Fig. 7: Peak picker marked NRT waveforms recorded with alternating polarity at different current levels
Conclusion
The present study shows that all the three artefact cancellation techniques might be used
with confidence for recording NRTs for clinical purpose. A visual estimation T-NRT if used will
not result in significant differences in T-NRT. However, amplitude of NRT recorded with
alternating polarity will be consistent as compared to the other two artefact cancellation
techniques. The use of the peak picker and regression analysis techniques for determining T-
NRT should be done with caution. Discarding incorrectly identified peak picker responses will
improve the efficacy of the regression analysis T-NRT.
Dissertation Vol.V, Part-A, AIISH, Mysore
172
References
Abbas, P. J., Brown, C. J., Shallop, J. K., Firszt, J. B., Hughes, M. L., Hong, S. H. & Staller, S. J.
(1999). Summary of results using the Nucleus CI24M implant to record the electrically
evoked compound action potential. Ear and Hearing, 20, 45-59.
Botros, A., van Dijk, B. & Killian, M. (2006). AutoNRTTM: An automated system that
measures ECAP thresholds with the Nucleus FreedomTM cochlear implant via machine
intelligence. Artificial Intelligence in Medicine, doi: 10.1016/j.artmed. 2006.06.003.
Brown, C. J. (2004). The electrically evoked whole nerve action potential. In H. Cullington (ed).
Cochlear Implant: objective measures, pp. 96-129, London: Whurr.
Brown, C. J. & Abbas, P. J. (1990). Electrically evoked whole-nerve action potentials. II.
Parametric data from the cat. Journal of the Acous Society of America, 88, 2205-2210.
Brown, C. J., Abbas, P. J. & Gantz, B. (1990). Electrically evoked whole-nerve action potentials:
Data from human cochlear implant users. Journal of the Acoustical Society of America,
88 (3), 1385-1391.
Brown, C. J., Hughes, M. L., Luk, B., Abbas, P. J., Wolaver, A. & Gervais, J. (2000). The
relationship between EAP and EABR thresholds and levels used to program the Nucleus
24 speech processor: Data from adults. Ear and Hearing, 21, 151-163.
Cooper, H. L., Vermeire, K. P., Patel, J. M., Cullington, H., Ricaud, R., Brunnel, T., Knight, M.,
Plant, K., Dees, C.D. & Murray, B. (2003). Comparison between NRT-based MAPs and
behaviorally measured MAPs at different stimulation rates – a preliminary study.
Hughes, M. L., Abbas, P. J., Brown, C. J., Etler, C., Behrens, A. & Dunn, S. (2003).
“Comparison of two methods used to measure the ECAP in subjects implanted with the
Clarion CII device.” Paper presented at the Third International Symposium and
Workshops on Objective Measures in Cochlear Implants, Ann Arbor, MI, 2003.
Hughes, M. L., Brown, C. J., Abbas, P. J., Wolaver, A. A. & Gervais, J. P. (2000). Comparison
of EAP thresholds with MAP levels in the Nucleus 24 cochlear implant: data from
children. Ear and Hearing, 21, 164-174.
Hughes, M. L. (2006). Fundamentals of clinical ECAP measures in cochlear implants, Part 2:
Measurement techniques and tips. Audiology Online, June 11, 2006. Retrieved June 11,
2006 from the Articles Archive on http://www.audiologyonline.com/.
Klop. W.C., Hartlooper, A., Briare, J. J. & Frijns, J. H. (2004). A new method for dealing with
the stimulus artefact in electrically evoked compound action potential measurements.
Acta Oto-Laryngologica, 124:2, 137-143.
Mens, H. M. (2004). Telemetry: Features and applications. In Cullington (ed) Cochlear Implant:
objective measures, pp. 23-38, London: Whurr.
Miller, C. A., Abbas, P. J., Rubinstein, J. T., Robinson, B. K., Matsuoka, A. J. & Woodworgh, G.
(1998). Electrically evoked compound action potentials of guinea pig and cat: Responses
to monopolar, monophasic stimulation. Hearing Research, 119, 142-154.
Dissertation Vol.V, Part-A, AIISH, Mysore
173
Miller, C. A., Robinson, B. K., Rubinstein, J. T. & Matsuoka, A. J. (1999). Electrically evoked
single fiber action potential from cat: response to monopolar, monophasisc stimulation.
Hearing Research, 130, 192-218.
Sauvage, S. R., Cazals,Y ., Erre, J.P. & Aran, J. M. (1983).Acoustically derived auditory nerve
action potential evoked by electrical stimulation: An estimation of the wave form of
single unit contribution. Journal of the Acoustical Society of America, 73, 616-627.
Van den Honert, C. & Stypulkowski, P. (1987). Characterization of the electrically evoked
auditory brainstem response (EABR) in cats and humans. Hearing Research, 21, 109–26.
Dissertation Vol.V, Part-A, AIISH, Mysore
174
Speech-Evoked Auditory Late Latency Response (ALLR) in hearing
aid selection
Shruti Kaul & C S Vanaja
Abstract
Cortical auditory evoked potentials (CAEPs) offer the possibility of evaluating the
effectiveness of hearing instruments in infants and older children who have limited behavioural
repertoire due to developmental delay or other disabilities. The present study was an attempt to
investigate the usefulness of auditory late latency responses (ALLR) evoked using speech stimuli
in hearing aid selection. The study consisted of two groups of participants. Group I included
children with normal hearing and Group II included children with hearing impairment both in
the age range of 5-7 years. CAEPs/ALLR were recorded for Group I participants for three
naturally produced speech sounds, /i/, /m/, /∫/, presented through a loud speaker at 65 dB SPL.
The CAEPs/ALLR was recorded for the same stimuli without a hearing aid and with two pre
selected hearing aids for Group II. For Group II participants functional gain measurements
were also carried out with the two pre selected hearing aids. The CAEPs (P1-N1-P2-N2) were
analyzed for peak latencies and amplitude of N1-P2 complex. Analyses of the data revealed that
all the three stimuli elicited waveforms that were distinct from each other. Significant effect of
stimuli was demonstrated for children with normal hearing and children with hearing
impairment wearing a hearing aid. Significant effect of age was not observed though a trend of
decrease in latency was noted for children with normal hearing. Comparison between aided and
unaided conditions revealed that in unaided conditions the responses were absent but the
responses were present in the aided condition. There was no statistically significant difference
between the latencies and amplitude of CAEP/ALLR peaks for the two populations when the
children with hearing impairment were wearing the most appropriate hearing aid. These
findings highlight that P1-N1-P2-N2 responses are efficient in reflecting the benefit of hearing
aid at least at a gross level.
Introduction
The increased emphasis on early identification and remediation of hearing loss has
resulted in considerable interest in using the cortical auditory evoked potentials (CAEPs) to
assess clinical populations in whom behavioural measures, speech detection and discrimination
are difficult to obtain (e.g., infants, children and difficult to test population). The obligatory
CAEPs, also called as auditory late latency responses (ALLR) can be recorded in response to a
Professor of Audiology, School of Audiology and Speech Language Pathology, Bharathiya Vidya Peet University,
Katra-Dhanakawadi, Pune, India. e-mail: [email protected]
Dissertation Vol.V, Part-A, AIISH, Mysore
175
variety of auditory stimuli ranging from simple tonal stimuli to complex speech stimuli such as
consonant-vowels syllables, words and even full sentences (Naatanen & Picton, 1987). These
responses can be reliably recorded from individuals of all age groups including well babies as
well as babies with low birth weight (Krutzberg et al. 1984).
Attempts have been made to record these potentials in individuals using hearing aid. In
one of the earliest study evaluating the benefit of CAEPs/ALLR, Rapin and Grazianni (1967)
found that a majority of their 5 to 24 months old infants with severe to profound sensori-neural
hearing loss had cortical ERPs threshold 20 dB lower/better in comparison to the unaided
thresholds for click and tonal stimuli. Most of the research later on concentrated on evaluation
hearing aid benefit using speech stimuli. In one of the recent studies Tremblay, Billings, Friesen
and Souza (2006) obtained CAEPs using amplified speech sounds /si and /∫i/ from 7 normal
hearing young adults. Participants were retested within a period of 8-days in both aided and
unaided conditions. Results revealed that speech evoked CAEPs can be recorded reliably in both
aided and unaided conditions. It was observed that hearing aids that provide a mild high
frequency gain only subtly enhance peak amplitudes relative to unaided conditions. If the
consonant-vowel (CV) boundary is preserved by the hearing aid, it can also be detected neurally
resulting in different neural response patterns for /si/ & /∫i/. It was concluded that speech evoked
cortical potentials can be recorded reliably in individuals wearing hearing aids.
In a similar study Tremblay, Kalstein, Billings and Souza (2006) recorded CAEPs in
adult hearing aid users using acoustic change complex (ACC). Seven adults (50-76 years) with
mild to severe sensori-neural hearing loss participated in the study. When presented with two
identifiable CV syllables (/si and ∫i/) the neural detection of CV transitions as indicated by the
presence of a P1-N1-P2 response was different for each speech sound. More specifically, the
latency of the evoked neural response coincided in time with the onset of the vowels. Hinduja,
Kusari and Vanaja (2005) investigated changes in ALLR as a function of hearing aid gain in 6
hearing impaired children having moderately severe to profound hearing loss. The aided
behavioral thresholds and aided ALLR responses were obtained with two hearing aids in all the
participants. One low gain hearing aid yielding behavioral threshold outside speech spectrum
called Poor Hearing aid and one hearing aid yielding thresholds within the speech spectrum
called Good Hearing aid was considered for the study. Results showed that the amplitude was
larger and the latency was shorter for ALLR recorded when the good Hearing aid was worn
when compared to that obtained while wearing the poor hearing aid.
Thus limited research carried out indicates that CAEPs/ALLRs can be reliably recorded
from individuals wearing hearing aids. However, there is a dearth of literature on usefulness of
CAEPs/ LLR in evaluating hearing aid benefit using natural speech tokens. Also it is not known
whether CAEPs can be useful in comparing performance among different hearing aids. If it can
then this measure will be a useful tool in selection, fitting and validation of hearing aids in
difficult to test population. It might also be instrumental in monitoring the performance during
Dissertation Vol.V, Part-A, AIISH, Mysore
176
and after the auditory training. Therefore the present study was designed to investigate the
following aims:
To compare the CAEP waveform obtained for naturally produced speech tokens, /i/, /m/
and /∫/ in children with normal hearing.
To evaluate the usefulness of CAEPs for naturally produced speech tokens, /i/, /m/ and /∫/
in validation of appropriate hearing aid.
Method
The following method was used to investigate the usefulness of ALLR in evaluation of
hearing aid benefit and also to develop norms for ALLR for natural speech tokens. All the
measurements were carried out in an acoustically treated double room situation where the
ambient noise levels were within the permissible levels (ANSI, 1991).
Participants
Two groups of participants were included in the study. Group I included 15 children with
normal hearing in the age range of 5-7 years and Group II included 10 children with hearing
impairment in the age range of 5-7 years. Participants of Group I had hearing sensitivity of less
than 15 dBHL at octave frequencies between 250 Hz and 8000 Hz. The middle ear functioning
was normal and there was no history of any otologic, neurologic problems. Participants of Group
II had pure tone thresholds in the range of 70 to 100 dBHL. They also had normal middle ear
functioning and there was no history of any otologic and neurologic problems.
Instrumentation
A calibrated 2-channel diagnostic audiometer with sound field facility was used to carry
out the pure tone audiometry and functional gain measurements. A calibrated immitance meter
was used to examine the middle ear functioning. Intelligent Hearing System (version 2.39) with
matched loud speakers was used to record and analyze the late latency responses.
Materials used for testing
Stimuli for recording LLR were natural speech segments -/i/, /m/, /∫/. These were spoken
by a normal adult female Kannada speaker into a unidirectional microphone connected to the
computer. The recording was done using Praat software with a sampling rate of 16000 Hz. The
duration of the stimuli was kept constant at 250 ms across all the speech sounds. The wave file
was then converted to stimulus file for ALLR recording using „Stim conv‟ provided by the
Intelligent Hearing System (version 2.39).
Test procedure for Group I
Pure tone thresholds were obtained in the sound field for octave frequencies between 250
Hz and 8000 Hz for air conduction. Tympanometry and acoustic reflexes were carried out to rule
out any middle ear pathology. ALLR recording was done for the participants who met the
selection criteria.
Dissertation Vol.V, Part-A, AIISH, Mysore
177
ALLR recording: Participants were comfortably seated in order to ensure a relaxed posture and
minimum rejection rate. Loud speaker delivering the stimuli was placed at a distance of one
meter and at a 0º azimuth to the test ear. Speech evoked LLR recording was done when the child
was awake. „Mehandi‟ was drawn on children‟s hands to ensure that the child is awake and quiet.
Conventional electrode montage was used with the non inverting electrode on Fz, inverting
electrode on the mastoid of the test ear and common electrode on the mastoid of the non test ear.
It was ensured that the electrode impedance values were less than 5kΩ and the inter electrode
difference was less than 3kΩ. ALLR were recorded using the test protocol given in Table 1.
Table 1: Protocol used for ALLR recording
Stimuli /i/, /m/, /∫/
Stimulus level 65 dBSPL
Transducer Loudspeakers at 0º azimuth
Rate 1.1/sec
Polarity Alternating
Filters 1-30 Hz
Notch filters On
Number of channels Single channel
Recording time window 500ms
Amplification 50X
Sweeps 200
Number of repeats 2
Test procedure for Group II
Similar to the procedure used in the Group I pure tone thresholds, tympanometry and
acoustic reflexes were obtained for the participants of Group II. Two digital hearing aids were
pre-selected and programmed based on the audiological findings. Functional gain measurements
as well as unaided and aided ALLR were carried out with pre-selected hearing aids. These two
test procedures were used to rate the hearing aids regarding their suitability to the children with
hearing impairment.
Functional gain measurements: Unaided and aided pure tone thresholds were obtained for FM
tones at octave frequencies between 250 Hz and 8000 Hz. The stimuli were presented through
the speakers placed at distance of one meter and at 45º azimuth. Conditioned responses were
obtained across the frequencies
ALLR recording: ALLRs were recorded separately for the three stimuli, /i/, /m/, /∫/ without the
hearing aid as well as with pre selected hearing aids. The procedure used for recording ALLR
was same as that used for Group I.
Analysis
The waveforms were analyzed and P1-N1-P2-N2 peaks were identified by two
audiologists who were unaware of the test conditions. The audiologists ranked the unaided and
Dissertation Vol.V, Part-A, AIISH, Mysore
178
aided ALLR obtained for Group II as I, II, III. Latency and amplitude of the identified peaks
were noted and suitable statistical analysis was carried out to investigate the aims of the study.
Results and discussion
Robust P1-N1-P2-N2 responses were obtained in all the participants of Group I for all the
three naturally produced speech tokens /i/, /m/, /∫/ and the responses were robust in Group II
when the children were wearing the most appropriate hearing aid. The responses were replicable
in all the participants. The data obtained were analyzed to investigate the aims of the study using
the SPSS software.
P1-N1-P2-N2 responses in children with normal hearing: P1-N1-P2-N2 responses could be
recorded from all the 15 participants for all the three natural speech tokens /i/, /m/, /∫/ presented.
The data obtained were tabulated and statistical analysis mixed design ANOVA was carried out
to assess the effect of age and stimuli on latency and amplitude of P1-N1-P2-N2 responses.
Effect of stimuli
Figure 1: Responses for /i/, /m/ and /∫/ stimuli in children with normal hearing
The mean and the standard deviation of the latencies obtained for the three stimuli across
the three age groups are tabulated in Table 2. It can be observed from the table that the latencies
for stimuli /i/ were shortest while /∫/ had the longest latencies. Figure 1 shows the representative
sample of waveform for the three different stimuli.
Repeated measure ANOVA revealed that there was a main effect of stimuli on the
latency of all the waves. Bonferroni‟s multiple comparison test was done to see the pairwise
differences between the stimuli. The test results indicate that P1 and N2 responses to all the three
stimuli are significantly different (at 0.01 level of significance) from one another. But the latency
of N1 and P2 peaks showed a significant difference between /i/ and /m/ as well as between /i/
and /∫/ stimuli at 0.05 level of significance but there was no significant difference between the
latencies for /m/ and /∫/.
/i/
/m/
/∫/
Dissertation Vol.V, Part-A, AIISH, Mysore
179
Table 2: P1-N1-P2-N2 latencies in ms for the three stimuli across the age groups in children with
normal hearing
Peaks
Age
Stimuli
/i/ /m/ /∫/
Mean SD Mean SD Mean SD
P1
5-6 years 101.34 16.84 113.40 20.74 131.42 17.94
6-7 years 97.35 10.24 102.60 6.53 130.42 9.16
7-8 years 86.22 3.56 93.96 5.27 118.92 19.71
N1
5-6 years 172.94 18.55 199.88 28.21 227.52 30.50
6-7 years 169.80 14.33 196.55 18.00 216.57 28.53
7-8 years 152.08 17.14 176.70 25.64 215.02 47.40
P2
5-6 years 226.74 22.79 266.74 24.96 291.76 30.25
6-7 years 211.63 12.74 255.50 17.52 289.20 28.06
7-8 years 201.38 31.16 238.86 29.04 283.14 42.57
N2
5-6 years 302.20 32.20 330.24 23.92 394.14 43.02
6-7 years 278.30 19.39 339.90 13.16 390.03 16.83
7-8 years 273.56 34.25 312.74 34.99 339.38 45.15
The N1-P2 responses were the most prominent peaks. Hence only amplitude of N1-P2
was measured. Table 3 shows the mean amplitude and standard deviation of N1-P2 complex for
the three stimuli. It can be observed from the table that the responses for the high frequency
stimuli, /∫/, had smaller amplitude when compared to responses for stimuli /i/ and /m/ which had
dominant spectral energy at low frequencies. A main effect of stimuli on N1-P2 amplitude was
yielded when repeated measures of ANOVA was carried out (F (2, 35) = 8.82, P< 0.05).
Bonferroni‟s multiple comparison test revealed that there was a significant difference between /∫/
and /m/ at 0.05 level of significance but no significant difference was seen between /i/ and /m/
and between /i/ and /∫/.
The results of the present study are in accordance to the findings of Agnug et al. (2006)
wherein /i/ had shortest latency followed by /כ/, /m/, /a/, /u/, /s/ and /∫/ had longest latency. In the
same study latency differences across the different vowels were studied wherein high front
vowel /i/ had earlier latencies than latency for low mid- back vowel /u/.
Table 3: N1-P2 amplitude in µv for the three stimuli and across age groups in children with
normal hearing
Amplitude
Age
Stimuli
/i/ /m/ /∫/
Mean SD Mean SD Mean SD
N1-P2
5-6 years 3.05 0.83 3.72 0.76 2.30 1.24
6-7 years 3.35 0.69 3.45 1.98 1.62 1.01
7-8 years 2.25 0.89 2.31 1.23 1.99 0.81
Dissertation Vol.V, Part-A, AIISH, Mysore
180
The responses for the mid vowel /a/ occurred between these two latencies. Obleser, Eultiz
and Lahiri (2004) found that the back vowel /o/ resulted in later latency when compared to the
latency of the front vowel /ø/ and they also observed that front vowels activate a more inferior
and anterior source compared to back vowels. Similar results were reported by Makela, Alku,
Makinen, Tiitinen (2005) and they concluded that cortical representation of vowels reflects the
phonological features of speech. However, phonological features alone do not account for
latency differences as suggested by Agnug et al. (2006) wherein they found that in Australian
English, /כ/ is further retracted than /u/ but still low-back vowel /כ/ did not have the longest
latency, but occurred at a similar latency to the mid vowel /a/. The investigators accounted for
this difference to the large F2-F1 differences in the vowels. This might also be the reason for
obtaining differences in latencies for /i/, /m/ and /∫/ stimuli in the present study. The F2-F1
difference is approximately 700 Hz to 800 Hz for /m/ speech sound and hence might be resulting
in response that occurs at a different time compared to the vowel /i/ which has a larger F2-F1
difference. Tremblay et al. (2003) also observed that when the stimuli had an early onset of
vocalic portion as in /∫i/ early P1-N1-P2-N2 latency was obtained when compared to the /si/
stimuli where the vocalic portion had comparatively later onset.
In the present study the natural speech tokens dominated by high frequency spectral
energy /∫/ elicited P1-N1-P2-N2 responses with smaller N1-P2 amplitudes than speech sounds
that had dominant spectral energy in the low frequencies. These findings are consistent with the
results of Agnug et al. (2006), wherein the low frequency dominant stimuli (/m/, /a/, /u/, /i/) had
higher amplitudes compared to the high frequency dominant stimuli (/s/, /∫/). Similar findings
have been reported for the tonal stimuli with low frequency tones eliciting cortical responses
with larger amplitudes than high frequency tones (Jacobson, Lombardi, Gibbens, Ahmad &
Newman, 1992). Shestakova, Brattico, Soloviev, Klucharev, Hotiliainen (2004) have reported
that amplitude and source locations of N1m differed between the vowel categories and vowels
with similar spectral envelopes had closer cortical representations than those where spectral
differences were greatest. Yeltin, Roland, Chriestensen and Purdy (2004) have reported that
cortical areas that respond to low frequency auditory information are located more superficially
(i.e. closer to the surface of scalp) than cortical regions for high frequencies. Hence low
frequency stimuli may activate more superficial cortical regions and produce larger amplitude
CAEPs than high frequency speech sounds when surface scalp electrodes are used. However, as
it is known the complexity of speech stimuli are not based solely on the frequency effect. Agnug
et al. (2006) found that /כ/ (dominated by lowest frequency spectral energy) produced N1-P2
amplitudes that were not significantly different from /s/ and /∫/ response amplitudes. The results
of the present study and the earlier reports indicate that the latency and amplitude of waves
depend on the stimulus used for evoking the responses and the latency and amplitude probably
depend on the spectral content of the stimuli used.
Effect of Age
A general trend can be observed from Table 1 wherein there is a decrease in latency from
5 to 8 years of age for all the four peaks P1, N1, P2, N2. However, mixed design ANOVA
revealed that age did not have a main effect on the latency and amplitude measures (P > 0.05
level of significance). No age related trend was observed for the N1-P2 amplitude and results of
Dissertation Vol.V, Part-A, AIISH, Mysore
181
mixed design ANOVA revealed that there was no main effect of age on the amplitude of N1-P2.
Also no significant interaction between age and stimuli was observed.
Though not statistically significant even in the present study there was a decrease in
latency of all the peaks with increase in age. Previous studies also report maturational changes
during the early childhood. Ceponiene, Rinne and Naatanen (2002) reported that children‟s ERPs
are dominated by P1 and N1 peaks and the N1 emerges between 3 to 4 years of age. In the study
a long preponderance of the N2 potential was noted. Mc Pherson and Starr, (1993) also reported
that P1-N1-P2-N2 decrease in latency from birth up to 5 years of age. The N2 demonstrates a
greater negativity in this age group than the older age groups. It was also reported that peak
amplitudes and latencies for P2 and N2 remain relatively constant for ages between 5 to 16 years
(Mc Pherson & Starr, 1993). This was attributed to increase in myelination and improvements in
synapse efficacy (Kraus et al. 1993).
The present study demonstrates a more pronounced decrease in latency and amplitude for
P1 than for N1, P2, and N2 responses. This is consistent with investigations by Sharma, Martin,
Roland, Sweeny, Gilley and Dorman (2005) who revealed that P1 latency is the biomarker for
the development of auditory pathways in children with hearing impairment who received
intervention through conventional hearing aids and cochlear implants. The reason for not finding
a statistical significant effect of of age might be because of small sample size included in age
group. The effect of age needs to be further investigated on a larger sample size.
P1-N1-P2-N2 responses in children with hearing impairment: P1-N1-P2-N2 responses were
recorded for natural speech tokens /i/, /m/ and /∫/, in both unaided and aided condition, at 65
dBSPL for 10 children with hearing impairment.
P1-N1-P2-N2 responses in unaided condition: P1-N1-P2-N2 responses were absent in the
unaided condition for all the participants. This is probably due to severity of hearing loss. The
participants had hearing loss that ranged from severe to profound (71-90 dBHL and >91 dBHL)
but the stimuli were presented at normal conversational level (65 dBSPL) as the aim of the study
was to assess the benefit of the hearing aid at conversational level.
Figure 2: Responses for /i/ stimuli in unaided (uppermost), rank II (middle) and rank I (lower
most) hearing aids
Dissertation Vol.V, Part-A, AIISH, Mysore
182
Polen (1984) found that moderate to severe sensori-neural hearing loss resulted in
prolongation of N1-N2 and P300 latencies and a reduction in N2 amplitude in comparison with
results from normal hearing participants. But the effect of hearing loss needs to be further
investigated.
Effect of amplification:
Speech evoked P1-N1-P2-N2 responses were reliably recorded in all the children while
they were wearing a hearing aid. The effects of amplification were analyzed statistically by
comparing aided and unaided responses for latency and amplitude measures.
Figure: 3 Responses for /m/ stimuli in unaided (uppermost), rank II (middle) and rank I (lower
most) hearing aids
Based on the functional gain measurements the hearing aids were ranked as I and II. The
benefit from amplification was greater with the hearing aid ranked as I when compared to that
from the hearing aid ranked as II. The mean and the standard deviation across the stimuli for
two hearing aids are tabulated in Table 4.
Figure 4: Responses for /∫/ stimuli in unaided (uppermost), rankII (middle) and rankI (lower
most) hearing aids
Dissertation Vol.V, Part-A, AIISH, Mysore
183
From the Table 4, it can be observed that, similar to that obtained in children with normal
hearing, the responses for /i/ had the shortest latency and responses to /∫/ had the longest latency
for both the hearing aids. Figure 2, 3 and 4 represent the unaided, aided response for the three
stimuli, /i/, /m/ and /∫/, respectively. Repeated measure ANOVA revealed that there was no main
effect of stimuli on P1, P2, and N2 response latencies but there was a main effect of stimuli on
latency of N1. Bonferroni‟s multiple comparison test showed that there was a significant
difference between /i/ and /m/ and between /i/ and /∫/ (P< 0.05) while no significant difference
was found between /m/ and /∫/ for for latency of N1 with the rank I hearing aid. On the contrary
for the rank II hearing aid no effect of stimuli was observed for P1-N1-P2-N2 responses.
The N1-P2 amplitude between the rank I and rank II hearing aids were also compared.
The mean and standard deviation for N1-P2 amplitude with the two hearing aids is tabulated in
Table 4. It can be observed from the table that /∫/ has the lowest amplitude compared to /i/ and
/m/ and /i/ has better amplitude than /m/ but there was no effect of stimuli on amplitude of N1-P2
complex for both the hearing aids as revealed by repeated measure ANOVA (P>0.05). These
findings are similar to findings in the normal hearing children. But for rank II hearing aid it was
seen that the amplitude of N1-P2 complex was reduced and the overall morphology was poorer
when compared to rank I hearing aid.
Table 5. P1-N1-P2-N2 latencies & amplitude for the stimuli for children with hearing
impairment wearing hearing aid
Latency/
Amplitude
Stimuli Rank I hearing aid Rank II hearing aid
Mean SD Mean SD
P1 latency
/i/ 92.70 20.42 119.92 42.15
/m/ 104.73 30.34 119.97 39.64
/∫/ 127.56 36.23 145.76 37.33
N1 latency
/i/ 159.17 28.87 214.87 48.67
/m/ 189.77 18.98 216.10 42.48
/∫/ 221.55 50.11 233.53 42.17
P2 latency
/i/ 237.82 25.87 276.61 23.92
/m/ 270.0 35.92 284.91 52.31
/∫/ 284.17 53.18 333.58 40.27
N2 latency
/i/ 308.13 48.11 365.10 33.35
/m/ 339.41 43.42 374.34 44.84
/∫/ 366.41 45.25 411.50 17.58
N1-P2
amplitude
/i/ 5.04 2.44 4.2 2.96
/m/ 4.76 2.25 3.5 1.25
/∫/ 2.97 1.32 2.5 1.98
Comparison between unaided and aided performance
In unaided condition, the P1-N1-P2-N2 responses were absent for all the participants
with hearing loss but in aided condition the responses were present for all the three stimuli for all
Dissertation Vol.V, Part-A, AIISH, Mysore
184
the participants for both the hearing aids. This suggests that the children benefited from the use
of hearing aid.
Comparison between rank I and II hearing aids
The two hearing aids were compared for latency and amplitude measures across the
stimuli using the Paired „t‟ test. The results revealed that there is a significant difference between
the N1-P2 amplitudes and a significant difference was found for N1 and P2 latencies for /i/
stimuli but no significant difference was found for the P1, N2 latencies. The results hence reveal
that N1, P2 latencies are critical in demonstrating the usefulness of amplification across the
speech stimuli. This has also been supported by various investigators who have used N1 latency
and N1-P2 amplitude for assessing usefulness of P1-N1-P2-N2 responses in fine discrimination
tasks (Agnug et al. 2006).
From the results thus obtained it can be observed that the amplitude measure is more
sensitive in differentiating between rank I and rank II hearing aid. In the present study the testing
was carried at a constant SPL for both the hearing aids. More the gain offered by the hearing aid
higher is the output delivered to the ear and the higher level reaching the cochlea probably
activated more number of fibers which in turn resulted in larger amplitude. It has been reported
that the amplitude of auditory evoked potentials depend on the number of fibers responding
(Hall, 1992). But the latency depends on the processing time and not on the number of fibers
responding and probably the difference in processing time across the participants did not result in
significant difference for the two hearing aids for the latency measure. It has been reported that
the effect of intensity on latencies of ALLR is not significant at moderate levels (Hall, 1992).
Comparison between children with normal hearing and children with hearing impairment
wearing most appropriate hearing aid
The latencies of P1, N1, P2, N2 and amplitude of N1-P2 peaks of the two groups of
participants were statistically compared using independent sample „t‟ test. The results reveal that
children with hearing impairment with most appropriate hearing aid (Rank I hearing aid) had
responses that were not statistically different from the normal hearing children. Hearing aids
compensated for the loss of audibility and there was no neural involvement, hence there was no
significant difference between the two groups of participants. Also, Group II participants were
hearing aid users since 2 years and were receiving auditory training. These results are supported
by those of Ponton et al. (1996) who reported that when deaf children were fitted with cochlear
implant P1 latency showed same rate of maturation in normal hearing children and children who
were fitted with the implant.
Efficacy of CAEPs/ALLR in ranking hearing aid benefit
Two judges who were unaware of the test conditions were requested to rate the hearing
aids based on ALLR responses. The agreement between the two judges was analyzed. It was
observed that there was 80% agreement between the two judges for responses to /i/ stimuli.
There was only 50% agreement between the two judges for the response to /⌡/ stimuli. The
Dissertation Vol.V, Part-A, AIISH, Mysore
185
response for stimuli /m/ showed 60% agreement between the judges. It was also observed that as
the waveform morphology became poorer agreement between the judges reduced. The probable
reason might be that one of the judges was more experienced in judging the speech evoked
ALLR than the other judge.
For investigating the efficacy of ALLR in hearing aid validation the ranking of hearing
aids based on ALLR was compared with the ranking done based on functional gain
measurement. A third audiologist judged the ALLR whenever there was a discrepancy between
the two judges and the ranking was carried out based on the decision of the majority of the
judges. Comparison of the ranking based on ALLR responses and functional gain measurements
showed that there was 80% agreement for the /i/ stimuli followed by /m/ which demonstrated
70% agreement and lowest agreement was found for /⌡/ stimuli. The reason for not finding high
agreement for all the stimuli might be because the functional gain was based on thresholds for
tonal stimuli. Behavioural aided and unaided responses for the three stimuli (/i/, /m/ and /∫/)
would have thrown more light on the perception the stimuli.
Conclusions
From the above results it can be observed that: ALLR responses can be recorded for the
speech stimuli in children with normal hearing and children with hearing impairment wearing a
hearing aid. The responses obtained for the three stimuli, /i/, /m/ and /∫/, resulted in distinct
responses indicating that the stimuli are coded differently in the auditory system. Among the
three stimuli /i/ resulted in better morphology, shorter latency and high amplitude than /m/ and /∫/
stimuli indicating that the vowels are better coded than the consonants. A trend of decreasing
latency with increase in age indicates that probably maturation is occurring at this stage. With
most appropriate hearing aids the responses were present for /i/, /m/ and /∫/ stimuli and they were
not significantly different from that of children with normal hearing demonstrating the usefulness
of the hearing aid. The hearing aid which was not suitable according to the functional gain
measurements also showed poor responses in the ALLR recording, hence indicating that the
ALLR responses are useful in differentiating between the more and less suitable hearing aid.
References
Agung, K., Purdy, S.C., McMahon, C.M. & Newall, P. (2006). The use of cortical auditory
evoked potentials to evaluate neural encoding of speech sounds in adults. Journal of the
American Academy of Audiology, 17, 559-572.
Ceponiene, R., Rinne, T. & Naatanen, R. (2002). Maturation of cortical processing as indexed by
event related potentials. Clinical Neurophysiology, 113, 870-872.
Cunningham, J., Nicol, T., Zecker, S. & Kraus, N. (2000). Speech-evoked neurophysiologic
responses in children with learning problems: development and behavioral correlates of
perception. Ear and Hearing, 21, 554-568.
Hall, J.W. III (1992). Handbook of auditory evoked responses. Massachusetts: Allyn and Bacon.
Hinduja, R., Kusari, M. & Vanaja, C.S. (2005). Paper presented at 38th
Annual conference of
Indian Speech and hearing association, Ahmedabad.
Hyde, M. (1997). The N1 response and its applications. Audiology Neuro-Otology, 2, 281-307.
Dissertation Vol.V, Part-A, AIISH, Mysore
186
Jenstad, L. M. & Souza, P. E.W. (2005). Using wide dynamic range compression in severly
hearing impaired listeners: effects on speech recognition and quality. Ear and Hearing,
26, 120-131.
Knight, R. T., Scabini, D., Woods, D. L. & Clayworth, C.C. (1988). The effects of lesions of
superior temporal gyrus and inferior pareital lobe on themporal and vertex components of
the human AEP. Electroencephalography and Clinical Neurophysiology, 70 (6), 499-
509. Abstract retrived, from http://www.Enterzpubmed.com/.
Korzack, P. A., Kurtzberg, D. & Stapells, D. R. (2005). Effects of sensori-neural hearing loss
and personal hearing aids on cortical event- related potential and behavioral measures of
speech sound processing. Ear and Hearing, 26, 165-185.
Kraus, N., McGee, T., Carell, T. D. & Sharma, A. (1995). Neurophysiologic bases of speech
discrimination. Ear and Hearing, 16, 19-37.
Makela, A., Alku, P., May, P., Makinen, V. & Tiitinen, H. (2005). Left-hemispheric brain
activity reflects formant transition in speech sounds. Neuroreport, 16, 549-553. Abstract
received, from http://www.Enterzpubmed.com/.
Mc Pherson, D. L. & Starr, A. (1993). Auditory evoked potentials in the clinic. In A. M. Halliday
(Ed.), Evoked Potentials in Clinical Testing. Edinburgh: Churchill Livingstone, 359-381.
Naatanen, R. & Picton, T. (1987). The N1 wave of the human electric and magnetic response to
sound: a review and an analysis of the component structure. Psychophysiology, 24 (4),
375-425. Abstract retrieved, from http://www.Enterzpubmed.com/.
Obleser, J., Elbert, T., Lahiri, A. & Eulitz, C. (2003). Cortical representation of vowels reflects
acoustic dissimilarity determined by formant frequencies. Cognitive Brain Research, 15,
207-213. Abstract retreived, from http://www.Enterzpubmed.com/.
Ponton, C. W. (1996). Possible application of functional imaging of the human auditory system
in the study of acclimatization and late onset deprivation. Ear and Hearing, 17, 78S-86S.
Purdy, S. C., Kelly, A. S. & Thorne, P. R. (2001). Auditory evoked potentials as measures of
plasticity in humans. Audiology Neuro-Otology, 6, 211-215.
Rapin, I. & Grazaini,L. J.(1967). Auditory-evoked responses in normal, brain- damaged and deaf
infants. Neurology, 17, 881-894. Abstract retrived from http://www.Enterzpubmed. com/
Ross, M. & Tomassetti, C. (1980). Hearing aid selection in preverbal hearing impaired children.
In Pollack MC, Ed. Amplification for hearing impaired, 2nd
Ed. New York: Grune &
Stratton.
Sharma, A., Martin, K., Roland, P., Bauer, P., Sweeney, M. H., Gilley, P. & Dorman, M. (2005).
P1 latency as a biomarker for central auditory development in children with hearing
impairment. Journal of the American Academy of Audiology, 16, 564-573. Speech
recognition in cochlear implant users. Scandinavian Audiology, 30, 31-40.
Shestakova, A., Brattico, E., Soloview, A., Klucharev, V. & Huotilainen, M. (2004). Orderly
cortical representation of vowel categories presented by multiple exemplars. Cognitive
Brain Research, 21, 342-350.
Tremblay, K. L., Friesen, L., Martin, B. A. & Wright, R. (2003). Test-retest reliability of cortical
evoked potentials using naturally produced speech sounds. Ear & Hearing, 24, 225-232.
Dissertation Vol.V, Part-A, AIISH, Mysore
187
Brainstem Responses to Speech in Normal Hearing and Cochlear
Hearing Loss Individuals
Sumesh K & Animesh Barman
Abstract
Studying the neural encoding of speech sounds provides insight into some of the auditory
processes involved in normal communication. Auditory brainstem evoked responses to speech
provide direct information about how the sound structure of a speech syllable is encoded in the
auditory system. Individuals with cochlear hearing loss have consistently shown difficulties in
perceiving place and manner cues of consonants. The current study aimed at determining the
effect of cochlear hearing loss, stimulus presentation level (equal SL and equal SPL) on
brainstem responses to speech. The ABR and FFR were recorded for the synthetic speech stimuli
/da/ in 22 normals and 22 cochlear hearing loss (PTA < 55 dBHL) individuals at 80 dBnHL and
40 dB SL. Results revealed that the cochlear hearing loss showed reduced amplitude and
prolonged wave latency even at equal sensational level. This effect was adverse with the increase
in severity of hearing loss. The temporal fine structure coding was adversely affected with
increase in the hearing loss which is reflected by the poor coding of F0 and its formant (F1).
Introduction
The neural encoding of sound stimulus begins at the auditory nerve and continues till the
cortex via the auditory brainstem. Brainstem responses to simple stimuli (e.g., clicks, tones) are
well defined and widely used in clinical practice in the evaluation of auditory pathway integrity
(Moller, 1999; Starr & Don, 1988). However, the role of brainstem in processing a complex
signal, varying in many acoustic dimensions continuously over time, such as a speech syllable
have recently become a subject of great interest with the help of conventional techniques of
recording evoked potentials.
Studying the neural encoding of speech sounds provides insight into some of the auditory
processes involved in normal communication. Auditory brainstem evoked responses (ABR)
provide more direct information about how the sound structure of a speech syllable is encoded
by the auditory system. A handful of studies have been done in similar lines to understand the
brainstem processing of speech signal (Russo, Nicol, Musacchia & Kraus, 2004; Kraus & Nicol,
2005). Based on these studies brainstem responses to a speech syllable can be divided into -
transient and sustained portions, namely the onset response and the frequency-following
response (FFR). The response functions as a gauge both of spectrum encoding and periodicity
encoding.
Lecturer in Audiology, All India Institute of Speech and Hearing, Mysore, India.
e-mail: [email protected]
Dissertation Vol.V, Part-A, AIISH, Mysore
188
Frequency encoding is manifested in speech-evoked auditory responses both in the
latency (Martin et al., 1997; McGee et al., 1996) and the amplitude of transient responses. The
onset responses are transient, akin to the well-documented clinical measure that uses click or
tonal stimuli as a tool for assessing both peripheral hearing and retrocochlear lesions such as
tumors of the auditory nerve or brainstem (Hall, 1992). The sustained frequency-following
response (FFR) is a phase-locked response that „follows‟ the waveform of the stimulating sound
up to a frequency of approx 1000 Hz (Hoormann et al., 1992). It must be noted that although the
FFR is a sustained response it might be considered a series of repeated transients. Thus, the FFR
can be treated as a measure of both periodicity and spectral processing.
Russo, Nicol, Musacchia and Kraus (2004) have designed a method to evaluate both the
periodicity and spectral encoding in far-field FFR recordings. The markings used in the system
are shown in the Figure.1. It contains a series of peaks ranging from peak V, A, C, D, E, F and
O. Waves V and A signal the response to the onset of sound. Wave C is thought as a response to
the onset of the vowel. Peaks - D, E and F represent vibrations of the vocal folds. The
fundamental frequency occurs at approximately 15 msec, 24 msec and 33 msec in stimulus
corresponding to wave D (22 msec), E (31 msec) and F (40 msec) in response. Neural
conduction accounts for a delay of approximately 7 ms between stimulus and response. Wave O
is a response to the cessation of sound. The small higher-frequency fluctuations between waves
D, E and F correspond in frequency to that of the first formant (F1) of the stimulus which, along
with F2, primarily shapes the vowels.
Figure.1: Depicts the wave V followed by the negative peaks A, C, D, E and F. The onset
response is bracketed while the region containing the FFR is indicated with a horizontal line
The significance of these peaks is now well established by its application in clinical
population. FFR has been used to study the brainstem coding deficits in several communication
disorders such as children with learning problems and adults with cochlear hearing loss. Some
children with language-based learning problems exhibit abnormal neural encoding of the spectral
and temporal information crucial for accurate perception of sounds (King, Warrier, Hayes &
Kraus, 2002; Cunningham, Nicol, Zecker, Bradlow & Kraus, 2001). Some also experienced
abnormal susceptibility to the demands placed on the auditory system by rapidly presented
temporal information (Wible, Nicol & Kraus, 2005).
Dissertation Vol.V, Part-A, AIISH, Mysore
189
Khaladkar, Kartik and Vanaja (2005) suggested that using speech sounds to elicit the
ABR offers an opportunity to isolate normal speech processing from abnormal speech processing
better. The researchers further suggested that it would be useful for evaluating patients with
possible auditory processing disorders. Plyler and Ananthanarayan (2001) reported that the FFR
can encode the second formant transition in normal-hearing listeners. However, FFR encoding
seems to be severely degraded in most of the listeners with hearing loss.
Russo, Nicol, Musacchia and Kraus (2004) found that the addition of background noise
interfered with normal brainstem encoding of the speech stimulus /da/. Most affected were the
onset responses V and A which were severely degraded and completely obscured in more than
40% of the subjects. Peaks C and F however remained present in noise in most subjects. Their
peak amplitudes were also affected.
Individuals with cochlear hearing loss have consistently shown difficulties in perceiving
place (Revoile, Pickett, Holden-Pitt, Talkin & Brandt, 1987) and manner cues (Danhauer, Hiller
& Edgerton, 1984) of consonants. These difficulties increased with the degree of hearing loss.
Moore, Glasberg and Hopkins (2006) reported that subjects with moderate hearing loss
performed much worse in the difference limen for F0 compared to normally hearing subjects at
the same center frequency, suggesting that most of the hearing-impaired subjects had a poor
ability to use temporal fine structure. The temporal fine structures are important for the coding of
F0 and its harmonics.
It is very important to understand whether the individuals with cochlear hearing loss
exhibit any encoding deficits at the level of the brainstem as a result of distortion at the cochlea.
Speech-evoked brainstem responses provide a unique opportunity to explore this possibility in a
non-invasive manner. Since there is a dearth of literature on the brainstem processing for speech
stimulus in individuals with hearing loss there is a need for exploring the brainstem bases for
speech perception deficits in individuals with hearing loss. Also, there is a need to understand
whether the temporal processing difficulties in the cochlear hearing is due to the reduction in the
audibility only or does the temporal processing deficit exist even when the audibility of
stimulation is controlled. Thus to test these needs the present study was designed with the
following objectives.
Aim of the study
1. To study the effects of cochlear hearing loss on brainstem response to speech.
2. To study the effects of stimulus presentation level (equal SL and equal SPL) on
brainstem responses to speech.
Method
Participants of the present study were divided into two groups, experimental and control
group. The control group included 22 ears of normal hearing individuals aged 16-50 years and
hearing sensitivity with in 15dB HL. The experimental group included 22 ears with cochlear
hearing loss of subjects aged from 16-50 years with hearing sensitivity within 55dB HL. Speech
Dissertation Vol.V, Part-A, AIISH, Mysore
190
identification scores of all 22 subjects were proportional to their pure tone average of 500, 1000
and 2000 Hz. There was no abnormality indicated on click evoked ABR and the absent TEOAE
indicated the presence of cochlear pathology. The experimental group is further divided group I
(PTA >15 dB HL & ≤ 41 dB HL) and group II (PTA > 41 dB HL & ≤ 55 dB HL).
Instrumentation
A calibrated diagnostic audiometer (GSI-61) was used for estimating the pure tone
thresholds and a calibrated middle ear analyzer (GSI Tympstar) to rule out middle ear pathology.
The brainstem responses to speech and click stimuli were recorded using Intelligent Hearing
Systems (IHS Smart EP windows USB version 3.91) evoked potential systems. The Oto acoustic
emissions were recorded using Intelligent Hearing Systems (IHS Smart TrOAE windows USB
version 2.62) to check for the outer hair cell functioning.
Procedure
Stimulus used
The stimulus /da/, extensively used by Kraus and her colleagues, was used for recording
the speech-evoked ABR. A Klatt formant synthesizer (Klatt, 1980) was used to synthesize a 40-
msec speech-like /da/ syllable at a sampling rate of 10 KHz. The F0 changed from 103 to 120
Hz, F1 from 200 to 720 Hz, F2 from 1700 to 1240 Hz and F3 from 2580 to 2500 Hz. F4 and F5
remained constant at 3600 and 4500 Hz respectively. The time-amplitude waveform of the
stimulus is shown in Figure 2.
Figure 2: The wave form representation of the stimulus /da/. Fundamental frequency (F0) is seen
in periodicity of major peaks. The first formant (F1) is seen as periodically occurring smaller
peaks.
Analysis of the ABR/ FFR recordings
The peak latency and the peak to trough amplitude of Wave V, A, C, D, E, F, O were
measured. Fast Fourier Transform (FFT) was performed to obtain the information regarding
spectral characteristics of the FFR - Frequency and Amplitude of spectral peaks. FFT was
performed on all evoked potential recordings for an epoch of 15-54 ms using a custom-made
program run in MATLAB platform. The Peak amplitude corresponding to F0 and F1 region was
also calculated using a custom made program file in the MATLAB platform.
Time
Amplitude
Dissertation Vol.V, Part-A, AIISH, Mysore
191
II. Recording of evoked potentials: The brainstem response to speech was recorded for speech
using the test protocol given in table 1.
Table 1: Stimulus and acquisition parameters used to record ABR and FFR S
tim
ulu
s
par
amet
ers
Speech stimulus /da/ (synthesized)
Duration of the stimulus 40 msec
Speech stimulus levels 40dB SL and 80 dBnHL
Polarity Alternate
Mode of presentation Ipsilateral (monaural)
Repetition rate 9.1
Acq
uis
itio
n p
aram
eter
s Transducer Insert ear phones ER-3A
Analysis time 70 msec (includes 10 ms pre-stimulus period)
Band pass filter 30 to 3000 Hz
Electrode placement Cz – Non-inverting (+ve);
Both mastoids – Inverting (-ve);
Forehead – Ground
Sweeps 1500
Electrode impedance < 10 kΩ
Inter-electrode impedance < 3 kΩ
The figure 3 represents the brainstem response to speech recorded in normal hearing
individual at 40 dBSL.
Figure 3: The recording of brainstem response to speech in a normal hearing individual at 40 dBSL
Objective Measures for Frequency Following Responses (using MATLAB platform)
The region following the onset responses was defined as the FFR. The spectral measures
performed to analyze the sustained FFR (an epoch of 15-54 ms) were the amplitude of the
spectral component corresponding to the stimulus fundamental frequency (F0 amplitude) and
first formant (F1amplitude).
The sustained portion of the responses (FFR) was passed through 100 -120 Hz and 200 to
720 Hz band pass 4th
order Butterworth filters in order to obtain the energy at fundamental
frequency and first formant respectively. The Fourier analysis was performed on the filtered
Dissertation Vol.V, Part-A, AIISH, Mysore
192
signal. A subject‟s responses were required to be above the noise floor in order to include in the
analysis. This was performed by comparing the spectral magnitude of pre-stimulus period to that
of the response. If the quotient of the magnitude of the F0 and F1 frequency component of FFR
divided by the pre-stimulus period was greater than one the responses was deemed to be above
the noise floor. The raw amplitude value of the F0 and F1 frequency component of the response
was then measured. This program was validated with recordings with known spectral
characteristics.
Results and Discussion
To understand the effect of cochlear hearing loss and the severity of hearing loss on the
brainstem responses to speech, the clinical group was divided into 2 groups, Group I (N = 11
ears) having PTA >15 dB HL & ≤ 41 dB HL and Group II (N = 11 ears) having PTA > 41 dB HL
& ≤ 55 dB HL. The data obtained from the groups were then compared with the control group (N
= 22 ears).
The tables 2, 3, 4 shows the mean amplitudes and latencies of the discrete peaks - waves
V, A, C, D, E, F, O and the F0, F1 amplitude for normal hearing Group I and Group II
respectively. The tables also include the results of paired sample t-test across the two
presentational levels.
Table 2: Mean, SD and t-values of the various waves latency and amplitude of brainstem
responses to /da/ at 80 dB nHL and 40 dB SL obtained in the Control group
Parameters 80 dB nHL 40 dB SL
t-values Mean SD Mean SD
Lat
ency
Wave V 8.15 0.29 9.35 0.27 11.94*
Wave A 9.09 0.31 10.82 0.42 15.07*
Wave C 19.87 0.33 21.67 0.91 8.86*
Wave D 26.66 0.58 29.15 0.84 11.60*
Wave E 37.25 0.54 39.49 0.86 10.52*
Wave F 47.35 0.48 49.64 0.69 16.07*
Wave O 56.95 0.69 59.09 0.47 12.41*
Am
pli
tude
Wave V 0.27 0.09 0.24 0.07 1.02
Wave C 0.41 0.13 0.30 0.08 3.94**
Wave D 0.48 0.12 0.33 0.12 4.43*
Wave E 0.41 0.11 0.25 0.08 4.81*
Wave F 0.44 0.13 0.34 0.11 2.58***
FF
T
F0 amplitude 30.40 7.86 24.12 7.34 2.69***
F1 amplitude 15.29 3.33 12.78 4.17 1.40
*p<0.001, **p<0.01, ***P<0.05
Dissertation Vol.V, Part-A, AIISH, Mysore
193
Table 3: Mean, SD and t-values of the various wave latency and amplitude of brainstem
responses to /da/ at 80 dBnHL and 40 dB SL in Group I
Parameters 80 dB nHL 40 dB SL
t-values Mean SD Mean SD
Lat
ency
Wave V 9.23 0.76 9.81 0.73 4.69**
Wave A 10.35 0.77 10.80 0.66 3.23***
Wave C 20.61 0.95 22.03 0.95 4.29**
Wave D 28.32 1.09 29.70 1.04 3.25**
Wave E 38.93 1.12 40.09 1.19 2.59***
Wave F 48.77 0.99 49.84 0.87 2.94***
Wave O 58.12 0.78 58.79 0.53 2.67***
Am
pli
tude
Wave V 0.32 0.12 0.24 0.13 2.11
Wave C 0.28 0.13 0.24 0.10 2.34***
Wave D 0.36 0.15 0.32 0.30 1.25
Wave E 0.40 0.19 0.30 0.12 1.90
Wave F 0.30 0.16 0.27 0.15 1.29
FF T F0 amplitude 23.73 7.89 23.99 8.49 0.51
F1 amplitude 15.10 5.19 13.05 3.42 1.64
*p<0.001, **p<0.01, ***P<0.05
Table 4: Mean, SD and t-values of the various waves latency and amplitude of brainstem
response to /da/ at 80 dB nHL and 40 dB SL in Group II
Parameters 80 dB nHL 40 dB SL
t-values Mean SD Mean SD
Lat
ency
Wave V 10.30 0.73 9.91 0.48 3.94***
Wave A 11.57 0.98 10.83 0.76 3.80**
Wave C 22.33 0.79 23.54 1.13 2.25
Wave D 29.79 1.03 29.95 1.00 0.53
Wave E 39.86 1.07 40.97 1.26 0.76
Wave F 50.03 1.22 50.92 1.66 2.09
Wave O 58.20 0.56 58.96 0.81 2.81***
Am
pli
tude
Wave V 0.21 0.05 0.21 0.10 1.74
Wave C 0.26 0.09 0.24 0.05 0.40
Wave D 0.28 0.05 0.22 0.076 2.3
Wave E 0.23 0.08 0.22 0.05 0.69
Wave F 0.21 0.08 0.21 0.05 0.12
FF
T F0 amplitude 15.25 4.00 14.94 4.59 0.15
F1amplitude 8.90 2.28 10.03 2.53 1.4
*p<0.001, **p<0.01, ***P<0.05
Dissertation Vol.V, Part-A, AIISH, Mysore
194
The peaks D, E, F which are considered as the sustained brainstem responses occurred
periodically at a periodic interval of approximately 10 msec. This time period when converted
into frequency values (Frequency = 1/time period) it correlated with the F0 of the speech stimuli
(100 Hz). Russo, Nicol, Musacchia & Kraus, (2004); Kraus, Nicol, (2005) reported that the
peaks D, E, F in the sustained FFR represents the vibration of the vocal folds i.e., the F0 of the
speaker.
The results showed that this periodicity was coded effectively in normal hearing
individuals and group I (minimal to mild hearing loss). However, in group II high variability in
the standard deviation of these peaks latency and absences of identifiable responses in certain
individuals could be the indication of inaccurate coding of F0 and its harmonics.
The comparison across the presentation level revealed a significant increase in the latency
and decrease in the amplitude of the responses when the presentational level was varied from 80
dBnHL to 40 dBSL in normals for all wave parameters except for wave V amplitude. However,
Group I showed significant difference in most of the wave parameters except for the wave V, D,
E and F amplitude and Group II showed no significant difference for most of the parameters
except for the wave V, A, and O latency.
Decrease in the latency with an increase in the stimulus intensity is due to a progressively
faster rising generator potential within the cochlea and similarly faster development of excitatory
post synaptic potential (Moller, 1981). Latency of the compound action potential directly
depends on how quickly the generator potential and the excitatory post synaptic potential reach
the threshold for firing leading to reduced wave latency.
Increase in the amplitude parameters with the increase in the stimulus intensity may be
because of the increase in the audibility of the stimulus. This supports the finding by Hall (1992)
where he says that the Auditory evoke potential amplitude increases with the increase in the
intensity. The amplitude of an AER is decided by the number of neurons firing for particular
stimulus intensity. At higher intensities the number of neuron beginning to fire will be more and
amplitude of the compound action potential thus generated will be high. This had resulted in the
high amplitude evoked responses.
In the clinical group some parameters did not show a significant difference across two
presentation level because of little difference across the presentation level and this was negligible
in the group II who had higher thresholds. It could also be due to a high variability in most of the
parameters in the participants.
The F0 and F1 amplitude given in the tables clearly shows that the F0 region has the
greatest amount of response energy compared to its harmonics at both the presentation level
which is consistent with the study done by Russo, Nicol, Musacchia and Kraus (2004). They
reported that F0 region in the responses showed a greater energy compared to its harmonics. It is
due to the high energy level in the F0 region of the stimulus (Ladefoged, 1996) and also due to
the better phase locking of the lower frequencies (Gelfand, 1998).
Dissertation Vol.V, Part-A, AIISH, Mysore
195
Comparison between the presentation level using the paired sample t-test in normal
hearing individuals showed a significant reduction in the response amplitude in the F0 region
when the presentation level was varied from 80 dB nHL to 40 dB SL but the reduction was not
significant in the F1 region.
The reduction in amplitude may be due to the reduction in the amount of acoustic energy
reaching the neurons at 40 dB SL compared to 80 dB nHL. The clinical group revealed no
significant decrease in the F0 and F1 amplitude with the change in the presentation level. This
could be due to the little differences across the presentation level. Also, the indifference between
the low and high intensity values may attributed to the disturbed intensity processing in the
hearing loss group (Florentine, et al., 1993).
Comparison across groups at equal Sensation Levels (40 dB SL)
Kruskal-Wallis test was carried out to check whether there is any significant difference
between the three groups. The Mann-Whitney U test was carried out for those parameters which
revealed significant difference with the Kruskal- Wallis test to check whether the Groups I and II
differed significantly from that of the control group.
Table 5: Z-values between the groups at 40 dB SL
Parameters z-values
Control Vs Group I Control Vs Group II Group I & II
latency Wave V -1.12 -2.98** -0.28
Wave F -0.86 -2.09*** -1.63
Amp Wave D -0.13 -2.61** -2.39***
Wave F -1.62 -3.05** -0.41
FFT F0 amplitude -0.30 -3.25** -2.81**
F1 amplitude -0.30 -2.03*** -2.25***
*p<0.001, **p<0.01, ***P<0.05
Results of Kruskal-Wallis test for the latency and amplitude parameters of discrete peaks
and the F0, F1 amplitude revealed a significant difference between the three groups (Control
group, Group I and Group II) for the latencies of wave V and F; wave D and F amplitude; and
the amplitude of the F0. Table 5 shows the results of the Mann-Whitney U test for the pair wise
comparison of the control group, group I and group II for the latencies of waves V and F;
amplitudes of waves D and F; and amplitudes of F0 and F1.
Results of Mann Whitney U test revealed no significant difference between the group I
and the control group for all the parameters. However, there was a minimal increase in the
latency and reduction in the amplitude for the group I. This may be due to the lesser degree of
hearing loss which has minimal or no effect in the temporal processing. This is consistent with
the study done by Bus, Hall and Grose (2004). They reported that individuals with mild cochlear
impairment are minimally affected in coding temporal fine structure compared to individuals
Dissertation Vol.V, Part-A, AIISH, Mysore
196
with moderate cochlear impairment. Also few of the mild hearing loss individuals in their study
had near normal performance in temporal fine structure coding.
The group II revealed a significant amplitude reduction and latency prolongation when
compared with individuals with normal hearing. Also the F0 and F1 amplitude showed a
significant reduction. This could indicate reduced temporal processing in higher degree of
hearing loss. This supports the study done by Lorenzi, Gilbert, Carn, Garnier and Moore (2006)
who reported that both young and elderly subjects with moderate cochlear hearing loss
performed very poorly with temporal fine structure speech which is very important for the
coding of F0 and its formants. This loss of ability to use temporal fine structure information
perhaps was related to a loss of neural synchrony (Woolf, Ryan & Bone, 1981).
Comparison of Group I and II showed a reduction in wave D amplitude and the F0 and
F1 amplitude with the increase in severity of hearing loss. This indicated that the degree of
hearing loss has an effect on temporal processing and coding temporal fine structure of speech
(Lorenzi, Gilbert, Carn, Garnier, & Moore 2006; Moore & Moore, 2003). Effect of degree of
hearing loss on temporal fine structure coding can be understood from the study done by Bus,
Hall and Grose (2004). Their data revealed that individuals with mild cochlear impairment are
minimally affected in coding temporal fine structure compared to individuals with moderate
cochlear impairment. Also a few of the mild hearing loss individuals in their study had near
normal performance in temporal fine structure coding.
Overall we can conclude that though the audibility of the stimulus was same across the
three groups, still the clinical group had some deficit in coding information at the auditory nerve
which was reflected in the latency and amplitude measures. The minimal to mild hearing loss
group had minimal loss of information and were almost similar to the normal group. This deficit
was more pronounced in the moderate hearing loss group.
Comparison across the groups at equal Hhearing levels (80 dB nHL)
Results of the Kruskal-Wallis test revealed significant difference between the 3 groups
for all parameters except for the wave V amplitude. Table 6 shows the results of the Mann-
Whitney U test for the pair-wise comparison of all parameters for between the groups. Results of
Man Whitney U test showed a significant difference between the group I and control group for
most of the parameter except the Wave E amplitude, F1 amplitude. As expected the normals had
shorter latencies and higher amplitude of the peaks compared to the Group I. This is due to
higher audibility in normal hearing individuals compared to group I. Also, there was a significant
reduction in the F0 amplitude in the group I. Though F1 amplitude showed a slight reduction in
amplitude in Group I it failed to show any significant difference.
In group II the wave latencies increased and the amplitude reduced significantly in the
compared to control group. Also, there was a drastic reduction in the F0, F1 amplitude. This
suggests that the inadequate audibility would affect the temporal processing to a great extent in
moderate hearing loss group.
Dissertation Vol.V, Part-A, AIISH, Mysore
197
Table 6: Z-values between the groups at 80 dB nHL
Parameters z-values
Control Vs Group I Control Vs Group II Group I Vs Group II
Latency
Wave V -3.46** -3.93* -1.96
Wave A -3.58* -3.95* -2.49***
Wave C -2.21*** -4.07* -3.11**
Wave D -3.88* -3.99* -2.73**
Wave E -2.83** -4.14* -1.53
Wave F -3.54* -4.04* -1.98***
Wave O -3.63* -3.76* 0.00
Amplitude
Wave C -2.58*** -2.95** -0.20
Wave D -1.98*** -3.82* -1.11
Wave E -0.591 -3.36** -1.87
Wave F -2.16*** -3.70* -1.24
FFT F0 amplitude -2.36*** -4.27* -2.74**
F1 amplitude -0.53 -3.57* -2.60**
*p<0.001, **p<0.01, ***P<0.05
Comparison between group I and group II revealed increase in the latency and decrease
in the amplitude of all parameters though significant difference was seen only for wave A, C, D,
F latency. A significant reduction in F0, F1 amplitude was also seen in group II. This again
shows that as the hearing loss increases the audibility reduces and this would have affected the
temporal processing and F0, F1 coding.
Conclusion
To conclude the comparison across the groups at equal hearing level were done in order
to see the kind of difficulties that the hearing impaired individuals will face in day to day
situation. As we know that in day to day situation both normal and hearing impaired individuals
will be exposed to sounds at equal hearing levels and not equal sensation level. From the results
above it is clear that as the degree of hearing loss increases the temporal processing degrades due
to reduced audibility or could be due to the altered physiology of the inner ear. Thus, in day to
today situation hearing impaired individuals might miss out lot of the temporal cues which
essential for the speech perception. The cochlear hearing loss individuals will most often have
degraded coding of F0 and its harmonics and this is more pronounced for a higher degree of
hearing loss.
From this we can conclude that as the degree of hearing loss increases the ability to
process temporal fine structure of speech degrades, thus, compromising the speech intelligibility
in quiet as well as in adverse environments.
Clinical implications
1. Brainstem responses to speech syllables can throw more light to understand the role of
brainstem processing of speech sounds.
Dissertation Vol.V, Part-A, AIISH, Mysore
198
2. FFT analysis of the brainstem responses is a useful tool in detecting deficits in speech
sound processing. Amplitudes of F0 and F1 peaks proven to be useful for this type of
evaluation.
3. It can be used as an objective tool to assess temporal processing in difficult to test
population.
4. It can also be used as a tool for hearing aid selection or to check for benefit from hearing
aid or rehabilitation.
References
Buss, E., Hall, J. W. & Grose, J. H. (2004). Temporal fine structure cues to speech and pure tone
modulation in observers with sensorineural hearing loss. Ear & Hearing, 25, 242 – 250.
Cunningham, J., Nicol, T., Zecker, S. J., Bradlow, A. & Kraus, N. (2001). Neurobiologic
responses to speech in noise in children with learning problems: deficits and strategies for
improvement. Clinical Neurophysiology, 112, 758 – 767.
Danhauer, J. L., Hiller, S. M. & Edgerton, B. J. (1984). Performance on a nonsense syllable test
for normal and hearing-impaired subjects. Journal of Auditory Research, 24(3), 165-173.
Florentine, M., Reed, C. M., Rabinowitz, W. M., Braida, L. D., Durlach, N. I. & Buus, S. (1993).
Intensity discrimination in listeners with sensorineural hearing loss. Journal of the
Acoustical Society of America, 94, 5, 2575-86.
Gelfand, S. M. (1998). Hearing: An Introduction to Psychological and Physiological Acoustics.
Taylor & Francis Ltd.
Hall, J.W. (1992). Handbook of Auditory Evoked Responses. Massachusetts: Allyn and Bacon.
Hoormann, J., Falkenstein, M., Hohnsbein, J. & Blanke, L. (1992). The human frequency-
following response (FFR): normal variability and relation to the click-evoked brainstem
response. Hearing Research, 59, 179–188.
Khaladkar, A. A., Kartik, N. & Vanaja, C.S. (2005). Speech Burst and click evoked ABR. Paper
presented at the annual convention of Indian Speech and hearing association, Indore.
King, C., Warrier, C. M., Hayesa, E. & Kraus, N. (2002). Deficits in auditory brainstem pathway
encoding of speech sounds in children with learning problems. Neuroscience Letters,
319, 111–115.
Kraus, N. & Nicol, T. (2005). Brainstem origins for cortical „what‟ and „where‟ pathways in the
auditory system. Trends in Neurosciences, 28, 176 – 181.
Ladefoged, P. (1996). Elements of acoustic phonetics, Chicago: The University of Chicago press.
Lorenzi, C., Gilbert, G., Carn, H., Garnier, S. & Moore, B. C. J. (2006). "Speech perception
problems of the hearing impaired reflect inability to use temporal fine structure".
Proceedings of the National Academy Sciences, 103, 18866-18869.
Martin, B. A., Sigal, A., Kurtzberg, D., & Stapells, D. R. (1997). The effects of decreased
audibility produced by high-pass noise masking on cortical event-related potentials to
speech sounds /ba/ and /da/. Journal of the Acoustical Society of America, 101, 1585–
1599.
Dissertation Vol.V, Part-A, AIISH, Mysore
199
McGee, T., Kraus, N., King, C., Nicol, T. & Carrell, T. D. (1996). Acoustic elements of speech
like stimuli are reflected in surface recorded responses over the guinea pig temporal lobe.
Journal of the Acoustical Society of America, 99, 3606–3614.
Moller, A. R. (1981). Latency in the ascending auditory pathway determined using continuous
sounds: comparison between transient and envelope latency. Brain Res, 207, 184-188.
Moore, B. C. J. & Moore, G. A. (2003). Discrimination of the fundamental frequency of
complex tones with fixed and shifting spectral envelopes by normally hearing and
hearing-impaired subjects. Hearing Research, 182, 153-163.
Moore, B. C., Glasberg, B. R. & Hopkins, K. (2006). Frequency discrimination of complex tones
by hearing-impaired subjects: Evidence for loss of ability. Hearing Research, 222, 16-27.
Plyler, P. N. & Ananthanarayan, A. K. (2001). Human frequency following Responses:
Representation of second formant transitions in normal-hearing and hearing-impaired
listeners. Journal of American Academy of Audiology, 12, 423-533.
Revoile, S., Pickett, J., Holden-Pitt, L., Talkin, D. & Brandt, F. (1987). Burst and transition cues
to voicing perception for spoken initial stops by impaired- and normal-hearing listeners.
Journal of Speech and Hearing Research, 30, 3-12.
Russo, N., Nicol, T., Musacchia, G. & Kraus, N. (2004).Brainstem responses to speech syllables.
Clinical Neurophysiology 115, 2021–2030.
Starr, A. & Don, M. (1988). Brain potentials evoked by acoustic stimuli. In: Picton TW, editor.
Handbook of electroencephalography and clinical neurophysiology. Amsterdam:
Elsevier.
Wible, B., Nicol, T. & Kraus, N. (2005). Correlation between brainstem and cortical auditory
processes in normal and language- impaired children. Brain, 128, 417–423.
Woolf, N. K., Ryan, A. F. & Bone, R. C. (1981). Neural phase-locking properties in the absence
of outer hair cells. Hearing Research, 4, 335-346.
Dissertation Vol.V, Part-A, AIISH, Mysore
200
Music Processed by Hearing Aids
Sushmit Mishra & P Manjula
Abstract
The processing of music by hearing aids is a challenge which the hearing aid industry is
facing today. The present study is an attempt to study the hearing aid processed music while
changing different parameters of the hearing aid. Thus through a controlled study design the
parameters in a hearing aid appropriate for good music perception were evaluated. The
evaluations were done using the spectral measurement and subjective perception of the music
samples processed by hearing aids programmed with different parameters.
The recorded music samples were subjected to perceptual analysis of five parameters on
a five-point rating scale and objective analysis was done using spectral slice in Praat software.
The results of objective and subjective analysis implied the following settings of the parameters
in the digital hearing aid to be more appropriate for the music sample studied. All these
conclusions are made with reference to the music samples, hearing aids and settings that were
used in the study. A fifteen channeled hearing aid was better for music perception than a six
channeled hearing aid. Further, the knee-point for ideal music perception should be set as high
as possible till it is not uncomfortable for the subject, feedback management and the noise
cancellation system should be turned off.
Key words: spectral measurement, music perception, compression knee-point, Noise
cancellation, feedback management
Introduction
Music is an important and enjoyable part of life for people of all ages. It has been found
to release tension, raise spirits and promote a feeling a well-being. There are just a few people
who do not enjoy listening to music but there are many people who regard music as one of their
chief pleasures. Music is an important ingredient of every culture throughout the world as a form
of entertainment and as a form of an art.
Following the perception of speech the appreciation of music is the next commonly
expressed requirement by the users of cochlear implants (Stainsby, McDermott, McKay & Clark
1997). This may well be said for the individuals who use hearing aids also. When the individuals
who enjoy listening to music acquire hearing impairment one might expect a significant effect on
music perception and the pleasure derived from music. Although there may be some restoration
of hearing through the use of hearing aids, it is questionable whether most hearing aids process
Professor in Audiology, All India Institute of Speech and Hearing, Mysore, India
e-mail: [email protected]
Dissertation Vol.V, Part-A, AIISH, Mysore
201
music in such a way as to enable the user to hear and enjoy music to the same degree as prior to
acquiring hearing loss.
The reason that individuals with hearing impairment fail to perceive or appreciate the
sound quality of music is because the hearing loss has differential effects on frequency
selectivity, temporal resolution, loudness perception/intensity discrimination and suprathreshold
performance. These contribute to the difficulty of the individuals with hearing impairment to
perceive or appreciate music (Chasin, 1996).
There is a dearth of research in the field of music and hearing impairment. This is also
because the musicians and the scientists often use different terminologies in describing the
characteristic of a tone. For example, a researcher may say 440 Hz tone and a western musician
may denote it by „A‟. The musicians are particularly interested in the tones below „C‟, i.e. 262
Hz. But most audiologists would ignore the sound energy below 250 Hz because of the poor
signal-to-noise ratio and because of hearing assessment problems. But for musicians this low
frequency information can significantly contribute to the quality judgment (Chasin, 2003).
Speech tends to be a well controlled spectrum with well established and predictable
perceptual characteristics. In contrast music spectra are highly variable and the perceptual
requirement can vary based upon the musicians, type of music and the type of instrument being
played. There are five major differences between speech and music stimuli as reported by Chasin
(2003 & 2006) and Chasin and Russo (2004). They are (i) the long-term average spectrum of
music vs. speech (ii) differing overall intensities (iii) crest factors (iv) phonetic vs. phonemic
perceptual requirements and (v) difference in loudness summation and loudness intensity.
Unlike the long-term average spectrum of speech, music is highly variable and a goal of a
long-term average music spectrum is poorly conceived. The potential intensity range for speech
is quite restricted, approximately 30 to 35 dB. But the dynamic range of music is of the order of
100 dB. A typical crest factor with speech is about 12 dB. Crest factors of 18 to 20 dB are not
uncommon for many musical instruments. The perceptual need for speech is quite clear. But for
music the perceptual need is quite varied and depends upon the instrument being played. The
vocal cords function as half wave resonators whereas the musical instrument can be half wave
resonator or quarter wave resonator depending upon the instrument which is played. These
differences make it difficult for a hearing aid user to enjoy music with a hearing aid that is
mainly designed for speech.
Historically the primary concern for the hearing aid design and fitting is the optimization
for speech input (Chasin & Russo, 2004). Musicians with hearing loss have often complained
about the poor sound quality while they are playing or listening to the music through their
hearing aid (Chasin, 2003). Not only the technology for music input in hearing aids is in its
infancy but the research and clinical knowledge of what the musicians and those who like to
listen to music need to hear is also in the early stage of understanding (Chasin & Russo, 2004).
Presently, the digital technology is replacing the analog technology used in the hearing aid
industry. The digital technology has the advantage of employing various algorithms such as
Dissertation Vol.V, Part-A, AIISH, Mysore
202
adaptive noise-reduction systems, adaptive directionality, adaptive feedback suppression and
highly flexible control of numerous amplification characteristics including complex forms of
compression. Digital technology is also enabling the processing of sounds in different channels
having different compression settings. Chasin (2003, 2006) recommended a set of parameters
being ideal for the perception of music through the hearing aids. Thus, through a controlled study
design the parameters in a hearing aid ideal for music perception need to be evaluated.
The present study is an attempt to evaluate the hearing aid processed music while
changing different parameters of the hearing aid. Thus, the objectives were as follows:
1. To compare the processing of music by using a six-channeled and a fifteen-channeled
hearing aid where all the other parameters of signal processing of the hearing aid are kept
the same.
2. To compare the music processed in a six-channeled hearing aid with
a. the compression knee-point being set high than that optimized for speech for a
particular hearing loss
b. by enabling and disabling the noise reduction system
c. by enabling and disabling the feedback management system
Method
The efficacy of the hearing aids was evaluated using the subjective perception and
objective spectral analysis of the music samples processed by the hearing aids.
Participants
Three groups of participants participated in this study, namely, Group I (Non-musicians
group), Group II (Musicians-Singers group) and Group III (Musicians-Instrumentalists Group).
They had hearing sensitivity within normal limits with no significant history of external or
middle ear infection or malformation of the ear.
Group I (Non-musicians group):15 participants with no formal education in music were
considered. The age of the participants ranged from 18 to 25 years (mean age of 20.75 years). It
was ensured that all the participants selected in the study enjoyed listening to music.
Group II (Musicians-Singers group): 15 participants with at least 10 years of experience in
professional singing were selected. The age range of the participants was from 24 years to 59
years (the mean age of 39.90 years). The average experience with professional singing in this
group was 24.8 years (range being 10 to 39 years). In order to have homogeneity in the group
care was taken to see that all the participants were practicing the Carnatic style of singing.
Group III (Musicians-Instrumentalists Group): 10 participants were selected in this group.
All the participants were practicing „Odissi‟ style of playing the instrument. Three of the
instrumentalists played sitar, two of them played the flute, three played the sarod and the other
two played the veena. The age range of the participants was between 36 and 52 years (mean age
Dissertation Vol.V, Part-A, AIISH, Mysore
203
being 48.10 years). The average experience in professional music for this group was 27.2 years
(range being 24 to 32 years).
Instruments used
Hearing Aids: Two commercially available digital Behind-The-Ear (BTE) hearing aids were
used in the study. The first hearing aid (Hearing aid A) was a six-channeled hearing aid and the
second hearing aid (Hearing aid B) was a fifteen-channeled hearing aid. These two hearing aids
were selected because, apart from the number of channels, the signal processing strategy, the
microphone technology and the noise canceller were technically similar. Both the hearing aids
used wide dynamic range compression (WDRC), dynamic noise canceller (dNC), feedback
management system and an omni directional microphone technology. Both these hearing aids
had an option of switching off the noise cancellation system and the feedback management
system.
Computers and Hi-Pro: A personal computer installed with NOAH (3.0 version) software and
connected with Hi-Pro were used to program hearing aids. Two laptops operating on Windows
XP and Praat software were also used in this study for recording the music sample. Frontech
speakers, 340 Watt PMPO were connected to the laptop that played the music. The volume was
set at a comfortable loudness level. The music samples were recorded on another laptop using
the Praat software. These samples were later transferred to an audio compact disc. The sampling
rate used in all the recording of the study was 32 kHz. The music samples were played to the
listeners from a laptop through the head phones (Fontopia MDR-EX51LP Consumer
Headphones) using the Praat software.
Coupler: A 2 cc coupler (HA-2) was used for coupling the hearing aid to the microphone while
recording.
Microphone: An omni directional microphone was connected to the laptop computer. This
microphone was in turn attached to the coupler using „fun tak‟ to record the output from the
hearing aid on the laptop using the Praat software.
Music Sample: A music sample recorded on an audio compact disc was used. Carnatic music
which was being played instrumentally was selected. The music sample had the lead music
played by violin and the other instrument being played was the mrudhagam. The music sample
was chosen from the music album titled „lagudi‟ and the music was based on raaga „Mohana
Kalayani‟. 90 second duration of the sample was selected for the purpose of evaluation.
Room Setting: A sound treated air conditioned room was selected for recording the music
sample from the hearing aids. The ambient noise levels inside the room were within permissible
limits.
Procedure: For the purpose of the study, the study was divided into 4 stages:
Stage 1: Recording of the hearing aid processed music samples
Stage 2: Programming of the hearing aid
Dissertation Vol.V, Part-A, AIISH, Mysore
204
Stage 3: Subjective analysis of the music samples
Stage 4: Measurement of spectra of the music samples
Stage 1: Recording of the Hearing Aid Processed Music
For the comparison of the effect of channels on processing of music, two digital hearing
aids were taken; one comprising of 6 channels (Hearing aid A) and the other comprising of 15
channels (Hearing aid B). The knee-point of both the hearing aids was at 54 dB when
programmed for the speech in quiet (default program). The knee-point was raised by 18 dB to
make it 72 dB. The music samples were recorded with noise cancellation and feedback
management systems off. Apart from the difference in number of channels the other signal
processing parameters in the hearing aids were similar.
A sample was recorded from the fifteen channeled hearing aid (Hearing aid B) with knee-
point set at 18 dB higher than the default knee-point setting for the speech as recommended by
the first fit of the programming software. This time both the noise cancellation system and the
feedback management system were turned on.
For recording the other music samples a 6-channeled digital hearing aid (Hearing aid A)
was selected. Music sample was played to the programmed hearing aids. The hearing aid
processed music was recorded. During recording of different samples the hearing aid A was
taken and the hearing aid processed music was recorded in each of the following conditions:
knee-point set at 72 dB with both the noise cancellation system and feedback management
systems switched off initially. Then, a music sample was recorded with the high knee-point (i.e.,
72 dB) with the noise cancellation system on and the feedback management system off. Later
another music sample was recorded with the knee-point being high (i.e. 72 dB) with the feedback
management system on and the noise cancellation system off. Finally, a music sample was
recorded with the noise cancellation system and feedback management system being switched on
with the knee-point being high (i.e., 72 dB).
On the whole there were eight music samples; two from the fifteen channeled hearing
aid, five from the six channeled hearing aid and the original music sample recorded through the
coupler. They are given in the following Table 1.
Thus, the recording of the music sample without and with the hearing aid was done.
These eight recorded music samples were later used for the subjective ratings and the spectral
analysis. The music sample was played using Praat software from a laptop through the speakers.
The hearing aid was placed at equivalent distance of 5 cm from either of the speakers and at 900
Azimuths, as shown in the figure. A foam sheet was placed below the hearing aid so that it did
not pick-up any noise due to vibration of the table. It was taken care that the microphone of the
hearing aid was at the level of centre of the speakers. The digital hearing aid was connected to a
HA-2 (2 cc) coupler which in turn was connected to the recording microphone. The recording
microphone was connected to the lap top computer for recording the music sample using the
Praat software. All the recordings in Praat software were made using 16 bit mono recording.
Dissertation Vol.V, Part-A, AIISH, Mysore
205
Thus, the music processed by the hearing aid in each of the seven different programmed settings
of the hearing aids was recorded.
Table 1: The eight music samples recorded with different settings of the hearing aids
Music samples Conditions of recording
Sample 1 Original music sample recorded through 2 cc coupler
Sample 2 Hearing aid B with knee-point high, noise cancellation and feedback
management off
Sample 3 Hearing aid A with knee-point high, noise cancellation and feedback
management off
Sample 4 Hearing aid B with knee-point high, noise cancellation and feedback
management on
Sample 5 Hearing aid A with knee-point high, noise cancellation and feedback
management on
Sample 6 Hearing aid A with knee-point at default, noise cancellation and feedback
management off
Sample 7 Hearing aid A with knee-point high, noise cancellation on and feedback
management off
Sample 8 Hearing aid A with knee-point high, noise cancellation off and feedback
management on
In order to make all the music samples equivalent, the original music sample was also
played in the same condition and recorded through the coupler to make the unprocessed music
sample equivalent to the music sample processed through the hearing aids. The music samples
were not normalized. The samples were then transferred to an audio compact disc.
Stage 2: Programming of the Hearing Aid
The hearing aids were programmed for a hypothetical flat sensorineural hearing loss with
air conduction threshold being 50 dB HL at all the audiometric frequencies. A flat hearing loss
was used so that the compression characteristic, when tested, remained same across all the
frequencies. The digital hearing aid was connected through a Hi-Pro to the personal computer
(PC) with software for programming. After the hearing thresholds were fed into the software
(NOAH 3.0) the digital hearing aids were programmed based on the NAL-NL1 prescriptive
procedure in the hearing aid programming software. An acclimatization level of 2 was used
while programming.
Stage 3: Subjective analysis of the music samples
Measures of quality judgment of the music samples were obtained using five, five-point
perceptual rating scales that was relevant to music. This is a modification of the work of
Gabrielsion and Sjogren (1979) that has been used extensively in the hearing aid industry
(Chasin & Russo, 2004). The participants were asked to rate the music samples on the perceptual
Dissertation Vol.V, Part-A, AIISH, Mysore
206
parameters of loudness, fullness, crispiness, naturalness and overall fidelity. Participants were
given the following definitions of the five perceptual parameters (Chasin & Russo, 2004).
Loudness was defined as the music that is sufficiently loud in contrast to faint, ranging
from 5 to 1 on the rating scale. Fullness was defined as the music being full in contrast to thin,
ranging from 5 to 1 on the rating scale. Clearness was defined as the music being clear and
distinct in contrast to being blurred or diffused, ranging from 5 to 1 on the rating scale.
Naturalness being defined as the music seems to be as if there is no hearing aid and the music
sounds as “I remember it”, ranging from 5 to 1 on the rating scale. Overall fidelity being defined
as that the dynamics and range of the music is not constrained or narrowed, ranging from 5 to 1
on the rating scale.
Specifically, the participants were asked to rate from 1 (poorest) to 5 (best) on the following
perceptual scales: loudness, fullness, crispiness, naturalness and overall fidelity. Thus, a perfect
perceptual reproduction score was 25 considering all the five parameters on the scale. The scales
for rating on the five parameters were as follows:
1. For loudness: 1 ( faint)……………….5 (sufficiently loud)
2. For fullness: 1 (thin)………………5 (full)
3. For clearness: 1 (blurred)………………5 (distinct and clear)
4. For naturalness: 1 (unnatural)……………...5 (natural)
5. For overall Fidelity: 1 (restricted)………………5 (wide and not constrained)
The music samples were played from a computer using the Praat software through the
head phones (Fontopia MDR-EX51LP Consumer Headphones). The participants were instructed
to listen to the samples at their comfortable loudness level.
All the participants in the three groups were made to listen to the music samples in
similar conditions. A relatively quiet room away from traffic noise and other noises was selected.
Each subject was made to listen to the eight different music samples mentioned above.
Instruction: The participants were given an identical set of instruction in a written format, in
English, so that the instruction for all the participants remained essentially the same. Four
participants in Group III asked for instruction written in Oriya, hence a translation of the
instruction were done which was verified by two graduate students in Oriya for the correctness
of the meaning. The instructions were further clarified by the experimenter before the
participants rated the music sample, if required. It was made certain that the participants were
absolutely clear with the terminology and completely certain about the rating scale before they
rated the music samples.
Stage4 - Measurement of spectra of the music samples
The selected music sample, from the recorded music samples, as processed by the
hearing aid were subjected to spectrum analysis using the Praat software. For the precise
comparison equivalently paired music slices were taken from the eight music samples. In each of
the samples recorded the samples for analysis were taken at the interval of 14 to 24 seconds, 48
Dissertation Vol.V, Part-A, AIISH, Mysore
207
to 58 seconds and 74 to 84 seconds. Three ten-second duration of the music samples were
selected for analysis with Praat software as the Praat software could analyze the music sample of
less than 10 second duration. Spectral analysis of three ten-second duration of the music samples
was obtained in the Hammin window. The energy concentration at octave and mid-octave
frequencies (from 200 to 8000 Hz) was measured and tabulated.
Results
The main objective of the study was to compare the processing of music through different
hearing aid settings, subjectively using a rating scale and objectively using the spectral analysis.
Subjective Analysis
For the subjective analysis the samples were first presented to 40 listeners to be rated on a
rating scale. There were a total of eight music samples. Details of these samples are provided in
Table 1. The original music sample and seven hearing aid processed music samples were rated
on a five-point rating scale. These ratings were tabulated. The statistical analysis was carried out
with the help of the Statistical Package for the Social Sciences (SPSS, Version 10). The non-
parametric test was used in the statistical analysis. The data, in terms of the five parameters, were
analyzed for the comparison of the three groups of participants using Kruskal-Wallis test (Non-
parametric equivalent of one-way ANOVA). Mann-Whitney U test was used to see the pair-wise
difference between the groups, where the comparisons were made taking two groups at a time.
Later, the original music sample was compared with all the other music samples using the
Wilcoxon Signed Rank test. These analyses were repeated for all the five parameters.
The Kruskal-Wallis and Mann-Whitney U Test was applied to all the parameters in the
rating scale. It revealed that the there was a significant difference (p<0.05) between the ratings
given by the non-musicians and singers, and the non-musicians and instrumentalists. But the
rating given by the singers and the instrumentalist group was not significantly different (p>0.05).
Wilcoxon signed rank test revealed that the non-musicians rated the music sample recorded from
the fifteen channeled hearing aid with the noise reduction system and feedback management
system (sample 2) to be similar to the original music sample (p>.05). It was interesting to note
that the instrumentalist and the singers group rated the sample 2 to be similar to the original
music sample in three parameters of perceptual rating but rated it to be different in two
parameters, namely, naturalness and overall fidelity.
Dissertation Vol.V, Part-A, AIISH, Mysore
208
Spectral Analysis
The result of analysis of the objective measures was similar to subjective measures.
Frequency (Hz)
60004000300020001000500250
Inte
nsity
(dB
SPL)
40
20
0
-20
-40
-60
Samples
Sample 1
Sample 2
Sample 3
Sample 4
Sample 5
Sample 6
Sample 7
Sample 8
Fig. 1: Hearing aid output at different fqs for different samples, in 48 to 58 seconds interval
Note: All the values in the graph are referenced to 2.10-5
dyne/cm2
Since on the Praat software a sample of maximum 10 seconds could be analysed, the
sampling for each music sample was done at intervals of 12 to 22 seconds, 48 to 58 seconds and
74 to 84 seconds. The energy concentration at each of octave and mid-octave frequency was
measured and was plotted as graphs as shown in Figure 1. All the three figures showed similar
pattern and Figure 1 represents the sample in the interval 48 to 58 seconds.The descriptions of
the music sample are given in Table 1. From the figures it is evident that the music sample 2
(Hearing aid B with knee-point set high, noise cancellation system and the feedback management
system turned off) gave the best representation of the original music sample. The music sample 3
(Hearing aid A with knee-point high, noise cancellation and feedback management off) and
music sample 4 (Hearing aid B with knee-point high, noise cancellation and feedback
management on) gave the second and third best representation of the original music sample
respectively. From the graph, it was evident that activation of the noise cancellation system or
the feedback management system in the hearing aids led to degradation of the sample in terms of
reduction of energy level in the low frequencies and increase of energy level in mid- and high-
frequencies.
From the figure, it was evident that the outputs from the hearing aids were lower than the
original in the lower frequency. But from the mid-frequency, around 1 KHz to 4 KHz, the
hearing aid amplified the music. The activation of the feedback system and noise cancellation
system (Sample 4, 5 & 8) led to a reduction of energy at the frequency about 2 KHz which is
evident as a dip in the energy output. It is quite evident from the graph that the output through
hearing aid B gave a better representation of the original music.
Discussion
The perception of music is dependent on various factors such as the relationship of the
harmonics in the lower frequency fundamentals and the higher frequency harmonics, the
concentration of energy in the lower frequency, the temporal resolution of the music in the ear in
Dissertation Vol.V, Part-A, AIISH, Mysore
209
order to be judged as good in quality. Hence the objective spectral evaluation of the output alone
will not provide a clear picture of the quality of music processed by a hearing aid. The finding
from the objective study should be complimented by a subjective study. A perceptual rating
inherently has many factors which are dependent on the subject. The experimenter has limited or
no control over the subject-dependent factors such as motivation, understanding and bias of the
participants. Hence the study evaluated the music samples processed by hearing aids both
subjectively and objectively.
Integration of the Subjective and Spectral Analysis
All the eight music samples recorded through the hearing aid were given a poorer rating
subjectively by all the listeners. In the studies done earlier hearing aid users have always
preferred a lower cut-off frequency in judgment of the quality of music (Punch, 1978; Frank &
Hall, 1985). Chasin (2003) noted that the information in the low frequencies contributed
significantly to the quality judgment of music. The figures depicting the hearing aid output at
different frequencies for different samples revealed that neither the hearing aid A nor the hearing
aid B represented the original music well in the lower frequency region. The output of the
hearing aid was always poorer than the original music sample in the lower frequency region and
the hearing aids were able to amplify the music after a frequency of around 1 KHz. Mishra and
Abraham (2007) have also noted that overall music processed through the hearing aids showed a
poor representation of waveform in the low frequencies. It can be noted that the sample in which
the hearing aids provided a good representation of the music in the lower frequency are rated
higher by the listeners. The figures depicting the original music showed a concentration of
energy in the lower frequency region and the energy falling precipitously after a frequency of
500 Hz. All the subjective/perceptual ratings indicated a much higher rating for the original
music sample which had relatively higher energy in the low frequency region.
Further, it was seen in this study that the hearing aid B having 15 channels represented
the original music better on perceptual rating with the knee-point being set higher. The
observation of the figures which represented the output from the hearing aid also depicted that
the output from the Hearing aid B having the knee-point raised to 72 dB (Sample 2) gave better
representation of the original music sample (Sample 1) than Hearing aid A with similar settings.
Hearing aid A with the knee-point raised to 72 dB (Sample 3) followed the output from the
Hearing aid B. In the perceptual rating too the ratings given to the hearing aid B, with the knee-
point high and the noise cancellation and feedback management off, was not significantly
different from the original music.
The rationale for having a single channel or double channel hearing aid is to have equal
compression ratio across the entire frequency range remains equal so that the ratio of the energy
in the low frequency and the high frequency remains essentially the same. According to Chasin,
and Russo (2004) if there is an imbalance in the amplification in the low frequencies and the
high frequencies the timbre will be affected.
Dissertation Vol.V, Part-A, AIISH, Mysore
210
Chasin (2003) noted that for wood wind instruments and quieter music or for clients with
precipitous sloping audiometric configuration, a multi-channel hearing aid may be acceptable,
since for the wood wind instruments the perceptual reliance is on the information in the lower
frequencies. But the music sample selected in the present study was not produced by any wood
wind instrument and the lead instrument was violin.
Chasin and Russo (2004) had noted that some musical instruments like piano and violin
are more speech-like. Such instruments are half wave resonator with evenly spaced harmonics.
Since the lead instrument in the music sample in the present study was violin this may be a
reason why a fifteen channel hearing aid performed better than a six channeled hearing aid. It has
been noted in various studies that multi-channel hearing aids improve speech perception
(Mispagel & Valente, 2006).
The increase in the number of channels leads to different gain and compression setting in
different frequency bands and there is disturbance between the low frequency fundamentals and
the high frequency harmonics. In the analysis of the subjective results of the Group I (consisting
of individuals not having any experience in professional music) there was no significant
difference between the original music and the out put from the hearing aid B with knee-point
raised by 18 dB. However, participants in Group II and Group III (consisting of professional
singers and instrumentalists) rated the music sample 2 to be similar to that given to the original
music sample, except in two parameters in the perceptual rating, namely, the naturalness and
overall fidelity. The differences in naturalness and reduced overall fidelity may be attributed to
the disruption of ratio or balance of the low frequency fundamental and the high frequency
harmonic structure. Such a difference was not perceived by the Group I who indulged in
listening to music just to seek pleasure. The variation may also be too subtle to be picked up an
untrained listener. This draws attention to the fact that special care should be taken while
prescribing hearing aids to the trained musicians as the demand of the musicians is far greater
than the non-musicians who just appreciate music.
Chasin (2006) had recommended that the knee-point of the hearing aid should be set
around 65 to 75 dB and the compression ratio should be low (e.g. 1.5:1). In the present study the
knee-point of the hearing aid was set at 72 dB. The knee-point while programming the hearing
aid was 54 dB which was the knee-point setting for speech as suggested by first fit of the
software. The subjective rating showed that the listeners had a preference for the hearing aid
output where the knee-point that has been set to higher level for both the hearing aids (Sample 2
& 3). The graph obtained for the outputs of the hearing aid in different settings also showed that
setting the knee-point higher gave a better representation of the music in the lower frequency
region. When the knee-point was set at the default setting for speech, the output of the hearing
aid B was much lower in intensity in the low frequency region (Sample 6). On raising the knee-
point the output obtained from the hearing aid A was better compared to the default setting.
All the participants gave poorer rating to the music samples in which the noise
cancellation system was activated (Sample 4, 5 & 7). In the graphs depicting the output from the
Dissertation Vol.V, Part-A, AIISH, Mysore
211
hearing aid in different settings it was noted that whenever the noise cancellation system was
activated it led to suppression of energy in the low frequency region. For time invariant noise,
the noise cancellation system in the present day hearing aids assume that since most of the noise
power is concentrated in the low frequencies, the speech is masked in this frequency region and
filtering out both speech and noise over this frequency range will have little or no effect on
intelligibility but will reduce the loudness and annoyance of the noise; i.e., overall sound quality
will be improved (Levitt, 2001).
The long-term spectra of noise and speech reveal the difference between speech and
noise in their concentration of energy. Hence, it can be assumed that the hearing aids used in this
study were assuming the music sample to be a time invariant noise and hence there was a
cancellation of energy in the low frequency region. As discussed previously the concentration of
energy in the low frequency region is essential for the judgment of quality of music to be better.
The findings of this study support that by Chasin (2003 & 2006), Chasin, and Russo (2004) and
Mishra, Kunnathur and Rajalakshmi (2005). The previous studies have noted that there is high
probability for the hearing aid to confuse music for noise and hence had recommended
deactivation of the noise cancellation system.
The activation of the feedback management system led to low energy levels which in turn
led to poorer rating by all the participants. In the figures it was noted that whenever the feedback
management system was activated (Samples 4, 5 & 8), it led to a dip at the frequency region of
around 2 KHz. In notch filtering of the feedback management system it is possible to invert a
band pass filter and remove the part of the peak in frequency response. It generally removes a
narrow part of the spectrum centered on the frequency of the filter. This type of filtering is used
to counter the acoustic feedback where the notch is tuned to remove a narrow band of frequency
around the offending frequency (Agnew, 1993). Most probably the activation of the feedback
system led to employment of such a filter which nullified the gain at a frequency range centered
around 2 KHz. Suppression of energy in a particular frequency will have a deleterious effect on
music perception since the gain should be equal and balanced over the frequency region for the
optimal perception of music.
The simultaneous activation of the noise cancellation system and the feedback
management system led to greater deleterious effect on the perception of music both objectively
and subjectively (Sample 4 & 5). It was worth noting that the hearing aid A had far more
deleterious effect when both the noise cancellation system and the feedback management system
were activated than hearing aid B by comparing outputs of sample 4 and sample 5 on the spectral
analysis.
Mishra, Kunnathur and Rajalakshmi (2005) noted that directional system leads to
significant loss of low frequency sound which may remove valuable information for music.
Hence, they recommended an omni-directional hearing aid for better perception of music. Even
though the hearing aids (Hearing aid A and Hearing aid B) in the present study had an omni-
Dissertation Vol.V, Part-A, AIISH, Mysore
212
directional microphone, the output in the low frequency was reduced, may be because of the
activation of the noise cancellation.
Conclusions
The hearing aid processed music samples were subjected to perceptual analysis of the
five parameters on a five-point rating scale. An objective spectral analysis was also done using
spectral slice in Praat software. The results of objective and subjective analysis implied the
following settings of the parameters in the digital hearing aid to be better for music perception.
All these conclusions are made with reference to the music samples, hearing aids, settings that
were used in the study.
A fifteen channeled hearing aid was better than a six channeled hearing aid for music
perception
The knee-point for ideal music perception should be set as high as possible till it is not
uncomfortable for the subject
The feedback management and the noise cancellation system should be turned off
In conclusion it can be said that music perception through the hearing aid can be
optimized to a greater extent with appropriate changes in the parameters of the hearing aids. It
was found from the results of the present study that a hearing aid with 15 channels (compared to
6 channels) disabling the noise cancellation and feedback management system would improve
the perception of music appreciably. The experimenters quite agree with Chasin (2003) who
noted that “a hearing aid ideal for music perception can be programmed to have good speech
intelligibility but the vice-versa is not true”.
References
Agnew, J. (1993). Applications of notch filter to reduce acoustic feedback. The Hearing Journal,
46 (3):37-40, 42-43.
Chasin, M. (1996). Musicians and the Prevention of Hearing Loss. San Diego: Singular
Publishing Group.
Chasin, M. (2003). Music and hearing aids. The Hearing Journal, 56 (7), 36-41.
Chasin, M. (2006). Hearing aids for musicians. The Hearing Review, 59 (3): 7-11.
Chasin, M. & Russo, F.A. (2004). Hearing aids and music. Trends in Amplification, 8 (4), 35-47.
Franks J.R. & Hall T.C. (1985). Hearing aid wearers and music. The Hearing Journal, 38 (5),
14-16.
Gabrielsson A. & Sjogren H. (1979). Perceived sound quality of hearing aids. Sand. Audiology,
8, 159-169
Dissertation Vol.V, Part-A, AIISH, Mysore
213
Levitt, H. (2001). Noise cancellation in hearing aids: An overview. Journal of Rehabilitation
Research and Development. 38 (1), 111-121.
Mispagel K. M. & Valente M., (2006). Effect of multichannel digital signal processing on
loudness comfort, sentence recognition and sound quality. Journal of American Academy
of Audiology. 17 (10), 681-707
Mishra, S.K., Kunnathur, A. & Rajlakshmi, K. (2005). Hearing aids and music: Do they mix?
Indian Speech and Hearing Association Conference, Indore.
Mishra, S. & Abraham A.K., Processing of Music by Hearing Aids (2007) Frontiers of Research
in Speech and Music, Mysore.
Punch J.L. (1978). Quality judgments of hearing aid- processed speech and music by normal and
otopathologic listener. J. Am. Aud. Soc. 3, 179-188.
Stainsby, T.H., McDermott H. J., McKay C. M. & Clark G. M. (1997). Preliminary results on
spectral shape perception and discrimination of musical sounds by normal hearing
subjects and cochlear implantees. Proceedings from the international conference on
computer and music.
Dissertation Vol.V, Part-A, AIISH, Mysore
214
Efficacy of Frequency Transposition Hearing Aid
In Dead Region Subjects
Swapna Raj S & K. Rajalakshmi
Abstract
A hearing aid is an electroacoustic device which enables a hearing impaired individual
to make maximum use of his residual hearing. Decreased audibility, reduced dynamic range,
decreased frequency resolution and temporal resolution are common problems in individuals
with sensorineural hearing loss. Sensorinueral hearing loss is also commonly called cochlear
loss, inner ear loss and nerve loss. The sensory mechanism comprises of the Outer hair cells
(OHC‟s) and the inner hair cells (IHC‟s). Damage to OHCs will lead to a reduction in the
compressive mechanism of the cochlea. Damage to IHCs will lead to a reduction in sound
transduction process. For this reason such regions are referred to as “Dead regions”. Steeply
sloping hearing loss is often associated with cochlear dead regions and also high frequency part
of speech contributes no information in such individuals. Thus, it is essential to transpose high
frequency information to the useful hearing at low frequencies in order to make high frequency
information accessible. Hence, the present study aimed at evaluating 10 subjects (15 ears) with
Dead regions. Subjects who were diagnosed as having dead regions using Modified Threshold
Equalising Noise (TEN) test participated in the study. Frequency transposition hearing aid was
fitted for individual who have dead region. Speech identification performance was evaluated
with High frequency sentence and word list in Transposition and No Transposition conditions.
The results revealed that with frequency transposition there is statistically significant amount of
benefit than non transposed condition for individuals with steeply sloping hearing loss with dead
regions.
Key words: High frequency sensorineural hearing loss, Frequency transposition, Dead region,
hearing aids, TEN.
Introduction
A hearing aid is an electroacoustic device which enables a hearing impaired individual to
make maximum use of his residual hearing. It takes an acoustical signal such as speech and
converts into an electric signal before amplification stage. The primary goal is to amplify and
deliver speech and other sounds at levels equivalent to that of normal speech and conversation.
It is the most effective therapeutic approach for the majority of individuals with hearing loss. It
differs in design, size, gain, ease of handling, volume control and availability of special features.
Decreased audibility, reduced dynamic range, decreased frequency resolution and
temporal resolution are common problems in individuals with sensorineural hearing loss.
Sensorinueral hearing loss is also commonly called cochlear loss, inner ear loss and nerve loss. Reader in Audiology, All India Institute of Speech and Hearing, Mysore, India; e-mail: [email protected]
Dissertation Vol.V, Part-A, AIISH, Mysore
215
The sensory mechanism comprises of the Outer hair cells (OHC‟s) and the inner hair cells
(IHC‟s). Damage to OHCs will lead to a reduction in the compressive mechanism of the cochlea.
Damage to IHCs will lead to a reduction in sound transduction process. For this reason such
regions are referred to as “Dead regions” by Moore and Glasberg, (1997), Moore, Huss, Vickers,
Glasberg and Alcantra, (2000). In approximately 90% of hearing impaired adults and 75% of
hearing impaired children the degree of impairment worsens from 500 Hz to 4 KHz (Macrae &
Dillon, 1996). Most hearing impaired individuals include a greater loss of hearing sensitivity at
high frequencies than at low frequencies. High frequency sensorineural hearing loss is the most
common configuration and type of hearing loss which results from destruction of inner hair cells
IHCs within cochlea Engstrom (1983), Borg, Canlon and Engstrom, (1995).
Individuals with moderate to severe hearing loss at high frequencies often do not benefit
from amplification of high frequencies or even perform more poorly when high frequencies are
amplified (Villchur, 1973; Moore, 1986; Murray & Byrne, 1986; Hogan & turner, 1998; Moore,
2001; Vickers, Moore & Baer, 2001). It has been suggested that subjects who do not benefit
from amplification of high frequencies have reduced function or complete loss of function of
IHCs and or neurons.
Mc Dermott, Dorkos, Dean and Ching (1999) found that conventional amplifying hearing
aids were of limited use and therefore various attempts had been made to modify them using
filtering, extended frequency response, selective amplification and amplitude compression.
However, the findings on effect of these manipulations can restore missing high frequency
information.
In individuals with steeply sloping loss with high frequency thresholds of about 70 dBHL
or greater the high frequency parts of speech contributes no information. Thus it is essential to
transpose high frequency information to the useful hearing at low frequencies in order to make
high frequency information accessible (Johansson 1961; Wedenberg, 1961; Raymond & Proud
1962; Ling & Druze, 1967; Foust & Gengel, 1973; Velmans & Marcuson, 1983; Rees &
Velmans, 1993; Turner & Hurtig, 1999) using transposition devices and results revealed
significant amount of improvement. Hence there is a need for additional training procedure or an
alternative transposition technique which would result in perceptual benefit for at least some
individuals. Inteo is a new advance in the technology to extend the audibility range into the high
frequencies and it also provides good audibility in regions where the sensitivity is reduced as in
high frequency hearing loss.
Considering all these factors the present study was designed to study the benefit of
frequency transposition hearing aids and to assess the speech identification performances for
individuals with dead regions in two conditions:
1. With Transposition
2. Without Transposition
Dissertation Vol.V, Part-A, AIISH, Mysore
216
Method
Subjects: 10 subjects (15 ears) with moderate to severe sloping high frequency sensorineural
hearing loss with age range of 20 to 75 years were selected for the study. Subjects who were
diagnosed as having Dead Regions were taken for the study.
Stimulus: Standardized high frequency word list and sentence list developed by Mascarenhas &
Yathiraj (2001) was used. The test consists of three lists each with twenty five words and three
sentences lists each with nine sentences.
Recording of Stimulus: These word lists and sentence lists were recorded using a unidirectional
microphone using wave pad software at a sampling rate of 16,000 KHz. Noise reduction and
normalization was done using Audacity software.
Procedure
Step 1: Calibration of audiometer OB922 (dual channel) was done in free field condition.
Step 2: The TEN (HL) test (Moore, 2004) was carried out for diagnosis of dead regions in the
cochlea.
Step By Step Setting Up
1. Feed the left output from the CD player to the left (or A) line. Level input on the audiometer
and the right output from the CD player to the right (or B) input.
2. Select the left (or A) input channel 1 on the audiometer and the right (or B) input for channel 2
on the audiometer.
3. Play track 1, set the audiometer so that both line inputs are played continuously press the
interrupt buttons and adjust vu meter to read 0 dB. Turn off the two inputs (press the interrupt
buttons).
4. Mix the two channels and direct the mixed channel to the desired ear (left or right).
5. Measure the absolute threshold (traditional pure tone audiogram) for each ear at each frequency
using Tracks 2-8 of the CD.
6. Set the desired noise level using the channel 1 control. The level in dB HL /ERB corresponds to
the dial reading on the audiometer.
7. Measure the masked threshold for each ear at each level /frequency using tracks 2-8 of the CD
while playing noise continuously.
8. Repeat steps 4-6 for the other ear if desired.
9. A dead region for a particular frequency is indicated by a masked threshold that is at least 10
dB above the absolute threshold and 10 dB above the TEN (HL) level per ERB.
Step 3: The subjects who fulfilled the standard criteria were included in the study. Their pure
tone thresholds from 250 to 8 KHz for air conduction and from 250 to 4 KHz for bone
conduction of the test ear were fed into the NOAH software.
The subjects were made to sit comfortably
The hearing aid was connected to the Hipro that was in turn connected to a computer with the
programming software
The hearing aid was detected by COMPASS VERSION 4 software after switching the hearing
aid “ON”
Clients data base were created and audiometric data was fed to NOAH software
Dissertation Vol.V, Part-A, AIISH, Mysore
217
Figure I: Fq region to be transposed (Source octave) & where it should be transposed (target octave)
Inteo includes a unique, patent-pending linear frequency transposition algorithm called
the Audibility Extender (AE) which allows people with an unaidable high frequency hearing loss
to hear the missing high frequencies in the lower frequency region. The following description
summarizes the action of the Audibility Extender (AE). First, the inteo AE receives information
of the wearers hearing loss from the dynamic integrator to decide which frequency region will be
transposed. The Audibility Extender picks the frequency within the area „to be transposed‟ or
„source octave‟ region with the highest intensity i.e. peak frequency and locks it for
transposition.
Figure II: Audibility Extender identifies the frequency at 4 KHz in the source octave region to
have the highest peak
In this example, 4000 Hz has the peak intensity
Figure III: Sound with the peak at 4 KHz transposed down by one octave to 2 KHz
Dissertation Vol.V, Part-A, AIISH, Mysore
218
Once identified, the range of frequencies starting from 2500 Hz will be shifted downward
to the target frequency region. In this case, 4000 Hz will be transposed linearly by one octave to
2000 Hz.
Figure IV: Sounds beyond the one octave band width of the 2000 Hz signal will be filtered out
To limit the masking effect from transposed signal, frequencies that are outside the one octave
bandwidth of 2000 Hz will be filtered out. This keeps the frequency ratio between original and
transposed signal.
Figure V: The filtered and transposed signal is amplified and mixed with original signal
The level of transposed signal will be automatically set by the Audibility Extender. A
manual gain adjustment of the transposed signal is also possible. The linearly transposed signal
is mixed with the original signal as the final output. The procedures described above will be
adopted to transpose high frequency sounds to low frequency. In order to see whether sufficient
amplification was provided to the subjects, hearing aid evaluation of performance was carried
out using two programmes 1) INTEO MASTER 2) AUDIBILITY EXTENDER. The subjects‟
task was to repeat back the words and sentences heard. The words and sentences were presented
at 40 dBHL through the speakers of the audiometer. For each subject the level was constant
during unaided and aided conditions. In subjects with asymmetrical hearing loss and hearing loss
in the non test ear, the non test ear was blocked in order to avoid its participation.
Step 4: In the next step the subjects were presented with the standardized list. The list consisted
of 3 lists of 25 high frequency words and 3 lists of 9 sentences in a list which was recorded using
a female voice and was presented through loudspeakers. Different lists were used in each of the
Dissertation Vol.V, Part-A, AIISH, Mysore
219
unaided and aided conditions. All subjects were first evaluated with INTEO MASTER Program
and then with AUDIBILITY EXTENDER program.
Unaided Condition: In quiet, with high frequency words and sentences presented at 40 dBHL
Aided Condition: Two aided conditions were evaluated: In quiet, with high frequency words and
sentences presented at 40 BHL; Without transposition (Inteomaster) and With transposition
(Audibility Extender)
The order of testing was unaided condition, Inteo Master Programme, followed by
Audibility extender programme. Hence a total of 25 words, 9 sentences, in all 3 lists were
presented randomly. The subjects were instructed to repeat the words he/she heard and the
responses were noted down in a response sheet. In speech identification testing each correct
response was given the score of „one‟ and the total number of correct responses was noted down
for each condition for each subject. The speech identification scores for words and sentences
were taken and appropriate statistical analyses were done.
Step 5: Unaided audiogram and the aided audiogram with the Audibility Extender programme
were obtained from 250 Hz to 8 KHz and appropriate statistical analyses were done.
Results
The data obtained from 10 subjects (15 ears) having high frequency sloping hearing
loss with dead regions were analyzed to investigate the benefits of frequency transposition
(Audibility Extender) over Non-transposed (Inteo master) on speech identification scores for
word and sentence using High frequency word and sentence list using Statistical Package for
Social Sciences (Version 10) for windows. The results can be tabulated under the following.
Stage 1: Comparison between Frequencies in Unaided and Aided Conditions: Comparisons
were made for the aided and unaided conditions of frequency transposition for different
frequencies from 250 Hz to 8 KHz.
Table I: The mean and standard deviation (SD) for different frequencies from 250 Hz to 8 KHz
and frequency transposition (Audibility Extender aided condition)
Frequency transposition Frequency (Hz) Mean Standard deviation
Unaided
250 27.9167 16.3009
500 37.0833 20.7209
1K 45.4167 23.6891
2K 69.1667 27.8660
4K 94.1667 15.9307
8K 97.5000 13.5680
Aided (with
Transposition)
250 8.7500 12.4545
500 17.0833 4.5017
1K 23.3333 10.7309
2K 35.0000 19.5402
4K 63.7500 24.7832
8K 82.0833 6.8948
Dissertation Vol.V, Part-A, AIISH, Mysore
220
Frequency (Hz)
8K4K2K1K500250
95%
CI f
or T
hres
hold
s120
100
80
60
40
20
0
Aided
Unaided
Figure VI. The unaided and aided performance with audibility extender
The error bar indicates 95% confidence interval, mean and (+) and (-) standard deviation.
Aided condition variation is less compared to unaided condition. The variation is more in 4 KHz
compared to 500 Hz and 8 KHz. Aided condition is statistically significant compared to unaided
condition.
Comparison on Effect of Unaided and Aided Conditions, Frequencies and the Interaction
between Unaided and Aided Conditions and Frequencies:
Table II: The results of Two-Way Repeated Measure ANOVA
Factor F(df) P
Hearing aid condition F(1,11)=39.33 p<0.001
Frequencies F(5,55)=129.941 p<0.001
Hearing aid conditions *frequencies F(5,55)=1.886 p<0.005
From table II it is evident that there is a significant difference between hearing aid
conditions. There is significant difference between frequencies and there is no significant
interaction between hearing aid conditions and frequencies.
Since there is significant difference between frequencies pair-wise comparisons were
made with Bonferroni‟s multiple comparison test. Based on this test result it is evident that for
frequencies 500 Hz, 1 KHz & 4 kHz, 8 KHz no significant difference was found and all other
pairs of frequencies were significantly different at 5% level of significance. Inorder to clearly
understand the effects of frequency and hearing aid conditions, stepwise analysis was performed
by taking each frequency and each hearing aid condition separately.
Stage 2: Comparison across Frequencies within Unaided Condition:
One-Way Repeated measure ANOVA was performed to see the effect of frequency in
unaided condition. There is significant difference between frequencies in unaided conditions.
F(5,55)=67.913 ,p < 0.001. Since there is significant difference between frequencies in unaided
condition pair-wise comparisons were made with Bonferroni‟s multiple comparison test. Based
Dissertation Vol.V, Part-A, AIISH, Mysore
221
on Bonferroni‟s test, it is evident that for frequencies 250 Hz, 500 Hz and 500 Hz, 1 KHz, 4 KHz
and 8 KHz no significant difference was found and all other pairs of frequencies were
significantly different at 5% level of significance.
Table III. The mean, SD for different frequencies from 250 Hz to 8 KHz in unaided conditions
Frequency (Hz) Mean SD
250 27.9167 16.3009
500 37.0833 20.7209
1K 45.4167 23.6891
2K 69.1667 27.8660
4K 94.1667 15.9307
8K 97.5000 13.5680
Stage 3: Comparison across Frequencies within Aided Condition
Table IV: The mean, SD for different fqs from 250 Hz to 8 KHz in aided condition (with Transposition)
Frequency
(Hz) Mean SD
250 8.7500 11.3074
500 17.0833 4.1404
1k 23.3333 10.5560
2k 35.0000 21.8926
4k 63.7500 26.6056
8k 82.0833 6.7612
One-Way repeated measure ANOVA was performed to see the effect of frequencies in
aided conditions [F (5, 70) = 73.052, p < 0.001]. There is a significant difference between
frequencies in aided condition. Pair-wise comparisons were made with Bonferroni‟s multiple
comparison test. Based on this test it is evident that for frequencies 250 Hz, 500 Hz and 500 Hz,
1 KHz & 4 KHz, 8 kHz no significant difference was found and all other pairs of frequencies
were significantly different at 5% level of significance.
Stage 4: Comparison between Unaided and Aided Conditions within each Frequency
Table V: The mean, SD for aided and unaided condition within each frequency
Frequency (Hz) Mean SD
Unaided 250
Aided 250
30.6667
8.0000
16.5688
11.3074
Unaided 500
Aided 500
40.0000
17.0000
19.9105
4.1404
Unaided 1k
Aided 1k
51.0000
24.0000
24.2899
10.5560
Unaided 2k
Aided 2k
75.3333
41.0000
28.0603
21.8926
Unaided 4k
Aided 4k
95.3846
66.5385
15.8721
25.7702
Unaided 8k
Aided 8k
97.5000
82.0833
13.5680
6.8948
Dissertation Vol.V, Part-A, AIISH, Mysore
222
Comparison of Difference between Aided and Unaided Condition within each Frequency:
Paired t-test was administered to see the difference between aided & unaided conditions within
each frequency.
Table VI. Paired t-test results between aided and unaided conditions within each frequency
Frequency t(df)
250Hz t(14)=4.208*
500Hz t(14)=4.271*
1kHz t(14)=4.876**
2kHz t(14)=6.871**
4kHz t(12)=3.977*
8kHz t(11)=3.890*
* indicates significant at 0.05 level; ** indicates significant at 0.001 level
Stage 5: Comparison of Speech Identification Scores between Unaided, Inteomaster,
Audibility Extender Programmes for Words
Table VII: The mean, SD for unaided, Inteomaster, Audibility extender programs for high fq. words
Conditions Mean SD
Unaided 7.0000 6.7823
Inteomaster 12.2667 4.5272
Audibility extender 15.0000 4.5356
One-way Repeated Measure ANOVA was performed to see the difference between
Unaided, Inteomaster, Audibility extender programmes for words. There is a significant
difference between speech identification scores for word in all three conditions [F (2,28)=34.161,
p<0.001]. Since there is a significant difference, pair-wise comparison was made with
Bonferroni‟s multiple comparison test. Based on this test results it is evident that all are
significantly different from one another at 0.001 level.
Stage 6: Comparison of SIS between Unaided, Inteomaster, Audibility Extender for
Sentences
Table VIII: Shows the mean & SD for unaided, Inteomaster, and Audibility Extender
programmes for high frequency sentences.
Conditions Mean SD
Unaided 11.2667 9.7722
Inteomaster 19.0000 6.0710
Audibility extender 21.4000 5.5136
One-way repeated measure ANOVA was performed to see the difference between
Unaided, Inteomaster, Audibility extender programmes for sentences. There is significant
difference between sentence speech identification scores, F (2, 28) =32.768, p < 0.001. Since
Dissertation Vol.V, Part-A, AIISH, Mysore
223
there is a significant difference between sentences speech identification scores, pair-wise
comparison were made with Bonferroni‟s multiple comparison test. Based on this test results, it
is evident that all are significantly different at 0.001 level.
WordsSentences
95%
CI f
or S
IS
30
20
10
0
Audibility Extender
Inteomaster
Unaided
Figure VII. Depicts the speech identification scores for Audibility Extender, Inteomaster and
unaided conditions for sentences and words
Error bar depicts the 95% confidence level the mean and (+) and (-) standard deviation,
for audibility extender the variation is less compared to inteomaster and unaided condition for
sentences and also for words.
Discussion
Like in the previous studies evaluating FRED transposition with hearing impaired
individuals (Velmans, 1975; Velmans & Marcuson 1980; Velmans et al., 1982; Velmans et al.,
1988; Rees & Maxvelmans, 1993) the present study has also shown that transposition to be
beneficial. In the present study 10 individuals (15 ears) with moderate to severe high frequency
steeply sloping hearing loss with dead region participated. Dead region was evaluated using a
TEN test. Word and sentence identification task was carried out using high frequency word and
sentence list which was recorded using a female voice. INTEO device were used it has an
integrated signal processing stratergy (ISP). Currently there are two main approaches to signal
processing: Sequential processing and Parallel processing.
In Sequential processing the flow of information is always in one direction. Because of
this sequential nature no two components may be working simultaneously. In Parallel processing
different components occurs at same time. The limitation of both strategies is that the
information from different functional units is not shared among each other. As a result wearer
satisfaction may not be ensured in all situations.
Integrated signal processing (ISP): ISP is newest approach to hearing aid signal
processing where not only the input signal shared among the different functional units but the
results of the processing from each functional unit are also shared amongst each other to result in
a highly complex and integrated network of information flow. All the processes within the
hearing aid function as a unit. This improves the quality of the processed sounds. Because the
Dissertation Vol.V, Part-A, AIISH, Mysore
224
information is shared among all components the use of more complex algorithms which further
enhance the performance of the hearing aid and wearer satisfaction are possible. Results of the
present study reveals that transposition helps in improving the identification of high frequency
words and sentences and also when unaided and aided performance was compared results show a
statistically significant difference between both conditions.
Unlike the present study McDermott and Dean (2000) carried out study on six adults with
a very steeply sloping high frequency hearing loss. A control group of 5 adults with normal
hearing participated in the study. Normal hearing individuals listened to speech material through
a low pass filter with a cut off frequency of 1200 Hz. The transposition was lowered by the
factor of 0.6. Results revealed that frequency transposition had little effect overall on the
perception of speech.
The result of the above study was different from present study because the study by
McDermott and Dean (2000) was carried out in a simulated condition in normal individuals.
And a relatively simple form of frequency lowering technique was used that is to shift each
frequency component in a sound by a constant factor. For e.g., when the factor equals 0.5 all
frequencies are shifted downwards by one octave. The disadvantage of this method is that overall
pitch of the speech signal is lowered. This may cause female speaker to sound like male speaker.
But in the present study the region with the highest intensity i.e. peak frequency is selected and
locks it for transposition. Once identified the transposition is done linearly by one octave. They
also maintain the frequency ratio between original and transposed signal so that the speech does
not get distorted. The linearly transposed signal is mixed with the original signal at the final
output. Thus helps in better understanding of missing high frequency information.
Conclusion
In the present study 10 subjects (15 ears) with moderate to severe high frequency sloping
hearing loss with dead regions were tested. The speech identification scores were measured using
high frequency word and sentences list for three different conditions unaided, aided
(Inteomaster), aided (Audibility Extender) in quiet at 40 dBHL. The frequencies in unaided and
aided ranging from 250 Hz to 8 KHz were also compared. The data obtained were statistically
analyzed using One-way and two-way repeated ANOVA and paired-t test. Based on the results
the following conclusions are drawn from the study:
1. There was significant difference between unaided and aided (Frequency transposition)
conditions from 250 Hz to 8 KHz.
2. There was significant difference in speech identification scores obtained between
Inteomaster (without transposition) and Audibility extender (with transposition). Thus it
can be inferred that there is statistically significant difference between speech
identification scores in Inteomaster and Audibility Extender conditions and subjective
preferences was for frequency transposition. Majority of individuals who were unable to
Dissertation Vol.V, Part-A, AIISH, Mysore
225
perceive high frequency information were able to perceive the same after transposition in
the aided condition.
References
Borg, E., Canlon, B. & Engstrom, B. (1995). Noise induced hearing loss-literature review and
experiments in rabbits. Scandinavian Audiology, 21, Supplement, 40, 1-147.
Engstrom, B. (1983). Stereocillia of sensory cells in normal and hearing impaired ears.
Scandinavian Audiology. Supplement, 19, 1-34.
Foust, K. O. & Gengel, R. W. (1973). Speech discrimination by sensori neural hearing impaired
persons using a transposer hearing aid. Scandinavian Audiology, 2 (3), 161-170.
Hogan, C. A. & Turner, C. W. (1998). High frequency audibility: Benefits for hearing impaired
listeners. Journal of the Acoustical Society of America, 104(1), 431-441.
Johansson, B. (1961). A new coding amplifier system for the severely hard of hearing. Proc 3rd
Int Congr Acoust, 2, 655-657. Ling, D. & Druze, W. S. (1967). Transposition of high frequency speech sounds by partial
vocoding of the speech spectrum: Its use by deaf children. Journal of Auditory Research,
7, 133-144. Macrae, J. H. & Dillon, H. (1996). Gain, frequency response and maximum output requirements
for hearing aids. Journal of Rehabilitation and Research Development, 3, 363-376.
McDermott, H. J., Dorkos, V. P., Dean, M. R. & Ching, Y. C. (1999). Improvements in speech
perception with use of the AVR transonic frequency transposing hearing aid. Journal of
Speech, Language and Hearing Research, 42, 1323- 1335.
Moore, B. C. J. (1986). Parallels between frequency selectivity measured in simultaneous and
forward masking. Scandinavian Audiology Supplement, 25, 139-152.
Moore, B. C. J. & Glasberg, B. R. (1997). Factors affecting thresholds for sinusoidal signals in
narrow band maskers with fluctuating envelopes. Journal of the Acoustical Scociety of
America, 3, 289-311. Moore, B .C. J., Huss, M., Vickers D. A., Glasberg, B. R., & Alcantra, J. I. (2000). A test for the
diagnosis of dead regions in the cochlea, British Journal of Audiology, 34, 205-224.
Murray, N. & Byrne, D. (1986). Performance of hearing impaired and normal hearing listeners
with various high frequency cutoffs in hearing aids. Australian journal of Audiology, 8,
21-28.
Velmans, M. & Marcuson, M. (1980). A speech like frequency transposing hearing aid for the
sensorineural deaf. children. British Journal of Audiology, 17, 221- 230.
Vickers, D. A., Moore B. C. J. & Baer, (2001). Effects of low pass filtering on the intelligibility
of speech in quiet for people with and without dead regions at high frequencies. Journal
of the Acoustical Society of America, 110 (2), 1164-1175.
Villchur, E. (1973). Signal processing to improve speech intelligibility in perceptive deafness.
Journal of Acoustical Society of America, 53, 1646-1657.
Raymond, T. H. & Proud, G.O. (1962). Audiofrequency conversion. Archives of Otolaryngology,
76, 436-446.
Ress, R. & Velmans, M. (1993). The effect of frequency transposition on the untrained auditory
discrimination of congenitally deaf children. British Journal of Audiology, 27, 53-60.
Turner, C. W. & Cummings, K. J. (1999). Speech intelligibility for listeners with high frequency
hearing loss. American Academy of Audiology, 8, 47-56.
Dissertation Vol.V, Part-A, AIISH, Mysore
226
Effect of Cochlear Hearing Loss on Tone Burst Evoked Stacked
Auditory Brainstem Response
Yatin Mahajan & Vanaja C S
Abstract
The auditory brainstem response (ABR) is one of the most useful clinical procedures for
the examination of auditory sensitivity and integrity of auditory system. The conventional ABR is
not sensitive in detecting small acoustic tumors and small intracanalicular tumors. Stacked tone
burst ABR is a new method developed to increase the sensitivity of ABR in detecting small
tumours. It has been reported that cochlear hearing loss affects conventional ABR measures.
Hence, it is possible that cochlear hearing loss also affects stacked ABR. Also a separate
normative data was established for stacked ABR obtained from adding different frequency
specific ABRs. In the present study tone burst ABRs for 500 Hz, 1000 Hz, 2000 Hz and 4000 Hz
were recorded from 22 ears with cochlear hearing loss and thirty five ears with normal hearing.
Stacked ABR was constructed from these tone burst ABRs. The results indicated there is an effect
of number of frequencies used for stacking and amplitude of ABR is largest when ABRs for all
the four frequencies are stacked. The results also revealed that cochlear hearing loss affects the
amplitude of stacked ABR and the reduction in amplitude increases with increase in severity of
hearing loss.
Key words: Frequency Specificity, amplitude, Synchronization.
Introduction
The auditory brainstem response (ABR) is one of the most useful clinical procedures for
the examination of auditory sensitivity and integrity of auditory system. The auditory brainstem
response (ABR) has been well accepted as a procedure to detect retrocochlear pathology (Selters
& Brackmann, 1977; Chandrasekhar, Brackmann & Devgan, 1995; Selesnick & Jackler, 1992;
Welling, Glasscock, Woods & Jackson, 1990; Jerger, Oliver, Chmiel & Rivera, 1986; Starr et al,
1996). However, the sensitivity of ABR in detection of acoustic neuroma, the most common
space occupying lesion on the auditory nerve, depends on its size and location. There are reports
indicating that conventional ABR is not sensitive in detecting small acoustic tumors and small
intracanalicular tumors. Tumors of sizes less than 10 mm and small intracanalicular tumors are
often missed by standard ABR methodology (Telian, Kileny, Niparko, Kemink & Graham, 1989;
Wilson, Hodgson, Gustafson, Hogue & Mills, 1992; Eggermont, Don & Brackmann, 1980;
Schmidt, Satallof, Newmann, Spiegel & Myers, 2001).
Professor of Audiology, School of Audiology and Speech Language Pathology, Bharathiya Vidya Peet University,
Katra-Dhanakawadi, Pune, India. e-mail: [email protected]
Dissertation Vol.V, Part-A, AIISH, Mysore
227
Studies have reported an increase in incidence of small acoustic tumors over the years
(Stangerup et al., 2004). Tos, Charabi and Thomasen (1999) investigated the distribution of
diagnosed vestibular schwanomas (VS) of various sizes in Denmark from 1976 to 1995 and
reported an increased incidence of intra-canalicular tumors (from 0.4 to 7.9 VS/million/year) and
small tumors (from 13.3 to 29.0 VS/million/year). Similar findings have been reported in other
parts of the world also (Nestor, Karol, Nutik & Smith, 1988; Moffat, Hardy, Irving, Beynon &
Baguley, 1995). Therefore it is essential that audiological tests are developed to identify small
acoustic tumors.
To overcome the disadvantage of standard ABR methodology, Don, Masuda, Nelson and
Brackmann (1997) developed a new ABR measure, called the stacked ABR. The stacked ABR is
a measure which reflects the overall neural activity from a wide frequency region of the cochlea
in response to auditory stimulation. This overall neural activity is a result of synchronized
activity from various regions of the auditory nerve and desynchronization resulting from
compression of a small tumor may be evident in reduction of stacked ABR wave V amplitude
(Don, Kwong, Tanaka, Brackmann & Nelson, 2005; Chandrasekhar, Brackmann & Devgan,
1995). Don, Kwong, Tanaka, Brackmann and Nelson (2005) reported that this method has
demonstrated 95% sensitivity and 88% specificity in detecting small acoustic tumors. Philibert,
Durrant, Ferber-Viart, Duclaux, Veuillet and Collet (2003) used tone burst of different
frequencies instead of derived band technique and waveform obtained were added after aligning
wave V. They reported similar enhancement of wave V amplitude as obtained using derived
band method. They further reported reduced amplitude of the stacked ABR in patients with small
tumors.
There is a dearth of literature on stacked ABR especially tone burst evoked stacked ABR.
Limited research available on stacked ABR indicates that stacked ABR is sensitive in
identification of small acoustic tumors. However, there is a need to standardize this procedure
and also study the factors that can affect the amplitude of stacked ABR. Several investigators
have reported that cochlear hearing loss affects various ABR measures such as absolute
latencies, inter peak latencies, latency intensity function and amplitude measures (Watson, 1996;
Oates & Stapells, 1992; Elberling & Parbo, 1987; Watson, 1999; Coats & Martin, 1977;
Rosenhamer, Lindstrom & Lundborg, 1981; Keith & Greville, 1987). There are very few reports
investigating effect of cochlear hearing loss on amplitude of wave V. The amplitude of the wave
V for click evoked ABR might be smaller in subjects with cochlear hearing loss than in normal
hearing subjects (Xu, Vinck, De Vel & Cauwenberge, 1998; Fowler & Durrant, 1994).
It can be hypothesized that any factor which affects conventional ABR will affect stacked
ABR measure. So it can be hypothesized that cochlear hearing loss has an effect on the
amplitude of stacked ABR. However, there is a dearth of studies in this area. It is essential to
determine the effect of cochlear hearing loss on stacked ABR and consider the effect if any,
while using stacked ABR for neurodiagnostic applications. ABR for five frequencies have been
used to obtain stacked ABR to assess the neural integrity across different frequency regions
Dissertation Vol.V, Part-A, AIISH, Mysore
228
(Don, Kwong, Tanaka, Brackmann & Nelson, 2005; Philibert et al, 2003). However, using lesser
number of frequencies may reduce the test time. Also in subjects with mild high frequency loss,
ABR for tone bursts of 4000 Hz and/or 2000 Hz might be absent but present for tone bursts of
other frequencies. At such time it will be useful if stacked ABR can be obtained from ABRs of
only two or three frequencies. The amplitude of stacked ABR will depend on the number of
waveforms stacked and the frequency of the stimuli used for recording frequency specific ABR.
Don, Masuda, Nelson and Brackmann (1997) reported a reduction of 33% of amplitude of
derived band stacked ABR when two bands of frequencies were removed in subjects with normal
hearing. So a separate normative data needs to be established for stacked ABR obtained from
adding different frequency specific ABRs. The present study was designed to investigate the
following aims:
1. To investigate the effect of cochlear hearing loss on the tone burst evoked Stacked ABR.
2. To obtain separate normative data for amplitude of stacked ABR obtained from
ABR for 500 Hz, 1000Hz, 2000 Hz & 4000 Hz tone bursts.
ABR for 500 Hz, 1000 Hz & 2000 Hz tone bursts.
ABR for 500 Hz & 1000 Hz tone bursts.
Method
Participants: Participants of the present study were divided into two groups. Group 1 included
thirty five ears of normal hearing individuals aged 15-50 years and hearing sensitivity within 15
dBHL. The group 2 included twenty two ears with cochlear hearing loss of subjects aged 15-50
years with hearing sensitivity within 55 dBHL. Speech identification scores of all 22 subjects
were proportional to pure tone average of 500, 1000 and 2000 Hz and there was no abnormality
indicated on click evoked ABR.
Instrumentation: A calibrated diagnostic audiometer was used for estimating the puretone
thresholds and a calibrated middle ear analyzer to rule out middle ear pathology. Tone burst
evoked stacked ABR was recorded using Intelligent Hearing Systems (Smart EP version 3.86)
evoked potential systems.
Procedure:
Table 1: Test protocol to record Tone burst ABR
Type of stimuli Tone bursts
Transducer Insert ear phones ER-3A
Test frequency 500, 1000, 2000, 4000 Hz
Duration 4 cycles (2-0-2)
Envelope(Gating) Blackmann
No. of stimuli 2000
Repetition rate 11.1/s
Test intensity 80dBnHL
Time window 20ms
Electrode montage Single channel
Polarity Alternate
Sensitivity 50uV
Filter settings 30Hz-3000Hz
Dissertation Vol.V, Part-A, AIISH, Mysore
229
Pure tone thresholds were obtained at octave frequencies between 250 Hz and 8000 Hz
for air conduction stimuli and between 250 Hz to 4 KHz for bone conduction stimuli using
modified Hughson-Westlake method (Carhart & Jerger, 1959). ABR was recorded for the tone
bursts using the test protocol given in Table 1. The wave V was identified at all test frequencies.
The wave V recorded at all frequencies was time aligned and these aligned waveforms were
added to obtain stacked ABR. The peak-to-trough amplitude of the added waveform was
measured.
Results
The participants of the cochlear hearing loss group were further divided into two groups.
One group consisted of 12 ears with mild cochlear hearing loss (26 dBHL to 40 dBHL) and other
group included 10 ears with moderate cochlear hearing loss (41 dBHL to 55 dBHL). Separate
stacked ABRs were obtained by stacking ABRs for all four frequencies (hereafter called SA),
stacking ABR for 500 Hz, 1000 Hz and 2000 Hz (hereafter called SA3) and stacking ABR for
500 Hz and 1000 Hz (hereafter called SA2). Table 2 shows the mean amplitude and standard
deviation values of stacked ABR for 35 ears with normal hearing and 22 ears with cochlear
hearing loss. The mean amplitude for stacked wave V was largest for SA followed by SA3 and
SA2 in individuals with normal hearing whereas there was not much difference between mean
values for amplitude for SA, SA3 and SA2 for individuals with cochlear hearing loss.
Table 2: Amplitude of stacked ABR for individuals with normal hearing for different stacked
ABRs in micro volts (μV)
Stacked ABR Normal hearing Cochlear hearing loss
N Mean Std. Deviation N Mean Std. Deviation
SA 35 0.54 0.09 19 0.30 0.11
SA3 35 0.53 0.11 21 0.30 0.11
SA2 35 0.50 0.14 22 0.30 0.12
Stacked ABR
SA2SA3SA
95%
CI
.7
.6
.5
.4
.3
.2
.1
Normals
cochlear HL
Figure 1: Error bars showing the upper and lower bounds of amplitude at 95% confidence
interval at different stacked ABRs for two groups
Dissertation Vol.V, Part-A, AIISH, Mysore
230
Figure 1 shows error bars for the upper and lower bounds of amplitude at 95%
confidence interval at different stacked ABRs for two groups and it can be observed from figure
that there is no overlap between the range of 95% confidence interval for individuals with
cochlear hearing loss and those with normal hearing for all stacked ABRs. There is a large gap
between lower bound of normal hearing and upper bound for cochlear hearing loss group.
Table 3 shows the mean amplitude and standard deviation values of stacked ABR for 12
ears with mild cochlear hearing loss and 10 ears with moderate cochlear hearing loss. The mean
amplitude for stacked wave V is largest for SA than other two stacked ABRs in individuals with
mild cochlear hearing loss and the mean amplitude for stacked wave V was largest for SA2 than
other two stacked ABRs i.e. SA and SA3 in individuals with moderate cochlear hearing loss.
Table 3: Amplitude of stacked ABR for individuals with mild hearing loss and moderate
cochlear hearing loss for different stacked ABRs in micro volts (μV)
Mild cochlear hearing loss Moderate cochlear hearing loss
Stacked ABR N Mean Std. Deviation N Mean Std. Deviation
SA 10 0.36 0.11 9 0.24 0.08
SA3 12 0.34 0.09 9 0.25 0.11
SA2 12 0.34 0.14 10 0.26 0.08
Results of Mann Whitney U test revealed that there is a significant difference (p<0.01) in
mean amplitude of stacked wave V for all stacked ABRs between individuals with normal
hearing and individuals with mild cochlear hearing loss and a significant difference was
observed between amplitude of stacked wave V between individuals with normal hearing and
individuals with moderate cochlear hearing loss for all stacked ABRs. Amplitude of stacked
wave V differed significantly (p<0.05) between the individuals with mild hearing loss and
individuals with moderate hearing loss for only SA.
Stacked ABR
SA2SA3SA
95%
CI
.7
.6
.5
.4
.3
.2
.1
Normals
Mild HL
Moderate HL
Figure 2: Error bars showing the upper and lower bounds of amplitude at 95% confidence
interval at different stacked ABRs for three groups
Dissertation Vol.V, Part-A, AIISH, Mysore
231
It can be observed from Figure 2 that the range of 95% confidence interval for
individuals with normal hearing loss is extremely different from range for individuals with mild
cochlear hearing loss or moderate cochlear hearing at all frequencies. But the ranges of 95%
confidence interval for mild hearing loss and moderate hearing loss are overlapping for all
stacked ABRs.
Discussion
Amplitude of stacked wave V in individuals with normal hearing ranged from 0.50µV to
0.57µV for SA which is lesser than the range reported by Philibert et al (2003). This can be
attributed to the differences in the methodology used in the two studies. Philibert et al (2003)
tried to approximate the methodology of Don, Masuda, Eggermont and Nelson (1997) and hence
used five frequencies to obtain frequency specific ABR. In the present study standard
audiometric frequencies were used due to time constraints. Also the duration of the stimuli in the
present study was 2-0-2 cycle as compared to 2-1-2 cycle used by Philibert et al (2003).
Results of the present study also showed an increase in stacked wave V amplitude with
the increase in the number of frequencies included for stacking in individuals with normal
hearing. This may be due to the increase in number of neural elements that contribute to the
response (Don, Ponton, Eggermont & Masuda, 1994). So it was observed that SA had more
amplitude as it involves four frequencies which results in more synchronization and higher
amplitude in individuals with normal hearing. Don, Masuda, Nelson and Brackmann (1997) also
reported similar results in which there were a reduction of 33% of amplitude of derived band
evoked stacked ABR when two bands were removed and waveforms were stacked. The
reduction in amplitude of stacked wave V with reduction in number of frequencies used in
stacking could be because of lesser number of averages in the final stacked ABR. It has been
reported in literature that the amplitude of wave V increases with increase in number of averages
(Hall, 1992; Hood, 1998). However, studies also indicate that change in amplitude is not
significant when the number of averages is increased beyond 2000 (Hall, 1992). In the present
study at each frequency 2000 sweeps were averaged. Therefore the effect of number of sweeps
on amplitude of ABR would be minimal. So the effect on amplitude of stacked ABR was due to
cochlear hearing loss.
In individuals with cochlear hearing loss there was a significant reduction in stacked
wave V amplitude for all the stacked ABRs when compared to those individuals with normal
hearing. This may be attributed to the fact that cochlear hearing loss results in abnormal
functioning of different neural elements across the cochlea. It is known that stacked ABR is a
result of total synchronized neural activity from different neural elements (Don, Kwong, Tanaka,
Brackmann & Nelson, 2005). So reduction in input to neural fibers due to cochlear hearing loss
will result in a significant reduction in stacked ABR amplitude.
Though the amplitude values of stacked wave V of different stacked ABRs were not
significantly different in individuals with mild and moderate cochlear hearing loss the amplitude
was reduced in individuals with moderate hearing loss. This may be attributed to the fact that
Dissertation Vol.V, Part-A, AIISH, Mysore
232
with the increase in hearing loss there will be more damaged regions in the cochlea which
consequently reduces the number of neural fibers stimulated leading to reduced amplitude.
To summarize, the results of the present study indicate that the amplitude of stacked ABR
depends on number of tone bursts evoked ABRs used for stacking. The results also revealed that
cochlear hearing loss affects the amplitude of stacked ABR and the reduction in amplitude
increases with increase in severity of hearing loss.
Conclusions
The results of the present study indicate that amplitude of ABR is largest when ABRs for
all the four frequencies are stacked. There is a significant difference between mean amplitude of
stacked ABR of individuals with normal hearing and individuals with cochlear hearing loss. The
amplitude of stacked ABR for individuals with mild hearing loss as well as moderate hearing
loss is significantly lesser than that of normal individuals. Though not statistically significant the
amplitude of stacked ABR reduces with increase in degree of hearing loss.
References
Carahart, R. & Jerger, J. (1959). Preferred method for clinical determination of pure tone
thresholds. Journal of Speech and Hearing Disorders, 24, 330-345.
Chandrasekhar, S.S., Brackmann, D.E. & Devgan, K.K. (1995). Utility of auditory brainstem
response audiometry in diagnosis of acoustic neuromas. American Journal of Otology,
16, 63-67.
Coats, A.C. & Martin, J.L. (1977). Human auditory nerve action potentials and brainstem evoked
responses: effects of audiogram shape and lesion location. Archives of Otolaryngology,
103, 605-622.
Don, M., Ponton, C.W., Eggermont, J.J. & Masuda, A. (1994). Auditory brainstem response
(ABR) peak amplitude variability reflects individual differences in cochlear response
times. Journal of Acoustic Society of America, 96, 3476-3491.
Don, M., Masuda, A., Nelson, R.A. & Brackmann, D.E. (1997). Successful detection of small
acoustic tumors using the stacked derived auditory brainstem response method. American
Journal of Otology, 18, 608-621.
Don, M., Kwong, B, Tanaka, C., Brackmann, D.E. & Nelson, R.A. (2005). The Stacked ABR: A
Sensitive and specific screening tool for detecting small acoustic tumors. Audiology
Neurotology, 10, 274-290.
Eggermont, J.J., Don, M. & Brackmann, D.E. (1980). Electrocochleography and auditory
brainstem responses in patients with pontine angle tumors. Annals Otology Rhinology
Laryngology Supplement, 75, 1-19.
Elberling, C. & Parbo, J. (1987). Reference data for auditory brainstem responses in
retrocochlear diagnosis. Scandinavian Audiology, 16, 49-55.
Fowler, C.G. & Durrant, J.D. (1994). Effects of peripheral hearing loss on the ABR. In J.T.
Jacobson (Ed), Principles and applications in auditory evoked potentials (pp: 237-250),
Massachusetts: Allyn & Bacon.
Hall, J.W. III (1992). Handbook of auditory evoked responses. Massachusetts: Allyn and Bacon.
Hood, L.J. (1998). Clinical applications of auditory brainstem response. San Diego: Singular
publishing group, Inc.
Dissertation Vol.V, Part-A, AIISH, Mysore
233
Jerger, J.F., Oliver, T.A., Chmiel, R.A. & Rivera, V.M. (1986). Patterns of auditory abnormality
in multiple sclerosis. Audiology, 25, 193- 209.
Keith, W.J. & Greville, K.A. (1987). Effects of audiometric configuration on the auditory
brainstem response. Ear and Hearing, 8, 49-55.
Moffat, D.A., Hardy, D.G., Irving, R.M., Viani, L. & Beynon, G.J. (1995). Referral tern in
vestibular schwannoma. Clinical Otolaryngology, 20, 80-83.
Nestor, J.J., Korol, H.W., Nutik, S.L., & Smith, R. (1998). The incidence of acoustic neuromas.
Archives of Otolaryngology Head and Neck Surgery, 114, 680.
Oates, P. & Stapells, D.R. (1992). Interaction of click intensity and cochlear hearing loss on
auditory brainstem response wave V latency. Ear and Hearing, 13, 28-34.
Philibert, B., Durrant, J.D., Ferber- Viart, C., Duclaux, R., Veuillet, E. & Collet, L. (2003).
Stacked tone burst evoked auditory brainstem responses: preliminary findings.
International Journal of Audiology, 42, 71-81.
Rosenhamer,H.J., Lindstrom, B. & Lundborg, T. (1981). On the use of click evoked electric
brainstem responses in audiological diagnosis. III: Latencies in cochlear hearing loss.
Scandinavian Audiology, 10, 3-11.
Schmidt, R.J., Sataloff, R.T., Newman, J., Spiegel, J.R. & Myers, D.L. (2001). The sensitivity
auditory brainstem response testing for the diagnosis of acoustic neuromas. Archives of
Otolaryngology Head and Neck Surgery, 127, 19-22.
Selesnick, S.H. & Jackler R.K. (1992). Atypical hearing loss in acoustic neuroma patients. The
Laryngoscope, 103, 437-441.
Selters, W.A. & Brackmann, D.E. (1977).acoustic tumor detection with brainstem electric
response audiometry. Archives of Otolaryngology, 103, 181-187.
Stangerup, S.E., Tos, M., Caye-Thomasen, P., Tos, T., Klokker, M. & Thomsen, J. (2004).
Increasing annual incidence of vestibular schwannoma and age at diagnosis. The Journal
of Laryngology and Otology, 118, 622-627.
Starr, A., Picton, T.W., Sininger, Y.S., Hood, L.J. & Berlin, C.I. (1996). Auditory neuropthy.
Brain, 119, 741-753.
Telian, S.A., Kileny, P.R., Niparko, J.K., Kemink, J.L. & Graham, M.D. (1989). Normal
auditory brainstem response in patients with acoustic neuroma. The Laryngoscope, 99,
10-14.
Tos, M., Charabi, S. & Thomsen, J. (1999). Incidence of vestibular schwanomas. The
Laryngoscope, 109, 736-70.
Watson, D.R. (1996). The effects of cochlear hearing loss, age and sex on auditory brainstem
response. Audiology, 35, 246-258.
Watson, D.R. (1999). A study of the effects of cochlear loss on auditory brainstem response,
specificity and false positive rate in retrocochlear assessment. Audiology, 38, 155-164.
Welling, D.B., Glasscock, M.E. III, Woods, C.I. & Jackson C.G. (1990). Acoustic neuroma : a
cost effective approach. Otolaryngology Head Neck Surgery, 103, 364-370.
Wilson, D.F., Hodgson, R.S., Gustafson, M.F., Hogue, S. & Mills, L. (1992). The sensitivity
auditory brainstem response testing in small acoustic neuromas. The Laryngoscope, 102,
961-964.
Xu, Z.M., Vinck, B., De Vel, E. & Van Cauwenberge, P. (1998). Mechanisms in noise induced
permanent hearing loss: An evoked otoacoustic mission and auditory brainstem response
study. The Journal of Laryngology and Otology, 112, 1154-1161.