Post on 26-May-2020
Context-dependent plasticity in the subcortical encoding of
linguistic pitch patterns
Joseph CY Lau1, Patrick CM Wong1, 2, and Bharath Chandrasekaran ∗3, 4, 5, 6, 7
1Department of Linguistics and Modern Languages, The Chinese University of Hong Kong,
Shatin, Hong Kong
2Brain and Mind Institute, The Chinese University of Hong Kong, Shatin, Hong Kong
3Department of Communication Sciences and Disorders, Moody College of Communication,
The University of Texas at Austin, Austin, TX, USA
4Department of Psychology, College of Liberal Arts, The University of Texas at Austin,
Austin, TX, USA
5Department of Linguistics, College of Liberal Arts, The University of Texas at Austin,
Austin, TX, USA
6Institute of Mental Health Research, College of Liberal Arts, The University of Texas at
Austin, Austin, TX, USA
7Institute for Neuroscience, The University of Texas at Austin, Austin, TX, USA
∗Corresponding author:Bharath ChandrasekaranDepartment of Communication Sciences and Disorders,University of Texas at Austin,Austin, TX, USAbchandra@utexas.edu
1
Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR
Abstract
We examined the mechanics of online experience-dependent auditory plasticity by assessing the
influence of prior context on the frequency-following responses (FFRs), which reflect phase-locked
responses from neural ensembles within the subcortical auditory system. FFRs were elicited to a
Cantonese falling lexical pitch pattern from twenty-four native speakers of Cantonese in a variable
context, wherein, the falling pitch pattern randomly occurred in the context of two other linguistic
pitch patterns; in a patterned context, wherein, the falling pitch pattern was presented in a predictable
sequence along with two other pitch patterns, and in a repetitive context, wherein the falling pitch
pattern was presented with 100% probability. We found that neural tracking of the stimulus pitch
contour was most faithful and accurate when listening context was patterned, and least faithful when
the listening context was variable. The patterned context elicited more robust pitch tracking relative
to the repetitive context, suggesting that context-dependent plasticity is most robust when the context
is predictable, but not repetitive. Our study demonstrates a robust influence of prior listening context
that works to enhance online neural encoding of linguistic pitch patterns. We interpret these results
as indicative of an interplay between contextual processes that are responsive to probability as well
as novelty in the presentation context.
New & Noteworthy
Human auditory perception in dynamic listening environments requires fine-tuning of sensory
signal based on behaviorally-relevant regularities in listening context, i.e. online-experience-dependent
plasticity. Our finding suggests what partly underlie online-experience-dependent plasticity are
interplaying contextual processes in the subcortical auditory system that are responsive to probability
as well as novelty in listening context. These findings add to the literature that looks to establish the
neurophysiological bases of auditory system plasticity, a central issue in auditory neuroscience.
INTRODUCTION
Human auditory perception often occurs in dynamic and complex listening environments. For the
most part, humans demonstrate remarkably successful and consistent perceptual abilities in everyday
communication. Accurate perception requires that the auditory system discovers and extracts behaviorally-relevant
regularities from a non-stationary soundscape, and fine-tunes and reorganizes sensory signal on the fly
(Large and Jones, 1999; Winkler et al., 2009). The ability of the auditory system to perform these
processes online in the ambient auditory environment has been extensively studied as a form of online
experience-dependent auditory plasticity (Chandrasekaran et al., 2014; Skoe et al., 2014). At least
two fundamental neural mechanisms have been hypothesized to underlie online auditory plasticity:
2
Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR
predictive coding, a process that computes and extracts statistical relationships of objects in stimuli,
from which expectancies of future sounds can be built and continuously tested (Lupyan and Clark,
2015), and stimulus-specific adaptation (SSA), a process that attenuates repetitive sensory presentation
in order to enhance the processing of novel stimuli (Natan et al., 2015).
In humans, the role of the subcortical auditory system in mediating online plasticity has received
attention recently (Chandrasekaran et al., 2014). Prior work on animal models has shown that online
subcortical encoding of auditory signals is dynamically modified by top-down cortical feedback (Suga,
2008). This top-down mechanism is executed via corticofugal pathways, which are feedback loops that
back-project from auditory cortical regions onto subcortical structures like the inferior colliculus (IC)
(Winer et al., 1998). Also, direct neuron recordings from anesthetized rats have indicated that IC neurons
demonstrate SSA to commonly recurring auditory stimuli (Perez-Gonzalez et al., 2005; Malmierca et al.,
2009). Although the exact neuronal mechanism underlying SSA is still under debate, a neuronal cooling
study has shown that SSA at the level of the IC was largely unaffected by the cortex (Anderson and
Malmierca, 2013). An emerging view is that SSA at the level of the IC is largely generated by a
mechanism local to the IC that is sensitive to stimulus novelty [i.e. the difference of a stimulus in
one or more dimensions compared to previously occurring stimuli (Naatanen and Picton, 1987; Wang
et al., 2010)]. Repetitive presentation, in which a stimulus is always identical to its prior stimulus given
any position, may lead to greater synaptic depression, leading to more robust detection of stimuli that
are more novel (i.e. less repetitive) in other presentation contexts.
In humans, a recent functional magnetic resonance imaging (fMRI) study has shown that the processing
of unexpected auditory stimuli in an oddball paradigm yielded activation in the left IC, thus converging
with evidence from animal models on the role of the IC in online subcortical plasticity (Cacciaglia et al.,
2015). However, electrophysiological studies looking at the wave V of auditory brainstem response
(ABR), which is thought to be generated in ensembles within the IC (Picton, 2010), failed to find such
an unexpected deviance-related effect (Slabu et al., 2010; Althen et al., 2011). One interpretation to this
lack of effect is that the wave V may be generated at the ascending lemniscal portions of IC which are
not sensitive to novelty (Escera et al., 2014).
The scalp-recorded frequency-following responses (FFRs), on the other hand, reflect phase-locked
responses from neural ensembles within the auditory brainstem and midbrain following the phasic ABR
(Chandrasekaran and Kraus, 2010). Prior work in humans has extensively characterized the FFR as
an index of long-term and short-term subcortical auditory plasticity. Although FFRs, unlike cortical
responses, cannot be evoked purely top-down based on online auditory regularities (e.g. evoked to a
3
Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR
suddenly omitted stimulus in a highly regular sequence) (Lehmann et al., 2016), previous studies have
shown that the presumably bottom-up FFRs can at least be modulated by online auditory regularities.
Representations of stimulus features in the FFR are enhanced when a stimulus is highly predictable,
such as within a repetitive presentation context (Chandrasekaran et al., 2009; Parbery-Clark et al., 2011;
Strait et al., 2011) or in musical patterns (Skoe et al., 2013). Extrapolating from these studies, subcortical
auditory processing, as indexed by the FFRs is less robust when the stimulus presentation is not easily
predictable, such as when presented as an oddball (Slabu et al., 2012; Skoe et al., 2014) or in random
(Chandrasekaran et al., 2009; Parbery-Clark et al., 2011; Strait et al., 2011; Skoe et al., 2013).
These FFR studies have provided critical evidence that stimulus predictability modulates online
subcortical plasticity. Such predictability-related plasticity may be driven by top-down predictive coding
via corticocollicular feedback loops (Chandrasekaran et al., 2014). Yet, the extent to which stimulus
predictability interacts with stimulus novelty in mediating neural plasticity in the same participants, to
our knowledge, has not been examined. In previous study designs, stimulus novelty in the listening
context covaried with transitional probability (i.e. the probability of the target stimulus transitioning
from any stimulus in a single step) (Chandrasekaran et al., 2009; Strait et al., 2011; Parbery-Clark et al.,
2011; Slabu et al., 2012; Skoe et al., 2014). Therefore, disentangling potential local novelty-dependent
plasticity from top-down predictability-dependent plasticity in previous human FFR studies is challenging.
As such, despite evidence of local SSA demonstrated in the IC in animal models, compelling evidence of
the interaction between local novelty-related mechanisms and top-down predictability-related mechanisms
in mediating human subcortical auditory plasticity is still lacking.
To address this research gap, the current study investigated the extent to which stimulus novelty
modulated subcortical auditory encoding in addition to stimulus predictability in the same participants.
Here, we examined the neural encoding of dynamic lexical pitch patterns in Cantonese (Figure 1), a
tone language, in native speakers of Cantonese. Participants listened to the same falling pitch pattern
in three different contexts: 1) a variable context, wherein, the tone occurred randomly with two other
tones at a 33% probability; 2) a repetitive context, wherein, the tone was presented at a 100% probability;
3) a patterned context, wherein, the tone was presented non-repetitively in a predictable pattern with two
other tones (100% transitional probability). Due to prior evidence that has shown both novelty-dependent
(Cacciaglia et al., 2015) and predictability-dependent forms of subcortical plasticity on humans (Chandrasekaran
et al., 2009; Slabu et al., 2012; Strait et al., 2011), we predicted that both stimulus predictability and
stimulus novelty would modulate Cantonese speakers subcortical representation of the target falling
lexical pitch pattern. Given that the target pitch pattern was more novel in the variable context (lower
4
Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR
stimulus probability) compared to the repetitive context, if effects of stimulus novelty were more robust
than stimulus predictability, we would expect subcortical representations to be enhanced in the variable
context relative to the repetitive context. However, in line with prior studies examining segmental
speech information (Chandrasekaran et al., 2009; Slabu et al., 2012; Strait et al., 2011), we predicted
enhance representation of the lexical pitch pattern in the repetitive context, relative to the variable
context. Such results would not only extend the finding of context-dependent subcortical processing
to the domain of suprasegmental information in speech signals, but also suggest top-down processes
like predictive coding may be more robust in mediating plasticity than stimulus novelty-related local
processes. Crucially, on top of this, we predicted most robust encoding in the patterned context, which
would confirm an effect of stimulus novelty-related modulation independent of stimulus predictability-related
effects. Our basis for prediction was that since transitional probability was 100% in both patterned and
repetitive contexts (hence identical stimulus predictability) , top-down processes would be equally active
in extracting regularities in the signal. However, relative to the repetitive context, local processes such
as synaptic depression may be minimized in the patterned context in which T4 presentations were more
novel (i.e. less repetitive) due to its prior patterned presentation with the two other tones.
MATERIALS AND METHODS
Participants
Twenty-four participants (12 male; Age: M: 23.8 yrs, S.D. 5.1) recruited by advertisements through the
mass email services at the Chinese University of Hong Kong were selected for the current study. All
participants were native speakers of Hong Kong Cantonese who reported no neurological/psychiatric
impairments. Participants self-reported normal hearing in both ears, and demonstrated pure-tone air
conduction thresholds of 25 dB or better at frequencies of 500, 1,000, 2,000, and 4,000 Hz. Informed
consent approved by The Joint Chinese University of Hong Kong - New Territories East Cluster Clinical
Research Ethics Committee was obtained from each participant before any experimental procedure.
Electrophysiological testing took place in the Laboratory for Language, Learning, and the Brain at the
Chinese University of Hong Kong. All participants were compensated monetarily for their participation.
5
Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR
Waveforms Spectrograms
0 25 50 75 100 125 150 175Time (ms) Time (ms)
Freq
uenc
y (H
z)
0 50 100 1500
100
200
300
400
500
µV
0.01
0.02
0.03
0.04
0.05
0.06
BA
*T4
T1
T2
Figure 1: Stimuli Characteristics: (A) Waveforms and (B) Spectrograms of the stimuli: syllables /ji/with a high-level (Tone 1, T1), a high-rising (Tone 2, T2), and a low-falling (Tone 4, T4) linguistic pitchpatterns in Cantonese. The frequency-following response (FFR) elicited by the Tone 4 syllable is thefocus of the current study. The start of the vowel portion of the Tone 4 syllable (25-175ms) is marked inthe waveform and spectrogram by red dotted lines.
Electrophysiological testing
Stimuli
Speech stimuli used for electrophysiological testing consisted of three Cantonese lexical tones, namely
Tone 1 (T1, high-level pitch pattern), Tone 2 (T2, high-rising pitch pattern), and Tone 4 (T4, low-falling
pitch pattern). The three tones had the same syllable /ji/, which in combination with the lexical tones,
lead to three different Cantonese words: /ji1/ (T1, ‘doctor’), /ji2/ (T2, ‘chair’), and /ji4/ (T4, ‘son’). The
stimuli were identical to ones used a previous study (Liu et al., 2014). The stimuli were produced by a
male native speaker of Cantonese. The stimuli were normalized for duration (175 ms) and intensity (74
dB SPL). As such, f0 (fundamental frequency) contour is the main acoustic feature that differs across
the stimuli: the f0 contours for T1, T2, and T4 ranges from 141-143 Hz, 105-132 Hz, and 86-99 Hz
respectively (Figure 1). Native speakers of Cantonese (the first and second author) confirmed the stimuli
to be natural exemplars of their respective lexical tone categories.
6
Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFRRunning Head: context-dependent neural plasticity in lexical pitch FFR
Variable Context
Repetitive Context
Patterned Context
A
StimulusNoveltyParadigm
StimulusProbabilityParadigm
100%
100%
33%
Patterned Context
Repetitive Context
B
Variable Context
34%
33%
Figure 2: Experimental Design: (A) Event-matched paradigms: Tone 4 (T4) stimuli were presentedin variable (top), repetitive (middle), and patterned (bottom) contexts. In the variable context, T4, Tone1 (T1) and Tone 2 (T2) syllables were randomly presented. In the repetitive context, the T4 syllable wasrepeated continuously. In the patterned context, T1, T2, and T4 syllables repeated in a fixed pattern.To control for presentation order, electrophysiological responses to T4 were event-matched between thevariable and repetitive contexts (black solid lines), and separately, between the patterned and repetitivecontexts (grey dotted lines). (B) Transitional probabilities: in the variable context (top), the probabilityof a T4 occurrence (instead of a T1 or T2) after any stimulus was 33%; in the repetitive context (middle),a T4 occured after another T4 with a 100% probability; in the patterned context (bottom), the probabilityof a T4 occurance was also 100%, because T4 trials only occured after a T1+T2 sequence.
electrode montage (Skoe and Kraus, 2010) that differentially recorded electrophysiological reponses189
from vertex (Cz, active) to bilateral linked mastoids (M1+M2, references), with the forehead as ground.190
Contact impedance was less than 2 k for all electrodes.191
PREPROCESSING PROCEDURES192
Filtering, artifact rejection, and averaging were performed offline using CURRY 7. Responses were193
bandpass filtered from 80 to 2500 Hz (12 dB/octave) to isolate subcortical activity from cortical po-194
tentials and to attenuate EEG signals above the phase locking limit of the brainstem (Skoe and Kraus,195
2010). Trials with activities greater than ±35 µV were considered artifacts and rejected. Responses to196
the T4 stimulus were averaged with a 275 ms epoching window encompassing -50 ms prior to stimulus197
onset, the 175 ms of the stimulus, and 50 ms after stimulus offset. Responses in the repetitive context198
condition were averaged according to their occurance relative to the order of presentation in the variable199
8
Figure 2: Experimental Design: (A) Event-matched paradigms: Tone 4 (T4) stimuli were presentedin variable (top), repetitive (middle), and patterned (bottom) contexts. In the variable context, T4,Tone 1 (T1) and Tone 2 (T2) syllables were randomly presented. In the repetitive context, the T4syllable was repeated continuously. In the patterned context, T1, T2, and T4 syllables repeated in afixed pattern. To control for presentation order, electrophysiological responses to T4 were event-matchedbetween the variable and repetitive contexts (solid black lines), and separately, between the patternedand repetitive contexts (gray dotted lines). (B) Transitional probabilities: in the variable context (top),the probability of a T4 occurrence (instead of a T1 or T2) after any stimulus was 33%; in the repetitivecontext (middle), a T4 occurred after another T4 with a 100% probability; in the patterned context(bottom), the probability of a T4 occurrence was also 100%, because T4 trials only occurred after aT1+T2 sequence.
These three Cantonese tones were chosen because their phonemic distinctions have been reported
to be stable in the language, i.e. these distinctions do not collapse under diachronic effects like sound
change (Mok et al., 2013), or synchronic processing mechanisms like talker normalization (Wong and
Diehl, 2003) across the population.
Amongst the three tones, a priori, we chose to focus on FFRs elicited by T4 to avoid a ceiling
effect on pitch tracking metrics, because FFR pitch tracking is relatively weaker for the falling tone T4
compared to T1 and T2 (Liu et al., 2014).
Presentation contexts
We presented the T4 syllable in three contexts, namely a variable context, a repetitive context, and a
patterned context (Figure 2). In the variable context condition, 1980 sweeps of T4 were presented
7
Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR
randomly in the context of T1 (1980 sweeps) and T2 (2040 sweeps) at a probability of 33% (Figure 2,
Panel A, top). In the repetitive context condition, 6000 sweeps of T4 were presented with a probability of
100% (Figure 2, Panel A, middle). In the patterned context, we presented T4 non-repetitively along with
T1 and T2 in a fixed sequence. In this patterned context condition, 2000 sweeps of T4 were presented,
and each sweep was preceded by a fixed sequence of one trial of T1 followed by one trial of T2 (Figure
2, Panel A, bottom). The transitional probability of occurrence of a T4 trial was therefore controlled at
100% in both patterned context (T4 after a T2) and repetitive context (T4 after a T4) conditions (Figure
2, Panel B, middle and bottom), but was much lower for the variable context at 33% (Figure 2, Panel B,
top). To control for the relative location of the T4 trials within the stream of all stimuli across conditions,
we conducted separate event-matching (Chandrasekaran et al., 2009) to control for trial order between
the variable and repetitive conditions (Figure 2, Panel A, top and middle), as well as the patterned and
repetitive conditions (Figure 2, Panel A, middle and bottom).
Electrophysiological recording procedures
Electrophysiological recording took place in an acoustically and electromagnetically shielded booth.
During recording, participants were told to ignore the stimuli and to rest or sleep in a reclining chair,
consistent with prior FFR recording protocols (Krishnan et al., 2004; Skoe and Kraus, 2010). Stimuli
were presented in a single polarity to the participant’s right ear through electromagnetically shielded
insert earphones (ER-3A, Etymotic Research, Elk Grove Village, IL, USA) at 80 dB SPL. A 10-minute
six-talker babble noise in Cantonese was recursively presented in the background at a signal-to-noise
ratio level of 0 dB to avoid potential ceiling effects on FFR metrics observed in pilot experiments which
were conducted without noise. Stimuli in all conditions were presented with a 250 ms stimulus-onset
asynchrony (SOA). Stimuli were presented via the presentation software Neuroscan Stim2 (Compumedics,
El Paso, TX, USA). For twenty participants (out of twenty-four) who chose to complete the whole
set of electrophysiological recording on a single day, the order of the three context conditions was
counterbalanced across participants. For the other four participants, electrophysiological recordings
for the repetitive and variable context conditions occurred on a single day, and the order of these two
conditions was counterbalanced. 1 Their recording session for the patterned context condition took place
on a later separate date. The recording of each condition lasted roughly 25 minutes. The total duration
of the testing including preparation time lasted approximately 90 minutes including preparation for each
participant.
Electrophysiological responses were recorded using a SynAmps2 Neuroscan system (Compumedics,
8
Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR
El Paso, TX, USA) with Ag-AgCl scalp electrodes, and digitized at a sampling rate of 20,000 Hz using
CURRY Scan 7 Neuroimaging Suite (Compumedics, El Paso, TX, USA). We used a vertical electrode
montage (Skoe and Kraus, 2010) that differentially recorded electrophysiological responses from the
vertex (Cz, active) to bilateral linked mastoids (M1+M2, references), with the forehead as ground.
Contact impedance was less than 2 kΩ for all electrodes.
Preprocessing Procedures
Filtering, artifact rejection, and averaging were performed offline using CURRY 7 (Compumedics,
El Paso, TX, USA). Responses were bandpass filtered from 80 to 2500 Hz (12 dB/octave) to isolate
subcortical activity from cortical contamination and mimic the phase-locking limit of the subcortical
auditory system (Skoe and Kraus, 2010). Trials with activities greater than ±35 µV were considered
artifacts and rejected. Responses to the T4 stimulus were averaged with a 275 ms epoching window
encompassing -50 ms before stimulus onset, the 175 ms of the stimulus, and 50 ms after stimulus offset.
Responses in the repetitive context condition were averaged according to their occurrence relative to the
order of presentation in the variable context condition, and separately, to the order of presentation in the
patterned context condition. The average number of trials was approximately 1800 trials per condition.
Data Analysis
FFR data were analyzed using customized Matlab (The Mathworks, Natick, MA, USA) scripts adapted
from the Brainstem Toolbox (Skoe and Kraus, 2010). Before analysis, the stimulus was down-sampled
to 20,000 Hz to match the sampling rate of the response. For each FFR, we first calculated its estimated
onset delay relative to the stimulus presentation time (neural lag) due to neural conduction of the
auditory pathway. This neural lag value was computed using a cross-correlation technique that slid
the response waveform (the portion of FFR wave from 0-175 ms) and the stimulus waveform in time
with respect to one another (Liu et al., 2014). The neural lag value (in ms) was taken as the time
point in which maximum positive correlation was achieved between 6 and 12 ms, the expected latency
of the onset component of the auditory brainstem response, with the transmission delay of the ear
inserts also taken into account (Bidelman et al., 2011; Strait et al., 2012). Then, the f0 (fundamental
frequency) contour from the averaged FFR was derived using a sliding window autocorrelation-based
procedure (Krishnan et al., 2004; Wong et al., 2007). To estimate how f0 values changed through the
waveform, the portion of the waveform that corresponds to the vowel portion of the stimulus (25-175
ms of stimulus, c.f. figure 1, shifted by neural lag in the FFR) was divided into 100 bins, each 50 ms
9
Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR
(49 ms overlap between adjacent time bins). Each of the 100 time bins was time-shifted in 1 ms steps
with a delayed version of itself, and a Pearson’s r was calculated at each 1 ms interval. The time-lag to
achieve maximum correlation within each bin was recorded. The reciprocal of this time-lag represented
an estimate of f0 of that bin. The resulting f0 values formed a 100-point f0 contour. The f0 contour of
the stimulus was also derived separately using the same procedure, but the 25-175 ms analysis window
of the waveform was not shifted by the neural lag. Subsequent analyses focused on whether and how
neural pitch tracking varied as a function of online stimulus predictability and novelty.
We derived two main metrics previously used to define the fidelity of the neural responses to
linguistic pitch patterns (Wong et al., 2007; Song et al., 2008; Skoe et al., 2014; Liu et al., 2014):
1) Stimulus-to-response correlation, and 2) f0 error. Stimulus-to-response correlation (values between
-1 and 1) is the Pearson’s correlation coefficient (r) between the stimulus and response f0 contours.
It indicates the similarity between the stimulus and response f0 contours in terms of the strength and
direction of their linear relationship (Wong et al., 2007; Liu et al., 2014). F0 error (in Hz) is the mean
absolute Euclidean distance between the stimulus and response f0 contours across the 100 bins in the
autocorrelation analysis. This metric represents the pitch encoding accuracy of the FFR by reflecting
how many Hz the FFR f0 contour deviates from the stimulus f0 contour on average (Song et al., 2008;
Skoe et al., 2014).
In addition, Signal-to-noise ratio (SNR) of each FFR were also derived to assess whether the overall
magnitude of neural activation over the entire FFR period (relative to pre-stimulus baseline) (Russo et al.,
2004) varied as a function of stimulus context. To derive the SNR of each FFR, the root mean square
(RMS) amplitudes (the mean absolute value of all sample points of the waveform within the respective
time windows, in µV) of the FFR period (neural lag to neural lag+175 ms) and the pre-stimulus baseline
period (-50 to neural lag) of the waveform were first recorded. The quotient of the FFR RMS amplitude
and the pre-stimulus RMS amplitude was taken as the SNR value (Russo et al., 2004).
Statistical Analyses
Before subsequent parametric statistical analyses, stimulus-to-response correlation values were first
converted into Fisher’s z’ scores (Wong et al., 2007), as Pearson’s correlation coefficients do not comprise
a normal distribution. To directly examine the extent to which FFR pitch encoding and phase-locking
varied as a function of the three types of stimulus context (variable, repetitive, and patterned contexts)
overall, we conducted one-way repeated measures ANOVAs on the FFR metrics (stimulus-to-response
correlation, f0 error, and SNR). We note that directly comparing encoding across the three contexts
10
Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR
conditions (comparing the event-matched variable and patterned conditions with the non-event-matched
patterned condition) is limited by a potential confound of presentation order. To account for this
confound, we conducted a second analysis that compared the T4 FFRs in the variable condition to
the event-matched T4 FFRs in the repetitive condition. A third analysis was conducted to compare the
T4 FFRs in the patterned condition to the event-matched T4 FFRs in the repetitive condition. Two sets
of separate paired sample t-tests were used to compare the mean stimulus-to-response correlation, f0
error, and SNR, between variable context and repetitive context conditions, and separately, between
patterned context and its separately event-matched repetitive context conditions.
RESULTS
Direct comparison between variable, repetitive, and patterned context conditions
One-way repeated measures ANOVA on stimulus-to-response correlation with the Greenhouse-Geisser
correction revealed significant differences between the three context conditions [F(1.19, 34.947) =
10.607, p=0.001]. Planned comparisons revealed that stimulus-to-response correlation of the repetitive
condition was significantly higher than that of the variable condition [F(1, 23) = 5.007, p=0.035], and
that stimulus-to-response correlation of the patterned condition was significantly higher than that of the
repetitive condition [F(1, 23) = 9.684, p=0.005].
The one-way repeated measures ANOVA on f0 error with the Greenhouse-Geisser correction also
revealed significant differences between the three context conditions [F(1.482, 34.095) = 8.973, p=0.002].
Planned comparisons revealed that f0 error of the repetitive condition was significantly lower than that
of the variable condition [F(1, 23) = 8.338, p=0.008], and that f0 error of the patterned condition was
marginally lower than that of the repetitive condition [F(1, 23) = 3.713, p=0.066].
The one-way repeated measures ANOVA on SNR with the Greenhouse-Geisser correction was not
significant [F(1.894, 43.569) = 1.001, p=0.372]. Planned comparisons on the SNR metric, between
repetitive and variable conditions [F(1, 23) = 0.389, p=0.539], and between repetitive, and patterned
conditions were not significant [F(1, 23) = 0.753, p=0.395].
Event-matched comparison between variable and repetitive context conditions
Figure 3 (A) and (B) show the grand averaged waveforms of event-matched FFRs to Cantonese Tone 4
/ji4/ syllable in variable context and repetitive context conditions, and the corresponding spectrograms.
We observed context-dependent effects, more specifically to this comparison, effects of online
11
Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR
Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFRs
Waveforms Spectrograms
0 25 50 75 100 125 150 175−0.5
0
0.5
Time (ms)
Am
plitu
de (µ
V)
Post−stimulus Time (ms)
Freq
uenc
y (H
z)
50 100 1500
100
200
300
400
500
µV
0
0.02
0.04
0.06
0.08
0.1
A BVariableContext
RepetitiveContext
Variable Repetitive0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1*
Stim
ulus−t
o−R
espo
nse
Cor
rela
tion
Variable Repetitive0
0.5
1
1.5
2
2.5
3
**f0
Erro
r (H
z)C D
Figure 3: Results: Frequency-following responses in variable vs. repetitive contexts: (A) Waveformsand (B) Spectrograms of grand-averaged event-matched Tone 4 (T4) frequency-following responses(FFRs) from the variable and repetitive contexts. (C) Mean stimulus-to-response correlations, and(D) mean f0 errors of event-matched T4 FFRs from the variable (left bars) and repetitive (right bars)contexts. Error bars denote ± one standard error from the mean. Note that pitch tracking of the samestimulus (T4) was more robust (higher stimulus-to-response correlations and lower f0 errors) in therepetitive context in which its transitional probability of occurrence was higher (100% vs 33% in thevariable condition). *p < 0.05, **p < 0.01
24
Figure 3: Results: Event-matched Frequency-following responses in variable vs. repetitivecontexts: (A) Waveforms and (B) Spectrograms of grand-averaged event-matched Tone 4 (T4)frequency-following responses (FFRs) from the variable and repetitive contexts. (C) Meanstimulus-to-response correlations, and (D) mean f0 errors of event-matched T4 FFRs from the variable(left bars) and repetitive (right bars) contexts. Error bars denote ± one standard error from the mean.Note that pitch tracking of the same stimulus (T4) was more robust (higher stimulus-to-responsecorrelations and lower f0 errors) in the repetitive context in which its transitional probability ofoccurrence was higher (100% vs. 33% in the variable condition). *p < 0.05, **p < 0.01
stimulus predictability, on both of our pitch tracking metrics [Figure 3 (C) and (D)]. We found a higher
stimulus-to-response correlation [t(23) = -2.697, p=0.013] ,and lower f0 error [t(23) = 3.005, p=0.006] in
the repetitive context condition relative to the variable context condition, indicating that the encoding of
a dynamic pitch pattern was more faithful when online stimulus context was predictable. No significant
context-dependent effect was found for the SNR measure [t(23) = 0.615, p=0.545].
Event-matched comparison between patterned and repetitive context conditions
Figure 4 (A) and (B) show the grand averaged waveforms of separately event-matched FFRs to Cantonese
Tone 4 /ji4/ syllable in patterned context and repetitive context conditions, and the corresponding
12
Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFRRunning Head: Context-dependent Subcortical Plasticity in Lexical Tone FFRs
Waveforms Spectrograms
0 25 50 75 100 125 150 175−0.5
0
0.5
Time (ms)
Am
plitu
de (µ
V)
Post−stimulus Time (ms)
Freq
uenc
y (H
z)
0 50 100 1500
100
200
300
400
500
µV
0
0.02
0.04
0.06
0.08
0.1
PatternedContext
B
RepetitiveContext
A
Patterned Repetitive0
0.2
0.4
0.6
0.8
1 **
Stim
ulus−t
o−R
espo
nse
Cor
rela
tion
Patterned Repetitive0
0.5
1
1.5
2
2.5
3
**
f0 E
rror (
Hz)
C D
Figure 4: Results: Frequency-following responses in patterned vs. repetitive contexts: (A)Waveforms (B) Spectrograms of grand-averaged event-matched Tone 4 (T4) frequency-followingresponses (FFRs) from the patterned and repetitive contexts. (C) Mean stimulus-to-responsecorrelations, and (D) mean f0 errors of event-matched Tone 4 FFRs from the patterned (left bars)and repetitive (right bars) contexts. Error bars denote ± one standard error from the mean. Note thatdespite the controlled transitional probability of a T4 occurrence in both conditions (both 100%), pitchtracking of the same stimulus (T4) was more robust (higher stimulus-to-response correlations and lowerf0 errors) in the patterned context when the stimulus was not repetitive. **p < 0.01
25
Figure 4: Results: Event-matched Frequency-following responses in patterned vs. repetitivecontexts: (A) Waveforms (B) Spectrograms of grand-averaged event-matched Tone 4 (T4)frequency-following responses (FFRs) from the patterned and repetitive contexts. (C) Meanstimulus-to-response correlations, and (D) mean f0 errors of event-matched Tone 4 FFRs from thepatterned (left bars) and repetitive (right bars) contexts. Error bars denote ± one standard error fromthe mean. Note that despite the controlled transitional probability of a T4 occurrence in both conditions(both 100%), pitch tracking of the same stimulus (T4) was more robust (higher stimulus-to-responsecorrelations and lower f0 errors) in the patterned context when the stimulus was not repetitive.**p < 0.01
spectrograms.
From our metrics, a context-dependent online stimulus novelty effect on pitch encoding was observed
[Figure 4 (C) and (D)]. We found that pitch encoding was more faithful when stimulus context was
patterned rather than repetitive, indicated by a higher stimulus-to-response correlation [t(23) = 2.876,
p=0.009], and lower f0 error [t(23) = -3.057, p=0.006] in the patterned context condition relative to the
repetitive context condition. No significant context-dependent effect was found for SNR [t(23) = -0.788,
p=0.439].
13
Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR
DISCUSSION
Our results demonstrate a clear influence of context on the representation of lexical pitch patterns in
human FFRs. We found that the pitch tracking in FFRs to a falling lexical pitch pattern (T4) was more
faithful (higher stimulus-to-response correlation) and accurate (lower f0 error) when the pitch pattern
was presented in a repetitive context relative to a variable context. This is consistent with prior work
using segmental speech stimuli (Chandrasekaran et al., 2009; Strait et al., 2011). Interestingly, we found
that when the transitional probability of occurrence (hence stimulus predictability) was controlled, FFR
pitch tracking was more faithful and accurate when the stimulus was patterned but not repetitive.
Prior research has extensively characterized the FFR as an index to subcortical auditory plasticity
(Chandrasekaran and Kraus, 2010). However, the question on whether cortical contributions to FFRs
can be ruled out has sparked attention lately. For example, a recent magnetoencephalography (MEG)
study has shown a right-asymmetric contribution of the auditory cortex to FFRs (Coffey et al., 2016).
Nevertheless, external evidence has at least suggested that the dominant source of FFRs is subcortical.
Human FFRs have an upper limit of about 1,000Hz (Chandrasekaran and Kraus, 2010). Animal models
on other species of mammals have demonstrated a similar range of phase-locking abilities in neuronal
populations within the inferior colliculus (IC) (Liu et al., 2006), while units at the auditory cortex
only demonstrate phase-locking up to about 250Hz (Wallace et al., 2005). Also, FFRs, compared
to cortical auditory evoked potentials, are smaller in amplitudes; FFRs also demonstrate much lower
latency variability and earlier maturation (Chandrasekaran and Kraus, 2010). A recent study using
source dipole modeling and 3- channel Lissajous analysis on high density multi-channel-recorded FFRs
has also suggested the midbrain to be the putative generator of speech FFRs (Bidelman, 2015). Since
all these external evidence has suggested that the dominant source of FFRs is subcortical, despite the
fact that contributions from the cortex on FFRs cannot be completely ruled out, we hereby interpret our
current findings as a demonstration of the subcortical auditory system being sensitive to both stimulus
novelty and predictability. As such, there may be multiple neural mechanisms that interactively influence
online subcortical encoding.
Top-down and Local Processes, and their Interplay
Per animal models, at least two mechanisms are likely to be involved in context-dependent online
subcortical plasticity interactively: a top-down corticofugal mechanism that automatically fine tunes
the representation of stimulus features that matches top-down expectation, and a local mechanism that
14
Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR
enhances representation of novel information (Chandrasekaran et al., 2014). We posit that more faithful
and accurate neural pitch tracking in the repetitive context relative to the variable context may be driven
by the higher transitional probability in the repetitive condition relative to the variable condition, despite
the target stimulus being more novel in the latter condition. The high transitional probability in the
repetitive condition may thus result in greater top-down predictive coding, the effects of which may
have overridden that of local novelty enhancement, thereby enhancing subcortical pitch representation.
In this experiment, we collected FFRs using a passive listening paradigm wherein participants did not
pay overt attention to the stimulus stream. Our findings of context-related effect are therefore likely
to be a fundamental process of auditory processing wherein highly automatic processes are operative
even without overt attention or explicit goal-directed behavior. Future studies can systematically test
the extent to which overt attention to the stimulus pattern may modulate the magnitude of top-down
modulation. For example, testing context-dependent subcortical encoding in different sleeping states
(e.g. awake vs. asleep), or while requiring participants to explicitly track the stimulus pitch patterns
may be informative.
The poorer encoding in the repetitive condition relative to the patterned condition, wherein the
transitional probability is controlled (hence equally robust predictive coding), is likely a result of reduced
local responsitivity to the repetitive stimulus at the subcortical auditory system, as a result of SSA.
Prior studies have discussed the possibility of SSA in modulating subcortical encoding online (Slabu
et al., 2012; Skoe et al., 2014). These studies have used auditory oddball paradigms to elicit FFRs
in the standard (high-probability) and deviant (low-probability) conditions respectively. FFR studies
using passive oddball paradigms have found that FFRs are more robust for the highly repetitive standard
stimulus, which may either suggest that SSA is not reflected in the FFR, which is sensitive to neural
phase-locking (Skoe et al., 2014), or that effects of SSA cannot be disambiguated from predictive coding
with an oddball paradigm in which stimulus novelty and probability co-vary (Slabu et al., 2012). Here,
we controlled for transitional probability while manipulating stimulus novelty in our study by employing
an event-matched comparison between patterned and repetitive contexts. Our results demonstrate a
novelty-related enhancement effect on FFRs (patterned > repetitive), suggesting that predictability and
novelty both drive context-dependent auditory plasticity.
An intriguing possibility is that the context-dependent effects found in this study represent an interaction
between online auditory plasticity and speech processing in the presence of background noise. The use
of babble noise in our stimulus presentation was intended to avoid a ceiling effect on pitch tracking
observed during pilot experiments where speech was presented in quiet. However, it is possible that
15
Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR
the presentation of speech-in-noise could invoke a greater interaction between top-down modulation and
local adaptation mechanisms than quiet presentation. For example, voice pitch is an important cue in
tagging auditory streams (Snyder and Alain, 2007). Hence, it is possible that a constant repetitive pitch in
the repetitive condition was tagged more easily as a separate stream from the background noise relative
to the variable condition. In other words, the randomly presented stimuli in the variable condition
may have resulted in greater neural inhibition because the stimuli were tagged as noise. This neural
habituation may have disproportionately impacted subcortical encoding in the variable condition in
addition to the low transitional probability. This interpretation would imply an intricate interaction not
just between predictive coding and SSA, but also a generalized noise-related neural habituation process
that has resulted in context-dependent modulation. Future studies could introduce different SNR levels
of background noise, and/or a change pitch cues that results in a change in talker, but not tone identity,
as a factor in their experimental design to test this possibility.
Animal studies have shown that subcortical auditory encoding can be modulated by online stimulus
statistics (Dean et al., 2005; Perez-Gonzalez et al., 2005; Malmierca et al., 2009) and behaviorally
relevant auditory experience (Suga, 2008) due to an interaction of local adaptation and top-down modulation
through corticofugal pathways (Malmierca et al., 2009). Human studies have shown that subcortical
encoding of speech and music sounds is modulated by prior listening contexts which vary in statistical
features of the auditory input (Chandrasekaran et al., 2009; Strait et al., 2011; Parbery-Clark et al.,
2011; Slabu et al., 2012; Skoe et al., 2013, 2014). A growing body of literature has also demonstrated
that subcortical auditory encoding is enhanced if the signals are behaviorally relevant, such as when
they serve linguistic purposes (Krishnan and Gandour, 2009) and are ecologically valid (Xu et al.,
2006; Krishnan et al., 2009). Together, these findings are shaping an emerging view that subcortical
structures are active processors that can be modulated by online listening contexts, among other factors
such as long-term and short-term auditory experience (Chandrasekaran and Kraus, 2010), to achieve
subcortical auditory plasticity (Chandrasekaran et al., 2014). However, how online listening context
interacts with these other types of auditory experiences to shape subcortical auditory plasticity is still
an open question. Prior studies, using repetitive stimuli presentation, have demonstrated that native
tone-language speakers exhibited superior subcortical pitch encoding ability to lexical tones, presumably
because of their life-long native tone language experience (Krishnan et al., 2005, 2009, 2010; Bidelman
et al., 2011; Krishnan et al., 2016).
Skoe and colleagues (2014), on the other hand, have investigated the extent to which subcortical
encoding of linguistic pitch patterns (i.e. lexical tones) was modulated by listening context in native
16
Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR
English speakers (with no prior tone language experience), and how this context-dependent encoding
changed after participants underwent an extensive sound-to-meaning auditory training program that
rendered the tones as behaviorally-relevant. Before training, they found that subcortical pitch tracking
of lexical tones was enhanced when the tones were presented with a higher probability in the context,
relative to when presented with a lower probability. Interestingly, post-training, there was no probability-dependent
enhancement effect on tone encoding. They argued that this loss of probability-dependent enhancement
was due to the stimuli becoming less novel after the tones’ linguistic relevance was acquired. However,
in the current study we found context-dependent enhancement on native tone language speakers, demonstrating
that linguistic relevance is unlikely to result in a loss of context-dependent plasticity. Instead, we
propose that the lack of probability-dependent enhancement after training, seen in the study by Skoe
and colleagues (2014), may reflect a less efficient top-down modulation for the non-native learners, who
only had very limited exposure to the lexical tones.
CONCLUSIONS
In summary, the current study shows a robust influence of prior listening context that enhances online
subcortical encoding of a dynamic, time-varying linguistic pitch pattern. Encoding is more robust
when a sound is more predictable and novel in a listening context. These findings demonstrate a
complex interplay between top-down predictive coding and local SSA processes at the subcortical level
that tunes sensory signal online based on stimulus history. These two processes are likely driven
by at least two general neurobiological mechanisms: predictive coding which enhances predictable
sensory input, and SSA which reduces responsitivity to repetitive sensory input. Together, we interpret
this context-dependent encoding as indicative of an interaction between online and long-term auditory
experience that shapes neural plasticity in the subcortical auditory system.
ACKNOWLEDGMENTS
We thank Zilong Xie for comments on drafts of this manuscript, and Hilda Chan, Jason Ho, Yinyin
Liang, Christine Liu, and Grace Pan for their assistance with data collection. We also thank Oliver
Bones and Fang Liu for their help on Matlab coding.
17
Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR
GRANTS
This work was supported by the U.S. National Institute of Health grant (1R01DC013315) to B. Chandrasekaran,
U.S. National Institutes of Health grants (R01DC008333 and R01DC013315), the Research Grants
Council of Hong Kong grants (477513 and 14117514), the Health and Medical Research Fund of Hong
Kong grant (01120616), and the Dr. Stanley Ho Medical Development Foundation to P.C.M. Wong.
DISCLOSURES
No Conflicts of interest, financial or otherwise, are declared by the authors.
AUTHOR CONTRIBUTIONS
P.C.M.W. and B.C. designed the study; J.C.Y.L. and P.C.M.W. collected the data; J.C.Y.L. analyzed the
data with input from P.C.M.W. and B.C.; J.C.Y.L., P.C.M.W. and B.C. interpreted results of experiments;
J.C.Y.L., P.C.M.W. and B.C. wrote the paper.
18
Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR
FOOTNOTES
1For the four participants whose electrophysiological recording for the patterned context condition took place on a separate
day, a Wilcoxon signed-rank test confirmed that their FFR signal-to-noise ratio (SNR) of the patterned context (M = 1.64, SD
= 0.32) was not statistically different (Z = -.365, p = 0.715) from the rest of the participants who completed all recordings
on a single day (M = 1.66, SD = 0.47). A separate Wilcoxon signed-rank test confirmed that the four participants’ FFR
SNRs between the event-matched patterned (M = 1.64, SD = 0.32) and repetitive (M = 1.62, SD = 0.11) conditions were not
statistically different either (Z = 0.00, p = 1). These suggest that the four participants’ electrophysiological recordings were
consistent across the two days of experiment, and also with participants who completed the experiments on a single day. (On
how the SNR metric was derived, readers are referred to the Methods section).
19
Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR
REFERENCES
Althen H, Grimm S, and Escera C. Fast detection of unexpected sound intensity decrements as
revealed by human evoked potentials. PLoS One, 6(12):e28522, 2011.
Anderson L and Malmierca M. The effect of auditory cortex deactivation on stimulus-specific
adaptation in the inferior colliculus of the rat. Eur J Neurosci, 37(1):52–62, 2013.
Bidelman GM. Multichannel recordings of the human brainstem frequency-following response: scalp
topography, source generators, and distinctions from the transient abr. Hear Res, 323:68–80, 2015.
Bidelman GM, Gandour JT, and Krishnan A. Musicians and tone-language speakers share enhanced
brainstem encoding but not perceptual benefits for musical pitch. Brain Cogn, 77(1):1–10, 2011.
Cacciaglia R, Escera C, Slabu L, Grimm S, Sanjuan A, Ventura-Campos N, and Avila
C. Involvement of the human midbrain and thalamus in auditory deviance detection.
Neuropsychologia, 68:51–58, 2015.
Chandrasekaran B, Hornickel J, Skoe E, Nicol T, and Kraus N. Context-dependent encoding in
the human auditory brainstem relates to hearing speech in noise: implications for developmental
dyslexia. Neuron, 64(3):311–319, 2009.
Chandrasekaran B and Kraus N. The scalp-recorded brainstem response to speech: Neural origins
and plasticity. Psychophysiology, 47(2):236–246, 2010.
Chandrasekaran B, Skoe E, and Kraus N. An integrative model of subcortical auditory plasticity.
Brain Topogr, 27(4):539–552, 2014.
Coffey EB, Herholz SC, Chepesiuk AM, Baillet S, and Zatorre RJ. Cortical contributions to the
auditory frequency-following response revealed by meg. Nat Commun, 7, 2016.
Dean I, Harper NS, and McAlpine D. Neural population coding of sound level adapts to stimulus
statistics. Nat Neurosci, 8(12):1684–1689, 2005.
Escera C, Leung S, and Grimm S. Deviance detection based on regularity encoding along the auditory
hierarchy: electrophysiological evidence in humans. Brain Topogr, 27(4):527–538, 2014.
Krishnan A and Gandour JT. The role of the auditory brainstem in processing linguistically-relevant
pitch patterns. Brain Lang, 110(3):135–148, 2009.
20
Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR
Krishnan A, Gandour JT, and Bidelman GM. The effects of tone language experience on pitch
processing in the brainstem. J Neurolinguistics, 23(1):81–95, 2010.
Krishnan A, Gandour JT, Bidelman GM, and Swaminathan J. Experience dependent neural
representation of dynamic pitch in the brainstem. Neuroreport, 20(4):408, 2009.
Krishnan A, Gandour JT, and Suresh CH. Language-experience plasticity in neural representation
of changes in pitch salience. Brain Res, 1637:102–117, 2016.
Krishnan A, Xu Y, Gandour J, and Cariani P. Encoding of pitch in the human brainstem is sensitive
to language experience. Brain Res Cogn Brain Res, 25(1):161–168, 2005.
Krishnan A, Xu Y, Gandour JT, and Cariani PA. Human frequency-following response:
representation of pitch contours in chinese tones. Hear Res, 189(1):1–12, 2004.
Large EW and Jones MR. The dynamics of attending: How people track time-varying events. Psychol
Rev, 106(1):119, 1999.
Lehmann A, Arias DJ, and Schonwiesner M. Tracing the neural basis of auditory entrainment.
Neuroscience, 337:306–314, 2016.
Liu F, Maggu AR, Lau JC, and Wong PC. Brainstem encoding of speech and musical stimuli in
congenital amusia: evidence from Cantonese speakers. Front Hum Neurosci, 8, 2014.
Liu LF, Palmer AR, and Wallace MN. Phase-locked responses to pure tones in the inferior colliculus.
J Neurophysiol, 95(3):1926–1935, 2006.
Lupyan G and Clark A. Words and the world predictive coding and the language-perception-cognition
interface. Curr Dir Psychol Sci, 24(4):279–284, 2015.
Malmierca MS, Cristaudo S, Perez-Gonzalez D, and Covey E. Stimulus-specific adaptation in the
inferior colliculus of the anesthetized rat. J Neurosci, 29(17):5483–5493, 2009.
Mok PP, Zuo D, and Wong PW. Production and perception of a sound change in progress: Tone
merging in Hong Kong Cantonese. Lang Var Change, 25(03):341–370, 2013.
Naatanen R and Picton T. The n1 wave of the human electric and magnetic response to sound: a
review and an analysis of the component structure. Psychophysiology, 24(4):375–425, 1987.
21
Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR
Natan RG, Briguglio JJ, Mwilambwe-Tshilobo L, Jones SI, Aizenberg M, Goldberg EM, and
Geffen MN. Complementary control of sensory adaptation by two types of cortical interneurons.
Elife, 4:e09868, 2015.
Parbery-Clark A, Strait D, and Kraus N. Context-dependent encoding in the auditory
brainstem subserves enhanced speech-in-noise perception in musicians. Neuropsychologia,
49(12):3338–3345, 2011.
Perez-Gonzalez D, Malmierca MS, and Covey E. Novelty detector neurons in the mammalian auditory
midbrain. Eur J Neurosci, 22(11):2879–2885, 2005.
Picton T. Human auditory evoked potentials. Plural Publishing, San Diego, 2010.
Russo N, Nicol T, Musacchia G, and Kraus N. Brainstem responses to speech syllables. Clin
Neurophysiol, 115(9):2021–2030, 2004.
Skoe E, Chandrasekaran B, Spitzer ER, Wong PC, and Kraus N. Human brainstem plasticity: the
interaction of stimulus probability and auditory learning. Neurobiol Learn Mem, 109:82–93, 2014.
Skoe E and Kraus N. Auditory brainstem response to complex sounds: a tutorial. Ear Hear, 31(3):302,
2010.
Skoe E, Krizman J, Spitzer E, and Kraus N. The auditory brainstem is a barometer of rapid auditory
learning. Neuroscience, 243:104–114, 2013.
Slabu L, Escera C, Grimm S, and Costa-Faidella J. Early change detection in humans as revealed by
auditory brainstem and middle-latency evoked potentials. Eur J Neurosci, 32(5):859–865, 2010.
Slabu L, Grimm S, and Escera C. Novelty detection in the human auditory brainstem. J Neurosci,
32(4):1447–1452, 2012.
Snyder JS and Alain C. Toward a neurophysiological theory of auditory stream segregation. Psychol
Bull, 133(5):780, 2007.
Song JH, Skoe E, Wong PC, and Kraus N. Plasticity in the adult human auditory brainstem following
short-term linguistic training. J Cogn Neurosci, 20(10):1892–1902, 2008.
Strait DL, Hornickel J, and Kraus N. Subcortical processing of speech regularities underlies reading
and music aptitude in children. Behav Brain Funct, 7(1):1, 2011.
22
Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR
Strait DL, Parbery-Clark A, Hittner E, and Kraus N. Musical training during early childhood
enhances the neural encoding of speech in noise. Brain Lang, 123(3):191–201, 2012.
Suga N. Role of corticofugal feedback in hearing. J Comp Physiol A, 194(2):169–183, 2008.
Wallace MN, Shackleton TM, Anderson LA, and Palmer AR. Representation of the purr call in the
guinea pig primary auditory cortex. Hear Res, 204(1):115–126, 2005.
Wang AL, Mouraux A, Liang M, and Iannetti GD. Stimulus novelty, and not neural refractoriness,
explains the repetition suppression of laser-evoked potentials. J Neurophysiol, 104(4):2116–2124,
2010.
Winer JA, Larue DT, Diehl JJ, and Hefti BJ. Auditory cortical projections to the cat inferior
colliculus. J Comp Neurol, 400(2):147–174, 1998.
Winkler I, Denham SL, and Nelken I. Modeling the auditory scene: predictive regularity
representations and perceptual objects. Trends Cogn Sci, 13(12):532–540, 2009.
Wong PC and Diehl RL. Perceptual normalization for inter-and intratalker variation in Cantonese level
tones. J Speech Lang Hear Res, 46(2):413–421, 2003.
Wong PC, Skoe E, Russo NM, Dees T, and Kraus N. Musical experience shapes human brainstem
encoding of linguistic pitch patterns. Nat Neurosci, 10(4):420–422, 2007.
Xu Y, Krishnan A, and Gandour JT. Specificity of experience-dependent pitch representation in the
brainstem. Neuroreport, 17(15):1601–1605, 2006.
23