Context-dependent plasticity in the subcortical encoding ... · Context-dependent plasticity in the...

23
Context-dependent plasticity in the subcortical encoding of linguistic pitch patterns Joseph CY Lau 1 , Patrick CM Wong 1, 2 , and Bharath Chandrasekaran *3, 4, 5, 6, 7 1 Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Shatin, Hong Kong 2 Brain and Mind Institute, The Chinese University of Hong Kong, Shatin, Hong Kong 3 Department of Communication Sciences and Disorders, Moody College of Communication, The University of Texas at Austin, Austin, TX, USA 4 Department of Psychology, College of Liberal Arts, The University of Texas at Austin, Austin, TX, USA 5 Department of Linguistics, College of Liberal Arts, The University of Texas at Austin, Austin, TX, USA 6 Institute of Mental Health Research, College of Liberal Arts, The University of Texas at Austin, Austin, TX, USA 7 Institute for Neuroscience, The University of Texas at Austin, Austin, TX, USA * Corresponding author: Bharath Chandrasekaran Department of Communication Sciences and Disorders, University of Texas at Austin, Austin, TX, USA [email protected] 1

Transcript of Context-dependent plasticity in the subcortical encoding ... · Context-dependent plasticity in the...

Page 1: Context-dependent plasticity in the subcortical encoding ... · Context-dependent plasticity in the subcortical encoding of linguistic pitch patterns ... that works to enhance online

Context-dependent plasticity in the subcortical encoding of

linguistic pitch patterns

Joseph CY Lau1, Patrick CM Wong1, 2, and Bharath Chandrasekaran ∗3, 4, 5, 6, 7

1Department of Linguistics and Modern Languages, The Chinese University of Hong Kong,

Shatin, Hong Kong

2Brain and Mind Institute, The Chinese University of Hong Kong, Shatin, Hong Kong

3Department of Communication Sciences and Disorders, Moody College of Communication,

The University of Texas at Austin, Austin, TX, USA

4Department of Psychology, College of Liberal Arts, The University of Texas at Austin,

Austin, TX, USA

5Department of Linguistics, College of Liberal Arts, The University of Texas at Austin,

Austin, TX, USA

6Institute of Mental Health Research, College of Liberal Arts, The University of Texas at

Austin, Austin, TX, USA

7Institute for Neuroscience, The University of Texas at Austin, Austin, TX, USA

∗Corresponding author:Bharath ChandrasekaranDepartment of Communication Sciences and Disorders,University of Texas at Austin,Austin, TX, [email protected]

1

Page 2: Context-dependent plasticity in the subcortical encoding ... · Context-dependent plasticity in the subcortical encoding of linguistic pitch patterns ... that works to enhance online

Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR

Abstract

We examined the mechanics of online experience-dependent auditory plasticity by assessing the

influence of prior context on the frequency-following responses (FFRs), which reflect phase-locked

responses from neural ensembles within the subcortical auditory system. FFRs were elicited to a

Cantonese falling lexical pitch pattern from twenty-four native speakers of Cantonese in a variable

context, wherein, the falling pitch pattern randomly occurred in the context of two other linguistic

pitch patterns; in a patterned context, wherein, the falling pitch pattern was presented in a predictable

sequence along with two other pitch patterns, and in a repetitive context, wherein the falling pitch

pattern was presented with 100% probability. We found that neural tracking of the stimulus pitch

contour was most faithful and accurate when listening context was patterned, and least faithful when

the listening context was variable. The patterned context elicited more robust pitch tracking relative

to the repetitive context, suggesting that context-dependent plasticity is most robust when the context

is predictable, but not repetitive. Our study demonstrates a robust influence of prior listening context

that works to enhance online neural encoding of linguistic pitch patterns. We interpret these results

as indicative of an interplay between contextual processes that are responsive to probability as well

as novelty in the presentation context.

New & Noteworthy

Human auditory perception in dynamic listening environments requires fine-tuning of sensory

signal based on behaviorally-relevant regularities in listening context, i.e. online-experience-dependent

plasticity. Our finding suggests what partly underlie online-experience-dependent plasticity are

interplaying contextual processes in the subcortical auditory system that are responsive to probability

as well as novelty in listening context. These findings add to the literature that looks to establish the

neurophysiological bases of auditory system plasticity, a central issue in auditory neuroscience.

INTRODUCTION

Human auditory perception often occurs in dynamic and complex listening environments. For the

most part, humans demonstrate remarkably successful and consistent perceptual abilities in everyday

communication. Accurate perception requires that the auditory system discovers and extracts behaviorally-relevant

regularities from a non-stationary soundscape, and fine-tunes and reorganizes sensory signal on the fly

(Large and Jones, 1999; Winkler et al., 2009). The ability of the auditory system to perform these

processes online in the ambient auditory environment has been extensively studied as a form of online

experience-dependent auditory plasticity (Chandrasekaran et al., 2014; Skoe et al., 2014). At least

two fundamental neural mechanisms have been hypothesized to underlie online auditory plasticity:

2

Page 3: Context-dependent plasticity in the subcortical encoding ... · Context-dependent plasticity in the subcortical encoding of linguistic pitch patterns ... that works to enhance online

Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR

predictive coding, a process that computes and extracts statistical relationships of objects in stimuli,

from which expectancies of future sounds can be built and continuously tested (Lupyan and Clark,

2015), and stimulus-specific adaptation (SSA), a process that attenuates repetitive sensory presentation

in order to enhance the processing of novel stimuli (Natan et al., 2015).

In humans, the role of the subcortical auditory system in mediating online plasticity has received

attention recently (Chandrasekaran et al., 2014). Prior work on animal models has shown that online

subcortical encoding of auditory signals is dynamically modified by top-down cortical feedback (Suga,

2008). This top-down mechanism is executed via corticofugal pathways, which are feedback loops that

back-project from auditory cortical regions onto subcortical structures like the inferior colliculus (IC)

(Winer et al., 1998). Also, direct neuron recordings from anesthetized rats have indicated that IC neurons

demonstrate SSA to commonly recurring auditory stimuli (Perez-Gonzalez et al., 2005; Malmierca et al.,

2009). Although the exact neuronal mechanism underlying SSA is still under debate, a neuronal cooling

study has shown that SSA at the level of the IC was largely unaffected by the cortex (Anderson and

Malmierca, 2013). An emerging view is that SSA at the level of the IC is largely generated by a

mechanism local to the IC that is sensitive to stimulus novelty [i.e. the difference of a stimulus in

one or more dimensions compared to previously occurring stimuli (Naatanen and Picton, 1987; Wang

et al., 2010)]. Repetitive presentation, in which a stimulus is always identical to its prior stimulus given

any position, may lead to greater synaptic depression, leading to more robust detection of stimuli that

are more novel (i.e. less repetitive) in other presentation contexts.

In humans, a recent functional magnetic resonance imaging (fMRI) study has shown that the processing

of unexpected auditory stimuli in an oddball paradigm yielded activation in the left IC, thus converging

with evidence from animal models on the role of the IC in online subcortical plasticity (Cacciaglia et al.,

2015). However, electrophysiological studies looking at the wave V of auditory brainstem response

(ABR), which is thought to be generated in ensembles within the IC (Picton, 2010), failed to find such

an unexpected deviance-related effect (Slabu et al., 2010; Althen et al., 2011). One interpretation to this

lack of effect is that the wave V may be generated at the ascending lemniscal portions of IC which are

not sensitive to novelty (Escera et al., 2014).

The scalp-recorded frequency-following responses (FFRs), on the other hand, reflect phase-locked

responses from neural ensembles within the auditory brainstem and midbrain following the phasic ABR

(Chandrasekaran and Kraus, 2010). Prior work in humans has extensively characterized the FFR as

an index of long-term and short-term subcortical auditory plasticity. Although FFRs, unlike cortical

responses, cannot be evoked purely top-down based on online auditory regularities (e.g. evoked to a

3

Page 4: Context-dependent plasticity in the subcortical encoding ... · Context-dependent plasticity in the subcortical encoding of linguistic pitch patterns ... that works to enhance online

Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR

suddenly omitted stimulus in a highly regular sequence) (Lehmann et al., 2016), previous studies have

shown that the presumably bottom-up FFRs can at least be modulated by online auditory regularities.

Representations of stimulus features in the FFR are enhanced when a stimulus is highly predictable,

such as within a repetitive presentation context (Chandrasekaran et al., 2009; Parbery-Clark et al., 2011;

Strait et al., 2011) or in musical patterns (Skoe et al., 2013). Extrapolating from these studies, subcortical

auditory processing, as indexed by the FFRs is less robust when the stimulus presentation is not easily

predictable, such as when presented as an oddball (Slabu et al., 2012; Skoe et al., 2014) or in random

(Chandrasekaran et al., 2009; Parbery-Clark et al., 2011; Strait et al., 2011; Skoe et al., 2013).

These FFR studies have provided critical evidence that stimulus predictability modulates online

subcortical plasticity. Such predictability-related plasticity may be driven by top-down predictive coding

via corticocollicular feedback loops (Chandrasekaran et al., 2014). Yet, the extent to which stimulus

predictability interacts with stimulus novelty in mediating neural plasticity in the same participants, to

our knowledge, has not been examined. In previous study designs, stimulus novelty in the listening

context covaried with transitional probability (i.e. the probability of the target stimulus transitioning

from any stimulus in a single step) (Chandrasekaran et al., 2009; Strait et al., 2011; Parbery-Clark et al.,

2011; Slabu et al., 2012; Skoe et al., 2014). Therefore, disentangling potential local novelty-dependent

plasticity from top-down predictability-dependent plasticity in previous human FFR studies is challenging.

As such, despite evidence of local SSA demonstrated in the IC in animal models, compelling evidence of

the interaction between local novelty-related mechanisms and top-down predictability-related mechanisms

in mediating human subcortical auditory plasticity is still lacking.

To address this research gap, the current study investigated the extent to which stimulus novelty

modulated subcortical auditory encoding in addition to stimulus predictability in the same participants.

Here, we examined the neural encoding of dynamic lexical pitch patterns in Cantonese (Figure 1), a

tone language, in native speakers of Cantonese. Participants listened to the same falling pitch pattern

in three different contexts: 1) a variable context, wherein, the tone occurred randomly with two other

tones at a 33% probability; 2) a repetitive context, wherein, the tone was presented at a 100% probability;

3) a patterned context, wherein, the tone was presented non-repetitively in a predictable pattern with two

other tones (100% transitional probability). Due to prior evidence that has shown both novelty-dependent

(Cacciaglia et al., 2015) and predictability-dependent forms of subcortical plasticity on humans (Chandrasekaran

et al., 2009; Slabu et al., 2012; Strait et al., 2011), we predicted that both stimulus predictability and

stimulus novelty would modulate Cantonese speakers subcortical representation of the target falling

lexical pitch pattern. Given that the target pitch pattern was more novel in the variable context (lower

4

Page 5: Context-dependent plasticity in the subcortical encoding ... · Context-dependent plasticity in the subcortical encoding of linguistic pitch patterns ... that works to enhance online

Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR

stimulus probability) compared to the repetitive context, if effects of stimulus novelty were more robust

than stimulus predictability, we would expect subcortical representations to be enhanced in the variable

context relative to the repetitive context. However, in line with prior studies examining segmental

speech information (Chandrasekaran et al., 2009; Slabu et al., 2012; Strait et al., 2011), we predicted

enhance representation of the lexical pitch pattern in the repetitive context, relative to the variable

context. Such results would not only extend the finding of context-dependent subcortical processing

to the domain of suprasegmental information in speech signals, but also suggest top-down processes

like predictive coding may be more robust in mediating plasticity than stimulus novelty-related local

processes. Crucially, on top of this, we predicted most robust encoding in the patterned context, which

would confirm an effect of stimulus novelty-related modulation independent of stimulus predictability-related

effects. Our basis for prediction was that since transitional probability was 100% in both patterned and

repetitive contexts (hence identical stimulus predictability) , top-down processes would be equally active

in extracting regularities in the signal. However, relative to the repetitive context, local processes such

as synaptic depression may be minimized in the patterned context in which T4 presentations were more

novel (i.e. less repetitive) due to its prior patterned presentation with the two other tones.

MATERIALS AND METHODS

Participants

Twenty-four participants (12 male; Age: M: 23.8 yrs, S.D. 5.1) recruited by advertisements through the

mass email services at the Chinese University of Hong Kong were selected for the current study. All

participants were native speakers of Hong Kong Cantonese who reported no neurological/psychiatric

impairments. Participants self-reported normal hearing in both ears, and demonstrated pure-tone air

conduction thresholds of 25 dB or better at frequencies of 500, 1,000, 2,000, and 4,000 Hz. Informed

consent approved by The Joint Chinese University of Hong Kong - New Territories East Cluster Clinical

Research Ethics Committee was obtained from each participant before any experimental procedure.

Electrophysiological testing took place in the Laboratory for Language, Learning, and the Brain at the

Chinese University of Hong Kong. All participants were compensated monetarily for their participation.

5

Page 6: Context-dependent plasticity in the subcortical encoding ... · Context-dependent plasticity in the subcortical encoding of linguistic pitch patterns ... that works to enhance online

Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR

Waveforms Spectrograms

0 25 50 75 100 125 150 175Time (ms) Time (ms)

Freq

uenc

y (H

z)

0 50 100 1500

100

200

300

400

500

µV

0.01

0.02

0.03

0.04

0.05

0.06

BA

*T4

T1

T2

Figure 1: Stimuli Characteristics: (A) Waveforms and (B) Spectrograms of the stimuli: syllables /ji/with a high-level (Tone 1, T1), a high-rising (Tone 2, T2), and a low-falling (Tone 4, T4) linguistic pitchpatterns in Cantonese. The frequency-following response (FFR) elicited by the Tone 4 syllable is thefocus of the current study. The start of the vowel portion of the Tone 4 syllable (25-175ms) is marked inthe waveform and spectrogram by red dotted lines.

Electrophysiological testing

Stimuli

Speech stimuli used for electrophysiological testing consisted of three Cantonese lexical tones, namely

Tone 1 (T1, high-level pitch pattern), Tone 2 (T2, high-rising pitch pattern), and Tone 4 (T4, low-falling

pitch pattern). The three tones had the same syllable /ji/, which in combination with the lexical tones,

lead to three different Cantonese words: /ji1/ (T1, ‘doctor’), /ji2/ (T2, ‘chair’), and /ji4/ (T4, ‘son’). The

stimuli were identical to ones used a previous study (Liu et al., 2014). The stimuli were produced by a

male native speaker of Cantonese. The stimuli were normalized for duration (175 ms) and intensity (74

dB SPL). As such, f0 (fundamental frequency) contour is the main acoustic feature that differs across

the stimuli: the f0 contours for T1, T2, and T4 ranges from 141-143 Hz, 105-132 Hz, and 86-99 Hz

respectively (Figure 1). Native speakers of Cantonese (the first and second author) confirmed the stimuli

to be natural exemplars of their respective lexical tone categories.

6

Page 7: Context-dependent plasticity in the subcortical encoding ... · Context-dependent plasticity in the subcortical encoding of linguistic pitch patterns ... that works to enhance online

Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFRRunning Head: context-dependent neural plasticity in lexical pitch FFR

Variable Context

Repetitive Context

Patterned Context

A

StimulusNoveltyParadigm

StimulusProbabilityParadigm

100%

100%

33%

Patterned Context

Repetitive Context

B

Variable Context

34%

33%

Figure 2: Experimental Design: (A) Event-matched paradigms: Tone 4 (T4) stimuli were presentedin variable (top), repetitive (middle), and patterned (bottom) contexts. In the variable context, T4, Tone1 (T1) and Tone 2 (T2) syllables were randomly presented. In the repetitive context, the T4 syllable wasrepeated continuously. In the patterned context, T1, T2, and T4 syllables repeated in a fixed pattern.To control for presentation order, electrophysiological responses to T4 were event-matched between thevariable and repetitive contexts (black solid lines), and separately, between the patterned and repetitivecontexts (grey dotted lines). (B) Transitional probabilities: in the variable context (top), the probabilityof a T4 occurrence (instead of a T1 or T2) after any stimulus was 33%; in the repetitive context (middle),a T4 occured after another T4 with a 100% probability; in the patterned context (bottom), the probabilityof a T4 occurance was also 100%, because T4 trials only occured after a T1+T2 sequence.

electrode montage (Skoe and Kraus, 2010) that differentially recorded electrophysiological reponses189

from vertex (Cz, active) to bilateral linked mastoids (M1+M2, references), with the forehead as ground.190

Contact impedance was less than 2 k for all electrodes.191

PREPROCESSING PROCEDURES192

Filtering, artifact rejection, and averaging were performed offline using CURRY 7. Responses were193

bandpass filtered from 80 to 2500 Hz (12 dB/octave) to isolate subcortical activity from cortical po-194

tentials and to attenuate EEG signals above the phase locking limit of the brainstem (Skoe and Kraus,195

2010). Trials with activities greater than ±35 µV were considered artifacts and rejected. Responses to196

the T4 stimulus were averaged with a 275 ms epoching window encompassing -50 ms prior to stimulus197

onset, the 175 ms of the stimulus, and 50 ms after stimulus offset. Responses in the repetitive context198

condition were averaged according to their occurance relative to the order of presentation in the variable199

8

Figure 2: Experimental Design: (A) Event-matched paradigms: Tone 4 (T4) stimuli were presentedin variable (top), repetitive (middle), and patterned (bottom) contexts. In the variable context, T4,Tone 1 (T1) and Tone 2 (T2) syllables were randomly presented. In the repetitive context, the T4syllable was repeated continuously. In the patterned context, T1, T2, and T4 syllables repeated in afixed pattern. To control for presentation order, electrophysiological responses to T4 were event-matchedbetween the variable and repetitive contexts (solid black lines), and separately, between the patternedand repetitive contexts (gray dotted lines). (B) Transitional probabilities: in the variable context (top),the probability of a T4 occurrence (instead of a T1 or T2) after any stimulus was 33%; in the repetitivecontext (middle), a T4 occurred after another T4 with a 100% probability; in the patterned context(bottom), the probability of a T4 occurrence was also 100%, because T4 trials only occurred after aT1+T2 sequence.

These three Cantonese tones were chosen because their phonemic distinctions have been reported

to be stable in the language, i.e. these distinctions do not collapse under diachronic effects like sound

change (Mok et al., 2013), or synchronic processing mechanisms like talker normalization (Wong and

Diehl, 2003) across the population.

Amongst the three tones, a priori, we chose to focus on FFRs elicited by T4 to avoid a ceiling

effect on pitch tracking metrics, because FFR pitch tracking is relatively weaker for the falling tone T4

compared to T1 and T2 (Liu et al., 2014).

Presentation contexts

We presented the T4 syllable in three contexts, namely a variable context, a repetitive context, and a

patterned context (Figure 2). In the variable context condition, 1980 sweeps of T4 were presented

7

Page 8: Context-dependent plasticity in the subcortical encoding ... · Context-dependent plasticity in the subcortical encoding of linguistic pitch patterns ... that works to enhance online

Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR

randomly in the context of T1 (1980 sweeps) and T2 (2040 sweeps) at a probability of 33% (Figure 2,

Panel A, top). In the repetitive context condition, 6000 sweeps of T4 were presented with a probability of

100% (Figure 2, Panel A, middle). In the patterned context, we presented T4 non-repetitively along with

T1 and T2 in a fixed sequence. In this patterned context condition, 2000 sweeps of T4 were presented,

and each sweep was preceded by a fixed sequence of one trial of T1 followed by one trial of T2 (Figure

2, Panel A, bottom). The transitional probability of occurrence of a T4 trial was therefore controlled at

100% in both patterned context (T4 after a T2) and repetitive context (T4 after a T4) conditions (Figure

2, Panel B, middle and bottom), but was much lower for the variable context at 33% (Figure 2, Panel B,

top). To control for the relative location of the T4 trials within the stream of all stimuli across conditions,

we conducted separate event-matching (Chandrasekaran et al., 2009) to control for trial order between

the variable and repetitive conditions (Figure 2, Panel A, top and middle), as well as the patterned and

repetitive conditions (Figure 2, Panel A, middle and bottom).

Electrophysiological recording procedures

Electrophysiological recording took place in an acoustically and electromagnetically shielded booth.

During recording, participants were told to ignore the stimuli and to rest or sleep in a reclining chair,

consistent with prior FFR recording protocols (Krishnan et al., 2004; Skoe and Kraus, 2010). Stimuli

were presented in a single polarity to the participant’s right ear through electromagnetically shielded

insert earphones (ER-3A, Etymotic Research, Elk Grove Village, IL, USA) at 80 dB SPL. A 10-minute

six-talker babble noise in Cantonese was recursively presented in the background at a signal-to-noise

ratio level of 0 dB to avoid potential ceiling effects on FFR metrics observed in pilot experiments which

were conducted without noise. Stimuli in all conditions were presented with a 250 ms stimulus-onset

asynchrony (SOA). Stimuli were presented via the presentation software Neuroscan Stim2 (Compumedics,

El Paso, TX, USA). For twenty participants (out of twenty-four) who chose to complete the whole

set of electrophysiological recording on a single day, the order of the three context conditions was

counterbalanced across participants. For the other four participants, electrophysiological recordings

for the repetitive and variable context conditions occurred on a single day, and the order of these two

conditions was counterbalanced. 1 Their recording session for the patterned context condition took place

on a later separate date. The recording of each condition lasted roughly 25 minutes. The total duration

of the testing including preparation time lasted approximately 90 minutes including preparation for each

participant.

Electrophysiological responses were recorded using a SynAmps2 Neuroscan system (Compumedics,

8

Page 9: Context-dependent plasticity in the subcortical encoding ... · Context-dependent plasticity in the subcortical encoding of linguistic pitch patterns ... that works to enhance online

Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR

El Paso, TX, USA) with Ag-AgCl scalp electrodes, and digitized at a sampling rate of 20,000 Hz using

CURRY Scan 7 Neuroimaging Suite (Compumedics, El Paso, TX, USA). We used a vertical electrode

montage (Skoe and Kraus, 2010) that differentially recorded electrophysiological responses from the

vertex (Cz, active) to bilateral linked mastoids (M1+M2, references), with the forehead as ground.

Contact impedance was less than 2 kΩ for all electrodes.

Preprocessing Procedures

Filtering, artifact rejection, and averaging were performed offline using CURRY 7 (Compumedics,

El Paso, TX, USA). Responses were bandpass filtered from 80 to 2500 Hz (12 dB/octave) to isolate

subcortical activity from cortical contamination and mimic the phase-locking limit of the subcortical

auditory system (Skoe and Kraus, 2010). Trials with activities greater than ±35 µV were considered

artifacts and rejected. Responses to the T4 stimulus were averaged with a 275 ms epoching window

encompassing -50 ms before stimulus onset, the 175 ms of the stimulus, and 50 ms after stimulus offset.

Responses in the repetitive context condition were averaged according to their occurrence relative to the

order of presentation in the variable context condition, and separately, to the order of presentation in the

patterned context condition. The average number of trials was approximately 1800 trials per condition.

Data Analysis

FFR data were analyzed using customized Matlab (The Mathworks, Natick, MA, USA) scripts adapted

from the Brainstem Toolbox (Skoe and Kraus, 2010). Before analysis, the stimulus was down-sampled

to 20,000 Hz to match the sampling rate of the response. For each FFR, we first calculated its estimated

onset delay relative to the stimulus presentation time (neural lag) due to neural conduction of the

auditory pathway. This neural lag value was computed using a cross-correlation technique that slid

the response waveform (the portion of FFR wave from 0-175 ms) and the stimulus waveform in time

with respect to one another (Liu et al., 2014). The neural lag value (in ms) was taken as the time

point in which maximum positive correlation was achieved between 6 and 12 ms, the expected latency

of the onset component of the auditory brainstem response, with the transmission delay of the ear

inserts also taken into account (Bidelman et al., 2011; Strait et al., 2012). Then, the f0 (fundamental

frequency) contour from the averaged FFR was derived using a sliding window autocorrelation-based

procedure (Krishnan et al., 2004; Wong et al., 2007). To estimate how f0 values changed through the

waveform, the portion of the waveform that corresponds to the vowel portion of the stimulus (25-175

ms of stimulus, c.f. figure 1, shifted by neural lag in the FFR) was divided into 100 bins, each 50 ms

9

Page 10: Context-dependent plasticity in the subcortical encoding ... · Context-dependent plasticity in the subcortical encoding of linguistic pitch patterns ... that works to enhance online

Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR

(49 ms overlap between adjacent time bins). Each of the 100 time bins was time-shifted in 1 ms steps

with a delayed version of itself, and a Pearson’s r was calculated at each 1 ms interval. The time-lag to

achieve maximum correlation within each bin was recorded. The reciprocal of this time-lag represented

an estimate of f0 of that bin. The resulting f0 values formed a 100-point f0 contour. The f0 contour of

the stimulus was also derived separately using the same procedure, but the 25-175 ms analysis window

of the waveform was not shifted by the neural lag. Subsequent analyses focused on whether and how

neural pitch tracking varied as a function of online stimulus predictability and novelty.

We derived two main metrics previously used to define the fidelity of the neural responses to

linguistic pitch patterns (Wong et al., 2007; Song et al., 2008; Skoe et al., 2014; Liu et al., 2014):

1) Stimulus-to-response correlation, and 2) f0 error. Stimulus-to-response correlation (values between

-1 and 1) is the Pearson’s correlation coefficient (r) between the stimulus and response f0 contours.

It indicates the similarity between the stimulus and response f0 contours in terms of the strength and

direction of their linear relationship (Wong et al., 2007; Liu et al., 2014). F0 error (in Hz) is the mean

absolute Euclidean distance between the stimulus and response f0 contours across the 100 bins in the

autocorrelation analysis. This metric represents the pitch encoding accuracy of the FFR by reflecting

how many Hz the FFR f0 contour deviates from the stimulus f0 contour on average (Song et al., 2008;

Skoe et al., 2014).

In addition, Signal-to-noise ratio (SNR) of each FFR were also derived to assess whether the overall

magnitude of neural activation over the entire FFR period (relative to pre-stimulus baseline) (Russo et al.,

2004) varied as a function of stimulus context. To derive the SNR of each FFR, the root mean square

(RMS) amplitudes (the mean absolute value of all sample points of the waveform within the respective

time windows, in µV) of the FFR period (neural lag to neural lag+175 ms) and the pre-stimulus baseline

period (-50 to neural lag) of the waveform were first recorded. The quotient of the FFR RMS amplitude

and the pre-stimulus RMS amplitude was taken as the SNR value (Russo et al., 2004).

Statistical Analyses

Before subsequent parametric statistical analyses, stimulus-to-response correlation values were first

converted into Fisher’s z’ scores (Wong et al., 2007), as Pearson’s correlation coefficients do not comprise

a normal distribution. To directly examine the extent to which FFR pitch encoding and phase-locking

varied as a function of the three types of stimulus context (variable, repetitive, and patterned contexts)

overall, we conducted one-way repeated measures ANOVAs on the FFR metrics (stimulus-to-response

correlation, f0 error, and SNR). We note that directly comparing encoding across the three contexts

10

Page 11: Context-dependent plasticity in the subcortical encoding ... · Context-dependent plasticity in the subcortical encoding of linguistic pitch patterns ... that works to enhance online

Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR

conditions (comparing the event-matched variable and patterned conditions with the non-event-matched

patterned condition) is limited by a potential confound of presentation order. To account for this

confound, we conducted a second analysis that compared the T4 FFRs in the variable condition to

the event-matched T4 FFRs in the repetitive condition. A third analysis was conducted to compare the

T4 FFRs in the patterned condition to the event-matched T4 FFRs in the repetitive condition. Two sets

of separate paired sample t-tests were used to compare the mean stimulus-to-response correlation, f0

error, and SNR, between variable context and repetitive context conditions, and separately, between

patterned context and its separately event-matched repetitive context conditions.

RESULTS

Direct comparison between variable, repetitive, and patterned context conditions

One-way repeated measures ANOVA on stimulus-to-response correlation with the Greenhouse-Geisser

correction revealed significant differences between the three context conditions [F(1.19, 34.947) =

10.607, p=0.001]. Planned comparisons revealed that stimulus-to-response correlation of the repetitive

condition was significantly higher than that of the variable condition [F(1, 23) = 5.007, p=0.035], and

that stimulus-to-response correlation of the patterned condition was significantly higher than that of the

repetitive condition [F(1, 23) = 9.684, p=0.005].

The one-way repeated measures ANOVA on f0 error with the Greenhouse-Geisser correction also

revealed significant differences between the three context conditions [F(1.482, 34.095) = 8.973, p=0.002].

Planned comparisons revealed that f0 error of the repetitive condition was significantly lower than that

of the variable condition [F(1, 23) = 8.338, p=0.008], and that f0 error of the patterned condition was

marginally lower than that of the repetitive condition [F(1, 23) = 3.713, p=0.066].

The one-way repeated measures ANOVA on SNR with the Greenhouse-Geisser correction was not

significant [F(1.894, 43.569) = 1.001, p=0.372]. Planned comparisons on the SNR metric, between

repetitive and variable conditions [F(1, 23) = 0.389, p=0.539], and between repetitive, and patterned

conditions were not significant [F(1, 23) = 0.753, p=0.395].

Event-matched comparison between variable and repetitive context conditions

Figure 3 (A) and (B) show the grand averaged waveforms of event-matched FFRs to Cantonese Tone 4

/ji4/ syllable in variable context and repetitive context conditions, and the corresponding spectrograms.

We observed context-dependent effects, more specifically to this comparison, effects of online

11

Page 12: Context-dependent plasticity in the subcortical encoding ... · Context-dependent plasticity in the subcortical encoding of linguistic pitch patterns ... that works to enhance online

Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR

Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFRs

Waveforms Spectrograms

0 25 50 75 100 125 150 175−0.5

0

0.5

Time (ms)

Am

plitu

de (µ

V)

Post−stimulus Time (ms)

Freq

uenc

y (H

z)

50 100 1500

100

200

300

400

500

µV

0

0.02

0.04

0.06

0.08

0.1

A BVariableContext

RepetitiveContext

Variable Repetitive0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1*

Stim

ulus−t

o−R

espo

nse

Cor

rela

tion

Variable Repetitive0

0.5

1

1.5

2

2.5

3

**f0

Erro

r (H

z)C D

Figure 3: Results: Frequency-following responses in variable vs. repetitive contexts: (A) Waveformsand (B) Spectrograms of grand-averaged event-matched Tone 4 (T4) frequency-following responses(FFRs) from the variable and repetitive contexts. (C) Mean stimulus-to-response correlations, and(D) mean f0 errors of event-matched T4 FFRs from the variable (left bars) and repetitive (right bars)contexts. Error bars denote ± one standard error from the mean. Note that pitch tracking of the samestimulus (T4) was more robust (higher stimulus-to-response correlations and lower f0 errors) in therepetitive context in which its transitional probability of occurrence was higher (100% vs 33% in thevariable condition). *p < 0.05, **p < 0.01

24

Figure 3: Results: Event-matched Frequency-following responses in variable vs. repetitivecontexts: (A) Waveforms and (B) Spectrograms of grand-averaged event-matched Tone 4 (T4)frequency-following responses (FFRs) from the variable and repetitive contexts. (C) Meanstimulus-to-response correlations, and (D) mean f0 errors of event-matched T4 FFRs from the variable(left bars) and repetitive (right bars) contexts. Error bars denote ± one standard error from the mean.Note that pitch tracking of the same stimulus (T4) was more robust (higher stimulus-to-responsecorrelations and lower f0 errors) in the repetitive context in which its transitional probability ofoccurrence was higher (100% vs. 33% in the variable condition). *p < 0.05, **p < 0.01

stimulus predictability, on both of our pitch tracking metrics [Figure 3 (C) and (D)]. We found a higher

stimulus-to-response correlation [t(23) = -2.697, p=0.013] ,and lower f0 error [t(23) = 3.005, p=0.006] in

the repetitive context condition relative to the variable context condition, indicating that the encoding of

a dynamic pitch pattern was more faithful when online stimulus context was predictable. No significant

context-dependent effect was found for the SNR measure [t(23) = 0.615, p=0.545].

Event-matched comparison between patterned and repetitive context conditions

Figure 4 (A) and (B) show the grand averaged waveforms of separately event-matched FFRs to Cantonese

Tone 4 /ji4/ syllable in patterned context and repetitive context conditions, and the corresponding

12

Page 13: Context-dependent plasticity in the subcortical encoding ... · Context-dependent plasticity in the subcortical encoding of linguistic pitch patterns ... that works to enhance online

Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFRRunning Head: Context-dependent Subcortical Plasticity in Lexical Tone FFRs

Waveforms Spectrograms

0 25 50 75 100 125 150 175−0.5

0

0.5

Time (ms)

Am

plitu

de (µ

V)

Post−stimulus Time (ms)

Freq

uenc

y (H

z)

0 50 100 1500

100

200

300

400

500

µV

0

0.02

0.04

0.06

0.08

0.1

PatternedContext

B

RepetitiveContext

A

Patterned Repetitive0

0.2

0.4

0.6

0.8

1 **

Stim

ulus−t

o−R

espo

nse

Cor

rela

tion

Patterned Repetitive0

0.5

1

1.5

2

2.5

3

**

f0 E

rror (

Hz)

C D

Figure 4: Results: Frequency-following responses in patterned vs. repetitive contexts: (A)Waveforms (B) Spectrograms of grand-averaged event-matched Tone 4 (T4) frequency-followingresponses (FFRs) from the patterned and repetitive contexts. (C) Mean stimulus-to-responsecorrelations, and (D) mean f0 errors of event-matched Tone 4 FFRs from the patterned (left bars)and repetitive (right bars) contexts. Error bars denote ± one standard error from the mean. Note thatdespite the controlled transitional probability of a T4 occurrence in both conditions (both 100%), pitchtracking of the same stimulus (T4) was more robust (higher stimulus-to-response correlations and lowerf0 errors) in the patterned context when the stimulus was not repetitive. **p < 0.01

25

Figure 4: Results: Event-matched Frequency-following responses in patterned vs. repetitivecontexts: (A) Waveforms (B) Spectrograms of grand-averaged event-matched Tone 4 (T4)frequency-following responses (FFRs) from the patterned and repetitive contexts. (C) Meanstimulus-to-response correlations, and (D) mean f0 errors of event-matched Tone 4 FFRs from thepatterned (left bars) and repetitive (right bars) contexts. Error bars denote ± one standard error fromthe mean. Note that despite the controlled transitional probability of a T4 occurrence in both conditions(both 100%), pitch tracking of the same stimulus (T4) was more robust (higher stimulus-to-responsecorrelations and lower f0 errors) in the patterned context when the stimulus was not repetitive.**p < 0.01

spectrograms.

From our metrics, a context-dependent online stimulus novelty effect on pitch encoding was observed

[Figure 4 (C) and (D)]. We found that pitch encoding was more faithful when stimulus context was

patterned rather than repetitive, indicated by a higher stimulus-to-response correlation [t(23) = 2.876,

p=0.009], and lower f0 error [t(23) = -3.057, p=0.006] in the patterned context condition relative to the

repetitive context condition. No significant context-dependent effect was found for SNR [t(23) = -0.788,

p=0.439].

13

Page 14: Context-dependent plasticity in the subcortical encoding ... · Context-dependent plasticity in the subcortical encoding of linguistic pitch patterns ... that works to enhance online

Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR

DISCUSSION

Our results demonstrate a clear influence of context on the representation of lexical pitch patterns in

human FFRs. We found that the pitch tracking in FFRs to a falling lexical pitch pattern (T4) was more

faithful (higher stimulus-to-response correlation) and accurate (lower f0 error) when the pitch pattern

was presented in a repetitive context relative to a variable context. This is consistent with prior work

using segmental speech stimuli (Chandrasekaran et al., 2009; Strait et al., 2011). Interestingly, we found

that when the transitional probability of occurrence (hence stimulus predictability) was controlled, FFR

pitch tracking was more faithful and accurate when the stimulus was patterned but not repetitive.

Prior research has extensively characterized the FFR as an index to subcortical auditory plasticity

(Chandrasekaran and Kraus, 2010). However, the question on whether cortical contributions to FFRs

can be ruled out has sparked attention lately. For example, a recent magnetoencephalography (MEG)

study has shown a right-asymmetric contribution of the auditory cortex to FFRs (Coffey et al., 2016).

Nevertheless, external evidence has at least suggested that the dominant source of FFRs is subcortical.

Human FFRs have an upper limit of about 1,000Hz (Chandrasekaran and Kraus, 2010). Animal models

on other species of mammals have demonstrated a similar range of phase-locking abilities in neuronal

populations within the inferior colliculus (IC) (Liu et al., 2006), while units at the auditory cortex

only demonstrate phase-locking up to about 250Hz (Wallace et al., 2005). Also, FFRs, compared

to cortical auditory evoked potentials, are smaller in amplitudes; FFRs also demonstrate much lower

latency variability and earlier maturation (Chandrasekaran and Kraus, 2010). A recent study using

source dipole modeling and 3- channel Lissajous analysis on high density multi-channel-recorded FFRs

has also suggested the midbrain to be the putative generator of speech FFRs (Bidelman, 2015). Since

all these external evidence has suggested that the dominant source of FFRs is subcortical, despite the

fact that contributions from the cortex on FFRs cannot be completely ruled out, we hereby interpret our

current findings as a demonstration of the subcortical auditory system being sensitive to both stimulus

novelty and predictability. As such, there may be multiple neural mechanisms that interactively influence

online subcortical encoding.

Top-down and Local Processes, and their Interplay

Per animal models, at least two mechanisms are likely to be involved in context-dependent online

subcortical plasticity interactively: a top-down corticofugal mechanism that automatically fine tunes

the representation of stimulus features that matches top-down expectation, and a local mechanism that

14

Page 15: Context-dependent plasticity in the subcortical encoding ... · Context-dependent plasticity in the subcortical encoding of linguistic pitch patterns ... that works to enhance online

Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR

enhances representation of novel information (Chandrasekaran et al., 2014). We posit that more faithful

and accurate neural pitch tracking in the repetitive context relative to the variable context may be driven

by the higher transitional probability in the repetitive condition relative to the variable condition, despite

the target stimulus being more novel in the latter condition. The high transitional probability in the

repetitive condition may thus result in greater top-down predictive coding, the effects of which may

have overridden that of local novelty enhancement, thereby enhancing subcortical pitch representation.

In this experiment, we collected FFRs using a passive listening paradigm wherein participants did not

pay overt attention to the stimulus stream. Our findings of context-related effect are therefore likely

to be a fundamental process of auditory processing wherein highly automatic processes are operative

even without overt attention or explicit goal-directed behavior. Future studies can systematically test

the extent to which overt attention to the stimulus pattern may modulate the magnitude of top-down

modulation. For example, testing context-dependent subcortical encoding in different sleeping states

(e.g. awake vs. asleep), or while requiring participants to explicitly track the stimulus pitch patterns

may be informative.

The poorer encoding in the repetitive condition relative to the patterned condition, wherein the

transitional probability is controlled (hence equally robust predictive coding), is likely a result of reduced

local responsitivity to the repetitive stimulus at the subcortical auditory system, as a result of SSA.

Prior studies have discussed the possibility of SSA in modulating subcortical encoding online (Slabu

et al., 2012; Skoe et al., 2014). These studies have used auditory oddball paradigms to elicit FFRs

in the standard (high-probability) and deviant (low-probability) conditions respectively. FFR studies

using passive oddball paradigms have found that FFRs are more robust for the highly repetitive standard

stimulus, which may either suggest that SSA is not reflected in the FFR, which is sensitive to neural

phase-locking (Skoe et al., 2014), or that effects of SSA cannot be disambiguated from predictive coding

with an oddball paradigm in which stimulus novelty and probability co-vary (Slabu et al., 2012). Here,

we controlled for transitional probability while manipulating stimulus novelty in our study by employing

an event-matched comparison between patterned and repetitive contexts. Our results demonstrate a

novelty-related enhancement effect on FFRs (patterned > repetitive), suggesting that predictability and

novelty both drive context-dependent auditory plasticity.

An intriguing possibility is that the context-dependent effects found in this study represent an interaction

between online auditory plasticity and speech processing in the presence of background noise. The use

of babble noise in our stimulus presentation was intended to avoid a ceiling effect on pitch tracking

observed during pilot experiments where speech was presented in quiet. However, it is possible that

15

Page 16: Context-dependent plasticity in the subcortical encoding ... · Context-dependent plasticity in the subcortical encoding of linguistic pitch patterns ... that works to enhance online

Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR

the presentation of speech-in-noise could invoke a greater interaction between top-down modulation and

local adaptation mechanisms than quiet presentation. For example, voice pitch is an important cue in

tagging auditory streams (Snyder and Alain, 2007). Hence, it is possible that a constant repetitive pitch in

the repetitive condition was tagged more easily as a separate stream from the background noise relative

to the variable condition. In other words, the randomly presented stimuli in the variable condition

may have resulted in greater neural inhibition because the stimuli were tagged as noise. This neural

habituation may have disproportionately impacted subcortical encoding in the variable condition in

addition to the low transitional probability. This interpretation would imply an intricate interaction not

just between predictive coding and SSA, but also a generalized noise-related neural habituation process

that has resulted in context-dependent modulation. Future studies could introduce different SNR levels

of background noise, and/or a change pitch cues that results in a change in talker, but not tone identity,

as a factor in their experimental design to test this possibility.

Animal studies have shown that subcortical auditory encoding can be modulated by online stimulus

statistics (Dean et al., 2005; Perez-Gonzalez et al., 2005; Malmierca et al., 2009) and behaviorally

relevant auditory experience (Suga, 2008) due to an interaction of local adaptation and top-down modulation

through corticofugal pathways (Malmierca et al., 2009). Human studies have shown that subcortical

encoding of speech and music sounds is modulated by prior listening contexts which vary in statistical

features of the auditory input (Chandrasekaran et al., 2009; Strait et al., 2011; Parbery-Clark et al.,

2011; Slabu et al., 2012; Skoe et al., 2013, 2014). A growing body of literature has also demonstrated

that subcortical auditory encoding is enhanced if the signals are behaviorally relevant, such as when

they serve linguistic purposes (Krishnan and Gandour, 2009) and are ecologically valid (Xu et al.,

2006; Krishnan et al., 2009). Together, these findings are shaping an emerging view that subcortical

structures are active processors that can be modulated by online listening contexts, among other factors

such as long-term and short-term auditory experience (Chandrasekaran and Kraus, 2010), to achieve

subcortical auditory plasticity (Chandrasekaran et al., 2014). However, how online listening context

interacts with these other types of auditory experiences to shape subcortical auditory plasticity is still

an open question. Prior studies, using repetitive stimuli presentation, have demonstrated that native

tone-language speakers exhibited superior subcortical pitch encoding ability to lexical tones, presumably

because of their life-long native tone language experience (Krishnan et al., 2005, 2009, 2010; Bidelman

et al., 2011; Krishnan et al., 2016).

Skoe and colleagues (2014), on the other hand, have investigated the extent to which subcortical

encoding of linguistic pitch patterns (i.e. lexical tones) was modulated by listening context in native

16

Page 17: Context-dependent plasticity in the subcortical encoding ... · Context-dependent plasticity in the subcortical encoding of linguistic pitch patterns ... that works to enhance online

Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR

English speakers (with no prior tone language experience), and how this context-dependent encoding

changed after participants underwent an extensive sound-to-meaning auditory training program that

rendered the tones as behaviorally-relevant. Before training, they found that subcortical pitch tracking

of lexical tones was enhanced when the tones were presented with a higher probability in the context,

relative to when presented with a lower probability. Interestingly, post-training, there was no probability-dependent

enhancement effect on tone encoding. They argued that this loss of probability-dependent enhancement

was due to the stimuli becoming less novel after the tones’ linguistic relevance was acquired. However,

in the current study we found context-dependent enhancement on native tone language speakers, demonstrating

that linguistic relevance is unlikely to result in a loss of context-dependent plasticity. Instead, we

propose that the lack of probability-dependent enhancement after training, seen in the study by Skoe

and colleagues (2014), may reflect a less efficient top-down modulation for the non-native learners, who

only had very limited exposure to the lexical tones.

CONCLUSIONS

In summary, the current study shows a robust influence of prior listening context that enhances online

subcortical encoding of a dynamic, time-varying linguistic pitch pattern. Encoding is more robust

when a sound is more predictable and novel in a listening context. These findings demonstrate a

complex interplay between top-down predictive coding and local SSA processes at the subcortical level

that tunes sensory signal online based on stimulus history. These two processes are likely driven

by at least two general neurobiological mechanisms: predictive coding which enhances predictable

sensory input, and SSA which reduces responsitivity to repetitive sensory input. Together, we interpret

this context-dependent encoding as indicative of an interaction between online and long-term auditory

experience that shapes neural plasticity in the subcortical auditory system.

ACKNOWLEDGMENTS

We thank Zilong Xie for comments on drafts of this manuscript, and Hilda Chan, Jason Ho, Yinyin

Liang, Christine Liu, and Grace Pan for their assistance with data collection. We also thank Oliver

Bones and Fang Liu for their help on Matlab coding.

17

Page 18: Context-dependent plasticity in the subcortical encoding ... · Context-dependent plasticity in the subcortical encoding of linguistic pitch patterns ... that works to enhance online

Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR

GRANTS

This work was supported by the U.S. National Institute of Health grant (1R01DC013315) to B. Chandrasekaran,

U.S. National Institutes of Health grants (R01DC008333 and R01DC013315), the Research Grants

Council of Hong Kong grants (477513 and 14117514), the Health and Medical Research Fund of Hong

Kong grant (01120616), and the Dr. Stanley Ho Medical Development Foundation to P.C.M. Wong.

DISCLOSURES

No Conflicts of interest, financial or otherwise, are declared by the authors.

AUTHOR CONTRIBUTIONS

P.C.M.W. and B.C. designed the study; J.C.Y.L. and P.C.M.W. collected the data; J.C.Y.L. analyzed the

data with input from P.C.M.W. and B.C.; J.C.Y.L., P.C.M.W. and B.C. interpreted results of experiments;

J.C.Y.L., P.C.M.W. and B.C. wrote the paper.

18

Page 19: Context-dependent plasticity in the subcortical encoding ... · Context-dependent plasticity in the subcortical encoding of linguistic pitch patterns ... that works to enhance online

Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR

FOOTNOTES

1For the four participants whose electrophysiological recording for the patterned context condition took place on a separate

day, a Wilcoxon signed-rank test confirmed that their FFR signal-to-noise ratio (SNR) of the patterned context (M = 1.64, SD

= 0.32) was not statistically different (Z = -.365, p = 0.715) from the rest of the participants who completed all recordings

on a single day (M = 1.66, SD = 0.47). A separate Wilcoxon signed-rank test confirmed that the four participants’ FFR

SNRs between the event-matched patterned (M = 1.64, SD = 0.32) and repetitive (M = 1.62, SD = 0.11) conditions were not

statistically different either (Z = 0.00, p = 1). These suggest that the four participants’ electrophysiological recordings were

consistent across the two days of experiment, and also with participants who completed the experiments on a single day. (On

how the SNR metric was derived, readers are referred to the Methods section).

19

Page 20: Context-dependent plasticity in the subcortical encoding ... · Context-dependent plasticity in the subcortical encoding of linguistic pitch patterns ... that works to enhance online

Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR

REFERENCES

Althen H, Grimm S, and Escera C. Fast detection of unexpected sound intensity decrements as

revealed by human evoked potentials. PLoS One, 6(12):e28522, 2011.

Anderson L and Malmierca M. The effect of auditory cortex deactivation on stimulus-specific

adaptation in the inferior colliculus of the rat. Eur J Neurosci, 37(1):52–62, 2013.

Bidelman GM. Multichannel recordings of the human brainstem frequency-following response: scalp

topography, source generators, and distinctions from the transient abr. Hear Res, 323:68–80, 2015.

Bidelman GM, Gandour JT, and Krishnan A. Musicians and tone-language speakers share enhanced

brainstem encoding but not perceptual benefits for musical pitch. Brain Cogn, 77(1):1–10, 2011.

Cacciaglia R, Escera C, Slabu L, Grimm S, Sanjuan A, Ventura-Campos N, and Avila

C. Involvement of the human midbrain and thalamus in auditory deviance detection.

Neuropsychologia, 68:51–58, 2015.

Chandrasekaran B, Hornickel J, Skoe E, Nicol T, and Kraus N. Context-dependent encoding in

the human auditory brainstem relates to hearing speech in noise: implications for developmental

dyslexia. Neuron, 64(3):311–319, 2009.

Chandrasekaran B and Kraus N. The scalp-recorded brainstem response to speech: Neural origins

and plasticity. Psychophysiology, 47(2):236–246, 2010.

Chandrasekaran B, Skoe E, and Kraus N. An integrative model of subcortical auditory plasticity.

Brain Topogr, 27(4):539–552, 2014.

Coffey EB, Herholz SC, Chepesiuk AM, Baillet S, and Zatorre RJ. Cortical contributions to the

auditory frequency-following response revealed by meg. Nat Commun, 7, 2016.

Dean I, Harper NS, and McAlpine D. Neural population coding of sound level adapts to stimulus

statistics. Nat Neurosci, 8(12):1684–1689, 2005.

Escera C, Leung S, and Grimm S. Deviance detection based on regularity encoding along the auditory

hierarchy: electrophysiological evidence in humans. Brain Topogr, 27(4):527–538, 2014.

Krishnan A and Gandour JT. The role of the auditory brainstem in processing linguistically-relevant

pitch patterns. Brain Lang, 110(3):135–148, 2009.

20

Page 21: Context-dependent plasticity in the subcortical encoding ... · Context-dependent plasticity in the subcortical encoding of linguistic pitch patterns ... that works to enhance online

Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR

Krishnan A, Gandour JT, and Bidelman GM. The effects of tone language experience on pitch

processing in the brainstem. J Neurolinguistics, 23(1):81–95, 2010.

Krishnan A, Gandour JT, Bidelman GM, and Swaminathan J. Experience dependent neural

representation of dynamic pitch in the brainstem. Neuroreport, 20(4):408, 2009.

Krishnan A, Gandour JT, and Suresh CH. Language-experience plasticity in neural representation

of changes in pitch salience. Brain Res, 1637:102–117, 2016.

Krishnan A, Xu Y, Gandour J, and Cariani P. Encoding of pitch in the human brainstem is sensitive

to language experience. Brain Res Cogn Brain Res, 25(1):161–168, 2005.

Krishnan A, Xu Y, Gandour JT, and Cariani PA. Human frequency-following response:

representation of pitch contours in chinese tones. Hear Res, 189(1):1–12, 2004.

Large EW and Jones MR. The dynamics of attending: How people track time-varying events. Psychol

Rev, 106(1):119, 1999.

Lehmann A, Arias DJ, and Schonwiesner M. Tracing the neural basis of auditory entrainment.

Neuroscience, 337:306–314, 2016.

Liu F, Maggu AR, Lau JC, and Wong PC. Brainstem encoding of speech and musical stimuli in

congenital amusia: evidence from Cantonese speakers. Front Hum Neurosci, 8, 2014.

Liu LF, Palmer AR, and Wallace MN. Phase-locked responses to pure tones in the inferior colliculus.

J Neurophysiol, 95(3):1926–1935, 2006.

Lupyan G and Clark A. Words and the world predictive coding and the language-perception-cognition

interface. Curr Dir Psychol Sci, 24(4):279–284, 2015.

Malmierca MS, Cristaudo S, Perez-Gonzalez D, and Covey E. Stimulus-specific adaptation in the

inferior colliculus of the anesthetized rat. J Neurosci, 29(17):5483–5493, 2009.

Mok PP, Zuo D, and Wong PW. Production and perception of a sound change in progress: Tone

merging in Hong Kong Cantonese. Lang Var Change, 25(03):341–370, 2013.

Naatanen R and Picton T. The n1 wave of the human electric and magnetic response to sound: a

review and an analysis of the component structure. Psychophysiology, 24(4):375–425, 1987.

21

Page 22: Context-dependent plasticity in the subcortical encoding ... · Context-dependent plasticity in the subcortical encoding of linguistic pitch patterns ... that works to enhance online

Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR

Natan RG, Briguglio JJ, Mwilambwe-Tshilobo L, Jones SI, Aizenberg M, Goldberg EM, and

Geffen MN. Complementary control of sensory adaptation by two types of cortical interneurons.

Elife, 4:e09868, 2015.

Parbery-Clark A, Strait D, and Kraus N. Context-dependent encoding in the auditory

brainstem subserves enhanced speech-in-noise perception in musicians. Neuropsychologia,

49(12):3338–3345, 2011.

Perez-Gonzalez D, Malmierca MS, and Covey E. Novelty detector neurons in the mammalian auditory

midbrain. Eur J Neurosci, 22(11):2879–2885, 2005.

Picton T. Human auditory evoked potentials. Plural Publishing, San Diego, 2010.

Russo N, Nicol T, Musacchia G, and Kraus N. Brainstem responses to speech syllables. Clin

Neurophysiol, 115(9):2021–2030, 2004.

Skoe E, Chandrasekaran B, Spitzer ER, Wong PC, and Kraus N. Human brainstem plasticity: the

interaction of stimulus probability and auditory learning. Neurobiol Learn Mem, 109:82–93, 2014.

Skoe E and Kraus N. Auditory brainstem response to complex sounds: a tutorial. Ear Hear, 31(3):302,

2010.

Skoe E, Krizman J, Spitzer E, and Kraus N. The auditory brainstem is a barometer of rapid auditory

learning. Neuroscience, 243:104–114, 2013.

Slabu L, Escera C, Grimm S, and Costa-Faidella J. Early change detection in humans as revealed by

auditory brainstem and middle-latency evoked potentials. Eur J Neurosci, 32(5):859–865, 2010.

Slabu L, Grimm S, and Escera C. Novelty detection in the human auditory brainstem. J Neurosci,

32(4):1447–1452, 2012.

Snyder JS and Alain C. Toward a neurophysiological theory of auditory stream segregation. Psychol

Bull, 133(5):780, 2007.

Song JH, Skoe E, Wong PC, and Kraus N. Plasticity in the adult human auditory brainstem following

short-term linguistic training. J Cogn Neurosci, 20(10):1892–1902, 2008.

Strait DL, Hornickel J, and Kraus N. Subcortical processing of speech regularities underlies reading

and music aptitude in children. Behav Brain Funct, 7(1):1, 2011.

22

Page 23: Context-dependent plasticity in the subcortical encoding ... · Context-dependent plasticity in the subcortical encoding of linguistic pitch patterns ... that works to enhance online

Running Head: Context-dependent Subcortical Plasticity in Lexical Tone FFR

Strait DL, Parbery-Clark A, Hittner E, and Kraus N. Musical training during early childhood

enhances the neural encoding of speech in noise. Brain Lang, 123(3):191–201, 2012.

Suga N. Role of corticofugal feedback in hearing. J Comp Physiol A, 194(2):169–183, 2008.

Wallace MN, Shackleton TM, Anderson LA, and Palmer AR. Representation of the purr call in the

guinea pig primary auditory cortex. Hear Res, 204(1):115–126, 2005.

Wang AL, Mouraux A, Liang M, and Iannetti GD. Stimulus novelty, and not neural refractoriness,

explains the repetition suppression of laser-evoked potentials. J Neurophysiol, 104(4):2116–2124,

2010.

Winer JA, Larue DT, Diehl JJ, and Hefti BJ. Auditory cortical projections to the cat inferior

colliculus. J Comp Neurol, 400(2):147–174, 1998.

Winkler I, Denham SL, and Nelken I. Modeling the auditory scene: predictive regularity

representations and perceptual objects. Trends Cogn Sci, 13(12):532–540, 2009.

Wong PC and Diehl RL. Perceptual normalization for inter-and intratalker variation in Cantonese level

tones. J Speech Lang Hear Res, 46(2):413–421, 2003.

Wong PC, Skoe E, Russo NM, Dees T, and Kraus N. Musical experience shapes human brainstem

encoding of linguistic pitch patterns. Nat Neurosci, 10(4):420–422, 2007.

Xu Y, Krishnan A, and Gandour JT. Specificity of experience-dependent pitch representation in the

brainstem. Neuroreport, 17(15):1601–1605, 2006.

23