3 tone sandhi in Standard Chinese: A corpus approachjiahong/publications/j03.pdf · 3rd tone sandhi...
Transcript of 3 tone sandhi in Standard Chinese: A corpus approachjiahong/publications/j03.pdf · 3rd tone sandhi...
3rd tone sandhi in Standard Chinese: A corpus approach
Jiahong Yuan1 and Yiya Chen2
University of Pennsylvania1, Leiden University2
Abstract
In Standard Chinese, a low tone (Tone3) is often realized with a rising F0 contour
before another low tone; this tone change is known as the 3rd tone sandhi. This study
investigated the acoustic characteristics of the 3rd tone sandhi in Standard Chinese in
telephone conversations and broadcast news speech. The sandhi rising tone was found to
be different from the lexical rising tone (Tone2) in disyllabic words in two measures: the
magnitude of the F0 rise and the time span of the F0 rise. We also found that word
frequency affected the realization of the sandhi rising tone. Specifically, the sandhi rising
tone in highly frequent words exhibited a smaller F0 rise (i.e., a greater difference from
the lexical rising tone) than that observed in less frequent words. This result suggests that
different processes may be involved in producing high- vs. low-frequency words in
Chinese.
Key Words: Tone, Tone sandhi, Conversation, Radio news, Corpus
1. Introduction
In lexical tone languages, in which fundamental frequency (F0) changes differentiate
word meanings, tones frequently undergo changes in connected speech, and surface with
F0 contours that differ from the canonical tonal shapes produced in isolation. This tonal
change process is commonly referred to as tone sandhi. During the last two decades, a
significant amount of research has been conducted regarding tone sandhi in various
Chinese dialects; this research culminated in the work by M. Chen (2000). Although
previous studies have greatly improved our understanding of the tone sandhi phenomena
in general, the weakness in most (if not all) studies is that the generalizations are
primarily based on introspective judgments or laboratory speech of a few speakers. Thus,
it is desirable to complement the existing literature by examining the realization of tone
sandhi in large data corpora with naturally occurring speech. The specific sandhi
phenomenon on which we focus in this paper is the 3rd (low) tone sandhi in Standard
Chinese, in which the first tone in a sequence of two low tones surfaces with a rising F0,
which is comparable to or neutralized with the 2nd lexical tone (rising) in the language.
Previous linguistic studies on the 3rd tone sandhi have mainly concerned with two
aspects of the phenomenon. The first aspect concerns the formation of the tone sandhi
domain (e.g., Shih, 1986; Zhang, 1988; Chen, 2000; Duanmu, 2000). The general
consensus in the literature is that disyllabic words with two low tones form a 3rd tone
sandhi domain, in which the first low tone changes to a sandhi rising (SR) tone. The
application of the 3rd tone sandhi across linguistic boundaries above the word level is
known to be determined by a number of factors such as syntactic structure, information
structure, speech prosody, and speaking rate (Speer et al., 1989; Shen, 1994; Shih, 1997;
Chen, 2003; Kuo et al., 2007).
The second aspect of 3rd tone sandhi concerns the exact phonetic nature of the derived
SR tone as compared with the lexical rising (LR) tone. The first well-known report
pertaining to the 3rd tone sandhi is Chao (1948), who described the change as the
replacement of the low tone with an LR tone. This view was challenged by two reports
that were published during the same period (Hockett, 1947; Martin, 1957); both
researchers described the SR tone in a stressed position as a new category that is similar
but not identical to the LR tone. In recent decades, the debate has been whether there is
indeed complete neutralization between the SR and LR tones, and if complete
neutralization is not present, then what are the acoustic parameters that differentiate these
tones? Zee (1980), who conducted the first instrumental investigation of the 3rd tone
sandhi to our knowledge, demonstrated that derived SR tones are pronounced with a
lower dip as well as a lower ending F0 than LR tones on the basis of two native speakers
of Beijing Mandarin. This subtle difference between SR and LR tones was supported by
later acoustic studies (Kratochvil, 1984; Shen, 1990; Xu, 1993, 1997; Peng, 2000; Kuo et
al., 2007), although varying magnitude of the difference between the two tones have been
reported. Based on a review of the literature and an acoustic study of the 3rd tone sandhi
in Taiwan Mandarin, Myers and Tsai (2003) proposed that the 3rd tone sandhi is
processed differently by different groups of Mandarin speakers: native speakers of
Beijing Mandarin apply the 3rd tone sandhi by phonetically modifying Tone3 so that it
sounds more similar to Tone2 whereas speakers of other varieties of Mandarin
categorically replace Tone3 with Tone2.
Despite the consistent trend of differences reported between SR and LR tones, it has
remained unclear whether listeners can hear the difference. Wang and Li (1967)
conducted the first perceptual experiment to test the ability of listeners to differentiate
between SR and LR tones. In the experiment, the subjects were asked to identify whether
a prerecorded word was an SR-Tone3 word (e.g., qi3ma3, ‘at least’) or a LR-Tone3 word
(e.g., qi2ma3, ‘to ride on a horse’). Their results demonstrated that the overall percentage
of accuracy ranged from 49.2% to 54.2% for the 14 listeners who did not participate in
the recording of the stimuli, suggesting that listeners cannot differentiate SR and LR
tones in word identification experiments. However, for the two subjects who recorded the
stimuli, the overall percentages of accuracy were above the chance level at 56.9% and
67.3%, respectively. Peng (2000) conducted a similar word identification experiment and
analyzed the identification results based on the signal detection theory (Macmillan and
Creelman, 2005). In her results, the mean sensitivity index A’ of the 15 listeners was
0.50, which suggested a random guess. However, there were two problems with her
conclusion. First, the standard deviation of A’ was very high (0.17); thus, there was a
significant variability in performance among the listeners. The second problem is that she
calculated the ratios of true and false positives in a manner that differs from that typically
applied in the signal detection theory1. Speer and Xu (2008) examined the time-course of
the resolution of lexical ambiguity from the 3rd tone sandhi by tracking the eye-
movements of listeners during a word-monitoring task. Surprisingly, they found that
1 In the study by Peng (2000), both the identification of Tone3 for underlying Tone2 and the identification of Tone2 for underlying Tone3 were considered false alarms. However, in standard signal detection theory, however, only one of them should be treated as false alarms, depending on which tone is treated as ‘positive’ or ‘alarm’.
when listeners heard an LR-Tone3 sequence, they made early glances at the character for
an SR tone, and when they heard an SR-Tone3 sequence, they made early glances at the
character for an LR Tone. Their result suggested that the listeners were sensitive to the
fine-grained phonetic differences between LR and SR tones.
The studies reviewed above were all based on laboratory speech, excluding the work
of Kratochvil (1984), which analyzed only one speaker. While we in general agree with
the importance and validity of laboratory speech in uncovering phonological patterns and
phonetic realizations (Xu, 2010), the small acoustic differences between the SR and LR
tones that were found in the previous studies must be examined using more naturally
occurring speech. The same argument has been offered regarding the nature of the
“incomplete neutralization” of the voicing contrast in a number of languages, such as
Dutch, German, and Catalan, in which underlying voiced word-final obstruents are
devoiced as a phonological process; however, phonetic studies have found small
differences between underlying voiced and voiceless word-final obstruents. There has
been an extensive debate in the literature regarding whether the incomplete neutralization
of final voicing was an experimental artifact of orthography or laboratory speech
(Fourakis and Iverson, 1984; Jassem and Richter, 1989; Port and Crowford, 1989;
Ernestus and Baayen, 2006; Warner et al., 2006; Kleber et al., 2010).
The goal of this study was to examine the acoustic difference between SR and LR
tones in large corpora of natural speech by expanding our preliminary work reported in
Chen and Yuan (2007). The use of large corpora also provides an opportunity to examine
possible word frequency effects on the acoustic realization of SR and LR tones. The
effect of frequency on speech production has been repeatedly reported in corpus studies
(e.g., Bybee 2002, on word-final /t/ and /d/ deletion rates; Patterson and Connine 2001,
on flap production; and Aylett and Turk 2004, on syllable duration). Zhao and Jurafsky
(2009) found that low-frequency words with mid-range tones in Cantonese are produced
with higher F0 than high-frequency words and that the F0 trajectories of less frequent
words are more dispersed than that of their more frequent counterparts. Zhang and Lai
(2010) demonstrated that “wug” words (i.e., pseudowords) are more resistant to the
application of the 3rd tone sandhi than real words for Mandarin speakers. For the purpose
of this paper, we examined the possible effect of word frequency on the acoustic
realization of SR tones as compared with that for LR tones.
2. Method
2.1. Data
Two large speech corpora were utilized in this study: the HKUST Mandarin Telephone
Speech (LDC2005S15) and the HUB4 Mandarin Broadcast News Speech (LDC98S73).
Broadcast news speech is formal read speech that is produced by well-trained
professional speakers of Standard Chinese; telephone conversation speech is produced by
typical speakers of Standard Chinese who may have different dialectal accents. Syllable
boundaries were automatically obtained through forced alignment using the Penn
Phonetics Lab Forced Aligner (Yuan and Liberman 2008). The CALLHOME Mandarin
Chinese Lexicon (LDC96L15) was used to identify words and tonal sequences from the
corpora.
We analyzed disyllabic words with four tonal sequences: low-low (T3+T3), low-rising
(T3+T2), rising-low (T2+T3), and rising-rising (T2+T2). The main comparison in this
paper is the realization of Tone3 and Tone2 when both tones are followed by Tone3. As a
control, we compared T3+T2 and T2+T2 sequences. Table 1 lists the total number of
tonal sequences used in the study.
Table 1: Total number of tokens for different tonal sequences.
Tonal sequence HKST (tel. conversations)
HUB4 (radio news)
(T2+T3)word 8,113 2,592 (T3+T3)word 3,938 3,090 (T2+T2)word 6,515 4,685 (T3+T2)word 8,112 4,852
2.2. Acoustic Measurements
We first extracted the F0 contour of the target tone, located its minimum F0 and the F0
at the offset of the tone-bearing syllable, and then calculated two measurements. One
measurement is the LogRange of the F0 rise, which is the log of the ratio between the F0
at the syllable offset and the minimum F0. The other measurement is the percentage of the
F0 rise duration derived by calculating the percentage of the duration between the
minimum F0 and the syllable offset over the duration of the tone-bearing syllable. All
measurements were automatically extracted using esps/get_f0 and Python scripts.
3. Results
3.1. Acoustic Realizations
We first examined the acoustic realization of the first syllable in the disyllabic words.
Figure 1 shows that in both telephone conversations and broadcast news, when the
following tone was rising (i.e. in X+T2), X differed significantly in the magnitude of the
F0 rise between the rising and low tones. When the following tone was low (i.e. in
X+T3), the low tone exhibited a great F0 rise (SR) compared to its rise in the X+T2
context, but X remains significantly different in the magnitude of the F0 rise between the
LR and SR tones (Telephone conversations: t(7865.5) = 3.45, p = 0.001; broadcast news:
t(5439.0) = 7.1, p < 0.001). The F0 peak of the SR tone was lower than that of the LR
tone. This result is similar to what Peng (2000) observed and compatible with the general
impression in the literature that the rise in the SR tone is slightly less steep than that of
the LR tone.
Figure 1: Means (and ± two standard errors) of the LogRange of the F0 rise within rising vs. low tones when
the tone-bearing syllable either precedes a low tone or a rising tone.
In both telephone conversations and broadcast news, we further observed a significant
difference between the SR and LR tones regarding the percentage of the F0 rise duration
(i.e., the distance from the F0 minimum to the end of the tone-bearing syllable as a
percentage of the total duration of the syllable). Figure 2 shows that when the following
tone was a low tone (i.e. X+T3), the underlying low tone became more like a rising tone
(i.e., an SR tone). However, the percentage of the rise duration of the LR tone was greater
than that of the SR tone (telephone conversations: t(8113.5) = 13.8, p < 0.001; broadcast
news: t(5557.0) = 7.1, p < 0.001). This result indicates that the rise onset of an SR tone is
slightly later than that of a LR tone.
Figure 2: Means (and ± two standard errors) of the percentage of the F0 rise duration over the tone-bearing unit
within rising vs. low tones when the tone-bearing syllable either precedes a low tone or a rising tone.
In summary, both the LogRange of the F0 rise and the percentage of the F0 rise
duration suggest that despite the great similarity between the SR and LR tones, they are
indeed different in the contexts of both broadcast news and telephone conversations.
Thus, the results from both laboratory speech and corpus data conjointly suggest that a
fine phonetic difference exists between the SR and LR tones.
3.2. Frequency Effects
We further examined whether word frequency affects the realization of the SR tone vs.
the LR tone. We focused on two tonal sequences: low-low (i.e., T3+T3) and rising-low
(i.e., T2+T3). For each tonal sequence, we separated the disyllabic words into four
frequency bins (0-10, 10-100, 100-1000, and more than 1000), based on the frequency
counts of 3,431,707 words in the Xinhua newswire. The frequency counts were provided
by the CALLHOME Mandarin Chinese Lexicon. Figure 3 shows that for the low-low
tonal sequence, there was a significant decrease in the LogRange F0 rise of the first low
tone (i.e., the SR tone) for words with high frequency (i.e., more than 1000). For the
rising-low tonal sequence (i.e., T2 preceding T3), such a word frequency effect does not
hold for the LR tone.
Figure 3: Means (and ± two standard errors) of the LogRange of the F0 rise of the lexical rising tone vs. the
sandhi rising tone within different word frequency ranges.
Figure 4 compares the LR and SR tones for different word frequency bins separately.
It was clear that the SR tone has a smaller F0 rise than the LR tone for high-frequency
words in both telephone conversations and broadcast news. For low-frequency words,
however, the difference between the two tones was not statistically significant. The
results of t-tests comparing the LR and SR tones for different word frequency bins are
presented in Table 2.
Figure 4: Means (and ± two standard errors) of the LogRange of the F0 rise of the lexical rising tone vs. the
sandhi rising tone within different word frequency ranges.
Table 2. The results of t-tests comparing T2 preceding T3 and T3 preceding T3 on the LogRange of the F0 rise.
Corpus Word freq. Number of tokens t-test
<= 10 T2+T3: 2175; T3+T3: 842 t = -0.52, p = 0.6 10_100 T2+T3: 2376; T3+T3: 958 t = -1.2, p = 0.22 100_1000 T2+T3: 3393; T3+T3: 920 t = 0.82, p = 0.41
Telephone converations
> 1000 T2+T3: 169; T3+T3: 1218 t = 3.53, p < 0.001 <= 10 T2+T3: 385; T3+T3: 335 t = -0.85, p = 0.40 10_100 T2+T3: 274; T3+T3: 339 t = 1.17, p = 0.24 100_1000 T2+T3: 1447; T3+T3: 1171 t = 2.01, p = 0.04
Broadcase news
> 1000 T2+T3: 486; T3+T3: 1245 t = 8.50, p < 0.001
Regarding the percentage of the F0 rise duration, the effect of word frequency is less
clear. Nonetheless, as shown in Figure 5, there was a greater difference between the SR
and LR tones for high-frequency words, especially in Broadcast news speech.
Figure 5: Means (and ± two standard errors) of the percentage of the F0 rise duration of the lexical rising tone
and the sandhi rising tone within different word frequency ranges.
4. Discussion
The SR and LR tones are acoustically different in both spontaneous telephone
conversations and formal broadcast news. This result is consistent with those of previous
studies that used laboratory speech and demonstrates that the difference is not an artifact
of orthography or laboratory speech. Our study also shows that word frequency affects
the acoustic realization of the SR tone. The SR tone differs more from the LR tone in
high-frequency words, especially with respect to the magnitude of the F0 rise.
Although they appear to be sensitive to the fine acoustic difference between the SR
and LR tones at the subconscious or unconscious level (Speer and Xu, 2008), native
listeners frequently fail to distinguish between the two tones at the conscious level (Wang
and Li, 1967; Peng, 2000). This type of mismatch between speech production and
perception is not a rare phenomenon. Many studies have reported a class of situations in
sound change called “near-mergers” (Labov et al., 1972, 1991; Yu 2007). In these
situations, “speakers consistently reported that two classes of sounds were ‘the same,’ yet
consistently differentiated them in production” (Labov et al., 1991: pp. 33). Studies on
“incomplete neutralization” also found that listeners often failed to identify the small
acoustic distinction between the voicing contrasts that are not completely neutralized. For
example, Port and O’Dell (1985) reported that German listeners could distinguish the
syllable final voiced and voiceless pairs, a well known example of “incomplete
neutralization”, with only about 60% accuracy (although this number was interpreted as
significantly better than chance in the paper).
Why do listeners fail to identify the difference between the SR and LR tones? From
the perspective of the traditional categorical perception theory (Liberman et al., 1957)
and the feature-based model of lexical access (Stevens, 2002), the SR and LR tones are
perceived as belonging to the same category, the mental representation of which may
consist of a set of tonal features (Wang, 1967; Yip, 1980; Bao, 1999) but contains no
detailed phonetic information. In this framework, phonological encoding precedes mental
lexical access. The phonetic details are the input to the phonological encoding process,
and they are not available in the output of the process. From the perspective of an
exemplar-based model (Johnson, 1997; Pierrehumbert, 2001, 2002), however, the metal
lexicon stores rich and detailed acoustic information. In this framework, “each category is
represented in memory by a large cloud of remembered tokens of that category”
(Pierrehumbert, 2001: pp. 140). Although the SR and LR tones are slightly different in
terms of their “means” ( i.e., the centers of distribution), their probability distributions
greatly overlap. Both frameworks may explain why listeners often fail to differentiate
between the two tones.
How does a native speaker, without consciously perceiving the difference, maintain
the subtle difference between the SR and LR tones? How does a child acquire the two
tones? Much research needs to be done to answer these questions. Our study indicates
that word frequency affects the acoustic realization of the SR tone. This result suggests
that the production of the 3rd tone sandhi may involve word-dependent processes. There
is no doubt that the 3rd tone sandhi is applied on-line in speech production because it
appears across word boundaries, and its application is determined by factors such as
syntactic and information structure. It is, however, unclear whether the production of the
3rd tone shandhi involves only a postlexical process. It is possible that the two syllables in
high-frequency disyllabic words are stored in long term memory together as one unit,
whereas less frequent ones may be stored as two independent syllables and assembled on-
line in speech production (this hypothesis is logical considering that words are not well-
defined in Chinese). Following this hypothesis, high-frequency words with an underlying
low-low tonal sequence are stored in the mental lexicon as a rising-low sequence, the
rising tone of which is slightly different from the LR tone. The postlexical process of the
3rd tone sandhi is, however, a categorical shift from a low tone to an LR tone. This
hypothesis may explain our result that the SR tone differs more from the LR tone in high-
frequency words, but appears to contradict the result of Zhang and Lai (2010), who found
that “wug” words are more resistant to the application of the 3rd tone sandhi than real
words. Another hypothesis, proposed by Loui et al. (2008) in their study on tone-
deafness, is that multiple neural pathways have evolved to combine consciously and
unconsciously obtained information for sound perception and production. Their study
found that tone-deaf individuals, who could not consciously perceive pitch differences,
could produce pitch intervals in target directions. Additional studies are necessary to test
and refine these hypotheses.
5. Conclusions
This paper examined the 3rd tone sandhi phenomenon in large corpora of natural
speech and analyzed both telephone conversations and formal broadcasts. Our results
confirm previous reports and findings that there are indeed low-level acoustic differences
between the sandhi rising and lexical rising tones both in terms of the magnitude of the F0
rise and the rise duration. Our study demonstrates that the difference is not an artifact of
orthography or laboratory speech. We also found that given a disyllabic word, which is a
3rd tone sandhi domain, word frequency affected the realization of the sandhi rising tone.
Specifically, the sandhi rising tone in highly frequent words exhibited a smaller F0 rise
(i.e., it differs more from the lexical rising tone) than in less frequent words. This result
suggests that different processes may be involved in producing high- vs. low-frequency
words in Chinese.
6. Acknowledgements
Both authors contributed equally to the paper. The work is supported by the U.S.
National Science Foundation (IIS-0964556), the Netherlands Organization for Scientific
Research (NWO-VIDI 016084338), and the European Research Council (ERC-Starting
Grant 206198).
7. References
Aylett, M. and Turk, A. 2006. Language redundancy predicts syllabic duration and the
spectral characteristics of vocalic syllable nuclei. Journal of the Acoustical Society of
America, 119: 3048–3058.
Bao, Z. 1999. The Structure of Tone. Oxford: Oxford University Press.
Bybee, J. 2002. Word frequency and context of use in the lexical diffusion of
phonetically conditioned sound change. Language Variation and Change 14: 261–
290.
Chao, Y. R. 1948. Mandarin Primer. Cambridge: Harvard University Press.
Chen, M. 2000. Tone Sandhi. Cambridge University Press. Cambridge.
Chen, Y. 2003. The phonetics and phonology of contrastive focus in Standard Chinese.
PhD dissertation. Stony Brook University.
Chen, Y. and Yuan, J. 2007. A Corpus Study of the 3rd Tone Sandhi in Standard Chinese.
Proceedings of Interspeech 2007. pp. 2749-2752.
Duanmu, S. 2000. The Phonology of Standard Chinese. Oxford: Oxford University Press.
Ernestus, M. and Baayen, H. 2006. The functionality of incomplete neutralization in
Dutch: The case of past-tense formation. In Goldstein, L., Whalen, D. and Best C.
(eds.), Laboratory phonology 8. pp. 27–49. Berlin: Mouton de Gruyter.
Fourakis, M. and Iverson, G. 1984. On the incomplete neutralization of German final
obstruents. Phonetica 41: 140–149.
Hockett, C. F. 1947. Peiping phonology. Journal of American Oriental Society 67:253-
267. Reprinted 1964 in M. Joos (eds.) Readings in Linguistics I, fourth edition. pp.
217-228. University of Chicago Press.
Jassem, W. and Richter, L. 1989. Neutralization of voicing in Polish obstruents. Journal
of Phonetics 17: 317–325.
Johnson, K. 1997. The auditory/perceptual basis for speech segmentation. Ohio State
University Working Papers in Linguistics 50: 101-113.
Kleber, F., John, T. and Harrington, J. 2010. The implications for speech perception of
incomplete neutralization of final devoicing in German. Journal of Phonetics 38: 185-
196.
Kratochvil, P. 1984. Phonetic tone sandhi in Beijing dialect stage speech. Cahiers de
Linguistique - Asie Orientale 13:135-174.
Kuo, Y., Xu, Y. and Yip, M. 2007. The phonetics and phonology of apparent cases of
iterative tonal change in Standard Chinese. In Gussenhoven, C. and Riad, T. (eds.)
Experimental Studies in Word and Sentence Prosody. pp. 211-237. Berlin: Mouton de
Gruyter.
Labov, W., Karen, M. and Miller, C. 1991. Near-mergers and the suspension of
phonemic contrast. Language Variation and Change 3: 33–74.
Labov, W., Yaeger M. and Steiner R. 1972. A quantitative study of sound change in
progress. Philadelphia: U.S. Regional Survey.
Liberman, A. M., Harris, K. S., and Hoffman, H. S. 1957. The discrimination of speech
sounds within and across phoneme boundaries. Journal of Experimental Psychology
54: 358-368.
Loui, P., Guenther, F. H., Mathys, C., and Schlaug, G. 2008. Action-perception mismatch
in tone-deafness. Current Biology 18: R331-332.
Macmillan, N. A. and Creelman, C. D. 2005. Detection Theory: A User's Guide (2nd
edition), Lawrence Erlbaum Associates, Inc.
Martin, S. E. 1957. Problems of hierarchy and indeterminacy in Mandarin phonology.
Bulletin of the Institute of History and Philology 29:209-230. Taipei.
Myers, J., and Tsay, J. 2003. Investigating the phonetics of tone sandhi. Taiwan Journal
of Linguistics 1: 29-68.
Patterson, D., and Connine, C. 2001. Variant frequency in flap production. Phonetica 58:
254–275.
Peng, S. 2000. Lexical versus 'phonological' representations of Mandarin Sandhi Tones.
In Broe M. and Pierrehumbert J. (eds.), Papers in laboratory phonology 5:
acquisition and the lexicon. pp. 152-167. Cambridge: Cambridge University Press.
Pierrehumbert, J. 2001. Exemplar dynamics: Word frequency, lenition, and contrast. In
Bybee, J. and Hopper, P. (eds.) Frequency effects and the emergence of lexical
structure. pp. 137-157. John Benjamins, Amsterdam.
Pierrehumbert, J. 2002. Word-specific phonology. Laboratory Phonology 7. pp. 101-139.
Mouton de Gruyter, Berlin.
Port, R. and Crawford, P. 1989. Incomplete neutralization and pragmatics in German.
Journal of Phonetics 17: 257–282.
Port, R. and O’Dell, M. 1985. Neutralization of syllable-final voicing in German. Journal
of Phonetics 13: 455-471.
Shen, J. 1994. Beijinghua shangshen liandu de diaoxing zuhe he jiezou xingshi [F0 and
rhythm of the 3rd tone Sandhi in Beijing Mandarin], Zhongguo Yuwen 4: 274-281.
Shen, X. S. 1990. Tonal coarticulation in Mandarin, Journal of Phonetics 18: 281-295.
Shih, C. 1986. The Prosodic Domain of Tone Sandhi in Chinese. PhD dissertation.
University of California at San Diego.
Shih, C. 1997. Mandarin third tone sandhi and prosodic structure. In J. Wang & N. Smith
(eds.), Studies in Chinese Phonology. pp. 81-124. Dordrecht: Foris.
Speer, S, R., Shih, C.-L., & Slowiaczek, M.L. 1989. Prosodic structure in language
comprehension: Evidence from tone sandhi in Mandarin. Language and Speech 32:
337-354.
Speer, S. R. and Xu, L. 2008. Processing lexical tone in third-tone sandhi, Labphon 11
abstracts. pp. 131-132.
Stevens, K. N. 2002. Toward a model for lexical access based on acoustic landmarks and
distinctive features. J. Acoust. Soc. Am. 111: 1872-1891.
Wang, W. S-Y. and Li, K. P. 1967. Tone 3 in Pekinese. Journal of Speech and Hearing
Research 10: 629-236.
Wang, W. S-Y. 1967. Phonological features of tone. International Journal of American
Linguistics 33:93-105.
Warner, N., Good, E., Jongman, A., and Sereno, J. 2006. Orthographic versus
morphological incomplete neutralization effects. Journal of Phonetics 34: 285-293.
Xu, Y. 1993. Contextual tonal variation in Mandarin Chinese. PhD dissertation.
University of Connecticut.
Xu, Y. 1997. Contextual tonal variations in Mandarin. Journal of Phonetics 25: 61-83.
Xu, Y. 2010. In defense of lab speech. Journal of Phonetics 38: 329-336.
Yip, M. 1980. The Tonal Phonology of Chinese. Ph.D. dissertation. MIT.
Yu, A. 2007. Understanding near mergers: The case of morphological tone in Cantonese.
Phonology 24: 187-214.
Yuan, J. and Liberman, M. 2008. Speaker identification on the SCOTUS corpus.
Proceedings of Acoustics ’08. pp. 5687-5690.
Zhang, J. and Lai, Y. 2010. Testing the role of phonetic knowledge in Mandarin tone
sandhi. Phonology 27: 153-201.
Zhang, Z. 1988. Tone and Tone Sandhi in Chinese. PhD dissertation. Ohio State
University.
Zhao, Y. and Jurafsky, D. 2009. The effect of lexical frequency and Lombard reflex on
tone hyperarticulation. Journal of Phonetics 27: 231-247.
Zee, E. 1980. A spectrographic investigation of Mandarin tone sandhi. UCLA Working
Papers 49:98-116.