Laboratory Phonology 11, 30 June - 2 July 2008, Wellington, New Zealand
The Gradient Phonotactics of English CVC Syllables Olga Dmitrieva & Arto Anttila
Department of Linguistics, Stanford UniversityIntroductionIntroduction
Factors affecting the well-formedness of English CVC syllables:
• OCP-place: gradient prohibition against homorganic consonants in C1 and C2 (e.g. gag vs. gap).
HYPOTHESIS: Syllables with C1 and C2 of the same place of articulation are underrepresented.
• Prominence alignment between stress, vowel height, and consonant place:
HYPOTHESIS: Syllables that violate prominence alignment are underrepresented.
MethodsMethodsMaterial:
• CMU pronunciation dictionary and CELEX lemma lexicon. • Stress: primary stress vs. no stress.• Consonants: coronal, dorsal, labial.• Vowels: high (= high or reduced) and low (= low or mid).
Effect size evaluation:
• Observed frequency/Expected frequency ratio (O/E ratio):
P(dorsal-V-dorsal) = P(onset=dorsal) * P(coda=dorsal) E(dorsal-V-dorsal) = P(dorsal-V-dorsal) * Total
• Multiple regression.
ResultsResults
0.941.211.17Coronal
1.030.511.63Dorsal
1.090.840.31Labial
CoronalDorsalLabial
Onset
Coda
0.941.211.17Coronal
1.030.511.63Dorsal
1.090.840.31Labial
CoronalDorsalLabial
Onset
Coda
0.861.201.29Coronal
1.250.420.78Dorsal
1.230.770.40Labial
CoronalDorsalLabial
Onset
Coda
0.861.201.29Coronal
1.250.420.78Dorsal
1.230.770.40Labial
CoronalDorsalLabial
Onset
Coda
0%
20%
40%
60%
80%
100%
Labial Dorsal Coronal
Coda
Coronal
Dorsal
Labial
0%
20%
40%
60%
80%
100%
Labial Dorsal Coronal
Coda
Coronal
Dorsal
Labial
RegressionRegression
OT AnalysisOT Analysis
ConclusionsConclusions
1.040.951.140.80Coronal
1.030.960.791.30Dorsal
0.651.680.871.27Labial
UnstressedStressedUnstressedStressed
CodaOnset
1.040.951.140.80Coronal
1.030.960.791.30Dorsal
0.651.680.871.27Labial
UnstressedStressedUnstressedStressed
CodaOnset
1.030.881.120.82Coronal
1.010.990.861.21Dorsal
0.881.180.831.15Labial
UnstressedStressedUnstressedStressed
CodaOnset
1.030.881.120.82Coronal
1.010.990.861.21Dorsal
0.881.180.831.15Labial
UnstressedStressedUnstressedStressed
CodaOnset
0%
20%
40%
60%
80%
100%
Stressed Unstressed
Coronal
Dorsal
Labial
1.040.921.110.78Coronal
0.901.060.701.42Dorsal
0.681.870.941.23Labial
HighLowHighLow
CodaOnset
1.040.921.110.78Coronal
0.901.060.701.42Dorsal
0.681.870.941.23Labial
HighLowHighLow
CodaOnset
1.030.961.150.76Coronal
0.871.030.641.41Dorsal
0.931.120.831.28Labial
HighLowHighLow
CodaOnset
1.030.961.150.76Coronal
0.871.030.641.41Dorsal
0.931.120.831.28Labial
HighLowHighLow
CodaOnset
0%
20%
40%
60%
80%
100%
Low High
Coronal
Dorsal
Labial
1.290.46High
0.242.42LowVowel quality
UnstressedStressed
Syllable type
1.290.46High
0.242.42LowVowel quality
UnstressedStressed
Syllable type
1.370.45High
0.411.89LowVowel quality
UnstressedStressed
Syllable type
1.370.45High
0.411.89LowVowel quality
UnstressedStressed
Syllable type
0%
20%
40%
60%
80%
100%
Stressed Unstressed
Coronal
Dorsal
Labial
0%
20%
40%
60%
80%
100%
Low High
Coronal
Dorsal
Labial
0%
20%
40%
60%
80%
100%
Stressed Unstressed
Low
High
3. Syllables violating consonant-vowel alignment are underrepresented:1. Syllables that violate OCP-place are underrepresented:
CMU
CELEX
CMU
Onset-coda cooccurrences (O/E values):
2. Syllables that violate consonant-stress alignment are underrepresented:
a. Labials and dorsals in unstressed syllables.b. Coronals in stressed syllables.
CMU
CELEX
4. Syllables that violate vowel-stress assignment are underrepresented:
a. Low vowels in unstressed syllables.b. High vowels in stressed syllables.
CMU
CELEX
CELEX
R = 0.943 (F(6, 35) = 38.689, p < 0.001)
R = 0.945 (F(6, 35) = 13.515, p < 0.001)
25,888 CVC syllables from CELEX83,798 CVC syllables from CMU
[rI]
repeat
[pit]
O/E ratio > 1.00 overrepresentationO/E ratio < 1.00 underrepresentation
Cases: 36 syllable types:3 onset place * 3 coda place * 2 stress * 2 vowel height
e.g. LLHS - labial-labial, high vowel, stressed
In CMU significant factors:• Vowel-stress alignment• OCP• No labial/dorsal in unstressed syllables
In CELEX significant factors:• Vowel-stress alignment• OCP• No labial/dorsal with high vowels
CMU
CELEX
• A set of unranked OT constraints generate implicational universals that reflect relative phonotactic markedness:
More marked forms entail less marked forms.
More marked forms surface less frequently.
• Sample universal:
If a language allows gag (violates OCP) it also allows gap.
Gap is always more frequent than gag.
• The implicational universals can be described graphically as a partial order.
• Precision (how many of the predicted relationships are correct): CMU 0.85
CELEX 0.86
Gradient OCP-place is active in all CVC syllables (not just monosyllabic words, cf. Berkley 1994).
Prominence alignment in CVC syllables:• The best stressed syllable has low or mid vowels.• The best unstressed syllable has high or reduced vowels and coronal consonants.
Positional neutralization and augmentation for vowels.Only positional neutralization for consonants.
References:Anttila, A. (2008). Gradient phonotactics and the Complexity Hypothesis. To appear in Natural Language and Linguistic Theory.Anttila, A. & Andrus, C. (2006). T-Orders. Ms., Stanford University.Baayen, R. H., Piepenbrock, R., & Gulikers, L. (1995). The CELEX Lexical Database (Release 2). Philadelphia, PA: Linguistic Data Consortium, University of Pennsylvania [Distributor].Berkley, D. (1994). Variability in Obligatory Contour Principle effects. CLS 30, pp. 1-12.Coetzee, A., & Pater, J. (2008). Weighted constraints and gradient restrictions on place co-occurrence in Muna and Arabic. To appear in Natural Language and Linguistic Theory.Weide, R. (1998). The CMU pronunciation dictionary (Release 0.6). Carnegie Mellon University. Available online at http://www.speech.cs.cmu.edu/cgi-bin/cmudict.
Constraints (significant regression factors):
OCP Avoid homorganic C1 and C2*x/a Avoid unstressed low vowels*X/I Avoid stressed high vowels*x/p_ Avoid labial/dorsal C1 in unstressed syllables*x/_p Avoid labial/dorsal C2 in unstressed syllables*p_/I Avoid labial/dorsal C1 + high vowel*i/_p/ Avoid high vowel + labial/dorsal C2Faith Do not change input segments
Graphic representation of implicational relationships in CELEX data.
0%
20%
40%
60%
80%
100%
Stressed Unstressed
Low
High
Dependent variable:• Log of the observed frequency.
Independent variable:• Log of the expected frequency.• Binary coded phonotactics factors:
1 – violates, 0 – does not violate.<0.01-2.9470.250-0.737Lab&Dor coda/Unstressed
<0.01-3.2680.251-0.820Lab&Dor onset/Unstressed
<0.001-4.5570.257-1.169Unstressed/Low
<0.001-6.7230.254-1.708Stressed/High
<0.001-4.7090.204-0.961OCP
<0.00111.1200.0770.857Expected
ptStd. ErrorCoefficientFactors
<0.01-2.9470.250-0.737Lab&Dor coda/Unstressed
<0.01-3.2680.251-0.820Lab&Dor onset/Unstressed
<0.001-4.5570.257-1.169Unstressed/Low
<0.001-6.7230.254-1.708Stressed/High
<0.001-4.7090.204-0.961OCP
<0.00111.1200.0770.857Expected
ptStd. ErrorCoefficientFactors
<0.05-2.430.196-0.477Lab&Dor coda/High
<0.01-3.3720.196-0.660Lab&Dor onset/High
<0.001-7.5910.199-1.513Unstressed/Low
<0.001-4.1510.2-1.830Stressed/High
<0.001-4.5990.16-0.783OCP
<0.0019.6890.0880.856Expected
ptStd. ErrorCoefficientFactors
<0.05-2.430.196-0.477Lab&Dor coda/High
<0.01-3.3720.196-0.660Lab&Dor onset/High
<0.001-7.5910.199-1.513Unstressed/Low
<0.001-4.1510.2-1.830Stressed/High
<0.001-4.5990.16-0.783OCP
<0.0019.6890.0880.856Expected
ptStd. ErrorCoefficientFactors
a. Low vowels with coronals.b. High vowels with labials or dorsals.
stressed > unstressedlow vowel > high vowellabial/dorsal > coronal
Top Related