The English comparative: Phonology and Usage Martin Hilpert, ICSI Berkeley / Rice University,...

1
The English comparative: Phonology and Usage Martin Hilpert, ICSI Berkeley / Rice University, [email protected] prouder and more proud Many English adjectives form the comparative in two ways. Some alternating adjectives have a clear preference (?more easy), while others alternate quite freely: easy 99.2% morphological cheesy 69.5% morphological queasy 4.9% morphological selected references Kytö, Merja and Suzanne Romaine. 1997. Competing forms of adjective comparison in Modern English: what could be more quicker and easier and more effective? In T. Nevalainen and L. Kahlas-Tarkka (eds), To explain the present — Studies in the changing English language in honour of Matti Rissanen. Helsinki: Memoires de la Societe Neophilologique de Helsinki, 329-52. Leech, Geoffrey N. and Jonathan Culpeper. 1997. The comparison of Adjectives in Recent British English. In T. Nevalainen and L. Kahlas-Tarkka (eds), 353-74. Lindquist, Hans. 2000. Livelier or more lively? Syntactic and contextual factors influencing the comparison of disyllabic adjectives. In J. M. Kirk (ed.), Corpora galore. Amsterdam: Rodopi, 125–32. Mondorf, Britta. 2003. Support for more-support. In G. Rohdenburg and B. Mondorf (eds), Determinants of grammatical variation in English. Berlin: Mouton de Gruyter, 251-304. Quirk, Randolph, Greenbaum, Sidney, Leech, Geoffrey and Jan Svartvik. 1985. A comprehensive grammar of the English language. New York: Longman. factors on the word-level length, characteristics of the final segment VARIABLE SOURCE TENDENCY # of syllables Quirk et al. (1985) periphrastic # of morphemes Mondorf (2003) periphrastic final /i/ Kytö and Romaine (1997) morphological final /li/ Lindquist (1998) periphrastic final /r/ Mondorf (2003) periphrastic final /l/ Kytö and Romaine (1997) morphological final C-cluster Mondorf (2003) periphrastic final stress Leech and Culpeper (1997) periphrastic a usage-based approach It is assumed that usage (quantitative patterns in large amounts of naturally produced language) reflects grammar and vice versa. A corpus analysis can establish the morphological/periphrasti c-ratio of alternating adjectives and determine which of the above factors best predict this ratio. what’s been said Previous analyses (e.g. Leech and Culpeper 1997, Mondorf 2003) hold that factors of phonology, morphology, syntax, and semantics govern the comparative alternation. However, an integrated account is missing; the relative importance of these factors has not been determined. Also, the role of frequency has not been sufficiently explored. When and why do speakers choose one variant over the other? factors beyond the word level VARIABLE SOURCE TENDENCY to-inf complement Mondorf (2003) periphrastic attributive use Leech and Culpeper (1997) morphological predicative use Leech and Culpeper (1997) periphrastic premodification Lindquist (2000) periphrastic weak gradability Mondorf (2003) periphrastic positive frequency Braun (1982) morphological How can we determine the relative strength of each factor? analysis Using the word-level characteristics, the ratio of comparatives and positives, and the frequency of the positive form as variables, the analysis yields an adjusted R 2 of .341: INCLUDED VARIABLES EXCLUDED VARIABLES Beta t Sig Beta t Sig # syl .876 12.738 .000 # mor -.015 -.378 .706 final /i/ -.437 -6.844 .000 final /r/ .022 .715 .475 CP-ratio -.176 -5.801 .000 final /l/ .056 1.663 .097 final /li/ .119 3.836 .000 fin stress .314 3.663 .000 fin clust .113 3.017 .003 posfr -.094 -2.97 .003 a first pass: using the n-gram corpus All bigrams of the form ‘-er than’ are retrieved. The corresponding uninflected adjectives followed by than are also retrieved, yielding 730 types. The LOG of the observed morphological/periphrastic-ratio serves as the dependent variable for a multiple linear regression. ADJ MORPH PERIPHR RATIO LOG (10) able 1,199 27,414 .0437 -1.36 bright 273,698 7,198 38.024 1.58 correct 54 28,864 .0018 -2.73 great 10,100,443 10,099 100.14 3.00 But what about those syntactic factors? results Word length measured in syllables, but not morphemes, strongly affects the alternation. The measured effects of final /i/and /li/, final stress, and final clusters corroborate earlier work. Final /r/, /l/, and sibilants have no significant effect. Factors of usage, such as the ratio of positives and comparative and the frequency of the positive form, affect the alternation. second try: using the BNC All comparative adjective forms are retrieved, yielding 272 types. The four syntactic variables are encoded as subcategorization probabilities for each adjective: ADJECTIVE ATTR PRED TO PREMOD ready 0,15 0,01 0,68 0.09 spicy 0,78 0,11 0,00 0.14 untidy 0,00 0,25 0,00 0.18 analysis Again, the LOG of the observed morphological/periphrastic-ratio serves as the dependent variable. Using all previously used variables and the subcategorization probabilities, the analysis yields an adjusted R 2 of .509: INCLUDED VARIABLES EXCLUDED VARIABLES Beta t Sig Beta t Sig # syl .969 10.572 .000 fin /i/ -.003 -.029 .977 CP-ratio -. 392 -9.122 .000 premod .010 .237 .813 fin stress .550 5.942 .000 pred .021 .497 .620 to-inf .113 2.626 .009 attr -.055 -1.212 .226 fin clust .112 2.511 .013 # mor .082 1.387 .167 final /l/ .109 2.480 .014 fin /li/ .066 1.454 .147 fin /r/ .085 1.937 results As in the first analysis, length in syllables, comparative-positive ratio, final stress and consonant cluster influence the alternation. Final /l/ is found to be significant. Of the syntactic variables, only to-infinitive complements show a significant effect. Final /i/and /li/do not show an independent effect in this conclusion Both structural phonological factors and factors of language use govern the alternation – but much variance still needs to be explained.

Transcript of The English comparative: Phonology and Usage Martin Hilpert, ICSI Berkeley / Rice University,...

Page 1: The English comparative: Phonology and Usage Martin Hilpert, ICSI Berkeley / Rice University, hilpert@icsi.berkeley.edu prouder and more proud Many English.

The English comparative: Phonology and Usage Martin Hilpert, ICSI Berkeley / Rice University,

[email protected]

prouder and more proud

Many English adjectives form the comparative in two ways. Some alternating adjectives have a clear preference (?more easy), while others alternate quite freely:

easy 99.2% morphologicalcheesy 69.5% morphologicalqueasy 4.9% morphological

selected referencesKytö, Merja and Suzanne Romaine. 1997. Competing forms of adjective comparison

in Modern English: what could be more quicker and easier and more effective? In T. Nevalainen and L. Kahlas-Tarkka (eds), To explain the present — Studies in the changing English language in honour of Matti Rissanen. Helsinki: Memoires de la Societe Neophilologique de Helsinki, 329-52.

Leech, Geoffrey N. and Jonathan Culpeper. 1997. The comparison of Adjectives in Recent British English. In T. Nevalainen and L. Kahlas-Tarkka (eds), 353-74.

Lindquist, Hans. 2000. Livelier or more lively? Syntactic and contextual factors influencing the comparison of disyllabic adjectives. In J. M. Kirk (ed.), Corpora galore. Amsterdam: Rodopi, 125–32.

Mondorf, Britta. 2003. Support for more-support. In G. Rohdenburg and B. Mondorf (eds), Determinants of grammatical variation in English. Berlin: Mouton de Gruyter, 251-304.

Quirk, Randolph, Greenbaum, Sidney, Leech, Geoffrey and Jan Svartvik. 1985. A comprehensive grammar of the English language. New York: Longman.

factors on the word-level

length, characteristics of the final segmentVARIABLE SOURCETENDENCY

# of syllables Quirk et al. (1985)periphrastic# of morphemes Mondorf (2003)periphrastic

final /i/ Kytö and Romaine (1997)morphologicalfinal /li/ Lindquist (1998)periphrasticfinal /r/ Mondorf (2003) periphrasticfinal /l/ Kytö and Romaine (1997)morphologicalfinal C-cluster Mondorf (2003) periphrasticfinal stress Leech and Culpeper (1997)periphrastica usage-based

approach

It is assumed that usage (quantitative patterns in large amounts of naturally produced language) reflects grammar and vice versa.

A corpus analysis can establish the morphological/periphrastic-ratio of alternating adjectives and determine which of the above factors best predict this ratio.

what’s been said

Previous analyses (e.g. Leech and Culpeper 1997, Mondorf 2003) hold that factors of phonology, morphology, syntax, and semantics govern the comparative alternation. However, an integrated account is missing; the relative importance of these factors has not been determined. Also, the role of frequency has not been sufficiently explored.

When and why do speakers choose one variant over the other?

factors beyond the word levelVARIABLE SOURCETENDENCY

to-inf complement Mondorf (2003) periphrasticattributive use Leech and Culpeper (1997)morphologicalpredicative use Leech and Culpeper (1997)periphrasticpremodification Lindquist (2000)periphrastic

weak gradability Mondorf (2003)periphrasticpositive frequency Braun (1982)morphological

How can we determine the relative strength of each factor?

analysis

Using the word-level characteristics, the ratio of comparatives and positives, and the frequency of the positive form as variables, the analysis yields an adjusted R2 of .341:

INCLUDED VARIABLES EXCLUDED VARIABLESBeta t Sig Beta t Sig

# syl .876 12.738 .000 # mor -.015 -.378 .706 final /i/ -.437 -6.844 .000 final /r/ .022 .715 .475CP-ratio -.176 -5.801 .000 final /l/ .056 1.663.097final /li/ .119 3.836 .000fin stress .314 3.663 .000fin clust.113 3.017 .003posfr -.094 -2.97 .003

a first pass: using the n-gram corpus

All bigrams of the form ‘-er than’ are retrieved. The corresponding uninflected adjectives followed by than are also retrieved, yielding 730 types. The LOG of the observed morphological/periphrastic-ratio serves as the dependent variable for a multiple linear regression.

ADJ MORPH PERIPHR RATIO LOG (10)

able 1,199 27,414 .0437 -1.36bright 273,698 7,198 38.024 1.58correct54 28,864 .0018 -2.73great 10,100,443 10,099 100.14 3.00

But what about those syntactic factors?

results

Word length measured in syllables, but not morphemes, strongly affects the alternation.

The measured effects of final /i/and /li/, final stress, and final clusters corroborate earlier work. Final /r/, /l/, and sibilants have no significant effect.

Factors of usage, such as the ratio of positives and comparative and the frequency of the positive form, affect the alternation.

second try: using the BNCAll comparative adjective forms are retrieved, yielding 272 types. The four syntactic variables are encoded as subcategorization probabilities for each adjective:ADJECTIVE ATTR PRED TO PREMODready 0,15 0,01 0,68 0.09spicy 0,78 0,11 0,00 0.14untidy 0,00 0,25 0,00 0.18

analysis

Again, the LOG of the observed morphological/periphrastic-ratio serves as the dependent variable. Using all previously used variables and the subcategorization probabilities, the analysis yields an adjusted R2 of .509:INCLUDED VARIABLES EXCLUDED VARIABLES

Beta t Sig Beta t Sig# syl .969 10.572 .000 fin /i/ -.003 -.029 .977CP-ratio -. 392 -9.122 .000 premod .010 .237 .813fin stress .550 5.942 .000 pred .021 .497 .620to-inf .113 2.626 .009 attr -.055 -1.212 .226fin clust.112 2.511 .013 # mor .082 1.387 .167final /l/ .109 2.480 .014 fin /li/ .066 1.454 .147

fin /r/ .085 1.937 .054

results

As in the first analysis, length in syllables, comparative-positive ratio, final stress and consonant cluster influence the alternation.

Final /l/ is found to be significant.

Of the syntactic variables, only to-infinitive complements show a significant effect.

Final /i/and /li/do not show an independent effect in this analysis; neither does the frequency of the positive form.

conclusionBoth structural phonological factors and factors of language use govern the alternation – but much variance still needs to be explained.