A Study of the Recognition of American English Intonation ...

Louisiana State UniversityLSU Digital Commons

LSU Historical Dissertations and Theses Graduate School

1968

A Study of the Recognition of American EnglishIntonation by Native Speakers.Clarence Wilton MccordLouisiana State University and Agricultural & Mechanical College

Follow this and additional works at: https://digitalcommons.lsu.edu/gradschool_disstheses

This Dissertation is brought to you for free and open access by the Graduate School at LSU Digital Commons. It has been accepted for inclusion inLSU Historical Dissertations and Theses by an authorized administrator of LSU Digital Commons. For more information, please [email protected].

Recommended CitationMccord, Clarence Wilton, "A Study of the Recognition of American English Intonation by Native Speakers." (1968). LSU HistoricalDissertations and Theses. 1506.https://digitalcommons.lsu.edu/gradschool_disstheses/1506

https://digitalcommons.lsu.edu?utm_source=digitalcommons.lsu.edu%2Fgradschool_disstheses%2F1506&utm_medium=PDF&utm_campaign=PDFCoverPages

https://digitalcommons.lsu.edu/gradschool_disstheses?utm_source=digitalcommons.lsu.edu%2Fgradschool_disstheses%2F1506&utm_medium=PDF&utm_campaign=PDFCoverPages

https://digitalcommons.lsu.edu/gradschool?utm_source=digitalcommons.lsu.edu%2Fgradschool_disstheses%2F1506&utm_medium=PDF&utm_campaign=PDFCoverPages

https://digitalcommons.lsu.edu/gradschool_disstheses?utm_source=digitalcommons.lsu.edu%2Fgradschool_disstheses%2F1506&utm_medium=PDF&utm_campaign=PDFCoverPages

https://digitalcommons.lsu.edu/gradschool_disstheses/1506?utm_source=digitalcommons.lsu.edu%2Fgradschool_disstheses%2F1506&utm_medium=PDF&utm_campaign=PDFCoverPages

mailto:[email protected]

This dissertation has been microfilmed exactly as received 69-4489

McCORD, Clarence Wilton, 1934- A STUDY OF THE RECOGNITION OF AMERICAN ENGLISH INTONATION BY NATIVE SPEAKERS.

Louisiana State University and Agricultural and Mechanical College, Ph.D., 1968 Language and Literature, linguistics

University Microfilms, Inc., Ann Arbor, Michigan

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

A Study of ths Recognition of

American English Intonation by Native Speakers

A Dissertation

Submitted to the Graduate Faculty of the Louisiana State University and

Agricultural and Mechanical College in partial fulfillment of the requirements for the degree of

Doctor of Philosophy

in

The Department of Speech

byClarence Wilton McCord

B.A,, Louisiana College, 1956 15.D., Golden Gate Baptist Theological Seminary, 1960

M.A., Louisiana State University, 1962 August, 1968


PLEASE NOTE:Not original copy. Several pages have indistinct print. Filmed as received.

University Microfilms.


ACKNOWLEDGMENT

The writer wishes to acknowledge with sincere appreciation the

able guidance of Dr. J. Donald Ragsdale, the helpful suggestions of

Dr. George H. Gunn, the encouragement of Dr. Claude L. Shaver, and

the assistance of Dr. William W. Evans, Dr. Waldo W. Braden, and ,

Dr. John H. Pennybacker. Further acknowledgment is due to the

patient endurance of Charles Dunham and Harold Overton, whithout

whose cooperation this study would not have been possible.

ii


TABLE OF CONTENTS

Page

CHAPTER I. INTRODUCTION . 1Definition of terms 1The problem 4

CHAPTER .II. INTONATION IN LINGUISTIC STUDIES 6Grammatical 8Semantic 14Experimental 26

CHAPTER III. PURPOSE AND DESIGN OF THE EXPERIMENT 36Purpose 36Experimental hypotheses 36Methodology 37

CHAPTER IV. RESULTS AND DISCUSSION . . 52Listener responses 52Naive and non-naive listeners 59Actor and linguist 67Intelligible and nonsense sentences 76Intelligible and filtered sentences 79

CHAPTER V. CONCLUSIONS 82Summary of conclusions 83Suggested studies 84General conclusions 85

BIBLIOGRAPHY 86

APPENDIX A. RECORDING 93

APPENDIX B. COPYING 99

APPENDIX C. DISTORTION COPYING 101

APPENDIX D. PLAYBACK 103iii


LIST OF TABLES

Page

TABLE I. Distribution of Listeners by Groups 41

TABLE II. Confusion Matrix for Listeners to Linguist 53

TABLE III. Confusion Matrix for Listeners to Actor 56

TABLE IV. Confusion Matrix for All Responses 60

TABLE V. Comparison of Naive and Non-naive Listenersto Linguist's Intelligible Sentences 61

TABLE VI. Comparison of Naive and Non-naive Listenersto Linguist's Nonsense Sentences 62

TABLE VII. Comparison of Naive and Non-naive Listenersto Linguist's Filtered Sentences 63

TABLE VIII. Comparison of Naive and Non-naive Listenersto Actor's Intelligible Sentences 64

TABLE IX. Comparison of Naive and Non-naive Listenersto Actor's Nonsense Sentences 65

TABLE X. Comparison of Naive and Non-naive Listenersto Actor's Filtered Sentences 66

(continued on page v)

iv


LIST OF TABLES (CONTINUED)

TABLE XI.

TABLE XII.

TABLE XIII.

TABLE XIV.

TABLE XV.

TABLE XVI.

TABLE XVII.

Comparison of Listener Responses to Intelligible Sentences of Actor with Listener Responses to Linguist’s Sentences of the Same Type and Meaning

Comparison of Listener Responses to Nonsense Sentences of Actor with Listener Responses to Linguist's Sentences of the Same Type and Meaning

Comparison of Listener Responses to Filtered Sentences of Actor with Listener Responses to Linguist's Sentences of the Same Type and Meaning

Comparison of Listener Responses to Linguist's Intelligible Sentences with Listener Responses to Linguist's Nonsense Sentences

Comparison of Listener Responses to Actor's Intelligible Sentences with Listener Responses to Actor's Nonsense Sentences

Comparison of Listener Responses to Linguist's Intelligible Sentences with Listener Responses to Linguist's Filtered Sentences

Comparison of Listener Responses to Actor's Intelligible Sentences with Listener Responses to Actor's Filtered Sentences


LIST OF FIGURES

Page

Figure 1. Listener Instruction sheet 42

Figure 2. Listener response form 43

Figure 3. Linear 8000 Hz (spectrographtc) display 46

Figure 4. Linear 4800 Hz (spectrographic) display 47-

Figure 5. Linear 8000 Hz (spectrographic) display ofharmonic 500 Hz calibration tone 48

Figure 6. Linear 4800 Hz (spectrographic) display ofharmonic 500 Hz calibration tone 49

Figure 7. Percentages of listeners accepting theIntended or defined meanings 58

Figure 8. Ranges of fundamental frequencies andassigned pitch levels in the sentencesof the linguist 69

Figure 9. Intonation contours used by linguistand actor 71


ABSTRACT

The purpose of this study was to test the theory that there

are conventional patterns of intonation which have meaning; that is,

that there are intonation morphemes. The major experimental

hypothesis wa3 the null correlate of the positive statement:

"Native speakers of American English can recognize the defined

meanings of specified contours." There were four sub-hypotheses:

(1) There is no difference between the meanings assigned

by listeners to full and intelligible speech and those assigned

by listeners to nonsense sentences.


by listeners to full intelligible speech and those assigned

by listeners to speech distorted by a low-pass filter.


by listeners to sentences with Pike's contours produced by

a linguist and those assigned by listeners to sentences

produced by an actor with the same intended meanings.


by naive listeners and those assigned by non-naive listeners

to the same sentences.

The design of the study required a relatively large number of

listeners to hear three types of sentences, intelligible, nonsense,

and filtered. The listeners were of two kinds, naive and non-naive.

The equipment used to conduct the experiment was tested for

frequency response, distortion, and proper functioning.


The sentences used for stimulus items were recorded by two

speakers, an actor and a linguist. The sentences were subjected

to spectrographic analysis after validation by qualfied judges.

The data collected from the listeners were tabulated and

treated by a Chi-square analysis to determine the significance

of variations of distribution among the groups of listener

responses.

No significant difference was found between the responses of

naive and non-naive listeners. The null hypothesis was accepted.

Any studies which succeed this present study may well ignore the

amount of training the listeners may have so long as they are

native American English speakers who are not bilingual.

Sub-hypotheses 1, 2, and 3 could not be accepted, nor could

they be rejected at .the .01 level of confidence. Their rejection

was tentative, but it was strongly suggestive of doubts about the

existence of intonation morphemes.

The theory that there are intonation morphemes or specific

meanings for conventional patterns of intonation is subject to

question. This study has neither proven nor disproven semantic

theory of intonational meaning, but it has rejected all the null

hypotheses whose acceptance would have supported the theory. The

overall rejection of the hypotheses was not at the .01 level of

confidence, but the tendency was far stronger to reject than to

accept the hypotheses.

Therefore, this study has not supported the theory that there

are intonation morphemes in American English. It is probable


that emotional meaning is a product of features other than intona

tion itself, those features being rate, pause, voice quality,

lexical context, grammatical context, and social context. To what

extent emotional meaning is actually conveyed by intonation is

still a matter of conjecture for further study and speculation.


CHAPTER I

INTRODUCTION

Pitch variation is a universal phenomenon in language. It is

impossible to produce voice without a characteristic pitch, and it

seems to be impossible to produce a truly monotone utterance. The

range and patterns of pitch change are as different from one

language to another as the sounds and words. The usefulness of such

pitch changes varies widely. For tone languages, such as Chinese,

pitch becomes an integral part of every word and is as necessary

for a correct semantic interpretation as the sound structure of the

word. _.n languages which do not make such use of pitch the role of

pitch remains questionable, but the uncertain status of pitch is not

due to any lack of study and speculation.

DEFINITION OF TERMS

Prosody and Phonemes

Language is composed of a stream of vocal sounds. The stream

of sound may be analyzed in at least two dimensions: the sequential

elements and the simultaneous elements. The sequential elements are

those sounds of language which have been characterized by the alpha

betic writing systems called the segmental phonemes, or significant

sound features. The simultaneous elements are those features which

occur at the same moment in time as the segmental phonemes but are

not generally analyzed as a significant part of the segmental


phonemes. The simultaneous features, considered as a whole, are

called prosodic, or suprasegmental, features. The prosody of

English includes stress, pitch, and juncture, which are the rhythmic

and melodic parts of the language. To what extent these prosodic

features of English may be considered either phonemic or semantically

significant is open to question.

Stress

Stress in language most often has been considered a function of

the force of breath with which' a word or syllable is uttered (15).

It is probably safe to define stress as the relative level of loud

ness with which a syllable is perceived, but stress cannot be

equated with any one physiological or physical function. A number

of studies of the physical and physiological parameters of stress

indicate that the listener's perception of stress depends upon the

intensity, frequency, and duration of the syllable (9, 28, 50, 51,

56, 58, 60, 61, 65, 84). Some studies conclude that intensity is

the most important factor in the perception of stress; others, that

frequency is more influential.

Intonation

Pitch phenomena in non-tone languages are called intonation.

The perception of pitch is most often said to depend upon the rate

of vibration of the vocal folds, but pitch perception appears to

involve other factors (24, 27, 50). The interdependence of

functions in perception noted above pertains to pitch as well as


3

to stress. Strictly speaking, pitch is a perceptual event, and the

fact that its physical parameters are complex does not effect the

reference to pitch as the primary element in intonation.

Even tone languages may have uses of pitch which must be called

intonation. According to Chang (14) the Chengtu Chinese dialect

makes some use of pitch which is not related to the intrinsic tones

but rather to the attitude or emotional state of the speaker. The

relationship of intonation to the entire utterance in English is the

subject of the present st.udy0

The physical and physiological parameters of pitch in language

have been studied and described many times (27, 28, 50, 51, 54, 56,

58, 59, 84). What causes and what is perceived by listeners as pitch

is not dependent upon any single factor. In the vocal apparatus,

fundamental frequency is a function of vocal fold tension and sub-

glottal pressure. The perception of pitch is dependent upon funda

mental frequency, intensity, and to a certain extent duration.

Bloomfield (6) and others have defined intonation in terms of the

rate of vibration of the vocal folds, but this kind of definition

appears now to be an oversimplification.

Because it is impossible, on any kind of physical and probably

on any psychological basis, to completely separate pitch and stress

elements in English, this study will use the term intonation to refer

to a kind of combination of both these features. Such a practice is

not without precedent. Pike (73) and others (7, 24, 28, 33, 53, 56,

58, 59, 64, 83) point out the necessary interdependence of the

features. To encompass these prosodic features as a kind of whole


the term prosodeme has been used (33, 71). Thus, this writer- uses

the term intonation to refer not specifically to the changes in

fundamental frequency alone, but to the perceptually distinct pitch

and stress patterns, making no attempt to eliminate their mutual

influence upon each other in perception.

THE PROBLEM

The function of intonation in English has been interpreted as

grammatical and semantic. The descriptions of grammatical functions

of intonation in English have remained basically the same since the

first such description in 1569 (34). Only terminology and a few

superficial details are different in the latest book on the subject

nearly four hundred years later (53). To what extent the semantic

function of intonation is operative has been studied before and will

undoubtedly be studied further. Pike concluded that the meanings

associated with intonation contours "could not be correlated with

the grammar, nor with their usage specifically with questions, state

ments, or the like, but rather had to be analyzed as implying

speakers' attitudes more or less independent of the grammar" (73,

p . 1 ) .

To what extent is a speaker capable of conveying, or how

accurately can a listener recognize, the kind of meaning described

by Pike, Wells (91), and others (1, 14, 22, 41, 42, 47, 48, 78)? If

pitch and stress phonemes are combined into morphemes or semantic

units, then the semantic interpretation of these units should be

— — relatively straightforward. But conflicting reports of experimental


data (25, 26, 38, 58, 89) indicate that further study is necessary.

For the purposes of this study, the problem may be stated a bit

more specifically. The nature of the problem is centered here in the

descriptions of intonational meanings found in Pike's Intonation of

American English' (73). Several questions will serve to define the

problem:

(1) How accurately can audiences recognize the meanings

of the intonation contours described and assigned specific

meanings by Pike?

(2) Does the meaning assigned by listeners reside in the

intonation alone, or might there be other factors at work?

(3) Will listeners assign the same meaning to different

contours?

(4) Will an actor who is asked to portray a certain atti

tude produce the same intonation contour as that described

by Pike?

(5) Will listeners assign the same meaning to an actor's

sentence which is intended to carry a specific emotional

meaning and to a sentence using the intonation contour

said by Pike to have that meaning?

(6) Does training or instruction in the recognition of

prosodic features effect a listener's recognition of

intonational meaning?


CHAPTER II

INTONATION IN LINGUISTIC STUDIES

Interest in the melodic patterns of language is at least as old

as writing systems. Some orthographic systems, such as those of

Greek and Hebrew, attempted to mark rhythmic and melodic patterns

by stress, punctuation, or other markings. Greek and'Latin rheto

ricians and poets discussed the prosody of their respective languages.

For some writers rhythmic patterns of spoken prose became extremely

important. For example, Aristotle (c. 325 B.C.) described the three

kinds of thythm used in speech (3, pp. 249-2.50), and Demetrius

(c. 200 B.C.) (31) discussed the proper rhythmic patterns appropri

ate to each of the four styles: the plain, the grand, the elegant,

and the forceful.

A systematic attempt to describe the prosodic features of

English did not appear until John Hart's The Opening of the Unreason

able VJriting of Our Inglish Toung in 1551 and especially his An

Orthographie in 1569. Hart declared in the latter work that prosody

performs two functions, "distinction and pointing" (34, p. 199).

Properly used, he said, punctuation could tell -one "how to vnderstande

what is written . . . and what sentence is asking: and what is

wondring . . ." (34, p. 200). The features of time and tune he

equated with the orthographic punctuation marks, inventing a few

additional diacritics and phonetic symbols to better serve his pur

pose. Bor example, the comma "is in reading the shortest rest . . .

6


7

alwayes signifying the sentence vnfinished . . ."(34, p. 200). The

period was "to signifie the ende of a full and perfite sentence

. . ." (34, p. 200). Of two special classes of sentences, the

interrogative and the exclamatory, Hart would have made a special

case. "And for -the marke of the interrogatiue and adm.iratiue, I

woulde thinke it more reasonable to vse them before then after,

bicause their tunes doe differ from our other maner of pronunciation

at the beginning of the sentence" (34, p. 200). Hart saw intonation,

or changes in pitch, as a means of distinguishing three classes of

sentences, the "full and perfite sentence," the "interrogatiue," and

the "admiratiue."

Probably the first significant description of English intonation

was that of Joshua Steele (1779) (82). In his attempt to counter

the statements of Lord Monboddo's The Origin and Progress of Language,

Steele developed a descriptive analysis of the prosodic features of

English. He invented a large, number of symbols resembling musical

notes to indicate upward and downward movements of pitch roughly

equal to a quarter, a half, and three-quarters of a musical tone.

His notation system included a method for noting duration, stress,

and pause in speech with several degrees of each factor. He observed

that pitch changes are so small as to be almost imperceptible, and

are not susceptible to description by using a musical scale. He was

very conscious of the minute gradations of pitch, differing not only

in different speakers but in the same speaker at different times.

His description was confined primarily to pitch changes within a

syllable, although he recognized that pitch changes in words and


sentences also occur significantly. Steele described broad types of

general styles in prosodic features; such as, forte, piano, adagio,

etc. Each general style had a relationship to the overall mood of

the speaker. Other British and American elocutionists followed

Steele, the most, notable among whom was James Rush, who will be

discussed below.

From these early beginnings, the investigation of English

intonation has followed at least three significant directions which

might be called, first, the grammatical; second, the semantic; and

third, the experimental. The first treats intonation as a part of

grammar only; that is, it is the result of some structural aspect

of sentences or causes certain structural or grammatical interpre

tations on the part of the hearer. The second may admit in part

that intonation performs a grammatical function, but goes further

to maintain that the speaker may intend and the listener may derive

some semantic interpretation which is quite apart from or different

from the grammatical function of intonation. The third direction of

investigation, the experimental, may use the theories of either or

both of the other two as hypotheses to be tested.

GRAMMATICAL

In his Good Speech, Walter Ripman (1925) attempted an analysis

only of the meaning of grammatical intonations because for "good and

bad intonations . . . there can be no rules that would govern all

cases. . . . A standard interpretation would mean that the readers

had standardized souls, -- all feeling the same emotions on reading


9

the same words" (76, p. 60).s'”K',ipman described only the rising pitch

of incomplete utterances and "yes-or-no" questions and the falling

pitch of definite statements and questions containing a question

word. In other words, though he recognized other uses for intonation,

Ripman described- only the grammatical functions. Interestingly, he

further noted that emphasis is generally achieved not by greater

force of utterance as commonly believed, "but by difference of

pitch" (76, p. 62) .

Lilias Armstrong and Ida Ward (1926) (4) defining intonation

also as the rise and fall of pitch in speech, confined themselves

to a description of the grammatical functions of intonation. Like

Ripman, they recognized other uses of pitch which they called emphatic

usage. While they defined stress in terms of breath force, they

recognized the close relationship between stress and intonation.

Armstrong and Ward were the first to use the term "tune" to describe

a basic intonation pattern. Their Tune I was the falling final pitch

of ordinary statements, questions other than "yes-or-no" questions,

commands, and exclamations. Tune II was the rising final pitch of

indefinite or non-final utterances and of "yes-or-no" questions. To

Armstrong and Ward emphatic usage differs from non-emphatic only in

having heavier stress or a wider range of tones in the Tune. Their

description is not significantly different from that of Ripman except

in the invention of the terminology Tune I and Tune II.

In English Phonetics (1931) Ripman (75) combined the pitch and

stress functions in what he called descending stress, level stress,

and ascending stress. Though he implied a kind of secondary stress,


10

he made no clear definition of it. The uses of rising, falling, and

level tones described here are about the same as in Ripman's earlier

work; but he added two uses. First, he described prominence with a

"considerable fall in pitch" (75, p. 159). And second, he turned

to connotative meaning with a falling-rising intonation, which, he

said, "implies that the statement made is subject, to some qualifica

tions" (75, p. 159). This last statement of classification is the

only real instance of Ripman's attempt to describe and classify any

connotative meaning.

Bloomfield's (1933) classification of phonemes (6) included a

category called secondary phonemes, including pitch and stress. His

description of pitch phonemes was based altogether on grammatical

functions of intonation, and he chose to mark the five pitch phonemes

by standard punctuation marks. Like all others who observe the

grammatical function of intonation, Bloomfield was interested only

in the phrase-terminal contours. Functions of intonation ether than

grammatical functions Bloomfield said were "gesture-like variations,

non-distinctive, but socially effective border[ing] upon genuine

linguistics" (6, p. 114) . Even though his analysis was grammatical,

the fact that Bloomfield gave pitch and stress phonemic status laid

the groundwork for detailed "semantic" analyses. The reference to

prosodic features as phonemes implied that they could be combined

into morphemes. It is of interest to note that not all morphemes

may be meaningful when isolated, though this latter notion is the

basis for many later "semantic" analyses.


Aside from the grammatical categories described by Bloomfield,

Maria Schubiger (1935) (79) admitted emotional "meanings" of intona

tion, but quickly dismissed the possibility of describing such

meanings since they are characterized only by their deviation from

normal grammatical patterns. Schubiger was perhaps the first writer

to clearly state the relationship between intonation patterns and

the grammatical constituents of a sentence. Her basic description

of intonation was the same as that of Armstrong and Ward's Tune I

and Tune II, but she applied the descriptions to constituent struc

tures of a sentence as well as to the sentence final pitch patterns.

So that, the ends of phrases and clauses may be said to have tunes

very much like those found at'the end of sentences.

The primary contribution of R. H. Stetson (1951) (83) to this

line of analysis was.a concept of the breath-group. A breath-group

is a phrase or group of words uttered between pauses or potential

pauses. Thus, a breath-group need not be bound on either side by a

pause for breath. He declared that pitch variation was simply a

by-product of the more important stress patterns within an utterance

but intonation is singularly important at the ends of phrases or

breath-groups. It remained only to identify Stetson's breath-group

with Schubiger's constituent structures to complete the grammatical

analysis of intonation.

Charles Hockett's A Manual of Phonology (1955) (36) provided a

link between Stetson and Schubiger by describing a macrosegment as

whatever occurs between pauses. This macrosegment, he said, is com

posed of two immediate constituents, the intonation and what is left


12

when intonation is removed. But for one whose description is.

primarily grammatical, Hockett's desctiptive apparatus is quite

detailed; he included three pitch levels, three terminal contours,

and one emphatic feature by which he meant primarily extra-high

pitch.

Harold King (1961) (46) provoded no new contribution, though

his description was slightly different. He indicated that the

starting point of the terminal contour was about as important as the

direction of pitch change, making a distinction between high-rising

and low-rising contours. King described these contours as "separate,

meaningful, grammatical elements" (46, p. 24).

Noam Chomsky's pronouncements on intonation are rare. In

attempting to refute current phonemic theory, which would include

a phonemic description of intonation, Chomsky (1964) said, "It has

. . . been studied in relative or complete isolation from the syn

tactic setting within which phonological processes operate" (16).

Thus, intonation is to be considered in the light of syntactic or

grammatical functioning or not at all.

Hans Kurath (1964) (49) added another necessary link between

Stetson and Schubiger by stating that "prosody and syntax are comple

mentary aspects of sentence structure. Conjointly they constitute

the grammar of the sentence" (49, p. 126). He implied that intona

tional patterns are somehow equated with constituent phrases in a

sentence. His insistence that intonation functions exclusively in

sentences and their constituent parts made Kurath conclude that

single words must be called sentences if they are spoken in isolation.


To Tune I (falling) and Tune II (rising), Kurath added a third

(level), which he said signals the end of a non-final sentence

constituent. Precontours, or that part of the sentence which conies

before the final contour, and pitch level per se must be considered

grammatically irrelevant, though they may have social significance

for conveying connotative or emotive meanings or conforming to

dialect patterns. When he came to a consideration of "emotive-

directive" uses of intonation, Kurath threw up his hands with the

comment that they are "innumerable . . . Hence, the precise nota

tion of emotive-directive intonations is as complicated as the

identification of the semantic range of words and morphemes" (49,

p. 132).

Philip Lieberman (1967) (53) completed the identification of the

breath-group with the constituent structures of a sentence, using

Chomsky's terminology, "the underlying phrase marker" (53, p. 109).

Lieberman returned to Armstrong and Ward's two-tune analysis but

gave them different names. The falling Tune I he called the unmarked

breath-group; and the rising Tune II, the marked breath-group.

Lieberman added information about the manner of perception of into

nation, using the gesture theory of perception applied by others to

the perception of segmental phonological features (phones). The

gesture theory of perception says basically that a person must

reproduce what he hears in order to properly perceive it. The

reproduction need not go as far as actually uttering sounds, but the

process does require some neural impulse and feedback in the speech

production system. Once the perception of the breath-group is


14

identified as marked or unmarked, the listener can assign a semantic

interpretation only when he knows the grammatical relationships of

the constituent phrase markers. In other words, a listener must

know what was necessary physiologically for the production of a

contour and the -grammatical relationships of the constituent elements

of the sentence before he can assign a meaning to the sentence.

The last work on English intonation, like the earliest, inter

prets intonation as a primarily grammatical element of language.

With more than four centuries of development, the essential differ

ences between Hart's description and that of Lieberman are a matter

of terminology and detail.

SEMANTIC

The semantic interpretation of intonation depends on the ability

of a speaker to intend and listeners to perceive a specific connota

tive meaning which is superimposed upon, but unrelated to, the

grammatical usage of intonation. The description of the types and

numbers of connotative meanings which can be conveyed by intonation

contours differs with each writer. The first writers as a group to

be interested in conveying meaning in expression were the rhetoricians

and later the elocutionists. The review here will begin with a

writer of primary importance in the latter group and proceed to the

most recent works.

James Rush's The Philosophy of the Human Voice (1826) (77) is

the most outstanding work on voice from the elocutionary movement.

His treatment of intonation included all the aspects of what he


15

called speech melody, which were pitch, force, time, and pause. He

suggested the notion of steps and glides of pitch, but restricted

his description to a musical tone system. Rush generalized that

certain kinds of pitch changes were necessary at pauses to convey

the normal sense- of the utterance, but he further attempted to "show

that some . . . phrases of melody may be employed as the appropriate

signs of certain sentiments" (77, p. 143). And the latter seems to

have been his primary purpose with regard to intonation. He empha

sized concrete (glide) and discrete (step) movements of pitch with

whole chapters devoted to glides of different directions and degrees,

describing the kinds of meanings conveyed by each type of pitch

movement.

Much later Otto Jespersen (1924), whose primary linguistic

interest was diachronic or historical linguistics, made one important

statement to indicate his understanding of the kinds of meanings which

can be conveyed by intonation. "Even a baby shows by his expression

that he can distinguish clearly between what is said to him lovingly

and what sharply, a long time before he understands a single word of

what is said" (40, p. 111).

In his English Intonation with Systematic Exercises (1924) (69)

Harold E. Palmer applied the term tonetics to the study of tone-

curves without regard to their meanings and intonation only to the

"science" which assigns meanings to such tone-curves. In the latter

portion of his book he listed a number of very specific patterns with

equally specific meanings assigned to them. On page 86 the author

offers a "synoptic summary of the semantic functions of the tone-


groups." There are four basic tone groups with slight variations

applied to different types of utterances to produce shades of

meaning; such as, protesting statements, categorical statements,

animated statements, implicative statements, and others. Similar

lists of meaning variations are given for Special Questions, Command

General Questions, and Isolated Words with a total of 31 meaning

categories in all.

Even though his analysis was primarily grammatical, Leonard

Bloomfield (1933) (6) was perhaps the first to suggest that there

were pitch phonemes, which led ultimately to the conclusion that

there must also be pitch morphemes. The works listed below in this

section are, for the most part, the logical extensions and logical

consequences of Bloomfield's assertion.

Bernard Bloch and George Trager (1942) (5) clearly defined the

stress phonemes and described juncture as a part of the prosodic

features of English. Their discussion of tones was brief and not

quite so specific since they were trying to outline descriptive

techniques and not to describe English in detail. They grouped

stress and tone together as features of accent, suggesting that high

and low pitch levels and rising, falling, or level tone contours be

indicated by accent marks or numerals over the letters. They indi

cated some indecision concerning how to determine the phonemic tones

but did not doubt that such a thing existed.

Zellig Harris (1944) (33) followed Bloomfield's lead and tried

to describe the segmentation method necessary for discovering the

intonation morphemes -- or prosodemes. He concluded that intonation


patterns cannot be analysed except with reference to the contexts

in which they occur. But the reverse is also true; segmental

phonemes are classes of sound minus their pitch and stress features

and do not exist as such. Thus, Harris pointed out an inter

dependence of the simultaneous components of phonology.

Rulon Wells (1945) (92) used the conclusions of Bloomfield,

Bloch and Trager, and Harris, and yet differed from all of them,

to develop a new approach, one which Kenneth Pike used that same

year. Wells set out to analyze pitch in the same way in which seg

mental phonemes were analyzed, by a contrastive analysis. He disa

greed with Bloomfield's conclusion that pitch phonemes carried

meanings. He said that pitch phonemes must be organized into pitch

morphemes, "which are the strict analogues of segmental morphemes

composed of segmental phonemes" (92). Wells differed from Harris

also, maintaining that prosodic features are not parts of phones,

but features or qualities of phones and therefore separable from

them. So Wells concluded that intonation morphemes are capable of

being analyzed separately from the segmental phonemes and carry

a meaning of their own.

Kenneth Pike's (1945) (73) conclusion that many intonation

contours are explicit in meaning is the very assumption which this

study proposes to examine. Pike used the discovery technique deve

loped by him and others for application to segmental phonemes to

study the so-called supra segmental phonemes of stress, pitch, and

juncture. He described in detail a number of intonation contours

to which he ascribed specific meanings. For the purpose of this


18

study, we may say that the semantic interpretation of intonation

reached its height in Pike's work; but several more works will

follow, most of which agree with Pike's general assumption if not

with his specific descriptions.

Rulon Wells (1947) (91) made an attempt to combine the immediate

constituent analysis of the grammatical interpretation of intonation

with the semantic interpretation. He said that an intonational

pattern marks the limits of a constituent and is equivalent with a

pause-group; each pause-group contains one intonation pattern which

he called a pitch morpheme. This approach did not alter his basic

assumption concerning the meaningfulness of intonation.

Dwight Bolinger (1949) (11) agreed in principle with Pike and

Wells that intonational patterns must be considered morphemes, but

he questioned the validity of the application of segmental phonemic

techniques to these quite different linguistic phenomena. His

objections were that (1) commonsense knowledge of intonation could

not be demonstrated to exist since the alphabetic writing did not

take it into account, (2) intonational patterns are superimposed

upon semantic units all shorter in length than the contour itself,

(3) intonation is not an arbitrary system like segmental phonemics,

and (4) intonation is composed of only one variable (pitch) while

segmental phonemes have many variables. Bolinger concluded that

there were many more questions than answers.

In his study of British radio dramas, Wiktor Jassem (1952) (38)

apparently used the same assumptions as Wells and Pike. His analysis

was based upon the specific meanings carried by intonations. In a


number of experiments Jassem found 84% correct response by listeners

in identifying situations under which a sentence might be uttered.

He defined twelve nuclear tunes which could have meanings alone or

in combinations.

W. F. Twaddell (1953) (88) made no real contribution to already

existing suprasegmental phonemics. He did attempt to correlate

Stetson's work with the suprasegmentals. Like Pike and Wells and

others, Twaddell maintained that the pitch levels (four in number)

are autonomous phonemes and that there are further pitch-contour

phonemes as opposed to pitch-level phonemes.

In a comparison of intonation in English with that of Japanese,

Isamu Abe (1955) (1) concluded that the function of superimposing

the speaker's emotions upon his utterance was the same for both

languages. He stated, perhaps more clearly than anyone before him,

that "it may be safely assumed that intonation . . . has a value of

its own as a psychological pitch curve. This seems to be especially

true of the English language" (1).

Daniel Jones (1956) said that "intonations in language have

meanings which are superimposed on the dictionary meanings of the

words uttered" (40, p. 277). He equated, more or less, Stetson's

breath-group with what he called a sense-group; but this was essen

tially the same notion expressed earlier by Wells' use of the term

pause-group. Jones concluded, strangely, that the listener must be

conscious not of what the speaker says, but of what he intends to say

so far as intonation is concerned; that is, the objective realization

of the subjective intent frequently fails to conform with the latter.


20

Pitch contrasts between syllables and within syllables were made

the important considerations in Roman Jakobson and Morris Halle's

Fundamentals of Language (1956) (37). In fact, they maintained that

contrasts in any prosodic features "are fully recognizable only when

both of them are present in the given sequence, so that the speaker

effects and the listener perceives their contrast" (37, p. 26).

This is saying, perhaps more specifically, what Pike meant by using

the term relative for the prosodic features of pitch and stress.

Relative relationships are evident only when they can be compared.

Noam Chomsky, with Morris Halle and Fred Lukoff (1956) (17),

began an attack primarily on the four phonemic levels of stress

proposed by Pike and many others. They chose to argue that two

levels of accent are necessary. They carefully distinguished accent

as a transcription notation and stress as the degree of loudness in

an utterance; they argued that the many degrees of stress are predic

table with the use of a two-accent notation. The authors then

argued, though it did not seem to be their original intent nor even

the logical conclusion of their direction in the paper, that accent

must be considered a distinctive feature, not a phoneme, since a

phoneme is to them a bundle of distinctive features. It does not

seem reasonable to this writer that the authors could maintain on

the one hand that distinctions other than accented-unaccented are

allophonic and argue on the other hand that stress is not phonemic.

It is not clear how they would handle intonation, which clearly is

composed of a sequence (or bundle) of tones. They concluded only

that intonation is an utterance-long or phrase-long component, never


21.

affirming or denying its phonemic or morphemic status or function.

Dwight Bolinger's questioning attitude in an earlier article

gave way to a positive position in his "Intonation and Grammar" (1957)

(12). Against those people who maintained that intonation served

only to demarcate the grammatical segments of a sentence, Bolinger

countered with a firm no. "The encounters between intonation and

grammar are casual, not causal . . . intonation is not grammatical"

(12) .

Robert Stockwell (1957) (86) continuing to analyze pitch

morphemically, argued that only two pitch phonemes and only one

juncture phoneme are necessary. Thus he believed he had a simpler

descriptive tool for intonation morphemes.

According to Nien-Chuang Chang (1958) (14), the Chengtu Chinese

dialect, aside from its tonal system, has an intonation system which,

like that of English, brings out different shades of meaning. Those

meanings are not related to the lexical content of the words, but to

the emotional state of the speaker, precisely the function which

Pike claimed for English intonation.

In his The Groundwork of Eng1ish Intonation (1958) (48) Roger

Kingdon distinguished between what he called static tones, or pitch

levels, and kinetic tones, or slides. He found four static tones

and five kinetic tones, which, when used in combination, form a

large number of tune forms. Kingdon said that American intonation

depends strongly on the static tones, but British English is more

dependent upon the kinetic tones. The author classified six types

of utterances (three of which might be considered grammatical classes;


and the other three, attitudes) each with a number of fine shades of

meaning; the total .number of his types and sub-types is thirty-seven.

Kingdon1s static tones and kinetic tones correspond, more or less,

to the tones and terminal contours of several other linguists. In

another work, Kingdon (1958) (47) briefly explained his analysis of

British English intonation found in the work above.

Charles Ilockett's description (1958) (35) of American English

was very nearly the same as that of Kenneth Pike (73). He used four

pitch levels and three terminal contours. The differences were his

recognition of only two phonemic stresses and the fact that he

reversed Pike's notation system for the four pitch phonemes. From

the highest to the lowest, the pitches /l, 2, 3/ and /4/ of Pike

became /4, 3, 2/ and /l/ for Hockett. This work represented a slight

change in Hockett's earlier description (36). What is significant

here is Hockett's assignment of phonemic status to all these

"distinctively different features" (35, p. 33).

Dwight Bolinger (1958) (7) once more added his findings and

opinions to the sometimes controversial descriptions of intonation,

this time denying the separability of stress and pitch as indepen

dently variable phonemic systems. He maintained that pitch is

merely a function of shifting sentence stress, and serves as a cue

for stress. The two do not shift independently, according to

Bolinger.

Maria Schubiger (1958) greatly changed her interpretation of the

role of intonation from her earlier view in 1935. The earlier work

(79) was discussed among those which interpret intonation's function


23

as a primarily grammatical one. In her English Intonation (78)

Schubiger still referred to sense-groups and tone groups as the

relationship between intonation and the grammatical units of a

sentence, but her description was now more detailed, and she main

tained (quite in opposition to her earlier view) that the main

function of intonation "is to give voice to the speaker's attitude"

(78, p. 38). She went on to list more than thirty specific intona

tion patterns with specific connotative meanings.

The article of FrantiSek Danes (I960) (22) argued that intona

tion is neither syntactic nor phonemic. Though it certainly is a

part of phonology, it composes a special system quite apart from

phonemics. Patterns of pitch and stress, he maintained, function

as wholes to make words communicative units. Danes concluded that

although intonation works most effectively in its ability to commu

nicate emotional attitudes, such a function does not supercede nor

interfere with the basic intonations of communication. Apparently

no one has attempted to follow his lead and develop a third kind of

analysis uniquely different from the two types of analysis here

discussed.

In Generality, Gradience, and the All-or-None (1961) (10) Dwight

Bolinger's questioning continued, though again colored by his appa

rent opinion that intonation communicates "meaning." He recognised

the common tendency of speakers and listeners to generalize. For

example, in ambiguous statements, listeners always choose one or

another meaning, not neither or something between. Gradience in

pitch is treated by listeners with an all-or-none response. Thus,


24

linguists and others tend to hear levels of pitch rather than the

infinitely gradient reality.

Gordon Peterson (1961) (71) seems to have been the first person

to use the term prosodeme to refer to the prosodic content of an

utterance as a kind of whole and not as a set of independently

variable factors, such as pitch, stress, and juncture.

Daniel Jones' discussion of The Phoneme (1962) (42) included

chapters on length, pitch, stress, and prominence. He maintained

that intonation can express emotional states and implications which

are incapable of expression in words. He agreed that there are many

special meanings which can be assigned to intonation patterns and

referred the reader to other works for a discussion of these patterns.

Prominence may be achieved by stress, quality, duration, and intona

tion. Perhaps Jones should have added that prominence may be the

result not only of any one of the factors cited, but also of any

combination of the factors, which is probably more common than the

former. Changing the place of prominence may subtly change the

meaning of the sentence, according to Jones.

In a somewhat different work, Clarence L. Header and John H.

Muyskens (1962) (64) were concerned with the physiological aspects

of speech and language development. They said that the child first

develops the prosodic features of language and then the sounds of

language. They discussed the effects of emotional states upon the

pitch and intensity and even the quality of the voice. Their primary

concern was the physiological effect of such emotional states. Added

tension raises the pitch of the voice and has other effects on


25

quality and intensity. The writers did not say to what extent those

effects might be perceived by listeners or might be recognized to con

vey any specific meaning. They concluded with Bolinger (7) that

pitch is a function of stress or emphasis and is not an independent

variable; the tune is an integrated whole.

Albert H. Marckwardt's (1962) (62) review of Chomsky, Halle,

and Lukoff's article, cited above, was highly critical. He argued

that they used the term "economy" in many different ways, that they

manipulated the juncture feature to suit their fancies, and that

their two-tone system of transcription was entirely useless for

adequate description of English. He concluded that the use of

binary oppositions does not.advance the knowledge of prosodic

features and certainly does not lead to a useful transcription; thus,

the Chomsky, Halle, and Lukoff approach yielded neither theoretical

nor practical results.

Following Bloomfield's lead, Otto Jesperson (1964) indicated in

Essentials of English Grammar (39) that the meaning of a sentence is

strongly dependent upon its intonation, which is tied closely with

stress. Stress and intonation may even reverse the meanings of the

words of a sentence. But Jesperson's only descriptions of intona

tional patterns were related to certain grammatical forms, not to

specific meanings. Perhaps he, like Kurath (49), recognized the

"semantic" function, but, seeing the relative ease of description of

the grammatical function, abandoned any attempt to describe the

former.


Although there have been many writers who have recognized the

function of intonation in changing the meanings of sentences, the

relatively few attempts at description are not in agreement on the

number and kinds of meanings, the best descriptive devices, nor even

the terminology to be applied to the descriptions. Perhaps there is

no way of knowing which description is best or most useful; but that

of Pike seems to be the most widely adopted in America.

EXPERIMENTAL

Experimental approaches to the study of intonation have used

the theories and■impressions of linguists to test the prosodic

features of language or to reconstruct them instrumentally. The

following section might be subdivided into (1) those studies which

have tested the validity or one of the theoretical approaches to

intonation or which have contributed materially to either the

grammatical or semantic theories of intonational meaning, (2) those

studies which have tried to discover and describe the physical or

physiological correlates of perceptual prosodic features such as pitch

and stress, and (3) those studies which have tried to develop instru

mentation for more accurate measurement or reproduction of prosodic

features.

Studies of Theory

In one of the earliest works of this type, Grant Fairbanks' (1940)

study of pitch changes in the voice as related to different emotions

(26) required a validation procedure which is of special interest


27

here. He asked six amateur actors to simulate five emotions. To be

sure he had utterances which carried the emotions he wanted, he asked

64 trained observers to judge the emotions of the speakers in the

30 sentences obtained. The judges selected the "right" emotion from

a list of 12 possible choices 66% to 88% of the time, the average

correct choice being about 80%. Fairbanks and LeMar Hoaglin (1941)

(25) later used the same validation procedure in a study of duration

and pause with the same results.

In his book on intonation Wiktor Jassem (1952) mentioned a

number of experiments (38, p. 47 ff.) similar to the validation

procedure of Fairbanks in which he found 847» correct response by

listeners in identifying situations under which a sentence might be

uttered. He was using taped radio dramas with professional actors

from the British Broadcasting Company.

Turning to another aspect of prosody, Dwight Bolinger and Louis

Gerstman (1957) (9) tested the relevance of the prosodic feature

called juncture (or disjuncture). They found that the utterances

"light housekeeper" and "lighthouse keeper," often thought to differ

only in stress, could be easily interchanged by changing the juncture

alone.

Although Denes (1959) (24) accepted the theory that tones exist

as linguistic features to convey information about the speaker's

emotional attitudes "in the same way as phonemes or words" (24), he

found that the tones are not functions strictly of fundamental fre

quency. Listeners could perceive "tones" clearly though by means

of the vocoder the speech was altered to a whisper. Using


synthesized speech, Denes produced tone changes on a single vowel

and asked listeners to identify the meaning from six possible cate

gories. Some tones were correctly identified by 80% of the listeners

some by only 217,.

James Flanagan (1958) (27) discovered that tones of language

apparently are not judged by differential discrimination, but more

nearly on the basis of an absolute judgment. He found that the

difference limen for fundamental frequency discrimination of synthe

tic vowels varied between + .28 and + .48 Hz, or a difference of

+ .23% to + .457» of the fundamental frequency. This conclusion does

not-deny Joshua Steele's notion that pitches in speech are infinitely

variable rather than discrete like the tones of musical notation.

It does mean that perceptible pitch changes in speech require more

change in fundamental frequency than might be expected if the judg

ment were strictly a differential discrimination. This study appears

to support the notion that there can be phonemic pitch levels which

are contrastive and can be differentiated by the listener.

In an attempt to test the semantic theory of intonational

meaning Elizabeth Uldall (1960) (89) used synthetic speech to produce

four sentences, varying only what she called the intonation, by which

she apparently meant the fundamental frequency. She presented the

sentences to twelve American listeners and asked them to respond to

the emotional meanings of the sentences by means of a list of ten

bi-polar adjective scales based on Osgood's semantic differential

research to find whether or not listeners would agree on the meanings

of intonations. Uldall concluded from the widely scattered scores


that meaning could not properly be assigned to intonation. One

possible fault wit! her study lay in the fact that she used synthe

tic speech and a very artificial smoothed contour. A smoothed

contour is an artificially produced glide in pitch which makes use

of a constant rate of change in a specific constant duration. In

other words, all glides in pitch were basically the same. The

speech which the listeners heard could not be said to be natural.

There is also the possibility that her testing instrument was

deficient.

In a study of more immediate concern, Philip Lieberman and

Sheldon B. Michaels (1962) (58) conducted an experiment to find the

relevant physical correlates of intonation patterns. Three speakers

were asked to protray eight emotions using neutral sentences. Both

naive and trained observers were asked to select the sentences which

best characterized each category of meaning. Then Lieberman and

Michaels synthesized utterances using the information obtained from

the sentences selected. Five different tapes were made, one varying

only pitch info'rmation and removing phonetic and amplitude informa

tion, the second including amplitude information, the third modula

ting the amplitude with smoothed pitch of 40 msec, time constant,

the fourth with a smoothing time constant of 100 msec., and the

fifth tape using a constant pitch with modulated amplitude. Ten

naive listeners heard each tape. Lieberman concluded that changes

in fundamental frequency alone are not sufficient to transmit emo

tional meanings. Listening to unprocessed speech, listeners

correctly identified the emotion 85% of the time. TTith-both pitch


and amplitude information listeners correctly identified the emotion

about 50% of the time; amplitude information alone yielded 447,

correct identification; and those with smoothed pitch gave 38% and

257. respectively. The monotone tape was correctly identified only

14% of the time; but even that was found to be significant at the

.006 level of confidence. So apparently both pitch and stress

elements are necessary for recognition of emotional meanings, so far

as synthetic speech is concerned.

A question of the validity of many earlier works was raised by

K. N. Stevens, T. T. Sandel, and A. S. Howse (1962) (84), who con

cluded that using non-speech stimuli to test speech perception is

probably questionable. They used bursts of noise to test perceptions

of loudness and duration. The responses were found to be unpredic

table on the basis of the acoustic stimulus. On the basis of their

data, they maintained that the perception of speech events depends

strongly on the context of the event.

In contrast to some studies which found that listeners could not

identify emotional meanings of intonation, K. Hadding-Koch and M.

Studdert-Kennedy (1964) (32) concluded that listeners judge meaning

not only by the terminal glide but by the entire contour. They

synthesized sentences using set fundamental frequency levels for four

phonemic pitches. When listeners judged the preferred question

intonation, their judgments were found to be functions of three

variables of the contours; peak, turning point, and end point. Thus,

any study based primarily on the terminal contours for recognition

of meaning is necessarily deficient.


31

In a test of phonemic pitch notation systems Philip Lieberman

(1965) (54) argued that the linguist only filled in the appropriate

notation because of his knowledge of the language structure, not

because of the existence of any free phonemic pitch levels. Quali

fied linguists were found to vary videly in their transcriptions of

the' same event with reference to relative fundamental frequency.

Lieberman concluded that "the phonemic pitch levels and terminal

symbols of the Trager-Smith system often have no distinct physical

basis" (54). Lieberman then concluded that the factors necessary

for predicting intonation are the physiological structures of speech,

the emotional state of the speaker, and the recoverability of the

deep, or underlying, structure of the sentence. But like most other

generative transformational grammarians, Lieberman only asserts that

the intonation is thus predictable. He did not illustrate the

application of such a solution to practical linguistic problems.

In other words, he did not test his own hypothesis.

Physical Correlates

In some of the earliest experimental investigations on prosodic

features of language Samuel Lifshitz (1933) (61) found that the

apparent duration (perceived length) of a sound impulse, a click,

depends on its equivalent loudness. He repeated the experiment using

pure tones (60), and the result was the same. The apparent applica

tion is to stress in language, since stress is generally defined in

terms of loudness alone.


32

Because many linguists, including Pike, Bloomfield, and others,

generally have defined the prosodic feature of stress as the result

of breath-force or relative loudness, H. Mol and E. M. Uhlenbeck

(1956) (65) decided to study the effect of intensity changes on

perception. By means of an amplifier, the word per'mit, recorded on

tape, was altered so that the greater intensity was on the first

syllable and the intensity of the second syllable was greatly

reduced. But no listener confused the recorded verb with the noun

form 1per-mit. The opposite procedure yielded similar results. But

Mol and Uhlenbeck were dealing only with intensity, and did not

change pitch or duration; which means that they may not have changed

the listeners' perception of loudness, which is not dependent in

language strictly upon intensity.

In some experiments in the perception of stress, D. B. Fry

(1958) (28) determined that judgments of stress in natural speech

are always based on a combination or inter-action of cues, among

them intensity, frequency, and duration. The one factor which seemed

to him most important in the perception of stress was pitch. Second

in importance was duration; and intensity was last.

Philip Lieberman's study (1960) of acoustic correlates of word

stress (56) tended to support the earlier findings of D. B. Fry

mentioned above (28). Lieberman concluded with Fry that fundamental

frequency seems more important than amplitude for the perception of

stress, and no stressed syllable can have both lower frequency and

lower amplitude than the unstressed syllable of the same word, even

if the duration of the stressed syllable might be increased.


33

In 1958 Lieberman (59) tested the theory that intonation is

produced by changes in the fundamental frequency alone. He con

cluded that the listeners can perceive the changes in fundamental

frequency as changes in pitch; they are further able to use informa

tion on the direction, rate, and magnitude of fundamental frequency

change; so that any or all of these might be cues in identifying an

intonation contour.

Accent in Serbo-Croation has been described in terms of both

pitch and intensity, but never quantity since long and short vowels

occur in contrast in both accented syllables and unaccented syllab

les. But Use Lehiste (1961) (51) concluded on the basis of spec-

trographic analysis that fundamental frequency and amplitude did not

always function as predicted from the impressionistic description.

But she probably should not have expected fundamental frequency and

amplitude to correspond closely to the perception of pitch and loud

ness because these physical and perceptual categories rarely corre

spond linearly though they are certainly related. Further, the

perceptual unit (pitch or stress) is often found to depend on a kind

of interrelationship of two or more physical parameters.

In a relatively rare study of physiological elements of speech

production Peter Ladefoged and Norris P. McKinney (1963) (50) found

that pitch changes in voice may be affected by changes in subglottal

pressure rather than by changes in vocal fold tension. If such

changes, occur during speech, it may be that intonation is a function

of stress, not of conscious pitch changes.


34

I n s t r umcn t a t i on

One study of importance because its findings made other research

in intonation possible V7as that of A. R. Adolph (1957) (2). He

developed a method for filtering the intelligible elements from a

speech sample without distorting the fluctuations in fundamental

frequency.

JohnM. Borst and Franklin S. Cooper (1957) (13) reported another

device which would allow for the manipulation of pitch and loudness

by any experimenter using a vocoder device. By this approach, real

speech could be changed with respect to pitch and loudness in any

way the experimenter might choose. The independent variables could

then be studied.

According to L. G. Kersta, P. D. Bricker, and E. E. David (1960)

(45) rapid fluctuations in voice pitch are necessary to the percep

tion of human speech as opposed to the less natural speech of the

vocoder. One might conclude, although Kersta did not, that experi

mentation performed with synthetic speech may be suspect when

listener judgments are involved.

Carrying Kersta's study a step further, Philip Lieberman (1961)

(55) analyzed and described the perturbations in pitch which make

human speech have a natural quality as opposed to the mechanical

quality of speech synthesizers. Apparently his purpose was to enable

researchers to construct synthesizers with a more nearly human voice

quality. The human voice is composed of nearly periodic tones, while

speech synthesizers produce a true periodic tone. If random fluctu

ations in the period of fundamental vibration could be produced


electronically, the machine would sound more nearly human.

Perhaps ignoring the theoretical arguments between the two

schools of thought, Ignatius G. Mattingly (1966) developed a compu

ter program for synthesizing prosodic features in artificial speech

using both grammatical and "meaning" approaches. He said, "The

tunes of natural speech distinguish between different syntactic

structures and relate successive sense-groups; they also give the

listener information about the personality of the speaker, his emo

tional state, or his attitude toward his utterance. . . . The

acoustic correlate of a tune is the modulation of the fundamental

frequency of the voiced portion of the sense-group" (63) . With

Mattingly's program listeners were said to be able to perceive

special meanings and resolve ambiguities. The only apparent defect

in the program is that it made use only of pitch variations and.

ignored intensity.

It thus appears that the empirical branch of intonational

studies is as divided as the theoretical descriptions. There are

those who "proved" that intonation carries only syntactic information

others, who "proved" that prosodic features convey emotional meanings

Some studies have tended to support the existence of pitch phonemes;

others, to destroy the notion. And finally, Mattingly was able to

construct a computer program making use of both kinds of information

to improve the production of synthetic speech.


CHAPTER I I I

PURPOSE AND DESIGN OF THE EXPERIMENT

The problem defined in Chapter I explains the necessity for

testing the semantic function of intonation in American English.

The question is whether semantic units or morphemes in American

English may be composed of prosodic or suprasegmental phonemes.

PURPOSE

The purpose of this study is to test the hypothesis of some

linguists that there are intonation morphemes. The exponent of this

hypothesis whose specific descriptions of such morphemes are to be

tested is Kenneth L. Pike, whose descriptions appear in his Intona

tion of American English (73) . The eleven contours chosen for this

study are those whose meanings are defined as INSISTENCE, SIMPLE

IMPLICATION, STRONG IMPLICATION, SIMPLE QUESTION, POLITE QUESTION,

DELIBERATION, SURPRISE, DISAPPOINTMENT, LIGHTNESS, REPUDIATION, and

HESITANCY. These contours were selected from Pike's work by omitting

those contours which were said to be generic of a class or a combina

tion of contours.

EXPERIMENTAL HYPOTHESES

The primary experimental hypothesis is the null correlate of the

positive statement: "Native speakers of American English can recog

nize the defined meanings of specific contours." Some sub-hyoptheses

36


directly related to the primary hypothesis are:


by listeners to full and intelligible speech and those

assigned by listeners to nonsense sentences.





by listeners to sentences with Pike's contours produced

by a linguist and those assigned by listeners to sentences

produced by an actor with the same intended meanings.


by naive listeners and those assigned by non-naive

listeners to the same sentences.

Obtaining the Stimulus Items

The writer and his faculty adviser selected two sentences of

more Or less neutral emotional content:

(1) It's on the table.

(2) I want to go home.

The writer constructed two nonsense sentences with similar

phonetic elements, similar types of transitions, and the same number

of syllables as the sentences above:

METHODOLOGY

(3) [sskj im vs

(4) [oi jssmp pa "bei


Two linguists on the researcher's committee approved the nonsense

sentences as phonetically similar to the first sentences.

The researcher asked a graduate student with considerable

training in linguistics to speak the four sentences in fifteen

different ways, using the eleven contours specified by Pike and four

others to be used as distractors. The eleven contours of interest

are the following:

1. INSISTENCE 2- '2 - 4

2. SIMPLE IMPLICATION 3- '2 - 3

3. STRONG IMPLICATION 2- '2 - 2

4. SIMPLE QUESTION 4- '3 - 2

5. POLITE QUESTION 2- '2 - 1

6. DELIBERATION 3- '4 - 3

7. SURPRISE 3- '1 - 4

8. disappointment 3- '3 - 4

9. LIGHTNESS 3- '1 - 2

10. REPUDIATION 3- '4 - 3

11. HESITANCY 3- '2 - 3'

Using a Briiel and Kjaer model 4132 condenser microphone, a Briiel

and Kjaer model 2603 microphone amplifier, and a Sony model TC 800

tape recorder, the researcher recorded the sentences produced by the

graduate student, hereafter called the linguist (see appendices for

justification of instrumentation).

The recorded sentences were randomized and presented for vali

dation to a panel of three trained linguists on the researcher's

committee. Given a description of the intended contours, the


39

linguists judged whether the sentences did or did not have the

specified contours. Only those sentences accepted by all three

members of the panel were used. Others were re-recorded until there

was unanimous acceptance by the three members of the panel. The

panel members' agreement with one another whether to accept or reject

a given item was 97% on the first listening.

Once all the sentences were accepted by the panel of linguists,

the meaningful sentences were distorted by passing them through an

Allison model 2-BR variable filter with the high frequency cut-off

adjusted to 225 Hz. The distorted sentences became the third set of

recorded sentences to be used as stimulus items for listeners, along

with meaningful sentences and nonsense sentences.

A similar procedure was used to collect and validate sentences

from a trained amateur actor, a graduate student in dramatics. First,

three actors were asked to portray the same emotions or meanings as

those in the list above as well as the distractors, in producing the

same four sentences. The sentences of all three actors were presented

to a panel of five trained dramatists and directors, all members of

the faculty at Louisiana State University. The panelists were to

judge whether the actor had or had not conveyed the meaning intended

for each given sentence. The one actor whose sentences were most

often accepted by the panel was selected as the speaker for the

experiment. Acceptance by four of the five judges constituted vali

dation of a given item. The agreement of this panel was about 60% on

the first listening. Any unacceptable items were discarded and re

recorded until the panel validated them. For two items, both with


40

the intended meaning of REPUDIATION, the researcher finally resorted

to a ranking procedure to select the one utterance of three which

best conveyed the meaning. Three of the actor's attempts to produce

sentences with the meaning of REPUDIATION were presented to the panel

which was asked to rank the sentences in the order of their ability

to convey that meaning. One sentence received the first ranking by

four of the five judges. Those same items had been the last to be

finally approved by a panel of linguists as well. The meaningful

sentences were distorted by the same process mentioned above to

produce a third set of stimulus items produced by the actor.

Testing Procedure

The three kinds of recorded material for both the actor and

the linguist were presented to three randomly selected groups of

naive listeners and three randomly selected groups of non-naive

listeners with not less than 25 in each of the twelve groups (see

Table I below). Non-naive listeners were those who were assumed to

have consciously developed an awareness of the prosodic features of

language under whatever name or for whatever reason; specifically,

those students who have successfully completed such courses as Speech

2 Voice and Diction, Speech 127 and 128 Applied Phonetics, Speech 103

Introduction to Descriptive Linguistics, Speech 205 Linguistic Geogra

phy, English 170 Introduction to Linguistics, English 172 The Contem

porary English Language, English 205 and 206 Language, Anthropology

160 Field Methods in Linguistic Research, or Romance Philology 225

Language Analysis. Naive listeners were those who had had none of


these courses,, All listeners were native speakers of American

English who were not bilingual either by home environment, as a

result of school training, or by any other means.

TABLE I

Distribution of listeners by groups

Naive Non-naive totals

Linguist meaningful sentences 32 35 67

>

f 194nonsense sentences 29 32 61

filtered sentences 31 35 66

Acto

r

meaningful sentences 32 35 67

)

* 194nonsense sentences 35 30 65

filtered sentences 25 37 62

184 204 388

The sentences were played on a Sony model TC 800 tape recorder,

a Sherwood model S-1000 II amplifier, and reproduced by an Acoustic

Research Laboratories AR-2ax speaker system at a constant average

level of 80 db measured six feet directly in front of the speaker.

Each group received instructions on the test procedure (see

Figure 1) and a sheet on which to mark responses to the sentences

heard (see Figure 2). Two items were presented to which the listen

ers were asked to respond for practice in listening and in using the

response form. Ample time was allowed between items to avoid rushing.

Repetition of any given item was made upon request.


INSTRUCTIONS

The sentences you V7ill hear are intended to carry some connotative meanings associated with the speaker's attitude. Please listen carefully to each sentence and check which of the meanings you believe the sentence to have. Please check one and only one meaning for each sentence; and check a meaning for every sentence. The list of possible meanings from which you are to choose is the following:

INSISTENCE - (I've said this before; now this is the last time.)

SIMPLE IMPLICATION - (. . .or else something might happen.)

STRONG IMPLICATION - ( . . . or else there will be trouble. A

threat.)

SIMPLE QUESTION

POLITE QUESTION - ( like . . . "Won't you sit down?")

DELIBERATION - (implies . . . But I want to think further on the

matter.)

SURPRISE - (Really! I don't understand that reaction.)

DISAPPOINTMENT - (I really don't feel like discussing it any

more.)

LIGHTNESS - (like talking to a baby -- or affecting femininity)

REPUDIATION - (like . . . "I don't believe a word of it.")

HESITANCY - (like . . . "I really don't want to say this, but...)

ANGER

CHIDING - (teasing)

LOVINGLY OR ADMIRINGLY

INCOMPLETE - (something else follows the last word of the sen

tence here.)

Figure 1. Listener instruction sheet


43

i

*3a.

aa.(S

ij32n5ti

5Er

J

¥

f;a

2

3

TfM.Uar.

w

iv»

•73

. j

2

.'l

i ;*4

n:c

it

an*6

*•f>•-i2

3

Ji*5

pa i

?3 {12

}<•

7i7?9

1°P

K»!*•

\7V*7

W

9>a

#

nt*a2627

26

i?

-JL

A c ii '. -M c f 1 . ; n f i c t j r.y-'i .jic.k xnj 1 v. -j . c* -rr t r.w in • 11 r.n?

If so, ! l3 y ' j l«*'m tr.#at hooo u ■» ; 111 a a ■ tcraponry r«si Im t in *fo r i; lty. country. ^hm :__in ict»ooloth*r (specify )

Figure 2. Listener response form (photographically reduced)


44

Statistical Procedure

The responses of the listeners V7ere recorded by groups as indi

cated in Table I. The responses of listening groups were compared

for significant variation using a Chi-square analysis as follows:

(1) For each of the three types of stimulus materials produced

by the two speakers, a two-by-two contingency table was constructed.

The one meaning most often selected was used for one of the frequency

counts arbitrarily called "Right," and the sum of all of the other

responses became the second frequency count called "Wrong." If the.

number of responses for the most frequently selected meaning was not

at least 50% of the total number of responses, the second most fre

quently selected meaning was added to the first for a second compari

son. This method of analysis was applied to (1) comparisons between

naive and non-naive listener responses and (2)comparisons between

listener responses to the linguist and to the actor on sentences

supposedly having the same meaning. The Chi-square analysis of the

distribution of responses in the groups being compared determined

significance.

(2) For each sentence of each of the two speakers a comparison

was made between responses of listeners to full intelligible speech

and responses of listeners to the nonsense sentence of the same type

and also the same sentence after filtering. The most frequently

selected meaning for the intelligible speech became the expected

frequency for the other two types; all other selected meanings were

grouped together for the expected frequency of "wrong" responses.

If the most frequently selected meaning did not represent at least


45

50% of the total number of responses, the second highest choice was

grouped with the first to form the expected frequency of the "right"

responses and all others grouped as "wrong." A comparison of the

expected frequencies established by the responses to intelligible

speech with the observed responses of listeners to nonsense and

filtered speech determined the significance of any variation by the

application of Chi-square analysis.

Spectrographic Analysis

A total of eighty-eight spectrograms were made of the sentences

of both speakers. The spectrograms were all produced on a Kay Sona-

Graph model 6061A with a narrow band filter and amplitude display

for intensity information. The frequency display used either the

full range up to 8000 Hz (see Figure 3) or, by use of the model 6076A

scale magnifier, a magnified scale up to 4800 Hz (see Figure 4), or

both. If a very clear display was available from one spectrogram, no

second one was produced.

Using an overlay produced from a spectrogram of the 500 Hz

harmonic calibration tone for both linear 8000 Hz display (see Figure

5) and linear 4800 Hz display (see Figure 6), the writer was able to

— determine the fundamental frequency of each syllable of each sentence.

Without knowledge of the pei'ceived tone levels assigned to the

sentences produced by the linguist, the writer assigned phonemic

pitches to the first syllable, the last stressed syllable, and the

final pitch of the sentences on the basis on the fundamental fre

quency. Obviously, the absolute frequency level was not the only


47


Figure 4.

Linear 4800 Hz

display

Figure 5. Linear 8000 Hz display of harmonic 500 Hz calibration tone

•p>oo

49


Figure

6. Linear 4800 Hz

display

of

harmonic

500 Hz

calibration

tone

consideration in the assignment of pitch. The relative relation

ships of the fundamental frequencies in the sentence in question

and in other sentences were considered as well. While the assign

ment of fundamental frequency was objective altogether, the assign

ment of pitch, even using objective data, became necessarily a

subjective task as much in looking at fundamental frequency infor

mation as in listening. Linguists assign perceived pitch phonemes

subjectively by listening to speech with no reference to the

physical parameters of that perception. But no amount of specific

information on the physical parameters removes the necessity for

assigning pitch phonemes subjectively or impressionistically. The

impression simply changes from auditory in the normal procedure

to visual in the procedure used by the researcher.

Once the tones were assigned to the sentences produced by the

linguist and compared with the pitches validated by the panel of

linguists, the writer assigned pitches to the sentences produced by

the actor as a basis for comparison. Again, the writer assigned

pitches to the syllables which Fike considered strategic without

reference to the intended meaning of the sentences. The tones

assigned to the actor's sentences for a specific meaning were com

pared with the pitches of the linguist's sentences declared by Pike

to have the same meaning. The accuracy of such comparison is depen

dent on the ability of the researcher to read the spectrograms and

to assign phonemic pitches by such a procedure. No statistical

statement of the significance of such comparisons is available.

Other kinds of individual comparisons were possible simply by


reference to the absolute ranges and frequencies used and variations

of fundamental frequency in places other than those which Pike

considered important.


CHAPTER IV

RESULTS AND DISCUSSION

The responses of all listeners were tabulated by groups (see

Table I on page 41 for the distribution of listeners in the twelve

groups) in response to each item. The total number of responses on

the 22 items of interest is equal to the number of listeners multi

plied by 22, or 8,536. For each item the total number of responses

is equal to the number of listeners, 388. For each intended meaning

being used, the total number of responses is equal to the number of

listeners multiplied by 2 since there are two different sentences

for each intended meaning, or a total of 776.

LISTENER RESPONSES

In the tables below are displays of responses in confusion

matrices. Along both axes are listed meaning categories, the

intended meaning on the vertical axis and the possible meaning

choices on the horizontal axis. The numbers within the matrices

represent the total number of listeners selecting a given meaning

for the sentences with the intended meaning in question.

Responses of Listeners to the Linguist

Of a possible 388, the highest number of listeners choosing the

meaning assigned by Pike to the contour of a given sentence was 124,

or 31.9% (see Table II). The total number of responses to all the

52


Reproduced

with perm

ission of the

copyright ow

ner. Further

reproduction prohibited

without

permission.

Table II

Confusion matrix for listeners to linguist

•oo uo a a.h a US Ui US Q) erf

Pike's meaning

MoHiEhCOHCOV-*M

onm dIHa,n Vi CO M

oi—1c-< o o>-r, m O CO M

/-■'io ro mE-* 0-« rov i roCO O' PO

LITE

QUESTION

OEHEh*HUSto«h-troQ

MCOhfH£CO

EHpiEh I Jh fb M -H oCO funQ

COCOroEhOn-J

oi-i£hMnsroPi

J>HO3h-i

%It;

Kmo

or-~*i—IaHit_>

>•<r-4oMqV-H

MEhm►4OhOOi-H1

INSISTENCE 100 43 40 8 12 30 12 31 6 37 12 42 10 21

3

SIMPLE IMPLICATION 43 37 17 35 19 36 11 29 17 26 57 7 9 8 37

STRONG IMPLICATION 45 25 30 21 28 13 8 14 5 18 14 77 15 3 69

SIMPLE QU33TI0N 6 16 24 123 37 8 58 4 24 5 7 8 5 13

POLITE QUESTION 5 11 14 99 89 12 60 6 25 12 10 2 14 5 24

DELIBERATION 29 38 10 39 29 45 9 48 9 26 28 15 4 IS 40

suit PRISE 70 31 IS 26 19 15 67 9 23 33 15 17 25 14 c;

DISAPPOINTMENT 79 45 31 7 10 44 8 41 3 32 24 42 2 12 8

LIGHTNESS 11 61 18 5 4 5 7 4 124 2 7 2 99 33 6

REPUDIATION 53 15 56 E 11 12 29 23 44 16 29 63 28 4

HESITANCY 29 109 17 4 6 9 2 10 35 12 35 2 63 43 12

LnU>

sentences was 4, 268, of which 735, or 17.2%, selected Pike's

meaning.

On the basis of the distribution in Table II, the researcher

concludes that listeners do not assign to a given intonation contour

the meaning which Pike attributes to that contour. It is obvious

at a glance that the distribution of choices is not a chance distri

bution; that is, some factor was operating to make the listener

accept certain kinds of meanings and reject others. But the distri

bution itself does not indicate what factors were operating. The

figures here include all three types of stimulus items, intelligible,

nonsense, and filtered; but the distribution is not necessarily a

reflection of that fact. Listener recognition of Pike's assigned

meaning to only the intelligible sentences of the linguist was 245

of a possible 1,474, or 16.6%,. This table does not allow a conclu

sion that there is no such thing as intonation morphemes, but it

does admit suspicion of the concept. One would expect that, even if

Pike's descriptions were wrong, listeners would choose the same

meaning for a given contour nearly all the time if intonation

morphemes had some real existence in language. Such does not appear

to be the case.

The possible implications of the distribution in Table II are

(1) Pike's descriptions are inaccurate or (2) emotional meaning

does not reside in a specific intonation contour. The first of

these implications will not be tested further; the second will be

discussed below in consideration of comparisons to be made between

listening groups.


55

Responses of Listeners to the Actor

For each intended meaning there V7as a total of 388 responses

(see Table III). The highest number of listeners selecting a given

intended meaning was 206, or 53% of the total number of listeners.

The total number of listener responses to all sentences was 4,268.

Of that number 1,125, or 26.4%, selected the meanings intended for

the sentences.

The researcher is unable to explain why the figures here are

as low as indicated. Previous research demonstrated listeners'

ability to recognize the intended meaning in intelligible speech

60% to 85% of the time (26, 38, 58). The low figures in Table III

are not a function necessarily of the fact that nonsense sentences

and filtered sentences are included in them. For the actor's

intelligible sentences only the total number of listeners selecting

the intended meaning was 409 of a possible 1,474, or 27.7%. What is

even more difficult to explain is the fact that at least four of

the five dramatics judges had validated the sentences of the actor,

yet less than 30% of the listeners agreed with the judgments of

these highly trained observers. The validation procedure used here

was different from that used in any previous study, but the listen

ing of the naive and non-naive listeners corresponds almost exactly

with the validation procedure used by Fairbanks (26). The only

real difference was in the number of listeners involved. Why Fair

banks found an average recognition of the intended meaning 80% of

the time using a twelve-choice form and this researcher found only

27.7%, recognition using a fifteen-choice form is inexplicable. It


Confusion

ir.ntrix

for listeners

to nctcr

56

£ i c - n . - j ; : o o 'c i CO

! i

28

I

c>rH

j 82

!

!

< *CM

rH

1 n

i

0

65

* 1 0 :1 1 ACT! to OpH

to to-

CO torH CO

uOCO

CMfH

O K IG IR O to COrH

CM to LOtorH

O1to

IQ

r e o : :yrHCO 3

rH <U 'X)rH

to CO CO 'OCM

C-»rH

OrH S W

r b

'

rH CMCM

L.0 M1CM

to rHCM

'O C-

u o n y i c n d E E*.0to

CMCM

OrH

.::R «,*i ! rH

CO CO

I

CMc— O

COOCM

S F F lM R .O r i pH e' toc -

0«

C> C- to0rH

S.HrH

CM

I K F i ' I I K I O l

- R 7 S I G

OC4

ento

CMrH

to O c.]0 to «HOrH

L ____1

•OCO

M«CM

E S I T O l S lO rH tO C i O

.

c .

to [> CM

K 0 I i l 7 R E : { I ' I £ C

pH0)

COCO

to00 rHrH C.J

r lto

rH iHCM

CM pHCM

OIO

‘ T r-. r * r r- n... J .L -1 c: c . ». .

7 .1 I T R .

O c^rH

*0CM

C iO

C-L-J

IO C!<M

CO t> UO

K O II .C F : ' ! . ’

R ' l . - r . ' i i s

OrH

corH to

to'orH

" o JG> h" ~

00rHrH

CO CM CM

i r D L i y o n - r : :

o ; : o k i s

toIO

c- CO U5 -a»

c - CM t>rH

Or-t

: : o u v o i i e i m

i s

O'iO

*;4 • 0 to

0pH

cmCO

CM»

:o O COv;t

OCM

O it'O

C -'O K F .tO I £ " ; t -c v

OUO

c>'.0

to torH

tOCO

CMCM

OrH

*“ *' to

c -co

1

i t u x u n s i y

p 3 u : .? x s s n

•r-ir .&c:S

t ;<D

f i0

•pa

•H

q

iV)JH:oM

JH

OJ~i

O>-{v-1->

H

:h►41*2

MCO

btH

*"40PH*4

JH

O

O

CO

bMc-«COr.o[0O*

COw 4»>-'«FHCO

0r ifHr.O

C f

rr?CHM•=?

pH

t -0t \

1

1

r i iCO

00

*H

■vi3rH

Mq

J L| <tr.oy *. n

fOCO;:•] J * «n-«a i0IHP

b !— 1

!Ha

t-'Ur-i

0/•X

e-nMCOCO


could be a function of the relatively fine distinctions which the

listeners were asked to make in the present research, although the

distinctions are justified on the basis of Pike's descriptions. A

more likely explanation is that Fairbanks did not try to use emo

tionally empty sentences; the contours were consistent with the

lexical content of the sentences. Such an explanation would also

account for the much higher figure, 84%, of Jassem's study (38,

p. 47 ff.). It does not -however account for the 85% recognition in

Lieberman and Michaels' study (58). In the latter study only eight

emotional meanings were involved and there is no iiidication of the

kind of choices available to the listeners. As in the present study

the sentences were supposedly empty of emotional content. The

researcher has no explanation for the difference between the find

ings of Lieberman and Michaels and those in Table III unless it has

to do with the testing instrument, which was not clearly explained

in their article.

It is obvious that more listeners assigned the intended mean

ings to the actor's sentences than assigned Pike's meanings to the

linguist's sentences, 26.4% for the former as opposed to 17.2% for

the latter (see also Figure 7). Whether that difference is

statistically significant will not be explored here. Some indica

tion of the significance of differences in distribution will be

discussed below in a comparison of responses to sentences by the

actor with those of the linguist.


58

J 19.3\ v s\ \ \v v s. v vVs

14.2

11.3 20.6 > x x. S vN_N-NA.

31.9

29.8

48.910.6N. N. v. N . S . v \

29.8V. Q vs *• v w s 1 TV,V»<A—

17.3 \ ■ -» W35.2

11.7

22.9 14.7

18.8

31.7 42

36.9WMt^Tr19.6

10.3

i _

HESITANCY

REPUDIATION

X!<Da•r-i

LIGHTNESS fo

QUESTION

STRONGIMPLICATION

SIMPLEIMPLICATION

INSISTENCE

W•rl3£2

DISAPPOINTMENT

SURPRISE

DELIBERATION

POLITEQUESTION

OSIMPLE UcS


Figure

7. percentages

of listeners

accepting

the

intended

or defined

meanings

Combined Responses

Of the 776 total responses for each intended or assigned

meaning, the highest number for any meaning was only 286, or 36.8%

of the total number of listeners (see Table IV). The average

recognition of the intended meaning overall was only 21.8% of the

total number of responses, or 1,860 of the possible 8,536. These

figures include all of the responses of listeners to all three

kinds of sentences, intelligible, nonsense, and filtered.

NAIVE AND NON-NAIVE LISTENERS

Every sentence used as a stimulus item was heard by a group of

naive listeners and a group of non-naive listeners. The purpose of

the comparison of the responses of these two types of listeners is

to determine whether an assumed development of awareness toward the

prosodic features of language has any effect on the recognition of

intonational meaning. The statistical comparison is designed to

test sub-hypothesis number four: There is no difference between

the meanings assigned by naive listeners and those assigned by non-

naive listeners to the same sentences.

The responses of naive and non-naive listeners to the same

sentences differed significantly (o( = .01) for only five sentences

(see Tables V, VI, VII, VIII, IX, and X). When the second most

frequently selected meaning is added, there remains only one of the

total of 220 sentences on which naive and non-naive listeners

differed significantly. If a more lenient confidence level were

allowed, there would be a total of 14 of the 220 sentences on which


Confusion

matrix

for

ail responses

60

EiS'ici'ioo;-: i

i i65

J

to

! ;

! 20

i r i

to to 0204 0rH COrH

X19SIAQI LO COrH to 0rH CD c-02 COrH tH04 rHt' toLO inLO

GKICIIIIO torH CD onrH 04rH CO04 to CD02 in COCvl CD COCD

h e g'tv toorHrHC4 in04rH

CO rHCO C402 3 in LOIO CDrH

-W.T/JISEH 0302 0404rHC"CO 04rH rH04 OLO O04 CO torH C"CO rHrHrH

sonvicndsra CO CO CO02 in COto Oin to toin CD SJtCMrH02CO

SSHtuLHOn C- 04 CO torH toto torH 04to 0rH O'0204OCO fc-CO

i’TS'AuIMIOd-dVSIC

rHIO rHCO tO04 LO tosrH

02rH 04CO04NT* inrHrH CO

ESiHd HfiS L*-rH 04rH torH 0-OrHCDCO torH toO-03

rHrH rH torH :

::oiJJy®i3nG:a rHin 04O- toto CDrH COCO tot- torH IOto t' 03CO CDin

MOIJSFn'Os.inoj

COrH <0CO LO COtorHtorH to cn CDrH rHrH LO CD

iioii-szri'Gsun n s

CDrH toin CDLD COCO04COCDrH

rH toOrHCOrH CO O' to

OMOHJ.SCOo rHto O'OrH

0-04 CDrH to to04 COCO IDCO toto rHin

rioijvoricua t"cd rHCO COLO inCO to OCD 'si*to . inCO O-OrHinto CO

rHEOKSISISKI c-gdrH

COa> COrHrHrHrH O04 04C4rH

Oo- rHOiHrH02 vMCO toto

-j'JTU'3Sl3pouStssg

bZc•HCfeS©£OXJa©•P•H

CtlO6hCOMonJ2SM

oMEH<;o362MCO

MCO

Oh-.c-»CDWsMtoOEHC/D

OWC-HCOCOSoO’raHI02MCO

£5OMEhCOCOPO’r0EHHi>-5OpH

OrHE<,erfwCDMJfOO

T4coM£Bcn

Ehf-A

ftHH£S*tnIHa

r/jE36hOJHh4

0MEhHCDPhCOPi

>H0£3EHMCOnoCd


61______________________ Table Y __________________Comparison of Naive (N)"'and’ non-naive (Nil) Listeners

to Linguist's Intelligible Sentences

Single Item Responses X" ...

Grouped Item Responses . X 1-

INSISTENCE 1 '>TU

right ’.vronfr

.0801

right wrong19 13

•Lit 23 12

INSISTENCE 2N 5 27

1.51012 20

1.85811 24 20 15SIMPLEIMPLICATION 1

?T 8 241.093

16 16.355u* 14 21 16 19

SIMPLEIMPLICATION 2

R 4 231.730

10 22.531* 10 25 15 20

STRONGIMPLICATION 1

N 10 221.103

17 151.178i:n 8 27 15 20

STRONGIMPLICATION 2

a 11 21.863

18 141.080vr 17 18 25 10

SIMPLE QUESTION 1

II 12 20.808

21 Il.24118 17 22 ls

SIMPLE QUESTION 2

% •i'i 4 202.443

9 231.52311 24 16 19

POLITE QUESTION 1

yj 5 27.0264

11 21.484ITM 6 29 16 19

POLITE QUESTION 2

V 10 19.0776

19 101.796i\ 11 21 16 16

DELIBERATION 1i'i 7 25

.23916 16

.355I'i 7 26 16 19

DELIBERATION 2K 11 21

10.870**15 17

5.391*Nil 2 33 8 27

SURPRISE 1i'i 9 23

4.613*17 15

.640r:'7 20 15 23 12K 8 24 17 15

SURPRISE 2 IT;' 7 28 .605 14 21 .657DISAP N 6 26 20 12POINTMENT 1 >*%ri«L» 12 23 1.339 22 13 .0496

DISAP N 11 21 IS 14POINTMENT 2 ' * v 10 25 .601 17 16 .763

N 11 21 21 11LIGHTNESS 1 15 20 .212 26 0 .262

N 7 25 15 17LIGHTNESS 2 i-g; 11 24 .366 18 17 .0163

N 8 24 16 16RERJDIATION 1 UN 11 24 .0972 19 16 .0112

N 8 24 12 20REPUDIATION 2 it.'i 13 23 .541 18 17 .809

N 5 27 11 21HESITANCY 1 NN 11 24 .151 17 18 .863

N 11 21 16 16HESITANCY 2 NN 13 22 .000363 19 16 .0112

In this and succeeding tables "right"* most frequently selected meaning; "wrong"rsum of all other responses.

In this and succeeding tables * c c - .05; ** .01


Table VI

Comparison of Naive (N) and Non-naive (NN) Listeners to Linguist's Nonsense Sentences

Single Item Responses X 1-

Grouped Item Responses

INSISTENCE 1N

right wrong

.141

right vrr eng

1.4546 23 10 19

NN QV 23 17 15

INSISTENCE 2N 0 23

.26312 17

,617UN 6 26 18 14SIMPLEIMPLICATION 1

N 3 262.022

13 16.117;.7M 9 23 14 18

SIMPLEIMPLICATION 2

N Ru 244,792*

10 191.454in; 15 17 17 15

STRONGIMPLICATION 1

N 11 181,866NN 8 24

STRONGIMPLICATION 2

N 17 121.416

18 11.512I j V 15 17 18 14

SIMPLS QUESTION 3.

K 12 17.103

18 11♦ 659V>ri'.r. 13 19 24 S

SIMPLE QUESTION 2

> j14 14. 171.966

20 11.399T * \ -

L\h 11 24 21 14POLITE QUESTION 1

N 12 171.844

21 81.740i;i; 9 23 20 12

POLITE QUESTION 2

N e 23.920

16 15.453i'ii't 14 21 22 13

DELIBERATION 1N 5 24

.49715 14

.1179 23 19 13


1.84418 11

.521KI'j 9 23 18 14

SURPRISE 1N 4 25

1.1079 20

.252> T» *nr. 9 23 13 19H 9 20 12 17

SURPRISE 2 Eli 3 29 5.991* 11 21 .686DISAP N 7 22 11 15POINTMENT 1 NN o 23 .00386 13 19 .00224

DISAP N 7 22 11 18POINTMENT 2 v-v.t 6 26 .895 11 21 .309

N 11 18 17 12LIGHTNESS 1 NN 13 19 .00224 21 11 .0895

N 6 23 13 16LIGHTNESS 2 NN 10 22 .416 17 15 .153

N 3 26 9 20REPUDIATION 1 id; 11 21 3.702 16 16 1.546

N g 23 12 17REPUDIATION 2 Ni; 7 26 .0400 12 20 .327

N 6 C'7, 12 17HESITANCY 1 >TV 20 12 9,231** 23 o ■ 4.605*

>.T 1G -T- 20 9HESITANCY 2 KN S r 7Co 5.788* 20 12 . 641


Table VII

Comparison or naive (;l) and Non-naive (ITU) Listeners to Linguist's Filtered Sentences

Single Item Responses TC

Grouped Item Responses . . . T -

INSISTENCE 1

right wrong

.0000760

right vrrong

.5257 24 16 15is: 9 26 16 19

INSISTENCE 2tv B 23

.084414 17

. 243V*: 0 26 19 16SIMPLEIMPLICATION 1

K 7 243.718

8 23.000634v V 3 32 3.0 25

SIMPLEIMPLICATION 2

N 8 232,204...

14 17.910i;:: 5 30 15 20

STRONGIMPLICATION 1

2J 12 19.106

21 10.0032416 ID 25 10

STRONGIMPLICATION 2

r 15 168.900**

15 133.790*V\* c 29 13 22

3IMPLS QUESTION 1

>*it 10 211.283

25 02.244i.*. 18 27 25 10

SIMPLE QUESTION 2

*»; 14 171.966

20 n.39911 24 21 14

POLITE > rJ’t 8 23 13 18QUESTION 1 . 10 25 .000634 19 16 .00201POLITE > * • i 8 23 16 15question 2 17/. 14 21 .920 22 13 .453

11 5 2.5 6 25DELIBERATION 1 6 29 .04 87 15 19 i. qp;;*

>T 6 25 12 19DELIBERATION 2 y V 1 •. . Eu 30 .779 14 21 .0211

]■; S 22 12 19SURPRISE 1 rr 2 33 8.220** 8 27 2.779

Vit 15 16 16 15SURPRISE 2 NN 9 23 4 .597* 16 19 .526DISAP N 4 27 13 18POINTMENT 1 w 12 23 3.011 16 19 .00363

DISAP l’i 6 25 14 17POINTMENT 2 T*"*’it-. 11 2.4 .701 17 18 .000897N 12 19 19 12

LIGHTNESS 1 I ;]I 15 20 .00832 22 13 .015217 13 15

LIGHTNESS 2 rriii* 17 18 .243N 5 26REPUD I AT I Cl.' 1 8 27 .141

H 8 23 11 20REPUDIATION 2 y:: r?/ 28 .733 17 18 .679

rl 12 19HESITANCY 1 yr 12 23 .396IT 12 19 15 16HESITANCY 2 !;1! 7 28 3.794* 17 18 .0537


Tablo VIII

Comparison of Naive ( i f ) and Non-naive (ITi’I) Listeners to Actor's Intelligible Sentences

Single Item Responses X *

Grouped Item Response", X*-

INSISTENCE 1N

right wrong right wrong

.58015 17

.690522 10

NN 21 14 27 8

INSISTENCE 2N 13 19

.090024 8

1.203l.-t 14 . 21 23 12SIMPLEUNPLICATION 1

it 10 22.531

26 62.920*'* 15 20 23 12

SIMPLEIMPLICATION 2

T.r 12 202.432

18 147 O Q Ai.ii 8 27 13 22

STRCifGIMPLICATION 1

N 13 192.429

16 16.4009 26 22 13

STRONGIMPLICATION 2

i\ 10 22.972

17 151.230i *-v* fill 16 19 23 12

SIMPLE QUESTION 1

VH 16 16.133V'- * • i I • 17 IS

SIMPLE QUESTION 2

if 9 23.0780

16 16.0163v T V 12 23 18 17

POLITE QUESTION 1

M A 28.515

11 21.968I-']’ 10 25 16 19

POLITE QUESTION 2

>• 12 20.187

1 n.1 o 13.9112IS i'i 16 19 19 16

DELIBERATION 1\ f.is 19 13

2.54515 20


.600817 15

.00591*■%;ill* 10 25 19 IS

SURPRISE 1N IS 16

.133V V 17 13

SURPRISE'2N 18 14

1.078NN 25 10DISAPPOINTMENT 1

N 14 185.271*XX 26 9

DISAPPOINTMENT 2

N 23 q

.316i * is 24 11

LIGHTNESS 1K 12 20

.038925 7

.429J-TT-.T 15 20 26 9

LIGHTNESS 2N 13 19

.018217 15

.103*«*» * » it 15 20 21 14

REPUDIATION 1N 7 25

.038813 19

.168■*. n.'M l 8 27 17 18

REPUDIATION 2If 12 20

.00169718 14

1.005NN 14 21 24 11

HESITANCY 1r

i i 6 26.0122

11 211.347NN 8 27 18 17

HESITANCY 2N 12 20

3.4,5516 16

.0112Nil 7 28 19 16


65

Table IX

Comparison of Waive (l'l) and Kon-naive (liit) Listeners to Actor's Nonsense Sentences

Single Item Responses r

Grouped Item. Responses X "

INSISTENCE 111

right wrong

2.924

right vrrong

.00408 27 20 15

V” 3 27 17 13

INSISTENCE 2i'i 12 23

6.822**19 16

7.802**YYil 3 27 7 23SIMPLEEiPU CATION 1

Pi 8 271.709

15 201.087K.) 4 26 20 20.

SIMPLEIMPLICATION 2

> * 7 28.0967

11 24.03316 24 11 19

STRONGIMPLICATION 1

\7H 5 30.226

17 IS4.266*\T>Ti'» i > 4 20 8 22

STRONGIMPLICATION 2

N 13 231.397

25 10.706X:.: IS 14 25 5

SIMPLE QUESTION 1

y 15 20.312

24 11. 202i'i i*. 14 IS 23 7

SIMPLE QUESTION 2

N 9 23.0780

16 16. .0163*\T% * 12 ot;L w 24 16

POLITE QUESTION 1

H 15 201.765

22 131.943ML 9 21 14 16

POLITE QUESTION 2

H 11 24.155

22 131.059i'i ji 9 21 16 14

DELIBERATION 1u 11 24

1.76417 18

.889•‘ill 6 24 12 13

DELIBERATION 2K 14 21

.084022 13

.00360ni: 14 16 20 10

SURPRISE 1N 25 10

.0114KN 22 6

SURPRISE 2N 30 5

2.418rvi'4 i'i 22 8DISAPPOINTMENT 1

N 16 191.371III] 19 11

DISAPPOINTMENT 2

N 17 18.162

32 12.538>.:* • ii > . 14 16 13 12

LIGHTNESS 1N 19 16

.35315 15IJ 12 23 18 17

LIGHTNESS 2 UK 8 22 .871 16 14 .000918N 14 21 20 15

REPUDIATION 1 illi 51 25 5.454* 14 16 1.193N 7 28 10 25

REPUDIATION 2 if 1 . 7 23 .000524 13 17 .962N 8 27 19 16

HESITANCY 1 NH 6 24 .339 10 2.0 3.780*N 12 23 17 13

HESITANCY 2 KN 10 20 • V—» V_l CO 1 22 8 3.159


66

Table X

Comparison of Halve (K) and lion-naive (ill’) Listeners to Actor's Filtered Sentences

Single Item Responses r

Grouped Item Responses r

INSISTENCE 1N

right v.To.ng

1.375

r ight wrong

.02810 25 6 19

7."'T A 33 8 29N 9 15 11 14

INSISTENCE 2 y< i < 6 29 2.357 16 21 .102SIMPLE K 4 21 13 12IMPLICATION 1 l i l l 6 31 .108 23 14 .284SIMPLE K 9 16 13 12IMPLICATION 2 h N Eo 32 5 «597* 10 27 5 .129*3 IP ON G K 6 19 13 9IMPLICATION 1 21 16 7.912** 34 3 .574STRCNG N 5 20 • 12 13IMPLICATION 2 V’’ 6 29 .0269 13 24 1.330SIMPLE li 8 17 15 10QUESTION 1 Li< Q 28 .913 24 i % .0014 6SIMPLE •> •

i'i0 16 19 6

QUESTION 2 l-,y 17 20 .266 31 6 7 QC • iyyPOLITE N 4 21 7 180UE3TION 1 r»/ 30 .00191 13 24 .0977POLITE i'i 8 17 14 31QUESTION 2 17 20 .696 24 13 .191

N 5 20 8 17DELIBSPATION 1 >•>* 3 2 25 .513 20 17 2.107

N E 20 10 15DELIBERATION 2 > '\T 7 30 .25 6 15 24 .432

11 g 17 15 10SURPRISE 1 •»">! 9 23 .913 24 13 .00146

N r*0 19 20 5SURPRISE 2 I.'l\ 10 27 .000820 27 10 .876DIS AP 1\ 7 18POINTMENT 1 :Cli 14 23 .280

DISAP li 8 17 13 12POINTMENT 2 Vi I* o 28 .912 14 23 1.861

H 14 11 23 2LIGHTNESS 1 YX 17 20 1.089 29 8 3.177

ii 4 21 13 12LIGHTNESS 2 %r>*in* g 28 .223 14 23 2.956

ViIS 3 22 8 17REPUDIATION 1 Kl‘» 10 27 1.227 14 23 .0403

N 5 20 10 15REPUDIATION 2 'l*TT 13 24 1.005 24 13 2.788

H 5 20 7 18HESITANCY 1 y > 7 8 29 .0264 15 22 .550

N 4 21 6 19HESITANCY 2 ?rv. 7 30 .00191 13 24 .425


67

naive and non-naive listeners differed .significantly (oC= .05),

with 8 remaining after combining the two.most frequently selected

meanings. On the basis of these results the researcher accepted

the null hypothesis. That is, the fourth sub-hypothesis appears

to be true. Because of the acceptance of this null hypothesis there

is no need to mention naive and non-naive listeners in other com

parisons. The number of groups to be dealt with is thus reduced

from twelve to six based on the kind of stimulus items and speaker

involved.

ACTOR AND LINGUIST

Two kinds of comparisons must be made between the actor and the

linguist. First, the researcher will attempt to determine how

nearly alike the intonations of the actor and the linguist were on

the basis of spectrographic analysis. Second, the listener respon

ses to sentences of both speakers will be compared to determine

whether the meaning recognized was significantly different.

Spectrographic Analysis

The researcher used an overlay produced from a spectrogram of

the harmonic 500 Hz calibration tone to determine the fundamental

frequency of each syllable (see the discussion of this procedure of

spectrographic analysis in Chapter III). The researcher's accuracy

in assigning phonemic pitch levels to the sentences of the linguist

was 72.8%. Of a total of 140 pitches validated by the panel of

linguists, the researcher properly assigned 102 pitches using the


procedure described in Chapter III. Nearly half of the errors

(47.4%) were due to the weakness of the final syllable or part of

the last syllable; in such cases the weak signal was insufficient

to produce an adequate spectrographic display. A few other errors

were apparently the result of the speaker's habit of beginning with

a low frequency but moving very rapidly to a higher frequency. The

panel of linguists perceived only the higher pitch even though the

lower frequency was present. The lower frequency apparently was not

of sufficient duration to be assigned a phonemic pitch. Another

large number of errors, 31.6% of the total, was on only four sen

tences to which the researcher assigned pitches one tone higher or

one tone lower for the entire contour. These three types of errors

account for all the errors made.

One finding of interest tends to confirm what Lieberman found

in his study (54; concerning assignment of pitch phonemes by lin

guists. He found that the same pitch was often assigned to differ

ent fundamental frequencies; and conversely, the same fundamental

frequency was assigned different pitches. The pitches validated by

the present panel of linguists showed the same tendency (see Figure

8 below). From the standpoint of the fundamental frequency the

pitches are unstable categories which overlap considerably. There

do appear to be some few frequencies which were used for one and

only one pitch lievel: /l/ 250, 222, 220; /2/1S5, 180; /3/ 115,

100; and /4/ 83, 77, 75. In light of this finding, the 97% agree

ment of the panel of linguists on the first hearing seems an

unlikely event. Perhaps the perceptual units are not so overlapping


69

Figure 8. range of fundamental frequency and assigned pitch levels

in the sentences of the linguist

Fund amental Frequencies

250

222

220

200

185

180

167

150

143

135

130

125

120

115

100• 95


70

as their physical correlates though the writer knows of no way to

test such a conjecture.

Using a procedure similar to that used in assigning pitches to

the sentences of the linguist, the researcher assigned pitches to

the sentences of the actor admitting the possibility of error in

this procedure. The assigned pitches were used as a basis for

comparing the linguist's sentences with those of the actor (see

Figure 9) .

It should be noted that the actor occasionally used two or even

three different contours for the same intended meaning. If the

phonemic-morphemic theory of intonational meaning is true, differ

ences in recognized meaning should be noted where there are

differences in the intonation contours.

Listener Responses

On the basis of the kinds of responses found it is difficult to

draw any generalizations. The statistics indicate a tendency which

requires the rejection of sub-hypothesis number three: There is no

difference between the meanings assigned by listeners to sentences

produced by an actor and those assigned by listeners to sentences

produced by a linguist with the same intended meanings. There are

numerous instances in which the differences are significant (c< = .01)

(see Tables XI, XII, and XIII). Of twenty-two intelligible sentences

the differences in recognition are significant thirteen times, or

59% of the time. The differences are less on nonsense and far less

on filtered speech. It is apparent that the distribution of re-

/


Reproduced

with perm

ission of the

copyright ow

ner. Further

reproduction prohibited

without

permission.

Figure 9. Intonation contours used by linguist and actor

Meanings

Intonation

Linguist

Contours

Actor

INSISTENCE 2- ' 2 - 4 3- '2-4

SIMPLH IlviPLICATION 3- * 2 - 3 - 2 3- *3-4, 3- ‘3-2-3, 3- ‘4-3

STRONG IMPLICATION 2- ‘2 - 2 3- '2-3, 3- '2-2, 3- '3-2

SIMPLE QUESTION 4- ' 3 - 2 • 4- ' 3 - 2

POLITE QUESTION 2- * 2 - 1 3- '3-2, 3- '3-4, 3- *2-3

DEL!EELATION 3- ' 4 - 3 3- '3 - 4

SURPRISE 3- *1 - 4 3- '2-1, 3- '3-2

DISAPPOINTMENT 3- ' 3 - 4 3- ' 3 - 4

LIGHTNESS 3- ' 1 - 2 3- '1-2, 2- *3-4

REPUDIATION 3- ' 4 - 3 - 4 1- '4-4, 4- '3-4

HESITANCY 3- ' 2 - 3 3- * 2 - 3

I

72

Table XI

Comparison of Listener Responses to Intelligible Sentences of .Actor (-'*)with Listener Responses to Linguist's (L) sentences of the Same

Type and Supposed Leaning

Single Item X1- Grouped Item -N/*-Responses Re soor:ses Xright wrong right wrong

A 36 31 49 18INSISTENCE 1 L 13 54 18.530** 54 13 .671

A 27 40 47 20INSISTENCE 2 L 16 51 4.931* 25 41 14.565 **SIMPLE A 25 4 2 41 26IMPLICATION 1 T4J 6 61 16.787** 23 59 5 «S56*SIMPLE A 20 47 29 38IMPLICATION 2 L 11 56 4.197* 20 47 3.217STRONG A 38 29 43 21IMPLICATION 1 L 22 45 9.568** 40 27 1.745STRONC- A. 13 54 33 34IMPLICATION 2 L 28 39 6.588** 39 28 .750SIMPLE A 53 34 45 22QUEST I Oii 1 L 30 37 .479 4 3 24 .298S1U PL3 . A 21 46 34 33QUEST 10ii 2 T, 8 59 8.625** 33 34 .119 .POLITE A 13 54 18 49QUESTION 1 L 5 62 5.198* 16 51 .355POLITE A 28 39 36 31QUESTION 2 L 22 45 1.563 35 32 .119

A 34 33 43 24DELIBERATION 1 L 14 53 14.315** 25 44 13.167**

A 21 46 .31 36DELIBERAE IC-N 2 L 3 64 18.323** 16 51 8.359

A. 33 34 33 34SURPRISE 1 L 2 65 39 .555** 32 35 .119

A 43 24? 58 9SURPRISE 2 L 14 53 27.478** 29 38 29.494 **DISAP A 40 27 45 21POINTMENT 1 L 13 54 24.471** 31 36 7.816**

DISAP A 47 20 54 13POINTMENT 2 TXJ 4 63 61.129 ** 25 42 27.756**

KJ i 29 38 51 16

LIGHTNESS 1 L 21 46 2.584 47 20 .949A 28 39 36 31

LIC-KTNESS 2 L S c p 14 .935** 27 40 2.. 99 6A 15 5 2 2 7 40

RSPUDI ATION 1 L 19 48 .355 35 32 1.483A 26 41 26 41

REPUDIATION 2 L 7 60 16.082** 23 39 .0310A 15 52 2.9 38|

HESITANCY 1 L 12 55 .742 19 48 3.928*A 11 56 30 37

HESITANCY 2 L r> /C-X. 43 5.569* 24 43 1.519


73

Table XII

Comparison of Listener Responses to Nonsense Sentences of Actor (A) with Listener Responses to Linguist's (L) Sentences of the Same


Single Item Responses

Grouped It err ResDor.ses X .

INSISTENCE 126

wrong right39 4346 14.992** 26

wrong2235 8.015**

INSISTENCE 215 2612 4 0 .466 17

3944 2.635

SIMPLEIMPLICATION 1

12 3412 49 .00292 22

SIMPLEIMPLICATION 2

60 2220 41 10.931** 18

39 4 .052*43

.510STRONGIMPLICATION 1

2119 42 9.758** 25

4436 .485

STRONGIMPLICATION 2

21 44 5032 4 .446* oo

SIMPLEQuestion i

36 4625 36 .350 h.C

1125 5.520*1919 .184

SIMPLE QUESTION 2

37 40L 26 1.176 47

_I614 .0003993

POLITE QUESTION 1

12 23T. 20 41 2.697 3342

5.737*POLITE QUESTION 2

18 4 1

21 .389 .698

DELIBERATION 14 P 29

14 4? 205641 2.384

DELIBERATION 228 37 28

25.801*’ 243737 .368

47 IESURPRISE 1 56.041** 18 32.449**

SURPRISE 213

11 50.441**l o38 25.153**

DISAPPOINTMENT 1

30 4345 11.141** 22

2239 12.643**

DISAPPOINTMENT 2

31 34 3953 19.259** 21

25 4 0 9.308**

34 31 56LIOUTNESS 1 12 4 0 15.902** 26 23.189**

20LIGHTNESS 2 16 c; vc 31

36 2.10719 4 r 34

REPUDIATION' 1 11 50 2.8363136 2.107

14 51 27REPUDIATION 2 13 48 .0616 30

5831 .465

HESITANCY 121

26 35 13.179** 304431 3.051

HESITANCY 222 45 27

59 21.110** 273834 .0158


74

Table XIII

Comperison of Listener Responses to Nonsense Sentences of .Actor (A) with Listener Responses to Linguist’s (L) Sentences of the Same


Single Item Resnor.se s r

Grouped Item Responses X"

INSISTENCE 1A

right v,T on g right vrr ong

2.186IS 50

5 .843*24 42

L 6 56 16 46

INSISTENCE 2A 17 45 19 43

.456L 6 5 5 5.784 * 25 41SIMPLEIMPLICATION 1

JL IB 44.175

26 36.421L 18 48 n ~<lv 41

SIMPLEIMPLICATION 2

A 14 4 88.651**

18 442.746L 21 45 12 54

STRONGIMPLICATION 1 .

A. 23 39ICQ• J ■

50 122.670L 28 38 46 20

STRONGIMPLICATION 2

rpi 556.727**

20 42.691L 21 45 27 39

SIMPLE QUESTION 1

j\ 401.924

39 231.688L 17 49 35 31

simpleQUIDS TION 2

A 24 35.0776

50 126.277**L 25 41 41 25

POLITE QUESTION 1

A c 53 12 5010.769**L 18 4 8 2.406 32 34

POLITE QUESTION 2

_<■, 13 491.877

38 2415.276**1. 22 44 31 'Z cU

DELIBERATION 1* 17 ' r.A •-/

19.647**20 42

2.201L 1 OC 12 44

DELIBERATION 2j\ 12 50

1,87720 42

.657L p c c 18 45

SURPRISE 1A 16 46

2.20035 27

10.188**L 11 55 20 45

SURPRISE 2A 15 47

1.69731 31

.270L 24 42 31 35DISAPPOINTMENT 1

21 4129 .269**

26 365.377*L 0 66 16 50

DISAP-•POINTRENT 2

A 17 45.171L 17 49

LIGHTNESS 1i: 21 41

.40952 10

S.746**L 27 39 41 25

LIGHTNESS 214 46

9.054**27 35

13.891**L 53 33 51 15

REPUDIATION 1A 13 49

7.530**21 41

7.166**L 62 10 56

REPUDIATION 2A ie 44

8.128**18 44

.111L 7 59 22 44A 0 53 14 48

HESITANCY 1 L 12 54 .103 24 42 2.287A 14 48

HESITANCY 2 L 23 38 4 .845


75

sponses for nonsense and filtered speech more nearly approximate a

chance distribution for both actor and linguist and would thus not

be significantly different as often. In some cases this is obvious

ly not true however, since the distribution of responses is less

like a chance distribution in nonsense and filtered speech for those

cases. It does appear that there are enough significant differences

to make it impossible to accept the null hypothesis.

It is interesting to note that there are fewer differences in

listener responses on those sentences whose contours were described

as the same for both the actor and the linguist in intelligible

speech (compare Table XI and Figure 9). There is one exception;

those sentences meaning DISAPPOINTMENT were much more readily under

stood to have that meaning by listeners to the actor. The writer

suspects that vocal quality is responsible for that difference but

has no way of testing that guess.

The rejection of the third sub-hypothesis adds weight to the

suspicion based earlier on the confusion matrices. Not only have

listeners not been able to assign the intended meaning to sentences

either of the actor or of the linguist more than 30% of the time,

but they have also assigned to two sentences with the same intended

meaning significantly different meanings. But because the findings

are thus far inconclusive, one may not conclude necessarily that

there are no intonation morphemes. It does seem to be in order to

question that notion and put it to further testing.


76

INTELLIGIBLE AND NONSENSE SENTENCES

From Tables XIV and XV it seems apparent once more that no

conclusive statement can be made. There is a strong tendency toward

significant differences which makes it impossible to accept the

first sub-hypothesis: There is no difference between the meanings

assigned by listeners to full intelligible speech and those assigned

by listeners to nonsense sentences. Twenty-eight of the 44 senten

ces differ significantly (c£ = .01) across the comparison with

intelligible speech. The rejection of the null hypothesis can not

be accomplished with any degree of certainty. Only a strong ten

dency is noted.

The writer noted on several occasions that the listeners to

nonsense speech assigned the intended meaning more often than those

who heard intelligible speech, but the design of this study made no

allowances for such an event. The responses to intelligible speech

were used as a control to establish expected frequency of response

to nonsense and filtered speech. The rationale was that the only

way to get at the meaning of the intonation was to see what meaning

was assigned by listeners to intelligible speech as the norm. If

the meaning found there remained in nonsense and filtered speech,

the contour could be said to convey this meaning. No allowance was

made for the fact that listeners to intelligible speech may reject

the intended meaning in favor of some other meaning. The phenomenon

noted may be a function of the attempt to superimpose an intonational

meaning upon a lexical content to which it would not optimally

apply. When the lexicon is removed the intonational meaning becomes


Table XIV

Comparison of Listener Responses to Linguist's Intelligible Sentences v.lth Listener Responses to Linguist's Nonsense Sentences

Single Item Responses X

Grouped Item Responses xx

I1J3IST3NCS 1FI

right wrong

31.395**

right vvr ong

54.216**37.326 23.674 46 19

F2 15 46 19 4 2

IN3I3T3NCE 2FI 14.57 46.43

.59529.13 31.87

5.477*F2 12 49 20 41SIMPLEIMPLICATION 1

FI 26.13 38.8713.367**F2 12 49

SIMPLEIMPLICATION 2

FI 12.45 48.257.592* * .

22.76 38.2411.442**F2 4 57 10 51

STRONGIMPLICATION 1

FI 16.39 44 .6112.807**

29.13 34.8724.046**F2 4 57 10 51

STRONGIMPLICATION 2

FI 25.49 35.51.424

33.ee 27.32.116F2 2S 33 35 26

SIMPLE QUESTION 1

FI 27.316 33.684.355

39.15 21.8510.527**F2 r «c.0 36 27 34

SIMPLE QUESTION 2

FI 13. co 47.3412.824**

22.76 38.2422.106**F2 4 57 £ 57

POLITE QUESTION 1

FI 10.02 50.984.328*

19.12 41.8817.416**F2 4 57 4 57

POLITE QUESTION 2

FI 20.03 40.97.0698

30.96 30.041.070F2 21 40 35 25

DELIBERATION 1FI 12.749 38.251

4.906*18.2 4 2.8

.151F2 5 56 19 42


10.133**20.94 40.06

23.407**F2 2 59 3 0 c>

SURPRISE 1FI 26.4 34.6

19.389**29.13 31.87

3.34 0F2 9 52 22 39


.26026.4 24 .6

.772F2 12 / C 23 38DISAPPOINTMENT 1

FI 16.33 44 .6210.808**

28.225 32.7751,800F2 5 56 23 42

DISAPPOINTMENT 2

FI 19.1 41.97.638**

24.58 36.4214.485**F2 9 52 10 51

LIGHTNESS 1FI 23.672 37.328

6.456*42.79 18.21

22.012**F2 14 47 26 35

LIGHTNESS 2FI 13.655 4 7,34 2

2.04730.043 30.952

1.669F2 9 52 25 36

REPUDIATION 1FI 17.3 43.7

6.976*+31.87 29.13

23.622**F2 g 53 11 50


9.391**27.3 33.7

17.617**F2 8 53 11 50

HESITANCY 1FI 14 .57 46.43

12,074**27.58 33.4 2

39.986**F2 3 58 3 58

HESITANCY 2FI 23.64 42.36

7.389*+34.478 31.522

.373F2 13 49 32 34'N0T£: in this and succeeding Tables Fl = expected frequency

F2= observed frequency


78

Table XV

Comparison of Listener Responses to Actor's Intelli*g-ible Sentences with Listener Responses to Actor’s Nonsense Sentences

Single Item Responses

Grouped Item Responses x 1

INSISTENCE 1FI

right vrr o tvs

23.249**

right wrong

8.621**30.388 34.612 47.5 1V. 5

F2 11 54 37 28

INSISTENCE 2FI 26.2 32 .8

8 .021**45.6 IS .4

23.227**F2 15 50 26 39SIMPLEIMPLICATION 1

FI 24 .245 4 0.7559.863**

39.774 25.2266.277**F2 12 53 30 35

SIMPLEIMPLICATION 2

FI 19.403 45.59722.251**

30.03 34 .9238.923**F2 2 63 5 50

strongIMPLICATION 1

FI 14.84 6 50.1542.983

36.655 28.14588.092**F2 9 56 25 40

STRONGIMPLICATION 2

FI 25.22 39.7621,506**

38.805 26.195.472F2 7 58 33 29

SIMPLE QUESTION 1

FI 32.01 32.99.558P2 29 36

SIMPLE QUESTION 2

FI 20.371 44.62?19.770**F2 37 28

POLITE QUESTION 1

FI 13.585 51.41514 .739**

26.195 38.80512.885**F2 1 64 12 53

POLITE QUESTION 2

FI 27,164 37.8367.425**

33.956 31.0441.009F? 18 47 38 27


44 .838**?2 n 59


19.15930.095 34.905

41.525**?2 A 61 A 61


13.730**50.477 14 .5 53

1.619F2 47 18 55 10

SURPRISE 2FI 41.717 23.263

7.07756.27 6.73

.2135F2 52 13 55 10DISAP-' P0INTM3NT 1

FI 36 .805 26.195.926F2 35 30

DISAPPOINTMENT 2

FI 49.S 15.429.435**F2 31 34

LIGHTNESS 1FI 26.132 3 o.666

2.1584 9.478 15.522

3.600*F2 34 31 56 9

LIGHTNESS 2FI 27.164 37.836

40.047**3 /.Gee 27.164

76.749**F 2 2 63 3 62


3.140*F2 22 43


17.046**40.75 24,25

20.724**F2 9 56 23 42

HESITANCY 1FI 14.552 50.448

8,079**25,132 36 .868

5.226*F2 U 60. 19 46

HESITANCY 2FI 16.2 48.2 33,96 31.04

7.407**'•'? i r?.1 / 48 .00321 23 42


79

more apparent.- Cases such as these may indicate that the rejection

of the theory of intonation morphemes is unwarranted.

INTELLIGIBLE AND FILTERED SENTENCES

As with nonsense sentences the responses to 28 of the 44

filtered sentences differ significantly (CC = .01) from responses

to intelligible sentences (see Tables XVI and XVII ); Grouping

together the two most frequently selected meanings has little effect

on the general picture. The second sub-hypothesis can not be

accepted. Its rejection is not with any high degree of confidence

however. Again, the rejection of this hypothesis only suggests a

tendency; it can not prove or disprove the existence of intonation

morphemes.

Though it is not as strong, some of the same phenomena occurred

here as in responses to nonsense sentences. Occasionally listeners

recognized the intended meaning more readily in filtered speech than

in intelligible speech. The suspicion remains but with very strong

reasons for suspicion, even though the ability of listeners to hear

the intended meaning more readily in nonsense and filtered speech

supports the theory rather than the suspicion that the theory is

wrong.


Table XVI

Comparison of Listener Responses- to Linguist’s Intelligible Sentences with Listener Responses to Linguist's Filtered Sentences

Single ItemX 1

Grouped Item- v / 2.Resnon ses Responses T

right wr on g right wrongFI 37.94 24.05 49.77 16.23

INSISTENCE 1 F2 8 58 60.8 84 * * 24 42 54.261**FI 15.76 45.24 31.52 34.4 8

INSISTENCE 2 F2 8 58 5.150* 10 56 27.267**SIMPLE FI 28.274 37.726IMPLICATION 1 F2 10 56 20.663*SIMPLE FI 13.79 52.2. 24.62 41.38IMPLICATION 2 F2 4 57 9.098** 7 54 18.159**STRONG FI 17.73 48.27 31.5 2 29.48IMPLICATION 1 F2 3 63 16.058** 4 62 49.716**STRONG FI 27.555 3 b .44 5 36.445 27.555IMPLICATION 2 F2 10 5 S 20.398** 31 35 2.028SIMPLE FI 29.555 36.445 42.359 23.641QUESTION 1 F2 18 48 8.182** 34 32 4 .606*SIMPLE FI 14.78 46.22 2.4.625 36.3 75QUESTION 2 ?2 8 58 4.105* e 58 18.022**POLITE FI 10,84 o 5 .1 c 20.68 45.32QUESTION 1 Ci_‘ 61 3 .764 13 53 4.153*POLITE FI 21.67 44.33 33.495 32.505QUESTION 2 F2 9 57 11. 249** 31 35 .377

FI 13.794 52.206 19.7 4 6.3DELIBERATION 1 ?2 1 65 15.002 9 57 8.283

FI 12.8 53.2 22.6 4 o, 34DELIBERATION 2 F2 8 58 2.233 11 55 9.137**

FI 28.648 37.352 31.522 34.478SURFRI3S 1 F2 8 58 26.296** 11 cr. 25 .575**

FI 14.78 51.22 28.565 37.435SURPRISE 2 F2 0 66 19.045** 24 42 1,290DISAP Fi 17.73 48.27 30.54 35.46POINTMENT 1 F2 16 50' .213 16 50 12,884**

DISAP FI 20.68 45.52 26.598 39.402POINTMENT 2 F2 10 55 8.032** 27 39 .0102

FI 25.612 40,388 46.299 19.701LIC-HTNESS 1 F2 27 39 .123 4-1 25 2.031

FI 14.778 51.222 32.505 35.495LIGHTNESS 2 F'2 33 35 28.951** 51 14 23.037**

FI 18.7 47.3 34.48 31.52REPUDIATION 1 ti r\r w 6 60 12,036** p. 58 42,582**

FI 20.68 45.32 29.55 36.45REPUDIATION 2 F2 1 65 27.274** IT 65 49.946**

FI 15,76 50.24 24.62 41.38HESITANCY. 1 F2 7 59 6.396* 19 47 2., 04 6

V I 21.65 39.15 31.86 28.225HESITANCY 2 F2 25 3 c ,707 40 21 3.442


Table XVII

Comparison of Listener Responses to -Actor's Intelligible Sentences with Listener Responses to Actor's Filtered Sentences

SingleResponr

Itemes

Grouped Item Responses X"

INSISTENCE 1FI

right v.t eng

40.445**

right wrong

80.471**20.985 33,015 45.322 16.678

F 2 4 58 14 48

INSISTENCE 2FI 24.99 37.01

24.173**43.49 18.51

58.203**F2 6 55 16 4 6SIMPLE VI 23.123 38.874 37.938 24.062IMPLICATION 1 F2 10 52 11.882** 18 44 26.999**SIMPLE PI IB.507 43.493 28.687 33.313IMPLICATION 2 F2 s 53 6,965** 23 39 2.098STRONG FI 14.16 47.84 35.154 26.84 6IMPLICATION 1 F2 0 65 18.351** 0 62 81.167**STRONG FI 24.07 37.93 37.01 24.99IMPLICATION 2 1**2 3 c QV.' k/ 30,014** 6 56 64 .463SIMPLE FI 30.535 31.4 65 31.465 30.535QUESTION 1 F2 17 45 11.821** 39 23 17 V» <-» rj6 * couSIMPLE FI 19.43 42.57QUESTION 2 F2 26 33 3.236POLITE FI 12.956 49.042 24.586 37.014QUESTION 1 F2 11 51 .373 11 51 13.114**POLITE FI 25.91 3 6 .09 32.39 29.61QUESTION 2 F2 25 37 .055 3S 24 2.035

FI 51.465 30.535DELIBERATION 1 F2 17 45 13.507**

FI 19.43 42.57 28.71 33.29DELIBERATION 2 ?2 11 51 5.244* 15 47 12.193**

FI 30* 5 35 31.4 65 48.115 13.882SURPRISE 1 F2 16 4 S 6.919** 35 27 15 .972**

FI 7,0 7C1*.• e ( v 22.21 53.67 8.33SURPRISE 2 F2 15 47 43.115** 31 31 71 .4 53**DISAP- FI 37.01 24.99FOIKTUSNf 1 F2 21 41 17.182**

DISAP FI 43.493 13.507POINTMENT 2 F2 17 45 54 .063**

FI 26.835 35.166 47.194 14.806LIGHTNESS 1 F2 31 31 1.141 52 10 2.049

FI 25.91 •36.09 36.09 25.91LIGHTNESS 2 F2 10 52 16.784*-" 13 49 35.350**

FI 27.76 %& . 9/1V’ — 4 kv —REPUDIATION 1 F2 13 49 14.211**

FI 24,056 37.944 38.87 23.13REPUDIATION 2 72 18 44 2.492 26 36 11.422**

«--.nr i 13.50 48.12 26.834 35.166HESITANCY 1 F2 EV 57 7,320** 14 48 10.822**

FI 14,02 45.98 0 2 e 0 5 r\r\cc »clHESITANCY 2 12 r 57 10,222** 13 <:9 24 .306**


CHAPTER V

CONCLUSIONS

The purpose of this study was to test the theory that there are

conventional patterns of intonation which have meaning; that is,

that there are intonation morphemes. The major experimental hypo

thesis was the null correlate of the positive statement: "Native

speakers of American English can recognize the defined meanings of

specified contours." There were four sub-hypotheses:


by listeners to full and intelligible speech and those assigned

by listeners to nonsense sentences.





by listeners to sentences with Pike's contours produced by

a linguist and those assigned by listeners to sentences pro

duced by an actor with the same intended meanings.


by naive listeners and those assigned by non-naive listeners

to the same sentences.

The design of the study required a relatively large number of

listeners to hear three types of sentences, intelligible, nonsense,

and filtered. The listeners were of tvo kinds, naive and non-naive.

The equipment used to conduct the experiment was tested' for frequency


83

response, distortion, and proper functioning.

The sentences used for stimulus items were recorded by two

speakers, an actor and a linguist. The sentences were subjected

to spectrographic analysis after validation by qualified judges.

The data collected from the listeners were tabulated and treat

ed by a Chi-square analysis to determine the significance of

variations of distribution among the groups of listener responses.

SUMMARY OF CONCLUSIONS

There is no significant difference between the responses of

naive and non-naive listeners. The null hypothesis was accepted.

Any studies which succeed this present study may well ignore the

amount of training the listeners may have had so long as they are

native American English speakers who are not bilingual.

Sub-hypotheses 1, 2, and 3 could not be accepted, nor could

they be rejected at the ,01 level of confidence. Their rejection

was tentative, but it was strongly suggestive of doubts about the

existence of intonation morphemes.

It is significant that this researcher never achieved a level

of recognition that some previous research had found. It is probably

because of two factors: (1) that the listeners were asked to make

distinctions in rather fine shades of meaning, and (2) much previous

research had used utterances whose content was consistent with the

intended meaning of the intonation contour. These factors are not

offered as apologies for the findings however.


The theory that there are intonation morphemes or specific

meanings for conventional patterns of intonation is subject to

question. This study has neither proven nor disproven the theory,

but it has rejected all the null hypotheses whose acceptance would

have supported the theory. The overall rejection of the hypotheses

was not at the .01. level of confidence, but the tendency was far

stronger to reject than to accept the hypotheses.

SUGGESTED STUDIES

Because the conclusions here are largely tentative, further

studies of this type are indicated. The writer believes that

recognition is the factor of primary importance, not the physical

correlates of perception nor even the ways in which perception takes

place. Units of language must be recognized in order to be meaning

ful. If a study can establish the fact that people can or cannot

recognize conventional intonation morphemes, the necessity for

further study may cease.

Studies relevant to the present one may include:

(a) a study of the influence of voice quality and other

paralinguistic features in the recognition of emotion.

(b) a study of the relationship between recognized emotional

meanings in semantically context-bound and context-free

utterances.

(c) a study of the role of emotional meaning in conversa-

v tional speech. Do speakers convey emotional meaning?

Would such meaning be recognized outside the conversational


situation in which it was uttered?

GENERAL CONCLUSIONS

This study has not supported the theory that there are intona

tion morphemes in American English. It is probable that emotional

meaning is a product of features other than intonation perhaps more

than of intonation itself, those features being rate, pause, voice

quality, lexical context, grammatical context, and social context.

To what extent emotional meaning is actually conveyed by intonation

is still a matter of conjecture for further study and speculation.


BIBLIOGRAPHY

1. Abe, Isamu. "Intonational Patterns of English and Japanese," Word. 11 (1955), 386-398.

2. Adolph, A. R. "Studies of Pitch Periodicity," Quarterly Progress Report, Cambridge, Mass.: Research Laboratory of Electronics, Massachusetts Institute of Technology, April 15, 1957, 127-131.

3. Aristotle. The Rhetoric of Aristotle, translated by J. E. C. Welldon, London: Macmillan and Company, 1886.

4. Armstrong, Lilias E. and Ida C. Ward. Handbook of English Intonation, Cambridge: Heffer and Sons, 1926.

5. Bloch, Bernard and George Trager. Outline of Linguistic Analysis, Baltimore, Md.: Linguistic Society of America, 1942-,

6. Bloomfield, Leonard. Language, New York: Henry Holt and Company, 1933.

7. Bolinger, Dwight L. "A Theory of Pitch Accent in English," Word, 14 (1958), 109-149.

8. __________. "Contrastive Accent and Contrastive Stress,"Language, 37 (1961), 83-96.

9. __________, and Louis J. Gerstman. "Disjunction as a Cue toConstructs," Journal of the Acoustical Society of America,29 (1957), 778.

10. __________. Generality, Gradience. and the All-or-None, TheHague: Mouton.and Company, 1961.

11.____________. "Intonation and Analysis," Word, 5 (1949), 248-254.

12. __________. "Intonation and Grammar," Language Learning,8 (1957), 31-37.

13. Borst, John M. and Franklin S. Cooper. "Speech Research Devices Based on a Channel Vocoder," Journa1 of The Acoustical Society of America, 29 (1957), 777.

14. Chang, Nien-Chuang T. "Tones and Lntonation in the Chengtu Dialect," Phonetlca, 2 (1958), 59-85.

36


87

15. Chomsky, Noam. Aspects of the Theory of Syntax, Cambridge,Mass.: The M. I, T. Press, 1965.

16. _ • "Current Issues in Linguistic Theory," in TheStructure of Language, edited by Jerry A. Fodor and JerroldJ. Katz, Englewood Cliffs, New Jersey: Prentice-Hall, Inc.,1964, 50-118.

17 . , Morris Halle, and Fred Lukoff. "On Accent andJuncture in English," in For Roman Jakobson, The Hague:Mouton and Company, 1956, 65-80.

18 . . Syntactic Structures, The Hague: Mouton, 1957.

19. Coleman, H. 0. "Intonation Emphasis," Miscellanea Phonetica,1 (1914), 6-26.

20. Cooper, Franklin S., P. C. Delattre, A. M. Liberman, J. M.Borst, and L. J. Gerstmah. "Some Experiments on the Perceptionof Synthetic Speech Sounds," Journal of the Acoustical Society of America, 24 (1952), 597-606.

21. Creelman, C. Douglas. "Detection, Discrimination, and the Loudness of Short Tones," Journal of the Acoustical Society of America, 35 (1963), 1201-1205.

22. Danes, Frantisek. "Sentence Intonation from a Functional Pointof View," Word, 16 (1960), 34-54.

23. DeGroot, A. W. "Phonetics in Its Relation to Aesthetics," in Manual of Phonetics, edited by Louise Kaiser, Amsterdam: North-Holland Publishing Company, 1957, 390-396.

24. Denes, P. "A Preliminary Investigation of Certain Aspects of Intonation," Language and Speech, 2 (1959), 106-122.

25. Fairbanks, Grant and LeMar W. Hoaglin, "An Experimental Study of the Durational Characteristics of the Voice During the Expression of Emotion," Speech Monographs, 8 (1941), 85-90.

26. . "Recent Experimental Investigations of Vocal Pitchin Speech," Journal of the Acoustical Society of America, 11 (1940), 457-466.

27. Flanagan, James L. "Pitch Discrimination for Synthetic Vowels," Journal of the Acoustical Society of America, 30 (1958), 435-442.

28. Fry, D. B. "Experiments in the Perception of Stress," Language and Speech, 1 (1958), 126-152.


88j -

29. Gleason, Henry A., Jr. An Introduction to Descriptive Linguistics, revised edition, New York: Holt, Rinehart, and Winston, 1961.

30. Gray, Giles M. and Claude M. Wise. ’’The Bases of Speech, 3d ed., New York: Harper, 1959.

31. Grube, G. M. A. A Greek Critic: Demetrius on Style, Toronto: University of Toronto Press, 1961.

32. Hadding-Koch, K.and M. Studdert-Kennedy. "An Experimental Study of Some Intonation Contours," Phonetica, 11 (1964),

• 175-185.

33. Harris, Zellig S. "Simultaneous Components in Phonology," Language, 20 (1944), 181-205.

34. Hart, John. An Orthographie, Conteyning the Due Order andReason, Howe to Write or Paint T’nimage of Mannes Voice, Most Like to the Life or Nature, 1569, reprinted with biographical, and bibliographical introductions by Bror Danielsson in Stockholm Studies in English, Stockholm: Almquist andWiksell, 1955, 165-227.

35. Hockett, Charles L. A Course in Modern Linguistics, New York: The Macmillan Company, 1958.

36 . __________ . A Manual of Phonology, Baltimore: Waverly Press,Inc., 1955.

37. Jakobson, Roman and Morris Halle. Fundamentals of Language,The Hague: Mouton and Company, 1956.

38. Jassem, Wiktcr. Intonation of Conversational English (Educated Southern British), Wroclaw, Poland: Travaux de la Societe desScientes et des Lettres de Wroclaw, Seria A, Nr. 45, 1952.

39. Jespersen, Otto. Essentials of English Grammar, University,Alabama: University of Alabama Press, 1964.

40. . Language: Its Nature, Development, and Origin,New York: Henry Holt and Company, 1924.

41. Jones, Daniel. An Outline of English Phonetics, Cambridge:W. Heffer and Sons, Ltd., 1956.

42. ___. The Phoneme: Its Nature and Use, 2nd ed.,Cambridge: W. Heffer and Sons, Ltd., 1962.

43. and Paul Passey. The Principles of the InternationalPhonetic Association, London: University College, 1949.


89

44. Joos, Martin. Acoustic Phonetics, Supplement to Language.Vol. 24, Baltimore: Waverly Press, 1948.

45. Kersta, L. G., P. D. Brick'er, and E, E, David. "Human or Machine? - A Study of Voice Naturalness," Journal of the Acoustical Society of America, 32 (I960), 1502.

46. King, Harold V. English Phonology, Ann Arbor, Michigan:Ann Arbor Publishers, 1961.

47. Kingdon, Roger. English Intonation Practice, London:Longmans, Green and Company, 1958, .

48. __________. The Groundwork of English Intonation, New York:Longmans, Green and Company, 1958.

49. Kurath, Hans. A Phonology and Prosody of Modern English,Ann Arbor: The University of Michigan Press, 1964.

50. Ladefoged, Peter and Norris P. McKinney. "Loudness, Sound Pressure, and Subglottal Pressure in Speech," Journal of the Acoustical Society of America, 35 (1963), 454-460.

51. Lehiste, Use. "Some Acoustic Correlates of Accent in Serbo- Croation," Phonetlca, 7 (1961), 114-147.

52. Liberman, A. M., F. S. Cooper, K. S. Harris, and P. F. Mac Neilage. "A Motor Theory of Speech Perception," Proceedings of the Speech Communication Seminar, Stockholm: Speech Transmission Laboratory, Royal Institute of Technology, 1963.

53. Lieberman, Philip. Intonation, Perception, and Language, Research Monograph No. 38, Cambridge, Mass.: The M. I. T.Press, 1967.

54. __________. "On the Acoustic Basis of the Perception ofIntonation by Linguists," Word, 21 (1965), 40-54.

55. __________. "Perturbations in Vocal Pitch," Journal of theAcoustical Society of America, 33 (1961), 597-603.

56. __________. "Some Acoustic Correlates of Word Stress inAmerican English," Journal of the Acoustical Society of America, 32 (1960), 451-455.

57. __________. "Some Acoustic Measurements of the FundamentalPeriodicity of Normal and Pathological Larynges," Journal ofthe Acoustical Society of America, 35 (1963), 344-353.

58. __________ and Sheldon B. Michaels. "Some Aspect's of Fundamental Frequency and Envelope Amplitude as Related to the Emotional Content of Speech," Journal of the Acoustical Society of America,


90

34 (1962), 922-927.

59. _________ . "Vowel Intonation Contours," Quarterly ProgressReport, Cambridge, Mass.: Research Laboratory of Electronics,Massachusetts Institute of Technology, July 15, 1958, 156-161.

60. Lifshitz, Samuel. "Apparent Duration of Sound Perception and Musical Optimum Reverberation," Journal of the Acoustical Society of America, 8 (1936), 213-221.

61. __________. "Two’ Integral Laws of Sound Perception Relatingto Loudness and Apparent Duration of Sound Impulses," Journal of the Acoustical Society of America, 5 (1933), 31-33.

62. Marckwardt, Albert H. ,,r0n Accent and Juncture in English' -A Critique," in Second Texas Conference on Problems of Linguistic Analysis of English (1957), Austin: University of TexasPress, 1962, 87-93.

63. Mattingly, Ignatius G. "Synthesis by Rule of Prosodic Features," Language and Speech, 9 (1966), 1-13.

64. Meader, Clarence L. and John H. Muyskens. Handbook of Bio-linguistlcs, Part One, revised edition, Toledo, Ohio: ToledoSpeech Clinic, Inc., 1962.

65. Mol, H. and E. M. Uhlenbeck, "The Linguistic Relevance of Intensity in Stress," Lingua, 5 (1956), 205-213.

66. Morton, John and Wiktor Jassem. "Acoustic Correlates of Stress,"Language and Speech, 8 (1965), 159-181.

67. Ostwald, Peter F. Soundmaklng, Springfield, 111.: Charles C.Thomas Publishers, 1963.

68. Palmer, Harold E. and W. G. Blandford. A Grammar of SpokenEnglish on £ Strictly Phonetic Basis, Cambridge: W. Hefferand Sons, 1924.

69. __________ . English Intonation with Systematic Exercises,Cambridge: V. Heffer and Sons, 1924.

70. __________ . The Scientific Study and Teaching of Languages,Yonkers-on-Hudson: World Book Company, 1917.

71. Peterson, Gordon E, "Automatic Speech Recognition Procedure," Language and Speech, 4 (1961), 267-281.

72. Pike, Kenneth L. Phonemics: £ Technique for Reducing Languages to Writing, Ann Arbor: The University of Michigan Press, 1945.


91

73. Pike, Kenneth L. The Intonation of American English, Ann Arbor: The University of Michigan Press, 1945.

74. Potter, Ralph K., George A. Kopp, and Harriet C. Green,Visible Speech, New York: D.'Van Nostrand, 1947.

75. Ripman, Walter. English Phonetics, London: J. M. Dent andSons, Ltd., 1931.

76. . Good Speech: An Introduction to English Phonetics,London: J. M. Dent and Sons, Ltd., 1925.

77. Rush, James. The Philosophy of the Human Voice: Embracing itsPhysiological History; together with a System of Principles, bywhich Criticism in the Art of Elocution may be Rendered Intelligible, and Instruction, Definite and Comprehensive, 2nd ed., Philadelphia: Grigg and Elliott, 1833.'

78. Schubiger, Maria, English Intonation, Tubingen: Max NiemeyerVerlag, 1958.

79 . __________ . The Role of Intonation in Spoken English, Boston:Expression Company, 1935.

80. Shaffer, Harry L. "Pitch Information Rate for Connected Speech," Journal of the Acoustical Society of America, 35 (1963), 1876.

81. Sledd, James H. "Notes on English Stress," in First TexasConference on Problems of Linguistic Analysis of English (1956), Austin: University of Texas Press, 1962, 33-44. •

82. Steele, Joshua. Prosodia RatIona11s: or An Essay TowardsEstablishing the Melody and Measure of Speech, to be Expressed and Perpetuated by Peculiar Symbols, 2nd. ed-., London:J. Nichols, 1779.

83. Stetson, R. H. Motor Phonetics, 2nd ed., Amsterdam: North-Holland Publishing Company, 1951.

84. Stevens, K. N., T. T. Sandel, and A. S. Howse. "Perception of Two-Component Noise Bursts," Journal of the Acoustical Society of America, 34 (1962), 1876-1878.

85. __________ . "Towards a Model for Speech Recognition," Journa1of the Acoustic Society of America 32 (1960), 47-55.

86. Stockwell, Robert P. "On the Analysis of English Intonation,"in Second Texas Conference on Problems of Linguistic Analysis of English (1957), Austin: University of Texas Press, 1962,39-55.


87. Stockwell, Robert P. "The Place of Intonation in a Generative Grammar of English," Language, 36 (1960), 360.

88. Twaddell, W. F. "Stetson's Model and the 'SuprasegmentalPhonemes'," Language, 29 (1953), 415-453.

89. Uldall, Elizabeth. "Attitudinal Meanings Conveyed by Intonation Contours," Language and Speech, 3 (1960), 223-234.

90. Vygotsky, L. S. Thought and Language, translated by EugeniaHaufmann and Gertrude Vakar, Cambridge, Mass.: The M. I. T.Press, 1962.

91. Wells, Rulon S. "Immediate Constituents," Language, 23 (1947),81-117.

92. .. "The Pitch Phonemes of English," Language, 21 (1945), 27-40.


V

\ppendix A

RECORDING

c<

u o cu

O <0 O n3 CO U

OO O E-« © U

>>G Oo cuC O Oj

• P

Of i

<x §o o o +> '.£> +> oj

ocl t oo o

to* K j CS3o

Uo CD•«-i •H

•Hisci r H

o ,a

£ 3 ©uP *

(DGOXP jo

o•H C\J£2 CO

(— 1

°yCQ

93


Instrument

Plot

94

Frequency rospcr.se of Eriiel and Kjaar model 4132 condenser microphono. fhe~curve "belovr probably dees not represent the true response characteristics of the microphone because of the inadequacy of the speaker in the test chamber and because of the lack of a true ^nnchoic chamber.

.1 I .<5-r e> o*u 5ii


95

Frequency response of Briiel and Kjesr model 2603 microphone preamplifier.

tI.<5•t

O8 $3.n•o


2o0

5 JO

1CC0

iCOO

5000

1000

0

96

Frequency response of Sony model TC 800 monaural tape rccorde* •

A . The curve below represents the frequency response of the amplifier only of the Sony TC 800 tape recorder.


•wHarmonic distortion in the Sony model TC 800 tape recorder was no

"more than the "distortion of the Bruel and Kjeer beat frequency oscillator model 1022, less than 4.5J6.'

Rotation time of the motors in the Sony TC 800 was measured by a frequency counter in .1 second intervals and found to be true- within .00001^.


\x"

98

B. The curve below represents the record and playback males of the Sony TC 800 tape recorder taken from the ’’monitor” output of the recorder’s amplifier.

n i i 11 i i i T i T n w i n T L w n w n T Qh i i i !; M ! I i i i : I * ! ! i i ! I i i HI i ! I i i i i ! 11 ik_U LL? i U JJj 1 U UJJLi ■ H n , > If ! i n i l ] ! 1 1 : ! | i 1 1 ! | ! i • 1 1 !!....... I ! ! I ! ! ! I i i ! I ! !t ' ! i t ; j i ! i ! ! I ! ! ! i t ! ! !L ‘ j i i i in ! h i

! I ! ! ; I I iin:! i I j : i i t i l l ! !

£

L ; i

f“ .uUMlD T

1;

i i1 i1

! I M ! 11 i 11 ; 11 i M II I

I ! i IM M

! ! i ! M II !

! I !I L L U

! i

M j j T l4

J 0 $ l I

- f e 11r n - I T T ;i i : ! ! !

_!_m— LJ_'— !--- 1— ; y: i . 1 ; ■ ■ i ■ ; ; i i ; IsI . • ! ■ I ! ! ! I i

1 II ' I

i : M I ■ ! ! ! I ! i II I ! 1 ! I i ! i ! ! ! I i I! : ! I ! i M 1; 1 i i I!

i I i i i U ! i"!M i l i i h i iI ! i

M i l l IJJ iji j j j j IliTT

i l ! ii';11111! 11! ! I ! I M I > ! I I 111! i 11! 11 i! !

I • ! I I !

! i

! ! ! i j! i ! | i i

I ! ■ I ' !______________'J_M______________• i"l i | ; ! M i I : Ijjjjjjj M_ _

i I i i i 1 1 i ! 111 i i [ i ! i i j j j j j 8I Ii i i i j I j i j N j j j j j M I j i i ! 1 j I j ! j i i I ! i ! i i M

• i i : n "i f | 1 i I I i I I I ! f| Mill!'I i l i f ; ! ! h l l i i ! I I !

Mil!I I | i 1 1 ! ! i ! { ! i ! i J M i l l ! ! WJJJ 1 i i ! i M ! i ! i 1 ! ! 1 ! ! i l i T i f i l T i ! M M i M ! i ! ! ‘ M ;} I j j j j | : j I i I j j j I j J J H I 11

iHTnVi’iTfi! i i j I ! ( j I I ■ II I I

.! I I

I I I_ — :_ L

n m i i f

!_LU UJJ LLi r i T f

I j : > - i : j 1 ; ! i !U 11 M i M M ! i1 i/'/l

j i ! ! i I I I ! I . i : I ! ! II I J J J -L

LI I I i i H j L I U 1 1 i ! i | i i ! i j / L l j j j 11 Ll,i i j T i T | T T l T i i [ i T r ' ’ 11L!I i ! ! M I ! ! i I ! i I I i ! !Dtrmn

' ! : 1 i ! i IJ J jJ jJ LL i! 1111 iijjjjj n_j_i n i ! i T i fT T i H i i ,

! ! I! i M 1

11 M M M i


Appendix B

COPYING

o•H•H

99


Wollensak

model

Sony

TC 800

tape recorder

tape recorder

100

See Appendix A for information on Sony TC 800 taps recorder

Frequency response of Vfollensek model 1500 tape recorder at preamplifier output

O o.o


.Appendix C

DISTORTION COPYING

O o o o o

101


V/ollensak

model

lioath

600/1

Sony

TC 800

tape recorder

decade

attenuator

tape recorder

resister

102

Ses Appendix A for information on the Sony TC 800 tape recorder.

See Appendix B for information on the YiollensAh 1500 taps recorder.

Filtering characteristics of the Allison 2-BR variable filter. The curve^efoiv-'indTcates effective filtering at 20 db per octave attenuation with flat response within the desired frequency range.

:

i ■: s! U ! i!!!i s! s!H l 1 !J H :J ’j * ISli! I........ .

: : ! i ' ; M ; U ! i ‘ j I

: i I ! i i

y i j j j j y l l l l u j

■a s


Appendix D

PLAYBACK

<D•ri

-P ® C3 +>

es n.

c

MMo u o ®<—f *r4

© -H

O © C 03 CO P-,op o 6-t ©

103


Instrument

Plot

104

See A ppend ix A f o r in fo r m a t io n on th e Sony TC 800 ta p e re c o rd e r

F re q u e n cy re sp o n se o f th e Sherwood m odel S-1000 I I a u d io a m o l i f i e r .

r ' i ! ! { 1 i I j I

n i i i i w i T i i T n p n n i i i T i i i i i i i i i i! . I ' I L L

I ■ !* ~ ' ; i M /f: ! j i i : i i ! i I | I i I i I i i

> : ; ; i ! i ! ! ! ! i ' ! i ! : 11 i

r-~—

qjz:

M ^ ; H ; : i i { ! ! M ; ! M M ! I ! ! ! S ! H ! i ! j i* I ’ • : ! : * s ! * ! ; i i f : i ! f i i ! ! ! i I ' ! i * I I I i I !

hi-:-I N

I L L■ i i

i ! !DU : ' < i l li ! i

- I I

• : M !H i^ j j ._' . j J !_

! : j : : : ! I j I | ! i: ' i 1 : ; ; : ! I j : i ; i

, 1

; I ; H V !i M Vi r iill1 I i

j : ii’TV i I i

i ! ! ! I Mi

: m ; i . i h i i j i i i i i i h Hi i i i M: _ j. ._ i... _ ■_____ • J.i j j _ L

& r - v - \

I 1 ;D l

i f !

! ’ 1 :

! i

nn i ! i i I ! I jI i

! I l I I

I ; :

l i t

i !i l l ,

I

i i ' ; v | ! i ! I

i ! i i I * ! ' I ■ Ii i l i i i i i : ' i

H i ii I!

! ! ! IIJ !j ijITiTHM i l ' l l

! I M V i I :

: i •~ T

' r ' - - T r h r '1 : 1 I ; I

Vi T! ! I I : ! I ! ! :

& TI : \

‘ | i M • |! i 1 ! I ! Ii i * ! i : < 1 / 1. i i !

i ■

' U JIL 1U f il i l i

" i l j j t . ....

l i

i n ' 1 ! I ! •

i ! ! i

i j I j"I p * r

J_L.L "i r r

1 i M 1 1 1 1 ! I i l i ! I

I |/; i I ; I I N i ! i I I ji j;! i! i i j 11 i!; 11S! i i I i I ! N 1 1 1 1 1 1

i l l ! i i l ' j i l l M l i

i i

j :Ql1 iT

! i


105

Frequency response of Acoustic Research Laboratories model AR-2sx speaker system tested in an Industrial Acoustic Company sound suite model 405A. The peaks in the curve below may represent the resonance characteristics of the chamber in which the speaker system was tested. See the next response curve for the curve based on the manufacturer's frequency response specifications, which is probably more accurate.


106

rrequency response of Acoustic Research Laboratories model AR-2ax sp'sake'r "systern according to manufacturer’s specifications.

o <«


107

Harmonic distortion of Acoustic Research Laboratories model AR~2ax VpeckeF'systern according to manufacturer’s specifications

t t

a a

cowo«*-**A•5ocoEaJS

ooo

o » n n <o 10 ^

QZ00UJ(AocUia.(0ujo>•u

>•ozuDaIdKU.

<oco

£bfl

Nouaoisia iN3Daad


108

Frequency response of the entire playback system (Sony TC 800 tape rec"orde r ~ Sh er wood S-1000 II amplifier, and AH-2ax speaker) tested ir. an Industrial Ac ousting, Company sound suite model 405A. The peaks on the curve below probably represent in partthe resonance characteristics of the chamber in which thesystem was tested.

! J M i ! ! _ [ ] ' M iC ~ i •; i 11 j i i i i D I M M ' M i

! i M !i

1 <

! i • ! ! ! i! I !

_[ I Ij I _H i : ! 1 !! ! M ! !

r

I ' o

i I i U ! : i

i !: U ! ! I M1 i ! 1 i i ’ !

K ! ■ 1 1 i : M ! : M i i i : ! ! V: : i ; i ,M i : i ; 1 ' | ■ ■ ! i ! ■ : ; ii | ' M m

i i! ! IimIT

! I i m M I !i ! I !! i

T C m! I

2 I I T> 1 <V

I M j I i 1; i : 1 i 1 ' i 1 I i • ! 1 ! !! 1 ; I i ! ■ i 1 ! ! 1 } i t ! ! i I j g

! : i i !

i 1 i 1V": rrrrf■iMTTiTiT

I : ! ij i ! ij :i j ! j i M i i ;U i i U i J j ! u_‘i i i ! N ! !j i i !j.D11 h h i j {11 i I j :

L 1 i i I M i ' i i ' M i i If H i n H ! I ! j j i l i i M : ■ ! ! : ■ I i ■ I ' J ! I I0 1 7 1 : 1 M r i l i l 7 0 7 1 7 1 ; ! i 7 i j i i j i l i i j i l i i l lh i ! i I l i i i i i I : I H i ! i \ l ; ; i ! i ! i I ! I i ! i I I I I i i i H ii ! I i •

I ■ ^ : I M !! i ! i j I !S'. • ' i i ! ! i ' i ' ! ! h i j j | i j : M .j t _-■****■• • ” 1 ' w

! ! I i i . I ! I i !

Mil!i I ' i V i I I !i i\!

y ; : F|'~:~!"7 ’ ' | i 1 ' ! i : ! ' i ' i / ; • ! ! : ‘ ! i i ' ! ! i t i j t I i I f | | M i j

i n iT m n i; I'M n T i w m n T T j T i ^ n i n h u t

I : I! ! I ! I | !M 1 1 ! I ! I iU i ! i ! ! ! I i I s j j Sj I ! I j i i i i ! M M i ! j M I i ! 111“ ‘T T T T t n T m !T I T i T l T l T i i T i T r n T T r i Ti M M I i i i M i !! M ! M i 11

m i


AUTOBIOGRAPHY

Clarence Wilton McCord was b o m September 14, 1934, in Lawton,

Oklahoma„ He attended, first grade in Giddings, Texas, and completed

his elementary and secondary education in the public schools of New

Orleans, Louisiana, graduating from Alcee Fortier High School in

1952. He received the B.A. degree with honors from Louisiana

' College in 1956, the B.D. degree from Golden Gate Baptist Theologi-

-cal Seminary in I960, and the M.A.. degree from Louisiana State

University in 1962. He was Assistant Professor of Speech at

Howard Payne College from 1961 to 1963 and at Georgia Southern

College from 1963 to 1966. While in residency at Louisiana State

University, he served as Graduate Assistant in 1966-67 and

Special Lecturer in Speech in 1967-68.

He has returned to Georgia Southern College as Assistant

Professor of Speech for the summer session, 1968.

109


EXAMINATION AND THESIS REPORT

Candidate: Clarence Wilton McCord

M ajor Field: Speech

Title of Thesis: a Study of the Recognition of American English Intonation byNative Speakers

Approved:

Major Professor and CW fman

Dean of the Graduate School

EXAMINING COMMITTEE:

■<r , ( ^ v o

U J C u

D ate of Examination:

May P2, 1968


A Study of the Recognition of American English Intonation ...

Documents

Transcript of A Study of the Recognition of American English Intonation ...