PetersonBarney_JAcoustSocAM_1952

download PetersonBarney_JAcoustSocAM_1952

of 10

Transcript of PetersonBarney_JAcoustSocAM_1952

  • 8/8/2019 PetersonBarney_JAcoustSocAM_1952

    1/10

    THE JOURNAL F THE ACOUSTICALOCIETYOF AMERICA VOLUME24, NUMBER 2 MARCH,1952ControlMethodsUsed n a Studyof the Vowels

    GORDONE. PETERSONAND HAROLD L. BARNEYBdl Talephoneaboratories,nc., MurrayHill, NewJersey(ReceivedDecember , 1951)

    Relationshipsetweenlistener'sdentificationf a spokenowel nd tspropertiessrevealedromacoustic easurementf its sound avehavebeen subject f studyby many nvestigators.oth heutterancend he dentificationf a vowel ependpon he anguagenddialectalackgroundsnd hevocal ndauditoryharacteristicsf the ndividualsoncerned.hepurposef thispapers to discussomef he ontrolethodshat aveeensedn he valuationf heseffectsnavoweltudyrogramat BellTelephoneaboratories.heplan f hestudy,alibrationfrecordingndmeasuringquipment,andmethodsor checkinghe performancef bothspeakersnd istenersre described.he methodsreillustratedrom esults f testsnvolving ome 6 speakersnd70 isteners.INTRODUCTION

    ONSIDERABLEariations obeoundn herocessesf speech roduction ecause f theircomplexitynd becausehey depend pon he pastexperience f the individual. As in much of humanbehavior here s a self-correcting,r servomechanismtypeof feedbacknvolved s the speakerears isownvoice ndadjusts isarticulatorymechanisms.In the elementaryase f a wordcontaining conso-nant-vowel-consonanthonemeZ.structure, speaker'spronunciationof the vowel within the word will beinfluencedy his particular ialectal ackground;ndhis pronunciation f the vowel may differ both inphonetic ualityand n measurableharacteristicsromthat produced n the word by speakerswith otherbackgrounds. listener, ikewise, s influencedn hisidentificationf a sound y hispastexperience.Variationsare observedwhen a given individualmakes epeated tterances f the samephoneme.verysignificantroperty f these ariationss that theyare not random n a statistical ense, ut show rendsandsudden reaks r shifts n level,andother ypesofnonrandomluctuations.Variationsikewise ppear nthe successivedentificationsy a listenerof the sameutterance. t. is probable that the identificationofrepeatedsounds s also nonrandombut there is littledirect evidence n this work to supportsuch a con-clusion.

    A study of sustained owelswas undertaken o in-vestigate n a general way the relation between thevowelphonementendedby a speaker nd that identi-fied by a listener,and to relate these n turn to acous-tical measurementsf the formantor energy oncentra-tion positionsn the speechwaves.In the plan of the study certainmethods nd tech-niqueswere employedwhich aided greatly in thecollectionof significantdata. These methods ncludedrandomization f test material and repetitions o ob-

    Bernard$. Lee, J. Acoust.Soc.Am. 22, 824 (1950). B. Bloch,Language 4, 3 (1948).a B. Bloch,Language 6, 88 (1950). R. K. Potter and J. C. Steinberg,J. Acoust. Soc. Am. 26,807 (1950).175

    tainsequencesf observationsor thepurposef check-ing the measurementroceduresnd the speaker ndlistenerconsistency.he acousticmeasurementseremadewith the sound pectrograph;o minimizemeas-urement rrors, methodwasused or rapidcalibrationof the ecordingndanalyzingpparatusy meansfa complexest one.Statisticalechniquesereappliedto the results f measurements,oth of the calibratingsignals nd of the vowel sounds.These methodsof measurement nd analysishavebeen ound o be precise nough o resolve he effectsof different dialectal backgroundsnd of the non-random rends n speakers' tterances. omeaspectsof the vowelstudy will be presentedn the following.paragraphs o illustrate the usefulness f the methodsemployed.

    EXPERIMENTAL PROCEDURESTheplanof thestudys llustratedn Fig.1. A listof words List 1) waspresentedo the speaker nd hisutterancesof the words were recordedwith a mag-netic ape ecorder, he list containeden monosyllabicwords achbeginning ith I-hi and endingwith l-d-]and differingonly in the vowel.The wordsusedwereheed,hid, head,had, hod,hawed,hood,who'd,bud, andheard. The order of the words was randomized in each

    list, and each speakerwas asked to pronounce wodifferent ists.The purpose f randomizinghe words nthe list was to avoid practiceeffectswhich would beassociated ith an unvaryingorder.If a givenList 1, recorded y a speaker,wereplayedback to a listener and the listener were asked to writedown what he heard on a second ist (List 2), a com-parison of List 1 and List 2 would reveal occasional

    Fro. 1. Recording nd measuring rrangementsor vowelstudy.

  • 8/8/2019 PetersonBarney_JAcoustSocAM_1952

    2/10

    176 G. E. PETERSON AND H. L. BARNEY

    hid hd

    hird

    ill.

    hud

    Ihd bud

    hoed had

    , . I1tJ ' I

    had hardFo. 2. Broadbandspectrogramsnd amplitude ections f the word ist by a femalespeaker.

    differences, r disagreements, etween speaker andlistener. nstead of being played back to a listener,List 1 might be played into an acousticmeasuringdevice and the outputs classified ccording o themeasured roperties f the soundsnto a List 3. The

    three istswill differ n somewordsdepending pon hecharacteristics f the speaker, he listener, and themeasuringdevice.A total of 76 speakers,ncluding 3 men, 28 womenand 15 children, each recorded wo lists of 10 words,

  • 8/8/2019 PetersonBarney_JAcoustSocAM_1952

    3/10

    METHODS USED IN A STUDY OF VOWELS 177making a total of 1520 recordedwords.Two of thespeakers erebornoutside he United Statesand a fewothers poke foreignanguageeforeearning nglish.Mostof thewomen ndchildren rewup in theMiddleAtlanticspeech rea. The malespeakersepresentedmuchbroader egionalsampling f the United States;the majority of them spokeGeneralAmerican2The wordswere andomizednd werepresentedo agroupof 70 listenersn a series f eight sessions.helistening roupcontained nly men and women, ndrepresentedmuch the same dialectal distribution asdid the groupof speakers, ith the exceptionhat afew observers ere ncluded hohad spoken foreignlanguagehroughoutheiryouth.Thirty-twoof the 76speakers erealsoamong he 70 observers.The 1520wordswerealsoanalyzed y means f thesound pectrograph..7Representativepectrogramsnd sections f thesewordsby a male speaker re shown n Fig. 3 of thepaperby R. K. Potter and J. C. Steinberg4a similarlist by a female peakers shown ereas Fig. 2.8 n thespectrograms,e see he initial [h-] followed y thevowel, nd henby thefinal [-d'].There s generallypart of the vowel ollowinghe influence f the [-h-1 ndpreceding he influenceof the I-d-] during which apracticallysteadystate is reached. n this interval, asection s made,as shown o the right of the spectro-grams.The sections, ortraying requencyon a hori-zontalscale, nd amplitude f the voicedharmonicsnthe vertical side, have been measuredwith calibratedPlexiglassemplates o provide data about the funda-mental and formant frequenciesnd relative formantamplitudes f eachof the 1520 ecorded ounds.

    LISTENING TESTSThe 1520 ecorded ordswerepresentedo the groupof 70 adult observers ver a high quality loud speakersystem in Arnold Auditorium at the Murray HillLaboratories. he general urpose f these estswas oobtainan aural classificationf eachvowel to supple-ment the speaker'sclassification.n presenting he

    words o the observers,he procedurewas to reproduceat each of sevensessions,00 words ecordedby 10speakers.At the eighth session,here remained ivemen's and one child's recordings o be presented; tothese were added three women's and one child's record-ingswhichhad beengiven n previous essions, akingagain a total of 200 words.The sound evel at the ob-servers'positionswas approximately70 db re 0.0002dyne/cm,andvariedovera range f about3 db at thedifferentpositions.In selectinghe speakersor eachof the first seven

    C. K. Thomas, Phonetics f American English, The RonaldPressCompany New York, 1947). Koenig,Dunn, and Lacy, J. Acoust.Soc.Am. 17, 19 (1946).?L. G. Kersta,J. Acoust. oc.Am. 20, 796 1948).8 WKey ords for the vowel symbolsare as follows: i] heed,[x'] hid, l'e'] head, ,e'] had, ['o] father, [-o] ball, I'v] hood,who'd, [-A']bud, [3'J heard.

    FREQUENCY OF SECOND FORMANT IN CYCLES PER SECOND2500 2000 1500 1000 500 00 0

    1000 u.

    Fla. 3. Voweloopwithnumbersf soundsnanimouslylassifiedby listeners; achsoundwaspresented 52 times.sessions, men, 4 women,and 2 childrenwere chosenat random rom he respectiveroups f 33, 28, and 15.The orderof occurrencef the 200words poken y the10 speakersor eachsession as randomizedor pre-sentation to the observers.

    Each observer asgivena pad containing 00 lineshaving the 10 wordson each line. He was asked todraw a line through he one word in each ine that heheard. The observers' eatingpositions n the audi-torium werechosen y a randomizing rocedure, ndeach observer ook the sameposition or each of theeightsessions,hichweregivenon eightdifferentdays.The randomizingof the speakers n the listeningsessions as designed o facilitate checksof learningeffects rom one sessiono another.The randomizingof words n eachgroupof 200 wasdesignedo minimizesuccessful uessing nd the learning of a particularspeaker's ialect.The seatingpositions f the listenerswere randomized o that it would be possible o de-termine whether position n the auditoriumhad aneffect on the identification of the sounds.

    DISCUSSION OF LISTENING TEST RESULTSThe total of 1520sounds eardby the observers on-sistedof the 10 vowels, achpresented 52 times.Theease with which the observers classified the variousvowels ariedgreatly.Of the 152 [-i3 sounds,or in-stance,143wereunanimously lassified y all observersas [i-]. Of the 152sounds hich he speakersntendedfor I-o-I,on the otherhand,only 9 wereunanimouslyclassifieds -a-]by the whole ury.These data are summarized n Fig. 3. This figureshows he positionsof the 10 vowels n a vowel loop inwhich the frequencyof the first formant is plottedagainst he frequency f the secondormant on melscales0 n this plot the origin is at the upper right.The numbersbesideeach of the phoneticsymbolsarethe numbersof sounds,out of 152, which were unani-mouslyclassifieds that particularvowelby the jury.It is of interest n passing hat in no casedid the jury

    agreeunanimouslyhat a soundwas something therthan what the speaker ntended.Figure 3 shows hat R. K. Potterand G. E. Peterson, . Acoust.Soc.Am. 20, 528(1948).toS.S. Stevens nd J. Volkman, Am. J. Psychol.329 (July,1940).

  • 8/8/2019 PetersonBarney_JAcoustSocAM_1952

    4/10

    178 G. E. PETERSON AND H. L. BARNEY[i], [r], [a], and I-u] are generally uite well under-stood.

    To obtain the locations of the small areas shown inFig. 3, the vowelswere epeated y a single peaker ntwelvedifferentdays.A line enclosingll twelvepointswasdrawn or eachvowel; he differencesn the shapesof theseareasprobablyhave ittle significance.When the vowelsare plotted in the mannershown nFig. 3, they appear n essentiallyhe samepositions sthose shown n the tongue hump positiondiagramswhich phoneticinns ave employed or many years?The terms high, ront, owback" efer o ihe tonguepositionsn the mouth.The [i], for instance,s madewith hetongue umphighand orward,he[u] withthehumphighandback, nd he a] and -a]with thetonguehump ow.It is of interest that when observers isagreedwithspeakerson the classificationof a vowel, the twoclassifications ere nearly always n adjacentpositionsof the vowel oop of Fig. 3. This is illustratedby thedata shown on Table I. This table shows how the ob-serversclassified he vowels, as comparedwith thevowels ntendedby the speakers. or instance,on allthe 152soundsntended s [i] by the speakers,herewere 10,267 otal votesby all observershat they were[i], 4 votes or [], 6 votes or -e-I, nd3 votes orOf the 152 a] sounds,herewasa large raction f thesounds n whichsomeof the observersoted or I-a-].[i] was aken or I-e-] sizable ercentagef the time,and [e-] wascalledeither [x-] or [a-] (adjacentsoundson the vowel oopshown n the preceding ig. 3) quitea largenumber f times. a] and v-l, and -A] andwere alsoconfusedo a certainextent. Here again, asin Fig. 2, the [i, [-r], [e], and I-u-]show.highntel-ligibility scores.It is of considerable interest that the substitutionsshownconform o presentdialectal rends n Americanspeechatherwell, and n part, to the prevailing owelshifts observable ver long periodsof time in mostlanguages.s The common endency s continually oshift toward highervowels n speech,which correspondto smiler mouth openings.The listener, on the other hand, would tend to makethe oppositesubstitution.This effect is most simplydescribedn terms of the front vowels. f a speakerproducest-] for [e], for example nan-] for [men] ascurrentlyheard n someAmericandialects; hen suchan individualwhenserving s a listenerwill be inclinedto writemenwhenhehears -mn-]. hus t is that in thesubstitutionshown n Table I, [-] most frequentlybecamee], and e-]most requently ecame-e].Theexplanationf thehigh ntelligibility f [a-] sprobablybasedon this samepattern. It will be notedalong he

    n D. Jones,An OutlineofEnglishPhanetis W. Heifer and Sons,Ltd., Cambridge, England, 1947).a G. W. Gray and C. IV[. Wise, The Basesof Speech HarperBrothers,Hew York, 1946),pp. 217-302.aL. Bloomfield, Lartguage Henry Holt and Company, NewYork, 1955), pp. 369-$91.

    vowel oop hat a widegap appears etweena-]andIn-]. The [a] of the Romanceanguagesppearsn thisregion.Since hat vowelwaspresentn neither he listsnor the dialectsof most of the speakers nd observersthe [-a-] asusually orrectlydentified.The [i] and he I-u] are the terminal r endpositionsin the mouth and on the vowel oop toward which thevowelsare normally directed n the prevailingprocessof pronunciationhange.n the formationof [i-] thetongue s humpedhigherand farther forward han forany othervowel; n [u] the tongue umptakes hehighestposterior osition n the mouthand the lips aremore oundedhan or anyothervowel. he vowelsu']and -i-]are thusmuchmoredifficult o displace,nd agreater tability n theorganicormation f these oundswouldprobablybe expected, hich n turn shouldmeanthat thesesounds re recognizedmoreconsistently y alistener.

    The high ntelligibilityof {--] probably esults romthe retroflexionwhich is present to a marked degreeonly in the formationof this vowel; that is, in additionto the regularhumpingof the tongue, he edges f thetongueare turnedup against he gum ridgeor the hardpalate. In the acoustical attern the third formant ismarkedly ower han for any othervowel.Thus n bothphysiologicalndacousticalhoneticshe [-] occupiesa singular ositionamong he American owels.The very low scores n I-u] and [a-] in Fig. 3 un-doubtedly esult primarily from the fact that somemembersof the speakinggroupand many membersofthe listening roupspeakoneof the formsof Americandialectsn which n-] and -] arenotdifferentiated.When the individuals' votes on the sounds are an-alyzed,markeddifferencesre seen n the way theyclassifiedhe sounds. ot only did the total numbers fagreements ith the speakersary, but theproportionsof agreementsor the x/arious owelswas significantlydifferent.Figure4 will be used o illustrate his point.If we plot total numbersof disagreementsor all tests,rather than agreements, he result is shown by theupperchart.This showshat [--], -e-],-a-], a], andhad the most disagreements.An "average" observerwould be expected o have a distributionof disagree-mentssimilar n proportionso this graph.The middlegraph llustrateshedistribution f disagreementsivenby observernumber06. His chiefdifficultywas in dis-tinguishingetweena] and -]. This typeof distribu-tion is characteristic f severalobservers. bserver 13,whosedistributionof disagreementss plotted on thebottomgraph,shows tendencyo confuse] and [-e-]more than the average.The distributions f disagreementsf all 70 observersdiffer from each other, depending n their languageexperience,but the differencesare generally less ex-treme than the two examples hownon Fig. 4. Thirty-two of the 70 observerswere also speakers.n caseswhere an observersuchas 06 was also a speaker, heremainder f the jury generally ad moredisagreements

  • 8/8/2019 PetersonBarney_JAcoustSocAM_1952

    5/10

    METHODS USED IN A STUDY OF VOWELS 179withhis o'] and o soundshanwith heother oundshe spoke. hus t appearshat if a speaker oesnotdifferentiatelearly etween pair of soundsn speak-ing hem,he s unlikely o classifyhemproperlywhenhehears thers peak hem.His languagexperience,swouldbe expected,nfluencesothhisspeakingndhishearingof sounds.Since he listeninggroupwas not given a seriesoftrainingsessionsor these ests, earningwouldbe ex-pected n the resultsof the tests.4 Severalpieces fevidencendicate certainamountof practice ffect,but the data are not such s to provide nythingmorethana very approximate easure f its magnitude.For one checkon practiceeffect, a ninth test wasgiven he ury, in which ll thewords avingmore han10 disagreementsn any of the preceding ight testswere repeated. There was a total of about 175 suchwords; to these were added 25 words which had nodisagreements,ickedat random rom the first eighttests.On the ninth test, 67 wordshad moredisagree-ments,109had lessdisagreements,nd 24 had the samenumber f disagreementss n the precedingests.Theprobability of getting this result had there been nopracticeor othereffect,but only a randomvariationof observers' otes, would be about 0.01. When thesedata are brokendown nto three groups or the men,womenand childrenspeakers,he largestdifferencesnnumbers f disagreementsor the originaland repeatedtests was on the childrens'words, ndicatinga largerpracticeor learningeffect on their sounds.The indi-cated earning ffecton men'sand women's peech asnearlyhesame.Whefi hedataareclassifiedccordingto thevowelsound,he earning ffect ndicated y therepetitionsas easton [i-I, {-3', nd [-u,andgreateston [o- and [.Another ndication hat there was a practiceeffectlies n the sequence f total numbersof disagreementsby tests.From the second o the seventh est, the totalnumber of disagreementsy all observers iminishedconsistentlyrom test to test, and the first test had con-siderably more disagreementshan the eighth, thusstrongly ndicating downward rend.With the speak-ersrandomizedn their orderof appearancen the eighttests, each test would be expected o have approxi-mately the samenumberof disagreements.he prob-ability of getting he sequencef numbers f total dis-agreements hichwasobtainedwouldbe somewhatessthan 0.05 if there were no learning rend or other non-random effect.

    It was also ound that the listeningpositionhad aneffect upon the scoresobtained. The observerswerearranged in 9 rows in the auditorium, and the listenersin the back4 rowshada significantly reaternumberofdisagreementsith the speakershan did the listenersin the first 5 rows. The effect of a listener's osition

    4H. Fletcher and R. H. Gait, J. Acoust.Soc. Am. 22, 93(19S0).

    140

    120

    100C

    3O

    25

    2O

    ALL OBSBVERS

    I ,

    JOBSERVER 6

    ,,IOBSERVER 013

    3 u u. Fro. 4. Observer isagreementsn listening ests.

    within an auditorium pon ntelligibility asbeenob-served reviouslynd sreportedn the iterature2ACOUSTIC MEASUREMENTSCalibrationsof Equipment

    A rapid calibrating techniquewas developed orcheckinghe over-allperformancef the recording ndanalyzing ystems.his dependedn the useof a testtonewhichhad an envelope pectrum hat wasessen-tially flat with frequencyover the voice band. Thecircuit used to generate his test tone is shownsche-matically n Fig. 5. It consistsssentially f an overload-ing amplifierand pulsesharpening ircuit. The waveshapeswhich may be observedat several differentpoints n the test tonegenerator re ndicatedn Fig. 5.The test tone generatormay be driven by an inputsinewave signalof any frequency etween50 and 2000cycles. igure6(a) shows section f the test tonewitha 100cycle epetitionrequency,hich adbeen e-'cordedon magnetic ape in placeof the word ists bythe speaker, and then played back into the soundspectrograph.he departure rom uniform frequencyresponse f the over-all systems s indicatedby theshapeof the envelope nclosinghe peaksof the 100

    bV. O. KnudsenndC. Mi Harris,AcousticalesigningnArchitectureJohnWiley andSons, ew York, 1950),pp. 180-181.

  • 8/8/2019 PetersonBarney_JAcoustSocAM_1952

    6/10

    180 G. E. PETERSON AND H. L. BARNEY

    V10.65N vbI: l hit!, :,t'm o

    v2a v2l6SN7

    o o