University of Groningen An Acoustic Analysis of Vowel ...

University of Groningen

An Acoustic Analysis of Vowel Pronunciation in Swedish DialectsLeinonen, Therese

IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite fromit. Please check the document version below.

Document VersionPublisher's PDF, also known as Version of record

Publication date:2010

Link to publication in University of Groningen/UMCG research database

Citation for published version (APA):Leinonen, T. (2010). An Acoustic Analysis of Vowel Pronunciation in Swedish Dialects. s.n.

CopyrightOther than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of theauthor(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).

The publication may also be distributed here under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license.More information can be found on the University of Groningen website: https://www.rug.nl/library/open-access/self-archiving-pure/taverne-amendment.

Take-down policyIf you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediatelyand investigate your claim.

Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons thenumber of authors shown on this cover page is limited to 10 maximum.

Download date: 14-11-2021

https://research.rug.nl/en/publications/an-acoustic-analysis-of-vowel-pronunciation-in-swedish-dialects(51096135-2965-4d55-882e-4f078cd52057).html

Chapter 4

Data

In this chapter the data set analyzed in this thesis is described. General informationabout the SweDia database where the data comes from is given in � 4.1. The speci�cdata set and the vowels that were chosen for the analyses are described in � 4.2, andin � 4.3 the speakers are described in more detail.

4.1 The SweDia Corpus

SweDia 2000 (Eriksson, 2004a,b) was a project carried out as a joint e�ort betweenthe Swedish universities of Lund, Stockholm and Umeå. The aim was to document,analyze and describe the dialectal variation in the Swedish language area, with aspecial focus on phonetic and phonological descriptions. The project was �nanced byThe Bank of Sweden Tercentenary Foundation and was carried out between 1998 and2003. Dialect data were recorded at 107 sites in Sweden and the Swedish-languageparts of Finland. At each site recordings were made with approximately twelvespeakers representing two generations. The older speakers were in the approximateage range of 55�75 years and the younger speakers of 20�35 years. An equal numberof male and female speakers were recorded in each age group. Hence, each recordingsite was represented by three older women, three older men, three younger womenand three younger men, with a few exceptions (see � 4.3).

The sites for the recordings were chosen to represent the rural dialects, so thatno speakers from cities or larger towns were included in the database. The motiv-ation for this was that traditional rural dialects tend to disappear in large parts ofthe Swedish language area and needed to be recorded before being completely lost(Eriksson, 2004b, see also � 2.1.2). Moreover, the language varieties in the cities havedeveloped under di�erent premises and have to be studied with di�erent methodsfrom those used for rural dialects. The language varieties in the cities have beenin�uenced by large-scale immigration from the countryside and in the cities there issigni�cantly more social and linguistic strati�cation than in rural areas (Nordberg,

45

46 Chapter 4. Data

2005). The locations for the database were chosen to represent the dialectal situationin the Swedish language area by being balanced geographically and with respect topopulation density. To enable diachronic comparison, locations that had been sub-ject to previous studies were favored when the geographic distribution allowed achoice between nearby locations.

The database consists of two types of data: a) spontaneous speech, and b)a controlled part for which speci�c phonetic and linguistic features were elicited.The spontaneous speech comprises free interviews with dialect speakers or dialogsbetween two dialect speakers. The controlled data focused on three speci�c areasof the dialects: 1) the sound system, 2) intonation and tone accents, and 3) thequantity system.

The recordings were made with a lapel microphone and a portable DAT-recorder.The recordings were done at 48 kHz sample rate and 16-bit amplitude resolution.Before analysis the data were downsampled to 16 kHz/16 bit.

The recordings were made in the speakers' homes or other familiar places in orderto make the participants feel comfortable and to make the use of the local vernacu-lar feel natural. A quiet room without much reverberation was chosen to ensuregood quality recording. Living rooms with many soft surfaces were preferred overkitchens, which contain many hard surfaces that produce reverb. Two interviewerswere generally present at the recording sessions, one of whom carried out the actualinterview and the other one being responsible for the technical equipment. In manycases the interviewers were not speakers of the local vernacular, but spoke a regionaldialect or the regional standard language representing the larger geographical regionof the recording site.

The fact that the interviewers are not speakers of the local vernacular might bea problem when recording dialect data. Especially in the Swedish language situ-ation where most speakers use code-mixing when varying between local, informalspeech and more formal speech it is di�cult to say to what extent the speech ofthe interviewers has in�uenced the speakers. In a study of the local vernacular ofBurträsk in the province of Västerbotten in the north of Sweden, Thelander (1979)recorded subjects in di�erent speech situations and measured the use of dialectalversus standard forms of a number of morphological and morpho-phonological fea-tures. First, the local subjects participated in a free discussion with four locals.After one hour a �stranger� entered the discussion. He or she originated from thenorth of Sweden but not from Burträsk County and was introduced as a member ofthe research sta�. There was no evidence of a complete code-switch, but the use ofstandard variants of the variables increased signi�cantly among the local speakersafter the �stranger� had entered the discussion. However, the increase of standardforms was strongest in sentences following immediately after the �stranger� had beentalking. A number of the speakers who participated in the group discussions werealso interviewed in Thelander's study. In the interview situation the use of stand-ard forms was considerably higher than in the group discussions, and the di�erencebetween the discussion and the interview situation was larger than between groupdiscussion with or without �stranger�. When the speakers were explicitly told that

4.2. Vowel data 47

the purpose was to record the local colloquial language there was less di�erence inspeech style in the di�erent situations.

To avoid in�uencing the speaking style of the participants in the SweDia projectthe interviewers tried to talk as little as possible and to give feedback with facialexpressions and body language rather than using verbal feedback (Aasa et al., 2000).The participants were also told explicitly that the purpose of the interview was torecord the local dialect. For some speakers it was more di�cult than for others tokeep talking the local dialect when the interviewers spoke another variety. Wheneliciting the controlled data, the participants sometimes had to be reminded to givethe local forms (Sw. bondska) instead of using standard variants.

4.2 Vowel data

The vowel data for this thesis come from the controlled part of SweDia databasefocusing on the sound system. The purpose of this part of the database was to makephonetic and phonological analyses of the vowel systems of the dialects possible. Asdescribed in � 2.3.3 the Swedish dialects vary a lot with respect to the vowel systems,and the phoneme systems of many present day dialects are unknown. Therefore, theword list for eliciting vowels was put together so that it would not only cover theStandard Swedish vowel system, but also re�ect some Proto-Nordic features that areknown to be preserved in some dialects.

A list of approximately 30 di�erent words was put together for eliciting the voweldata. The list was, however, not constant across all sites but varied to some extent.For some of the most divergent dialects the word list turned out to be unsuitable,because many of the words in the list were not used in the local vernacular. This wasthe case for the sites Munsala, Orsa and Älvdalen, for which separate word lists werecreated. For the same reason, a few words in the original list had to be replaced forall or some of the speakers at some other sites. In order to keep the phonetic contextof the vowels as stable as possible, words were chosen where the target vowels aresurrounded by coronal consonants. Only words with an /r/ following the vowel werean exception from this rule, since some varieties of Swedish have a dorsal /r/ andnot an apical /r/ like Standard Swedish. Including vowels in pre-/r/ context wasstill important because some of the Swedish vowels have allophonic variants thatoccur only before /r/.

The interviewers had prepared short questions1 that would make the dialectspeakers come up with a certain word. Once a speaker had guessed the right word,the word was repeated 3�5 times in isolation.

The word list data were manually segmented and transcribed within the SweDiaproject and partly in the follow-up project SweDat. The segmentation and transcrip-tion was mainly done by student assistants working within the project. For eachword, only the target phoneme was segmented and transcribed. The transcription is

1For example, �Vad använder �skaren för att fånga �sk?� `What does the �sherman use forcatching �sh?', answer: �nät� `net'.

48 Chapter 4. Data

a rough phonetic transcription. The aim of the SweDia project was to gather dataand provide researchers with the data for experimental and quantitative researchpurposes. The transcriptions were mainly intended to o�er an overview of the dataas well as to serve as a search tool. By not transcribing the data in �ne phonetic detailvariation due to di�erent transcribers was reduced. One example of the roughness ofthe transcriptions is that diphthongs are usually not transcribed, but indicated withthe symbol of the most nearby monophthong and an additional label �dift�. For thepresent thesis, vowel segments were analyzed acoustically. The transcriptions in thedatabase were only used for identifying point vowels (see � 5.1.3).

4.2.1 Selected vowels

For this thesis a selection of the vowels in the SweDia word lists was made. Table 4.1displays the 19 vowels selected: twelve long vowels and seven short vowels. Onlywords that were used for eliciting vowel phonemes at most of the locations in thedatabase were chosen. Moreover, it turned out that some of the words in the originalword lists were problematic for eliciting dialect data. The selected words include allthe Standard Swedish long vowel phonemes and the allophones of /E:/ and /ø:/. Ofthe Swedish short vowel phonemes four are missing: /Efi/, /8,/ /U/ and /øfl/. However,the pre-/r/ allophones of /Efi/ and /øfl/ (that is, [æ] and [œ]) are represented.

For some reason, the /o:/ vowel was elicited with the word låt in the southernparts of the language area (administrated by the university of Lund) and with lås inthe central and northern parts (administrated by the universities of Stockholm andUmeå). Even though, as a rule, only vowels elicited with the same word all over thelanguage area were used for this study, an exception was made for /o:/, so that thecomplete set of Swedish long vowels would be represented.

Standard Swedish /ø:/ is represented by two di�erent words, lös and söt, in theselected data set. The reason for this is that the vowel in lös represents the Proto-Nordic diphthong /au/, while the vowel in söt was originally a monophthong. Somedialects have preserved two di�erent phonemes in these two words.

4.2.2 Missing vowels

The reason that some of the Swedish short vowels are missing in the selected data setis that they had not been consistently elicited at all sites for the database. Sometimesdi�erent words were used at di�erent sites for eliciting the vowels. However, twowords that have the same vowel phoneme in Standard Swedish do not always havethe same phoneme in all dialects. For this reason a decision was made to onlyuse vowels elicited with the same word for comparison across dialects. The onlyexception to this rule was Standard Swedish /o:/, elicited with lås and låt (seeprevious section).

For eliciting the Standard Swedish phoneme /øfl/ the word blött `wet' was used.This turned out to be problematic, because some of the �eldworkers asked for theadjective blött (neuter form) while others asked for the verb form blött (supine),

4.2. Vowel data 49

Table 4.1. The words used for eliciting the vowels that comprise the data set forthe current study.Swedish Standard Proto- word class, Englishword Swedish Nordic form translation

vowel diphthong?

dis /i:/ no noun sing. hazedisk /I/ no noun sing. counter, dishes, disktyp /y:/ no noun sing. type, jerk�ytta /Y/ no verb inf. moveleta /e:/ yes verb inf. seek, look forlett /efl/ yes verb sup. leadlus /0:/ no noun sing. lousenät /E:/ no noun sing. netlär [æ:] no verb pres. teachsärk [æ] no noun sing. nightgownsöt /ø:/ no adj. sing. sweetlös /ø:/ yes adj. sing. loosedör [œ:] yes verb pres. diedörr [œ] no noun sing. doorlat /A:/ no adj. sing. lazylass /a/ no noun sing. loadlås/låt /o:/ no noun sing. lock/tunelott /O/ no noun sing. lott, sharesot /u:/ no noun sing. soot

which are homophones in Standard Swedish. The vowels in the adjective and theverb, however, do not have the same historical origin. The vowel in the adjectiveblött originates from the Proto-Nordic diphthong /au/, while the vowel in the verbform originates form Proto-Nordic /eu/. A number of Swedish dialects have pre-served the Proto-Nordic diphthongs, and thus, also re�exions of /au/ and /eu/, asseparate phonemes. In the database the words are not tagged for word class, butonly the Standard Swedish orthographic form is given in addition to the phonetictranscription of the vowel segment. Comparing the re�exion of /au/ in one dialectwith the re�exion of /eu/ in another dialect would show di�erences that do nothave a linguistic basis. Therefore, blött was not included in the analysis, so that thephoneme /øfl/ is only represented by its allophone [œ] in the word dörr.

For eliciting /8/ the word ludd `�u�, fuzz' was used in the SweDia project. Thiswas an unlucky choice, since this word turned out to be unknown to many of thedialect speakers. In many dialects the words lo and lugg are used instead of ludd.In lugg the vowel phoneme is the same as in ludd, only the consonant context isdi�erent. The word lo, however, has a di�erent phoneme (a long vowel). Therefore,the data concerning /8/ is incomplete in the database. This is a shortcoming, since/8/ is an important dialect marker: in the spoken language around Stockholm andUppsala many speakers show a merger of /8/ and /øfl/, while, for example, in parts

50 Chapter 4. Data

of the Finland-Swedish dialect area /8/ is lacking as a separate phoneme with thepronunciation being identical to /U/.

For the phoneme /Efi/ the word lätt `easy' was used at some sites and läsk `softdrink' at other sites. While lätt originates from Proto-Nordic, läsk is a relativelyrecent formation to the verb läska which is a Low German loanword in Swedish.The ä in the two words does not have the same origin and the two words might havedi�erent phonemes in some dialects. Therefore the vowel was not included in theanalysis.

The phoneme /U/ was not included in the word lists for all of the locations, andcould therefore not be included in this study.

It is regrettable that a few important vowel phonemes have not been consistentlyelicited for the SweDia database. Still, the SweDia data is a unique collection ofsystematically elicited data of modern Swedish dialects from the whole Swedishlanguage area. Even though a few vowel phonemes are missing in the analysis in thepresent thesis, the data should be able to give a good picture of how the Swedishdialects relate to each other with respect to vowel pronunciation. All StandardSwedish long vowel phonemes are included in the data set, and as mentioned in � 2.3the geographic variation is more prominent in Swedish long vowels than in Swedishshort vowels.

4.3 Speakers

The total number of dialect speakers analyzed in this thesis is 1,170, recorded at 98di�erent sites. The sites are displayed in Figure 4.1. In addition, twelve speakersof Standard Swedish were included. These speakers were recorded in the SweDiaproject, too, and had been perceived as good representatives of Standard Swedishpronunciation. The speakers of Standard Swedish grew up in the greater Stockholmarea and were all either professional linguists working at a Swedish university orstudents of a linguistic subject at the time of the recording.

In the SweDia project the aim was to record twelve speakers at each site: threeolder women, three older men, three younger women and three younger men. How-ever, at some sites more than twelve speakers were recorded and at some sites the�eldworkers did not manage to �nd three speakers of each speaker group for a re-cording. Therefore, the number of speakers varies somewhat across sites and acrossspeaker groups. In addition, not all words in Table 4.1 were recorded by all speakersat all sites. A decision was made to include only speakers who had recorded at least13 out of the 19 vowels. The average number of speakers per site is twelve, but thenumber varies between eight and fourteen. The number of speakers per site includedin this thesis is shown in Appendix A.

For the various analyses in this thesis average values per vowel were computedfor groups of speakers. Three di�erent groupings were made:

4.3. Speakers 51

Figure 4.1. The 98 sites in Sweden and Finland where the dialect data wererecorded. The four biggest cities in the area are included as reference points in themap.

52 Chapter 4. Data

• one group per site

• two groups per site: older and younger speakers

• four groups per site: older women, older men, younger women, younger men

When the number of vowels recorded by a group was less than �fteen, the group wasnot included in any analyses. The total number of speakers in each group and thenumber of older and younger speakers and men and women in each group is listedin Appendix A.

Some of the analyses presented in Chapters 6 and 7 work with missing data in thedata matrix, while others do not. Factor analysis (� 6.3) is sensitive to missing data,so only objects with data for all 19 vowels were included. Multidimensional scaling(Chapter 7), on the other hand, is based on average vowel distances, which can becalculated for a smaller number of vowels without biasing the results. Therefore, alarger number of speakers are included in the multidimensional scaling than in thefactor analysis. Groups without the full number of vowels are indicated by footnotesin Appendix A.

The average birth year for the older speakers is 1933 and for the younger speakers1973. When the recordings were made, the average age of the older speakers was66 and the average age of the younger speakers 26. As the histograms in Figure 4.2show, the age range is larger for the older speakers than for the younger speakers.The older speakers were born between 1911 and 1957, while the younger speakerswere born between 1959 and 1982.

By including speakers from two age categories in the SweDia database one pur-pose was to make studies of language change in apparent time possible. The agerange of the younger speakers was deliberately chosen so that the group would notinclude teenagers, but somewhat older speakers. While the language of teenagers caninclude features that are dropped when the speakers get older, speakers in their 20s

men

Birth year

Fre

quen

cy

1910 1930 1950 1970 1990

010

2030

4050

60

old young

women

Birth year

1910 1930 1950 1970 1990

old young

Figure 4.2. Histograms of the birth years of the speakers.

4.3. Speakers 53

or early 30s who still lived in the municipality where they grew up were consideredto be representative for the local dialect (Eriksson, 2004b).

The speakers were all born in the area where they were recorded and had livedthere most of their lives. An additional requirement for the younger speakers wasthat their parents were speakers of the local vernacular. For the older speakers thiswas not a requirement, but the parents of most of the older speakers were at leastfrom the same larger region.

University of Groningen An Acoustic Analysis of Vowel ...

Documents

Transcript of University of Groningen An Acoustic Analysis of Vowel ...