Analyses of first names in The Netherlands: full population studies Gerrit Bloothooft Institute of...
-
Upload
lesley-atkins -
Category
Documents
-
view
214 -
download
0
Transcript of Analyses of first names in The Netherlands: full population studies Gerrit Bloothooft Institute of...
Analyses of first names in The Netherlands: full population
studies
Gerrit Bloothooft
Institute of Linguistics OTS
Utrecht University
CTL colloqium June 2006 2
Dutch studies on first names
Limited scientific work so far– Dictionary (20.000 entries)– Few socio-linguistic studies
• Limited scope, small samples
Topic is extremely popular in the media
CTL colloqium June 2006 3
Research dimensions in onomastics
– Name– Form and spelling– Origin– Motives– Time– Place
require a lot of data
CTL colloqium June 2006 4
Full population
Gemeentelijke Basis Administratie (GBA), Civil Administration
Electronically from 1994 Legal right to use data for scientific
research 16+ million people
CTL colloqium June 2006 5
Connected!
UiL-OTS and Meertens Institute are connected to the GBA on June 1, 2006
The right to make a rich data extraction for the full population (all persons with Dutch nationality): planned July 1, 2006
CTL colloqium June 2006 6
Research proposal NWO
The first name revolution in the 20th century in The Netherlands – the first name as a measure of social and linguistic change
CTL colloqium June 2006 7
Mile stones
Traditional naming (after relatives) decreased enormously during the 20th century, especially second half
Full freedom for parents through name law of 1970
-> Naming of children became a very personal linguistic and social expression during the last 50 years
CTL colloqium June 2006 8
Major topics
Changes in naming after relatives Relations between names and social
classes (sets and spelling) Regional spread of names, dialectal
influences Life cycles of names
CTL colloqium June 2006 9
What do we get (per person)
All first names Date -, place -, postal code -, land of birth,
gender, date of decease (after 1994) Parents: first names, date & place of birth Children: first names, date & place of birth Administrative number of all persons with own
record
this is unprecedented (also internationally)!
CTL colloqium June 2006 10
Looking for mechanisms
All research topics can be described as the search for large scale mechanisms and relations
Away from the individual name, towards much higher aggregation levels
CTL colloqium June 2006 11
Towards name sets
From 16+ million names with over 200.000 different first names to a much lower number of name sets
that have homogeneous properties
CTL colloqium June 2006 12
A previous study (2000-2004)
First names from the National Social Security Bank (SVB)
All children born since 1983– first name (official, no nick name, but..)
– year of birth– family code (separate table) – postal code (four digits)
CTL colloqium June 2006 13
A very rich source
4.2 million children (1983-2002)– 200.000 per year
1.9 million families
176.800 different first names– 108.500 unique names– 3.120 names with frequency > 100
represent 85% of the children
CTL colloqium June 2006 14
Datareduction needed
Far too many names to describe one by one
Names with common properties– Not from etymological point of view– Not from linguistic point of view
– Based on choices of parents name use!
CTL colloqium June 2006 15
Naming and social classesHypothesis:
There are social classes with own naming preferences
These classes/subcultures may relate to – culture/language (Frisian, Arabic, Turkish, Surinam,
Antillean,..)
– religion (Catholic, Protestant, Islam,..)
– sociological status (education, income,..)
– geography (urban, rural, regional,..)
CTL colloqium June 2006 16
Research aims:
Identification of social classes (and their naming preferences) on the basis of the first names of children per family
Study of the relation between these subcultures (first names) and socio-cultural and geographic factors
CTL colloqium June 2006 17
Method (a chain of names)
Parents choose first names from a set that is popular in their subculture (relatives, friends, neighbours,..) (with higher probability) [Social Group size is about 150]
This is informative only if there is more than one child (more than one name) in a family
Pairs of first names (from a family) as unit for analysis
CTL colloqium June 2006 18
Method (a chain of names)
Children in on family: Mark, Peter, Linda
If Mark is popular in a subculture, then Peter and Linda may be popular as well
Name pairs: Mark - Peter, Peter - Mark, Mark - Linda, Linda - Mark, Peter - Linda, Linda - Peter
CTL colloqium June 2006 19
Method (a chain of names)
Select all families with two or more children (1.17 million families, 2.81 million children)
Derive all pairs of first names (from a single family) (in all, 2.12 million different pairs)
Compute the frequency of each pair The higher the frequency of a pair, the more
likely the first names in the pair belong to the same set
CTL colloqium June 2006 20
Most frequent name pairs
Frequency Pair of first names1091 Johannes Maria
790 Johannes Johanna
754 Jeroen Martijn
727 Johanna Maria
….
572 Mohamed Fatima
459 Lars Niels
CTL colloqium June 2006 21
Clustering of first names
Define measure that reflects relationship between two names
Combine names, which mutually have a strong relationship, into a set– Johannes, Maria, Johanna, …
CTL colloqium June 2006 22
Name relationship measure
Esther– 7.967 girls– 12.973 brothers and sisters– 276 times sister Judith (= 2.1 %)
Judith– 4.828 girls– 8.033 brothers and sisters– 276 times sister Esther (= 3.4 %)
Geometric average (2.7 %)– A symmetric measure of relationship between the two
names
CTL colloqium June 2006 23
Clustering of first names
Name pairs from a (subculture-related) set have the highest relation measure
Esther:
Judith 2.7
Mirjam 2.4
Ruben 1.2
David 1.1
Judith:
Esther 2.7
Mirjam 1.6
Ruben 1.0
Miriam 0.8
CTL colloqium June 2006 24
Clustering
Start with strongly related name-pairs Add new name-pair to existing cluster or
start a new cluster Iterative procedure
CTL colloqium June 2006 25
Clustering results
4.013 first names– Frequency of a pair > 4
result: 340 name sets– Limited number of large sets– High number of small sets
top-25 of sets is most illustrative– 2.887 first names– 2.64 million children (75%)
CTL colloqium June 2006 26
Features of name sets
Period of maximum popularity refine!– Traditional, Pre-modern (1950-1980), Modern
Language– Dutch, Frisian, English, American, French,
Spanish, Italian, [Arabic, Turkish]– Common Western
Topic area– Nature, History & Culture, Old Testament
Length– Short (one syllable), long
CTL colloqium June 2006 27
A map of name sets
Presentation of a map of name sets– Based on mutual relations between name sets
The closer two name sets on the map, the more related the sets
CTL colloqium June 2006 28
Spanish & Italian
Long American & English
Short American & English
Pre-modernEnglish & French
Long names from the Old Testament
Names from nature
Long names from history and culture
Short modern Common Western
Pre-modern Common Western
Long French Scandinavian
Pre-modern Dutch
Short modern Dutch
Traditional DutchLatin | Dutch Short traditional
DutchFrisian
CTL colloqium June 2006 29
Dimensions
Long Short
Modern
Pre-modern
Traditional
Foreign
Common Western
Dutch, Frisian
CTL colloqium June 2006 30
Spanish & Italian RICARDO
Long American & English MICHAEL
Short American & English
Pre-modern English & French DENNIS
KIM
Names from the Old Testament DANIËL
Names from nature IRIS
Names from history and culture LAURENS
Short modern TIM Common Western
Pre-modern MARK Common WesternFrench Scandinavian NIELS
CHARLOTTE Pre-modern Dutch
JEROEN Short modern Dutch BART
Traditional DutchJOHANNES | JAN Short traditional
Dutch TEUNFrisianJELLE
CTL colloqium June 2006 31
Geographical distribution
four-digit postal code area level [3584]– Big differences between pc areas
• city quarters• villages (religion)
– Enough children for characterisation• On average 1200 births per pc in 20 years• Some further name grouping needed
CTL colloqium June 2006 32
Further grouping
Traditional names (Latin form)
Traditional names (Dutch)
Frisian names
Pre-modern names (Dutch, Western)
Foreign names (English)
Short modern names (Dutch, Western, Skand)
Names from OT, history, culture, nature
Arabic & Turkish names [unrelated group]
Other [low frequent]
%
8
5
3
12
24
13
7
5
23
CTL colloqium June 2006 33
Spanish & Italian
Long American & English
Short American & English
Pre-modern English & French
Names from the Old Testament
Names from nature
Names from history and culture
Short modern Western
Pre-modern WesternFrench Scandinavian
Pre-modern Dutch
Short modern Dutch
Traditional Dutch
Short traditional Dutch
ShortPre-Modern
Foreign
TraditionalLatin Dutch
Frisian
History & Culture
CTL colloqium June 2006 34
Traditional(Dutch)AaltjeBarendDirkjeEvertGeertjeHarmJantjeKlaasMargjeTeunis
CTL colloqium June 2006 35
Traditional(Latin form)AdrianaBernardusChristinaEduardElisabethFranciscusGeertruidaHubertusJohannaKrijnMaria
CTL colloqium June 2006 36
Frisian namesAafkeBaukeDouweFroukjeJoppeJitskeJelleMennoSietskeOnnoWietskeWiebe
CTL colloqium June 2006 37
Pre-modern names (Dutch, Western)AnniekAnitaCarlaFrankJochemJeroenLindaMarkMarloesPaulSuzanne
CTL colloqium June 2006 38
Foreign names(English)AmandaDennisDannyChantalHenryIsabellaKimKevinMelissaRicardoSamanthaStephen
CTL colloqium June 2006 39
Short names(modern, Dutch,Western, Skand)AnneBartEvaGijsLisaKajNielsSanneSofieTim
CTL colloqium June 2006 41
Old testamenthistory, culture,natureDaniëlEstherJudithNaomiWillemijnDiederikFrederiekeMauritsIrisFleurJasmijn
CTL colloqium June 2006 43
Arabic and Turkish names
FatimaMohamedNouraHamzaSaraYassinFatmaMustafaHaticeMehmet
CTL colloqium June 2006 44
Further geographical analysis
Per pc area: percentage of children per name group (8 values)
These percentages reflect social composition of the pc area
Factor analysis on data from 3584 pc areas 10 typical profiles
CTL colloqium June 2006 45
10 profilesTraditional – Latin form
Traditional – Dutch
Transitional, Traditional Dutch to pre-modernTransitional, Traditional Latin form to foreign
Pre-modern
Foreign
Short
Elite
Arabic-Turkish
Frisian
CTL colloqium June 2006 46
Example profileTraditional – Latin form
Traditional – Latin formTraditional – DutchFrisian namesPre-modern namesForeign namesShort namesNames from OT, history, culture, natureArabic and Turkish namesother
%3718
18
12660
12
CTL colloqium June 2006 47
Naming map of the
Netherlands
short
foreign
Frisian
traditional Latin
elite
ArabTurkish
pre-modern
traditional Dutch
>foreign
CTL colloqium June 2006 50
Conclusions
Successful data reduction Name groups & subcultures
– language, income, education, religion
Geographic representation– four-digit postal code area just right
The factor time should be included
CTL colloqium June 2006 51
The Wegener connection
Direct marketing company Organises twice a year a national
consumer questionnaire 200.000 families per year
– Wide range of information• Income, education level
– Includes first names and year of birth of all family members
CTL colloqium June 2006 52
Correlation at family level(instead of postal code level)
Name set &– Income of parents– Educational level (of both parents)– (newspapers, underwear, cars, insurance,
holidays,…..) preferences of parents
CTL colloqium June 2006 53
Mathematical studies
Life cycle of a name Zipf’s behavior
– A few names with high frequency, a lot of names that are unique
information function of a name in communication
CTL colloqium June 2006 55
Research dimensions in onomastics
– Name– Form and spelling– Origin– Motives– Time– Place
YES, we can do great research on this with the full population data!
CTL colloqium June 2006 56
Contact Book:
Over voornamen, Het spectrum (2004)
E-mail: [email protected]
Homepage:www.let.uu.nl/~Gerrit.Bloothooft/personal
Mail:Trans 10, 3512 JK Utrecht, The Netherlands