African language families and their structural properties
description
Transcript of African language families and their structural properties
African language families and their structural properties
Sonja Bosch
Department of African Languages
University of South Africa
Workshop: Language Technologies for African Languages, EACL, Athens, Greece. March 31 2009.
Goal of the workshop
• to provide a forum to meet and share the latestdevelopments in the field of language technologies for African languages;
• to attract linguists who specialise in African languages and who would like to leverage the tools and approaches of computational linguistics;
• to attract computational linguists who are interested in learning about the particular linguistic challenges posed by African languages.
Overview of tutorial
• Introduction- Video: The History of Mankind
• Four African Phyla- History of division
- Main characteristics
• Africanisms
• Discussion/Interaction
Introduction
Aim of tutorial - • to give an overview of the complex language
situation on the African continent, not only regarding the vast variety of languages spoken, but also regarding the classifications of languages;
• to create an awareness of a few of the characteristic structural properties of some of the African languages;
• to hopefully inspire as many researchers as possible to get involved with language technology research in African languages.
Video: The History of Mankind
Journey of Mankind -The Peopling of the World• Who were our ancestors? From where did we originate? • Interaction of migration and climate over the last 160 000 years.• “We are the descendants of a few small groups of tropical
Africans who united in the face of adversity, not only to the point of survival but to the development of a sophisticated social interaction and culture expressed through many forms.”
• Stephen Oppenheimer - tracked routes and timing of migration,
placing it in context with ancient rock art around the world.
http://www.bradshawfoundation.com/journey
Distribution of languages by area of origin
http://www.ethnologue.com/ethno_docs/distribution.asp?by=area#1
Area Living
languages Number of speakers
Count % Count % Mean Median
Africa 2,092 30.3 675,887,158 11.8 323,082 25,391
Americas
1,002 14.5 47,559,381 0.8 47,464 2,000
Asia 2,269 32.8 3,489,897,14
761.0 1,538,077 10,171
Europe 239 3.5 1,504,393,18
326.3 6,294,532 220,000
Pacific 1,310 19.0 6,124,341 0.1 4,675 800
Totals 6,912 100.0
5,723,861,210 100.0 828,105 7,000
African Language Familieshttp://encarta.msn.com/media_461520382_761565449_-1_1/African_Language_Families.html
Main research constraints shared by all African languages
• Limited number of researchers• Large number of languages involved• Poor documentation for most languages• Long-standing interaction between adjacent
languages• Disappearance of some languages in second
half of 20th century(cf. Heine & Nurse, 2000:5)
Documentation for African languages
• Quality and quantity – fairly high to nil• Reasonably accurate and comprehensive
reference grammar – fewer than 100 African languages
• Majority – inadequate grammar, analysis of part of language, article or two
• Some – word list, or even less(cf. Heine & Nurse, 2000:5)
Four African Phyla• Greenberg in The Languages of Africa (1963) - traced
the historical origin and development of African languages, and classified them into four major groups:
• Niger-Congo - 300 million to 400 million speakers (1,436 languages)
• Afro-Asiatic - 200 million to 300 million speakers (371 languages)
• Nilo-Saharan - 30 million speakers (approx.)(196 languages)
• Khoisan - 200,000 to 300,000 speakers (35 languages)
Classification of four African phyla
• Khoisan – language phyla or collection of languages?
• Afro-Asiatic – most widely recognised phylum, longest history of research, largest number of researchers
• Nilo-Saharan – proposed by Greenberg 50 years ago
• Niger-Congo – recognised approx. in same format since 19th century
Four African Phyla• Greenberg in The Languages of Africa (1963) - traced
the historical origin and development of African languages, and classified them into four major groups:
• Niger-Congo - 300 million to 400 million speakers (1,436 languages)
• Afro-Asiatic - 200 million to 300 million speakers (371 languages)
• Nilo-Saharan - 30 million speakers (approx.)(196 languages)
• Khoisan - 200,000 to 300,000 speakers (35 languages)
Niger-Congo• Kordofanian languages: southern Sudan (Nuba Hills). • Mande: West Africa; incl. Bambara (Mali), Soninke (Mali, Senegal and
Mauritania). • Atlantic-Congo:
– Atlantic: incl. Wolof (Senegal), and Fulfulde (across West Central Africa) NB: validity of Atlantic as genetic grouping is controversial).
– Ijoidin (Nigeria), incl. Ijo and Defaka. – Dogon (Mali). – Volta-Congo:
• Senufo (Côte d'Ivoire and Mali) and incl. Senari and Supyire. • Gur (Côte d'Ivoire,Togo, Burkina Faso and Mali) incl. Dagbani (Northern
Ghana).• Adamawa-Ubangi: incl. Sango (Central African Republic).• Kru (West Africa) incl. Bété, Nyabwa, and Dida. • Kwa: includes Akan (Ghana) and Gbe languages (Ghana , Togo, Benin,
and Nigeria, of which Ewe is best known. • Benue-Congo, incl. among others:
– Bantu: a very large group, incl. Swahili (Kiswahili) and Zulu (isiZulu). – Yoruba and Igbo (Nigeria).
Niger-Congo structures
• Tonal languages
e.g. Zulu: ínyàngá (HLH) „moon/month“
ínyàngà (HLL) „medicine man“
• Noun class system
singular/plural by means of affixes
e.g. Zulu: umuntu/abantu „person/persons“
concordial agreement
e.g. Zulu: abantu abaningi bayasebenza „Many people work“
• Verb suffixes
modification of meaning of the verb
e.g. Zulu: -pheka „cook“, -phekela „cook for“, -phekwa „cooked by“,
-phekisa „let cook“
• Word orderSVO widespread, but SOV found in Mande, Ijoid and Dogon.
Four African Phyla• Greenberg in The Languages of Africa (1963) - traced
the historical origin and development of African languages, and classified them into four major groups:
• Niger-Congo - 300 million to 400 million speakers (1,436 languages)
• Afro-Asiatic - 200 million to 300 million speakers (371 languages)
• Nilo-Saharan - 30 million speakers (approx.)(196 languages)
• Khoisan - 200,000 to 300,000 speakers (35 languages)
Nilo-Saharan
• Komuz languages. • Saharan languages (incl. Kanuri (Niger, Nigeria)). • Songhay languages (Mali, Niger).• Fur languages (incl. Fur). • Maban languages. • (Chari-Nile languages - later rejected, placing four branches
below on equal footing with above). – Central Sudanic languages (CAR, Chad, DRC).– Kunama language. – Berta language. – Eastern Sudanic languages (incl. Nubian (Sudan, S Egypt)
and Nilotic languages (incl. Dinka (Sudan), Luo & Maasai (Kenya, Tanzania)).
Nilo-Saharan structures
• Tonal languages
• Verb prefixation and suffixation – no class agreement
• Case-marking on nouns, e.g. dative and locative (to indicate grammatical relations and semantic functions)
• Simplified noun class systems
• Word order – SOV most common
Four African Phyla• Greenberg in The Languages of Africa (1963) - traced
the historical origin and development of African languages, and classified them into four major groups:
• Niger-Congo - 300 million to 400 million speakers (1,436 languages)
• Afro-Asiatic - 200 million to 300 million speakers (371 languages)
• Nilo-Saharan - 30 million speakers (approx.)(196 languages)
• Khoisan - 200,000 to 300,000 speakers (35 languages)
Afroasiatic• *Chadic (Nigeria, Chad, Cameroon, Central African
Republic and Niger. Hausa, its principal language). • *Berber (dominant language Tamarshak / Tamasheq).• *Semitic (incl. Amharic and Tigrinya).• Cushitic (incl. Beja and Oromo as principal
languages. Beja (Sudan and Eritrea), Oromo (Ethiopia).
• *Egyptian (4,500 years of written records, not spoken for 600 years. Its final phase, Coptic is liturgical language of Coptic Church).
• Omotic (Omo plateau in Ethiopia. North and South Omotic subfamilies).
* General agreement that these major branches are clear-cut entities.
Afro-Asiatic structures
• Tonal languages - appear in the Omotic, Chadic, and South and East Cushitic branches of Afro-Asiatic.
• Two-gender system in singular, with the feminine marked by –(a)t,
e.g. Amharic: sew “ man”, set “woman”; ligu “boy”, ligitu, “girl”. • Emphatic consonants, variously realised as glottalized,
pharyngealised or implosive – changes meaning of word.
• Word order - VSO with SVO tendencies.
Four African Phyla• Greenberg in The Languages of Africa (1963) - traced
the historical origin and development of African languages, and classified them into four major groups:
• Niger-Congo - 300 million to 400 million speakers (1,436 languages)
• Afro-Asiatic - 200 million to 300 million speakers (371 languages)
• Nilo-Saharan - 30 million speakers (approx.)(196 languages)
• Khoisan - 200,000 to 300,000 speakers (35 languages)
Khoisan (1) Non-Khoe
– Ju (Northern) - (!O)!Xũũ, ||X’au||’e, Ju|’hoan (DC)– !Ui-Taa (Southern)
• (1.2.1) !Ui• (1.2.2) Taa
– ‡Hõã (200 speakers, Botswana. Moribund.)
(2) Khoe (Central)– Khoekhoe
• (2.1.1) North Nama/Damara, Hai//om, ‡Aakhoe (DC)• (2.1.2) South †!Ora; Cape Khoekhoe varieties
– Kalahari Khoe• (2.2.1) West Kxoe, Buga, ||Ani (DC),Naro (DC), G||ana, G|ui, ‡Haba
(DC)• (2.2.2) East Shua, Ts’ixa, Danisi, |Xaise, †Deti, Kua-Tsua (DC)
(3) Sandawe40,000 speakers in Tanzania (some indication - Sandawe may be related to Khoe-Kwadi family, but the relationship remains speculative).
(4) Kwadi †Kwadi (Extinct, Angola)
(5) Hadza Hadza (200-800 speakers in Tanzania), isolate
(DC = dialect cluster; † = (presumably extinct)
Khoisan structures• Sound system – unique and complex, click sounds
Non-Khoe• SVO – different from other Khoisan languages• Syntactic context determines meaning of a stem
Khoe• Rich morphology – inflection, derivation (suffixes)• Nouns – person, gender, number suffix• Grammatical agreement (adj, poss, dem, numerals, interrogatives)
Sandawe• Two genders (masc & fem), and two numbers (sing. & plural)• SOV
Kwadi (hardly any linguistic information)• SOV, frequent stem reduplication
Hadza• VSO, two genders (masc & fem)
Africanisms: special features of African languages
*Heine, B. & Nurse, D. 2008. A Linguistic Geography of Africa. Cambridge [England]; New York: Cambridge University Press.
• Quantitative survey of African languages of all major genetic groupings (99 languages: 55 Niger-Congo, 23 Afro-Asiatic, 15 Nilo-Saharan, 6 Khoisan) and major regions.
• Properties chosen that are claimed by researchers to be wide spread in Africa but not elsewhere.
• Catalogue drawn up of phonological, morphosyntactic and semantic properties that can help to define African languages.
• Africa has average of 6.8 of 11 properties.• Outside Africa no language has more than 5 of the properties.• Sub-Saharan Africa stands out typologically with an average of
7.2 properties.
*RELATIVE FREQUENCY OF OCCURRENCE OF 11 TYPOLOGICAL PROPERTIES IN AFRICAN LANGUAGES
Property used as criteria
No. of languages with that property (from total of 99)
1. Labial-velar stops 39
2. Implosive stops 36
3. Lexical or grammatical tones 80
4. ATR-based vowel harmony 39
5. Verbal derivational suffixes (pass, caus, appl. etc.) 76
6. Nominal modifiers follow noun 89
7. Semantic polysemy ‘drink/pull/smoke’ 74
8. Semantic polysemy ‘hear/see/understand’ 72
9. Semantic polysemy ‘animal/meat’ 40
10. Comparative construction [X is big/defeats/ surpasses/passes Y]
82
11. Noun “child” used productively to express diminutive meaning
50
Conclusion
• Language situation in Africa – language technology has important role to play.
• Africa lagging far behind.• Pockets of expertise emerging.• Language resources – crucial building blocks for
developing language technologies.
Recommended Reading
Greenberg, J. H. The languages of Africa. Bloomington: Indiana University, 1963.
Heine, B. & Nurse, D. 2000. African languages : an introduction. Cambridge [England]; New York: Cambridge University Press.
Heine, B. & Nurse, D. 2008. A Linguistic Geography of Africa. Cambridge [England]; New York: Cambridge University Press.
• Siyabonga!
• Ke a leboga!
• Nkosi!
• Ndi a livhuwa!
• Thank you!
References• African language families. 2008. [O] Available:
http://encarta.msn.com/media_461520382_761565449_-1_1/African_Language_Families.html. Accessed on 30 March 2009.
• Aikhenvald, Alexandra Y. / R. M. W. Dixon. 2001. Areal Diffusion and Genetic Inheritance: Problems in Comparative Linguistics. Oxford: Oxford University Press.
• Alexandre, P. 1972. An Introduction to languages and Language in Africa. London, Ibadan, Nairobi: Heinemann.
• Ethnologue. 2005. [O] Available: http://www.ethnologue.com/ethno_docs/distribution.asp?by=area#1Accessed on 30 March 2009.
• Gourt the home of all knowledge. Sa. [O] Available: http://articles.gourt.com/en/language%20family Accessed on 30 March 2009.
• Greenberg, J. H.. The languages of Africa. Bloomington: Indiana University, 1963.
• Heine, B. & Nurse, D. 2000. African languages : an introduction. • Cambridge [England] ; New York : Cambridge University Press.• Heine, B. & Nurse, D. 2008. A Linguistic Geography of Africa. Cambridge
[England] ; New York : Cambridge University Press.Oppenheimer, S. 2003. Journey of Mankind. [O] Available: http://www.bradshawfoundation.com/journey. Accessed on 30 March 2009.
Language endangerment in Africa • Brenzinger, M. Language death: factual and theoretical explorations
with special reference to East Africa. Berlin: Mouton de Gruyter, 1992.
• Brenzinger, M. Endangered languages in Africa. Köln: Köppe, 1998. • Mous, M. “Loss of linguistic diversity in Africa”. In: Janse, M. and S.
Tol (eds.). Language Death and Language Maintenance. Amsterdam: John Benjamins, 2003.
• Sommer, G. “A survey on language death in Africa”. In: M. Brenzinger (ed.). Language death: factual and theoretical explorations with special reference to East Africa. Berlin/New York: Mouton de Gruyter, 1992, 301-407.
• Wurm, S. (ed.). Atlas of the World’s Languages in Danger of Disappearing. Paris: UNESCO Publishing, 2001.