Lecture2: Morphologystaff.um.edu.mt/mros1/csa3202/pdf/morphology.pdf · What is Morphology...
Transcript of Lecture2: Morphologystaff.um.edu.mt/mros1/csa3202/pdf/morphology.pdf · What is Morphology...
What is MorphologyMorphological Function
Morphological ProcessesMorphotactics: the order of morphemes
Orthography versus Phonology
Lecture 2: Morphology
CSA3202 Human Language Technology
Mike Rosner, Dept ICSOctober 2011
CSA3202 Human Language Technology Lecture 2: Morphology 1/ 26
What is MorphologyMorphological Function
Morphological ProcessesMorphotactics: the order of morphemes
Orthography versus Phonology
Acknowledgements
Richard Sproat, Morphology and Computation, MIT Press,ISBN 0-262-19314-0 (1992)Jurafsky and Marting Ch. 5 (new edition), Ch. 8 (Old edition)
CSA3202 Human Language Technology Lecture 2: Morphology 2/ 26
What is MorphologyMorphological Function
Morphological ProcessesMorphotactics: the order of morphemes
Orthography versus Phonology
Outline
1 What is Morphology
2 Morphological Function
3 Morphological Processes
4 Morphotactics: the order of morphemes
5 Orthography versus Phonology
CSA3202 Human Language Technology Lecture 2: Morphology 3/ 26
What is MorphologyMorphological Function
Morphological ProcessesMorphotactics: the order of morphemes
Orthography versus Phonology
What is Morphology?
There are several areas of linguistics which study the structureof words.
PhonologyOrthographyMorphology
These vary according to your definition of a word
A word is a sequence of phonemes (phonology)A word is a sequence of graphemes (orthography)A word is a sequence of morphemes (morphology)
Rules govern the structures under each of these definitions.
CSA3202 Human Language Technology Lecture 2: Morphology 4/ 26
What is MorphologyMorphological Function
Morphological ProcessesMorphotactics: the order of morphemes
Orthography versus Phonology
What is Morphology?
There are several areas of linguistics which study the structureof words.
PhonologyOrthographyMorphology
These vary according to your definition of a word
A word is a sequence of phonemes (phonology)A word is a sequence of graphemes (orthography)A word is a sequence of morphemes (morphology)
Rules govern the structures under each of these definitions.
CSA3202 Human Language Technology Lecture 2: Morphology 4/ 26
What is MorphologyMorphological Function
Morphological ProcessesMorphotactics: the order of morphemes
Orthography versus Phonology
What is Morphology?
There are several areas of linguistics which study the structureof words.
PhonologyOrthographyMorphology
These vary according to your definition of a word
A word is a sequence of phonemes (phonology)A word is a sequence of graphemes (orthography)A word is a sequence of morphemes (morphology)
Rules govern the structures under each of these definitions.
CSA3202 Human Language Technology Lecture 2: Morphology 4/ 26
What is MorphologyMorphological Function
Morphological ProcessesMorphotactics: the order of morphemes
Orthography versus Phonology
What is Morphology?
There are several areas of linguistics which study the structureof words.
PhonologyOrthographyMorphology
These vary according to your definition of a word
A word is a sequence of phonemes (phonology)A word is a sequence of graphemes (orthography)A word is a sequence of morphemes (morphology)
Rules govern the structures under each of these definitions.
CSA3202 Human Language Technology Lecture 2: Morphology 4/ 26
What is MorphologyMorphological Function
Morphological ProcessesMorphotactics: the order of morphemes
Orthography versus Phonology
What is Morphology?
There are several areas of linguistics which study the structureof words.
PhonologyOrthographyMorphology
These vary according to your definition of a word
A word is a sequence of phonemes (phonology)A word is a sequence of graphemes (orthography)A word is a sequence of morphemes (morphology)
Rules govern the structures under each of these definitions.
CSA3202 Human Language Technology Lecture 2: Morphology 4/ 26
What is MorphologyMorphological Function
Morphological ProcessesMorphotactics: the order of morphemes
Orthography versus Phonology
What is a Morpheme?
Definition: smallest linguistic unit that has an independentmeaning or grammatical function
free morpheme: morphemes that can stand alone as words.e.g. clock, sickbound morpheme: morphemes that always attach to othermorphemes, never existing as words themselvese.g. -ly, non-
CSA3202 Human Language Technology Lecture 2: Morphology 5/ 26
What is MorphologyMorphological Function
Morphological ProcessesMorphotactics: the order of morphemes
Orthography versus Phonology
Words: Parts of Speech
Words are traditionally classified into categories, known asparts of speech (POS) or word classes.The major parts of speech are noun, verb, adjective, adverb,pronoun, preposition, conjunction, article etc.Subcategorization of some major POS e.g. nouns
proper nouncommon noun
Open classes and closed classesContent words vs. function words
POS can shed light on the context in which a word can occur,its neighbours and even its pronunciation
“òbject” (noun) vs “objèct” (verb)Many words are ambiguous: bankTrouble ahead for some tasks
POS taggingWordsense disambiguation
CSA3202 Human Language Technology Lecture 2: Morphology 6/ 26
What is MorphologyMorphological Function
Morphological ProcessesMorphotactics: the order of morphemes
Orthography versus Phonology
Two Different Kinds of Morphology
Morphology can be broadly divided into two different classes:
Inflectional MorphologyInput: a wordOutput: a different form of the same wordExample: wasal → waslu
Derivational MorphologyInput: a wordOutput: a different word that is derived the input word.Example: important → unimportant
issue: What is meant by a different word?
CSA3202 Human Language Technology Lecture 2: Morphology 7/ 26
What is MorphologyMorphological Function
Morphological ProcessesMorphotactics: the order of morphemes
Orthography versus Phonology
Differences between Inflectional and DerivationalMorphology
Inflectional MorphologyDoes not change part of speech.Particular inflections may be required in particular syntacticcontexts such as subject and object positions:“I like her” versus “me like she”.Productive: tends to apply across the board - especially to newwords e.g. “I was googling all day”.Predictable semantics: +s applies to most nouns and almostalways means plural.
Derivational MorphologyChanges part of speechSyntactic context never requires particular derivation.Not very productive: “dislike”, “dishate*Unoredictable semantic effect: “business” versus “happiness”.
CSA3202 Human Language Technology Lecture 2: Morphology 8/ 26
What is MorphologyMorphological Function
Morphological ProcessesMorphotactics: the order of morphemes
Orthography versus Phonology
Morphological Function
What forms?What information?The kind of information depends to some extent on the part ofspeech concerned,
VerbNounAdjective
and also to whether we are talking about derivational orinflectional morphology
CSA3202 Human Language Technology Lecture 2: Morphology 9/ 26
What is MorphologyMorphological Function
Morphological ProcessesMorphotactics: the order of morphemes
Orthography versus Phonology
Verb Forms(Inflectional)
EnglishTalkTalksTalkingTalked
Italianparlareparloparliparlaparliamoparlateparlono
CSA3202 Human Language Technology Lecture 2: Morphology 10/ 26
What is MorphologyMorphological Function
Morphological ProcessesMorphotactics: the order of morphemes
Orthography versus Phonology
Verb Information
personnumbergenderIn many languages the verb must agree with one or more of itsarguments on such information.e.g. Subject-verb agreement in FrenchLes filles sont arrivées - the girls arrived.
Example
arriv + é + e + s
arrive + past + fem + plur
CSA3202 Human Language Technology Lecture 2: Morphology 11/ 26
What is MorphologyMorphological Function
Morphological ProcessesMorphotactics: the order of morphemes
Orthography versus Phonology
Verb Information: Tense and AspectInflectional
Tense: the time at which the situations denoted by themoccurred
pastpresentfuture
Aspect: the state of completion of the action denotedThis is the distinction we observe in
I waited for three hours (perfective aspect)I am waiting for the bus (imperfective aspect)
CSA3202 Human Language Technology Lecture 2: Morphology 12/ 26
What is MorphologyMorphological Function
Morphological ProcessesMorphotactics: the order of morphemes
Orthography versus Phonology
Languages Differ
Tense and aspect are marked in different ways by differentlanguagesparleròI will speakMaltese se nkellemIn Chinese it is e.g. tashui+le = he sleep + PERFECTIVE
CSA3202 Human Language Technology Lecture 2: Morphology 13/ 26
What is MorphologyMorphological Function
Morphological ProcessesMorphotactics: the order of morphemes
Orthography versus Phonology
Verb Information: VoiceInflectional and Derivational
Voice: Active vs. PassiveActive: John brings the book downPassive: The book is brought down by John
In English active/passive is marked syntactically i.e at sentencelevel.In many other languages, it is marked morphologically i.e. atword levelMaltese example derived from niżel (to descend)
niżel (he descended) → niżżel (he brought down)niżżel (he brought down) → tniżżel (be brought down)
CSA3202 Human Language Technology Lecture 2: Morphology 14/ 26
What is MorphologyMorphological Function
Morphological ProcessesMorphotactics: the order of morphemes
Orthography versus Phonology
Nouns and Pronoun Information
Inflectional categories for nouns and pronouns includeNumber (singular, plural, dual)fortizz+a, fortizz+i, gh̄ajn+ejnCase (marks different relationships to the verb)
Nominative (subject), accusative (object), as seen inEnglish pronouns: he, himLatin nouns tabul+a, tabul+a+m
German has four cases: nominative, genitive, dative,accusative but only genitive is marked on the noun.Latin has sixFinnish has fourteen!
Gender (feminine, masculine, neuter)In some languages, such as the Bantu languages, more detailedgender classes exist.
CSA3202 Human Language Technology Lecture 2: Morphology 15/ 26
What is MorphologyMorphological Function
Morphological ProcessesMorphotactics: the order of morphemes
Orthography versus Phonology
Adjectives
Some languages express number, gender, case morphologicallyFrench: bon, bons, bonne, bonnes (good).Many languages express comparison of adjectivesmorphologically.English
Hard (unmarked - stem)Harder (comparative + er)Hardest (superlative + est)Sometimes the e is omitted.
In English comparison is sometimes expressed with syntax.“more different” not “differenter”
CSA3202 Human Language Technology Lecture 2: Morphology 16/ 26
What is MorphologyMorphological Function
Morphological ProcessesMorphotactics: the order of morphemes
Orthography versus Phonology
Derivational Morphology
Recall that derivations are category-changingTypical examples involving nouns, adjectives and verbs
Nominalization (verb → noun):destroy → destruc+tioncatch → catch+er
Deverbal adjectives (verb → adjective)The English suffix -able attaches to transitive verbs x andmeans able to be x’edExample drink → drink+able;
Nominalized adjective (noun → adjective)
The English suffix -less attaches to nouns x and meanssomething not possessing xExample brain → brain+less;
CSA3202 Human Language Technology Lecture 2: Morphology 17/ 26
What is MorphologyMorphological Function
Morphological ProcessesMorphotactics: the order of morphemes
Orthography versus Phonology
Morphological Processes
Linear Concatenation, where a morphologically complex wordcan be analyzed as a series of morphemes concatenatedtogether as with prefixes and suffixes: en + large + ment.Non-Linear Concatenation
infix
Bontoc (Philippines)fikas strong; fi+um+ikas be strongkilad red; fi+um+ikas be red
circumfix:German ge + stem + te.g. sagen, gesagtMaltese negative: ma + naf + x
CSA3202 Human Language Technology Lecture 2: Morphology 18/ 26
What is MorphologyMorphological Function
Morphological ProcessesMorphotactics: the order of morphemes
Orthography versus Phonology
Morphological Processes
Reduplication, e.g. in Indonesianorang man → orang+orang (men)Note that although reduplication is concatenative, it iscontext-dependent. What is inserted depends on what comesbefore.Vowel change
swim/swamConsonant change
send/sent
CSA3202 Human Language Technology Lecture 2: Morphology 19/ 26
What is MorphologyMorphological Function
Morphological ProcessesMorphotactics: the order of morphemes
Orthography versus Phonology
Morphological Processes
Interdigitation: a basic charateristic of Semitic languages(Maltese, Arabic, Hebrew, Akkadian, Syriac...)
Input radicals + vocalismOutput stemExample k t b + i e → kiteb
Interdigitation is an example of a non-concatenative operation.The output stem is then used as a basis for furthermorphological operationsn + kiteb + u → niktbu
Note that the end result is not nkitebu
CSA3202 Human Language Technology Lecture 2: Morphology 20/ 26
What is MorphologyMorphological Function
Morphological ProcessesMorphotactics: the order of morphemes
Orthography versus Phonology
Compounding
In contrast to derivations and inflections, where affixes areattached to a stem, in compounding two or more lexemes arejoined together.Both lexemes might undergo modification in the process.In German, the concatenation is expressed in the orthography:
Examplelebensversicherungsgesellschaftsangestellter
(life insurance company employee)leben s versicherung s gesellschaft s angestellter
life insurance company employee
CSA3202 Human Language Technology Lecture 2: Morphology 21/ 26
What is MorphologyMorphological Function
Morphological ProcessesMorphotactics: the order of morphemes
Orthography versus Phonology
Morphotactics
Morphotactics investigates the constraints imposed on theorder in which morphemes are combined.Various kinds of such constraints are known.Constraints on the type of affix
“un” is a prefix“tion” is a suffix
Syntactic constraintsthe suffix -able applies to verbs to yield an adjective
Other constraints:
in English, “Latin” affixes are attached before “native” ones:non+im+partial non+il+legiblein+non+partial in+non+legible
CSA3202 Human Language Technology Lecture 2: Morphology 22/ 26
What is MorphologyMorphological Function
Morphological ProcessesMorphotactics: the order of morphemes
Orthography versus Phonology
Why is Morphology Useful?
Almost all natural language applications require someprocessing of words.
Dictionary toolsInformation RetrievalSpellcheckingMachine Translation
When these are written, there is often a mismatch betweenwhat appears on the page, and what appears in the dictionary.The severity of this problem depends on the languageMorphological processing helps to bridge the gap.
CSA3202 Human Language Technology Lecture 2: Morphology 23/ 26
What is MorphologyMorphological Function
Morphological ProcessesMorphotactics: the order of morphemes
Orthography versus Phonology
Summary
Morphological structure conveys important information ofdifferent typesMorphological structure is manifest in different ways.Morphological structure is governed by rules.Morphological analysis attempt to discover that structure.
CSA3202 Human Language Technology Lecture 2: Morphology 24/ 26