Finnish OT Prosody

44
Finnish OT Prosody Finnish OT Prosody Lauri Karttunen CLS-41 April 7, 2005

description

Finnish OT Prosody. Lauri Karttunen CLS-41 April 7, 2005. Overview. Success of Finite-State Morphology Lexical transducers Two ways of describing morphological alterations Sequential (Chomsky & Halle 1968) Parallel (Koskenniemi 1983) Finnish OT Prosody Basic Facts - PowerPoint PPT Presentation

Transcript of Finnish OT Prosody

Page 1: Finnish OT Prosody

Finnish OT ProsodyFinnish OT Prosody

Lauri KarttunenCLS-41April 7, 2005

Page 2: Finnish OT Prosody

OverviewOverview

Success of Finite-State MorphologyLexical transducersTwo ways of describing morphological alterations

Sequential (Chomsky & Halle 1968)Parallel (Koskenniemi 1983)

Finnish OT ProsodyBasic FactsFinite-state implementation of Kiparsky’s 2003 analysiswith the FST tool (Beesley & Karttunen 2003)Conclusion

Final thoughts

Page 3: Finnish OT Prosody

Computational morphologyComputational morphology

Analysis

leaves

leaf N Pl leave N Pl leave V Sg3

Generation

hang V Past

hanged hung

Page 4: Finnish OT Prosody

Two challengesTwo challenges

MorphotacticsWords are composed of smaller elements that

must be combined in a certain order:piti-less-ness is Englishpiti-ness-less is not English

Phonological alternationsThe shape of an element may vary depending

on the contextpity is realized as piti in pitilessnessdie becomes dy in dying

Page 5: Finnish OT Prosody

Morphology is regular (=rational)Morphology is regular (=rational)

The relation between the surface forms of a language and the corresponding lexical forms can be described as a regular relation.

A regular relation consists of ordered pairs of strings.leaf+N+Pl : leaves hang+V+Past : hung

Any finite collection of such pairs is a regular relation.

Regular relations are closed under operations such as concatenation, iteration, union, and composition.

Complex regular relations can be derived from simple relations.

Page 6: Finnish OT Prosody

Morphology is finite-stateMorphology is finite-state

A regular relation can be defined using the metalanguage of regular expressions.

[{talk}|{walk}|{work}] [% +Base:0 | %+SgGen3:s| %+Progr:{ing}| %+Past:{ed}];

A regular expression can be compiled into a finite-state transducer that implements the relation computationally.

Page 7: Finnish OT Prosody

work+3rdSg --> works

k:k

t:t

a:a

a:a

w:wo:o

l:lr:r

+Progr:i :g

+3rdSg:s

+Past:e :d

:n

+Base:

GenerationGeneration

Page 8: Finnish OT Prosody

talked --> talk+Past

k:k

t:t

a:a

a:a

w:wo:o

l:lr:r

+Progr:i :g

+3rdSg:s

+Past:e :d

:n

+Base:

AnalysisAnalysis

Page 9: Finnish OT Prosody

Lexical transducerLexical transducer

veut

vouloir +IndP +SG + P3

Finite-state transducer

inflected form

citation form inflection codesv o u l o i r +IndP +SG +P3

v e u t

Bidirectional: generation or analysisCompact and fastComprehensive systems have been

built for over 40 languages:English, German, Dutch, French,

Italian, Spanish, Portuguese, Finnish, Russian, Turkish, Japanese, Korean, Basque, Greek, Arabic, Hebrew, Bulgarian, …

Page 10: Finnish OT Prosody

How lexical transducers are madeHow lexical transducers are made

LexiconFST

RuleFSTs

Compiler

f a t +Adj

r

+Comp

f a t t e

Lexical Transducer(a single FST)composition

LexiconRegular Expression

RulesRegular Expressions

Morphotactics

Alternations

Page 11: Finnish OT Prosody

Two-level rules vs. rewrite rulesTwo-level rules vs. rewrite rules

compose intersect

FST

rule 1 rule 2 rule n...

Surface form

Lexical form

Koskenniemi 1983

Intermediate form

...

Surface form

Lexical form

rule 1

rule n

rule 1

Chomsky&Halle 1968

Page 12: Finnish OT Prosody

Rewrite rulesRewrite rules

Epenthesis

Harmony

Lowering

? u: t y ? A s

? u: t I y ? A s

? u: t u y ? a s

? o: t u y ? a s

Yawelmani Vowel Harmony Kisseberth 1969

Page 13: Finnish OT Prosody

Two-level constraintsTwo-level constraints

? u: t y ? A s

? o: t u y ? a s

Underlying representation controls all three alternations.

Epenthesis: Insert u or i (underspecification)Harmony: Rounding next to a round V of the same height.Lowering: Long u always realized as long o.

Page 14: Finnish OT Prosody

Rewrite Rules vs. ConstraintsRewrite Rules vs. Constraints

• Two different ways of decomposing the complex relation between lexical and surface forms into a set of simpler relations that can be more easily understood and manipulated.

• One approach may be more convenient than the other for particular applications.

Page 15: Finnish OT Prosody

Two-level model vs. OT Two-level model vs. OT

In some respects, the two-level model of Koskenniemi (1983) was ten years ahead of its time:Symbol-to-symbol constraints, not string relations like

rewrite rules.Rules can refer to both input and output contexts.Constraints on the output can be expressed directly.Concepts such as FAITHFULNESS can be expressed

straight-forwardly.

But two-level constraints were not violable and not ranked. All the constraints have to be satisfied to get any output.

Page 16: Finnish OT Prosody

OverviewOverview

Success of Finite-State MorphologyTwo strains

Sequential (Chomsky & Halle 1968)Parallel (Koskenniemi 1983)

Finnish OT ProsodyBasic FactsFinite-state implementation of Kiparsky’s 2003 analysiswith the FST tool (Beesley & Karttunen 2003)Conclusion

Final thoughts

Page 17: Finnish OT Prosody

Finnish Prosody: basic factsFinnish Prosody: basic facts

• The nucleus of a Finnish syllable must consist of a short vowel, a long vowel, or a diphthong.

• Main stress is always on the first syllable, secondary stress occurs on non-initial syllables.

• Adjacent syllables are never stressed.• Stressed syllable is initial in the foot.

ilmoittautuminen ‘registering’ (Nom Sg)(íl.moit).(tàu.tu).(mì.nen)

Page 18: Finnish OT Prosody

Ternary feet in FinnishTernary feet in Finnish

Stress that would fall on a light syllable shifts on the following heavy syllable creating a ternary foot.(ká.las).te.(lèm.me) ‘we are fishing’(íl.moit).(tàu.tu).mi.(sès.ta) ‘registering’ (Ela Sg)(rá.kas).ta.(jàt.ta).ri.(àn.sa) ‘his mistresses’ (Par Pl)

Can we get these facts to come out “for free”, from the interaction of independently motivated principles?

Yes!Paul Kiparsky “Finnish Noun Inflection” Generative Approaches to

Finnic and Saami Linguistics, Diane Nelson and Satu Manninen (eds.), pp.109-161, CSLI Publications, 2003.

Nine Elenbaas and René Kager. "Ternary rhythm and the lapse constraint". Phonology 16. 273-329.

Page 19: Finnish OT Prosody

Non-OT and OT solutionsNon-OT and OT solutions

It is possible to define a cascade of replace rules that produce the desired result.

http://www.stanford.edu/~laurik/fsmbook/examples/FinnishProsody.html

But, following Kiparsky, we are going to do OT today, and in a more elegant way than is shown at

http://www.stanford.edu/~laurik/fsmbook/examples/FinnishOTProsody.html

Page 20: Finnish OT Prosody

General StrategyGeneral Strategy

Input language

GEN

.o.

Compose the input languagewith GEN to produce amapping from each input formto all of its output candidates

Eliminate suboptimalcandidates by applyingconstraints in the rankedorder. At least one outputcandidate always survives.

Constraint 1

Constraint 2

By what finite-state operation?

Page 21: Finnish OT Prosody

Lenient Composition .O.Lenient Composition .O.

Let R be a relation that maps each input string to one or more outputs.

Let C be a constraint that eliminates some outputs.

R .O. C is the relation that maps each input string that can meet the constraint C to the outputs that meet C and leaves the rest of the relation R unchanged. (Karttunen 1998)

Is constraint ranking rule ordering in disguise?

Page 22: Finnish OT Prosody

Need a prolific Need a prolific GENGEN

ka.laka.láka.làka.(là)ka.(lá)ká.laká.láká.làká.(là)ká.(lá)kà.la

(kà.la)(ká).la(ká).lá(ká).là(ká).(là)(ká).(lá)(ká.là)(ká.lá)(ká.la) ☜(ka.là)(ka.lá)

kà.lákà.làkà.(là) kà.(lá)(kà).la(kà).lá(kà).là(kà).(là)(kà).(lá)(kà.là)(kà.lá)

kala ‘fish’ (Nom Sg) 33 candidates

Page 23: Finnish OT Prosody

Basic definitions 1Basic definitions 1

Using Parc/XRCE regular expression syntax:

define C [b | c | d | f | g | h | j | k | l | m | n | p | q | r | s | t | v | w | x | z]; # Consonant

define HighV [u | y | i]; # High voweldefine MidV [e | o | ö]; # Mid voweldefine LowV [a | ä] ; # Low voweldefine USV [HighV | MidV | LowV]; # Unstressed Vowel

define MSV [á | é | í | ó | ú | ý | ä’ | ö’];define SSV [à | è | ì | ò | ù | y` | ä` | ö`];define SV [MSV | SSV]; # Stressed voweldefine V [USV | SV] ; # Vowel

Page 24: Finnish OT Prosody

Basic definitions 2Basic definitions 2

define P [V | C]; # Phonedefine B [[\P+] | .#.]; # Boundarydefine E .#. | "."; # Edge

define Light [C* V]; # Light syllabledefine Heavy [Light P+]; # Heavy syllable

define S [Heavy | Light]; # Syllabledefine SS [S & $SV]; # Stressed syllabledefine US [S & ~$SV]; # Unstressed syllable

define MSS [S & $MSV] ; # Syllable with main stress

Page 25: Finnish OT Prosody

GEN 1GEN 1

define MarkNonDiphthong [ [. .] -> "." || [HighV|MidV] _ LowV, LowV _ MidV , i _ [MidV - e], u _ [MidV - o], y _ [MidV - ö] ];

Insert a syllable boundary between vowels that cannot forma diphtong: i.a, e.a, a.e, i.o, u.e, y.e, etc.

define Syllabify C* V+ C* @-> ... "." || _ C V ;

Insert a syllable boundary after a maximal C* V+ C* pattern that is followed by C V. For example, strukturalismi -> struk.tu.ra.lis.mi.

Page 26: Finnish OT Prosody

GEN 2GEN 2

define Stress a (->) á|à, e (->) é|è, i (->) í|ì, o (->) ó|ò, u (->) ú|ù, y (->) "y´"|"y`", ä (->) "ä´"|"ä`", ö (->) "ö´"|"ö`";

Optionally stress any vowel with a primary or secondary stress.

define Scan [[S ("." S ("." S)) & $SS] (->) "(" ... ")" || E _ E] ;

Optionally group syllables into unary, binary, or ternary feet when there is at least one stressed syllable.

define Gen [MarkNonDiphthongs .o. Syllabify .o. Stress .o. Scan];

Page 27: Finnish OT Prosody

Demo!Demo!

regex {kala} .o. Gen (compose)

print lower-words (show output candidates)

print size (count them)

Page 28: Finnish OT Prosody

Kiparsky's nine constraintsKiparsky's nine constraints

ClashAlignLeftMainStressFootBinLapseNonFinalStressToWeightParseAllFeetFirst

Page 29: Finnish OT Prosody

Counting constraint violationsCounting constraint violations

We use asterisks to mark constraint violations. We need a way to prefer candidates with the least number of violation marks.

define Viol ${*};

define Viol0 ~Viol; # No violationsdefine Viol1 ~[Viol^2]; # At most one violationdefine Viol2 ~[Viol^3]; # At most two violationsdefine Viol3 ~[Viol^4];

This eliminates the violation marks after the candidate set has been pruned by a constraint.

define Pardon {*} -> 0;

Page 30: Finnish OT Prosody

Defining OT ConstraintsDefining OT Constraints

Three types:Unviolable constraints

Primary stress in FinnishOrdinary violable constraints

LapseGradient alignment constraints

All-Feet-FirstStrategy:

We define an evaluation template for each of the three types and then define the individual constraints with the help of the templates.

Page 31: Finnish OT Prosody

Evaluation Template for Evaluation Template for Unviolable ConstraintsUnviolable Constraintsdefine Unviolable(Candidates, Constraint) [ Candidates .o. Constraint ];

Example:

define MainStress(X) Unviolable(X, B MSS ~$MSS);

# B is the left edge of the word or "(".# MSS is a syllable with a primary stress.

Page 32: Finnish OT Prosody

Evaluation Template for Ordinary Evaluation Template for Ordinary ConstraintsConstraintsdefine Eval(Candidates, Violation, Left, Right) [ Candidates .o.

Violation -> ... {*} || Left _ Right .O.

Viol3 .O. Viol2 .O. Viol1 .O. Viol0 .o. Pardon ];

where Viol0 is ~${*}, Viol2 is ~[[${*}]^2], etc. andPardon is {*} -> 0 deleting all violation marks.

Page 33: Finnish OT Prosody

Evaluation Template for Left-Evaluation Template for Left-Oriented Gradient AlignmentOriented Gradient Alignmentdefine EvalGradientLeft(Candidates, Violation, Left, Right) [

Candidates .o.Violation -> {*} ... || .#. Left _ Right

.o.Violation -> {*}^2 ... || .#. Left^2 _ Right

.o.Violation -> {*}^3... || .#. Left^3 _ Right

.o.Violation -> {*}^4 ... || .#. Left^4 _ Right

.o.Violation -> {*}^5 ... || .#. Left^5 _ Right

.o.Violation -> {*}^6 ... || .#. Left^6 _ Right

.o.Violation -> {*}^7 ... || .#. Left^7 _ Right

.o.Violation -> {*}^8 ... || .#. Left^8 _ Right

.O. Viol12 .O. Viol11 .O. Viol10 .O. Viol9 .O. Viol8 .O. Viol7 .O. Viol6 .O. Viol5 .O. Viol4 .O. Viol3 .O. Viol2 .O. Viol1 .O. Viol0 .o. Pardon ];

Page 34: Finnish OT Prosody

Clash, AlignLeft, MainStressClash, AlignLeft, MainStress

ClashNo stress on adjacent syllables.

define Clash(X) Eval(X, SS, SS B, ?*);

Align-LeftThe stressed syllable is initial in the foot.

define AlignLeft(X) Eval(X, SV, .#. ~[?* "(" C*], ?*);

Main StressThe primary stress in Finnish is on the first syllable.

define MainStress(X) Unviolable(X, B MSS ~$MSS);

Page 35: Finnish OT Prosody

FootBin, Lapse, NonFinalFootBin, Lapse, NonFinal

Foot-Bin Feet are minimally bimoraic and maximally bisyllabic.

define FootBin(X) Eval(X, "(” Light ") "|” ("S["." S]^>1, ?* ,?*);

LapseEvery unstressed syllable must be adjacent to a stressed syllable or to the word

edge.

define Lapse(X) Eval(X, US, [B US B], [B US B]);

Non-FinalThe final syllable is not stressed.

define NonFinal(X) Eval(X, SS, ?*, ~$S .#.);

Page 36: Finnish OT Prosody

StressToWeight, Parse, StressToWeight, Parse, AllFeetFirstAllFeetFirst

Stress-To-WeightStressed syllables are heavy.

define StressToWeight(X) Eval(X, SS & Light, ?*, ")"| E);

License-Syllables are parsed into feet.

define Parse(X) Eval(X, S, E, E);

All-Ft-LeftThe left edge of every foot coincides with the left edge of some prosodic

word.

define AllFeetFirst(X) [ EvalGradientLeft(X, "(", ~$"." "." ~$".", ?*) ];

Page 37: Finnish OT Prosody

Finnish ProsodyFinnish Prosody

Kiparsky 2003:

define FinnishProsody(Input) [ AllFeetFirst( Parse( StressToWeight( NonFinal( Lapse( FootBin( MainStress( AlignLeft( Clash( Input .o. Gen)))))))))];

Page 38: Finnish OT Prosody

FinnWordsFinnWords

regex FinnishProsody( {kalastelet} | {kalasteleminen} | {ilmoittautuminen} | {järjestelmättömyydestänsä} | {kalastelemme} | {ilmoittautumisesta} | {järjestelmällisyydelläni} | {järjestelmällistämätöntä} | {voimisteluttelemasta} | {opiskelija} | {opettamassa} | {kalastelet} | {strukturalismi} | {onnittelemanikin} | {mäki} | {perijä} | {repeämä} | {ergonomia} | {puhelimellani} | {matematiikka} | {puhelimistani} | {rakastajattariansa} | {kuningas} | {kainostelijat} | {ravintolat} | {merkonomin} ) ;

Demo!

Page 39: Finnish OT Prosody

ResultResult

(ér.go).(nò.mi).a(íl.moit).(tàu.tu).mi.(sès.ta)(íl.moit).(tàu.tu).(mì.nen)(ón.nit).(tè.le).(mà.ni).kin(ó.pis).(kè.li).ja(ó.pet).ta.(màs.sa)(vói.mis).te.(lùt.te).le.(màs.ta)(strúk.tu).ra.(lìs.mi)(rá.vin).(tò.lat)(rá.kas).ta.(jàt.ta).ri.(àn.sa)(ré.pe).(ä`.mä)(pé.ri).jä(pú.he).li.(mèl.la).ni

(pú.he).li.(mìs.ta).ni(mä’.ki)(má.te).ma.(tìik.ka)(mér.ko).(nò.min)(kái.nos).(tè.li).jat(ká.las).te.(lèm.me)(ká.las).te.(lè.mi).nen(ká.las).(tè.let)(kú.nin).gas(jä’r.jes).tel.(mä`l.li).syy.(dèl.lä).ni(jä’r.jes).(tèl.mät).tö.(my`y.des).(tä`n.sä)(jä’r.jes).(tèl.mäl).(lìs.tä).mä.(tö`n.tä)

Page 40: Finnish OT Prosody

Two ErrorsTwo Errors

(ká.las).te.(lè.mi).nen (jä´r.jes).tel.(mä`l.li).syy.(dèl.lä).ni

The interaction of Lapse and StressToWeight does not produce the desired result in these cases.

Page 41: Finnish OT Prosody

What is wrong?What is wrong?

define Debug(Input) [ DebugStressToWeight( NonFinal( Lapse( FootBin( MainStress( AlignLeft( Clash( Input .o. Gen))))))) ];

regex Debug({kalasteleminen});(ká*.las).te.(lè*.mi).nen <-- actual winner(ká*.las).(tè*.le).(mì*.nen) <-- desired output

(jä´r.jes).tel.(mä`l.li).syy.(dèl.lä).ni(jä’r.jes).(tèl.mäl).li.(sy`y.del).(lä`*.ni)

The StressToWeight constraint eliminates some of the desired winning candidates.

Page 42: Finnish OT Prosody

ConclusionConclusion

Can we get Ternary feet in Finnish “for free”, from the interaction of independently motivated principles?We don’t know.

Optimality Prosody is computationally very difficult.The number of initial candidates is huge:

kalasteleminen 70653järjestelmällisyydelläni 21767579

Simple tableau methods do not work.Finite-state implementation guards against errors made by a

human GEN and EVAL.But even when an error can be pinpointed, the fix is not

obvious.Debugging OT constraints is as hard as debugging two-

level rules, in practice more difficult than rewrite systems.

Page 43: Finnish OT Prosody

Final ThoughtsFinal ThoughtsMorphology is a regular relation.

The composition of words (morphosyntax), morphological alternations, and prosody can be described in finite-state terms.

A complex relation can be decomposed in different ways.There are many flavors of finite-state morphology: Item-and-

Arrangement, Rewrite rules, Two-level rules, Realizational Morphology, Classical optimality constraints.

Computing with finite-state tools is fun and easy.We have sophisticated formalism for describing regular relations,

efficient compilers and runtime software.

‘Pen-and-pencil’ morphology badly needs computational support.It is difficult to get globally correct results relying on a handful of

interesting words, rules, and constraints.

Page 44: Finnish OT Prosody

ReferencesReferences

Kenneth R. Beesley & Lauri Karttunen, Finite State Morphology, CSLI Publications. March 2003. (Software included).

http://www.fsmbook.com/

Lauri Karttunen, "Computing with Realizational Morphology" in CICLing-2003, A. Gelbukh (ed.), Lecture Notes in Computer Science 2588, pages 205-216. Springer Verlag. 2003.