Finite-state Machines: Theory and ApplicationsUnweighted Finite-state Automata
Thomas Hanneforth
Institut fur LinguistikUniversitat Potsdam
December 10, 2008
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 1 / 99
Overview
1 Applications of finite-state machines2 Finite state acceptors and transducers: formal characterization3 Closure properties and algebra of finite state acceptors4 Closure properties and algebra of finite state transducers5 Equivalence transformations on finite state acceptors6 Equivalence transformations on finite state transducers7 Decidability properties of unweighted finite-state acceptors and
transducers
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 2 / 99
Outline
1 Applications of Finite-State Machines
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 3 / 99
Some Applications of Finite-State Machines inComputational Linguistics
Morphological analysis: lemmatization, word segmentation,segmentation disambiguation
Spelling correction
Lexicon representation
Part-of-Speech-Tagging
Shallow parsing, Chunking
Speech recognition, speech synthesis
Optimality theory
Language modeling & Statistical language processing
Statistical machine translation
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 5 / 99
Outline
2 Finite state acceptors and transducers: formal characterizationFinite-state acceptorsFinite-state transducers
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 6 / 99
Finite-state acceptorsNon-deterministic finite-state acceptor
Definition (Non-deterministic finite-state acceptor (NFA))A non-deterministic finite-state acceptor A is a 5-tuple 〈Q,Σ, q0, F, δ〉 where
Q is a non-empty set of states
Σ is a non-empty set and called the alphabet of A
q0 ∈ Q, the start state
F ⊆ Q, the set of final states
δ : Q× Σ ∪ {ε} 7→ 2Q, the transition function.
Notesδ may be a partial function (and usually is)
Nondeterminism: the transition function δ maps a state q and an alphabetsymbol a to a set of successor states.
A transition may be labeled with ε, the neutral element of concatenation.
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 8 / 99
Finite-state acceptorsExample (Nondeterministic FSA Alex accepting some animal names)
� �� ��
� �� �� �
�� �
�� ���
��� ���
�� ���
��
��
���
�� ���
���
�� ���
���
���
��
��
���
���
���
��
�
�
�
�
�
�
�
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 9 / 99
Finite-state acceptorsDeterministic finite-state acceptor
Definition (Deterministic finite-state acceptor (DFA))A deterministic finite-state acceptor A is a 5-tuple 〈Q,Σ, q0, F, δ〉, where
Q is a non-empty set of states
Σ is a non-empty set and called the alphabet of A
q0 ∈ Q, the start state
F ⊆ Q, the set of final states
δ : Q× Σ 7→ Q, the transition function.
NotesAgain, δ may be a partial function,
DFSA are by definition ε-free, that is, contain no ε-transitions.
DFSA and NDFA have the same generative power that is both conceptsare equivalent (cf. subset construction).
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 10 / 99
Finite-state acceptors
Example (Deterministic version of Alex)
�
�
�
��
�
�
��
�
�
���
��
�
��
�
���
� �
���
���
��
��
���
��
��
���
���
���
���
���
���
� �
Deterministic acyclic FSA are also called tries. Tries are useful forlexicon representation.
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 11 / 99
Finite-state acceptorsExamples
Example (DFSA ANP accepting English noun phrase patterns)
�
�
�
�
���
�
��
�
�
���
�
���
Example (DFSA AEnglish accepting some English sentences)
� �
����
����
������
���
�
�
���
�� ��
��
��������
�
������
�����
��������
���
���
�
����
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 12 / 99
Finite-state acceptorsGeneralized transition function, language
Definition (Generalized transition function δ∗)δ∗ is the reflexive and transitive closure of δ
δ∗(q, ε) = q,∀q ∈ Qδ∗(q, aw) = δ∗(δ(q, a), w)
��
�
�
�
�
��
�
Definition (Language of a DFSA A)L(A) = {w ∈ Σ∗ | δ∗(q0, w) ∈ F}We also say that L(A) is recognized by A.
Definition (Regular language)The language is called regular if there exists some DFA which recognizes it.
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 13 / 99
Finite-state transducersDefinition
Definition ((Non-deterministic) finite-state transducer (NFST))A (non-deterministic) finite-state transducer T is a 7-tuple〈Q,Σ,∆, q0, F, δ, σ〉, where
Q is a non-empty set of states
Σ is a non-empty set and called the input alphabet of T
∆ is a non-empty set and called the output alphabet of T
q0 ∈ Q, the start state
F ⊆ Q, the set of final states
δ : Q× Σ ∪ {ε} 7→ 2Q, the transition function.
σ : Q× Σ ∪ {ε} ×Q 7→ ∆∗, the output function.
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 14 / 99
Finite-state transducersAlternative definition
To simplify some definitions, we combine transition and output functionto a set of transitions.In addition, we restrict the output function to single symbols or ε.
Definition (Normalized finite-state transducer)A normalized finite-state transducer T is a 6-tuple 〈Q,Σ,∆, q0, F, E〉, where
Q is a non-empty set of states
Σ is a non-empty set, the input alphabet of T
∆ is a non-empty set, the output alphabet of T
q0 ∈ Q, the start state
F ⊆ Q, the set of final states
E ⊆ Q× (Σ ∪ {ε})× (∆ ∪ {ε})×Q, the set of transitions.
Note that every transducer can be transformed into a normalizedtransducer.
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 15 / 99
Finite-state transducers
Example (Tlex mapping surface forms to morph. features)
�
�������
�������
�����
�
������
��
��
�� ���
��
���
����
��
����
�� ��
�
�����
��
�
��
����
�
���
����
��
�
���
���
���
���
���
���
Note that fish is nondeterministically mapped to { NOUN sg, NOUN pl}
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 16 / 99
Finite-state transducers
Example (Laughter machine Tlaugh)
� ����
����
���
��
���
��
��
The input string laugh is mapped to the infinite set {han|n ≥ 1}
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 17 / 99
Finite-state transducers
Example (Bracketing machine Tbracket)
�
���������
�
���
����
���
���
���
���
���
���
���
Every occurrence of ab is enclosed within brackets.
For example, the input string cabbabc is mapped to c{ab}b{ab}c bytraversing the state sequence 0 0 2 3 4 0 0 2 3 4 0 0
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 18 / 99
Finite-state transducersLanguage
Definition (Language of a FST)The language L(T ) of a FST T = 〈Q,Σ,∆, q0, F, δ, σ〉 is defined in thefollowing way:L(T ) = {〈u, v〉 | δ∗(q0, u) ∩ F 6= ∅ ∧ v ∈ σ∗(q0, u)}δ∗ is recursively defined:
δ∗(q, ε) = {q} and
δ∗(q, wa) =⋃q′∈δ∗(q,w) δ(q
′, a)
σ∗ is the generalized output function and defined like this:
σ∗(q, ε) = {ε} and
σ∗(q, wa) = σ∗(q, w) ·⋃q′∈δ∗(q,w) σ(q′, a, p)
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 19 / 99
Finite-state transducersTransduction mapping
Definition (Transduction mapping)The transduction mapping JT K : Σ∗ 7→ ∆∗ of a FST T is defined as:JT K(x) = {y | 〈x, y〉 ∈ L(T )}
Definition (Functional transducer)A transducer T is called functional if |JT K(x)| ≤ 1 for all x ∈ Σ∗
ExampleTbracket is functional. Tlex is not functional.
Definition (Ambiguous transducer)A transducer T is called ambiguous, if |JT K(x)| > 1 for some x ∈ Σ∗.
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 20 / 99
Finite-state transducersAmbiguous transducers
Definition (Finitely ambiguous transducer)A transducer T is called finitely ambiguous, if |JT K(x)| is finite for all x ∈ Σ∗.
ExampleTlex is finitely ambiguous.
Definition (Infinitely ambiguous transducer)A transducer T is called infinitely ambiguous, if |JT K(x)| is infinite for somex ∈ Σ∗.
ExampleTlaugh is infinitely ambiguous.
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 21 / 99
Finite-state transducersDeterministic finite-state transducer
Definition (Deterministic finite-state transducer (DFST))A deterministic finite-state transducer T is a 7-tuple 〈Q,Σ,∆, q0, F, δ, σ〉,where
Q is a non-empty set of states
Σ is a non-empty set and called the input alphabet of T
∆ is a non-empty set and called the output alphabet of T
q0 ∈ Q, the start state
F ⊆ Q, the set of final states
δ : Q× Σ 7→ Q, the (deterministic) transition function.
σ : Q× Σ 7→ ∆∗, the (deterministic) output function.
TheoremEvery deterministic transducer is functional.
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 22 / 99
Outline
3 Closure properties and algebra of finite state acceptorsUnionConcatenationStar closurePlus closureReversalComplementationIntersectionDifferenceSubstitutionHomomorphism
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 23 / 99
Closure properties and algebra of finite state acceptors
Definition (Closure of a set)Let S be a set and let fk be a k-ary function taking k-tuples over S asarguments. We say that S is closed under fk if for all ai ∈ Sfk(a1, a2, . . . , ak) ∈ S.
NoteClosure properties are important for the modularity based on a specificformalism. They allow to build complex things out of simpler ones bycombining them with a number of operations.
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 25 / 99
Closure properties and algebra of finite state acceptors
Closure properties of regular languagesThe set of languages which is recognized by finite-state acceptors (the regularlanguages) is closed under
Union
Concatenation
Plus and star closure
Reversal
Complementation
Intersection
Difference
Homomorphism and substitution
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 26 / 99
Closure properties and algebra of finite state acceptorsUnion
Example (Union of two acceptors)
�
��
�
�
��
�
A1
� �� ��
�
A2
�
�
�
��
�
�
��
�
�
�
�
A1 ∪A2
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 27 / 99
Closure properties and algebra of finite state acceptorsConcatenation
Example (Concatenation of two acceptors)
�
��
�
�
��
�
A1
� �� ��
�
A2
�
�
�
���
�
�
��
�
�
� �
�
A1 ·A2
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 28 / 99
Closure properties and algebra of finite state acceptorsStar closure
Example (Star (Kleene) closure of an acceptor)
�
��
�
�
��
�
A
�
� ��
��
�
�
�
�
�
� �
A∗
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 29 / 99
Closure properties and algebra of finite state acceptorsPlus closure
Example (Plus closure of an acceptor)
�
��
�
�
��
�
A
�
��
�
��
�
�
�
�
�
A+
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 30 / 99
Closure properties and algebra of finite state acceptorsReversal
Definition (Reversal of a string)
The reversal of a string w ∈ Σ∗ – denoted by wR – is defined as:
εR = ε
(a · w)R = wR · a, ∀a ∈ Σ ∧ w ∈ Σ∗
Example
obamaR = bamaR · o = amaR · bo = maR · abo = aR ·mabo = ε · amabo
Definition (Reversal of a string set)
Let S ⊆ Σ∗ be a set of strings. The reversal of S - denoted by SR – is definedas: SR = {wR|w ∈ S}.
TheoremThe set of regular languages is closed under reversal.
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 31 / 99
Closure properties and algebra of finite state acceptorsReversal
Example (Reversal of an acceptor)
� ��
��
��
�
�
�
�
�
�
A
�
��
��
��
��
�
��
��
�
�
AR
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 32 / 99
Closure properties and algebra of finite state acceptorsComplementation
Given an alphabet Σ and a FSA A you sometimes need a FSA A representingall strings x over Σ∗ which are not in A.Formally: L(A) = {x ∈ Σ∗|x 6= L(A)} or L(A) = Σ∗ − L(A)
L(A)
L(A)
TheoremThe set of regular languages is closed under complementation.
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 33 / 99
Closure properties and algebra of finite state acceptorsComplementation: algorithm
Algorithm:1 Determinize A and obtain A′.2 Make A′ complete by adding a sink state s and adding for each state q
and each symbol a ∈ Σ not already used at q a transition δ(q, a) = s.3 Exchange final and non-final states.
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 34 / 99
Closure properties and algebra of finite state acceptorsComplementation
Example (Complementation of a finite-state acceptor)
� ��
��
��
�
A(Σ = {a, c, r, t})
� �� �
�
�
�
�
�
�
��
����
�
�
�
�
�
�
�
�
�
A
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 35 / 99
Closure properties and algebra of finite state acceptorsComplementation
Example (Why complementation works)Consider a trie for W = {cat, camel, dog, frog}.
�
��
��
�
�
��
��
��
���
��
�
��
���
���
Definition (Definition of a trie)Let W be a finite set of words over Σ. Let Pref(W ) the set of all prefixes ofW . Define a DFA A = 〈Pref(W ),Σ, ε,W, δ〉 with∀a ∈ Σ, x, xa ∈ Pref(W ): δ(x, a) = xa.
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 36 / 99
Closure properties and algebra of finite state acceptorsComplementation
Example (A trie for W = {cat, camel, dog, frog})States are labeled with prefixes of W .
�
��
��
�
�
���
���
���
���
�
����
��
��
���
���
�����
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 37 / 99
Closure properties and algebra of finite state acceptorsComplementation
In a trie – a special acyclic DFA – each state corresponds to a singleprefix of a word in W .In a general DFA A, each state q corresponds to a set of prefixes of thewords in L(A), the left language of q.
Definition (Left language)
The left language of a state q – symbolically←−L (q) – is defined as:
←−L (q) = {w ∈ Σ∗ | δ∗(q0, w) = q}
Example (Left language)
� �� ��
�
←−L (1) = {a(ba)n | n ≥ 0}
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 38 / 99
Closure properties and algebra of finite state acceptorsComplementation: Why can we only complementize DFAs?
Example (’Complementation’ of an NFA)
� ��
��
� ��
�
�
�
The ”complementized”NFA still accepts ab.
�
�
�
�
�
�
�
�
�
�
���
�
�
�
�
�
�
�
��
�
�
��
�
�
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 39 / 99
Closure properties and algebra of finite state acceptorsComplementation: Importance
Complementation (negation) is important for the inherent robustness ofmethods based on finite-state automata.
A NLP system is called robust if there does not exist an input string forwhich it fails. That means: A robust NLP system accepts Σ∗.This is immediately related to negation: if there is some input string wwhich is not accepted by some DFA A (say, for example, a DFArepresenting some NP grammar about stock indices), one could use A toaccept w: L(A) ∪ L(A) = Σ∗.This property does not carry over to context-free languages: they are notclosed under complementation.
Context-sensitive languages are – perhaps surprisingly – again closedunder complementation. But their recognition problem is notoriouslydifficult.
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 40 / 99
Closure properties and algebra of finite state acceptorsComplementation: the sting in the tail: complexity
TheoremConsider an NFA A with state set Q. The state complexity of an equivalentDFA A′ can be in the worst case in O(|Σ|2|Q|).
Example (Worst case of determinization)
Consider the regular language L = Σ∗a(a|b)k for some k (with Σ = {a, b}).While an NFA for L has k + 2 states, the equivalent DFA has 2k+1 states.
�
�
�
��
��
��
�
�
k = 2
�
�
�
�
�
�
�
�
��
��
�
�
�
�
�
�
�
�
�
�
�
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 41 / 99
Closure properties and algebra of finite state acceptorsIntersection
If we know that FSAs are closed under complementation and union thenwe also know that they are closed under intersection.
Why? By DeMorgan!
A ∩B ≡ A ∪B
But this approach is very complex, since it requires threecomplementation operations which in turn require determinization.
There is a more direct method: we let pair of states of the original FSAsbe states of the intersection FSA.
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 42 / 99
Closure properties and algebra of finite state acceptorsIntersection: Example
Example (Intersection with the product state construction)
� ��
��
��
��
A1
� ��
��
�
�
�
�
A2
����� ������
������
������
����
A1 ∩A2
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 43 / 99
Closure properties and algebra of finite state acceptorsIntersection: formal definition
Definition (Intersection of two finite-state acceptors)Let A1 = 〈Q1,Σ1, q01 , F1, δ1〉 and A2 = 〈Q2,Σ2, q02 , F2, δ2〉 be two FSAs.A1 ∩A2, the intersection of A1 and A2 is an acceptor:
A = 〈Q1 ×Q2,Σ1 ∩ Σ2, 〈q01 , q02〉, F1 × F2, δ〉
where 〈p′, q′〉 ∈ δ(〈p, q〉, a) if p′ ∈ δ1(p, a) and q′ ∈ δ2(q, a) for alla ∈ Σ1 ∩ Σ2.
This mathematical approach generates in the worst as in the best case aFSA with |Q1||Q2| states.
But a lot of these states may not contribute to the language
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 44 / 99
Closure properties and algebra of finite state acceptorsIntersection: useless states
Definition (Inaccessible and non-coaccessible states)Let A be a finite-state automaton (acceptor or transducer) with start state q0.A state q in A is called inaccessible if there is no path in A from q0 to q.A state q in A is called non-coaccessible if there is no path in A from q to afinal state of A.A state is called useless if it is inaccessible or non-coaccessible.A finite-state automaton A is called trim or connected if it has no uselessstates.
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 45 / 99
Closure properties and algebra of finite state acceptorsIntersection: removal of useless states
Algorithm connect(A)
Require: FSM A with start state q0, state set Q and final state set FEnsure: A without useless states
1: Perform a depth-first search starting at q0 and mark each visited state2: Delete each unmarked state q and all its ingoing and outgoing transitions3: Reverse A4: Unmark all states in Q5: Perform a depth-first search starting at all states q ∈ F and mark each
visited state6: Delete each unmarked state q and all its ingoing and outgoing transitions7: Reverse A
Complexity of connect(A)
If A has |Q| number of states and |E| number of transitions, the complexity ofconnect(A) is in O(|Q|+ |E|).
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 46 / 99
Closure properties and algebra of finite state acceptorsIntersection: algorithm
Require: FSAs A1 = 〈Q1,Σ1, q01 , F1, δ1〉 and A2 = 〈Q2,Σ2, q02 , F2, δ2〉Ensure: A = A1 ∩A2
1: F := Q := ∅2: ENQUEUE(S, 〈q01 , q02〉)3: while S 6= ∅ do4: 〈q1, q2〉 := DEQUEUE(S)5: for all a ∈ Σ1 ∩ Σ2 do6: if q′1 ∈ δ1(q1, a) ∧ q′2 ∈ δ2(q2, a) then7: δ(〈q1, q2〉, a) := δ(〈q1, q2〉, a) ∪ {〈q′1, q′2〉}8: if 〈q′1, q′2〉 /∈ Q then9: Q := Q ∪ {〈q′1, q′2〉}
10: if q′1 ∈ F1 ∧ q′2 ∈ F2 then11: F := F ∪ {〈q′1, q′2〉}12: end if13: ENQUEUE(S, 〈q′1, q′2〉)14: end if15: end if16: end for17: end while18: CONNECT (A)19: return A
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 47 / 99
Closure properties and algebra of finite state acceptorsIntersection: practical importance
Closure under intersection means that we can develop constraintsindependently of each other and then enforce their validitysimultaneously by intersecting them
A lot of finite-state based NLP is based on intersection: (Two-level-)Morphology, Constraint based grammar, pattern matching etc.
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 48 / 99
Closure properties and algebra of finite state acceptorsDifference
Definition (Difference)Let A1 and A2 two FSAs. The difference A1 −A2 is defined as:
A1 −A2 ≡ A1 ∩A2
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 49 / 99
Closure properties and algebra of finite state acceptorsDifference
Example (Difference)
� ��
��
�
�
�
�
A1
� ��
��
��
��
A2
� ��
��
��
��
���
�
�
�
�
A1 −A2
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 50 / 99
Closure properties and algebra of finite state acceptorsSubstitution
Definition (Substitution)
A substitution is a mapping s : Σ 7→ 2∆∗ for two alphabets Σ and ∆.s is generalized to s∗ : Σ∗ 7→ 2∆∗ by:
s∗(ε) = ε
s∗(xa) = s∗(x)s(a)
Theorem (Closure under substitution)The set of regular languages is closed under substitution with regularlanguages.
NoteA lot of finite-state based NLP is based on closure under substitution.
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 51 / 99
Closure properties and algebra of finite state acceptorsSubstitution
Example (Substitution in computational morphology)
� �������� ������
A morphology rule as a FSA A
�
��
��
�
�
��
�
�
��
�
�
��
�
A1
�
��
��
�
A2
� ��
��
��
�
�
��
�
�� �
�
�
��
��
�
�
�
�
�
�
��
��
Result of the substitution A{NSTEM = A1, NINFL = A2}
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 52 / 99
Closure properties and algebra of finite state acceptorsHomomorphism
Definition (Homomorphism)A homomorphism is a mapping h : Σ 7→ ∆∗ for two alphabets Σ and ∆.
Definition (Inverse homomorphism)
Given a homomorphism h, the inverse homomorphic image h−1 of a languageL is defined as: h−1(L) = {w | h(w) ∈ L}
Theorem (Closure under homomorphism)The set of regular languages is closed under homomorphism and inversehomomorphism
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 53 / 99
Outline
4 Closure properties and algebra of finite state transducersProjectionCompositionCross productInversionIntersection
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 54 / 99
Closure properties and algebra of finite state transducers
The set of finite state transducers is closed under
Union
Concatenation
Closure
Reversal
Projection (note that this leads to FSAs)
Composition
Inversion
Finite state transducers are not closed under
Complementation
Intersection (but acyclic and ε-free transducers are)
Difference
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 56 / 99
Closure properties and algebra of finite state transducersProjection
Definition (First and second projection)Let T = 〈Q,Σ,∆, q0, F, S〉 be a transducer.The first projection of T – symbolically π1(T ) – is the FSAA = 〈Q,Σ, q0, F, δ〉 where∀a ∈ Σ ∪ {ε}, δ(p, a) = {q | ∃b ∈ ∆ : 〈p, a, b, q〉 ∈ S}
The second projection of T – symbolically π2(T ) – is the FSAA = 〈Q,Σ, q0, F, δ〉 where∀b ∈ ∆ ∪ {ε}, δ(p, b) = {q | ∃a ∈ Σ : 〈p, a, b, q〉 ∈ S}
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 57 / 99
Closure properties and algebra of finite state transducersProjection
Example (Projection)
�
�����
�
������
����
���
��������
Transducer T
�
��
�
���
���
�
����
π1(T )
�
���
�
��
��
�
��
π2(T )
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 58 / 99
Closure properties and algebra of finite state transducersCompositionComposing a transducer T1 with a transducer T2 (formally T1 ◦ T2) means:take some input u for T1, collect the output v of T1, feed it as input into T2
and collect the output w of T2.
Definition (Composition relation)Let T1 = 〈Q1,Σ1,∆1, q01 , F1, S1〉 and T2 = 〈Q2,Σ2,∆2, q02 , F2, S2〉 betransducers. L(T1 ◦ T2) ={〈u,w〉 ∈ Σ∗1 ×∆∗2 | ∃v ∈ ∆∗1 ∩ Σ∗2 : 〈u, v〉 ∈ L(T1) ∧ 〈v, w〉 ∈ L(T2)}
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 59 / 99
Closure properties and algebra of finite state transducersComposition
Definition (ε-free composition)Let T1 = 〈Q1,Σ1,∆1, q01 , F1, E1〉 and T2 = 〈Q2,Σ2,∆2, q02 , F2, E2〉 betwo normalized, ε-free FSTs. T1 ◦ T2, the composition of T1 and T2, is thetransducerT = 〈Q1 ×Q2,Σ1,∆2, 〈q01 , q02〉, F1 × F2, E〉 where
E = {〈〈p, q〉, a, b, 〈p′, q′〉〉 | ∃c ∈ ∆1 ∩ Σ2 :〈p, a, c, p′〉 ∈ E1 ∧ 〈q, c, b, q′〉 ∈ E2}
Properties of compositionThe composition operation is not commutative, that is, in general:T1 ◦ T2 6= T2 ◦ T1
The composition operation is associative, that is:
T1 ◦ T2 ◦ T3 = (T1 ◦ T2) ◦ T3 = T1 ◦ (T2 ◦ T3)
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 60 / 99
Closure properties and algebra of finite state transducersComposition
How does composition work?Whenever T1 contains a transition:
� �������
and T2 contains a transition:
� �������
T will contain a transition:
����� �����������
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 61 / 99
Closure properties and algebra of finite state transducersComposition
Example (Composition)
�
�����
����
�� �
������
������
���������
������������
�������
FST repeatedly mapping
words to their categories
◦ �
�����
�
������
����
���
��������
FST mapping NP-patterns to
NP category
= �
�
����
������
������
�� �������
������
�� ����
������
�������
�����
�� ������
�����
� ���
�����
������
������
� ���
�����
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 62 / 99
Closure properties and algebra of finite state transducersApplication
Definition (Identity transducer)Let A = 〈Q,Σ, q0, F, δ〉 be a FSA.The identity transducer ID(A) is defined by 〈Q,Σ,Σ, q0, F, E〉 where
E = {〈p, a, a, q〉 | ∃p, q ∈ Q, a ∈ Σ ∪ {ε} : q ∈ δ(p, a)}
Example (Identity transducer)
�
����
�
�������
�����
���
����������
Definition (Application)The application of a FST T to a FSA A – symbolically T [A] – is defined as
T [A] ≡ π2(ID(A) ◦ T )
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 63 / 99
Closure properties and algebra of finite state transducersApplication
Example
� �� ������� � �
FSA A
� ����
��������������
�����
FSA ID(A)
� ������� ���� ������ � ������
ID(A) ◦ T
�
�
����
������
������
�� �������
������
�� ����
������
�������
�����
�� ������
�����
� ���
�����
������
������
� ���
�����
FST T
� ��� �� ��
T [A] = π2(ID(A) ◦ T )
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 64 / 99
Closure properties and algebra of finite state transducersComposition: relationship to intersection
Composition can be considered as a generalization of intersection. Theintersection of two FSAs A1 and A2 can be defined as follows:
A1 ∩A2 = π1(ID(A1) ◦ ID(A2))
So, intersecting two FSAs is done by composing their identity transducers andafterwards projecting one of the tapes. Composing two transducers X and Ymeans synchronizing (intersecting) their inner tapes and then combining theouter tapes:
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 65 / 99
Closure properties and algebra of finite state transducersComposition: handling ε-transitions
It is possible to generalize the composition definition to transducers withε-transitions:
Definition (Transducer composition)Let T1 = 〈Q1,Σ1,∆1, q01 , F1, E1〉 and T2 = 〈Q2,Σ2,∆2, q02 , F2, E2〉 betwo normalized FSTs.T1 ◦ T2, the composition of T1 and T2, is the transducer
T = 〈Q1 ×Q2,Σ1,∆2, 〈q01 , q02〉, F1 × F2, E ∪ Eε ∪ Ei,ε ∪ Eo,ε〉where
1 E = {〈〈p, q〉, a, b, 〈p′, q′〉〉 | ∃c ∈ ∆1 ∩ Σ2 :〈p, a, c, p′〉 ∈ E1 ∧ 〈q, c, b, q′〉 ∈ E2}
2 Eε = {〈〈p, q〉, a, b, 〈p′, q′〉〉 | 〈p, a, ε, p′〉 ∈ E1 ∧ 〈q, ε, b, q′〉 ∈ E2}3 Ei,ε = {〈〈p, q〉, ε, a, 〈p, q′〉〉 | 〈q, ε, a, q′〉 ∈ E2 ∧ p ∈ Q1}4 Eo,ε = {〈〈p, q〉, a, ε, 〈p′, q〉〉 | 〈p, a, ε, p′〉 ∈ E1 ∧ q ∈ Q2}
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 66 / 99
Closure properties and algebra of finite state transducersComposition: handling ε-transitionsThere are four different ways, how ε and alphabet symbols on the second tapeof T1 and the first tape of T2 can interact:
1 T1 contains a a : c-transition and T2 contains a c : b-transition: this ishandled in the same way as in the ε-free case
2 T1 contains a a : ε-transition and T2 contains a ε : b-transition→T contains a a : b-transition. That is: ε is treated as a regular symbol.
3 T1 “stays” in the same state, T2 moves on:
�
�����
� ������������ ����������
4 T1 moves on, T2 “stays” in the same state:
� �������
�
�����
����� ����������
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 67 / 99
Closure properties and algebra of WFSAComposition
Example (Composition of two unweighted FSTs)
� ����
������
�����
�����
T1
� ������
������
�����
T2
����� ����������
��������
��������
����
����
����
� ��
��������
� �������
���� ��� ������
T1 ◦ T2
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 68 / 99
Closure properties and algebra of finite state transducersComposition: Application
Composition is a very important operation for building processing orfiltering cascades, for example in robust parsing and morphologicalanalysis.
Since composition is not commutative, the order of a transducer cascadeC = T1 ◦ T2 ◦ . . . ◦ Tk matters.
This may lead to problems related to feeding, counter-feeding, bleedingand counter-bleeding.
Since the composition operation is associative, the order in which thecompositions in C are computed does not matter. This entails somefreedom degrees for implementing such cascades.
Note, that the state complexity of T1 ◦ T2 ◦ . . . ◦ Tk is |Q1||Q2| . . . |Qk|in the worst case.
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 69 / 99
Closure properties and algebra of finite state transducersCross product
Definition ((Cartesian) Product)Given two sets S1 and S2, the Cartesian product S1 × S2 is defined as:
S1 × S2 = {〈x, y〉 | x ∈ S1 ∧ y ∈ S2}
Theorem (Product of regular sets)Let A1 = 〈Q1,Σ1, q01 , F1, δ1〉 and A2 = 〈Q2,Σ2, q02 , F2, δ2〉 be twofinite-state acceptors. Then L(A1)× L(A2) is representable by a finite-statetransducer A1 ×A2.
Proof.A1 ×A2 ≡ ID(A1) ◦ TΣ∗1 7→ε ◦ Tε7→Σ∗2
◦ ID(A2)
NoteCross product is the core of all replacement operations.
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 70 / 99
Closure properties and algebra of finite state transducersCross product
Example (Cross product)
� �
�������
��
�����
A1
� �
����
���
���
A2
�
��������
���
��� �
TΣ∗1 7→ε
� �
�����������
�������� �
�������� ���
������
��� �
��� ���
���������
������ �
������ ���
A1 ×A2
�
������
����
������
Tε7→Σ∗2
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 71 / 99
Closure properties and algebra of finite state transducersInversion
Definition (Inversion)Let T = 〈Q,Σ,∆, q0, F, E〉 be a transducer.The inversion of T – symbolically T−1 – is the FSTT−1 = 〈Q,∆,Σ, q0, F, E
−1〉 where E−1 = {〈p, b, a, q〉|〈p, a, b, q〉 ∈ E}
NoteThus, inversion simply exchanges input- and output “tapes” of a transducer.
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 72 / 99
Closure properties and algebra of finite state transducersInversion
Example (Morphological analysis vs. generation)
�
�������
�
������
���
�� �
��� �
�
���
�
���
�����
�
���
�� ����
���
���
FST TMorph mapping words to morphological categories
�
�������
�
������
���
� ���
� ���
�
���
�
���
�����
�
���
��� ���
���
��
FST T−1Morph mapping morphological categories to words
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 73 / 99
Closure properties and algebra of finite state transducersWhy are FSTs not closed under intersection?
Example
�
���
����
���
Tan 7→bnc∗
�
���
����
���
Tan 7→b∗cn
The intersection of Tan 7→bnc∗ and Tan 7→b∗cn would result in the relationR = {〈an, bncn〉 | n ≥ 0} which is not regular and thus not representable bya finite-state transducer.
NoteThis has consequences for creating applications based on finite-statetransducers. They cannot be based on the intersection of constraintsrepresented as transducers.
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 74 / 99
Closure properties and algebra of finite state transducersWhy are FSTs not closed under intersection?
Intuitively, the existence of ε within loops leading to infinite ambiguity isthe reason why FSTs are not closed under intersection
Thus, ε-free FSTs – also called equal-length transducers – are closedunder intersection
The same is true for acyclic FSTs, where we have some freedom whereto realize the ε-transitions
By DeMorgan, non-closure under intersection leads to non-closure undercomplementation
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 75 / 99
Outline
5 Equivalence transformations on finite-state acceptorsε-RemovalDeterminizationMinimization
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 76 / 99
Equivalence transformations on finite-state acceptors
Equivalence transformations
Equivalence transformations are operations on automata which changethe topology of an automaton without changing its language.
They usually serve optimization purposes, that is, they create smallerand/or faster automata.
Finite-state acceptors admit the following equivalence transformations:
ε-Removal
Determinization
Minimization
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 78 / 99
Equivalence transformations on finite-state acceptorsε-Removal
Definition ( )Let p and q be states in Q and let w be a string in Σ∗. Let w be a relationQ×Q, such that 〈p, q〉 ∈ w if there is a path labeled with w from p to q.
Definition (ε-closure)Given a NFA A = 〈Q,Σ, q0, F, δ〉, ε-closure(q) = {q} ∪ {p ∈ Q | q ε p}.
Definition (ε-free FSA)Let A = 〈Q,Σ, q0, F, δ〉 be a FSA. Define A′, the equivalent ε-free FSA withL(A′) = L(A), as A′ = 〈Q,Σ, q0, F
′, δ′〉 where:
δ′(q, a) = ε-closure(⋃q′∈ε-closure(q) δ(q
′, a)), ∀q ∈ Q, a ∈ Σ
F ′ = F ∪ {q0}, if ε-closure(q0) ∩ F 6= ∅, else F ′ = F .
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 79 / 99
Equivalence transformations on finite-state acceptorsε-Removal
Example
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 80 / 99
Equivalence transformations on finite-state acceptorsDeterminization
Definition (Subset construction)Let A = 〈Q,Σ, q0, F, δ〉 be a FSA. Define A′, the equivalent DFA withL(A′) = L(A) as A′ = 〈2Q,Σ, {q0}, F ′, δ′〉 with :
F ′ = {S ⊆ Q | S ∩ F 6= ∅}δ′(S, a) =
⋃q∈S δ(q, a), ∀a ∈ Σ, ∀S ⊆ Q
NoteThe complexity of an algorithm which implements this in a naive way isexponential.
In the normal case, most of the subset states in the DFA are notaccessible / coaccessible.
A better algorithm based on a state queue avoids inaccessible states.
But this doesn’t change the complexity in the worst case.
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 81 / 99
Equivalence transformations on finite-state acceptorsDeterminization
Example
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 82 / 99
Equivalence transformations on finite-state acceptorsMinimization
Definition (Minimal DFA)Let A = 〈Q,Σ, q0, F, δ〉 be a DFA.L is minimal if ∀A′ = 〈Q′,Σ′, q′0, F ′, δ′〉 : L(A′) = L(A)⇒ |Q| ≤ |Q′|
Definition (Right language)Let A = 〈Q,Σ, q0, F, δ〉 be a DFA.The right language of a state q ∈ Q – symbolically
−→L (q) – is defined as:
−→L (q) = {w ∈ Σ∗ | δ∗(q, w) ∈ F}
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 83 / 99
Equivalence transformations on finite-state acceptorsMinimization
Definition (Equivalent states)
Two states p and q are called equivalent if−→L (p) =
−→L (q).
This holistic definition based on right languages can be turned into a recursivedefinition of equivalence of states:
Definition (State equivalence ≡)Let A = 〈Q,Σ, q0, F, δ〉 be a DFA.Two states p and q are called equivalent – symbolically p ≡ q –, if:
p ≡ q if p ∈ F ⇔ q ∈ F ∧ ∀a ∈ Σ : δ(p, a) ≡ δ(q, a).
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 84 / 99
Equivalence transformations on finite-state acceptorsMinimization
Based on state equivalence, we come up with a definition of a minimal DFA:
Theorem (Minimal DFA I)Let A = 〈Q,Σ, q0, F, δ〉 be a DFA. A is minimal iff
∀p, q ∈ Q : p 6= q ⇒−→L (p) 6=
−→L (q).
By substituting the recursive definition of state equivalence into the lasttheorem, we arrive at:
Theorem (Minimal DFA II)Let A = 〈Q,Σ, q0, F, δ〉 be a DFA. A is minimal iff∀p, q ∈ Q : p 6= q ⇒ p ∈ F < q ∈ F ∨ ∃a ∈ Σ : δ(p, a) 6≡ δ(q, a).
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 85 / 99
Equivalence transformations on finite-state acceptorsMinimization: Myhill-Nerode theorem
Theorem (Myhill-Nerode)The following propositions are equivalent:
1 L is recognized by a DFA AL = 〈Q,Σ, q0, F, δ〉.2 L is the union of some equivalence classes of a right invariant
equivalence relation R with finite index.3 RL (x RL y iff ∀z ∈ Σ∗ : xz ∈ L⇔ yz ∈ L) is of finite index.
NotesThe Myhill-Nerode theorem links states with subsets of Σ∗. It is central to the theorem that thenumber of states in a DFA is finite.
The Myhill-Nerode theorem assumes complete DFAs, that is, the corresponding transition functionδ is total.
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 86 / 99
Equivalence transformations on finite-state acceptorsMinimization: Myhill-Nerode theorem
Definition (Equivalence relation, equivalence class)Let S be a set and E ⊆ S × S a binary relation. E is called a equivalencerelation if E is reflexive, symmetric and transitive.If E is a equivalence relation, we call [x]E = {y | x E y} the equivalenceclass of x wrt E.
Properties of equivalence relations1 x ∈ [x]E ,∀x ∈ S2 [x]E = [y]E ∨ [x]E ∩ [y]E = ∅,∀x, y ∈ S3
⋃x∈S
[x]E = S
Definition (Index of a equivalence relation)Let E be a equivalence relation. The index IE of E is the number of E’sequivalence classes. E is of finite index if IE is finite.
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 87 / 99
Equivalence transformations on finite-state acceptorsMinimization: Myhill-Nerode theorem
Definition (Right-invariant equivalence relation)Let R be a equivalence relation over Σ∗.R is called right-invariant (with respect to concatenation) if
∀x, y, z ∈ Σ∗ : x R y ⇒ xz R yz
Definition (Left language)
Let A = 〈Q,Σ, q0, F, δ〉 be a DFA. Define the left language←−L (p) of a state
p ∈ Q as←−L (p) = {w ∈ Σ∗ | δ(q0, w) = p}.
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 88 / 99
Equivalence transformations on finite-state acceptorsMinimization: Myhill-Nerode theorem
Myhill-Nerode theorem.We prove the theorem by chaining 1⇒ 2, 2⇒ 3 and 3⇒ 1.1⇒ 2.Let A be a DFA recognizing L.Define RA as x Ra y if δ(q0, x) = δ(q0, y), ∀x, y ∈ Σ∗.Subproof: RA is right-invariant equivalence relation of finite index (the indexof RA is |Q|). The equivalence classes of RA are the left languages←−L (p),∀p ∈ Q.
⋃q∈F
←−L (q) = L.
Example
� ��
���
�
←−L (1) = {ε, a(ba)∗b},
←−L (2) = {a(ba)∗},
←−L (3) = {a(ba)∗c(d)∗}
L =←−L (2) ∪
←−L (3)
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 89 / 99
Equivalence transformations on finite-state acceptorsMinimization: Myhill-Nerode theorem
Myhill-Nerode theorem (continued).2⇒ 3.R = RA is an refinement of RL, that is, every equivalence class of RA iscontained in some equivalence class of RL.
1 Assume that x RA y.2 Since RA is right-invariant, xz RA yz, for all z ∈ Σ∗.3 Thus xz ∈ L if and only if yz ∈ L.4 Thus xz RL yz and the equivalence class of x wrt RA is contained in the
equivalence class of x wrt RL.5 Since the index of RA is finite (at most equal to |Q|) and the index of RL
is less or equal to the index of RA, we conclude that RL is of finite index.
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 90 / 99
Equivalence transformations on finite-state acceptorsMinimization: Myhill-Nerode theorem
Example (RA is a refinement of RL)
�
�
�
��
�
�
�
�
��
��
��
��
��
��
�
��
���
��
�
�
�
��
��
���
��
���
���
��
���
���
�
�
�
��
�
�
�
�
�
��
��
���
�
�
���
��
�
�
�
�
��� ���
���
�
State Corresponding equiv. class of RA
3 e9 frie15 dog17 doll19 dollar22 coll24 collar
State Corresponding equiv. class of RL
7 dog, collar, dollar, end, friend, frog8 e, frie12 doll16 coll
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 91 / 99
Equivalence transformations on finite-state acceptorsMinimization: Myhill-Nerode theorem
Myhill-Nerode theorem (continued).3⇒ 1.Given RL, construct a new FSA A′ = 〈Q′,Σ, q′0, F ′, δ′〉 as follows:
1 Q′ = {[x] | [x] is a equivalence class of Σ∗ under RL}2 q′0 = [ε]3 δ′ : Q′ × Σ 7→ Q′ : δ′([x], a) = [xa],∀[x] ∈ Q′ ∧ a ∈ Σ4 F ′ = {[x] | x ∈ L}
Exampleδ′([e], n) = [en] δ′({frie, e}, n) = {frien, en}δ′([en], d) = [end] δ′({frien, en}, d) = {friend, end}a
aInspired by CAKE: “Friend is a four-letter word”
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 92 / 99
Equivalence transformations on finite-state acceptorsMinimization: algorithms
Approaches
1 Union-Find-based: Find all states p and q with−→L (p) =
−→L (q) and
merge them.2 Partition-based: Starting at sets of non-equivalent states, partition these
sets further until each set contains only equivalent states.
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 93 / 99
Outline
6 Equivalence transformations on finite-state transducers
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 94 / 99
Outline
7 Decidability properties of unweighted finite-state acceptors andtransducers
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 96 / 99
Decidability properties of unweighted finite-state acceptorsand transducersGiven two finite-state acceptors A and A′, the following properties aredecidable:
L(A) = ∅L(A) = Σ∗
L(A) = L(A′)L(A) ⊆ L(A′)
Given two finite-state transducers T and T ′, the following properties aredecidable:
T is functional
Given two finite-state transducers T and T ′, the following properties areundecidable:
L(T ) = L(T ′)
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 98 / 99
Version history
16.10.08: version 0.1 (initial version)
20.10.08: version 0.2 (some error corrections, added definition ofcomposition of FSTs with ε-transitions)
09.11.08: version 0.3 (added example for ε-composition, enhancedexample for cross product, added subtitles)
07.12.08: version 0.4 (completed minimization section)
Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 99 / 99