Download - Finite-state Machines: Theory and Applications ...tagh.de/tom/wp-content/uploads/fsm_unweigtedautomata.pdf · Finite-state Machines: Theory and Applications Unweighted Finite-state

Finite-state Machines: Theory and ApplicationsUnweighted Finite-state Automata

Thomas Hanneforth

Institut fur LinguistikUniversitat Potsdam

December 10, 2008

Thomas Hanneforth (Universitat Potsdam) Finite-state Machines: Theory and Applications December 10, 2008 1 / 99

Overview

1 Applications of finite-state machines2 Finite state acceptors and transducers: formal characterization3 Closure properties and algebra of finite state acceptors4 Closure properties and algebra of finite state transducers5 Equivalence transformations on finite state acceptors6 Equivalence transformations on finite state transducers7 Decidability properties of unweighted finite-state acceptors and

transducers


Outline

1 Applications of Finite-State Machines


Some Applications of Finite-State Machines inComputational Linguistics

Morphological analysis: lemmatization, word segmentation,segmentation disambiguation

Spelling correction

Lexicon representation

Part-of-Speech-Tagging

Shallow parsing, Chunking

Speech recognition, speech synthesis

Optimality theory

Language modeling & Statistical language processing

Statistical machine translation


Outline

2 Finite state acceptors and transducers: formal characterizationFinite-state acceptorsFinite-state transducers


Finite state acceptors and transducers:

formal characterization

Finite-state acceptorsNon-deterministic finite-state acceptor

Definition (Non-deterministic finite-state acceptor (NFA))A non-deterministic finite-state acceptor A is a 5-tuple 〈Q,Σ, q0, F, δ〉 where

Q is a non-empty set of states

Σ is a non-empty set and called the alphabet of A

q0 ∈ Q, the start state

F ⊆ Q, the set of final states

δ : Q× Σ ∪ {ε} 7→ 2Q, the transition function.

Notesδ may be a partial function (and usually is)

Nondeterminism: the transition function δ maps a state q and an alphabetsymbol a to a set of successor states.

A transition may be labeled with ε, the neutral element of concatenation.


Finite-state acceptorsExample (Nondeterministic FSA Alex accepting some animal names)

� ��

� ��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

�

�

�

�

�

�

�


Finite-state acceptorsDeterministic finite-state acceptor

Definition (Deterministic finite-state acceptor (DFA))A deterministic finite-state acceptor A is a 5-tuple 〈Q,Σ, q0, F, δ〉, where


Σ is a non-empty set and called the alphabet of A



δ : Q× Σ 7→ Q, the transition function.

NotesAgain, δ may be a partial function,

DFSA are by definition ε-free, that is, contain no ε-transitions.

DFSA and NDFA have the same generative power that is both conceptsare equivalent (cf. subset construction).


Finite-state acceptors

Example (Deterministic version of Alex)

�

�

�

��

�

�

��

�

�

��

��

�

��

�

��

� �

��

��

��

��

��

��

��

��

��

��

��

��

��

� �

Deterministic acyclic FSA are also called tries. Tries are useful forlexicon representation.


Finite-state acceptorsExamples

Example (DFSA ANP accepting English noun phrase patterns)

�

�

�

�

��

�

��

�

�

��

�

��

Example (DFSA AEnglish accepting some English sentences)

� �

��

��

��

��

�

�

��

��

��

��

�

��

��

��

��

��

�

��


Finite-state acceptorsGeneralized transition function, language

Definition (Generalized transition function δ∗)δ∗ is the reflexive and transitive closure of δ

δ∗(q, ε) = q,∀q ∈ Qδ∗(q, aw) = δ∗(δ(q, a), w)

��

�

�

�

�

��

�

Definition (Language of a DFSA A)L(A) = {w ∈ Σ∗ | δ∗(q0, w) ∈ F}We also say that L(A) is recognized by A.

Definition (Regular language)The language is called regular if there exists some DFA which recognizes it.


Finite-state transducersDefinition

Definition ((Non-deterministic) finite-state transducer (NFST))A (non-deterministic) finite-state transducer T is a 7-tuple〈Q,Σ,∆, q0, F, δ, σ〉, where


Σ is a non-empty set and called the input alphabet of T

∆ is a non-empty set and called the output alphabet of T



δ : Q× Σ ∪ {ε} 7→ 2Q, the transition function.

σ : Q× Σ ∪ {ε} ×Q 7→ ∆∗, the output function.


Finite-state transducersAlternative definition

To simplify some definitions, we combine transition and output functionto a set of transitions.In addition, we restrict the output function to single symbols or ε.

Definition (Normalized finite-state transducer)A normalized finite-state transducer T is a 6-tuple 〈Q,Σ,∆, q0, F, E〉, where


Σ is a non-empty set, the input alphabet of T

∆ is a non-empty set, the output alphabet of T



E ⊆ Q× (Σ ∪ {ε})× (∆ ∪ {ε})×Q, the set of transitions.

Note that every transducer can be transformed into a normalizedtransducer.


Finite-state transducers

Example (Tlex mapping surface forms to morph. features)

�

��

��

��

�

��

��

��

��

��

��

��

��

��

��

�

��

��

�

��

��

�

��

��

��

�

��

��

��

��

��

��

Note that fish is nondeterministically mapped to { NOUN sg, NOUN pl}



Example (Laughter machine Tlaugh)

� ��

��

��

��

��

��

��

The input string laugh is mapped to the infinite set {han|n ≥ 1}



Example (Bracketing machine Tbracket)

�

��

�

��

��

��

��

��

��

��

��

��

Every occurrence of ab is enclosed within brackets.

For example, the input string cabbabc is mapped to c{ab}b{ab}c bytraversing the state sequence 0 0 2 3 4 0 0 2 3 4 0 0


Finite-state transducersLanguage

Definition (Language of a FST)The language L(T ) of a FST T = 〈Q,Σ,∆, q0, F, δ, σ〉 is defined in thefollowing way:L(T ) = {〈u, v〉 | δ∗(q0, u) ∩ F 6= ∅ ∧ v ∈ σ∗(q0, u)}δ∗ is recursively defined:

δ∗(q, ε) = {q} and

δ∗(q, wa) =⋃q′∈δ∗(q,w) δ(q

′, a)

σ∗ is the generalized output function and defined like this:

σ∗(q, ε) = {ε} and

σ∗(q, wa) = σ∗(q, w) ·⋃q′∈δ∗(q,w) σ(q′, a, p)


Finite-state transducersTransduction mapping

Definition (Transduction mapping)The transduction mapping JT K : Σ∗ 7→ ∆∗ of a FST T is defined as:JT K(x) = {y | 〈x, y〉 ∈ L(T )}

Definition (Functional transducer)A transducer T is called functional if |JT K(x)| ≤ 1 for all x ∈ Σ∗

ExampleTbracket is functional. Tlex is not functional.

Definition (Ambiguous transducer)A transducer T is called ambiguous, if |JT K(x)| > 1 for some x ∈ Σ∗.


Finite-state transducersAmbiguous transducers

Definition (Finitely ambiguous transducer)A transducer T is called finitely ambiguous, if |JT K(x)| is finite for all x ∈ Σ∗.

ExampleTlex is finitely ambiguous.

Definition (Infinitely ambiguous transducer)A transducer T is called infinitely ambiguous, if |JT K(x)| is infinite for somex ∈ Σ∗.

ExampleTlaugh is infinitely ambiguous.


Finite-state transducersDeterministic finite-state transducer

Definition (Deterministic finite-state transducer (DFST))A deterministic finite-state transducer T is a 7-tuple 〈Q,Σ,∆, q0, F, δ, σ〉,where


Σ is a non-empty set and called the input alphabet of T

∆ is a non-empty set and called the output alphabet of T



δ : Q× Σ 7→ Q, the (deterministic) transition function.

σ : Q× Σ 7→ ∆∗, the (deterministic) output function.

TheoremEvery deterministic transducer is functional.


Outline

3 Closure properties and algebra of finite state acceptorsUnionConcatenationStar closurePlus closureReversalComplementationIntersectionDifferenceSubstitutionHomomorphism


Closure properties and algebra

of finite state acceptors

Closure properties and algebra of finite state acceptors

Definition (Closure of a set)Let S be a set and let fk be a k-ary function taking k-tuples over S asarguments. We say that S is closed under fk if for all ai ∈ Sfk(a1, a2, . . . , ak) ∈ S.

NoteClosure properties are important for the modularity based on a specificformalism. They allow to build complex things out of simpler ones bycombining them with a number of operations.


Closure properties and algebra of finite state acceptors

Closure properties of regular languagesThe set of languages which is recognized by finite-state acceptors (the regularlanguages) is closed under

Union

Concatenation

Plus and star closure

Reversal

Complementation

Intersection

Difference

Homomorphism and substitution


Closure properties and algebra of finite state acceptorsUnion

Example (Union of two acceptors)

�

��

�

�

��

�

A1

� ��

�

A2

�

�

�

��

�

�

��

�

�

�

�

A1 ∪A2


Closure properties and algebra of finite state acceptorsConcatenation

Example (Concatenation of two acceptors)

�

��

�

�

��

�

A1

� ��

�

A2

�

�

�

��

�

�

��

�

�

� �

�

A1 ·A2


Closure properties and algebra of finite state acceptorsStar closure

Example (Star (Kleene) closure of an acceptor)

�

��

�

�

��

�

A

�

� ��

��

�

�

�

�

�

� �

A∗


Closure properties and algebra of finite state acceptorsPlus closure

Example (Plus closure of an acceptor)

�

��

�

�

��

�

A

�

��

�

��

�

�

�

�

�

A+


Closure properties and algebra of finite state acceptorsReversal

Definition (Reversal of a string)

The reversal of a string w ∈ Σ∗ – denoted by wR – is defined as:

εR = ε

(a · w)R = wR · a, ∀a ∈ Σ ∧ w ∈ Σ∗

Example

obamaR = bamaR · o = amaR · bo = maR · abo = aR ·mabo = ε · amabo

Definition (Reversal of a string set)

Let S ⊆ Σ∗ be a set of strings. The reversal of S - denoted by SR – is definedas: SR = {wR|w ∈ S}.

TheoremThe set of regular languages is closed under reversal.


Closure properties and algebra of finite state acceptorsReversal

Example (Reversal of an acceptor)

� ��

��

��

�

�

�

�

�

�

A

�

��

��

��

��

�

��

��

�

�

AR


Closure properties and algebra of finite state acceptorsComplementation

Given an alphabet Σ and a FSA A you sometimes need a FSA A representingall strings x over Σ∗ which are not in A.Formally: L(A) = {x ∈ Σ∗|x 6= L(A)} or L(A) = Σ∗ − L(A)

L(A)

L(A)

TheoremThe set of regular languages is closed under complementation.


Closure properties and algebra of finite state acceptorsComplementation: algorithm

Algorithm:1 Determinize A and obtain A′.2 Make A′ complete by adding a sink state s and adding for each state q

and each symbol a ∈ Σ not already used at q a transition δ(q, a) = s.3 Exchange final and non-final states.



Example (Complementation of a finite-state acceptor)

� ��

��

��

�

A(Σ = {a, c, r, t})

� ��

�

�

�

�

�

�

��

��

�

�

�

�

�

�

�

�

�

A



Example (Why complementation works)Consider a trie for W = {cat, camel, dog, frog}.

�

��

��

�

�

��

��

��

��

��

�

��

��

��

Definition (Definition of a trie)Let W be a finite set of words over Σ. Let Pref(W ) the set of all prefixes ofW . Define a DFA A = 〈Pref(W ),Σ, ε,W, δ〉 with∀a ∈ Σ, x, xa ∈ Pref(W ): δ(x, a) = xa.



Example (A trie for W = {cat, camel, dog, frog})States are labeled with prefixes of W .

�

��

��

�

�

��

��

��

��

�

��

��

��

��

��

��



In a trie – a special acyclic DFA – each state corresponds to a singleprefix of a word in W .In a general DFA A, each state q corresponds to a set of prefixes of thewords in L(A), the left language of q.

Definition (Left language)

The left language of a state q – symbolically←−L (q) – is defined as:

←−L (q) = {w ∈ Σ∗ | δ∗(q0, w) = q}

Example (Left language)

� ��

�

←−L (1) = {a(ba)n | n ≥ 0}


Closure properties and algebra of finite state acceptorsComplementation: Why can we only complementize DFAs?

Example (’Complementation’ of an NFA)

� ��

��

� ��

�

�

�

The ”complementized”NFA still accepts ab.

�

�

�

�

�

�

�

�

�

�

��

�

�

�

�

�

�

�

��

�

�

��

�

�


Closure properties and algebra of finite state acceptorsComplementation: Importance

Complementation (negation) is important for the inherent robustness ofmethods based on finite-state automata.

A NLP system is called robust if there does not exist an input string forwhich it fails. That means: A robust NLP system accepts Σ∗.This is immediately related to negation: if there is some input string wwhich is not accepted by some DFA A (say, for example, a DFArepresenting some NP grammar about stock indices), one could use A toaccept w: L(A) ∪ L(A) = Σ∗.This property does not carry over to context-free languages: they are notclosed under complementation.

Context-sensitive languages are – perhaps surprisingly – again closedunder complementation. But their recognition problem is notoriouslydifficult.


Closure properties and algebra of finite state acceptorsComplementation: the sting in the tail: complexity

TheoremConsider an NFA A with state set Q. The state complexity of an equivalentDFA A′ can be in the worst case in O(|Σ|2|Q|).

Example (Worst case of determinization)

Consider the regular language L = Σ∗a(a|b)k for some k (with Σ = {a, b}).While an NFA for L has k + 2 states, the equivalent DFA has 2k+1 states.

�

�

�

��

��

��

�

�

k = 2

�

�

�

�

�

�

�

�

��

��

�

�

�

�

�

�

�

�

�

�

�


Closure properties and algebra of finite state acceptorsIntersection

If we know that FSAs are closed under complementation and union thenwe also know that they are closed under intersection.

Why? By DeMorgan!

A ∩B ≡ A ∪B

But this approach is very complex, since it requires threecomplementation operations which in turn require determinization.

There is a more direct method: we let pair of states of the original FSAsbe states of the intersection FSA.


Closure properties and algebra of finite state acceptorsIntersection: Example

Example (Intersection with the product state construction)

� ��

��

��

��

A1

� ��

��

�

�

�

�

A2

��

��

��

��

A1 ∩A2


Closure properties and algebra of finite state acceptorsIntersection: formal definition

Definition (Intersection of two finite-state acceptors)Let A1 = 〈Q1,Σ1, q01 , F1, δ1〉 and A2 = 〈Q2,Σ2, q02 , F2, δ2〉 be two FSAs.A1 ∩A2, the intersection of A1 and A2 is an acceptor:

A = 〈Q1 ×Q2,Σ1 ∩ Σ2, 〈q01 , q02〉, F1 × F2, δ〉

where 〈p′, q′〉 ∈ δ(〈p, q〉, a) if p′ ∈ δ1(p, a) and q′ ∈ δ2(q, a) for alla ∈ Σ1 ∩ Σ2.

This mathematical approach generates in the worst as in the best case aFSA with |Q1||Q2| states.

But a lot of these states may not contribute to the language


Closure properties and algebra of finite state acceptorsIntersection: useless states

Definition (Inaccessible and non-coaccessible states)Let A be a finite-state automaton (acceptor or transducer) with start state q0.A state q in A is called inaccessible if there is no path in A from q0 to q.A state q in A is called non-coaccessible if there is no path in A from q to afinal state of A.A state is called useless if it is inaccessible or non-coaccessible.A finite-state automaton A is called trim or connected if it has no uselessstates.


Closure properties and algebra of finite state acceptorsIntersection: removal of useless states

Algorithm connect(A)

Require: FSM A with start state q0, state set Q and final state set FEnsure: A without useless states

1: Perform a depth-first search starting at q0 and mark each visited state2: Delete each unmarked state q and all its ingoing and outgoing transitions3: Reverse A4: Unmark all states in Q5: Perform a depth-first search starting at all states q ∈ F and mark each

visited state6: Delete each unmarked state q and all its ingoing and outgoing transitions7: Reverse A

Complexity of connect(A)

If A has |Q| number of states and |E| number of transitions, the complexity ofconnect(A) is in O(|Q|+ |E|).


Closure properties and algebra of finite state acceptorsIntersection: algorithm

Require: FSAs A1 = 〈Q1,Σ1, q01 , F1, δ1〉 and A2 = 〈Q2,Σ2, q02 , F2, δ2〉Ensure: A = A1 ∩A2

1: F := Q := ∅2: ENQUEUE(S, 〈q01 , q02〉)3: while S 6= ∅ do4: 〈q1, q2〉 := DEQUEUE(S)5: for all a ∈ Σ1 ∩ Σ2 do6: if q′1 ∈ δ1(q1, a) ∧ q′2 ∈ δ2(q2, a) then7: δ(〈q1, q2〉, a) := δ(〈q1, q2〉, a) ∪ {〈q′1, q′2〉}8: if 〈q′1, q′2〉 /∈ Q then9: Q := Q ∪ {〈q′1, q′2〉}

10: if q′1 ∈ F1 ∧ q′2 ∈ F2 then11: F := F ∪ {〈q′1, q′2〉}12: end if13: ENQUEUE(S, 〈q′1, q′2〉)14: end if15: end if16: end for17: end while18: CONNECT (A)19: return A


Closure properties and algebra of finite state acceptorsIntersection: practical importance

Closure under intersection means that we can develop constraintsindependently of each other and then enforce their validitysimultaneously by intersecting them

A lot of finite-state based NLP is based on intersection: (Two-level-)Morphology, Constraint based grammar, pattern matching etc.


Closure properties and algebra of finite state acceptorsDifference

Definition (Difference)Let A1 and A2 two FSAs. The difference A1 −A2 is defined as:

A1 −A2 ≡ A1 ∩A2


Closure properties and algebra of finite state acceptorsDifference

Example (Difference)

� ��

��

�

�

�

�

A1

� ��

��

��

��

A2

� ��

��

��

��

��

�

�

�

�

A1 −A2


Closure properties and algebra of finite state acceptorsSubstitution

Definition (Substitution)

A substitution is a mapping s : Σ 7→ 2∆∗ for two alphabets Σ and ∆.s is generalized to s∗ : Σ∗ 7→ 2∆∗ by:

s∗(ε) = ε

s∗(xa) = s∗(x)s(a)

Theorem (Closure under substitution)The set of regular languages is closed under substitution with regularlanguages.

NoteA lot of finite-state based NLP is based on closure under substitution.


Closure properties and algebra of finite state acceptorsSubstitution

Example (Substitution in computational morphology)

� ��

A morphology rule as a FSA A

�

��

��

�

�

��

�

�

��

�

�

��

�

A1

�

��

��

�

A2

� ��

��

��

�

�

��

�

��

�

�

��

��

�

�

�

�

�

�

��

��

Result of the substitution A{NSTEM = A1, NINFL = A2}


Closure properties and algebra of finite state acceptorsHomomorphism

Definition (Homomorphism)A homomorphism is a mapping h : Σ 7→ ∆∗ for two alphabets Σ and ∆.

Definition (Inverse homomorphism)

Given a homomorphism h, the inverse homomorphic image h−1 of a languageL is defined as: h−1(L) = {w | h(w) ∈ L}

Theorem (Closure under homomorphism)The set of regular languages is closed under homomorphism and inversehomomorphism


Outline

4 Closure properties and algebra of finite state transducersProjectionCompositionCross productInversionIntersection


Closure properties and algebra

of finite state transducers

Closure properties and algebra of finite state transducers

The set of finite state transducers is closed under

Union

Concatenation

Closure

Reversal

Projection (note that this leads to FSAs)

Composition

Inversion

Finite state transducers are not closed under

Complementation

Intersection (but acyclic and ε-free transducers are)

Difference


Closure properties and algebra of finite state transducersProjection

Definition (First and second projection)Let T = 〈Q,Σ,∆, q0, F, S〉 be a transducer.The first projection of T – symbolically π1(T ) – is the FSAA = 〈Q,Σ, q0, F, δ〉 where∀a ∈ Σ ∪ {ε}, δ(p, a) = {q | ∃b ∈ ∆ : 〈p, a, b, q〉 ∈ S}

The second projection of T – symbolically π2(T ) – is the FSAA = 〈Q,Σ, q0, F, δ〉 where∀b ∈ ∆ ∪ {ε}, δ(p, b) = {q | ∃a ∈ Σ : 〈p, a, b, q〉 ∈ S}


Closure properties and algebra of finite state transducersProjection

Example (Projection)

�

��

�

��

��

��

��

Transducer T

�

��

�

��

��

�

��

π1(T )

�

��

�

��

��

�

��

π2(T )


Closure properties and algebra of finite state transducersCompositionComposing a transducer T1 with a transducer T2 (formally T1 ◦ T2) means:take some input u for T1, collect the output v of T1, feed it as input into T2

and collect the output w of T2.

Definition (Composition relation)Let T1 = 〈Q1,Σ1,∆1, q01 , F1, S1〉 and T2 = 〈Q2,Σ2,∆2, q02 , F2, S2〉 betransducers. L(T1 ◦ T2) ={〈u,w〉 ∈ Σ∗1 ×∆∗2 | ∃v ∈ ∆∗1 ∩ Σ∗2 : 〈u, v〉 ∈ L(T1) ∧ 〈v, w〉 ∈ L(T2)}


Closure properties and algebra of finite state transducersComposition

Definition (ε-free composition)Let T1 = 〈Q1,Σ1,∆1, q01 , F1, E1〉 and T2 = 〈Q2,Σ2,∆2, q02 , F2, E2〉 betwo normalized, ε-free FSTs. T1 ◦ T2, the composition of T1 and T2, is thetransducerT = 〈Q1 ×Q2,Σ1,∆2, 〈q01 , q02〉, F1 × F2, E〉 where

E = {〈〈p, q〉, a, b, 〈p′, q′〉〉 | ∃c ∈ ∆1 ∩ Σ2 :〈p, a, c, p′〉 ∈ E1 ∧ 〈q, c, b, q′〉 ∈ E2}

Properties of compositionThe composition operation is not commutative, that is, in general:T1 ◦ T2 6= T2 ◦ T1

The composition operation is associative, that is:

T1 ◦ T2 ◦ T3 = (T1 ◦ T2) ◦ T3 = T1 ◦ (T2 ◦ T3)



How does composition work?Whenever T1 contains a transition:

� ��

and T2 contains a transition:

� ��

T will contain a transition:

��



Example (Composition)

�

��

��

��

��

��

��

��

��

FST repeatedly mapping

words to their categories

◦ �

��

�

��

��

��

��

FST mapping NP-patterns to

NP category

= �

�

��

��

��

��

��

��

��

��

��

��

��

� ��

��

��

��

� ��

��


Closure properties and algebra of finite state transducersApplication

Definition (Identity transducer)Let A = 〈Q,Σ, q0, F, δ〉 be a FSA.The identity transducer ID(A) is defined by 〈Q,Σ,Σ, q0, F, E〉 where

E = {〈p, a, a, q〉 | ∃p, q ∈ Q, a ∈ Σ ∪ {ε} : q ∈ δ(p, a)}

Example (Identity transducer)

�

��

�

��

��

��

��

Definition (Application)The application of a FST T to a FSA A – symbolically T [A] – is defined as

T [A] ≡ π2(ID(A) ◦ T )


Closure properties and algebra of finite state transducersApplication

Example

� ��

FSA A

� ��

��

��

FSA ID(A)

� ��

ID(A) ◦ T

�

�

��

��

��

��

��

��

��

��

��

��

��

� ��

��

��

��

� ��

��

FST T

� ��

T [A] = π2(ID(A) ◦ T )


Closure properties and algebra of finite state transducersComposition: relationship to intersection

Composition can be considered as a generalization of intersection. Theintersection of two FSAs A1 and A2 can be defined as follows:

A1 ∩A2 = π1(ID(A1) ◦ ID(A2))

So, intersecting two FSAs is done by composing their identity transducers andafterwards projecting one of the tapes. Composing two transducers X and Ymeans synchronizing (intersecting) their inner tapes and then combining theouter tapes:


Closure properties and algebra of finite state transducersComposition: handling ε-transitions

It is possible to generalize the composition definition to transducers withε-transitions:

Definition (Transducer composition)Let T1 = 〈Q1,Σ1,∆1, q01 , F1, E1〉 and T2 = 〈Q2,Σ2,∆2, q02 , F2, E2〉 betwo normalized FSTs.T1 ◦ T2, the composition of T1 and T2, is the transducer

T = 〈Q1 ×Q2,Σ1,∆2, 〈q01 , q02〉, F1 × F2, E ∪ Eε ∪ Ei,ε ∪ Eo,ε〉where

1 E = {〈〈p, q〉, a, b, 〈p′, q′〉〉 | ∃c ∈ ∆1 ∩ Σ2 :〈p, a, c, p′〉 ∈ E1 ∧ 〈q, c, b, q′〉 ∈ E2}

2 Eε = {〈〈p, q〉, a, b, 〈p′, q′〉〉 | 〈p, a, ε, p′〉 ∈ E1 ∧ 〈q, ε, b, q′〉 ∈ E2}3 Ei,ε = {〈〈p, q〉, ε, a, 〈p, q′〉〉 | 〈q, ε, a, q′〉 ∈ E2 ∧ p ∈ Q1}4 Eo,ε = {〈〈p, q〉, a, ε, 〈p′, q〉〉 | 〈p, a, ε, p′〉 ∈ E1 ∧ q ∈ Q2}


Closure properties and algebra of finite state transducersComposition: handling ε-transitionsThere are four different ways, how ε and alphabet symbols on the second tapeof T1 and the first tape of T2 can interact:

1 T1 contains a a : c-transition and T2 contains a c : b-transition: this ishandled in the same way as in the ε-free case

2 T1 contains a a : ε-transition and T2 contains a ε : b-transition→T contains a a : b-transition. That is: ε is treated as a regular symbol.

3 T1 “stays” in the same state, T2 moves on:

�

��

� ��

4 T1 moves on, T2 “stays” in the same state:

� ��

�

��

��


Closure properties and algebra of WFSAComposition

Example (Composition of two unweighted FSTs)

� ��

��

��

��

T1

� ��

��

��

T2

��

��

��

��

��

��

� ��

��

� ��

��

T1 ◦ T2


Closure properties and algebra of finite state transducersComposition: Application

Composition is a very important operation for building processing orfiltering cascades, for example in robust parsing and morphologicalanalysis.

Since composition is not commutative, the order of a transducer cascadeC = T1 ◦ T2 ◦ . . . ◦ Tk matters.

This may lead to problems related to feeding, counter-feeding, bleedingand counter-bleeding.

Since the composition operation is associative, the order in which thecompositions in C are computed does not matter. This entails somefreedom degrees for implementing such cascades.

Note, that the state complexity of T1 ◦ T2 ◦ . . . ◦ Tk is |Q1||Q2| . . . |Qk|in the worst case.


Closure properties and algebra of finite state transducersCross product

Definition ((Cartesian) Product)Given two sets S1 and S2, the Cartesian product S1 × S2 is defined as:

S1 × S2 = {〈x, y〉 | x ∈ S1 ∧ y ∈ S2}

Theorem (Product of regular sets)Let A1 = 〈Q1,Σ1, q01 , F1, δ1〉 and A2 = 〈Q2,Σ2, q02 , F2, δ2〉 be twofinite-state acceptors. Then L(A1)× L(A2) is representable by a finite-statetransducer A1 ×A2.

Proof.A1 ×A2 ≡ ID(A1) ◦ TΣ∗1 7→ε ◦ Tε7→Σ∗2

◦ ID(A2)

NoteCross product is the core of all replacement operations.


Closure properties and algebra of finite state transducersCross product

Example (Cross product)

� �

��

��

��

A1

� �

��

��

��

A2

�

��

��

��

TΣ∗1 7→ε

� �

��

��

��

��

��

��

��

��

��

A1 ×A2

�

��

��

��

Tε7→Σ∗2


Closure properties and algebra of finite state transducersInversion

Definition (Inversion)Let T = 〈Q,Σ,∆, q0, F, E〉 be a transducer.The inversion of T – symbolically T−1 – is the FSTT−1 = 〈Q,∆,Σ, q0, F, E

−1〉 where E−1 = {〈p, b, a, q〉|〈p, a, b, q〉 ∈ E}

NoteThus, inversion simply exchanges input- and output “tapes” of a transducer.


Closure properties and algebra of finite state transducersInversion

Example (Morphological analysis vs. generation)

�

��

�

��

��

��

��

�

��

�

��

��

�

��

��

��

��

FST TMorph mapping words to morphological categories

�

��

�

��

��

� ��

� ��

�

��

�

��

��

�

��

��

��

��

FST T−1Morph mapping morphological categories to words


Closure properties and algebra of finite state transducersWhy are FSTs not closed under intersection?

Example

�

��

��

��

Tan 7→bnc∗

�

��

��

��

Tan 7→b∗cn

The intersection of Tan 7→bnc∗ and Tan 7→b∗cn would result in the relationR = {〈an, bncn〉 | n ≥ 0} which is not regular and thus not representable bya finite-state transducer.

NoteThis has consequences for creating applications based on finite-statetransducers. They cannot be based on the intersection of constraintsrepresented as transducers.


Closure properties and algebra of finite state transducersWhy are FSTs not closed under intersection?

Intuitively, the existence of ε within loops leading to infinite ambiguity isthe reason why FSTs are not closed under intersection

Thus, ε-free FSTs – also called equal-length transducers – are closedunder intersection

The same is true for acyclic FSTs, where we have some freedom whereto realize the ε-transitions

By DeMorgan, non-closure under intersection leads to non-closure undercomplementation


Outline

5 Equivalence transformations on finite-state acceptorsε-RemovalDeterminizationMinimization


Equivalence transformations on finite-stateacceptors

Equivalence transformations on finite-state acceptors

Equivalence transformations

Equivalence transformations are operations on automata which changethe topology of an automaton without changing its language.

They usually serve optimization purposes, that is, they create smallerand/or faster automata.

Finite-state acceptors admit the following equivalence transformations:

ε-Removal

Determinization

Minimization


Equivalence transformations on finite-state acceptorsε-Removal

Definition ( )Let p and q be states in Q and let w be a string in Σ∗. Let w be a relationQ×Q, such that 〈p, q〉 ∈ w if there is a path labeled with w from p to q.

Definition (ε-closure)Given a NFA A = 〈Q,Σ, q0, F, δ〉, ε-closure(q) = {q} ∪ {p ∈ Q | q ε p}.

Definition (ε-free FSA)Let A = 〈Q,Σ, q0, F, δ〉 be a FSA. Define A′, the equivalent ε-free FSA withL(A′) = L(A), as A′ = 〈Q,Σ, q0, F

′, δ′〉 where:

δ′(q, a) = ε-closure(⋃q′∈ε-closure(q) δ(q

′, a)), ∀q ∈ Q, a ∈ Σ

F ′ = F ∪ {q0}, if ε-closure(q0) ∩ F 6= ∅, else F ′ = F .


Equivalence transformations on finite-state acceptorsε-Removal

Example


Equivalence transformations on finite-state acceptorsDeterminization

Definition (Subset construction)Let A = 〈Q,Σ, q0, F, δ〉 be a FSA. Define A′, the equivalent DFA withL(A′) = L(A) as A′ = 〈2Q,Σ, {q0}, F ′, δ′〉 with :

F ′ = {S ⊆ Q | S ∩ F 6= ∅}δ′(S, a) =

⋃q∈S δ(q, a), ∀a ∈ Σ, ∀S ⊆ Q

NoteThe complexity of an algorithm which implements this in a naive way isexponential.

In the normal case, most of the subset states in the DFA are notaccessible / coaccessible.

A better algorithm based on a state queue avoids inaccessible states.

But this doesn’t change the complexity in the worst case.


Equivalence transformations on finite-state acceptorsDeterminization

Example


Equivalence transformations on finite-state acceptorsMinimization

Definition (Minimal DFA)Let A = 〈Q,Σ, q0, F, δ〉 be a DFA.L is minimal if ∀A′ = 〈Q′,Σ′, q′0, F ′, δ′〉 : L(A′) = L(A)⇒ |Q| ≤ |Q′|

Definition (Right language)Let A = 〈Q,Σ, q0, F, δ〉 be a DFA.The right language of a state q ∈ Q – symbolically

−→L (q) – is defined as:

−→L (q) = {w ∈ Σ∗ | δ∗(q, w) ∈ F}



Definition (Equivalent states)

Two states p and q are called equivalent if−→L (p) =

−→L (q).

This holistic definition based on right languages can be turned into a recursivedefinition of equivalence of states:

Definition (State equivalence ≡)Let A = 〈Q,Σ, q0, F, δ〉 be a DFA.Two states p and q are called equivalent – symbolically p ≡ q –, if:

p ≡ q if p ∈ F ⇔ q ∈ F ∧ ∀a ∈ Σ : δ(p, a) ≡ δ(q, a).



Based on state equivalence, we come up with a definition of a minimal DFA:

Theorem (Minimal DFA I)Let A = 〈Q,Σ, q0, F, δ〉 be a DFA. A is minimal iff

∀p, q ∈ Q : p 6= q ⇒−→L (p) 6=

−→L (q).

By substituting the recursive definition of state equivalence into the lasttheorem, we arrive at:

Theorem (Minimal DFA II)Let A = 〈Q,Σ, q0, F, δ〉 be a DFA. A is minimal iff∀p, q ∈ Q : p 6= q ⇒ p ∈ F < q ∈ F ∨ ∃a ∈ Σ : δ(p, a) 6≡ δ(q, a).


Equivalence transformations on finite-state acceptorsMinimization: Myhill-Nerode theorem

Theorem (Myhill-Nerode)The following propositions are equivalent:

1 L is recognized by a DFA AL = 〈Q,Σ, q0, F, δ〉.2 L is the union of some equivalence classes of a right invariant

equivalence relation R with finite index.3 RL (x RL y iff ∀z ∈ Σ∗ : xz ∈ L⇔ yz ∈ L) is of finite index.

NotesThe Myhill-Nerode theorem links states with subsets of Σ∗. It is central to the theorem that thenumber of states in a DFA is finite.

The Myhill-Nerode theorem assumes complete DFAs, that is, the corresponding transition functionδ is total.



Definition (Equivalence relation, equivalence class)Let S be a set and E ⊆ S × S a binary relation. E is called a equivalencerelation if E is reflexive, symmetric and transitive.If E is a equivalence relation, we call [x]E = {y | x E y} the equivalenceclass of x wrt E.

Properties of equivalence relations1 x ∈ [x]E ,∀x ∈ S2 [x]E = [y]E ∨ [x]E ∩ [y]E = ∅,∀x, y ∈ S3

⋃x∈S

[x]E = S

Definition (Index of a equivalence relation)Let E be a equivalence relation. The index IE of E is the number of E’sequivalence classes. E is of finite index if IE is finite.



Definition (Right-invariant equivalence relation)Let R be a equivalence relation over Σ∗.R is called right-invariant (with respect to concatenation) if

∀x, y, z ∈ Σ∗ : x R y ⇒ xz R yz

Definition (Left language)

Let A = 〈Q,Σ, q0, F, δ〉 be a DFA. Define the left language←−L (p) of a state

p ∈ Q as←−L (p) = {w ∈ Σ∗ | δ(q0, w) = p}.



Myhill-Nerode theorem.We prove the theorem by chaining 1⇒ 2, 2⇒ 3 and 3⇒ 1.1⇒ 2.Let A be a DFA recognizing L.Define RA as x Ra y if δ(q0, x) = δ(q0, y), ∀x, y ∈ Σ∗.Subproof: RA is right-invariant equivalence relation of finite index (the indexof RA is |Q|). The equivalence classes of RA are the left languages←−L (p),∀p ∈ Q.

⋃q∈F

←−L (q) = L.

Example

� ��

��

�

←−L (1) = {ε, a(ba)∗b},

←−L (2) = {a(ba)∗},

←−L (3) = {a(ba)∗c(d)∗}

L =←−L (2) ∪

←−L (3)



Myhill-Nerode theorem (continued).2⇒ 3.R = RA is an refinement of RL, that is, every equivalence class of RA iscontained in some equivalence class of RL.

1 Assume that x RA y.2 Since RA is right-invariant, xz RA yz, for all z ∈ Σ∗.3 Thus xz ∈ L if and only if yz ∈ L.4 Thus xz RL yz and the equivalence class of x wrt RA is contained in the

equivalence class of x wrt RL.5 Since the index of RA is finite (at most equal to |Q|) and the index of RL

is less or equal to the index of RA, we conclude that RL is of finite index.



Example (RA is a refinement of RL)

�

�

�

��

�

�

�

�

��

��

��

��

��

��

�

��

��

��

�

�

�

��

��

��

��

��

��

��

��

��

�

�

�

��

�

�

�

�

�

��

��

��

�

�

��

��

�

�

�

�

��

��

�

State Corresponding equiv. class of RA

3 e9 frie15 dog17 doll19 dollar22 coll24 collar

State Corresponding equiv. class of RL

7 dog, collar, dollar, end, friend, frog8 e, frie12 doll16 coll



Myhill-Nerode theorem (continued).3⇒ 1.Given RL, construct a new FSA A′ = 〈Q′,Σ, q′0, F ′, δ′〉 as follows:

1 Q′ = {[x] | [x] is a equivalence class of Σ∗ under RL}2 q′0 = [ε]3 δ′ : Q′ × Σ 7→ Q′ : δ′([x], a) = [xa],∀[x] ∈ Q′ ∧ a ∈ Σ4 F ′ = {[x] | x ∈ L}

Exampleδ′([e], n) = [en] δ′({frie, e}, n) = {frien, en}δ′([en], d) = [end] δ′({frien, en}, d) = {friend, end}a

aInspired by CAKE: “Friend is a four-letter word”


Equivalence transformations on finite-state acceptorsMinimization: algorithms

Approaches

1 Union-Find-based: Find all states p and q with−→L (p) =

−→L (q) and

merge them.2 Partition-based: Starting at sets of non-equivalent states, partition these

sets further until each set contains only equivalent states.


Outline

6 Equivalence transformations on finite-state transducers


Equivalence transformations on finite-statetransducers

Outline

7 Decidability properties of unweighted finite-state acceptors andtransducers


Decidability properties of unweighted

finite-state acceptors and transducers

Decidability properties of unweighted finite-state acceptorsand transducersGiven two finite-state acceptors A and A′, the following properties aredecidable:

L(A) = ∅L(A) = Σ∗

L(A) = L(A′)L(A) ⊆ L(A′)

Given two finite-state transducers T and T ′, the following properties aredecidable:

T is functional

Given two finite-state transducers T and T ′, the following properties areundecidable:

L(T ) = L(T ′)


Version history

16.10.08: version 0.1 (initial version)

20.10.08: version 0.2 (some error corrections, added definition ofcomposition of FSTs with ε-transitions)

09.11.08: version 0.3 (added example for ε-composition, enhancedexample for cross product, added subtitles)

07.12.08: version 0.4 (completed minimization section)