Finite State Automata. A very simple and intuitive formalism suitable for certain tasks A bit like a...
-
Upload
carissa-cowherd -
Category
Documents
-
view
215 -
download
2
Transcript of Finite State Automata. A very simple and intuitive formalism suitable for certain tasks A bit like a...
Finite State Automata
Finite State Automata• A very simple and intuitive formalism suitable for
certain tasks• A bit like a flow chart, but can be used for both
recognition and generation• “Transition network”• Unique start point• Series of states linked by transitions• Transitions represent input to be accounted for, or
output to be generated• Legal exit-point(s) explicitly identified
ExampleJurafsky & Martin, Figure 2.10
• Loop on q3 means that it can account for infinite length strings
• “Deterministic” because in any state, its behaviour is fully predictable
q0 q1 q2 q3 q4
b aa !a
Non-deterministic FSAJurafsky & Martin, Figure 2.18
• At state q2 with input “a” there is a choice of transitions
• We can also have “jump” arcs (or empty transitions), which also introduce non-determinism
q0 q1 q2 q3 q4
b aa !a
2.19
ε
Augmented Transition Networks
• ATNs were used for parsing in the 60s and 70s• For parsing, you need to pass constraints (e.g. for
agreement) as well as account for input: the Transition Networks were “augmented” by having a “register” into/from which such information could be put/taken.
• It’s easy to write recognizers, but computing structure is difficult
• ATNs quickly become very complex; one solution isto have a “cascade” of ATNs, where transitions can call other networks
Augmented Transition Networks
S q1
NP q1
ex
push NPput “num”
detput “num”
push VPget “num”
nput “num”
adj
q2
εpop NPprep
Exercises
q0 q1 q2 q3 q4
b aa !a
fsa([[0,b,1],[1,a,2],[2,a,3],[3,a,3],[3,!,end]]).
[0,b,1] [1,a,2] [2,a,3] [3,a,3] [3,!,end]
NDSFA
q0 q1 q2 q3 q4
b aa !
ε
fsa([[0,b,1],[1,a,2],[2,a,3],[3,empty,2],[3,!,end]]).
[0,b,1] [1,a,2] [2,a,3] [3,!,end] [3,empty,2]
FSA and NDFSA programsFirst load (consult) the file, eg 219.pl
| ?- help.Options are as followsrun - a simple recognizer; on prompt type in string with space between each element, ending in . or ! or ?run(v) - verbose recognizer gives trace of transitionsgen(X) - generate text; will interact at choice pointsrec(X,quiet) - to generate text deterministically. Type ; to get other grammatical sequences
| ?- run. b a a a a !Enter your string:
yes
FSA and NDFSA programs
| ?- run(v).Enter your string:
0-b-11-a-22-a-33-skip-22-a-33-skip-22-a-33-skip-23-!-end
yes
b a a a a !
| ?- gen(X).
FSA and NDFSA programs
Choice at state 3. Choose state from (1) [!,end](2) [empty,2]Select choice number: 2.
Choice at state 3. Choose state from (1) [!,end](2) [empty,2]Select choice number: 2.
Choice at state 3. Choose state from (1) [!,end](2) [empty,2]Select choice number: 1.
X = [b,a,a,a,a,!] ?
yes
| ?- rec(X,quiet).
X = [b,a,a] ?
FSA and NDFSA programs
;
X = [b,a,a,a] ? ;
X = [b,a,a,a,a] ? ;
X = [b,a,a,a,a,a] ?
yes
FSAs and regular expressions
• FSAs have a close relationship with “regular expressions”, a formalism for expressing strings, mainly used for searching texts, or stipulating patterns of strings
• Regular expressions are defined by combinations of literal characters and special operators
Regular expressionsCharacter Meaning Examples[ ] alternatives /[aeiou]/, /m[ae]n/ range /[a-z]/[^ ] not /[^pbm]/, /[^ox]s/? optionality /Kath?mandu/* zero or more /baa*!/+ one or more /ba+!/. any character /cat.[aeiou]/^, $ start, end of line\ not special character \.\?\^| alternate strings /cat|dog/( ) substring /cit(y|ies)/etc.
Regular expressions
• A regular expression can be mapped onto an FSA
• Can be a good way of handling morphology
• Especially in connection with Finite State Transducers
Finite State Transducers
• A “transducer” defines a relationship (a mapping) between two things
• Typically used for “two-level morphology”, but can be used for other things
• Like an FSA, but each state transition stipulates a pair of symbols, and thus a mapping
Finite State Transducers
• Three functions:– Recognizer (verification): takes a pair of strings
and verifies if the FST is able to map them onto each other
– Generator (synthesis): can generate a legal pair of strings
– Translator (transduction): given one string, can generate the corresponding string
Some conventions
• Transitions are marked by “:”
• A non-changing transition “x:x” can be shown simply as “x”
• Wild-cards are shown as “@”
• Empty string shown as “ε”
An exampleJ&M Fig. 3.9, p.74
q0
q6
q5
q4
q3
q2
q1
q7
f o xc a td o g
g o o s es h e e pm o u s e
g o:e o:e s es h e e pm o:i u:εs:c e
N:ε
N:ε
N:ε
P:^ s #
S:#
S:#
P:#
lexical:intermediate
q0
q6
q5
q4
q3
q2
q1
q7
g o o s es h e e pm o u s e
g o:e o:e s es h e e pm o:i u:εs:c e
N:ε
N:ε
N:ε
P:^ s #
S:#
S:#
P:#
[0] f:f o:o x:x [1] N:ε [4] P:^ s:s #:# [7][0] f:f o:o x:x [1] N:ε [4] S:# [7][0] c:c a:a t:t [1] N:ε [4] P:^ s:s #:# [7][0] s:s h:h e:e p:p [2] N:ε [5] S:# [7][0] g:g o:o o:o s:s e:e [2] N:ε [5] P:# [7]
f o x N P s # : f o x ^ s #f o x N S : f o x #c a t N P s # : c a t ^ s #s h e e p N S : s h e e p # g o o s e N P : g e e s e #
f o xc a td o g
Lexical:surface mappingJ&M Fig. 3.14, p.78
ε e / {x s z} ^ __ s #f o x N P s # : f o x ^ s #c a t N P s # : c a t ^ s #
q5
q4q0 q2 q3
q1
^: ε#
other
other
z, s, x
z, s, x
#, other z, x
^: ε
s ^: ε
ε:e s
#
f o x ^ s # f o x e s #c a t ^ s # : c a t ^ s #
q5
q4q0 q2 q3
q1
^: ε#
other
other
z, s, x
z, s, x
#, other z, x
^: ε
s ^: ε
ε:e s
#
[0] f:f [0] o:o [0] x:x [1] ^:ε [2] ε:e [3] s:s [4] #:# [0][0] c:c [0] a:a [0] t:t [0] ^:ε [0] s:s [0] #:# [0]
FST
• Can be generated automatically
• Therefore, slightly different formalism
FST compilerhttp://www.xrce.xerox.com/competencies/content-analysis/fsCompiler/fsinput.html[d o g N P .x. d o g s ] | [c a t N P .x. c a t s ] |[f o x N P .x. f o x e s ] |[g o o s e N P .x. g e e s e]
s0: c -> s1, d -> s2, f -> s3, g -> s4.s1: a -> s5.s2: o -> s6.s3: o -> s7.s4: <o:e> -> s8.s5: t -> s9.s6: g -> s9.s7: x -> s10.s8: <o:e> -> s11.s9: <N:s> -> s12.s10: <N:e> -> s13.s11: s -> s14.s12: <P:0> -> fs15.s13: <P:s> -> fs15.s14: e -> s16.fs15: (no arcs)s16: <N:0> -> s12.
s0
s3
s2
s1
s4
c
d
f
g
s0: c -> s1, d -> s2, f -> s3, g -> s4.s1: a -> s5.s2: o -> s6.s3: o -> s7.s4: <o:e> -> s8.s5: t -> s9.s6: g -> s9.s7: x -> s10.s8: <o:e> -> s11.s9: <N:s> -> s12.s10: <N:e> -> s13.s11: s -> s14.s12: <P:0> -> fs15.s13: <P:s> -> fs15.s14: e -> s16.fs15: (no arcs)s16: <N:0> -> s12.
fst([[s0,[c,s1], [d,s2], [f,s3], [g,s4]],[s1,[a,s5]],[s2,[o,s6]],[s3,[o,s7]],[s4,[[o,e],s8]],[s5,[t,s9]],[s6,[g,s9]],[s7,[x,s10]],[s8,[[o,e],s11]],[s9,[['N',s],s12]],[s10,[['N',e],s13]],[s11,[s,s14]],[s12,[['P',0],fs15]],[s13,[['P',s],fs15]],[s14,[e,s16]],[fs15, noarcs],[s16,[['N',0],s12]]]).
FST 3.9
s0
q6
q5
q4
q3
q2
q1
q7
g o o s es h e e pm o u s e
g o:e o:e s es h e e pm o:i u:εs:c e
N:ε
N:ε
N:ε
PL:^ s #
SG:#
SG:#
PL:#
f o xc a td o g
s0
q1
f o xc a td o g
FST 3.9 (portion)[s0,[f,s1], [c,s3], [d,s5]],[s1,[o,s2]],[s2,[x,q1]],[s3,[a,s4]],[s4,[t,q1]],[s5,[o,s6]],[s6,[g,q1]],
s0 q1
f s1 s2
s3 s4
s5 s6
c
d
o
a
o
x
t
g