1 Chapter 3 Scanning - Theory and Practice Prof Chung. 10/8/2015.
-
Upload
lillian-benson -
Category
Documents
-
view
215 -
download
1
Transcript of 1 Chapter 3 Scanning - Theory and Practice Prof Chung. 10/8/2015.
1
Chapter 3 Scanning - Theory and Practice
Prof Chung.
04/21/23
22
Outlines 3.1 Overview 3.2 Regular Expressions 3.4 Finite Automata and Scanners 3.5 Using a Scanner Generator
LEX --- Introduce in TA course: LEX introduction 3.7 Practical Considerations 3.8 Translating Regular Expressions into Finite
Automata 3.9 Summary
Modify form http://www.cs.ualberta.ca/~amaral/courses/680/
04/21/23
Overview(1) Formal notations
For specifying the precise structure of tokens are necessary
Quoted string in Pascal Can a string split across a line? Is a null string allowed? Is .1 or 10. ok? The 1..10 problem
Scanner generators Tables, Programs
What formal notations to use?
04/21/233
Overview(2) Lexical analyzer (scanner) role
Produce a sequence of (tokens) for parser Stripe out comments and whitespaces Associate a line number with each error message Expand macros
04/21/234
LexicalAnalyzer
Parser
SymbolTable
sourceprogram
to semanticanalysis
token
getNextToken
Regular Expressions (1) Tokens
built from symbols of a finite vocabulary.
Structures of tokens use regular expressions to define
Set Definition The sets of strings defined by regular expressions are termed is a regular expression denoting the empty set
is a regular expression denoting the set that contains only the empty string
A string s is a regular expression denoting a set containing only s
04/21/235
Regular Expression (2) if A and B are regular expressions, so are
A | B (alternation)
A regular expression formed by A or B
(a)|(b) = {a, b}
AB or A•B (concatenation)
A regular expression formed by A followed by B
(a)(b) = {ab}
A* (Kleene closure)
A regular expression formed by zero or more repetitions of A
a* = {, a, aa, aaa, …}
04/21/236
More Complex Example (a|b|c)* = {, a, b, c, aa, ab, ac, ba, bb, bc, ca, cb, cc …}
Regular Expression (3) Some notational convenience
P+
PP* (at least one)
Not(A) V - A
Not(S) V* - S
AK
AA …A (k copies)
A?
Optional, zero or one occurrence of A
04/21/237
More Complex Example Let D = (0 | 1 | 2 | 3 | 4 | ... | 9 ) Let L = (A | B | ... | Z | a | b |... | z)
comment = -- not(EOL)* EOL ex: --hello12_34 \n
decimal = D+ · D+
ex: 123.456
ident = L (L | D | _)* ex: A1a5_6
comments = ((#| ) not(#))* ex:#A435#3
Regular Expressions (4) Is regular expression as power as CFG? { [i ]i | i1}
Regular grammar
04/21/238
Aa
AaB
Aa
ABaor
Finite Automata and Scanners (1) Finite automaton (FA)
can be used to recognize the tokens specified by a regular expression
A FA consists of A finite set of states S A set of input symbols (the input symbol alphabet) A set of transitions (or moves) from one state to
another, labeled with characters in V
A special start state s0 (only one) A set of final, or accepting, states F
04/21/239
FA = {S, , s0, F, move }
Finite Automata and Scanners (2)
04/21/2310
is a transition
is a state
is a final state
is the start state
Example at next page….
Example A transition diagram
This machine accepts (abc+)+
Finite Automata and Scanners (3)
04/21/2311
a
a b c
c
( a b c + ) +
Finite Automata and Scanners (4) Other Example
(0|1)*0(0|1)(0|1)
04/21/2312
1 42 30 0,1 0,1
5
0,1 0
0,1 (0|1)*0(0|1)(0|1)
Finite Automata and Scanners (5) Other Example
ID = L(L|D)*(_(L|D)+)*
A data structure can be translated for many REs or FAs
04/21/2313
L -
L | D
L | D
(_(L|D)+)*L (L|D)*
Final for two * symbol
What difference?Answer : “_” by times
item 2 = item 3
Finite Automata and Scanners (6) Other Example
RealLit = (D+(|.))|(D*.D+)
04/21/2314
Two kinds of FA: Deterministic: next transition is unique Non-deterministic: otherwise
Finite Automata and Scanners (7)
04/21/2315
...a
a
Which path we should select?
...
A transition diagram
A transition table
4321
Finite Automata and Scanners (8)
04/21/2316
1/ /
Not(Eol)
3 42Eol
State Character
- Eol a b …
3
3
2
4 3 3 3
Finite Automata and Scanners (9) Any regular expression
can be translated into a DFA that accepts the set of strings denoted by the regular expression
The transition can be done Automatically by a scanner generator : LEX (TA course) Manually by a programmer :
Coding the DFA in two form 1. Table-driven, commonly produced by a scanner
generator 2. Explicit control, produced automatically or by
hand
04/21/2317
Finite Automata and Scanners (10) Scanner Driver Interpreting a Transition
Table/* Note: CurrentChar is already set to the current input character. */
State = StartState;
while (TRUE) {
NextSate = T[State , CurrentChar];
if (NextSate == ERROR)
break;
State = NextState;
CurrentChar = getchar();
}
If(is_final_state(State))
/* Return or process valid token. */
else
lexical_error(CurrentChar);
04/21/2318
Table-driven
Finite Automata and Scanners (11) Scanner with Fixed Token Definitionif (CurrentChar == ‘/') {
CurrentChar = getchar();
if (CurrentChar == ‘/') {
do {
CurrentChar = getchar();
} while (CurrentChar != '\n');
} else {
ungetc(CurrentChar, stdin);
lexical_error(CurrentChar);
}
}
else
lexical_error(CurrentChar);
/* Return or process valid token. */
04/21/2319
Explicit control
Finite Automata and Scanners (12) Transducer
We may perform some actions during state transition.
A scanner can be turned into a transducer by the appropriate insertion of actions based on state transitions
04/21/2320
2121
Using a Scanner Generator By TA….
04/21/23
Practical Considerations (1) Reserved Words
Usually, all keywords are reserved in order to simplify parsing.
In Pascal, we could even write begin begin; end; end; begin; end if else then if = else;
The problem with reserved words is that they are too numerous. COBOL has several hundreds of reserved words!
ZEROS ZERO ZEROES
04/21/2322
Practical Considerations (2) Compiler Directives and Listing Source Lines
Compiler options e.g. optimization, profiling, etc. handled by scanner or semantic routines Complex pragmas are treated like other statements.
Source inclusion e.g. #include in C handled by preprocessor or scanner
Conditional compilation e.g. #if, #endif in C useful for creating program versions
04/21/2323
Practical Considerations (3) Entry of Identifiers into the Symbol Table
Who is responsible for entering symbols into symbol table?
Scanner?
Consider this example: { int abc; … { int abc; } }
04/21/2324
Practical Considerations (4) How to handle end-of-file?
Create a special EOF token. EOF token is useful in a CFG
Multicharacter Lookahead Blanks are not significant in Fortran DO 10 I= 1,100
Beginning of a loop
DO 10 I = 1.100 An assignment statement DO 10I=1.100
A Fortran Scanner can determine whether the O is the last character of a DO
token only after reading as far as the comma
04/21/2325
Practical Considerations (5) Multicharacter Lookahead (Cont’d)
In Ada and Pascal To scan 10..100
There are three token 10 .. 100
Two-character (..) lookahead after the 10
It is easy to build a scanner that can perform general backup.
If we reach a situation in which we are not in final state and
cannot scan any more characters, we extract characters from the right end of the buffer and queue them fir rescanning
Until we reach a prefix of the scanned characters flagged as a valid token
04/21/2326
Example at next page
Practical Considerations (6) An FA That Scans Integer and Real Literals
and the Subrange Operator
04/21/2327
D ●
D
D
D
● ●Buffered
TokenToken Flag
1 Integer Literal
12 Integer Literal
12. Invalid
12.3 Real Literal
12.3e Invalid
12.3e+ Invalid
Detail Operation of each case at next page
Practical Considerations (7) An FA That Scans Integer and Real Literals
and the Subrange Operator
04/21/2328
D
1
Buffered Token Token Flag
Integer Literal1
Input Token
Input string: 12.3e+q
Practical Considerations (8) An FA That Scans Integer and Real Literals
and the Subrange Operator
04/21/2329
D
1
Buffered Token Token Flag
Integer Literal1
Input Token
D
2 2
Input string: 12.3e+q
Practical Considerations (9) An FA That Scans Integer and Real Literals
and the Subrange Operator
04/21/2330
D
1
Buffered Token Token Flag
Invalid1
Input Token
D
2 2. .
●
Input string: 12.3e+q
Practical Considerations (10) An FA That Scans Integer and Real Literals
and the Subrange Operator
04/21/2331
D
1
Buffered Token Token Flag
Real Literal1
Input Token
D
2 2. .
●
3 3
D
Input string: 12.3e+q
Practical Considerations (11) An FA That Scans Integer and Real Literals
and the Subrange Operator
04/21/2332
D
1
Buffered Token Token Flag
Invalid1
Input Token
D
2 2. .
●
3 3
D
e e
?
Input string: 12.3e+q
Backup is invoked!
Practical Considerations (11) An FA That Scans Integer and Real Literals
and the Subrange Operator
04/21/2333
D
1
Buffered Token Token Flag
Invalid1
Input Token
D
2 2. .
●
3 3
D
e e
?
Input string: 12.3e+q
Backup is invoked!
Practical Considerations (12) An FA That Scans Integer and Real Literals
and the Subrange Operator
04/21/2334
D
1
Buffered Token Token Flag
Invalid1
Input Token
D
2 2. .
●
3 3
D
e e
?
+ +
?
Input string: 12.3e+q
Practical Considerations (13)cannot scan any more characters, and not in
accept state Backup is invoked !
04/21/2335
D
1
Buffered Token Token Flag
Invalid1
Input Token
D
2 2. .
●
3 3
D
e e
?
+ +
?
Input string: 12.3e+q
3636
Outlines 3.1 Over View 3.2 Regular Expression 3.3 Finite Automata and Scanners 3.4 Using a Scanner Generator 3.5 Practical Considerations 3.6 Translating Regular Expressions into Finite
Automata Creating Deterministic Automata Optimizing Finite Automata
3.7 Tracing Example
04/21/23
Translating Regular Expressions into Finite Automata(1) Regular expressions are equivalent to FAs
The main job of a scanner generator To transform a regular expression definition into an
equivalent FA
04/21/2337
A regular expression
Nondeterministic FA
Deterministic FA
Optimized Deterministic FA
minimize # of states
Importance in NFA->DFA
Translating Regular Expressions into Finite Automata(2) We can transform any regular expression
into an NFA with the following properties: There is an unique final state The final state has no successors Every other state has at least one successors
Example : A Nondeterministic Finite Automaton (NFA) Input : babb Regular Expressions : (a|b)*abb
04/21/2338
Unique final state
Final S has no successor
0a
a
2b b
b
31
either one or two successors
Translating Regular Expressions into Finite Automata(3) We need to review the definition of regular
expression Item 1:
It is null string
Item 2: a It is a char of the vocabulary
Item 3 : | It is “or” operation. Example : A|B
Item 4 : ● It is the operation of catenation Example : AB
Item 4 : * It is the operation of repetition Example : A*
04/21/2339
More Example at Next Page
Translating Regular Expressions into Finite Automata(4) NFA : (null string)
NFA : a (1string) A char of the vocabulary
04/21/2340
a
Processing Token
aProcessing Token
NFA :
NFAFor A
Translating Regular Expressions into Finite Automata(5)
04/21/2341
NFAFor B
Processing Token
NFA : ●
Translating Regular Expressions into Finite Automata(6)
04/21/2342
NFAFor A
NFAFor B
Processing Token ●
NFA :
Translating Regular Expressions into Finite Automata(7)
04/21/2343
NFAFor A
Processing Token
= 0 times
> 1 times
Construct an NFA for Regular Expression
01* | 1 (0(1*)) |1
Translating Regular Expressions into Finite Automata(8)
04/21/2344
1 *Processing Token
Start
Construct an NFA for Regular Expression
01*|1 (0(1*)) |1
Translating Regular Expressions into Finite Automata(9)
04/21/2345
Processing Token 1*
Start
0
For Connection
Construct an NFA for Regular Expression
01*+1 (0(1*))+1
Translating Regular Expressions into Finite Automata(10)
04/21/2346
Processing Token 1*
0 | 1
Start
What’s problem about NFA? Ans: It may be ambiguous that difficult to
program!!!
A Nondeterministic Finite Automaton (NFA):(a|b)*abb
Translating Regular Expressions into Finite Automata(11)
04/21/2347
2b
3Start 0a
1b
a
b
Input : babbProcessing Tokenb a
Ambiguous!!!
Which one should we select?
What’s problem about NFA? Ans: It may be ambiguous that difficult to
program!!!
A deterministic Finite Automaton (NFA): b*abb
Translating Regular Expressions into Finite Automata(12)
04/21/2348
2b
3Start 0a
1b
b
Input : babbProcessing Tokenb a
No Ambiguous!!!
It have unique path!
b b
Creating Deterministic Automata(1) The transformation
from an NFA N to an equivalent DFA M works by what is sometimes called the subset construction
An Example for each step… Initial NFA : 01*|1 (0(1*)) |1
04/21/2349
Start
4
652
3
1 10
7
8 9
More Detail operation at next page…
Creating Deterministic Automata(2) Step 1:
04/21/2350
Start
4 1
652
0
3
110
7
8 9 1
1. -closure(1) ={1, 2, 8}2. -closure(2) ={2}3. -closure(3) ={3 ,4 ,5 ,7 ,10}4. -closure(4) ={4 ,5 ,7 ,10}5. -closure(5) ={5}
6. -closure(6) ={5 ,6 ,7 ,10}7. -closure(7) ={5, 7, 10}8. -closure(8) ={8}9. -closure(9) ={9 ,10}10. -closure(10) ={10}
More Detail operation at next page…
Creating Deterministic Automata(2) Step 1:
04/21/2351
Start
4 1
652
0
3
110
7
8 9 1
1. -closure(1) ={1, 2, 8}2. -closure(2) ={2}3. -closure(3) ={3 ,4 ,5 ,7 ,10}4. -closure(4) ={4 ,5 ,7 ,10}5. -closure(5) ={5}
6. -closure(6) ={5 ,6 ,7 ,10}7. -closure(7) ={5, 7 ,10}8. -closure(8) ={8}9. -closure(9) ={9 ,10}10. -closure(10) ={10}
Creating Deterministic Automata(2) Step 1:
04/21/2352
2
0
1
8 1
1. -closure(1) ={1, 2, 8}
Creating Deterministic Automata(2) Step 1:
04/21/2353
Start
4 1
652
0
3
110
7
8 9 1
1. -closure(1) ={1, 2, 8}2. -closure(2) ={2}3. -closure(3) ={3 ,4 ,5 ,7 ,10}4. -closure(4) ={4 ,5 ,7 ,10}5. -closure(5) ={5}
6. -closure(6) ={5 ,6 ,7 ,10}7. -closure(7) ={5, 7 ,10}8. -closure(8) ={8}9. -closure(9) ={9 ,10}10. -closure(10) ={10}
Creating Deterministic Automata(2) Step 1:
04/21/2354
2
0
2. -closure(2) ={2}
Creating Deterministic Automata(2) Step 1:
04/21/2355
Start
4 1
652
0
3
110
7
8 9 1
1. -closure(1) ={1, 2, 8}2. -closure(2) ={2}3. -closure(3) ={3 ,4 ,5 ,7 ,10}4. -closure(4) ={4 ,5 ,7 ,10}5. -closure(5) ={5}
6. -closure(6) ={5 ,6 ,7 ,10}7. -closure(7) ={5, 7 ,10}8. -closure(8) ={8}9. -closure(9) ={9 ,10}10. -closure(10) ={10}
Creating Deterministic Automata(2) Step 1:
04/21/2356
4 1
53
10
7
3. -closure(3) ={3 ,4 ,5 ,7 ,10}
Creating Deterministic Automata(2) Step 1:
04/21/2357
Start
4 1
652
0
3
110
7
8 9 1
1. -closure(1) ={1, 2, 8}2. -closure(2) ={2}3. -closure(3) ={3 ,4 ,5 ,7 ,10}4. -closure(4) ={4 ,5 ,7 ,10}5. -closure(5) ={5}
6. -closure(6) ={5 ,6 ,7 ,10}7. -closure(7) ={5, 7 ,10}8. -closure(8) ={8}9. -closure(9) ={9 ,10}10. -closure(10) ={10}
Creating Deterministic Automata(2) Step 1:
04/21/2358
4 1
5
10
7
4. -closure(4) ={4 ,5 ,7 ,10}
Creating Deterministic Automata(2) Step 1:
04/21/2359
Start
4 1
652
0
3
110
7
8 9 1
1. -closure(1) ={1, 2, 8}2. -closure(2) ={2}3. -closure(3) ={3 ,4 ,5 ,7 ,10}4. -closure(4) ={4 ,5 ,7 ,10}5. -closure(5) ={5}
6. -closure(6) ={5 ,6 ,7 ,10}7. -closure(7) ={5, 7 ,10}8. -closure(8) ={8}9. -closure(9) ={9 ,10}10. -closure(10) ={10}
Creating Deterministic Automata(2) Step 1:
04/21/2360
15
5. -closure(5) ={5}
Creating Deterministic Automata(2) Step 1:
04/21/2361
Start
4 1
652
0
3
110
7
8 9 1
1. -closure(1) ={1, 2, 8}2. -closure(2) ={2}3. -closure(3) ={3 ,4 ,5 ,7 ,10}4. -closure(4) ={4 ,5 ,7 ,10}5. -closure(5) ={5}
6. -closure(6) ={5 ,6 ,7 ,10}7. -closure(7) ={5, 7 ,10}8. -closure(8) ={8}9. -closure(9) ={9 ,10}10. -closure(10) ={10}
Creating Deterministic Automata(2) Step 1:
04/21/2362
1
65
10
7
6. -closure(6) ={5 ,6 ,7 ,10}
This point line not be computed!!
Creating Deterministic Automata(2) Step 1:
04/21/2363
Start
4 1
652
0
3
110
7
8 9 1
1. -closure(1) ={1, 2, 8}2. -closure(2) ={2}3. -closure(3) ={3 ,4 ,5 ,7 ,10}4. -closure(4) ={4 ,5 ,7 ,10}5. -closure(5) ={5}
6. -closure(6) ={5 ,6 ,7 ,10}7. -closure(7) ={5, 7 ,10}8. -closure(8) ={8}9. -closure(9) ={9 ,10}10. -closure(10) ={10}
Creating Deterministic Automata(2) Step 1:
04/21/2364
10
7
7. -closure(7) ={5, 7 ,10}
1
5
Creating Deterministic Automata(2) Step 1:
04/21/2365
Start
4 1
652
0
3
110
7
8 9 1
1. -closure(1) ={1, 2, 8}2. -closure(2) ={2}3. -closure(3) ={3 ,4 ,5 ,7 ,10}4. -closure(4) ={4 ,5 ,7 ,10}5. -closure(5) ={5}
6. -closure(6) ={5 ,6 ,7 ,10}7. -closure(7) ={5, 7 ,10}8. -closure(8) ={8}9. -closure(9) ={9 ,10}10. -closure(10) ={10}
Creating Deterministic Automata(2) Step 1:
04/21/2366
81
8. -closure(8) ={8}
Creating Deterministic Automata(2) Step 1:
04/21/2367
Start
4 1
652
0
3
110
7
8 9 1
1. -closure(1) ={1, 2, 8}2. -closure(2) ={2}3. -closure(3) ={3 ,4 ,5 ,7 ,10}4. -closure(4) ={4 ,5 ,7 ,10}5. -closure(5) ={5}
6. -closure(6) ={5 ,6 ,7 ,10}7. -closure(7) ={5, 7 ,10}8. -closure(8) ={8}9. -closure(9) ={9 ,10}10. -closure(10) ={10}
Creating Deterministic Automata(2) Step 1:
04/21/2368
10
9
9. -closure(9) ={9 ,10}
Creating Deterministic Automata(2) Step 1:
04/21/2369
Start
4 1
652
0
3
110
7
8 9 1
1. -closure(1) ={1, 2, 8}2. -closure(2) ={2}3. -closure(3) ={3 ,4 ,5 ,7 ,10}4. -closure(4) ={4 ,5 ,7 ,10}5. -closure(5) ={5}
6. -closure(6) ={5 ,6 ,7 ,10}7. -closure(7) ={5, 7 ,10}8. -closure(8) ={8}9. -closure(9) ={9 ,10}10. -closure(10) ={10}
Creating Deterministic Automata(2) Step 1:
04/21/2370
10
10. -closure(10) ={10}
Creating Deterministic Automata(2) Step 1:
04/21/2371
Start
4 1
652
0
3
110
7
8 9 1
1. -closure(1) ={1, 2, 8}2. -closure(2) ={2}3. -closure(3) ={3 ,4 ,5 ,7 ,10}4. -closure(4) ={4 ,5 ,7 ,10}5. -closure(5) ={5}
6. -closure(6) ={5 ,6 ,7 ,10}7. -closure(7) ={5, 7 ,10}8. -closure(8) ={8}9. -closure(9) ={9 ,10}10. -closure(10) ={10}
Total closures, but…..
Creating Deterministic Automata(2) Step 1:
04/21/2372
Start
4 1
652
0
3
110
7
8 9 1
1. -closure(1) ={1, 2, 8}2. -closure(2) ={2}3. -closure(3) ={3 ,4 ,5 ,7 ,10}4. -closure(4) ={4 ,5 ,7 ,10}5. -closure(5) ={5}
6. -closure(6) ={5 ,6 ,7 ,10}7. -closure(7) ={5, 7 ,10}8. -closure(8) ={8}9. -closure(9) ={9 ,10}10. -closure(10) ={10}
Delete Sub Set ...
Creating Deterministic Automata(2) Step 1:
04/21/2373
Start
4 1
652
0
3
110
7
8 9 1
1. -closure(1) ={1, 2, 8}2. -closure(2) ={2}3. -closure(3) ={3 ,4 ,5 ,7 ,10}4. -closure(4) ={4 ,5 ,7 ,10}5. -closure(5) ={5}
6. -closure(6) ={5 ,6 ,7 ,10}7. -closure(7) ={7 ,10}8. -closure(8) ={8}9. -closure(9) ={9 ,10}10. -closure(10) ={10}
Now Closures, No Sub Set ...
State 3 state 3state 4state 5,7state 10empty
Creating Deterministic Automata(3) Step 1:
The initial state of M is the set of states reachable from the initial state of N by -transitions
Usually called l-closure or ε-closure
04/21/2374
Algorithm for example at upside
Creating Deterministic Automata(4) Step 2:
To create the successor states Take any state S of M and any character c, and compute S’s
successor under c S is identified with some set of N’s states, {n1, n2,…}
Find all possible successor states to {n1, n2,…} under c Obtain a set {m1, m2,…}
T=close({m1, m2,…})
04/21/2375
S T{n1, n2,…} close({m1, m2,…})
Creating Deterministic Automata(7) Step 2:
void make_deterministic( nondeterministic_fa N , deterministic *M){ set_of_fa_states T; M->initial_state = SET_OF(N.initial_state) ; close (& M->initial_state ); Add M-> initial_state to M->states; while( states or transitions can be added) { choose S in M->states and c in Alphabet; T=SET_OF (y in N . states SUCH THAT x->y under c for some x in S); close(& T); if(T not in M->states) add T to M->states;
Add the transition to M->transitions: S->T under c; } M->final_states = SET_OF(S in M->states SUCH_THAT N.final_state in S);}
04/21/2376
Example at next page…
Creating Deterministic Automata(5) Step 2:
First Re-Number for simplifying the work flow1. -closure(1) ={1, 2, 8} A = {1, 2, 8}
3. -closure(3) ={3 ,4 ,5 ,7 ,10} B = {3 ,4 ,5 ,7 ,10}
6. -closure(6) ={5 ,6 ,7 ,10} C = {5 ,6 ,7 ,10}
9. -closure(9) ={9 ,10} D = {9, 10}
04/21/2377
Start
4 1
652
0
3
1 10
7
8 9 1
More Operation at next page ……
Creating Deterministic Automata(6)
04/21/2378
{1,2,8}
{3 ,4 ,5 ,7 ,10}
{9, 10}
{5 ,6 ,7 ,10
}
Start
4 1
652
0
3
1 10
7
8 9 1
A : {1, 2, 8} B : {3 ,4 ,5 ,7 ,10} C : {5 ,6 ,7 ,10}D : {9, 10}
A
B C
D
Start
0
1
1
1
NoOut-
Degree
Final
Creating Deterministic Automata(7) Step 2:
void make_deterministic( nondeterministic_fa N , deterministic *M){ set_of_fa_states T; M->initial_state = SET_OF(N.initial_state) ; close (& M->initial_state ); Add M-> initial_state to M->states; while( states or transitions can be added) { choose S in M->states and c in Alphabet; T=SET_OF (y in N . states SUCH THAT x->y under c for some x in S); close(& T); if(T not in M->states) add T to M->states;
Add the transition to M->transitions: S->T under c; } M->final_states = SET_OF(S in M->states SUCH_THAT N.final_state in S);}
04/21/2379
Example at next page…
Optimizing Finite Automata(1) Minimize number of states
Every DFA has a unique smallest equivalent DFA Given a DFA M
we use Transition Table to construct the equivalent minimal DFA.
Initially, we draw a transition table from DFA diagram.
04/21/2380
Start
1
A
DFA
D
1
B C
Table
StateCharacter
0 1
A B D
B C
C C
D
A: Start StateB,C,D: Final State
Optimizing Finite Automata(2) Minimize number of states
04/21/2381
StateCharacter
0 1
A B D
B C
C C
D
Start
1
A
DFA
D
1
B C
OptimizeB is
equal C
StateCharacter
0 1
A {B, C} D
{B, C} {B, C}
D
New DFAStart A
D
1B,C
A: Start StateB,C,D: Final StateSpecial :
B can merge into C, Because the B and C are final state.
Additional Simplifying rules (removing
parentheses) “*” has highest precedence and is left associative Concatenation has 2nd highest precedence and is left
associative “| “has lowest precedence and is left associative E.g., (a)|((b)*(c)) == a|b*c
04/21/2382
8383
Outlines 3.1 Over View 3.2 Regular Expression 3.3 Finite Automata and Scanners 3.4 Using a Scanner Generator 3.5 Practical Considerations 3.6 Translating Regular Expressions into Finite
Automata 3.7 Tracing Example
Modify form http://www.cs.ualberta.ca/~amaral/courses/680/
04/21/23
Tracing Example(1) Review Steps of Scanner Generator
04/21/2384
A regular expression
Nondeterministic FA
Deterministic FA
Optimized Deterministic FA
minimize # of states
Importance in NFA->DFA
Tracing Example(2) Regular Expression
IF and IFA
04/21/2385
if {return IF;}
[a - z] [a – z|0 - 9 ] * {return ID;}
[0 - 9] + {return NUM;}
. {error ();}
Tracing Example(3) Translate from RE to NFA
04/21/2386
A regularexpression
Nondeterministic FA
Deterministic FA
OptimizedDeterministic FA
minimize # of states
Tracing Example(4)
04/21/2387
The NFA for a symbol i is: i1 2start
The NFA for the regular expression if is:
f 31start 2i
The NFA for a symbol f is: f 2start 1
IF
if {return IF;}
Tracing Example(5)
04/21/2388
a-z1start
[a-z] [a-z|0-9 ] * {return ID;}
42 3a-z
0-9
ID
Tracing Example(6)
04/21/2389
54320-91start
NUM
[0 – 9]+ {return NUM;}
0-9
Tracing Example(9)
04/21/2390
NUM
21
any but \n
error
IDIF
1 2
i f
3 a-z1 42 3a-z
0-9
54320-91 0-9
Tracing Example(10) Translate from NFA to DFA
04/21/2391
A regularexpression
Nondeterministic FA
Deterministic FA
OptimizedDeterministic FA
minimize # of states
Tracing Example(11)
04/21/2392
2 3 84 5 6 7
139 10 11 1214 15
1
a-z
0-90-9
a-z
0-9i
f
IF
error
NUM
ID
anycharacter
Full NFA Diagram
Special case :Handle in Final
Tracing Example(12)
04/21/239393
2 3 84 5 6 7
914
1
a-z
0-90-9
a-z
0-9i
f
IF
error
NUM
ID
anycharacter
1. -closure(1) ={ 1, 4, 9, 14}
2. -closure(2) ={ 2}
3. -closure(3) ={ 3}
4. -closure(5) ={ 5, 6, 8}5. -closure(7) ={ 7, 8}
6. -closure(8) ={ 6, 8}
7. -closure(10) ={ 10, 11, 13}
9. -closure(13) ={11, 13}
8. -closure(12) ={12, 13}
10 11 12 13
10. -closure(15) ={15}
15
Tracing Example(13)
04/21/2394
DFA States = {1-4-9-14}
1-4-9-14
Now we need to compute:
move(1-4-9-14,a-h) = {5,15}
-closure({5,15}) = {5,6,8,15}
a-h 5-6-8-15
2 3 84 5 6 7
914
1
a-z
0-90-9
a-z
0-9i
f
IF
error
NUM
ID
anycharacter
10 11 12 1315
Tracing Example(16)
04/21/2395
DFA States = {1-4-9-14}move(1-4-9-14, i) =
1-4-9-14
a-h {2,5,15}
-closure({2,5,15}) = {2,5,6,8,15}
2-5-6-8-15i
2 3 84 5 6 7
914
1
a-z
0-90-9
a-z
0-9i
f
IF
error
anycharacter
10 11 12 1315
5-6-8-15
Tracing Example(21)
04/21/2396
DFA States = {1-4-9-14}move(1-4-9-14, j-z) =
-closure({5,15}) =1-4-9-14
a-h 5-6-8-15
2-5-6-8-15i
j-z{5,15}
{5,6,8,15}
2 3 84 5 6 7
914
1
a-z
0-90-9
a-z
0-9i
f
IF
error
NUM
ID
anycharacter
10 11 12 1315
Tracing Example(22)
04/21/2397
DFA States = {1-4-9-14}move(1-4-9-14, 0-9) =
1-4-9-14
a-h 5-6-8-15
2-5-6-8-15i
j-z10-11-13-15
0-9
{10,15}
-closure({10,15}) ={10,11,13,15}
2 3 84 5 6 7
914
1
a-z
0-90-9
a-z
0-9i
f
IF
error
NUM
ID
anycharacter
10 11 12 1315
Tracing Example(23)
04/21/239898
DFA States = {1-4-9-14}move(1-4-9-14, other) =
1-4-9-14
a-h 5-6-8-15
2-5-6-8-15i
j-z10-11-13-15
0-9
15other
{15}
-closure({15}) = {15}
2 3 84 5 6 7
914
1
a-z
0-90-9
a-z
0-9i
f
IF
error
anycharacter
10 11 12 1315
NUM
ID
Tracing Example(24)
04/21/2399
DFA states = {1-4-9-14}
The analysis for 1-4-9-14is complete. We mark it andpick another state in the DFAto analysis. (Practice)
2 3 84 5 6 7
139 10 11 1214 15
1
a-z
0-90-9
a-z
0-9i
f
IF
error
NUM
ID
anycharacter
1-4-9-14
a-h 5-6-8-15
2-5-6-8-15i
j-z10-11-13-15
0-9
15other
Tracing Example(25)
04/21/23100
5-6-8-15
2-5-6-8-15
10-11-13-15
3-6-7-8
11-12-13
6-7-8
15
1-4-9-14
a-e, g-z, 0-9
a-z,0-9
a-z,0-9
0-9
0-9
f
i
a-h
j-z
0-9
other
ID
ID
NUM NUM
IF
error
ID
a-z,0-9
See pp. 118 of Aho-Sethi-Ullmanand pp. 29 of Appel.
Tracing Example(26)
04/21/23101
A regularexpression
Nondeterministic FA
Deterministic FA
OptimizedDeterministic FA
minimize # of states
Minimize DFA
Tracing Example(27)
04/21/23102
State
character
0-9 a-e f g-h i j-z other
A D C C C B C E
B G G F G G G -
C G G G G G G -
D H - - - - - -
E - - - - - - -
F G G G G G G -
G G G G G G G -
H H - - - - - -
A
B
C
D
E
F
G
HTransition Table
DFA
Tracing Example(28)
04/21/23103
State
character
0-9 a-e f g-h i j-z other
A D C C C B C E
B G G F G G G -
C G G G G G G -
D H - - - - - -
E - - - - - - -
F G G G G G G -
G G G G G G G -
H H - - - - - -
A
B
C
D
E
F
G
HTransition Table
DFA
State
character
0-9
a-e f g-
h i j-zother
A D C C C B C E
B C C C C C C -
C C C C C C C -
D D - - - - - -
E - - - - - - -
New Transition Table-1
Tracing Example(29)
04/21/23104
A
B
C
D
E
F
G
H
DFA
State
character
0-9
a-e f g-
h i j-zother
A D C C C B C E
B C C C C C C -
C C C C C C C -
D D - - - - - -
E - - - - - - -
New Transition Table-1
State
character
0-9
a-e f g-
h i j-zother
A D B B B B B E
B B B B B B B -
D D - - - - - -
E - - - - - - -
New Transition Table-2
Tracing Example(30)
04/21/23105
A
B
C
D
E
F
G
H
DFA
B
D
E
A
0-9
a-z
0-9
other
IF
ID
NUM
error
a-z,0-9
B=C=F=GD=H
State
character
0-9
a-e f g-
h i j-zother
A D B B B B B E
B B B B B B B -
D D - - - - - -
E - - - - - - -
New Transition Table-2
i
fNew DFA
IF can be handled by look-ahead programming
Chapter 3 End
Any Question?
04/21/23106
隨堂考試 (1+) What is the optimized DFA for 1+?
1. -closure(1) ={1, 2}2. -closure(2) ={2}3. -closure(3) ={3 ,4 ,2}4. -closure(4) ={4 ,2}
1 42 3
1*
= 1 . 1+ (Can use this method)
1. -closure(1) ={1, 2} A3. -closure(3) ={3 ,4 ,2} B
StateCharacter
0 1
A B
B B
A
B
{1,2}
AStart
{3,4,2}
1B 1
Can Not Optimized, (Merge)For A is Start State, B is Final State!