Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16...
Transcript of Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16...
![Page 1: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/1.jpg)
Regular Expressions andFinite State Automata
Gul Agha2104 SChttp://www.cs.uiuc.edu/class/fa05/cs421/current/
Based in part on slides by Mattox Beckman, as updated by Vikram Adve and Elsa Gunter
![Page 2: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/2.jpg)
2/10/2006 CS 421 2
Language Syntax
Syntax is the description of which strings of symbols are meaningful expressions in a languageIt takes more than syntax to understand a language; need meaning (semantics) tooSyntax is the entry point
![Page 3: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/3.jpg)
2/10/2006 CS 421 3
Elements of Syntax
Character set – previously always ASCII, now often 64 character setsKeywords – usually reservedSpecial constants – cannot be assigned toIdentifiers – can be assigned toOperator symbolsDelimiters (parenthesis, braces, brackets,)Blanks (aka white space)
![Page 4: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/4.jpg)
2/10/2006 CS 421 4
Elements of Syntax
ExpressionsType expressionsDeclarations (in functional languages)Statements (in imperative languages)Subprograms
![Page 5: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/5.jpg)
2/10/2006 CS 421 5
Elements of Syntax
ModulesInterfacesClasses (for object-oriented languages)
![Page 6: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/6.jpg)
2/10/2006 CS 421 6
Formal Language Descriptions
Regular expressions, regular grammars, finite state automataContext-free grammars, BNF grammars, syntax diagrams
Whole family more of grammars and automata – covered in automata theory
![Page 7: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/7.jpg)
2/10/2006 CS 421 7
Grammars
Grammars are formal descriptions of which strings over a given character set are in a particular languageLanguage designers write grammarLanguage implementers use grammar to know what programs to acceptLanguage users use grammar to know how to write legitimate programs
![Page 8: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/8.jpg)
2/10/2006 CS 421 8
Regular Expressions
Start with a given character set –a, b, c…
Each character is a regular expressionThe regular expression b represents the set { b }
![Page 9: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/9.jpg)
2/10/2006 CS 421 9
Regular Expressions
If x and y are regular expressions, then xy is a regular expression
xy represents the set of all strings made concatenating a string in x with a string in y
If x and y are regular expressions, then x | yis a regular expression
x | y represents the set of strings in x or in y (or both)
![Page 10: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/10.jpg)
2/10/2006 CS 421 10
Regular Expressions
If x is a regular expression, then so is ( x )( x ) represents the same thing as x
If x is a regular expression, then so is x*x* represents strings made by concatenating zero or more strings from x
εrepresents the empty set
![Page 11: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/11.jpg)
2/10/2006 CS 421 11
Example Regular Expressions
(0|1)*1The set of all strings of 0’s and 1’s ending in 1, {1, 01, 11,…}
a*b(a*)The set of all strings of a’s and b’s with exactly one b
((01) |(10))*You tell me
Regular expressions (equivalently, regular grammars) important for lexing, breaking strings into recognized words
![Page 12: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/12.jpg)
2/10/2006 CS 421 12
Example: Lexing
Regular expressions good for describing lexemes (words) in a programming language
Identifier = (a | b | … | z | A | B | … | Z) (a | b |… | z | A | B | … | Z | 0 | 1 | … | 9)*Digit = (0 | 1 | … | 9)Number = (1 | … | 9)(0 | … | 9) | ~ (1 | … | 9)(0 | … | 9) Keywords: if = if, while = while,…
![Page 13: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/13.jpg)
2/10/2006 CS 421 13
Implementing Regular Expressions
Regular expressions can generate strings in a languageNot so good for recognizing when a string is in languageRegular expressions: which option to choose, how many repetitions to makeAnswer: finite state automata
![Page 14: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/14.jpg)
2/10/2006 CS 421 14
Finite State Automata
A finite state automata over an alphabet is:a directed graphedges are labeled with elements of alphabet,
or empty stringsome nodes (or states), marked as finalone node marked as start state
Syntax of FSA
![Page 15: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/15.jpg)
2/10/2006 CS 421 15
Example FSA
0 11
0Start State
Final State
Final State
1
0
10
![Page 16: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/16.jpg)
2/10/2006 CS 421 16
Deterministic FSA’s
If FSA has for every state exactly one edge for each letter in alphabet then FSA is deterministic
No edge labeled with ε
In general FSA in non-deterministic.NFSA also allows edges labeled by ε
Deterministic FSA special kind of non-deterministic FSA
![Page 17: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/17.jpg)
2/10/2006 CS 421 17
DFSA Language Recognition
Think of a DFSA as a board game; DFSA is boardYou have string as a deck of cards; one letter on each cardStart by placing a disc on the start state
![Page 18: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/18.jpg)
2/10/2006 CS 421 18
DFSA Language Recognition
Move the disc from one state to next along the edge labeled the same as top card in deck; discard top card
When you run out of cards:if you are in final state, you win; string is in languageif you are not in a final state, you lose; string is not in language
![Page 19: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/19.jpg)
2/10/2006 CS 421 19
DFSA Language Recognition -Summary
Given a string (over the alphabet)Start at start stateMove over edge labeled with first letter to new stateRemove first letter from stringRepeat until string goneIf end in final state then string in language
Semantics of FSA
![Page 20: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/20.jpg)
2/10/2006 CS 421 20
Example DFSA
Regular expression: (0 | 1)* 1Deterministic FSA
0 1
1
0
![Page 21: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/21.jpg)
2/10/2006 CS 421 21
Example DFSA
Regular expression: (0 | 1)* 1Accepts string 0 1 1 0 1
0 1
1
0
![Page 22: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/22.jpg)
2/10/2006 CS 421 22
Example DFSA
Regular expression: (0 | 1)* 1Accepts string 0 1 1 0 1
0 1
1
0
![Page 23: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/23.jpg)
2/10/2006 CS 421 23
Example DFSA
Regular expression: (0 | 1)* 1Accepts string 0 1 1 0 1
0 1
1
0
![Page 24: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/24.jpg)
2/10/2006 CS 421 24
Example DFSA
Regular expression: (0 | 1)* 1Accepts string 0 1 1 0 1
0 1
1
0
![Page 25: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/25.jpg)
2/10/2006 CS 421 25
Example DFSA
Regular expression: (0 | 1)* 1Accepts string 0 1 1 0 1
0 1
1
0
![Page 26: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/26.jpg)
2/10/2006 CS 421 26
Example DFSA
Regular expression: (0 | 1)* 1Accepts string 0 1 1 0 1
0 1
1
0
![Page 27: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/27.jpg)
2/10/2006 CS 421 27
Example DFSA
Regular expression: (0 | 1)* 1Accepts string 0 1 1 0 1
0 1
1
0
![Page 28: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/28.jpg)
2/10/2006 CS 421 28
NFSA generalize DFSA in two ways:Include edges labeled by ε
Allows process to non-deterministically change state
Each state can have zero, one or more edges labeled by each letter
Given a letter, non-deterministically choose an edge to use
Non-deterministic FSA’s
![Page 29: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/29.jpg)
2/10/2006 CS 421 29
NFSA Language Recognition
Play the same game as with DFSAFree move: move across an edge with empty string label without discarding cardWhen you run out of letters, if you are in final state, you win; string is in languageYou can take one or more moves back and try againIf have tried all possible paths without success, then you lose; string not in language
![Page 30: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/30.jpg)
2/10/2006 CS 421 30
FSA Language RecognitionMove the disc from one state to next if edge between labeled the same as top card in deck; discard top card Free move: move across an edge with empty string label without discarding cardWhen you run out of cards, if you are in final state, you win; string is in languageYou can take a move back and try anotherIf you have cards left and you have tried all possible edges without success, then you lose; string not in language
![Page 31: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/31.jpg)
2/10/2006 CS 421 31
Example NFSA
Regular expression: (0 | 1)* 1Non-deterministic FSA
0
1
1
![Page 32: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/32.jpg)
2/10/2006 CS 421 32
Example NFSA
Regular expression: (0 | 1)* 1Accepts string 0 1 1 0 1
0
1
1
![Page 33: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/33.jpg)
2/10/2006 CS 421 33
Example NFSA
Regular expression: (0 | 1)* 1Accepts string 0 1 1 0 1
0
1
1
![Page 34: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/34.jpg)
2/10/2006 CS 421 34
Example NFSA
Regular expression: (0 | 1)* 1Accepts string 0 1 1 0 1
0
1
1
![Page 35: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/35.jpg)
2/10/2006 CS 421 35
Example NFSA
Regular expression: (0 | 1)* 1Accepts string 0 1 1 0 1
0
1
1
![Page 36: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/36.jpg)
2/10/2006 CS 421 36
Example NFSA
Regular expression: (0 | 1)* 1Accepts string 0 1 1 0 1Guess
0
1
1
![Page 37: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/37.jpg)
2/10/2006 CS 421 37
Example NFSA
Regular expression: (0 | 1)* 1Accepts string 0 1 1 0 1Backtrack
0
1
1
![Page 38: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/38.jpg)
2/10/2006 CS 421 38
Example NFSA
Regular expression: (0 | 1)* 1Accepts string 0 1 1 0 1Guess again
0
1
1
![Page 39: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/39.jpg)
2/10/2006 CS 421 39
Example NFSA
Regular expression: (0 | 1)* 1Accepts string 0 1 1 0 1Guess
0
1
1
![Page 40: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/40.jpg)
2/10/2006 CS 421 40
Regular expression: (0 | 1)* 1Accepts string 0 1 1 0 1Backtrack
Example NFSA
0
1
1
![Page 41: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/41.jpg)
2/10/2006 CS 421 41
Example NFSA
Regular expression: (0 | 1)* 1Accepts string 0 1 1 0 1Guess again
0
1
1
![Page 42: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/42.jpg)
2/10/2006 CS 421 42
Example NFSA
Regular expression: (0 | 1)* 1Accepts string 0 1 1 0 1
0
1
1
![Page 43: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/43.jpg)
2/10/2006 CS 421 43
Example NFSA
Regular expression: (0 | 1)* 1Accepts string 0 1 1 0 1Guess (Hurray!!)
0
1
1
![Page 44: Regular Expressions and Finite State Automataakantor/teaching/cs421sp2006/...2/10/2006 CS 421 16 Deterministic FSA’s If FSA has for every state exactly one edge for each letter in](https://reader036.fdocuments.us/reader036/viewer/2022071609/614761d4afbe1968d37a0640/html5/thumbnails/44.jpg)
2/10/2006 CS 421 44
Rule Based Execution
SearchWhen stuck backtrack to last point with choices remainingExecuting the NFSA in last example was example of rule based executionFSA’s are rule-based programs; transitions between states (labeled edges) are rules; set of all FSA’s is programming language