1 Equivalence of PDA, CFG Conversion of CFG to PDA Conversion of PDA to CFG.
CS 44 – Jan. 29 Expression grammars –Associativity √ –Precedence CFG for entire language...
-
Upload
gloria-rice -
Category
Documents
-
view
212 -
download
0
Transcript of CS 44 – Jan. 29 Expression grammars –Associativity √ –Precedence CFG for entire language...
CS 44 – Jan. 29
• Expression grammars– Associativity √– Precedence
• CFG for entire language (handout)
• CYK algorithm– General technique for testing for acceptance
(Note we don’t want PDA since it could be nondeterministic!)
Precedence
• (* /) bind stronger than (+ -)• (+ -) separate better than (* /)• Need to break up expression into terms
– Ex. 9 – 8 * 2 + 4 / 5– We want to say that an expression consists of “terms”
separated by + and –– And each term consists of numbers separated by *
and /– But which should we define first, expr or term?
Precedence (2)
• Which grammar is right?
expr expr + term | expr – term | termterm term * num | term / num | num
Or this one:
expr expr * term | expr / term | termterm term + num | term – num | num
Let’s try examples 1 + 2 * 3 and 1 * 2 + 3
Moral
• If a grammar is defining something hierarchical, like an expression, define large groupings first.
• Lower precedence operators appear first in grammar. (They separate better)– Ex. * appears lower in parse tree than + because it
gets evaluated first.
• In a real programming language, there can be more than 10 levels of precedence. C has ~15!
C language
• Handout
– How does the grammar begin?– Where are the mathematical expressions?– Do you agree with the precedence?– Do you see associativity?– What else is defined in grammar?– Where are the terminals?
Accepting input
• How can we tell if a given source file (input stream of tokens) is a valid program?Language defined by CFG, so …– Can see if there is some derivation from grammar?– Can convert CFG to PDA?
• Exponential performance not acceptable. (e.g. doubling every time we add token)
• Two improvements:– CYK algorithm, runs in O(n3)– Bottom-up parsing, generally linear, but restrictions on
grammar.
CYK algorithm
• In 1965-67, discovered independently by Cocke, Younger, Kasami.
• Given any CFG and any string, can tell if grammar generates string.
• The grammar needs to be in CNF first.– This ensures that the rules are simple. Rules are of
the form X a or X YZ
• Consider all substrings of len 1 first. See if these are in language. Next try all len 2, len 3, …. up to length n.
continued
• Maintain results in an NxN table. Top right portion not used.– Example on right is for
testing word of length 3.
• Start at bottom; work your way up.
• For length 1, just look for “unit rules” in grammar, e.g. Xa.
1..3
X X1..2 2..3
X1..1 2..2 3..3
continued
• For general case i..j– Think of all possible
ways this string can be broken into 2 pieces.
– Ex. 1..3 = 1..2 + 3..3or 1..1 + 2..3
– We want to know if both pieces L. This handles rules of form A BC.
• Let’s try example from 3+7+. (in CNF)
1..3
X X1..2 2..3
X1..1 2..2 3..3
337 3+7+ ?
S AB
A 3 | AC
B 7 | BD
C 3
D 7
For each len 1 string, which variables generate it?
1..1 is 3. Rules A and C.
2..2 is 3. Rules A and C.
3..3 is 7. Rules B and D.
1..3
X X1..2 2..3
X1..1
A, C
2..2
A, C
3..3
B, D
337 3+7+ ?
S AB
A 3 | AC
B 7 | BD
C 3
D 7
Length 2:
1..2 = 1..1 + 2..2 =
(A or C)(A or C) = rule A
2..3 = 2..2 + 3..3 =
(A or C)(B or D) = rule S
1..3
X X1..2
A
2..3
S X1..1
A, C
2..2
A, C
3..3
B, D
337 3+7+ ?
S AB
A 3 | AC
B 7 | BD
C 3
D 7
Length 3: 2 cases for 1..3:
1..2 + 3..3: (A)(B or D) = S
1..1 + 2..3: (A or C)(S) no!
We only need one case to work.
1..3
S X X1..2
A
2..3
S X1..1
A, C
2..2
A, C
3..3
B, D
Example #2
Let’s test the word baabS AB | BCA BA | aB CC | bC AB | a
Length 1:‘a’ generated by A, C‘b’ generated by B
1..4X X X
1..3 2..4X X
1..2 2..3 3..4X
1..1
B
2..2
A, C
3..3
A, C
4..4
B
baab
S AB | BC
A BA | a
B CC | b
C AB | a
Length 2:
1..2 = 1..1 + 2..2 = (B)(A, C) = S,A
2..3 = 2..2 + 3..3 = (A,C)(A,C) = B
3..4 = 3..3 + 3..4 = (A,C)(B) = S,C
1..4X X X
1..3 2..4X X
1..2
S, A
2..3
B
3..4
S, C X1..1
B
2..2
A, C
3..3
A, C
4..4
B
baab
S AB | BC
A BA | a
B CC | b
C AB | a
Length 3: [ each has 2 chances! ]
1..3 = 1..2 + 3..3 = (S,A)(A,C) = Ø
1..3 = 1..1 + 2..3 = (B)(B) = Ø
2..4 = 2..3 + 4..4 = (B)(B) = Ø
2..4 = 2..2 + 3..4 = (A,C)(S,C) = B
1..4X X X
1..3
Ø
2..4
B X X1..2
S, A
2..3
B
3..4
S, C X1..1
B
2..2
A, C
3..3
A, C
4..4
B
Finally…
S AB | BCA BA | aB CC | bC AB | aLength 4 [has 3 chances!]1..4 = 1..3 + 4..4 = (Ø)(B) = Ø1..4 = 1..2 + 3..4 = (S,A)(S,C) = Ø1..4 = 1..1 + 2..4 = (B)(B) = Ø
Ø means we lose!baab L.
However, in general don’t give up if you encounter Ø in the middle of the process.
1..4
Ø X X X1..3
Ø
2..4
B X X1..2
S, A
2..3
B
3..4
S, C X1..1
B
2..2
A, C
3..3
A, C
4..4
B