CSCI 4325 / 6339 Theory of Computation
description
Transcript of CSCI 4325 / 6339 Theory of Computation
CSCI 4325 / 6339Theory of Computation
Zhixiang Chen
Department of Computer Science
University of Texas-Pan American
Chapter TwoContext-free Languages
A Short Overview
We know that is not regular Can we design a grammar to generate L?
Answer: S a S b S e
The grammar is (V, , R, S) V = { a, b, S} = {a, b} S R: S a S b, S e
The above grammar is context-free.
}0:{ ibaL ii
Context-free Grammars
Definition. A context-free grammar G is a quadruple (V, , R, S) where V is an alphabet V is the set of terminals S V - is the start symbol R is the set of production rules
R (V - ) x V* V - is the set of non–terminals
Production Rules
Let (A, u) R be a production rule. We can rewrite it as A u
A u means that from a nonterminal symbol
A, we derive a string u. Or, we say A implies u. Or, A derives u. Or, A generates u.
Understand the Derivation Relation Let G= (V, , R, S) be a context-free grammar.
if and only if
The relation has a reflexive and transitive closure denoted by
Understand
yxVyx ,, *
wAand
uwvyuAvx
,
*
*
Context-free Languages
The language generated by a CF G= (V, , R, S): A string is generated by G if
The language generated by G is
A language is CF, if it is generated by a CF grammar.
wS *
*w
}:{)( ** wSwGL
Arithmetic Expressions
Ex’s: The language of arithmetic expressions is CF.
This language is generated by the CF grammar G G = (V, , R, E) V = { E , ( , ) , + , * , - , / , id , T , F} = {( , ) , + , * , / , id} E R: E E + T T T * F E – T T / F
T T * F T / F F F (E) F id
Ex’s of derivation?
CF Language Examples
Ex: is context-free. show this is true in class
Ex: is context-free. show this is true in class.
}:},{{ * RwwbawL
}},{:{ *bawwwL R
Theorem. Every regular language is context-free. Proof:Let L = L(M) be
regular language recognized by a FA M = ( K, , , s, F). Construct a context-free
grammar to simulate M. Idea of construction
}:{}),(:{
,
),,,(
FqeqpaqapqR
sSKV
SRVG
Example of Construction
Construct the CF grammar for the following FA:
a
b b
ba
a
Parse Trees
Given a CF grammar G = ( V, , S, R), L (G), the derivation procedure S * can be described by a tree. We call such a tree as the parse tree of .
Importance of parse tree analysis of the syntax of .
Parse Tree Examples
Consider arithmetic expressions generated by G = ( V , , R , E), where V = { E , T , F , ( , ) , + , * , - , / , id} = {( , ) , + , * , - , / , id} R: E E + T | E – T | T | T * F | T / F
T T * F | T / F | F F (E) | id
Construct a parse tree for id*(id+id)
Rightmost Derivations
Given a context-free grammar G = ( V, , S, R) for any *, a right-most derivation for is such a derivation that at each step the right-most non-terminal is used to do the derivation.
Ex’s of Rightmost Derivations Ex: G = ( V, , S, R), where
V = { E , T , ( , ) , id , + , * , - , / } = {( , ) , id , + , * , - , /} R: E E + T | E – T | E * T | E / T
T (E) | id Find the rightmost derivation for
( id + id ) * ( id – id * id )
Leftmost Derivations
Similar to rightmost derivations, at each step the left-most non-terminal symbol is used to do derivation.
Ex Find the leftmost derivation for ( id + id ) * ( id – id * id )
Theorem 3.2.1. Let G = ( V, , S, R) be a context-free grammar, and let A V-, and *. Then the following statements are equivalent: (a) A * (b) There is a parse tree with root A and yield . (c) There is a leftmost derivation A * (d) There is a rightmost derivation A *
Proof by induction on the length of . Prove (a)(b)(c)(d)(d)
L
R
Ambiguity
A context-free grammar G = ( V, , S, R) is ambiguous if there is a * such the has two distinct parse trees. That is, there are different meanings or
interpretations for , or The semantics of is ambiguous
Ambiguity Examples
Ex. E E + E | E * E | (E) | id
E
EE
EE
EE
EE
E+
+*
*
id
id id id id
id
Note. Can you see different meanings of id+id*id?
Ambiguous Languages
A language is inherently ambiguous if any context-free grammar generating it is ambiguous.
Why ambiguity is not good?
Pushdown Automata (PA)
babaabba
Finite control
a
b
b
a
a Pushdown stack
Input
Definition of PA
A pushdown automata is a sextuple M = ( K , , , , S , F ) K is a finite set of states. is the input alphabet. is the stack alphabet. S K is the initial state. F K is the set of final states. is the transition relation
: K x ( { e} ) x * K x *
Understand the Transition Relation Understand
(p, a , ) = ( q , ) p: the current state a: the current input symbol : the top string on the current stack q: the new state : replace the top string on the current stack with
Configurations of PA
Configurations of a pushdown automaton are tuples in K x * x *
Given a configuration ( p , , u ) understand it:
p: the current state : the remaining tape content u: the current stack content
Yield Relations of PA
Yield relation |Given two configurations ( p, x, ) and (q, y, ), ( p, x, ) | (q, y, ) If x = a y , = ,
= , (p, a, ) = ( q , ) Define |* as the reflexive transitive closure
of | Understand | and |*
The Language Accepted by a PA Give * , a PA M accepts if and only if
(s, , e) * (p, e, e) for some p ⊢ F.
The language accepted by M is L (M) = { * : (s, , e) * (p, e, e) ⊢
for some p F}
PA Examples
EX. Design a pushdown automaton accepting L = { c : {a, b}* }
M = ( K , , , , S , F ) K = { s, f } , = { a, b, c} = { a, b } , F = { f } : ( s , a , e ) ( s, a)
( s , b , e ) ( s, b)
( s , c , e ) ( f, e)
( f , a , a ) ( f, e)
( f , b , b ) ( f, e)
R
PA Examples
EX. Design a pushdown automaton accepting L = { : {a, b}* }
( s , a , e ) ( s, a) ( s , b , e ) ( s, b) ( s , e , e ) ( f, e) ( f , a , a ) ( f, e) ( f , b , b ) ( f, e)
R
PA vs. CF Languages
Theorem 3.4.1: the class of languages accepted by pushdown automata is exactly the class of context-free languages.
Proof. Part 1 Each CF language is accepted by
some PA. Let G = ( V , , R , S ) be a CF grammar. Want to construct a PA M such that
L (G) = L (M). The idea of constructing of M?
Push the start symbol S of the CF G onto the stack
Simulate derivation on the stack Match terminals symbols in stack top with the
current input symbols
Constructing of the PA for CF G
M = ( {p ,q} , , V , , p , {q} ) : ( p , e , e ) ( q, S)
( q , e, A ) ( q, x), if A x R
( q , a , a ) ( q, e), a
Example
EX Construct a PA M for G = ( V , , R , S ) V = { s , a , b , c } , = { a , b , c } , R :
S a S a , S b S b , S c
The PA M is
M = ( {p ,q} , , V , , p , {q} ) :
( p , e , e ) ( q , S) ( q , e, S ) ( q , a S a) ( q , e , S) ( q , b S b) ( q , e , S) ( q , c) ( q , a , a) ( q, e) ( q , b , b) ( q, e) ( q , c , c) ( q, e)
Operation on abbcbbaState Unread Input Stackp abbcbba eq abbcbba Sq abbcbba aSaq bbcbba Saq bbcbba bSbaq bcbba Sbaq bcbba bSbbaq cbba Sbbaq cbba cbbaq bba bbaq ba baq a aq e e
Now , we need to prove L (M) = L (G)
Claim Let * , ( V - ) V* {e}.
Then S * if and only if
(q , , S) * (q , e , ⊢ ) Proof of Claim.
(if – part) suppose S * ,
where * , ( V - ) V* {e}. We prove (q , , S) * (q , e , ⊢ ) By induction on the length of leftmost. Basis step. The length is 0, i.e. = e , = S
L
L
Induction hypothesis : Assume if S * by a derivation of length n or
less, n 0, then
(q , , S) * (q , e , ⊢ ) Induction step.
Let
Be a leftmost derivation if from S. Let A be the leftmost nonterminal symbol, then
where *, , V* , A R
110 n
L
n
LLL
uuuuS
L
,, 1 xuxAu nn
(only-if part) Suppose (q , , S) *(q , e , ⊢ ) with * , ( V - ) V* {e}.
We show S * . By induction on the number of transitions of
type 2 in the computation by M.
L
Part 2. If a language is accepted by a pushdown automaton then it is a context-free language. We consider simple pushdown automaton:
Whenever (q , , ) (p, ) is a transition and q is not the start state, then , and | | 2.
Note Any pushdown automaton can be simulated by a simple pushdown automaton.
Construction of context-free grammar
G = ( V , , R , S ) is the same S is the new initial state V is the set of S plus all the states below
< q , A , p >, q , p K , A {e, Z}
Explain < q , A , p >
< q , A , p > represents any portion of the input string that might be read between a point in time when M is in state q with A on the top of its stack, and a point in time when M removes A from the stack and enters state p.
R:
(1) S < s, Z, f’ >, where s is the start state of the original PA M, and f’ is the new final state
(2) For each (q , a , B) ( r, e) where q, r K, a {e}, B, C {e} and for each p K,
Add rule < q, B, p > a< r, C, p >
(3) For each (q , a , B) ( r, C1 C2) Where q, r K, a {e}, B {e} and C1 , C2 and for each p, p’ K
Add rule < q , B , p > a < r, C1, p’ > < p’, C2, p >
(4) For each q K, add < q , e , q > e
Claim q, pK, A {e}, and x *, <q, A, p > * x if and only if (q, x, A) * (p, e, e) ⊢
Closure Properties.
Theorem 3.5.1. CF languages are closed under union, concatenation and kleene star.
Proof. Given G1 = ( V1 , 1 , R1 , S1 ) G2 = ( V2 , 2 , R2 , S2 ) Union: Want G = ( V , , R , S ) such that
L(G)=L(G1) L(G2) Construction of G
V = V1 V2 { S } R = R1 R2 {S S1 , S S2}
Closure Properties
Concatenation: want G = ( V , , R , S ) such that
L (G) = L (G1) L (G2) Construction of G
V = V1 V2 { S } R = R1 R2 {S S1 S2}
Closure Properties
Kleene star: Want G = ( V , , R , S ) such that
L (G) = L* (G1) , where G1 = ( V , , R , S1 ) Construction of G
V = V1 { S } R = R1 {S e , S S S1}
Intersection with Regular Languages Theorem 3.5.2. The intersection of a CF
language with a regular language is CF. Proof: Given L1 = L (M1), L2 = L (M2)
M1 is a pushdown automaton M1 = ( K1 , , 1 , 1 , S1 , F1 ) M2 is a finite automaton (M2 is
deterministic) M2 = ( K2 , , , S2 , F2). Want a pushdown automaton
M = ( K , , , , S , F ) such that L (M) = L (M1) L (M2).
Proof (continued).
Idea Use M3 to do parallel simulation of M1 and M2 Construction:
K = K1 x K2 = 1 S = (S1 , S2 ) F = F1 x F2 :
If (q1, a, ) (p1 , ) 1 for each q2 K2, define ((q1, q2), a, ) ((p1 , (q2, a)) , ) .
If (q1, e, ) (p1, ) 1, for each q2 K2 , define
((q1, q2), e , ) ((p1, q2), ) .
A Technical Lemma
Let G = (V, , R, S) be a CF grammar. Let (G) denote the largest number of symbols on the right-hand side of any rule in R. (G) indicates the largest number of children a
node in a parse tree of G may have. Lemma 3.5.1 The yield of any parse tree of
G of height h has length at most ((G)) . Proof. Estimate the tree size.
h
The Pumping Theorem
Theorem 3.5.3 Let G = (V, , R, S) be a CF grammar. Then any string L(G) of length greater than
can be written as
such that either v or y is nonempty and
for every n 0 . Furthermore,
||)(|| VG
)(GLzxyuv nn
uvxyz
.|| nvxy
Proof of the Pumping Theorem
A
A
S
| u | v | x | y | z |
Non-CF Languages:Applications of the Pumping Theorem Consider
L3 = { { a b c}* : has a equal
number of a’s , b’s
and c’s}
}1:{2
}0:{1
primeaisnaL
ncbaLn
nnn
Non-CF Languages:Applications of the Pumping Theorem Proofs for L1 and L2 are direct applications of
the pumping theorem Consider several cases for L1 Note that we can choose n=q+1+r to make
nq+r=(q+1)(r+2) to be a composite n umber Note: L1 = L3 a* b* c*
Proof for L3 is done by contradiction and indirect application of the pumping theorem.
Demonstrate proofs in class to show that these three languages are not CF.
A New Picture of Languages
CF LanguagesRegular Languages
Theorem 3.5.4. CF languages are not closed under intersection or complementation. Proof By contradiction. Given two CF languages
Suppose CF languages are closed under intersection or complementation, then
must be CF. Unfortunately, this is not true (by the Pumping Theorem).
}0,:{
}0,:{
2
1
mncbaL
mncbaLnnm
mnn
}0:{2121 ncbaLLLL nnn
Algorithms for CF Grammars Theorem 3.6.1.
(a) There is a polynomial algorithm which, given a CF grammar, constructs an equivalent pushdown automaton.
(b) There is a polynomial algorithm which, given a pushdown automaton, constructs an equivalent CF grammar.
(c) There is a polynomial algorithm which, given a CF grammar G and a string x decides whether x L (G).
(c) is not easy to see. However, (a) and (b) are straightforward
See the proof about the equivalence of CF grammars and PA. The constructions were given in the proof.
Proof of (c)
Two major steps: Convert a CF grammar to an equivalent
Chomsky Normal Form CF grammar. Decide the membership problem for the
Chomsky Normal Form CF grammar. Use dynamic programming technique.
Chomsky Normal Form
A CF grammar G = ( V, , R, S) is in Chomsky Normal Form if R ( V - ) x V
Understand R A Chomsky Normal Form grammar cannot generate
symbols in {e}
2
Conversion to Chomsky Normal Formal Convert G = (V, , R, S) to a Chomsky Normal Form CF
grammar G’ such that L(G’) = L(G) – ( {e} ) Step 1. Get rid of longer rules
Ex A B1 B2 B3 B4 B5 Will be replaced by
A B1 A1 A1 B2 A2 A2 B3 A3 A3 = B4 B5
where A1, A2 and A3 are new nonterminals. The time complexity for this step is O(n).
Step 2 : Get Rid of e–Rules
Find the set of erasable nonterminals E = { A ( V - ) : A * e }
Algorithm to find erasable nonterminals: E = While A with E* and A E
add A to E
Delete from G the e-rules A e, and Short rules:
For any rule A B C or A C B with BE and C VAdd A C
Time Complexity of the above steps is O (n )2
Step 3 : Get Rid of Shorter Rules For each A V compute D(A)
D (A) = { B V : A * B } Algorithm:
D (A) = { A } While B C with B D (A) and C D(A)
add C to D(A) Delete shorter rules like A B For each rule A B C replace it by A B’ C’
for every B’ D(B), C’ D(C) Finally, for S B C, add A B C for every A
D(S) Time complexity of the above is O (n )2
Decide the Membership for the Chomsky Normal Form Idea: Dynamic programming
For any decide by analyzing substrings of x
1 i i + s n , define N [i, i+s] to be the set of all symbols in V that can derive in G the string
Use dynamic programming to compute N [i, i+s], and hence N[1,n] if and only
nxxxx 21
sii xx
)(GLx
)(GLx ],1[ nNS
Find N[i, i+s]
Algorithm For i = 1 to n do N [i, i] = { }; let all other N [ i, j ] = For s =1 to n-1 do
for i = 1 to n - s do for k = i to i + s – 1 do
if a rule A B C R with B N [ i, k] and C N [ k + 1, i + s]
add A to N [ i, i + s]
Accept x if
ix
],1[ nNS