First and Follow Set

5
                                                   Compiler Construction                                           To find out FIRST & FOLLOW set                                                                   LL PARSER          The construction of a predictive parser is aided by two functions associated with a grammar called FIRST and FOLLOW. These functions allows us to fill the entries of the predictive parsing table.                                                                                      FIRST : We associate each grammar symbol A with the set FIRST(A). The implication of this set is that the grammar symbol A can in some steps of transition produce the elements of the set FIRST(A).         If 'α' is any string of grammar symbols, let FIRST(α) be the set of terminals that begin the string derived from α . If  α=*=>є then add  є to FIRST(є). RULES TO COMPUTE FIRST SET 1) If X is a terminal , then FIRST(X) is {X}  2)  If X--> є then add є  to FIRST(X)  3)  If X is a nonterminal and X-->Y1Y2Y3...Yn , then put 'a' in  FIRST(X) if for some i, a is in FIRST(Yi) and є is in all of  FIRST(Y1),...FIRST(Yi-1);  In other words,  Y1...Yi-1=*=>є .  If є is in all of FIRST(Yj) for all j=1,2....n, then add є to  FIRST(X).This is so because all Yi's produce є, so X definitely produces є            FOLLOW :         FOLLOW is defined only for non terminals of the grammar G. It can be defined as the set of terminals of grammar G , which can immediately follow the non terminal in a production rule from start symbol.                 In other words, if A is a nonterminal, then FOLLOW(A) is the set of     terminals 'a' that can appear immediately  to the right of A in some    sentential  form, i.e. The set of terminals 'a' such that there exists a    derivation of the form S=*=> αAaβ for some α and β (which can be    strings ,or empty).     RULES TO COMPUTE FOLLOW SET 1) If S is the start symbol, then add $ to the FOLLOW(S). 2) If there is a production rule A--> αBβ then everything in FIRST(β)             except for є  is placed in FOLLOW(B). 3) If there is a production A--> αB , or a production  A--> αBβ where FIRST(β) contains є then everything in FOLLOW(A) is in FOLLOW(B).      Prepared by: Paul PrOnabananda; Student of M.S in CSE NSU ID# 073764050                Email:   [email protected]/hotmail.com/yahoo.com                                Cell: 01724843626; 01553466667; 01921144286

Transcript of First and Follow Set

Page 1: First and Follow Set

                                                   Compiler Construction                                          To find out FIRST & FOLLOW set

                                                                  LL PARSER   

             The construction of a predictive parser is aided by two functions associated with a grammar called FIRST and FOLLOW. These functions allows us to fill the entries of the predictive parsing table.

                                                                                     FIRST :

We associate each grammar symbol A with the set FIRST(A). The implication of this set is that the grammar symbol A can in some steps of transition produce the elements of the set FIRST(A).         If 'α' is any string of grammar symbols, let FIRST(α) be the set of terminals that begin the string derived from α . If  α=*=>є then add  є to FIRST(є).

RULES TO COMPUTE FIRST SET

1) If X is a terminal , then FIRST(X) is {X} 2)  If X­­> є then add є  to FIRST(X) 3)  If X is a nonterminal and X­­>Y1Y2Y3...Yn , then put 'a' in 

 FIRST(X) if for some i, a is in FIRST(Yi) and є is in all of  FIRST(Y1),...FIRST(Yi­1); In other words,  Y1...Yi­1=*=>є . If є is in all of FIRST(Yj) for all j=1,2....n, then add є to  FIRST(X).This is so because all Yi's produce є, so X definitelyproduces є            

FOLLOW :        FOLLOW is defined only for non terminals of the grammar G.It can be defined as the set of terminals of grammar G , which can immediately follow the non terminal in a production rule from start symbol.    

             In other words, if A is a nonterminal, then FOLLOW(A) is the set of    terminals 'a' that can appear immediately  to the right of A in some    sentential  form, i.e. The set of terminals 'a' such that there exists a    derivation of the form S=*=> αAaβ for some α and β (which can be   strings ,or empty).    RULES TO COMPUTE FOLLOW SET

1) If S is the start symbol, then add $ to the FOLLOW(S).2) If there is a production rule A­­> αBβ then everything in FIRST(β)

            except for є  is placed in FOLLOW(B).3) If there is a production A­­> αB , or a production  A­­> αBβ where

FIRST(β) contains є then everything in FOLLOW(A) is in FOLLOW(B).

     Prepared by: Paul PrOnabananda; Student of M.S in CSE NSU ID# 073764050               Email:   [email protected]/hotmail.com/yahoo.com                               Cell: 01724843626; 01553466667; 01921144286

Page 2: First and Follow Set

                                                   Compiler Construction                                          To find out FIRST & FOLLOW set   Example : Consider the following grammar                       S­> aABe                       A­­>Abc |  b                       B­­> d          Find the FIRST and FOLLOW for each nonterminal of the grammar.           Solution : ­ 

          Steps : 1) Find for every non terminal  if it is nullable.2) Find FIRST for every nonterminal using rules described earlier. 3) Using the information got from calculations in steps 1 and 2

one could calculate  FOLLOW for every nonterminal by rules described earlier.

      To calculate FIRST’s

           a)  To calculate FIRST(S) :                 From rule   S­­>aABe   ,  we get 'a' belongs to FIRST(S)                   No other rule will help give an element in FIRST(S).                 So,   FIRST(S)={a}

           b) To calculate FIRST(A) :                 From rule  A­­>Abc  ,   we can't add any element                  From rule  A­­>b   , we know that 'b' belongs to FIRST(A)                 No other rule will help give an element in FIRST(A).                So,    FIRST(A)={b}

c) To calculate FIRST(B)      From rule B­­>d  ,  we add 'd' to FIRST(B)     No other rule will help give an element in FIRST(B).

                  So,   FIRST(B)={d}     

      To calculate FOLLOW’s

       a) To calculate FOLLOW(S)            Since S is start symbol, add $ to FOLLOW(S)            From rule S­­>aABe  , we don’t get any contribution to the             FOLLOW(S)          /*See rules 2 and 3  for FOLLOW*/             From rule A­­>Abc   , since no where we see any symbol S, so              no contribution to FOLLOW(A) is found in this rule.            From rule A­­>b , again no contribution.            From rule B­­>d, again no contribution.            Hence FOLLOW(S) ={$}          b) To calculate FOLLOW(A) :

     Prepared by: Paul PrOnabananda; Student of M.S in CSE NSU ID# 073764050               Email:   [email protected]/hotmail.com/yahoo.com                               Cell: 01724843626; 01553466667; 01921144286

Page 3: First and Follow Set

                                                   Compiler Construction                                          To find out FIRST & FOLLOW set              From rules A­­>b , B­­>d   we get no contribution.              From rule S­­>aABe    we expect to get some contribution.              See rule 2 of forming FOLLOW(on page 2)              As per that , we can add everything in FIRST(Be) in FOLLOW(A)              except for epsilon.                FIRST(Be)=FIRST(B) ={d}              So add 'd' to FOLLOW(A).              Since Be is not nullable, so we can't use the rule 3 (See page 2)              For rule   A­­>Abc ,  we do get some contribution straight away.              From rule 2(page 2), 'b' belongs to FOLLOW(A)              No other rule will help give an element in FOLLOW(A).              Hence  FOLLOW(A)={d,b} 

        c)  To calculate FOLLOW(B)             Only rule S­­>aABe contributes.             By rule 2 , 'e' belongs to FOLLOW(B).

Hence  FOLLOW(B)={e}       

(compiler construction)

Top down parsing (continued)

Last lecture we talked about re-cursive descent parsing, which I said wasthe method of choice if you don't have an automated parsing tool (such asYacc) to do the dirty work for you.

Recall that in order for RD to work, our grammar had to be LL(1), whichmeant that any time we had a choice of several alternatives, we must beable to decide upon one altern-ative based only on the NEXT in-put token.

That is, if we have a choice, such as

A -> alpha | beta | gamma | delta

we must be able to tell which choice to take based on only ONE token.Let us today examine some im-plications of this.

If the correct alternative to take at this point is an alpha, then either(1) the next token must be something that can begin an al-pha, or(2) it must be legal for alpha to match nothing, and the next token mustbe something that can follow an A.

Possibility (1) leds us into de-fining a relation, which we will call FIRST.FIRST(alpha) will be the set of symbols that can begin a string derived

     Prepared by: Paul PrOnabananda; Student of M.S in CSE NSU ID# 073764050               Email:   [email protected]/hotmail.com/yahoo.com                               Cell: 01724843626; 01553466667; 01921144286

Page 4: First and Follow Set

                                                   Compiler Construction                                          To find out FIRST & FOLLOW setstarting from alpha (remember alpha may contain tokens and/ornonterminals).

Possibility (2) leds us into de-fining a relation FOLLOW(A), which is theset of symbols that can legally follow an A. Note that in this case,unlike the first, the "argument" is a single nonterminal.

I want to quickly go over the algorithm to compute first and follow.Really, the details are not all that important. What you should know is(1) what first and follow are, and(2) how they are used.The details of how to compute first and follow are given in the book inmuch more detail, if you really want to see it.

The initial step is to discover all the nonterminals that can derivenothing. Let us call this the EPSILON set.Clearly anything that has an ep-silon production on the right hand side isin the epsilon set, so we first scan the productions and add these symbols.Next, we repeatedly do the fol-lowing:

Scan the productions,If anything has all

nonterminals on the right hand side, and

if all those nonterminals are in the epsilon set, then

add the left hand side to the epsilon set.we do this until we have scanned the productions at least once withoutadding anything to the epsilon set.

Now to construct FIRST(alpha), there are three cases:

alpha begins with a terminal - x , then first(alpha) consists of only x.

alpha begins with a nonterminal, X -

we go compute first(X), and assign this to first(alpha).

now if X is in the epsilon set, we also add first(remainder)

to first(alpha).

alpha is epsilon - add epsilon to first set.

To compute the follow set of a nonterminal X,look at all the places X is used in the grammer.Compute what could come next (using the first set information)after we have seen the X. This is the follow set.

(depending upon time, give rail-road diagram form for grammar, andintuition about first and follow in terms of railroad charts).

Once we have first and follow, we can rephrase the LL(1) re-quirement, asfollows:

When we are faced with a choiceA -> alpha | beta | gamma

| deltaThe first sets of the choices must be distinct, and if any choice can beempty than these must also be distinct from the follow set for A.

We can also rephrase our con-struction rule for building a procedure forrecognzing nonterminals. In this example, our procedure would be

     Prepared by: Paul PrOnabananda; Student of M.S in CSE NSU ID# 073764050               Email:   [email protected]/hotmail.com/yahoo.com                               Cell: 01724843626; 01553466667; 01921144286

Page 5: First and Follow Set

                                                   Compiler Construction                                          To find out FIRST & FOLLOW set

boolean procedure A()if next token in first set

for alpha thenrecognize alpha

else if next token in first set for beta then

recognize betaelse if next token in

first set for gamma thenrecognize gamma

else if next token in first set for delta then

recognize deltaelse

error

the error is replaced by "return true", if epsilon is a legal choice forthis nonterminal.

This all seems very straightfor-ward. We can encode our choices in a table,indexed by nonterminals and ter-minals and containing right hand sides.

M[A,x] is "if you are looking for an A, and the next token is an x, thenthis is the production to use."

Procedure is given on page 225.

How do we use this? Build a push down automata with a stack full of GOALS.algorithm given on page 226. This is our first example of a table drivenparser.

Advantages of table driven pars-ers:(1) small and(2) fastdisadvantages:(1) hard to debug if you are do-ing it by hand.

     Prepared by: Paul PrOnabananda; Student of M.S in CSE NSU ID# 073764050               Email:   [email protected]/hotmail.com/yahoo.com                               Cell: 01724843626; 01553466667; 01921144286