PCD UNIT II new

8/7/2019 PCD UNIT II new

1/70

LR Parsing Algorithm

Sm

Xm

Sm-1

Xm-1

.

.

S1X1

S0

a1 ... ai ... an $

Action Table

terminals and $st four differenta actionstes

Goto Table

non-terminalst each item isa a state numbertes

LR Parsing Algorithm

stack

input

output

1


2/70

2

A Configuration of LR Parsing Algorithm

A configuration of a LR parsing is:

( So X1 S1 ... Xm Sm, ai ai+1 ... an $ )

Stack Rest of Input

Sm and ai decides the parser action by consulting the parsing action

table. (Initial Stack contains just So )

A configuration of a LR parsing represents the right sentential form:

X1 ... Xm ai ai+1 ... an $


3/70

3

Actions of A LR-Parser

1. shift s -- shifts the next input symbol and the state s onto the stack

( So X1 S1 ... Xm Sm, ai ai+1 ... an $ ) ( So X1 S1 ... Xm Sm ai s, ai+1 ... an $ )

2. reduce ApF (orrn where n is a production number)

pop 2|F| (=r) items from the stack; then push A and s where s=goto[sm-r,A]

( So X1 S1 ... Xm Sm, ai ai+1 ... an $ ) ( So X1 S1 ... Xm-rSm-rA s, ai ... an $ )

Output is the reducing production reduce ApF

3. Accept Parsing successfully completed

4. Error -- Parser detected an error (an empty entry in the action table)


4/70

4

Reduce Action

pop 2|F| (=r) items from the stack; let us assume that F = Y1Y2...Yr

then push A and s where s=goto[sm-r,A]

( So X1 S1 ... Xm-rSm-rY1 Sm-r...YrSm, ai ai+1 ... an $ )

( So X1 S1 ... Xm-rSm-rA s, ai ... an $ )

So X1 shift the input symbol: aaction In fact, Y1Y2...Yr is a handle.

X1 ... Xm-rA ai ... an $ X1 ... Xm Y1...Yrai ai+1 ... an $


5/70


6/70

6

(SLR) Parsing Tables for Expression Grammar

s id + * ( ) $ E T F

0 s5 s4 1 2 3

1 s6 acc

2 r2 s7 r2 r2

3 r4 r4 r4 r4

4 s5 s4 8 2 3

5 r6 r6 r6 r6

6 s5 s4 9 3

7 s5 s4 10

8 s6 s11

9 r1 s7 r1 r1

10 r3 r3 r3 r3

11 r5 r5 r5 r5

Action Table Goto Table1) E p E+T

2) E p T

3) T p T*F

4) T p F5) F p (E)

6) F p id


7/70

7

Actions of A (S)LR-Parser -- Example

stack input action output

0 Shift 5

0id5 *id+id$ Reduce by Fid Fid

0 F 3 *id+id$ Reduce by T

F T

F0 T 2 *id+id$ Shift 7

0T2 * 7 id+id$ Shift 5

s id + * ( ) $ E T F

0 s5 s4 1 2 3

1 s6 ac

c

2 r2 s7 r2 r2

3 r4 r4 r4 r4

4 s5 s4 8 2 3

5 r6 r6 r6 r6

6 s5 s4 9 3

7 s5 s4 10

8 s6 s11

9 r1 s7 r1 r1

10 r3 r3 r3 r3

11 r5 r5 r5 r5

id*id+id$

id*id+id$


8/70

8

0T2*7 id5 +id$ reduce by Fpid Fpid

0T2*7 F10 +id$ reduce by TpT*F TpT*F

0 T2 +id reduce by EpT EpT

0 E1 +id$ shift 6

0E1 +6 id$ shift 5

0E1+6 id5 $ reduce by Fpid Fpid

0E1+6 F3 $ reduce by TpF TpF

0E1+6 T9 $ reduce by EpE+T EpE+T

0E1 $ accept


9/70

Kernel items, which include the initial item, S -> .S

and all items whose dots are not at the left end.

Whereas the nonkernel items have their dots at the

left end.

9


10/70

10

Goto Operation

If I is a set of LR(0) items and X is a grammar symbol (terminal or non-terminal), then goto(I,X) is defined as follows:

If A p E.XF in Ithen every item in closure({A p EX.F}) will be in goto(I,X).

Example:I ={ E p .E, E p .E+T, E p .T,

T p .T*F, T p.F,F p.(E), F p .id }

goto(I,E) = { E p E., E p E.+T }goto(I,T) = { E p T

., T p T

.*F }

goto(I,F) = {T p F. }goto(I,() = { F p (.E), E p .E+T, E p.T, T p.T*F, T p .F,F p .(E), F p.id }

goto(I,id) = { F p id. }


11/70

11

Conflict Example

S p L=R I0: S p .S I1:S p S. I6:S p L=.R I9: S p L=R.

S p R S p .L=R R p .L

Lp *R S p .R I2:S p L.=R Lp .*R

L p id L p .*R Rp L. L p .id

Rp L L p .id

Rp .L I3:S p R.

I4:L p *.R I7:L p *R.

Problem Rp .L

FOLLOW(R)={=,$} Lp .*R I8:Rp L.

= shift 6 L p .id

reduce by Rp L

shift/reduce conflict I5:L p id.


12/70

12

Conflict Example2

S p AaAb I0: S p .S

S p BbBa S p .AaAb

A p I S p .BbBa

B p I A p .

B p .

Problem

FOLLOW(A)={a,b}

FOLLOW(B)={a,b}

a reduce by A p I b reduce by A p I

reduce by B p I reduce by B p I

reduce/reduce conflict reduce/reduce conflict


13/70

13

SLR-Parsing Tables for Ambiguous Grammar

FOLLOW(E) = { $,+,*,) }

State I8

has shift/reduce conflicts for symbols + and *.

I0 I1 I7I5E*E

when current token is *

shift * is right-associative

reduce * is left-associative

when current token is +shift + has higher precedence than *

reduce * has higher precedence than +


14/70

14

SLR-Parsing Tables for Ambiguous Grammar

id + * ( ) $ E

0 s3 s2 1

1 s4 s5 acc

2 s3 s2 63 r4 r4 r4 r4

4 s3 s2 7

5 s3 s2 8

6 s4 s5 s97 r1 s5 r1 r1

8 r2 r2 r2 r2

9 r3 r3 r3 r3

Action Goto


15/70

Runtime environments

Source Language issues

Storage Organization

Storage allocation strategies

15


16/70

Source Language issues

Procedures:

A procedure definition is a declaration, associates an

identifier with a statement.

The identifier is a procedure name, statement is the procedure

body.

Identifiers appearing in a procedure definition are called

formal parameter

16


17/70

17

m=1,n=9

m(1st half)middle-1

n (2nd half)middle+1


18/70

1. Program sort(input,output)

2. var a :array[0..10] of integer;

3. Procedure readarray;

4. Var i:ineger;

5. Begin

6. For i=1 to 9 do read(a[i])

7. End;8. Function partition(y,z:integer):integer;

9. Var I,j,x,y:integer;

10. Begin

11. end

12. Procedure quicksort(m,n :

integer);

13. Var i:integer;

14. Begin

15. If(n>m) then begin

16. i:=partition(m,n);

17. Quicksort(m,i-1);

18. Quicksort(i+1,n)

19. end

20. end;

21. begin

22. a[0]:=-9999;a[10] :=9999;

23. Readarray;proc.call

24. quicksort(1,9)

25. end

Defn

of

proc.body

of

proc.

18


19/70

Activation tree

Flow of control among procedures:

1.Control flows sequentially:

pgm has sequence of steps,control is being atspecific point.

2.Each execution of procedure starts at the

beginning of procedure body and returns controls.

19


20/70

Activation of procedures:

Execution begins..

Enter read array

Leave read array

Enter quick sort(1,9)

Enter partition(1,9)

leave partition(1,9)Enter quick sort(1,3)

Leave quick sort(1,3)

Enter quick sort(5,9)

Leave quick sort(5,9)Leave quick sort(1,9)

Execution terminated(1,9)

1 2 3 4 5 6 7 8 9

Pivot

1,3 5,3

20


21/70

Lifetime of an activation of a procedure is between last and first

step.

An activation tree.

S

q(2,3)q(1,0)p(1,3)

q(5,9)q(1,3)p(1,9)

p(5,9) q(5,5) p(7,9)

q(9,9)q(7,7)p(7,9)q(2,1)P(2,3) q(3,3)

q(1,9)

21


22/70

Each node represents an activation of a procedure

The root represents the activation of the main

procedure.

The node for a is the parent for b if and only ifcontrol flows from a to b.

Node a is left to b if and only if the lifetime of a

occurs before the lifetime of b.

22


23/70

Control stack

Flow of control in a program corresponds to Depth

first traversal i.e.,root,node,its children.

Control stack keep track oflive procedure

activations.

Push node for an activation as activation begins.

Pop node when the activation ends

23


24/70

q(2,3)q(1,0)p(1,3)

q(1,3)p(1,9)

q(1,9)

S

r

24


25/70

Scope of a declaration:

Declaration may be implicit or explicitScope rules of a language determines which

declaration of a name applies when the name

appears in the text of a program.

Name is said to be Local (within procedure)and

non local .

Difference Local and non local is syntatic

structure .

25


26/70

Binding of names

Data objects refers to storage location that hold values.

Environments refers to function that map name to a

storage location

State refers to function that maps a storage location to

the value held there.Pi=0 address 100

Pi=3.14 address 100

Name :pi Value:3.14Storage :100

Stateenvironment

26


27/70

Storage Organization

Compiler takes some memory from OS whilecompiling

Sub division of runtime memory

1.Generated target code

2.Data objects static memory

3.Control stack

Size of the Generated target code,Data objects arefixed at compile time.

When a call occurs execution of activation is

interrupted and status of information is saved ontostack.

When a control returns from a call,activationrestarted.

27


28/70

A separate area of runtime memory is called heap.

TOS

28


29/70

Activation record:

Information needed by a single execution ofprocedure is managed using a contiguous block

of storage called activation record.

Push Activation record when a procedure

called

pop Activation record when a control returns

to caller.

29


30/70

actual parameters used by the calling

procedure pointing to the activation record of the

caller

locate data needed by the called

procedure

Returned values

Actual parameter

Optional control link

Optional access link

Saved machine status

Local data

Temporaries

30


31/70

Compile time layout of local data:

Runtime storage comes in block of contiguousbyte.

Multibyte objects are stored in consecutive

bytes.

Storage depends on the type of the name.

Elementary datatype, aggregate needs storage

interms of byte.

31


32/70

32


33/70


34/70

Static allocation:

i. Names are bound to storage as the program compiled.

ii. The binding do not change at runtime.

Limitations:

1.Size of the data objects and constraints known at

compile time.2.Recursive procedure are restricted.

3.Data structures cannot be created at runtime.

34


35/70

35


36/70

Stack allocation:

Runtime storage

Locals are bound to fresh storage in each activation.

Calling sequence:

Procedure calls are implemented by generating calling sequence

in the target code.

A call sequence allocates an activation record and enters info.

Into its fields

A return sequence restores the state of the machine

36


37/70

Code in a calling sequence is divided between calling and callee

i.e., source language,target machine.

There is an advantage to placing the fields for parameters and a

potential returned value next to the activation record of the

caller.

The caller can then access these fields using offsets from the end

of its own activation record without knowing the completelayout.

37


38/70

38


39/70

39


40/70

The call sequence is;

1.Evaluates actuals

2.Stores return address(top_sp)

3.Saves register value

4.Initializes local data.

Return sequence is;

1.Return value is placed next to the activation record of caller

2.Callee restores top_sp,registers,branches to return address in the

caller code.

3.Caller copy return value to evaluate an expression.

40


41/70

Variable length data:

41


42/70

Dangling references:

Occurs when there is a reference to storage that has been

deallocated.

This storage is undefined and a logical error.

42


43/70

Heap allocation:

Parcels the pieces of contiguous storage as needed for activation

record These may be deallocated.

The values of local names must be retained when an activation

ends.

A called activation outlives the caller.

For efficiency:

For each size of interest ,keep a linked list of free blocks of that size.

Fill a request of site s with size of s.

For large blocks of storage use the heap manager.

43


44/70

CS416 Compiler Design 44

Type Checking

A compiler has to do semantic checks in addition to syntactic checks.

1.Type checks operator is applied to an incompatible operand

2.Flow of control checks e.g., for ,switch

3.Uniqueness check object must be declared only once

e.g., elements in scalar type should not be repeated

4.Name related checks

same name appears two or more times.

e.g., In ada lang. Loop may have same name that appears at thebeginning and ending of construct.


45/70


Type systems

A type system is a collection of rules for assigning type expressionsto the parts of a program.

A type checkerimplements a type system.

The design of type checker for a language is based on information

about the syntactic constructs in the language.

e.g., if both operands of the arithmetic operators of +,-,* are type of

integer, then then result if of type integer.

Each expression has a type associated with it


46/70


Type expressions

The type of a language construct will be denoted by a type

expression.

Type expressions is either a basic type or is formed by Applying an

operator called type constructor.

The set of basic types and constructors depend on the language to bechecked.

1.A basic type is an expression.among them boolean,char,integer and

real. Type_error will signal an error during type checking.

Void denotes the absence of the value.

2.Type expressions may be named, a type name is a type expression.


47/70


3.A type constructor applied to type expressions is a type

expression.

arrays: If T is a type expression, then array(I,T) is a typeexpression where I denotes index range. Ex:array(0..99,int)

products: If T1 and T2 are type expressions, then theircartesian product T1 X T2 is a type expression. Ex: int xint

pointers: If T is a type expression, then pointer(T) is a

type expression. Ex: pointer(int)Var p:row

Declare variable p to have a pointer(row)


48/70


49/70


Record:

The difference between record and products are the record fields

have names.The record type constructor will be applied to a tuple formed from

field names and yield types

Type row=record

address:integer;

lexeme:array [1..15] of char

End;

Var table:array[1..101] of row

Declares the type row representing the type expression

Record((address x integer) x (lexeme x array(1.15,char)))Variable table to be an array of records of this type.


50/70


4.Type expressions may contain variable whose values are type

expression.

X

charchar integer

pointer

Char x char

pointer(integer)


51/70


Type systems

Collection of rules for assigning type expressions to the

various parts of a program. A type checker implements a type system

Static and dynamic checking types:

checking done when the target program runs is termed

dynamic.

Any check can be done dynamically if the target code carries

the type of an element with the value of that element.

A soundtype system eliminates run-time type checking for

type errors.

A programming language is strongly-typed, if every programits compiler accepts will execute without type errors.


52/70


Table: array[0255] of char;

i:integer

And then compute table[i],

A compiler in general cannot guarantee that duringexecution,the value of I will lie in the range 0 to 255.


53/70


Error recovery:

Type checker catches the error and it do somethingreasonable when the error is discovered.

Compiler must report the nature and location of the

error.


54/70


Specification of simple type checker

PD;E

Psequence of declarations D followed by E

D id: T { addtype (id.entry, T.type) }

T char { T.type=char }

T int { T.type=int }

T real { T.type=real }

T T1 { T.type=pointer(T1.type) }

T array[intnum] of T1 { T.type=array(1..intnum.val,T1.type) }


55/70


Type Checking of Expressions

The synthesised attribute type for E gives the Type Expressions

assigned by the type system to the expression generated by E.

E id { E.type=lookup(id.entry) }

E charliteral { E.type=char }

E intliteral { E.type=int }

E realliteral { E.type=real }


56/70


57/70


Type Checking ofStatements

The language constructs like statements does not have values and

the basic type void assigned to it.

If a error is detected within a statement then the type assigned to

the statement is type_error

The semantic rule for the type checking statement are:

Sp id = E {S.type:=if id.type=E.type then void

else =type-error }

Sp if E then S1 {S.type:=if E.type=boolean then S1.type

else =type-error }Sp while E do S1 {S.type:=if E.type=boolean then S1.type

else type-error }


58/70


Type Checking of Functions

E p E1 ( E2 ) {E.type:=if E2.type=s and E1.type=spt) then t

else type-error }

Ex: int f (double x, char y) { ... }

f: double x char y p int

argument types return type

TT1T2{T.type=T1.typeT2.type}


59/70


Intermediate Code Generation

Intermediate codes are machine independent codes, but they are closeto machine instructions.

The given program in a source language is converted to anequivalent program in an intermediate language by the intermediatecode generator.

Intermediate language can be many different languages, and thedesigner of the compiler decides this intermediate language.

syntax trees can be used as an intermediate language.

postfix notation can be used as an intermediate language.

three-address code (Quadraples) can be used as anintermediate language


60/70


Three-Address Code (Quadraples)

A quadraple is:

x := y op z

where x, y and z are names, constants or compiler-generatedtemporaries; op is any operator.

But we may also the following notation for quadraples (much better

notation because it looks like a machine code instruction)

op y,z,x

apply operatorop to y and z, and store the result in x.

We use the term three-address code because each statement usually

contains three addresses (two for operands, one for the result).


61/70


Three-Address Statements

Binary Operator: op y,z,result or result := y op zwhere op is a binary arithmetic or logical operator. This binary operator isapplied to y and z, and the result of the operation is stored in result.

Ex: add a,b,c

gt a,b,c

addr a,b,c

addi a,b,c

Unary Operator: op y,,result or result := op y

where op is a unary arithmetic or logical operator. This unary operator isapplied to y, and the result of the operation is stored in result.

Ex: uminus a,,c

not a,,c

inttoreal a,,c


62/70


Three-Address Statements (cont.)

Move Operator: mov y,,result or result := y

where the content ofy is copied into result.

Ex: mov a,,c

movi a,,c

movr a,,c

Unconditional Jumps: jmp ,,L or goto L

We will jump to the three-address code with the label L, and the execution

continues from that statement.

Ex: jmp ,,L1 // jump to L1

jmp ,,7 // jump to the statement 7


63/70


Three-Address Statements (cont.)

Conditional Jumps: jmprelop y,z,L or if y relop z goto LWe will jump to the three-address code with the label Lif the result ofy relop z is

true, and the execution continues from that statement. If the result is false, the execution

continues from the statement following this conditional jump statement.

Ex: jmpgt y,z,L1 // jump to L1 if y>z

jmpgte y,z,L1 // jump to L1 if y>=zjmpe y,z,L1 // jump to L1 if y==z

jmpne y,z,L1 // jump to L1 if y!=z

Our relational operator can also be a unary operator.

jmpnz y,,L1 // jump to L1 if y is not zero

jmpz y,,L1 // jump to L1 if y is zero

jmpt y,,L1 // jump to L1 if y is true

jmpf y,,L1 // jump to L1 if y is false


64/70


65/70


Structural Equivalence of Type Expressions

How do we know that two type expressions are equal?

As long as type expressions are built from basic types (no type names),

we may use structural equivalence between two type expressions

Structural Equivalence Algorithm (sequiv):if (s and t are same basic types) then return true

else if (s=array(s1,s2) and t=array(t1,t2)) then return (sequiv(s1,t1) and sequiv(s2,t2))

else if (s = s1 x s2 and t = t1 x t2) then return (sequiv(s1,t1) and sequiv(s2,t2))

else if (s=pointer(s1) and t=pointer(t1)) then return (sequiv(s1,t1))

else if (s = s1 p s2 and t = t1 p t2) then return (sequiv(s1,t1) and sequiv(s2,t2))

else return false


66/70


Names for Type Expressions

In some programming languages, we give a name to a type expression,and we use that name as a type expression afterwards.

type link = o cell; ? p,q,r,s have same types ?

var p,q : link;var r,s : o cell

How do we treat type names?

Get equivalent type expression for a type name (then usestructural equivalence), or

Treat a type name as a basic type.


67/70


Cycles in Type Expressions

type link = o cell;

type cell = record

x : int,

next : link

end;

We cannot use structural equivalence if there are cycles in type

expressions.

We have to treat type names as basic types. but this means that the type expression link is different

than the type expression ocell.


68/70


Type Conversions

x + y ? what is the type of this expression (int or double)?

What kind of codes we have to produce, if the type of x is double and

the type of y is int?

inttoreal y,,t1

real+ t1,x,t2


69/70

69


70/70

70

Type Checking

A compiler has to do semantic checks in addition to syntacticchecks.

Semantic Checks

Static done during compilation

Dynamic done during run-time

Type checkingis one of these static checking operations.

we may not do all type checking at compile-time.

Some systems also use dynamic type checking too.

PCD UNIT II new

Documents

Transcript of PCD UNIT II new