PCD UNIT II new
-
Upload
rishitha-ranjith -
Category
Documents
-
view
216 -
download
0
Transcript of PCD UNIT II new
-
8/7/2019 PCD UNIT II new
1/70
LR Parsing Algorithm
Sm
Xm
Sm-1
Xm-1
.
.
S1X1
S0
a1 ... ai ... an $
Action Table
terminals and $st four differenta actionstes
Goto Table
non-terminalst each item isa a state numbertes
LR Parsing Algorithm
stack
input
output
1
-
8/7/2019 PCD UNIT II new
2/70
2
A Configuration of LR Parsing Algorithm
A configuration of a LR parsing is:
( So X1 S1 ... Xm Sm, ai ai+1 ... an $ )
Stack Rest of Input
Sm and ai decides the parser action by consulting the parsing action
table. (Initial Stack contains just So )
A configuration of a LR parsing represents the right sentential form:
X1 ... Xm ai ai+1 ... an $
-
8/7/2019 PCD UNIT II new
3/70
3
Actions of A LR-Parser
1. shift s -- shifts the next input symbol and the state s onto the stack
( So X1 S1 ... Xm Sm, ai ai+1 ... an $ ) ( So X1 S1 ... Xm Sm ai s, ai+1 ... an $ )
2. reduce ApF (orrn where n is a production number)
pop 2|F| (=r) items from the stack; then push A and s where s=goto[sm-r,A]
( So X1 S1 ... Xm Sm, ai ai+1 ... an $ ) ( So X1 S1 ... Xm-rSm-rA s, ai ... an $ )
Output is the reducing production reduce ApF
3. Accept Parsing successfully completed
4. Error -- Parser detected an error (an empty entry in the action table)
-
8/7/2019 PCD UNIT II new
4/70
4
Reduce Action
pop 2|F| (=r) items from the stack; let us assume that F = Y1Y2...Yr
then push A and s where s=goto[sm-r,A]
( So X1 S1 ... Xm-rSm-rY1 Sm-r...YrSm, ai ai+1 ... an $ )
( So X1 S1 ... Xm-rSm-rA s, ai ... an $ )
So X1 shift the input symbol: aaction In fact, Y1Y2...Yr is a handle.
X1 ... Xm-rA ai ... an $ X1 ... Xm Y1...Yrai ai+1 ... an $
-
8/7/2019 PCD UNIT II new
5/70
-
8/7/2019 PCD UNIT II new
6/70
6
(SLR) Parsing Tables for Expression Grammar
s id + * ( ) $ E T F
0 s5 s4 1 2 3
1 s6 acc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5
Action Table Goto Table1) E p E+T
2) E p T
3) T p T*F
4) T p F5) F p (E)
6) F p id
-
8/7/2019 PCD UNIT II new
7/70
7
Actions of A (S)LR-Parser -- Example
stack input action output
0 Shift 5
0id5 *id+id$ Reduce by Fid Fid
0 F 3 *id+id$ Reduce by T
F T
F0 T 2 *id+id$ Shift 7
0T2 * 7 id+id$ Shift 5
s id + * ( ) $ E T F
0 s5 s4 1 2 3
1 s6 ac
c
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5
id*id+id$
id*id+id$
-
8/7/2019 PCD UNIT II new
8/70
8
0T2*7 id5 +id$ reduce by Fpid Fpid
0T2*7 F10 +id$ reduce by TpT*F TpT*F
0 T2 +id reduce by EpT EpT
0 E1 +id$ shift 6
0E1 +6 id$ shift 5
0E1+6 id5 $ reduce by Fpid Fpid
0E1+6 F3 $ reduce by TpF TpF
0E1+6 T9 $ reduce by EpE+T EpE+T
0E1 $ accept
-
8/7/2019 PCD UNIT II new
9/70
Kernel items, which include the initial item, S -> .S
and all items whose dots are not at the left end.
Whereas the nonkernel items have their dots at the
left end.
9
-
8/7/2019 PCD UNIT II new
10/70
10
Goto Operation
If I is a set of LR(0) items and X is a grammar symbol (terminal or non-terminal), then goto(I,X) is defined as follows:
If A p E.XF in Ithen every item in closure({A p EX.F}) will be in goto(I,X).
Example:I ={ E p .E, E p .E+T, E p .T,
T p .T*F, T p.F,F p.(E), F p .id }
goto(I,E) = { E p E., E p E.+T }goto(I,T) = { E p T
., T p T
.*F }
goto(I,F) = {T p F. }goto(I,() = { F p (.E), E p .E+T, E p.T, T p.T*F, T p .F,F p .(E), F p.id }
goto(I,id) = { F p id. }
-
8/7/2019 PCD UNIT II new
11/70
11
Conflict Example
S p L=R I0: S p .S I1:S p S. I6:S p L=.R I9: S p L=R.
S p R S p .L=R R p .L
Lp *R S p .R I2:S p L.=R Lp .*R
L p id L p .*R Rp L. L p .id
Rp L L p .id
Rp .L I3:S p R.
I4:L p *.R I7:L p *R.
Problem Rp .L
FOLLOW(R)={=,$} Lp .*R I8:Rp L.
= shift 6 L p .id
reduce by Rp L
shift/reduce conflict I5:L p id.
-
8/7/2019 PCD UNIT II new
12/70
12
Conflict Example2
S p AaAb I0: S p .S
S p BbBa S p .AaAb
A p I S p .BbBa
B p I A p .
B p .
Problem
FOLLOW(A)={a,b}
FOLLOW(B)={a,b}
a reduce by A p I b reduce by A p I
reduce by B p I reduce by B p I
reduce/reduce conflict reduce/reduce conflict
-
8/7/2019 PCD UNIT II new
13/70
13
SLR-Parsing Tables for Ambiguous Grammar
FOLLOW(E) = { $,+,*,) }
State I8
has shift/reduce conflicts for symbols + and *.
I0 I1 I7I5E*E
when current token is *
shift * is right-associative
reduce * is left-associative
when current token is +shift + has higher precedence than *
reduce * has higher precedence than +
-
8/7/2019 PCD UNIT II new
14/70
14
SLR-Parsing Tables for Ambiguous Grammar
id + * ( ) $ E
0 s3 s2 1
1 s4 s5 acc
2 s3 s2 63 r4 r4 r4 r4
4 s3 s2 7
5 s3 s2 8
6 s4 s5 s97 r1 s5 r1 r1
8 r2 r2 r2 r2
9 r3 r3 r3 r3
Action Goto
-
8/7/2019 PCD UNIT II new
15/70
Runtime environments
Source Language issues
Storage Organization
Storage allocation strategies
15
-
8/7/2019 PCD UNIT II new
16/70
Source Language issues
Procedures:
A procedure definition is a declaration, associates an
identifier with a statement.
The identifier is a procedure name, statement is the procedure
body.
Identifiers appearing in a procedure definition are called
formal parameter
16
-
8/7/2019 PCD UNIT II new
17/70
17
m=1,n=9
m(1st half)middle-1
n (2nd half)middle+1
-
8/7/2019 PCD UNIT II new
18/70
1. Program sort(input,output)
2. var a :array[0..10] of integer;
3. Procedure readarray;
4. Var i:ineger;
5. Begin
6. For i=1 to 9 do read(a[i])
7. End;8. Function partition(y,z:integer):integer;
9. Var I,j,x,y:integer;
10. Begin
11. end
12. Procedure quicksort(m,n :
integer);
13. Var i:integer;
14. Begin
15. If(n>m) then begin
16. i:=partition(m,n);
17. Quicksort(m,i-1);
18. Quicksort(i+1,n)
19. end
20. end;
21. begin
22. a[0]:=-9999;a[10] :=9999;
23. Readarray;proc.call
24. quicksort(1,9)
25. end
Defn
of
proc.body
of
proc.
18
-
8/7/2019 PCD UNIT II new
19/70
Activation tree
Flow of control among procedures:
1.Control flows sequentially:
pgm has sequence of steps,control is being atspecific point.
2.Each execution of procedure starts at the
beginning of procedure body and returns controls.
19
-
8/7/2019 PCD UNIT II new
20/70
Activation of procedures:
Execution begins..
Enter read array
Leave read array
Enter quick sort(1,9)
Enter partition(1,9)
leave partition(1,9)Enter quick sort(1,3)
Leave quick sort(1,3)
Enter quick sort(5,9)
Leave quick sort(5,9)Leave quick sort(1,9)
Execution terminated(1,9)
1 2 3 4 5 6 7 8 9
Pivot
1,3 5,3
20
-
8/7/2019 PCD UNIT II new
21/70
Lifetime of an activation of a procedure is between last and first
step.
An activation tree.
S
q(2,3)q(1,0)p(1,3)
q(5,9)q(1,3)p(1,9)
p(5,9) q(5,5) p(7,9)
q(9,9)q(7,7)p(7,9)q(2,1)P(2,3) q(3,3)
q(1,9)
21
-
8/7/2019 PCD UNIT II new
22/70
Each node represents an activation of a procedure
The root represents the activation of the main
procedure.
The node for a is the parent for b if and only ifcontrol flows from a to b.
Node a is left to b if and only if the lifetime of a
occurs before the lifetime of b.
22
-
8/7/2019 PCD UNIT II new
23/70
Control stack
Flow of control in a program corresponds to Depth
first traversal i.e.,root,node,its children.
Control stack keep track oflive procedure
activations.
Push node for an activation as activation begins.
Pop node when the activation ends
23
-
8/7/2019 PCD UNIT II new
24/70
q(2,3)q(1,0)p(1,3)
q(1,3)p(1,9)
q(1,9)
S
r
24
-
8/7/2019 PCD UNIT II new
25/70
Scope of a declaration:
Declaration may be implicit or explicitScope rules of a language determines which
declaration of a name applies when the name
appears in the text of a program.
Name is said to be Local (within procedure)and
non local .
Difference Local and non local is syntatic
structure .
25
-
8/7/2019 PCD UNIT II new
26/70
Binding of names
Data objects refers to storage location that hold values.
Environments refers to function that map name to a
storage location
State refers to function that maps a storage location to
the value held there.Pi=0 address 100
Pi=3.14 address 100
Name :pi Value:3.14Storage :100
Stateenvironment
26
-
8/7/2019 PCD UNIT II new
27/70
Storage Organization
Compiler takes some memory from OS whilecompiling
Sub division of runtime memory
1.Generated target code
2.Data objects static memory
3.Control stack
Size of the Generated target code,Data objects arefixed at compile time.
When a call occurs execution of activation is
interrupted and status of information is saved ontostack.
When a control returns from a call,activationrestarted.
27
-
8/7/2019 PCD UNIT II new
28/70
A separate area of runtime memory is called heap.
TOS
28
-
8/7/2019 PCD UNIT II new
29/70
Activation record:
Information needed by a single execution ofprocedure is managed using a contiguous block
of storage called activation record.
Push Activation record when a procedure
called
pop Activation record when a control returns
to caller.
29
-
8/7/2019 PCD UNIT II new
30/70
actual parameters used by the calling
procedure pointing to the activation record of the
caller
locate data needed by the called
procedure
Returned values
Actual parameter
Optional control link
Optional access link
Saved machine status
Local data
Temporaries
30
-
8/7/2019 PCD UNIT II new
31/70
Compile time layout of local data:
Runtime storage comes in block of contiguousbyte.
Multibyte objects are stored in consecutive
bytes.
Storage depends on the type of the name.
Elementary datatype, aggregate needs storage
interms of byte.
31
-
8/7/2019 PCD UNIT II new
32/70
32
-
8/7/2019 PCD UNIT II new
33/70
-
8/7/2019 PCD UNIT II new
34/70
Static allocation:
i. Names are bound to storage as the program compiled.
ii. The binding do not change at runtime.
Limitations:
1.Size of the data objects and constraints known at
compile time.2.Recursive procedure are restricted.
3.Data structures cannot be created at runtime.
34
-
8/7/2019 PCD UNIT II new
35/70
35
-
8/7/2019 PCD UNIT II new
36/70
Stack allocation:
Runtime storage
Locals are bound to fresh storage in each activation.
Calling sequence:
Procedure calls are implemented by generating calling sequence
in the target code.
A call sequence allocates an activation record and enters info.
Into its fields
A return sequence restores the state of the machine
36
-
8/7/2019 PCD UNIT II new
37/70
Code in a calling sequence is divided between calling and callee
i.e., source language,target machine.
There is an advantage to placing the fields for parameters and a
potential returned value next to the activation record of the
caller.
The caller can then access these fields using offsets from the end
of its own activation record without knowing the completelayout.
37
-
8/7/2019 PCD UNIT II new
38/70
38
-
8/7/2019 PCD UNIT II new
39/70
39
-
8/7/2019 PCD UNIT II new
40/70
The call sequence is;
1.Evaluates actuals
2.Stores return address(top_sp)
3.Saves register value
4.Initializes local data.
Return sequence is;
1.Return value is placed next to the activation record of caller
2.Callee restores top_sp,registers,branches to return address in the
caller code.
3.Caller copy return value to evaluate an expression.
40
-
8/7/2019 PCD UNIT II new
41/70
Variable length data:
41
-
8/7/2019 PCD UNIT II new
42/70
Dangling references:
Occurs when there is a reference to storage that has been
deallocated.
This storage is undefined and a logical error.
42
-
8/7/2019 PCD UNIT II new
43/70
Heap allocation:
Parcels the pieces of contiguous storage as needed for activation
record These may be deallocated.
The values of local names must be retained when an activation
ends.
A called activation outlives the caller.
For efficiency:
For each size of interest ,keep a linked list of free blocks of that size.
Fill a request of site s with size of s.
For large blocks of storage use the heap manager.
43
-
8/7/2019 PCD UNIT II new
44/70
CS416 Compiler Design 44
Type Checking
A compiler has to do semantic checks in addition to syntactic checks.
1.Type checks operator is applied to an incompatible operand
2.Flow of control checks e.g., for ,switch
3.Uniqueness check object must be declared only once
e.g., elements in scalar type should not be repeated
4.Name related checks
same name appears two or more times.
e.g., In ada lang. Loop may have same name that appears at thebeginning and ending of construct.
-
8/7/2019 PCD UNIT II new
45/70
CS416 Compiler Design 45
Type systems
A type system is a collection of rules for assigning type expressionsto the parts of a program.
A type checkerimplements a type system.
The design of type checker for a language is based on information
about the syntactic constructs in the language.
e.g., if both operands of the arithmetic operators of +,-,* are type of
integer, then then result if of type integer.
Each expression has a type associated with it
-
8/7/2019 PCD UNIT II new
46/70
CS416 Compiler Design 46
Type expressions
The type of a language construct will be denoted by a type
expression.
Type expressions is either a basic type or is formed by Applying an
operator called type constructor.
The set of basic types and constructors depend on the language to bechecked.
1.A basic type is an expression.among them boolean,char,integer and
real. Type_error will signal an error during type checking.
Void denotes the absence of the value.
2.Type expressions may be named, a type name is a type expression.
-
8/7/2019 PCD UNIT II new
47/70
CS416 Compiler Design 47
3.A type constructor applied to type expressions is a type
expression.
arrays: If T is a type expression, then array(I,T) is a typeexpression where I denotes index range. Ex:array(0..99,int)
products: If T1 and T2 are type expressions, then theircartesian product T1 X T2 is a type expression. Ex: int xint
pointers: If T is a type expression, then pointer(T) is a
type expression. Ex: pointer(int)Var p:row
Declare variable p to have a pointer(row)
-
8/7/2019 PCD UNIT II new
48/70
-
8/7/2019 PCD UNIT II new
49/70
CS416 Compiler Design 49
Record:
The difference between record and products are the record fields
have names.The record type constructor will be applied to a tuple formed from
field names and yield types
Type row=record
address:integer;
lexeme:array [1..15] of char
End;
Var table:array[1..101] of row
Declares the type row representing the type expression
Record((address x integer) x (lexeme x array(1.15,char)))Variable table to be an array of records of this type.
-
8/7/2019 PCD UNIT II new
50/70
CS416 Compiler Design 50
4.Type expressions may contain variable whose values are type
expression.
X
charchar integer
pointer
Char x char
pointer(integer)
-
8/7/2019 PCD UNIT II new
51/70
CS416 Compiler Design 51
Type systems
Collection of rules for assigning type expressions to the
various parts of a program. A type checker implements a type system
Static and dynamic checking types:
checking done when the target program runs is termed
dynamic.
Any check can be done dynamically if the target code carries
the type of an element with the value of that element.
A soundtype system eliminates run-time type checking for
type errors.
A programming language is strongly-typed, if every programits compiler accepts will execute without type errors.
-
8/7/2019 PCD UNIT II new
52/70
CS416 Compiler Design 52
Table: array[0255] of char;
i:integer
And then compute table[i],
A compiler in general cannot guarantee that duringexecution,the value of I will lie in the range 0 to 255.
-
8/7/2019 PCD UNIT II new
53/70
CS416 Compiler Design 53
Error recovery:
Type checker catches the error and it do somethingreasonable when the error is discovered.
Compiler must report the nature and location of the
error.
-
8/7/2019 PCD UNIT II new
54/70
CS416 Compiler Design 54
Specification of simple type checker
PD;E
Psequence of declarations D followed by E
D id: T { addtype (id.entry, T.type) }
T char { T.type=char }
T int { T.type=int }
T real { T.type=real }
T T1 { T.type=pointer(T1.type) }
T array[intnum] of T1 { T.type=array(1..intnum.val,T1.type) }
-
8/7/2019 PCD UNIT II new
55/70
CS416 Compiler Design 55
Type Checking of Expressions
The synthesised attribute type for E gives the Type Expressions
assigned by the type system to the expression generated by E.
E id { E.type=lookup(id.entry) }
E charliteral { E.type=char }
E intliteral { E.type=int }
E realliteral { E.type=real }
-
8/7/2019 PCD UNIT II new
56/70
-
8/7/2019 PCD UNIT II new
57/70
CS416 Compiler Design 57
Type Checking ofStatements
The language constructs like statements does not have values and
the basic type void assigned to it.
If a error is detected within a statement then the type assigned to
the statement is type_error
The semantic rule for the type checking statement are:
Sp id = E {S.type:=if id.type=E.type then void
else =type-error }
Sp if E then S1 {S.type:=if E.type=boolean then S1.type
else =type-error }Sp while E do S1 {S.type:=if E.type=boolean then S1.type
else type-error }
-
8/7/2019 PCD UNIT II new
58/70
CS416 Compiler Design 58
Type Checking of Functions
E p E1 ( E2 ) {E.type:=if E2.type=s and E1.type=spt) then t
else type-error }
Ex: int f (double x, char y) { ... }
f: double x char y p int
argument types return type
TT1T2{T.type=T1.typeT2.type}
-
8/7/2019 PCD UNIT II new
59/70
CS416 Compiler Design 59
Intermediate Code Generation
Intermediate codes are machine independent codes, but they are closeto machine instructions.
The given program in a source language is converted to anequivalent program in an intermediate language by the intermediatecode generator.
Intermediate language can be many different languages, and thedesigner of the compiler decides this intermediate language.
syntax trees can be used as an intermediate language.
postfix notation can be used as an intermediate language.
three-address code (Quadraples) can be used as anintermediate language
-
8/7/2019 PCD UNIT II new
60/70
CS416 Compiler Design 60
Three-Address Code (Quadraples)
A quadraple is:
x := y op z
where x, y and z are names, constants or compiler-generatedtemporaries; op is any operator.
But we may also the following notation for quadraples (much better
notation because it looks like a machine code instruction)
op y,z,x
apply operatorop to y and z, and store the result in x.
We use the term three-address code because each statement usually
contains three addresses (two for operands, one for the result).
-
8/7/2019 PCD UNIT II new
61/70
CS416 Compiler Design 61
Three-Address Statements
Binary Operator: op y,z,result or result := y op zwhere op is a binary arithmetic or logical operator. This binary operator isapplied to y and z, and the result of the operation is stored in result.
Ex: add a,b,c
gt a,b,c
addr a,b,c
addi a,b,c
Unary Operator: op y,,result or result := op y
where op is a unary arithmetic or logical operator. This unary operator isapplied to y, and the result of the operation is stored in result.
Ex: uminus a,,c
not a,,c
inttoreal a,,c
-
8/7/2019 PCD UNIT II new
62/70
CS416 Compiler Design 62
Three-Address Statements (cont.)
Move Operator: mov y,,result or result := y
where the content ofy is copied into result.
Ex: mov a,,c
movi a,,c
movr a,,c
Unconditional Jumps: jmp ,,L or goto L
We will jump to the three-address code with the label L, and the execution
continues from that statement.
Ex: jmp ,,L1 // jump to L1
jmp ,,7 // jump to the statement 7
-
8/7/2019 PCD UNIT II new
63/70
CS416 Compiler Design 63
Three-Address Statements (cont.)
Conditional Jumps: jmprelop y,z,L or if y relop z goto LWe will jump to the three-address code with the label Lif the result ofy relop z is
true, and the execution continues from that statement. If the result is false, the execution
continues from the statement following this conditional jump statement.
Ex: jmpgt y,z,L1 // jump to L1 if y>z
jmpgte y,z,L1 // jump to L1 if y>=zjmpe y,z,L1 // jump to L1 if y==z
jmpne y,z,L1 // jump to L1 if y!=z
Our relational operator can also be a unary operator.
jmpnz y,,L1 // jump to L1 if y is not zero
jmpz y,,L1 // jump to L1 if y is zero
jmpt y,,L1 // jump to L1 if y is true
jmpf y,,L1 // jump to L1 if y is false
-
8/7/2019 PCD UNIT II new
64/70
-
8/7/2019 PCD UNIT II new
65/70
CS416 Compiler Design 65
Structural Equivalence of Type Expressions
How do we know that two type expressions are equal?
As long as type expressions are built from basic types (no type names),
we may use structural equivalence between two type expressions
Structural Equivalence Algorithm (sequiv):if (s and t are same basic types) then return true
else if (s=array(s1,s2) and t=array(t1,t2)) then return (sequiv(s1,t1) and sequiv(s2,t2))
else if (s = s1 x s2 and t = t1 x t2) then return (sequiv(s1,t1) and sequiv(s2,t2))
else if (s=pointer(s1) and t=pointer(t1)) then return (sequiv(s1,t1))
else if (s = s1 p s2 and t = t1 p t2) then return (sequiv(s1,t1) and sequiv(s2,t2))
else return false
-
8/7/2019 PCD UNIT II new
66/70
CS416 Compiler Design 66
Names for Type Expressions
In some programming languages, we give a name to a type expression,and we use that name as a type expression afterwards.
type link = o cell; ? p,q,r,s have same types ?
var p,q : link;var r,s : o cell
How do we treat type names?
Get equivalent type expression for a type name (then usestructural equivalence), or
Treat a type name as a basic type.
-
8/7/2019 PCD UNIT II new
67/70
CS416 Compiler Design 67
Cycles in Type Expressions
type link = o cell;
type cell = record
x : int,
next : link
end;
We cannot use structural equivalence if there are cycles in type
expressions.
We have to treat type names as basic types. but this means that the type expression link is different
than the type expression ocell.
-
8/7/2019 PCD UNIT II new
68/70
CS416 Compiler Design 68
Type Conversions
x + y ? what is the type of this expression (int or double)?
What kind of codes we have to produce, if the type of x is double and
the type of y is int?
inttoreal y,,t1
real+ t1,x,t2
-
8/7/2019 PCD UNIT II new
69/70
69
-
8/7/2019 PCD UNIT II new
70/70
70
Type Checking
A compiler has to do semantic checks in addition to syntacticchecks.
Semantic Checks
Static done during compilation
Dynamic done during run-time
Type checkingis one of these static checking operations.
we may not do all type checking at compile-time.
Some systems also use dynamic type checking too.