Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

57
Syntax Directed Translation Professor Yihjia Tsai Tamkang University
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    233
  • download

    1

Transcript of Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

Page 1: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

Syntax Directed Translation

Professor Yihjia TsaiTamkang University

Page 2: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

2

Phases of a Compiler

• 1. Lexical Analyzer (Scanner)

• Takes source Program and Converts into tokens

• 2. Syntax Analyzer (Parser)

• Takes tokens and constructs a parse tree.

• 3. Semantic Analyzer

• Takes a parse tree and constructs an abstract syntax tree with attributes.

Page 3: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

3

Phases of a Compiler- Contd

• 4. Takes an abstract syntax tree and

produces an Interpreter code (Translation output)

• 5. Intermediate-code Generator

• Takes an abstract syntax tree and produces un- optimized Intermediate code.

Syntax Directed TranslationSyntax Directed Translation

Page 4: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

4

Motivation: Parser as Translator

Syntax-directed translation

ParserParserParserParser

Syntax + translation rules(often hardcoded in the parser)

Stream of tokens

ASTs, byte codeassembly code, etc

Page 5: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

5

Important• Syntax directed translation: attaching actions to

the grammar rules (productions).• The actions are executed during the compilation

(not during the generation of the compiler, not during run time of the program!). Either when replacing a nonterminal with its rhs (LL, top-down) or a handle with a nonterminal (LR, bottom-up).

• The compiler-compiler generates a parser which knows how to parse the program (LR,LL). The actions are “implanted” in the parser and are executed according to the parsing mechanism.

Page 6: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

6

Example :Expressions

• E E + T• E T• T T * F• T F• F ( E )• F num

Page 7: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

7

Synthesized Attributes• The attribute value of the terminal at the

left hand side of a grammar rule depends on the values of the attributes on the right hand side.

• Typical for LR (bottom up) parsing.• Example: TT*F {$

$.val=$1.val$3.val}.

T.val

T.val

F.val

Page 8: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

8

Example :Expressions In LEX• E E + T {$$.val:=$1.val+

$3.val;}• E T {$$.val:=$1.val;}• T T * F {$$.val:=$1.val*$3.val;}• T F {$$.val:=$1.val;}• F ( E ) {$$.val:=$2.val;}• F num {$1.val:=$1.val;}

Page 9: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

9

Example 2:Type definitions

• D T L• T int• T real• L id , L• L id

Page 10: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

10

Inherited attributes

• The value of the attributes of one of the symbols to the right of the grammar rule depends on the attributes of the other symbols (left or right).

• Typical for LL parsing (top down).• D T {$2.type:=$1.type} L• L id , {$3.type:=$1.type} L

D.type

,id

L.type

L.type

T.type

L.type

Page 11: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

11

Type definitions

• D T {$2.type:=$1.type} L• T int {$$.type:=int;}• T real {$$:=real;}• L id , L {gen(id.name,$$.type);• $3.type:=$$.type;}• L id {gen(id.name,$$.type); }

T.type

int

Page 12: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

12

Type definitions: LL(1)

• D T {$2.type:=$1.type} L• T int {$$.type:=int;}• T real {$$:=real;}• L id {gen(id.name,$$.type);• $2.type:=$$.type;} R• R , id {gen(id.name,$$.type); }• R

T.type

int

Page 13: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

13

How to arrange things for LL(1) on stack?

• Include on the stack, except for the grammar symbol also the actions, and a shadow copy for each nonterminal.

• Each time one sees an action on the stack, execute it.

• Shadow copies are used to get synthesized values and pass them further to the right of the rule.

Page 14: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

LR parser

LR(k)parser

action goto

a + b $

• Given the current state on top and current token, consult the action table.

• Either, shift, i.e., read a new token, put in stack, and push new state, or

• or Reduce, i.e., remove some elements from stack, and given the newly exposed top of stack and current token, to be put on top of stack, consult the goto table about new state on top of stack.

s0

sn-1

sn

X0

Xn-1

Page 15: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

15

LR parser adapted.

LR(k)parser

action goto

a + b $ Same as before, plus:

•Whenever reduce step, execute the action associated with grammar rule.If left-to right inherited attributes exist, can also execute actions in middle of rule.

•Can put record of attributes, associated with a grammar symbol, on stack.

s0

sn-1

sn

X0

Xn-1

Attributes

Page 16: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

16

LL parser

LL(k)parser

$

Z

X

Y

Parsing table

a + b $ •If top symbol X a terminal, must match current token m.

•If not, pop top of stack. Then look at table T[X, m] and push grammar rule there in reverse order.

Page 17: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

$ 2+3*4$ num.type:=2

$num +3*4$ Fnum F.type:=2

$F +3*4$ TF T.type:=2

$T +3*4$ ET E.type:=2

$E +3*4$ shift

$E+ 3*4$ shift num.type:=3

$E+num *4$ Fnum F.type:=3

$E+F *4$ TF F.type:=3

$E+T *4$ shift

$E+T* 4$ shift num.type:=4

$E+T*num $ Fnum F.type:=4

$E+T*F $ TT*F T.type:=12

$E+T $ EE*T E.type:=14

Page 18: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

LL parser Adapted

LL(k)parser

$

Z

X

Y

Parsing table

a + b $

•If top symbol X a terminal, must match current token m.

•Put actions into stack as part of rules. Hold for each nonterminal a record with attributes.

•If nonterminal, replace top of stack with shadow copy. Then look at table T[X, m] and push grammar rule there in reverse order.

•If shadow copy, remove. This way nonterminal can deliver values down and up.

Attributes

Actions

Page 19: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

On stack to be read

rule action

$D int a,b$

$(D)L{}T int a,b$ DT{}L$(D)L{}(T)int{}

int a,b$ Tint{}

$(D)L{}(T)

a,b$ T.type:=int

$(D)L a,b$ L.type:=int

$(D)(L)R{}id

a,b$ Lid{}R

$(D)(L)R ,b$ Gen(a,int),R.type:=int

$(D)(L)(R){}id,

,b$ R, id

$(D)(L)(R){}id

b$

$(D)(L)(R)

$ Gen(b,int),R.type:=int

Page 20: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

20

Expressions in LL:Eliminating left recursion

• E E + T• E T• T T * F• T F• F ( E )• F num

• E T E’• E’ + T E’• E’ • T F T’• T’ * F T’• T’ • F ( E )• F num

Page 21: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

3

(2+3)*4E

E

E’

E’

E’

T’

T’

T’

T

T

F

F

F

2

4

+

*( )

T

F T’

• E T E’• E’ + T E’• E’ • T F T’• T’ * F T’• T’ • F ( E )• F num

Page 22: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

22

Actions in LL

• E T {$2.down:=$1.up;} E’ {$$.up:=$2.up;}• E’ + T {$3.down:=$$.down+

$2.up;} E’ {$$.up:=$3.up;}• E’ {$$.up:=$$.down;}• T F {$2.down:=$1.up;} T’ {$$.up:=$2.up;}• T’ * F {$3.down:=$$.down+

$2.up;} T’ {$$.up:=$3.down;}• T’ {$$.up:=$$.down;}• F ( E ) {$$.up:=$2.up;}• F num {$$.up:=$1.up;}

E

E

E’

E’

E’

T’

T’

T’

T

T

F

F

F

2

4

+

*( )

T

F T’

3

Page 23: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

23

Syntax Directed Translation Scheme

• A syntax directed translation scheme is a syntax directed definition in which the net effect of semantic actions is to print out a translation of the input to a desired output form.

• This is accomplished by including “emit” statements in semantic actions that write out text fragments of the output, as well as string-valued attributes that compute text fragments to be fed into emit statements.

Page 24: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

24

Syntax-Directed Translation1. Values of these attributes are evaluated by the semantic

rules associated with the production rules.2. Evaluation of these semantic rules:

– may generate intermediate codes– may put information into the symbol table– may perform type checking– may issue error messages– may perform some other activities– in fact, they may perform almost any activities.

3. An attribute may hold almost any thing.– a string, a number, a memory location, a complex record.

4. Grammar symbols are associated with attributes to associate information with the programming language constructs that they represent.

Page 25: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

25

Syntax-Directed Definitions and Translation Schemes

1. When we associate semantic rules with productions, we use two notations:– Syntax-Directed Definitions– Translation Schemes

Page 26: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

26

Schemes

A. Syntax-Directed Definitions:– give high-level specifications for translations– hide many implementation details such as order of

evaluation of semantic actions.– We associate a production rule with a set of

semantic actions, and we do not say when they will be evaluated.

B. Translation Schemes:– indicate the order of evaluation of semantic actions

associated with a production rule.– In other words, translation schemes give a little bit

information about implementation details.

Page 27: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

27

Syntax-Directed Definitions1. A syntax-directed definition is a generalization of a context-free grammar in which:

– Each grammar symbol is associated with a set of attributes. – This set of attributes for a grammar symbol is partitioned into two subsets

called • synthesized and • inherited attributes of that grammar symbol.

– Each production rule is associated with a set of semantic rules.

2. Semantic rules set up dependencies between attributes which can be represented by a dependency graph.

3. This dependency graph determines the evaluation order of these semantic rules.

4. Evaluation of a semantic rule defines the value of an attribute. But a semantic rule may also have some side effects such as printing a value.

Page 28: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

28

Annotated Parse Tree1. A parse tree showing the values of attributes at

each node is called an annotated parse tree.

2. The process of computing the attributes values at the nodes is called annotating (or decorating) of the parse tree.

3. Of course, the order of these computations depends on the dependency graph induced by the semantic rules.

Page 29: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

29

Syntax-Directed Definition

In a syntax-directed definition, each production A→α is associated with a set of semantic rules of the form:

b=f(c1,c2,…,cn)where f is a function and b can be one of the followings:

b is a synthesized attribute of A and c1,c2,…,cn are attributes of the grammar symbols in the production ( A→α ).

OR b is an inherited attribute one of the grammar symbols in α (on the right side of the production), and c1,c2,…,cn are attributes of the grammar symbols in the production ( A→α ).

Page 30: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

30

Attribute Grammar

• So, a semantic rule b=f(c1,c2,…,cn) indicates that the attribute b depends on attributes c1,c2,…,cn.

• In a syntax-directed definition, a semantic rule may just evaluate a value of an attribute or it may have some side effects such as printing values.

• An attribute grammar is a syntax-directed definition in which the functions in the semantic rules cannot have side effects (they can only evaluate values of attributes).

Page 31: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

31

Syntax-Directed Definition -- Example

Production Semantic RulesL → E return print(E.val)E → E1 + T E.val = E1.val + T.val

E → T E.val = T.valT → T1 * F T.val = T1.val * F.val

T → F T.val = F.valF → ( E ) F.val = E.valF → digit F.val = digit.lexval

1. Symbols E, T, and F are associated with a synthesized attribute val.

2. The token digit has a synthesized attribute lexval (it is assumed that it is evaluated by the lexical analyzer).

Page 32: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

32

Annotated Parse Tree -- Example Input: 5+3*4 L

E.val=17 return

E.val=5 + T.val=12

T.val=5 T.val=3 * F.val=4

F.val=5 F.val=3 digit.lexval=4

digit.lexval=5 digit.lexval=3

Page 33: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

33

Dependency Graph

Input: 5+3*4 L

E.val=17

E.val=5 T.val=12

T.val=5 T.val=3 F.val=4

F.val=5 F.val=3 digit.lexval=4

digit.lexval=5 digit.lexval=3

Page 34: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

34

Syntax-Directed Definition – Example2

Production Semantic RulesE → E1 + T E.loc=newtemp(), E.code = E1.code || T.code ||

add E1.loc,T.loc,E.loc

E → T E.loc = T.loc, E.code=T.codeT → T1 * F T.loc=newtemp(), T.code = T1.code || F.code

|| mult T1.loc,F.loc,T.loc

T → F T.loc = F.loc, T.code=F.codeF → ( E ) F.loc = E.loc, F.code=E.codeF → id F.loc = id.name, F.code=“”

1. Symbols E, T, and F are associated with synthesized attributes loc and code.

2. The token id has a synthesized attribute name (it is assumed that it is evaluated by the lexical analyzer).

3. It is assumed that || is the string concatenation operator.

Page 35: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

35

Syntax-Directed Definition – Inherited Attributes

Production Semantic RulesD → T L L.in = T.typeT → int T.type = integerT → real T.type = realL → L1 id L1.in = L.in, addtype(id.entry,L.in)L → id addtype(id.entry,L.in)

1. Symbol T is associated with a synthesized attribute type.

2. Symbol L is associated with an inherited attribute in.

Page 36: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

36

A Dependency Graph – Inherited Attributes

Input: real p q

D L.in=real

T L T.type=real L1.in=real addtype(q,real)

real L id addtype(p,real) id.entry=q

id id.entry=p

parse tree dependency graph

Page 37: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

37

Syntax Trees1. Decoupling Translation from Parsing-Trees.2. Syntax-Tree: an intermediate representation of the compiler’s input.3. Example Procedures:

mknode, mkleaf4. Employment of the synthesized attribute nptr (pointer)

PRODUCTION SEMANTIC RULEE E1 + T E.nptr = mknode(“+”,E1.nptr ,T.nptr)

E E1 - T E.nptr = mknode(“-”,E1.nptr ,T.nptr)

E T E.nptr = T.nptrT (E) T.nptr = E.nptrT id T.nptr = mkleaf(id, id.lexval)T num T.nptr = mkleaf(num, num.val)

Page 38: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

38

Draw the Syntax Tree

a-4+c

id num 4

id

to entry for a

to entry for c

Page 39: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

39

Directed Acyclic Graphs for Expressions

a + a * ( b – c ) + ( b – c ) * d

+

+ *

*-a

b c

d

Page 40: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

40

• 1. Postfix and Prefix notations:• We have already seen how to generate them.• Let us generate Java Byte code.• E -> E’+’ E { emit(“iadd”);}• E-> E ‘* ‘ E { emit(“imul”);}• E-> T• T -> ICONST { emit(“sipush ICONST.string);}• T-> ‘(‘ E ‘)’

Examples

Page 41: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

41

• The abstract machine code for an expression simulates a stack evaluation of the postfix representation for the expression. Expression evaluation proceeds by processing the postfix representation from left to right.

Abstract Stack Machine

Page 42: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

42

Evaluation• 1. Pushing each operand onto the stack

when encountered.• 2. Evaluating a k-ary operator by using the

value located k-1 positions below the top of the stack as the leftmost operand, and so on, till the value on the top of the stack is used as the rightmost operand.

• 3. After the evaluation, all k operands are popped from the stack, and the result is pushed onto the stack (or there could be a side-effect)

Page 43: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

43

Example

Stmt -> ID ‘=‘ expr { stmt.t = expr.t || ‘istore a’}

Applied to a = 3*b –c

bipush 3iload bimuliload cisubistore a

Page 44: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

44

Java Virtual Machine

• Analogous to the abstract stack machine, the Java Virtual machine is an abstract processor architecture that defines the behavior of Java Bytecode programs.

• The stack (in JVM) is referred to as the operand stack or value stack. Operands are fetched from the stack and the result is pushed back on to the stack.

• Advantages: VM code is compact as the operands need not be explicitly named.

Page 45: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

45

Data Types

• The int data type can hold 32 bit signed integers in the range -2^31 to 2^(31) -1.

• The long data type can hold 64 bit signed integers.

• Integer instructions in the Java VM are also used to operate on Boolean values.

• Other data types that Java VM supports are byte, short, float, double. (Your project should handle at least three data types).

Page 46: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

46

Selected Java VM Instructions

Java VM instructions are typed i.e., the operator explicitly specifies what operand types it expects.

Expression Evaluation• sipush n push a 2 byte signed int on to stack• iload v load/push a local variable v• istore v store top of stack onto local var v• iadd pop two elements and push their sum• isub pop two elements and push their difference

Page 47: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

47

Selected Java VM Instructions

• imul pop two elements and push their product• iand pop two elements and push their bitwise and• ior pop two elements and push their bitwise

or• ineg pop top element and push its negation• lcmp pop two elements (64 bit integers), push the

comparison result. 1 if Vs[0]<vs[1], 0 if vs[0]=vs[1] otherwise -1.

• i2l convert integers to long• l2i convert long to integer

Page 48: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

48

Selected Java VM Instructions

• Branches:• GOTO L unconditional transfer to label l• ifeq L transfer to label L if top of stack is 0• ifne L transfer to label L if top of stack !=0• Call/Return: Each method/procedure has

memory space allocated to hold local variables (vars register), an operand stack (optop register) and an execution environment (frame register)

Page 49: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

49

Selected Java VM Instructions

• Invokestatic p invoke method p. pop args from stack as initial values of formal parameters (actual parameters are pushed before calling).

• Return return from current procedure• ireturn return from current procedure

with integer value on top of stack.• Areturn return from current procedure

with object reference return value on top of stack.

Page 50: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

50

Selected Java VM Instructions

Array Manipulation: Java VM has an object data type reference to arrays and objects

• newarray int create a new array of integers using the top of the stack as the size. Pop the stack and push a reference to the newly created array.

• Iaload pop array subscript expression on top of stack and array pointer (next stack element). Push value contained in this array element.

• iastore

Page 51: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

51

Selected Java VM Instructions

Object Manipulation• new c create a new instance of class

C (using heap) and push the reference onto stack.

• getfield f push value from object field f of object pointed by object reference at the top of stack.

• putfield f store value from vs[1] into field f of object pointed by the object reference vs[0]

Page 52: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

52

Selected Java VM Instructions

Simplifying Instructions:• ldc constant is a macro which will

generate either bipush or sipush depending on c.

Page 53: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

53

Byte Code (JVM Instructions)

• No-arg operand: (instructions needing no arguments hence take only one byte.)

• examples: aaload, aastore,aconsta_null, aload_0, aload_1, areturn, arraylength, astore_0, athrow, baload, iaload, imul etc

• One-arg operand: bipush, sipush,ldc etcmethodref op:invokestatic, invokenonvirtual,

invokevirtual

Page 54: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

54

Byte Code (JVM Instructions)

• Fieldref_arg_op:• getfield, getstaic, putfield, pustatic.• Class_arg_op:• checkcast, instanceof, new• labelarg_op (instructions that use labels)• goto, ifeq, ifne, jsr, jsr_w etc• Localvar_arg_op:• iload, fload, aload, istore

Page 55: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

55

Translating an If statement

Stmt -> if expr then stmt1 { out = newlabel();

stmt.t = expr.t|| ‘ifnnonnull’ || out || stmt1.t ||‘label’ out: ‘nop’ }

example:if ( a +90==7) { x = x+1; x = x+3;}

Page 56: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

56

Translating a while statement

Stmt -> WHILE (expr) stmt1 { in =newlabel(); out= newlabel();

emit( stmt.t = ‘label’ || in|| ‘nop’ || expr.t || ‘ifnonnull’|| out|| stmt1.t || ‘goto’ || in|| ‘label’ || out) }

Page 57: Syntax Directed Translation Professor Yihjia Tsai Tamkang University.

57

References

• Compilers Principles, Techniques and Tools, Aho, Sethi, and Ullman , Chapter 5

• Appel, Chapter 4 and 5