Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1...

37
Cse322, Programming Languages and Compilers 1 06/27/22 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues Object size initialization
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    216
  • download

    1

Transcript of Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1...

Page 1: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

104/19/23

Lecture #7, April 24, 2007• More about IR1• Library Functions•Canonicalization•OO runtime issues•Object size•initialization

Page 2: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

204/19/23

Assignments

• Project #1 due Wednesday, May 3, 2007

• Recall Midterm Exam on Tuesday May 1, 2007. In class, 1.5 hours, two days before Project 1 is due.

Page 3: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

304/19/23

IR1 simplifications

• As we move into the backend of the compiler we are making some simplifications.

• We have only integer and Boolean values (no more floating point values)

• All values require 32 bits to store (including booleans)

• Values and pointers (addresses) take up the same amount of space (32 bits).

• Every value takes up exactly wdSize bytes (where wdSize = 4)

Page 4: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

404/19/23

Semantics of Expdatatype EXP

= BINOP of ProgramTypes.BINOP * EXP * EXP

| RELOP of ProgramTypes.RELOP * EXP * EXP

| CALL of EXP * EXP list

| MEM of EXP

| NAME of string (* method names *)

| TEMP of int (* registers *)

| PARAM of int (* method parameters *)

| MEMBER of EXP * int (* instance variables *)

| VAR of int (* local vars of methods *)

| CONST of string

| STRING of string

| ESEQ of STMT list * EXP

Expressions represent values, But some values (mostly new array and new object) require actions to complete. ESEQ allows us to embed actions in expressions.

We will discuss some of the highlights next.

Page 5: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

504/19/23

BINOP and RELOP• These are straightforward translations of their

ProgramTypes.sml counter parts.

• Binops translate directly.• Relop LT, GT etc, translate as if their were GT etc.

operators just like ADD, TIMES etc.

• Binops AND and OR generally appear in the tests of the statements While and If. We use short circuit translations to translate these.

• If And or OR appears in an expression (rather than a statement) we can still use shortcircuit evaluation by using the ESEQ expression, using a local temp and generating statements (inside the ESEQ) to move either true or false into the local temp. (this is already done in the template code handed out).

Page 6: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

604/19/23

Library functions.• Several missing operations can be translated into

library functions.• A library function is a function supplied by the runtime

environment. Possible library functions include, Boolean negation, and unary minus, malloc, coerce, etc.

• A library function translates to a call.

CALL (NAME “unary_minus”) [VAR 1]

CALL (NAME “negate”) [VAR 3]

• We use the NAME expression to name library functions as well as the functions we generate to implement methods.

• We will develop other library functions as we go along.

Page 7: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

704/19/23

MEM

• MEM has no direct counter-part in ProgramTypes.sml

• Its meaning is to fetch the contents of a memory location.

• Its value is the value of that memory location.

• It is always a 32 bit value.

• Several other expression constructors have as their value memory locations. These constructors are appropriate arguments to MEM.– They include TEMP, PARAM, MEMBER, and VAR

Page 8: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

804/19/23

Addresses

• Variables, methods, members, and parameters all have there values stored in memory locations.

• Most of these addresses are fixed offsets from some known address. I.e. the current object pointer or the activation record pointer.

• The runtime system will define these known addresses.

Page 9: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

904/19/23

(PARAM n)

• This means the address of the nth parameter.

• There will be some fixed location for parameters

• We will need to add the correct offset for the nth parameter to this address.

• Under the assumption that all values take up wdSize bytes, the offset = n * wdSize , but we leave this abstract at this stage.

Page 10: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

1004/19/23

(Var n)

• This means the address of the nth local variable of the current method.

• There will be some fixed location for local variables.

• We will need to add the correct offset for the nth local to this address.

• Under the assumption that all values take up wdSize bytes, the offset = n * wdSize , but we again leave this abstract at this stage.

Page 11: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

1104/19/23

(MEMBER(X,n))

• This means the value of the nth member (instance variable) of the object stored at address “X”

• We will need to add the correct offset for the nth local to this address.

Note that X is itself an address.

MEMBER(x,n) = MEM(x + wdSize * n)

under the assumption that all instance variables take up wdSize bytes.

Page 12: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

1204/19/23

this.x

• Recall that instance variables without an object prefix, are really: this.x

• What is the address of this?

• This refers to the current object. The object which includes the method being executed.

• x.f(1,3)

The object of a method call is an implicit parameter 0th

parameter.

Page 13: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

1304/19/23

Printing • Printing is handled library functions.

• We will need 1 library function for each kind of object we can print.

• In general we’d need a Basic type tag in the ProgramTypes PrintE constructor to support this.

• Lets assume we print only Integer values. Then we need only two library functions.

• One for printing literal strings (PrintT) called (NAME “prStr”) , and one for (PrintE) called (NAME “prInt”)

Page 14: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

1404/19/23

Translating Methods• Each method is translated into a Func• To translate you need to

– Create a proper name. Classname_methodname» You may need to track the current class so the classname is

available– Translate the methods variable declarations into a (possibly empty)

STMT list– Translate the methods body into a STMT list

• Merge the two STMT list. Put the variable one first.

• As you do this you will need to prepare the correct environment that tracks the Vkind of variables.

• Return a FUNC object. Be sure and get the (ProgramTypes.Type list) right in the Func node as these will needed in the second phase.

Page 15: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

1504/19/23

Translating Methodsfun pass1M className env (MetDecl(loc,rng,name,params,vars,body)) = let fun paramTypes (Formal(typ,nm)) = typ fun paramEnv count [] = [] | paramEnv count ((Formal(typ,nm))::xs) = (nm,Vparam count)::(paramEnv (count+1) xs) fun varTypes (VarDecl(loc,typ,nm,init)) = typ fun varEnv count [] = [] | varEnv count ((VarDecl(loc,typ,nm,init))::xs) = (nm,Vlocal count)::(varEnv (count+1) xs) val initEnv = (paramEnv 1 params) @ env fun initIR count [] = [] | initIR count (VarDecl(loc,typ,nm,SOME init) :: vs) = (MOVE(VAR count,pass1E initEnv init)):: initIR (count+1) vs | initIR count (VarDecl(loc,typ,nm,NONE)::xs) = initIR (count+1) xs val varIR = initIR 1 vars val bodyEnv = (varEnv 1 vars) @ initEnv val bodyIR = pass1S bodyEnv (Block body)in (FUNC(className^"_"^name ,map paramTypes params ,map varTypes vars ,varIR @ bodyIR))end

Page 16: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

1604/19/23

Canonicalization• Usually we like to think of expressions as

being side effect free.

• Because of ESEQ, this is clearly not true.

• We need to evaluate expressions in a canonical order in order to make sure we always get effects in the same order.

• Consider: x.f(new person, new int [3])

• Both arguments translate to side effecting code. Which effects should happen first. Maybe in this case it doesn’t matter, but we should have a fixed evaluation order.

Page 17: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

1704/19/23

• Consider f(g(5),7)• Translation into X86 may require the use of

specific registers.

• When calls are nested inside calls, if we’re not carefull the register usage can get mixed up.

Page 18: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

1804/19/23

General fix part 1

• Always use a new temporary register to name the value of a method call.

• Load this value immediately after the method returns.

• Use only this new register name in subsequent code.

• CALL(NAME “f”,[CONST “1”])

ESEQ([MOVE (TEMP 100

,CALL(NAME “f”,[CONST “1”])

)]

,TEMP 100)

Page 19: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

1904/19/23

General fix part 2

• Translate every expression into a pair• The first part of the pair is the statements in

the expression, the second part of the pair is a pure expression (with no embedded ESEQ)

2 + f (1) + g(3)

2 +

ESEQ([temp100 := f(1)],temp100) +

ESEQ([temp101 := g(3)],temp101)

( [temp100 := f(1), temp101 := g(3)] , 2+temp100+temp101)

Page 20: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

2004/19/23

Canonicalization of Statements

• Since statements can have expressions we need to canonicalize statements as well.

MOVE(f(3), g(5))

[t1 := f(3), t2 := g(5), MOVE(t1,t2) ]

The general case is to translate any statement into a list of statements. Statements lifted out of embedded expressions are incorporated into the resulting list of ststements.

Page 21: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

2104/19/23

ML codefun canonicalE x =case x of BINOP(m,a,b) => let val (sa,ea) = canonicalE a val (sb,eb) = canonicalE b in (sa@sb,BINOP(m,ea,eb)) end| CALL(f,xs) => let val temp = newTemp() val (sf,ef) = canonicalE f val (xsStmt,xsL) = canonicalL xs in (sf @ xsStmt @ [ MOVE(temp,CALL(ef,xsL)) ],temp) end

see next slide for

canonicalL

Page 22: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

2204/19/23

canonicalL

• How do we canonicalize a list of expressions?

and canonicalL [] = ([],[])

| canonicalL (x :: xs) =

let val (xStmt,xL) = canonicalE x

val (xsStmt,xsL) = canonicalL xs

in (xStmt @ xsStmt, xL :: xsL) end

Page 23: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

2304/19/23

Statements

and canonicalS x =

case x of

MOVE(a,b) =>

let val (sa,ea) = canonicalE a

val (sb,eb) = canonicalE b

in sa @ sb @ [MOVE(ea,eb)] end

| STMTlist xs =>

List.concat (map canonicalS xs)

List.concat has type(‘a list) list -> ‘a list

Page 24: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

2404/19/23

Fair Warning

• Project #2, to be assigned on May 3rd, includes, in part, the completion of cannonicalization.

• It will also include the finalization of offsets

• And it will include some simple optimization.

Page 25: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

2504/19/23

Run-time issues• So far, in IR1, we have glossed over all the important

run-time issues of an OO language.– thanks to Jenke Li for these notes

• Classes and objects– storage allocation– static class variables– dynamic class variables

• Method Invocations– static methods– static binding methods– dynamic binding methods– mini Java’s method invocation

• Others– local variables and parameters– non-local (class) variables– Activation record size– this pointer

Page 26: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

2604/19/23

Storage for Class objects

• Observations– Static class variables are established per class. They should be

allocated once in a single static place

– Dynamic class variables are cloned once for every new object. They are allocated space inside the object.

• General Strategies– A class descriptor for each class

» pointer to parent class

» pointers to (local) methods

» storage for static variables

– An object record for each class

» pointer to class descriptor

» storage for local class (dynamic) variables

» storage for inherited variables.

Page 27: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

2704/19/23

Object Record Layout

• Objects contain space not only for variables that belong to the this class, but also for all ancestor classes.

• How should we layout variables so that the offset of all of them can be determined statically by the compiler?

• For single inheritance, we can use the prefixing method.

Page 28: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

2804/19/23

Prefix Method

• When a class B extends a class A– those variables of B that are inherited from A are laid out in a

record implementing B, in the same order they appear in the record implementing A.

• The compiler can assign a fixed offset for every variable in the object record.

• The offset will be the same for a class and for all its subclasses.

• Compiled methods can access variables by their offset, and not their name.

Page 29: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

2904/19/23

An Exampleclass A { int i=1, j=2; }class B extends A {int m =3, n = 4; }class C extends A {int k = 5; }class D extends C { int l = 6; }

class Test { A a = new A; B b = new B; C c = new C; D d = newd D; … }

A’s rec B’s rec C’s rec D’s record

i i i ij j j j m k k n l

Page 30: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

3004/19/23

Deciding Object size

• To decide an object’s size– inherited class information must be known

class A { int i=1, j=2; } 2*wdSize

class B extends A {int m =3, n = 4; } A’s size + 2*wdSize

class C extends A {int k = 5; } A’s size + 1*wdSize

class D extends C { int l = 6; } C’s size + 1*wdSize

class Test {

A a = new A; // allocate 2*wdSize

B a = new B; // allocate 4*wdSize

C c = new C; // allocate 3*wdSize

D d = new D; … } // allocate 4*wdSize

Page 31: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

3104/19/23

What if class declarations appear out of order?

class D extends C { int l = 6; }

class C extends A {int k = 5; }

class A { int i=1, j=2; }

class B extends A {int m =3, n = 4; }

Solution

Perform a topological sort on class decls based upon inheritance relationship, then collect the size information.

Page 32: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

3204/19/23

Deciding Class Variable Offsets• Once object sizes are known (computed), variable

offsets can be computed easily.• Rule.

– The offset of a subclass’s first instance variable is the parent classes’ object size.

• Exampleclass A { int i=1 // 0 , j=2; // 1* wdSize }class B extends A {int m =3 // 2*wdSize , n = 4; // 3*wdSize }class C extends A {int k = 5; // 2*wdSize }class D extends C { int l = 6; // 3*wdSize }

Page 33: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

3304/19/23

Class Variable Initialization

• Class variable must always be initialized– either by the compiler

– or the user

• Initialization happens at object creation time

• User-provided initialization code is in the class declaration.

• Solution – Collect initialization info while processing class decls. Propogate the initialization expression downwards. Use this info when generating new expression code.

Page 34: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

3404/19/23

Exampleclass A { int i=1 // 0 1

, j=2; // 1* wdSize 2

}

class B extends A {int m=3 // 2*wdSize 3

, n = 4; // 3*wdSize 4

}

class C extends A {int k = 5; // 2*wdSize 5

}

class D extends C { int l = 6; // 3*wdSize 6

}

Page 35: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

3504/19/23

new A New Objects• Object size • Variable Offsets• Variable initialization

• canonicalize the initialization statements (stored in symbol table) to get a list of statements and expressions. (initS,initExps)

• ESEQ(initS @[t1 := #vars * wdsize,t2 := malloc(t1),t3 := 0,t2[t3] := (get 0 initExps),t3 := t3 + wdSize,t2[t3] := (get 1 initExps). . .], t2)

Page 36: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

3604/19/23

Accessing an Objects instance variables

• O.x• Obtain objects address by translating O• Obtain variables offset• add the two together.

• Recall that any value, parameter, etc which is an object, can be treated as an address.

Page 37: Cse322, Programming Languages and Compilers 1 7/15/2015 Lecture #7, April 24, 2007 More about IR1 Library Functions Canonicalization OO runtime issues.

Cse322, Programming Languages and Compilers

3704/19/23

Next time

• Next time we will make more concrete assumptions about translating mini-Java

• We will begin to define a “semantics” for IR1 by making an interpreter for it.