CSE 131 – Compiler Construction 04/08/2013 04/05/2013...

Post on 23-Sep-2020

2 views 0 download

Transcript of CSE 131 – Compiler Construction 04/08/2013 04/05/2013...

CSE 131 – Compiler Construction

Discussion 1: Getting Started04/05/201304/08/2013

INTRODUCTION

● Contact Info:See Lab Hours link for lab hours and TAs/Tutors scheduled times in B230

● Availability: We are available at our designated lab hour times, or by appointment. Feel free to e-mail us.

OVERVIEW

● Purpose of our discussions

● Starter Code

● STOs and Types

● Error Reporting

● Functions

WHAT YOU SHOULD ALREADY KNOW!

● Object Oriented Design

● Inheritance

● Polymorphism

● Some C

● Some assembly (for project 2)

PHASES OF COMPILATION

● Lexical Analysis (Lexer.java)Parsing for tokens – not a major point of this course

● Syntax & Semantic Analysis (rc.cup/MyParser.java)Project 1 will focus on the Semantics. In other words, does some syntactically correct piece of code make sense. (If it compiles, ship it)

● Code Generation (Project 2)If it runs, ship it

STARTER CODE

● The Starter Code is located in: /home/solaris/ieng9/cs131s/public/starterCode/

● Look through the files and get familiar with each of them.

● Also the files GETTING_STARTED and CUP_Overview provided with the Starter Code are helpful.

IMPORTANT FILES

● rc.cupContains the Parser’s rules and actions (defines the grammar)

Example:

Designator2 ::= Designator2:_1 T_DOT T_ID:_3 {: RESULT = ((MyParser) parser).DoDesignator2_Dot (_1, _3); :}

| Designator2:_1 T_LBRACKET ExprList T_RBRACKET {: RESULT = ((MyParser) parser).DoDesignator2_Array (_1); :}

;

Unfamiliar Syntax

Designator 2 :: =| ..| Designator3:_1 {:RESULT = _1; :};

Designator3 ::= ...| ..| T_INT_LITERAL:_1 {: RESULT = new ConstSTO(_1) :}

Action Codes

RESULT is the default name of the left-hand side value of the production rule.

_1 and _2 etc. are labels/names to refer to the values of the symbols they are associated with.

IMPORTANT FILES

● MyParser.javaContains methods for semantic analysis.

Example: STO DoDesignator3_ID (String strID) {

STO sto; if ((sto = m_symtab.access (strID)) == null){

m_nNumErrors++;m_errors.print (Formatter.toString(ErrorMsg.undeclared_id, strID));sto = new ErrorSTO (strID);

}return (sto);

}

IMPORTANT FILES

● SymbolTable.javaContains the functions that support Scopes.insert(STO), access(String), accessGlobal(String), accessLocal(String), openScope(), closeScope(), etc.

● This file contains one of the issues that will be resolved in Phase 0.

Parse Tree Example

Code:function: void main() {

auto x = 2;}

Program

OptGlobalDecls

GlobalDecls

VarDecl

OpStatic T_AUTO T_ID T_ASSIGN Expr T_SEMI

T_STATIC Expr0

Expr1 ... Expr8

Designator

Designator1 ... Designator3

T_INT_LITERAL

STO HIERARCHY

STO

FuncSTO TypedefSTO

ExprSTO ErrorSTOVarSTO ConstSTO

● You can change this around!● Look at the methods in each and how they are overloaded

(i.e., isVar(), isConst())

STO Hierarchy

● All STOs have internal fields to indicate if the STO is: modifiable and/or addressable. Whenever a new STO is returned, these fields should be set appropriately.

● Additionally, ConstSTO includes a value field, since only constants will be folded at compile time (more on this coming soon).

TYPES

Type ::= SubType OptModifierList OptArrayDef | T_FUNCPTR T_COLON ReturnType T_LPAREN OptParamList:_3 T_RPAREN ;

SubType ::= QualIdent | BasicType ; BasicType ::= T_INT | T_FLOAT | T_BOOL | T_CHAR ;

TYPES

● Need to create objects for:○ Basic Types○ Array Types○ Struct Types (only done in structdef), ○ Pointer Types○ Function Pointer Types (will behave differently than normal pointers)

● How can these objects be organized to make our lives easier?

● What methods and fields should we provide within each?

ONE POSSIBLE TYPE HIERARCHY

Type

BasicType

ArrayType StructType PtrGrpTypeNumericType

BoolType

IntType FloatType

CompositeType

FuncPtrTypePointerType

VoidType

NullPtrType

WHAT METHODS ARE USEFUL

● Look at how the STO.java and *STO.java files are written.● Consider making methods like:

○ isNumeric(), ○ isFloat(), ○ isInt(), ○ isChar(), ○ isArray(), ○ etc...

available in your Type Hierarchy.● Alternatively, use Java’s instanceof operator

e.g. obj instanceof NumericType

WHAT ELSE IS USEFUL?

● All Types would benefit from methods like:○ isAssignable(Type t) – coercible type (ie, int → float)○ isEquivalent(Type t) – same type (e.g. this.getClass()

== t.getClass())

● Some types will need to store more information:○ ArrayType may need to store dimensions○ StructType may need to store a Vector of fields (or

better yet, an entire Scope)○ You will need the size of these types also, for sizeof

SETTING TYPES

● Must ensure that STO’s all have some Type field within them and that this Type field is set when the type becomes known.

● What changes need to be made to the CUP and MyParser files?

EXAMPLE OF SETTING TYPE

● CUP rule currently states:

VarDecl ::= OptStatic UndecoratedType IdentListWOptInit:_3 T_SEMI{:

((MyParser) parser).DoVarDecl (_3);:}

● We now want to incorporate the Type, so we pass the Type to the MyParser method as well.

EXAMPLE OF SETTING TYPE

● Now, in MyParser.java, want to add Type: void DoVarDecl (Vector lstIDs, Type t) { for (int i = 0; i < lstIDs.size (); i++) { String id = (String) lstIDs.elementAt (i); if (m_symtab.accessLocal (id) != null) { m_nNumErrors++; m_errors.print (Formatter.toString(ErrorMsg.redeclared_id, id)); } VarSTO sto = new VarSTO (id);

// Add code here to set sto’s type field!!! m_symtab.insert (sto); } }

TYPE CHECKING

● Now that we have associated types with our STOs, how do we check them?

TYPE CHECKING EXAMPLE

● Consider the following code:int x; // VarSTO x gets created and Type set to int, insert into symtabfloat y; // VarSTO y gets created and Type set to float, insert into symtabfunction : void main() { x = 5; // OK y = x + 12.5; // OK x = y; // Error}

● Let’s focus on the statement y = x + 12.5

TYPE CHECKING EXAMPLE

y = x + 12.5;Currently CUP has:

Expr7 ::= Expr7:_1 AddOp:_2 Expr8:_3 {: RESULT = _1; :}

● What needs to be done?Based on AddOp (+, -), we need to check the types of _1 and _3Based on the types of _1 and _3, we need to create a new STO (an ExprSTO) to return as the result.

TYPE CHECKING EXAMPLE

● Getting a Type out of the STO

● You have some STO variable “a” and you want to check if it is Equivalent to STO variable “b”:

a.getType().isEquivalent(b.getType())

The isEquivalent method should return a boolean

ORGANIZATION IS YOUR FRIEND!

● Don’t just try to throw code into the files.● Think about the current problem at hand.● Also think about upcoming tasks.● Try to make your code as general as possible.● Remember that in Project 2 we will be generating assembly

code, so the more forethought that occurs in Project 1, the easier Project 2 will be!

AN IDEA

● Define a new function inside MyParser to check expressions● Define Operator classes to assist type checking

In MyParser:STO DoBinaryExpr(STO a, Operator o, STO b) {

STO sto = o.checkOperands(a, b); if (sto.isError()) {

… } return sto;

}

OPERATOR HIERARCHY

● Arguably the most important class hierarchy for this projectOne possible way to set it up:

Operator

BinaryOp UnaryOp

BitwiseOp

AddOp, MinusOp,etc.

ArithmeticOp

BooleanOp

AndOp, OrOp

AndBwOp, OrBwOp,

XorOp

NotOp,IncOp,DecOp

ComparisonOp

EqualToOp, NEqualToOp,

LessThanOp, etc.

AN IDEA

● In each Operator class, we can do something like this:STO checkOperands(STO a, STO b) { if (!a.getType().isNumeric() || !b.getType().isNumeric()) {

// Errorreturn (new ErrorSTO(…));

} else if (a.getType().isInteger() && b.getType().isInteger()) {

// return ExprSTO of int type } else

// return ExprSTO of float type }

ERROR REPORTING

● Now that we can check types, we will find errors!

● Once we find them, we want to print them out.

● Use only the provided error messages in ErrorMsg.java – these correspond nicely with each check.

ERROR REPORTING

● Only report the FIRST error found in each statement.

Once an error is found in a statement, suppress all further errors that may occur in that statement.

● If you want to see the line number where the error occurred (for debugging purposes), do a “make debug”

ERRORSTO

● ErrorSTO is made such that it will appear to be any other STO. Once you find an error, you want to make your result be an ErrorSTO so the error does not propagate throughout the program.

FUNCTIONS!

● First, some fundamental points:

○ We are writing a static translator, not an interpreter.○ Once we finish the function declaration, including the

body, we’re done. We don’t need to remember the code in the body, since we’ll just be spitting out assembly code (Project 2).

○ A function call will basically boil down to an assembly “call foo” type instruction, once all type checking is complete.

FUNCTIONS

● For Project 1, we will need to:○ Check the function call against the function declaration

to ensure argument symmetry.

○ Check the body of the function, type checking the same way we check statements in main.

○ Check the return logic of the function, including return by reference.

○ Allow function overloading (Extra Credit)

FuncSTO

● In order to do most of these checks, we will need to store some information about the function. FuncSTO is designed for this!

● Things that should be stored:1. The Return Type

a. This is separate from the full type of the function itself, which is actually a funcptr type.

2. All parameter information:a. Total number of parametersb. Type of each parameter, including whether pass-by-

reference or notc. Flag for return-by-reference

FUNCTION DEFINITION

FuncDef ::= T_FUNCTION T_COLON ReturnType OptRef T_ID:_2

{: ((MyParser) parser).DoFuncDecl_1(_2); :}

T_LPAREN OptParamList:_3 T_RPAREN

{: ((MyParser) parser).DoFormalParams(_3); :}

T_LBRACE OptStmtList T_RBRACE

{: ((MyParser) parser).DoFuncDecl_2(); :} ;

FUNCTION DEFINITION

void DoFuncDecl_1 (String id) { if (m_symtab.accessLocal (id) != null) { m_nNumErrors++; m_errors.print (Formatter.toString(ErrorMsg.redeclared_id, id)); } FuncSTO sto = new FuncSTO (id); m_symtab.insert (sto); // Inserted into current scope m_symtab.openScope ();// New scope opened m_symtab.setFunc (sto); // Current Func we’re in is set}

FUNCTION DEFINITION

void DoFuncDecl_2 () { m_symtab.closeScope (); // Close scope (pops top scope off) m_symtab.setFunc (null); // Says we’re back in outer scope }

FUNCTION DEFINITION

function : bool foo (float a, float b, float &c) { bool x; // local variable x = a > b; x = (a + c) <= 2; return x;}

● In this example, we want to create a FuncSTO with the name “foo”, set its return type to boolean, set its param count to 3, and remember the parameters are: value float, value float, and reference float).

● Furthermore, we want to insert the VarSTO for a, b, and c into the SymTable so they are available for the body of the procedure.

FUNCTION CALL

● Now that we have a FuncSTO in the SymTable, and type-checked the function declaration, we are ready to call it.

foo(1, 2, 3.3);

● Given the above call, there would be an error since ‘3.3’ is not addressable and cannot be sent to a reference parameter.

FUNCTION CALL

● So, when we call a function, we want to get the FuncSTO from the SymTable and check its parameters with the arguments.

● Consider making a Vector of some new object (you create it how you want) that holds the parameters. When you call the function, compare the two vectors.

● Important design choices!!!● Remember function overloading for extra credit – how to

uniquely encode the name with the parameters and know which one to get from the Symbol Table.

FUNCTION RETURN TYPES

● The type of the return expression needs to be checked against the type declared as the function’s return type

● When a function call is used in an expression, it behaves like any other ExprSTO. You need to use the function’s return type for semantic checking

● Functions are also special in that they are the only object that can have a “void” type

○ The “void” type is not equivalent or assignable to anything (including itself), so any use of an object of “void” type in an expression should result in an error

FUNCTION RETURN TYPES

● Return-by-value○ At a return site, the return expression just has to have

the correct type○ At the call site, the function call expression results in an

rval

● Return-by-reference○ At a return site, the return expression has to have the

correct type and must be a modifiable lval.○ At the call site, the function call expression results in an

modifiable lval.

WHAT TO DO NEXT!

1. Find a partner and setup a UnixGroup (see broadcast message for details)!

2. Use tools like CVS, Subversion, and Eclipse to make your life easier (see “Useful Links” on website).

3. Understand the Starter Code. Look through all the files!4. Complete Phase 0 & 1.5. Come to lab hours and ask questions (We’re here for you!)6. Check the Piazza discussion board – FAQ & Often people

have the same question as you and it may already have been answered.