Post on 21-Dec-2015
Compiler Construction
Semantic Analysis II
Rina Zviel-Girshin and Ohad ShachamSchool of Computer Science
Tel-Aviv University
22
Administration
TA1 is up LR parsing Submission deadline 20/12/2009
PA3 is up Submission deadline 16/12/2009
33
Compiler
ICProgram
ic
x86 executable
exeLexicalAnalysi
s
Syntax Analysi
s
Parsing
AST Symbol
Tableetc.
Inter.Rep.(IR)
CodeGeneration
IC compiler
We saw: Scope Symbol tables
Today: Type checking Recap
44
Semantic analysis motivation
Syntax analysis is not enough
int a;a = “hello”;
int a;b = 1;
Assigning wrong type
Assigning undeclared variable
int a;int a;a = 1;
Variable re declaration
55
Symbol table
An environment that stores information about identifiers
A data structure that captures scope information
SymbolKindTypeProperties
valuefieldint…
testmethod-> intprivate
setValuemethodint -> void public
66
Examples of type errors
int a; a = true;
void foo(int x) { int x; foo(5,7);}
1 < true
class A {…}class B extends A { void foo() { A a; B b; b = a; }}
argument list doesn’t match
formal parameters
a is not a subtype of b
assigned type doesn’t match declared type
relational operator applied to non-int
type
77
Types
Type Set of values computed during program execution boolean = {true,false} int = {-231..231-1} void = {}
Type safety Types usage adheres formally defined typing rules
88
Type judgments
e : T Formal notation for type judgments e is a well-typed expression of type T 2 : int 2 * (3 + 4) : int true : bool “Hello” : string
99
Type judgments
E e : T Formal notation for type judgments In the context E, e is a well-typed expression of T b:bool, x:int b:bool x:int 1 + x < 4:bool foo:int->string, x:int foo(x) : string
Type context set of type bindings id : T (symbol table)
1111
Typing rules for expressions
E true : bool E false : bool
E int-literal : int E string-literal : string
E e1 : int E e2 : int
E e1+e2 : int[+]
E null : null E new T() : T
AST leaves
1212
Some IC expression rules 1
E true : bool
E e1 : int E e2 : int
E e1 op e2 : int
E false : bool
E int-literal : int E string-literal : string
op { +, -, /, *, %}
E e1 : int E e2 : int
E e1 rop e2 : boolrop { <=,<, >, >=}
E e1 : T E e2 : T
E e1 rop e2 : boolrop { ==,!=}
1313
Some IC expression rules 2
E e1 : bool E e2 : bool
E e1 lop e2 : boollop { &&,|| }
E e1 : int
E - e1 : int
E e1 : bool
E ! e1 : bool
E e1 : T[]
E e1.length : int
E e1 : T[] E e2 : int
E e1[e2] : T
E e1 : int
E new T[e1] : T[]
E new T() : T
E e:C (id : T) C
E e.id : T
1414
Type-checking algorithm1. Construct types
1. Add basic types to type table
2. Traverse AST looking for user-defined types (classes,methods,arrays) and store in table
3. Bind all symbols to types
2. Traverse AST bottom-up (using visitor)1. For each AST node find corresponding rule
(there is only one for each kind of node)
2. Check if rule holds1. Yes: assign type to node according to consequent
2. No: report error
151545 > 32 && !false
BinopExpr UnopExpr
BinopExpr
…
op=AND
op=NEGop=GT
intLiteral
val=45
intLiteral
val=32
boolLiteral
val=false
: int : int
: bool
: bool
: bool
: bool
E false : bool
E int-literal : int
E e1 : int E e2 : int
E e1 rop e2 : bool
rop { <=,<, >, >=}
E e1 : bool E e2 : bool
E e1 lop e2 : bool
lop { &&,|| }
E e1 : bool
E !e1 : bool
Algorithm example
1616
Statement rules
Statements have type voidJudgments of the form
E SIn environment E, S is well-typed
E e:bool E S
E while (e) S
E e:bool E S
E if (e) S
E e:bool E S1 E S2
E if (e) S1 else S2
E break E continue
1717
Checking return statements
Special entry {ret:Tr} represents return value Add to symbol table when entering method Lookup entry when hit return statement
ret:void E
E return;
ret:T’E T≤T’
E return e;
E e:T
T subtype of T’
1818
Subtyping Inheritance induces subtyping relation
Type hierarchy is a treeSubtyping rules:
A extends B {…}
A ≤ B A ≤ A
A ≤ B B ≤ C
A ≤ C null ≤ A
Subtyping does not extend to array typesA subtype of B then A[] is not a subtype of B[]
1919
Type checking with subtyping
S ≤ TS may be used whenever T is expectedAn Expression E from type S also has type T
E e : SS ≤ T
E e : T
2121
Semantic analysis flow
Parsing and AST construction Combine library AST with IC program AST
Construct and initialize global type table Phase 1: Symbol table construction
Construct class hierarchy and check that hierarchy is a tree Construct remaining symbol table hierarchy Assign enclosing-scope for each AST node
Phase 2: Scope checking Resolve names Check scope rules using symbol table
Phase 3: Type checking Assign type for each AST node
Phase 4: Remaining semantic checks
2222
Class hierarchy for typesabstract class Type {...}
class IntType extends Type {...}
class BoolType extends Type {...}
class ArrayType extends Type { Type elemType;}
class MethodType extends Type { Type[] paramTypes; Type returnType; ... }
class ClassType extends Type { ICClass classAST; ...}...
2323
Type comparison
Use a unique object for each distinct type Resolve each type expression to same object Use reference equality for comparison (==)
2424
Type table implementationclass TypeTable { // Maps element types to array types private Map<Type,ArrayType> uniqueArrayTypes; private Map<String,ClassType> uniqueClassTypes;
public static Type boolType = new BoolType(); public static Type intType = new IntType(); ...
// Returns unique array type object public static ArrayType arrayType(Type elemType) { if (uniqueArrayTypes.containsKey(elemType)) { // array type object already created – return it return uniqueArrayTypes.get(elemType); } else { // object doesn’t exist – create and return it ArrayType arrt = new ArrayType(elemType); uniqueArrayTypes.put(elemType,ArrayType); return arrt; } } ... }
2626
Semantic analysis flow example
class A { int x; int f(int x) { boolean y; ... }}
class B extends A { boolean y; int t;}
class C { A o; int z;}
2727
Parsing and AST construction
IntTypeBoolTypeABCf : int->int…
TypeTable
Table populated with user-defined
types during parsing
(or special AST pass)
class A { int x; int f(int x) { boolean y; ... }}
class B extends A { boolean y; int t;}
class C { A o; int z;}
parser.parse()
ICClassname = A
Fieldname = xtype = IntType
Methodname = f
Paramname = xtype = IntType
DeclarationvarName = yinitExpr = nulltype = BoolType
fields[0] methods[0]
bodyparameters[0]
ASTProgramfile = …
classes[0]
ICClassname = Bsuper = A
classes[1]classes[2]
…ICClassname = C
…
…
2828
Defined types and type table
class A { int x; int f(int x) { boolean y; ... }}
class B extends A { boolean y; int t;}
class C { A o; int z;}
class TypeTable { public static Type boolType = new BoolType(); public static Type intType = new IntType(); ... public static ArrayType arrayType(Type elemType) {…} public static ClassType classType(String name, String super, ICClass ast) {…} public static MethodType methodType(String name,Type retType, Type[] paramTypes) {…}}
abstract class Type { String name; boolean subtypeof(Type t) {...}}class IntType extends Type {...}class BoolType extends Type {...}class ArrayType extends Type { Type elemType;}class MethodType extends Type { Type[] paramTypes; Type returnType;}class ClassType extends Type { ICClass classAST;}
IntTypeBoolTypeABCf : int->int…
TypeTable
2929
Assigning types by declarations
IntTypeBoolType
...
TypeTable
ClassTypename = A
ClassTypename = B
ClassTypename = C
MethodTypename = fretTypeparamTypes
type
type
type
type
super
All type bindings available during parsing time
ICClassname = A
Fieldname = xtype = IntType
Methodname = f
Paramname = xtype = IntType
DeclarationvarName = yinitExpr = nulltype = BoolType
fields[0] methods[0]
bodyparameters[0]
ASTProgramfile = …
classes[0]
ICClassname = Bsuper = A
classes[1]classes[2]
…ICClassname = C
…
…
3030
Symbol tables
ICClassname = A
Fieldname = xtype = IntType
Methodname = f
Paramname = xtype = IntType
DeclarationvarName = yinitExpr = nulltype = BoolType
fields[0] methods[0]
bodyparameters[0]
ASTProgramfile = …
classes[0]
ICClassname = Bsuper = A
classes[1]classes[2]
…ICClassname = C
…
…
ACLASS
BCLASS
CCLASS
Global symtab
xFIELDIntType
fMETHODint->int
A symtaboCLASSA
zFIELDIntType
C symtab
tFIELDIntType
yFIELDBoolType
B symtabxPARAMIntType
yVARBoolType
thisVARA
retRET_VARIntType
f symtab
abstract class SymbolTable { private SymbolTable parent;}class ClassSymbolTable extends SymbolTable { Map<String,Symbol> methodEntries; Map<String,Symbol> fieldEntries; }class MethodSymbolTable extends SymbolTable { Map<String,Symbol> variableEntries;}
abstract class Symbol { String name;}class VarSymbol extends Symbol {…} class LocalVarSymbol extends Symbol {…}class ParamSymbol extends Symbol {…}...
3131
Scope nesting in IC
SymbolKindTypeProperties Global
SymbolKindTypeProperties Class
SymbolKindTypeProperties Method
SymbolKindTypeProperties Block
names of all classes
fields and methods
formals + locals
variables defined in block
class GlobalSymbolTable extends SymbolTable {}class ClassSymbolTable extends SymbolTable {}class MethodSymbolTable extends SymbolTable {}class BlockSymbolTable extends SymbolTable {}
3232
Symbol tables
ICClassname = A
Fieldname = xtype = IntType
Methodname = f
Paramname = xtype = IntType
DeclarationvarName = yinitExpr = nulltype = BoolType
fields[0] methods[0]
bodyparameters[0]
ASTProgramfile = …
classes[0]
ICClassname = Bsuper = A
classes[1]classes[2]
…ICClassname = C
…
…
ACLASS
BCLASS
CCLASS
Global symtab
xFIELDIntType
fMETHODint->int
A symtaboCLASSA
zFIELDIntType
C symtab
tFIELDIntType
yFIELDBoolType
B symtabxPARAMIntType
yVARBoolType
thisVARA
retRET_VARIntType
f symtab
this belongs to method
scope
ret can be used later for type-
checking return statements
Locationname = xtype = ?
…
3333
Sym. tables phase 1 : construction
ICClassname = A
Fieldname = xtype = IntType
Methodname = f
Paramname = xtype = IntType
DeclarationvarName = yinitExpr = nulltype = BoolType
fields[0] methods[0]
bodyparameters[0]
ASTProgramfile = …
classes[0]
ICClassname = Bsuper = A
classes[1]classes[2]
…ICClassname = C
…
…
ACLASS
BCLASS
CCLASS
Global symtab
xFIELDIntType
fMETHODint->int
A symtaboCLASSA
zFIELDIntType
C symtab
tFIELDIntType
yFIELDBoolType
B symtabxPARAMIntType
yVARBoolType
thisVARA
retRET_VARIntType
f symtab
class TableBuildingVisitor implements Visitor { ...}
Locationname = xtype = ?
…
Build tables,Link each AST node to enclosing table
abstract class ASTNode { SymbolTable enclosingScope;}
enclosingScope
symbol
?
3434
ICClassname = A
Fieldname = xtype = IntType
Methodname = f
Paramname = xtype = IntType
DeclarationvarName = yinitExpr = nulltype = BoolType
fields[0] methods[0]
bodyparameters[0]
ASTProgramfile = …
classes[0]
ICClassname = Bsuper = A
classes[1]classes[2]
…ICClassname = C
…
…
ACLASS
BCLASS
CCLASS
Global symtab
xFIELDIntType
fMETHODint->int
A symtaboCLASSA
zFIELDIntType
C symtab
tFIELDIntType
yFIELDBoolType
B symtabxPARAMIntType
yVARBoolType
thisVARA
retRET_VARIntType
f symtab
class TableBuildingVisitor implements Visitor { ...}
During this phase, add symbols from definitions, not uses, e.g., assignment to variable x
symbol
?Locationname = xtype = ?
…
Sym. tables phase 1 : construct
3535
ICClassname = A
Fieldname = xtype = IntType
Methodname = f
Paramname = xtype = IntType
DeclarationvarName = yinitExpr = nulltype = BoolType
fields[0] methods[0]
bodyparameters[0]
ASTProgramfile = …
classes[0]
ICClassname = Bsuper = A
classes[1]classes[2]
…ICClassname = C
…
…
ACLASS
BCLASS
CCLASS
Global symtab
xFIELDIntType
fMETHODint->int
A symtaboCLASSA
zFIELDIntType
C symtab
tFIELDIntType
yFIELDBoolType
B symtabxPARAMIntType
yVARBoolType
thisVARA
retRET_VARIntType
f symtab
symbolLocationname = xtype=?
…
Sym. tables phase 2 : resolve
Resolve each id to a symbol,e.g., in x=5 in foo, x is the formal parameter of f
check scope rules:illegal symbol re-definitions,illegal shadowing,illegal use of undefined symbols...
class SymResolvingVisitor implements Visitor { ...}
enclosingScope
3636
ICClassname = A
Fieldname = xtype = IntType
Methodname = f
Paramname = xtype = IntType
DeclarationvarName = yinitExpr = nulltype = BoolType
fields[0] methods[0]
bodyparameters[0]
ASTProgramfile = …
classes[0]
ICClassname = Bsuper = A
classes[1]classes[2]
…ICClassname = C
…
…
Locationname = xtype = IntType
…
Type-check AST
IntTypeBoolType ...
TypeTable
class TypeCheckingVisitor implements Visitor { ...}
Use type-rules to infer types for all AST expression nodes
Check type rules for statements
3737
ICClassname = A
Fieldname = xtype = IntType
Methodname = f
Paramname = xtype = IntType
DeclarationvarName = yinitExpr = nulltype = BoolType
fields[0] methods[0]
bodyparameters[0]
ASTProgramfile = …
classes[0]
ICClassname = Bsuper = A
classes[1]classes[2]
…ICClassname = C
…
…
Locationname = xtype = IntType
…
Miscellaneous semantic checks
class SemanticChecker { ...}
Check remaining semantic checks: single main method, break/continue inside loops etc.