Topic 3 -Binding Time and Symbol Tables Dr. William A. Maniatty Assistant Prof. Dept. of Computer...
-
Upload
erik-hartshorne -
Category
Documents
-
view
214 -
download
0
Transcript of Topic 3 -Binding Time and Symbol Tables Dr. William A. Maniatty Assistant Prof. Dept. of Computer...
Topic 3 -Binding Time andSymbol Tables
Dr. William A. ManiattyAssistant Prof.
Dept. of Computer ScienceUniversity At Albany
CSI 511Programming Languages and Systems Concepts
Fall 2002
Monday Wednesday 2:30-3:50LI 99
Introduction to Binding
Binding refers to associating an entity with a value, such as
Variable name with address0 Result of expression with ephemeral storage Constant with its value Seperately compiled function with address
Design Binding Times
There are extra binding times available to programming language designers.
Language Design Time - Choose fundamental primitives, reserved words, etc.
Compiler/Interpreter Implementation Time -How to internally represent language constructs.
Programming Time - Language users pick the algorithms and data structures.
Object - What does it mean?
The word Object has many meanings in program languages.
Object Module -A compiled (but not linked) module of a program.
Object (OOP sense) - An instance of a class in Object Oriented Programming.
Object (Programming Language Sense) -The entities which are bound to values.
Use the programming language for now.
Binding Time Design Issues
Late binding of objects indicates that interpreters.
Dynamic Type Systems
Care needs to be taken to avoid ambiguity when binding.
Name Space Collisions Polymorphism (Overloading)
Object Attributes
Objects have many attributes Lifetime (Persistence) Type Scope Value/Address
Language should: Precisely specify attributes Be Orthogonal -Separate Controls
Object Persistence vs. Lifetime
Persistence -Persistant objects last longer than the process that created it.
Examples - Files, databases. Memory for nonpersistent objects is called
volatile (you lose data if powered down).
Lifetime - When is the storage allocated to an object available?
Events Impacting Object Lifetime
Life Time has several aspects. Creation of objects Creation of bindings References to variables/subroutines/types/etc. (Re)activation and Deactivation of bindings Destruction of bindings Destruction of Objects
Allocation andObject Lifetime
How can objects be allocated? Statically -Exist during Program's Lifetime Stack -Used for ephemeral objects and
ephemeral objects. Heap Objects -Have controlled lifetimes
Deallocation: How is it indicated? Explicitly - Destructors/free/delete Implicitly - Garbage collection
Initialization - Separate (Constructors)
Static Allocation
Done at compile time Literals (and constants) bound to values Variables bound to addresses
Compiler notes undefined symbols Library functions Global Variables and System Constants
Linker (and loader if DLLs used) resolve undefined references.
Stack Based Allocation
Stack Layout determined at compile time Variables bound to offsets from top of stack.
Layout called stack frame or activation record
Compilers use registers
Function parameters and results need consistent treatment across modules
C/C++ use prototypes Eiffel/Java/Oberon use single definition
Parameter Passing Conventions
Actual Parameters -at the call site
Formal Parameters - at the subroutine declaration
Address - a memory location, data objects containing addresses can be called:
Pointer - use explicit dereferencing operation. Reference - use implicit dereferencing.
Parameter Passing Conventions
Call by value - Copy to the function
Call by reference - Pass reference
Call by address - Pass address to function
Call by result - Pass result back to caller
Call by value result - Copy inputs to the function and copy results to caller.
Parameters can be on stack or in registers.
Call Site Code Generation for Stack Allocation
Call Setup Push Register Values on stack (if caller saves) Push parameters on stack (or load into
registers)
Call Function Push Return Address on stack Goto Function's Start Address
Call Cleanup (if caller saves)
Subroutine Code Generation for Stack Allocation
Prologue -Push Registers that will be overwritten on stack (if callee saves)
Body of function
Call Cleanup (if caller saves) Copy results (if any) Pop Parameters off stack. Pop registers Return
Heap Allocation
Heap provides dynamic memory management.
Not to be confused with binary heap or binomial heap data structures.
Under the hood, may periodically need to request additional memory from the O/S.
Requested large regions (requests are expensive).
Done using a library (e.g. C) Or as part of the language (C++, Java, Lisp).
Memory Management
Holes can form where memory is freed. Coalesce adjacent holes Small holes fragment the memory.
Suppose you allocate a smaller chunk, which hole do we take it from?
First fit - The first hole found that it fits into Best fit - The smallest segment it fits into Worst fit - The largest segment it fits into
When to Free Memory
Depends on language. Explicit deallocation -needed for library
approaches (e.g. C). Implicit Deallocation - aka garbage collection
Garbage is unreferenced memory. Compaction moves allocated memory to contiguous
addresses (coalescing all holes). Can cause timing variations (care is needed in real
time systems).
Speeding Up Searching for a Free Block
Recall all fitting scheme require finding sufficiently large blocks.
Idea: Organize Free List according to block size.
Fibbonacci Heap - Use Fibbonacci numbers for block sizes.
Buddy System -Use Block sizes of 2k
Introduction to Scope
Scope refers to the region of a program during which a binding is active.
Consider the following code segment, what should the output be?
program(output){const int i = 1;procedure b(){
write(output, i); // What value is output here?}procedure a(){
const int i = 2;b();
}a(); // invoke a
}
Scope Rules
Two popular answers to the problem. Static (lexical) scope -Use compile time
analysis. Normally in block structured languages, the containing scope is preferred, output is 1 in this case.
Dynamic Scope -Value found at run time by resolving to nearest stack frame in which the value is defined, output is 2 in this case.
Lexical scope is more popular.
Variants of Static Scope
Single Global Scope (BASIC) - simplest
Global and Local (Fortran) Fortran Common Blocks
Supports separate compilation Gives base address of region Each program specifies (possibly different) layout
Block Structured (Pascal)
Modules and Separate Compilation
Modules support encapsulation (much like classes).
Found in Modula 2, Euclid, Oberon and Ada.
For separate compilation define interfaces (data and subroutines) Export statements - published interfaces Import statements - uses published interfaces
Classes are extensions of modules
More Notation
Fundamental question: Does the scope need to be explicitly imported to be visible?
Yes - Referred to as closed scope. No - Referred to as open scope.
Aliasing -having more than one way to refer to the same object.
Classes and Scope
Classes provide encapsulation in object oriented programming (OOP).
Supports aggregating heterogeneous data and operations together.
Interfaces are published C++ public section in classes Internals can be hidden (ala private section in C++) Constructors and destructors supported.
OOP Features
I think of OOP as providing Encapsulation -groups data with operations Inheritance -permits extension of more general
base classes (and overriding behaviors) Polymorphism (overloading) - allows
operators/subroutines to have behaviors dependent on the types of arguments and results expected.
Dynamic Scope
Dynamic scoping prefers the instance defined in the most recently invoked function.
Not very popular currently (hard to debug) Found in interpreted languages (APL, older
Lisp dialects, e.g. EMACS Lisp).
Fans claim that it makes customizing subroutines easier.
Symbol Table Design Criteria
Symbol tables require: Fast insertion Fast lookup Occasional deletion (should be fast).
Which motivates the use of hash tables.
But ordinary hash tables are not good with nesting (ala classes/records/subroutines)
Operations on Symbol Tables (Static Scope)
A Symbol Table should support: Entering Scope Leaving Scope Inserting a symbol (with scope information) Looking up a symbol (with scope information)
It is often useful to store symbol table in object/executables
e.g. For debugging or source level analysis
LeBlanc-Cook Symbol Table Lookup 1/5
LeBlanc-Cook Symbol Table Lookup Each Scope is assigned a serial number Elements are never deleted from the table A Scope Counter is maintained
The first scope is 0 Every new scope encountered increments the counter
To track nesting, a scope stack is maintained. Push to enter scope, pop when leaving scope
LeBlanc-Cook Symbol Table Lookup 2/5
Put all symbols in a single hash table. Keywords not inserted (can use another hash). Entries indexed using both name and scope.
To lookup a name Look in the hash table for (name,scope) pair. If not found:
Parent scope is found using stack Test if parent scope is open or exports symbol
LeBlanc-Cook Symbol Table Lookup 3/5
About Hashing and Hash Functions: Is the universe of keys known in advance?
Yes - perfect minimal hashing may be possible. No - must handle collisions
e.g. Quadratic Rehash or Chaining
Symbol Table Algorithm has to handle collisions if hashing is used.
Dynamic Scope and Symbol Table Management
Dynamic scope has different symbol table management needs than static scope
Needs insert, lookup, enter scope, leave scope. Just like static scope
Competing Approaches: simplicity vs. speed Association Lists -Simple, fast scope entry/exit. Central Reference Table -Like Leblanc-Cook sans
reference stack. Faster Lookup (common case?), slower scope entry/exit.
Association Lists
Association Lists (A-Lists) combine list and stack treatment.
When a new scope is entered Push its symbols on the stack Use a unidirectional linked list to implement stack.
To find an item Scan stack starting at top of stack.
When leaving a scope Pop all symbols in scope from the stack.
Central Reference Tables (1)
Central Reference Tables use hashing Elements are keyed by symbol Each element is a stack
So we have one stack per symbol Newest Scope is on top
Use a unidirectional linked list to implement stack.
Central Reference Tables (2)
To insert a symbol/scope Hash on symbol, push symbol/scope on stack.
To find a symbol in a scope Hash to symbol's stack Use scope at top of stack.
When leaving scope Pop all symbols in that scope from top of their
respective stacks.
Resolving Static Scope at Run Time
Consider a function F containing G. i.e. F and G are nested functions
Suppose G uses an identifier in F's scope. How can G find F's frame pointer at run time?
If G is always invoked by F, just do base + offset Called static chaining - offset computed at compile time.
But what if G is separated by recursive invocations Use pointer jumping (exploit transitivity and associativity) Called dynamic chaining - requires run time support
An Example Requiring Dynamic Chaining
program DynChain(input,output);var basex, basey, TimesCalled : integer;function Ackermann(x, y : integer) : integer;begin TimesCalled := TimesCalled + 1; if (x = 0) then begin writeln("Returning ", y + 1, ", bx = ", basex, ",by = ", basey, "TimesCalled = ", TimesCalled); Ackermann := y + 1; end else if (y = 0) then Ackerman := Ackermann(x-1, 1, TimesCalled) else Ackerman := Ackermann(x - 1, ackerman(x, y- 1, TimesCalled), TimesCalled);end; {Ackermanm}begin TimesCalled = 0; writeln("Enter basex and base y"); readln(basex, basey); writeln("ackerman( basex = ", basex, ", basey = ", basey, ") = ",
Ackermann(basex, basey, TimesCalled), "TimesCalled = ", TimesCalled);end.
Subroutine Closures
Consider when a function, F, is passed as an argument to another function, G
E.g. Comparison Operators for sorting When G invokes F, how can we determine the
scope?
Subroutine closures describe a function's scope and instruction space address
Overloading Defined
An overloaded function or operator selects its semantics based on the types of its parameters and result
Implicit overloading - provided by language e.g. addition in Pascal can handle real or integers Write and Writeln in Pascal
Explicit overloading - programmers resolve actions
e.g. Overloaded operators and methods in C++
Some thoughts on Overloading
Should user defined overloading of operators be permitted?
Pro: Permits consistent interface e.g. A = B * C; good for integer, real, complex ...
Cons: You may need to read the entire program to understand a single line of code.
e.g. A = B * C; What if B and C are objects? Inheritance?
What to do with ephemeral objects? e.g A * B * C
More Thoughts
Meyer's Eiffel overloads A(i) Single parameter function Single index array Because functions and arrays are often
interchangeable!
Operator vs. function overloading Operator - Syntactic Sugar Function - Programmers know to read code
Challenges of Overloading
Compiler needs to be smart about types
Separate compilation hard e.g. Unix Linker - Predates C++ Name Space Mangling
Can break system tools (profilers/debuggers) Compiler creates a unique name based on
operator/function name and parameter/result types. No standard defined
Hard to link code compiled by different compilers
Templates
Templates in C++ are used for container classes.
The base type describes elements in the container.
The base type is a parameter to the template passed when instantiated (or in a typedef).
Makes separate compilation hard Typically interface needs to be compiled by
both publisher and user (header files)
Templates Pros and Cons
Templates promote code reuse But also promotes compiled code bloat
Recovering from syntax errors is hard! Make a small STL error, get pages of errors And the error messages are not helpful! Vandevoorde's Xroma - Have template
developer give compiler hints (also for code generation).