Symbol Table Management

download Symbol Table Management

of 7

Transcript of Symbol Table Management

  • 7/28/2019 Symbol Table Management

    1/7

    7/10/13 Symbol Table Management

    jcsites.juniata.edu/faculty/rhodes/lt/sytbmgmt.htm 1/7

    Symbol Table Managementlast updated 5/11/07

    Screen Mode

    Central part of compiler

    The symbol table is accessed by most phases of a compiler, beginning with the lexicalanalysis to optimization.

    The symbol table carries the collected information about each named object in the

    program to other phases of the compiler. As soon as a named token is found,depending on the token type, a call to the symbol table routines must be made.

    Tokens:

    Keywords of the language, special characters/punctuation, user defined identifiersused for different meanings, constants

    Identifiers:Identifiers are user defined

    variables, memory location labelsparameter namesnames of subroutines (functions or procedures) which are associated with entrypoints of the routines' codearray or structure names

    field names; file names; enumerated type valuesconstant names; type names

    Need a global structure to store identifiers and their associated meanings (attributes)for future reference. This structure provides more information that the grammar andsyntax directed translation of the language would not normally be able to carry.

    Reserved words

    These may be stored here too. You have to look up an identifier and reserved wordshave the same syntax as an identifier.

  • 7/28/2019 Symbol Table Management

    2/7

    7/10/13 Symbol Table Management

    jcsites.juniata.edu/faculty/rhodes/lt/sytbmgmt.htm 2/7

    Scoping rules

    The symbol table must be able to handle the scoping rules of the language.

    Symbol Table Requirements

    fast lookup . The parser of a compiler is linear in speed. The table lookup needs to beas fast as possible.

    flexible in structure. The entries must contain all necessary information, dependingon usage of identifier

    efficient use of space . This will require runtime allocation (dynamic) of the space.

    handle characteristics of language (e.g., scoping, implicit declaration)

    Scoping requires handling entry of a local block and exit.Block exit requires removal or hiding of the entries of the block.

    Program components

    Declarative statements - define identifiers and their attributes

    Imperative statements - uses identifiers assuming their attributes

    Languages and their explicit declarative statements

    Pascal, Ada VAR statement

    C, C++, Java, PL/Istatements beginning with atype

    FORTRAN TYPE, DIMENSION statementCOBOL DATA DIVISION

  • 7/28/2019 Symbol Table Management

    3/7

    7/10/13 Symbol Table Management

    jcsites.juniata.edu/faculty/rhodes/lt/sytbmgmt.htm 3/7

    Implicit Declarations

    Some languages have implicit declarations

    FORTRAN or PL/I SUM = SUM + I (first encounter assumes declaration with defaultattributes)

    APL, BASIC, LISP (most interpreted languages)

    Actions on declarations

    Declarative statements generally do not translate to (or directly associated with) anyexecutable code. They may allocate space at designated times, however.

    Symbols can be monitored, especially in languages that require declaration statements

    FIRST ENCOUNTER SUBSEQUENTENCOUNTER CONCLUSION

    Declaration Reference (in imperativestmt)Continue--this is the

    expected case

    Declaration DeclarationMultiple declarations error unless appropriate within

    scoping rulesDeclaration None Unused warning message

    Reference (in imperativestatement) NA Undefined error

    String Management

    How to efficiently store names of identifiers

  • 7/28/2019 Symbol Table Management

    4/7

    7/10/13 Symbol Table Management

    jcsites.juniata.edu/faculty/rhodes/lt/sytbmgmt.htm 4/7

    1. Restrict length of identifier

    FORTRAN 77 and earlier limited variables to 1-6 characters (4 bytes packedencoding)Pascal had 1-10 characters in initial designAdvantages : structure consists of simple fixed sized string array; easy lookup

    Disadvantages : limits programmer to choose effective names; may waste space

    2. Separate String Space

    Advantages : allows unlimited identifier names; doesn't waste spaceDisadvantages : extra memory reference for access

    3. Dynamic allocation of strings

    Advantages : don't need predefined space for strings

    Disadvantages : complexity

    Extend to other parts of the symbol table

    Name Searching

    Functions:

    Declaration of name- enter new name into table- return error if already there- add new attributes as found

    Use- expect name to be found- return position of entry in table

    Entry of new scope - allow new declarations or redefined declarations

    Exit of scope - delete entries

    Goals:

  • 7/28/2019 Symbol Table Management

    5/7

    7/10/13 Symbol Table Management

    jcsites.juniata.edu/faculty/rhodes/lt/sytbmgmt.htm 5/7

    1. efficiency of declaration entry, insertion of new name2. efficiency of retrieval, lookup3. sorted list of symbol table dump

    Most important would be #2

    Approaches:

    n is the number of entries in the symbol table for the analysis below

    n' is the number of entries in the current [local] block

    1. Linear access

    Assuming error free source fileTime(declaration) = k or O(1) [k is some constant]Time (reference) kn/2 or O(n)

    To check for errorsTime(declaration) ~= kn'/2Time(reference) ~= kn/2

    Time(block entry) = O(1)Time(block exit) = O(n') or O(1) if display usedTime(sort) = O(n log n)

    2. Binary search access

    data structure same as linear tableas new name is entered, insert into block to maintain sorted order on lookup use binary searchTime(declaration) = O(n')Time(ref) = O(log n')Time(sort) = 0 (already in order)Time(entry) = O(1)Time(exit) = O(1)

  • 7/28/2019 Symbol Table Management

    6/7

    7/10/13 Symbol Table Management

    jcsites.juniata.edu/faculty/rhodes/lt/sytbmgmt.htm 6/7

    3. Tree access

    - assumes a more random encounter of namesB A C P C D B Q C D E- entries of symbol table are organized into tree structure- have forest of trees - each tree constitutes a block

    Average Time(decl) = O(log n)Average Time (ref) = O(log n)Worst case Time(decl) = T(ref) = O(n)Time(sort) = 0Time(entry) = O(1)Time(exit) = O(n')

    4. Hash tables

    use a hash function on name to find location in symbol tablesymbol table is organized like a table of stacks since some identifiers in differentblock will collideTime(declaration) = O(1) or O(n')Time(reference) = O(b) where b is the chain depth

    Time(entry) = O(1)Time(exit) = O(n) or O(n') if second link connects block's ids

    Single pass vs multipass compiler:

    single pass can discard the entries while multipass will need to retain temporarily

  • 7/28/2019 Symbol Table Management

    7/7

    7/10/13 Symbol Table Management

    jcsites.juniata.edu/faculty/rhodes/lt/sytbmgmt.htm 7/7

    inaccessible entries