Http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction1 COMP30330 2009-2010 Compiler...
-
Upload
aldous-whitehead -
Category
Documents
-
view
216 -
download
1
Transcript of Http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction1 COMP30330 2009-2010 Compiler...
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 1
COMP30330 2009-2010
Compiler Construction
Lecturer: Dr. Arthur CaterTeaching Assistant: Santiago Villalba
Demonstrator: Zeeshan Ahmed
Admin issues
• 24 lectures, 3 assignments, 11 practical sessions
• 1st assignment set on Thursday 21 Jan, due on Friday 12 Feb, worth 10%
• 2nd assignment set on Thursday 4 Feb, due on Friday 26 March, worth 20%
• 3rd assignment set on Thursday 4 March, due on Friday 23 April, worth 20%
• 2 hour exam will occur after semester end.
• Book “Compilers: Principles, Techniques and Tools”
by Aho, Lam, Sethi & Ullman: 2nd edition
• Each student should register for Monday practicals or Tuesday practicals.
• Attendance records will be kept.
• A Module Moodle exists at http://csimoodle.ucd.ie/moodle/course/view.php?id=98
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 2
What does a compiler do?
• Compilers translate programs written in a “high-level language” into some other form
• That other form may be machine code that can be directly executed by computer hardware relocatable binary, needing more work on address references assembly code, needing assembling &c code for a virtual machine, such as the JVM for Java equivalent code in another HLL, such as C
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 3
Front-end and Back-end
• The job of translating a program from one form to another is often broken down into two major stages:
1) Front-end: Analysing the source program, determining
• how its characters form words,
• how its words form statements, procedures, class definitions, etc
• how its statements etc conform to language rules, such as
• using only declared variables,
• using operands of proper type for operators
• … and reporting statically detectable errors in the program if they exist
1) Back-end: Generating an equivalent program in the target language
• Multiple implementations of a source language for different computers may share a front end and match it with different back ends.
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 4
Compilers vs Interpreters
Interpreters do not translate programs, rather they simulate them.
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 5
Compiler (&c) Interpreter
source program source
program
targetprogram
input
input
output
output
Some special varieties of compiler
• Cross compiler
• Debugging compiler
• Optimizing compiler
• Batch compiler
• Load-and-go compiler
It is quite common for a compiler for a language to be written in that same language.
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 6
Phases of a typical compiler
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 7
source program
lexical analyzer
token stream
Machine-IndependentCode Optimizer
syntax analyzer
intermediate representation
intermediate representation
syntax tree
syntax tree
semantic analyzer
intermediatecode generator
Code Generator
target-machine code
Machine-DependentCode Optimizer
target-machine code
symboltable
Software relatives
Various other software tools perform similar analysis functions to a compiler’s
• Syntax-directed editors– automatically insert text fragments near reserved words
• Prettyprinters and colorizers
• Static checkers– look for e.g.
unreachable code, undeclared / unused variables, datatype mismatches
• html / xml browsers
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 8
Metalanguages
Chomsky hierarchy of types of language, distinguished by what restrictions may be placed on “productions” in an adequately descriptive “generative grammar”:
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 9
Type 0 (unrestricted)any LHS may be replaced by any RHS
aXbYc Pqr
Type 1 (context-sensitive)a single nonterminal in the context of a LHS may be replaced by anything else in the same context
aXbYc aZwbYc
Type 2 (context-free)a LHS may mention only a single nonterminal
Type 3 (regular)a LHS may mention only a single nonterminal and a RHS may mention at most one terminal followed by at most one nonterminal
X pQr
X pY
Relevance of types of language
• Regular languages are often used to describe the word-level syntax of a HLL
• rules for valid identifiers, numbers, strings, reserved words, etc
• finite-state automata can recognise regular languages
• tools such as ‘lex’, ‘Flex’, ‘ANTLR’, ‘JavaCC’ can build a lexical analyzer program ( lexer , scanner ) when supplied with a regular grammar describing the desired regular language; or hand coding can be used
• Context-free languages are used to describe the phrase-level syntax of a HLL
• rules for expressions, statements, compound statements, conditionals, etc
• push-down automata can recognise context-free languages
• tools such as ‘Yacc’, ‘Bison’, ‘ANTLR’, ‘JavaCC’ can build a syntax analyzer program ( parser ) when supplied with a context-free grammar describing the desired context-free language; or hand coding can be used
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 10
Beyond Recognition
• A Finite State Automaton can classify character sequences as numbers, ids, etc.
• A parser can operate simply at the level of token sequences.
• But mere yes/no judgements are not what is required of lexers, parsers.
• Associating “semantic actions” with grammar productions allows lexers, parsers to maintain a symbol table, distinguishing different identifiers build the values of numeric expressions generate simple code as a by-product of parsing
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 11
Symbol table
• An unsophisticated symbol table may have the following form:
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 12
0:01:42:93:\\4:\\5:\\6:\\
Semantic actions associated with statetransitions in a finite state automatoncan accumulate characters in a buffer,then at an accepting state look up in symbol table, and insert new entry ifno match is found.