7/30/2019 Compiler Design Lectures
1/33
CS 434
Compilers Design
Dr. Ayman Hamarsheh
7/30/2019 Compiler Design Lectures
2/33
Lecture 1
Introduction
Programs, Interpreters and Translators
7/30/2019 Compiler Design Lectures
3/33
Programming languages are notations for
describing computations to people and tomachines
All the software running on all thecomputers was written in some
programming language Before a program can be run, it first must
be translated into a form in which it can be
executed by a computer The software systems that do this
translation are called compilers
7/30/2019 Compiler Design Lectures
4/33
a compiler is a program that can read aprogram in one language - the source
language - and translate it into anequivalent program in another language -the targetlanguage;
An important role of the compiler is toreport any errors in the source programthat it detects during the translationprocess.
If the target program is an executablemachine-language program, it can then becalled by the user to process inputs andproduce outputs
7/30/2019 Compiler Design Lectures
5/33
An interpreteris another common kind oflanguage processor.
Instead of producing a target program as atranslation, an interpreter appears to directlyexecute the operations specified in the sourceprogram on inputs supplied by the user.
The machine-language target program producedby a compiler is usually much faster than aninterpreter at mapping inputs to outputs .
An interpreter, however, can usually give better
error diagnostics than a compiler, because itexecutes the source program statement bystatement.
7/30/2019 Compiler Design Lectures
6/33
The main advantages of compilers
They produce programs which run
quickly.
They can spot syntax errors while theprogram is being compiled (i.e. you
are informed of any grammatical
errors before you try to run the
program).
7/30/2019 Compiler Design Lectures
7/33
The main advantages of interpreters
There is no lengthy "compile time", i.e. you
do not have to wait between writing a
program and running it, for it to compile They tend to be more "portable", which
means that they will run on a greater
variety of machines.
7/30/2019 Compiler Design Lectures
8/33
In addition to a compiler, several other
programs may be required to create anexecutable target program.
A source program may be divided intomodules stored in separate files.
The task of collecting the source programis sometimes entrusted to a separateprogram, called a preprocessor.
The preprocessor may also expandshorthands, called macros, into sourcelanguage statements.
7/30/2019 Compiler Design Lectures
9/33
The compiler may produce an assembly
language program as its output, becauseassembly language is easier to produce as
output and is easier to debug.
The assembly language is then processedby a program called an assemblerthat
produces relocatable machine code as its
output.
7/30/2019 Compiler Design Lectures
10/33
Large programs are often compiled in pieces, so
the relocatable machine code may have to be
linked together with other relocatable object files
and library files into the code that actually runs
on the machine.
The linkerresolves external memory addresses,where the code in one file may refer to a location
in another file.
The loaderthen puts together all of the
executable object files into memory for
execution.
7/30/2019 Compiler Design Lectures
11/33
Source Program Translators TargetProgram
Compilers
Interpreters
7/30/2019 Compiler Design Lectures
12/33
Lecture 2
The Structure of a Compiler
Analysis-Synthesis Model of
Translation (Compilation)
7/30/2019 Compiler Design Lectures
13/33
7/30/2019 Compiler Design Lectures
14/33
There are two parts of compilation: Analysis part
Synthesis part
Front
End
Back
End
Intermediate
Representation
Source
code
Machine
Errors
code
7/30/2019 Compiler Design Lectures
15/33
The Analysis Part:
It is often called the front end of the compiler
Breaks up the source program into constituent
pieces and imposes a grammatical structure on
these pieces.
Creates an intermediate representation of thesource program.
If the analysis part detects that the source
program is either syntactically ill formed or
semantically unsound, then it must provide
informative messages, so the user can take
corrective action.
7/30/2019 Compiler Design Lectures
16/33
Collects information about the source
program and stores it in a data structurecalled a symbol table, which is passed
along with the intermediate representation
to the synthesis part. During analysis, the operations implied by
the source program are determined and
recorded in a hierarchical structure calleda tree.
7/30/2019 Compiler Design Lectures
17/33
The Synthesis Part:
It is often called the back end of the
compiler
constructs the desired target program from
the intermediate representation and the
information in the symbol table.
7/30/2019 Compiler Design Lectures
18/33
Phases of compilation process:
Compiler operates as a sequence of phases,each of which transforms one representation of
the source program to another.
In practice, several phases may be grouped
together, and the intermediate representationsbetween the grouped phases need not be
constructed explicitly.
Symbol table, which stores information about the
entire source program, is used by all phases of
the compiler.
7/30/2019 Compiler Design Lectures
19/33
Phases of compilation process:
Lexical Analysis Syntax Analysis
Semantic Analysis
Intermediate Code Generation
Machine-Independent code optimization
Code Generation
Machine-Dependent Code Optimization
7/30/2019 Compiler Design Lectures
20/33
Issues in compiler design The compiler deals with many big-picture issues
Compiler construction brings togethertechniques from disparate parts of ComputerScience.
Compilers are engineered objectssoftwaresystems built with distinct goals in mind.
In building a compiler, the compiler writer makesmyriad design decisions, each decision has animpact on the resulting compiler.
a well designed compiler must observe isinviolable.
7/30/2019 Compiler Design Lectures
21/33
Lecture 3
Programming Language
Specifications
7/30/2019 Compiler Design Lectures
22/33
Definition of Syntax
In computer science, the syntax of aprogramming language is the set of rules thatdefine the combinations of symbols that areconsidered to be correctly structured programsin that language.
The syntax of a language defines its surfaceform.
Text-based programming languages are basedon sequences of characters.
visual programming languages are based on thespatial layout and connections between symbols(which may be textual or graphical).
7/30/2019 Compiler Design Lectures
23/33
Definition of Syntax
The syntaxof a programming languagedescribes the proper form of its programs.
The syntax of textual programming
languages is usually defined using acombination of regular expressions (forlexical structure) and Backus-Naur Form(for grammatical structure) to inductivelyspecify syntactic categories (nonterminals)and terminalsymbols.
7/30/2019 Compiler Design Lectures
24/33
The syntax of a language describes theform of a valid program, but does not
provide any information about the meaning
of the program or the results of executingthat program.
syntax of most programming languages
can be specified using a Type-2 grammar,i.e., they are context-free grammars.
7/30/2019 Compiler Design Lectures
25/33
Semantics and Pragmatics
The two stages of analysis semantics andpragmatics, are concerned with getting at themeaningof a sentence.
In the first stage (semantics) a partialrepresentation of the meaning is obtained basedon the possible syntactic structure(s) of thesentence, and on the meanings of the words inthat sentence
In the second stage, the meaning is elaboratedbased on contextualand world knowledge
7/30/2019 Compiler Design Lectures
26/33
Semantics
In general, the input to the semantic stage
of analysis may be viewed as being a set
of possible parsesof the sentence, and
information about the possible wordmeanings.
7/30/2019 Compiler Design Lectures
27/33
Lecture 4
In-depth Study of SyntacticSpecifications
7/30/2019 Compiler Design Lectures
28/33
Syntactic
The syntactic analysis of source codeusually entails the transformation of the
linear sequence of tokens into a
hierarchical syntax tree (abstract syntax
trees are one convenient form of syntax
tree)
7/30/2019 Compiler Design Lectures
29/33
Syntax definition
The syntax of textual programming
languages is usually defined using a
combination of regular expressions (for
lexical structure) and Backus-Naur Form(for grammatical structure) to inductively
specify syntactic categories (nonterminals)
and terminalsymbols
7/30/2019 Compiler Design Lectures
30/33
Syntax definition
The syntax of a language describes theform of a valid program, but does notprovide any information about the meaning
of the program or the results of executingthat program.
The meaning given to a combination ofsymbols is handled by semantics
Not all syntactically correct programs aresemantically correct
7/30/2019 Compiler Design Lectures
31/33
Using natural language as an example, it
may not be possible to assign a meaningto a grammatically correct sentence or the
sentence may be false:
"John is a married bachelor. " isgrammatically well-formed but has no
generally accepted meaning.
7/30/2019 Compiler Design Lectures
32/33
No ambiguity allowed in programming
languages in form (syntax) and meaning(semantics)
Distinction between syntax and semantics:
many programming languages havefeatures that meanthe same (shared
semantics) but are expresseddifferently
identifying which is which helps thelearning curve
7/30/2019 Compiler Design Lectures
33/33
Syntax Specification
Formalism: set of production rules Microsyntax rules: concatenation, alternation
(choice among finite alternatives), Kleene
closure
- The set of strings produced by these three rules
is a regular setorregular language
- The rules are specified by regular expressions
they generate the regular language- Strings in the regular language are recognized
by scanners
Top Related