G53CMP: Lecture 1 - cs.nott.ac.uk
Transcript of G53CMP: Lecture 1 - cs.nott.ac.uk
G53CMP: Lecture 1
Administrative Details 2016
andIntroduction to Compiler Construction
Henrik Nilsson
University of Nottingham, UK
G53CMP: Lecture 1 – p.1/39
Finding People and Information (1)
• Henrik NilssonRoom A08, Computer Science Buildinge-mail: [email protected]
• Teaching Assistant (TA): Jonathan Fowlere-mail: [email protected]
G53CMP: Lecture 1 – p.2/39
Finding People and Information (2)
• Main module web page:www.cs.nott.ac.uk/~psznhn/G53CMP
G53CMP: Lecture 1 – p.3/39
Finding People and Information (2)
• Main module web page:www.cs.nott.ac.uk/~psznhn/G53CMP
• Moodle: moodle.nottingham.ac.uk/course/view.php?id=45541
G53CMP: Lecture 1 – p.3/39
Finding People and Information (2)
• Main module web page:www.cs.nott.ac.uk/~psznhn/G53CMP
• Moodle: moodle.nottingham.ac.uk/course/view.php?id=45541
• Direct questions concerning lectures andcoursework to the Moodle G53CMP Forum.
G53CMP: Lecture 1 – p.3/39
Finding People and Information (2)
• Main module web page:www.cs.nott.ac.uk/~psznhn/G53CMP
• Moodle: moodle.nottingham.ac.uk/course/view.php?id=45541
• Direct questions concerning lectures andcoursework to the Moodle G53CMP Forum.
Anyone can ask and answer questions, butyou must not post exact solutions to thecoursework.
G53CMP: Lecture 1 – p.3/39
Aims and Motivation (1)
Why study Compiler Construction?
G53CMP: Lecture 1 – p.4/39
Aims and Motivation (1)
Why study Compiler Construction?
• Why did you opt to take this module?
G53CMP: Lecture 1 – p.4/39
Aims and Motivation (1)
Why study Compiler Construction?
• Why did you opt to take this module?
• More generally, what do you think are goodreasons to take this module?
G53CMP: Lecture 1 – p.4/39
Aims and Motivation (2)
Aims: Deepened understanding of:
G53CMP: Lecture 1 – p.5/39
Aims and Motivation (2)
Aims: Deepened understanding of:
• how compilers (and interpreters) work andare constructed
G53CMP: Lecture 1 – p.5/39
Aims and Motivation (2)
Aims: Deepened understanding of:
• how compilers (and interpreters) work andare constructed
• programming language design and semantics
G53CMP: Lecture 1 – p.5/39
Aims and Motivation (2)
Aims: Deepened understanding of:
• how compilers (and interpreters) work andare constructed
• programming language design and semantics
The former is a great, “hands on”,“learning-by-doing” way to learn the latter.
G53CMP: Lecture 1 – p.5/39
Aims and Motivation (3)
Why?
G53CMP: Lecture 1 – p.6/39
Aims and Motivation (3)
Why?
The ACM/IEEE 2013 CS Curriculum Guidelines:
G53CMP: Lecture 1 – p.6/39
Aims and Motivation (3)
Why?
The ACM/IEEE 2013 CS Curriculum Guidelines:
“Graduates should realize that the computingfield advances at a rapid pace, and graduatesmust possess a solid foundation that allows andencourages them to maintain relevant skills asthe field evolves. Specific languages andtechnology platforms change over time. . . .
G53CMP: Lecture 1 – p.6/39
Aims and Motivation (3)
Why?
The ACM/IEEE 2013 CS Curriculum Guidelines:
. . . Therefore, graduates need to realize that theymust continue to learn and adapt their skillsthroughout their careers. To develop this ability,students should be exposed to multipleprogramming languages, tools, paradigms, andtechnologies as well as the fundamentalunderlying principles throughout their education.”
G53CMP: Lecture 1 – p.6/39
Aims and Motivation (4)
Moreover: Compilers: “a microcosm ofcomputer science” [CT04]
G53CMP: Lecture 1 – p.7/39
Aims and Motivation (4)
Moreover: Compilers: “a microcosm ofcomputer science” [CT04]
• Formal Languages and Automata Theory
G53CMP: Lecture 1 – p.7/39
Aims and Motivation (4)
Moreover: Compilers: “a microcosm ofcomputer science” [CT04]
• Formal Languages and Automata Theory
• Datastructures and algorithms
G53CMP: Lecture 1 – p.7/39
Aims and Motivation (4)
Moreover: Compilers: “a microcosm ofcomputer science” [CT04]
• Formal Languages and Automata Theory
• Datastructures and algorithms
• Computer architecture
G53CMP: Lecture 1 – p.7/39
Aims and Motivation (4)
Moreover: Compilers: “a microcosm ofcomputer science” [CT04]
• Formal Languages and Automata Theory
• Datastructures and algorithms
• Computer architecture
• Programming language semantics
G53CMP: Lecture 1 – p.7/39
Aims and Motivation (4)
Moreover: Compilers: “a microcosm ofcomputer science” [CT04]
• Formal Languages and Automata Theory
• Datastructures and algorithms
• Computer architecture
• Programming language semantics
• Formal reasoning about programs
G53CMP: Lecture 1 – p.7/39
Aims and Motivation (4)
Moreover: Compilers: “a microcosm ofcomputer science” [CT04]
• Formal Languages and Automata Theory
• Datastructures and algorithms
• Computer architecture
• Programming language semantics
• Formal reasoning about programs
• Software engineering aspects
G53CMP: Lecture 1 – p.7/39
Aims and Motivation (4)
Moreover: Compilers: “a microcosm ofcomputer science” [CT04]
• Formal Languages and Automata Theory
• Datastructures and algorithms
• Computer architecture
• Programming language semantics
• Formal reasoning about programs
• Software engineering aspects
Thus, “capstone” module tying everything together.G53CMP: Lecture 1 – p.7/39
Aims and Motivation (5)
Or, in terms of modules, G53CMP directly drawsfrom/informs:
G53CMP: Lecture 1 – p.8/39
Aims and Motivation (5)
Or, in terms of modules, G53CMP directly drawsfrom/informs:
• G52MAL: formal language theory, grammars,(D)FAs
G53CMP: Lecture 1 – p.8/39
Aims and Motivation (5)
Or, in terms of modules, G53CMP directly drawsfrom/informs:
• G52MAL: formal language theory, grammars,(D)FAs
• G51MCS, G52IFR: formal reasoning,structural induction
G53CMP: Lecture 1 – p.8/39
Aims and Motivation (5)
Or, in terms of modules, G53CMP directly drawsfrom/informs:
• G52MAL: formal language theory, grammars,(D)FAs
• G51MCS, G52IFR: formal reasoning,structural induction
• G51PRG, G51FUN, G51OOP: programming,understanding programming languages
G53CMP: Lecture 1 – p.8/39
Aims and Motivation (5)
Or, in terms of modules, G53CMP directly drawsfrom/informs:
• G52MAL: formal language theory, grammars,(D)FAs
• G51MCS, G52IFR: formal reasoning,structural induction
• G51PRG, G51FUN, G51OOP: programming,understanding programming languages
• G51CSA: how computers work
G53CMP: Lecture 1 – p.8/39
Aims and Motivation (5)
Or, in terms of modules, G53CMP directly drawsfrom/informs:
• G52MAL: formal language theory, grammars,(D)FAs
• G51MCS, G52IFR: formal reasoning,structural induction
• G51PRG, G51FUN, G51OOP: programming,understanding programming languages
• G51CSA: how computers work
• G54FOP/FPP: programming language theory
G53CMP: Lecture 1 – p.8/39
Aims and Motivation (6)
Jobs?
G53CMP: Lecture 1 – p.9/39
Aims and Motivation (6)
Jobs? There are plenty of companies out therewith in-house languages or that critically rely oncompiler/interpreter expertise for other reasons.Some possibly surprising examples:
G53CMP: Lecture 1 – p.9/39
Aims and Motivation (6)
Jobs? There are plenty of companies out therewith in-house languages or that critically rely oncompiler/interpreter expertise for other reasons.Some possibly surprising examples:
G53CMP: Lecture 1 – p.9/39
Aims and Motivation (6)
Jobs? There are plenty of companies out therewith in-house languages or that critically rely oncompiler/interpreter expertise for other reasons.Some possibly surprising examples:
• Standard Chartered Bank
G53CMP: Lecture 1 – p.9/39
Aims and Motivation (6)
Jobs? There are plenty of companies out therewith in-house languages or that critically rely oncompiler/interpreter expertise for other reasons.Some possibly surprising examples:
• Standard Chartered Bank
• Jane Street
G53CMP: Lecture 1 – p.9/39
Learning Outcomes
• Knowledge of language and compiler design,semantics, key ideas and techniques.
• Experience of compiler construction tools.
• Experience of working with a medium-sizedprogram.
• Programming in various paradigms
• Capturing design through formalspecifications and deriving implementationsfrom those.
G53CMP: Lecture 1 – p.10/39
Literature (1)
David A. Watt and Deryck F. Brown.Programming Language Processors in Java,Prentice-Hall, 1999.
• Used to be the main book. The lectures partlyfollow the structure of this book.
G53CMP: Lecture 1 – p.11/39
Literature (1)
David A. Watt and Deryck F. Brown.Programming Language Processors in Java,Prentice-Hall, 1999.
• Used to be the main book. The lectures partlyfollow the structure of this book.
• The coursework was originally based on it.
G53CMP: Lecture 1 – p.11/39
Literature (1)
David A. Watt and Deryck F. Brown.Programming Language Processors in Java,Prentice-Hall, 1999.
• Used to be the main book. The lectures partlyfollow the structure of this book.
• The coursework was originally based on it.
• Hands-on approach to compiler construction.Particularly good if you like Java.
G53CMP: Lecture 1 – p.11/39
Literature (1)
David A. Watt and Deryck F. Brown.Programming Language Processors in Java,Prentice-Hall, 1999.
• Used to be the main book. The lectures partlyfollow the structure of this book.
• The coursework was originally based on it.
• Hands-on approach to compiler construction.Particularly good if you like Java.
• Considers software engineering aspects.
G53CMP: Lecture 1 – p.11/39
Literature (1)
David A. Watt and Deryck F. Brown.Programming Language Processors in Java,Prentice-Hall, 1999.
• Used to be the main book. The lectures partlyfollow the structure of this book.
• The coursework was originally based on it.
• Hands-on approach to compiler construction.Particularly good if you like Java.
• Considers software engineering aspects.
• A bit weak on linking theory with practice.
G53CMP: Lecture 1 – p.11/39
Literature (2)
An alternative: Keith D. Cooper and LindaTorczon. Engineering a Compiler, Elsevier, 2004.
• Covers more ground in greater depth thanthis module.
G53CMP: Lecture 1 – p.12/39
Literature (2)
An alternative: Keith D. Cooper and LindaTorczon. Engineering a Compiler, Elsevier, 2004.
• Covers more ground in greater depth thanthis module.
• Gradually becoming the new main referencefor the module.
G53CMP: Lecture 1 – p.12/39
Literature (2)
An alternative: Keith D. Cooper and LindaTorczon. Engineering a Compiler, Elsevier, 2004.
• Covers more ground in greater depth thanthis module.
• Gradually becoming the new main referencefor the module.
For each lecture, there are references to therelevant chapter(s) of both books (see lectureoverview on the G53CMP web page).
G53CMP: Lecture 1 – p.12/39
Literature (3)
Great supplement: Alfred V Aho, Ravi Sethi,Jeffrey D. Ullman. Compilers — Principles,Techniques, and Tools, Addison-Wesley, 1986.(The “Dragon Book”.)
• Classic reference in the field.
G53CMP: Lecture 1 – p.13/39
Literature (3)
Great supplement: Alfred V Aho, Ravi Sethi,Jeffrey D. Ullman. Compilers — Principles,Techniques, and Tools, Addison-Wesley, 1986.(The “Dragon Book”.)
• Classic reference in the field.
• Covers much more ground in greater depththan this module.
G53CMP: Lecture 1 – p.13/39
Literature (3)
Great supplement: Alfred V Aho, Ravi Sethi,Jeffrey D. Ullman. Compilers — Principles,Techniques, and Tools, Addison-Wesley, 1986.(The “Dragon Book”.)
• Classic reference in the field.
• Covers much more ground in greater depththan this module.
• A book that will last for years.
G53CMP: Lecture 1 – p.13/39
Literature (3)
Great supplement: Alfred V Aho, Ravi Sethi,Jeffrey D. Ullman. Compilers — Principles,Techniques, and Tools, Addison-Wesley, 1986.(The “Dragon Book”.)
• Classic reference in the field.
• Covers much more ground in greater depththan this module.
• A book that will last for years.
• There is a New(-ish) 2007 edition!
G53CMP: Lecture 1 – p.13/39
Literature (3)
Great supplement: Alfred V Aho, Ravi Sethi,Jeffrey D. Ullman. Compilers — Principles,Techniques, and Tools, Addison-Wesley, 1986.(The “Dragon Book”.)
• Classic reference in the field.
• Covers much more ground in greater depththan this module.
• A book that will last for years.
• There is a New(-ish) 2007 edition!
• But nice that core principles have lasting value!
G53CMP: Lecture 1 – p.13/39
Literature (4)
Other useful references:
• Benjamin C. Pierce. Types and ProgrammingLanguages.
• Graham Hutton. Programming in Haskell.
G53CMP: Lecture 1 – p.14/39
Lectures and Handouts
• Come prepared to take notes. There will besome handouts, but for the most part not.
G53CMP: Lecture 1 – p.15/39
Lectures and Handouts
• Come prepared to take notes. There will besome handouts, but for the most part not.
• All electronic slides, program code, andother supporting material in electronic formused during the lectures, will be madeavailable on the course web page.
G53CMP: Lecture 1 – p.15/39
Lectures and Handouts
• Come prepared to take notes. There will besome handouts, but for the most part not.
• All electronic slides, program code, andother supporting material in electronic formused during the lectures, will be madeavailable on the course web page.
• However! The electronic record of thelectures is neither guaranteed to be completenor self-contained!
G53CMP: Lecture 1 – p.15/39
Assessment (1)
First sit:
• The exam counts for 75 % of the total mark.
• The coursework counts for the remaining 25 %.
• 2 h exam, 3 questions, each worth 25 %.
G53CMP: Lecture 1 – p.16/39
Assessment (1)
First sit:
• The exam counts for 75 % of the total mark.
• The coursework counts for the remaining 25 %.
• 2 h exam, 3 questions, each worth 25 %.
Bonus! There will be (sub)question(s) on theexam closely related to the coursework!
Effectively, the weight of the coursework is thusmore like 50 %, except partly examined later.
G53CMP: Lecture 1 – p.16/39
Assessment (2)
Resit:
• The exam counts for 100 % of the total mark.
• 2 h exam, 4 questions, each worth 25 %
• The coursework does not count.
G53CMP: Lecture 1 – p.17/39
Assessment (3)
Why this structure?
G53CMP: Lecture 1 – p.18/39
Assessment (3)
Why this structure?
• Compiler construction is best learnt by doing.
G53CMP: Lecture 1 – p.18/39
Assessment (3)
Why this structure?
• Compiler construction is best learnt by doing.
• Thus, if you do and understand the coursework,you will be handsomely rewarded.
G53CMP: Lecture 1 – p.18/39
Assessment (3)
Why this structure?
• Compiler construction is best learnt by doing.
• Thus, if you do and understand the coursework,you will be handsomely rewarded.
• Past experience shows that students who don’tengage with the coursework struggle to pass.
G53CMP: Lecture 1 – p.18/39
Assessment (3)
Why this structure?
• Compiler construction is best learnt by doing.
• Thus, if you do and understand the coursework,you will be handsomely rewarded.
• Past experience shows that students who don’tengage with the coursework struggle to pass.
• 2013/14, half of the G53CMP students got 1stor II-1 marks . . .
G53CMP: Lecture 1 – p.18/39
Assessment (3)
Why this structure?
• Compiler construction is best learnt by doing.
• Thus, if you do and understand the coursework,you will be handsomely rewarded.
• Past experience shows that students who don’tengage with the coursework struggle to pass.
• 2013/14, half of the G53CMP students got 1stor II-1 marks . . . but a quarter failed. Clearcorrelation to coursework engagement.
G53CMP: Lecture 1 – p.18/39
Assessment (3)
Why this structure?
• Compiler construction is best learnt by doing.
• Thus, if you do and understand the coursework,you will be handsomely rewarded.
• Past experience shows that students who don’tengage with the coursework struggle to pass.
• 2013/14, half of the G53CMP students got 1stor II-1 marks . . . but a quarter failed. Clearcorrelation to coursework engagement.
• Numbers differ, but similar pattern most years.
G53CMP: Lecture 1 – p.18/39
Coursework (1)
Learning goals:
• Getting a practical understanding of keyconcepts and techniques in the fields ofcompiler construction and language theory.
G53CMP: Lecture 1 – p.19/39
Coursework (1)
Learning goals:
• Getting a practical understanding of keyconcepts and techniques in the fields ofcompiler construction and language theory.
• Getting hands-on experience of working withlanguage processors.
G53CMP: Lecture 1 – p.19/39
Coursework (1)
Learning goals:
• Getting a practical understanding of keyconcepts and techniques in the fields ofcompiler construction and language theory.
• Getting hands-on experience of working withlanguage processors.
• Getting some experience of using compilerconstruction tools.
G53CMP: Lecture 1 – p.19/39
Coursework (1)
Learning goals:
• Getting a practical understanding of keyconcepts and techniques in the fields ofcompiler construction and language theory.
• Getting hands-on experience of working withlanguage processors.
• Getting some experience of using compilerconstruction tools.
• Getting experience of the issues involved inworking with medium-sized programs.
G53CMP: Lecture 1 – p.19/39
Coursework (2)
You will be given partial implementations of acompiler for a small language calledMiniTrinagle.
You will be asked to:
G53CMP: Lecture 1 – p.20/39
Coursework (2)
You will be given partial implementations of acompiler for a small language calledMiniTrinagle.
You will be asked to:
• answer theoretical questions related to thecode
G53CMP: Lecture 1 – p.20/39
Coursework (2)
You will be given partial implementations of acompiler for a small language calledMiniTrinagle.
You will be asked to:
• answer theoretical questions related to thecode
• extend the code with new features.
G53CMP: Lecture 1 – p.20/39
Coursework (2)
You will be given partial implementations of acompiler for a small language calledMiniTrinagle.
You will be asked to:
• answer theoretical questions related to thecode
• extend the code with new features.
Detailed instructions for the coursework available(soon) from the module web page (Part I: 6 Oct.).
G53CMP: Lecture 1 – p.20/39
Coursework (2)
You will be given partial implementations of acompiler for a small language calledMiniTrinagle.
You will be asked to:
• answer theoretical questions related to thecode
• extend the code with new features.
Detailed instructions for the coursework available(soon) from the module web page (Part I: 6 Oct.).Study these instructions very carefully!
G53CMP: Lecture 1 – p.20/39
Haskell and Coursework Support (1)
The functional language Haskell is usedthroughout the module as:
G53CMP: Lecture 1 – p.21/39
Haskell and Coursework Support (1)
The functional language Haskell is usedthroughout the module as:
• An ideal language for illustrating anddiscussing all aspects of compilerconstruction (and similar applications).
G53CMP: Lecture 1 – p.21/39
Haskell and Coursework Support (1)
The functional language Haskell is usedthroughout the module as:
• An ideal language for illustrating anddiscussing all aspects of compilerconstruction (and similar applications).
• Functional language notation is closelyaligned with mathematical notation andformalisms (such as attribute grammars)commonly used in text books on compilers.
G53CMP: Lecture 1 – p.21/39
Haskell and Coursework Support (1)
The functional language Haskell is usedthroughout the module as:
• An ideal language for illustrating anddiscussing all aspects of compilerconstruction (and similar applications).
• Functional language notation is closelyaligned with mathematical notation andformalisms (such as attribute grammars)commonly used in text books on compilers.
• A really good choice for implementing compilersin practice (and much else beside).
G53CMP: Lecture 1 – p.21/39
Haskell and Coursework Support (2)
To help you get up to speed using Haskell andprovide some extra help with the coursework:
G53CMP: Lecture 1 – p.22/39
Haskell and Coursework Support (2)
To help you get up to speed using Haskell andprovide some extra help with the coursework:
• Some of the lectures closely aligned with thecoursework.
G53CMP: Lecture 1 – p.22/39
Haskell and Coursework Support (2)
To help you get up to speed using Haskell andprovide some extra help with the coursework:
• Some of the lectures closely aligned with thecoursework.
• Haskell revision coursework (unassessed).
G53CMP: Lecture 1 – p.22/39
Haskell and Coursework Support (2)
To help you get up to speed using Haskell andprovide some extra help with the coursework:
• Some of the lectures closely aligned with thecoursework.
• Haskell revision coursework (unassessed).
• Slides from a Haskell revision lecture viamodule web page.
G53CMP: Lecture 1 – p.22/39
Laboratory Sessions
Laboratory sessions:
• Fridays, 9–11, A32
G53CMP: Lecture 1 – p.23/39
Laboratory Sessions
Laboratory sessions:
• Fridays, 9–11, A32
• TAs will be present during the laboratorysessions from 7 October (except possibly 21October).
G53CMP: Lecture 1 – p.23/39
Coursework Assessment (1)
• Two parts to the coursework: I and II
G53CMP: Lecture 1 – p.24/39
Coursework Assessment (1)
• Two parts to the coursework: I and II
• Each part to be solved individually
G53CMP: Lecture 1 – p.24/39
Coursework Assessment (1)
• Two parts to the coursework: I and II
• Each part to be solved individually
• Submission for each part:
- Brief written report (hard copy & PDF)
- All source code (electronically)
G53CMP: Lecture 1 – p.24/39
Coursework Assessment (1)
• Two parts to the coursework: I and II
• Each part to be solved individually
• Submission for each part:
- Brief written report (hard copy & PDF)
- All source code (electronically)
• For part II, compulsory 10 minute oralexamination in assigned slot during one of thelab sessions after the submission deadline.
G53CMP: Lecture 1 – p.24/39
Coursework Assessment (1)
• Two parts to the coursework: I and II
• Each part to be solved individually
• Submission for each part:
- Brief written report (hard copy & PDF)
- All source code (electronically)
• For part II, compulsory 10 minute oralexamination in assigned slot during one of thelab sessions after the submission deadline.
• Catch-up slots only if missed slot with goodcause; personal tutor to request on your behalf.
G53CMP: Lecture 1 – p.24/39
Coursework Assessment (2)
G53CMP: Lecture 1 – p.25/39
Coursework Assessment (2)
• A number of weighted questions for each part.
G53CMP: Lecture 1 – p.25/39
Coursework Assessment (2)
• A number of weighted questions for each part.
• Written answer to each question assessed on
- Correctness (0, 1, or 2 marks)
- Style (0, 1, or 2 marks)
G53CMP: Lecture 1 – p.25/39
Coursework Assessment (2)
• A number of weighted questions for each part.
• Written answer to each question assessed on
- Correctness (0, 1, or 2 marks)
- Style (0, 1, or 2 marks)
• In the oral examination (part II only), youexplain your answers.
G53CMP: Lecture 1 – p.25/39
Coursework Assessment (2)
• A number of weighted questions for each part.
• Written answer to each question assessed on
- Correctness (0, 1, or 2 marks)
- Style (0, 1, or 2 marks)
• In the oral examination (part II only), youexplain your answers.
• Your explanations are assessed as follows:
- 2: 100 % of mark for written answer
- 1: 65 % of mark for written answer
- 0: 0 % of mark for written answerG53CMP: Lecture 1 – p.25/39
Coursework Deadlines
Coursework deadlines:
• Part I: Monday 31 October, 15:00.
• Part II: Monday 28 November, 15:00.
G53CMP: Lecture 1 – p.26/39
Coursework Deadlines
Coursework deadlines:
• Part I: Monday 31 October, 15:00.
• Part II: Monday 28 November, 15:00.
Oral examinations during the lab sessions thefollowing two Fridays; i.e. 2 and 9 December.
G53CMP: Lecture 1 – p.26/39
Coursework Deadlines
Coursework deadlines:
• Part I: Monday 31 October, 15:00.
• Part II: Monday 28 November, 15:00.
Oral examinations during the lab sessions thefollowing two Fridays; i.e. 2 and 9 December.
Start early! It is not possible to do thiscoursework at the last minute.
G53CMP: Lecture 1 – p.26/39
What is a Compiler? (1)
Compilers are program translators:
source
program
target
programcompiler
error diagnostics
Typical example:
• Source language: C
• Target language: x86 assembler
G53CMP: Lecture 1 – p.27/39
What is a Compiler? (1)
Compilers are program translators:
source
program
target
programcompiler
error diagnostics
Typical example:
• Source language: C
• Target language: x86 assembler
Why? To make it easier to program computers!G53CMP: Lecture 1 – p.27/39
What is a Compiler? (2)
GCC translates this C program
int main(int argc, char *argv) {
printf("%d\n", argc - 1);
}
into this x86 assembly code (excerpt):
movl 8(%ebp), %eax
decl %eax
subl $8, %esp
pushl %eax
pushl $.LC0
call printf
addl $16, %esp G53CMP: Lecture 1 – p.28/39
Source and Target Languages
Large spectrum of possibilities, for example:
• Source languages:
- (High-level) programming languages
- Modelling languages
- Document description languages
- Database query languages
G53CMP: Lecture 1 – p.29/39
Source and Target Languages
Large spectrum of possibilities, for example:
• Source languages:
- (High-level) programming languages
- Modelling languages
- Document description languages
- Database query languages
• Target languages:
- High-level programming language
- Low-level programming language(assembler or machine code, byte code)
G53CMP: Lecture 1 – p.29/39
Where are Compilers Used? (1)
• Implementation of programming languages:C, C++, Java, C#, Haskell, Lisp, Prolog, Ada,Fortran, Cobol, . . .
G53CMP: Lecture 1 – p.30/39
Where are Compilers Used? (1)
• Implementation of programming languages:C, C++, Java, C#, Haskell, Lisp, Prolog, Ada,Fortran, Cobol, . . .
• Document processing: LaTeX → DVI,DVI → PostScript/PDF/. . .
G53CMP: Lecture 1 – p.30/39
Where are Compilers Used? (1)
• Implementation of programming languages:C, C++, Java, C#, Haskell, Lisp, Prolog, Ada,Fortran, Cobol, . . .
• Document processing: LaTeX → DVI,DVI → PostScript/PDF/. . .
• Databases: optimization of database queriesexpressed in query languages like SQL.
G53CMP: Lecture 1 – p.30/39
Where are Compilers Used? (2)
• Hardware design: modelling andsimulation/verification, compilation to silicon.E.g. Spice, VHDL.
G53CMP: Lecture 1 – p.31/39
Where are Compilers Used? (2)
• Hardware design: modelling andsimulation/verification, compilation to silicon.E.g. Spice, VHDL.
• Modelling and simulation of physical systems(cars, trains, aircraft, nuclear power plants,. . . ): Simulink, Modelica.
G53CMP: Lecture 1 – p.31/39
Where are Compilers Used? (2)
• Hardware design: modelling andsimulation/verification, compilation to silicon.E.g. Spice, VHDL.
• Modelling and simulation of physical systems(cars, trains, aircraft, nuclear power plants,. . . ): Simulink, Modelica.
• In Web browsers to speed up execution ofcode embedded in web pages, applets, RichInternet Applications (RIA), etc.
G53CMP: Lecture 1 – p.31/39
Where are Compilers Used? (2)
• Hardware design: modelling andsimulation/verification, compilation to silicon.E.g. Spice, VHDL.
• Modelling and simulation of physical systems(cars, trains, aircraft, nuclear power plants,. . . ): Simulink, Modelica.
• In Web browsers to speed up execution ofcode embedded in web pages, applets, RichInternet Applications (RIA), etc.
• . . .
G53CMP: Lecture 1 – p.31/39
Compilers vs. Interpreters
Interpreters are another class of translators:
• Compiler: translates a program once and forall into target language.
G53CMP: Lecture 1 – p.32/39
Compilers vs. Interpreters
Interpreters are another class of translators:
• Compiler: translates a program once and forall into target language.
• Interpreter: effectively translates (the usedparts of) a source program every time it is run.
G53CMP: Lecture 1 – p.32/39
Compilers vs. Interpreters
Interpreters are another class of translators:
• Compiler: translates a program once and forall into target language.
• Interpreter: effectively translates (the usedparts of) a source program every time it is run.
• Techniques like Just-In-Time Compilation(JIT) blurs this distinction.
G53CMP: Lecture 1 – p.32/39
Compilers vs. Interpreters
Interpreters are another class of translators:
• Compiler: translates a program once and forall into target language.
• Interpreter: effectively translates (the usedparts of) a source program every time it is run.
• Techniques like Just-In-Time Compilation(JIT) blurs this distinction.
• Compilers and interpreters sometimes usedtogether, e.g. Java: Java compiled into Javabyte code, byte code interpreted by a JavaVirtual Machine (JVM), JVM might use JIT.
G53CMP: Lecture 1 – p.32/39
Inside the Compiler (1)
Traditionally, a compiler is broken down intoseveral phases:
• Scanner: lexical analysis
• Parser: syntactic analysis
• Checker: contextual analysys (e.g. typechecking)
• Optimizer: code improvement
• Code generator
G53CMP: Lecture 1 – p.33/39
Inside the Compiler (2)
• Lexical Analysis:
G53CMP: Lecture 1 – p.34/39
Inside the Compiler (2)
• Lexical Analysis:
- Verify that input character sequence islexically valid.
G53CMP: Lecture 1 – p.34/39
Inside the Compiler (2)
• Lexical Analysis:
- Verify that input character sequence islexically valid.
- Group characters into sequence of lexicalsymbols, tokens.
G53CMP: Lecture 1 – p.34/39
Inside the Compiler (2)
• Lexical Analysis:
- Verify that input character sequence islexically valid.
- Group characters into sequence of lexicalsymbols, tokens.
- Discard white space and comments(typically).
G53CMP: Lecture 1 – p.34/39
Inside the Compiler (3)
• Syntactic Analysis/Parsing
G53CMP: Lecture 1 – p.35/39
Inside the Compiler (3)
• Syntactic Analysis/Parsing
- Verify that the input program issyntactically valid, i.e. conforms to theContext Free Grammar of the language.
G53CMP: Lecture 1 – p.35/39
Inside the Compiler (3)
• Syntactic Analysis/Parsing
- Verify that the input program issyntactically valid, i.e. conforms to theContext Free Grammar of the language.
- Determine the program structure.
G53CMP: Lecture 1 – p.35/39
Inside the Compiler (3)
• Syntactic Analysis/Parsing
- Verify that the input program issyntactically valid, i.e. conforms to theContext Free Grammar of the language.
- Determine the program structure.
- Construct a representation of the programreflecting that structure withoutunnecessary details, usually an AbstractSyntax Tree (AST).
G53CMP: Lecture 1 – p.35/39
Inside the Compiler (4)
• Contextual Analysis/Checking StaticSemantics:
G53CMP: Lecture 1 – p.36/39
Inside the Compiler (4)
• Contextual Analysis/Checking StaticSemantics:
- Resolve meaning of symbols.
G53CMP: Lecture 1 – p.36/39
Inside the Compiler (4)
• Contextual Analysis/Checking StaticSemantics:
- Resolve meaning of symbols.
- Report undefined symbols.
G53CMP: Lecture 1 – p.36/39
Inside the Compiler (4)
• Contextual Analysis/Checking StaticSemantics:
- Resolve meaning of symbols.
- Report undefined symbols.
- Type checking.
- . . .
G53CMP: Lecture 1 – p.36/39
Inside the Compiler (4)
• Contextual Analysis/Checking StaticSemantics:
- Resolve meaning of symbols.
- Report undefined symbols.
- Type checking.
- . . .
• Optimization:
G53CMP: Lecture 1 – p.36/39
Inside the Compiler (4)
• Contextual Analysis/Checking StaticSemantics:
- Resolve meaning of symbols.
- Report undefined symbols.
- Type checking.
- . . .
• Optimization:
- Code improvements aiming at making it runfaster and/or use less space, energy, etc.
G53CMP: Lecture 1 – p.36/39
Inside the Compiler (4)
• Contextual Analysis/Checking StaticSemantics:
- Resolve meaning of symbols.
- Report undefined symbols.
- Type checking.
- . . .
• Optimization:
- Code improvements aiming at making it runfaster and/or use less space, energy, etc.
- Almost always heuristics: cannotguarantee optimal result.
G53CMP: Lecture 1 – p.36/39
Inside the Compiler (5)
• Code Generation:
G53CMP: Lecture 1 – p.37/39
Inside the Compiler (5)
• Code Generation:
- Output the appropriate sequence of targetlanguage instructions.
G53CMP: Lecture 1 – p.37/39
Inside the Compiler (5)
• Code Generation:
- Output the appropriate sequence of targetlanguage instructions.
- Might involve further low-level(target-specific) optimization.
G53CMP: Lecture 1 – p.37/39
Inside the Compiler (6)
sequence of characters
scanner
parser
checker
optimizer/
code
generator
sequence of tokens
Abstract Syntax Tree (AST)
Intermediate Representation (IR), e.g. verified/annotated AST
target code
Lexical Analysis
Syntactic Analysis/Parsing
Contextual Analysis/checking Static Semantics
(e.g. Type Checking)
Optimization and Code Generation
(possibly many steps involving a number
of intermediary representations)
G53CMP: Lecture 1 – p.38/39
Inside the Compiler (7)
• Front end: Scanner, Parser, Contextualchecker.Depends more heavily on the sourcelanguage.
G53CMP: Lecture 1 – p.39/39
Inside the Compiler (7)
• Front end: Scanner, Parser, Contextualchecker.Depends more heavily on the sourcelanguage.
• Middle section (“Middle end”): Optimizer.Operates on an Intermediary Representation(IR) that could be fairly independent of sourceand target language. Hence potentiallyreusable!
G53CMP: Lecture 1 – p.39/39
Inside the Compiler (7)
• Front end: Scanner, Parser, Contextualchecker.Depends more heavily on the sourcelanguage.
• Middle section (“Middle end”): Optimizer.Operates on an Intermediary Representation(IR) that could be fairly independent of sourceand target language. Hence potentiallyreusable!
• Back end: Code GeneratorDepends heavily on the target language.
G53CMP: Lecture 1 – p.39/39