CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

27
CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton

Transcript of CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

Page 1: CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

CPSC 388 – Compiler Design and Construction

Lecture: MWF 11:00am-12:20pm, Room 106 Colton

Page 2: CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

Instructor

Louis OliphantOffice: 111 ColtonE-mail: [email protected] Hours:

MWF 2:35pm-4:35pmTH 4pm-5pm(Open Door Policy)

Page 3: CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

Course Description

An intense treatment of the theoretical and practical considerations involved in implementing translators for high-level programming languages. Students will design and implement parts of a compiler for a high level language.

Prerequisites: CPSC 171 and CPSC 172 or permission.

Page 4: CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

Course Web Page

www.cs.hiram.edu/~oliphantLT/cpsc388

Page 5: CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

TextbookCompilers: Principles, Techniques, & Tools

Second Edition Alfred V. Aho, Monica S. Lam, Ravi Sethi, & Jeffrey D. Ullman

Page 6: CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

Grading

Midterm: 15% Final: 15% Homeworks: 30%

Approximately weekly written assignments Programs: 40%

5 assignments building a compiler

Page 7: CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

Reading Assignment

Chapter 1

Homework AssignmentRead paper on FORTRAN and write 1 page report. (see

webpage for details)Due: Sept 4th at beginning of class

Page 8: CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

What is a Compiler?

CompilerSource Language

LTarget Language

L’

A compiler translates text from a source language, L, toA target language, L’.

JavaC++FORTRAN

JVMIntel PentiumMIPSIBM’s CELL

Page 9: CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

What is a Compiler?Source Language

L(sequence of characters)

Lexical Analyzer(Scanner)

Sequence of Tokens

Syntax Analyzer(Parser)

Abstract Syntax Tree

Symantic AnalyzerAugmented, AnnotatedAbstract Syntax Tree

Intermediate CodeGenerator

Intermediate Code

OptimizerOptimized

Intermediate Code

Target LanguageL’

Assembly CodeMachine Code

Code Generator

Page 10: CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

The Scanner Reads characters from the source program. Groups characters into lexemes

sequences of characters that "go together" Lexemes corresponds to a tokens; scanner

returns next token (plus maybe some additional information) to the parser.

Scanner also discovers lexical errors (e.g., erroneous characters such as # in java).

Page 11: CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

Example Lexemes and Tokens

Lexeme: ;

SEMI-COLON

=

ASSIGN

index

IDENT

tmp

IDENT

21

INT-LIT

64.32

INT-FLOATToken:

Source Code:

position = initial + rate * 60 ;

Corresponding Tokens:

IDENT ASSIGN IDENT PLUS IDENT TIMES INT-LIT SEMI-COLON

Page 12: CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

The Parser Groups tokens into "grammatical phrases", discovering the

underlying structure of the source program.

Finds syntax errors.For example, in Java the source code position = * 5 ;corresponds to the sequence of tokens: IDENT ASSIGN TIMES INT-LIT SEMI-COLONAll are legal tokens, but that sequence of tokens is erroneous.

Might find some "static semantic" errors, e.g., a use of an undeclared variable, or variables that are multiply declared.

Might generate code, or build some intermediate representation of the program such as an abstract-syntax tree.

Page 13: CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

Example ParseSource Code:

position = initial + rate * 60 ;

Abstract-Syntax Tree: =

position +

initial *

rate 60

•Interior nodes are operators. •A node's children are operands. •Each subtree forms "logical unit"

e.g., the subtree with * at its root shows thatbecause multiplication has higher precedencethan addition, this operation must be performedas a unit (not initial+rate).

Page 14: CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

Semantic Analyzer

Checks for (more) "static semantic" errors

Annotate and/or change the abstract syntax tree

Page 15: CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

Example Symantic Analysis

Abstract-Syntax Tree:

=

position +

initial *

rate 60

Annotated Abstract-Syntax Tree:

= (float)

position + (float)

initial * (float)

rate

60

intToFloat (float)

Page 16: CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

Intermediate Code Generator Translates from abstract-syntax tree to

intermediate code One possibility is 3-address code

each instruction involves at most 3 operandsExample:temp1 = inttofloat(60)temp2 = rate * temp1temp3 = initial + temp2position = temp3

Page 17: CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

Optimizer

Tries to improve code to Run faster Be smaller Consume less energy

Page 18: CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

Try Optimizing This (for speed)int sumcalc(int a, int b, int N){

int i;int x, y;x = 0;y = 0;for(i=0; i<=N; i++) {

x=x+(4*a/b)*i+(i+1)*(i+1);x=x+b*y;

}return x;

}

Page 19: CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

Some Types of Optimization

Constant Propagation Algebraic Simplification Copy Propagation Common Sub-expression Elimination Dead Code Elimination Loop Invariant Removal Strength Reduction

Page 20: CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

Code GeneratorGenerate object code from (optimized) intermediate codeExample:.datac1:

.float 60.0.text

l.s $f0,ratemul.s $f0,c1l.s $f2,initialadd.s $f0,$f0,$f2s.s $f0,position

Page 21: CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

Symbol Tables

Lexical Analyzer(Scanner)

Syntax Analyzer(Parser)

Symantic Analyzer

Intermediate CodeGenerator

Optimizer

Code Generator

Symbol Table

Page 22: CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

Symbol Tables Keep track of names declared in the

program Separate level for each scope Used to analyze static symantics:

Variables should not be declared more than once in a scope

Variables should not be used before being declared

Parameter types for methods should match method declaration

Page 23: CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

Compiler Modularity

Lexical Analyzer(Scanner)

Syntax Analyzer(Parser)

Symantic Analyzer

Intermediate CodeGenerator

Optimizer

Code Generator

Front End

Back End

Page 24: CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

Many Compilers

Java

C

.Net

FORTRAN

JVM

Intel Pentium

IBM Cell

Motorola Processor

Page 25: CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

Many Compilers

Java

C

.Net

FORTRAN

JVM

Intel Pentium

IBM Cell

Motorola Processor

IntermediateCode

Optimization

Page 26: CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

Summary Compilers Translate Source Language to Target

Language Compilers have several steps

Scanner Parser Semantic Analyzer Intermediate Code Generator Optimizer Code Generator

Symbol Table Used To Keep Track of Names Used in Program

Front End and Back End Simplify Compiler Design Introduction of new languages Introduction of new hardware

Page 27: CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.

Reading Assignment

Chapter 1

Homework AssignmentRead paper on FORTRAN and write 1 page report. (see

webpage for details)Due: Sept 4th at beginning of class