CH1.1 CSE244 Chapter 1: Introduction to Compiling Prof. Steven A. Demurjian, Sr. Computer Science &...

Post on 18-Jan-2018

230 views 0 download

description

CH1.3 CSE244 Classifications of Compilers  Compilers Viewed from Many Perspectives  However, All utilize same basic tasks to accomplish their actions Single Pass Multiple Pass Load & Go Construction Debugging Optimizing Functional

Transcript of CH1.1 CSE244 Chapter 1: Introduction to Compiling Prof. Steven A. Demurjian, Sr. Computer Science &...

CH1.1

CSE244

Chapter 1: Introduction to CompilingChapter 1: Introduction to Compiling

Prof. Steven A. Demurjian, Sr.Computer Science & Engineering Department

The University of Connecticut191 Auditorium Road, Box U-155

Storrs, CT 06269-3155steve@engr.uconn.edu

http://www.engr.uconn.edu/~steve(860) 486 - 4818

Dr. Robert LaBarreUnited Technologies Research Center

411 Silver LaneE. Hartford, CT 06018LaBarrRE@utrc.utc.comlabarre_math@hotmail.com

CH1.2

CSE244

Introduction to CompilersIntroduction to Compilers As a Discipline, Involves Multiple CSE AreasAs a Discipline, Involves Multiple CSE Areas

Programming Languages and Algorithms Software Engineering & Theory / Foundations Computer Architecture & Operating Systems

But, Has Surprisingly Simplistic Intent:But, Has Surprisingly Simplistic Intent:

CompilerSource program

Target Program

Error messages

Diverse & Varied

CH1.3

CSE244

Classifications of CompilersClassifications of Compilers Compilers Viewed from Many PerspectivesCompilers Viewed from Many Perspectives

However, All utilize same basic tasks to However, All utilize same basic tasks to accomplish their actionsaccomplish their actions

Single Pass

Multiple Pass

Load & Go

Construction

Debugging

OptimizingFunctional

CH1.4

CSE244

Classifications of CompilersClassifications of Compilers Also, Broadly Categorized as:Also, Broadly Categorized as:

We Will Discuss Each Category in This ClassWe Will Discuss Each Category in This Class

Analysis:

Synthesis:

Decompose Source into an intermediate representation

Target program generation from representation

CH1.5

CSE244

Important Notes In Today’s Technology, In Today’s Technology, AnalysisAnalysis Is Often Performed Is Often Performed

by by Software ToolsSoftware Tools - This Wasn’t the Case in Early - This Wasn’t the Case in Early CSE DaysCSE Days Structure / Syntax directed editors: Force

“syntactically” correct code to be entered Pretty Printers: Standardized version for program

structure (i.e., blank space, indenting, etc.) Static Checkers: A “quick” compilation to detect

rudimentary errors Interpreters: “real” time execution of code a

“line-at-a-time”

CH1.6

CSE244

Important Notes Compilation Is Compilation Is NotNot Limited to Programming Limited to Programming

Language ApplicationsLanguage Applications Text Formatters

LATEX & TROFF Are Languages Whose Commands Format Text

Silicon Compilers Textual / Graphical: Take Input and Generate Circuit

Design Database Query Processors

Database Query Languages Are Also a Programming Language

Input Is“compiled” Into a Set of Operations for Accessing the Database

CH1.7

CSE244

The Many The Many PhasesPhases of a Compiler of a CompilerSource Program

Lexical Analyzer

1

Syntax Analyzer2

Semantic Analyzer3

Intermediate Code Generator

4

Code Optimizer5

Code Generator6

Target Program

Symbol-table Manager

Error Handler

1, 2, 3 : Analysis - Our Focus4, 5, 6 : Synthesis

CH1.8

CSE244

Three Phases:Three Phases: Linear / Lexical Analysis:

L-to-r Scan to Identify Tokens Hierarchical Analysis:

Grouping of Tokens Into Meaningful Collection Semantic Analysis:

Checking to Insure Correctness of Components

The Analysis Task For Compilation

CH1.9

CSE244

Phase 1. Lexical Analysis

Easiest Analysis - Identify tokens which are building blocks

For Example:

All are tokens

Blanks, Line breaks, etc. are scanned out

Position := initial + rate * 60 ;_______ __ _____ _ ___ _ __ _

CH1.10

CSE244

Phase 2. Phase 2. Hierarchical AnalysisHierarchical Analysisaka aka ParsingParsing or or Syntax AnalysisSyntax Analysis

For previous example, we would have Parse Tree:

identifier

identifier

expression

identifier

expression

number

expression

expression

expression

assignment statement

position

:=

+

*

60

initial

rate

Nodes of tree are constructed using a grammar for the language

CH1.11

CSE244

What is a Grammar?What is a Grammar? Grammar is a Set of Rules Which Govern the Grammar is a Set of Rules Which Govern the

Interdependencies & Structure Among the TokensInterdependencies & Structure Among the Tokens

statement is an assignment statement, or while statement, or if statement, or ...

assignment statement

expression is an

is an identifier := expression ;

(expression), or expression + expression, or expression * expression, or number, or identifier, or ...

CH1.12

CSE244

Why Have We Divided Analysis Why Have We Divided Analysis in This Manner?in This Manner?

Lexical Analysis - Scans Input & Its Linear Lexical Analysis - Scans Input & Its Linear Actions Are Not RecursiveActions Are Not Recursive Identify Only Individual “words” that are the

the Tokens of the Language Recursion Is Required to Identify Structure of an Recursion Is Required to Identify Structure of an

Expression, As Indicated in Parse TreeExpression, As Indicated in Parse Tree Verify that the “words” are Correctly

Assembled into “sentences” What is Third Phase?What is Third Phase?

Determine Whether the Sentences have One and Only One Unambiguous Interpretation

“John Took Picture of Mary Out on the Patio”

CH1.13

CSE244

Phase 3. Semantic AnalysisPhase 3. Semantic Analysis Find More Complicated Semantic Errors and Find More Complicated Semantic Errors and

Support Code GenerationSupport Code Generation Parse Tree Is Augmented With Semantic ActionsParse Tree Is Augmented With Semantic Actions

position

initial

rate

:=+

*

60

Compressed Tree

position

initial

rate

:=+

*

inttoreal

60

Conversion Action

CH1.14

CSE244

Phase 3. Semantic AnalysisPhase 3. Semantic Analysis Most ImportantMost Important Activity in This Phase: Activity in This Phase: Type CheckingType Checking - - Legality of OperandsLegality of Operands Many Different Situations:Many Different Situations:

Real := int + char ;

A[int] := A[real] + int ;

while char <> int do

…. Etc.

CH1.15

CSE244

Analysis in Text Formatting

Simple Commands : LATEX

\begin{single}

\end{single}

\noindent

\section{Introduction}

$A_i$

$A_{i_j}$

Embedded in a stream of text, i.e., a FILE

\ and $ serve as signals to LATEX

begin

single

noindent

section

Language

Commands

What are tokens?

What is hierarchical structure?

What kind of semantic analysis is required?

CH1.16

CSE244

Supporting Phases/ Activities for Analysis

Symbol Table Creation / MaintenanceSymbol Table Creation / Maintenance Contains Info on Each “Meaningful” Token,

Typically Identifiers Data Structure Created / Initialized During

Lexical Analysis Utilized / Updated During Later Analysis &

Synthesis Error HandlingError Handling

Detection of Different Errors Which Correspond to All Phases

What Kinds of Errors Are Found During the Analysis Phase?

What Happens When an Error Is Found?

CH1.17

CSE244

The Many The Many PhasesPhases of a Compiler of a CompilerSource Program

Lexical Analyzer

1

Syntax Analyzer2

Semantic Analyzer3

Intermediate Code Generator

4

Code Optimizer5

Code Generator6

Target Program

Symbol-table Manager

Error Handler

1, 2, 3 : Analysis - Our Focus4, 5, 6 : Synthesis

CH1.18

CSE244

The Synthesis Task For Compilation Intermediate Code GenerationIntermediate Code Generation

Abstract Machine Version of Code - Independent of Architecture

Easy to Produce and Do Final, Machine Dependent Code Generation

Code OptimizationCode Optimization Find More Efficient Ways to Execute Code Replace Code With More Optimal Statements 2-approaches: High-level Language &

“Peephole” Optimization Final Code GenerationFinal Code Generation

Generate Relocatable Machine Dependent Code

CH1.19

CSE244

Reviewing the Entire ProcessReviewing the Entire Process

Errors

position := initial + rate * 60

lexical analyzer

syntax analyzer

semantic analyzer

intermediate code generator

id1 := id2 + id3 * 60

:=

id1id2l

id3

+*

60

:=

id1id2l

id3

+*

inttoreal

60

Symbol Table

position ....

initial ….

rate….

CH1.20

CSE244

Reviewing the Entire ProcessReviewing the Entire Process

Errorsintermediate code generator

code optimizer

final code generator

temp1 := inttoreal(60)temp2 := id3 * temp1temp3 := id2 + temp2id1 := temp3

temp1 := id3 * 60.0id1 := id2 + temp1

mov f id3, r2mulf #60.0, r2movf id2, r1addf r2, r2movf r1, id1

position ....

initial ….

rate….

Symbol Table

CH1.21

CSE244

Compiler Cousins:Compiler Cousins: PreprocessorsPreprocessors Provide Input to Compilers

1. Macro Processing

#define in C: does text substitution before compiling

#define X 3

#define Y A*B+C

#define Z getchar()

CH1.22

CSE244

2. File Inclusion

#include in C - bring in another file before compiling

defs.h

//////////////////

main.c

#include “defs.h”

…---…---…---…---…---…---…---…---…---

//////////////////

…---…---…---…---…---…---…---…---…---

CH1.23

CSE244

3. Rational Preprocessors Augment “Old” Languages With Modern Augment “Old” Languages With Modern

ConstructsConstructs Add Macros for If - Then, While, Etc. Add Macros for If - Then, While, Etc. #Define Can Make C Code More Pascal-like#Define Can Make C Code More Pascal-like

#define begin {

#define end }

#define then

CH1.24

CSE244

4. Language Extensions for a Database System

EQUEL - Database query language embedded in C

## Retrieve (DN=Department.Dnum) where

## Department.Dname = ‘Research’

is Preprocessed into:

ingres_system(“Retr…..Research’”,____,____);

a procedure call in a programming language.

CH1.25

CSE244

The Grouping of Phases

Front End : Analysis + Intermediate Code Generation

Back End : Code Generation + Optimizationvs.

Number of Passes:Single - Preferred

Multiple - Easier, but less efficient

Tradeoffs ……..

CH1.26

CSE244

Compiler Construction Tools

Parser Generators : Produce Syntax Analyzers

Scanner Generators : Produce Lexical Analyzers

Syntax-directed Translation Engines : Generate Intermediate Code

Automatic Code Generators : Generate Actual Code

Data-Flow Engines : Support Optimization