Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

32
Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    223
  • download

    1

Transcript of Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

Page 1: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

Introduction to Compilers

Professor Yihjia Tsai2006 Spring

Tamkang University

Page 2: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

2

What is a compiler?

• Translates source code to target code– Source code is typically a high level

programming language (Java, C++, etc) but does not have to be

– Target code is often a low level language like assembly or machine code but does not have to be

• Can you think of other compilers that you have used – according to this definition?

Page 3: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

3

Before we begin

• A-Z, a-z, 0-9• “ double quote• # hash• $ dollar sign• % percent• & ampersand• ‘ single quote• ( left parenthesis• ) right parenthesis

• * star• + plus• , comma• - hyphen, minus• / slash• : colon• ; semicolon• < less than• = equal

Page 4: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

4

Symbols

• > greater than• ? question mark• @ at sign• [ left (open) square

bracket• \ back slash• ] right (close) square

bracket• ^ caret, power• _ underscore

• ` back quote• { open brace• | or• } close brace• ~ tilde• . period, dot bullet

Page 5: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

5

Greek symbols

alpha beta gamma delta epsilon phi zeta theta iota kappa lambda

mu nu xi pi rho sigma tau chi psi eta omega

Page 6: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

6

Other Compilers

• Javadoc -> HTML• XML -> HTML• SQL Query output -> Table• Poscript -> PDF• High level description of a circuit -

> machine instructions to fabricate circuit

Page 7: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

The C

om

pila

tion P

roce

ss

Page 8: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

8

The analysis Stage

• Broken up into four phases– Lexical Analysis (also called scanning

or tokenization)– Parsing– Semantic Analysis– Intermediate Code Generation

Page 9: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

9

Lexing Example

double d1;double d2;d2 = d1 * 2.0;

double TOK_DOUBLE reserved wordd1 TOK_ID variable name; TOK_PUNCT has value of “;”double TOK_DOUBLE reserved wordd2 TOK_ID variable name ; TOK_PUNCT has value of “;”d2 TOK_ID variable name = TOK_OPER has value of “=”d1 TOK_ID variable name* TOK_OPER has value of “*”2.0 TOK_FLOAT_CONST has value of 2.0; TOK_PUNCT has value of “;”

lexemes

Page 10: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

10

Syntax and Semantics

• Syntax - the form or structure of the expressions – whether an expression is well formed

• Semantics – the meaning of an expression

Page 11: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

11

Syntactic Structure

• Syntax almost always expressed using some variant of a notation called a context-free grammar (CFG) or simply grammar– BNF– EBNF

Page 12: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

12

A CFG has 4 parts

• A set of tokens (lexemes), known as terminal symbols

• A set of non-terminals• A set of rules (productions) where each

production consists of a left-hand side (LHS) and a right-hand side (RHS) The LHS is a non-terminal and the RHS is a sequence of terminals and/or non-terminal symbols.

• A special non-terminal symbol designated as the start symbol

Page 13: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

13

An example of BNF syntax for real numbers

<r> ::= <ds> . <ds><ds> ::= <d> | <d> <ds><d> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7| 8 | 9

< > encloses non-terminal symbols::= 'is' or 'is made up of ' or 'derives' (sometimes denoted with an arrow ->) | or

Page 14: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

14

Example

• On the example from the previous slide:– What are the tokens?– What are the lexemes?– What are the non terminals?– What are the productions?

Page 15: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

15

Token vs. lexeme

• to·ken One that represents a group, as an employee whose presence is used to deflect from the employer criticism or accusations of discrimination.

• to·ken A basic, grammatically indivisible unit of a language such as a keyword, operator or identifier.

• lexeme A minimal unit (as a word or stem) in the lexicon of a language; `go' and `went' and `gone' and `going' are all members of the English lexeme `go'

• lexeme A minimal lexical unit of a language. Lexical analysis converts strings in a language into a list of lexemes. For a programming language these word-like pieces would include keywords, identifiers, literals and punctuations. The lexemes are then passed to the parser for syntactic analysis.

Page 16: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

16

BNF Points

• A non terminal can have more than RHS or an OR can be used

• Lists or sequences are expressed via recursion

• A derivation is just a repeated set of production (rule) applications

• Examples

Page 17: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

17

Example Grammar

<program> -> <stmts><stmts> -> <stmt> | <stmt> ; <stmts><stmt> -> <var> = <expr><var> -> a | b | c | d<expr> -> <term> + <term> | <term> - <term><term> -> <var> | const

Page 18: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

18

Example Derivation

<program> => <stmts> => <stmt> => <var> = <expr> => a = <expr> => a = <term> + <term> => a = <var> + <term> => a = b + <term> => a = b + const

Page 19: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

19

Parse Trees• Alternative representation for a

derivation• Example parse tree for the previous

example

var expr=

term+

var

b

const

stmts

stmt

terma

Page 20: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

20

Another Example

Expression -> Expression + Expression | Expression - Expression | ... Variable | Constant |...Variable -> T_IDENTIFIERConstant -> T_INTCONSTANT | T_DOUBLECONSTANT

Page 21: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

21

The Parse

Expression -> Expression + Expression -> Variable + Expression

-> T_IDENTIFIER + Expression -> T_IDENTIFIER + Constant -> T_IDENTIFIER + T_INTCONSTANT

a + 2

Page 22: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

22

Parse Trees

PS -> P | P PS

P -> | '(' PS ')' | '<' PS '>' | '[' PS ']'

What’s the parsetree for this statement ? < [ ] [ < > ] >

Page 23: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

23

EBNF - Extended BNF

• Like BNF except that• Non-terminals start w/ uppercase • Parens are used for grouping

terminals • Braces {} represent zero or more

occurrences (iteration ) • Brackets [] represent an optional

construct , that is a construct that appears either once or not at all.

Page 24: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

24

EBNF example

Exp -> Term { ('+' | '-') Term }Term -> Factor { ('*' | '/') Factor }Factor -> '(' Exp ')' | variable | constant

Page 25: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

25

EBNF/BNF

• EBNF and BNF are equivalent• How can {} be expressed in BNF?• How can ( ) be expressed?• How can [ ] be expressed?

Page 26: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

26

Semantic Analysis

• The syntactically correct parse tree (or derivation) is checked for semantic errors

• Check for constructs that while valid syntax do not obey the semantic rules of the source language.

• Examples:– Use of an undeclared/un-initialized variable– Function called with improper arguments– Incompatible operands and type mismatches,

Page 27: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

27

Examples

int i;int j;i = i + 2;

int arr[2], c;c = arr * 10;

Most semantic analysis pertains to the checking of types.

void fun1(int i);double d;d = fun1(2.1);

Page 28: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

28

Intermediate Code Generation

• Where the intermediate representation of the source program is created.

• The representation can have a variety of forms, but a common one is called three-address code (TAC)

• Like assembly – the TAC is a sequence of simple instructions, each of which can have at most three operands.

Page 29: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

29

Example

_t1 = b * c_t2 = b * d_t3 = _t1 + _t2a = _t3

a = b * c + b * d

Note: temps

Page 30: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

30

Another Example

_t1 = a > b if _t1 goto L0 _t2 = a - c a = _t2L0: t3 = b * c c = _t3

if (a <= b) a = a - c;c = b * c;

Note TempsSymbolic addresses

Page 31: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

31

Next Time

• Finish introduction to compilation stages

• Read Appel Chapter 1, and 2 if you have not already done so.

• What is a splay tree?

Page 32: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

32

Selected References

• Appel, A., Modern Compiler Implementation In Java (2nd Ed), Cambridge University Press, 2002. ISBN 052182060X.

• Aho, A.V., R. Sethi, and J.D. Ullman, Compilers Principles, Techniques and Tools, Addison-Wesley, 1988. ISBN 0-201-10088-6.

• Muchnick, S., Advanced Compiler Design and Implementation, Morgan Kaufmann, 1998. ISBN 1-55860-320-4.