Unit-II

1

Unit-II

Syntax and Semantics

1

TEXT BOOKS :

1. CONCEPTS OF PROGRAMMING LANGUAGES BY ROBERT.W.SEBESTA

2. PROGRMMING LANGUAGES BY LOUDEN

REFERENCES :

1. PROGRMMING LANGUAGES BY GHEZZI 2. PROGRMMING LANGUAGES BY WATT,W.DREAMTECH

22

SUB-TOPIC NUMBER

SUB-TOPIC NAME LECTURE

NUMBER

SLIDE NUMBERS

1 Introduction L1 5

2 The General Problem of Describing Syntax: Terminology

L1 6

3 Formal Definition of Languages L1

8

4 Formal Methods of Describing Syntax L2 9

5 BNF and Context-Free Grammars L2 10

6 Backus-Naur Form (BNF) L2 11

7 BNF Fundamentals L3 12

8 Specific Rule for Describing Lists L3 13

33

SUB-TOPIC NUMBER

SUB-TOPIC NAME

LECTURE NUMBER

SLIDE NUMBERS

9 Grammars and Derivations L3 14

10 Another Example L4 15

11 Extended BNF L4 16

12 BNF and EBNF L4 17

13 EBNF variations L5 18

14 Parse trees L5 19

15 Ambiguity in Grammars L5 21

16 Unambiguous grammar L6 23

44

SUB-TOPIC NUMBER

SUB-TOPIC NAME LECTURE

NUMBER

SLIDE NUMBERS

17 Attribute grammars L6 24

18 Denotation semantics L6 25

19 Axiomatic semantics L7 27

20 Assertions L7 28

Introduction• Syntax and semantics provide a language’s definition • Syntax: the form or structure of the expressions,

statements, and program units• Semantics: the meaning of the expressions,

statements, and program units• E.g., while (x>20)

{ sum = sum + x;

x = x+1;

}

5

The General Problem of Describing Syntax: Terminology

• A language is a set of sentences• A sentence/statement is a string of characters over some

alphabet• A token is a category of lexemes (e.g., identifier)• A lexeme is the lowest level syntactic unit of a language

(e.g., *, sum, begin)

6

The General Problem of Describing Syntax: Terminology

• E.g.

language:

{

int index, count;

…

index = 2 * count + 17;

}

statement

lexeme

token: identifier

token: int literal

7

Formal Definition of Languages

• Recognizers– A recognition device reads input strings of the

language and decides whether the input strings belong to the language

– Example: syntax analysis part of a compiler

• Generators– A device that generates sentences of a language– One can determine if the syntax of a particular

sentence is correct by comparing it to the structure of the generator

8

LECTURE-2 9

Formal Methods of Describing Syntax

• Backus-Naur Form and Context-Free Grammars (BNF form)– Most widely known method for describing

programming language syntax• Extended BNF

– Improves readability and writability of BNF

9

LECTURE-2 10

BNF and Context-Free Grammars

• Context-Free Grammars– Developed by Noam Chomsky in the mid-1950s– Natural language linguist– Described four classes of grammars that define four

classes of languages– Two classes (context-free and regular) turned out to

be useful for describing the syntax of programming languages

10

LECTURE-2 11

Backus-Naur Form (BNF)• Backus-Naur Form (1959)

– Invented by John Backus to describe Algol 58, later modified by Peter Naur

– BNF is equivalent to context-free grammars – BNF is a metalanguage used to describe another

language– In BNF, abstractions are used to represent classes

of syntactic structures--they act like syntactic variables (also called nonterminal symbols)

• E.g. <assign> -> <var> = <expression> a = ;

2 * b11

LECTURE-3 12

BNF Fundamentals

• Non-terminals: BNF abstractions• Terminals: lexemes and tokens• Grammar: a collection of rules

– Examples of BNF rules:

<ident_list> → identifier | identifier, <ident_list>

<if_stmt> → if <logic_expr> then <stmt>

12

LECTURE-3 13

Specific Rule for Describing Lists

• Syntactic lists are described using recursion

<ident_list> ident

|ident,<ident_list>

13

LECTURE-3 14

Grammars and Derivations

• A derivation is a repeated application of rules, starting with the start symbol and ending with a sentence (all terminal symbols)

14

LECTURE-4 15

Another Example

<assign> <id > = <expr>

<id> A | B | C

<expr> <id> + <expr>

| <id> * <expr>

| (<expr>)

| <id>

<assign> => <id> = <expr>

=>A = <expr>

=> A = <id> * <expr>

=> A = B * <expr>

=> A = B * (<expr>)

=> A = B * (<id> + <expr>)

=> A = B * (A + <expr>)

= > A = B * (A + <id>)

=> A = B * ( A + C )

Grammar Derivation of “A=B*(A+C)”

15

LECTURE-4 16

Extended BNF• Optional parts are placed in brackets [ ]

<selection> -> if (<expression>) <statement> [else <statement>]

• Repetitions (0 or more) are placed inside braces { }

<ident_list> -> <identifier> {, <identifier>}• When a single element must be chosen from a group,

the options are placed in parentheses and separted by the OR operator, |.

<term> -> <term> (* | / | %)<factor>

16

LECTURE-4 17

BNF and EBNF• BNF<expr> <expr> + <term> | <expr> - <term> | <term> <term> <term> * <factor> | <term> / <factor> | <factor> <Factor-><exp>**<factor> |<exp>EBNF<expr> <term> {(+ | -)<term>}<term> <factor> {(* | /) <factor>}<factor>-><exp>{**<exp>}

17

LECTURE-5 18

EBNF variations• In place of the arrow, a colon is used and the RHS is placed

on the next line• Instead of a vertical bar to separate alternative RHSs, they

are simply placed on separate lines• In place of squared brackets to indicate something being

optional, the subscript opt is used. E.g.– ConstructorDeclarator ->

SimpleName(FormalParameterListopt)• Rather than using the | symbol in a parenthesized list of

elements to indicate a choice, the words “one of” are used. E.g. – AssignmentOperator -> one of

= *= /= %= += -= <<= >>= &= |=

18

LECTURE-5 19

Parse trees

• One of the most attractive features of grammars is that they naturally describe the hierarchical syntactic structure of the sentences of the languages they define. these hierarchical structures are called parse trees

• A hierarchical representation of a derivation

19

LECTURE-5 20

Parse Tree

<assign> A=B*(A+C)

<id> = <expr>

A <id> * <expr>

B ( <expr> )

<id> + <expr>

A <id>

C

A hierarchical representation of a derivation

20

LECTURE-5 21

Ambiguity in Grammars

• A grammar is ambiguous if and only if it generates a sentential form that has two or more distinct parse trees

21

LECTURE-5 22

An Ambiguous Expression Grammar<assign> -> <id> = <expr><id> -> A|B|C<expr> -> <expr> + <expr>

|<expr> * <expr> |(<expr>)

|<id>

a. <assign>

<id> = <expr>

A <expr> + <expr>

<id> <expr> * <expr>

B <id> <id>

C A

b. <assign>

<id> = <expr>

A <expr> * < expr>

<expr> + <expr> <id>

<id> <id> A

B C

A = B + C * A

22

LECTURE-6 23

Unambiguous grammarAn unambiguous grammar for if-then-else:BNF rule given

<if_stmt>->if<logic_expr>then<stmt>

|if<logic_expr>then<stmt>else<stmt>

The unambiguous grammar based on these ideas:

<stmt>-><matched>|<unmatched>

<matched>->if<logic-expr>then<matched>else<unmatched>

|any non if stmt

<unmatched>->if<logic_expr>then<stmt>

|if<logic-expr>then<matched>else<unmatched

23

LECTURE-6 24

Attribute grammars

• Attribute grammars are grammars to which have been added attributes, attribute computation functions, and predicate functions.

• Attribute computation functions sometimes called semantic functions, are associated with grammar rules.

• Predicate functions, which state some of the syntax and static semantic rules of the language, are associated with grammar rule

24

LECTURE-6 25

Denotational semantics

• This is the widely known method for describing the meaning of programs.

• The fundamental concept of denotation semantics is to define for each language entity both a mathematical object and a function that maps instances of that entity onto instances of the mathematical object

25

LECTURE-6 26

Denotational semantics(cond)

• The state of a program• Expressions• Assignment statements• Logical pretest loops• evaluation

26

LECTURE-7 27

Axiomatic semantics

• Axiomatic semantics was defined in conjunction with the development of a method to prove the correctness of programs.

• In a proof,each statement of a program is both preceded and followed by a logical expression that specifies constraints on program variables

• The notation used to describe constraints,indeed the language of axiomatic semantics,is predicate calculus

27

LECTURE-7 28

Assertions

• Axiomatic semantics is based on mathematical logic.the logic expressions are called predicates,or assertions

• An assertion immediately preceding a program statement describes the constraints on the program variables at that point in the program is called precondition assertion

• an assertion immediately following a statement describes the new constraints on those variables after execution of the statement is called post condition assertion

28

LECTURE-7 29

Weakest precondition:is the least restrictive precondition that will guarantee the validity of the associated postcondition

The weakest precondition can be computed only by an inference rule .an axiom is a logical statement that is assumed to be true

Assignment statements: sequences: selection:Logical pretest loops:Evaluation:

29

Unit-II

Documents

Transcript of Unit-II