8/18/2019 Part 2 - Syntax
1/30
8/18/2019 Part 2 - Syntax
2/30
8/18/2019 Part 2 - Syntax
3/30
• The study of programming languages can bedivided into the examination of syntax andsemantics
– Syntax - is the form of expressions, statements, andprogram units
– Semantics - is the meaning of those expressions,
statements, and program units
• Meaning that syntax is the form (structure,grammar) of a language and semantics is the
meaning of a language .• In a well-designed programming language,
semantics should follow directly from syntax
• Describing syntax is easier than describing
semantics
8/18/2019 Part 2 - Syntax
4/30
xample!
syntax! if-else is an operator thatta"es three operands - a condition andtwo statements
semantics! if the value of a is greaterthan the value of b, then increment a.#therwise, increment b.
if $a % b& a ' a ( )* else b ' b ( )*
8/18/2019 Part 2 - Syntax
5/30
• Syntax is what the grammar allows,semantics is what it means.
int x ' +ve+* syntax is o"ay
$type identier ' value&, semantics is
wrong $+ve+ is not an int&.
8/18/2019 Part 2 - Syntax
6/30
8/18/2019 Part 2 - Syntax
7/30
oth the syntax and semantics of aprogramming language must be carefullydened so that!
• language implementors can implementthe language $correctly&, so that programsdeveloped with one implementation runcorrectly under another $portability&
• programmers can use the language$correctly&
8/18/2019 Part 2 - Syntax
8/30
Describing Syntax
/ language is a set of strings ofcharacters from some alphabet.
xamples!0 nglish $using the standard alphabet&0 binary numbers $using the alphabet
12, )3&
8/18/2019 Part 2 - Syntax
9/30
The syntax rules of a languagedetermine whether or not arbitrary
strings belong to the language. The rststep in specifying syntax is describingthe basic units or 4words5 of thelanguage, called lexemes.
6or example, some typical 7ava lexemesinclude!
0 if0 ((
0 (
8/18/2019 Part 2 - Syntax
10/30
T; /? @>#?M #6DSA>II=< SB=T/C
• Lexemes - the lowest level ofsyntactic unit
• The lexemes of a programminglanguage include its identiers,literals, operators and special words
• Token of a language is a category ofits lexemes
8/18/2019 Part 2 - Syntax
11/30
• ?exemes are grouped into categoriescalled to"ens. ach to"en has one or
more lexemes.• To"ens are specied using regular
expressions or nite automata.•
The scannerlexical analyEer of acompiler processes the characterstrings in the source program anddetermines the to"ens that theyrepresent.
• #nce the to"ens of a language aredened, the next step is to determine
which seFuences of to"ens are in the
8/18/2019 Part 2 - Syntax
12/30
8/18/2019 Part 2 - Syntax
13/30
8/18/2019 Part 2 - Syntax
14/30
?/=
8/18/2019 Part 2 - Syntax
15/30
6#>M/? MT;#DS #6 DSA>II=<SB=T/C
• 7ohn ac"us and =oam Ahoms"y
invented a notation that is mostwidely used for describingprogramming language syntax
8/18/2019 Part 2 - Syntax
16/30
A#=TCT 6> /MM/>S
• Ahoms"y described classes ofgrammars that dene classes oflanguages. Two of these grammar
classes, context-free and regularturned out to be useful for describingthe syntax of programming
languages• The to"ens of programming
languages can be described by
regular grammars
8/18/2019 Part 2 - Syntax
17/30
#>I 6#>M$=6&
• =6 is a very natural notation for describing syntax
• Ahoms"yKs context-free languages is almost same as=6Ks context-free grammars meta-language is alanguage that is used to describe another language
• =6 is meta-language for programming languages
• The abstractions in =6, or grammar are called non-terminals
• The lexemes and to"ens of the rules are called
terminals• / =6 description, or grammar, is simply a collection
of rules
8/18/2019 Part 2 - Syntax
18/30
=6 L ac"us-=aur 6orm=6 is!
0 a metalanguage - a language used todescribe other languages0 the standard way to describe programminglanguage syntax
0 often used in language reference manuals
The class $set& of languages that can bedescribed using =6 is called the context-freelanguages, and =6 descriptions are alsocalled context-free grammars or ustgrammars.
8/18/2019 Part 2 - Syntax
19/30
=6 =otation
8/18/2019 Part 2 - Syntax
20/30
8/18/2019 Part 2 - Syntax
21/30
@arse Tree
• / parse tree is a graphical way ofrepresenting a derivation.
• the root of the parse tree is always the
start symbol• each interior node is a nonterminal
• each leaf node is a to"en
• the children of a nonterminal $interiornode& are the >;S of some rule whose?;S is the nonterminal
8/18/2019 Part 2 - Syntax
22/30
6 l t f
8/18/2019 Part 2 - Syntax
23/30
6or example, a parse tree for!if $id % num& id ' num* else 1 id ' id (num* id ' id* 3
using the previous grammar is!
/ i bi if th 9
8/18/2019 Part 2 - Syntax
24/30
/ grammar is ambiguous if there are 9 or moredistinct parse trees $or eFuivalently, leftmostderivations& for the same string.
Aonsider the grammar!Nexpr% O id P num P $Nexpr%& P Nexpr% ( Nexpr% PNexpr% Q Nexpr%and the string!id ( num Q id
The following parse trees show that this grammar isambiguous!
Rhich parse tree ould e prefer
8/18/2019 Part 2 - Syntax
25/30
Rhich parse tree would we prefer
8/18/2019 Part 2 - Syntax
26/30
This grammar modication gives rise tothree proof obligations!0 the two grammars dene the samelanguage0 the second grammar always gives correctassociativity and precedence
0 the second grammar is not ambiguous These roofs are omitted.
8/18/2019 Part 2 - Syntax
27/30
SB=T/C /@;S
• / graph is a collection of nodes, some of whichare connected by lines, called edges
• / directed graph is one in which the edges are
directional* they have arrowheads on one endto indicate a direction
• The information in =6 rules can berepresented in a directed graph, such graphs
are called syntax graphs. These graphs userectangles for non-terminals and circles forterminals
8/18/2019 Part 2 - Syntax
28/30
/MM/>S /=D>A#S
• #ne of the most widely used of the syntaxanalyEer generators is named yacc - yet anothercompiler compiler Syntax analyEers forprogramming languages, which are often called
parsers, construct parse trees for givenprograms
• The 9 broad classes of parsers are top-down, inwhich the tree is built from the root downward to
the leaves, and bottom-up, in which the parsetree is built from the leaves upward to the root.
8/18/2019 Part 2 - Syntax
29/30
>AG>SI DA=T@/>SI=<
• Aontext-free grammar can serve as thebasis for the syntax analyEer, or parser, of acompiler
•
/ simple "ind of grammar-based top-downparser is named recursive decent
• @arsing is the process of tracing a parsetree for a given input string
• The basic idea of a recursive decent parseris that there is a subprogram for each non-terminal in the grammar
8/18/2019 Part 2 - Syntax
30/30
/TT>IGT /MM/>S
• /n attribute grammar is a deviceused to describe more of thestructure of
a programming language than ispossible with a context-free grammar
• /n attribute grammar is an extension
to a context-free grammar
Top Related