1 Note As usual, these notes are based on the Sebesta text. The tree diagrams in these slides are...
-
date post
21-Dec-2015 -
Category
Documents
-
view
214 -
download
0
Transcript of 1 Note As usual, these notes are based on the Sebesta text. The tree diagrams in these slides are...
1
Note
• As usual, these notes are based on the Sebesta text.
• The tree diagrams in these slides are from the lecture slides provided in the instructor resources for the text, and were made by David Garrett.
2
Context-free (CF) grammar
• A CF grammar is formally presented as a 4-tuple G=(T,NT,P,S), where:– T is a set of terminal symbols (the alphabet)– NT is a set of non-terminal symbols– P is a set of productions (or rules), where
PNT(TNT)*– SNT
4
Example 2L2 = { the dog chased the dog, the dog chased a
dog, a dog chased the dog, a dog chased a dog, the dog chased the cat, … }
G2 = ( { a, the, dog, cat, chased }, { S, NP, VP, Det, N, V }, { S NP VP, NP Det N, Det a | the, N dog | cat, VP V | VP NP, V chased }, S )
Notes: S = Sentence, NP = Noun Phrase , N = Noun
VP = Verb Phrase, V = Verb, Det = Determiner
5
Examples of lexemes and tokens
Lexemes Tokens
foo identifier
i identifier
sum identifier
-3 integer_literal
10 integer_literal
1 integer_literal
; statement_separator
= assignment_operator
6
BNF Fundamentals• Sample rules [p. 128]
<assign> → <var> = <expression><if_stmt> → if <logic_expr> then <stmt><if_stmt> → if <logic_expr> then <stmt> else <stmt>
• non-terminals/tokens surrounded by < and >• lexemes are not surrounded by < and >• keywords in language are in bold• → separates LHS from RHS• | expresses alternative expansions for LHS
<if_stmt> → if <logic_expr> then <stmt> | if <logic_expr> then <stmt> else
<stmt>• = is in this example a lexeme
7
BNF Rules• A rule has a left-hand side (LHS) and a right-
hand side (RHS), and consists of terminal and nonterminal symbols
• A grammar is often given simply as a set of rules (terminal and non-terminal sets are implicit in rules, as is start symbol)
8
Describing Lists
• There are many situations in which a programming language allows a list of items (e.g. parameter list, argument list).
• Such a list can typically be as short as empty or consisting of one item.
• Such lists are typically not bounded.
• How is their structure described?
9
Describing lists
• The are described using recursive rules.
• Here is a pair of rules describing a list of identifiers, whose minimum length is one:
<ident_list> ident | ident, <ident_list>
• Notice that ‘,’ is part of the object language
10
Example 3
L3 = { 0, 1, 00, 11, 000, 111, 0000, 1111, … }
G3 = ( {0,1}, {S,ZeroList,OneList},
{ S ZeroList | OneList,
ZeroList 0 | 0 ZeroList,
OneList 1 | 1 OneList }, S )
11
Derivation of sentences from a grammar
• A derivation is a repeated application of rules, starting with the start symbol and ending with a sentence (all terminal symbols).
12
Example: derivation from G2
• Example: derivation of the dog chased a catS NP VP
Det N VP
the N VP
the dog VP
the dog V NP
the dog chased NP
the dog chased Det N
the dog chased a N
the dog chased a cat
13
Example: derivations from G3
• Example: derivation of 0 0 0 0S ZeroList
0 ZeroList 0 0 ZeroList 0 0 0 ZeroList 0 0 0 0
• Example: derivation of 1 1 1S OneList
1 OneList 1 1 OneList 1 1 1
14
Observations about derivations
• Every string of symbols in the derivation is a sentential form.
• A sentence is a sentential form that has only terminal symbols.
• A leftmost derivation is one in which the leftmost nonterminal in each sentential form is the one that is expanded.
• A derivation can be leftmost, rightmost, or neither.
15
An example grammar
<program> <stmt-list><stmt-list> <stmt>
| <stmt> ; <stmt-list> <stmt> <var> = <expr> <var> a
| b | c | d
<expr> <term> + <term> | <term> - <term>
<term> <var> | const
16
A leftmost derivation
<program> => <stmts>
=> <stmt>
=> <var> = <expr>
=> a = <expr>
=> a = <term> + <term>
=> a = <var> + <term>
=> a = b + <term>
=> a = b + const
17
Parse tree
• A hierarchical representation of a derivation
<program>
<stmts>
<stmt>
const
a
<var> = <expr>
<var>
b
<term> + <term>
18
Parse trees and compilation
• A compiler builds a parse tree for a program (or for different parts of a program).
• If the compiler cannot build a well-formed parse tree from a given input, it reports a compilation error.
• The parse tree serves as the basis for semantic interpretation/translation of the program.
19
Extended BNF
• Optional parts are placed in brackets [ ]<proc_call> -> ident [(<expr_list>)]
• Alternative parts of RHSs are placed inside parentheses and separated via vertical bars <term> → <term> (+|-) const
• Repetitions (0 or more) are placed inside braces { }<ident> → letter {letter|digit}
20
Comparison of BNF and EBNF• sample grammar fragment expressed in BNF
<expr> <expr> + <term>| <expr> - <term>| <term>
<term> <term> * <factor>| <term> / <factor>| <factor>
• same grammar fragment expressed in EBNF<expr> <term> {(+ | -) <term>}<term> <factor> {(* | /) <factor>}
21
Ambiguity in grammars
• A grammar is ambiguous if and only if it generates a sentential form that has two or more distinct parse trees
22
An ambiguous grammar for arithmetic expressions
<expr> <expr> <op> <expr> | const<op> / | -
<expr>
<expr> <expr>
<expr> <expr>
<expr>
<expr> <expr>
<expr> <expr>
<op>
<op>
<op>
<op>
const const const const const const- -/ /
<op>
23
Disambiguating the grammar
• If we use the parse tree to indicate precedence levels of the operators, we can remove the ambiguity.
• The following rules give / a higher precedence than -
<expr> <expr> - <term> | <term><term> <term> / const| const
<expr>
<expr> <term>
<term> <term>
const const
const/
-
24
Links to BNF-style grammars for actual programming languages
Below are some links to grammars for real programming languages. Look at how the grammars are expressed.
– http://www.schemers.org/Documents/Standards/R5RS/– http://www.sics.se/isl/sicstuswww/site/documentation.html
In the ones listed below, find the parts of the grammar that deal with operator precedence.
– http://java.sun.com/docs/books/jls/index.html– http://www.lykkenborg.no/java/grammar/JLS3.html– http://www.enseignement.polytechnique.fr/profs/informatique/Jean-Jacques.L
evy/poly/mainB/node23.html– http://www.lrz-muenchen.de/~bernhard/Pascal-EBNF.html
25
Associativity of operators• When multiple operators appear in an expression, we
need to know how to interpret the expression.
• Some operators (e.g. +) are associative, meaning that the meaning of an expression with multiple instances of the operator is the same no matter how it is interpreted:
(a+b)+c = a+(b+c)
• Some operators (e.g. -) are not associative:(a-b)-c a-(b-c) e.g. try a=10, b=8, c=6
(10-8)-6 = -4 but 10-(8-6)=8
26
Associativity of Operators• Operator associativity can also be indicated by a
grammar
<expr> -> <expr> + <expr> | const (ambiguous)<expr> -> <expr> + const | const (unambiguous)
<expr><expr>
<expr>
<expr> const
const
const
+
+
27
Links to BNF-style grammars for actual programming languages
Below are some links to grammars for real programming languages. Look at how the grammars are expressed.
– http://www.schemers.org/Documents/Standards/R5RS/– http://www.sics.se/isl/sicstuswww/site/documentation.html
In the ones listed below, find the parts of the grammar that deal with operator associativity.
– http://java.sun.com/docs/books/jls/index.html– http://www.lykkenborg.no/java/grammar/JLS3.html– http://www.enseignement.polytechnique.fr/profs/informatique/Jean-
Jacques.Levy/poly/mainB/node23.html– http://www.lrz-muenchen.de/~bernhard/Pascal-EBNF.html