Chapter 3
-
Upload
rosywalia71 -
Category
Documents
-
view
115 -
download
0
Transcript of Chapter 3
04/08/23 1
Programming Languages
Language Translation Issues
Programming Language SyntaxStages in TranslationFormal Translation Models
04/08/23 2
Programming Languages
Programming Language Syntax
• Syntax is defined as the arrangement of words as elements in a sentence to show their relationship.
X=Y + Z
2+3x4
X=2.45 + 3.67
Semantics cover Declarations, Operations, Sequence Control & Reference Environment.
04/08/23 3
Programming Languages
General Syntactic CriteriaoMain Purpose of Syntax is to provide a notation for communication between the Programmer & Programming Language Processor.
oDifferent Representation Criteria for a Data Type
Syntax Design should provide the following
Readability
Write ability
Ease of Verifiability
Ease of Translation
Lack Of Ambiguity
04/08/23 4
Programming Languages
• Readability
– Underlying Structure of the algorithm and data representation by the program is apparent from an inspection of the program text.
– Should be Self-Documenting.
– Support Natural Statement Formats, Structured Statements, Liberal use of Keywords & noise words, embedded comments, identifiers etc
– Good language should be supported by good Programming.
– Syntactic differences should reflect underlying semantic differences, so that program constructs that do similar things look similar and program constructs that do radically different things look different.
04/08/23 5
Programming Languages
• Writeability
– Features normally conflict with features that makes it easy to write.
– Implicit Syntactic conversations that allow declarations and operations to be left unspecified make programs short to write but difficult to read.
– Support Natural Statement Formats, Structured Statements, Liberal use of Keywords & noise words, embedded comments, identifiers etc
– When is a Syntax Redundant ?• When is it better and where it degrades the performance?
04/08/23 6
Programming Languages
• Ease of Verification
• Ease of Translation– Regularity of Structure/ Complexity – Whether it should be easy on Translator or
User?
• Lack of Ambiguity– An ambiguous statement allows more than 1 interpretations.
Example- 1) if Boolean expression then statement1 else statement2
2) if Boolean expression then statement1
If Boolean expression1 then if Boolean expression2 then statement1 else statement2
04/08/23 7
Programming Languages
Syntactic Elements of a Language
Character Set
Identifiers
Operator Symbol
Keywords & Reserved Words-Difference between them.
Noise Words – Go to
Comments / Blanks (Spaces)
Delimiters and Brackets – e.g. of Delimiters -begin end etc.
Free & Fixed Field Formats
Expression
Statements (Structured /Simple)
04/08/23 8
Programming Languages
Overall Program-Subprogram Structure
Separate Subprogram Definitions – Each Subprogram definition is treated as a separate syntactic unit. Each Program is complied separately and linked at load time.
Separate Data Definitions –Group together all operations that manipulate a given Data Object. : Classes in C
Nested Subprogram Definitions –Helps in Modular approach. But the concept is disappearing with the advent of Object-Oriented Programming.
Separate Interface Definitions –To pass data between two separately compiled components, additional data is needed. Handled by Program Specific Component.
e.g. In C “.h” forms the Specification component and the source program “.c” files form the Implementation component.
Data description separated from executable statements
Un separated subprogram definitions
04/08/23 9
Programming Languages
Stages in Translation
What is Translation ?
Logically, we may divide translation into two major parts – Analysis of the input source program & synthesis of the executable object program.
Translators are generally grouped according to the number of passes
For Fast Compilation use Single pass else Multiple Passes can be used.
Analysis of a Source Program
Lexical Analysis (Scanning) – Group the sequence of Characters
Create Lexeme and attach a type tag.
Model used is Finite-State Automata.
Time Consuming.Lexeme – Number / Identifier / Delimiter / Operator
04/08/23 10
Programming Languages
Source Program
Lexical Analysis
Syntactic Analysis
Semantic Analysis
Code Generation
Lexical Tokens
Parse Tree
Exe code
Intermediate Code
Optimized Intermediate Code
Optimization
Linking
Symbol Table
Other Table
Source
Program
Recognition
Phases
Structure of a Compiler
04/08/23 11
Programming Languages
Stages in Translation
Syntactic Analysis (Parsing).
Identifies Statements, Declarations, Expressions etc.
Semantic Analysis
It is the central phase of translation
Structure of executable object code begins to take place.
Output is some internal form of the final executable program which is then manipulated by the optimization stage of the translator before executable code is actually generated.
Some common functions of Semantic Analyzer :
Symbol-Table Maintenance
Insertion of Implicit Information
Error Detection
Macro Processing
04/08/23 12
Programming Languages
Stages in Translation
Synthesis of the Object Program
Optimization : Works on the Intermediate code received for Semantic analyzer, code contains string of operators and operands, or a table of operator-operand sequences. From this the code generators may generate the properly formatted output object code.
A=B+C+D
-May generate a code as such
-(a) Temp1= B+C
-(b) Temp2=Temp1 + D
-(c) A=Temp2
1)Load Register with B(from(a))2)Add C to register3)Store register in Temp14)Load Register with Temp1(from(b))5)Add D to register6)Store register is Temp27)Load register with Temp2(from(c))8)Store register in A.
04/08/23 13
Programming Languages
Stages in Translation
Code Generation
From the Optimized Code we must form assembly language statements , machine code or other object program form that is to be the output of the Translation.
Linking & Loading
Pieces of code from Separate Translations of subprograms are coalesced into the final executable program.
Bootstrapping
Often translator for a new language is written in that Language.
Diagnostic Compilers
Especially designed for Rapid Turnaround & Compilation time.
04/08/23 14
Programming Languages
Closing Quiz
Main Purpose of Syntax is to provide a notation for communication between the _________ & Programming Language Processor
Write 1 point to illustrate conflict between ease of write ability & readability.
What are the syntactic elements of a Language.
Lexical Analysis produces ______________.
Write the various stages in Translation.
Logically, we may divide translation into two major parts – _________ of the input source program & synthesis of the __________ object program.
04/08/23 15
Programming Languages
Formal Translation Models
The Syntactic Recognition parts of a Compiler theory are generally based on the context-free theory of Languages.
The formal definition of a syntax of a Programming Language is usually called Grammar.
A Grammar consists of a set of rules that specify the sequences of characters that form allowable programs in the langue being defined.
A Formal Grammar is just a grammar specified using a strictly defined notation.
The two classes of grammars useful in Compiler are
BNF Grammar
Regular Grammar
04/08/23 16
Programming Languages
Formal Translation Models
BNF Grammars ( Backus- Naur Form)
Comparison with English
The girl / ran / home
It is a Context free Grammar.
A Syntactically correct program has to make sense Syntactically
A Language is any set of (Finite Length) character strings with characters chosen from some fixed set of symbols.
<digit> :: 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
Term Digit is called a Syntactic Category or a non terminal.
<conditional statement> :: = if<Boolean expression> then <statement> else <statement> | if <Boolean expression > then <statement>
04/08/23 17
Programming Languages
BNF Grammar (Cont.)
<unsigned integer> ::<digit> | <unsigned integer> <digit>
Examples considering that <identifier> and <number> have already been defined
<assignment statement>:: <variable>=<arithmetic expression>
<arithmetic expression>::=<term> | <arithmetic expression> + <term> |
<arithmetic expression> - <term>
<term>::=<primary> |<term> x <primary> | <term> / <primary>
<primary>::=<variable> | <number> | <arithmetic expression>
<variable>::=<identifier>|<identifier>[subscript list]
<subscript list>::=<arithmetic expression> | <subscript list>,<arithmetic expression>
04/08/23 18
Programming Languages
Parse Trees
We can use a single-replacement rule to generate strings in our language.
S SS | (S) | ( )
S=>(S)=>(SS)=>(( )S)=>(( )( ))
Each term in the derivation is called Sentential Form
The use of a Formal Grammar to define the syntax of a programming language is important both for the Language User & Language Implementer.
To determine if a given string represents a syntactically valid program in the Language, we must use the grammar rules to construct a syntactic analysis or parse of the String. If the String can be successfully parsed, then it is in the language.
04/08/23 19
Programming Languages
BNF Grammar (Cont.)
What are the various restrictions in BNF Grammar
The same identifier may not be declared twice in the same block.
Every identifier must be declared in some block enclosing the point of its use.
An array declared to have two dimensions cannot be referenced with three subscripts.
Ambiguity
They / are /flying planes.
They / are flying /planes.
04/08/23 20
Programming Languages
BNF Grammar (Cont.)
Ambiguity
G1 : S -> SS|0|1 G2 : T -> 0T|1T|0|1
Ambiguous
04/08/23 21
Programming Languages
Extension of BNF Notation
The primary reason of the need for Extension of BNF is that it forces a rather Unnatural Representation for the common syntactic constructs of optional elements, alternative elements & repeated elements within a grammar rule.
Example : A signed integer is a sequence of digits preceded by an optional plus or minus.
<signed integer> :: + <integer> | - <integer>
<integer> :: <digit> | <integer><digit>
In Extended BNF it would be written as such :
signed integers : <signed integer>::[+|-] <digit>{digit}*
Identifier: <identifier> :: = <letter> { <letter> | <digit>}*
04/08/23 22
Programming Languages
Extension of BNF Notation
Syntax Charts (also called a Railroad Diagram)
Variable =Arithmetic expression
Assignment Statement
TermArithmetic Expression
-
+
04/08/23 23
Programming Languages
Finite State Automata
Tokens for a Programming Language have simple structures.
An Identifier begins with a letter, successive characters are letters of digits, they become part of identifiers name.
“if” reserved word is just the letter I followed by letter f.
This simple model is called Finite State Automata or a State Machine
A B
Any string that takes the machine from the initial state to a final state through a series of transitions is accepted by the machine.
04/08/23 24
Input Current State Accept String
Null A No
1 B Yes
10 B Yes
100 B Yes
1001 A No
10010 A No
100101 B Yes
Programming Languages
04/08/23 25
Programming Languages
Deterministic & Non-Deterministic Finite Automata
Deterministic-For each state of FSA and each input symbol, we have a unique transition to the same or different state. If there are “n” states and “k” symbols , then the FSA will have n x k transitions.
Non-Deterministic-It is FSA withA set of States.A start StateA set of Final StatesAn input alphabetA set of arcs from nodes to nodes, each labeled by an element of the Input Alphabet.
04/08/23 26
Programming Languages
Computational Power of an FSA
They have a defined set of states.Anbn will not be recognized by and FSA.
We need to have a Finite set of information like n<=k to find the solution.
04/08/23 27
Programming Languages
Closing Quiz
<term>::=<______> |<term> x <_______> | <term> / <________>
Ambiguity occurs when ______________________.
Syntax charts are also called ________________.
What is the meaning of this symbol in terms of FSA
FSA stands for ______________.
Difference between Non-deterministic FSA & Deterministic FSA