Language processor implementation using python

Post on 19-May-2015

333 views 1 download

Tags:

Transcript of Language processor implementation using python

Language Processor Implementationusing Python

Implement parser, syntax analyzer, semantic analyzer for Pascal language

Show main approaches to implementation of semantic analysis as well as intermediary code generation

Parameterize language processor

Objective

I investigated strong and week points ofseveral programming languages:JavaPerlPythonC++DelphiAs a result I chose Python due to several very good reasons.

Choosing Programming Language

High level programming language Supports object oriented paradigm Convenient data types Relatively fast due to using C based libraries Easy readable syntax Cross-platform Convenient tools for parsing YAML Supports regular expressions out of the box

Python Benefits

Вихід

Entry

Parser Syntax Semantics Code gen

In memory tables

Exit

Language Processor Work Scheme

Identifier^[A-Za-z][A-Za-z_0-9]{0,255}$Integer Const^[+-]?\d{1,10}$Float Const^([+-]?((\d+\.\d+)|(\d+\.\d+e[+-]\d+)))$String Literal^'.{0,65535}'$

Parser. Tokens

def getClass(self, word): c = None if ((word in self.KeyWords) or (word in metadata["delimiters"]) or (word in metadata["double"]) or (word in metadata["conditional_delimiters"]) or (word in metadata["multiplicative"]) or (word in metadata["additive"]) or (word in metadata["Relation"])): c = word else: for r in self.RegExp.keys(): if re.compile(r).match(word): c = self.RegExp[r] return c

Algorythm of Analyzing Lexems

Separate class

Design pattern Singleton

Uses hash table as internal structure

Fast access

Convenient format

Attribute Table

Attribute Table Look

Context free grammar

Left associated grammar

EBNF

Configuration format is yaml

Language grammar is easily changed

without source code modification

Syntax Analyzer

Short and readable

yaml uses data structures that are native

to programming languages like Perl,

Python

YAML Configuration Format Benefits

EBNF ruleprogram ::= Program ID ; block .

EBNF rule in configurationprogramme: - [program, id, ;, Block, "."]

Configuration Format of Language Grammar

Rule without semantic actionscomplex_action: - [begin, action_list, end]

Rule with semantic actionscomplex_action: - [begin, "#200", action_list, end, "#220"]

Semantic Analyzer. Semantic Actions

Making code generation process easier, I created a bunch of classes like AttrFor.class AttrFor (AttrObject): self.parameter = None self.first = None self.last = None self.step = None self.body = None

Attribute classes

Abstract Parse Tree

We are using tetrads language in order to generate intermediary codeHow tetrads language looks:Z := X op YZ := op XZ := YZ := Y[X]Z:GOTO ZIf condition GOTO Z

Intermediary code generation

program q;var

a, b: integer;i: integer;d: integer;

begin d := 4;

for i:= 1 to (2+2*2)*2 dobegin b:=b + 1; a:=a * 2;end;d:=a;

end.

Example of input file

d:=4i:=1@Lid1:if i > 12 goto @Lid2b:=b + 1a:=a * 2i := i + 1goto @Lid1@Lid2:d:=a

Output of language processor using intermediary code

I implemented parser, syntax analyzer, semantic analyzer, intermediary code generation of Pascal programming language

I showed main concepts of semantic analysis as well as intermediary code generation

Language processor has been parameterized

Bottom line