Introduction Lex
-
Upload
popescu-adina -
Category
Documents
-
view
223 -
download
0
Transcript of Introduction Lex
-
8/13/2019 Introduction Lex
1/21
Prin
ciplesofCompilers
-2/03/2006
PabloR.
Fillottrani
FREEUNIVERSITY OF
BOZENBOLZANOFaculty of
Computer Science
Introduction to Lex
General Description Input file Output file How matching is done Regular expressions Local names Using Lex
-
8/13/2019 Introduction Lex
2/21
Prin
ciplesofCompilers
-2/03/2006
PabloR.
Fillottrani
FREEUNIVERSITY OF
BOZENBOLZANOFaculty of
Computer Science
General Description
Lex is a program that automaticallygenerates code for scanners.
Input: a description of the tokens in theform of regular expressions, together
with the actions to be taken when eachexpression is matched.
Output: a text file with C source codedefining a procedure yylex()that is a tableimplementing the DFA for the regular
expressions.
-
8/13/2019 Introduction Lex
3/21
Prin
ciplesofCompilers
-2/03/2006
PabloR.
Fillottrani
FREEUNIVERSITY OF
BOZENBOLZANOFaculty of
Computer Science
Input File
Lex input file is divided in threeparts
/* declarations*/
%%
/* rules */
%%
/* auxiliary functions*/
-
8/13/2019 Introduction Lex
4/21
Prin
ciplesofCompilers
-2/03/2006
PabloR.
Fillottrani
FREEUNIVERSITY OF
BOZENBOLZANOFaculty of
Computer Science
Input File Declaration
The declaration part includes theassignment of names to regularexpressions in the form:
It can also include C code external to thedefinition of yylex()within %{and %}inthe first column.
Also, it is possible to specify someoptions with the sintax %option
-
8/13/2019 Introduction Lex
5/21
Prin
ciplesofCompilers
-2/03/2006
PabloR.
Fillottrani
FREEUNIVERSITY OF
BOZENBOLZANOFaculty of
Computer Science
Input File - Rules
The rules part specifies what to dowhen a regular expression ismatched
Actions are normal C sentences(can be a complex C sentencebetween {}).
%{ %} enclosed text appearingbefore the first rule may be used todeclare local variables.
-
8/13/2019 Introduction Lex
6/21
Prin
ciplesofCompilers
-2/03/2006
Pab
loR.
Fillottrani
FREEUNIVERSITY OF
BOZENBOLZANOFaculty of
Computer Science
Input File Aux Functions
The auxiliary functions part is onlyC code.
It includes function definitions forevery function needed in the rulepart
It can also contain the main()function if the scanner is going tobe used as a standalone program.
The main()function must call thefunction yylex()
-
8/13/2019 Introduction Lex
7/21
Prin
ciplesofCompilers
-2/03/2006
Pab
loR.
Fillottrani
FREEUNIVERSITY OF
BOZENBOLZANOFaculty of
Computer Science
Code Generated by Lex
The output of Lex is a file calledlex.yy.c
It is a C function that returns aninteger, i.e., a code for a Token
If it contains the main()functiondefinition, it must be compiled torun.
Otherwise, the code can be anexternal function declaration for thefunctionint yylex()
-
8/13/2019 Introduction Lex
8/21
Prin
ciplesofCompilers
-2/03/2006
Pab
loR.
Fillottrani
FREEUNIVERSITY OF
BOZENBOLZANOFaculty of
Computer Science
How matching is done
By running the generated scanner, itanalyses its input looking for strings thatmatch any of its patterns, and thenexecutes the action.
If more than one match is found, it selectsthe regular expression matching thelongest string.
If it finds two or more matches of the samelength, the one listed first is selected.
If no match is found, then the default ruleis executed: i.e., the next character in theinput is copied to the output.
-
8/13/2019 Introduction Lex
9/21
Prin
ciplesofCompilers-2/03/2006
Pab
loR.
Fillottrani
FREEUNIVERSITY OF
BOZENBOLZANOFaculty of
Computer Science
Regular expressions ::= a the character a ::= sthe string s, even if s contains metacharacters
::=\a the character a when a is a metacharacter (e.g., *) ::= . any character except newline ::=[+] any of the character ::= [-]any character from to
::= [^+] any character except those ::= * zero or more repetitions of ::= + one or more repetitions of ::= ? zero or one repetitions of ::= | or ::= followed by
::= ()same as ::= {} the named regular expression in the
definitions part
-
8/13/2019 Introduction Lex
10/21
Prin
ciplesofCompilers-2/03/2006
Pab
loR.
Fillottrani
FREEUNIVERSITY OF
BOZENBOLZANOFaculty of
Computer Science
Internal names
The rules, inside the actiondefinition, can refer to the followingvariables: yytext, the string being matched
(lexeme)
yyin, the input file yyout, the output file ECHO, the default rule action yyval, the global variable for
communicating the Attribute for aToken to the Parser
-
8/13/2019 Introduction Lex
11/21
Prin
ciplesofCompilers-2/03/2006
Pab
loR.
Fillottrani
FREEUNIVERSITY OF
BOZENBOLZANOFaculty of
Computer Science
Using Lex
There are several lex versions.We're going to use flex.
In order to maximize compatibility,use -loption when compiling, and%option noyywrapin the definitionpart of the Lex input file.
-
8/13/2019 Introduction Lex
12/21
PrinciplesofCompilers-2/03/2006
Pab
loR.
Fillottrani
FREEUNIVERSITY OF
BOZENBOLZANOFaculty of
Computer Science
Example 1
%{#include %}
%%[0-9]+ { printf("%s\n", yytext); }.|\n ;
%%
main(){yylex();
}
-
8/13/2019 Introduction Lex
13/21
PrinciplesofCompilers-2/03/2006
Pab
loR.
Fillottrani
FREEUNIVERSITY OF
BOZENBOLZANOFaculty of
Computer Science
Example 2
%{int c=0, w=0, l=0;
%}word [^ \t\n]+
eol \n
%%{word} {w++; c+=yyleng;};
{eol} {c++; l++;}. {c++;}
%%
main(){yylex();printf("%d %d %d\n", l, w, c);
}
-
8/13/2019 Introduction Lex
14/21
PrinciplesofCompilers-2/03/2006
Pab
loR.
Fillottrani
FREEUNIVERSITY OF
BOZENBOLZANOFaculty of
Computer Science
Example 3
%{int tokenCount=0;
%}
%%[a-zA-Z]+ { printf("%d WORD \"%s\"\n",
++tokenCount, yytext); }[0-9]+ { printf("%d NUMBER \"%s\"\n",
++tokenCount, yytext); }[^a-zA-Z0-9]+ { printf("%d OTHER \"%s\"\n",
++tokenCount, yytext); }
%%main() { yylex(); }
-
8/13/2019 Introduction Lex
15/21
PrinciplesofCompilers-2/03/2006
Pab
loR.
Fillottrani
FREEUNIVERSITY OF
BOZENBOLZANOFaculty of
Computer Science
Example 4
%{#include
int lineno =1;
%}
line .*\n
%%
{line} {printf("%5d %s", lineno++, yytext); }
%%int main() {
yylex(); return 0; }
-
8/13/2019 Introduction Lex
16/21
PrinciplesofCompilers-2/03/2006
Pab
loR.
Fillottrani
FREEUNIVERSITY OF
BOZENBOLZANOFaculty of
Computer Science
Example 5
%{#include
%}
comment_line \/\/.*\n
%%{comment_line} { printf("%s\n", yytext); }
.*\n ;%%
int main() {
yylex(); return 0;
}
-
8/13/2019 Introduction Lex
17/21
PrinciplesofCompilers-2/03/2006
Pab
loR.
Fillottrani
FREEUNIVERSITY OF
BOZENBOLZANOFaculty of
Computer Science
Example 7
%{
#include
%}digit [0-9]
number {digit}+%%
{number} { int n = atoi(yytext);printf("%x", n); }
. {;}%%
int main() {
yylex(); return 0;}
-
8/13/2019 Introduction Lex
18/21
PrinciplesofCompilers-2/03/2006
Pab
loR.
Fillottrani
FREEUNIVERSITY OF
BOZENBOLZANOFaculty of
Computer Science
Example 8 (1/4)%{
#include "globals.h"
#include "util.h"
#include "scan.h"int lineno=0;FILE *listing; // used to output source code listing
FILE *code; // used to output assembly codeFILE *source; // used to input tiny program source
code
int TraceScan = 1;int EchoSource = 1;
/* lexeme of identifier or reserved word */
char tokenString[MAXTOKENLEN+1];
%}
digit [0-9]number {digit}+
letter [a-zA-Z]
identifier {letter}+
newline \nwhitespace [ \t]+
-
8/13/2019 Introduction Lex
19/21
PrinciplesofCompilers-2/03/2006
Pab
loR.
Fillottrani
FREEUNIVERSITY OF
BOZENBOLZANOFaculty of
Computer Science
Example 8 (2/4)%%"if" {return IF;}
"then" {return THEN;}"else" {return ELSE;}"end" {return END;}
"repeat" {return REPEAT;}"until" {return UNTIL;}
"read" {return READ;}"write" {return WRITE;}
":=" {return ASSIGN;}"=" {return EQ;}"
-
8/13/2019 Introduction Lex
20/21
PrinciplesofCompilers-2/03/2006
Pab
loR.
Fillottrani
FREEUNIVERSITY OF
BOZENBOLZANOFaculty of
Computer Science
Example 8 (3/4)
{number} {return NUM;}{identifier} {return ID;}
{newline} {lineno++;}
{whitespace} {/* skip whitespace */}
"{" { char c;do{ c = input();
if (c == '\n') lineno++;
} while (c != '}');}
-
8/13/2019 Introduction Lex
21/21
PrinciplesofCompilers-2/03/2006
Pab
loR.
Fillottrani
FREEUNIVERSITY OF
BOZENBOLZANOFaculty of
Computer Science
Example 8 (4/4)%%TokenType getToken(void)
{ static int firstTime = TRUE;TokenType currentToken;
if (firstTime)
{ firstTime = FALSE;lineno++;
yyin = source=stdin;
yyout = listing=stdout; }
currentToken = (TokenType)yylex();strncpy(tokenString,yytext,MAXTOKENLEN);if (TraceScan) {
fprintf(listing,"\t%d: ",lineno);
printToken(currentToken,tokenString);
}
return currentToken; }
int main(){
TraceScan = TRUE;
while( getToken() != ENDFILE);return 0; }