Programming in Perl regular expressions and m,s operators
description
Transcript of Programming in Perl regular expressions and m,s operators
![Page 1: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/1.jpg)
Programming in Perlregular expressions and m,s
operators
Peter VerhásJanuary 2002.
![Page 2: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/2.jpg)
Pattern Matching Operator
expression =~ m/regexp/options;
$a = "apple";
print "yes!" if $a =~ m/pp/;
The result is TRUE (1) or FALSE (0).
![Page 3: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/3.jpg)
M operator options
• g global search• i case insensitive search• m multi-line string• s single line string• o evaluate once only• x extended regular expression
Now let’s see what Regular expression is and then we will return to m operator fine points.
![Page 4: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/4.jpg)
Regular Expressions
• A regular expression is a string with joker characters and joker expressions.
• We will look at examples to explain it.
![Page 5: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/5.jpg)
Regular Expression to Verify Email (1)
@mail = ( '[email protected]', 'hab.akukk%mikkamakka@jeno', );
for( @mail ){ if( /^.*\@\w+\..+$/ ){ print "$_ seems to be a good eMail\n"; }else{ print "$_ bad address\n"; } }OUTPUT:[email protected] seems to be a good eMailhab.akukk%mikkamakka@jeno bad address
NOTES:$_ is used as defaultm/ is default when / is used$_ =~ m/^.*@\w+\..+$/
@ would also work instead of \@ but \@ is safe
![Page 6: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/6.jpg)
Regular Expression to Verify Email (2)
/^.*\@\w+\..+$/• ^ at the start of the string• .* zero or more any-character
– * means zero or more of what stands before
• \@ a single @ character• \w+ one or more alpha character
– + means one or more of what stands before
• \. one . (dot) character– special regexp character is escaped with \
• .+ one or more any character• $ until end of string
![Page 7: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/7.jpg)
Search and Replace Example of Regular Expressions
$text = 'JavaScript is not used on island Java.';
$text =~ s/Java(?!Script)/Borneo/;
print $text;
OUTPUT:JavaScript is not used on island Borneo.
NOTES:Operator s will be dicussed later in detail(?! ) is zero length forward look, detailed later
![Page 8: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/8.jpg)
Meta (joker) Character
• . any character but new line• ^ start of string• $ end of string• \ escaping the next character• \w any alpha character• \W any non-alpha character• \s any white space• \S any non-white space
Only examples, there are
other meta characters, see the Perl
manual.
![Page 9: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/9.jpg)
Parentheses (1)
$text = 'Hook is not used on island Java.';$text =~ /(Ho(ok))\s(is?).*\3((l|s)(a|l))/;print "$1 $2 $3 $4 $5 $6\n";#$text = 'Hook i not used on island Java.';$text =~ /(Ho(ok))\s(is?).*\3((l|s)(a|l))/;print "$1 $2 $3 $4 $5 $6\n";
OUTPUT:Hook ok is la l aHook ok i sl s l NOTES:
Numbering is in the order of the opening parentheses
![Page 10: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/10.jpg)
Parentheses without $n
$text = 'Hook is not used on island Java.';$text =~ /(Ho(ok))\s(is?).*\3((?:l|s)(a|l))/;print "$1 $2 $3 $4 $5 .$6.\n";$text = 'Hook i not used on island Java.';$text =~ /(Ho(ok))\s(is?).*\3((?:l|s)(a|l))/;print "$1 $2 $3 $4 $5 .$6.\n";
OUTPUT:Hook ok is la a ..Hook ok i sl l .. NOTES:
(?: ) groups sub-expression without creating reference
$6 is zero string
![Page 11: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/11.jpg)
Character classes
• List of characters between [ and ]• Interval, e.g. [a-f]• Negative character set [^a-f]
![Page 12: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/12.jpg)
Repetitions
• * zero or more times• + one or more times• ? zero or one time• {n} exactly n times• {n,} at least n times• {n,m} at least n times, at most m
times
NOTES:There is {n,} but there is
not {,m}
Why? (hint: {0,m} works, but {n,???}??)
![Page 13: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/13.jpg)
Greedy repetition
• Repetitions are greedy, eat as many characters as possible
$text = 'Hook is not used on island Java.';$text =~ /(.*)is/; #1print "$1.\n";$text =~ /(.*?)is/; #2print "$1.\n";$text =~ /(.*?)is.*n/; #3print "$1.\n";
OUTPUT:Hook is not used on .Hook .Hook .
![Page 14: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/14.jpg)
Other extensions
• Other UNIX tools also use simpler, similar regular expressions
• Perl regular expressions are more powerful
List of some extensions on the next slides
![Page 15: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/15.jpg)
Regular expression comment
(?# comment comes here)
• Use comments! Use comments! Use comments! Use comments! Use comments! Use comments! Use comments! Use comments! Use comments! Use comments! Use comments! Use comments!
![Page 16: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/16.jpg)
Regular Expression Parentheses
• (?: sub expression w/o $n)
(?: we have discussed it already beforehand as it came up in an example, but this is the proper
place to discuss this construct.)
![Page 17: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/17.jpg)
Positive look forward
(?= subregexp)
$t = 'jamaica rum rum kingston rum';
$t =~ s/([aeoui])(?=\w)/uc($1)/ge;
print $t;
• OUTPUT:jAmAIca rUm rUm kIngstOn rUm
Example:Uppercase all vowels standing inside a word
to upper case.
![Page 18: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/18.jpg)
Negative look forward
(?! subregexp)
$t = 'jamaica rum rum kingston rum';
$t =~ s/([aeoui])(?!\w)/uc($1)/ge;
print $t;
• OUTPUT:jamaicA rum rum kingston rum
Example:Uppercase all vowels standing end of a word
to upper case.
![Page 19: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/19.jpg)
Option change inside the regular expression
(? imsx)• This can be used inside m/ or s/
operator.• i and g options can not be used
Now we go back to operator m/ and discuss some details.
![Page 20: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/20.jpg)
M operator array result
@k = "abbabaa" =~ m/(bb).+(a.)/;
print $#k; print ' ',$k[0],' ',$k[1],"\n";
OUTPUT:1 bb aa
NOTES:Parts of the expression are closed into ( )$1, $2 ... are the default variables where the
substrings are put
![Page 21: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/21.jpg)
M operator option g (1)
@k = "abbabaa" =~ m/(b)(a)/g;
print $#k,' ',$k[0],' ',$k[1],' ',$k[2],' ',$k[3],"\n";
OUTPUT:3 b a b a
NOTES:$_ is used as defaultm/ is default when / is used@ would also work instead of \@
but it is safe
![Page 22: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/22.jpg)
M operator option g (2)
$t = "abbabaa";
while( $t =~ m/(ab)(b|a)/g ){
print pos($t)," $1 $2\n";
}
OUTPUT:3 ab b
6 ab a
![Page 23: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/23.jpg)
M operator option i
• Case insensitive matchprint '.',"apple" =~ /AppLe/,".\n";
print '.',"apple" =~ /AppLe/i,".\n";
• prints..
.1.
![Page 24: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/24.jpg)
M operator options m and s
$t = "mah\na\nb";while( $t =~ /(.?.)$/mg ){ print '.',$1; }print ".\n";while( $t =~ /(.?.)$/sg ){ print '.',$1; }print ".\n";while( $t =~ /(.?.)$/g ){ print '.',$1; }print ".\n";• OUTPUT:.ah.a.b..b..b.
m matches $ to all \n in the strings matches . to \n (otherwise . is any character but \n)
![Page 25: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/25.jpg)
M operator option o
• Evaluate the regular expression only once to save processor
$t = "al brab";$a = 'al'; $b = 'rab';&q;&p;$b = 'fe';&q;&p;sub q { print ' q',$t =~ /$a\sb$b/o }sub p { print ' p',$t =~ /$a\sb$b/ }
• prints
q1 p1 q1 p
![Page 26: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/26.jpg)
M operator option x
@k = "abbabaa" =~ m/(bb) #two or more 'b' gets into $1
.+ #one or more any-character
(a.) #a letter 'a' and exactly one any-character
/x; #space and comment allowed
print $#k;
print ' ',$k[0],' ',$k[1],"\n";
OUTPUT:1 bb aa
This option allows space (\ is space) and comments to ease readability.
![Page 27: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/27.jpg)
Operator s
$text =~ s/regexp/replace/egimosx• Options:
– e replace is interpreted as expression– g global search and replace– i case insensitive search– m string is treated as multi-line – o regular expression is evaluated only once– s string is treated as single-line– x extended syntax for the regexp
![Page 28: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/28.jpg)
Global Search and Replace
$t = "abbab" ;
$t =~ s/ab/aa/g;
print $t;OUTPUT:
aabaa replaces all occurrences of the search regular expression to the
replacement string
![Page 29: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/29.jpg)
m and s operators with different delimiters
• / is the default, but you can use• ' to have non-interpolated string• Other non alphanumeric
characters• () {} [] with matching character
pairs– In this case s{search}{replace}
![Page 30: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/30.jpg)
m and s operators with different delimiters example
$text = 'a@bba@bbabb';@b = ('bba');$text =~ s{@b}{q}g;print "$text\n";$text = 'a@bba@bbabb';$text =~ s'@b'q'g;print "$text\n";OUTPUT:a@q@qbbaqbaqbabb
@b is evaluated in the first search but not in the second
![Page 31: Programming in Perl regular expressions and m,s operators](https://reader036.fdocuments.us/reader036/viewer/2022081603/56814009550346895dab40dc/html5/thumbnails/31.jpg)
Thank you for your kind attention.