Perl Day 4. Fuzzy Matches We know about eq and ne, but they only match things exactly We know about...

14
Perl Perl Day 4 Day 4

Transcript of Perl Day 4. Fuzzy Matches We know about eq and ne, but they only match things exactly We know about...

Page 1: Perl Day 4. Fuzzy Matches We know about eq and ne, but they only match things exactly We know about eq and ne, but they only match things exactly –Sometimes.

PerlPerl

Day 4Day 4

Page 2: Perl Day 4. Fuzzy Matches We know about eq and ne, but they only match things exactly We know about eq and ne, but they only match things exactly –Sometimes.

Fuzzy MatchesFuzzy Matches

We know about eq and ne, but they only We know about eq and ne, but they only match things exactlymatch things exactly– Sometimes you want to match things more Sometimes you want to match things more

vaguely, like vaguely, like I don’t care what number it is, but I care it’s a I don’t care what number it is, but I care it’s a

numbernumber Or I need to make sure the phone number entered Or I need to make sure the phone number entered

has 10 digits.has 10 digits.

Regular Expressions make it possibleRegular Expressions make it possible– They are arguably the most powerful part of They are arguably the most powerful part of

perl.perl.

Page 3: Perl Day 4. Fuzzy Matches We know about eq and ne, but they only match things exactly We know about eq and ne, but they only match things exactly –Sometimes.

MatchMatch

Let’s first deal with matching thingsLet’s first deal with matching things– Imagine you ask the user to type in a phone Imagine you ask the user to type in a phone

number. You should next check you got a valid number. You should next check you got a valid phone number (10 digits).phone number (10 digits).

print(“Enter a phone number\n”);print(“Enter a phone number\n”);$PhoneNum=<STDIN>;$PhoneNum=<STDIN>;chomp($PhoneNum);chomp($PhoneNum);if($PhoneNum=~/\d+/)if($PhoneNum=~/\d+/){ print(“good job\n”); }{ print(“good job\n”); }elseelse{ print(“That wasn’t a phone number\n”); }{ print(“That wasn’t a phone number\n”); }

Page 4: Perl Day 4. Fuzzy Matches We know about eq and ne, but they only match things exactly We know about eq and ne, but they only match things exactly –Sometimes.

=~=~

Up until now we’ve dealt with ==, !Up until now we’ve dealt with ==, !=, <, >, >=, <=, eq and ne in tests. =, <, >, >=, <=, eq and ne in tests.

=~ means you are doing a regular =~ means you are doing a regular expression match. Note you are not expression match. Note you are not looking for the string ‘/\d+/’, you are looking for the string ‘/\d+/’, you are looking for what that means.looking for what that means.

Page 5: Perl Day 4. Fuzzy Matches We know about eq and ne, but they only match things exactly We know about eq and ne, but they only match things exactly –Sometimes.

\d, \w, \s\d, \w, \s

In a regular expression, you’ll often see \In a regular expression, you’ll often see \[something]. They each have different [something]. They each have different meanings:meanings:– \d – A digit (0-9)\d – A digit (0-9)– \w – A word character (a-Z, _)\w – A word character (a-Z, _)– \s – A space, or tab\s – A space, or tab– . Matches absolutely anything. Matches absolutely anything– \. Matches only a dot.\. Matches only a dot.– Any words will match exactly (e.g. /enda/ Any words will match exactly (e.g. /enda/

would match only enda).would match only enda).

Page 6: Perl Day 4. Fuzzy Matches We know about eq and ne, but they only match things exactly We know about eq and ne, but they only match things exactly –Sometimes.

+ * {}+ * {}

Any of the previous tokens can be followed Any of the previous tokens can be followed by by – + Means there must be 1 or more+ Means there must be 1 or more– * Means there can be any number (including * Means there can be any number (including

0)0)– {7} Means there must be exactly 7{7} Means there must be exactly 7– {1,4} Means there must be between 1 and 4{1,4} Means there must be between 1 and 4– {,10} Means there must be less than 10{,10} Means there must be less than 10

e.g:e.g: =~/\d{7}/ means there must be 7 digits=~/\d{7}/ means there must be 7 digits

Page 7: Perl Day 4. Fuzzy Matches We know about eq and ne, but they only match things exactly We know about eq and ne, but they only match things exactly –Sometimes.

EndingsEndings

After the last /, you can put After the last /, you can put additional things:additional things:– i This makes the match case insensitivei This makes the match case insensitive– g Allow it to match more than once g Allow it to match more than once

(globally)(globally)– m Allow it to match on multiple linesm Allow it to match on multiple lines

Page 8: Perl Day 4. Fuzzy Matches We know about eq and ne, but they only match things exactly We know about eq and ne, but they only match things exactly –Sometimes.

Search and ReplaceSearch and Replace

Uses the same language as matchingUses the same language as matching– However after the =~ you put an sHowever after the =~ you put an s– When you were doing matching there was When you were doing matching there was

secretly a m there, it’s optionalsecretly a m there, it’s optional

$Text=‘abc123 def456’;$Text=‘abc123 def456’;

$Text=~s/\d/x/g;$Text=~s/\d/x/g;– This will search for a digit and replace it with x. This will search for a digit and replace it with x.

The g indicates it’ll do it everywhere it finds a The g indicates it’ll do it everywhere it finds a digitdigit The result: $Text=‘abcxxx defxxx’;The result: $Text=‘abcxxx defxxx’;

Page 9: Perl Day 4. Fuzzy Matches We know about eq and ne, but they only match things exactly We know about eq and ne, but they only match things exactly –Sometimes.

More ExamplesMore Examples

$Text=‘abc123 def456’;$Text=‘abc123 def456’;

$Text=~s/\d+//g;$Text=~s/\d+//g;

$Text=~s/c/b/g;$Text=~s/c/b/g;

$Text=~s/\d{2}/a/;$Text=~s/\d{2}/a/;

$Text=~s/\s*//g;$Text=~s/\s*//g;

Page 10: Perl Day 4. Fuzzy Matches We know about eq and ne, but they only match things exactly We know about eq and ne, but they only match things exactly –Sometimes.

Matching Specific PlacesMatching Specific Places

2 additional special characters:2 additional special characters:– ^ Means match at start of string only^ Means match at start of string only– $ Means match at end of string only$ Means match at end of string only

Sometimes you only want the first 3 digits:Sometimes you only want the first 3 digits:$Phone=4042602694;$Phone=4042602694;– This would remove the area code:This would remove the area code:

$Phone=~s/^\d{3}//g;$Phone=~s/^\d{3}//g;

Sometimes you only want the last 4:Sometimes you only want the last 4:$Phone=~s/\d{4}$//g;$Phone=~s/\d{4}$//g;

Page 11: Perl Day 4. Fuzzy Matches We know about eq and ne, but they only match things exactly We know about eq and ne, but they only match things exactly –Sometimes.

CapturingCapturing

Anything you wrap in ()’s will be Anything you wrap in ()’s will be captured:captured:– The first ()’s are $1, the second are $2 The first ()’s are $1, the second are $2

etc.etc.

$Phone=4042602694;$Phone=4042602694;

$Phone=~/(\d{3})(\d{3})(\d{4})/;$Phone=~/(\d{3})(\d{3})(\d{4})/;

$AreaCode=$1;$AreaCode=$1;

$Exchange=$2;$Exchange=$2;

$Extension=$3;$Extension=$3;

Page 12: Perl Day 4. Fuzzy Matches We know about eq and ne, but they only match things exactly We know about eq and ne, but they only match things exactly –Sometimes.

TranslationTranslation Changing strings to upper case is easyChanging strings to upper case is easy

– The Command is Translate (tr), works like match, search The Command is Translate (tr), works like match, search and replace.and replace.

$Text=“this is lower case”;$Text=“this is lower case”;$Text=~tr/[a-z]/[A-Z]/;$Text=~tr/[a-z]/[A-Z]/;

The square brackets create a “character class”The square brackets create a “character class”– a-z means all letters between a and z. a-z means all letters between a and z. – [c-k] would be all letters from c to k[c-k] would be all letters from c to k– [asdf] would be a, s, d and f[asdf] would be a, s, d and f– [ab]+ would be any combinations of a and b, like:[ab]+ would be any combinations of a and b, like:

AA AbAb AaaaaAaaaa bbbbbbbbbb

Page 13: Perl Day 4. Fuzzy Matches We know about eq and ne, but they only match things exactly We know about eq and ne, but they only match things exactly –Sometimes.

Is Tomato a Fruit or Veg?Is Tomato a Fruit or Veg? grep can help. It looks in an array to tell you if a grep can help. It looks in an array to tell you if a

pattern is in the array.pattern is in the array.– The pattern can be any regular expression like what you The pattern can be any regular expression like what you

just learned.just learned.

@Fruits=(‘apple’,’bananna’,’orange’);@Fruits=(‘apple’,’bananna’,’orange’);@Veg=(‘potato’,’carrot’,’tomato’);@Veg=(‘potato’,’carrot’,’tomato’);if(grep(/tomato/i,@Fruits))if(grep(/tomato/i,@Fruits)){{ print(“It’s a fruit\n”);print(“It’s a fruit\n”);}}elsif(grep(/tomato/i,@Veg))elsif(grep(/tomato/i,@Veg)){{ print(“It’s a veg\n”);print(“It’s a veg\n”);}}

Page 14: Perl Day 4. Fuzzy Matches We know about eq and ne, but they only match things exactly We know about eq and ne, but they only match things exactly –Sometimes.

SplitSplit

If you have a string, and you want to If you have a string, and you want to make it into an array, split can help.make it into an array, split can help.

$Text=“This is a string”;$Text=“This is a string”;

@Words=split(/\s/,$Text);@Words=split(/\s/,$Text);

print(“$Words[2]\n”);print(“$Words[2]\n”);