Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by...

25
Pattern Matching II

Transcript of Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by...

Page 1: Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by default greedy. For example, –$_ = “Bob sat next to the.

Pattern Matching II

Page 2: Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by default greedy. For example, –$_ = “Bob sat next to the.

Greedy Matching• When dealing with quantifiers, Perl’s

pattern matcher is by default greedy.• For example,

– $_ = “Bob sat next to the Bobcat and listened to the Bobolink”;

/.*Bob/– $_ = “Freddie’s hot dogs”;

/Fred+/– $_ = “Freddie’s hot dogs are really hot!”;

/.*hot/

Page 3: Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by default greedy. For example, –$_ = “Bob sat next to the.

Minimal Matching

• The minimal mode is specified by (?) after the quantifier.

• For example, – $_ = “Freddie’s hot dogs”;

/Fred+?/– $_ = “Freddie’s hot dogs are really hot!”;

/.*?hot/

Page 4: Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by default greedy. For example, –$_ = “Bob sat next to the.

Multiple Quantifiers

• Leftmost quantifier is greediest.

• For example,– $_ = “Bob sat next to the Bobcat and listened to

the Bobolink”;

/Bob.*Bob.*link/

• The first .* matches:– “ sat next to the Bobcat and listened to the “

Page 5: Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by default greedy. For example, –$_ = “Bob sat next to the.

Anchors

• More complicated patterns can be created with anchors.

• An anchor requires a pattern to match at specific places in a string.

• Allows a particular position in a pattern to align with a particular position in the string.

Page 6: Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by default greedy. For example, –$_ = “Bob sat next to the.

(^) Anchor

• (^) requires the pattern match at the beginning.

• For example,– /^Shelley/

“Shelley has red hair”

“What color is Shelley’s hair?”– /^[^!]^/

• The meaning of (^) depends on the context.

Page 7: Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by default greedy. For example, –$_ = “Bob sat next to the.

($) Anchor

• ($) requires the pattern match at the end.

• For example,– /hair$/

“Shelley has red hair”

“What color is Shelley’s hair?”

Page 8: Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by default greedy. For example, –$_ = “Bob sat next to the.

(\b) Anchor

• (\b) matches the position between a word and a non-word character.

• For example,– /\bwear\b/

“I wear shoes”

“Swimwear for sale.”

“Molly wears green sweaters.”

Page 9: Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by default greedy. For example, –$_ = “Bob sat next to the.

Binding Operators

• A pattern can be matched against any string with binding operators (=~) and (!~)

• The left operand must evaluate to a string and the return value is a Boolean.

• For example,– $string =~ /[,;:]/– $string !~ /[,;:]/– if (<STDIN> =~ /^[Yy]/) { … }

Page 10: Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by default greedy. For example, –$_ = “Bob sat next to the.

Pattern Modifiers

• A pattern can be followed by a modifier.

• The modifier changes how:– The pattern is interpreted.– The pattern matcher works while using the

pattern.

• The most common modifiers are:– i, m, s, o, x

Page 11: Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by default greedy. For example, –$_ = “Bob sat next to the.

(i) Modifier

• (i) modifier tells the pattern matcher to ignore case.

• For example, /apples/i matches– “apples”– “Apples”– “APPLES”– “ApPlEs”

Page 12: Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by default greedy. For example, –$_ = “Bob sat next to the.

(m) And (s) Modifier

• (m) treats a string as multiple lines:– (^) matches just after any newline.– ($) matcher just before any newline.

• (s) treats a string as a single line:– (.) will also match newline characters.

• If both (m) and (s) are specified:– (.) matches any character.– (^) and ($) match positions after and before a

newline

Page 13: Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by default greedy. For example, –$_ = “Bob sat next to the.

(o) Modifier

• Patterns can include scalar variables:– The variables are interpolated.

• Patterns containing variables are recompiled every time their used.

• Provides dynamic patterns, but very expensive.

• Include (o) modifier if variable never changes. – Tells Perl not to recompile the pattern.

Page 14: Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by default greedy. For example, –$_ = “Bob sat next to the.

(x) Modifier

• (x) tells the pattern matcher to ignore white spaces.

• For example, /\d+ \. \d+/x is equivalent to /\d+\.\d+/

• Allows comments to be included for patterns./\d+ # digits before the decimal. \. # The decimal point. \d+ # digits after the point./x

Page 15: Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by default greedy. For example, –$_ = “Bob sat next to the.

Remembering Matches

• Sometimes a pattern needs to reference a part of a string it matched earlier.

• Done by parenthesizing parts of interest.• Referenced by implicitly defined variables

– e.g. \1, \2, \3, …

• For example,– /(\w+).*\1/ - “jo likes joanne.”– /(.)\1/ – /([‘”])(.*?)\1/

Page 16: Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by default greedy. For example, –$_ = “Bob sat next to the.

References Outside a Pattern

• Parts of a pattern are needed outside the pattern sometimes.

• Can be referenced by implicit variables:– e.g. $1, $2, $3, …

• For example,“VY ran for 267 yards Saturday” =~

/(\d+) (\w+) (\w+)/;

print “$1 $2 $3 \n”;

Page 17: Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by default greedy. For example, –$_ = “Bob sat next to the.

Nested Parentheses

• Patterns can have nested parentheses.

• Relate to variables by counting ( starting from the left.

• For example:$_ = “31 Oct 2005”;

/((\d+) (\w+) (\d+))/;

print “$1 \n $2 $3 $4 \n”;

Page 18: Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by default greedy. For example, –$_ = “Bob sat next to the.

Backreferences

• \n and $n are called backreferences.– Refers to the result of the previous match.

• Perl also includes 3 implicit variables.– $` – part before the match.– $& – part that matched.– $’ – part after the match.

• Costly for matcher to save these for every match.

Page 19: Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by default greedy. For example, –$_ = “Bob sat next to the.

RegEx Extensions

• Perl includes several extensions to previous versions of its regular expression syntax.

• The general form is:(?xPattern)

• x is a one or two character code.

Page 20: Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by default greedy. For example, –$_ = “Bob sat next to the.

Look Ahead

• Want a pattern to match if (not) followed by a subpattern, but do not want the subpattern as part of the match.

• (?=) and (?!) provides this look ahead behavior.

• For example,– /\d+(?=\.)/– /\d+(?!\.)/

Page 21: Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by default greedy. For example, –$_ = “Bob sat next to the.

Look Behind

• Perl also allows look behinds.

• (?<=) and (?<!) provides this behavior.

• For example,– /(?<=\.)d+/– /(?<!\.)d+/

Page 22: Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by default greedy. For example, –$_ = “Bob sat next to the.

Substitution

• Often need to find a substring and replace it with another.

• Perl has a substitution operator for this.• The general form is:

– s dl Pattern dl New_string dl Modifiers

• The common form is:– s/Pattern/New_string/

• The return value is the number of substitutions made.

Page 23: Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by default greedy. For example, –$_ = “Bob sat next to the.

Examples

• Example 1:$_ = “No more apples!”;

s/apples/applets/;

• Example 2:$_ = “Who are Jack and Jill?”;

s/(\w+) and (\w+)/$2 & $1/;

Page 24: Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by default greedy. For example, –$_ = “Bob sat next to the.

Substitution with Modifiers

• Modifiers can be used with the substitution operator.

• i, o, m, s, and x have the same effect.

• There are two common modifiers for substitutions:– g: perform substitution everywhere it applies.– e: substitution part treated as a Perl expression.

Page 25: Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by default greedy. For example, –$_ = “Bob sat next to the.

(g) and (e) Examples

• Example 1:$_ = “12034005”;s/0//g;

• Example 2:$_ = “Molly and Mary were cold.”;s/(\w+)/”\1”/g;

• Example 3:$_ = “Is it Sum, SUM, sum, or suM?”;s/sum/sum/ig;

• Example 4:s/(\w+)/uc($1)/e;