Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons...
-
Upload
elaine-foster -
Category
Documents
-
view
214 -
download
1
Transcript of Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons...
![Page 1: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/1.jpg)
Mechanics
Copyright © Software Carpentry 2010
This work is licensed under the Creative Commons Attribution License
See http://software-carpentry.org/license.html for more information.
Regular Expressions
![Page 2: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/2.jpg)
Regular Expressions Mechanics
Notebook #1
Site Date Evil (millivaders)---- ---- ------------------Baker 1 2009-11-17 1223.0Baker 1 2010-06-24 1122.7Baker 2 2009-07-24 2819.0Baker 2 2010-08-25 2971.6Baker 1 2011-01-05 1410.0Baker 2 2010-09-04 4671.6 ⋮ ⋮ ⋮
![Page 3: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/3.jpg)
Regular Expressions Mechanics
Notebook #2
Site/Date/EvilDavison/May 22, 2010/1721.3Davison/May 23, 2010/1724.7Pertwee/May 24, 2010/2103.8Davison/June 19, 2010/1731.9Davison/July 6, 2010/2010.7Pertwee/Aug 4, 2010/1731.3Pertwee/Sept 3, 2010/4981.0 ⋮ ⋮ ⋮
![Page 4: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/4.jpg)
Regular Expressions Mechanics
'(.+)/([A-Z][a-z]+) ([0-9]{1,2}),? ([0-9]{4})/(.+)'
This pattern matches:
- one or more characters
- a slash
- a single upper-case letter
- one or more lower-case letters
- a space
- one or two digits
- a comma if one is there
- a space
- exactly four digits
- a slash
- one or more characters
![Page 5: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/5.jpg)
Regular Expressions Mechanics
How?
![Page 6: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/6.jpg)
Regular Expressions Mechanics
How?
Using finite state machines
![Page 7: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/7.jpg)
Regular Expressions Mechanics
a
Match a single 'a'
![Page 8: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/8.jpg)
Regular Expressions Mechanics
Match a single 'a'
start here
a
![Page 9: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/9.jpg)
Regular Expressions Mechanics
Match a single 'a'
match this character
a
![Page 10: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/10.jpg)
Regular Expressions Mechanics
Match a single 'a'
must be here
at the end
a
![Page 11: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/11.jpg)
Regular Expressions Mechanics
Match a single 'a'
must be here
at the end
a
a
![Page 12: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/12.jpg)
Regular Expressions Mechanics
Match one or more 'a'
a
a
![Page 13: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/13.jpg)
Regular Expressions Mechanics
Match one or more 'a'
a
a
match this
as before
![Page 14: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/14.jpg)
Regular Expressions Mechanics
Match one or more 'a'
a
a
match this
as before
match again
and again
![Page 15: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/15.jpg)
Regular Expressions Mechanics
Match one or more 'a'
a
a
don't have to stop
here the first time,
just have to be here
at the end
match this
as before
match again
and again
![Page 16: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/16.jpg)
Regular Expressions Mechanics
Match one or more 'a'
a
a
don't have to stop
here the first time,
just have to be here
at the end
a
a+
match this
as before
match again
and again
![Page 17: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/17.jpg)
Regular Expressions Mechanics
Match 'a' or nothing
a
![Page 18: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/18.jpg)
Regular Expressions Mechanics
Match 'a' or nothing
a
transition is "free"
![Page 19: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/19.jpg)
Regular Expressions Mechanics
Match 'a' or nothing
a
transition is "free"
So this is '(a|)'
![Page 20: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/20.jpg)
Regular Expressions Mechanics
Match 'a' or nothing
a
transition is "free"
So this is '(a|)'
Which is 'a?'
![Page 21: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/21.jpg)
Regular Expressions Mechanics
Match 'a' or nothing
a
transition is "free"
So this is '(a|)'
Which is 'a?'a
a+
a?
![Page 22: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/22.jpg)
Regular Expressions Mechanics
Match zero or more 'a'
a
a
![Page 23: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/23.jpg)
Regular Expressions Mechanics
Match zero or more 'a'
a
a
Combine ideas
![Page 24: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/24.jpg)
Regular Expressions Mechanics
Match zero or more 'a'
a
a
Combine ideas
This is 'a*'
![Page 25: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/25.jpg)
Regular Expressions Mechanics
Match zero or more 'a'
a
a
Combine ideas
This is 'a*'a
a+
a?
a*
![Page 26: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/26.jpg)
Regular Expressions Mechanics
a
a
d
c
What regular expression
is this?
b
![Page 27: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/27.jpg)
Regular Expressions Mechanics
a
a
d
c
What regular expression
is this?
a+|(b(c|d))
b
![Page 28: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/28.jpg)
Regular Expressions Mechanics
Action at a node depends only on:
![Page 29: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/29.jpg)
Regular Expressions Mechanics
Action at a node depends only on:
- arcs out of that node
![Page 30: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/30.jpg)
Regular Expressions Mechanics
Action at a node depends only on:
- arcs out of that node
- characters in target data
![Page 31: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/31.jpg)
Regular Expressions Mechanics
Action at a node depends only on:
- arcs out of that node
- characters in target data
Finite state machines have no memory
![Page 32: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/32.jpg)
Regular Expressions Mechanics
Action at a node depends only on:
- arcs out of that node
- characters in target data
Finite state machines have no memory
Means it is impossible to write a regular
expression
to check if arbitrarily nested parentheses
match
![Page 33: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/33.jpg)
Regular Expressions Mechanics
Action at a node depends only on:
- arcs out of that node
- characters in target data
Finite state machines have no memory
Means it is impossible to write a regular
expression
to check if arbitrarily nested parentheses
match
"(((....)))" requires memory
![Page 34: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/34.jpg)
Regular Expressions Mechanics
Action at a node depends only on:
- arcs out of that node
- characters in target data
Finite state machines have no memory
Means it is impossible to write a regular
expression
to check if arbitrarily nested parentheses
match
"(((....)))" requires memory (or at least a
counter)
![Page 35: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/35.jpg)
Regular Expressions Mechanics
Action at a node depends only on:
- arcs out of that node
- characters in target data
Finite state machines have no memory
Means it is impossible to write a regular
expression
to check if arbitrarily nested parentheses
match
"(((....)))" requires memory (or at least a
counter)
Similarly, only way to check if a word
contains each
vowel once is to write 5! = 120 clauses
![Page 36: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/36.jpg)
Regular Expressions Mechanics
Why use a tool with limits?
![Page 37: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/37.jpg)
Regular Expressions Mechanics
Why use a tool with limits?
They're fast
![Page 38: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/38.jpg)
Regular Expressions Mechanics
Why use a tool with limits?
They're fast
- After some pre-calculation, a regular
expression
only has to look at each character in the
input
data once
![Page 39: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/39.jpg)
Regular Expressions Mechanics
Why use a tool with limits?
They're fast
- After some pre-calculation, a regular
expression
only has to look at each character in the
input
data once
It's readable
![Page 40: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/40.jpg)
Regular Expressions Mechanics
Why use a tool with limits?
They're fast
- After some pre-calculation, a regular
expression
only has to look at each character in the
input
data once
It's readable
- More readable than procedural equivalent
![Page 41: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/41.jpg)
Regular Expressions Mechanics
Why use a tool with limits?
They're fast
- After some pre-calculation, a regular
expression
only has to look at each character in the
input
data once
It's readable
- More readable than procedural equivalent
And regular expressions can do a lot more
than
what we've seen so far
![Page 42: Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See .](https://reader036.fdocuments.us/reader036/viewer/2022081519/56649de75503460f94ae14c3/html5/thumbnails/42.jpg)
June 2010
created by
Greg Wilson
Copyright © Software Carpentry 2010
This work is licensed under the Creative Commons Attribution License
See http://software-carpentry.org/license.html for more information.