101 3.7 search text files using regular expressions
-
Upload
acacio-oliveira -
Category
Technology
-
view
48 -
download
3
Transcript of 101 3.7 search text files using regular expressions
Core
Lin
ux fo
r Re
d H
at a
nd F
edor
a le
arni
ng u
nder
GN
U F
ree
Doc
umen
tatio
n Li
cens
e -
Copy
left
(c) A
cáci
o O
livei
ra 2
012
Ev
eryo
ne
is p
erm
itte
d to
co
py
and
dis
trib
ute
verb
atim
co
pie
s o
f th
is li
cen
se d
ocu
me
nt,
cha
ngin
g is
allo
wed
Linux Essenciais and System Administration
Core
Lin
ux fo
r Re
d H
at a
nd F
edor
a le
arni
ng u
nder
GN
U F
ree
Doc
umen
tatio
n Li
cens
e -
Copy
left
(c) A
cáci
o O
livei
ra 2
012
Ev
eryo
ne
is p
erm
itte
d to
co
py
and
dis
trib
ute
verb
atim
co
pie
s o
f th
is li
cen
se d
ocu
me
nt,
cha
ngin
g is
allo
wed
Key Knowledge Areas
Create simple regular expressions containing several notational elements. Use regular expression tools to perform searches through a filesystem or file content.
Unix Commands
Search text files using regular expressions
Terms and Utilities
grep egrep fgrep sed regex(7)
2
Core
Lin
ux fo
r Re
d H
at a
nd F
edor
a le
arni
ng u
nder
GN
U F
ree
Doc
umen
tatio
n Li
cens
e -
Copy
left
(c) A
cáci
o O
livei
ra 2
012
Ev
eryo
ne
is p
erm
itte
d to
co
py
and
dis
trib
ute
verb
atim
co
pie
s o
f th
is li
cen
se d
ocu
me
nt,
cha
ngin
g is
allo
wed
Search text files using regular expressions
Using the stream editor
3
Core
Lin
ux fo
r Re
d H
at a
nd F
edor
a le
arni
ng u
nder
GN
U F
ree
Doc
umen
tatio
n Li
cens
e -
Copy
left
(c) A
cáci
o O
livei
ra 2
012
Ev
eryo
ne
is p
erm
itte
d to
co
py
and
dis
trib
ute
verb
atim
co
pie
s o
f th
is li
cen
se d
ocu
me
nt,
cha
ngin
g is
allo
wed
Search text files using regular expressions
The sed utility is a stream editor that takes input either from a file or from data thatis piped into the utility.
Sed works globally within a file unless addressing symbols are used to limit scope
Using the stream editor
4
sed command can be used to make simple substitutions and more powerful changes to a file.
Simple substitutions throughout a file are made using the following syntax:sed -option s/REGEXP/replacement/flag filename
sed command will work using text from standard input as well as text from specified files. Original file is left intact, and the changes are written to a new file.
REGEXP stands for regular expression, which is a way of searching for particular characters
s/ command instructs sed to locate REGEXP and remove that while adding the replacement in its place.
sed -e ‘s/lisa/Lisa/’ -e ‘s/nikki/Nikki/’ myfriendssed -e ‘s/lisa/Lisa/’; ‘s/nikki/Nikki/’ myfriends
Ex: search the my friends file and replace the characters lisa and nikki with Lisa and Nikki, respectively.
Core
Lin
ux fo
r Re
d H
at a
nd F
edor
a le
arni
ng u
nder
GN
U F
ree
Doc
umen
tatio
n Li
cens
e -
Copy
left
(c) A
cáci
o O
livei
ra 2
012
Ev
eryo
ne
is p
erm
itte
d to
co
py
and
dis
trib
ute
verb
atim
co
pie
s o
f th
is li
cen
se d
ocu
me
nt,
cha
ngin
g is
allo
wed
Search text files using regular expressions
Using the stream editor
5
Core
Lin
ux fo
r Re
d H
at a
nd F
edor
a le
arni
ng u
nder
GN
U F
ree
Doc
umen
tatio
n Li
cens
e -
Copy
left
(c) A
cáci
o O
livei
ra 2
012
Ev
eryo
ne
is p
erm
itte
d to
co
py
and
dis
trib
ute
verb
atim
co
pie
s o
f th
is li
cen
se d
ocu
me
nt,
cha
ngin
g is
allo
wed
Search text files using regular expressions
Using the stream editor
6
Options Used with sedOption Use
-V Displays version information and then exits.
-h Displays help information and then exits.
-n Prevents the file from being displayed after it has been processed.
-e command Appends the commands to those being processed.
-f file Appends the commands in the specified script file to those being processed.
sed -f scriptname filename
Ex: run a script on a file by using the -f option. This enables you to store frequently used options and simplifies larger commands
Core
Lin
ux fo
r Re
d H
at a
nd F
edor
a le
arni
ng u
nder
GN
U F
ree
Doc
umen
tatio
n Li
cens
e -
Copy
left
(c) A
cáci
o O
livei
ra 2
012
Ev
eryo
ne
is p
erm
itte
d to
co
py
and
dis
trib
ute
verb
atim
co
pie
s o
f th
is li
cen
se d
ocu
me
nt,
cha
ngin
g is
allo
wed
Search text files using regular expressions
Using the stream editor
7
Flags Used with s - flags allow further configuration of command
Flag Use
g Applies the changes globally.
p Prints all lines that contain a substitution. (Normally used with the -n option.)
NUMBER Replaces only the NUMBER match.
w filename Writes all lines with substitutions to the file specified.
I Ignores case when matching REGEXP.
Addressing Used with sedAddress Use
number Specifies the line number to match.
number, number Specifies the line numbers to match and includes all lines located between these numbers.
$ Specifies to match the last line.
! Matches all lines except for the lines specified.
Core
Lin
ux fo
r Re
d H
at a
nd F
edor
a le
arni
ng u
nder
GN
U F
ree
Doc
umen
tatio
n Li
cens
e -
Copy
left
(c) A
cáci
o O
livei
ra 2
012
Ev
eryo
ne
is p
erm
itte
d to
co
py
and
dis
trib
ute
verb
atim
co
pie
s o
f th
is li
cen
se d
ocu
me
nt,
cha
ngin
g is
allo
wed
Search text files using regular expressions
Using the stream editor
8
Core
Lin
ux fo
r Re
d H
at a
nd F
edor
a le
arni
ng u
nder
GN
U F
ree
Doc
umen
tatio
n Li
cens
e -
Copy
left
(c) A
cáci
o O
livei
ra 2
012
Ev
eryo
ne
is p
erm
itte
d to
co
py
and
dis
trib
ute
verb
atim
co
pie
s o
f th
is li
cen
se d
ocu
me
nt,
cha
ngin
g is
allo
wed
Search text files using regular expressions
Using the stream editor
9
Core
Lin
ux fo
r Re
d H
at a
nd F
edor
a le
arni
ng u
nder
GN
U F
ree
Doc
umen
tatio
n Li
cens
e -
Copy
left
(c) A
cáci
o O
livei
ra 2
012
Ev
eryo
ne
is p
erm
itte
d to
co
py
and
dis
trib
ute
verb
atim
co
pie
s o
f th
is li
cen
se d
ocu
me
nt,
cha
ngin
g is
allo
wed
Search text files using regular expressions
Using the stream editor
10
Core
Lin
ux fo
r Re
d H
at a
nd F
edor
a le
arni
ng u
nder
GN
U F
ree
Doc
umen
tatio
n Li
cens
e -
Copy
left
(c) A
cáci
o O
livei
ra 2
012
Ev
eryo
ne
is p
erm
itte
d to
co
py
and
dis
trib
ute
verb
atim
co
pie
s o
f th
is li
cen
se d
ocu
me
nt,
cha
ngin
g is
allo
wed
Search text files using regular expressions
Using the stream editor
11
Core
Lin
ux fo
r Re
d H
at a
nd F
edor
a le
arni
ng u
nder
GN
U F
ree
Doc
umen
tatio
n Li
cens
e -
Copy
left
(c) A
cáci
o O
livei
ra 2
012
Ev
eryo
ne
is p
erm
itte
d to
co
py
and
dis
trib
ute
verb
atim
co
pie
s o
f th
is li
cen
se d
ocu
me
nt,
cha
ngin
g is
allo
wed
Search text files using regular expressions
Using grep
12
Core
Lin
ux fo
r Re
d H
at a
nd F
edor
a le
arni
ng u
nder
GN
U F
ree
Doc
umen
tatio
n Li
cens
e -
Copy
left
(c) A
cáci
o O
livei
ra 2
012
Ev
eryo
ne
is p
erm
itte
d to
co
py
and
dis
trib
ute
verb
atim
co
pie
s o
f th
is li
cen
se d
ocu
me
nt,
cha
ngin
g is
allo
wed
Search text files using regular expressions
Using grep
13
Core
Lin
ux fo
r Re
d H
at a
nd F
edor
a le
arni
ng u
nder
GN
U F
ree
Doc
umen
tatio
n Li
cens
e -
Copy
left
(c) A
cáci
o O
livei
ra 2
012
Ev
eryo
ne
is p
erm
itte
d to
co
py
and
dis
trib
ute
verb
atim
co
pie
s o
f th
is li
cen
se d
ocu
me
nt,
cha
ngin
g is
allo
wed
Search text files using regular expressions
Using grep
14
grep utility is used to search files for the pattern specified.
•Default action of the utility is to print the matches to the search. •Can accept filenames to search or it can search data from standard input.
Syntax: grep –options [-e searchpattern] [-f filename]
Used with 3 variants controlled by options:
Option Use
-G The default behavior which interprets the pattern as a basic regular expression.
-E Interprets the pattern as an extended regular expression. This option functions the same as the –G option with GNU grep.
-F Interprets the pattern as a list of fixed strings.
Core
Lin
ux fo
r Re
d H
at a
nd F
edor
a le
arni
ng u
nder
GN
U F
ree
Doc
umen
tatio
n Li
cens
e -
Copy
left
(c) A
cáci
o O
livei
ra 2
012
Ev
eryo
ne
is p
erm
itte
d to
co
py
and
dis
trib
ute
verb
atim
co
pie
s o
f th
is li
cen
se d
ocu
me
nt,
cha
ngin
g is
allo
wed
Search text files using regular expressions
Using grep
15
Core
Lin
ux fo
r Re
d H
at a
nd F
edor
a le
arni
ng u
nder
GN
U F
ree
Doc
umen
tatio
n Li
cens
e -
Copy
left
(c) A
cáci
o O
livei
ra 2
012
Ev
eryo
ne
is p
erm
itte
d to
co
py
and
dis
trib
ute
verb
atim
co
pie
s o
f th
is li
cen
se d
ocu
me
nt,
cha
ngin
g is
allo
wed
Search text files using regular expressions
Using grep Exemple
16
Core
Lin
ux fo
r Re
d H
at a
nd F
edor
a le
arni
ng u
nder
GN
U F
ree
Doc
umen
tatio
n Li
cens
e -
Copy
left
(c) A
cáci
o O
livei
ra 2
012
Ev
eryo
ne
is p
erm
itte
d to
co
py
and
dis
trib
ute
verb
atim
co
pie
s o
f th
is li
cen
se d
ocu
me
nt,
cha
ngin
g is
allo
wed
Search text files using regular expressions
Using grep Exemple
17
Core
Lin
ux fo
r Re
d H
at a
nd F
edor
a le
arni
ng u
nder
GN
U F
ree
Doc
umen
tatio
n Li
cens
e -
Copy
left
(c) A
cáci
o O
livei
ra 2
012
Ev
eryo
ne
is p
erm
itte
d to
co
py
and
dis
trib
ute
verb
atim
co
pie
s o
f th
is li
cen
se d
ocu
me
nt,
cha
ngin
g is
allo
wed
Search text files using regular expressions
Using grep Exemple
18
Core
Lin
ux fo
r Re
d H
at a
nd F
edor
a le
arni
ng u
nder
GN
U F
ree
Doc
umen
tatio
n Li
cens
e -
Copy
left
(c) A
cáci
o O
livei
ra 2
012
Ev
eryo
ne
is p
erm
itte
d to
co
py
and
dis
trib
ute
verb
atim
co
pie
s o
f th
is li
cen
se d
ocu
me
nt,
cha
ngin
g is
allo
wed
Search text files using regular expressions
Regular expressions are patterns of characters, some with special meaning, that are useful when using text filters.
Special characters are and can be used to represent other characters or groups of characters. These special characters are known as metacharacters.
Regular expressions
19
It is important to understand that although these characters may appear to be the same as wildcard characters used at a shell prompt, they function differently.
Core
Lin
ux fo
r Re
d H
at a
nd F
edor
a le
arni
ng u
nder
GN
U F
ree
Doc
umen
tatio
n Li
cens
e -
Copy
left
(c) A
cáci
o O
livei
ra 2
012
Ev
eryo
ne
is p
erm
itte
d to
co
py
and
dis
trib
ute
verb
atim
co
pie
s o
f th
is li
cen
se d
ocu
me
nt,
cha
ngin
g is
allo
wed
Search text files using regular expressions
metacharacters
20
Metacharacter Use
\ The slash is used to locate any of the characters following the slash.
* The asterisk matches zero or more occurrences of the preceding regular expression. The zero or more occurrence matching is useful when using this character along with others. For example, when searching for \*are, matches for are and stare are returned.
. A dot matches any single character; this character is used as a wildcard.
^ The caret is used to locate the start line; it is often followed by another character to locate a line starting with that character. Using this metacharacter to search for ^A would locate all lines beginning with A.
$ The dollar sign locates the end of the line, and when proceeded by another character, it will locate lines ending with that character. So, using a$ will locate all lines ending with a.
[ ] Brackets are used to locate specific characters; a range of characters can also be specified within brackets. When a range is specified such as 1-5, the numbers 1, 2, 3, 4, and 5 are specified.
[^ ] Brackets with a caret as the first character contained between them search for all characters except those that are also contained within the brackets. So, for [^1-9] all characters are found except for the numbers ranging from one to nine.
\{ \} These symbols are used to locate a range or specific number of instances. The expression a\{3\} will search for aaa while the expression a\{1,3\} will locate a, aa, and aaa.
\< \> The slash and less than symbols are used with a set of characters followed by the slash and greater than symbols. The characters that are located within the symbols are searched for at the word boundary. This allows you to locate complete words, regardless of where they are located within a sentence.