Revision Lecture

18
Revision Lecture Mauro Jaskelioff

description

Revision Lecture. Mauro Jaskelioff. AWK Program Structure. AWK programs consists of patterns and procedures. Pattern_1 { Procedure_1} Pattern_2 { Procedure_2} Pattern_3 { Procedure_3} … … Pattern_n { Procedure_n}. - PowerPoint PPT Presentation

Transcript of Revision Lecture

Page 1: Revision Lecture

Revision Lecture

Mauro Jaskelioff

Page 2: Revision Lecture

AWK Program Structure

• AWK programs consists of patterns and procedures

Pattern_1 { Procedure_1}Pattern_2 { Procedure_2}Pattern_3 { Procedure_3} … …Pattern_n { Procedure_n}

• Additionally, a program can contain function definitions (but we don’t need to worry about them now)

Page 3: Revision Lecture

Example program

• Don’t mind details! Try to recognize the general structure described on the previous slide.

BEGIN { FS= ":" print “Example v0.1"

}$7 ~ /bash/ {

print $1 " uses bash"}

$4 == 0 { print "user " $1 " belongs to the root group"}

{ print "--------------------------------"}

Page 4: Revision Lecture

AWK Input

• AWK input consists of records and fields• Records are separated by a record

separator RS• By default the RS is a newline, so each

record is a line of input• Each record consists of zero or more fields,

separated by a field separator FS• By default the FS is blank space.• The current record is $0. Each of its fields

is $1, $2, …

Page 5: Revision Lecture

Example of inputsConsider the following

input file:• Default RS and

default FSif $0=“Red,255 0 0”

then $1=“Red,255”,$2=“0” and $3=“0”

• With FS=‘,’if $0=“Red,255 0 0”

then $1=“Red” and $2=“255 0 0”

Red,255 0 0Green,0 255 0Blue,0 0 255

Red,255 0 0Green,0 255 0Blue,0 0 255

Red,255 0 0Green,0 255 0Blue,0 0 255

Page 6: Revision Lecture

AWK’s Main loop (simplified)

for each input record r doparse rfor each pattern pati do

if r matches pati then

execute proci

Page 7: Revision Lecture

PatternsA pattern can be:• Relational expression

– Use relational operators, e.g. $1 > $2awk -F: ‘$1 > $2 {print $0}’ /etc/passwd

– Can do numeric or string comparisonsawk -F: ‘$1==“gdm” {print $0}’ /etc/passwd

• An empty patternawk -F: ‘{print $0}’ /etc/passwd

– Always True– Equivalent to a true expression. For example,

the command above is the same as:awk -F: ‘1 < 2 {print $0}’ /etc/passwd

Page 8: Revision Lecture

Patterns (2)

• Pattern-matching expression– E.g. quoted strings, numbers, operators,

defined variables… – ~ means match, !~ means don’t matchawk -F: '$1 ~ /.dm.*/ {print $0}' /etc/passwdawk -F: '$0 ~ /^...:/ {print $0}' /etc/passwdawk -F: '$1 !~ /^g/ {print $0}' /etc/passwd

• /regular expression/– Equivalent to $0 ~ /regular expression/

awk -F: ‘/^...:/ {print $1}’ /etc/passwd

Page 9: Revision Lecture

Special patterns

• Two special patterns:– BEGIN

• Specifies procedures that take place before the first input line is processedawk ‘BEGIN {print “Version 1.0”}’ dataFile

– END• Specifies procedures that take place after the last

input record is readawk ‘END {print “end of data”}’ dataFile

• This means we need to refine description of the main loop (see next slide)

Page 10: Revision Lecture

AWK’s refined Main loop

for each BEGIN pattern doexecute corresponding procedure

for each input record r doparse rfor each pattern pati do

if r matches pati thenexecute proci

for each END pattern doexecute corresponding procedure

This is the previousversion of the main loop

Page 11: Revision Lecture

Procedures

• Procedures consist of the usual assignment, conditional, and looping statements found in most languages.

• These are separated by newlines or semi-colons and are contained within curly brackets { }

• A procedure can be empty. The empty procedure prints $0.

Page 12: Revision Lecture

awk Built-in Variables

• awk has a number of built in variables:– FILENAME - current filename– FS - Field separator– NF - Number of fields in current record– NR - Number of current record– RS - Record separator– $0 - Entire input record– $n - nth field in current record

Page 13: Revision Lecture

Control Structures

• if (condition) statement• if (condition) statement else

statement• for (expr1; expr2; expr3) statement• for (index in array) statement

– More about this when we review arrays.

• while (condition) statement

Page 14: Revision Lecture

For-While equivalence

for (expr1; expr2; expr3) statement

is equivalent to:

expr1;while (expr2) {

statement;expr3

}

Page 15: Revision Lecture

awk Operators

Symbol Meaning$ Field reference

++ -- Increment, decrement

+ - ! Addition, subtraction, negation

* / % Multiplication, division, modulus

< <= > >= != == Relational operators

~ !~ Match regular expression and negation

in Array membership

&& || Logical and, Logical or

?: If-then-else for expressionsx == y ? “Equal” : “Not equal”

= += -= *= /= %= Assignment

Page 16: Revision Lecture

Arrays in awk

• awk has arrays with elements subscripted with strings (associative arrays)

• Assign arrays in one of two ways:– Name them in an assignment statement

• myArray[i]=n++• myArray["Red"]="255 0 0"

– Use the split(str,arr,fs) function which splits the string str into elements of array arr, using field separator, fs. It returns the number of fields used.

• n=split(input, words, " ")

Page 17: Revision Lecture

Example of split

results in:m ← 4colors[1] ← "Blue"colors["2"]← "0"colors[3] ← "0"colors["4"]← "255"

• Since indexes are really strings it's legal to write them enclosed in quotes

m=split("Blue 0 0 255",colors," ")

Page 18: Revision Lecture

Reading elements in an array

• Using a for loop:

– Since indexes are strings, this is the only way to loop through all elements of an array

• Using the operator in:

– we use this to test if an index exists.

for (index in array)print array[index]

if (index in array)...