Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives...

27
Introduction to Unix (CA263) File Processing

Transcript of Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives...

Page 1: Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.

Introduction to Unix (CA263)

File Processing

Page 2: Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.

Guide to UNIX Using Linux, Third Edition 2

Objectives

• Explain UNIX and Linux file processing

• Use basic file manipulation commands to create, delete, copy, and move files and directories

• Employ commands to combine, cut, paste, rearrange, and sort information in files

• Create a script file

• Use the join command to link files using a common field

• Use the awk command to create a professional-looking report

Page 3: Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.

Guide to UNIX Using Linux, Third Edition 3

Understanding File Structures

• Files can be structured in many ways depending on the kind of data they store– Employee information can be stored on

separate line separated by delimiters such as colon. This type of record is known as variable-length record.

– Another way to create a records is to have fixed number of column for each column. This type of record is known as fixed-length record.

Page 4: Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.

Guide to UNIX Using Linux, Third Edition 4

Understanding File Structures (continued)

• UNIX/Linux store data, such as letters and product records, as flat ASCII files

• Three kinds of regular files are– Unstructured ASCII character, you can store any

kind of data in any order. You can’t retrieve a particular column, you have to print everything to get what you need.

– Unstructured ASCII records, it store data as a sequence of fixed-length record, contains similar information of different persons on different rows.

– Unstructured ASCII trees, it is structured as a tree of record, that can be organized as fixed-length or variable length record. Each record contains key that helps in searching record quickly.

Page 5: Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.

Guide to UNIX Using Linux, Third Edition 5

Understanding File Structures (continued)

Page 6: Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.

Guide to UNIX Using Linux, Third Edition 6

Manipulating Files

• Creating files• Delete files when no longer needed• Finding a file• Combining files using paste command and output

redirection• Separating files suing cut command• Sorting the contents of a file

Page 7: Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.

Guide to UNIX Using Linux, Third Edition 7

Create Files

• Redirection sign & touch command will create an empty file.

$ > accountfiles$ touch accountsfile2

Syntax touch [-options] [filename(s)]Useful Options include:

-a update the access time only

-m update the last time the file was modified

-c prevent creating file, if it does not exist

Page 8: Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.

Guide to UNIX Using Linux, Third Edition 8

Delete Files

• Delete files or directory permanently when no longer needed

• $ rm –i phonebook.bak

Syntax rm [-options] [filename / directory]Useful Options include:-i display warning before deleting a file-r will remove directory and everything it contains

Page 9: Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.

Guide to UNIX Using Linux, Third Edition 9

Removing Directories

• Remove directory permanently when no longer needed

• $ rmdir documentsSyntax rmdir [-options] [directory]Useful Options include:

-i display warning before deleting a file ???-r will remove directory and everything it contains

• Note: Directory must be empty to delete with the rmdir command

Page 10: Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.

Guide to UNIX Using Linux, Third Edition 10

Finding a File

• Finding a file helps you locate it in the directory structure

• $ find –i phonebook.bak

Syntax: find [pathname] [-name filename]Useful Options include:

Pathname: . is for current directory

-name indicates that you are searching for file

with specific name. You can use wild card

Page 11: Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.

Guide to UNIX Using Linux, Third Edition 11

Combining Files

• Combining files using output redirection

– cat command - concatenate text of two different files via output redirection

– paste command - joins text of different files in side by side fashion

Page 12: Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.

Guide to UNIX Using Linux, Third Edition 12

Sorting

• Sorting a file’s contents alphabetically or numerically.

Syntax sort [-option] [filename]

Useful Options include:-k n sort on key field specified by n-t indicates a specified char that separate fields-m merges input file that have been previously sorted-o redirect output to the specified file-d sorts in alphanumeric or dictionary order-g sorts by numeric order-r sorts in reverse order

Page 13: Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.

Guide to UNIX Using Linux, Third Edition 13

Sorting Example

$ sort –k 3 food >sortedfood$ Cat sortedfoodLettuce Sourdough BeefSpinach White bread ChickenBeans Pumpernickel MuttonCarrots Whole Wheat Turkey

Page 14: Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.

Guide to UNIX Using Linux, Third Edition 14

Sorting –n Example

• The –n option to sort specifies that the first field on the line is to be considered a number.

$ cat data5 272 123 3323 2-5 1115 614 -9

$ sort data-5 1114 -915 62 1223 23 335 27

$ sort –n data-5 112 123 335 2714 -915 623 2

Page 15: Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.

Guide to UNIX Using Linux, Third Edition 15

Sorting +1n Example

• The +1 say to skip the first field. Similarly, +5n would mean to skip the first five field.

$ cat data5 272 123 3323 2-5 1115 614 -9

• Skip the first field in the sort

$ sort +1n data14 -923 215 6-5 112 125 273 33

Page 16: Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.

Guide to UNIX Using Linux, Third Edition 16

Creating Script Files

• UNIX/Linux users create shell script files to contain commands that can be run sequentially as a set – this helps with the issues of command automation and re-use of command actions

• UNIX/Linux users use the vi editor to create script files, then make the script executable using the chmod command with the x argument

Page 17: Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.

Guide to UNIX Using Linux, Third Edition 17

Creating Script Files (continued)

Page 18: Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.

Guide to UNIX Using Linux, Third Edition 18

Using the join Command on Two Files

• Sometimes you want to link the information in two files

• The join command is often used in relational database processing

• The join command associates information in two different files on the basis of a common field or key in those files

Page 19: Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.

Guide to UNIX Using Linux, Third Edition 19

The join Command

• Syntax: join [-option] [file1 file2]Useful Options include:• File1 and file2 are two input files that must be

sorted on join field.-1 fieldnum specifies the common join field in file 1-2 fieldnum specifies the common join field in file 2-o specifies a list of field to output

-t specifies the field separator character, space, tab or new line

-a filenum produce a file for each unpairable line

-e str replaces empty fields for the unpairable line in the string specified by str.

Page 20: Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.

Guide to UNIX Using Linux, Third Edition 20

sed

• sed is a program used for editing data. It stands for stream editor.

• Example: To change first occurrences of “Unix” to “UNIX” on every line of intro

$ cat introThe Unix operating system was pioneered by Ken Thompson. Main goal of Unix was to create Unix environment for efficient program development

$ sed 's/Unix/UNIX/' introThe UNIX operating system was pioneered by Ken Thompson. Main goal of UNIX was to create Unix environment for efficient program development

Page 21: Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.

Guide to UNIX Using Linux, Third Edition 21

sed with g option

• Example: To change all occurrences of “Unix” to “UNIX” on every line of intro

$ cat introThe Unix operating system was pioneered by Ken Thompson. Main goal of Unix was to create Unix environment for efficient program development

$ sed 's/Unix/UNIX/g' intro The UNIX operating system was pioneered by Ken Thompson. Main goal of UNIX was to create UNIX environment for efficient program development

Page 22: Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.

Guide to UNIX Using Linux, Third Edition 22

sed with –n option

• Just print first 2 lines

$ cat intro The Unix operating system was pioneered by Ken Thompson. Main goal of Unix was to create Unix environment for efficient program development

$ sed –n '1,2p' intro The Unix operating system was pioneered by Ken Thompson. Main goal of Unix was to create Unix

Page 23: Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.

Guide to UNIX Using Linux, Third Edition 23

sed with –n option

• Just print first lines containing Unix

$ cat intro The Unix operating system was pioneered by Ken Thompson. Main goal of Unix was to create Unix environment for efficient program development

$ sed –n '/Unix/p' intro The Unix operating system was pioneered by Ken Thompson. Main goal of Unix was to create Unix

Page 24: Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.

Guide to UNIX Using Linux, Third Edition 24

sed with d option

• Delete lines 1 and 2

$ cat intro The Unix operating system was pioneered by Ken Thompson. Main goal of Unix was to create Unix environment for efficient program development

$ sed '1,2d' intro environment for efficient program development

• Delete all lines containing Unix

$ sed '/Unix/d' intro environment for efficient program development

Page 25: Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.

Guide to UNIX Using Linux, Third Edition 25

A Brief Introduction to theAwk Program

• Awk, a pattern-scanning and processing language helps to produce professional-looking reports

• Awk provides a powerful programming environment that can perform actions on files that are difficult to duplicate with a combination of other commands

Page 26: Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.

Guide to UNIX Using Linux, Third Edition 26

A Brief Introduction to theAwk Program (continued)

• Awk checks to see if the input records in specified files satisfy a pattern

• If so, awk executes a specified action

• If no pattern is provided, awk applies the action to every record

Page 27: Introduction to Unix (CA263) File Processing. Guide to UNIX Using Linux, Third Edition 2 Objectives Explain UNIX and Linux file processing Use basic file.

Guide to UNIX Using Linux, Third Edition 27

awk command

• Syntax:

awk [-Fsep] [‘pattern {action}…’] [filename]Useful Options include:-F: means the field separator is colon

awk ‘BEGIN { print "This is an awk print line." }’This is an awk print line.

awk –F: ‘{printf "%s\t %s\n", $1, $2}’ datafile