The Linux Filesystem Part Workbook...

77
Part Workbook 5. The Linux Filesystem

Transcript of The Linux Filesystem Part Workbook...

Part Workbook 5. The Linux Filesystem

2

Table of Contents1. File Details ................................................................................................................... 5

Discussion .............................................................................................................. 5How Linux Save Files ...................................................................................... 5What's in an inode? .......................................................................................... 6Viewing inode information with the stat command ................................................. 9Viewing inode information with the ls command .................................................. 10Identifying Files by Inode ................................................................................ 10

Examples .............................................................................................................. 11Example 1. Comparing file sizes with ls -s and ls -l ............................................... 11Example 2. Listing files, sorted by modify time ..................................................... 11Example 3. Decorating listings with ls -F ............................................................. 12

Online Exercises .................................................................................................... 13Online Exercise 1. Viewing file metadata ............................................................. 13

Specification .......................................................................................... 13Deliverables ........................................................................................... 13Suggestions ........................................................................................... 13

Questions .............................................................................................................. 132. Hard and Soft Links ..................................................................................................... 17

Discussion ............................................................................................................. 17The Case for Hard Links ................................................................................. 17Hard Link Details ........................................................................................... 17The Case for Soft Links .................................................................................. 19Soft Link Details ............................................................................................ 20Creating Links with the ln Command ................................................................. 20Issues with Soft Links ..................................................................................... 21

Dangling Links ...................................................................................... 21Recursive links ....................................................................................... 22Absolute vs. Relative Soft Links ............................................................... 23

Comparing Hard and Soft Links ........................................................................ 23Examples .............................................................................................................. 23

Example 1. Working with hard links ................................................................... 23Example 2. Working with soft links .................................................................... 24Example 3. Working with soft links and directories ................................................ 25

Online Exercises .................................................................................................... 25Online Exercise 1. Creating and Managing Links ................................................... 25

Specification .......................................................................................... 25Deliverables ........................................................................................... 26

Online Exercise 2. Sharing a Hard Linked File ...................................................... 26Specification .......................................................................................... 26Deliverables ........................................................................................... 27

Questions .............................................................................................................. 273. Directories and Device Nodes ........................................................................................ 30

Discussion ............................................................................................................. 30Directories ..................................................................................................... 30

Directory Structure ................................................................................. 30Directory Links ...................................................................................... 31

Device Nodes ................................................................................................ 33Block and Character Device Nodes ............................................................ 33Terminals as Devices .............................................................................. 34Device Permissions, Security, and the Console User ...................................... 35

Examples .............................................................................................................. 36

The Linux Filesystem

3

Example 1. Interpreting Directory Links Counts .................................................... 36Questions .............................................................................................................. 37

4. Disks, Filesystems, and Mounting ................................................................................... 40Discussion ............................................................................................................. 40

Disk Devices ................................................................................................. 40Low Level Access to Drives ............................................................................. 41Filesystems .................................................................................................... 42Mounting Filesystems ...................................................................................... 43

Viewing Mount Points ............................................................................. 44Why Bother? ......................................................................................... 44Mounting Temporary Media: The /media directory. ....................................... 45Mounting Issues ..................................................................................... 46

Examples .............................................................................................................. 47Example 1. Using an Unformatted Floppy ............................................................ 47Example 2. Using a DOS Formatted Floppy ......................................................... 48Example 3. Floppy Images ................................................................................ 49

Online Exercises .................................................................................................... 49Online Exercise 1. Using Floppies ...................................................................... 49

Setup .................................................................................................... 49Specification .......................................................................................... 50Deliverables ........................................................................................... 50Possible Solution .................................................................................... 50Cleaning Up .......................................................................................... 50

Online Exercise 2. Imaging a Floppy ................................................................... 50Setup .................................................................................................... 51Specification .......................................................................................... 51Deliverables ........................................................................................... 51Possible Solution .................................................................................... 51Cleaning Up .......................................................................................... 51

Questions .............................................................................................................. 515. Locating Files with locate and find ................................................................................. 55

Discussion ............................................................................................................. 55Locating Files ................................................................................................ 55Using Locate ................................................................................................. 55Using find ..................................................................................................... 56

Find Criteria .......................................................................................... 57Find Actions .......................................................................................... 58

Examples .............................................................................................................. 59Example 1. Using locate ................................................................................... 59Example 2. Using find ...................................................................................... 60Example 3. Using find to Execute Commands on Files ........................................... 60

Online Exercises .................................................................................................... 61Online Exercise 1. Locating files ........................................................................ 61

Specification .......................................................................................... 61Deliverables ........................................................................................... 61

Questions .............................................................................................................. 626. Compressing Files: gzip and bzip2 .................................................................................. 64

Discussion ............................................................................................................. 64Why Compress Files? ...................................................................................... 64Standard Linux Compression Utilities ................................................................. 64Other Compression Utilities .............................................................................. 65

Examples .............................................................................................................. 65Example 1. Working with gzip ........................................................................... 65Example 2. Using gzip Recursively ..................................................................... 65

The Linux Filesystem

4

Example 3. Working with bzip2 ......................................................................... 66Online Exercises .................................................................................................... 66

Online Exercise 1. Working with compression Utilities ........................................... 66Specification .......................................................................................... 66Deliverables ........................................................................................... 67

Questions .............................................................................................................. 677. Archiving Files with tar ................................................................................................ 69

Discussion ............................................................................................................. 69Archive Files ................................................................................................. 69Tar Command Basics ...................................................................................... 69More About tar .............................................................................................. 70

Absolute References ................................................................................ 71Establishing Context ............................................................................... 72Compressing archives .............................................................................. 73

Examples .............................................................................................................. 73Example 1. Creating a tar Archive ...................................................................... 73Example 2. Tarring Directly to a Floppy .............................................................. 74Example 3. Oops. ............................................................................................ 75

Online Exercises .................................................................................................... 75Online Exercise 1. Archiving Directories ............................................................. 75

Specification .......................................................................................... 75Deliverables ........................................................................................... 75

Questions .............................................................................................................. 76

5

Chapter 1. File DetailsKey Concepts

• The term file refers to regular files, directories, symbolic links, device nodes, and others.

• All files have common attributes: user owner, group owner, permissions, and timing information. Thisinformation is stored in a structure called an inode.

• File names are contained in data structures called dentries (directory entries).

• A file's inode information can be examined with the ls -l and stat commands.

• Within the Linux kernel, files are generally identified by inode number. The ls -i command can be usedto examine inode numbers.

Discussion

How Linux Save FilesSuppose elvis opens a text editor, and composes the following shopping list.

eggsbaconmilk

When he is finished, and closes the editor, he is asked what he would like to name the file. He choosesshopping.txt. Once he is done, he lists the contents of the directory to make sure it is there.

[elvis@station elvis]$ ls -ltotal 4-rw-rw-r-- 1 elvis elvis 16 Jul 11 07:54 shopping.txt

This short example illustrates the three components Linux associates with a file.

data The data is the content of the file, in this case, the 16 bytes that compose elvis'sshopping list (13 visible characters, and 3 invisible "return" characters that indicatethe end of a line). In Linux, as in Unix, every file's content is stored as a series of bytes.

metadata In addition to its content, every file in Linux has extra information associated with it.The entire last Workbook focused on some of this information, namely the file's userowner, group owner, and permissions. Other information, such as the last time the filewas modified or read, is stored as well. Much of this metadata is reported when yourun the ls -l command. In Linux (and Unix), all of the extra information associatedwith a file (with the important exception discussed next) is stored in a structure calledan inode.

filename The filename is the exception to the rule. Although the filename could be consideredmetadata associated with the file, it is not stored in the inode directly. Instead, thefilename is stored in a structure called a dentry. (The term dentry is a shortening ofdirectory entry, and, as we will see in a later Lesson, the structure is closely associatedwith directories.) In essence, the filename associates a name with an inode.

In summary, there are three structures associated with every file: a dentry which contains a filenameand refers to an inode, which contains the file's metadata and refers to the file's data. Understanding the

File Details

6

relationships between these structures helps in understanding later concepts, such as links and directories.These structures are summarized in the figure below.

Figure 1.1. File Structures

What's in an inode?In Linux (and Unix), every file that exists in the filesystem has an associated inode which stores all of thefile's information, except for the filename. What can be found in an inode?

File Type In Linux (and Unix), the term file has a verygeneral meaning: anything that exists in thefilesystem (and thus has an inode associatedwith it) is a file. This includes regularfiles and directories, which we have alreadyencountered, symbolic links and device nodeswhich we will soon encounter, and a coupleof more obscure creatures which are related tointerprocess communication, and beyond thescope of the course. The possible file typesare listed in the table below.

Table 1.1. Linux (Unix) File Types

File Type lsabbr.

Use

Regular File - Storing data

Directories d Organizing files

SymbolicLinks

l Referring to otherfiles

CharacterDeviceNodes

c Accessing devices

Block DeviceNodes

b Accessing devices

File Details

7

File Type lsabbr.

Use

Named Pipes p Interprocesscommunication

Sockets s Interprocesscommunication

Each of the seven file types mentioned aboveuses the same inode structure, so they eachhave the same types of attributes: owners,permissions, modify times, etc. When listingfiles with ls -l, the file type of a file isidentified by the first character, using theabbreviations found in the second columnabove.

Note

The term file is overloaded in Linux(and Unix), and has two meanings.When used in sentences such as"every file has an inode", the termrefers to any of the file typeslisted in the table above. Whenused in sentences such as "Thehead command only works on files,not directories", the term file isreferring only to the specific typeof file that holds data. Usually,the meaning is clear from context.When a distinction has to be made,the term regular file is used, asin "The ls -l command identifiesregular files with a hyphen (-)".

Ownerships and Permissions As discussed in the previous Workbook,every (regular) file and directory has a groupowner, a user owner, and a collection of threesets of read, write, and execute permissions.Because this information is stored in a file'sinode, and the inode structure is the samefor all files, all seven file types use the samemechanisms for controlling who has access tothem, namely chmod, chgrp, and chown.

When listing files with ls -l, the first columndisplays the permissions (as well as file type),the third the user owner, and the fourth thegroup owner.

Timing Information Each inode stores three times relevant tothe file, conventionally called the atime,ctime, and mtime. These times record the last

File Details

8

time a file was accessed (read), changed, ormodified, respectively.

Table 1.2. File times

AbbreviationName Purpose

atime AccessTime

Updates whenever thefile's data is read

ctime ChangeTime

Updates whenever thefile's inode informationchanges

mtime ModifyTime

Updates whenever thefile's data changes

What's the difference between change andmodify? When a file's data changes, the fileis said to be modified, and the mtime isupdated. When a file's inode informationchanges, the file is said to be changed, and thefile's ctime is updated. Modifying a file (andthus changing the mtime) causes the ctime toupdate as well, while merely reading a file(and thus changing the atime) does not causethe ctime to udpate.

What about createtime?

Often, people mistake Unix's ctimefor a "creation time". Oddly,traditional Unix (and Linux) doesnot record the fixed time a file wascreated, a fact which many considera weakness in the design of the Unixfilesystem.

File length and size The inode records two measures of how largea file is: The file's length (which is the actualnumber of bytes of data), and the file's size(which is the amount of disk space the fileconsumes). Because of the low level details ofhow files are stored on a disk, the two differ.Generally, the file's size increments in chunks(usually 4 kilobytes) at a time, while thelength increases byte by byte as informationis added to the file. When listing files withthe ls -l command, the file's length (in bytes)is reported in the 5th column. When listingfiles with the ls -s command, the file's size (inkilobytes) is reported instead.

Link Count Lastly, the inode records a file's link count, orthe number of dentries (filenames) that refer

File Details

9

to the file. Usually, regular files only haveone name, and the link count as one. As wewill find out, however, this is not always thecase. When listing files with ls -l, the secondcolumn gives the link count.

Viewing inode information with the stat command

Red Hat Enterprise Linux includes the stat command for examining a file's inode information in detail. InUnix programming, a file's collection of inode information is referred to as the status of the file. The statcommand can be thought of as reporting the status of a file.

Note

The stat command is often not installed by default in Red Hat Enterprise Linux. If you find thatyour machine does not have the stat command, have your instructor install the stat RPM packagefile.

stat [OPTION] FILE...

Display file (or filesystem) status information.

Switch Effect

-c, --format=FORMAT Print only the requested information using the specified format. See the stat(1)man page for more information.

-f, --filesystem Show information about the filesystem the file belongs to, instead of the file.

-t, --terse Print output in terse (single line) form

In the following example, madonna examines the inode information on the file /usr/games/fortune.

[madonna@station madonna]$ stat /usr/games/fortune

File: `/usr/games/fortune'

Size: 17795 Blocks: 40 IO Block: 4096 Regular File

Device: 303h/771d Inode: 540564 Links: 1

Access: (0755/-rwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)

Access: 2003-07-09 02:36:41.000000000 -0400 Modify: 2002-08-22 04:14:02.000000000 -0400Change: 2002-09-11 11:38:09.000000000 -0400

The name of the file. This is information is not really stored in the inode, but in the dentry, asexplained above.Inconveniently for the terminology introduced above, the stat command labels the length of the file"Size".The number of filesystem blocks the file consumes. Apparently, the stat command is using ablocksize of 2 kilobytes. 1

The file type, in this case, a regular file.The link count, or number of filenames that link to this inode. (Don't worry if you don't understandthis yet.)The file's user owner, group owner, and permissions.The atime, mtime, and ctime for the file.

File Details

10

Viewing inode information with the ls commandWhile the stat command is convenient for listing the inode information of individual files, the ls commandoften does a better job summarizing the information for several files. We reintroduce the ls command, thistime discussing some of its many command line switches relevant for displaying inode information.

ls [OPTION...] FILE...

List the files FILE..., or if a directory, list the contents of the directory.

Switch Effect

-a, --all Include files that start with .

-d, --directory If FILE is a directory, list information about the directory itself, not thedirectory's contents.

-F, --classify Decorate filenames with one of *, /, =, @, or | to indicate file type.

-h, --human-readable Use "human readable" abbreviations when reporting file lengths.

-i, --inode List index number of each file's inode.

-l Use long listing format

-n, --numeric-uid-gid Use numeric UIDs and GIDs, rather then usernames and groupnames.

-r, --reverse Reverse sorting order.

-R, --recursive List subdirectories recursively.

--time=WORD Report (or sort by) time specified by WORD instead of mtime. WORD maybe one of "atime", "access", "ctime", or "status".

-t Sort by modification time.

In the following example, madonna takes a long listing of all of the files in the directory /usr/games.The different elements reported by ls -l are discussed in detail.

[madonna@station madonna]$ ls -l /usr/games/

total 28

drwxr-xr-x 3 root root 4096 Jan 29 09:40 chromium-rwxr-xr-x 1 root root 17795 Aug 22 2002 fortunedr-xrwxr-x 3 root games 4096 Apr 1 11:49 Maelstrom

The total number of blocks used by files in the directory. (Note that this does not includesubdirectories).The file type and permissions of the file.The file's link count, or the total number of dentries (filenames) that refer to this file. (Note that, fordirectories, this is always greater than 1!. hmmm...)The file's owner.The file's group owner.The length of the file, in bytes (Note that directories have a length as well, and the length seems toincrement in blocks. hmmm....)The file's mtime, or last time the file was modified.

Identifying Files by InodeWhile people tend to use filenames to identify files, the Linux kernel usually uses the inode directly. Withina filesystem, every inode is assigned a unique inode number. The inode number of a file can be listed withthe -i command line switch to the ls command.

File Details

11

[madonna@station madonna]$ ls -iF /usr/games/ 540838 chromium/ 540564 fortune* 312180 Maelstrom/

In this example, the directory chromium has an inode number of 540838. A file can be uniquely identifiedby knowing its filesystem and inode number.

Examples

Comparing file sizes with ls -s and ls -lThe user elvis is examining the sizes of executable files in the /bin directory. He first runs the ls -scommand.

[elvis@station elvis]$ ls -s /bintotal 4860 4 arch 0 domainname 20 login 92 sed 96 ash 56 dumpkeys 72 ls 32 setfont 488 ash.static 12 echo 72 mail 20 setserial 12 aumix-minimal 44 ed 20 mkdir 0 sh 0 awk 4 egrep 20 mknod 16 sleep...

Next to each file name, the ls command reports the size of the file in kilobytes. For example, the file ashtakes up 96 Kbytes of disk space. In the (abbreviated) output, that all of the sizes are divisible by four.Apparently, when storing a file on the disk, disk space gets allocated to files in chunks of 4 Kbytes. (Thisis referred to as the "blocksize" of the filesystem). Note that the file awk seems to be taking up no space.

Next, elvis examines the directory information using the ls -l command.

[elvis@station elvis]$ ls -l /bintotal 4860-rwxr-xr-x 1 root root 2644 Feb 24 19:11 arch-rwxr-xr-x 1 root root 92444 Feb 6 10:20 ash-rwxr-xr-x 1 root root 492968 Feb 6 10:20 ash.static-rwxr-xr-x 1 root root 10456 Jan 24 16:47 aumix-minimallrwxrwxrwx 1 root root 4 Apr 1 11:11 awk -> gawk...

This time, the lengths of the files are reported in bytes. Looking again at the file ash, the length is reportedas 92444 bytes. This is reasonable, because rounding up to the next 4 Kilobytes, we get the 96 Kbytesreported by the ls -s command. Note again the file awk. The file is not a regular file, but a Symbolic Link,which explains why it was consuming no space. Symbolic Links will be discussed in more detail shortly.

Lastly, elvis is curious about the permissions on the /bin directory. When he runs ls -l /bin, however,he get a listing of the contents of the /bin directory, no the directory itself. He solves his problem byadding the -d command line switch.

[elvis@station elvis]$ ls -ld /bindrwxr-xr-x 2 root root 4096 Jul 8 09:29 /bin

Listing files, sorted by modify timeThe user prince is exploring the system log files found in the /var/log directory. He is curious aboutrecent activity on the system, so he would like to know which files have been accessed most recently. Hefirst takes a long listing of the directory, and begins examining the reported modify times for the files.

[prince@station prince]$ ls -l /var/logtotal 16296-rw------- 1 root root 20847 Jul 12 2002 boot.log-rw------- 1 root root 45034 Jul 6 2002 boot.log.1

File Details

12

-rw------- 1 root root 29116 Jun 29 2002 boot.log.2-rw------- 1 root root 18785 Jun 22 2002 boot.log.3-rw------- 1 root root 15171 Jun 15 2002 boot.log.4drwxr-xr-x 2 servlet servlet 4096 Jan 20 2002 ccm-core-cms-rw------- 1 root root 57443 Jul 12 2002 cron-rw------- 1 root root 62023 Jul 6 2002 cron.1-rw------- 1 root root 74850 Jun 29 2002 cron.2...

With 74 files to look at, prince quickly tires of skimming for recent files. Instead, he decides to let thels command do the hard work for him, specifying that the output should be sorted by mtime with the -t command line switch.

[prince@station prince]$ ls -lt /var/log total 16296-rw------- 1 root root 57443 Jul 12 2002 cron-rw------- 1 root root 2536558 Jul 12 2002 maillog-rw------- 1 root root 956853 Jul 12 2002 messages-rw-rw-r-- 1 root utmp 622464 Jul 12 2002 wtmp-rw-r--r-- 1 root root 22000 Jul 12 2002 rpmpkgs-rw-r--r-- 1 root root 38037 Jul 12 2002 xorg.0.log....

Now elvis easily reads the cron, maillog, and messages files as the most recently modified files.Curious which log files are not being used, prince repeats the command, adding the -r command lineswitch.

[prince@station prince]$ ls -ltr /var/logtotal 16296-rw-r--r-- 1 root root 32589 Oct 23 2001 xorg.1.log.olddrwxr-xr-x 2 servlet servlet 4096 Jan 20 2002 ccm-core-cmsdrwxr-xr-x 2 root root 4096 Feb 3 2002 vbox-rwx------ 1 postgres postgres 0 Apr 1 2002 pgsqldrwx------ 2 root root 4096 Apr 5 2002 samba-rw-r--r-- 1 root root 42053 May 7 2002 xorg.1.log-rw-r--r-- 1 root root 1371 May 9 2002 xorg.setup.log-rw------- 1 root root 0 Jun 9 2002 vsftpd.log.4...

Decorating listings with ls -FThe user blondie is exploring the /etc/X11 directory.

[blondie@station blondie]$ ls /etc/X11/applnk prefdm sysconfig xorg.conf.backup xkbdesktop-menus proxymngr twm xorg.conf.wbx Xmodmapfs rstart X xorg.conf.works Xresourcesgdm serverconfig xdm XftConfig.README-OBSOLETE xserverlbxproxy starthere xorg.conf xinit xsm

Because she does not have a color terminal, she is having a hard time distinguishing what is a regular fileand what is a directory. She adds the -F command line switch to decorate the output.

[blondie@station blondie]$ ls -F /etc/X11/applnk/ rstart/ xorg.conf Xmodmapdesktop-menus/ serverconfig/ xorg.conf.backup Xresourcesfs/ starthere/ xorg.conf.wbx xserver/gdm/ sysconfig/ xorg.conf.works xsm/lbxproxy/ twm/ XftConfig.README-OBSOLETEprefdm* X@ xinit/proxymngr/ xdm/ xkb@

Now, the various files are decorated by type. Directories end in a /, symbolic links with a @, and regularfiles with executable permissions (implying that they are commands to be run) end in a *.

File Details

13

Online Exercises

Viewing file metadata

Lab Exercise

Objective: List files by modify time

Estimated Time: 5 mins.

Specification

1. Create a file in your home directory called etc.bytime. The file should contain a long listing ofthe /etc directory, sorted by modify time. The most recently modified file should be on the first lineof the file.

2. Create a file in your home directory called etc.bytime.reversed. The file should contain a longlisting of the /etc directory, reverse sorted by modify time. The most recently modified file shouldbe on the last line of the file.

3. Create a file called etc.inum, which contains the inode number of the /etc directory as its onlytoken. (Note that this is asking for the inode of the directory itself).

Deliverables

1.1. A file called etc.bytime, which contains a long listing of all files in the /etc directory,

sorted by modify time, with the most recently modified file first.

2. A file called etc.bytime.reversed, which contains a long listing of all files in the /etcdirectory, sorted by modify time, with the most recently modified file last.

3. A file called etc.inum, which contains the inode number of the file /etc directory as itsonly token.

Suggestions

The first few lines of the file etc.bytime should look like the following, although the details (such asmodify time) may differ.

total 2716-rw-r--r-- 1 root root 258 May 21 09:27 mtab-rw-r--r-- 1 root root 699 May 21 09:13 printcap-rw-r----- 1 root smmsp 12288 May 21 09:10 aliases.db-rw-rw-r-- 2 root root 107 May 21 09:10 resolv.conf-rw-r--r-- 1 root root 28 May 21 09:10 yp.conf-rw------- 1 root root 60 May 21 09:10 ioctl.save-rw-r----- 1 root root 1 May 21 09:10 lvmtabdrwxr-xr-x 2 root root 4096 May 21 09:10 lvmtab.d-rw-r--r-- 1 root root 46 May 21 08:55 adjtime

Questions1. Which of the following is not a data structure associated with a file?

File Details

14

a. dentry

b. superblock

c. inode

d. data (blocks)

e. All of the above are data structures associated with files.

2. Which of the following file types does not use a data structure called an inode?

a. regular file

b. directory

c. symbolic link

d. character device node

e. All of the above file types use the inode data structure.

3. Which of the following information is not stored in a file's inode?

a. The file's modify time

b. The file's permissions

c. The file's user owner

d. The file's name

e. All of the above information is stored in the inode.

Use the output from the following commands to answer the next 2 questions.

[student@station student]$ stat /bin File: "/bin" Size: 2048 Blocks: 4 IO Block: 4096 DirectoryDevice: 309h/777d Inode: 44177 Links: 2 Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)Access: Wed Mar 19 09:38:51 2003Modify: Wed Jan 22 16:36:06 2003Change: Wed Jan 22 16:36:06 2003[student@station student]$ ls -l /usr/bin/tree-rwxr-xr-x 1 root root 18546 Jun 23 2002 /usr/bin/tree

4. How many blocks are in use by the directory /bin as shown above?

a. 2

b. 4

c. 2048

d. 4096

5. What are the permissions for the file /usr/bin/tree as shown above?

a. 640

File Details

15

b. 644

c. 755

d. 775

6. Which command(s) will show the size and permissions of the file /etc/passwd?

a. stat /etc/passwd

b. df -h

c. cat /etc/passwd

d. ls -l /etc/passwd

7. Which command syntax will show the owner and group of the directory /etc?

a. ls /etc

b. ls -l /etc

c. ls -d /etc

d. ls -ld /etc

8. Which time for a file is shown by the ls -l command?

a. The file's modify time

b. The file's change time

c. The file's access time

d. The file's creation time

e. None of the above.

9. Which time is updated when a file is read?

a. The file's modify time

b. The file's change time

c. The file's access time

d. A and C

e. All of the above.

10. Which time is updated when data is appended to a file?

a. The file's modify time

b. The file's change time

c. The file's access time

d. A and B

File Details

16

e. All of the above.

17

Chapter 2. Hard and Soft LinksKey Concepts

• The ln command creates two distinct types of links.

• Hard links assign multiple dentries (filenames) to a single inode.

• Soft links are distinct inodes that reference other filenames.

Discussion

The Case for Hard Links

Occasionally, Linux users want the same file to exist two separate places, or have two different names.One approach is by creating a hard link.

Suppose elvis and blondie are collaborating on a duet. They would both like to be able to work on thelyrics as time allows, and be able to benefit from one another's work. Rather than copying an updated fileto one another every time they make a change, and keeping their individual copies synchronized, theydecide to create a hard link.

Blondie has set up a collaboration directory called ~/music, which is group owned and writable bymembers of the group music. She has elvis do the same. She then creates the file ~/music/duet.txt,chgrps it to the group music, and uses the ln command to link the file into elvis's directory.

[blondie@station blondie]$ ls -ld music/drwxrwxr-x 2 blondie music 4096 Jul 13 05:45 music/[blondie@station blondie]$ echo "Knock knock" > music/duet.txt[blondie@station blondie]$ chgrp music music/duet.txt[blondie@station blondie]$ ln music/duet.txt /home/elvis/music/duet.txt

Because the file was linked, and not copied, it is the same file under two names. When elvis edits /home/elvis/music/duet.txt, he is editing /home/blondie/music/duet.txt as well.

[elvis@station elvis]$ echo "who's there?" >> music/duet.txt

[blondie@station blondie]$ cat music/duet.txtKnock knockwho's there?

Hard Link Details

How are hard links implemented? When created, the file /home/blondie/music/duet.txtconsists of a dentry, an inode, and data, as illustrated below.

Hard and Soft Links

18

Figure 2.1. Regular File

After using the lncommand to create the link, the file is still a single inode, but there are now two dentriesreferring to the file.

Figure 2.2. Hard Link

When blondie runs the ls -l command, look closely at the second column, which reports the link countfor the file.

[blondie@station blondie]$ ls -l music/duet.txt-rw-rw-r-- 2 blondie music 25 Jul 13 06:08 music/duet.txt

Until now, we have not been paying much attention to the link count, and it has almost always been 1 forregular files (implying one dentry referencing one inode). Now, however, two dentries are referencing theinode, and the file has a link count of 2. If blondie changes permissions on the file /home/blondie/music/duet.txt, what happens to the file /home/elvis/music/duet.txt?

[blondie@station blondie]$ chmod o-r music/duet.txt

[elvis@station elvis]$ ls -l music/duet.txt-rw-rw---- 2 blondie music 25 Jul 13 06:08 music/duet.txt

Because both halves of the link share the same inode, elvis sees the changed permissions as well.

What happens if blondie removes /home/blondie/music/duet.txt? The inode /home/elvis/music/duet.txt still exists, but with one less dentry referencing it.

Hard and Soft Links

19

[blondie@station blondie]$ rm music/duet.txt

Figure 2.3. Hard link after half is removed

What would you expect the link count of the file /home/elvis/music/duet.txt to be now?

[elvis@station elvis]$ ls -l music/duet.txt-rw-rw---- 1 blondie music 25 Jul 13 06:08 music/duet.txt

A little awkwardly, elvis is left with a file owned by blondie, but it is still a valid file, nonetheless. At alow level, the rm command is not said to delete a file, but "unlink" it. A file (meaning the file's inode anddata) are automatically deleted from the system when its link count goes to 0 (implying that there are nolonger any dentries (filenames) referencing the file).

The Case for Soft LinksThe other approach to assigning a single file two names is called a soft link. While superficially similar,soft links are implemented very differently from hard links.

The user madonna is obsessively organized, and has collected her regular errands into seven todo lists,one for every day of the week.

[madonna@station madonna]$ ls todo/friday.todo saturday.todo thursday.todo wednesday.todomonday.todo sunday.todo tuesday.todo

She consults her todo lists multiple times a day, but finds that she has trouble remembering what day ofweek it is. She would rather have a single file called today.todo, which she updates each morning. Shedecides to use a soft link instead. Because today is Tuesday, she uses the same ln command that is usedfor creating hard links, but adds the -s command line switch to specify a soft link.

[madonna@station todo]$ lsfriday.todo saturday.todo thursday.todo wednesday.todomonday.todo sunday.todo tuesday.todo[madonna@station todo]$ ln -s tuesday.todo today.todo[madonna@station todo]$ ls -ltotal 32-rw-rw-r-- 1 madonna madonna 138 Jul 14 09:54 friday.todo-rw-rw-r-- 1 madonna madonna 29 Jul 14 09:54 monday.todo-rw-rw-r-- 1 madonna madonna 579 Jul 14 09:54 saturday.todo-rw-rw-r-- 1 madonna madonna 252 Jul 14 09:54 sunday.todo-rw-rw-r-- 1 madonna madonna 519 Jul 14 09:54 thursday.todo

Hard and Soft Links

20

lrwxrwxrwx 1 madonna madonna 12 Jul 14 09:55 today.todo -> tuesday.todo-rw-rw-r-- 1 madonna madonna 37 Jul 14 09:54 tuesday.todo-rw-rw-r-- 1 madonna madonna 6587 Jul 14 09:55 wednesday.todo

Examine closely file type (the first character of each line in the ls -l command) of the newly created filetoday.todo. It is not a regular file ("-"), or a directory ("d"), but a "l", indicating a symbolic link. Asymbolic link, also referred to as a "soft" link, is a file which references another file by filename. Softlinks are similar to aliases found in other operating systems. Helpfully, the ls -l command also displayswhat file the soft link refers to, where today.todo -> tuesday.todo implies the soft link titledtoday.todo references the file tuesday.todo.

Now, whenever madonna references the file today.todo, she is really examining the filetuesday.todo.

[madonna@station todo]$ cat today.todofeed cattake out trashwater plants[madonna@station todo]$ cat tuesday.todofeed cattake out trashwater plants

Soft Link DetailsHow are soft links implemented? When created, the file tuesday.txt, like most files, consists of adentry, an inode, and data, as illustrated below. When the soft link today.txt was created, the soft link(unlike a hard link) really is a new file, with a newly created inode. The link is not a regular file, however,but a symbolic link. Symbolic links, rather than storing actual data, store the name of another file. Whenthe Linux kernel is asked to refer to the symbolic link, the kernel automatically resolves the link by lookingup the new filename. The user (or really, the process on behalf of the user) that referred to the symboliclink doesn't know the difference.

Figure 2.4. Soft Links

Creating Links with the ln CommandAs illustrated above, both hard links and soft links are created with the ln command.

Hard and Soft Links

21

ln [OPTION...] TARGET [LINK]

Create the link LINK referencing the file TARGET.

ln [OPTION...] TARGET... [DIRECTORY]

Create link(s) to the file(s) TARGET in the directory DIRECTORY.

Switch Effect

-f, --force clobber existing destination files

-s, --symbolic make symbolic (soft) link instead of hard link

The ln command behaves very similarly to the cp command: if the last argument is a directory, thecommand creates links in the specified directory which refer to (and are identically named) to the precedingarguments. Unlike the cp command, if only one argument is given, the ln command will effectively assumea last argument of ".". When specifying links, the ln command expects the name of the original file(s)first, and the name of the link last. Reversing the order doesn't produce the desired results. Again, whenin doubt, recall the behavior of the cp command.

In the following short example, madonna creates the file orig.txt. She then tries to make a hard linkto it called newlnk.txt, but gets the order of the arguments wrong. Realizing her mistake, she thencorrects the problem.

[madonna@station madonna]$ date > orig.txt[madonna@station madonna]$ ln newlnk.txt orig.txtln: accessing `newlnk.txt': No such file or directory[madonna@station madonna]$ ln orig.txt newlnk.txt

Issues with Soft Links

Dangling Links

Soft links are susceptible to a couple of problems that hard links are not. The first is called dangling links.What happens if madonna renames or removes the file tuesday.todo?

[madonna@station todo]$ mv tuesday.todo tuesday.hide[madonna@station todo]$ ls -ltotal 32-rw-rw-r-- 1 madonna madonna 138 May 14 09:54 friday.todo-rw-rw-r-- 1 madonna madonna 29 May 14 09:54 monday.todo-rw-rw-r-- 1 madonna madonna 579 May 14 09:54 saturday.todo-rw-rw-r-- 1 madonna madonna 252 May 14 09:54 sunday.todo-rw-rw-r-- 1 madonna madonna 519 May 14 09:54 thursday.todolrwxrwxrwx 1 madonna madonna 12 May 14 09:55 today.todo -> tuesday.todo-rw-rw-r-- 1 madonna madonna 37 May 14 10:22 tuesday.hide-rw-rw-r-- 1 madonna madonna 6587 May 14 09:55 wednesday.todo[madonna@station todo]$ cat today.todocat: today.todo: No such file or directory

The symbolic link today.todo still references the file tuesday.todo, but the file tuesday.todono longer exists! When madonna tries to read the contents of today.todo, she is met with an error.

Hard and Soft Links

22

Figure 2.5. Dangling Links

Recursive links

The second problem that symbolic links are susceptible to is recursive links. In day to day use, recursivelinks are not nearly as common as dangling links, and someone almost has to be looking for them to createthem. What if madonna created two symbolic links, link_a, which referenced a file named link_b,and link_b, which references the file link_a, as illustrated below?

[madonna@station madonna]$ ln -s link_a link_b[madonna@station madonna]$ ln -s link_b link_a[madonna@station madonna]$ ls -ltotal 0lrwxrwxrwx 1 madonna madonna 6 Jul 14 10:41 link_a -> link_blrwxrwxrwx 1 madonna madonna 6 Jul 14 10:41 link_b -> link_a

Figure 2.6. Recursive Links

When madonna tries to read link_a, the kernel resolves link_a to link_b, which it then resolves back tolink_a, and so on. Fortunately, the kernel will only resolve a link so many times before it suspects that itis caught in a recursive link, and gives up.

[madonna@station madonna]$ cat link_acat: link_a: Too many levels of symbolic links

Hard and Soft Links

23

Absolute vs. Relative Soft Links

When creating soft links, users can choose between specifying the link's target using a relative or absolutereference. If the soft link, and its target, are never going to be relocated, the choice doesn't matter. Often,however, users can't anticipate how the files they create will be used in the future. Usually, relative linksare more resilient to unexpected changes.

Comparing Hard and Soft LinksWhen should a soft link be used, and when should a hard link be used? Generally, hard links are moreappropriate if both instances of the link have a reasonable use, even if the other instance didn't exist. Inthe example above, even if blondie decided not to work on the duet and removed her file, elvis couldreasonably continue to work. Soft links are generally more appropriate when one file cannot reasonablyexist without the other file. In the example above, madonna could not have tasks for "today" if she did nothave tasks for "tuesday". These are general guidelines, however, not hard and fast rules.

Sometimes, more practical restrictions make the choice between hard links and soft links easier. Thefollowing outlines some of the differences between hard and soft links. Do not be concerned if you do notunderstand the last two points, they are included for completeness.

Table 2.1. Comparing Hard and Soft Links

Hard Links Soft Links

Directories may not be hard linked. Soft links may refer to directories.

Hard links have no concept of "original" and "copy".Once a hard link has been created, all instances aretreated equally.

Soft links have a concept of "referrer" and"referred". Removing the "referred" file results in adangling referrer.

Hard links must refer to files in the same filesystem. Soft links may span filesystems (partitions).

Hard links may be shared between "chroot"eddirectories.

Soft links may not refer to files outside of a"chroot"ed directory.

Examples

Working with hard linksIn her home directory, blondie has a file called rhyme and a directory called stuff. She takes a longlisting with ls -li, where the -i command line switch causes the ls command to print the inode number ofeach file as the first column of output.

Because each inode in a filesystem has a unique inode number, the inode number can be used to identifya file. In fact, when keeping track of files internally, the kernel usually refers to a file by inode numberrather than filename.

[blondie@station blondie]$ ls -iltotal 8 246085 -rw-rw-r-- 1 blondie blondie 51 Jul 18 15:29 rhyme 542526 drwxrwxr-x 2 blondie blondie 4096 Jul 18 15:34 stuff

She creates a hard link to the rhyme file, and views the directory contents again.

[blondie@station blondie]$ ln rhyme hard_link[blondie@station blondie]$ ls -li 246085 -rw-rw-r-- 2 blondie blondie 51 Jul 18 15:29 hard_link

Hard and Soft Links

24

246085 -rw-rw-r-- 2 blondie blondie 51 Jul 18 15:29 rhyme 542526 drwxrwxr-x 2 blondie blondie 4096 Jul 18 15:34 stuff

The link count for rhyme is now 2. Additionally notice that the inode number for both rhyme andhard_link is 246085, implying that although there are two names for the file (two dentries), there isonly one inode.

If we change the permissions on rhyme, the permissions on hard_link will change as well. Why? thetwo filenames refer to the same inode. Because the inode references a file's content, they also share thesame data.

[blondie@station blondie]$ chmod 660 rhyme[blondie@station blondie]$ ls -li 246085 -rw-rw---- 2 blondie blondie 51 Jul 18 15:29 hard_link 246085 -rw-rw---- 2 blondie blondie 51 Jul 18 15:29 rhyme 542526 drwxrwxr-x 2 blondie blondie 4096 Jul 18 15:34 stuff[blondie@station blondie]$ echo "Hickory, Dickory, Dock," > rhyme[blondie@station blondie]$ echo "Three mice ran up a clock." >> hard_link[blondie@station blondie]$ cat rhymeHickory, Dickory, Dock,Three mice ran up a clock.

Moving or even removing the original file has no effect on the link file.

[blondie@station blondie]$ mv rhyme stuff[blondie@station blondie]$ ls -Rli.:total 8 246085 -rw-rw---- 2 blondie blondie 51 Jul 18 15:29 hard_link 542526 drwxrwxr-x 2 blondie blondie 4096 Jul 18 15:38 stuff ./stuff:total 4 246085 -rw-rw---- 2 blondie blondie 51 Jul 18 15:29 rhyme

Working with soft linksThe user blondie now repeats the exact same exercise, but uses a soft link instead of a hard link. She startswith an identical setup as the example above.

[blondie@station blondie]$ ls -litotal 8 246085 -rw-rw-r-- 1 blondie blondie 29 Jul 18 15:25 rhyme 542526 drwxrwxr-x 2 blondie blondie 4096 Jul 18 15:25 stuff

She now creates a soft link to the rhyme file, and views the directory contents again.

[blondie@station blondie]$ ln -s rhyme soft_link[blondie@station blondie]$ ls -litotal 8 246085 -rw-rw-r-- 1 blondie blondie 29 Jul 18 15:25 rhyme 250186 lrwxrwxrwx 1 blondie blondie 5 Jul 18 15:26 soft_link -> rhyme 542526 drwxrwxr-x 2 blondie blondie 4096 Jul 18 15:25 stuff

In contrast to the hard link above, the soft link exists as a distinct inode (with a distinct inode number),and the link counts of each of the files remains 1. This implies that there are now two dentries and twoinodes. When referenced, however, the files behave identically as in the case of hard links.

[blondie@station blondie]$ echo "Hickory, Dickory, Dock," > rhyme[blondie@station blondie]$ echo "Three mice ran up a clock." >> soft_link[blondie@station blondie]$ cat rhymeHickory, Dickory, Dock,Three mice ran up a clock.

Hard and Soft Links

25

Unlike the hardlink, the softlink cannot survive if the original file is moved or removed. Instead, blondieis left with a dangling link.

[blondie@station blondie]$ ls -liR.:total 4 250186 lrwxrwxrwx 1 blondie blondie 5 Jul 18 15:26 soft_link -> rhyme 542526 drwxrwxr-x 2 blondie blondie 4096 Jul 18 15:31 stuff ./stuff:total 4 246085 -rw-rw-r-- 1 blondie blondie 51 Jul 18 15:29 rhyme[blondie@station blondie]$ cat soft_linkcat: soft_link: No such file or directory

Working with soft links and directoriesSoft links are also useful as pointers to directories. Hard links can only be used with ordinary files.

[einstein@station einstein]$ ln -s /usr/share/doc docs[einstein@station einstein]$ ls -il 10513 lrwxrwxrwx 1 einstein einstein 14 Mar 18 20:31 docs -> /usr/share/doc 10512 -rw-rw---- 2 einstein einstein 949 Mar 18 20:10 hard_link 55326 drwxrwxr-x 2 einstein einstein 1024 Mar 18 20:28 stuff

The user einstein can now easily change to the docs directory without having to remember or type thelong absolute path.

Online Exercises

Creating and Managing Links

Lab Exercise

Objective: Create and Manage hard and soft links

Estimated Time: 10 mins.

Specification

All files should be created in your home directory.

1. Create a file called cal.orig in your home directory, which contains a text calendar of the currentmonth (as produced by the cal command).

2. Create a hard link to the file cal.orig, called cal.harda

3. Create a hard link to the file cal.orig, called cal.hardb

4. Create a soft link to the file cal.orig, called cal.softa

5. Remove the file cal.orig, so that the soft link you just created is now a dangling link.

6. Create a soft link to the /usr/share/doc directory, called docabs, using an absolute reference.

7. Create a soft link to the ../../usr/share/doc directory, called docrel, using a relativereference. (Note: depending on the location of your home directory, you may need to add or remove

Hard and Soft Links

26

some .. references from the proceeding filename. Include enough so that the the soft link is a truerelative reference to the /usr/share/doc directory.

If you have finished the exercise correctly, you should be able to reproduce output similar to the following.

[student@station student]$ ls -ltotal 12-rw-rw-r-- 2 student student 138 Jul 21 10:03 cal.harda-rw-rw-r-- 2 student student 138 Jul 21 10:03 cal.hardblrwxrwxrwx 1 student student 8 Jul 21 10:03 cal.softa -> cal.origlrwxrwxrwx 1 student student 14 Jul 21 10:03 docabs -> /usr/share/doclrwxrwxrwx 1 student student 19 Jul 21 10:03 docrel -> ../../usr/share/doc

Deliverables

1.1. A file called cal.harda.

2. A file called cal.hardb, which is a hard link to the proceeding file.

3. A file called cal.softa, which is a dangling soft link to the nonexistent file cal.orig.

4. A file called docabs, which is a soft link to the /usr/share/doc directory, using anabsolute reference.

5. A file called docrel, which is a soft link to the /usr/share/doc directory, using a relativereference.

Sharing a Hard Linked File

Lab Exercise

Objective: Share a hard linked file between two users.

Estimated Time: 15 mins.

Specification

You would like to create a hard linked file that you will share with another user.

1. As your primary user, create a subdirectory of /tmp named after your account name, such as /tmp/student, where student is replaced with your username.

2. Still as your primary user, create a file called /tmp/student/novel.txt, which contains the text"Once upon a time."

[student@station student]$ mkdir /tmp/student[student@station student]$ echo "Once Upon a Time," > /tmp/student/novel.txt[student@station student]$ ls -al /tmp/student/total 12drwxrwxr-x 2 student student 4096 Jul 21 10:13 .drwxrwxrwt 28 root root 4096 Jul 21 10:12 ..-rw-rw-r-- 1 student student 18 Jul 21 10:13 novel.txt

3. Now log in as (or su - to) your first alternate account. Create a directory in /tmp which is named afteryour alternate account, such as /tmp/student_a.

4. As your first alternate user, in your newly created directory, create a hard link to the file /tmp/student/novel.txt, called /tmp/student_a/novel.lnk. Try to edit the file, changing the

Hard and Soft Links

27

line from "Once upon a time,", to "It was a dark and stormy night.". Why did you have difficulties? Areyou able to modify the ownerships or permissions of the file novel.lnk? Why or Why not?

[student@station student]$ su - student_aPassword:[student_a@station student_a]$ mkdir /tmp/student_a[student_a@station student_a]$ ln /tmp/student/novel.txt /tmp/student_a/novel.lnk[student_a@station student_a]$ echo "It was a dark and stormy night." >> /tmp/student_a/novel.lnk-bash: /tmp/student_a/novel.lnk: Permission denied

5. As your primary user, adjust the permissions and/or ownerships on the file /tmp/student/novel.txt, so that your first alternate user is able to modify the file.

6. As your first alternate user, apply the edit mentioned above. When you are finished, the file /tmp/student_a/novel.lnk should contain only the text "It was a dark and stormy night.".

Deliverables

1.1. A file called /tmp/student/novel.txt, where student is replaced with the name of

your primary user, owned by your primary user. The file should have appropriate ownershipsand permissions so that it can be modified by your first alternate account. The file should containonly the text "It was a dark and stormy night.".

2. A file called /tmp/student_a/novel.lnk, where student_a is replaced with thename of your first alternate account. The file should be a hard link to the file /tmp/student/novel.txt.

QuestionsUse the output from the following command to help answer the next 5 questions.

[student@station student]$ ls -li /usr/bin/ 342997 lrwxrwxrwx 1 root root 5 Apr 1 11:18 ./bunzip2 -> bzip2 342998 lrwxrwxrwx 1 root root 5 Apr 1 11:18 ./bzcat -> bzip2 342999 lrwxrwxrwx 1 root root 6 Apr 1 11:18 ./bzcmp -> bzdiff 343004 lrwxrwxrwx 1 root root 6 Apr 1 11:18 ./bzless -> bzmore 343066 lrwxrwxrwx 1 root root 16 Apr 1 11:12 ./gunzip -> ../../bin/gunzip 343112 lrwxrwxrwx 1 root root 14 Apr 1 11:12 ./gzip -> ../../bin/gzip 343136 lrwxrwxrwx 1 root root 2 Apr 1 11:21 ./lz -> uz 343123 -rwxr-xr-x 3 root root 57468 Jan 24 23:42 ./rx 343123 -rwxr-xr-x 3 root root 57468 Jan 24 23:42 ./rz 343065 -rwxr-xr-x 3 root root 61372 Jan 24 23:42 ./sb 343065 -rwxr-xr-x 3 root root 61372 Jan 24 23:42 ./sx 343065 -rwxr-xr-x 3 root root 61372 Jan 24 23:42 ./sz 347486 lrwxrwxrwx 1 root root 8 Jul 21 16:43 ./uncompress -> compress 343117 -rwxr-xr-x 3 root root 3029 Jan 31 11:08 ./zegrep 343117 -rwxr-xr-x 3 root root 3029 Jan 31 11:08 ./zfgrep 343117 -rwxr-xr-x 3 root root 3029 Jan 31 11:08 ./zgrep

Note that many lines have been omitted from the previous command's output, leaving only a few interestingone.

1. Which of the following files share the same inode?

a. lz

b. uz

Hard and Soft Links

28

c. rx

d. sb

e. sx

f. sz

2. Removing which of the following files would create a dangling link?

a. bzip2

b. lz

c. uz

d. sb

e. compress

f. zgrep

3. How many files (listed or not) share inode number 343123?

a. 1

b. 2

c. 3

d. None of the above.

e. It cannot be determined from the information provided.

4. Examine the lengths of the symbolic links such as bzcat, lz, and uncompress, as reportedin the 6th column of the output above. Which of the following best explains what the length of asymbolic link represents?

a. The length represents the length of the filename that the symbolic link resolves to.

b. The length represents the number of files which share the soft link.

c. The length is the length of the file that the symbolic link resolves to.

d. The length is arbitrary, and serves no purpose.

e. None of the above.

Suppose the system administrator moved the /usr/bin directory, as shown.

[root@station root]# mv /usr/bin /usr/lib/bin

5. Which files in the new /usr/lib/bin directory would be dangling symbolic links?

a. bzcat

b. gunzip

c. gzip

Hard and Soft Links

29

d. lz

e. uncompress

f. zgrep

6. What is the correct command for creating a shortcut from your home directory that points to a /data/project directory?

a. ln /data/project /home/student/project

b. ln /home/student/project /data/project

c. ln -s /data/project /home/student/project

d. ln -s /home/student/project /data/project

7. Projects A, B, and C all use the file /data/script. All teams want to have a copy in their ownproject directory but they also want to be sure that any changes to the original file are reflected intheir copies. Using project_A as an example, which commands would accomplish this goal?

a. ln /data/script /data/project_A/script

b. cp /data/script /data/project_A/script

c. ln -s /data/script /data/project_A/script

d. ln -s /data/project_A/script /data/script

e. A and C

8. The team leader of project_D wants to use the script as a starting point, but intends to modify it ina way that the other teams will not want to use. What is the best way for to get the original script?

a. ln /data/script /data/project_D/script

b. cp /data/script /data/project_D/script

c. ln -s /data/script /data/project_D/script

d. ln -s /data/project_D/script /data/script

30

Chapter 3. Directories and DeviceNodes

Key Concepts

• The term file refers to regular files, directories, symbolic links, device nodes, and others.

• All files have common attributes: user owner, group owner, permissions, and timing information.

• File meta-information is contained in a data structure called inodes.

• File names are contained in data structures called directory entries (dentries).

• File meta-information can be examined with the ls -l and stat commands.

Discussion

Directories

Directory Structure

Earlier, we introduced two structures associated with files, namely dentries, which associate filenameswith inodes, and inodes, which associate all of a file's attributes with its content. We now see how thesestructures relate to directories.

The user prince is using a directory called report to manage files for a report he is writing. He recursivelylists the report directory, including -a (which specifies to list "all" entries, including those beginningwith a ".") and -i (which specifies to list the inode number of a file as well as filename). What results isthe following listing of the directories and files, along with their inode number.

[prince@station prince]$ ls -iaR reportreport: 592253 . 249482 .. 592255 html 592254 text report/html: 592255 . 592253 .. 592261 chap1.html 592262 chap2.html 592263 figures report/html/figures: 592263 . 592255 .. 592264 image1.png report/text: 592254 . 592253 .. 592257 chap1.txt 592258 chap2.txt

Notice the files (directories) "." and ".." are included in the output. As mentioned in a previous Workbook,the directory "." refers to a directory itself, and the directory ".." refers to the directory's parent. Everydirectory actually contains entries called . and .., though they are treated as hidden files (because they"begin with ."), and not displayed unless -a is specified.

The same directories, files, and inode numbers are reproduced below, in an easier format.

path | inode--------------------------------------------

Directories and Device Nodes

31

report/ | 592253|-- html | 592255| |-- chap1.html | 592261| |-- chap2.html | 592262| `-- figures | 592263| `-- image1.png | 592264`-- text | 592254 |-- chap1.txt | 592257 `-- chap2.txt | 592258

As seen in the following figure of the report directory, directories have the same internal structureas regular files: a dentry, an inode, and data. The data that directories store, however, are the dentriesassociated with the files the directory is said to contain. A directory is little more than a table of dentries,mapping filenames to the underlying inodes that represent files. When introduced, the name dentry wassaid to be derived from directory entry. We now see that directories are no more complicated than that:a directory is a collection of dentries.

Figure 3.1. The Internal Structure of Directories

Directory Links

Earlier, we observed that the link counts of directories, as reflected in the second column of the ls -lcommand, was always two or greater. This follows naturally from the fact that every directory is referencedat least twice, once by itself (as the directory "."), and once by its parent (with an actual directory name, suchas report). The following diagram of the dentries contained by the report directory, its subdirectoryhtml, and its subdirectory figures, helps to illustrate.

Directories and Device Nodes

32

Figure 3.2. Dentry Tables for the report, report/html, and report/html/figures directories.

When prince takes a long listing of the report directory, he sees the four files in the first table.

[prince@station prince]$ ls -ial reporttotal 16 592253 drwxrwxr-x 4 prince prince 4096 Jul 14 13:27 . 249482 drwx-----x 6 prince prince 4096 Jul 14 13:27 .. 592255 drwxrwxr-x 3 prince prince 4096 Jul 14 13:49 html 592254 drwxrwxr-x 2 prince prince 4096 Jul 14 13:49 text

Every file in the listing is a directory, and the link count (here the third column, since the inode numberhas been prepended as the first column) of each is greater than or equal to two. Can we account for each ofthe links? Let's start by listing the references to inode number 592253 (the report directory, or above,simply ".").

1. The entry ., found in the directory itself.

2. The parent directory (not pictured) contains an entry called report, which references the same inode.

3. The subdirectory html contains an entry called .., which references inode 592253 as its parentdirectory.

4. Likewise, the subdirectory text (not pictured) contains an entry called .., which references the sameinode.

Accounting for itself (which calls it "."), it's parent (which calls it "report"), and its two subdirectories(which call it ".."), we have found all four links to the inode 592253 reported by the ls -l command.

In the following listing, the report/html directory has a link count of 3. Can you find all threereferences to inode number 592255 in the figure above?

[prince@station prince]$ ls -ial report/html

Directories and Device Nodes

33

total 20 592255 drwxrwxr-x 3 prince prince 4096 Jul 14 13:49 . 592253 drwxrwxr-x 4 prince prince 4096 Jul 14 13:27 .. 592261 -rw-rw-r-- 1 prince prince 2012 Jul 14 13:28 chap1.html 592262 -rw-rw-r-- 1 prince prince 2012 Jul 14 13:28 chap2.html 592263 drwxrwxr-x 2 prince prince 4096 Jul 14 13:28 figures

"html" in the directory report, "." in the directory report/html, and ".." in the directory report/html/figures.

In summary, directories are simply collections of dentries for the files the directory is said to contain,which map filenames to inodes. Every directory contains at least two links, one from its own directoryentry ".", and one from its parent's entry with the directory's conventional name. Directories are referencedby an additional link for every subdirectory, which refer to the directory as "..".

Device Nodes

We have now discussed three types of "creatures" which can exist in the Linux filesystem, namely regularfiles, directories, and symbolic links. In this section, we shift gears, and discuss the last two types offilesystem entries (that will be covered in this course), block and character device nodes.

Block and Character Device Nodes

Device nodes exist in the filesystem, but do not contain data in the same way that regular files, or evendirectories and symbolic links, contain data. Instead, the job of a device node is to act as a conduit to aparticular device driver within the kernel. When a user writes to a device node, the device node transfersthe information to the appropriate device driver in the kernel. When a user would like to collect informationfrom a particular device, they read from that device's associated device node, just as reading from a file.

By convention, device nodes live within a dedicated directory called /dev. In the following, the user elvistakes a long listing of files in the /dev directory.

[elvis@station elvis]$ ls -l /devtotal 228crw------- 1 root root 10, 10 Jan 30 05:24 adbmousecrw-r--r-- 1 root root 10, 175 Jan 30 05:24 agpgartcrw------- 1 root root 10, 4 Jan 30 05:24 amigamouse...crw------- 1 elvis root 14, 7 Jan 30 05:24 audioctlbrw-rw---- 1 root disk 29, 0 Jan 30 05:24 aztcdcrw------- 1 elvis root 10, 128 Jan 30 05:24 beepbrw-rw---- 1 root disk 41, 0 Jan 30 05:24 bpcdcrw------- 1 root root 68, 0 Jan 30 05:24 capi20...

As there are over 7000 entries in the /dev directory, the output has been truncated to only the first severalfiles. Focusing on the first character of each line, most of the files within /dev are not regular files ordirectories, but instead either character device nodes ("c"), or block device nodes ("b"). The two typesof device nodes reflect the fact that device drivers in Linux fall into one of two major classes, characterdevices and block devices.

Block Devices Block devices are devices that read and write information achunk ("block") at a time. Block devices customarily allowrandom access, meaning that a block of data could be readfrom anywhere on the device, in any order. Examples ofblock devices include hard drives, floppy drives, and CD/ROMdrives.

Directories and Device Nodes

34

Character Devices Character devices are often devices that read and writeinformation as a stream of bytes ("characters"), and there is anatural concept of what it means to read or write the "next"character. Examples of character devices include keyboards,mice, sound cards, and printers. Some character device driverssupport memory buffers as well.

Under the Hood

The real distinction between character and block device drivers relates to how the device driverinteracts with the Linux kernel. block devices ("disks") interact with the the unified I/O cache,while character devices bypass the cache and interact with processes directly.

Terminals as Devices

In the following, elvis has logged onto a Linux machine on both the first and second virtual consoles. Inthe first workbook, we learned how to identify terminals by name, and found that the name of the firstvirtual console was tty1, and the second virtual console was tty2. Now, we can see that the "name"of a terminal is really the name of the device node which maps to that terminal. In the following listing,the device nodes /dev/tty1 through /dev/tty6 are the device nodes for the first 6 virtual consoles,respectively.

[elvis@station elvis]$ ls -l /dev/tty[1-6]crw--w---- 1 elvis tty 4, 1 May 14 16:06 /dev/tty1crw--w---- 1 elvis tty 4, 2 May 14 16:06 /dev/tty2crw------- 1 root root 4, 3 May 14 08:50 /dev/tty3crw------- 1 root root 4, 4 May 14 08:50 /dev/tty4crw------- 1 root root 4, 5 May 14 08:50 /dev/tty5crw------- 1 root root 4, 6 May 14 08:50 /dev/tty6

In the following, elvis, working from virtual console number 1, will redirect the output of the cal commandthree times; first, to a file called /tmp/cal, secondly, to the /dev/tty1 device node, and lastly, tothe /dev/tty2 device node.

[elvis@station elvis]$ cal > /tmp/cal

[elvis@station elvis]$ cal > /dev/tty1 May 2003Su Mo Tu We Th Fr Sa 1 2 3 4 5 6 7 8 9 10 11 1213 14 15 16 17 18 1920 21 22 23 24 25 2627 28 29 30 31

[elvis@station elvis]$ cal > /dev/tty2

This case should be familiar; the output was merely redirected to a newly created file called cal.From appearances, the redirection didn't happen, but it did. The output of the command was redirectedto the device node for the first virtual console, which did what it was "supposed to do", namely,display all information written to it on the screen.Where did the output of the cal command go this time? The information was redirected to the devicenode for the second virtual console, which did what it was "supposed to do", namely displayed iton the second virtual console.

The second redirection merits a little more attention. When elvis redirected the output to the device nodecontrolling his current virtual console, /dev/tty1, the effect was as if he had performed no redirectionat all. Why? When elvis runs interactive commands without redirection, they write to the controllingterminal's device node by default. Redirecting the command's output to /dev/tty1 is akin to saying "butinstead of writing your output to my terminal, write your output to my terminal."

Directories and Device Nodes

35

Upon switching to the second virtual console, using the CTRL+ALT+F2 key sequence, elvis finds thefollowing characters on the screen.

Red Hat Enterprise Linux Server release 5 (Tikanga)Kernel 2.6.18-8.el5 on an i686

station login: elvisPassword:Last login: Mon May 14 16:55:22 on tty1You have new mail.

[elvis@station elvis]$ May 2003Su Mo Tu We Th Fr Sa 1 2 3 4 5 6 7 8 9 10 11 1213 14 15 16 17 18 1920 21 22 23 24 25 26

27 28 29 30 31

This is where elvis's cursor was sitting after logging in on the second virtual console, and switchingto virtual console number 1.Here is where the output of the cal command was written to the terminal. Note the lack of a linefeedseparating the output. This is not a natural, well formatted occurrence, but something odd that elvisasked the device driver to do.Lastly, the output of the cal command tailed off, but notice that the bash shell did not offer a freshprompt. In fact, the bash shell didn't even realize that the characters were written to the terminal. It'sstill waiting for elvis to enter a command.

Device Permissions, Security, and the Console User

Continuing the train of thought from above, the user elvis (who has logged onto the first two virtualconsoles, tty1 and tty2) next tries to redirect the output of the cal command to virtual console number3, but runs into problems.

[elvis@station elvis]$ cal > /dev/tty3-bash: /dev/tty3: Permission denied

Why was elvis not able to perform the same trick on the third virtual console? because elvis has not loggedin on the third virtual console, and therefore does not own the device. Examine again the long listing ofthe ls -l virtual console device nodes:

[elvis@station elvis]$ ls -l /dev/tty[1-6]crw--w---- 1 elvis tty 4, 1 May 16 13:38 /dev/tty1crw--w---- 1 elvis tty 4, 2 May 16 13:38 /dev/tty2crw------- 1 root root 4, 3 May 16 10:02 /dev/tty3crw------- 1 root root 4, 4 May 16 10:02 /dev/tty4crw------- 1 root root 4, 5 May 16 10:02 /dev/tty5crw------- 1 root root 4, 5 May 16 10:02 /dev/tty6

Because device nodes are considered files, they also have user owners, group owners, and a collection ofpermissions. When reading or writing from device nodes, permissions apply just as if reading or writingto a regular file. This allows a system administrator (or the software on the system) to control who hasaccess to particular devices using a familiar technique; namely, by managing file owners and permissions.

What happens when prince logs in on the third virtual console?

[elvis@station elvis]$ ls -l /dev/tty[1-6]crw--w---- 1 elvis tty 4, 1 May 16 13:38 /dev/tty1crw--w---- 1 elvis tty 4, 2 May 16 13:38 /dev/tty2crw--w---- 1 prince tty 4, 3 May 16 13:46 /dev/tty3crw------- 1 root root 4, 4 May 16 10:02 /dev/tty4

Directories and Device Nodes

36

crw------- 1 root root 4, 5 May 16 10:02 /dev/tty5crw------- 1 root root 4, 6 May 16 10:02 /dev/tty6

When a user logs in, they take ownership of the device node that controls their terminal. Processes thatthey run are then able to read input or write output to the terminal. In general, the permissions on devicenodes do not allow standard users to access devices directly. Two categories of exceptions occur.

Terminals Because users need to be able to communicate with the system, they(or, more exactly, the processes that they run) must be able to readfrom and write to the terminal they are using. Usually, part of theprocess of logging into a system involves transferring ownership ofthe terminal's device node to the user.

"Console Users" In Red Hat Enterprise Linux, users can be granted specialpermissions not because of who they are, but because of where theylogged in from. Namely, when a user logs into a virtual console, orthe graphical X server, they are considered a "console user". Consoleusers are granted access to hardware devices associated with theconsole, such as the floppy drive and sound card. When the consoleuser logs out of the system, ownerships for these devices are restoredto system defaults. None of this happens if a user logs in over thenetwork, for example. (If the user is not sitting at the machine, wouldit be reasonable for them to use the floppy drive?)

In summary, Linux (and Unix) uses device nodes to allow users access to devices on the system. Themanaging of devices on a Linux system can be a large and complicated topic. As an introduction, we haveexamined enough about how terminal device nodes are used to introduce the concept, and identify a coupleof advantages to the device node approach.

• When writing programs, programmers do not need to deal with device details. They can treat all inputand output as if it they were simply reading or writing to a file.

• Access to devices can be controlled through the same techniques of file ownerships and permissionsthat are used for regular files.

Examples

Interpreting Directory Links CountsThe user elvis takes a long listing of the /var/spool directory. He is interested in interpreting thesubdirectory's link counts, as listed in the second column.

[elvis@station elvis]$ ls -l /var/spooltotal 64drwxr-xr-x 2 root root 4096 Jan 24 16:26 anacrondrwx------ 3 daemon daemon 4096 Jun 18 02:00 atdrwxrwx--- 2 smmsp smmsp 4096 Jul 21 10:42 clientmqueuedrwx------ 2 root root 4096 Jun 18 16:12 crondrwx------ 3 lp sys 8192 Jul 18 17:38 cupsdrwxr-xr-x 23 root root 4096 Jan 24 18:52 lpddrwxrwxr-x 2 root mail 4096 Jul 21 10:11 maildrwx------ 2 root mail 8192 Jul 21 10:43 mqueuedrwxr-xr-x 17 root root 4096 Feb 24 19:41 postfixdrwxr-xr-x 2 rpm rpm 4096 Apr 11 06:18 repackagedrwxrwxrwt 2 root root 4096 Apr 5 23:46 sambadrwxr-xr-x 2 root root 8192 Jul 16 17:53 up2datedrwxrwxrwt 2 root root 4096 Feb 3 19:13 vbox

Directories and Device Nodes

37

Noticing that it has a link count of 17, elvis concludes that the postfix directory contains 15subdirectories. (1 link (shown above) for the postfix entry, 1 link for the entry . found within postfix(not shown), and 15 for the entries .. within each of 15 subdirectories.)

Examining a long listing of the /var/spool/postfix directory, he concludes that he was right (thereare 15 subdirectories).

[elvis@station elvis]$ ls -l /var/spool/postfix/total 60drwx------ 2 postfix root 4096 Feb 24 19:41 activedrwx------ 2 postfix root 4096 Feb 24 19:41 bouncedrwx------ 2 postfix root 4096 Feb 24 19:41 corruptdrwx------ 2 postfix root 4096 Feb 24 19:41 deferdrwx------ 2 postfix root 4096 Feb 24 19:41 deferreddrwxr-xr-x 2 root root 4096 Apr 1 12:22 etcdrwx------ 2 postfix root 4096 Feb 24 19:41 flushdrwx------ 2 postfix root 4096 Feb 24 19:41 incomingdrwxr-xr-x 2 root root 4096 Apr 11 05:54 libdrwx-wx--- 2 postfix postdrop 4096 Feb 24 19:41 maildropdrwxr-xr-x 2 root root 4096 Feb 24 19:41 piddrwx------ 2 postfix root 4096 Feb 24 19:41 privatedrwx--x--- 2 postfix postdrop 4096 Feb 24 19:41 publicdrwx------ 2 postfix root 4096 Feb 24 19:41 saveddrwxr-xr-x 3 root root 4096 Feb 24 19:41 usr

QuestionsUse the output from the following two commands to answer the next 3 questions.

[student@station student]$ tree /etc/sysconfig/networking/ /etc/sysconfig/networking/|-- devices| `-- ifcfg-eth0|-- ifcfg-lo`-- profiles |-- default | |-- hosts | |-- ifcfg-eth0 | |-- network | `-- resolv.conf `-- netup |-- hosts |-- ifcfg-eth0 |-- network `-- resolv.conf 4 directories, 10 files[student@station student]$ ls -iaR /etc/sysconfig/networking//etc/sysconfig/networking/: 49180 . 244801 .. 65497 devices 49019 ifcfg-lo 65498 profiles /etc/sysconfig/networking/devices: 65497 . 49180 .. 73383 ifcfg-eth0 /etc/sysconfig/networking/profiles: 65498 . 49180 .. 65499 default 558071 netup /etc/sysconfig/networking/profiles/default: 65499 . 73386 hosts 73384 network 65498 .. 73383 ifcfg-eth0 73385 resolv.conf /etc/sysconfig/networking/profiles/netup: 558071 . 558076 hosts 558072 network 65498 .. 558077 ifcfg-eth0 558075 resolv.conf

Directories and Device Nodes

38

1. What would you expect to be the link count of inode number 65498?

a. 2

b. 3

c. 4

d. 5

e. None of the above

2. What would you expect to be the link count of inode number 49180?

a. 2

b. 3

c. 4

d. 5

e. None of the above

3. What would you expect to be the link count of inode number 65499?

a. 2

b. 3

c. 4

d. 5

e. None of the above

Use the output from the following command to answer the next 4 questions.

[elvis@station 030_section_questions]$ ls -l /dev/tty[1-6] /dev/fd0 /dev/audiocrw--w---- 1 elvis tty 4, 1 Jul 22 15:30 /dev/tty1crw--w---- 1 prince tty 4, 2 Jul 22 15:30 /dev/tty2crw--w---- 1 elvis tty 4, 3 Jul 22 15:30 /dev/tty3crw--w---- 1 blondie tty 4, 4 Jul 22 15:30 /dev/tty4crw------- 1 root root 4, 5 Jul 22 09:29 /dev/tty5crw------- 1 root root 4, 6 Jul 22 09:29 /dev/tty6brw-rw---- 1 prince floppy 2, 0 Jan 30 05:24 /dev/fd0crw------- 1 prince root 14, 4 Jan 30 05:24 /dev/audio

4. The user elvis is logged in to which virtual console(s)?

a. Virtual Console Number 1

b. Virtual Console Number 2

c. Virtual Console Number 3

d. Virtual Console Number 4

e. Virtual Console Number 5

f. Virtual Console Number 6

Directories and Device Nodes

39

5. Which of the following are block device nodes?

a. /dev/tty1

b. /dev/tty2

c. /dev/tty3

d. /dev/tty6

e. /dev/fd0

f. /dev/audio

6. Which user is currently considered the "Console User"?

a. elvis

b. prince

c. blondie

d. None of the above

e. It cannot be determined from the information provided.

7. The user elvis has also logged on using the X graphical environment. He tries to play an audio CDusing the gnome-cd player. Which of the following best explains why this will not work?

a. Only users who have logged on using a virtual console may access the audio devices.

b. The user elvis is not considered the "Console User", and does not have write permissions tothe device nodes that connect to the audio device drivers.

c. The user elvis is not a member of the group "audio", and does not have write permissions tothe device nodes that connect to the audio device drivers.

d. None of the above

40

Chapter 4. Disks, Filesystems, andMounting

Key Concepts

• Linux allows low level access to disk drives through device nodes in the /dev directory.

• Usually, disks are formatted with a filesystem, and mounted to a directory instead.

• Filesystems are created with some variant of the mkfs command.

• The default filesystem of Red Hat Enterprise Linux is the ext3 filesystem.

• The mount command is used to map the root directory of a disk's (or a disk partition's) filesystem to analready existing directory. That directory is then referred to as a mount point.

• The umount command is used to unmount a filesystem from a mount point.

• The df command is used to report filesystem usage, and tables currently mounted devices.

Discussion

Disk DevicesLinux (and Unix) allows users direct, low level access to disk drives through device nodes in the /devdirectory. The following table lists the filenames of common disk device nodes, and the disks with whichthey are associated.

Table 4.1. Linux Disk Device Nodes

Device Node Disk

/dev/fd0 Floppy Disk

/dev/hda IDE Primary Master

/dev/hdb IDE Primary Slave

/dev/hdc IDE Secondary Master

/dev/hdd IDE Secondary Slave

/dev/sd[a-z] SCSI Disks

/dev/cdrom Symbolic Link to CD/ROM

Although device nodes exist for disk drives, usually standard users do not have permissions to access themdirectly. In the following, elvis has performed a long listing of the various device nodes tabled above.

[elvis@station elvis]$ ls -l /dev/fd0 /dev/hd[abcd] /dev/sda /dev/cdromlrwxrwxrwx 1 root root 8 Oct 1 2002 /dev/cdrom -> /dev/hdcbrw-rw---- 1 elvis floppy 2, 0 Jan 30 05:24 /dev/fd0brw-rw---- 1 root disk 3, 0 Jan 30 05:24 /dev/hdabrw-rw---- 1 root disk 3, 64 Jan 30 05:24 /dev/hdbbrw------- 1 elvis disk 22, 0 Jan 30 05:24 /dev/hdcbrw-rw---- 1 root disk 22, 64 Jan 30 05:24 /dev/hdd

Disks, Filesystems, and Mounting

41

brw-rw---- 1 root disk 8, 0 Jan 30 05:24 /dev/sda

By default, elvis does not have permissions to access the machine's fixed drives. Because he is (apparently)logged on at the console, he is considered the "console user", and has gained permissions to access thefloppy and CD/ROM drives. We will take advantage of this fact during this lesson.

Interestingly, the file /dev/cdrom is not a device node, but a symbolic link, which resolves to the blockdevice node /dev/hdc. Most modern CD/ROM drives physically attach to the machine using either anIDE or SCSI interface, and so appear to the kernel as just another SCSI or IDE drive. Some applications,however, such as the gnome-cd audio CD player, want to access "the CD/ROM". For them, /dev/cdromprovides access to the CD/ROM, no matter how its attached to the system.

More often than not, hard disks are further divided into partitions. Partitions are regions of the hard diskthat can each be used as if it were a separate disk. Just as there are device nodes for every disk, there arealso device nodes for every disk partition. The name of a partition's device node is simply the partitionnumber appended to the name of the disk's device node. For example, the device node for the third partitionof the primary slave IDE drive is called /dev/hdb3.

The following diagram illustrates a hard disk that has been divided into four partitions, and the devicenodes which address each of the partitions.

Figure 4.1. Hard Disk Partitions

Low Level Access to DrivesIn the following, elvis is exploring low level access to his floppy drive. He starts by ensuring he hasread and write permissions to the floppy's device node, catting the /etc/resolv.conf file, and thenredirecting the output of the command to the floppy drive (/dev/fd0).

[elvis@station elvis]$ ls -l /dev/fd0brw-rw---- 1 elvis floppy 2, 0 Jan 30 05:24 /dev/fd0[elvis@station elvis]$ cat /etc/resolv.confsearch example.comnameserver 192.168.0.254[elvis@station elvis]$ cat /etc/resolv.conf > /dev/fd0-bash: /dev/fd0: No such device or address

At first perplexed by the error message, elvis realizes that there is no floppy disk in his floppy disk drive.He places an old, unused floppy (that he doesn't care about the contents of) into the drive.

[elvis@station elvis]$ cat /etc/resolv.conf > /dev/fd0-bash: /dev/fd0: Read-only file system

Perplexed again, elvis removes the floppy from the drive, examines it, slides the write protection tab onthe floppy disk to the writable position, and reinserts the floppy.

[elvis@station elvis]$ cat /etc/resolv.conf > /dev/fd0

Finally, the floppy drive's light comes on, and elvis hears the disk spin as information is written to thefloppy. Curious to see what he has written to the floppy disk, elvis next tries to read the floppy with theless pager.

Disks, Filesystems, and Mounting

42

[elvis@station elvis]$ less /dev/fd0/dev/fd0 is not a regular file (use -f to see it)

The less pager seems to be telling elvis, "Sane people don't read from device nodes directly, but if youreally want to do it, I'll let you." Because elvis really wants to do it, he adds the -f command line switch.

[elvis@station elvis]$ less -f /dev/fd0

On the first page of the pager, elvis recognizes the first few characters as the contents of /etc/resolv.conf. After that, however, the pager shows unintelligible data, with occasional human readabletext interspersed.

search example.comnameserver 192.168.0.254B^RA^^@^@V^^|F^@^@^@AdDBP^A^^|^C^F^B|^@DVP|Q^^^R ABAoot failed^@^@^@LDLINUX SYS...

The user elvis continues to page through the "file" using the less pager, seeing apparently random snippetsof text and binary characters. When he feels like he's gotten the point, he quits the pager.

What is the point? By accessing disk drives through their device nodes, users may see (and write) thecontents of the drive byte for byte. To the user, the drive looks like a (very big) file. When elvis cats thecontents of a file to the drive's device node, the information is transferred, byte for byte, to the drive.

On the first few bytes of the floppy, elvis sees a copy of the contents of the /etc/resolv.conf file. Onthe floppy, what is the filename associated with the information? Trick question. It doesn't have one. Whois the user owner? What are the permissions? There are none. It's just data. When elvis goes back to readthe contents of the drive, he sees the data he wrote there, namely the contents of the /etc/resolv.conffile. After that, he sees whatever happened to be on the floppy before he started.

FilesystemsThe previous section demonstrated how to access drives at a low level. Obviously, people do not like tostore their information on drives as one stream of data. They like to store their information in files, and givethe files filenames. They like to organize their files into directories, and say who can access the directoryand who cannot. All of this structuring of information is the responsibility of what is called a filesystem.

A filesystem provides order to disk drives by organizing the drive into fixed sized chunks called blocks.The filesystem then organizes these blocks, effectively saying "this block is going to contain only inodes","this block is going to contain only dentries", "these 3 block over here, and that one over there, are goingto contain the contents of the file /etc/services", or "this first block is going to store informationwhich keeps track of what all the other blocks are being used for". Filesystems provide all of this structurethat is usually taken for granted.

Before a disk can be used to store files in a conventional sense, it must be initialized with this type oflow level structure. In Linux, this is usually referred to as "creating a filesystem". In other operatingsystems, it is usually referred to as "formatting the disk". Linux supports a large number of different typesof filesystems (the fs(5) man page lists just a few). While Linux's native filesystem is the ext2 (or in RedHat Enterprise Linux, the ext3) filesystem, it also supports the native filesystems of many other operatingsystems, such as the DOS FAT filesystem, or the OS/2 High Performance File System.

In Linux, filesystems are created with some variant of the mkfs command. Because these commandsare usually used only by the administrative user, they do not live in the standard /bin or /usr/bin directories, and therefore cannot be invoked as simple commands. Instead, they live in the /sbindirectory, which is reserved for administrative commands. In the following, elvis lists all commands thatbegin mkfs in the /sbin directory.

Disks, Filesystems, and Mounting

43

[elvis@station elvis]$ ls /sbin/mkfs*/sbin/mkfs /sbin/mkfs.ext2 /sbin/mkfs.msdos/sbin/mkfs.cramfs /sbin/mkfs.ext3 /sbin/mkfs.vfat

Apparently, there is one copy of the mkfs command for each type of filesystem that can be constructed,including the ext2 and msdos filesystems. The user elvis next formats the same floppy he used above withthe ext2 filesystem. Because the mkfs.ext2 command did not live in one of the "standard" directories, elvisneeds to refer to the command using an absolute reference, /sbin/mkfs.ext2.

[elvis@station elvis]$ /sbin/mkfs.ext2 /dev/fd0mke2fs 1.39 (29-May-2006)Filesystem label=OS type: LinuxBlock size=4096 (log=2)Fragment size=4096 (log=2)4096 inodes, 4096 blocks204 blocks (4.98%) reserved for the super userFirst data block=01 block group32768 blocks per group, 32768 fragments per group4096 inodes per group

Writing inode tables: done Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 20 mounts or180 days, whichever comes first. Use tune2fs -c or -i to override.

The mkfs.ext2 command displays information about the filesystem as it creates it on the device /dev/fd0. When the command completes, the filesystem has been initialized, and the floppy is ready to be used.

The mkfs command, and its variants, can be configured with a large collection of command line switcheswhich specify low level details about the filesystem. These details are beyond the scope of this course,however. Fortunately, the various options default to very reasonable general purpose defaults. For thecurious, more information can be found in the mkfs.ext2(8) and similar man pages.

Mounting Filesystems

Once a disk or a disk partition has been formatted with a filesystem, users need some way to access thedirectories and files that the filesystem provides. In other operating systems, users are usually very awareof disk partitions, because they have to refer to them using labels such as C: or D:. In Unix, users are oftenunaware of partitions, because different disk partitions are organized into a single directory structure.

How is this done? Every filesystem provides a root directory that serves as the base of that filesystem. Uponbooting the system, one of your disk's partitions acts as the root partition, and its root directory becomesthe system's root directory /. Sometimes, that's the end of the story. The / directory has subdirectories,and those subdirectories have subdirectories, all of which reside in the root partition's filesystem.

If a system has multiple disks, however, or if a disk has multiple partitions, the story gets more complicated.In order to access the filesystems on the other partitions, the root directories of those filesystems aremapped to an already existing directory through a standard Unix technique called mounting.

In the example diagrammed below, the filesystem on the partition /dev/hda2 is being used as theroot partition, and contains the /etc, /tmp, /home, and other expected directories. The /dev/hda4partition has also been formatted with a filesystem, and its root directory contains the directories /blondie, /prince, and others. In order to make use of this filesystem, it is mounted to the /homedirectory, which already existed in the root partition's filesystem. This mount usually happens as a normalpart of the system's boot up process.

Disks, Filesystems, and Mounting

44

Figure 4.2. Mounting a Filesystem

Now, all references to the /home directory are transparently mapped to the root directory of the filesystemon /dev/hda4, giving the appearance that the /home directory contains the subdirectories blondie,elvis, etc., as seen below. When a filesystem is mounted over a directory in this manner, that directoryis referred to as a mount point.

[elvis@station elvis]$ ls /homeblondie elvis madonna prince

How can elvis tell which partition contains a given file? Just using the ls command, he can't! To the lscommand, and usually in a user's mind, all of the different partitions are gracefully combined into a single,seamless directory structure.

Viewing Mount Points

How can a user determine which directories are being used as mount points? One approach is to run themount command without arguments.

[elvis@station elvis]$ mount/dev/hda3 on / type ext3 (rw)none on /proc type proc (rw)usbdevfs on /proc/bus/usb type usbdevfs (rw)/dev/hda1 on /boot type ext3 (rw)none on /dev/pts type devpts (rw,gid=5,mode=620)none on /dev/shm type tmpfs (rw)

Without arguments, the mount command returns a list of current mount points, the device that is mountedto it, the type of filesystem that device has been formatted with, and any mount options associated withthe mount. In the above example, the /dev/hda3 partition is being used as the root partition, and theext3 filesystem on partition /dev/hda1 has been mounted to the directory /boot. Note that several ofthe filesystems listed above are said to be on the device "none". These are virtual filesystems, which areimplemented by the kernel directly, and do not exist on any physical device.

Why Bother?

If you seldom know which directories are being used as mount points, and which files exist in whichpartitions, why bother even talking about it? For now, we will address two reasons. The first reason is thatthere can be subtle issues that creep up which are related to the underlying filesystem. Partitions can runout of space. If a filesystem mounted on /home runs out of space, no more files can be created underneaththe /home directory. This has no effect on the /tmp directory, however, because it belongs to anotherfilesystem. In Unix, when a partition fills up, it only effects the part of the directory structure underneathits mount point, not the entire directory tree.

Disks, Filesystems, and Mounting

45

Users can determine how much space is available on a partition with the df command, which stands for"disk free".

df [OPTION...] [FILE...]

Show information about all partitions, or partition on which FILE resides.

Switch Effect

-a, --all Show all filesystems, including those of size 0

-h, --human-readable Print sizes in human readable format

-i, --inodes List inode usage instead of block usage

-T, --print-type Include filesystem type

Not only will the df command show how much space is left on particular partitions, but it also gives avery readable table of which devices are mounted to which directories. In the following, the filesystem on/dev/hda2 is being used as the root partition, the filesystem on /dev/hda1 is mounted to the /bootdirectory, and /dev/hda4 is mounted to /home. A partition on a second disk drive, namely the /dev/hdb2 partition, is mounted to a non standard directory named /data.

[elvis@station elvis]$ dfFilesystem 1K-blocks Used Available Use% Mounted on/dev/hda2 8259708 6708536 1131592 86% //dev/hda1 102454 24227 72937 25% /boot/dev/hda4 5491668 348768 4863936 7% /home/dev/hdb2 4226564 1417112 2594748 36% /datanone 127592 0 127592 0% /dev/shm

Mounting Temporary Media: The /media directory.

The second reason users need to be aware of filesystems and mount points involves temporary media suchas floppies and CD/ROM drives. Like any block device, floppy disks and CD/ROM disks are formattedwith filesystems. In order to access these filesystems, they must be mounted into the directory structure,using a (already existing) directory as a mount point. Which directory should be used?

The /media directory contains subdirectories such as /media/floppy and /media/cdrom, or even/media/camera, which are intended for just this purpose; they serve as mount points for temporarymedia. (Thus the name of the /media directory.) If you would like to make use of a temporary disk,such as a floppy disk, you must first mount the filesystem into your directory structure, using the mountcommand.

[elvis@station elvis]$ mount /media/floppy[elvis@station elvis]$ dfFilesystem 1K-blocks Used Available Use% Mounted on/dev/hda2 8259708 6708536 1131592 86% //dev/hda1 102454 24227 72937 25% /boot/dev/hda4 5491668 348768 4863936 7% /home/dev/hdb2 4226564 1417112 2594748 36% /datanone 127592 0 127592 0% /dev/shm/dev/fd0 1412 13 1327 1% /media/floppy

On the last line, the df command now reports the newly mounted floppy drive, and elvis can copy filesonto the floppy.

Disks, Filesystems, and Mounting

46

Figure 4.3. Mounting a Formatted Floppy

[elvis@station elvis]$ cp /etc/services /media/floppy/[elvis@station elvis]$ ls /media/floppy/lost+found services

Where did the directory lost+found come from? This directory was created when the filesystem wascreated, and always exists in the root directory of an ext2 or ext3 filesystem. It is used occasionally whenrepairing damaged filesystems.

When elvis has finished using the floppy, he detaches it from the filesystem using the umount command.

[elvis@station elvis]$ umount /media/floppy/[elvis@station elvis]$ ls /media/floppy/

Once the floppy disk's filesystem has been detached from the /media/floppy directory, the directoryis just an empty directory.

Mounting Issues

Mounting devices is one of the more awkward and problematic issues for new Linux (and Unix) users.The following issues can also occur, which serve to complicate the matter.

Permissions By default, only the root user can mount and unmount devices.Temporary media are handled differently, however. The "ConsoleUser" (someone who has logged in from a virtual consoleor the X login screen) gains ownership of devices associatedwith the physical machine, such as a floppy drive, and specialpermissions to mount these devices to predefined mount points,such as /media/floppy. If a user has logged in by some othertechnique, such as over the network, or via the su command,they will not be considered the "Console User", and will not havepermissions to mount these devices.

Busy Filesystems A filesystem can only be unmounted if it is considered "non-busy". What can keep a filesystem "busy"? Any open file, or anyprocess that has a current working directory in the filesystem,"busy"s the filesystem. The only way for the filesystem to beunmounted is to track down any processes that might be keepingthe filesystem "busy", and kill them.

Automounters The GNOME graphical environment runs an automounter, whichkeeps an eye on the CD/ROM drive, and will automatically mount

Disks, Filesystems, and Mounting

47

the filesystem of any newly inserted disk. The automounter is partof the graphical environment, and does not exist if a user loggedin through a virtual console. Also, the automounter only works forthe CD/ROM drive. The floppy drive, or other devices, must bemounted "manually".

Kernel Buffering In order to improve performance, the kernel buffers all blockdevice (harddrive) interactions. For example, when you copya file to a floppy, the file might seem to have been copiedalmost immediately. Later, when you unmount the floppy withthe umount command, the command takes a while to return whilewrites are being committed to the floppy. When unmounting thedevice, the kernel is forced to commit all pending transactions tothe disk.

What would happen if you removed the floppy from the drivebefore buffered writes were committed to disk? At best, the filesthat you thought you had copied to the floppy would not be there.At worst, you may have a corrupted floppy, and a confused Linuxkernel the next time someone tries to mount a floppy.

The upshot: Not only must you mount temporary media (such asfloppies) before you can use them, you must also unmount themedia when you are done.

Examples

Using an Unformatted FloppyThe user madonna has a collection of songs which she would like to copy to an unformatted floppy toshare with friends. Because she knows that all of her friends use Red Hat Enterprise Linux, she decidesto format the floppy with the ext2 filesystem.

[madonna@station madonna]$ /sbin/mkfs.ext2 /dev/fd0mke2fs 1.32 (09-Nov-2002)Filesystem label=OS type: LinuxBlock size=1024 (log=0)Fragment size=1024 (log=0)184 inodes, 1440 blocks72 blocks (5.00%) reserved for the super userFirst data block=11 block group8192 blocks per group, 8192 fragments per group184 inodes per group Writing inode tables: doneWriting superblocks and filesystem accounting information: done This filesystem will be automatically checked every 21 mounts or180 days, whichever comes first. Use tune2fs -c or -i to override.

Next, she mounts the floppy, and copies her files over to it.

[madonna@station madonna]$ mount /media/floppy/[madonna@station madonna]$ cp song* /media/floppy/[madonna@station madonna]$ cd /media/floppy/[madonna@station floppy]$ ls

Disks, Filesystems, and Mounting

48

lost+found song02.ogg song04.ogg song06.oggsong01.ogg song03.ogg song05.ogg song07.ogg

She then unmounts her floppy.

[madonna@station madonna]$ umount /media/floppy/umount: /media/floppy: device is busy

Why would the floppy not unmount? Some process either has an open file, or a current working directorywithin the floppy's filesystem. The offending process is madonna's bash shell, whose current workingdirectory is /media/floppy. In order to unmount the floppy, madonna must cd to somewhere else,such as her home directory.

[madonna@station floppy]$ cd[madonna@station madonna]$ umount /media/floppy/

She can now remove the floppy from the drive.

Using a DOS Formatted FloppyEventually, madonna comes across a friend that insists he can only use DOS formatted floppies. Madonnaformats another floppy, this time with the MS-DOS filesystem, mounts the floppy, and copies her filesover to it.

[madonna@station madonna]$ /sbin/mkfs.msdos /dev/fd0mkfs.msdos 2.8 (28 Feb 2001)[madonna@station madonna]$ mount /media/floppy/[madonna@station madonna]$ cp song0* /media/floppy/

After she has copied the files to the floppy, she decides that she would like to create a soft link to identifyher favorite song for her friend.

[madonna@station madonna]$ cd /media/floppy/[madonna@station floppy]$ lssong01.ogg song03.ogg song05.ogg song07.oggsong02.ogg song04.ogg song06.ogg[madonna@station floppy]$ ln -s song06.ogg my_favorite_song.oggln: creating symbolic link `my_favorite_song.ogg' to `song06.ogg': Operation not permitted

Why could madonna not create the link? Although Linux supports the MS-DOS filesystem, the MS-DOS filesystem is much simpler than traditional Linux filesystems, and Linux needs to make somecompromises. One of these compromises is that the MS-DOS filesystem does not support soft (or hard)links. Neither does the MS-DOS filesystem support file owners and file permissions. How does Linuxhandle this? It treats all files as owned by the same user, and all permissions as 755. What happens whenmadonna tries to change permissions on one of the files?

[madonna@station floppy]$ chmod 664 song05.oggchmod: changing permissions of `song05.ogg' (requested: 0664, actual: 0644): Operation not permitted

Again, the operation not permitted. Different filesystems provide different capabilities, and the MS-DOSfilesystem is not as featured as the ext2 filesystem.

Now that she is finished, madonna cd's to her home directory, and unmounts the floppy.

[madonna@station floppy]$ cd[madonna@station madonna]$ umount /media/floppy/

Why did she cd to her home directory first?

Disks, Filesystems, and Mounting

49

Her bash shell's current working directory was inside the floppy's filesystem, which would have preventedher from unmounting the floppy.

Floppy ImagesSeveral friends have asked madonna for copies of the same floppy. After a while, she grows tired offormatting floppies, mounting them, copying the same 8 songs over to it, and unmounting. She decides tospeed up the process by creating an image file for her floppy.

She first prepares a floppy with the files that she wants to distribute.

[madonna@station madonna]$ /sbin/mkfs.ext2 /dev/fd0 > /dev/null mke2fs 1.32 (09-Nov-2002)[madonna@station madonna]$ mount /media/floppy/[madonna@station madonna]$ cp song* /media/floppy/[madonna@station madonna]$ ls /media/floppy/lost+found song02.ogg song04.ogg song06.oggsong01.ogg song03.ogg song05.ogg song07.ogg[madonna@station madonna]$ umount /media/floppy/

Next, she copies the contents of the unmounted floppy, byte for byte, into a file called songs.img.

[madonna@station madonna]$ cat /dev/fd0 > songs.img

The cat command, that we have been using on to view simple text files, works just as well on binary files,or, in this case, binary disk data. After a few seconds of the floppy drive spinning, she has a new file.How big is the file?

[madonna@station madonna]$ ls -s songs.img1444 songs.img

1.4 MBytes, exactly what you would expect from a floppy. She stores this image file, and wheneversomeone new wants a copy of her floppy, she reverses the process by cating the image back down to anunformatted floppy.

[madonna@station madonna]$ cat songs.img > /dev/fd0

Because the image file is transferred, byte for byte, to the new floppy, this command has the effect offormatting the floppy with an ext2 filesystem, and copying the files to it, all in one step. This is a powerfultechnique known as imaging drives, and works on any type of disk, not just floppies.

When accessing devices at a low level, it is important that the device is unmounted. If madonna hadperformed to last command on a mounted floppy, the kernel would have almost certainly been confused,and corrupted the floppy.

Online Exercises

Using Floppies

Lab Exercise

Objective: Format, mount, and umount a floppy drive

Estimated Time: 15 mins.

Setup

You will need a floppy for this exercise. The contents of the floppy will be destroyed.

Disks, Filesystems, and Mounting

50

Specification

In this lab exercise, you will format a floppy disk, mount it, copy files to it, and then unmount the floppy.

1. Make sure that the floppy disk is not write protected, and place it in the floppy drive.

2. Format the floppy with the ext2 filesystem, using the /sbin/mkfs.ext2 command.

3. Mount the floppy onto the /media/floppy directory, using the mount command.

4. Recursively copy the contents of the /etc/sysconfig directory onto your floppy. (Ignore any errorsthat you do not have permissions to read some files.)

5. Unmount your floppy. Swap floppies with a neighbor at this point, if possible.

6. Mount your neighbor's floppy (or remount your own), again to the /media/floppy directory.

7. With the floppy still mounted, capture the output of the df command into the file df.floppy in yourhome directory.

Deliverables

1.1. A floppy formatted with the ext2 filesystem mounted to the /media/floppy directory.

2. A directory /media/floppy/sysconfig, which is a recursive copy of the /etc/sysconfig directory (perhaps with a few inaccessible files omitted).

3. A file in your home directory called df.floppy, which contains the output of the dfcommand.

Possible Solution

The following sequence of commands provides one possible solution to this exercise.

[student@station student]$ /sbin/mkfs.ext2 /dev/fd0mke2fs 1.32 (09-Nov-2002)Filesystem label=OS type: Linux...[student@station student]$ mount /media/floppy/[student@station student]$ cp -r /etc/sysconfig /media/floppy/cp: cannot open `/etc/sysconfig/rhn/up2date' for reading: Permission deniedcp: cannot open `/etc/sysconfig/rhn/systemid' for reading: Permission denied...[student@station student]$ df > df.floppy[student@station student]$ umount /media/floppy/

Cleaning Up

You may unmount your floppy after you have been graded. If you are proceeding to the next exercise,save the contents of your floppy.

Imaging a Floppy

Lab Exercise

Objective: Create an image of a floppy disk drive

Disks, Filesystems, and Mounting

51

Estimated Time: 15 mins.

Setup

You will need the floppy you created in the previous exercise, namely, an ext2 formatted floppy containinga recursive copy of the /etc/sysconfig directory.

Specification

In this lab exercise, you will create an image file of a floppy, and then restore the floppy using the imagefile.

1. Make sure that the floppy disk is not write protected, and place it in the floppy drive. Make sure thatthe floppy is not mounted. Unmount the floppy with the umount command, if necessary.

2. Using the cat command, make an image file of the floppy in your home directory, called floppy.img.

3. Using the ls -s command, confirm that the size of your image file is 1444 Kbytes.

4. Using the file command, make sure the file is being identified as an ext2 filesystem.

5. With your floppy still in the drive, and still unmounted, reformat the floppy with the msdos filesystem(using the /sbin/mkfs.msdos command).

6. Mount your floppy to the /media/floppy directory, and, using the mount command withoutarguments, confirm that the floppy has been formatted with the msdos (vfat) filesystem.

7. Unmount the floppy. Using the cat command, restore your floppy from the image file.

8. Mount your floppy, again to the /media/floppy directory. Using the mount command, withoutarguments, confirm that the original ext2 filesystem has been restored.

Deliverables

1.1. A floppy formatted with the ext2 filesystem mounted to the /media/floppy directory.

2. An image of the same floppy, stored in the file floppy.img in your home directory.

Possible Solution

The following commands provide one solution to the first four steps of this exercise.

[student@station student]$ cat /dev/fd0 > floppy.img[student@station student]$ file floppy.imgfloppy.img: Linux rev 1.0 ext2 filesystem data[student@station student]$ ls -stotal 1448 4 df.floppy 1444 floppy.img

Cleaning Up

Because the image file you created from this exercise is fairly large, you may want to remove it after yourexercise has been graded.

QuestionsSome questions may use the following information:

Disks, Filesystems, and Mounting

52

[student@station student]$ mount/dev/hda9 on / type ext3 (rw)proc on /proc type proc (rw)sysfs on /sys type sysfs (rw)devpts on /dev/pts type devpts (rw,gid=5,mode=620)/dev/hda2 on /boot type ext3 (rw)/dev/hda7 on /home type ext3 (rw)none on /dev/shm type tmpfs (rw)/dev/hda8 on /tmp type ext3 (rw)/dev/hda5 on /usr type ext3 (rw)/dev/hda6 on /var type ext3 (rw)tmpfs on /dev/shm type tmpfs (rw)none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)

1. Which command or commands will show what filesystems are currently mounted?

a. chown

b. df

c. mount

d. ls

e. mkdir

2. According to the screen output above, which device contains the /home filesystem?

a. /dev/pts

b. /dev/hda1

c. /dev/hda5

d. /dev/hda7

e. None of the above

3. According to the screen output above, where is the device /dev/hda6 mounted?

a. /dev/pts

b. /home

c. /var

d. /usr

e. None of the above

4. When unmounting a device, the error message "umount: /media/floppy: device is busy" can meanwhat?

a. An application was started from the directory /media/floppy and is still running.

b. The current working directory for some shell is /media/floppy

c. The floppy is read only.

d. You do not have permission to use the umount command.

Disks, Filesystems, and Mounting

53

e. An application has the file /media/floppy/make.log open for writing.

5. Why is it important to unmount removable media before physically removing it?

a. If you do not, the system will shut down.

b. So that the next disk inserted can be read and mounted without error.

c. So that the disk can be read and mounted on another system without error.

d. Most systems cache data written to a device. The umount ensures that this data gets writtenfrom the cache to the device so that it is not lost.

6. Which of the following commands can be used to format an unformatted floppy?

a. /sbin/mkfs.ext2

b. /sbin/mkfsys.msdos

c. format

d. floppyfd

e. mount

The user elvis tries to unmount the floppy, and receives the following error message.

[elvis@station elvis]$ umount /media/floppy/umount: only student can unmount /dev/fd0 from /media/floppy

7. What is the most reasonable explanation for the message?

a. The user student write protected the floppy.

b. The user student mounted the floppy, and therefore only that user may umount the floppy.

c. The user elvis does not have permissions to run the umount command.

d. The user student formatted the floppy, so only the user student may mount and unmount it.

8. Which of the following commands would create an ext2 filesystem on the third partition of theprimary master IDE drive?

a. /sbin/mkfs.ext2 /dev/hda

b. /sbin/mkfs.ext2 /dev/fd0

c. /sbin/mkfs.msdos /dev/hda

d. /sbin/mkfs.msdos /dev/hda3

e. None of the above

9. Which of the following commands would create an ext2 filesystem on the second partition of aSCSI disk?

a. /sbin/mkfs.ext2 /dev/sda2

b. /sbin/mkfs.ext2 /dev/hda2

Disks, Filesystems, and Mounting

54

c. /sbin/mkfs.msdos /dev/sda2

d. /sbin/mkfs.msdos /dev/hda2

e. None of the above

When logged into the X graphical environment, using GNOME, elvis tries to mount a newly inserted CD/ROM.

[elvis@station elvis]$ mount /media/cdrommount: according to mtab, /dev/cdrom is already mounted on /media/cdrommount failed

10. What is the most reasonable explanation for the message?

a. Another user, logged in over the network, mounted the CD/ROM without elvis realizing it.

b. The GNOME automounter automatically mounted the CD/ROM.

c. The /etc/mtab file is out of sync, and the CD/ROM is really not currently mounted.

d. None of the above.

55

Chapter 5. Locating Files with locateand find

Key Concepts

• The locate command uses a database to quickly locate files on the system by filename.

• The find command performs a real time, recursive search of the filesystem.

• The find command can search for files based on inode information.

• The find command can perform arbitrary commands on files.

Discussion

Locating FilesIt is common to find config files in /etc or executables in a bin directory, however, sometimes it is necessaryto search the system for a specific file. Two of the common tools for this are locate and find.

The command locate prints the names of files and directories that match a supplied pattern. It is the fasterof the two commands because it relies on a database (updated daily by default) instead of searching realtime. The downside of this is that it will not find files created today or it will find files that have beendeleted since the last update of the database.

The command find can find files by name but can also search for files by owner, group, type, modificationdate, and many other criteria. With its real time search through the directory tree, it is slower than locatebut it is also more flexible.

Using LocateThe locate command quickly reports all files on the disk whose filename contains the specified text.The search relies on a database, which is updated nightly, so recently created files will probably not bereported. The database does remember file permissions, however, so you will only see files which youwould normally have permissions to see.

Earlier we used the command umount to unlink a filesystem from the directory tree. Lets see what fileson the system include the string "umount" in their names.

[blondie@station blondie]$ locate umount/bin/umount/sbin/umount.cifs/sbin/umount.nfs/sbin/umount.nfs4/usr/bin/gnome-umount/usr/share/doc/samba-3.0.23c/htmldocs/manpages/smbumount.8.html/usr/share/doc/samba-3.0.23c/htmldocs/manpages/umount.cifs.8.html/usr/share/man/man2/umount.2.gz/usr/share/man/man2/umount2.2.gz/usr/share/man/man8/umount.8.gz/usr/share/man/man8/umount.cifs.8.gz/usr/share/man/man8/umount.nfs.8.gz

Locating Files with locate and find

56

Notice that in addition to /bin/umount we also locate variants for special network related filesystems,several man page files.

The locate command also supports "file globs", or, more formally, pathname expansion, using the same*, ?, and [...] expressions as the bash shell. For example, if you knew that there was a PNG image of afish somewhere on the system, you might try the following locate command.

For reasons discussed in a later workbook, it's a better idea to wrap any "globs" in quotes, though you canoften get away without them.

[blondie@station ~]$ locate "*fish*.png"/usr/share/backgrounds/tiles/fish.png/usr/share/gnome/help/fish/C/figures/fish_applet.png/usr/share/gnome/help/fish/es/figures/fish_applet.png.../usr/share/gnome/help/fish-applet-2/de/figures/fish_applet.png/usr/share/gnome/help/fish-applet-2/de/figures/fish_settings.png/usr/share/gnome/help/fish-applet-2/ja/figures/fish_applet.png.../usr/share/gnome/panel/pixmaps/fishanim.png/usr/share/icons/hicolor/16x16/apps/gnome-panel-fish.png...

Using findThe find command is used to search the filesystem for files that meet a specified criteria. Almost anyaspect of a file can be specified, such as its name, its size, the last time it was modified, even its link count.(The only exception is the file's content. For that, we need to wait for a command called grep, whichcomplements find nicely.)

The find command's syntax takes a little getting accustomed to, but once learned, is very usable. A findcommand essentially consists of three parts: a root directory (or directories), a search criteria, and an action.

find (root directory) (criteria) (action)

The default directory is ".", the default criteria is "every file", and the default action is "print" (thefilename), so running the find command without arguments will simply descend the current directory,printing every filename. If given a directory name as a single argument, the same would be done for thatdirectory.

[madonna@station madonna]$ find /etc/sysconfig/networking//etc/sysconfig/networking//etc/sysconfig/networking/devices/etc/sysconfig/networking/devices/ifcfg-eth0/etc/sysconfig/networking/profiles/etc/sysconfig/networking/profiles/default/etc/sysconfig/networking/profiles/default/network/etc/sysconfig/networking/profiles/default/resolv.conf/etc/sysconfig/networking/profiles/default/hosts/etc/sysconfig/networking/profiles/default/ifcfg-eth0/etc/sysconfig/networking/profiles/netup/etc/sysconfig/networking/profiles/netup/network/etc/sysconfig/networking/profiles/netup/resolv.conf/etc/sysconfig/networking/profiles/netup/hosts/etc/sysconfig/networking/profiles/netup/ifcfg-eth0/etc/sysconfig/networking/ifcfg-lo

Usually, however, the find command is given criteria to refine its search, in the form of (non standard)command line switches. For example, the -name command line switch is used to find files with a givenname. As with locate, globs are supported, but should be quoted.

[madonna@station madonna]$ find /etc -name "*.conf"

Locating Files with locate and find

57

/etc/gdm/securitytokens.conf/etc/gdm/custom.conf/etc/gssapi_mech.conf.../etc/updatedb.conf/etc/reader.conf/etc/selinux/restorecond.conffind: /etc/selinux/targeted/modules/active: Permission deniedfind: /etc/selinux/targeted/modules/previous: Permission denied/etc/selinux/targeted/setrans.conf/etc/selinux/semanage.conf...

While superficially similar to the locate command, find functions by performing a search in real time.This can take a lot longer, but avoids the issue of an "out of sync" database. Note that if the proper orderingis not followed, find becomes quickly confused.

[madonna@station madonna]$ find -name "*.conf" /etcfind: paths must precede expressionUsage: find [path...] [expression]

Find Criteria

If you browse the find(1) man page, you will discover that an overwhelming selection of criteria can bespecified for your search. Almost any aspect of the file that can be reported by the stat command or the lscommand is fair game. The following table summarizes some of the more common search criteria.

Table 5.1. Search Criteria for the find Command

switch specification

-empty The file is a directory or regular file, and is empty.

-group gname The file is group owned by gname.

-inum n The file has an inode number n.

-links n The file has n links.

-mmin n The file was last modified n minutes ago.

-mtime n The file was last modified n days ago.

-name pattern The file's name matches the file glob pattern.

-newer filename The file was modified more recently than filename.

-perm mode The file's permissions are exactly mode.

-perm -mode All of the permission bits mode are set for the file.

-perm +mode Any of the permission bits mode are set for the file.

-size n The file has a size of n.

-type c The file is of type c, where c is "f" (regular file), "d" (directory), or "l" (symboliclink). See the man page for more details.

-user uname File is owned by the user uname.

More options are available, but these should give you an idea of the flexibility of the find command.Any criteria that takes a numeric argument, such as -size or -mtime, recognizes arguments of the form +3(meaning more than 3), -3 (meaning less than 3), or 3 (meaning exactly 3).

If multiple criteria are specified, by default, all criteria must be met. If multiple criteria are separated by -or, however, either condition may be met. Criteria can be inverted by preceding the criteria with -not.

Locating Files with locate and find

58

As an example, the following command finds all files under /var which are not group writable.

[elvis@station elvis]$ find /var -not -perm +20 /var/var/lib/var/lib/rpm/var/lib/rpm/Packages/var/lib/rpm/Basenames/var/lib/rpm/Name/var/lib/rpm/Group...

Find Actions

You can also specify what you would like done to files that meet the specified criteria. By default, if nocriteria is specified, the file name is printed to standard out, one file per line. Other options are summarizedin the following table.

Table 5.2. Action Specifications for the find Command

Switch Action

-exec command ; Execute command on matching files. Use {} to indicate where filename should besubstituted.

-ok command ; Like -exec, but prompt for each file

-ls Print file in ls -dils format.

Again, others exist. Consult the find(1) man page.

Perhaps the most useful, and definitely the most awkward, of these is -exec, and its close cousin -ok.The -exec mechanism is powerful: rather than printing the names of matching files, arbitrary commandscan be run. The -exec mechanism is awkward, because the syntax for specifying the command to run istricky. The command should be written after the -exec switch, using a literal {} as a placeholder for thefile name. The command should be terminated with a literal ;, but as will be seen a later Workbook, the; has special significance to the shell, and so must be "escaped" by prepending a \. An example will helpclarify the syntax.

Suppose madonna wanted to make a copy of every file greater than 200 Kbytes in size out of the /etcdirectory. First, she finds which files meet the criteria.

[madonna@station madonna]$ find /etc -size +200k 2>/dev/null/etc/selinux/targeted/policy/policy.21/etc/firmware/microcode.dat/etc/prelink.cache/etc/termcap/etc/pki/tls/certs/ca-bundle.crt/etc/gconf/gconf.xml.defaults/%[email protected]/etc/gconf/gconf.xml.defaults/%gconf-tree-cs.xml...

(The 2>/dev/null serves to "throw away" complaints about directories madonna does not have permissionsto access.)

To confirm the sizes of the files, she reruns the command, specifying the "action" of -ls.

[madonna@station madonna]$ find /etc -size +200k -ls 2>/dev/null132462 1112 -rw-r--r-- 1 root root 1130305 Aug 22 15:41 /etc/selinux/targeted/policy/policy.21132045 788 -rw-r--r-- 1 root root 798555 Dec 4 2006 /etc/firmware/microcode.dat132705 320 -rw-r--r-- 1 root root 319141 Aug 24 13:20 /etc/prelink.cache130862 800 -rw-r--r-- 1 root root 807103 Jul 12 2006 /etc/termcap131090 440 -rw-r--r-- 1 root root 441017 Nov 30 2006 /etc/pki/tls/certs/ca-bundle.crt

Locating Files with locate and find

59

132279 460 -rw-r--r-- 1 root root 459789 Aug 22 15:43 /etc/gconf/gconf.xml.defaults/%[email protected] 468 -rw-r--r-- 1 root root 467812 Aug 22 15:43 /etc/gconf/gconf.xml.defaults/%gconf-tree-cs.xml...

Now, she makes a directory called /tmp/big, and composes a cp command on the find command line,remembering the following.

• Place a {} as a placeholder for matching file names

• Terminate the command with a \;.

[madonna@station madonna]$ mkdir /tmp/big[madonna@station madonna]$ find /etc -size +200k -exec cp {} /tmp/big \; 2>/dev/null[madonna@station madonna]$ ls /tmp/big/apps_gnome_settings_daemon_keybindings.schemas %gconf-tree-nn.xmlapps_nautilus_preferences.schemas %gconf-tree-or.xmlca-bundle.crt %gconf-tree-pa.xmlclock.schemas %gconf-tree-pl.xml...

Rather than printing the file name, the find command copied the files to the /tmp/big directory.

Examples

Using locateThere are several ways to find a specific file.

[blondie@station blondie]$ locate rmdir/bin/rmdir/usr/lib/perl5/5.8.8/i386-linux-thread-multi/auto/POSIX/rmdir.al/usr/share/doc/bash-3.1/loadables/rmdir.c/usr/share/man/man1/rmdir.1.gz/usr/share/man/man1p/rmdir.1p.gz/usr/share/man/man2/rmdir.2.gz/usr/share/man/man3p/rmdir.3p.gz[blondie@station blondie]$ find /bin -name "*dir*"/bin/mkdir/bin/rmdir[blondie@station blondie]$ which rmdir/bin/rmdir

In the above examples, locate shows everything in the database with the string "rmdir" including thecommand and man pages. find shows all files under /bin that include "dir" in the name. Finally, whichshows the absolute path for a known command.

You can also include filename expansion characters in your search:

[blondie@station blondie]$ locate "*theme*png".../home/elvis/gdm/themes/RHEL/background.png/home/elvis/gdm/themes/RHEL/distribution.png/home/elvis/gdm/themes/RHEL/icon-language.png/home/elvis/gdm/themes/RHEL/icon-reboot.png/home/elvis/gdm/themes/RHEL/icon-session.png/home/elvis/gdm/themes/RHEL/icon-shutdown.png/home/elvis/gdm/themes/RHEL/logo.png...

Recall that locate uses a database and will not locate files that have been created since the database waslast updated. The example below should not show any output from the locate command.

Locating Files with locate and find

60

[blondie@station blondie]$ touch ~/locate_example_file[blondie@station blondie]$ locate locate_example_file

Because the locate database does not yet know about the locate_example_file file, no files arereported.

Using findThe find searches the actual directory tree from the specified beginning point.

[elvis@station elvis]$ find /home/elvis/home/elvis/home/elvis/.metacity/home/elvis/.metacity/sessions/home/elvis/.metacity/sessions/1187895446-1983-3849805419.ms/home/elvis/.metacity/sessions/1187972526-2245-2949308818.ms...

Multiple starting directories can be specified for the find command.

[elvis@station elvis]$ find /bin /usr/bin -name "*dir*"/bin/rmdir/bin/mkdir/usr/bin/dir/usr/bin/lndir/usr/bin/dirname...

The find command can be used to discover symbolic links.

[elvis@station ~]$ find /usr/bin -type l/usr/bin/lastb/usr/bin/spam/usr/bin/gnome-character-map/usr/bin/rdistd/usr/bin/ipmish/usr/bin/python2...[elvis@station ~]$ ls -l /usr/bin/lastblrwxrwxrwx 1 root root 4 Aug 22 15:36 /usr/bin/lastb -> last

As a complicated example, find can produce a "ls -l style" listing of everything on the system not ownedby the users root, bin or elvis. As there may be directories where search access is denied, redirecting errorsto /dev/null reduces screen clutter.

[elvis@station elvis]$ find / -not -user root -not -user bin -not -user elvis -ls 2> /dev/null 198506 96 -rwxr-xr-x 1 rpm rpm 89344 Jan 4 2007 /bin/rpm1013889 8 drwx------ 3 madonna madonna 4096 Aug 25 08:43 /home/madonna1013833 8 drwx------ 3 pataki pataki 4096 Aug 22 15:43 /home/pataki1013986 8 drwx------ 4 rhauser_a rhauser_a 4096 Aug 23 20:33 /home/rhauser_a1013903 8 drwx------ 3 prince prince 4096 Aug 22 15:43 /home/prince1014000 8 drwx------ 3 rhauser_c rhauser_c 4096 Aug 23 11:02 /home/rhauser_c1013896 8 drwx------ 3 bob bob 4096 Aug 22 15:43 /home/bob1013854 8 drwx------ 3 blondie blondie 4096 Aug 25 08:59 /home/blondie1013875 8 drwx------ 3 nero nero 4096 Aug 22 15:43 /home/nero1013882 8 drwx------ 3 einstein einstein 4096 Aug 22 15:43 /home/einstein...

Using find to Execute Commands on FilesFind all the files under /tmp with a link count greater than 1 and make a copy of each in a directorycalled /tmp/links.

Locating Files with locate and find

61

[blondie@station blondie]$ ls -l /tmp/*file-rw-rw-r-- 2 blondie blondie 0 Mar 17 22:33 /tmp/linkfile-rw-rw-r-- 2 blondie blondie 0 Mar 17 22:33 /tmp/newfile[blondie@station blondie]$ mkdir /tmp/links[blondie@station blondie]$ find /tmp -type f -links +1 -exec cp {} /tmp/links \;[blondie@station blondie]$ ls /tmp/linkslinkfilenewfile

Online Exercises

Locating files

Lab Exercise

Objective: Devise and execute a find command that produces the result described in each of thefollowing.

Estimated Time: 20 mins.

Specification

Use the find command to find files which match the following criteria, and redirect the output to thespecified files in your home directory. When listing filenames, make sure every filename is an absolutereference.

You will encounter a number of "Permission denied" messages when find tries to recurse into directoriesfor which you do not have permissions to access. Do not be concerned with these errors. You can suppressthese error messages by appending 2> /dev/nullto your find command.

You may need to consult the find(1) man page to find the answer for some of the problems.

Deliverables

The following command lines will produce the appropriate answers.

find /var/lib -user webalizer

find /var -user root -group mail

find /usr/bin -size +1000000c -ls

find /etc/sysconfig -exec file {} \;

find /usr/bin -type f -links +3

1.1. The file varlib.webalizer, which contains a list of all files under the /var/lib

directory which are owned by the user "webalizer".

2. The file var.rootmail, which contains a list of all files under the /var directory which areowned by the user "root" and group owned by the group "mail".

3. The file bin.big which contains a ls -dils style listing of all files under the /usr/bindirectory that are greater than 1000000 characters in size.

Locating Files with locate and find

62

4. Execute the file command on every file under /etc/sysconfig, and record the output inthe file sysconfig.find.

5. The file big.links, which contains a list of the filenames of regular files underneath the /usr/bin directory which have a link count of greater than 3.

Questions1. Which find option will locate files of exactly 100 blocks?

a. -size +100

b. -size 100

c. -inum 100

d. -inum +100

2. Which find option will locate files with inode number 100?

a. -type f

b. -size 100

c. -inum 100

d. -perm 100

3. Which find option or options will locate only ordinary files which have a link count of 2 or more?

a. -type f -links +2

b. -links +1

c. -type o -links +1

d. -type f -links +1

4. Which find option will print the output in a ls -l style format?

a. -type

b. -size

c. -inum

d. -ls

5. Which find option or options will locate files owned by user root and group sys?

a. -user root -and -group sys

b. -user root -group sys

c. -user root

d. -group sys

Locating Files with locate and find

63

6. Which command or commands will list files that were recently created and include the string"coffee" in their names?

a. slocate coffee

b. find . -name coffee

c. find . -name "*coffee*"

d. ls -R *coffee*

7. Which command or commands will list files that include the string "coffee" in their names?

a. slocate coffee

b. find . -name coffee

c. find . -name "*coffee*"

d. ls -R *coffee*

8. Which find command will locate ordinary files under /home or /tmp with world writablepermissions?

a. find /home /tmp -type f -perm -2

b. find /home -or /tmp -type f -perm 002

c. find /home /tmp -type o -perm 2

d. find /home /tmp -perm -2

9. Which find option can be added to the previous answer so that each file found will have the "other"write permission removed?

a. -exec chmod o-w \;

b. -exec chmod o-w

c. -exec chmod o-w {} \;

d. -exec chmod -2

10. Which find option can be used such that each file found will have permissions removed and havefind prompt interactively whether or not to change the permissions?

a. -ok

b. -ask

c. -exec

d. -exec -ok

64

Chapter 6. Compressing Files: gzipand bzip2

Key Concepts

• Compressing seldom used files saves disk space.

• The most commonly used compression command is gzip.

• The bzip2 command is newer, and provides the most efficient compression.

Discussion

Why Compress Files?

Files that are not used very often are often compressed. Large files are also compressed before transferringto other systems or users. The advantages of saved space and bandwidth usually outweighs the added timeit takes to compress and uncompress files.

Text files often have patterns that can be compressed up to 75% but binary files rarely compress more than25%. In fact, it is even possible for a compressed binary file to be larger than the original file!

Standard Linux Compression Utilities

As better and better compression techniques have been developed, new compression utilities have gainedfavor. For backwards compatibility, however, older compression utilities are still retained. Often, there isa trade off between compression efficiency and CPU activity. Sometimes, older compression utilities do"good enough" in a much shorter time.

The following list discusses the two most common compression utilities used in Linux and Unix.

gzip (.gz) The gzip command is the most versatile and most commonly useddecompression utility. Files compressed with gzip are uncompressed withgunzip. Additionally, the gzip command supports the following command lineswitches.

Switch Effect

-c Redirect Output to stdout

-d Decompress instead of compress file

-r Recurse through subdirectories, compressing individual files.

-1 ... -9 Specify trade off between CPU intensity and compressionefficiency.

bzip2 (.bz) The bzip2 command is a relative newcomer, which tends to produce the mostcompact compressed files, but is the most CPU intensive. Files compressedwith bzip2 are uncompressed with bunzip2. The bzip2 command supports thefollowing command line switches.

Compressing Files: gzip and bzip2

65

Switch Effect

-c Redirect Output to stdout

-d Decompress instead of compress file

The following examples illustrate the use and relative efficiency of the compression commands.

[elvis@station elvis]$ ls -sh termcap725K termcap[elvis@station elvis]$ gzip termcap[elvis@station elvis]$ ls -sh termcap*234K termcap.gz[elvis@station elvis]$ gzip -d termcap[elvis@station elvis]$ ls -sh termcap*725K termcap

[elvis@station elvis]$ ls -sh termcap725K termcap[elvis@station elvis]$ bzip2 termcap[elvis@station elvis]$ ls -sh termcap*185K termcap.bz2[elvis@station elvis]$ bunzip2 termcap.bz2[elvis@station elvis]$ ls -sh termcap*725K termcap

Other Compression UtilitiesAnother compression utility available in Red Hat Enterprise Linux is zip. This utility is compatible with theDOS/Windows PKzip/Winzip utilities and can compress more than one file into a single file, somethingthat gzip and bzip2 cannot do.

Linux and Unix users often prefer instead to use tar and gzip together in preference to zip. The commandtar is discussed in the next lesson.

Examples

Working with gzipMadonna also has a copy of the same bigfile but prefers to use gzip compression.

[madonna@station madonna]$ gzip bigfile[madonna@station madonna]$ ls -l bigfile*-rw-r--r-- 1 madonna madonna 131069 Mar 18 15:29 bigfile.gz[madonna@station madonna]$ gunzip bigfile.gz[madonna@station madonna]$ ls -l bigfile*-rw-r--r-- 1 madonna madonna 409305 Mar 18 15:29 bigfile

Notice the better compression algorithm from this utility.

Using gzip RecursivelyThe gzip command includes a -r command line switch, which will recurse through subdirectories,compressing individual files. In the following example, madonna will create a local copy of the /etc/sysconfig/networking directory, and then recursively compress the copy.

[madonna@station madonna]$ cp -r /etc/sysconfig/networking .[madonna@station madonna]$ gzip -r networking

Compressing Files: gzip and bzip2

66

[madonna@station madonna]$ tree networking/networking/|-- devices| `-- ifcfg-eth0.gz|-- ifcfg-lo.gz`-- profiles |-- default | |-- hosts.gz | |-- ifcfg-eth0.gz | |-- network.gz | `-- resolv.conf.gz `-- netup |-- hosts.gz |-- ifcfg-eth0.gz |-- network.gz `-- resolv.conf.gz 4 directories, 10 files

Working with bzip2Elvis realizes that the compress utility that he first used is old and decides to try a much newer compressionutility.

[elvis@station elvis]$ bzip2 bigfile[elvis@station elvis]$ ls -l bigfile*-rw-r--r-- 1 elvis elvis 154563 Mar 18 15:29 bigfile.bz2[elvis@station elvis]$ bunzip2 bigfile.bz2[elvis@station elvis]$ ls -l bigfile*-rw-r--r-- 1 elvis elvis 409305 Mar 18 15:29 bigfile

Notice that to uncompress this archive, Elvis must give the filename with the bz2 extension. In the otherexamples, the utility used could find a file of the given base name with a known extension.

Online Exercises

Working with compression Utilities

Lab Exercise

Objective: Compress large files

Estimated Time: 10 mins.

Specification

1. Copy the files /etc/gconf/schemas/gnome-terminal.schemas and /usr/bin/gimpinto your home directory, preserving their original filenames. (The first is an example of a large textfile, the second is an example of a large binary file.) Use the gzip command to compress each of thenewly created files.

2. Again, copy the files /etc/gconf/schemas/gnome-terminal.schemas and /usr/bin/gimp into your home directory. This time, use the bzip2 command to compress the two files.

3. One last time, copy the /etc/gconf/schemas/gnome-terminal.schemas and /usr/bin/gimp files into your home directory. Use the ls -s command to compare the sizes of the variouscompression techniques.

Compressing Files: gzip and bzip2

67

Deliverables

1.1. The file gnome-terminal.schemas in your home directory, which is a copy of /etc/gconf/schemas/gnome-terminal.schemas.

2. The file gnome-terminal.schemas.gz, the gziped version of gnome-terminal.schemas.

3. The file gnome-terminal.schemas.bz2, the bzip2ed version of gnome-terminal.schemas.

4. The file gimp in your home directory, which is a copy of /usr/bin/gimp.

5. The file gimp.gz, the gziped version of gimp.

6. The file gimp.bz2, the bzip2ed version of gimp.

Questions1. What filename extension is generally associated with files compressed using the bzip2 utility?

a. .Z

b. .gz

c. .bz2

d. .tar

2. What filename extension is generally associated with files compressed using the gzip utility?

a. .Z

b. .gz

c. .bz2

d. .tar

3. Which commands can uncompress a .gz file?

a. uncompress

b. gunzip

c. gzip -d

d. bunzip2

4. Why is compression most useful for text files?

a. Binary files may get corrupted when compressed.

b. Binary files are always larger after being compressed

c. Utilities cannot compress a binary file

Compressing Files: gzip and bzip2

68

d. Binary files are often already efficiently using space and little is gained by compressing them.

5. Assuming that the text files exist in the current directory, what will the command gzip report.txtdraft.txt schedule.txt produce?

a. A single file with three text files in it.

b. A single compressed file with three text files in it.

c. Three compressed text files.

d. An error message.

69

Chapter 7. Archiving Files with tarKey Concepts

• Archiving files allows an entire directory structure to be stored as a single file.

• Archives are created, listed, and extracted with the tar command.

• Archive files are often compressed as well.

• The fileroller application provides a GUI interface to archiving files.

Discussion

Archive FilesOften, if a directory and its underlying files are not going to be used for a while, or if the entire directorytree is going to be transferred from one place or another, people convert the directory tree into an archivefile. The archive contains the directory and its underlying files and subdirectories, packaged as a single file.In Linux (and Unix), the most common command for creating and extracting archives is the tar command.

Originally, archive files provided a solution to backing up disks to tape. When backing up a filesystem,the entire directory structure would be converted into a single file, which was written directly to a tapedrive. The tar command derived its name from "t"ape "ar"chive.

Today, the tar is seldom used to write to tapes directly, but instead creates archive files which are oftenreferred to as "tar files", "tar archives", or sometimes informally as "tarballs". These archive files areconventionally given the .tar filename extension.

Tar Command BasicsWhen running the tar command, the first command line must be selected from the following choices.

Switch Effect

-c, --create Create an archive file

-x, --extract Extract an archive file

-t, --list List the contents of an archive file

There are others, but almost always one of these three will suffice. See the tar(1) man page for more details.

Next, almost every invocation of the tar command must include the -f command line switch and itsargument, which specifies which archive file is being created, extracted, or listed.

As an example, the user prince has been working on a report, which involves several subdirectories andfiles.

report/|-- html/| |-- chap1.html| |-- chap2.html| `-- figures/| `-- image1.png

Archiving Files with tar

70

`-- text/ |-- chap1.txt `-- chap2.txt 3 directories, 5 files

He would like to email a copy of the report to a friend. Rather than attach each individual file to an emailmessage, he decides to create an archive of the report directory. He uses the tar command, specifying -cto "c"reate an archive, and using the -f command line switch to specify the archive file to create.

[prince@station prince]$ tar -c -f report.tar report[prince@station prince]$ ls -stotal 24 4 report 20 report.tar

The newly created archive file report.tar now contains the entire contents of the report directory,and its subdirectories. In order to confirm that the archive was created correctly, prince lists the contentsof the archive file with the tar -t command (again using -f to specify which archive file).

[prince@station prince]$ tar -t -f report.tarreport/report/text/report/text/chap1.txtreport/text/chap2.txtreport/html/report/html/figures/report/html/figures/image1.pngreport/html/chap1.htmlreport/html/chap2.html

As further confirmation, prince extracts the archive file in the /tmp directory, using tar -x.

[prince@station prince]$ cd /tmp[prince@station tmp]$ tar -x -f /home/prince/report.tar [prince@station tmp]$ ls -R report/report/:html text report/html:chap1.html chap2.html figures report/html/figures:image1.png report/text:chap1.txt chap2.txt

Now convinced that the archive file contains the report, and that his friend should be able to extract it, hecleans up the test copy, and uses the mutt command to email the archive as an attachment.

[prince@station tmp]$ rm -fr report/[prince@station tmp]$ cd[prince@station prince]$ mutt -a report.tar -s "My Report" [email protected]

Don't be concerned if you aren't familiar with mutt. This just serves as an example of why someone mightwant to create a tar archive.

More About tarThe first command line switch to the tar command must be one of the special switches discussed above.Because the first switch is always one of a few choices, the tar command allows a shortcut; you do notneed to include the leading hyphen. Often, experienced users of tar will use shortened command lineslike the following.

Archiving Files with tar

71

[prince@station prince]$ tar cf report.tar report[prince@station prince]$ tar tf report.tarreport/report/text/report/text/chap1.txtreport/text/chap2.txtreport/html/report/html/figures/report/html/figures/image1.pngreport/html/chap1.htmlreport/html/chap2.html

Creating archives introduces a lot of complicated questions, such as some of the following.

• When creating archives, how should links be handled? Do I archive the link, or what the link refers to?

• When extracting archives as root, do I want all of the files to be owned by root, or by the original owner?What if the original owner doesn't exist on the system I'm unpacking the tar on?

• What happens if the tape drive I'm archiving to runs out of room in the middle of the archive?

The answers to these, and many other questions as well, can be decided with an overwhelming numberof command line switches to the tar command, as tar --help or a quick look at the tar(1) man page willdemonstrate. The following table lists some of the more commonly used switches, and there use will bediscussed below.

Switch Effect

-C, --directory=DIR Change to directory DIR

-P, --absolute-reference don't strip leading / from filenames

-v, --verbose list files processed

-z, --gzip internally gzip archive

-j, --bzip2 internally bzip2 archive

Absolute References

Suppose prince wanted to archive a snapshot of the current networking configuration of his machine. Hemight run a command like the following. (Note the inclusion of the -v command line switch, which listseach file as it is processed.)

[prince@station prince]$ tar cvf net.tar /etc/sysconfig/networkingtar: Removing leading `/' from member namesetc/sysconfig/networking/etc/sysconfig/networking/devices/etc/sysconfig/networking/devices/ifcfg-eth0etc/sysconfig/networking/profiles/etc/sysconfig/networking/profiles/default/etc/sysconfig/networking/profiles/default/network...

As the leading message implies, what was an absolute reference to /etc/sysconfig/networkingis converted to relative references inside the archive: None of the entries have leading slashes. Why is thisdone? What happens if prince turns right around and extracts the archive?

[prince@station prince]$ tar xvf net.taretc/sysconfig/networking/etc/sysconfig/networking/devices/etc/sysconfig/networking/devices/ifcfg-eth0

Archiving Files with tar

72

etc/sysconfig/networking/profiles/etc/sysconfig/networking/profiles/default/etc/sysconfig/networking/profiles/default/networketc/sysconfig/networking/profiles/default/resolv.conf...[prince@station prince]$ ls -R etc/etc/:sysconfig etc/sysconfig:networking etc/sysconfig/networking:devices ifcfg-lo profiles etc/sysconfig/networking/devices:ifcfg-eth0 etc/sysconfig/networking/profiles:default netup etc/sysconfig/networking/profiles/default:hosts ifcfg-eth0 network resolv.conf etc/sysconfig/networking/profiles/netup:hosts ifcfg-eth0 network resolv.conf

Because the file entries were relative, the archive unpacked into the local directory. As a rule, archivefiles will always unpack locally, reducing the chance that you will unintentionally clobber files in yourfilesystem by unpacking an archive on top of them. When constructing the archive, this behavior can beoverridden with the -P command line switch.

Establishing Context

When extracting the archive above, the first "interesting" directory is the networking directory, becauseit contains the relevant subdirectories and files. When extracting the archive, however, and "extra" etcand etc/sysconfig are created. In order to get to the interesting directory, someone has to work hisway down to it.

When constructing an archive, the -C command line switch can be used to help establish context bychanging directory before the archive is constructed. Compare the following two tar commands.

[prince@station prince]$ tar cvf net.tar /etc/sysconfig/networkingtar: Removing leading `/' from member namesetc/sysconfig/networking/etc/sysconfig/networking/devices/etc/sysconfig/networking/devices/ifcfg-eth0etc/sysconfig/networking/profiles/etc/sysconfig/networking/profiles/default/etc/sysconfig/networking/profiles/default/network...[prince@station prince]$ tar cvf net.tar -C /etc/sysconfig networkingnetworking/networking/devices/networking/devices/ifcfg-eth0networking/profiles/networking/profiles/default/networking/profiles/default/network...

In the second case, the tar command first changes to the /etc/sysconfig directory, and then createsa copy of the networking directory found there. When the resulting archive file is extracted, the"interesting" directory is immediately available.

Archiving Files with tar

73

Of course, prince could have used the cd command before running the tar command to the same effect,but the -C command line switch is often more efficient.

Compressing archives

Often, the tar command is used to archive files that will not be used anytime soon. Because the resultingarchive files will not be used soon, they are compressed as well. In the following, prince is able to save asignificant amount of disk space by gziping his archive of his home directory.

[prince@station prince]$ tar cf /tmp/prince.tar -C /home/prince .[prince@station prince]$ ls -s /tmp/prince.tar 224 /tmp/prince.tar[prince@station prince]$ gzip /tmp/prince.tar[prince@station prince]$ ls -s /tmp/prince.tar.gz 28 /tmp/prince.tar.gz

Because users are often creating and then compressing archives, or dealing with archives that have beencompressed, the tar command provides three command line switches for internally compressing (ordecompressing) archive files. Above, prince could have obtained the same result by adding a -z commandline switch.

[prince@station prince]$ tar czf /tmp/prince.tar.gz -C /home/prince .[prince@station prince]$ ls -s /tmp/prince.tar.gz 28 /tmp/prince.tar.gz

The combination of tar and gzip is found so often, that often the .tar.gz filename extension will beabbreviated .tgz.

[prince@station prince]$ tar czf /tmp/prince.tgz -C /home/prince .[prince@station prince]$ tar tzf /tmp/prince.tgz././.bash_profile./Desktop/./Desktop/redhat_academy.desktop./.bashrc./.bash_logout./.zshrc

With older versions of the tar command, when expanding or listing a tar archive, you would have torespecify the appropriate compression ( with the -z or -j command line switches). In recent version of RedHat Enterprise Linux, however, the tar command will now automatically recognize a compressed archive,and decompress it appropriately.

[prince@station prince]$ tar tf /tmp/prince.tgz././.bash_profile./Desktop/./Desktop/redhat_academy.desktop./.bashrc./.bash_logout./.zshrc

Examples

Creating a tar ArchiveThe user einstein wants to make a copy of the bash documentation that he can take along with him. Hequickly tars up the /usr/share/doc/bash-2.05b directory.

[einstein@station einstein]$ tar cvzf bashdoc.tgz -C /usr/share/doc bash-3.1bash-3.1/

Archiving Files with tar

74

bash-3.1/bashref.psbash-3.1/bash.0bash-3.1/bash.htmlbash-3.1/article.psbash-3.1/complete/bash-3.1/complete/complete2.ianmac...[einstein@station einstein]$ ls -s bashdoc.tgz 1240 bashdoc.tgz

Once he gets the file to its new location, he extracts the archive.

[einstein@station einstein]$ tar xvf bashdoc.tgzbash-3.1/bash-3.1/bashref.psbash-3.1/bash.0bash-3.1/bash.htmlbash-3.1/article.psbash-3.1/complete/bash-3.1/complete/complete2.ianmac...

Tarring Directly to a FloppyThe user maxwell wants to quickly compare the LDAP configuration on two different machines. Themachines are not connected to a network, but both have a floppy drive. Rather than creating an archive,formatting a floppy, mounting the floppy, copying the archive, and unmounting the floppy, maxwelldecides to save a few steps. With an unmounted floppy in the drive, maxwell runs the following command.

[maxwell@station maxwell]$ tar cvzf /dev/fd0 -C /etc openldapopenldap/openldap/ldapfilter.confopenldap/ldap.confopenldap/ldapsearchprefs.confopenldap/ldaptemplates.conf

He then ejects the floppy and carries it to the second machine. The following command extracts the archiveinto his local directory.

[maxwell@station maxwell]$ tar xvzf /dev/fd0openldap/openldap/ldapfilter.confopenldap/ldap.confopenldap/ldapsearchprefs.confopenldap/ldaptemplates.confopenldap/ldaptemplates.conf gzip: stdin: decompression OK, trailing garbage ignoredtar: Child died with signal 13tar: Error exit delayed from previous errors

Although the tar command (or, more accurately, the gzip command) complained about "trailing garbage",the archive was successfully extracted.

What happened here? The tar command wrote directly to the floppy's device node, so the archive filewas written byte for byte onto the floppy as raw data. Upon extracting the archive, the file was read bytefor byte, until the file was entirely read. The gzip command kept going, however, trying to decompresswhatever was sitting on the floppy before the archive was written. This is the "trailing garbage" that thegzip command complained about. What was the filename of the archive as it was sitting on the floppy?(Trick question!)

The file doesn't have a name, because the floppy doesn't have a filesystem. (What ever filesystem mighthave existed on the floppy was destroyed by the archive).

Archiving Files with tar

75

Oops.The user einstein wants to create an archive of his home directory. He tries the following command.

[einstein@station einstein]$ tar cvzf ~/einstein.tgz ~tar: Removing leading `/' from member nameshome/einstein/home/einstein/.kde/home/einstein/.kde/Autostart/...home/einstein/.bash_historyhome/einstein/einstein.tgztar: /home/einstein/einstein.tgz: file changed as we read ittar: Error exit delayed from previous errors

Why did the tar command error out? The archive was being written to the file /home/einstein/einstein.tgz. The archive included every file in the /home/einstein directory. Eventually, thetar command tried to append the file /home/einstein/einstein.tgz to the archive /home/einstein/einstein.tgz. This obviously causes problems.

Fortunately, the tar command is now smart enough to detect circular references. In the (not too distant)"old days", the first clue that something was wrong in situations like this was the long time it took the tarcommand to run, and the second clue was the error message saying that the disk was out of space.

What's the solution? Make sure that the archive file you're creating does not exist in the directory yourarchiving. The /tmp directory comes in handy.

[einstein@station einstein]$ tar czf /tmp/einstein.tgz ~tar: Removing leading `/' from member names[einstein@station einstein]$ mv /tmp/einstein.tgz .

Online Exercises

Archiving Directories

Lab Exercise

Objective: Create an archive using the tar command.

Estimated Time: 15 mins.

Specification

1. In your home directory, create the file zip_docs.tar which is an archive of the documentation forthe zip package located in the /usr/share/doc/zip* directory.

2. Create the file /tmp/student.tgz, which is a gziped archive of your home directory. Replacestudent with your username.

Deliverables

Solutions

tar cvf ~/zip_docs.tar /usr/share/doc/zip*

tar cvzf /tmp/student.tgz ~

Archiving Files with tar

76

1.1. The file zip_docs.tar in your home directory, which is an archive of the /usr/share/doc/zip* directory.

2. The file /tmp/student.tgz, with student replaced with your username, which is agziped archive of your home directory.

Questions1. Which of the following commands would create an archive called archive.tar?

a. tar -c -f archive.tar

b. tar -x -f archive.tar /usr/games

c. tar -t -f archive.tar /usr/games

d. tar -c -f archive.tar /usr/games

e. None of the above.

2. Which of the following commands would list the contents of an archive file calledarchive.tar?

a. tar tf archive.tar

b. tar -xf archive.tar

c. tar -c -f archive.tar

d. tar --list archive.tar

e. None of the above.

3. Which of the following commands would extract the contents of an archive file calledarchive.tar?

a. tar tf archive.tar

b. tar -xf archive.tar

c. tar -c -f archive.tar

d. tar --list archive.tar

e. None of the above.

4. You have downloaded a file titled linux-2.5.34.tar.gz. Which of the following commandscan you run to extract the contents of the file?

a. tar xvzf linux-2.5.34.tar.gz

b. tar -x -f linux-2.5.34.tar.gz

c. tar -x -z -f linux-2.5.34.tar.gz

d. tar -xZf linux-2.5.34.tar.gz

Archiving Files with tar

77

e. tar fz linux-2.5.34.tar.gz

f. tar -x -f linux-2.5.34.tar.gz -z

5. You would like to make a bzip2 compressed archive of the /usr/share/sounds directory, sothat when someone extracts the archive, it extracts starting with the directory sounds. Which ofthe following commands will create the archive?

a. tar -c -f /tmp/sounds.tar.bz2 /usr/share/sounds

b. tar cvjf /tmp/sounds.tar.bz2 -C /usr/share sounds

c. tar -c -f /tmp/sounds.tar.bz2 -C /usr/share/sounds -j

d. tar -cj /tmp/sounds.tar.bz2 -f /usr/share/sounds

e. None of the above

6. What filename extension does the tar command add automatically when creating an archive?

a. .tar

b. .tgz

c. .tar.gz

d. .zip

e. No extension is added by tar.

7. Usually, when a tar archive is extracted with the command tar xzf archive.tgz, whereare the files placed?

a. Relative to the root of the current filesystem

b. Relative to the current working directory

c. Relative to the directory specified on the command line

d. Relative to the root of the root filesystem

e. Relative to the /tmp directory

8. A file has been downloaded called archive.tgz. How can you view the contents of this archive?

a. tar tzvf archive.tgz

b. tar jtvf archive.tgz

c. tar tvf archive.tgz

d. tar cvzf archive.tgz

e. tar tf archive.tgz