CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

42
CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil

Transcript of CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

Page 1: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

CSNB334 Advanced Operating Systems7. File Management

Lecturer: Asma Shakil

Page 2: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

Introduction The file system part of the operating system provides the

resource abstractions typically associated with secondary storage.

A file is a collection of data with the following properties: Long-term existence. Sharable b/w processes. Structure : Hierarchical.

Typical operations on a file Create Delete Open Close Read Write.

Attributes of a file Owner, creation time, time last modified, access privileges etc.

Page 3: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

Linux File Structure

Linux views a file as a named stream of bytes for writing/reading from storage devices without distinction into physical fields, records and so on.

A simple description of the UNIX system, also applicable to Linux, is this: "On a UNIX system, everything is a file; if something is not a file,

it is a process." Files include

Programs, services, texts, images, and so forth. named pipes Sockets Input and output devices.

Page 4: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

Linux File Manager

Gives a set of functions (system calls) to manipulate files: int open(char *pathname, int oflag, [, int mode]); int creat(char *pathname, int mode) int read( int filedes, char *buf, unsigned int nbytes) int close(int filedes) int write(int filedes, char *buf, unsigned int nbytes) long lseek(int filedes, long offset, int where) // to position a file

where = 0; offset from beginning of file. where = 1; offset from current position in the file where = 2; offset + size of file.

int ioctl(int filedes, unsigned long request, char * arg) Used to change the behaviour of an open file.

Page 5: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

The Linux (Virtual) File System Linux includes a versatile and powerful file handling

facility – the VFS – designed to support a wide variety of file management systems and file structures.

Basically a VFS is a kernel software layer that handles all system calls related to a standard Unix filesystem.

Its main strength is providing a common interface to several kinds of filesystems to user processes regardless of the target file system or the underlying processor hardware. This allows Linux to access files from disks in other OS

formats such as Windows, MINIX etc.

Page 6: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

The Role of the Virtual Filesystem (VFS) Let's assume that a user issues the shell command:

$ cp /floppy/TEST /tmp/test

where /floppy is the mount point of an MS-DOS diskette and /tmp is a normal Second Extended Filesystem (Ext2) directory.

The VFS is an abstraction layer between the application program and the filesystem implementations.

Therefore, the cp program is not required to know the filesystem types of /floppy/TEST and /tmp/test.

Instead, cp interacts with the VFS by means of generic system calls.

Page 7: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

VFS role in a simple file copy operation

Page 8: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

Filesystems supported by the VFS Filesystems for Linux

Second Extended Filesystem (Ext2), the recent Third Extended Filesystem (Ext3), and the Reiser Filesystems (ReiserFS ).

Filesystems for Unix variants sysv filesystem (System V , Coherent , Xenix ), UFS (BSD ,

Solaris , NEXTSTEP ), MINIX filesystem, and VERITAS VxFS (SCO UnixWare )

Microsoft filesystems MS-DOS, VFAT (Windows 95 and later releases), and NTFS

(Windows NT 4 and later releases) ISO9660 CD-ROM filesystem and Universal Disk Format (UDF )

DVD filesystem Other proprietary filesystems

IBM's OS/2 (HPFS ), Apple's Macintosh (HFS ), Amiga's Fast Filesystem (AFFS ), and Acorn Disk Filing System (ADFS )

Additional filesystems originating in systems other than Linux such as IBM's JFS and SGI's XFS

You can see which file systems are registered by looking in at /proc/filesystems.

Page 9: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

The Common File Model The key idea behind the VFS consists of introducing a common

file model capable of representing all supported filesystems. This model strictly mirrors the file model provided by the

traditional Unix filesystem. However, each specific filesystem implementation must translate

its physical organization into the VFS's common file model. For instance, in the common file model, each directory is

regarded as a file, which contains a list of files and other directories.

However, several non-Unix disk-based filesystems use a File Allocation Table (FAT), which stores the position of each file in the directory tree. In these filesystems, directories are not files.

To stick to the VFS's common file model, the Linux implementations of such FAT-based filesystems must be able to construct on the fly, when needed, the files corresponding to the directories. Such files exist only as objects in kernel memory.

Page 10: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

The Linux (Virtual) File System When a process initiates a

file-oriented system call the kernel calls a function in the VFS. This function handles the

file-system-independent manipulations such as

Check access rights Close an open file Modify the file pointer (with

lseek()) The file-system-dependent

manipulations such as Determining where blocks

are located on the disk Instructing device drive to

read/write blocks are handled by a translator

(mapping function) that converts the call from the VFS into a call to the target file system.

Page 11: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

An example From the previous example (cp command), consider

the read( ) command. This would be translated by the kernel into a call specific to the MS-DOS filesystem.

The application's call to read( ) makes the kernel invoke the corresponding sys_read( ) service routine.

Each file is represented by a file data structure in kernel memory. This data structure contains a field called f_op that contains

pointers to functions specific to MS-DOS files, including a function that reads a file.

sys_read( ) finds the pointer to this function and invokes it.

Thus, the application's read( ) is turned into the rather indirect call: file->f_op->read(...);

Page 12: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

The Linux (Virtual) File System VFS is an OO scheme.

you have a base class, named filesystem which has a bunch of virtual methods which are overridden by every other custom file system present in the kernel.

Since, it is written in C, rather than an OO langauge VFS objects are implemented simply as C data

structures. Each object contains

Data Pointers to file-system-dependent functions that operate on

that data.

Page 13: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

The Linux (Virtual) File System The four primary object types in VFS are:

Superblock object. Inode object. Dentry object. File object.

Page 14: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

The superblock object The superblock object holds information about each mounted file

system. Owes its name to historical heritage

When the first block of a disk partition (called the superblock) was used to hold the meta-information about the partition itself.

The actual data structure in linux struct super_block.

Holds information Device that this filesystem is mounted on. Basic block size of the file system. Flags, such as a read-only flag. Mount time. File type Dirty flag, to indicate that the superblock has been changed

but not written back to disk. Semaphore for controlling access to the file system. List of superblock operations.

Page 15: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

Superblock

struct super_block { kdev_t s_dev;/* device */ unsigned long s_blocksize;/* block size */ unsigned char s_blocksize_bits;/* ld(block size) */ unsigned char s_lock;/* superblock lock */ unsigned char s_rd_only; Unsigned char s_dirt; Struct file_system_type *s_type; Struct super_operations *s_op; Unsigned long s_flags; Unsigned long s_magic; Unsigned long s_time; Struct inode *s_covered;/* mount point */ Struct inode *s_mounted; /* root inode */ Struct wait_queue *s_wait;/* s_lock wait queue */ Union {

Struct minix_sb_info minix_sb; Struct ext2_sb_info ext2_sb; …. Void *generic_sb; }u;

Page 16: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

Superblock operations s_op points to a vector of functions for accessing the file system. struct super_operations {

void (*read_inode)(struct inode *); // reads a specified inode from a mounted file system.

Int (*notify_change)(struct inode *,struct iattr *); // Called when inode attributes are changed

Void (*write_inode)(struct inode *); // Write given inode to disk.Void (*put_inode)(struct inode *); // if inode is no longer required.Called

when deleting file and release its blocks.

Void (*put_super)(struct super_block *); //Void (*write_super)(struct super_block *); // Called when the VFS

decides that the superblock needs to be written to disk.Void (*statfs)(struct super_block *,struct statfs *);Void (*remount_fs)(struct super_block *,int *,char *);};

These functions serve to translate the specific representation of the superblock and inode on data media to their general form in memory and vice-versa.

Page 17: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

The Inode Object An inode (Index Node) is associated with each file. The inode object holds all information (metadata) about a named file

(except its name and the actual data contents). Owner Group Permissions Access time Size of data it holds Number of links

To obtain the inode number for a file : ls -i An inode is both a physical object located on the disk of a filesystem

and a conceptual object described in the kernel by a struct inode Each inode object is associated with an inode number that uniquely

identifies the file within the filesystem.

Page 18: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

INODE struct inode {

Kdev_t I_dev; //ID of device containing the file or 0Unsigned long I_ino; //file’s inode numberUmode_t I_mode; //permissionsNlink_t I_nlinkl; //number of hard linksUid,gid etc….Dev_t I_rdev; /* only if device special file */Size ,times of modification,access,creation etc..Struct inode_operations *I_op;…….…….}

System calls related to obtaining the metadata of a file int stat (const char * path, struct stat * buf); int fstat (int fd, struct stat * buf);

Page 19: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

Inode Operations Struct inode_operations {

Struct file_operations *default_file_ops;

Int (*create)(struct inode *,const char *,int,int,struct inode **);

Int (*lookup)(struct inode *,const char *,int,struct inode **);

Int (*link)(struct inode *,struct inode *,const char *,int);

Int (*unlink)(struct inode *,const char *,int);

Int (*symlink)(struct inode *,const char *,int);

Int (*mkdir)(struct inode *,const char *,int);} NOTE :All these functions are directly called from the

implementation of the corresponding system call.

Page 20: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

The Dentry Object

Directory entry (dentry) is a file that associates inodes to filenames.

The directory structure is very simple: each is an array of links.

Page 21: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

A link is a structure which associates a name (string) to an inode number.  

Directory and Link Structure

•Each file has to have at least one link in one directory.•This is true for directories too, except for the root directory. •All files can be identified by their path, which is the list of links which have to be traversed to reach the file (either starting at the root directory, or at the current directory).•A file can have links in many directories;

•a directory has to have a single link towards itself (except ``.'' and ``..''), from its parent directory.

Page 22: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

File Object Every file that is opened by a process has a

corresponding entry of the file object. An “open file'' is described in the Linux kernel

by a struct file item; the structure encloses a pointer to the inode representing

the file. The file object by itself has no corresponding image on the disk.

The main information stored in a file object is the file pointer – current position in the file from which the next operation will take place – different for different processes.

File structures are created by system calls like open, pipe and socket, and are shared by parent and child across fork.

Page 23: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

File Structure struct file { struct list_head f_list; //This field links files together into one of a number

of lists. There is one list for each active file-system, starting at the s_files pointer in the super-block.

struct dentry *f_dentry; //This field records the directory entry that points to the inode for this file

struct file_operations *f_op; //This field points to the methods to use on this file atomic_t f_count; //The number of references to this file. One for each different user-

process file descriptor.unsigned int f_flags; // This field stores the flags for this file such as access type (read/write),

nonblocking, appendonly etc. mode_t f_mode; loff_t f_pos; //This records the current file position which will be the address used for the next

read request, and for the next write request if the file does NOT have the O_APPEND flag. unsigned long f_reada, f_ramax, f_raend, f_ralen, f_rawin; struct fown_struct f_owner; unsigned int f_uid, f_gid; //These fields get set to the owner and group of the process which

opened the file. int f_error; unsigned long f_version; /* needed for tty driver, and maybe others */ void *private_data; };

Page 24: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

File Methods struct file_operations { loff_t (*llseek) (struct file *, loff_t, int); ssize_t (*read) (struct file *, char *, size_t, loff_t *);ssize_t (*write) (struct file *, const char *, size_t, loff_t *); int (*readdir) (struct file *, void *, filldir_t); unsigned int (*poll) (struct file *, struct poll_table_struct *); int (*ioctl) (struct inode *, struct file *, unsigned int, unsigned long); int (*mmap) (struct file *, struct vm_area_struct *); int (*open) (struct inode *, struct file *); int (*flush) (struct file *); int (*release) (struct inode *, struct file *); int (*fsync) (struct file *, struct dentry *); int (*fasync) (int, struct file *, int); int (*check_media_change) (kdev_t dev); int (*revalidate) (kdev_t dev); int (*lock) (struct file *, int, struct file_lock *); };

Page 25: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

An example of how processes interact with files

Page 26: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

Filesystem Mounting

Before using a filesystem, there are two basic operations that must be performed Registration

Done when you build the kernel. Mounting

For root filesystem – done at system initialization. For other filesystems – done at any time.

Page 27: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

Mounting

All files accessible in a UNIX system are arranged in one big tree, the file hierarchy, rooted at /.

These files can be spread out over several devices.

The mount command serves to attach the file system found on some device to the big file tree.

Page 28: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

Mount command

For example, to "mount" the DVD-ROM drive before you can access it.

mount -t iso9660 /dev/hdc /cdrom

mount makes a device part of the file system. -t iso9660 specifies the format of the file system being mounted.

(The iso9660 is the standard format for data CDs (and most DVDs) but would be msdos if we were mounting a floppy drive with a DOS-formatted floppy in it.)

dev/hdc is the path to the DVD-ROM drive's device driver file. /cdrom is the directory to "map" the device to in the file system

so it can be accessed. Called the “mount point” – can be any user defined directory.

Page 29: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

Mounting (Contd..)

The "mount_root()" function takes care of mounting the first file system.

Every mounted file system is then represented by super_block structure.

The function read_super() of the virtual file system is used to initialize the superblock.

Page 30: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

Registering the File Systems

When you build the Linux kernel you are asked if you want each of the supported file systems. You can see which file systems are registered by looking in at

/proc/filesystems. For example: ext2

nodev proc iso9660

When the kernel is built, the file system startup code contains calls to the initialization routines of all of the built in file systems.

Each file system's initialization routine registers itself with the Virtual File System and is represented by a file_system_type data structure which contains the name of the file system and a pointer to its VFS superblock read routine.

Page 31: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

file_system_type data structures

•Each file_system_type data structure contains the following information: •Superblock read routine

•This routine is called by the VFS when an instance of the file system is mounted,

•File System name •The name of this file system, for example ext2,

•Device needed •Does this file system need a device to support? Not all file system need a device to hold them. The /proc file system, for example, does not require a block device,

Page 32: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

Opening a file

To open a file The file manager searches the storage systems for the specified pathname.

This Involves opening each directory in the pathname searching the path for the next file or directory in the pathname.

If the search encounters a mount point, then it moves from one file system to the other and continues the search.

Once the file is found, VFS checks file and user permission for that file. If the process has the correct

permissions, VFS sets up various table entries to manage I/O. Entry in file descriptor table (each process has one) – besides stdin(0), stdout(1)

and stderr(2). This entry is an integer value returned by the open() system call

Used for all subsequent references to the file. The entry in file descriptor table points to an entry in the open file table which is

of type struct file. The file structure entry holds the status information specific to the process that opened

the file. E.g. the value of the file position for this process’s use.

The file structure entry references the VFS inode after it has been created in the primary memory.

Page 33: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

Opening file

Page 34: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

Mounting a File System – How is it done? This mount command will pass the kernel three pieces of

information; the name of the file system, the physical block device that contains the file system and, thirdly, where in the existing file system topology the new file

system is to be mounted. The first thing that the Virtual File System must do is to find the

file system. To do this it searches through the list of known file systems by

looking at each file_system_type data structure in the list pointed at by file_systems.

If it finds a matching name it now knows that this file system type is supported by this kernel and it has the address of the file system specific routine for reading this file system's superblock.

$ mount -t iso9660 /dev/cdrom /mnt/cdrom

Page 35: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

Mounting a File System – How is it done? Next if the physical device passed by mount is not already

mounted, it must find the VFS inode of the directory that is to be the new file system's mount point. Once the inode has been found it is checked to see that it is a

directory and that there is not already some other file system mounted there. The same directory cannot be used as a mount point for more than

one file system. Next, the VFS mount code must allocate a VFS superblock and

pass it the mount information to the superblock read routine for this file system. The superblock read routine must fill out the VFS superblock

fields based on information that it reads from the physical device. For the EXT2 file system this mapping or translation of information is

quite easy, it simply reads the EXT2 superblock and fills out the VFS superblock from there.

For other file systems, such as the MS DOS file system, it is not quite such an easy task.

If the block device cannot be read from or if it does not contain this type of file system then the mount command will fail.

Page 36: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

A Mounted File System Each mounted file system is described by a vfsmount data structure which are queued on a list pointed at by vfsmntlist.

Each vfsmount structure contains• the device number of the block device holding the file system,• the directory where this file system is mounted and •a pointer to the VFS superblock allocated when this file system was mounted

In turn the VFS superblock points at •the file_system_type data structurefor this sort of file system and• to the root inode for this file system. This inode is kept resident in the VFS inode cache all of the time that this file system is loaded.

Page 37: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

The Second Extended File system (EXT2) The Second Extended File system was devised (by Rémy Card)

as an extensible and powerful file system for Linux. The EXT2 file system, like a lot of the file systems, is built on the

premise that the data held in files is kept in data blocks. These data blocks are all of the same length. Every file's size is rounded up to an integral number of blocks.

If the block size is 1024 bytes, then a file of 1025 bytes will occupy two 1024 byte blocks. Unfortunately this means that on average you waste half a block per file.

In this case Linux, along with most operating systems, trades off a relatively inefficient disk usage in order to reduce the workload on the CPU.

Page 38: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

The Second Extended File system (EXT2) The EXT2 file system divides the logical partition that it occupies

into Block Groups.

•Each group duplicates information critical to the integrity of the file system as well as holding real files and directories as blocks of information and data. •This duplication is neccessary should a disaster occur and the file system need recovering

Page 39: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

The EXT2 Inode

EXT2 defines the file system topology by describing each file in the system with an inode data structure. The inodes for the file system are all kept together

in inode tables

Page 40: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

The EXT2 Inode

Page 41: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

How large can a file be?

Depends on block sizes Size of disk addressess used

Example : assuming maximum block size = 4KB Direct block – (12 (0-11) direct blocks) address files of upto

48KB Indirect block – assuming each level of indirection can

access 1000 blocks Level 1 – 1000 = 1M blocks Level 2 – 1000 x 1000 = 1G blocks Level 3 – 1000 x 1000 x 1000 = 1 T blocks Total = 1, 001, 001, 011 blocks = 4, 004, 004, 044 KB

Page 42: CSNB334 Advanced Operating Systems 7. File Management Lecturer: Asma Shakil.

File size

Very large files possible Can try with other block sizes

However file access time will be slow due to multiple level indexes

Most unix variants do not use > 2 level index Due to incompatibility with storage hardware