File Concept A file is a named collection of related information that is recorded on secondary...

26
File-System Interface CS 355 Operating Systems Dr. Matthew Wright Operating System Concepts chapter 10

Transcript of File Concept A file is a named collection of related information that is recorded on secondary...

Page 1: File Concept A file is a named collection of related information that is recorded on secondary storage. A file has a define structure, which we must know.

File-System Interface

CS 355Operating Systems

Dr. Matthew Wright

Operating System Conceptschapter 10

Page 2: File Concept A file is a named collection of related information that is recorded on secondary storage. A file has a define structure, which we must know.

File Concept• A file is a named collection of related information that is recorded

on secondary storage.• A file has a define structure, which we must know in order to

interpret its contents. Examples: text, image, executable, etc.• Files have attributes, usually including the following:–Name: human-readable file name– Identifier: numeric identifier within the file system– Type: some systems formally support different file types– Location: address of the file in a storage device– Size: number of bytes (or words, or blocks) in the file– Protection: access-control information– Time, date, and user identification: may be useful for

protection, security, and resource-monitoring

Page 3: File Concept A file is a named collection of related information that is recorded on secondary storage. A file has a define structure, which we must know.

File Operations• Since a file is an abstract data type, we should define the operations that

can be performed on files.• The operating system provides system calls to perform these operations– Create: the OS must find space in the file system and add an entry to

the directory– Write: OS must find the location of a file and usually keeps a write

pointer that indicates where the next write will occur– Read: OS must find the file and usually keeps a read pointer that

indicates where the next read will occur– Reposition within a file: change the file position pointer to a given

value (i.e. seek a given location)– Delete: release the space allocated to a file and update the directory– Truncate: erase the contents of a file, but keep its attributes

Page 4: File Concept A file is a named collection of related information that is recorded on secondary storage. A file has a define structure, which we must know.

File Operations• To reduce directory searching, many systems require an open()

system call before a file is first used.• The OS maintains an open-file table with information about all

open files.• When a process finishes using a file, it calls a close() system

call.• The open-file table contains the following information for each file:– File pointer: stores the current read/write location within a file;

unique to a process accessing the file– File-open count: the number of processes accessing the file– Disk location of the file: to improve access speed, the location of

the file on disk is stored in memory – Access rights: indicates what operations a process is allowed to

do to a file

Page 5: File Concept A file is a named collection of related information that is recorded on secondary storage. A file has a define structure, which we must know.

File Locking• The operating system may provide processes the ability to lock an

open file to prevent other processes from gaining access to it.• Locks may be shared by several processes or exclusive to one

process.• Locks may be mandatory or advisory–Mandatory locks are enforced by the operating system– Advisory locks are not enforced by the OS; it is up to application

programmers to ensure that locks are properly acquired and released

• Windows systems generally use mandatory locking, while UNIX systems generally use advisory locking.• File locking in Java is accomplished via the lock() method of the FileChannel object associated with a file.

Page 6: File Concept A file is a named collection of related information that is recorded on secondary storage. A file has a define structure, which we must know.

File Types• An operating system may be designed to recognize and support

various file types.• File types are often stored in the file name, as a file extension (the

part of the file name following a period).• The system may use the file extension to indicate the type of

operations that can be done to a file• File extensions are often just hints, not guarantees that a certain

file is of a given type.• Mac OS X stores a creator attribute with each file: the name of the

program that created the file, so that it can open files with the correct application• UNIX uses a magic number stored at the beginning of some files to

indicate the general type; users may add file extension hints, but the OS does not use them

Page 7: File Concept A file is a named collection of related information that is recorded on secondary storage. A file has a define structure, which we must know.

File Structure• Some operating systems impose certain structure on files or

require that files conform to predetermined file types.• For example, UNIX considers each file to be a sequence of 8-bit

bytes, though it does not impose an interpretation of the bytes.• All operating systems must support some sort of executable file so

that users can run programs.• Mac OS requires that files contain two parts: a resource fork

(containing user-specific information) and a data fork (containing program code or data).• Since disk I/O is performed in blocks of set size (e.g. 512 bytes),

the OS must pack logical records into physical blocks to be stored on the disk.

Page 8: File Concept A file is a named collection of related information that is recorded on secondary storage. A file has a define structure, which we must know.

Practice (10.2)

Why do some systems keep track of the type of a file, while others leave it to the user and others simply do not implement multiple file types?

Which system is “better”?

Page 9: File Concept A file is a named collection of related information that is recorded on secondary storage. A file has a define structure, which we must know.

Practice (10.13)

What are the advantages and disadvantages of recording the name of the creating program with the file’s attributes (as is done in the Macintosh Operating System)?

Page 10: File Concept A file is a named collection of related information that is recorded on secondary storage. A file has a define structure, which we must know.

File Access• Sequential Access: information is processed in order, one record

after another

• Direct Access: views a file as a numbered sequence of blocks that may be accessed in any order• Direct access can be extended to use an index to help find

locations within a file, which can reduce access time for finding information in a large file.

Page 11: File Concept A file is a named collection of related information that is recorded on secondary storage. A file has a define structure, which we must know.

Storage Structure• Disk can be subdivided into partitions.– Partitions also known as minidisks or slices.– Disks or partitions can be RAID protected against failure.– Disk or partition can be used raw (without a file system) or

formatted with a file system.• Entity containing file system known as a volume.– Each volume containing a file system also tracks that file system’s

info in a device directory or volume table of contents.• As well as general-purpose file systems there are many special-

purpose file systems, frequently all within the same operating system or computer.

Page 12: File Concept A file is a named collection of related information that is recorded on secondary storage. A file has a define structure, which we must know.

Directory Overview• A directory can be viewed as a symbol table that translates file

names into directory entries.• The following operations are performed on directories:– Search for a file– Create a file– Delete a file– List the contents of a directory– Rename a file– Traverse the file system: access every directory and file within a

directory structure (e.g. for backup)

Page 13: File Concept A file is a named collection of related information that is recorded on secondary storage. A file has a define structure, which we must know.

Directory StructureSingle-Level Directory• Simplest structure• Requires files to have unique names• Provides no facility for grouping files• Not suitable for organizing large number of files or for multiuser

systems

Page 14: File Concept A file is a named collection of related information that is recorded on secondary storage. A file has a define structure, which we must know.

Directory StructureTwo-Level Directory• Provides a master file directory (MFD) for the system and user file

directory (UFD) for each user• Isolates one user’s files from other users (good for protection, bad

for collaboration)

Page 15: File Concept A file is a named collection of related information that is recorded on secondary storage. A file has a define structure, which we must know.

Directory StructureTree-Structured Directory• Allows users to create their own directories to organize files• Each program must keep track of its current directory.• Path names can

be absolute or relative.• A user could be

allowed access to files of another user.

Page 16: File Concept A file is a named collection of related information that is recorded on secondary storage. A file has a define structure, which we must know.

Directory StructureAcyclic-Graph Directory• Allows subdirectories to be shared, existing in the file system in two (or

more) places at once.• This could be implemented by

duplicating file information in different directories, but such an implementation may be hard to keep consistent

• UNIX: implements shared files and directories via links, which are pointers to other files or directories

• If a file is deleted, what happens to any links pointing to it?

Page 17: File Concept A file is a named collection of related information that is recorded on secondary storage. A file has a define structure, which we must know.

Directory StructureGeneral Graph Directory• Allowing links could produce cycles in the directory graph.• How could we guarantee no cycles?• If we allow

cycles, then we must design search algorithms so that no part of the file system is searched repeatedly.

Page 18: File Concept A file is a named collection of related information that is recorded on secondary storage. A file has a define structure, which we must know.

Practice (10.4)Could you simulate a multilevel directory structure with a single-level directory structure in which arbitrarily long names can be used?

If your answer is yes, explain how you can do so, and contrast this scheme with the multilevel directory scheme. If your answer is no, explain what prevents your simulation’s success.

How would your answer change if file names were limited to seven characters?

Page 19: File Concept A file is a named collection of related information that is recorded on secondary storage. A file has a define structure, which we must know.

File-System Mounting• Each volume must be mounted before it can be available to

processes on the system• To mount a volume, the OS must be given the device name and

the mount point (the location within the file structure where the new file system will be attached).• What happens if a file system is mounted over a directory that

contains files?• Example: Macintosh searches new devices for file system; if it finds

a file system, it automatically mounts it at root level• Example: Windows maintains a two-level directory structure, with

devices and volumes assigned drive letters, though recent versions allow a file system to be mounted at any point in the directory tree.

Page 20: File Concept A file is a named collection of related information that is recorded on secondary storage. A file has a define structure, which we must know.

File Sharing• If a system supports multiple users, how does it provide protection

and sharing of files?• Many systems associate an owner and group with each file and

directory.– The owner is the user who can change attributes and has the

most control over the file.– The group is a set of users who share access to the file.– If a user requests an operation on a file, the OS determines

whether the user is the owner of the file or part of the group, and thus whether the requested operation is permitted.

Page 21: File Concept A file is a named collection of related information that is recorded on secondary storage. A file has a define structure, which we must know.

Remote File Systems• In a distributed file system (DFS), remote directories are visible

from a local machine.• The client-server model allows clients to mount remote file

systems from servers (e.g. H:\ drive at Huntington University).• Standard operating system file calls are translated into remote

calls.• Distributed Information Systems (distributed naming services)

such as LDAP, DNS, and Active Directory implement unified access to information needed for remote computing• Remote file systems add new failure modes, particularly due to

network failure or server failure.• Recovery from failure can involve state information about status of

each remote request.

Page 22: File Concept A file is a named collection of related information that is recorded on secondary storage. A file has a define structure, which we must know.

Consistency Semantics• Consistency semantics specify how multiple users are to access a shared

file simultaneously.• Similar to process synchronization algorithms from Chapter 6, but less

complex due to slow speed of disk I/O and network latency (for remote file systems).

• Unix file system (UFS) implements:– Writes to an open file visible are immediately to other users of the

same open file.– Sharing file pointer to allow multiple users to read and write

concurrently.• Andrew File System (AFS) implemented complex remote file sharing

semantics– Writes to an open file by one user are not immediately visible to other

users who have the same file open.– Once a file is closed, the changes to it are visible only in sessions

starting later; already open instances of the file do not change.

Page 23: File Concept A file is a named collection of related information that is recorded on secondary storage. A file has a define structure, which we must know.

Protection• Protection involves keeping files safe from improper access.• Protection mechanisms limit the types of access that can be made.• Operations that might be controlled include: read, write, execute,

append, delete, list attributes• Access control is usually dependent on the identity of the user– An access-control list (ACL) specifies the types of access allowed

for each user– An ACL can be very long and difficult to manage– A simpler solution is to grant permissions to categories of users,

such as owner, group, and universe.– ACLs may be combined with owner/group/universe permissions.

Page 24: File Concept A file is a named collection of related information that is recorded on secondary storage. A file has a define structure, which we must know.

Protection• UNIX allows read, write, and execute privileges to be granted or

denied to each of owner, group, and universe.• Windows provides protection options in the Security tab of the File

Properties dialog box.• Protection could also be achieved by requiring passwords in order

to access files.• Directories must be protected as well.• If files may have numerous path names (when links exist), then a

given user may have different access rights to a particular file, depending on the path name used.

Page 25: File Concept A file is a named collection of related information that is recorded on secondary storage. A file has a define structure, which we must know.

Practice (10.8)Consider a system that supports 5,000 users. Suppose that you want to allow 4,990 of these users to be able to access one file.

a) How would you specify this protection scheme in UNIX?

b) Can you suggest another protection scheme that can be used more effectively for this purpose than the scheme provided by UNIX?

Page 26: File Concept A file is a named collection of related information that is recorded on secondary storage. A file has a define structure, which we must know.

Practice (10.19)

What are some advantages and disadvantages of associating with remote file systems (stored on file servers) a different set of failure semantics from that associated with local file systems?