File Systems and Disk Management - UNI Department of ... · Disk management organizes disk blocks...
Transcript of File Systems and Disk Management - UNI Department of ... · Disk management organizes disk blocks...
File Systems and Disk Management
Sarah Diesburg Operating Systems
CS 3430
Design Goals of File Systems
Physical reality File system abstraction
Block-oriented Byte-oriented
Physical sectors Named files
No protection Users protected from one another
Data might be corrupted if machine crashes
Robust to machine failures
File System Components
Disk management organizes disk blocks into files
Naming provides file names and directories to users, instead of tracks and sector numbers (e.g. Diesburg)
Protection keeps information secure from other users
Reliability protects information loss due to system crashes
User vs. System View of a File
User level: individual files System call level: collection of bytes Operating system level:
A block is a logical transfer unit Even for getc() and putc() 4 Kbytes under UNIX
A sector is a physical transfer unit 512-byte sectors on disks
File: a named collection of blocks
User vs. System View of a File
A process Read bytes 2 to 12
OS Fetch the block containing those bytes Return those bytes to the process
User vs. System View of a File
A process Write bytes 2 to 12
OS Fetch the block containing those bytes Modify those bytes Write out the block
Ways to Access a File
People use file systems Design of file systems involves understanding
how people use file systems Sequential access—bytes are accessed in
order Random access (direct access)—bytes are
accessed in any order Content-based access—bytes are accessed
according to constraints on bye contents e.g., return 100 bytes starting with “aye carumba”
File Usage Patterns
Most files are small, and most references are to small files e.g., .login and .c files
Large files use up most of the disk space e.g., mp3 files
Large files account for most of the bytes transferred between memory and disk
Bad news for file system designers
File System Design Constraints
High performance Efficient access of small files
Many small files Used frequently
Efficient access of large files Consume most disk space Account for most of the data movement
Some Definitions
A file contains a file header, which associates the file with its disk sectors
File header
data block location
data block location
name
Some Definitions
A file system needs a disk allocation bitmap to represent free space on the disk, one bit per block
Disk Allocation Policies
Contiguous allocation Link-list allocation Segment-based allocation Indexed allocation Multi-level indexed allocation Hashed allocation
Contiguous Allocation
File blocks are stored contiguously on disk To allocate a file,
Specify the file size Search the disk allocation bitmap for
consecutive free blocks data block location
number of blocks
data block location
File header
Pros and Cons of Contiguous Allocation + Fast sequential access + Ease of computing random file locations
Adding an offset to the first disk block location - External fragmentation - Difficulty in growing files
Linked-List Allocation
Each file block on a disk is associated with a pointer to the next block A special marker to indicate the end of the file
e.g., MS-DOS file system File attribute table (FAT)
data block location
next block entry
data block location
next block entry
File header
Pros and Cons of Linked-List Allocation + Files can grow dynamically with incremental
allocation of blocks - Sequential access may suffer
Blocks may not be contiguous - Horrible random accesses
May involve multiple sequential searches - Unreliable
A corrupted pointer can lead to loss of the remaining file
Indexed Allocation
Uses a preallocated index to directly track the file block locations
data block location
data block location
data block location
data block location
File header
Pros and Cons of Indexed Allocation
+ Fast lookups and random accesses - File blocks may be scattered all over the disk
Poor sequential access Needs defragmenter
- Needs to reallocate index as the file size increases
Segment-Based Allocation
Needs a segment table to allocate multiple, contiguous regions of blocks
begin, end blocks
begin, end blocks
begin, end blocks
begin, end blocks
File header
Pros and Cons of Segment-Based Allocation + Relax the requirements for large contiguous
disk regions - Fragmentation 100%
Segment-based allocation Indexed allocation
- Random accesses not as fast as pure contiguous allocation
Multilevel Indexed Allocation
Certain index entries point to index blocks, as opposed to data blocks (e.g., Linux ext2)
data block location
index block location
index block location
index block location
data block location
index block location
index block location
data block location
data block location
File header
12 data block location
data block location data block location
data block location
index block location
Multilevel Indexed Allocation
A single indirect block contains pointers to data blocks
A double indirect block contains pointers to single indirect blocks
A triple indirect block contains pointers to double indirect blocks
Pros and Cons of Multilevel Indexed Allocation + Optimized for small and large files
Small files accessed through the first 12 pointers
Large files can grow incrementally - Multiple disk accesses to fetch a data block
under triple indirect block - Largest file size capped by the number of
pointers - Arbitrary file size boundaries among levels
Hashed Allocation
Allocates a disk block by hashing the block content to a disk location
data block location
data block location
data block location
data block location
Old file header
data block location
data block location
data block location
data block location
New file header
Pros and Cons of Hashed Allocation
+ File blocks of the same content can share the same disk block to save storage e.g., empty blocks
+ Good for backups and archival Small modifications to a large file result in only
additional storage of the changes - Poor disk performance