2. File System - Keio Universityweb.sfc.keio.ac.jp/~hagino/sa16/02.pdf · What is a File? • A...
Transcript of 2. File System - Keio Universityweb.sfc.keio.ac.jp/~hagino/sa16/02.pdf · What is a File? • A...
SOFTWARE ARCHITECTURE
2. FILE SYSTEM Tatsuya Hagino
1
https://vu5.sfc.keio.ac.jp/sa/login.php
lecture URL
Operating System Structure
2
Hardware
Operating System
Application
Bootstrap Device Management
Scheduler Memory Management
File System Process Management
System call processing
Network Management
What is a File? • A unit to store information in an external storage
• sometimes called dataset
• Characteristics of a file • Information in a file is non-volatile.
• Its content does not disappear when power is off.
• Information in a file is persistent. • It exists forever.
• You can start with what you have left.
• File System • Manage files on an external storage
• Windows → NTFS
• MacOS → HFS
• Unix → UFS
• File related conventions: • file name
• file structure
• file type
• file access method
3
File
File
External Storage
(SSD, HDD, USB)
File Naming Convention • Each file has a name.
• File name
• Letters used for file name • case insensitive: lower and upper letters are not distinguished.
• case sensitive: lower and upper letters are distinguished.
• Encoding of non alphabet characters (e.g. kanji file name)
• Length of file name • MS-DOS limits 8+3 characters.
• UNIX limits 255 characters.
• File extension • It may express its file type:
• Windows: use it to find associated program.
• Mac: uses to hold file type internally.
• UNIX: depends on applications • Compilers may use it to select programming language.
4
document.docx
file name
file extension
File
file name
File Structure
• Sequence of bytes • No structure in a file
• User can create arbitrary fields.
• A text file uses LF or CR to separate lines.
• May not be efficient when reading or writing.
• Sequence of records • Consist of fixed length
records.
• For a punch card, each record consists of 80 characters. (i.e. one record = one line)
• Efficiently read and write records.
5
char char char LF
char char char char char LF
char char char char LF
char char LF
char char char char char char
char char char char char char
char char char char char char
char char char char char char
line 1
line 2
line 3
line 4
record 1
record 2
record 3
record 4
vs
File Type
• Regular File:
• text file or binary file
• Directory:
• Folder
• Manage a set of files
• Character Special File:
• Input/output device
• Serial devices like terminal, printer, network, etc.
• Block Special File:
• Devices with block access (i.e. read/write blocks)
• HDD, SSD, etc.
• File system can be created on it.
6
File Access Method
• Sequential access • Read/write a file one byte
by one byte sequentially.
• Cannot skip or change the order of reading/writing.
• May rewind it and start from the beginning again.
• Random access • Read/write a file randomly
in any order.
• In UNIX, the next position can be specified using seek system call.
7
one by one
from beginning
to end in any
order
vs
Hierarchical File System • Implemented by allowing a directory to
have other directories inside.
• Implementation of a directory: • Each directory consists of a list of entries.
• Each entry consists of a file name and a pointer to a file. • A file can be pointed from more than one
directory.
• File sharing (hard link)
• Specifying a file • Use path name
• Concatenate file names with separation characters • UNIX → /
• Windows → ¥
• Old Mac OS → :
8
parent directory
regular
file
child directory
regular
file
directory
a.docx
bb.c
sa.tex
xyz
size
owner
access
blocks
file information
file
block
regular
file
File
Path Name • Absolute path name
• Starting from the root directory, sub directory names are listed with / separators.
• Relative path name • Starting from the current working directory
9
/
bin sbin usr etc var home tmp
bin sbin local root hagino
bin sa16
01.pptx 02.pptx
ls
grep
passwd
/home/hagino/sa16/02.pptx
sa16/02.pptx
root directory
current working directory
File Read/Write Mechanism
• UNIX as an example
• Directly use system calls to read/write a file
• Use standard input/output libraries
10
Application
Standard Input/Output Library
File System related System Call
Buffer Management
Device Driver Hardware HDD
SSD
OS
File System related System Calls
11
open
read
write
seek
ioctl
close
file name
(path name)
File
Descriptor
read data from the file
write data to the file
data
data
change the position of
next read/write
position
attributes
other operations
end access
start access
File Descriptor
• A small number returned by open system call • specifies an opened file
• used by read/write
• remembers a position of read/write
• Managed by each process • The meaning of file descriptors is local to each process.
• Even in a single process, if the same file is opened more than onece, different file descriptors are assigned.
• Special file descriptors: • standard input → 0
• standard output → 1
• standard error output → 2
12
Example program of reading a file using system calls
13
#include <fcntl.h>
int main(argc, char *argv[])
{
int fd, n;
char buf[512];
fd = open(argv[1], O_RDONLY);
while ((n = read(fd, buf, 512)) > 0) {
write(1, buf, n);
}
close(fd);
}
Standard Input/Output Library • System call
• Not easy to use • e.g. There is no system call to read one line.
• Inefficient for small usage • System call needs to go though OS protection boundary (user mode to supervise mode).
• Costs much more than calling a library.
• Standard input/output library • Useful interface
• read one line
• read one character
• Optimize system call access • Use buffer
14
Application
Standard input/output library
System call
fopen, fclose, fgetc, fputc, fgets, fputs
open, close, read, write
buffer
How to Use Standard Input/Output Libraries
15
fopen
fgetc
gets
fputc
fputs
fclose
file name
(path name)
FILE
structure
read one character
read one line
one character
one line
write one character
one character
one line
write one line
end access
start access
fscanf
convert string to value
value
fprintf
value
convert value to string
Example program of reading a file using standard
input/output libraries
16
#include <stdio.h>
int main(argc, char *argv[])
{
FILE fp;
int ch;
fp = fopen(argv[1], "r");
while ((ch = fgetc(fp)) >= 0) {
fputc(ch, stdout);
}
fclose(fp);
}
Implementation of Standard Input/Ouput Library
17
struct FILE {
int fd;
char buf[BUFSIZE];
int size;
int counter;
};
typedef struct FILE FILE;
FILE structure
FILE *fopen(char *path, char *mode) {
FILE *fp;
fp = (FILE *)malloc(sizeof(struct FILE));
fp->fd = open(path, ...);
fp->size = 0;
fp->counter = 0;
return fp;
}
fopen
counter buf
holds
temporal
data
next
read/write
potition
size
System call
Application
FILE structure
OS
Implementation of fgetc
18
int fgetc(FILE *fp) {
if (fp->counter >= fp->size) {
fp->size = read(fp->fd, fp->buf, BUFSIZE);
if (fp->size <= 0) return -1;
fp->counter = 0;
}
return fp->buf[fp->counter++];
}
fgetc
fgetc buf
(1) checks buf
OS
file
(2) read 512 bytes
assume
BUFSIZE = 512
(3) fill buf with
512 bytes (4) returns one
character
If there is data in buf,
no read
Implementation of System Call
19
Process Management Structure 0 1 2 3 4
File Descriptor Table
File
Descriptor
File
Descriptor
File
Descriptor
inode
block
numbers
inode
block
numbers
Block Device
inode (index node)
• For each file
• Manages file information
• owner
• size
• file blocks
File Descriptor
• For each opened file
• Holds read/write position
Implementation of open
• Returns a file descriptor for a given file
• Calls namei to find its inode.
• Finds an empty file descriptor slot and sets a new file structure.
20
int open(char *path, int flags, ...) {
struct inode *ip;
int fd;
struct file *fp;
ip = namei(path);
if (ip) {
fp = create a new file structure;
fp->inode = ip;
fp->offset = 0;
fp->refcount = 1;
fd = unused file descriptor of proc;
proc->file[fd] = fp;
return fd;
}
else return -1;
}
struct file {
struct inode *inode;
long offset;
int refcount;
};
file structure
namei
• Following the path, checks each directory to fined the
named file.
21
struct inode *namei(char *path) {
struct inode *dp;
if (*path == ’/’) {
dp = proc->root_directory;
path++;
} else dp = proc->current_working_directory;
while (*path) {
name = select next name from path;
dp = lookup(dp, name);
}
return dp;
}
Implementation of File • Each file consists of a list of blocks on a disk (or a block device).
• UNIX uses inode to manage block numbers • FAT file system uses FAT to manage
22
struct inode {
short mode;
int uid
int gid;
long size;
...
int atime;
int mtime;
int ctime;
...
int direct[12];
int indirect[3];
};
inode structure
owner
size
modification time
direct reference block numbers (12 blocks)
indirect block numbers (3 blocks)
inode inode
data
block data block
data
block
Filesystem on
a block device
Direct and Indirect Blocks
• Each inode points 12 direct blocks.
• When one block is 512 bytes, it can hold a file less than 512×12=6KB.
• An indirect block contains a list of blocks.
23
infomation
inode
direct 1
direct 2
direct 12
indirect 1
indirect 2 indirect 3
data
data
data
indirect block data
data
data
data
indirect block
indirect block
data
data
indirect block
Calculation of Block Number • From file position to calculate corresponding block number
• A file descriptor holds a position of read/write and the next read/write access the corresponding block.
24
int balloc(struct inode *ip, long offset) {
struct buf *bp;
int i;
blk = (offset + BLKSIZE - 1) / BLKSIZE;
if (blk < 12) return ip->direct[blk];
blk -= 12;
blocks = BLKSIZE/sizeof(int);
for (i = 0; i < 3; i++) {
if (blk < blocks) break;
blk -= blocks;
blocks *= BLKSIZE/sizeof(int);
}
bp = getblock(ip->indirect[i])
while (i-- > 0) {
blocks /= BLKSIZE/sizeof(int);
bp = getblock(bp->buf[blk/blocks]);
blk %= blocks;
}
return bp->buf[blk];
}
file descriptor file
position
read
write
Buffering • Disk blocks are cached in OS memory.
• Read: • First time: reads data from the disk and put it in the cache.
• Second time: do not access disk but use data in the cache.
• Write: • Do not write data to the disk immediately.
• Write them out all later.
25
Hash table
for buffers
buffer buffer
buffer
recently
used
buffer buffer
struct buf {
struct buf *next, *prev;
struct buf *queue_next, *queue_prev;
struct inode *inode;
int dirty;
int blkno;
char buf[BLKSIZE];
};
struct buf buffers[HASH];
buf structure
Disk Format
• A new disk needs to be formatted.
26
Superblock
File system on a disk
inode
list of unused blocks
data blocks
disk information
inode information
Unused blocks are managed by a list
or a bitmap.
actual data blocks
Summary
• File System
• UNIX file system as an example.
• System call vs Standard Input/Output Library
• Implementation of the library
• File Descriptor
• Implementation of File System
• inode
• Direct and indirect blocks
27