File System - web.cse.ohio-state.eduweb.cse.ohio-state.edu/~babic.1/B.2s.pdf · Unix/Linux File...

19
1 01/13/2020 File System Presentation B CSE 2431/5421: Introduction to Operating Systems Gojko Babić Study: 13.1–13.4, 2.1–2.4, 12.3.4 g. babic Presentation B 2 Name – this information kept in human-readable form. Type – needed for systems that support different types. Location – pointers to file location on device. Size – current file size. Protection – controls who can do reading, writing, executing. Time and date – data for protection, security, and usage monitoring. Information about files (i.e. file attributes) are kept in the directory structure, which is maintained on the disk. File Attributes

Transcript of File System - web.cse.ohio-state.eduweb.cse.ohio-state.edu/~babic.1/B.2s.pdf · Unix/Linux File...

Page 1: File System - web.cse.ohio-state.eduweb.cse.ohio-state.edu/~babic.1/B.2s.pdf · Unix/Linux File System g. babic Presentation B 8 Thus, the file is read (or written) sequentially by

1

01/13/2020

File SystemPresentation B

CSE 2431/5421: Introduction to Operating Systems

Gojko BabićStudy: 13.1–13.4, 2.1–2.4, 12.3.4

g. babic Presentation B 2

Name – this information kept in human-readable form.

Type – needed for systems that support different types.

Location – pointers to file location on device.

Size – current file size.

Protection – controls who can do reading, writing,

executing.

Time and date – data for protection, security, and usage

monitoring.

Information about files (i.e. file attributes) are kept in the

directory structure, which is maintained on the disk.

File Attributes

Page 2: File System - web.cse.ohio-state.eduweb.cse.ohio-state.edu/~babic.1/B.2s.pdf · Unix/Linux File System g. babic Presentation B 8 Thus, the file is read (or written) sequentially by

2

g. babic Presentation B 3

A collection of nodes containing information (i.e. file attributes) about a set of files.

F 1 F 2F 3

F 4

F n

Directory

Files

• Both the directory structure and the files reside on disk.

Directory Structure

g. babic Presentation B 4

A Typical File-System Organization

Page 3: File System - web.cse.ohio-state.eduweb.cse.ohio-state.edu/~babic.1/B.2s.pdf · Unix/Linux File System g. babic Presentation B 8 Thus, the file is read (or written) sequentially by

3

Presentation B 5

File Operations:CreateWriteReadReposition within fileDeleteOpen(file_name) – search the directory structure on disk for entry

file_name, and move the content of entry to memory.Close

Access Methods:Sequential Access (based on a tape model of a file):

– read next– write next – reset

Direct (or Relative) Access– position to n (n = relative block number)– read next– write next

File Operations & Access Methods

g. babic Presentation B 6

Modes of file access: read, write, execute Three classes of users

R W X

─ owner access (3 bits used) e.g. 7 1 1 1R W X

─ group access (3 bits used) e.g. 5 1 0 1R W X

─ public access (3 bits used) e.g. 1 0 0 1 File owner is able to control what can be done and by whom. Unix command to set above access rights (i.e. owner: read,

write & execute, group: read & execute, world: execute) for the file game:

chmod 751 game System administrator creates a group (with system unique

name), say G, and add some users to the group. Unix command to attach a group G to a file game:

chgrp G game

Unix/Linux: File Protection

Page 4: File System - web.cse.ohio-state.eduweb.cse.ohio-state.edu/~babic.1/B.2s.pdf · Unix/Linux File System g. babic Presentation B 8 Thus, the file is read (or written) sequentially by

4

g. babic Presentation B 7

In order to use a file, you first need to ask for it by name. This is

called opening the file. The open system call creates an operating

system object called an open file. The open file is logically

connected to the file you named in the open system call. An open

file has a file location associated with it and that is the offset in

the file where the next read or write will start.

After you open a file, you can use the read or write system calls

to read or write a number of characters. Each read or write

system call increments a file location for a number of characters

read or written.

Unix/Linux File System

g. babic Presentation B 8

Thus, the file is read (or written) sequentially by default.

The lseek system call is used to achieve random access into

the file since it changes the file location that will be used for

the next read or write.

To create new file, you use the creat system call.

You close the open file using the close system call, when you

are done using it.

You delete the file from a directory using the unlink system

call.

Unix/Linux File System (cont.)

Page 5: File System - web.cse.ohio-state.eduweb.cse.ohio-state.edu/~babic.1/B.2s.pdf · Unix/Linux File System g. babic Presentation B 8 Thus, the file is read (or written) sequentially by

5

g. babic Presentation B 9

System Call Parameters Returns Notes

open name, flags fid Connect to open file

creat name, mode fid Creates file and

connect to open file

read fid, buffer, count count Reads bytes from

open file

write fid, buffer, count count Writes bytes to open

file

lseek fid, offset, mode offset Moves to position of

next read or write

close fid code Disconnect open file

unlink name code Delete named file

Unix/Linux File System Calls

g. babic Presentation B 10

/* This program reads first 10 characters from the existing file ABC into the array buffer.*/

void main()

{ int y, x, z;

char buffer[100];

y = open ("ABC", 0);

if (y<0) {printf(“error with open”); return 0;}

x = read (y, buffer, 10);

if (x<0) {printf(“error with read”); return 0;}

z=close (y);

if (z<0) {printf(“error with close”); return 0;}

printf(“done”);

return 0;}

A Simple Program with File System Calls

Page 6: File System - web.cse.ohio-state.eduweb.cse.ohio-state.edu/~babic.1/B.2s.pdf · Unix/Linux File System g. babic Presentation B 8 Thus, the file is read (or written) sequentially by

6

g. babic Presentation B 11

Note: Processing in kernel mode may initiate some disk I/O

Processing of Open System Call

g. babic Presentation B 12

C program invoking printf() library call, which calls write() system call

Standard C Library Example

Page 7: File System - web.cse.ohio-state.eduweb.cse.ohio-state.edu/~babic.1/B.2s.pdf · Unix/Linux File System g. babic Presentation B 8 Thus, the file is read (or written) sequentially by

7

#include <stdio.h>

int main(){printf("hello World");return 0;}

.file "hello.c".LC0:.string "hello World".text.globl main.type main, @function

main:subq $8, %rspmovq $.LC0, %rdimovq $0, %raxcall printfmovq $0, %raxaddq $8, %rspret

13

Compiling hello.c

Compilation command: gcc -O1 –S hello.cgenerated this x86-64 assembly code (with minor changes in red):

g. babic Presentation K

Source: Bryant & O’Hallaron: “Computer Systems, 2nd edition”

System calls are provided on IA32 via a exception causing instruction int n, where n can be 0-255, although historically system calls are provided through exception 128 (0x80).

By convention, register %eax contains the system call number, and registers %ebx, %ecx, %edx, %esi, %edi and %ebp contain up to 6 arguments.

Examples of system call numbers:

exit: 1

fork 2

read 3

write 4

open: 5g. babic Presentation B 14

Linux/IA32 System Calls

close: 6 wait: 7 creat: 8 unlink: 10 execve: 11 lseek: 19

getpid: 20 kill: 37 pipe: 42 umask: 60 dup2: 63 gettimeofday: 78

Page 8: File System - web.cse.ohio-state.eduweb.cse.ohio-state.edu/~babic.1/B.2s.pdf · Unix/Linux File System g. babic Presentation B 8 Thus, the file is read (or written) sequentially by

8

.section .data

string: .ascii "hello world“

string_end:

.equ len, string_end - string

.section .text

.globl main

main:

# system call: write(1, "hello, world\n", 11)

movl $4, %eax # System call number 4

movl $1, %ebx # stdout has descriptor 1

movl $string, %ecx # “hello world” string

movl $len, %edx # String length

int $0x80 # System call code

movl $0, %eax

ret

int main() {write (1,“hello world”,11);return 0}/*This is a version of the familiar hello program*/

On the right, we have an implementation of the helloprogram directly with Linuxsystem calls.

g. babic 15

Linux/IA32 System Calls

• Implementing system calls requires a control transfer which involves some sort of architecture-specific feature. A typical way to implement this is to use a software interrupt or trap. Interrupts transfer control to the operating system kernel so software simply needs to set up some register with the system call number needed, and execute the software interrupt.

• For many RISC processors this is the only technique provided, but CISC architectures such as x86 support additional techniques. One example is SYSCALL/SYSRET, SYSENTER/SYSEXIT. These are "fast" control transfer instructions that are designed to quickly transfer control to the OS for a system call without the overhead of an interrupt.

• Linux 2.5 began using this on the x86, where available; formerly it used the INT instruction, where the system call number was placed in the EAX register before interrupt 0x80 was executed.

Wikipedia: System Call Typical Implementation

Presentation B 16g. babic

Page 9: File System - web.cse.ohio-state.eduweb.cse.ohio-state.edu/~babic.1/B.2s.pdf · Unix/Linux File System g. babic Presentation B 8 Thus, the file is read (or written) sequentially by

9

g. babic Presentation B 17

For each process, a set of open file identifiers numbered 0, 1, 2, and so on can be issued for I/O transactions between the process and the Unix/Linux operating system. The first three open file identifiers automatically are assigned I/O channels for a new process when it is created:– Open file identifier 0 (called the standard input) is connected to

the keyboard for input

– Open file identifier 1 (called the standard output) is connected to the terminal for output,

– Open file identifier 2 (called the standard error) is connected to the terminal for output.

Most Unix/Linux commands receive input from the standard input, produces output to the standard output, and send error messages through the standard error. However, the standard I/O channels of a command can be redirected.

Unix/Linux Standard Input and Output

Disks hold enormous amount of data – on the order of hundreds to thousands of gigabytes compared to hundreds to thousands of megabytes in memory.

Disks are slower than RAM-based memory – on the order of milliseconds to read information on a disk.

18Presentation H

SpindleArm

Actuator

Platters

SCSIconnector

Electronics(including a processor and memory!)

Disc Storage

Page 10: File System - web.cse.ohio-state.eduweb.cse.ohio-state.edu/~babic.1/B.2s.pdf · Unix/Linux File System g. babic Presentation B 8 Thus, the file is read (or written) sequentially by

10

• Disks consist of platters, each with two surfaces.

• Each surface consists of concentric rings called tracks.

• Each track consists of sectors separated by gaps.

Spindle

SurfaceTracks

Track k

Sectors

Gaps

19Presentation B

Disc Geometry

Capacity defined to be the maximum number of bits that can be recorded on a disk. Determined by the following factors:

• Recording density (bits/in): The number of bits on a 1-inch segment of a track.

• Track density (tracks/in): The number of tracks on a 1-inch segment of radius extending from the center of the platter.

• Areal density (bits/in2): product of recording density and track density

20Presentation B

Determination of areal density:• Original disks partitioned every track into the same number of

sectors, which was determined by the innermost track. Resulted in sectors being spaced further apart on outer tracks.

• Modern disks partition into disjoint subsets called recording zones.

• Each track within zone same number of sectors, determined by the innermost track.

• Each zone has a different number of sectors/track.

Disc Capacity

Page 11: File System - web.cse.ohio-state.eduweb.cse.ohio-state.edu/~babic.1/B.2s.pdf · Unix/Linux File System g. babic Presentation B 8 Thus, the file is read (or written) sequentially by

11

Capacity = (#bytes/sector) x (avg #sectors/track) x (#tracks/surface) x (#surfaces/platter) x (#platters/disk)

Example: • 512 bytes/sector

• Average of 300 sectors/track

• 20,000 tracks/surface

• 2 surfaces/platter

• 5 platters/disk

Capacity = 512 x 300 x 20,000 x 2 x 5 = 30,720,000,000 = 30.72 GB.

21Presentation B

Computing Disc Capacity

The disk surface spins at a fixedrotational rate.Rotation is counter-clockwise

By moving radially, the arm can position the read/write head over any track.

The read/write headis attached to the endof the arm and flies overthe disk surface on

a thin cushion of air.

spindle

spindle

spin

dle

spindlespindle

22Presentation B

Disc Operations (Single-Platter View)

Page 12: File System - web.cse.ohio-state.eduweb.cse.ohio-state.edu/~babic.1/B.2s.pdf · Unix/Linux File System g. babic Presentation B 8 Thus, the file is read (or written) sequentially by

12

After BLUE read Seek for RED Rotational latency After RED read

Seek Rotational latency

Data transfer23

Complete read of red sector

Presentation B

Disc Access – Service Time Components

Average access time for a sector:

Taccess = Tavg seek + Tavg rotation + Tavg transfer

Seek time (Tavg seek):• Time to position heads over cylinder

• Typical Tavg seek is 3 – 9 ms, max can be as high as 20 ms

Rotational latency (Tavg rotation):• Once head is positioned over track, the time it takes for the first bit of

the sector to pass under the head.

• In the worst case, the head just misses the sector and waits for the disk to make a full rotation.

Tmax rotation = (1/RPM) x (60secs/1min)

• Average case is ½ of worst case:Tavg rotation = (1/2) x (1/RPM) x (60secs/1min)

• Typical rotation speed = 7200 RPMs.

24Presentation B

Calculating Access Time

Page 13: File System - web.cse.ohio-state.eduweb.cse.ohio-state.edu/~babic.1/B.2s.pdf · Unix/Linux File System g. babic Presentation B 8 Thus, the file is read (or written) sequentially by

13

Transfer time (Tavg transfer):• Time to read bits in the sector

• Time depends on the rotational speed and the number of sectors per track.

• Estimate of the average transfer time;• Tavg transfer = (1/RPM) x (1/(avg #sectors/tracks)) x (60 secs/1min)

Example:• Rotational rate = 7200 RPM

• Average seek time = 9 ms

• Avg #sectors/track = 400

Tavg rotation = 1/2 x (60 secs/7200 RPM) x (1000 ms/sec) ≈ 4 ms

Tavg transfer = (60/7200RPM) x (1/400secs/track) x (1000ms/sec) ≈ 0.02 ms

Taccess = 9 ms + 4ms + 0.02 ms

25Presentation B

Calculating Access Time (cont.)

Time to access the 512 bytes in a disk sector is dominated by the seek time (9 ms) and rotational latency (4 ms).

Accessing the sector takes a long time but transferring bits are basically free.

Since seek time and rotational latency are roughly the same, at least same order of magnitude, doubling the seek time is a reasonable estimate for access time.

Comparison of access times of various storage devices when reading a comparable 512-byte sector sized block:

• SRAM 256 ns

• DRAM 5000 ns

• Disk 10 ms

• Disk is about 40,000 times slower than SRAM, 2500 times slower than DRAM.

26Presentation B

Access Time

Page 14: File System - web.cse.ohio-state.eduweb.cse.ohio-state.edu/~babic.1/B.2s.pdf · Unix/Linux File System g. babic Presentation B 8 Thus, the file is read (or written) sequentially by

14

Presentation B 27

After receiving a read system call from the given process (program) to read, e.g. 50 characters from the already open file, operating system maps read request to the appropriate block number.

Here are steps in performing a read disc operation:

1. operating system provides I/O disc controller with: memory buffer address, block number, type of operation: read (or write)

2. now while CPU executes code of some other process (and while the read issuing process is blocked, i.e. it can not run), the disc I/O controller maps a block number to a sector address (a surface #, a track # and sector #):

a. positions a head over the appropriate track (seek time)

b. waits for desired sector to rotate to the head (rotational latency)

Processing of Read File System Call

g. babic Presentation B 28

c. transfers 512 bytes of data (i.e. one sector) from disc to the

local controller memory, and then by DMA those characters

are copied into the appropriate buffer in the main memory

3. When done, the disc controller sends a hardware interrupt to

CPU.

The typical time to perform disc I/O operation is 10-20 millisec.

CPU is interrupted from running some process, and operating

system takes 50 characters from the memory buffer and copies

them into the read issuing process address space. The read

issuing process is now eligible for running.

Processing of Typical File System Call

Page 15: File System - web.cse.ohio-state.eduweb.cse.ohio-state.edu/~babic.1/B.2s.pdf · Unix/Linux File System g. babic Presentation B 8 Thus, the file is read (or written) sequentially by

15

g. babic Presentation B 29

Assume that it takes 20 milliseconds and 25 milliseconds to perform read and write disc operation, respectively, and that it takes 1 millisecond and 0.5 millisecond for operating system to process a system call and hardware interrupt, respectively. Also, assume that the given file is open, that the given process is the only active process in the system and that no error happens during the execution of system calls or I/O operations.

Then, try to estimate a duration of a time period which starts when the process issues the following systems call:

x = read (fd, ch, 20)

and ends when process starts the execution of the instruction that follows. Since there are no other processes in the system, this is a period during which the issuing process is blocked.

Estimating Time Process Is Blocked

g. babic Presentation B 30

The usual (normal) time =

The maximum time =

The minimum time =

1 + 20 + .5 = 21.5 millisecond

1 + 20 + .5 + 20 +.5 = 42 millisecond

1 millisecond

Estimating Time Process Is Blocked (cont.)

• The period we are calculating includes times for:

– operating system processing of hardware interrupts

and system call

– I/O controller performing I/O operation(s)

– hardware processing of interrupts (exceptions) and

exception of interrupt causing instruction (both

assumed to be negligible for this calculation)

Page 16: File System - web.cse.ohio-state.eduweb.cse.ohio-state.edu/~babic.1/B.2s.pdf · Unix/Linux File System g. babic Presentation B 8 Thus, the file is read (or written) sequentially by

16

g. babic Presentation B 31

• Let us try to estimate a duration of a time period needed for O.S. to process this system call:

x = write (fd, ch, 20)

Estimating Time For Processing Write Sys Call

The usual (normal) time =

The maximum time =

1 + 20 + .5 + 25 + .5 = 47 millisecond

1 + 20 + .5 + 25 + .5 + 20 +.5 + 25 + .5= 93 millisecond

• How long will issuing process be blocked?• Note: This time is much more difficult to estimate than a time for

read system call, since O.S may not wait for write disc operation to finish before it unblocks the issuing process.

g. babic Presentation B 32

Search for a file

Delete a file

List a directory

Rename a file

Traverse the file system

Organize (logically) directory for:

– Efficiency: locating a file quickly.

– Naming: to be convenient to users.1. Two users can have same name for different files.

2. The same file can have several different names.

– Grouping – logical grouping of files by properties, (e.g., all Java programs, all games, …)

Operations Performed on Directory

Page 17: File System - web.cse.ohio-state.eduweb.cse.ohio-state.edu/~babic.1/B.2s.pdf · Unix/Linux File System g. babic Presentation B 8 Thus, the file is read (or written) sequentially by

17

g. babic Presentation B 33

A single-level directory for all users.

• Naming problem

• Grouping problem

Single-Level Directory

g. babic Presentation B 34

Separate directory for each user.

• Path name• Can have the same file name for different user• Efficient searching• But no grouping capability

Two-Level Directory

Page 18: File System - web.cse.ohio-state.eduweb.cse.ohio-state.edu/~babic.1/B.2s.pdf · Unix/Linux File System g. babic Presentation B 8 Thus, the file is read (or written) sequentially by

18

g. babic Presentation B 35

Tree-Structured Directories

Efficient searching

Grouping Capability

Concept of current directory (working directory)

g. babic Presentation B 36

Have shared subdirectories and files.

Acyclic-Graph Directories

Two different names (aliasing) for the same file.

Unlink file = Delete file entry from a directory

File is deleted when a reference count reaches zero

Page 19: File System - web.cse.ohio-state.eduweb.cse.ohio-state.edu/~babic.1/B.2s.pdf · Unix/Linux File System g. babic Presentation B 8 Thus, the file is read (or written) sequentially by

19

g. babic Presentation B 37

General Graph Directories

• Unix avoids cycles by prohibiting multiple reference to directories.