Secondary Storage Management 1

54
1 SECONDARY STORAGE MANAGEMENT

Transcript of Secondary Storage Management 1

Page 1: Secondary Storage Management 1

1

SECONDARY STORAGE MANAGEMENT

Page 2: Secondary Storage Management 1

2

Overview• The file management system maintains the

file system and its directories and also keeps track of free secondary storage space.

• The I/O system provides device drivers that actually control the transfer of data between memory and the secondary storage devices.

• The secondary storage management system optimizes the completion of I/O tasks by employing algorithms to facilitate more efficient disk usage.

Page 3: Secondary Storage Management 1

3

Disk Structure

• Disk drives are addressed as large 1-dimensional arrays of logical blocks, where the logical block is the smallest unit of transfer, (usually 512 bytes).

• The 1-dimensional array of logical blocks is mapped into the sectors of the disk sequentially.– Sector 0 is the first sector of the first track on the

outermost cylinder.– Mapping proceeds in order through that track, then

through the rest of the tracks in that cylinder, and then through the rest of the cylinders from outermost to innermost.

Page 4: Secondary Storage Management 1

4

• In theory, we can convert a logical block number into a disk address (I.e. Cylinder #, track#,sector#).

• In practice, it is difficult to perform this translation. Disks may have defective sectors. The no. of sectors per track is not constant on some drives.

Page 5: Secondary Storage Management 1

5

Disk Scheduling

• The operating system is responsible for using hardware efficiently — for the disk drives, this means having a fast access time and disk bandwidth.

• Access time has two major components– Seek time is the time for the disk arm to move the heads

to the cylinder (track) containing the desired sector.– Rotational latency is the additional time waiting for the

disk to rotate the desired sector to the disk head.

• Want to minimize seek time• Seek time seek distance

Page 6: Secondary Storage Management 1

6

• The transfer time, T to/from disk depends on the rotation speed of the disk.

T = no. of bytes to be transferred

no. of bytes on track * rotation speed

• Disk bandwidth is the total number of bytes transferred, divided by the total time between the first request for service and the completion of the last transfer.

• We can improve the access time and bandwidth by scheduling the servicing of disk I/O request in a good order.

Page 7: Secondary Storage Management 1

7

• Read section 14.2 on pages 488 - 489 of text to see example comparing the transfer times for both sequential and random access.

Page 8: Secondary Storage Management 1

8

Whenever a process needs I/O to or from disk, it issues a system call to the OS. The request specifies the following pieces of info:– Whether this operation is input or output– What the disk address for the transfer is– What the memory address for the transfer is– What the number of bytes to be transferred is.

If the desired disk drive and controller are available, the request can be serviced immediately. If the driver or controller is busy, any new requests will be placed in a queue of pending requests for that drive. The OS chooses which pending request to service next.

Page 9: Secondary Storage Management 1

9

Disk Scheduling (Cont.)

• Several algorithms exist to schedule the servicing of disk I/O requests. These include:1. First-Come-First-Serve

2. Shortest Seek Time First

3. SCAN

4. Circular SCAN (C-SCAN)

5. LOOK

Page 10: Secondary Storage Management 1

10

FCFS

Illustration shows total head movement of 640 cylinders.

Page 11: Secondary Storage Management 1

11

Shortest Service Time First

• Selects the request with the minimum seek time from the current head position.

• SSTF scheduling is a form of SJF scheduling; may cause starvation of some requests.

• Illustration shows total head movement of 236 cylinders.

Page 12: Secondary Storage Management 1

12

SSTF (Cont.)

Page 13: Secondary Storage Management 1

13

SCAN

• The disk arm starts at one end of the disk, and moves toward the other end, servicing requests until it gets to the other end of the disk, where the head movement is reversed and servicing continues.

• Sometimes called the elevator algorithm.• Illustration shows total head movement of

208 cylinders.

Page 14: Secondary Storage Management 1

14

SCAN (Cont.)

Page 15: Secondary Storage Management 1

15

C-SCAN

• Provides a more uniform wait time than SCAN.• The head moves from one end of the disk to the

other, servicing requests as it goes. When it reaches the other end, however, it immediately returns to the beginning of the disk, without servicing any requests on the return trip.

• Treats the cylinders as a circular list that wraps around from the last cylinder to the first one.

Page 16: Secondary Storage Management 1

16

C-SCAN (Cont.)

Page 17: Secondary Storage Management 1

17

C-LOOK

• Version of C-SCAN

• Arm only goes as far as the last request in each direction, then reverses direction immediately, without first going all the way to the end of the disk.

• That is, it looks for a request before continuing to move in a given direction.

Page 18: Secondary Storage Management 1

18

C-LOOK (Cont.)

Page 19: Secondary Storage Management 1

19

Page 20: Secondary Storage Management 1

20

Page 21: Secondary Storage Management 1

21

Disk Management

The OS is responsible for several other aspects of disk management. These include:

• Disk formatting

• Booting from disk

• Bad-block recovery

Page 22: Secondary Storage Management 1

22

Disk Formatting• Low-level formatting, or physical formatting:

Dividing a disk into sectors that the disk controller can read and write.

• The disk is filled with a special data structure for each sector.

• Data structure has a header, data area and trailer.• To use a disk to store files, the operating system

still needs to record its own data structures on the disk. It does so by:– Partitioning the disk into one or more groups of

cylinders. Each partition can be treated as a separate disk.

– Logical formatting or “making a file system”.

Page 23: Secondary Storage Management 1

23

Bad Block Handling

• On simple disks (e.g. with IDE controllers), bad blocks are handled manually. E.g. in MS-DOS the format command does a logical format of the disk and scans the disks for bad blocks. If a bad block is found, an entry is made in the FAT structure to mark that block as unusable.

• If a block goes bad during normal operation, can run chkdsk to manually search for and record bad blocks.

Page 24: Secondary Storage Management 1

24

MS-DOS Disk Layout

Page 25: Secondary Storage Management 1

25

Bad Blocks. Cont.• More sophisticated disks, such as SCSI are smarter

about bad block recovery. The bad block list is initialized during low-level formatting at the factory and is updated over the life of the disk.

• The controller can be configured to replace each bad sector logically with a spare sector. A typical bad sector management might be as follows:– The OS tries to read logical block 87– The controller calculates the error-correcting code (ECC)

and finds that the sector is bad. It reports this to the OS.– When the system is rebooted, a special command tells the

SCSI controller to replace with a spare.– Next time logical block 87 is requested, the request is

translated into the spare sector’s address by the controller.

Page 26: Secondary Storage Management 1

26

Swap-Space Management• Swap-space — Virtual memory uses disk space as

an extension of main memory.• Swap-space can be carved out of the normal file

system,or, more commonly, it can be in a separate disk partition.

• Swap-space management– 4.3BSD allocates swap space when process starts; holds

text segment (the program) and data segment.– Kernel uses swap maps to track swap-space use.– Solaris 2 allocates swap space only when a page is

forced out of physical memory, not when the virtual memory page is first created.

Page 27: Secondary Storage Management 1

27

RAID Redundant Array of Independent

(or Inexpensive) Disks• RAID is a set of physical disk drives viewed by

the OS as a single logical drive.• A large no. of disks operating in parallel can

improve the data transfer rate.• Redundant info stored on multiple disks thereby

increasing reliability.• Data is distributed across multiple disks.• RAID is arranged into seven different levels (0

thru 6).

Page 28: Secondary Storage Management 1

28

RAID (cont)• Several improvements in disk-use

techniques involve the use of multiple disks working cooperatively.

• Redundancy can be implemented either by: – duplicating each disk, (i.e. mirroring or

shadowing).– Striping: a technique which uses a group of

disks as one storage unit.

Page 29: Secondary Storage Management 1

29

Mirroring• A mirrored array consists of two or more

disk drives. Each disk stores exactly the same data. During read, alternate blocks of data are read from different drives, then combined to reassemble the original data. The access time for a multi-block read is reduced by a factor equal to the no. of disk drives in the array.

Page 30: Secondary Storage Management 1

30

Striping• A striped array requires a minimum of three disk

drives. One disk is reserved for error checking. A file segment to be stored is divided into blocks, which are then written simultaneously to different disks. As the write operation is taking place, the system creates a block of parity words from each group of blocks and stores that on a reserved disk. During read operations, the parity word is used to check the original data for errors.

• Can have bit-level or block-level striping.

Page 31: Secondary Storage Management 1

31

RAID Levels

Page 32: Secondary Storage Management 1

32

Level OF Raid

• Raid is divided into major levels. some levels are based on mirroring that improving reliabilities but are expensive

• some levels are based on stripping that provide high I/O rate ,but are not reliable.

• Some level use stripping with parity bits ,a disk is used to store an extra bit known as parity bit used for recovering lost data by comparing parity.

• Parity is the mechanism of checking errors in the transmitted data using parity bit.

Page 33: Secondary Storage Management 1

33

Raid Level 0

• Block –level stripping is used for storing storing data without any duplicity of data.

• This is used when request requires huge block of data .

• data is divided into strips and can be accessed in parallel, which increases the performance of I/O.IF two I/O requests arrive at the same time for two different blocks ,they can run simultaneously accessing blocks from different disks, this reduces the I/O transfer rate.

Page 34: Secondary Storage Management 1

34

Raid 0 (No redundancy)

strip 0

strip 4

strip 8

strip 12

strip 1

strip 5

strip 9

strip 13

strip 2

strip 6

strip 10

strip 14

strip 3

strip 7

strip 11

strip 15

Page 35: Secondary Storage Management 1

35

Raid level 1• Based on mirroring. Redundancy is achieved by

duplicated all the data. If there are four disks, computer has four duplicate disk also. Each strip is mapped onto two disks.

• Read request in this level can be performed on either of the disks. whichever involves the less seek time. write request performs on both strips of disks. Recovery is easier because if one disk fails, data stored in this disk is removed from its duplicate disk. Data Can be accessed from the other disks fails. Cost incurred in this level is very high bcoz it takes double space for data storgae.

Page 36: Secondary Storage Management 1

36

Raid 1 (mirrored)

strip 0

strip 4

strip 8

strip 12

strip 1

strip 5

strip 9

strip 13

strip 2

strip 6

strip 10

strip 14

strip 3

strip 7

strip 11

strip 15

strip 0

strip 4

strip 8

strip 12

strip 1

strip 5

strip 9

strip 13

strip 2

strip 6

strip 10

strip 14

strip 3

strip 7

strip 11

strip 15

Page 37: Secondary Storage Management 1

37

Raid Level 2(Memory style Error Correcting Code Org

• This level have parity checking mechanism that is special bits known as parity bits are used to recover any lost bit in the data. A byte consists of eight bits that are stored on eight disks, some other disks store the error –correction bits or parity bits.

• Each byte of data is set to the parity either odd this is equal to 1 or even is equal to 0.if a bit is lost or damaged whole bit pattern is checked for the present parity and reconstructed by computing it again. Error checking scheme stores two or more extra bits as a parity bit. This is more reliable then the previous two and needs less disks then previous levels. we used this level where frequently disk error occurs.

Page 38: Secondary Storage Management 1

38

Raid 2 ()

f0(b)b2b1b0 b2f1(b) f2(b)

Page 39: Secondary Storage Management 1

39

Raid level 3 bit –interleaved parity org• This level needs only one parity bit instead of

error correcting code and work similar to 2 level raid. Single disk is required for storing for storing the parity bit that is used for the error correcting code and detection.

• this parity is the computed parity from all the bits of a byte.

• Is a damaged bit is found ,all the bits in a byte are completed again for the parity. if the parity does not match with the previous computed parity ,an error is detected.

• This is less expensive then level 2.

Page 40: Secondary Storage Management 1

40

• Drawback is extra over head. in term of time taken for computing parity of every byte. its slow in writing operations, whenever block is updated.

Page 41: Secondary Storage Management 1

41

RAID 3 (bit-interleaved parity)

P(b)b2b1b0 b2

Page 42: Secondary Storage Management 1

42

Raid level 4(Block interleaved Parity organization

• Separate blocks of data are stored on separate disks. similar to the level 0;

• This level has also separate disk used for the storing parity block that is computed from the data blocks.

• If Failure of a block occurs computing again for parity block and matching it from the previous parity block can reconstruct it.

Page 43: Secondary Storage Management 1

43

• This level provide The high I/O transfer rate. If the data block is large because read and write request os performed simultaneously from different disks in parallel.

• Every write operation need disk access 4 times. this is 2 read and 2 write. whenever a write requests is issued, block is updated and the parity block is made ready. since parity is block has to be updated every time a block is entered.

• Then the block is made ready for parity computation and the parity block is rewritten.

Raid level 4(Block interleaved Parity organization

Page 44: Secondary Storage Management 1

44

RAID 4 (block-level parity)

block 0

block 4

block 8

block 12

block 1

block 5

block 9

block 13

block 2

block 6

block 10

block 14

block 3

block 7

block 11

block 15

P(0-3)

P(4-7)

P(8-11)

P(12-15)

Page 45: Secondary Storage Management 1

45

Raid level 5(block interleaved distributed parity

• This level is similar to the level 4.The different lies in using the entire available disks for storing parity with the data. For example If an array of six disks is present ,parity for nth block is stored in the disk number (n mod5)+1 rest of the data is stored on the other five disks on nth blocks.

• its store data and parity in different blocks, this is ways to less parity and data loss if disk fails. Data is ether recover from parity block or by computing data blocks.

Page 46: Secondary Storage Management 1

46

RAID 5 (block-level distributed parity)

block 0

block 4

block 8

block 12

P(16-19)

block 1

block 5

block 9

P(12-15)

block 16

block 2

block 6

P(8-11)

block 13

block 17

block 3

P(4-7)

block 10

block 14

block 18

P(0-3)

block 7

block 11

block 15

block 19

Page 47: Secondary Storage Management 1

47

Raid level 6(P+Q redundancy scheme)

• In this level instead of one parity computing ,two computation are performed for parity and the computed parity and the computed parity are stored on different disks. Two extra disks are required for this purpose. Extra disk or parity computation is done to prevent multiple disks failure. this ensures high reliability but an extra overhead is incurred into the form of extra disks and five disks access for one write command.

Page 48: Secondary Storage Management 1

48

RAID 6

block 0

block 4

block 8

block 12

P(16-19)

block 1

block 5

block 9

P(12-15)

block 16

block 2

block 6

P(8-11)

block 13

block 17

block 3

P(4-7)

block 10

block 14

block 18

P(0-3)

blok 7

block 15

block 19

P(0-3)

block 11

P(4-7)

P(8-11)

P(12-15)

P(16-19)

Page 49: Secondary Storage Management 1

49

Selecting a RAID Level

• If a disk fails, the time to rebuild its data can be significant and will vary with the RAID level used.

• Rebuilding is easiest for RAID level 1. Simply copy the data from another disk.

• RAID level 0 is used in high-performance applications where data loss is not critical.

• RAID level 1 is popular for applications that require high reliability with fast recovery.

Page 50: Secondary Storage Management 1

50

• The combination of RAID levels 0 and 1 (RAID 0 + 1) is used for applications where performance and reliability are very important, for e.g. databases.

• Due to RAID 1’s high space overhead, RAID level 5 is often preferred for storing large volumes of data.

• RAID level 6 is not supported currently by many implementations, but should offer better reliability than level 5.

Page 51: Secondary Storage Management 1

51

Cost of Storage

• Main memory is much more expensive than disk storage.

• Removable media could lower the overall storage cost.

• The cost per megabyte of hard disk storage is competitive with magnetic tape if only one tape is used per drive.

• The cheapest tape drives and the cheapest disk drives have had about the same storage capacity over the years.

• The following figures show the cost trend per megabyte of DRAM, magnetic disks and tape drives.

Page 52: Secondary Storage Management 1

52

Price per Megabyte of DRAM, From 1981 to 2000

Page 53: Secondary Storage Management 1

53

Price per Megabyte of Magnetic Hard Disk, From 1981 to 2000

Page 54: Secondary Storage Management 1

54

Price per Megabyte of a Tape Drive, From 1984-2000