Secondary Storage Management 1

1

SECONDARY STORAGE MANAGEMENT

2

Overview• The file management system maintains the

file system and its directories and also keeps track of free secondary storage space.

• The I/O system provides device drivers that actually control the transfer of data between memory and the secondary storage devices.

• The secondary storage management system optimizes the completion of I/O tasks by employing algorithms to facilitate more efficient disk usage.

3

Disk Structure

• Disk drives are addressed as large 1-dimensional arrays of logical blocks, where the logical block is the smallest unit of transfer, (usually 512 bytes).

• The 1-dimensional array of logical blocks is mapped into the sectors of the disk sequentially.– Sector 0 is the first sector of the first track on the

outermost cylinder.– Mapping proceeds in order through that track, then

through the rest of the tracks in that cylinder, and then through the rest of the cylinders from outermost to innermost.

4

• In theory, we can convert a logical block number into a disk address (I.e. Cylinder #, track#,sector#).

• In practice, it is difficult to perform this translation. Disks may have defective sectors. The no. of sectors per track is not constant on some drives.

5

Disk Scheduling

• The operating system is responsible for using hardware efficiently — for the disk drives, this means having a fast access time and disk bandwidth.

• Access time has two major components– Seek time is the time for the disk arm to move the heads

to the cylinder (track) containing the desired sector.– Rotational latency is the additional time waiting for the

disk to rotate the desired sector to the disk head.

• Want to minimize seek time• Seek time seek distance

6

• The transfer time, T to/from disk depends on the rotation speed of the disk.

T = no. of bytes to be transferred

no. of bytes on track * rotation speed

• Disk bandwidth is the total number of bytes transferred, divided by the total time between the first request for service and the completion of the last transfer.

• We can improve the access time and bandwidth by scheduling the servicing of disk I/O request in a good order.

7

• Read section 14.2 on pages 488 - 489 of text to see example comparing the transfer times for both sequential and random access.

8

Whenever a process needs I/O to or from disk, it issues a system call to the OS. The request specifies the following pieces of info:– Whether this operation is input or output– What the disk address for the transfer is– What the memory address for the transfer is– What the number of bytes to be transferred is.

If the desired disk drive and controller are available, the request can be serviced immediately. If the driver or controller is busy, any new requests will be placed in a queue of pending requests for that drive. The OS chooses which pending request to service next.

9

Disk Scheduling (Cont.)

• Several algorithms exist to schedule the servicing of disk I/O requests. These include:1. First-Come-First-Serve

2. Shortest Seek Time First

3. SCAN

4. Circular SCAN (C-SCAN)

5. LOOK

10

FCFS

Illustration shows total head movement of 640 cylinders.

11

Shortest Service Time First

• Selects the request with the minimum seek time from the current head position.

• SSTF scheduling is a form of SJF scheduling; may cause starvation of some requests.

• Illustration shows total head movement of 236 cylinders.

12

SSTF (Cont.)

13

SCAN

• The disk arm starts at one end of the disk, and moves toward the other end, servicing requests until it gets to the other end of the disk, where the head movement is reversed and servicing continues.

• Sometimes called the elevator algorithm.• Illustration shows total head movement of

208 cylinders.

14

SCAN (Cont.)

15

C-SCAN

• Provides a more uniform wait time than SCAN.• The head moves from one end of the disk to the

other, servicing requests as it goes. When it reaches the other end, however, it immediately returns to the beginning of the disk, without servicing any requests on the return trip.

• Treats the cylinders as a circular list that wraps around from the last cylinder to the first one.

16

C-SCAN (Cont.)

17

C-LOOK

• Version of C-SCAN

• Arm only goes as far as the last request in each direction, then reverses direction immediately, without first going all the way to the end of the disk.

• That is, it looks for a request before continuing to move in a given direction.

18

C-LOOK (Cont.)

21

Disk Management

The OS is responsible for several other aspects of disk management. These include:

• Disk formatting

• Booting from disk

• Bad-block recovery

22

Disk Formatting• Low-level formatting, or physical formatting:

Dividing a disk into sectors that the disk controller can read and write.

• The disk is filled with a special data structure for each sector.

• Data structure has a header, data area and trailer.• To use a disk to store files, the operating system

still needs to record its own data structures on the disk. It does so by:– Partitioning the disk into one or more groups of

cylinders. Each partition can be treated as a separate disk.

– Logical formatting or “making a file system”.

23

Bad Block Handling

• On simple disks (e.g. with IDE controllers), bad blocks are handled manually. E.g. in MS-DOS the format command does a logical format of the disk and scans the disks for bad blocks. If a bad block is found, an entry is made in the FAT structure to mark that block as unusable.

• If a block goes bad during normal operation, can run chkdsk to manually search for and record bad blocks.

24

MS-DOS Disk Layout

25

Bad Blocks. Cont.• More sophisticated disks, such as SCSI are smarter

about bad block recovery. The bad block list is initialized during low-level formatting at the factory and is updated over the life of the disk.

• The controller can be configured to replace each bad sector logically with a spare sector. A typical bad sector management might be as follows:– The OS tries to read logical block 87– The controller calculates the error-correcting code (ECC)

and finds that the sector is bad. It reports this to the OS.– When the system is rebooted, a special command tells the

SCSI controller to replace with a spare.– Next time logical block 87 is requested, the request is

translated into the spare sector’s address by the controller.

26

Swap-Space Management• Swap-space — Virtual memory uses disk space as

an extension of main memory.• Swap-space can be carved out of the normal file

system,or, more commonly, it can be in a separate disk partition.

• Swap-space management– 4.3BSD allocates swap space when process starts; holds

text segment (the program) and data segment.– Kernel uses swap maps to track swap-space use.– Solaris 2 allocates swap space only when a page is

forced out of physical memory, not when the virtual memory page is first created.

27

RAID Redundant Array of Independent

(or Inexpensive) Disks• RAID is a set of physical disk drives viewed by

the OS as a single logical drive.• A large no. of disks operating in parallel can

improve the data transfer rate.• Redundant info stored on multiple disks thereby

increasing reliability.• Data is distributed across multiple disks.• RAID is arranged into seven different levels (0

thru 6).

28

RAID (cont)• Several improvements in disk-use

techniques involve the use of multiple disks working cooperatively.

• Redundancy can be implemented either by: – duplicating each disk, (i.e. mirroring or

shadowing).– Striping: a technique which uses a group of

disks as one storage unit.

29

Mirroring• A mirrored array consists of two or more

disk drives. Each disk stores exactly the same data. During read, alternate blocks of data are read from different drives, then combined to reassemble the original data. The access time for a multi-block read is reduced by a factor equal to the no. of disk drives in the array.

30

Striping• A striped array requires a minimum of three disk

drives. One disk is reserved for error checking. A file segment to be stored is divided into blocks, which are then written simultaneously to different disks. As the write operation is taking place, the system creates a block of parity words from each group of blocks and stores that on a reserved disk. During read operations, the parity word is used to check the original data for errors.

• Can have bit-level or block-level striping.

31

RAID Levels

32

Level OF Raid

• Raid is divided into major levels. some levels are based on mirroring that improving reliabilities but are expensive

• some levels are based on stripping that provide high I/O rate ,but are not reliable.

• Some level use stripping with parity bits ,a disk is used to store an extra bit known as parity bit used for recovering lost data by comparing parity.

• Parity is the mechanism of checking errors in the transmitted data using parity bit.

33

Raid Level 0

• Block –level stripping is used for storing storing data without any duplicity of data.

• This is used when request requires huge block of data .

• data is divided into strips and can be accessed in parallel, which increases the performance of I/O.IF two I/O requests arrive at the same time for two different blocks ,they can run simultaneously accessing blocks from different disks, this reduces the I/O transfer rate.

34

Raid 0 (No redundancy)

strip 0

strip 4

strip 8

strip 12

strip 1

strip 5

strip 9

strip 13

strip 2

strip 6

strip 10

strip 14

strip 3

strip 7

strip 11

strip 15

35

Raid level 1• Based on mirroring. Redundancy is achieved by

duplicated all the data. If there are four disks, computer has four duplicate disk also. Each strip is mapped onto two disks.

• Read request in this level can be performed on either of the disks. whichever involves the less seek time. write request performs on both strips of disks. Recovery is easier because if one disk fails, data stored in this disk is removed from its duplicate disk. Data Can be accessed from the other disks fails. Cost incurred in this level is very high bcoz it takes double space for data storgae.

36

Raid 1 (mirrored)

strip 0

strip 4

strip 8

strip 12

strip 1

strip 5

strip 9

strip 13

strip 2

strip 6

strip 10

strip 14

strip 3

strip 7

strip 11

strip 15

strip 0

strip 4

strip 8

strip 12

strip 1

strip 5

strip 9

strip 13

strip 2

strip 6

strip 10

strip 14

strip 3

strip 7

strip 11

strip 15

37

Raid Level 2(Memory style Error Correcting Code Org

• This level have parity checking mechanism that is special bits known as parity bits are used to recover any lost bit in the data. A byte consists of eight bits that are stored on eight disks, some other disks store the error –correction bits or parity bits.

• Each byte of data is set to the parity either odd this is equal to 1 or even is equal to 0.if a bit is lost or damaged whole bit pattern is checked for the present parity and reconstructed by computing it again. Error checking scheme stores two or more extra bits as a parity bit. This is more reliable then the previous two and needs less disks then previous levels. we used this level where frequently disk error occurs.

38

Raid 2 ()

f0(b)b2b1b0 b2f1(b) f2(b)

39

Raid level 3 bit –interleaved parity org• This level needs only one parity bit instead of

error correcting code and work similar to 2 level raid. Single disk is required for storing for storing the parity bit that is used for the error correcting code and detection.

• this parity is the computed parity from all the bits of a byte.

• Is a damaged bit is found ,all the bits in a byte are completed again for the parity. if the parity does not match with the previous computed parity ,an error is detected.

• This is less expensive then level 2.

40

• Drawback is extra over head. in term of time taken for computing parity of every byte. its slow in writing operations, whenever block is updated.

41

RAID 3 (bit-interleaved parity)

P(b)b2b1b0 b2

42

Raid level 4(Block interleaved Parity organization

• Separate blocks of data are stored on separate disks. similar to the level 0;

• This level has also separate disk used for the storing parity block that is computed from the data blocks.

• If Failure of a block occurs computing again for parity block and matching it from the previous parity block can reconstruct it.

43

• This level provide The high I/O transfer rate. If the data block is large because read and write request os performed simultaneously from different disks in parallel.

• Every write operation need disk access 4 times. this is 2 read and 2 write. whenever a write requests is issued, block is updated and the parity block is made ready. since parity is block has to be updated every time a block is entered.

• Then the block is made ready for parity computation and the parity block is rewritten.

Raid level 4(Block interleaved Parity organization

44

RAID 4 (block-level parity)

block 0

block 4

block 8

block 12

block 1

block 5

block 9

block 13

block 2

block 6

block 10

block 14

block 3

block 7

block 11

block 15

P(0-3)

P(4-7)

P(8-11)

P(12-15)

45

Raid level 5(block interleaved distributed parity

• This level is similar to the level 4.The different lies in using the entire available disks for storing parity with the data. For example If an array of six disks is present ,parity for nth block is stored in the disk number (n mod5)+1 rest of the data is stored on the other five disks on nth blocks.

• its store data and parity in different blocks, this is ways to less parity and data loss if disk fails. Data is ether recover from parity block or by computing data blocks.

46

RAID 5 (block-level distributed parity)

block 0

block 4

block 8

block 12

P(16-19)

block 1

block 5

block 9

P(12-15)

block 16

block 2

block 6

P(8-11)

block 13

block 17

block 3

P(4-7)

block 10

block 14

block 18

P(0-3)

block 7

block 11

block 15

block 19

47

Raid level 6(P+Q redundancy scheme)

• In this level instead of one parity computing ,two computation are performed for parity and the computed parity and the computed parity are stored on different disks. Two extra disks are required for this purpose. Extra disk or parity computation is done to prevent multiple disks failure. this ensures high reliability but an extra overhead is incurred into the form of extra disks and five disks access for one write command.

48

RAID 6

block 0

block 4

block 8

block 12

P(16-19)

block 1

block 5

block 9

P(12-15)

block 16

block 2

block 6

P(8-11)

block 13

block 17

block 3

P(4-7)

block 10

block 14

block 18

P(0-3)

blok 7

block 15

block 19

P(0-3)

block 11

P(4-7)

P(8-11)

P(12-15)

P(16-19)

49

Selecting a RAID Level

• If a disk fails, the time to rebuild its data can be significant and will vary with the RAID level used.

• Rebuilding is easiest for RAID level 1. Simply copy the data from another disk.

• RAID level 0 is used in high-performance applications where data loss is not critical.

• RAID level 1 is popular for applications that require high reliability with fast recovery.

50

• The combination of RAID levels 0 and 1 (RAID 0 + 1) is used for applications where performance and reliability are very important, for e.g. databases.

• Due to RAID 1’s high space overhead, RAID level 5 is often preferred for storing large volumes of data.

• RAID level 6 is not supported currently by many implementations, but should offer better reliability than level 5.

51

Cost of Storage

• Main memory is much more expensive than disk storage.

• Removable media could lower the overall storage cost.

• The cost per megabyte of hard disk storage is competitive with magnetic tape if only one tape is used per drive.

• The cheapest tape drives and the cheapest disk drives have had about the same storage capacity over the years.

• The following figures show the cost trend per megabyte of DRAM, magnetic disks and tape drives.

52

Price per Megabyte of DRAM, From 1981 to 2000

53

Price per Megabyte of Magnetic Hard Disk, From 1981 to 2000

54

Price per Megabyte of a Tape Drive, From 1984-2000

Secondary Storage Management 1

Documents

Transcript of Secondary Storage Management 1