CSCI1412 Lecture 11

34
CSCI1412 Lecture 11 Operating Systems 3 File Systems Dr John Cowell phones off (please) 1 CSCI1412-OS-3

description

phones off (please). CSCI1412 Lecture 11. Operating Systems 3 File Systems Dr John Cowell. Overview. Low level file systems partitions, sectors, clusters formatting FAT’s what is a FAT and how a file is stored fragmentation and defragmentation how a file is located - PowerPoint PPT Presentation

Transcript of CSCI1412 Lecture 11

CSCI1412 - File Systems

CSCI1412Lecture 11Operating Systems 3File SystemsDr John Cowell

phones off (please)1CSCI1412-OS-3OverviewLow level file systemspartitions, sectors, clustersformattingFATswhat is a FAT and how a file is storedfragmentation and defragmentationhow a file is locatedHigh level file systemsdrives, directories, filesfile attributes and permissionssharing informationCSCI1412-OS-32Low-level File Systems3CSCI1412-OS-3Low-level File ManagementPhysically, a hard disk consists of platters, surfaces, cylinders, heads and sectorsLogically, a hard disk consists of partitions or drives (C:, D:, E:, etc.), directories and filesIt is the operating system that carries out the translation between the underlying physical organisation and representation of data and the logical layer visible to the userdifferent partitions (drives) may even have different translations that are entirely transparent to the userCSCI1412-OS-34File System LayersIn order to provide a convenient and efficient translation, an operating system generally implements a number of intermediate layersto insulate logical structure from device dependenciesCSCI1412-OS-35application programslogical file systemfiles, directories, etc.BIOSBasic Input Output SystemIDE controllerintegrated drive electronics(hard disk drive) controllerhard diskPartitionsA partition is a collection of adjacent cylinders on the hard drive that are grouped togethereach partition can have its own operating system and so its own logical file systemthe partitions are defined (position, size) in a tablethere are only four entries allowed in the partition tableMS-DOS/Win uses a logical structure referred to as FAT this is specific to MS-DOS/Win and is not genericused for floppy drives and flash memoryother operating systems have their own file systemsXP, Vista: NTFS; Linux: ext2these arent compatible: partition may be unreadableCSCI1412-OS-36FormattingAn analogyimagine a large area of tarmacit would be chaos if drivers tried to park anywhere!now imagine that spaces are painted with white linesthe area of tarmac has now become a car parkFormatting is the process of laying down the file system structures on the partitioneach partition needs to be formatted separatelytwo file allocation tables are writtenone root directory entry is writtenall sectors are tested to see if they are bad & markedCSCI1412-OS-37Surface OrganisationThe very first sector on a hard drive is the primary boot sector which also contains the partition tableCSCI1412-OS-38

Sectors and ClustersRemember that the surface of the disk is divided into blocks of bytes called sectorson current PCs 1 sector = 512 bytesSince hard disks have become so much larger there are so many sectors that is inefficient to deal with sectors individuallysectors are grouped into blocks, known as clustersA file is stored in a whole number of clustersany space left over in the final cluster is wastedsometimes called slack spaceCSCI1412-OS-39Directory EntriesAfter the two FATs on a partition, the operating system puts the root directory (\)a directory consists of a number of directory entriesEach directory entry consists ofthe file name, extension, other information and the cluster number of the initial data stored in the fileCSCI1412-OS-310The directory entry holds the cluster number of the first cluster in which the file is storedfile name8ext31attribute flagsreserved10time2date2start2size4File Allocation TablesThe file allocation table (FAT) is a database that holds the status of every cluster on the partitionbecause it is so important two copies of FAT are storedif it is corrupted all data on the partition can be lost!The directory entry holds the start cluster of a filethe FAT entry indexed by that cluster number contains the number of the next cluster storing the files datathere is a special value to signify end of file (FFF816)Special values signify free blocks and bad blocksfree blocks are signified by 000016bad blocks are signified by FFF716CSCI1412-OS-311FAT IllustrationSuppose the clusters are 4K in size and we want to store a file, \FRED.DOC, which is 6 Kb longCSCI1412-OS-312012345678910111213141516171819202122232425262728293031 root directorynamestartcommand.com2autoexec.bat3config.sys4fred.doc8bill.doc9jack.doc100xFFFD0xFFF8000000xFFF7000012345678910FAT0000011121314150000000000016171819202122232425260000027282930310xFFF80xFFF80xFFF8120xFFF8161724250xFFF8262714131822210xFFF8Cluster ContentsMore details on the cluster contentsCSCI1412-OS-3130xFFFD0xFFF8000000xFFF7000012345678910FAT0000011121314150000000000016171819202122232425260000027282930310xFFF80xFFF80xFFF8120xFFF8161724250xFFF8262714131822210xFFF8 root directorynamestartcommand.com2autoexec.bat3config.sys4fred.doc8bill.doc9jack.doc10cluster contentsFAT-1FAT-2root directory (\)unusedunusedunusedFRED.DOC (0..4095)freefreebadBILL.DOC (0..4095)JACK.DOC (0..4095)FRED.DOC (4095..6143)free0123456789101112Windows 9X File SystemWindows 9X uses a minor variation of the MS-DOS FAT file system structure, called VFAT (which developed into FAT32)it is still a linked-list allocation using a FAT indexDirectory entries have been expanded to allow file names of up to 256 charactersThe FAT table can be selected as 16 bit or 32 bit16-bit FAT gives 65536 entriesthe file system is compatible with MS-DOS64K entries 32K clusters 2Gb maximum partition size32-bit FAT gives 4 109 entriesextremely large maximum partition size!CSCI1412-OS-314NTFSThe (V)FAT structure has serious shortcomingslarge disk sizes are only possible using large clustersthe FAT offers poor performance and is non-robust (key file data such as time, date, size are kept in the directory entry)there is no file access protection mechanismWindows XP and Vista use an entirely different systemfull access protection securitylarge capacity allocation schemeinternationalisation using long Unicode file namesrobust fault-tolerance with transaction loggingCSCI1412-OS-315NTFS NT File System was developed for Windows NTImplemented as the file system used by Windows 2000 and XPUses 64-bit addressestheoretically supporting disk partitions of up to 264 bytes capacitySupports 255 character case sensitive file names in Unicodebut the case sensitivity is not supported by the Win32 API!Paths are limited to 32767 bytesCSCI1412-OS-316File System StructureNTFS partition is organised as linear sequence of blocksblock size can be 512 bytes to 64 Kb actual block size is a compromise between large blocks (more efficient data transfer) and small blocks (less file fragmentation)Most important part of the file structure is the Master File Table (MFT)each row (record) is 1 Kb longEach record contains description of a single file or directory, including the files attributesthe addresses of the data blocks actually containing the files dataData block address list may require more than one record in the MFT if the file is especially largein that case, the base record contains the addresses of the overflow MFT recordsSize of the MFT is also flexiblemay grow to a maximum of 248 records

CSCI1412-OS-317Master File TableAny file system needs to store information about itself, such as blocks used, volume name, etcThis metadata is stored in a series of filesfile names start with a $NTFS uses first 16 records for thisThe address of the MFT itself is stored in the boot block of the disk partition when the partition is createdallows the MFT to be placed anywhere on the diskto avoid bad blocks16First user file15(Reserved for future use)14(Reserved for future use)13(Reserved for future use)12(Reserved for future use)11$ExtendExtensions: quotas, etc10$UpcaseCase conversion table9$SecureSecurity descriptors for all files8$BadClusList of bad blocks7$BootBootstrap loader6$BitmapBitmap of blocks used5$Root directory4$AttrDefAttribute definitions3$VolumeVolume file2$LogFileLog file for recovery1$MftMirrMirror copy of MFT0$MftMaster File TableCSCI1412-OS-318MFT RecordsEach MFT record consists of a series of paired valuesEach pair of values comprises attribute header, valueseveral different types of these such as file name, attribute list, and dataValue(s) associated with the data attribute contain either addresses of data blocks, or for a small file, the data itself an immediate file

CSCI1412-OS-319How NTFS WorksThe OS will attempt always to place data into contiguous blocksbut this is not always possible The diagram details a file of nine blocks in sizeThis example shows the data blocks grouped into three runs of blocksThis is a short file, ie the data blocks will not fit into the MFT record, but all the information about the file willThe header shows that the offset of the first block is 0 ie, how far from the start of the file is the first data blockThe second header value gives the number of the first block that is not part of the file effectively, the length of the file in blocksAfter the header come the pairs of values pointing directly to the used blocksfirst value gives the block numbersecond value gives the number of contiguous blocksCSCI1412-OS-320Integrity CheckingIf the computer crashes or is switched off in the middle of updating the file system(s) it can be left in a state of confusion or corruptionthe two FATs may not contain the same informationthe file clusters may not reflect the FAT listthe initial directory entry may not point to the fileSome operating systems (e.g. UNIX) automatically check the file system integrity on start-upMS-DOS/Win requires you to run SCANDISKNTFS uses journalinglogs file system updates that are about to happenchecks afterwards to see that they did happenundoes any partially completed updatesCSCI1412-OS-321FragmentationInitially all the free space is in one big blockclusters are allocated in sequence to files as neededCSCI1412-OS-322FRED.DOCBILL.DOCJACK.DOCFREESuppose BILL.DOC is deleted and the clusters freedFRED.DOCJACK.DOCFREEFREEFRED.DOCJOHN.DOCJACK.DOCThen a longer file, JOHN.DOC, is stored on diskJOHN.DOCthe file JOHN.DOC is said to be fragmentedafter lots of deletions / additions a disk can be very fragmenteda badly fragmented disk will be much slower to read and writeDefragmentationDefragmentation is the name given to the process of rearranging the clusters so that all files are held in adjacent clusters and all free space is at the endCSCI1412-OS-323012345678910111213141516171819202122232425262728293031012345678910111213141516171819202122232425262728293031the whole disk is read and the clusters are moveda defragmented disk will improve performancethis process can be very time consuminga defragmenting tool is included in later versions of WindowsFinding FilesWhen the system request a file from diskdirectory is read to see if file existsdirectory entry has address of first cluster of filedisk drive heads are moved to correct trackOS waits for correct sector to arrive under disk headsector is read & data transferred to disk cache bufferwhen buffer full, contents transferred to RAMnext sector is read, until all sectors in cluster have been accessedFAT re-accessed to find next clusterprocess repeated until end-of-file is found in FATCSCI1412-OS-324High-level File Systems25CSCI1412-OS-3High Level File SystemsIn addition to dealing with low-level issues such as partition tables, FATs, clusters and sectors the operating system deals with higher-level issues of file managementdirectory structuresfile attributessecurityMost operating systems also have some concept of file types or special filesdirectories are identified by attribute in directory entryother file types in Windows are identified by extensione.g. EXE, COM, BAT, DOC, XLS, TTF, DLLCSCI1412-OS-326Directories and FoldersDirectories, or folders as they are also known, are used to organise collections of filesin a typical modern computer, even single-user PC, there can be 100,000s of filesif they were kept in a single, flat, file structure it would be extremely time consuming to find any particular fileusing directories users can group together filesapplications, system files, user filesdocuments, spreadsheet, presentationsA directory is a special file type in which the OS stores the names of other files or other directoriesdirectories within directories are sub-directoriesCSCI1412-OS-327Typical Single User File SystemNote: the separation ofuser filesapplicationsoperating systemAllowsbackups of workeasy installation, reinstallation and removal of applicationsoperating system upgradesCSCI1412-OS-328

DrivesIn Windows systems different physical devices and file systems are assigned different drive labelsA: Floppy diskC: First hard disk (partition)D: Second hard disk (partition)E: CD-ROMthis organisation is visible to the user and has big drawbacks if the physical organisation needs alterationIn Unix there is one logical file system structure which may consist of many physical devicesthe devices may even be distributed across a networkCSCI1412-OS-329File AttributesThe operating system keeps track of certain file properties or attributesdate / time of creation & last modificationin WindowsR: read-only, H: hidden, S: systemA: archive (has the file changed since last backup)in Unixfile owner, group; file permissionsIn MS-DOS/Win the file extension also has meaning to the operating systemEXE: executable, BAT: batch command scriptCSCI1412-OS-330File PermissionsIn both Windows XP/Vista and UNIX the concept of MS-DOS/Win file attributes has been extended to a more comprehensive implementation of file permissionsthe directory entry for each file stores the user name of its owner (the creator) and their default groupthere are three sets of file permissions flagsuser (owner)read (r), write (w) and execute (x) permissionsgroupread (r), write (w) and execute (x) permissionsotherread (r), write (w) and execute (x) permissionsCSCI1412-OS-331Personal & Shared Information - 1Using these file permissions it is possible to restrict other users access to your filesthe user flags apply to the owner of the filee.g. if there is no user write permission, not even the owner can write to the file (i.e. the file is read-only)the group flags apply to other users who belong to the same group as the group of the filee.g. other members of group staff may be granted permission to read documents belonging to jcowell (staff) jcowell could write to a file owned by group cci with group write permissions (as jcowell is also a member of group cci)CSCI1412-OS-332Personal & Shared Information - 2Using these file permissions it is possible to restrict other users access to your files, cont.the other flags apply to all other users who are not even in the same groupif others have read & write access, then any other user can read or write to the filesIn UNIX the (x) flags indicate eitheran ordinary file is executable (like .EXE in MS-DOS)a directory is searchable by the appropriate userse.g. if jcowell makes a directory rwx,r-x,--- then group staff members can read it and enter into it but other users cannotCSCI1412-OS-333SummaryLow level file systemspartitions, sectors, clustersformattingFATswhat is a FAT and how a file is storedfragmentation and defragmentationhow a file is locatedHigh level file systemsdrives, directories, filesfile attributes and permissionssharing informationCSCI1412-OS-334