File Backup Terminology

7

Click here to load reader

Transcript of File Backup Terminology

Page 1: File Backup Terminology

8/8/2019 File Backup Terminology

http://slidepdf.com/reader/full/file-backup-terminology 1/7

File Backup Terminology: What do terms like “Differential,” &“Incremental,” mean, and how will they help me? Updated Tue, 10/19/2010 - 02:11 by Ritho 

Introduction

Over the years various backup technologies have been developed in an attempt to minimize the amount ofspace required to store backup files, and to reduce the bandwidth required to transfer those files to

remote locations. When faced with the different backup methods that many programs offer, it is easy to

become confused, since the terminology used is often not very clear, and it is hard to know the benefits ordrawbacks of any one technology. This article is meant to be a simple guide to help cut down on the

frustration that many experience when they don’t know what certain terms mean, and how differentoptions are best used.

Note: This is not, by far, an exhaustive glossary of backup terms. If you have questions about any terms

that are not covered below, please feel welcome to ask in the comment section, and we will attempt to

answer them for you.

Index

Common Backup Methods

Full Backups 

Differential Backups 

Incremental Backups 

Delta Block-level Backups 

Mirror Backups (Simple Copy)

Other Backup Methods and Techniques

Binary Patch 

Synthetic 

Hard Linked Backups 

Discussion

Full Backups 

This is just what it sounds like. This is a complete backup of all the data that a user selects when

configuring a backup job. The copied files are usually placed into a single file archive and compressed tohelp save space. Every time another full backup is made, all the files in the source are once again copiedan archive. The problem is that often there are only a few new or changed files, and continuously making

full backups will end up copying a lot of extra files that don’t really need to be backed up again. This

ends up using a lot of extra storage and wastes time. You can of course delete older backups to free up

Page 2: File Backup Terminology

8/8/2019 File Backup Terminology

http://slidepdf.com/reader/full/file-backup-terminology 2/7

space, but the time is still lost. The extra wear on hard disks or the amount of bandwidth that is used tomake frequent full backups must be considered too.

It is a much better idea to make a full backup once in a while, and then figure out a way to only copy the

new or changed files on a more frequent basis. Several different methods, described below, have beencreated to implement this very thing.

Benefits and Disadvantages of Full Backups 

Faster restore of all files -- When a full restore is necessary full backups are quick because you are only dealing with one

archive file.

Full backups are large and time consuming to make -- They are not well suited for regular backups such as those

performed hourly or daily.

Differential Backups 

After creating a full backup archive this backup method helps to reduce the size of subsequent backups bydoing a “differential” comparison of the original files and the last full backup. All new and modified files are

copied to a archive along side the full backup. The important thing to understand is that differentialbackups are cumulative. Each differential backup backs up everything that is different since the last fullbackup even if those files are already included in a previous differential. Since Differentials back up only

new or changed files, they are a faster backup method than creating a full backup each time. Differentialbackups are well suited for daily or less frequent backup strategies.

Benefits and Disadvantages of Differential Backups Faster to restore that some other methods -- To do a full restore of all backup files, you only need the full backup and the

last diff backup.

Differential Backups are more demanding on storage than some of the other backup methods, because of data

redundancy.

Each subsequent differential grows significantly until it becomes necessary to create a new full backup. Then the process

starts over.

Page 3: File Backup Terminology

8/8/2019 File Backup Terminology

http://slidepdf.com/reader/full/file-backup-terminology 3/7

 

Incremental Backups 

This backup method works similarly to differential backups, but with one important difference that deals

with the high level of data redundancy in differentials. Each incremental contains only the files that were

created or modified since the last full backup or last incremental. Incrementals, while not containing asmuch redundant data as differentials, are still cumulative since successive backups will still contain any

files that were already backed but have been modified in some way. Incremental backups are a goodsolution for more frequent backups such as those performed on an hourly basis.

Benefits and Disadvantages of Incremental Backups 

Incremental backups can be completed more quickly that differential backups because there is less redundant data being

copied.

Incremental backups are smaller than differential backups.

The number of successive incrementals that can be made between full backups, while still remaining manageable, is

much greater than with differentials.

Incremental backups may take considerably longer to do complete restores than differential backups because all the

individual archives must be merged together one by one with the full backup.

Delta or Block-Level Backups 

The term “delta” is often used rather flexibly in reference to different backup technologies, but when

paired with other terms as in “Delta Backup,” Delta Block Backup,” and “Delta-Style Backup”  they

generally refer to the same basic backup method. Deltas are best described as block-level technology,

where as incrementals and differentials are file-level technologies. It is important to note that delta blocktechniques are only applied to modified files, not new files. New files are of course just backed up in anormal fashion.

File-level backups will backup a changed file in its entirety, even if it has only changed slightly. While this

may not be much of a problem for small text documents, is can quickly become a problem with very large

files like databases. Take for example the email clients like Outlook, which save all received email and

Page 4: File Backup Terminology

8/8/2019 File Backup Terminology

http://slidepdf.com/reader/full/file-backup-terminology 4/7

attachments in single file databases. Even if only one email has been received, the entire database filehas changed, and is backup again. Since these databases can easily grow to be several hundreds of 

megabytes in size you once again end up with a lot of data redundancy.

Delta backups deal with this problem by backing up only the parts files which have changed instead of the

whole file. Each changed file is broken down in fixed size blocks and those blocks are compared with theoriginal file. (The size of block that is handled is dependent on the particular program or perhaps on a userchosen size. Block sizes generally range between 1 and 32 kilobytes in size.) Only those blocks that

contain differences are extracted and backed up. Deltas can be confusing because they can be applied in

a couple of different ways. There are differential deltas, and incremental deltas. These work on the sameprinciple as the differential and incremental file backups explained above, but at a much more granularlevel. Similarly each type of delta would inherit the same type of advantages and disadvantages.

Deltas are especially advantageous for use in technologies where files are backed up immediately afterfiles are created or modified. This is known as real-time backup or continuous data protection. Deltas are

also very beneficial when used to backup files over networks with limited bandwidth or to remote serverssuch as online storage.

Benefits and Disadvantages of Delta Style Backups 

Delta Backups are extremely fast because of the small amount of data being transferred.

Deltas produce much less redundancy, and backups are fractionally smaller than those produced by incremental or

differential backups. This dramatically reduces the demands on storage and bandwidth.

Deltas of modified files do not produce whole files in the backup, and thus restores absolutely depend on the program

that created them to do the restoration.

Deltas are slower to restore because the individual files must be reconstructed from their various parts.

Binary Patch Backups (FastBit) 

Page 5: File Backup Terminology

8/8/2019 File Backup Terminology

http://slidepdf.com/reader/full/file-backup-terminology 5/7

Binary patch technology was originally developed as a way for software developers to easily update their

programs on customers over the Internet by sending “patches” that would replace the parts of files that

needed modification. Recently it has started to be adapted into backup technologies as well. The most

relevant example is a backup technology called FastBittm which is employed by number of online storagevendors.

Binary Patch Backups work very similarly to Deltas, the primary difference being they are even moregranular. Deltas work on a block-level, while binary patches work on the, well, binary level. Because

Deltas backup only the modified parts of files in fixed size blocks, part of that block may contain some

unchanged data. Binary patches avoid this by only copying the actual bytes of the binary code that havechanged.

Benefits and Disadvantages of Binary Patch Backups 

Note: Do the very limited application of binary patching technology in actual backup software, as well as

very sparse information on the subject, the author is very uncertain about the benefits and/or limitationsthat may be inherent to the technique.

 Virtually eliminates all data redundancy, and produces the smallest backups possible with current technologies.

It is even less bandwidth intensive than deltas.

The production of the actual patch may be more demanding on system resources and more time consuming than deltas,

although the loss may be regained in bandwidth and transfer costs.

No information about how file reconstruction is handled and how efficient it is.

Mirror Backups 

Most backup programs will list mirror backups as an alternative to full, differential, or incremental

backups, etc. Some programs use an alternate term for mirrors, such as “simple copy.”  Mirror backups

are basically the simplest type of backup. There are no real backup technologies being employed when

Page 6: File Backup Terminology

8/8/2019 File Backup Terminology

http://slidepdf.com/reader/full/file-backup-terminology 6/7

making a mirror style backup, only copy technology. If you copy and paste a folder from one drive toanother you have created a mirror backup of that folder. The mirrored files generally exist in the same

state they did in the source, not compressed into archives like with a full backup. (Although someprograms support compressing each file individually and adding encryption)

When to Use Mirrored BackupsMirror style backups without compressions are good to use when you are backing up a lot of files withcompression already applied them. For example, music files in mp3 or wma format, images in jpg or png

format, videos in dvix, mov, or flv format, and most program install or setup files are already

compressed. If you include these files in a normal backup that applies compression you will often notice itwill be very slow, and you will gain very little extra compression by doing so. It is best to set up separatebackup jobs for compressed files and non compressed files. If your backup program supports include and

exclude filters they can be used to either automatically select or deselect the compressed filesrespectively.

Benefits and Disadvantages of Mirror Backups 

Mirror backups are much faster when working with compressed files.

Because mirrored files are not placed in single archive files there is less concern about corruption.

Since mirror backups generally don’t use compression they can require large amounts of storage space, unless other

techniques such as hard linking are also employed

Synthetic Full Backups 

Synthetic Full Backup is a term you will see from time to time and it should be understood that it is not a

backup method like those above, but rather a technology that may be applied to one of the abovemethods to make full restores more efficient and require less down time.

Synthetics are generally only applied in server - client type backup systems. A client computer may

perform a backup by any method, incremental, delta, etc. then transfer that backup to a server. At somepoint the server then combines several of the individual backup archives to form a synthetic full backup.

Because of this, after the initial full backup, the client machine only needs to perform backups of new ormodified files, another full backup will never be necessary.

The benefits of this approach are twofold. First, the backup speed of technologies like differentials won’t

degrade over time because of the growing size of cumulative archives since a synthetic will be made on aregular basis. Secondly, when a full restore needs to be made on a client machine, no reconstruction of 

files or file parts needs to be done. The reconstruction has already been performed by the server allowingthe client machine the fastest possible recovery time.

Hard Linked Backups (also Hardlink) 

Some backup software has the ability to employ multiple hard links to preserve space when you wish to

save multiple full mirror style backups of the same set of files.

Page 7: File Backup Terminology

8/8/2019 File Backup Terminology

http://slidepdf.com/reader/full/file-backup-terminology 7/7

To understand what a hard link is consider how files are stored on a hard drive. When you save adocument file, the physical data can be written any where on the disk. Then the file system makes a

reference or hard link to that physical data with the file name you specify. With some file systems it is

possible to create more than one reference to that physical data. Using multiple hard links it is possible toassign any number of file names in different folders to the same physical data.

When using backup programs that support creating hard links to make several backups of the same files,the program will build hard links for all the files that have not changed. For example, if you create two

copies of a folder that contains 100MB of data, they normally would end up using 200MB of space. With

hard links they would only use 100MB of space. If you changed one 2MB file before you make the secondcopy using hard links, the two folders would consume 102MB of space.1  The first folder would contain theoriginal 2MB file while the second would contain the modified one.

It should be mentioned that if you decide you want to delete one of the backups containing hard links, it is

not a problem, as all the other hard links will be unaffected. The physical file on the disk is only deletedwhen all the hard links to it are removed. Also hard links can only exist within the same volume. ( e.g.they can not span across different partitions or drives) On Windows based file systems, NTFS supportshard links, while FAT does not.

1. Windows Explorer does not report file space as one would expect when using hard links. If a 100MB file has two hard links both links will be reported as

consuming 100MB of space for a total of 200MB used. However, the space saved by the hard links is reflected in the amount of free space on the drive, only

100MB will have been consumed.