File Backup Terminology
Click here to load reader
Transcript of File Backup Terminology
8/8/2019 File Backup Terminology
http://slidepdf.com/reader/full/file-backup-terminology 1/7
File Backup Terminology: What do terms like “Differential,” &“Incremental,” mean, and how will they help me? Updated Tue, 10/19/2010 - 02:11 by Ritho
Introduction
Over the years various backup technologies have been developed in an attempt to minimize the amount ofspace required to store backup files, and to reduce the bandwidth required to transfer those files to
remote locations. When faced with the different backup methods that many programs offer, it is easy to
become confused, since the terminology used is often not very clear, and it is hard to know the benefits ordrawbacks of any one technology. This article is meant to be a simple guide to help cut down on the
frustration that many experience when they don’t know what certain terms mean, and how differentoptions are best used.
Note: This is not, by far, an exhaustive glossary of backup terms. If you have questions about any terms
that are not covered below, please feel welcome to ask in the comment section, and we will attempt to
answer them for you.
Index
Common Backup Methods
Full Backups
Differential Backups
Incremental Backups
Delta Block-level Backups
Mirror Backups (Simple Copy)
Other Backup Methods and Techniques
Binary Patch
Synthetic
Hard Linked Backups
Discussion
Full Backups
This is just what it sounds like. This is a complete backup of all the data that a user selects when
configuring a backup job. The copied files are usually placed into a single file archive and compressed tohelp save space. Every time another full backup is made, all the files in the source are once again copiedan archive. The problem is that often there are only a few new or changed files, and continuously making
full backups will end up copying a lot of extra files that don’t really need to be backed up again. This
ends up using a lot of extra storage and wastes time. You can of course delete older backups to free up
8/8/2019 File Backup Terminology
http://slidepdf.com/reader/full/file-backup-terminology 2/7
space, but the time is still lost. The extra wear on hard disks or the amount of bandwidth that is used tomake frequent full backups must be considered too.
It is a much better idea to make a full backup once in a while, and then figure out a way to only copy the
new or changed files on a more frequent basis. Several different methods, described below, have beencreated to implement this very thing.
Benefits and Disadvantages of Full Backups
Faster restore of all files -- When a full restore is necessary full backups are quick because you are only dealing with one
archive file.
Full backups are large and time consuming to make -- They are not well suited for regular backups such as those
performed hourly or daily.
Differential Backups
After creating a full backup archive this backup method helps to reduce the size of subsequent backups bydoing a “differential” comparison of the original files and the last full backup. All new and modified files are
copied to a archive along side the full backup. The important thing to understand is that differentialbackups are cumulative. Each differential backup backs up everything that is different since the last fullbackup even if those files are already included in a previous differential. Since Differentials back up only
new or changed files, they are a faster backup method than creating a full backup each time. Differentialbackups are well suited for daily or less frequent backup strategies.
Benefits and Disadvantages of Differential Backups Faster to restore that some other methods -- To do a full restore of all backup files, you only need the full backup and the
last diff backup.
Differential Backups are more demanding on storage than some of the other backup methods, because of data
redundancy.
Each subsequent differential grows significantly until it becomes necessary to create a new full backup. Then the process
starts over.
8/8/2019 File Backup Terminology
http://slidepdf.com/reader/full/file-backup-terminology 3/7
Incremental Backups
This backup method works similarly to differential backups, but with one important difference that deals
with the high level of data redundancy in differentials. Each incremental contains only the files that were
created or modified since the last full backup or last incremental. Incrementals, while not containing asmuch redundant data as differentials, are still cumulative since successive backups will still contain any
files that were already backed but have been modified in some way. Incremental backups are a goodsolution for more frequent backups such as those performed on an hourly basis.
Benefits and Disadvantages of Incremental Backups
Incremental backups can be completed more quickly that differential backups because there is less redundant data being
copied.
Incremental backups are smaller than differential backups.
The number of successive incrementals that can be made between full backups, while still remaining manageable, is
much greater than with differentials.
Incremental backups may take considerably longer to do complete restores than differential backups because all the
individual archives must be merged together one by one with the full backup.
Delta or Block-Level Backups
The term “delta” is often used rather flexibly in reference to different backup technologies, but when
paired with other terms as in “Delta Backup,” Delta Block Backup,” and “Delta-Style Backup” they
generally refer to the same basic backup method. Deltas are best described as block-level technology,
where as incrementals and differentials are file-level technologies. It is important to note that delta blocktechniques are only applied to modified files, not new files. New files are of course just backed up in anormal fashion.
File-level backups will backup a changed file in its entirety, even if it has only changed slightly. While this
may not be much of a problem for small text documents, is can quickly become a problem with very large
files like databases. Take for example the email clients like Outlook, which save all received email and
8/8/2019 File Backup Terminology
http://slidepdf.com/reader/full/file-backup-terminology 4/7
attachments in single file databases. Even if only one email has been received, the entire database filehas changed, and is backup again. Since these databases can easily grow to be several hundreds of
megabytes in size you once again end up with a lot of data redundancy.
Delta backups deal with this problem by backing up only the parts files which have changed instead of the
whole file. Each changed file is broken down in fixed size blocks and those blocks are compared with theoriginal file. (The size of block that is handled is dependent on the particular program or perhaps on a userchosen size. Block sizes generally range between 1 and 32 kilobytes in size.) Only those blocks that
contain differences are extracted and backed up. Deltas can be confusing because they can be applied in
a couple of different ways. There are differential deltas, and incremental deltas. These work on the sameprinciple as the differential and incremental file backups explained above, but at a much more granularlevel. Similarly each type of delta would inherit the same type of advantages and disadvantages.
Deltas are especially advantageous for use in technologies where files are backed up immediately afterfiles are created or modified. This is known as real-time backup or continuous data protection. Deltas are
also very beneficial when used to backup files over networks with limited bandwidth or to remote serverssuch as online storage.
Benefits and Disadvantages of Delta Style Backups
Delta Backups are extremely fast because of the small amount of data being transferred.
Deltas produce much less redundancy, and backups are fractionally smaller than those produced by incremental or
differential backups. This dramatically reduces the demands on storage and bandwidth.
Deltas of modified files do not produce whole files in the backup, and thus restores absolutely depend on the program
that created them to do the restoration.
Deltas are slower to restore because the individual files must be reconstructed from their various parts.
Binary Patch Backups (FastBit)
8/8/2019 File Backup Terminology
http://slidepdf.com/reader/full/file-backup-terminology 5/7
Binary patch technology was originally developed as a way for software developers to easily update their
programs on customers over the Internet by sending “patches” that would replace the parts of files that
needed modification. Recently it has started to be adapted into backup technologies as well. The most
relevant example is a backup technology called FastBittm which is employed by number of online storagevendors.
Binary Patch Backups work very similarly to Deltas, the primary difference being they are even moregranular. Deltas work on a block-level, while binary patches work on the, well, binary level. Because
Deltas backup only the modified parts of files in fixed size blocks, part of that block may contain some
unchanged data. Binary patches avoid this by only copying the actual bytes of the binary code that havechanged.
Benefits and Disadvantages of Binary Patch Backups
Note: Do the very limited application of binary patching technology in actual backup software, as well as
very sparse information on the subject, the author is very uncertain about the benefits and/or limitationsthat may be inherent to the technique.
Virtually eliminates all data redundancy, and produces the smallest backups possible with current technologies.
It is even less bandwidth intensive than deltas.
The production of the actual patch may be more demanding on system resources and more time consuming than deltas,
although the loss may be regained in bandwidth and transfer costs.
No information about how file reconstruction is handled and how efficient it is.
Mirror Backups
Most backup programs will list mirror backups as an alternative to full, differential, or incremental
backups, etc. Some programs use an alternate term for mirrors, such as “simple copy.” Mirror backups
are basically the simplest type of backup. There are no real backup technologies being employed when
8/8/2019 File Backup Terminology
http://slidepdf.com/reader/full/file-backup-terminology 6/7
making a mirror style backup, only copy technology. If you copy and paste a folder from one drive toanother you have created a mirror backup of that folder. The mirrored files generally exist in the same
state they did in the source, not compressed into archives like with a full backup. (Although someprograms support compressing each file individually and adding encryption)
When to Use Mirrored BackupsMirror style backups without compressions are good to use when you are backing up a lot of files withcompression already applied them. For example, music files in mp3 or wma format, images in jpg or png
format, videos in dvix, mov, or flv format, and most program install or setup files are already
compressed. If you include these files in a normal backup that applies compression you will often notice itwill be very slow, and you will gain very little extra compression by doing so. It is best to set up separatebackup jobs for compressed files and non compressed files. If your backup program supports include and
exclude filters they can be used to either automatically select or deselect the compressed filesrespectively.
Benefits and Disadvantages of Mirror Backups
Mirror backups are much faster when working with compressed files.
Because mirrored files are not placed in single archive files there is less concern about corruption.
Since mirror backups generally don’t use compression they can require large amounts of storage space, unless other
techniques such as hard linking are also employed
Synthetic Full Backups
Synthetic Full Backup is a term you will see from time to time and it should be understood that it is not a
backup method like those above, but rather a technology that may be applied to one of the abovemethods to make full restores more efficient and require less down time.
Synthetics are generally only applied in server - client type backup systems. A client computer may
perform a backup by any method, incremental, delta, etc. then transfer that backup to a server. At somepoint the server then combines several of the individual backup archives to form a synthetic full backup.
Because of this, after the initial full backup, the client machine only needs to perform backups of new ormodified files, another full backup will never be necessary.
The benefits of this approach are twofold. First, the backup speed of technologies like differentials won’t
degrade over time because of the growing size of cumulative archives since a synthetic will be made on aregular basis. Secondly, when a full restore needs to be made on a client machine, no reconstruction of
files or file parts needs to be done. The reconstruction has already been performed by the server allowingthe client machine the fastest possible recovery time.
Hard Linked Backups (also Hardlink)
Some backup software has the ability to employ multiple hard links to preserve space when you wish to
save multiple full mirror style backups of the same set of files.
8/8/2019 File Backup Terminology
http://slidepdf.com/reader/full/file-backup-terminology 7/7
To understand what a hard link is consider how files are stored on a hard drive. When you save adocument file, the physical data can be written any where on the disk. Then the file system makes a
reference or hard link to that physical data with the file name you specify. With some file systems it is
possible to create more than one reference to that physical data. Using multiple hard links it is possible toassign any number of file names in different folders to the same physical data.
When using backup programs that support creating hard links to make several backups of the same files,the program will build hard links for all the files that have not changed. For example, if you create two
copies of a folder that contains 100MB of data, they normally would end up using 200MB of space. With
hard links they would only use 100MB of space. If you changed one 2MB file before you make the secondcopy using hard links, the two folders would consume 102MB of space.1 The first folder would contain theoriginal 2MB file while the second would contain the modified one.
It should be mentioned that if you decide you want to delete one of the backups containing hard links, it is
not a problem, as all the other hard links will be unaffected. The physical file on the disk is only deletedwhen all the hard links to it are removed. Also hard links can only exist within the same volume. ( e.g.they can not span across different partitions or drives) On Windows based file systems, NTFS supportshard links, while FAT does not.
1. Windows Explorer does not report file space as one would expect when using hard links. If a 100MB file has two hard links both links will be reported as
consuming 100MB of space for a total of 200MB used. However, the space saved by the hard links is reflected in the amount of free space on the drive, only
100MB will have been consumed.