Introduction to Storage Deduplication for the SQL Server DBA

Global Marketing

Introduction to Storage Deduplication for the SQL Server DBA

SQLDBApros

SQL Server DBA Professionals

SQLDBApros2

Introduction to deduplication

SQL Server DBAs across the industry are increasingly facing requests to

place database backups on deduplication storage — while also

considering whether or not compressing backups while using

deduplication storage is a good idea.

SQLDBApros3

Let us explain…

Deduplication is not a new term; it has been circulating widely the past few years as major companies began releasing deduplicating storage devices. Deduplication simply means not having to save the same data repeatedly.

SQLDBApros4

Imagine this…

Imagine the original data as separate files; the same files multiply as multiple users save them to their home directories, creating an excess of duplicate files that contain the same information. The object of deduplication is to store unique information once, rather than repeatedly.

In this example, when a copy of a file is already saved, subsequent copies simply point to the original data, rather than saving the same information again and again.

Download Your Free Trial of our Backup Compression Tool

http://dell.to/1gyhrlY


SQLDBApros5

Chunking

Files enter the process intact, as they would in any storage. They are then deduplicated, and compressed.

On many appliances, data is processed in real time. Unlike the file example above, deduplication appliances are more sophisticated.

Most deduplication appliances offer an architecture wherein incoming data is deduplicated inline (before it hits the disk) and then the unique data is compressed and stored.

Sophisticated algorithms break up files into smaller bits. This is called “chunking.” Most chunking algorithms offer “variable block” processing and “sliding windows” that allow for changes within files without much loss in deduplication.




SQLDBApros6

Fingerprints

For instance, if a one-line change is made in one of several nearly identical files, sliding windows and variable block sizes break up the larger file into smaller ones and effectively store the small pieces of changed information without thinking the one line of change means the entire file is new.

Each chunk of data gets hashed, which you can think of as a fingerprint. If the system encounters a piece of data bearing a fingerprint it recognizes, it merely updates the file map and reference count without having to save that data.

Unique data is saved and compressed.




SQLDBApros7

Reduce storage needs

Both target and source-side (at the server itself) deduplication help to reduce demand for storage by eliminating redundancies in data and reducing the amount of data being sent across the network.

In source-side deduplication, vendor APIs can quickly query a storage device to see if the chunks of data already reside on the storage rather than sending all of the bits across the network for the storage device to process.




SQLDBApros8

Replication

Replication is another key feature.

Replication features found in today’s deduplication appliances are a boon for DBAs.

It allows them to ensure that backup data from one data center can be easily replicated to another by moving only the needed deduplicated and compressed chunks.




SQLDBApros9

Rehydration

Deduplication does mean that the file will eventually have to be “rehydrated” (put back together) if it’s needed.

Bits are read, decompressed, and reassembled.

Read speed may slow down compared to non-deduplicated storage because of needed decompression, reassembly, and transmission over the network.

Additionally, fragmentation on the storage device can cause additional rehydration overhead.




10

Learn More

Backup Compression Tool– Free TrialDownload Compression WhitepaperFollow us on Twitter @SQLDBApros



http://dell.to/N3IvNq

http://bit.ly/1ff8yJb

http://bit.ly/1ff8yJb

Introduction to Storage Deduplication for the SQL Server DBA

Technology

Transcript of Introduction to Storage Deduplication for the SQL Server DBA