Characterization of Incremental Data Changes for Efficient Data Protection
description
Transcript of Characterization of Incremental Data Changes for Efficient Data Protection
![Page 1: Characterization of Incremental Data Changes for Efficient Data Protection](https://reader035.fdocuments.us/reader035/viewer/2022062310/56816674550346895dda0d00/html5/thumbnails/1.jpg)
1© Copyright 2013 EMC Corporation. All rights reserved.
Characterization of Incremental Data Changes for Efficient Data
Protection
Hyong Shim, Philip Shilane, & Windsor Hsu
Backup Recovery Systems DivisionEMC Corporation
![Page 2: Characterization of Incremental Data Changes for Efficient Data Protection](https://reader035.fdocuments.us/reader035/viewer/2022062310/56816674550346895dda0d00/html5/thumbnails/2.jpg)
2© Copyright 2013 EMC Corporation. All rights reserved.
Data Protection Environment
SAN or LANWAN
Application Servers
Primary Storage
Data ProtectionStorage
High I/O per sec.Medium Capacity Large Capacity
Medium I/O per sec.
Virtual Machines
![Page 3: Characterization of Incremental Data Changes for Efficient Data Protection](https://reader035.fdocuments.us/reader035/viewer/2022062310/56816674550346895dda0d00/html5/thumbnails/3.jpg)
3© Copyright 2013 EMC Corporation. All rights reserved.
Contributions Detailed analysis of data change
characteristics from enterprise customers Design for replication snapshots to lower
overheads on primary storage. Evaluation of overheads on data protection
storage Rules-of-thumb for storage engineers and
administrators
![Page 4: Characterization of Incremental Data Changes for Efficient Data Protection](https://reader035.fdocuments.us/reader035/viewer/2022062310/56816674550346895dda0d00/html5/thumbnails/4.jpg)
4© Copyright 2013 EMC Corporation. All rights reserved.
EMC Symmetrix VMAX Traces
Trace Set #Volume # Storage Systems
Duration hrs
Estimated Capacity (GB)
1hr_1Wrt 109,263 125 30.4 [78.3] 71 [203]1hr_1GBWrt 16,100 120 7.7 [6.7] 132 [262]24hr_1GBWrt 508 13 24.4 [1.2] 318 [439]
Collected from enterprise customer sites
![Page 5: Characterization of Incremental Data Changes for Efficient Data Protection](https://reader035.fdocuments.us/reader035/viewer/2022062310/56816674550346895dda0d00/html5/thumbnails/5.jpg)
5© Copyright 2013 EMC Corporation. All rights reserved.
Capacity and Write Footprint
Analysis for 1hr_1GBWrit Not collected: applications using each volume
![Page 6: Characterization of Incremental Data Changes for Efficient Data Protection](https://reader035.fdocuments.us/reader035/viewer/2022062310/56816674550346895dda0d00/html5/thumbnails/6.jpg)
6© Copyright 2013 EMC Corporation. All rights reserved.
I/O PropertiesTrace Set #Write
reqs (1000s)
Write size (GB)
#Read reqs (1000s)
Read size (GB)
1hr_1Wrt 72 [510]
2 [31]
167 [1963]
5 [66]
1hr_1GBWrt 429 [1270]
11[80]
796 [4987]
25[166]
24hr_1GBWrt 1803 [4839]
51[338]
7824[23875]
242[763]
1.9-4.3X more read I/Os than write I/Os 2.3-4.7X more GB read than written High variability More analysis in the paper
![Page 7: Characterization of Incremental Data Changes for Efficient Data Protection](https://reader035.fdocuments.us/reader035/viewer/2022062310/56816674550346895dda0d00/html5/thumbnails/7.jpg)
7© Copyright 2013 EMC Corporation. All rights reserved.
Sequential vs. Random Write I/O
We measure how much data are written, on average, after seeking to a non-consecutive sector.
Selected most sequential and most random for analysis
Storage Volume
w w w wTrace Timeline (w = Write I/O, r = Read I/O)
r w Sequential Write I/O(5 + 1+ 3)/ 3 = 3
![Page 8: Characterization of Incremental Data Changes for Efficient Data Protection](https://reader035.fdocuments.us/reader035/viewer/2022062310/56816674550346895dda0d00/html5/thumbnails/8.jpg)
8© Copyright 2013 EMC Corporation. All rights reserved.
r w ww r wr w w www w r w …
Replication Interval 1
TransferPeriod
may require snapshot storage and I/O
Trace Timeline (w = Write I/O, r = Read I/O)
Storage VolumeSectors
Replication Interval 2
Block
Trace Analysis Methodology
Create a snapshot to protect block data
![Page 9: Characterization of Incremental Data Changes for Efficient Data Protection](https://reader035.fdocuments.us/reader035/viewer/2022062310/56816674550346895dda0d00/html5/thumbnails/9.jpg)
9© Copyright 2013 EMC Corporation. All rights reserved.
Replication Snapshot
0
Storage Volume state before transfer takes place
1 2 3 4
Block:
= Modified block to be transferred
Trace Timeline (w = Write I/O)
Goal: Create a snapshot technique that is integrated with replication that decreases overheads on primary storage
Change block tracking records modified blocks for next replication interval, possibly with a bit vector.
A snapshot has to maintain block values against overwrites.
![Page 10: Characterization of Incremental Data Changes for Efficient Data Protection](https://reader035.fdocuments.us/reader035/viewer/2022062310/56816674550346895dda0d00/html5/thumbnails/10.jpg)
10© Copyright 2013 EMC Corporation. All rights reserved.
Replication Snapshot
Baseline Snapshot: All writes cause copy-on-write
0
Storage Volume state before transfer takes place
1 2 3 4
Block:
= Modified block to be transferred
Snapshot AreaTrace Timeline (w = Write I/O)
w w w Baseline
Transfer in progress
![Page 11: Characterization of Incremental Data Changes for Efficient Data Protection](https://reader035.fdocuments.us/reader035/viewer/2022062310/56816674550346895dda0d00/html5/thumbnails/11.jpg)
11© Copyright 2013 EMC Corporation. All rights reserved.
Replication Snapshot
Changed Block Replication Snapshot (CB): Only writes to tracked blocks cause copy-on-write
0 1 2 3 4
Block:
Snapshot Areaw w w Baseline
Transfer in progress
CB
![Page 12: Characterization of Incremental Data Changes for Efficient Data Protection](https://reader035.fdocuments.us/reader035/viewer/2022062310/56816674550346895dda0d00/html5/thumbnails/12.jpg)
12© Copyright 2013 EMC Corporation. All rights reserved.
Replication Snapshot
Changed Block with Early Release Replication Snapshot (CBER): Only writes to tracked blocks cause copy-on-write, and blocks are released once transferred
0 1 2 3 4
Block:
Snapshot Areaw w w Baseline
Transfer in progress
CB
CBER
![Page 13: Characterization of Incremental Data Changes for Efficient Data Protection](https://reader035.fdocuments.us/reader035/viewer/2022062310/56816674550346895dda0d00/html5/thumbnails/13.jpg)
13© Copyright 2013 EMC Corporation. All rights reserved.
Replication Snapshot
0 1 2 3 4
Block:
Snapshot Areaw w w Baseline
CB
CBER
Baseline Snapshot: All writes cause copy-on-write Changed Block Replication Snapshot (CB): Only
writes to tracked blocks cause copy-on-write Changed Block with Early Release Replication
Snapshot (CBER): Only writes to tracked blocks cause copy-on-write, and blocks are released once transferred
= Modified block to be transferred
![Page 14: Characterization of Incremental Data Changes for Efficient Data Protection](https://reader035.fdocuments.us/reader035/viewer/2022062310/56816674550346895dda0d00/html5/thumbnails/14.jpg)
14© Copyright 2013 EMC Corporation. All rights reserved.
Snapshot Storage OverheadsRule-of-thumb: Over-provision primary capacity by 8% for snapshots
![Page 15: Characterization of Incremental Data Changes for Efficient Data Protection](https://reader035.fdocuments.us/reader035/viewer/2022062310/56816674550346895dda0d00/html5/thumbnails/15.jpg)
15© Copyright 2013 EMC Corporation. All rights reserved.
Snapshot I/O OverheadsRule-of-thumb: Over-provision primary I/O by 100% to support copy-on-write related write-amplification
![Page 16: Characterization of Incremental Data Changes for Efficient Data Protection](https://reader035.fdocuments.us/reader035/viewer/2022062310/56816674550346895dda0d00/html5/thumbnails/16.jpg)
16© Copyright 2013 EMC Corporation. All rights reserved.
Snapshot I/O OverheadsRule-of-thumb: Over-provision primary I/O by 100% to support copy-on-write related write-amplification
![Page 17: Characterization of Incremental Data Changes for Efficient Data Protection](https://reader035.fdocuments.us/reader035/viewer/2022062310/56816674550346895dda0d00/html5/thumbnails/17.jpg)
17© Copyright 2013 EMC Corporation. All rights reserved.
Transfer Size to Protection Storage Rule-of-thumb: 40% of written bytes are transferred to protection storage
![Page 18: Characterization of Incremental Data Changes for Efficient Data Protection](https://reader035.fdocuments.us/reader035/viewer/2022062310/56816674550346895dda0d00/html5/thumbnails/18.jpg)
18© Copyright 2013 EMC Corporation. All rights reserved.
IOPS Requirements for Protection StorageRule-of-thumb: Protection storage must support 20% of the I/O per second capabilities of primary storage
![Page 19: Characterization of Incremental Data Changes for Efficient Data Protection](https://reader035.fdocuments.us/reader035/viewer/2022062310/56816674550346895dda0d00/html5/thumbnails/19.jpg)
19© Copyright 2013 EMC Corporation. All rights reserved.
Related Work Trace analysis
– Numerous publications Most closely related is Patterson [2002]
Snapshots– Common paradigm for storage but rarely integrated with
incremental transfer techniques– Storage overheads Azagury [2002] and Shah [2006]
Synchronous Mirroring– Effective when change rates are low and geographic
distance is small– We are focused on periodic, asynchronous replication
![Page 20: Characterization of Incremental Data Changes for Efficient Data Protection](https://reader035.fdocuments.us/reader035/viewer/2022062310/56816674550346895dda0d00/html5/thumbnails/20.jpg)
20© Copyright 2013 EMC Corporation. All rights reserved.
Conclusion
SAN or LANWAN
Application Servers
Primary Storage
Data ProtectionStorage
High I/O per sec.Medium Capacity Large Capacity
Medium I/O per sec.
![Page 21: Characterization of Incremental Data Changes for Efficient Data Protection](https://reader035.fdocuments.us/reader035/viewer/2022062310/56816674550346895dda0d00/html5/thumbnails/21.jpg)
21© Copyright 2013 EMC Corporation. All rights reserved.
Conclusion Trace analysis shows diversity of storage characteristics Snapshot overheads on primary storage can be decreased by
improved integration with network transfer Sequential versus random access patterns affect incremental
change patterns on both primary and protection storage
![Page 22: Characterization of Incremental Data Changes for Efficient Data Protection](https://reader035.fdocuments.us/reader035/viewer/2022062310/56816674550346895dda0d00/html5/thumbnails/22.jpg)
22© Copyright 2013 EMC Corporation. All rights reserved.
Rules-of-Thumb Over-provision primary capacity by 8% for snapshots Over-provision primary I/O by 100% to support copy-on-write
related write-amplification A write buffer decreases snapshot I/O overheads but has little
impact on storage overheads 40% of written bytes are transferred to protection storage Schedule at least 6 hours between transfers to minimize clean
data in transferred blocks Schedule at least 12 hours between transfers to minimize peak
network bandwidth requirements Protection storage must support 20% of the I/O per second
capabilities of primary storage
![Page 23: Characterization of Incremental Data Changes for Efficient Data Protection](https://reader035.fdocuments.us/reader035/viewer/2022062310/56816674550346895dda0d00/html5/thumbnails/23.jpg)
23© Copyright 2013 EMC Corporation. All rights reserved.
Questions?
![Page 24: Characterization of Incremental Data Changes for Efficient Data Protection](https://reader035.fdocuments.us/reader035/viewer/2022062310/56816674550346895dda0d00/html5/thumbnails/24.jpg)
![Page 25: Characterization of Incremental Data Changes for Efficient Data Protection](https://reader035.fdocuments.us/reader035/viewer/2022062310/56816674550346895dda0d00/html5/thumbnails/25.jpg)
25© Copyright 2013 EMC Corporation. All rights reserved.
Trace Analysis: Replication of SnapshotsThe amount of data to replicate drops in half with 12 hours between snapshots. 4KB results are compared to Patterson 2002.
![Page 26: Characterization of Incremental Data Changes for Efficient Data Protection](https://reader035.fdocuments.us/reader035/viewer/2022062310/56816674550346895dda0d00/html5/thumbnails/26.jpg)
26© Copyright 2013 EMC Corporation. All rights reserved.
I/O Per Second (IOPS) Request RateTrace Set Average
Write RatePeak Write Rate 10ms
Average Read Rate
Peak Read Rate 10 ms
1hr_1Wrt 0.7[8]
1762[2602]
2[25]
1693[2457]
1hr_1GBWrt 15[37]
4360[4379]
29[118]
3603[4135]
24hr_1GBWrt 20[55]
9004[8165]
89[269]
5647[7012]
Peak Values: IOPS are calculated every 10ms period, and the peaks for each volume are averaged.More analysis in the paper
![Page 27: Characterization of Incremental Data Changes for Efficient Data Protection](https://reader035.fdocuments.us/reader035/viewer/2022062310/56816674550346895dda0d00/html5/thumbnails/27.jpg)
27© Copyright 2013 EMC Corporation. All rights reserved.
Snapshot I/O OverheadsRule-of-thumb: Over-provision primary I/O by 100% to support copy-on-write related write-amplification