Practical Web-based Delta Sync for Cloud Storage Services · Web Apps with local storage or log...
Transcript of Practical Web-based Delta Sync for Cloud Storage Services · Web Apps with local storage or log...
![Page 1: Practical Web-based Delta Sync for Cloud Storage Services · Web Apps with local storage or log files need web -based Delta Sync ... overhead affordable at the web client side. •](https://reader034.fdocuments.us/reader034/viewer/2022050513/5f9d26ec95806b37ba6c499d/html5/thumbnails/1.jpg)
HeXiao Zhenhua Li Ennan Zhai Tianyin Xu
PracticalWeb-basedDeltaSyncforCloudStorageServices
[email protected],2017Hotstorage’17
![Page 2: Practical Web-based Delta Sync for Cloud Storage Services · Web Apps with local storage or log files need web -based Delta Sync ... overhead affordable at the web client side. •](https://reader034.fdocuments.us/reader034/viewer/2022050513/5f9d26ec95806b37ba6c499d/html5/thumbnails/2.jpg)
NetworkTrafficisOverwhelminginCloudStorage
FileSync
2
CloudTraffichas30%CAGR(CompoundAverage Growth Rate)
SeverClient
NetworkTrafficUsers Vendors
![Page 3: Practical Web-based Delta Sync for Cloud Storage Services · Web Apps with local storage or log files need web -based Delta Sync ... overhead affordable at the web client side. •](https://reader034.fdocuments.us/reader034/viewer/2022050513/5f9d26ec95806b37ba6c499d/html5/thumbnails/3.jpg)
DeltaSyncImproves Network Efficiency
DeltaSynciscrucialforreducingcloudstoragenetworktraffic.
10MB1B
DeltaSync
DeltaData
3
NewFile OldFile
Delta sync support in nine state-of-the-art cloud storage services 10MB
FullSyncNewFile OldFile
FullFile
![Page 4: Practical Web-based Delta Sync for Cloud Storage Services · Web Apps with local storage or log files need web -based Delta Sync ... overhead affordable at the web client side. •](https://reader034.fdocuments.us/reader034/viewer/2022050513/5f9d26ec95806b37ba6c499d/html5/thumbnails/4.jpg)
No Web-basedDeltaSync
Whyweb-baseddeltasyncisnotsupportedbytoday’scloudstorageservices?
4
WebAppswithlocalstorageorlogfilesneedweb-basedDeltaSync
WebisthemostpervasiveandOS-independent cloudstorageaccessmethod
Web-baseddeltasyncisessentialforcloudstoragewebclientsandwebapps
![Page 5: Practical Web-based Delta Sync for Cloud Storage Services · Web Apps with local storage or log files need web -based Delta Sync ... overhead affordable at the web client side. •](https://reader034.fdocuments.us/reader034/viewer/2022050513/5f9d26ec95806b37ba6c499d/html5/thumbnails/5.jpg)
Contribution
• Wequantitatively studywhyweb-baseddeltasyncisnotoffered bytoday’scloudstorageservices.
• Webuildapracticalweb-baseddeltasyncsolutionforcloudstorageservices.• Byreversing traditionaldeltasyncprocess,wemaketheoverheadaffordableatthewebclientside.• Byexploitingthelocality ofusers’editsandtradingoffhashalgorithms,wemakethecomputationoverheadaffordableattheserverside.
5
![Page 6: Practical Web-based Delta Sync for Cloud Storage Services · Web Apps with local storage or log files need web -based Delta Sync ... overhead affordable at the web client side. •](https://reader034.fdocuments.us/reader034/viewer/2022050513/5f9d26ec95806b37ba6c499d/html5/thumbnails/6.jpg)
WebRsync:ImplementDeltaSynconWeb
• Implementrsync onrealcloudstoragewith nativewebtech:JavaScript + HTML5 + WebSocket• rsync isthedefactosolutionofdeltasyncincloudstorage
JavaScriptImplementationofRsync
WebServer
LocalFile System
HTML5FileAPI
WebSocket
StorageBackendAliyun OSS/OpenStack Swift
High-SpeedInternalNetwork
Web Browser
CImplementationofRsync
6
![Page 7: Practical Web-based Delta Sync for Cloud Storage Services · Web Apps with local storage or log files need web -based Delta Sync ... overhead affordable at the web client side. •](https://reader034.fdocuments.us/reader034/viewer/2022050513/5f9d26ec95806b37ba6c499d/html5/thumbnails/7.jpg)
WebRsyncvs.rsync
7
Sync time of WebRsync vs rsync
Average Client CPU utilization
![Page 8: Practical Web-based Delta Sync for Cloud Storage Services · Web Apps with local storage or log files need web -based Delta Sync ... overhead affordable at the web client side. •](https://reader034.fdocuments.us/reader034/viewer/2022050513/5f9d26ec95806b37ba6c499d/html5/thumbnails/8.jpg)
StagnationduetoJavaScript’sSingle-thread EventLoopModel
//printtimestampevery100mssetInterval(print(timestamp),100) //printthetimestampofeverykeystone( startorendofatask)on_start(task); print(task.id, timestamp) on_finish(task); print(task.id, timestamp)
8
StagMeter
![Page 9: Practical Web-based Delta Sync for Cloud Storage Services · Web Apps with local storage or log files need web -based Delta Sync ... overhead affordable at the web client side. •](https://reader034.fdocuments.us/reader034/viewer/2022050513/5f9d26ec95806b37ba6c499d/html5/thumbnails/9.jpg)
1.SendmetadataWaitserver
2.ChecksumSearchandComparison
3.SendtokensandliteralbytesWaitserver
High CPU Utilizationwhencomputing
TimestampPrintingissuspendedWebisunderstagation state
StagMeteronWebRsync
9
Sync Process (Second)
![Page 10: Practical Web-based Delta Sync for Cloud Storage Services · Web Apps with local storage or log files need web -based Delta Sync ... overhead affordable at the web client side. •](https://reader034.fdocuments.us/reader034/viewer/2022050513/5f9d26ec95806b37ba6c499d/html5/thumbnails/10.jpg)
WebR2sync:Client-sideOptimizationReverseComputationProcess
Client Server
RequestforSyncingFilef’
ChecksumListoffSegmentationFingerprinting
SearchingComparing
GeneratetokensandLiteralBytes Construct
NewFilefACK
10
WebRsync
![Page 11: Practical Web-based Delta Sync for Cloud Storage Services · Web Apps with local storage or log files need web -based Delta Sync ... overhead affordable at the web client side. •](https://reader034.fdocuments.us/reader034/viewer/2022050513/5f9d26ec95806b37ba6c499d/html5/thumbnails/11.jpg)
WebR2sync:Client-sideoptimizationReverseComputationProcess
• Web Reverse Rsync: Reverse complicated computation fromserver to client.
Client Server
RequestforSyncingFilef’
SegmentationFingerprinting
GenerateTokensAndLiteralBytes
ConstructNewFilefACK
SearchingComparing
ChecksumListoff
11
![Page 12: Practical Web-based Delta Sync for Cloud Storage Services · Web Apps with local storage or log files need web -based Delta Sync ... overhead affordable at the web client side. •](https://reader034.fdocuments.us/reader034/viewer/2022050513/5f9d26ec95806b37ba6c499d/html5/thumbnails/12.jpg)
PerformanceofWebR2sync
Edit Size (Byte)
Sync
Tim
e (S
econ
d)
12
Edit Size (Byte)
Sync
Tim
e (S
econ
d)
Issue:Servertakesseverelyheavyoverhead.
![Page 13: Practical Web-based Delta Sync for Cloud Storage Services · Web Apps with local storage or log files need web -based Delta Sync ... overhead affordable at the web client side. •](https://reader034.fdocuments.us/reader034/viewer/2022050513/5f9d26ec95806b37ba6c499d/html5/thumbnails/13.jpg)
Server-sideOverheadProfiling
Checksumsearching andblockcomparison occupy80%ofthecomputingtime
MD5 Computing Checksum Search
13
Ø UsefasterhashfunctionstoreplaceMD5Ø Reducechecksumsearchingoverhead
![Page 14: Practical Web-based Delta Sync for Cloud Storage Services · Web Apps with local storage or log files need web -based Delta Sync ... overhead affordable at the web client side. •](https://reader034.fdocuments.us/reader034/viewer/2022050513/5f9d26ec95806b37ba6c499d/html5/thumbnails/14.jpg)
ReplacingMD5withSipHashinChunkComparison
HashFunction CollisionProbability
CyclesperByte
MD5 Low 5.58
Murmur3 High 0.33
Spooky High 0.14
SipHash Low 1.13
SipHashremainlowCollisionProbabilityat muchfasterspeed
14
A comparison of pseudorandom hash functions
![Page 15: Practical Web-based Delta Sync for Cloud Storage Services · Web Apps with local storage or log files need web -based Delta Sync ... overhead affordable at the web client side. •](https://reader034.fdocuments.us/reader034/viewer/2022050513/5f9d26ec95806b37ba6c499d/html5/thumbnails/15.jpg)
SolvePossibleHashCollision
• ReplaceMD5withSipHash,maycausepotentialcollisions(Probabilityp),sodoesMD5.
• OurSolution:UseSpooky(fastestmethod,collisionprobabilityp’).• Theprobabilityofcollisionsisp*p’
• Alternative:UseMD5orotherstronghashfunctionsasaglobalverification.• ComputeMD5overwholefileisexpensive.
15
![Page 16: Practical Web-based Delta Sync for Cloud Storage Services · Web Apps with local storage or log files need web -based Delta Sync ... overhead affordable at the web client side. •](https://reader034.fdocuments.us/reader034/viewer/2022050513/5f9d26ec95806b37ba6c499d/html5/thumbnails/16.jpg)
ReduceChunkSearchingbyExploitingLocality ofFileEdits.
16
MD5-4
HashTableAdler32-1 Adler32-2 Adler32-3 Adler32-4
MD5-1 MD5-2 MD5-3
Block1 Block2 Block3 Block4
Checksumsearch
Compare
95%synchronizedfileshavelessthan10 edits.
![Page 17: Practical Web-based Delta Sync for Cloud Storage Services · Web Apps with local storage or log files need web -based Delta Sync ... overhead affordable at the web client side. •](https://reader034.fdocuments.us/reader034/viewer/2022050513/5f9d26ec95806b37ba6c499d/html5/thumbnails/17.jpg)
EvaluationSetup
17
Basic experiment setup visualized in a map of China
![Page 18: Practical Web-based Delta Sync for Cloud Storage Services · Web Apps with local storage or log files need web -based Delta Sync ... overhead affordable at the web client side. •](https://reader034.fdocuments.us/reader034/viewer/2022050513/5f9d26ec95806b37ba6c499d/html5/thumbnails/18.jpg)
SyncTime
18
1 10 100 1K 10K 100kEdit Size (Byte)
10-1
100
101Sy
nc T
ime
(Sec
ond) WebRsync
WebR2syncWebR2sync+rsync
WebR2sync+is2-3 times fasterthanWebR2syncand15-20timesfasterthanWebRsync
![Page 19: Practical Web-based Delta Sync for Cloud Storage Services · Web Apps with local storage or log files need web -based Delta Sync ... overhead affordable at the web client side. •](https://reader034.fdocuments.us/reader034/viewer/2022050513/5f9d26ec95806b37ba6c499d/html5/thumbnails/19.jpg)
Throughput
19
0 2000 4000 6000 8000Number of Concurrent Users
NoWebRsync
WebRsync
WebR2sync
WebR2sync+
rsync
Thisthroughputisas4 timesasthatofWebR2sync/rsyncandas 9timesasthatofNoWebRsync.
![Page 20: Practical Web-based Delta Sync for Cloud Storage Services · Web Apps with local storage or log files need web -based Delta Sync ... overhead affordable at the web client side. •](https://reader034.fdocuments.us/reader034/viewer/2022050513/5f9d26ec95806b37ba6c499d/html5/thumbnails/20.jpg)
FutureWork
• Evaluateourapproachunderdifferenteditmodes• delete,insert,append
• Evaluatetrafficefficiency• allthemethodsshouldhavesimilartrafficefficiency
• Understandtheeffectsofthreeoptimizations• evaluatethemseparately
20
![Page 21: Practical Web-based Delta Sync for Cloud Storage Services · Web Apps with local storage or log files need web -based Delta Sync ... overhead affordable at the web client side. •](https://reader034.fdocuments.us/reader034/viewer/2022050513/5f9d26ec95806b37ba6c499d/html5/thumbnails/21.jpg)
Discussion
• Probabilityofcollisionsoffilechecksums
• Characteristicsoffileoperationsinreal-worldscenariosfromtheperspectiveofsync
• Localitymeasurefordecidingwhethertoapplylocality-basedoptimization.
21
![Page 22: Practical Web-based Delta Sync for Cloud Storage Services · Web Apps with local storage or log files need web -based Delta Sync ... overhead affordable at the web client side. •](https://reader034.fdocuments.us/reader034/viewer/2022050513/5f9d26ec95806b37ba6c499d/html5/thumbnails/22.jpg)
Conclusion
•WebR2sync+isapracticalsolutionforweb-baseddeltasync• lightweightcomputation attheclientside• optimizedoverheadattheserverside• theserver-sideoptimizationscanbeadoptedinthetraditionalcloudstoragearchitecture
22
![Page 23: Practical Web-based Delta Sync for Cloud Storage Services · Web Apps with local storage or log files need web -based Delta Sync ... overhead affordable at the web client side. •](https://reader034.fdocuments.us/reader034/viewer/2022050513/5f9d26ec95806b37ba6c499d/html5/thumbnails/23.jpg)
Thanks!discussion
23
![Page 24: Practical Web-based Delta Sync for Cloud Storage Services · Web Apps with local storage or log files need web -based Delta Sync ... overhead affordable at the web client side. •](https://reader034.fdocuments.us/reader034/viewer/2022050513/5f9d26ec95806b37ba6c499d/html5/thumbnails/24.jpg)
WebRsyncDetailed Description
Block1
Block2
Block3
…
Adler32 MD5
Adler32 MD5
Adler32 MD5
… …
WeakChecksumSearch
StrongChecksumCompare
1 block offset
YES
YES
NO
NO
MatchedTokens LiteralBytes ConstructNewFile
Client Server
1 byte offset
Rolling Adler32O(1): Adler(i)=>Adler(i+1)
24
![Page 25: Practical Web-based Delta Sync for Cloud Storage Services · Web Apps with local storage or log files need web -based Delta Sync ... overhead affordable at the web client side. •](https://reader034.fdocuments.us/reader034/viewer/2022050513/5f9d26ec95806b37ba6c499d/html5/thumbnails/25.jpg)
WebR2sync:FlowchartandData structure
ConstructNewFilesClient Server
WeakChecksumSearch
StrongChecksumCompare
YES
NO
NO
1 byte offsetNo further Operation
YESBlock 1Block 2Block 3Block 4
Block 1Block 2Block 3Block 4
Whenfind amatch,recordtheassociatedindex
25
![Page 26: Practical Web-based Delta Sync for Cloud Storage Services · Web Apps with local storage or log files need web -based Delta Sync ... overhead affordable at the web client side. •](https://reader034.fdocuments.us/reader034/viewer/2022050513/5f9d26ec95806b37ba6c499d/html5/thumbnails/26.jpg)
SyncTimedecomposed
26
1 10 100 1K 10K 100KEdit Size (Byte)
0
0.05
0.1
0.15
0.2Sy
nc T
ime
(Sec
ond) Server
NetworkClient
WebR2sync+clienttakesstableandshortertime.BecauseoftheServer-sideoptimization,computingtimeismuchshorterbothinclientandserver.