MetaSyncFileSynchronizationAcrossMultipleUntrustedStorageServices
Seungyeop Han,HaichenShen,Taesoo Kim*,ArvindKrishnamurthy,ThomasAnderson,and
DavidWetherall
UniversityofWashington *Georgia InstituteofTechnology1
Filesyncservicesarepopular
400M ofDropbox usersreachedinJune20152
Baidu(2TB)
Manysyncserviceproviders
Dropbox (2GB) GoogleDrive(15GB)
MSOneDrive (15GB) Box.net (10GB)
3
Canwerelyonanysingleservice?
4
ExistingApproaches
• Encryptfilestopreventmodification– Boxcryptor
• Rewritefilesyncservicetoreducetrust– SUNDR(Lietal.,04),DEPOT(Mahajan etal.,10)
5
MetaSync:Canwebuildabetter filesynchronizationsystemacrossmultipleexistingservices?
MetaSync
6
Higheravailability,greatercapacity,higherperformanceStrongerconfidentiality &integrity
Goals
• Higheravailability• Strongerconfidentiality&integrity• Greatercapacityandhigherperformance
• Noservice-service,client-clientcommunication
• Noadditionalserver• Opensourcesoftware
7
Overview
• Motivation&Goals• MetaSync Design• Implementation• Evaluation• Conclusion
8
KeyChallenges
• Maintainagloballyconsistentview ofthesynchronizedfilesacrossmultipleclients
• Usingonlytheserviceproviders’unmodifiedAPIs withoutanycentralizedserver
• Eveninthepresenceofservicefailure
9
OverviewoftheDesign
Synchronization ReplicationObjectStore
MetaSync
BackendabstractionsLocalStorage
Dropbox GoogleDrive OneDrive Remote
Services
10
1.FileManagement
ObjectStore
• Similardatastructurewithversioncontrolsystems(e.g.,git)
• Content-basedaddressing– Filename=hashofthecontents– De-duplication– Simpleintegritychecks
• Directoriesformahashtree– Independent&concurrentupdates
11
ObjectStore
12
head =f12…
Dir1
abc…
Dir2
4c0…
Large.bin
20e…
blob blobblob
small1 small2
• Filesarechunkedorgroupedintoblobs• Theroothash=f12…uniquely identifiesasnapshot
ObjectStore
13
old=f12…
Dir1
abc…
Dir2
4c0…
Large.bin
20e…
blob blobblob
small1 small2
• Filesarechunkedorgroupedintoblobs• Theroothash=f12…uniquely identifiesasnapshot
1ae…
blob
head =07c…
Large.bin
OverviewoftheDesign
Synchronization ReplicationObjectStore
MetaSync
BackendabstractionsLocalStorage
Dropbox GoogleDrive OneDrive Remote
Services
14
2.Consistentupdate
UpdatingGlobalView
GlobalView
v0ab1…
Client1 Prev
Head
Prev
Head
Client2
master
15
Previouslysynchronizedpoint
Currentroothash
UpdatingGlobalView
GlobalView
v0ab1…
Client1 Prev Head
PrevClient2
v1c10…
master
16
Head
UpdatingGlobalView
GlobalView
v0ab1…
Client1 Prev
Head
PrevClient2
v1c10…
master
17
Head
UpdatingGlobalView
GlobalView
v0ab1…
Client1 Prev
Head
PrevClient2
v1c10…
master
18
Head
UpdatingGlobalView
GlobalView
v0ab1…
Client1 Prev
Prev HeadClient2
v1c10…
v27b3…
master
19
Head
v2f13…
UpdatingGlobalView
GlobalView
v0ab1…
Client1 Prev
Prev
Head
Client2
v1c10… v27b3…
master
20
Head
v2f13…
UpdatingGlobalView
GlobalView
v0ab1…
Client1 Prev
Prev
Head
Client2
v1c10… v27b3…
master
21
Head
v3a31…
ConsistentUpdateofGlobalView
• Needtohandleconcurrentupdates,unavailableservicesbasedonexistingAPIs
MetaSync
Dropbox
MetaSync
GoogleDrive OneDrive
root=f12… root=b05…
22
Paxos
• Multi-roundnon-blockingconsensusalgorithm– Saferegardlessoffailures– Progressifmajorityisalive
Proposer Acceptor23
Metasync:SimulatePaxos• Useanappend-only list tologPaxos messages– ClientsendsnormalPaxos messages– Uponarrivalofmessage,serviceappendsitintoalist– Clientcanfetchalistoftheorderedmessages
• EachserviceproviderhasAPIstobuildappend-onlylist– GoogleDrive,OneDrive,Box:Commentsonafile– Dropbox:Revisionlistofafile– Baidu:Filesinadirectory
24
Metasync:PassivePaxos (pPaxos)
• Backendservicesworkaspassiveacceptor• Acceptordecisionsaredelegatedtoclients
Clients Passive StorageServices
S2
S1
S3
25
P1
P2
propose(3)
Metasync:PassivePaxos (pPaxos)
• Backendservicesworkaspassiveacceptor• Acceptordecisionsaredelegatedtoclients
Clients Passive StorageServices
S2
S1
S3
26
P1
P2
propose(2)
Metasync:PassivePaxos (pPaxos)
• Backendservicesworkaspassiveacceptor• Acceptordecisionsaredelegatedtoclients
Clients Passive StorageServices
S2
S1
S3
27
P1
P2
fetch(S1)
fetch(S2)
fetch(S3)
Metasync:PassivePaxos (pPaxos)
• Backendservicesworkaspassiveacceptor• Acceptordecisionsaredelegatedtoclients
Clients Passive StorageServices
S2
S1
S3
28
P1
P2
accept(3,v1)
fetch
DiskPaxos
29
Disk1 Disk2 Disk3
P1 P2 P3
Propose
DiskPaxos
30
Disk1 Disk2 Disk3
P1 P2 P3
Fetch
Paxos vs.DiskPaxos vs.pPaxos
• DiskPaxos:maintainsablockperclient
Proposer
Acceptor
computation
Propose Accept
Paxos
Proposer
Acceptor
diskblocks
Propose Check
DiskPaxos
…
Proposer
Acceptor
append-only
Propose Check
pPaxos
Gafni &Lamport ’02
RequiresacceptorAPIO(clientsxacceptors)O(acceptors) O(acceptors)
31
require#msgs
OverviewoftheDesign
Synchronization ReplicationObjectStore
MetaSync
BackendabstractionsLocalStorage
Dropbox GoogleDrive OneDrive Remote
Services
32
3.Replicateobjects
StableDeterministicMapping
• MetaSync replicatesobjectsRtimesacrossSstorageproviders(R<S)
• Requirements– Shareminimalinformationamongservices/clients– Supportvariationinstoragesize–Minimizerealignmentuponconfigurationchanges
• Deterministicmapping
– E.g.,map(7a1…)=Dropbox,Google
33
DeterministicMappingExample
• Service={A(1),B(2),C(2),D(1)}• N={A1,B1,B2,C1,C2,D1}(normalized)• Map(i)=Sorted(N,key=md5(i,serviceID,vID))
Capacity
map[0]=[A1,C2,D1,B1,B2,C1]=[A,C]map[1]=[B2,B1,C1,C2,A1,D1]=[B,C]…map[19]=[C2,B1,D1,A1,B2,C1]=[C,B]
H=20
R=2
bc1…mod20=1=>ReplicateontoBandC
34
DeterministicMappingExample
• WhenCisremovedmap[0]=[A1,C2,D1,B1,B2,C1]=[A,C]map[1]=[B2,B1,C1,C2,A1,D1]=[B,C]…map[19]=[C2,B1,D1,A1,B2,C1]=[C,B]
H=20
R=2
map[0]=[A1,D1,B1,B2]=[A,D]map[1]=[B2,B1,A1,D1]=[B,A]…map[19]=[B1,D1,A1,B2]=[B,D]
H=20
Thesortedorderismaintained=>Minimizerealignments 35
Implementation
• PrototypedwithPython– ~8klinesofcode
• Currentlysupports5backendservices– Dropbox,GoogleDrive,OneDrive,Box.net,Baidu
• Twofront-endclients– Commandlineclient– Syncdaemon
36
Evaluation
• Howistheend-to-endperformance?
• What’stheperformancecharacteristicsofpPaxos?
• HowquicklydoesMetaSync reconfiguremappings?
37
Evaluation
• Howistheend-to-endperformance?
• What’stheperformancecharacteristicsofpPaxos?
• HowquicklydoesMetaSync reconfiguremappings?
38
End-to-EndPerformance
Dropbox Google MetaSyncLinuxKernel920 directories15kfiles,166MB
2h45m > 3hrs 12m18s
Pictures50 files,193MB
415s 143s 112s
Synchronizethetargetbetweentwocomputers
39
Performancegainsarefrom:• Parallelupload/downloadwithmultipleproviders• Combinedsmallfilesintoablob
(S=4,R=2)
LatencyofpPaxos
Latencyisnotdegradedwithincreasingconcurrentproposersoraddingslowbackendstorageservice
40
0
5
10
15
20
25
30
35
1 2 3 4 5
Latency(s)
#ofProposers
Dropbox
OneDrive
Box
Baidu
LatencyofpPaxos
Latencyisnotdegradedwithincreasingconcurrentproposersoraddingslowbackendstorageservice
41
0
5
10
15
20
25
30
35
1 2 3 4 5
Latency(s)
#ofProposers
Dropbox
OneDrive
Box
Baidu
All
Conclusion
• MetaSync providesasecure,reliable,andperformant filessyncserviceontopofpopularcloudproviders– Toachieveaconsistentupdate,wedeviseanewclient-basedPaxos
– Tominimizeredistribution,wepresentastabledeterministicmapping
• Sourcecodeisavailable:– http://uwnetworkslab.github.io/metasync/
42
Top Related