An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and...
-
date post
21-Dec-2015 -
Category
Documents
-
view
218 -
download
0
Transcript of An Architecture for Internet Data Transfer Niraj Tolia Michael Kaminsky*, David G. Andersen, and...
An Architecture for Internet Data Transfer
Niraj Tolia
Michael Kaminsky*, David G. Andersen, and Swapnil Patil
Carnegie Mellon University and *Intel Research Pittsburgh
Niraj Tolia © May 20062
Innovation in Data Transfer is Hard
• Imagine: You have a novel data transfer technique• How do you deploy?
1. Update HTTP. Talk to IETF. Modify Apache, IIS, Firefox, Netscape, Opera, IE, Lynx, Wget, …
2. Update SMTP. Talk to IETF. Modify Sendmail, Postfix, Outlook…
3. Give up in frustration
2001 2006
2002 2003 2004 2005
2001 2006
2002 2003 2004 2005
2001Bittorrent
2001 2006
2002 2003 2004 2005
2001Bittorrent
2005Avalanche
2001 2006
2002 2003 2004 2005
2001Bittorrent
2005Avalanche
2005ALM-FastReplica
2001 2006
2002 2003 2004 2005
2001Bittorrent
2005Avalanche
2005CoBlitz
2005ALM-FastReplica
2001 2006
2002 2003 2004 2005
2001Bittorrent
2001Chord
2005Avalanche
2005CoBlitz
2005ALM-FastReplica
2001 2006
2002 2003 2004 2005
2001Bittorrent
2001Chord
2005Avalanche
2005CoBlitz
2005ALM-FastReplica
2001CAN
2001 2006
2002 2003 2004 2005
2001Bittorrent
2001Chord
2001Pastry
2005Avalanche
2005CoBlitz
2005ALM-FastReplica
2001CAN
2001 2006
2002 2003 2004 2005
2001Bittorrent
2001Chord
2001Pastry
2004OpenDHT
2005Avalanche
2005CoBlitz
2005ALM-FastReplica
2001CAN
2001 2006
2002 2003 2004 2005
2001Bittorrent
2001Chord
2001Pastry
2004OpenDHT
2005Avalanche
2005CoBlitz
2005ALM-FastReplica
2001CAN
2001LBFS
2001 2006
2002 2003 2004 2005
2001Bittorrent
2001Chord
2001Pastry
2004OpenDHT
2003Value-Caching
2005Avalanche
2005CoBlitz
2005ALM-FastReplica
2001CAN
2001LBFS
2001 2006
2002 2003 2004 2005
2001Bittorrent
2001Chord
2001Pastry
2004OpenDHT
2003CASPER
2003Value-Caching
2005Avalanche
2005CoBlitz
2005ALM-FastReplica
2001CAN
2001LBFS
2001 2006
2002 2003 2004 2005
2001Bittorrent
2001Chord
2001Pastry
2004OpenDHT
2003CASPER
2003Value-Caching
2005Avalanche
2005CoBlitz
2005ALM-FastReplica
2003Riverbed
2001CAN
2001LBFS
2001 2006
2002 2003 2004 2005
2001Bittorrent
2001Chord
2001Pastry
2004OpenDHT
2003CASPER
2003Value-Caching
2005Avalanche
2005CoBlitz
2005ALM-FastReplica
2003Riverbed
2001CAN
2001LBFS
2004Lookaside Caching
2001 2006
2002 2003 2004 2005
2001Bittorrent
2001Chord
2001Pastry
2004OpenDHT
2003CASPER
2003Value-Caching
2005Avalanche
2005CoBlitz
2005ALM-FastReplica
2004Segank
2003Riverbed
2001CAN
2001LBFS
2004Lookaside Caching
2001 2006
2002 2003 2004 2005
2001Bittorrent
2001Chord
2001Pastry
2004OpenDHT
2003CASPER
2003Value-Caching
2004USPS
2005Avalanche
2005CoBlitz
2005ALM-FastReplica
2004Segank
2003Riverbed
2001CAN
2001LBFS
2004Lookaside Caching
2001 2006
2002 2003 2004 2005
2001Bittorrent
2001Chord
2001Pastry
2004OpenDHT
2003CASPER
2003Value-Caching
2004USPS
2005Avalanche
2005CoBlitz
2005ALM-FastReplica
2004Segank
2003Riverbed
2001CAN
2001LBFS
2004Lookaside Caching
2004BlueFS
2001 2006
2002 2003 2004 2005
2001Bittorrent
2001Chord
2001Pastry
2004OpenDHT
2003CASPER
2003Value-Caching
2002IBP
2004USPS
2005Avalanche
2005CoBlitz
2005ALM-FastReplica
2004Segank
2003Riverbed
2001CAN
2001LBFS
2004Lookaside Caching
2004BlueFS
2001 2006
2002 2003 2004 2005
2001Bittorrent
2001Chord
2001Pastry
2004OpenDHT
2003CASPER
2003Value-Caching
2002IBP
2004USPS
2005Avalanche
2005CoBlitz
2005ALM-FastReplica
2004Segank
2003DataStaging
2003Riverbed
2001CAN
2001LBFS
2004Lookaside Caching
2004BlueFS
2001 2006
2002 2003 2004 2005
2001Bittorrent
2001Chord
2001Pastry
2004OpenDHT
2003CASPER
2003Value-Caching
2002IBP
2003Bullet
2004USPS
2005Avalanche
2005CoBlitz
2005ALM-FastReplica
2004Segank
2003DataStaging
2003Riverbed
2001CAN
2001LBFS
2004Lookaside Caching
2004BlueFS
Niraj Tolia © May 20063
Barriers to Innovation in Data Transfer
• Applications bundle:• Content Negotiation: What data to send
– Naming (URLs, directories, …)– Languages– Identification– …
• Data Transfer: Getting the bits across
• Both are tightly coupled (e.g., HTTP, SMTP)• Hinders innovation and evolution of new
services
Niraj Tolia © May 20064
Solution: A Data Transfer Service
• Decouple content negotiation from data transfer• Applications perform negotiation as before• But hand data objects to the Transfer Service
• The Transfer Service is shared by applications
Application ProtocolSender Receiver
Xfer Service Xfer Service
and Data
Data
Niraj Tolia © May 20065
Application Protocol
Extensible Transfer Architecture
Sender Receiver
Xfer Service
USBKeychain
Xfer Service
LocalCache
USBKeychain
Bittorrent
Application-independent cache
Non-networked transfers New network features
Bittorrent
Plugins
Niraj Tolia © May 20066
Transfer Service BenefitsApps. can reuse available transfer techniques
• No reimplementation needed
Easier deployment of new technologies• Applications need no modification
Provides for cross-application sharing• Can interpose on all data transfers
Handles transient disconnections
Niraj Tolia © May 20067
Outline
• Motivation• Data Oriented Transfer (DOT) service• Evaluation• Open Issues and Future Work• Conclusion
Niraj Tolia © May 20068
Xfer ServiceXfer Service
Sender Receiver
10,000 Foot View of Transfers using DOT
Request File X
put(X) read() data
• How does the transfer service name data?• How does the transfer service locate data?
??
??
Niraj Tolia © May 20069
• Application defined names are not portable• Use content-naming for globally unique names• Objects represented by an OID
• Objects are further sub-divided into “chunks”
• Each OID corresponds to a list of descriptors• Descriptor lists allow for partial transfers
FileFileFileFile
Desc3Desc3
DOT: Object Naming
Foo.txtFoo.txtFoo.txtFoo.txt OIDOID
File Cryptographic Hash
Desc1Desc1
Desc2Desc2
Niraj Tolia © May 200610
DOT: Object Location
• Data transfers in DOT are receiver driven• Receiver has better idea of available resources
• Senders specify ‘hints’ - potential data locations– dot://sender.example.com:12000/– dht://opendht.org/– …
Niraj Tolia © May 200611
Xfer ServiceXfer Service
Sender Receiver
A Transfer using DOT
Request File X
OID, Hints
put(X) OID, Hints get(OID, Hints) read() data
TransferPlugins
Niraj Tolia © May 200612
(1) ApplicationAPI
DOT’s Modular Architecture
DOT
Application
Network
Local Storage
StoragePlugin
TransferPlugin
(3) Storage Plugin API
(2) Transfer Plugin API
Niraj Tolia © May 200613
DOTNetworkTransfer
PluginMultiPath
Plugin
Transfer Plugin API
• Simple API• get_descriptor_list( OID, hints )• get_chunks( descriptor_list, hints )• cancel_chunks( chunk_list )
• Transfer plugin chaining is easy• e.g., multipath plugin Transfer
Plugin
TransferPlugin
Niraj Tolia © May 200614
Implementation• In C++ using libasync event-driven library• One storage plugin:
– In-memory hash tables, disk backed.
• Three transfer plugins:• Default Xfer-Xfer plugin• Portable Storage plugin• Multipath plugin
• Applications• gcp, an scp-like tool for file transfers• A DOT-enabled Postfix email server
– Included a socket-like adapter library
Niraj Tolia © May 200615
Xfer
Current DOT Prototype
NET( DSL )
Xfer
NET Multi-path
NETwireless
SENDER
RECEIVER
USBUSB
InternetInternetcache
Application-independent cache
Non-networked transfers Multipath and Mirror support
NET
MIRROR
Xfer Plugins
Niraj Tolia © May 200616
Outline
• Motivation• Data Oriented Transfer (DOT) service• Evaluation• Open Issues and Future Work • Conclusion
Niraj Tolia © May 200617
Evaluation
• Standard file transfer• Portable Storage• Multi-Path• Case Study: Postfix Email Server
• Capture and analysis of email trace• Evaluation of DOT-enabled SMTP server• Integration effort
Niraj Tolia © May 200618
Standard File Transfer Setup
• Two DOT-enabled machines• Network Emulator
• Evaluate various b/w + delay combinations
• Use gcp for the file transfers • Used 40MB, 4MB, 400KB, 40KB, 4KB files
• Presenting 40MB here
Network Emulator
Niraj Tolia © May 200619
Standard File Transfer
• Overhead: hashing, extra RTT• No noticeable overheads with latency
0.5
3.6
1.6
3.7
0.7
3.7
1.2
4.1
0.0
1.0
2.0
3.0
4.0
5.0
1000 Mb/s 100 Mb/s
Tra
ns
fer
Tim
e (
se
c)
- 4
0 M
Bwget scp scp w/o encr. gcp
Niraj Tolia © May 200620
Portable Storage Experiment
• 255 MB transfer over emulated DSL• Based on Virtual Machine transfers at Carnegie Mellon• DOT preemptively copies data onto Flash drive
• Wait 5 minutes, plug flash drive into receiver• Two drive speeds
• 8MB/s - 1GB• 20MB/s - 2GB
2 Mbit/s
Niraj Tolia © May 200621
Portable Storage Results
.. 1126s(~ 19 min)
Device Inserted
Niraj Tolia © May 200622
Multipath Plugin: Load Balancing
• Varied capacity + delay of experimental links• Compare fastest link alone with multipath plugin
on both links; what speedup?
• Transferred 40MB file• 128 KB socket buffer sizes
Gigabit
Experimental links
Network Emulator
Niraj Tolia © May 200623
Multipath Plugin is Effective
Link 1 Link 2 Single Multipath Savings
100/0100/0
10/03.59
1.90
3.54
47%
1.4%
100/66
100/66
10/66
1/66
43.33
23.20
22.97
38.25
46%
47%
12%
-40 MB @ 100Mbit/s ideal: 3.2 seconds
-Multipath plugin nearly doubles throughput
- TCP effects dominate. Pipe not full.
- Multipath plugin doubles by adding second stream. Actual capacity irrelevant.
Gigabit
Link 1
Link 2
Niraj Tolia © May 200624
Postfix Email Trace Replay
• Generated 10,000 email messages from trace• Random data matched to chunk hash data• Preserves some similarity between messages• Replayed through Postfix to a single local server
• Postfix disk bound… DOT CPU overhead negligible• Savings due to duplication within emails
Program Seconds Bytes Sent
Postfix 468 172 MB
Postfix + DOT 468 117 MB (68%)
Niraj Tolia © May 200625
Postfix Integration
• Integrated DOT with the Postfix mail server
• 1 part-time week, 1 student new to Postfix• Includes time to write generic adapter library
Program LoC Added LoC %
GTC Lib -- 421
Postfix 70,824 184 0.3%
smtpd 6,413 107 1.7%
smtp 3,378 71 2.1%
Niraj Tolia © May 200626
Discussion on Deployment
• Application Resilience• DOT is a service - it’s outside the control of the
application.• Our Postfix falls back to normal SMTP if
– No Transfer Service contact– Transfer keeps failing
• In the short term, a simple fallback is encouraged. However, this could interfere with some functions
– DOT-based virus scanner…• In the long term, DOT would be a part of a
system’s core infrastructure
Niraj Tolia © May 200627
Future Work
• Security• Application encrypts before DOT
- No block-based caching, reuse, mirroring, …• No encryption
- Resembles the status quo• In progress: Convergent encryption
– Requires integration with DOT chunking
• Application Preferences• Encryption, QoS, priorities, …
– DOT might benefit from application input• Need an extensible way to express these
Niraj Tolia © May 200628
Conclusion
• DOT separates app. logic from data transfer• Makes it easier to extend both
• Architecture works well• Overhead low (especially in wide-area)• Major benefits
• Caching• Flexibility to implement new transfer techniques
Niraj Tolia © May 200629
Backup Slides
Niraj Tolia © May 200630
Normal SMTP
SMTP Client ServerEHLO
250 Hello
MAIL FROM: user…
DATA
250 OK
Niraj Tolia © May 200631
DOT-Enabled SMTP
X-DOT-DATA
SMTP Client ServerEHLO
250 Hello
MAIL FROM: user…
(OID+Hints)
250 OKXfer Service
Niraj Tolia © May 200632
Convergent Encryption
FileFileFileFile
Hash3Hash3
Hash1Hash1
Hash2Hash2
• Chunki is encrypted using Hashi
• All identical cleartext blocks will map to the same encrypted block
• Hashi is further encrypted using a private key
Niraj Tolia © May 200633
Standard File Transfer
Niraj Tolia © May 200634
Mail Server Evaluation
• Trace: 159 days at low volume academic mail server• 458,861 messages• hash, size of: message, headers, body• Message chunks
• hash and size of each chunk• Static chunking and Rabin fingerprinting
(Content-based block division)
Niraj Tolia © May 200635
DOT chunk caching benefits email
Method Total Bytes Percent Bytes
SMTP Default 6800 MB -
DOT body 5876 MB 86.41%
Rabin body 5056 MB 74.35%
Rabin whole 5496 MB 80.81%
Niraj Tolia © May 200636
Default GTC-GTC Transfer Protocol
ReceiverGET_DESCRIPTORS(OID)
Desc list 1,2,…
GET_CHUNKS(…)
Sender
Chunk 1Chunk 2
GET_CHUNKS(…)
• GTC-GTC protocol mirrors transfer plugins• Implemented as RPC calls• (Fetches are actually pipelined)
Niraj Tolia © May 200637
Rabin Fingerprinting
4 7 8 2 8
File Data
Rabin Fingerprints
Given Value - 8
Natural Boundary Natural Boundary
Hash 1 Hash 2
Niraj Tolia © May 200638
Rabin Fingerprinting: Examples of Edits
1. Original File
2. Addition in chunk– Changes only one hash
3. Addition creating a new breakpoint
4. Deletion changing size of chunkFigure from “A Low-bandwidth Network File System”
Niraj Tolia © May 200639
DOT Objects Naming
• Objects represented by an OID• Divided into “chunks”, each with a descriptor
• Each OID corresponds to a list of descriptors • Data is fetched using descriptor lists• Supports partial transfers
Niraj Tolia © May 200640
Innovation in Data Transfer is Hard
• Imagine: You have a novel data transfer technique• Say… Bittorrent, a P2P protocol for sharing large files
• How do you deploy?1.Update HTTP. Talk to IETF. Modify Apache, IIS,
Firefox, Netscape, Opera, IE, Lynx, Wget, …
2.Update SMTP. Talk to IETF. Modify Sendmail, Postfix, Exchange, Mail.app, Eudora, …
3.Give up in frustration
Niraj Tolia © May 200641
(1) ApplicationAPI
DOT’s Modular Architecture
DOT
Application
Network
Local Storage
StoragePlugin
TransferPlugin
TransferPlugin
TransferPlugin
(3) Storage Plugin API
(2) Transfer Plugin API
Niraj Tolia © May 200642
DOT Plugins
• Multipath plugin• List of sub-plugins• Balances load
• Portable storage plugin• Sender: Copies new data onto USB flash device• Receiver: Scans USB flash device for blocks
– naïve filesystem layout, unoptimized• but effective
Niraj Tolia © May 200643
Portable Storage Results
.. 1126s(~ 19 min)
Device Inserted
Niraj Tolia © May 200644
DOT email chunk caching evaluation
• Step 1: Caching analysis (infinite cache)• SMTP default: What was really sent• DOT body: Whole-body only caching, headers
sent separately• Rabin body: Headers sent separately, rabin
fingerprint chunking of body• Rabin whole: Headers+body chunked together
• Easiest to implement for application. Just send data…
• Step 2: Trace replay through Postfix
Niraj Tolia © May 200645
Related Work• BEEP• Proxy-based data interposition approaches
– RON, X-Bone, OCALA
• Other Content-Addressable Systems– Bittorrent, DHTs, DTNs, EMC’s Centera
• Using Content-Addressability to save on data transfers– CASPER, LBFS, Rhea et al., Spring et al.
• Portable Storage– Lookaside Caching, BlueFS
• Other transfer protocols– GridFTP, IBP, HTTP, etc.
Niraj Tolia © May 200646
Transfer plugin API
• get_descriptors( OID, hints )• get_chunks( descriptor, hints )• cancel_chunks( chunk_list )
• Hints specify a plugin + data• gtc://sender.example.com:12000/• dht://opendht.org/• …
• Transfer plugin chaining is easy• e.g., multipath plugin