Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.
-
Upload
samuel-adams -
Category
Documents
-
view
242 -
download
2
Transcript of Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.
![Page 1: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/1.jpg)
SDP-MARCH-Talk
Geo-Distribution
唐宇2013年 11月
![Page 2: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/2.jpg)
Outline
• Geo-distribution• SMFS• RACS
![Page 3: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/3.jpg)
Why geo-distribution?
• Securing data from large-scale disasters is important.– 40% of enterprises that experience a
disaster (e.g. loss of a site) go out of business within five years.
– Data loss failure in a large bank can have much greater consequences with potentially global implications.
![Page 4: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/4.jpg)
Open questions
• Trade-off involves balancing safety against performance– Synchronous• Sensitive to link latency
– Semi-synchronous• Data can still be lost if disaster strikes
– Fully asynchronous• Weakest safety guarantees
![Page 5: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/5.jpg)
Outline
• Geo-distribution• SMFS• RACS
![Page 6: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/6.jpg)
SMFS ( Smoke and Mirrors File System)
• Smoke and Mirrors: Reflecting Files at a Geographically Remote Location Without Loss of Performance – FAST’09– Hakim Weatherspoon, Lakshmi Ganesh,
Tudor Marian, Mahesh Balakrishnan, and Ken Birman
– Cornell University & Microsoft Research (Silicon Valley)
![Page 7: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/7.jpg)
论文工作• 在 geo-distribution三类常见实现方法外,提出新的方法 network-sync:– Offer stronger guarantees on data reliability
than semi-synchronous and asynchronous solutions while retaining their performance
• 支持多个文件更新的原子性(略)
![Page 8: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/8.jpg)
Data Loss Model• We consider data to be lost if an update has
been acknowledged to the client, but the corresponding data no longer exists in the system.– Synchronous
• When Primary and mirror sites fail.
– Semi-synchronous• When the primary site fails and sent packets do not make
it to the mirror.
– Asynchronous• When the primary site fails
![Page 9: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/9.jpg)
Failure Model and Assumptions
• Failures can occur at any level– Storage devices, storage area network, network links,
switches, hubs, wide-area network, and/or an entire site
• Failures can be simultaneously or even in sequence
• Sites may have redundant network paths connecting them– Allow us to focus on the tolerance of failures that
disable an entire site, and on combinations of failures such as the loss of both an entire site and the network connecting it to the backup (rolling disaster)
![Page 10: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/10.jpg)
Network-sync大致原理1. It proactively adds redundancy at the
network level to transmitted data.2. It exposes the level of in-network
redundancy added for any sent data via feedback notifications
![Page 11: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/11.jpg)
Network-sync具体实现
![Page 12: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/12.jpg)
Maelstrom• Forward Error Correction(FEC)
– A generic term for a broad collection of techniques aimed at proactively recovering from packet loss or corruption.
– FEC implementations for data generated in real-time are typically parameterized by a rate (r, c): for every r data packets, c error correction packets are introduced into the stream.
• Maelstrom是 FEC的一种实现– Its performance is tolerant to random and bursty loss– 基于 TCP协议
• 若Maelstrom也不能修复错误数据,就只能重传报文
![Page 13: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/13.jpg)
Comparison of Mirroring Protocols
• Network-sync can be understood as an enhancement of the semi-synchronous style of mirroring
• Offering similar performance as semi-synchronous solutions, but with increased reliability
![Page 14: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/14.jpg)
Evaluation Configuration
• Local-sync (semi-synchronous)• Remote-sync (synchronous)• Network-sync• Local-sync+FEC• Remote-sync+FEC
![Page 15: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/15.jpg)
Reliability During Disaster(一 )
The remote-sync and remote-sync+FEC solutions do not lose data in this situation
The y-axis shows both the total number of messages sent and total number of messages lost
![Page 16: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/16.jpg)
Reliability During Disaster (二 )
Latency is the time between a local storage server sending a request and a remote storage server receiving the request.
![Page 17: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/17.jpg)
Performance evaluation(一 )The x-axis shows loss probability on the wide-area link being increased from 0% to 1%, while the y-axis shows the throughput achieved by each of these mirroring solutions.
![Page 18: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/18.jpg)
Performance evaluation(二 )
![Page 19: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/19.jpg)
Outline
• Geo-distribution• SMFS• RACS
![Page 20: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/20.jpg)
RACS
• RACS: A Case for Cloud Storage Diversity– SoCC’10– Hussam Abu-Libdeh, Lonnie Princehouse,
Hakim Weatherspoon– Cornell University
![Page 21: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/21.jpg)
应用场景• The increasing popularity of cloud storage is
leading organizations to consider moving data out of their own data centers and into the cloud
• It becomes very expensive for organizations to switch storage providers.
• We argue that striping user data across multiple providers can allow customers to avoid vendor lock-in, reduce the cost of switching providers, and better tolerate provider outages or failures.
![Page 22: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/22.jpg)
论文工作• RACS (Redundant Array of Cloud Storage)– A cloud storage proxy that transparently stripes
data across multiple cloud storage providers
• 论文贡献– 通过仿真证明 RACS能用可接受的额外代价应对部分数据的不可用,以及减少对存储服务商的依赖
– 通过仿真展示切换云存储服务商的 cost– 证明 RACS可以与 Amazon S3 clients兼容且使用多个存储提供者作为后端
![Page 23: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/23.jpg)
Solution
• 如何减少对数据存储服务提供者的依赖?• 数据镜像 /副本?–额外开销太大
• 解决方案–类似 RAID 5– Erasure coding
![Page 24: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/24.jpg)
RAID 5
• 假设有 n个磁盘,将要写入的数据均分为 (n-1)块,存放到 (n-1)块磁盘中
• 对 (n-1)块中存放的数据按位获取奇偶校验信息,存放到第 n块磁盘中
• 当 1个磁盘坏掉,可以通过其他 (n-1)块磁盘恢复失去的数据
![Page 25: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/25.jpg)
Erasure coding
• Erasure coding– It transforms a message of k symbols into a
longer message with n symbols such that the original message can be recovered from a subset of the n symbols
• 在RACS的实现– 假设有 n个磁盘,将要写入的数据均分为m块,存放到
m块磁盘中– 针对上一步中m块磁盘中存放的数据,作数据冗余(特殊处理的),并存放到其余 (n-m)块磁盘中
– 当一个磁盘的数据丢失,可以通过其他任意m块磁盘恢复失去的数据
![Page 26: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/26.jpg)
Erasure coding的优点• Tolerating Outages• Tolerating Data Loss• Adapting to Price Changes• Adapting to New Providers• Control Monetary Spending• Choice in Data Recovery
![Page 27: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/27.jpg)
Distributed RACS
• To avoid from bottleneck• Zookeeper – Chubby的开源实现
![Page 28: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/28.jpg)
Performance overhead• Storage
– RACS uses a factor of n/m more storage, plus some additional overhead for metadata associated with each share
• Number of requests– n for put , create , and delete operations– m for get operations
• Bandwidth– The bandwidth used by put operations by a factor of n/m, due
to the redundant shares
• Latency– Put operations must wait for the slowest of the repositories– Get latency could be better than the average of all repositories– Coordination with ZooKeeper is another source of latency
![Page 29: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/29.jpg)
实验数据集介绍• Data covers 18 months of activity on the
Internet Archive (IA, http://www.archive.org) servers.
• The trace represents HTTP and FTP interactions to read and write various documents and media files (images, sounds, videos) stored at the Internet Archive and served to users.
![Page 30: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/30.jpg)
实验数据集
![Page 31: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/31.jpg)
Money cost
![Page 32: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/32.jpg)
Monthly costs breakdown
![Page 33: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/33.jpg)
Other results
Tolerating a vendor price hike
The cost of switching the Internet Archive’s storage provider
All response times averaged over four runs
![Page 34: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/34.jpg)
FUTURE WORK1. 未考虑存储服务提供者之间的关联– The virtual compute nodes of Amazon EC2
can read from and write to Amazon S3 storage with low latency and no bandwidth charges
2. 存储服务提供者的异构性– Cloud providers– Cluster– Desktop PC
![Page 35: Geo-Distribution 唐宇 2013 年 11 月. Outline Geo-distribution SMFS RACS.](https://reader036.fdocuments.us/reader036/viewer/2022081418/56649cfe5503460f949cf2a9/html5/thumbnails/35.jpg)
Thanks!Q&A