FAWN: A Fast Array of Wimpy Nodes - SCHOOL OF COMPUTER SCIENCE
FAWN: Fast Array of Wimpy Nodes
description
Transcript of FAWN: Fast Array of Wimpy Nodes
![Page 1: FAWN: Fast Array of Wimpy Nodes](https://reader035.fdocuments.us/reader035/viewer/2022081420/56815f3e550346895dce12fb/html5/thumbnails/1.jpg)
FAWN: Fast Array of Wimpy Nodes
Click icon to add picture
A technical paper presentation in fulfillment of the requirements ofCIS 570 – Advanced Computer Systems – Fall 2013
Scott R. [email protected] 14-Nov-2013
![Page 2: FAWN: Fast Array of Wimpy Nodes](https://reader035.fdocuments.us/reader035/viewer/2022081420/56815f3e550346895dce12fb/html5/thumbnails/2.jpg)
2
![Page 3: FAWN: Fast Array of Wimpy Nodes](https://reader035.fdocuments.us/reader035/viewer/2022081420/56815f3e550346895dce12fb/html5/thumbnails/3.jpg)
Overview• Identify the problem space
• FAWN as a solution– Architecture principles– Unique key-value storage
• Evaluate and benchmark a 21-node FAWN cluster
• Identify when FAWN makes sense
3
![Page 4: FAWN: Fast Array of Wimpy Nodes](https://reader035.fdocuments.us/reader035/viewer/2022081420/56815f3e550346895dce12fb/html5/thumbnails/4.jpg)
Theoretical Problem Space• CPU I/O gap
– Modern processors are so efficient that a lot of time is spent idle
• CPU power consumption scales linearly– Increased caches to keep the superscalar pipelines fed is a driver
• Dynamic Voltage Frequency Switching (DVFS) is inefficient– Intel SpeedStep technology– CPU still operates generally at 50% power consumption
4
![Page 5: FAWN: Fast Array of Wimpy Nodes](https://reader035.fdocuments.us/reader035/viewer/2022081420/56815f3e550346895dce12fb/html5/thumbnails/5.jpg)
What’s the real problem?• Electricity is expensive!
– Home usage is measured in KW, data center usage in MW
• Facebook use up to $1 million a month in electricity– Only three data centers!
• Oregon, USA• Virginia, USA• Sweden
5
![Page 6: FAWN: Fast Array of Wimpy Nodes](https://reader035.fdocuments.us/reader035/viewer/2022081420/56815f3e550346895dce12fb/html5/thumbnails/6.jpg)
Facebook’s Not Playing Around• Fourth data center to be powered by renewable wind
– Iowa, USA
6
http://goo.gl/sFmmxz dtd 14-Nov-2013
![Page 7: FAWN: Fast Array of Wimpy Nodes](https://reader035.fdocuments.us/reader035/viewer/2022081420/56815f3e550346895dce12fb/html5/thumbnails/7.jpg)
Proposed Solution• Fast Array of Wimpy Nodes (FAWN)
– Bridge the I/O gap• Use slower CPUs and faster Flash storage
– Reduce power consumption per node• Embedded CPUs consume significantly less power
– Address distributed storage for the new architecture• New key-value storage system (FAWN-KV)
– Complementary per node data store (FAWN-DS)
7
![Page 8: FAWN: Fast Array of Wimpy Nodes](https://reader035.fdocuments.us/reader035/viewer/2022081420/56815f3e550346895dce12fb/html5/thumbnails/8.jpg)
8
System Architecture
![Page 9: FAWN: Fast Array of Wimpy Nodes](https://reader035.fdocuments.us/reader035/viewer/2022081420/56815f3e550346895dce12fb/html5/thumbnails/9.jpg)
9
Basic Functions
![Page 10: FAWN: Fast Array of Wimpy Nodes](https://reader035.fdocuments.us/reader035/viewer/2022081420/56815f3e550346895dce12fb/html5/thumbnails/10.jpg)
10
Replication & Consistency
![Page 11: FAWN: Fast Array of Wimpy Nodes](https://reader035.fdocuments.us/reader035/viewer/2022081420/56815f3e550346895dce12fb/html5/thumbnails/11.jpg)
Understanding Flash Storage• Fast random reads
– 175x faster than HDDs– Vary wildly between make/models
• Efficient I/O– Very low power– High query per Joule rate vs. HDDs
• Slow random writes– Expensive erase/write cycle– Motivation for log structured (i.e. sequential) data storage
11
![Page 12: FAWN: Fast Array of Wimpy Nodes](https://reader035.fdocuments.us/reader035/viewer/2022081420/56815f3e550346895dce12fb/html5/thumbnails/12.jpg)
Optimized Maintenance Functions• Split
– Used when adding a node to the cluster– Read, then sequential write to two new data stores if key is in range
• Merge– Used when deleting a node from the cluster– Mutually exclusive stores, so append one data store to the other
• Compact– Cleans up entries in a data store– Skip orphans, out-of-range, deleted and write to new data store
12
![Page 13: FAWN: Fast Array of Wimpy Nodes](https://reader035.fdocuments.us/reader035/viewer/2022081420/56815f3e550346895dce12fb/html5/thumbnails/13.jpg)
13
Optimized Sequential Read & Writes
![Page 14: FAWN: Fast Array of Wimpy Nodes](https://reader035.fdocuments.us/reader035/viewer/2022081420/56815f3e550346895dce12fb/html5/thumbnails/14.jpg)
14
Front-end Consistent Hashing
![Page 15: FAWN: Fast Array of Wimpy Nodes](https://reader035.fdocuments.us/reader035/viewer/2022081420/56815f3e550346895dce12fb/html5/thumbnails/15.jpg)
15
Node Join
![Page 16: FAWN: Fast Array of Wimpy Nodes](https://reader035.fdocuments.us/reader035/viewer/2022081420/56815f3e550346895dce12fb/html5/thumbnails/16.jpg)
Node Leave• Rather than split the data stores, nodes merge them
• In reality, this means…– Add a new replica into each chain the departing node belonged to– So, the processing is the same as a join event
16
![Page 17: FAWN: Fast Array of Wimpy Nodes](https://reader035.fdocuments.us/reader035/viewer/2022081420/56815f3e550346895dce12fb/html5/thumbnails/17.jpg)
Failure Detection• Nodes are assumed to be fail-stop
– Front-end and back-end nodes gossip at a known rate• If timeout, front-end initiates leave operation for failed node
• Current design only copes with node failures– Coping with network failures require future work
17
![Page 18: FAWN: Fast Array of Wimpy Nodes](https://reader035.fdocuments.us/reader035/viewer/2022081420/56815f3e550346895dce12fb/html5/thumbnails/18.jpg)
Single Node Evaluation• Performance almost entirely dependent on flash media
18
![Page 19: FAWN: Fast Array of Wimpy Nodes](https://reader035.fdocuments.us/reader035/viewer/2022081420/56815f3e550346895dce12fb/html5/thumbnails/19.jpg)
21-Node Evaluation• In general, the back-ends prove to be well-matched
19
![Page 20: FAWN: Fast Array of Wimpy Nodes](https://reader035.fdocuments.us/reader035/viewer/2022081420/56815f3e550346895dce12fb/html5/thumbnails/20.jpg)
21-Node Evaluation• Relatively responsive through maintenance operations
20
![Page 21: FAWN: Fast Array of Wimpy Nodes](https://reader035.fdocuments.us/reader035/viewer/2022081420/56815f3e550346895dce12fb/html5/thumbnails/21.jpg)
21-Node Evaluation• Slightly slower than production key-value systems
– Worst case response times on-par
21
![Page 22: FAWN: Fast Array of Wimpy Nodes](https://reader035.fdocuments.us/reader035/viewer/2022081420/56815f3e550346895dce12fb/html5/thumbnails/22.jpg)
21-Node Evaluation• Power draw is low and consistent across operations
22
![Page 23: FAWN: Fast Array of Wimpy Nodes](https://reader035.fdocuments.us/reader035/viewer/2022081420/56815f3e550346895dce12fb/html5/thumbnails/23.jpg)
21-Node Evaluation• Power draw is low and consistent across operations
– Query per Joule is an order of magnitude higher than traditional production distributed systems
• 1 billion instructions per Joule• 1/3 the frequency• 1/10 (or less) the power
23
![Page 24: FAWN: Fast Array of Wimpy Nodes](https://reader035.fdocuments.us/reader035/viewer/2022081420/56815f3e550346895dce12fb/html5/thumbnails/24.jpg)
When does FAWN matter?• It depends on the workload…
24
![Page 25: FAWN: Fast Array of Wimpy Nodes](https://reader035.fdocuments.us/reader035/viewer/2022081420/56815f3e550346895dce12fb/html5/thumbnails/25.jpg)
QUESTIONS?Thanks very much!
25