Post on 16-Feb-2016
description
FAWN: Fast Array of Wimpy Nodes
Click icon to add picture
A technical paper presentation in fulfillment of the requirements ofCIS 570 – Advanced Computer Systems – Fall 2013
Scott R. Sideleaussideleau@umassd.edu 14-Nov-2013
2
Overview• Identify the problem space
• FAWN as a solution– Architecture principles– Unique key-value storage
• Evaluate and benchmark a 21-node FAWN cluster
• Identify when FAWN makes sense
3
Theoretical Problem Space• CPU I/O gap
– Modern processors are so efficient that a lot of time is spent idle
• CPU power consumption scales linearly– Increased caches to keep the superscalar pipelines fed is a driver
• Dynamic Voltage Frequency Switching (DVFS) is inefficient– Intel SpeedStep technology– CPU still operates generally at 50% power consumption
4
What’s the real problem?• Electricity is expensive!
– Home usage is measured in KW, data center usage in MW
• Facebook use up to $1 million a month in electricity– Only three data centers!
• Oregon, USA• Virginia, USA• Sweden
5
Facebook’s Not Playing Around• Fourth data center to be powered by renewable wind
– Iowa, USA
6
http://goo.gl/sFmmxz dtd 14-Nov-2013
Proposed Solution• Fast Array of Wimpy Nodes (FAWN)
– Bridge the I/O gap• Use slower CPUs and faster Flash storage
– Reduce power consumption per node• Embedded CPUs consume significantly less power
– Address distributed storage for the new architecture• New key-value storage system (FAWN-KV)
– Complementary per node data store (FAWN-DS)
7
8
System Architecture
9
Basic Functions
10
Replication & Consistency
Understanding Flash Storage• Fast random reads
– 175x faster than HDDs– Vary wildly between make/models
• Efficient I/O– Very low power– High query per Joule rate vs. HDDs
• Slow random writes– Expensive erase/write cycle– Motivation for log structured (i.e. sequential) data storage
11
Optimized Maintenance Functions• Split
– Used when adding a node to the cluster– Read, then sequential write to two new data stores if key is in range
• Merge– Used when deleting a node from the cluster– Mutually exclusive stores, so append one data store to the other
• Compact– Cleans up entries in a data store– Skip orphans, out-of-range, deleted and write to new data store
12
13
Optimized Sequential Read & Writes
14
Front-end Consistent Hashing
15
Node Join
Node Leave• Rather than split the data stores, nodes merge them
• In reality, this means…– Add a new replica into each chain the departing node belonged to– So, the processing is the same as a join event
16
Failure Detection• Nodes are assumed to be fail-stop
– Front-end and back-end nodes gossip at a known rate• If timeout, front-end initiates leave operation for failed node
• Current design only copes with node failures– Coping with network failures require future work
17
Single Node Evaluation• Performance almost entirely dependent on flash media
18
21-Node Evaluation• In general, the back-ends prove to be well-matched
19
21-Node Evaluation• Relatively responsive through maintenance operations
20
21-Node Evaluation• Slightly slower than production key-value systems
– Worst case response times on-par
21
21-Node Evaluation• Power draw is low and consistent across operations
22
21-Node Evaluation• Power draw is low and consistent across operations
– Query per Joule is an order of magnitude higher than traditional production distributed systems
• 1 billion instructions per Joule• 1/3 the frequency• 1/10 (or less) the power
23
When does FAWN matter?• It depends on the workload…
24
QUESTIONS?Thanks very much!
25