Dynamo: Amazon’s Highly Available Key-Value Store
Farley Lai
University of Iowa
February 21, 2014
Farley Lai (UIOWA) Amazon Dynamo (Big Data) February 21, 2014 1 / 14
Motivation
MapReduce processes big data in a parallel and distributed fashion.
Daynamo forms the foundation of big data, namely, the storage.
Shopping Cart
Clients tend to insert and update items frequenty but review the cart tocheck out only at the end. Is it fun for the sytem to always ask you toretry later in minutes whenever there is an item inserted/updated in theshopping cart?
Farley Lai (UIOWA) Amazon Dynamo (Big Data) February 21, 2014 2 / 14
SOA of Amazon’s Platform
Farley Lai (UIOWA) Amazon Dynamo (Big Data) February 21, 2014 3 / 14
Roles
Service Provider: Amazon
Service: Dynamo, the storage service
Customer: application/service vendors
Client: applications/services
User: human and/or bots
Service Level Agreements (SLA)
SLA are contracts signed by service providers and customers, specifyingthe quality of service guaranteed for a client access distribution.
Example: service guaranteeing that it will provide a response within300ms for 99.9% of its requests for a peak client load of 500 requests persecond.
Farley Lai (UIOWA) Amazon Dynamo (Big Data) February 21, 2014 4 / 14
What is Dynamo?
A distributed key-value storage service built on a ring topology with
high availability for writes
eventual consistency
Farley Lai (UIOWA) Amazon Dynamo (Big Data) February 21, 2014 5 / 14
Requirements and Assumptions
Requirements
Simple read/write to data items identified by unique keys
ACID: automicity, consistency, isolation and durability
SLA: latency constraints on the 99.9th percentile of thedistribution
Assumptions
Trusted environment and machines without security concerns
Farley Lai (UIOWA) Amazon Dynamo (Big Data) February 21, 2014 6 / 14
Problems, Techniques and Advantages
Problems Techniques Advantages
Partitioning Consistent Hashing Incremental ScalabilityHigh write availability Vector clocks with
conlict resolutionVersion size is decoupledfrom update rates
Temporary failures Sloppy Quorum,hinted handoff
High availability and dura-bility guarantee despitesome unavailable replicas
Permanent failures Merkle trees Fast replica synchronizationMembership Gossip protocol decentralized registry for
storing membership andliveness info
Farley Lai (UIOWA) Amazon Dynamo (Big Data) February 21, 2014 7 / 14
Partitioning
Consistent hashing
1 key space
2 tokens assignment
3 replication
4 load distribution
5 node availability
6 node capacity
Farley Lai (UIOWA) Amazon Dynamo (Big Data) February 21, 2014 8 / 14
Data Versioning
Operations
1 read()⇒get()
2 write()⇒put()
3 conflict resolution
4 vector clock
Farley Lai (UIOWA) Amazon Dynamo (Big Data) February 21, 2014 9 / 14
Sloppy Quorum
1 R(2) + W (2) > N(3)
2 latency
Farley Lai (UIOWA) Amazon Dynamo (Big Data) February 21, 2014 10 / 14
Replica Synchronization
Figure : Merkle hash tree1 Figure : Merkle hash tree2
Farley Lai (UIOWA) Amazon Dynamo (Big Data) February 21, 2014 11 / 14
Evaluation: latency
Farley Lai (UIOWA) Amazon Dynamo (Big Data) February 21, 2014 12 / 14
Evaluation: load balance
Farley Lai (UIOWA) Amazon Dynamo (Big Data) February 21, 2014 13 / 14
Evaluation: write buffer
Farley Lai (UIOWA) Amazon Dynamo (Big Data) February 21, 2014 14 / 14
Top Related