Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low...

44
Epidemic Techniques Chiu Wah So (Kelv in)
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    217
  • download

    0

Transcript of Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low...

Page 1: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Epidemic Techniques

Chiu Wah So (Kelvin)

Page 2: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Database Replication

Why do we replicate database?– Low latency– High availability

To achieve strong (sequential) consistency on replicated database. Not scalable.– One primary database– Quorum system (contact over half of replicas)

Page 3: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Database Replication

To scale and have high availability, we need a weaker consistency model– Eventual consistency: If all updating stops then

eventually all replicas will converge to the identical values.

The two papers talk about how to use epidemic techniques to achieve eventual consistency to scale.

Page 4: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Epidemic Techniques for replicated database

Epidemic Algorithms for Replicated Database Maintenance– Look at different epidemic algorithms to reduce

bandwidth consumption to maintain replicated database

Astrolabe– Scalable and Robust information management

system.

Page 5: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Motivation on the first paper

Clearinghouse service maintains translations from names to machine addresses. (like DNS)

Problem: Using direct mail and anti-entropy, too much traffic to maintain consistency between highly replicated servers. Some key links are overloaded.

Look at techniques to reduce bandwidth: rumor spreading and spatial distributions.

Page 6: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Direct mail

Direct mail: each server sends update to all other servers.

Advantage– Easy to implement– Good enough for small and static servers

Disadvantage: – Not scale (O(n) message for each update)– Updates may get lost.

Page 7: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Anti-entropy

Servers pick random server and resolve differences. 3 ways to resolve differences: push, pull, and push-

pull.

Page 8: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Anti-entropy Example

Page 9: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Push

Page 10: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Pull

Page 11: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Push-Pull

Page 12: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Anti-entropy Average time

Average converge in O(log(n)) steps

Pull, push-pull

Push

Page 13: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Anti-entropy (2)

Very expensive to send the whole database across network to compare

Some techniques for optimizing comparing bandwidth

– Compute Checksum – Exchange list of of recent updates. Then apply the update a

nd compute checksum– Exchange updates in reverse chronological order until chec

ksums agree Still too much bandwidth…..

Page 14: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Rumor spreading

Main idea: send out updates randomly. Instead of comparing whole database.

Three states: susceptible, infective, and removed. Initially all servers are susceptible Once server has a rumor (infective), and then pick a

random server to send the rumor. With probability 1/k, the server loses interest

(removed) to spread rumor

Page 15: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Rumor spreading (2)

But maybe not every server got the rumor.– With probability of remaining susceptible after the

epidemic finishes:

Run anti-entropy infrequently to make sure every server gets the update.

Page 16: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Three goals in rumor spreading

Low Residue: the probability of remaining susceptible when the epidemic finishes

Low Traffic: total traffic sent per site Low Delay: Average time and the last time

between the injection of an update and the arrival of update.

Page 17: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Variations in rumor spreading

Many variations in rumor spreading.– Blind with Coin vs Feedback with Counter– Push vs Pull– Increase the smaller counter of the two– Connection limit– Hunting

Page 18: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Feedback Counter vs Blind Prob.

Feedback and Counter

Blind and Probability

Page 19: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Deletion and Death Certificates

Simple solution: death certificates and store for a fixed threshold of time

2nd solution: dormant death certificates. Use two threshold time, and some servers keep it longer. 2 different timestamp: original timestamp and reactivation timestamp.

Page 20: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Motivation of Spatial Distributions

Network is not uniform. Certain key links in the network are overload.

– Transatlantic links about 80 conversations, but on average conversations per link is 6.

Therefore, we should favor nearby neighbors.

Page 21: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Spatial Distributions

Each servers, sort the list of sites by distance from s.

Select anti-entropy exchange partners from the sorted list according to a function f(i), i = index on the sorted list.

We can use f(i) = i^(-a), where a is the parameter for tuning spatial distribution.

Page 22: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Spatial Distribution

Page 23: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Next Paper: Astrolabe

The first paper talks about how to use rumor spreading and spatial distribution to reduce bandwidth.

But the storage grows O(n) and total bandwidth taken up by gossip grows O(n^2)

We need a more scalable solution.

Page 24: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Astrolabe

Scalable and Robust information management system.

Monitors the dynamically changing state of a collection of distributed resources.

Reports summaries of this information to users.

Page 25: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Four design goals

Scalability: scale through its zone hierarchy. Information is summarized before exchanges.

Flexibility: easy to install new aggregated function in a form of SQL aggregation query

Robustness: randomized peer-to-peer approach to exchange information.

Security: use signed certificates.

Page 26: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Structure of Astrolabe

Structure of Astrolabe’s zones can be viewed as a trees. Leaves of this tree are hosts.

Each hosts run an astrolabe agent.

Page 27: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Astrolabe Detail

Each agent is a virtual database. Each agent has a path name. (F

or example: /USA/Cornell/pc3) Each agent contains information,

called MIB, for all the ancestor zone (For example, it contains /, /USA, /USA/Cornell)

Each ancestor MIB is generated using aggregation for scalability, instead of having O(n) entries.

Page 28: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Astrolabe Detail (2)

Each zone can be viewed as relational table of the attributes of its child zone.

How do we gather or generate the information in the zone relational table?

Two ways: If the agent is in the zone, use aggregation to construct the MIB for the zone. Otherwise, gossip for the information.

Therefore, MIB for internal zones has to be small in order to scale.

Page 29: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Aggregation

Aggregation Function Certificates contain information on how to collect and aggregate attributes of child zone MIBs into entries for internal zone MIBs.– Programmed in SQL-like language– Propagates by two ways: copying to parent (prop

agates like other normal attributes), and look for new AFC from its ancestor zone

Page 30: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Aggregation (2)

Here are the SQL aggregation functions that are provided by Astrolabe.

Page 31: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Gossip

Each zone has a small set of addresses for representative agents.

Representative agents are computed using an aggregation function, such as using load and longevity.

An agent gossips on behalf of those zones for which it is a representative.

Page 32: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Gossip (2)

Periodically, the agent picks one of the child zones, and talks to one of the contact agents. (anti-entropy)

Then, it sends all the child zones at that level, and does the same thing for the higher levels in the tree up until the root level.

Then the two agents can compare which entries are newer and keep them.

Page 33: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Example of gossip (taken from ken slides)

Name Time Load Weblogic? SMTP? Word Version

swift 2003 .67 0 1 6.2

falcon 1976 2.7 1 0 4.1

cardinal 2201 3.5 1 1 6.0

Name Time Load Weblogic? SMTP? Word Versio

n

swift 2011 2.0 0 1 6.2

falcon 1971 1.5 1 0 4.1

cardinal 2004 4.5 1 0 6.0

swift.cs.cornell.edu

cardinal.cs.cornell.edu

Page 34: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Example of gossip (2)

Name Time Load Weblogic? SMTP? Word Version

swift 2003 .67 0 1 6.2

falcon 1976 2.7 1 0 4.1

cardinal 2201 3.5 1 1 6.0

Name Time Load Weblogic? SMTP? Word Versio

n

swift 2011 2.0 0 1 6.2

falcon 1971 1.5 1 0 4.1

cardinal 2004 4.5 1 0 6.0

swift.cs.cornell.edu

cardinal.cs.cornell.edu

swift 2011 2.0

cardinal 2201 3.5

Page 35: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Example of gossip (3)

Name Time Load Weblogic? SMTP? Word Version

swift 2011 2.0 0 1 6.2

falcon 1976 2.7 1 0 4.1

cardinal 2201 3.5 1 1 6.0

Name Time Load Weblogic? SMTP? Word Versio

n

swift 2011 2.0 0 1 6.2

falcon 1971 1.5 1 0 4.1

cardinal 2201 3.5 1 0 6.0

swift.cs.cornell.edu

cardinal.cs.cornell.edu

Page 36: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Name Load Weblogic? SMTP? Word Version

swift 2.0 0 1 6.2

falcon 1.5 1 0 4.1

cardinal 4.5 1 0 6.0

Name Load Weblogic? SMTP? Word Version

gazelle 1.7 0 0 4.5

zebra 3.2 0 1 6.2

gnu .5 1 0 6.2

Name Avg Load

WL contact SMTP contact

SF 2.6 123.45.61.3 123.45.61.17

NJ 1.8 127.16.77.6 127.16.77.11

Paris 3.1 14.66.71.8 14.66.71.12

San Francisco

New Jersey

SQL query “summarizes”

data

Dynamically changing query output is visible system-wide

Page 37: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Membership

When an agent has not seen an update for a zone from a particular representative for some time Tfail. Remove its MIB.

Connect different pieces of the trees and add in new machines

– IP multicast– Broadcast– Relatives

Administrators responsible for configuring the system by assigning zone names.

Page 38: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Communication

Through Http and UDP(need to fragment the messages into more than one UDP packets)

If there is firewall,– Use ALG in core internet or an astrolabe agent in

core internet.

Page 39: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Security

Each zone is a management unit. Children have a way to override policy

enforced by parents. Each zone: 2 pairs of key, CA and zone keys

– Zone certificate– MIB certificate– Aggregation function certificate– Client certificate

Page 40: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Related work

Directory Services (Clearinghouse, Bayou, Globe)

Network Monitoring Event Notification Sensor Networks Peer-to-peer routing

Page 41: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Measurement on expected # rounds

Page 42: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Measurement on expected # rounds (2)

Page 43: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Measurement on Latency

Real Simulation

Page 44: Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Conclusion ??