Synchronous Multi-Master Clusters in WAN

46
Building Synchronous MySQL clusters in Cloud and WAN Alexey Yurchenko Codership Oy

description

Presented by Alexey Yurchenko, Co-founder & Developer of Galera, Codership at the MariaDB Roadshow in London, 18. Sep. 2014

Transcript of Synchronous Multi-Master Clusters in WAN

Page 1: Synchronous Multi-Master Clusters in WAN

Building Synchronous MySQL clusters in Cloud and WAN

Alexey YurchenkoCodership Oy

Page 2: Synchronous Multi-Master Clusters in WAN

3www.codership.com

A Very Dirrrty Word

Sssssssssss...

Page 3: Synchronous Multi-Master Clusters in WAN

4www.codership.com

A Very Dirrrty Word

Synchronous.

Page 4: Synchronous Multi-Master Clusters in WAN

5www.codership.com

A Very Dirrrty Word

Synchronous.w h a t i s i t g o o d f o r ? ? ?

Page 5: Synchronous Multi-Master Clusters in WAN

6www.codership.com

Data Safety

COMMIT

Replicate

OK

Client Master Slave

CO

MM

IT

Asynchronous Replication:

Potential data loss

Page 6: Synchronous Multi-Master Clusters in WAN

7www.codership.com

Data Safety

COMMITReplicate

ACK

OK

Client Master Slave

CO

MM

IT

Synchronous Replication:

Additional latency

Page 7: Synchronous Multi-Master Clusters in WAN

8www.codership.com

Data Safety

Disaster Recovery:

DC1 DC2Replication

#1

Page 8: Synchronous Multi-Master Clusters in WAN

9www.codership.com

Multi-Master

COMMIT Replicate

OK

Client1 Master1 Master2 Client2COMMIT

DEADLOCK

CONFLICTDETECTION

CONFLICTRESOLUTION

COMMIT

CONFLICTDETECTION

CONFLICTRESOLUTION

ROLLBACK

Page 9: Synchronous Multi-Master Clusters in WAN

10www.codership.com

Access Latency Elimination

Page 10: Synchronous Multi-Master Clusters in WAN

11www.codership.com

Access Latency Elimination

#2

Page 11: Synchronous Multi-Master Clusters in WAN

12www.codership.com

us-east

Benchmark Setup (Amazon EC2)

eu-west~ 6000 km, ~ 90 ms RTT

us-east eu-west

Page 12: Synchronous Multi-Master Clusters in WAN

13www.codership.com

Access Latency Elimination

client location us-east server US-EU cluster change

us-east 28.03 ms 119.80 ms ~4.3x

eu-west 1953.89 ms 122.92 ms ~0.06x

Page 13: Synchronous Multi-Master Clusters in WAN

14www.codership.com

What Happened?

SQL traffic (reads, writes, etc.)

~ 6000 km, ~ 90 ms RTT

SQLtraffic

Replication traffic(commits only)

Page 14: Synchronous Multi-Master Clusters in WAN

To Sync or Semi-sync?

Page 15: Synchronous Multi-Master Clusters in WAN

16www.codership.com

Look, Ma! No 2-phase commit!

COMMITReplicate

ACK

OK

Client Master Slave

CO

MM

IT

Slave didn't commit!

Page 16: Synchronous Multi-Master Clusters in WAN

17www.codership.com

To Sync or Semi-sync?

Synchronous (master rolls back and stops):● Data redundancy preserved (sort of: slave is dead)● Availability compromised (!!!)

Semi-synchronous (master continues):● Data redundancy compromised● Availability preserved

Master Slave

Replicate

Page 17: Synchronous Multi-Master Clusters in WAN

18www.codership.com

To Sync or Semi-sync?

For all practical purposes (production) replication is supposed to protect against master loss, not slave loss (slave loss is mitigated by adding more slaves), to increase the availability of the service.

Ironically, fully synchronous replication is not only impractically slow, it is detrimental to the availability goal.

Page 18: Synchronous Multi-Master Clusters in WAN

Synchronous Replication in WAN

The Latency And How To Deal With It.

Page 19: Synchronous Multi-Master Clusters in WAN

20www.codership.com

The Latency And How to Deal With It

Latency: 1 RTT – 1.5 RTT (100 – 500 ms)

(<200 ms should be practically possible)

Trx rate <= 1/Latency

(10 – 2 transactions per second? Blast! )

Page 20: Synchronous Multi-Master Clusters in WAN

21www.codership.com

The Latency And How to Deal With It

The way they deal with any latency:

1) Buffering:

AUTOCOMMIT UPDATEs → multi-statement transactions

2) Parallelization:

1 client session → 10 client sessions

Page 21: Synchronous Multi-Master Clusters in WAN

Synchronous Replication in WAN

Galera Cluster for MySQL variants

Page 22: Synchronous Multi-Master Clusters in WAN

24www.codership.com

Galera Cluster for MySQL variants

mysqld

Galera

APIwsrep

MySQL

Cluster(other nodes)

Synchronous communication

Dynamic library

wsrep patch

wsrep API

Page 23: Synchronous Multi-Master Clusters in WAN

25www.codership.com

Galera Cluster for MySQL variants

Page 24: Synchronous Multi-Master Clusters in WAN

26www.codership.com

Galera Cluster for MySQL variants

MySQL-wsrep MariaDBGalera Cluster

PerconaXtraDB Cluster

Galera GaleraGalera

Page 25: Synchronous Multi-Master Clusters in WAN

27www.codership.com

Galera Cluster and CAP Theorem

Consistency

AvailabilityPartition

Tolerance

Fixed:

timeouts

Page 26: Synchronous Multi-Master Clusters in WAN

Synchronous Replication in WAN

Goals:● Disaster Recovery

● Performance● Service Availability

DO's and DONT's

Page 27: Synchronous Multi-Master Clusters in WAN

29www.codership.com

Synchronous Replication in WAN: DO's

Invest in a good WAN link(You invest in nodes. The link is the same part of the

cluster as the nodes are.)

Page 28: Synchronous Multi-Master Clusters in WAN

30www.codership.com

Synchronous Replication in WAN: DO's

Categorize your data:

1) Rare, small writes, frequent reads, global data – good.

2) Heavy writes, few reads, local data – bad.

Page 29: Synchronous Multi-Master Clusters in WAN

31www.codership.com

Synchronous Replication in WAN: DO's

Categorize your data (OpenStack):

1) KeyStone identity data, Glance image metadata:

mostly reads, small writes, data of global interest.

2) Ceilometer monitoring data:

almost write-only, no need to share globally – store in MongoDB.

Jay Pipes, “Tales from the Field: Backend Data Storage in OpenStack Clouds”

Page 30: Synchronous Multi-Master Clusters in WAN

32www.codership.com

Synchronous Replication in WAN: DO's

Configure timeouts:● All Galera timeouts and periods should be no less than WAN round

trip times.

● Defaults should be suitable for networks with up to 500ms RTTs.

● The higher the timeouts – the more partition tolerant and the less available the cluster is (CAP theorem).

● Timeouts relation:RTT <= evs.suspect_timeout <= evs.inactive_timeout <= evs.install_timeout

● evs.suspect_timeout is the timeout to detect single node partition/failure

● Further info:http://galeracluster.com/documentation-webpages/configurationtips.html#wan-replication

Page 31: Synchronous Multi-Master Clusters in WAN

33www.codership.com

Synchronous Replication in WAN: DO's

Configure cluster segments:

DC1

1

1

1

DC2

2

2

DC3

3

3

3

2

Page 32: Synchronous Multi-Master Clusters in WAN

34www.codership.com

Synchronous Replication in WAN: DO's

Choose odd number of nodes and odd number of datacenters:● Most popular choice: 3x3

● Also observed in the field: 5x3 and 3x5

Page 33: Synchronous Multi-Master Clusters in WAN

35www.codership.com

Synchronous Replication in WAN: DO's

3 is better than 2!

DC1

DC2 DC3

Page 34: Synchronous Multi-Master Clusters in WAN

36www.codership.com

Synchronous Replication in WAN: DONT's

1) Hot Spots

Page 35: Synchronous Multi-Master Clusters in WAN

37www.codership.com

Synchronous Replication in WAN: DONT's

hotspot

1

RTT

Page 36: Synchronous Multi-Master Clusters in WAN

38www.codership.com

Synchronous Replication in WAN: DONT's

1) Hot Spots

2) Poor Links

Page 37: Synchronous Multi-Master Clusters in WAN

39www.codership.com

Synchronous Replication in WAN: DONT's

Full packet loss

→ the node is not with us

No packet loss

→ the node is with us ???

Synchronous – with who?

Page 38: Synchronous Multi-Master Clusters in WAN

40www.codership.com

Synchronous Replication in WAN: DONT's

1) Hot Spots

2) Poor Links

3) Huge Transactions

Page 39: Synchronous Multi-Master Clusters in WAN

41www.codership.com

Synchronous Replication in WAN: DONT's

Huge transactions kill concurrency:

a) Long to replicate

b) Long to certify

c) Long to apply on slave

→ SLAVE LAG

Page 40: Synchronous Multi-Master Clusters in WAN

42www.codership.com

Synchronous Replication in WAN: DONT's

1) Hot Spots

2) Poor Links

3) Huge Transactions

4) No Primary Keys

Page 41: Synchronous Multi-Master Clusters in WAN

43www.codership.com

Synchronous Replication in WAN: DONT's

No PRIMARY KEY:

mysql> DELETE FROM 10M_rows_no_PK_table;

=> 50 000 000 000 000 rows scan.

Page 42: Synchronous Multi-Master Clusters in WAN

44www.codership.com

If Synchronous Doesn't Work Out

Galera1

1

3

2 Galera2

A

C

B

Native MySQL Asynchronous Replication Between Galera Clusters

(log_slave_updates = ON)

async

Master Slave

Page 43: Synchronous Multi-Master Clusters in WAN

45www.codership.com

If Synchronous Doesn't Work Out

Galera1

1

3

2 Galera2

A

C

B

Native MySQL Asynchronous Replication Between Galera Clusters

async

Master Slave

Page 44: Synchronous Multi-Master Clusters in WAN

46www.codership.com

If Synchronous Doesn't Work Out

Galera1

1

3

Galera2

A

C

B

Native MySQL Asynchronous Replication Between Galera Clusters

async

Master Slave

Page 45: Synchronous Multi-Master Clusters in WAN

47www.codership.com

If Synchronous Doesn't Work Out

Galera1

1

3

2 Galera2

A

C

B

async

Master Slave

Native MySQL Asynchronous Replication Between MariaDB Galera Clusters(log_slave_updates = OFF)

Page 46: Synchronous Multi-Master Clusters in WAN

48www.codership.com

Synchronous Replication in WAN

Q & A