PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster · PDF filePhxSQL: A...

25
PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster Ming CHEN@WeChat

Transcript of PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster · PDF filePhxSQL: A...

Page 1: PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster · PDF filePhxSQL: A High-Availability & Strong-Consistency MySQL Cluster ... • Zookeeper/Admin/ ... Plugin Phx Binlog

PhxSQL: A High-Availability & Strong-Consistency

MySQL Cluster

Ming CHEN@WeChat

Page 2: PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster · PDF filePhxSQL: A High-Availability & Strong-Consistency MySQL Cluster ... • Zookeeper/Admin/ ... Plugin Phx Binlog

Why PhxSQLHighly expected features for MySql cluster

Page 3: PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster · PDF filePhxSQL: A High-Availability & Strong-Consistency MySQL Cluster ... • Zookeeper/Admin/ ... Plugin Phx Binlog

3

Availability and consistency in MySQL cluster• Master-slaves replication

• Replication solutions

• Fully synchronous

• Semi-synchronous

• Asynchronous

• Third-party aided failover

• Zookeeper/Admin/…

Binlog…3: z2: y1: x

Innodb-|--|--|--|-

SVR C

MySQL(slave)

Binlog…3: z2: y1: x

Innodb-|--|--|--|-

SVR B

MySQL(slave)

Binlog…3: z2: y1: x

Innodb-|--|--|--|-

SVR A

MySQL(master)

pull pull

client

Page 4: PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster · PDF filePhxSQL: A High-Availability & Strong-Consistency MySQL Cluster ... • Zookeeper/Admin/ ... Plugin Phx Binlog

4

Call for both high-availability and strong-consistency

• Critical Applications demand both high-availability and strong-consistency

• accounts/financial transactions/…

• Support for both high-availability and strong-consistency

• greatly simplifies system design

• makes correctness-reasoning easier

Page 5: PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster · PDF filePhxSQL: A High-Availability & Strong-Consistency MySQL Cluster ... • Zookeeper/Admin/ ... Plugin Phx Binlog

5

A new MySQL cluster supports the same high-availability and strong-consistency as Zookeeper does

PhxSQL

Page 6: PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster · PDF filePhxSQL: A High-Availability & Strong-Consistency MySQL Cluster ... • Zookeeper/Admin/ ... Plugin Phx Binlog

What is PhxSQLHigh-availability & strong-consisteny MySQL cluster

Page 7: PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster · PDF filePhxSQL: A High-Availability & Strong-Consistency MySQL Cluster ... • Zookeeper/Admin/ ... Plugin Phx Binlog

7

Key PhxSQL features• Support full compatibility with MySQL

• Support high availability and linearizable consistency

• Support deployment in wide-area network

• Support online reconfiguration of cluster membership

Page 8: PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster · PDF filePhxSQL: A High-Availability & Strong-Consistency MySQL Cluster ... • Zookeeper/Admin/ ... Plugin Phx Binlog

8

Support full compatibility with MySQL• Transparent to MySQL clients

• Support full features of MySQL

• Even serializable level of transaction isolation

• Minimum intrusive change to MySQL source code

modify after_flush

add before_recoverty

Page 9: PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster · PDF filePhxSQL: A High-Availability & Strong-Consistency MySQL Cluster ... • Zookeeper/Admin/ ... Plugin Phx Binlog

9

Support high availability• Fully automatic failover within configurable seconds

• A cluster works well when more than half of cluster servers still function

Page 10: PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster · PDF filePhxSQL: A High-Availability & Strong-Consistency MySQL Cluster ... • Zookeeper/Admin/ ... Plugin Phx Binlog

10

Support linearizable consistency• A cluster of PhxSQL seems to be a single MySQL server to MySQL clients

concurrently accessing it

• PhxSQL supports the same consistency level as strong as Zookeeper does!

Page 11: PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster · PDF filePhxSQL: A High-Availability & Strong-Consistency MySQL Cluster ... • Zookeeper/Admin/ ... Plugin Phx Binlog

11

Support online reconfiguration of cluster membership• Add a new server, remove an old server, or replace an old server with a new one

in an atomic fashion while still serving read/write requests

• Be easier for maintenance and more friendly to clients

Page 12: PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster · PDF filePhxSQL: A High-Availability & Strong-Consistency MySQL Cluster ... • Zookeeper/Admin/ ... Plugin Phx Binlog

How PhxSQL WorksMySQL cluster powered by Paxos

Page 13: PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster · PDF filePhxSQL: A High-Availability & Strong-Consistency MySQL Cluster ... • Zookeeper/Admin/ ... Plugin Phx Binlog

13

Paxos maintains consistent states across servers by enforcing same operations on all servers when more than half of servers work

SVR A

Client R Client S Client T

int X=0;

int Y=0;

void Foo(int a, int b);

bool Bar(int a, int b);

invoke: R1, T1, S1, R2, …

Paxos

SVR B

int X=0;

int Y=0;

void Foo(int a, int b);

bool Bar(int a, int b);

Paxos

SVR C

int X=0;

int Y=0;

void Foo(int a, int b);

bool Bar(int a, int b);

Paxos

Foo(1, 2); Bar(3, 30) Bar(1, 10) Foo(3, 4)

1: R1: Foo(1, 2)

2: T1: Foo(3, 4)

3: S1: Bar(1, 10)

4: R2: Bar(3, 30)

invoke: R1, T1, S1, R2, … invoke: R1, T1, S1, R2, …God

Page 14: PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster · PDF filePhxSQL: A High-Availability & Strong-Consistency MySQL Cluster ... • Zookeeper/Admin/ ... Plugin Phx Binlog

14

PhxSQL enforces a simple but effective constraint by Paxos• Constraint: two MySQL servers have SAME states if and only if they have

SAME binlog

• Enforcement• PhxSQL maintains a GLOBAL consistent binlog by Paxos

• Every MySQL server aligns its LOCAL binlog to the GLOBAL one.

Page 15: PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster · PDF filePhxSQL: A High-Availability & Strong-Consistency MySQL Cluster ... • Zookeeper/Admin/ ... Plugin Phx Binlog

15

MySQL cluster

client

Binlog…3: z2: y1: x

Innodb-|--|--|--|-

SVR B

MySQL(slave)

Binlog…3: z2: y1: x

Innodb-|--|--|--|-

SVR A

MySQL(master)

Binlog…3: z2: y1: x

Innodb-|--|--|--|-

SVR C

MySQL(slave)

pull

pull

Page 16: PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster · PDF filePhxSQL: A High-Availability & Strong-Consistency MySQL Cluster ... • Zookeeper/Admin/ ... Plugin Phx Binlog

16

PhxSQL

client

Binlog…3: z2: y1: x

Innodb-|--|--|--|-

PhxPlugin

PhxBinlog

…3: z2: y1: x

Paxos

PhxProxy

SVR B

PhxBinlogSvr

MySql(slave)

Binlog…3: z2: y1: x

Innodb-|--|--|--|-

PhxPlugin

PhxBinlog

…3: z2: y1: x

Paxos

PhxProxy

SVR A

PhxBinlogSvr

MySql(master)

Binlog…3: z2: y1: x

Innodb-|--|--|--|-

PhxPlugin

PhxBinlog

…3: z2: y1: x

Paxos

PhxProxy

SVR C

PhxBinlogSvr

MySql(slave)

pull pull

forward

1. Paxos detects failure and elects new master by leasing and periodic heartbeat

2. Clients access master MySQL transparently via PhxProxywho forwards requests to current master

3. MySql master appends local binlog to global consistent PhxBinlog maintained by Paxos

4. MySql slaves pull binlog from global consistent PhxBinlog

Page 17: PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster · PDF filePhxSQL: A High-Availability & Strong-Consistency MySQL Cluster ... • Zookeeper/Admin/ ... Plugin Phx Binlog

17

Paxos maintains consistent states in the cluster

PhxBinlogSvr A

PhxPlugin:Client R PhxPlugin:Client S PhxPlugin:Client T

master_ver

master_id

binlog_queue

C1: bool change_master(ver, new_master);

C2: int app_binlog(binlog_id, binlog);

Paxos

PhxBinlogSvr B

Paxos

PhxBinlogSvr C

Paxos

C1(1, R); C2(“R_xxx”, “insert foo=2 into table_1”) C1(1, S) C1(1, T)

1: R1: C1(1, R)

2: T1: C1(1, T)

3: S1: C1(1, S)

4: R2: C2(…)

SM SM

God

SM master_ver

master_id

binlog_queue

C1: bool change_master(ver, new_master);

C2: int app_binlog(binlog_id,binlog);

master_ver

master_id

binlog_queue

C1: bool change_master(ver, new_master);

C2: int app_binlog(binlog_id,binlog);

Page 18: PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster · PDF filePhxSQL: A High-Availability & Strong-Consistency MySQL Cluster ... • Zookeeper/Admin/ ... Plugin Phx Binlog

How to Use PhxSQLEasy to integrate into existing system

Page 19: PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster · PDF filePhxSQL: A High-Availability & Strong-Consistency MySQL Cluster ... • Zookeeper/Admin/ ... Plugin Phx Binlog

19

Case 1: non-intrusive change to existing system

• Assume there some kinds of naming service that tells the current master in existing system• Zookeeper/DNS/configuration file/...

• Develop a new daemon program to learn the change of master in PhxSQL cluster and update information in the naming service accordingly• Safety: PhxSQL ensures consistency even the master information in naming service

is stale or MySQL clients connects to a slave

naming servicemaster ip: 10.1.1.1

master10.1.1.1

slave10.1.1.3

slave10.1.1.2

PhxSQL

client

daemonI. learn new master

II. update

1. get ip of master

2. request

Page 20: PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster · PDF filePhxSQL: A High-Availability & Strong-Consistency MySQL Cluster ... • Zookeeper/Admin/ ... Plugin Phx Binlog

20

Case 2: minimal intrusive change to existing system

• Traditional invocation scenario1. Read the IP of master from configuration file

2. Get MySQL handler by calling mysql_real_connect(MYSQL *mysql, const char * IP, …) with the IP

3. Invoke other MySQL client API with the handler

• PhxSQL invocation scenario1. Read the IP list of PhxSQL cluster servers from configuration file

2. Get MySQL handler by calling PhxSQLClientBase::Connect() with the IP list and then PhxSQLClientBase::GetMySQLFD()

3. Invoke other MySQL client API with the handler

Page 21: PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster · PDF filePhxSQL: A High-Availability & Strong-Consistency MySQL Cluster ... • Zookeeper/Admin/ ... Plugin Phx Binlog

PerformanceOn par with MySQL semi-sync replication

Page 22: PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster · PDF filePhxSQL: A High-Availability & Strong-Consistency MySQL Cluster ... • Zookeeper/Admin/ ... Plugin Phx Binlog

22

Performance test settings: PhxSQL vs. MySQL Semi-sync• 3 servers: Intel(R) Xeon(R) CPU E5-2420 @ 1.90GHz * 2, 32G memory, SSD

Raid10, 1000M NIC

• Network ping latency: master->slave: 3ms; client->master: 4ms

• Percona 5.6.31-77.0

• Master of MySQL semi-sync replication waits for only ONE ack

• Test tool and parameters: sysbench --oltp-tables-count=10 --oltp-table-

size=1000000 --num-threads=500 --max-requests=100000 --report-interval=1 --

max-time=200

Page 23: PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster · PDF filePhxSQL: A High-Availability & Strong-Consistency MySQL Cluster ... • Zookeeper/Admin/ ... Plugin Phx Binlog

23

QPS on par

5076

4633425657

4055

47528

20391

insert.lua(100% write)

select.lua(0% write)

OLTP.lua(20% write)

200 concurrent client threads

PhxSQL MySQL

8260

105928

465437072

121535

33229

insert.lua(100% write)

select.lua(0% write)

OLTP.lua(20% write)

500 concurrent client threads

PhxSQL MySQL

Page 24: PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster · PDF filePhxSQL: A High-Availability & Strong-Consistency MySQL Cluster ... • Zookeeper/Admin/ ... Plugin Phx Binlog

24

Response time (ms) on par

39.34 4.21

140.16

49.274.1

176.39

insert.lua(100% write)

select.lua(0% write)

OLTP.lua(20% write)

200 concurrent client threads

PhxSQL MySQL

60.414.58

192.93

70.64.17

270.38

insert.lua(100% write)

select.lua(0% write)

OLTP.lua(20% write)

500 concurrent client threads

PhxSQL MySQL

Page 25: PhxSQL: A High-Availability & Strong-Consistency MySQL Cluster · PDF filePhxSQL: A High-Availability & Strong-Consistency MySQL Cluster ... • Zookeeper/Admin/ ... Plugin Phx Binlog

25

Current progress

• Deployed in WeChat account system• WeChat account: 889M monthly active users

• Support for Percona 5.7 in progress

• Open source: https://github.com/tencent-wechat/phxsql