MySQL 5.6 Global Transaction IDs - Use case: (session) consistency

download MySQL 5.6 Global Transaction IDs - Use case: (session) consistency

If you can't read please download the document

Transcript of MySQL 5.6 Global Transaction IDs - Use case: (session) consistency

PowerPoint-Prsentation

Ulf Wendel, Oracle

MySQL 5.6, PECL/mysqlnd_ms 1.3


MySQL 5.6
Global Transaction Ids
Use case: Consistency

The speaker says...

We need new database cluster-aware APIs in PHP. Many of us use clusters for availability and performance reasons. No matter what data storage solution, there is hardly a one-fits-all solution.

Cluster-wide consistency varies by solution. Some solutions, such as MySQL Replication, provide different levels, depending on usage pattern. Our APIs shall allow requesting any level needed.

Simplified consistency level definitions first before its explained how Global Transaction Ids (GTIDs) fit in.

All nodes have all the latest copies

All cients see the latest (comitted) copies

Synchronous

Strong consistency

MySQL NodeMySQL NodeMySQL Node

SET X = 1

GET X, X = 1

The speaker says...

In this presentation a cluster is considered to deliver strong consistency if all nodes immediately serve the latest comitted copies. Clients see each others changes immediately. The nodes in the cluster are synchronous. Use, if transactions don't allow for any fuzzyness, for example, for financial or booking transactions.

As will be shown, strong consistency is possible even in lazy primary copy systems. It does not always require an eager read-one, update-any design.

Some nodes have the latest copies

A client reads his latest (comitted) copies

Other clients may read old copies

Session consistency

MySQL NodeMySQL NodeMySQL Node

GET X, X = 0SET X = 1Client AGET X, X = 1Client AClient AClient B

The speaker says...

Session consistency ensures that one client will be able to read all his updates for the duration of a session. Clients may or may not see comitted transactions of each other immediately. A load balancer must pick nodes appropriately.

This relaxed consistency level is good enough to please a discussion forum user. The user will see his latest contributions immediately. It becomes unlikely that users resubmit their posts as they are displayed immediately. However, other readers may not see the very latest posts.

Nodes may or may not serve the latest copies

Nodes may serve stale copies

Nodes may not have copies at all

Eventual consistency

MySQL NodeMySQL NodeMySQL Node

GET X, NULLSET X = 1GET X, X = 0

The speaker says...

An eventual consistent node may or may not serve the latest copy. In fact, there is no promise that a particular copy is available from every node. Many systems that default to eventual consistency reach strong consistency over time. Eventually all nodes get synchronized. This model is similar to that of a cache.

Eventual consistency is good enoug for browsing product catalogs or other infrequently changing contents in areas where stale data is acceptable. It is the default consistency level with MySQL Replication. Whatever, why bother ?! Can't vendors provide proper solutions?

There is no one-fits-all replication solution

The dangers of replication and a solution (Jim Gray, Pat Helland, 1996): a ten-fold increase in nodes and traffic gives a thousand fold increase in deadlocks or reconciliations

CAP theorem (Eric Brewer, 2000): consistency, availability and paritition tolerance can't be achieved together

vendors are forced to offer compromises

vendors are faced with a highly complex problem

you are forced to know about consistency :-(

Are vendors fooling you?

The speaker says...

I would love to stop talking about consistency. But replication has limits. Keep on pushing but know the limits. Every synchronous, update anywhere solution has a scalability limit. Partitioning (sharding) only shifts the limits and creates other issues (rebuilding aggregates). Until its solved...Applications shall state their QoS/consistency needs:To allow vendors to relax consistency for performance reasons, but only if feasible, if allowed by the app

To allow load balancers to make an optimal node choice

= PECL/mysqlnd_ms 1.2+ build around GTIDs...

Tell the plugin/load balancer what you need!

/* Yes, works with PDO and mysql as well. */$mysqli = new mysqli("myapp", "username", "password", "database");/* read-write splitting: master used */$mysqli->query("INSERT INTO orders(item) VALUES ('spring flowers')");/* Request session consistency: read your writes */mysqlnd_ms_set_qos($mysqli, MYSQLND_MS_QOS_CONSISTENCY_SESSION);/* Plugin picks a node which has the changes, here: master */$res = $mysqli->query("SELECT item FROM orders WHERE order_id = 1");var_dump($res->fetch_assoc());/* Back to eventual consistency: stale data allowed */mysqlnd_ms_set_qos($mysqli, MYSQLND_MS_QOS_CONSISTENCY_EVENTUAL);/* Plugin picks any slave, stale data is allowed */$res = $mysqli->query("SELECT item, price FROM specials");

PECL/mysqlnd_ms 1.2+ API

The speaker says...

Because PECL/mysqlnd_ms is a load balancer at the driver level, it can be controlled not only through SQL hints but also provide API calls. mysqlnd_ms_set_qos() defines the quality of service (QoS) that shall be provided. It instructs the load balancer to select only MySQL database cluster nodes that cen deliver the requested quality of service, for example, with regards to consistency.

Without GTIDs the rules for a MySQL Replication cluster are simple: eventual consistency any slave, session and strong consistency master only.

Combination of server id and sequence number

Emulation: PECL/mysqlnd_ms 1.2, MySQL Proxy

Built-in: MySQL 5.6

Global transaction identifier

MySQL MasterLog 7, Pos 34, GTID M:1: UPDATE x=1Log 7, Pos 35, GTID M:2: UPDATE x=9MySQL Slave 1MySQL Slave 2 , GTID M:1: UPDATE x=1

, GTID M:1: UPDATE x=1 , GTID M:2: UPDATE x=9

The speaker says...

A global transaction identifier is a cluster-wide unique transaction identifier. MySQL 5.6 can generate it automatically. MySQL Proxy and PECL/mysqlnd_ms 1.2 feature client-side emulations for use with any MySQL version. SQL can be used to access the GTIDs.

GTIDs have been been created to make MySQL Replication failover easier. However, they are useful for load balancing as well in a primary copy system.

With or without GTID: primary only

Only the primary has all comitted transactions

MySQL Replication: strong con.

MySQL MasterMySQL SlaveMySQL Slave

SET X = 1Client AGET X, X = 1Client B

The speaker says...

Configuring PECL/mysqlnd_ms for use with a MySQL Replication cluster and calling mysqlnd_ms_set_qos($conn, MYSQLND_MS_QOS_CONSISTENCY_STRONG) instructs PECL/mysqlnd_ms to use only the MySQL Replication master server for requests.

In a lazy primary copy system there is only one node that is guaranteed to have all comitted updates: the primary. Note that its possible to achieve higher consistency levels than eventual consistency in an lazy primary copy system by appropriately choosing nodes.

Use GTID to find synchronous slave

Check slave status using SQL

Reduce read load on master

MySQL Replication: session con.

MySQL Master..., GTID M:1: UPDATE x=1..., GTID M:2: UPDATE x=9MySQL Slave 1MySQL Slave 2 , GTID M:1: UPDATE x=1

, GTID M:1: UPDATE x=1 , GTID M:2: UPDATE x=9SET X = 9

GET X, X = 9

The speaker says...

Global transaction identifier help to find up-to-date slaves that have already replicated the latest updates of a client. Thus, session consistency can now be achieved by reading from the master and selected up-to-date slaves.

This works with the GTID emulation of PECL/mysqlnd_ms 1.2 and any MySQL server version as well as with PECL/mysqlnd_ms 1.3 (not yet released) and MySQL 5.6 with its built-in GTIDs.

Remember: only one API call for your PHP application...

With or without GTID: all slaves

Optional QoS level: upper slave lag limit

MySQL estimates slave lag!

MySQL Replication: eventual c.

MySQL MasterMySQL Slave 1MySQL Slave 2Slave lag = 1 secondSET X = 9

GET X, X = 8

Slave lag = 7 seconds

The speaker says...

A MySQL Replication slave is eventual consistent it may or may not have the latest updates. There is no need to filter nodes with regards to consistency.

However, slaves can be filtered by replication lag: mysqlnd_ms_set_qos($conn, MYSQLND_MS_QOS_CONSISTENCY_EVENTUAL, MYSQLND_MS_QOS_OPTION_AGE, 5) filters out all slaves that report an estimated lag of more than five seconds.

Same logic whenever slaves are to be filtered

applied for: session consistency + GTID

applied for: eventual consistency + Lag

Stage 1: send SQL to check status to all slaves

Stage 2: fetch replies in order

Stage 3: apply filter logic

SQL is documented in the manual

Slave selection logic

The speaker says...

Whenever PECL/mysqlnd_ms has to check the slave status for filtering slaves, the check is done in two stages. First, all slaves are queried. Then, replies are fetched from the slaves in order. Usually, many requests can be send out in stage one before the servers reply.

The SQL details are documented at php.net/mysqlnd_ms.

PECL/mysqlnd_ms is a PHP solution

Stateless: decisions are not remembered

Shared-nothing: instances don't communicate

Optional: user hooks to make statefull decisions

Stateless and shared-nothing

The speaker says...

Please recall, that we are talking about a PHP integrated solution. By default PHP is stateless and promotes a shared-nothing architecture.

PHP and PECL/mysqlnd_ms loose their state at the end of a web request. State is neither persisted nor shared between different processes. Thus, there is no single point of failure.

If you want PECL/mysqlnd_ms to remember decisions, install user hooks and persist their decisions.

THE END

Contact: [email protected]