1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged...

191
1 / 191

Transcript of 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged...

Page 1: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

1 / 191

Page 2: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

2 / 1912 / 191

Page 3: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

 

Safe Harbor StatementThe following is intended to outline our general product direction. It is intended for information purpose only, and may not beincorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied up inmaking purchasing decisions. The development, release and timing of any features or functionality described for Oracle´s productremains at the sole discretion of Oracle.

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

3 / 191

Page 4: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

about.me/lefred

Who am I ?

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

4 / 191

Page 5: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Frédéric Descamps@lefredMySQL EvangelistManaging MySQL since 3.23devops believerliving in Belgium 🇧🇪https://lefred.be

 

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

5 / 191

Page 6: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication: heart of MySQL InnoDB Cluster

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

6 / 191

Page 7: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication: heart of MySQL InnoDB Cluster

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

7 / 191

Page 8: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Replication

but what is it ?!?

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

8 / 191

Page 9: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

GR is a plugin for MySQL, made by MySQL and packaged withMySQLGR is an implementation of Replicated Database StateMachine theoryPaxos based protocolGR allows to write on all Group Members (cluster nodes)simultaneously while retaining consistencyGR implements conflict detection and resolutionGR allows automatic distributed recovery

Automatic failoverAutomatic membership configuration

Adding/removing membersNetwork partitions, failures

Prevents data lossSupported on all MySQL platforms !!

Linux, Windows, Solaris, OSX, FreeBSDCompletely Open Source - GPL ! No license required to haveHigh Availability

MySQL Group Replication

but what is it ?!?

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

9 / 191

Page 10: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication is nice, but how does it work ?

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

10 / 191

Page 11: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication is nice, but how does it work ?

it´s just ...

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

11 / 191

Page 12: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication is nice, but how does it work ?

it´s just ...

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

12 / 191

Page 13: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication is nice, but how does it work ?

it´s just ...

... no, in fact the writesets replication is synchronous and then certification and apply of the changes are local to each nodes andasynchronous.

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

13 / 191

Page 14: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication is nice, but how does it work ?

it´s just ...

... no, in fact the writesets replication is synchronous and then certification and apply of the changes are local to each nodes andasynchronous.

not that easy to understand... right ? As a picture is worth a 1000 words, let´s illustrate this...

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

14 / 191

Page 15: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

15 / 191

Page 16: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Replication

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

16 / 191

Page 17: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Replication

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

17 / 191

Page 18: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Replication

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

18 / 191

Page 19: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Replication

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

19 / 191

Page 20: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Replication

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

20 / 191

Page 21: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Replication

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

21 / 191

Page 22: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Replication

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

22 / 191

Page 23: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Replication

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

23 / 191

Page 24: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Replication

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

24 / 191

Page 25: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Replication

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

25 / 191

Page 26: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Replication

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

26 / 191

Page 27: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Replication

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

27 / 191

Page 28: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Replication (full transaction)

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

28 / 191

Page 29: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Replication (full transaction)

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

29 / 191

Page 30: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Replication (full transaction)

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

30 / 191

Page 31: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Replication (full transaction)

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

31 / 191

Page 32: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Replication (full transaction)

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

32 / 191

Page 33: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Replication (full transaction)

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

33 / 191

Page 34: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Replication (full transaction)

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

34 / 191

Page 35: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Replication (full transaction)

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

35 / 191

Page 36: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Replication (full transaction)

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

36 / 191

Page 37: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Communication System (GCS)MySQL Xcom protocol

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

37 / 191

Page 38: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Communication System (GCS)MySQL Xcom protocolReplicated Database State Machine

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

38 / 191

Page 39: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Communication System (GCS)MySQL Xcom protocolReplicated Database State MachinePaxos based protocol (a variant of Mencius)

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

39 / 191

Page 40: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Communication System (GCS)MySQL Xcom protocolReplicated Database State MachinePaxos based protocol (a variant of Mencius)its task: deliver messages across the distributed system:

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

40 / 191

Page 41: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Communication System (GCS)MySQL Xcom protocolReplicated Database State MachinePaxos based protocol (a variant of Mencius)its task: deliver messages across the distributed system:

atomically

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

41 / 191

Page 42: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Communication System (GCS)MySQL Xcom protocolReplicated Database State MachinePaxos based protocol (a variant of Mencius)its task: deliver messages across the distributed system:

atomicallyin Total Order (Writes)

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

42 / 191

Page 43: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Communication System (GCS)MySQL Xcom protocolReplicated Database State MachinePaxos based protocol (a variant of Mencius)its task: deliver messages across the distributed system:

atomicallyin Total Order (Writes)

MySQL Group Replication receives the Ordered 'tickets' from this GCS subsystem.

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

43 / 191

Page 44: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Architecture, Stack, Core, ...

MySQL Group Replication

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

44 / 191

Page 45: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Replication - Architecture

Node Types:

R: Traffic routers/proxies: MySQL Router, HAProxy, ProxySQL...M: mysqld nodes participating in MySQL Group Replication

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

45 / 191

Page 46: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Replication - Stack

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

46 / 191

Page 47: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Server calls into the plugin through a generic interface

Most server internals are hidden from the plugin

Plugin interacts with the server through a generic interface

Replication plugin determines the fate of the commitoperation through a well defined server interfaceThe plugin makes use of the relay log infrastructure toinject changes in the receiving server

MySQL Group Replication - Core

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

47 / 191

Page 48: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Maintaining distributed execution contextDetecting and Resolving conflictsHandling distributed recovery

Detect membership changesDonate state if neededCollect state if needed

Proposing transactions to other membersReceiving and handling transactions from other membersDeciding the ultimate fate of transactions

commit or rollback

MySQL Group Replication - GR Plugin

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

48 / 191

Page 49: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

The communication API (and binding) is responsible for:

Abstracting the underlying group communication system implementation fromthe plugin itselfMapping the interface to a specific group communication system implementation

The Group Communication System engine:

Paxos implementation (Similar to Paxos Mencius)Building block to provide distributed agreement between servers

MySQL Group Replication - GCS

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

49 / 191

Page 50: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Total Order

GTID generation

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

50 / 191

Page 51: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Replication uses MySQL replication framework bydesign:

binary logs

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

51 / 191

Page 52: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Replication uses MySQL replication framework bydesign:

binary logs

relay logs

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

52 / 191

Page 53: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

MySQL Group Replication uses MySQL replication framework bydesign:

binary logs

relay logs

GTIDs: Global Transaction IDs

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

53 / 191

Page 54: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

How does Group Replication handle GTIDs ?There are two ways of generating GTIDs:

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

54 / 191

Page 55: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

How does Group Replication handle GTIDs ?There are two ways of generating GTIDs:

AUTOMATIC: the transaction is assigned with an automatically generated id during commit. Where regular replication uses thesource server UUID, on Group Replication, the group name is used.

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

55 / 191

Page 56: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

How does Group Replication handle GTIDs ?There are two ways of generating GTIDs:

AUTOMATIC: the transaction is assigned with an automatically generated id during commit. Where regular replication uses thesource server UUID, on Group Replication, the group name is used.

ASSIGNED: the user assigns manually a GTID through SET GTID_NEXT to the transaction. This is common to any replicationformat and the id is assigned before the transaction starts.

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

56 / 191

Page 57: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

57 / 191

Page 58: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : Total Order Delivery - GTID

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

58 / 191

Page 59: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : Total Order Delivery - GTID

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

59 / 191

Page 60: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : Total Order Delivery - GTID

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

60 / 191

Page 61: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : Total Order Delivery - GTID

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

61 / 191

Page 62: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : Total Order Delivery - GTID

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

62 / 191

Page 63: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : Total Order Delivery - GTID

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

63 / 191

Page 64: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : Total Order Delivery - GTID

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

64 / 191

Page 65: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : Total Order Delivery - GTID

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

65 / 191

Page 66: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : Total Order Delivery - GTID

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

66 / 191

Page 67: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : Total Order Delivery - GTID

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

67 / 191

Page 68: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : Total Order Delivery - GTID

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

68 / 191

Page 69: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : Total Order Delivery - GTID

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

69 / 191

Page 70: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : Total Order Delivery - GTID

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

70 / 191

Page 71: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : Total Order Delivery - GTID

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

71 / 191

Page 72: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : Total Order Delivery - GTID

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

72 / 191

Page 73: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : Total Order Delivery - GTID

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

73 / 191

Page 74: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : Total Order Delivery - GTID

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

74 / 191

Page 75: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : Total Order Delivery - GTID

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

75 / 191

Page 76: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : Total Order Delivery - GTID

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

76 / 191

Page 77: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : GTIDThe previous example was not totally in sync with reality. In fact, a writer allocates a block of GTID and when we have multiplewrites (multi-primary mode) all writers will use GTID sequence numbers in their allocated block.

The size of the block is defined by group_replication_gtid_assignment_block_size (default to 1M)

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

77 / 191

Page 78: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : GTIDExample:

Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-355

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

78 / 191

Page 79: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : GTIDExample:

Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-355

New write on an other node:

Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-355,1000354

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

79 / 191

Page 80: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : GTIDExample:

Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-355

New write on an other node:

Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-355,1000354

Let's write on the third node:

Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-355:1000354:2000354

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

80 / 191

Page 81: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : GTIDExample:

Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-355

New write on an other node:

Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-355,1000354

Let's write on the third node:

Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-355:1000354:2000354

And writing back on the first one:

Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-356:1000354:2000354

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

81 / 191

Page 82: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

82 / 191

Page 83: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

done !

Return from Commit

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

83 / 191

Page 84: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication: return from commitAsynchronous Replication:

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

84 / 191

Page 85: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication: return from commit (2)Semi-Sync Replication:

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

85 / 191

Page 86: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

(*): eventual

Group Replication: return from commit (3)Group Replication (*):

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

86 / 191

Page 87: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

(*): before

Group Replication: return from commit (4)Group Replication (*):

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

87 / 191

Page 88: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

(*): after

Group Replication: return from commit (5)Group Replication (*):

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

88 / 191

Page 89: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Does this mean we can have a distant node and always let it ack later?

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

89 / 191

Page 90: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Does this mean we can have a distant node and always let it ack later?

NO!

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

90 / 191

Page 91: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Does this mean we can have a distant node and always let it ack later?

NO!

Because the system has to wait for the noop (single skip message) from the “distant” node where latency is higher

The size of the GCS consensus messages window can be get and set from UDF functions:group_replication_get_write_concurrency(), group_replication_set_write_concurrency()

mysql> select group_replication_get_write_concurrency();+-------------------------------------------+| group_replication_get_write_concurrency() |+-------------------------------------------+| 10 |+-------------------------------------------+

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

91 / 191

Page 92: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Event HorizonGCS Write Consensus Concurrency

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

92 / 191

Page 93: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Event HorizonGCS Write Consensus Concurrency

group replication write concurrency

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

93 / 191

Page 94: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Event HorizonGCS Write Consensus Concurrency

group replication write concurrency

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

94 / 191

Page 95: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Event HorizonGCS Write Consensus Concurrency

group replication write concurrency

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

95 / 191

Page 96: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Event HorizonGCS Write Consensus Concurrency

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

96 / 191

Page 97: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Event HorizonGCS Write Consensus Concurrency

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

97 / 191

Page 98: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Event HorizonGCS Write Consensus Concurrency

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

98 / 191

Page 99: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Event HorizonGCS Write Consensus Concurrency

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

99 / 191

Page 100: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

100 / 191

Page 101: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

conflict

Optimistic Locking

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

101 / 191

Page 102: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : Optimistic LockingGroup Replication uses optimistic locking

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

102 / 191

Page 103: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : Optimistic LockingGroup Replication uses optimistic locking

during a transaction, local (InnoDB) locking happens

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

103 / 191

Page 104: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : Optimistic LockingGroup Replication uses optimistic locking

during a transaction, local (InnoDB) locking happensoptimistically assumes there will be no conflicts across nodes (no communication between nodes necessary)

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

104 / 191

Page 105: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : Optimistic LockingGroup Replication uses optimistic locking

during a transaction, local (InnoDB) locking happensoptimistically assumes there will be no conflicts across nodes (no communication between nodes necessary)cluster-wide conflict resolution happens only at COMMIT, during certification

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

105 / 191

Page 106: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Group Replication : Optimistic LockingGroup Replication uses optimistic locking

during a transaction, local (InnoDB) locking happensoptimistically assumes there will be no conflicts across nodes (no communication between nodes necessary)cluster-wide conflict resolution happens only at COMMIT, during certification

Let´s first have a look at the traditional locking to compare.

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

106 / 191

Page 107: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Traditional locking

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

107 / 191

Page 108: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Traditional locking

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

108 / 191

Page 109: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Traditional locking

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

109 / 191

Page 110: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Traditional locking

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

110 / 191

Page 111: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Traditional locking

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

111 / 191

Page 112: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Traditional locking

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

112 / 191

Page 113: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Optimistic Locking

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

113 / 191

Page 114: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Optimistic Locking

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

114 / 191

Page 115: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Optimistic Locking

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

115 / 191

Page 116: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Optimistic Locking

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

116 / 191

Page 117: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Optimistic Locking

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

117 / 191

Page 118: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Optimistic Locking

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

118 / 191

Page 119: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Optimistic Locking

The system returns error 149 as certification failed:

ERROR 1180 (HY000): Got error 149 during COMMIT

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

119 / 191

Page 120: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Optimistic Locking (2)The conflict detection can happen at two different stage:

Only one of the two transactions was sent to the Group and the other one is still running (local).

If both transaction were sent to the Group at almost the same time and both reached GCS/XCOM.

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

120 / 191

Page 121: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Optimistic Locking (3)

Only one transaction was sent to GCS and the other conflicting one is still local:

In this case, it's the high priority transaction mechanism of InnoDB that kills the local one:

If you try any statement in the transaction's session you will see:

ERROR: 1213: Deadlock found when trying to get lock; try restarting transaction

If you try to commit the transaction you will see:

ERROR: 1180: Got error 149 - 'Lock deadlock; Retry transaction' during COMMIT

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

121 / 191

Page 122: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Optimistic Locking (4)

Both transactions were sent to GCS:

The second transaction (the conflicting one) will return:

ERROR 3101 (40000): Plugin instructed the server to rollback the current transaction.

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

122 / 191

Page 123: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Such conflicts happen only when using multi-primary group ! 

not totally true in MySQL < 8.0.13 when failover happens

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

123 / 191

Page 124: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Drawbacks of optimistic lockinghaving a first-committer-wins system means conflicts will more likely happen when writing on multiple members with:

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

124 / 191

Page 125: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Drawbacks of optimistic lockinghaving a first-committer-wins system means conflicts will more likely happen when writing on multiple members with:

large transactions

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

125 / 191

Page 126: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Drawbacks of optimistic lockinghaving a first-committer-wins system means conflicts will more likely happen when writing on multiple members with:

large transactions

long running transactions

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

126 / 191

Page 127: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Drawbacks of optimistic lockinghaving a first-committer-wins system means conflicts will more likely happen when writing on multiple members with:

large transactions

long running transactions

hotspot records

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

127 / 191

Page 128: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

128 / 191

Page 129: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Configurable Consistency Guarantees

Consistency Levels

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

129 / 191

Page 130: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

mysql> show variables like 'group_replication_consistency';+-------------------------------+----------+| Variable_name | Value |+-------------------------------+----------+| group_replication_consistency | EVENTUAL |+-------------------------------+----------+

 

 

Consistency: EVENTUAL (default)By default, there is no synchronization point for the transactions, when you perform a write on a node, if you immediately read thesame data on another node, it is eventually there.

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

130 / 191

Page 131: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

mysql> show variables like 'group_replication_consistency';+-------------------------------+----------+| Variable_name | Value |+-------------------------------+----------+| group_replication_consistency | EVENTUAL |+-------------------------------+----------+

 

 

Consistency: EVENTUAL (default)By default, there is no synchronization point for the transactions, when you perform a write on a node, if you immediately read thesame data on another node, it is eventually there.

Since MySQL 8.0.16, we have to possibility to set thesynchronization point at read or at write or both (globally or for asession).

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

131 / 191

Page 132: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Consistency: BEFORE (READ)

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

132 / 191

Page 133: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Consistency: AFTER (WRITE)

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

133 / 191

Page 134: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Consistency: BEFORE_AND_AFTER

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

134 / 191

Page 135: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

135 / 191

Page 136: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

can the transaction be committed ?

Certification

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

136 / 191

Page 137: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

CertificationCertification is the process that only needs to answer the following unique question:

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

137 / 191

Page 138: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

CertificationCertification is the process that only needs to answer the following unique question:

can the write (transaction) be committed ?

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

138 / 191

Page 139: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

CertificationCertification is the process that only needs to answer the following unique question:

can the write (transaction) be committed ?based on yet to be applied transactions

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

139 / 191

Page 140: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

CertificationCertification is the process that only needs to answer the following unique question:

can the write (transaction) be committed ?based on yet to be applied transactionssuch conflicts must come for other members/nodes

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

140 / 191

Page 141: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

CertificationCertification is the process that only needs to answer the following unique question:

can the write (transaction) be committed ?based on yet to be applied transactionssuch conflicts must come for other members/nodes

happens on every member/node and is deterministic

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

141 / 191

Page 142: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

CertificationCertification is the process that only needs to answer the following unique question:

can the write (transaction) be committed ?based on yet to be applied transactionssuch conflicts must come for other members/nodes

happens on every member/node and is deterministicresults are not reported to the group (does not require a new communication step)

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

142 / 191

Page 143: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

CertificationCertification is the process that only needs to answer the following unique question:

can the write (transaction) be committed ?based on yet to be applied transactionssuch conflicts must come for other members/nodes

happens on every member/node and is deterministicresults are not reported to the group (does not require a new communication step)

pass: commit/queue to appy

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

143 / 191

Page 144: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

CertificationCertification is the process that only needs to answer the following unique question:

can the write (transaction) be committed ?based on yet to be applied transactionssuch conflicts must come for other members/nodes

happens on every member/node and is deterministicresults are not reported to the group (does not require a new communication step)

pass: commit/queue to appyfail: rollback/drop the transaction

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

144 / 191

Page 145: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

CertificationCertification is the process that only needs to answer the following unique question:

can the write (transaction) be committed ?based on yet to be applied transactionssuch conflicts must come for other members/nodes

happens on every member/node and is deterministicresults are not reported to the group (does not require a new communication step)

pass: commit/queue to appyfail: rollback/drop the transaction

serialized by the total order in GCS/XCOM + GTID

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

145 / 191

Page 146: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

CertificationCertification is the process that only needs to answer the following unique question:

can the write (transaction) be committed ?based on yet to be applied transactionssuch conflicts must come for other members/nodes

happens on every member/node and is deterministicresults are not reported to the group (does not require a new communication step)

pass: commit/queue to appyfail: rollback/drop the transaction

serialized by the total order in GCS/XCOM + GTIDcost is based on trx size (# rows & # keys)

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

146 / 191

Page 147: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

terminology

Write vs Writeset

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

147 / 191

Page 148: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Let's illustrate a table:

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

148 / 191

Page 149: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Now let's make a change

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

149 / 191

Page 150: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

and at commit time:

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

150 / 191

Page 151: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

WritesetsContain the hash for the rows PKs that are changed and in some cases the hashes of foreign keys or others dependencies thatneed to be captured (e.g. non NULL UKs). Writesets are gathered during transaction execution.

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

151 / 191

Page 152: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

WritesetsContain the hash for the rows PKs that are changed and in some cases the hashes of foreign keys or others dependencies thatneed to be captured (e.g. non NULL UKs). Writesets are gathered during transaction execution.

WritesCalled also write values, refers to the actual changes. Write values are also gathered during transaction execution.

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

152 / 191

Page 153: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Writeset - examples+-------+-----------+------+-----+---------+-------+| Field | Type | Null | Key | Default | Extra |+-------+-----------+------+-----+---------+-------+| id | binary(1) | NO | PRI | NULL | || name | binary(2) | YES | | NULL | |+-------+-----------+------+-----+---------+-------+

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

153 / 191

Page 154: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Writeset - examples+-------+-----------+------+-----+---------+-------+| Field | Type | Null | Key | Default | Extra |+-------+-----------+------+-----+---------+-------+| id | binary(1) | NO | PRI | NULL | || name | binary(2) | YES | | NULL | |+-------+-----------+------+-----+---------+-------+

mysql> insert into t2 values (1,2);

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

154 / 191

Page 155: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Writeset - examples+-------+-----------+------+-----+---------+-------+| Field | Type | Null | Key | Default | Extra |+-------+-----------+------+-----+---------+-------+| id | binary(1) | NO | PRI | NULL | || name | binary(2) | YES | | NULL | |+-------+-----------+------+-----+---------+-------+

mysql> insert into t2 values (1,2);

pke: PRIMARY | test | t2 | 1 | 1 hash: 11853456929268668462

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

155 / 191

Page 156: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Writeset - examples+-------+-----------+------+-----+---------+-------+| Field | Type | Null | Key | Default | Extra |+-------+-----------+------+-----+---------+-------+| id | binary(1) | NO | PRI | NULL | || name | binary(2) | YES | | NULL | |+-------+-----------+------+-----+---------+-------+

mysql> insert into t2 values (1,2);

pke: PRIMARY | test | t2 | 1 | 1 hash: 11853456929268668462

mysql> update t2 set name=3 where id=1;

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

156 / 191

Page 157: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Writeset - examples+-------+-----------+------+-----+---------+-------+| Field | Type | Null | Key | Default | Extra |+-------+-----------+------+-----+---------+-------+| id | binary(1) | NO | PRI | NULL | || name | binary(2) | YES | | NULL | |+-------+-----------+------+-----+---------+-------+

mysql> insert into t2 values (1,2);

pke: PRIMARY | test | t2 | 1 | 1 hash: 11853456929268668462

mysql> update t2 set name=3 where id=1;

pke: PRIMARY | test | t2 | 1 | 1 hash: 10002085147685770725pke: PRIMARY | test | t2 | 1 | 1 hash: 10002085147685770725

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

157 / 191

Page 158: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Writeset - examples (2)+-------+-----------+------+-----+---------+-------+| Field | Type | Null | Key | Default | Extra |+-------+-----------+------+-----+---------+-------+| id | binary(1) | NO | PRI | NULL | || name | binary(2) | YES | UNI | NULL | || name2 | binary(1) | YES | | NULL | |+-------+-----------+------+-----+---------+-------+

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

158 / 191

Page 159: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Writeset - examples (2)+-------+-----------+------+-----+---------+-------+| Field | Type | Null | Key | Default | Extra |+-------+-----------+------+-----+---------+-------+| id | binary(1) | NO | PRI | NULL | || name | binary(2) | YES | UNI | NULL | || name2 | binary(1) | YES | | NULL | |+-------+-----------+------+-----+---------+-------+

mysql> insert into t3 values (1,2,3);

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

159 / 191

Page 160: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Writeset - examples (2)+-------+-----------+------+-----+---------+-------+| Field | Type | Null | Key | Default | Extra |+-------+-----------+------+-----+---------+-------+| id | binary(1) | NO | PRI | NULL | || name | binary(2) | YES | UNI | NULL | || name2 | binary(1) | YES | | NULL | |+-------+-----------+------+-----+---------+-------+

mysql> insert into t3 values (1,2,3);

pke: PRIMARY | test |t3 | 1 | 1 hash: 79134815725924853pke: name | test |t3 | 2 hash: 11034644986657565827

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

160 / 191

Page 161: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Writeset - examples (2)+-------+-----------+------+-----+---------+-------+| Field | Type | Null | Key | Default | Extra |+-------+-----------+------+-----+---------+-------+| id | binary(1) | NO | PRI | NULL | || name | binary(2) | YES | UNI | NULL | || name2 | binary(1) | YES | | NULL | |+-------+-----------+------+-----+---------+-------+

mysql> insert into t3 values (1,2,3);

pke: PRIMARY | test |t3 | 1 | 1 hash: 79134815725924853pke: name | test |t3 | 2 hash: 11034644986657565827

mysql> update t3 set name=3 where id=1;

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

161 / 191

Page 162: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Writeset - examples (2)+-------+-----------+------+-----+---------+-------+| Field | Type | Null | Key | Default | Extra |+-------+-----------+------+-----+---------+-------+| id | binary(1) | NO | PRI | NULL | || name | binary(2) | YES | UNI | NULL | || name2 | binary(1) | YES | | NULL | |+-------+-----------+------+-----+---------+-------+

mysql> insert into t3 values (1,2,3);

pke: PRIMARY | test |t3 | 1 | 1 hash: 79134815725924853pke: name | test |t3 | 2 hash: 11034644986657565827

mysql> update t3 set name=3 where id=1;

pke: PRIMARY | test | t3 | 1 | 1 hash: 79134815725924853pke: name | test | t3 | 3 hash: 18082071075512932388pke: PRIMARY | test | t3 | 1 | 1 hash: 79134815725924853pke: name | test | t3 | 2 hash: 11034644986657565827

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

162 / 191

Page 163: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Writeset - examples (2)+-------+-----------+------+-----+---------+-------+| Field | Type | Null | Key | Default | Extra |+-------+-----------+------+-----+---------+-------+| id | binary(1) | NO | PRI | NULL | || name | binary(2) | YES | UNI | NULL | || name2 | binary(1) | YES | | NULL | |+-------+-----------+------+-----+---------+-------+

mysql> insert into t3 values (1,2,3);

pke: PRIMARY | test |t3 | 1 | 1 hash: 79134815725924853pke: name | test |t3 | 2 hash: 11034644986657565827

mysql> update t3 set name=3 where id=1;

pke: PRIMARY | test | t3 | 1 | 1 hash: 79134815725924853pke: name | test | t3 | 3 hash: 18082071075512932388pke: PRIMARY | test | t3 | 1 | 1 hash: 79134815725924853pke: name | test | t3 | 2 hash: 11034644986657565827

[after image][before image]

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

163 / 191

Page 164: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Certification

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

164 / 191

Page 165: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

165 / 191

Page 166: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Enhanced support for large transactions

Message Fragmentation

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

166 / 191

Page 167: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Fragmenting an outgoing message

(1). If the message size exceeds the maximum size that the user allows(group_replication_communication_max_message_size), the member fragments the message into chunksthat do not exceed the maximum size.

(2). The member broadcasts each chunk to the group, i.e. forwards each chunk individually to XCom.

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

167 / 191

Page 168: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Reassembling an incoming message - first fragment

(2). The members conclude that the incoming message is actually a fragment of a bigger message.

(3). The members buffer the incoming fragment because they conclude the fragment is a chunk of a still-incomplete message.(Fragments contain the necessary metadata to reach this conclusion.)

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

168 / 191

Page 169: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Reassembling an incoming message - second fragment

(5). Same as step 3.

(6). Same as step 4.

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

169 / 191

Page 170: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Reassembling an incoming message - last fragment

1. The members conclude that the incoming message is actually a fragment of a bigger message.

2. The members conclude that the incoming fragment is the last chunk missing, reassemble the original, whole message, andprocess it.

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

170 / 191

Page 171: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Houston we have a problem !

Flow Control

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

171 / 191

Page 172: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Flow ControlIn Group Replication, every member send statistics about its queues (applier queue and certification queue) to the other members.Then every node decide to slow down or not if they realize that one node reached the threshold for one of the queue.

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

172 / 191

Page 173: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Flow ControlIn Group Replication, every member send statistics about its queues (applier queue and certification queue) to the other members.Then every node decide to slow down or not if they realize that one node reached the threshold for one of the queue.

So when group_replication_�ow_control_mode is set to QUOTA on the node seeing that one of the othermembers of the cluster is lagging behind (threshold reached), it will throttle the write operations to the a quota that is calculatedbased on the number of transactions applied in the last second, and then it is reduced below that by subtracting the “over thequota” messages from the last period.

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

173 / 191

Page 174: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Flow ControlIn Group Replication, every member send statistics about its queues (applier queue and certification queue) to the other members.Then every node decide to slow down or not if they realize that one node reached the threshold for one of the queue.

So when group_replication_�ow_control_mode is set to QUOTA on the node seeing that one of the othermembers of the cluster is lagging behind (threshold reached), it will throttle the write operations to the a quota that is calculatedbased on the number of transactions applied in the last second, and then it is reduced below that by subtracting the “over thequota” messages from the last period.

This mean that the threshold is NOT decided on the node being slow, but the node writing a transaction checks its threshold flowcontrol values and compare them to the statistics from the other nodes to decide to throttle or not.

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

174 / 191

Page 175: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Flow Control - on writer

>quota

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

175 / 191

Page 176: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Flow Control - on all members

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

176 / 191

Page 177: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Flow Control - configuration variablesAs in MySQL 8.0.16:

+-----------------------------------------------------+-------+| Variable_name | Value |+-----------------------------------------------------+-------+| group_replication_�ow_control_applier_threshold | 25000 || group_replication_�ow_control_certi�er_threshold | 25000 || group_replication_�ow_control_hold_percent | 10 || group_replication_�ow_control_max_quota | 0 || group_replication_�ow_control_member_quota_percent | 0 || group_replication_�ow_control_min_quota | 0 || group_replication_�ow_control_min_recovery_quota | 0 || group_replication_�ow_control_mode | QUOTA || group_replication_�ow_control_period | 1 || group_replication_�ow_control_release_percent | 50 |+-----------------------------------------------------+-------+

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

177 / 191

Page 178: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

transaction's lifecycle in Group Replication

Summary

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

178 / 191

Page 179: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

begin;

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

179 / 191

Page 180: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

begin;update table1 set c = 999 where id =2;

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

180 / 191

Page 181: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

begin;update table1 set c = 999 where id =2;update table1set b = "eee"where id = 3;

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

181 / 191

Page 182: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

begin;update table1 set c = 999 where id =2;update table1set b = "eee"where id = 3;

commit;

clie

nt b

lock

s on

com

mit.

..

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

182 / 191

Page 183: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

begin;update table1 set c = 999 where id =2;update table1set b = "eee"where id = 3;

commit;

clie

nt b

lock

s on

com

mit.

..

writesets+ gtid_event+ write values

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

183 / 191

Page 184: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

begin;update table1 set c = 999 where id =2;update table1set b = "eee"where id = 3;

commit;

clie

nt b

lock

s on

com

mit.

..

writesets+ gtid_event+ write values

certify

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

184 / 191

Page 185: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

begin;update table1 set c = 999 where id =2;update table1set b = "eee"where id = 3;

commit;

clie

nt b

lock

s on

com

mit.

..

writesets+ gtid_event+ write values

certifycertify

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

185 / 191

Page 186: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

begin;update table1 set c = 999 where id =2;update table1set b = "eee"where id = 3;

commit;

clie

nt b

lock

s on

com

mit.

..

writesets+ gtid_event+ write values

certifycertify

certify

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

186 / 191

Page 187: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

begin;update table1 set c = 999 where id =2;update table1set b = "eee"where id = 3;

commit;

commit finalized

writesets+ gtid_event+ write values

certifycertify

certify

+ GTID

binlog

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

187 / 191

Page 188: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

begin;update table1 set c = 999 where id =2;update table1set b = "eee"where id = 3;

commit;

commit finalized

writesets+ gtid_event+ write values

certifycertify

certify

+ GTID

binlog

+ GTID

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

188 / 191

Page 189: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

begin;update table1 set c = 999 where id =2;update table1set b = "eee"where id = 3;

commit;

commit finalized

writesets+ gtid_event+ write values

certifycertify

certify

+ GTID

binlog

+ GTID

+ GTIDrelaylog

relaylog

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

189 / 191

Page 190: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

begin;update table1 set c = 999 where id =2;update table1set b = "eee"where id = 3;

commit;

commit finalized

writesets+ gtid_event+ write values

certifycertify

certify

+ GTID

binlog

+ GTID

+ GTIDrelaylog

relaylog

binlog

binlog

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

190 / 191

Page 191: 1 / 191 - DataOps Barcelona | Databases · GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos

Thank you !

Any Questions ?

Copyright @ 2019 Oracle and/or its affiliates. All rights reserved.

191 / 191