Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default...

59
Geographically Dispersed Percona XtraDB Cluster Deployment Marco (the Grinch) Tusa September 2017 Dublin

Transcript of Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default...

Page 1: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

Geographically Dispersed Percona XtraDB Cluster Deployment

Marco (the Grinch) Tusa September 2017 Dublin

Page 2: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

2

Marco “The Grinch” • Open source enthusiast • Percona consulting Team Leader

About me

Page 3: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

3

Agenda• WhatisPXC

• Whennodesinteracts

• Letusclarify,geodispersed-Whattokeepinmindthen

• Howtomeasurelatencycorrectly

• Usetherightway(sync/async)

• Usehelplikereplicationmanager

Page 4: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

4

What is PXC/Galera?(Virtually)SynchronousReplication:

• Truemulti-master• Noslavelag• Nomaster-slavefailoverorVIP• Multi-threadedapplayers• Automaticnodeprovisioning• Elasticscale(in–out)• Geographicdistributed(withsegments)• MixwithAsyncreplication Galera

Balancer

Webtraffic

Page 5: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

5

What PXC/Galera is NOT?NotWrite-scalablesolution

Notgreatforahighamountofparallel,smallrequestsNotgreatforworkingwithForeignKeysNotgoodforshardingData(eachnodehastheentiredataset)

Galera

Balancer

Webtraffic

Page 6: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

6

What is a Node StandardMySQLReplication

Master

Slave

Slave

• GaleraMySQLReplication

Node

Node Node

9cba28fa-a8be-11e4-8f41-9f963e1dbf4f

Page 7: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

7

SegmentsAsegmentisalogicalgroupingofnodes.ReplicationbetweenSegmentisoptimized(writeset-somelevelofcommunication)

Trafficandmessagingisreduced

IncaseofSST,thedonorischosenbyproximity

Page 8: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

8

More nodes more problemsUseatwophasecommit,ordistributedlockingwithcapacityformula:m=nxoxt(wheremessages/sec=numberofnodesduetoprocessonumberofoperationwithttransactionthroughput)

Page 9: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

9

When nodes interacts• Keepaliveandchecksforclusterhealth

• Writesetonwritercommit

• Certificationresults

• Ackonlocalapply

• FlowControl

• IST/SST

Page 10: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

10

Let us clarify, geo dispersed 1Geodispersed

ormulti-site,clusterisaclusterconfigurationusedtohelpensurehigh

systemandapplicationavailabilityintheeventofsitedisaster.Inthis

configuration,serversareseparatedgeographicallyandthephysical

storage(quorumdiskor)DATAissynchronouslyreplicatedbetween

sites. (http://www.expertglossary.com/storage/definition/geo-dispersed-cluster)

Page 11: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

11

Let us clarify, geo dispersed 2For some environments, latency is the sole focus of performance.

As an example of latency, shows a network transfer, such as an HTTP GET request, with the time split into latency and data transfer components.

Page 12: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

12

Geo dispersed Geodispersedisdeterminatebythelatencyexisting

betweennodes

NOTbythegeographiclocationitself.

Page 13: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

13

How to measure latency correctly 1 • wsrep_evs_repl_latency

(Itmeasureslatencyfromthetimepointwhenamessageissentouttothe

timepointwhenamessageisreceived.)

• wsrep_replicated/wsrep_replicated

• netperf

Page 14: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

14

How to measure latency correctly 2

Page 15: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

14

How to measure latency correctly 2

PING

Page 16: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

14

How to measure latency correctly 2

PING

Page 17: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

14

How to measure latency correctly 2

Why?

Page 18: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

15

Brief digression

Ref:https://goo.gl/kDTYnW

Page 19: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

15

Brief digression

Ref:https://goo.gl/kDTYnW

Page 20: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

15

Brief digression

Ref:https://goo.gl/kDTYnW

Page 21: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

16

Brief digression • PINGuseICMP(InternetControlMessageProtocol)NOTTCPoverIP

• Defaultdatasizeis56bytesplusheader(8bytes)

Page 22: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

16

Brief digression • PINGuseICMP(InternetControlMessageProtocol)NOTTCPoverIP

• Defaultdatasizeis56bytesplusheader(8bytes)

Page 23: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

16

Brief digression • PINGuseICMP(InternetControlMessageProtocol)NOTTCPoverIP

• Defaultdatasizeis56bytesplusheader(8bytes)

ping -M do -s 1473 -c 3 192.168.0.34

Page 24: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

16

Brief digression • PINGuseICMP(InternetControlMessageProtocol)NOTTCPoverIP

• Defaultdatasizeis56bytesplusheader(8bytes)

ping -M do -s 1473 -c 3 192.168.0.34

Page 25: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

16

Brief digression • PINGuseICMP(InternetControlMessageProtocol)NOTTCPoverIP

• Defaultdatasizeis56bytesplusheader(8bytes)

Notgoodenough!

Page 26: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

17

Brief digression

Page 27: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

17

Brief digressionTCPmeansTransmissionControlProtocolandasthenamesays,itisdesigntocontrolthedatatransmissionhappeningbetweensourceanddestination.

TCPimplementationsusetheIPprotocolencapsulationforthetransmissionofthedata:

Page 28: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

17

Brief digressionTCPmeansTransmissionControlProtocolandasthenamesays,itisdesigntocontrolthedatatransmissionhappeningbetweensourceanddestination.

TCPimplementationsusetheIPprotocolencapsulationforthetransmissionofthedata:

Page 29: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

17

Brief digressionTCPmeansTransmissionControlProtocolandasthenamesays,itisdesigntocontrolthedatatransmissionhappeningbetweensourceanddestination.

TCPimplementationsusetheIPprotocolencapsulationforthetransmissionofthedata:

Looksthesamethingthanbeforeright?

Page 30: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

17

Brief digressionTCPmeansTransmissionControlProtocolandasthenamesays,itisdesigntocontrolthedatatransmissionhappeningbetweensourceanddestination.

TCPimplementationsusetheIPprotocolencapsulationforthetransmissionofthedata:

Looksthesamethingthanbeforeright?

WRONG!

Page 31: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

18

Brief digressionATCPimplementationhasseveralcharacteristicsthatmakesensetosummarize:

Page 32: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

18

Brief digressionATCPimplementationhasseveralcharacteristicsthatmakesensetosummarize:

Isstreamoriented

Page 33: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

18

Brief digressionATCPimplementationhasseveralcharacteristicsthatmakesensetosummarize:

IsstreamorientedEstablishaconnection

Page 34: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

18

Brief digressionATCPimplementationhasseveralcharacteristicsthatmakesensetosummarize:

IsstreamorientedEstablishaconnectionMonitorthedatatransfer

Page 35: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

18

Brief digressionATCPimplementationhasseveralcharacteristicsthatmakesensetosummarize:

IsstreamorientedEstablishaconnectionMonitorthedatatransferBufferedtransmission

Page 36: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

18

Brief digressionATCPimplementationhasseveralcharacteristicsthatmakesensetosummarize:

IsstreamorientedEstablishaconnectionMonitorthedatatransferBufferedtransmissionUnstructuredstream

Page 37: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

18

Brief digressionATCPimplementationhasseveralcharacteristicsthatmakesensetosummarize:

IsstreamorientedEstablishaconnectionMonitorthedatatransferBufferedtransmissionUnstructuredstreamFull-duplexconnection

Page 38: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

18

Brief digressionATCPimplementationhasseveralcharacteristicsthatmakesensetosummarize:

IsstreamorientedEstablishaconnectionMonitorthedatatransferBufferedtransmissionUnstructuredstreamFull-duplexconnectionStreamasasequenceofoctetsplitinsegments

Page 39: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

19

Brief digressionTCPdispatcheruseDynamicSlideWindow

Page 40: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

19

Brief digressionTCPdispatcheruseDynamicSlideWindow

Page 41: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

19

Brief digressionTCPdispatcheruseDynamicSlideWindow

Dispatchermanagesthreepointersassociatedtoeachconnection:ThefirstpointerindicatethestartoftheslidingwindowThesecondpointerindicatesthehigheroctetthatcanbedispatchtet.Thethirdpointerindicatesthewindowlimit

Page 42: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

20

How to measure latency correctly 3 Backtous

Page 43: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

20

How to measure latency correctly 3 Backtous

• CheckforgatewayMTUcut• ping -M do -s 1473 -c 3 192.168.0.34

Page 44: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

20

How to measure latency correctly 3 Backtous

• CheckforgatewayMTUcut• ping -M do -s 1473 -c 3 192.168.0.34

• Considersentandreceivedmessages(IEWsrepreplicatedbytes&Wsrepreceivedbytes)

Page 45: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

20

How to measure latency correctly 3 Backtous

• CheckforgatewayMTUcut• ping -M do -s 1473 -c 3 192.168.0.34

• Considersentandreceivedmessages(IEWsrepreplicatedbytes&Wsrepreceivedbytes)

• CheckKernelsettingsfor:• Buffering• Congestioncontrol• Frameutilization

Page 46: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

20

How to measure latency correctly 3 Backtous

• CheckforgatewayMTUcut• ping -M do -s 1473 -c 3 192.168.0.34

• Considersentandreceivedmessages(IEWsrepreplicatedbytes&Wsrepreceivedbytes)

• CheckKernelsettingsfor:• Buffering• Congestioncontrol• Frameutilization

• Testwithnetperf(IE)• netperf-H192.168.1.51-tTCP_RR-v2-l60---b2-r250K-R1M-s250K,10M-S10K,256K

Page 47: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

20

How to measure latency correctly 3 Backtous

• CheckforgatewayMTUcut• ping -M do -s 1473 -c 3 192.168.0.34

• Considersentandreceivedmessages(IEWsrepreplicatedbytes&Wsrepreceivedbytes)

• CheckKernelsettingsfor:• Buffering• Congestioncontrol• Frameutilization

• Testwithnetperf(IE)• netperf-H192.168.1.51-tTCP_RR-v2-l60---b2-r250K-R1M-s250K,10M-S10K,256K

• Checkthewsrep_evs_repl_latencyvalueinSHOWGLOBALSTATUSlike‘wsrep%’;

Page 48: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

21

What is the right limit?

Dependsbytheusage

Balance incomingwrite/s consistencereads

Page 49: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

22

Last chance for (virtually) Synchronous Wansettings:

Page 50: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

22

Last chance for (virtually) Synchronous Wansettings:

evs.inactive_check_period = PT30S;evs.inactive_timeout = PT1M;evs.suspect_timeout = PT40S; evs.stats_report_period = PT3M;

evs.join_retrans_period=PT0.5S !don’tusePING

Page 51: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

22

Last chance for (virtually) Synchronous Wansettings:

evs.inactive_check_period = PT30S;evs.inactive_timeout = PT1M;evs.suspect_timeout = PT40S; evs.stats_report_period = PT3M;

evs.join_retrans_period=PT0.5S !don’tusePING

Master_Slavenotaverygoodoptionthoughwsrep_provider_options = "gcs.fc_limit = 256; gcs.fc_factor = 0.99; gcs.fc_master_slave = YES"

Page 52: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

23

Async replication kicks in Wecanusealmostthesamemodelsweused

Page 53: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

23

Async replication kicks in Wecanusealmostthesamemodelsweused

ChallengeistoshiftfromoneMaster-NodetoanewoneOrfromaslavetoanother

Page 54: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

24

Async replication ways StandardbinlogpositionusingXIDandwsrep_last_committed+----------------------+---------+ | Variable_name | Value | +----------------------+---------+ | wsrep_last_committed | 3282552 | +----------------------+---------+

Binlog# at 544 #170920 19:26:56 server id 3306 end_log_pos 575 CRC32 0x3ae1edcd Xid = 3282552

Simpletoinstall/setupNightmaretomaintain

Page 55: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

25

Async replication ways UsingGTIDAllnodesonaclusterwillhavethesameGTIDMovefromaslavefromonenodetoanothercanbeautomated.Existing: YvesTrudeausolution:https://github.com/y-trudeau/Mysql-tools/tree/master/PXC SinglelinkDC1->DC2 MultipleLinkDC1->DC2->DC3

Page 56: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

26

Conclusions

• PlancarefullyyounetworkandDC-DCconnectivity

• KeepthenumberofnodesinsideaPXCclustertominimum

• Testproperly(notping)thelatencyonthenetwork

• UsePXC/Galerareplicationbetweengeo-distributedonlyifitissafe

• DonotesitatetoshifttoAsyncreplication

• UseexistingsolutionstohelpyoumanageasyncreplicationbetweenPXCs

Page 57: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

27

Page 58: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

28

Q&A

Page 59: Geographically Dispersed Percona XtraDB Cluster Deployment › live › e17 › sites › default › files › slides... · 20 How to measure latency correctly 3 Back to us • Check

29

Contacts

To contact Me

[email protected]

[email protected]

To follow me

http://www.tusacentral.net/

http://www.percona.com/blog/

https://www.facebook.com/marco.tusa.94

@marcotusa

http://it.linkedin.com/in/marcotusa/

“Consulting = No mission refused!”