Running Galera Clusters Street Signs to Watch for When · Pivotal Cloud Foundry (PCF) 2+ years on...
Transcript of Running Galera Clusters Street Signs to Watch for When · Pivotal Cloud Foundry (PCF) 2+ years on...
© Copyright 2017 Pivotal Software, Inc. All rights Reserved. Version 1.0
September 2017
Highway SafetyStreet Signs to Watch for When Running Galera Clusters
Thanks to: http://www.safetysign.com/
Get Comfortable!
Demo Starts
3
Get Comfortable!
Demo Starts
2
Get Comfortable!
Demo Starts
1
Get Comfortable!
Demo Starts
Now
WHAT JUST HAPPENED?
Same app, different results
Marco Nicosia
● Principal Product Manager
● Pivotal Cloud Foundry (PCF)
● 2+ years on MySQL for PCF, all Galera
● We have another project working on Leader/Follower
Agenda
Galera Highway Safety
☑ Demo 1: Deadlocks
❏ Purpose of this Talk
❏ Demo 2: DDL under TOI
❏ Demo 3: Large DMLs
❏ Demo 4: Crashing Galera
❏ Mitigations
Cover w/ Image
Why Intermediate?
Not a “How to install & use”
Not an “Advanced tuning”
This is “Make an informed decision”
Share our experiences with Galera
Not an advertisement
GOAL
Convince you that Galera is Great
… when used sparingly
Marco’s Rules
#1 People Don’t Read
#2 People Don’t Listen
Cover w/ Image
Why Show Us This?
Any app who ignores a deadlock is an idiot
… or just a new developer
True! But not precisely “MySQL”
Cover w/ Image
Galera Limitations
● InnoDB only, no MyISAM
○ System tables are MyISAM
○ Must use CREATE USER/GRANT
● Doesn’t distribute locks
Learn more: http://galeracluster.com/documentation-webpages/limitations.html
Cover w/ Image
Galera is Complex
Advertised as Multi-Master MySQL
Can be a great HA solution
… at the cost of complexity.
Demo #2: Large DDL
Concept: DDL modes
TOI vs. RSU
Learn more:http://galeracluster.com/videos/galera-cluster-best-practices-ddls-and-schema-upgrades/
WHAT JUST HAPPENED?
● One writer issued a large (~4min) DDL
● All other writers fully blocked
● Total Order Isolation serializes all DDLs
● … AND blocks all commits
Demo #3: Large DML
Concept: Flow Control
Learn more: http://galeracluster.com/documentation-webpages/nodestates.html
WHAT JUST HAPPENED?
● One active writer committed a large DML
● All other writers were blocked
● Two nodes fell significantly behind
● Cluster snaps back to normal
Cover w/ Image
Distributed Reads
Myth: Fully synchronous replication
Fact: Certification-based is eventually consistent MySQL
Cover w/ Image
Synch vs. Certification
Myth: Multiple Copies of Data
Guarantees a writeset will commit
Applied subsequent to certification
Demo #4: Divergent Data
WHAT JUST HAPPENED?
Two nodes had nearly full disks
Single writer issued large DDL
Immediately failed on other two nodes
Leaving incompatible schema
Single writer eventually issued incompatible DML
Two nodes immediately crash and SST
Mitigations
Conc;usion
Mitigations
Technology
● Restrict traffic to one node!
● Dedicated clusters
● Force InnoDB
● Disable table locks
● Force primary keys
● Tune Galera
● Maintain headroom
Education
● TOI vs. RSU
● Max txn size
● Deadlocks
● Primary keys
● Critical Reads
● Flow control
Cover w/ Image
Conclusions
Galera is powerful
Especially for those who cannot tolerate downtime
Not standard MySQL
Developers must be Galera-aware
Special Thanks
● Our Customers● Laine Campbell● The Percona support team● Shatarupa Nandi● Ben Laplanche● Andrew Garner● Joseph Palermo
● Samuel Serrano● Morgan Fine● Urvashi Reddy● Difan Zhao● Rob Dimsdale● Christopher Hendrix● Andrew Crump
… and the Barley cluster!
Marco & Pivotal Software
● Marco Nicosia, Twitter: @menicosia
● Pivotal Labs, Data and Cloud Foundry
● Over 2,000 employees in more than 20 locations globally
● Key customers include Comcast, Allstate, Ford, Citi, GE, Southwest, Verizon.
● Investors include GE, Dell, Ford, and Microsoft.
We’re Hiring:
San Francisco, New York, Toronto, Dublin, and London