MariaDB MaxScale
Transcript of MariaDB MaxScale
© MariaDB Corporation Ab
MariaDB MaxScale - The intelligent database proxy
Anders Karlsson
MariaDB Principal Sales Engineer
Agenda
●About Anders Karlsson
●Why do we need MaxScale
●MaxScale architecture○Plugin types
○Use cases
●MaxScale Roadmap and future
●Questions? Answers!
About Anders Karlsson
• Has been in the database business for
more then 30 years
• Has worked as a support engineer, porting
engineer, performance consultant, trainer
but mainly as a Sales Engineer
• Has worked for Oracle, Informix, MySQL etc.
• Joined MySQL in 2004 and
MariaDB in 2012
• Besides work is also a beer aficionado, has
an interest in old cars and computers and
tries to spend as much time as possible
with his twins
The need for traffic control
• Servers, even low end ones, scale up –
More CPU, more RAM, faster Disks etc.
• Requirements change after an application
is already created
• We need to monitor, filter and audit
database requests
Server architecture challenges
●Cloud architecture○Need spread data and traffic across regions,
seasons and servers
○Need to optimize for reads and writes
●More use cases○Reporting servers
○Sharding
○Backup server
●Virtualization○With virtualization, servers often gets added,
removed and changed more frequently
Database Firewall
Server architecture challenges
Application Business Logic
Application Database Connection Management
MySQL/MariaDB connector
● Load Balance queries● Send queries to node per load
balancing● Detect node failure and failover● Detect node addition or removal
● Determine which shard to send the query
● If sharding schema changes, or a new shard added,
○ update the application code
Database Firewall
MariaDB Replication
1:100 101:200 201:300 301:..
.
Database Firewall
Data Sharding
MariaDB Galera Cluster
● Complex application
● Reduced Development
Agility and DBA Productivity
● Longer time to market
● For security requirements
additional database firewall
needed
Application Business Logic
Application Database Connection
Management
MySQL/MariaDB connector
MaxScale
Connection and Statement
based load balancing
● Simplifies applications
○ Improved Developer productivity
● Cluster configuration transparent to
applications
○ Improved DBA productivity
● Monitors backend database high
availability
○ Improved Business Continuity
● Load Balance queries● Route queries to node per load
balancing algorithm● Detect node failure and failover● Detect node addition or removal
● Send queries over a single server connection to MaxScale
MariaDB Galera ClusterMariaDB Replication
Scalability with MaxScale for
MariaDB replicationApplication Business Logic
Application Database Connection
Management
MySQL/MariaDB connector
MaxScale
Connection and Statement
based load balancing
● Simplifies applications
○ Improved Developer productivity
● Cluster configuration transparent to
applications
○ Improved DBA productivity
● Monitors backend database high
availability
○ Improved Business Continuity
● Load Balance queries● Route queries to node per load
balancing algorithm● Detect node failure and failover● Detect node addition or removal
● Send queries over a single server connection to MaxScale
MariaDB Galera ClusterMariaDB Replication
MariaDB MaxScale: Lightweight,
Modular, Intelligent, Transparent
Routing
MaxScale
Core
Routing
● MaxScale core is a high
performance message switch.
● Module APIs allow for multiple
modules of each type.
● Modules can be
client-facing, back-end
facing, internal, or connect
to other DBs.
● Eases customer-driven
development.
Flexible architecture, no
change to clients.
MaxScale Pluggable Architecture
●MaxScale is defined by the plugins / modules○Without plugins, MaxScale does nothing
●Transparent to applications
●Uses the Linux epoll facility○Non-blocking I/O notification
○Linux only
○Kernel 2.6 and up only
○Very scalable!
●Written in C and some C++
●Completely Open Source!
MaxScale Internals
Client and Back-End Protocols
● Client protocols let applicationsconnect to MaxScale with differentdatabase languages and connections.○ Initial support for MySQL and MariaDB connectors.
○ Future support planned for REST and others.
● Back-end protocols allow MaxScale to connect to different database and cluster technologies.○ Initial support for MySQL Replication, MariaDB
Replication, Galera Cluster, and MySQL Server with NDB storage engine.
Protocol
Routing and Load Balancing
● Routing modules implement
fundamental proxy algorithms.
○ Connection based routing: establishes route upon
connection, doesn’t examine requests.
○ Statement based routing: examines requests to
dynamically route requests to the appropriate back-
end.
● Modular design allows for load balancing, HA
failover, sharding, and other routing policies.
Routing
Filtering, Logging, and Auditing
● Can log requests that meet filtercriteria such as time-out, accessspecific row, etc.
● Configurable firewall blocks or allows requests based on policy.
● External filtering and transformation pipeline processes requests and results as they pass through MaxScale.○ Allows for creation of parallel streams.
○ Transforms can alter both requests and returned data.
Filter/Log
● Authentication modules let
MaxScale proxy user credentials.
○ In the future, they may deliver alternative
authentication such as Kerberos, OpenStack
Keystone, or PAMs.
● Monitoring modules provide the
detailed back-end state
information MaxScale needs to make the
right routing decisions.
Authentication
Monitor
Authentication and Monitoring
MariaDB MaxScale: Implementing
Advanced Database Services● MaxScale services interface
to clients and servers
through protocols.
● Router modules implement
policy to determine which
servers are best able to
handle requests.
● Filters and logging modules
implement a pipeline that
can block, split, or transform
requests.
Simple clients, complex
architectures.
Ro
utin
g
Fil
ter/
Lo
g
Client
Protocol
Server
Protocol
Message Core
&
State
Machine .log
Read Scalability: MySQL Replication +
Connection Load Balancing
● Applications are “cluster aware”, split reads and writes on separate connections.
● Using connection load balancing, MaxScale routes the read/write connection to the master, and uses round-robin load balancing to route read-only connections to the slaves.
● MaxScale monitors the cluster, adjusting resource pools should a slave fail, or should MHA failover promote a slave to master.
MaxScale
MaxScale Load Balancing –
Concrete example
ClientClient Client For applications that need read scalability and load balancing and can use separate connections for reads and writes.
Each client establishes one connection for writes and one for reads
Master Slave Slave
[write service]
type=service
router=readconnroute
router_options=master
servers=s1,s2,s3,..,sn
[read service]
type=service
router=readconnroute
router_options=slave
servers=s1,s2,s3,..,sn
Slave Slave...
RW Scalability, HA Failover: Galera
Cluster + Connection Load Balancing
● Applications can be simple, use one connection for reads and updates.
● Using connection load balancing, MaxScale load balances connections to each node in the Galera Cluster, round-robin.
● MaxScale monitors the cluster, routing requests only to fully synchronized nodes.
MaxScale
RW Splitting, Read Scalability: MySQL
Replication + Statement Load Balancing
● Applications can be simple, use one connection for reads and updates.
● Using statement load balancing, MaxScale dynamically routes writes to the master, and round-robin load balances read requests to the slaves.
● MaxScale monitors the cluster, routing requests only to fully operational nodes.
MaxScale
R/W
Splitting
Query Logging For Performance
Diagnostics
DBAs or DevOps engineers can
capture performance problems:
1. MaxScale accepts a query from a
client application,
2. Forwards the query to the back-end,
logs it into the “All Queries” and “Long
Running Queries” log files.
3. Receives the result from the back-
end,
4. Forwards the result to the client, logs
the result into the “All Queries” log,
and if it is one of the N longest-
running queries, logs the result into
the “Long Running Queries” log file.
MaxScale is flexible. Rather than
logging, it could “tee” queries to
another service such as an
analytics warehouse using the
same technique.
MaxScale
.log
Long-Running
Queries
.log
All
Queries
1 4
2
4
2
4
2 3
Query Transformation For Legacy
Application Compatibility
Modify queries from legacy
applications on the fly - for
example a MySQL 5.1 app:
1. MaxScale accepts a query from a
MySQL 5.1 compatible client,
2. If the query matches the regular
expression “/CREATE TABLE/” then
MaxScale substitutes “ENGINE” for
“TYPE” in that statement, else it
passes the statement through the
filter unchanged.
3. Forwards the transformed statement
to MariaDB 10 or MySQL 5.6.
4. Receives the result from the back-
end.
5. Forwards the result to the client.
Complex pipelines with multiple
stages that split, filter, and
transform queries are possible.
MaxScale
1 5
3 4
/CREATE TABLE/
s/TYPE/ENGINE/
2
Query Duplication with MaxScale
MaxScale
1 5
4
22 3
Database A:
MariaDB Replication Cluster
1. MaxScale accepts a query from a client
application
2. Forwards the query to the back-end
Sends query to Database A
Send query to Database B via Tee-filter
3. Receives the result from Database A
4. Receives the result from Database B
5. Forwards the result to the client
● Cross-DB solutions: Duplicate queries to MariaDB
Replication Cluster and Galera Cluster
● ETL: Use Tee-filter to insert into transactional
database and insert into analytic database via
queuing to ETL process
● Faster and smooth rollout of new services
○ Using different DB technology and different
workloads
Database B:
MariaDB Galera Cluster
MaxScale in Action – BinLog Router
Transparent MySQL replication relay
Horizontally Scale Slaves without Master Overload
Crash Safe Disaster Recovery
Better Parallel Replication
User facing application reads from slaves are most up to date
MaxScale in Action – Schema sharding
Shard
1
MaxScale
Sharding
Router
Shard
2
Shard
3
Shard
4
Shard
5
Route to sharded database
Use Case: A Business is hosting database
for multiple customers on
sharded servers. As business
grows add new customers while
continuing existing services
Each customer with its own database
schema.
Each Server hosts up to 600 customers.
Client Applications are aware of database
schema - but not the hosting server
● MaxScale Schema Shard Router
Solution:
○ All applications connect to single
MaxScale
○ MaxScale routes the shard server
based on query from client
○ No impact on existing client
application when■ New client or shard server is added
■ Database schema are moved
around shard server
MariaDB MaxScale Today
GA since Jan 2015:● Platforms supported - all 64-bit only:
○ CentOS 5,6,7, RHEL 5,6, Fedora 19,20
○ Debian 6,7, Ubuntu 12.04 LTS, 13.10, 14.04
○ OpenSUSE 13.1
● Open source (GPL v2) on GitHub.
● CLI and text-based administration, tooling.
● Supports Pacemaker and Heartbeat HA.
● Protocols, Authentication: MySQL/MariaDB connectors, authentication.
● Monitors: MySQL/MariaDB replication + MHA, Galera cluster.
● Routers: Connection and statement based load balancing, R/W splitting:○ Complex replication hierarchies.
○ Server and service weighting.
○ Slave fault tolerance, slave consistency (lag) as routing decision parameter.
○ Server maintenance mode.
● Logging/Filters: “Tee” splitter, everything and time-based logging, regex-
based filtering/firewall, regex query transformation.
MaxScale 1.1
Scalability Binlog relay: improved HA for master/slave replication
Schema based sharding
MySQL Cluster (NDB) connection load balancing
Named Server Routing
Ease of Use Integration with MDBE notification service
Nagios Plugin for monitoring
Hint based statement load balancing
Canonical Query Logging integrated with RabbitMQ
Security Database Firewall filter: Block SQL injection
Questions? Answers!
The question is not “What is
the answer?”, the question is
“What is the question?”.
Henri Poincaré