optimizer gets your data, fast How a database · How a database optimizer gets your data, fast...

52
How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe 2019 30 Sept - 2 Oct Amsterdam, Netherlands

Transcript of optimizer gets your data, fast How a database · How a database optimizer gets your data, fast...

Page 1: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

How a database optimizer gets your data,fast

Vicențiu CiorbaruSoftware Developer Team Lead @ MariaDB Foundation

Percona Live Europe 201930 Sept - 2 Oct Amsterdam, Netherlands

Page 2: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

2

whoami

● Vicențiu Ciorbaru

● MariaDB Foundation,Software Developer Team Lead

● MariaDB developer since 2013-…

● Implemented Roles, Window Functions and others

MariaDB Foundationhttps://mariadb.org

Page 3: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

3

Goal of a query optimizer● Produce a query plan that executes your query in the fastest time possible.

● Optimizer has many tools at its disposal:○ It can choose to pre-read tables○ Cache results (such as uncorrelated subqueries)○ Use indexes to look up values○ Use indexes to access data in-order and avoid sorting○ Rewrite a query (more on this later)○ And more...

● Number of possible plans grows exponentially with # tables

MariaDB Foundationhttps://mariadb.org

Page 4: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

4

Goal of a query optimizer● Not enough time to try out every possible plan

● In a "perfect world" any query should be performing as fast as possible.

● Many queries do!

● But sometimes…

MariaDB Foundationhttps://mariadb.org

Page 5: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

Statistics based optimizations

5MariaDB Foundationhttps://mariadb.org

Page 6: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

6

What optimizer can ask the engine

● “how many rows does the table havе?”

● “what would it cost to scan the whole table?”

● “what would it cost to read that many rows from this index?”

● “how many distinct key values are in this index?”

● “how many keys lie in that range of values in this index?”

MariaDB Foundationhttps://mariadb.org

Page 7: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

7

A problem: instability

● DBT-3 Q8: National Market Share Query — 8 InnoDB tables○ 4 different plans○ from 7 minutes to 1.2 hours

● DBT-3 Q7: Volume Shipping Query — 6 InnoDB tables○ 7 different plans○ from 12 minutes to many hours (and timeout)

MariaDB Foundationhttps://mariadb.org

Page 8: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

8

A problem: instability

● DBT-3 Q8: National Market Share Query — 8 InnoDB tables○ 4 different plans○ from 7 minutes to 1.2 hours

● DBT-3 Q7: Volume Shipping Query — 6 InnoDB tables○ 7 different plans○ from 12 minutes to many hours (and timeout)

● What about InnoDB persistent statistics?

MariaDB Foundationhttps://mariadb.org

Page 9: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

9

A problem: All engines are liars

MariaDB> CREATE TABLE t1 (a INT, b INT, c INT, KEY(a,b)) ENGINE=MyISAM;MariaDB> INSERT t1 VALUES (RAND()*100000,RAND()*100000,RAND()*100000);MariaDB> ... 400 000 rows …

MariaDB> SELECT COUNT(DISTINCT a) AS cardinality FROM t1;

+-------------+| cardinality |+-------------+| 97794 |+-------------+

MariaDB Foundationhttps://mariadb.org

Page 10: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

10

A problem: All engines are liars

MariaDB> SELECT cardinality FROM information_schema.statistics --> WHERE table_name='t1' AND column_name='a';

+-------------+| cardinality |+-------------+| 98304 |+-------------+

MariaDB> ALTER TABLE t1 ENGINE=InnoDB;MariaDB> SELECT cardinality FROM information_schema.statistics --> WHERE table_name='t1' AND column_name='a';+-------------+| cardinality |+-------------+| 196914 |+-------------+

MariaDB Foundationhttps://mariadb.org

Page 11: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

11

A problem: An index is required

● Takes storage space

● Needs to be updated for every INSERT / UPDATE / DELETE

● More indexes make INSERT / UPDATE / DELETE slow

MariaDB Foundationhttps://mariadb.org

Page 12: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

12

A problem: An index is required

● Takes storage space

● Needs to be updated for every INSERT / UPDATE / DELETE

● More indexes make INSERT / UPDATE / DELETE slow

● Too expensive if you only need statistics!

MariaDB Foundationhttps://mariadb.org

Page 13: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

13

Solution: Engine Independent Table Statistics

● Stable - Solves Instability

● Precise - Solves Storage Engine lying

● Detailed - Stores more information that most storage engines

● Identical for all engines

● Comparable - Format is "humanly readable"

MariaDB Foundationhttps://mariadb.org

Page 14: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

14

How does it work?

● new tables in the mysql schema

● @@use_stat_tables = [ never | complementary | preferably ]

● @@optimizer_use_condition_selectivity = 1 ... 5

● ANALYZE TABLE ... [ PERSISTENT FOR ... ]

MariaDB Foundationhttps://mariadb.org

Page 15: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

15

An example is worth a thousand words...

● https://dev.mysql.com/doc/employee/en/● https://github.com/datacharmer/test_db● 300,000 employees, 2.8 millions salary payments, 167 MB of data

MariaDB> source employees.sql

MariaDB> SET USE_STAT_TABLES=PREFERABLY;Query OK, 0 rows affected (0.00 sec)

MariaDB> ANALYZE TABLE departments, dept_emp, dept_manager, -> employees, salaries, titles;12 rows in set (18.49 sec)

MariaDB Foundationhttps://mariadb.org

Page 16: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

16

Example: per-column statisticsMariaDB> SET OPTIMIZER_USE_CONDITION_SELECTIVITY=1;Query OK, 0 rows affected (0.01 sec)

MariaDB> SELECT * FROM departments JOIN dept_emp USING (dept_no) -> JOIN employees USING (emp_no) -> JOIN titles USING (emp_no) -> WHERE title='Manager';24 rows in set (15.13 sec)

MariaDB Foundationhttps://mariadb.org

Page 17: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

17

Example: per-column statisticsMariaDB> SET OPTIMIZER_USE_CONDITION_SELECTIVITY=1;Query OK, 0 rows affected (0.01 sec)

MariaDB> SELECT * FROM departments JOIN dept_emp USING (dept_no) -> JOIN employees USING (emp_no) -> JOIN titles USING (emp_no) -> WHERE title='Manager';24 rows in set (15.13 sec)

MariaDB> SET OPTIMIZER_USE_CONDITION_SELECTIVITY=3;Query OK, 0 rows affected (0.01 sec)

MariaDB> SELECT * FROM departments JOIN dept_emp USING (dept_no) -> JOIN employees USING (emp_no) -> JOIN titles USING (emp_no) -> WHERE title='Manager';24 rows in set (0.53 sec)

MariaDB Foundationhttps://mariadb.org

Page 18: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

18

Example: per-column statistics+-------------+--------+---------+-------+----------+-----------------------+| table | type | key | rows | filtered | Extra |+-------------+--------+---------+-------+----------+-----------------------+| departments | ALL | NULL | 9 | 100.00 | || dept_emp | ref | dept_no | 36844 | 100.00 | || employees | eq_ref | PRIMARY | 1 | 100.00 | || titles | ref | PRIMARY | 1 | 100.00 | Using index condition |+-------------+--------+---------+-------+----------+-----------------------+

MariaDB> SET OPTIMIZER_USE_CONDITION_SELECTIVITY=3;+-------------+--------+---------+--------+----------+----------------------..| table | type | key | rows | filtered | Extra+-------------+--------+---------+--------+----------+----------------------.. | titles | ALL | NULL | 443308 | 14.29 | Using where| employees | eq_ref | PRIMARY | 1 | 100.00 || dept_emp | ref | PRIMARY | 1 | 100.00 || departments | ALL | NULL | 9 | 77.78 | Using where; Using join+-------------+--------+---------+--------+----------+----------------------..

MariaDB Foundationhttps://mariadb.org

Page 19: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

19

Example: per-column statistics+-------------+--------+---------+-------+----------+-----------------------+| table | type | key | rows | filtered | Extra |+-------------+--------+---------+-------+----------+-----------------------+| departments | ALL | NULL | 9 | 100.00 | || dept_emp | ref | dept_no | 36844 | 100.00 | || employees | eq_ref | PRIMARY | 1 | 100.00 | || titles | ref | PRIMARY | 1 | 100.00 | Using index condition |+-------------+--------+---------+-------+----------+-----------------------+

MariaDB> SET OPTIMIZER_USE_CONDITION_SELECTIVITY=3;+-------------+--------+---------+--------+----------+----------------------..| table | type | key | rows | filtered | Extra+-------------+--------+---------+--------+----------+----------------------.. | titles | ALL | NULL | 443308 | 14.29 | Using where| employees | eq_ref | PRIMARY | 1 | 100.00 || dept_emp | ref | PRIMARY | 1 | 100.00 || departments | ALL | NULL | 9 | 77.78 | Using where; Using join+-------------+--------+---------+--------+----------+----------------------..

MariaDB Foundationhttps://mariadb.org

Page 20: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

20

Example: per-column statistics+-------------+--------+---------+-------+----------+-----------------------+| table | type | key | rows | filtered | Extra |+-------------+--------+---------+-------+----------+-----------------------+| departments | ALL | NULL | 9 | 100.00 | || dept_emp | ref | dept_no | 36844 | 100.00 | || employees | eq_ref | PRIMARY | 1 | 100.00 | || titles | ref | PRIMARY | 1 | 100.00 | Using index condition |+-------------+--------+---------+-------+----------+-----------------------+

MariaDB> SET OPTIMIZER_USE_CONDITION_SELECTIVITY=3;+-------------+--------+---------+--------+----------+----------------------..| table | type | key | rows | filtered | Extra+-------------+--------+---------+--------+----------+----------------------.. | titles | ALL | NULL | 443308 | 14.29 | Using where| employees | eq_ref | PRIMARY | 1 | 100.00 || dept_emp | ref | PRIMARY | 1 | 100.00 || departments | ALL | NULL | 9 | 77.78 | Using where; Using join+-------------+--------+---------+--------+----------+----------------------..

MariaDB Foundationhttps://mariadb.org

Page 21: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

21

Equi-height Histograms

MariaDB Foundationhttps://mariadb.org

Page 22: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

22

Equi-height Histograms

MariaDB Foundationhttps://mariadb.org

Page 23: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

23

Example: non-uniform distributionMariaDB> SELECT (MIN(salary)+MAX(salary))/2 FROM salaries;+-----------------------------+| 98421.5000 |+-----------------------------+

MariaDB> SELECT * FROM employees JOIN salaries USING (emp_no) -> JOIN titles USING (emp_no) -> WHERE salary > 100000;166014 rows in set (18.79 sec)

MariaDB> SET OPTIMIZER_USE_CONDITION_SELECTIVITY=4;Query OK, 0 rows affected (0.00 sec)

MariaDB> SELECT * FROM employees JOIN salaries USING (emp_no) -> JOIN titles USING (emp_no) -> WHERE salary > 100000;166014 rows in set (6.64 sec)

MariaDB Foundationhttps://mariadb.org

Page 24: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

24

Example: per-column statisticsMariaDB> SET OPTIMIZER_USE_CONDITION_SELECTIVITY=3;+------+-----------+--------+---------+--------+----------+-------------+| id | table | type | key | rows | filtered | Extra |+------+-----------+--------+---------+--------+----------+-------------+| 1 | titles | ALL | NULL | 443308 | 100.00 | || 1 | employees | eq_ref | PRIMARY | 1 | 100.00 | || 1 | salaries | ref | PRIMARY | 9 | 48.68 | Using where |+------+-----------+--------+---------+--------+----------+-------------+

MariaDB> SET OPTIMIZER_USE_CONDITION_SELECTIVITY=4;+------+-----------+--------+---------+---------+----------+-------------+| id | table | type | key | rows | filtered | Extra |+------+-----------+--------+---------+---------+----------+-------------+| 1 | salaries | ALL | NULL | 2844047 | 4.76 | Using where || 1 | employees | eq_ref | PRIMARY | 1 | 100.00 | || 1 | titles | ref | PRIMARY | 1 | 100.00 | |+------+-----------+--------+---------+---------+----------+-------------+

MariaDB Foundationhttps://mariadb.org

Page 25: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

25

Example: per-column statisticsMariaDB> SET OPTIMIZER_USE_CONDITION_SELECTIVITY=3;+------+-----------+--------+---------+--------+----------+-------------+| id | table | type | key | rows | filtered | Extra |+------+-----------+--------+---------+--------+----------+-------------+| 1 | titles | ALL | NULL | 443308 | 100.00 | || 1 | employees | eq_ref | PRIMARY | 1 | 100.00 | || 1 | salaries | ref | PRIMARY | 9 | 48.68 | Using where |+------+-----------+--------+---------+--------+----------+-------------+

MariaDB> SET OPTIMIZER_USE_CONDITION_SELECTIVITY=4;+------+-----------+--------+---------+---------+----------+-------------+| id | table | type | key | rows | filtered | Extra |+------+-----------+--------+---------+---------+----------+-------------+| 1 | salaries | ALL | NULL | 2844047 | 4.76 | Using where || 1 | employees | eq_ref | PRIMARY | 1 | 100.00 | || 1 | titles | ref | PRIMARY | 1 | 100.00 | |+------+-----------+--------+---------+---------+----------+-------------+

MariaDB Foundationhttps://mariadb.org

Page 26: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

26

Example: per-column statisticsMariaDB> SET OPTIMIZER_USE_CONDITION_SELECTIVITY=3;+------+-----------+--------+---------+--------+----------+-------------+| id | table | type | key | rows | filtered | Extra |+------+-----------+--------+---------+--------+----------+-------------+| 1 | titles | ALL | NULL | 443308 | 100.00 | || 1 | employees | eq_ref | PRIMARY | 1 | 100.00 | || 1 | salaries | ref | PRIMARY | 9 | 48.68 | Using where |+------+-----------+--------+---------+--------+----------+-------------+

MariaDB> SET OPTIMIZER_USE_CONDITION_SELECTIVITY=4;+------+-----------+--------+---------+---------+----------+-------------+| id | table | type | key | rows | filtered | Extra |+------+-----------+--------+---------+---------+----------+-------------+| 1 | salaries | ALL | NULL | 2844047 | 4.76 | Using where || 1 | employees | eq_ref | PRIMARY | 1 | 100.00 | || 1 | titles | ref | PRIMARY | 1 | 100.00 | |+------+-----------+--------+---------+---------+----------+-------------+

MariaDB Foundationhttps://mariadb.org

Page 27: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

27

Example: samplingMariaDB> SET OPTIMIZER_USE_CONDITION_SELECTIVITY=4;

MariaDB> SELECT dept_name, AVG(salary) --> FROM departments JOIN dept_emp USING (dept_no) --> JOIN employees USING (emp_no) --> JOIN salaries USING (emp_no) --> WHERE last_name LIKE '%off' GROUP BY dept_name;9 rows in set (5.06 sec)

MariaDB> SET OPTIMIZER_USE_CONDITION_SELECTIVITY=5;

MariaDB> SELECT dept_name, AVG(salary) --> FROM departments JOIN dept_emp USING (dept_no) --> JOIN employees USING (emp_no) --> JOIN salaries USING (emp_no) --> WHERE last_name LIKE '%off' GROUP BY dept_name;9 rows in set (0.40 sec)

MariaDB Foundationhttps://mariadb.org

Page 28: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

28

Example: per-column statistics+-------------+--------+---------+-------+----------+---------------------+| table | type | key | rows | filtered | Extra |+-------------+--------+---------+-------+----------+---------------------+| departments | ALL | NULL | 9 | 100.00 | temporary; filesort || dept_emp | ref | dept_no | 36844 | 100.00 | || employees | eq_ref | PRIMARY | 1 | 100.00 | where || salaries | ref | PRIMARY | 9 | 100.00 | |+-------------+--------+---------+-------+----------+---------------------+

MariaDB> SET OPTIMIZER_USE_CONDITION_SELECTIVITY=5;+-------------+-------+---------+--------+----------+--------------------...| table | type | key | rows | filtered | Extra ...+-------------+-------+---------+--------+----------+--------------------...| employees | ALL | NULL | 300024 | 1.00 | where; temporary; filesort| dept_emp | ref | PRIMARY | 1 | 100.00 | index | departments | ALL | NULL | 9 | 77.78 | where; join buffer (flat,| salaries | ref | PRIMARY | 9 | 100.00 | +-------------+-------+---------+--------+----------+--------------------...

MariaDB Foundationhttps://mariadb.org

Page 29: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

29

Example: per-column statistics+-------------+--------+---------+-------+----------+---------------------+| table | type | key | rows | filtered | Extra |+-------------+--------+---------+-------+----------+---------------------+| departments | ALL | NULL | 9 | 100.00 | temporary; filesort || dept_emp | ref | dept_no | 36844 | 100.00 | || employees | eq_ref | PRIMARY | 1 | 100.00 | where || salaries | ref | PRIMARY | 9 | 100.00 | |+-------------+--------+---------+-------+----------+---------------------+

MariaDB> SET OPTIMIZER_USE_CONDITION_SELECTIVITY=5;+-------------+-------+---------+--------+----------+--------------------...| table | type | key | rows | filtered | Extra ...+-------------+-------+---------+--------+----------+--------------------...| employees | ALL | NULL | 300024 | 1.00 | where; temporary; filesort| dept_emp | ref | PRIMARY | 1 | 100.00 | index | departments | ALL | NULL | 9 | 77.78 | where; join buffer (flat,| salaries | ref | PRIMARY | 9 | 100.00 | +-------------+-------+---------+--------+----------+--------------------...

MariaDB Foundationhttps://mariadb.org

Page 30: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

30

Example: per-column statistics+-------------+--------+---------+-------+----------+---------------------+| table | type | key | rows | filtered | Extra |+-------------+--------+---------+-------+----------+---------------------+| departments | ALL | NULL | 9 | 100.00 | temporary; filesort || dept_emp | ref | dept_no | 36844 | 100.00 | || employees | eq_ref | PRIMARY | 1 | 100.00 | where || salaries | ref | PRIMARY | 9 | 100.00 | |+-------------+--------+---------+-------+----------+---------------------+

MariaDB> SET OPTIMIZER_USE_CONDITION_SELECTIVITY=5;+-------------+-------+---------+--------+----------+--------------------...| table | type | key | rows | filtered | Extra ...+-------------+-------+---------+--------+----------+--------------------...| employees | ALL | NULL | 300024 | 1.00 | where; temporary; filesort| dept_emp | ref | PRIMARY | 1 | 100.00 | index | departments | ALL | NULL | 9 | 77.78 | where; join buffer (flat,| salaries | ref | PRIMARY | 9 | 100.00 | +-------------+-------+---------+--------+----------+--------------------...

MariaDB Foundationhttps://mariadb.org

Page 31: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

31

MariaDB 10.0 (GA 2014)

● Per table: number of rows

● Per index: number of distinct values

● Per column:○ number of distinct values○ min and max values, amount of NULLs, average value length

● Equi-height histogram

● Sampling for LIKE predicates

MariaDB Foundationhttps://mariadb.org

Page 32: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

32

MySQL 8.0 (GA 2018)

● Optimizer Statistics

● Per column:○ Amount of NULLs○ Equi-height and “singleton” histograms○ Faster than MariaDB 10.0 on large tables○ Skips values to keep the data set small

MariaDB Foundationhttps://mariadb.org

Page 33: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

33

MariaDB 10.4 (GA 2019)

● Statistics Tables are Enabled by default

● Can collect histograms through sampling○ Uses Bernoulli Sampling○ Server variable analyze_sample_percentage○ Set to 0 to let MariaDB decide how many rows to sample.

● Uses the unsmoothed first-order jackknife estimator to estimate total cardinality.

● At least as fast as MySQL 8.0 for large tables

MariaDB Foundationhttps://mariadb.org

Page 34: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

Query rewriting

34MariaDB Foundationhttps://mariadb.org

Page 35: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

35

Background on optimizations

● A derived table is a table in the FROM clause, defined as a subquery.

SELECT * FROM (SELECT a FROM t1) der_t1;

MariaDB Foundationhttps://mariadb.org

Page 36: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

36

VIP Customers and their ordersselect *from vip_customers, (select * from orders where order_date between '2017-10-01' and '2017-10-31') as OCT_ORDERSwhere OCT_ORDERS.amount > 1000000 and OCT_ORDERS.customer_id = vip_customers.customer_id;

MariaDB Foundationhttps://mariadb.org

Page 37: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

37

Naive executionselect * from vip_customers, (select * from orders where order_datebetween '2017-10-01' and '2017-10-31') as OCT_ORDERSwhere OCT_ORDERS.amount > 1000000 and OCT_ORDERS.customer_id = vip_customers.customer_id;

orders

OCT_ORDERS

DATE FILTER

amount > 1000000 JOIN vip_customers

RESULTMariaDB Foundation

https://mariadb.org

Page 38: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

38

Derived Table Merge

select *from vip_customers vc, (select * from orders where order_date between '2017-10-01' and '2017-10-31' ) as OCT_ORDERSwhere OCT_ORDERS.amount > 1M and OCT_ORDERS.customer_id = vc.customer_id;

select *from vip_customers vc, orders owhere o.amount > 1M and o.customer_id = vc.customer_id and o.order_date between '2017-10-01' and '2017-10-31';

MariaDB Foundationhttps://mariadb.org

Page 39: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

39

Explain shows table is merged explain select *from vip_customers, (select * from orders where order_date between '2017-10-01' and '2017-10-31') as OCT_ORDERSwhere OCT_ORDERS.amount > 1000000 and OCT_ORDERS.customer_id = vip_customers.customer_id;

+----+-------------+---------------+------+..+---------+-------------+|id | select_type | table | type |..| rows | Extra |+----+-------------+---------------+------+..+---------+-------------+| 1 | SIMPLE | vip_customers | ALL |..| 101 | || 1 | SIMPLE | orders | ALL |..| 1000000 | Using where;|+----+-------------+---------------+------+..+---------+-------------+

MariaDB Foundationhttps://mariadb.org

Page 40: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

40

Execution after merge

ordersOCT_ORDERS

amount > 1MJOIN vip_customers

RESULT

select *from vip_customers vc, orders owhere o.amount > 1M and o.customer_id = vc.customer_id and o.order_date between '2017-10-01' and '2017-10-31';

MariaDB Foundationhttps://mariadb.org

Page 41: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

41

Execution after merge

ordersOCT_ORDERS

amount > 1MJOIN vip_customers

RESULT

select *from vip_customers vc, orders owhere o.amount > 1M and o.customer_id = vc.customer_id and o.order_date between '2017-10-01' and '2017-10-31';

Merging is good!It simplifies the

query!

MariaDB Foundationhttps://mariadb.org

Page 42: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

42

Execution after merge

ordersOCT_ORDERS

amount > 1MJOIN vip_customers

RESULT

select *from vip_customers vc, orders owhere o.amount > 1M and o.customer_id = vc.customer_id and o.order_date between '2017-10-01' and '2017-10-31';

Merging is good!It simplifies the

query!

Works in all stable MariaDB & MySQL

versions.

MariaDB Foundationhttps://mariadb.org

Page 43: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

43

Execution after merge

ordersOCT_ORDERS

amount > 1MJOIN vip_customers

RESULT

select *from vip_customers vc, orders owhere o.amount > 1M and o.customer_id = vc.customer_id and o.order_date between '2017-10-01' and '2017-10-31';

Merging is good!It simplifies the

query!

Works in all stable MariaDB & MySQL

versions.

Can not be used when aggregation

is present. :(

MariaDB Foundationhttps://mariadb.org

Page 44: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

create view OCT_TOTALS asselect customer_id, SUM(amount) as TOTAL_AMTfrom orderswhere order_date between '2017-10-01' and '2017-10-31'group by customer_id

select *from OCT_TOTALSwhere customer_id=1

Condition pushdown

44MariaDB Foundationhttps://mariadb.org

Page 45: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

create view OCT_TOTALS asselect customer_id, SUM(amount) as TOTAL_AMTfrom orderswhere order_date between '2017-10-01' and '2017-10-31'group by customer_id

select *from OCT_TOTALSwhere customer_id=1

Condition pushdown

45

There are a lot of customers and we only

want the data for 1.

MariaDB Foundationhttps://mariadb.org

Page 46: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

create view OCT_TOTALS asselect customer_id, SUM(amount) as TOTAL_AMTfrom orderswhere order_date between '2017-10-01' and '2017-10-31'group by customer_id

select *from OCT_TOTALSwhere customer_id=1

Condition pushdown

46

We can push the condition to the where

clause!

MariaDB Foundationhttps://mariadb.org

Page 47: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

create view OCT_TOTALS asselect customer_id, SUM(amount) as TOTAL_AMTfrom orderswhere order_date between '2017-10-01' and '2017-10-31'group by customer_id

select *from OCT_TOTALSwhere customer_id=1

Condition pushdown

47

This tactic works for Window Functions too!

MariaDB Foundationhttps://mariadb.org

Page 48: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

create view top_three_orders asselect * from ( select customer_id, amount, rank() over (partition by customer_id order by amount desc) as order_rank from orders) as ranked_orderswhere order_rank < 3

+-------------+--------+------------+| customer_id | amount | order_rank |+-------------+--------+------------+| 1 | 10000 | 1 || 1 | 9500 | 2 || 1 | 400 | 3 || 2 | 3200 | 1 || 2 | 1000 | 2 |.....................................

select * from top_three_orders where customer_id = 1

Condition pushdown through PARTITION BY

48MariaDB Foundationhttps://mariadb.org

Page 49: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

create view top_three_orders asselect * from ( select customer_id, amount, rank() over (partition by customer_id order by amount desc) as order_rank from orders) as ranked_orderswhere order_rank < 3

+-------------+--------+------------+| customer_id | amount | order_rank |+-------------+--------+------------+| 1 | 10000 | 1 || 1 | 9500 | 2 || 1 | 400 | 3 || 2 | 3200 | 1 || 2 | 1000 | 2 |.....................................

select * from top_three_orders where customer_id = 1

Condition pushdown through PARTITION BY

49MariaDB Foundationhttps://mariadb.org

Page 50: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

MariaDB 10.2, MySQL 8.0

■ Compute top_three_orders for all customers

■ Select rows with customer_id = 1

MariaDB 10.3 and onwards (and e.g. PostgreSQL)

■ Only compute top_three_orders for customer_id=1

■ This can be much faster!

■ Can make use of index(customer_id)

50

MariaDB - MySQL Comparison

MariaDB Foundationhttps://mariadb.org

Page 51: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

● This talk would not have been possible if it not for the MariaDB Foundation's sponsors.

51

Sponsors of MariaDB Foundation

MariaDB Foundationhttps://mariadb.org

Page 52: optimizer gets your data, fast How a database · How a database optimizer gets your data, fast Vicențiu Ciorbaru Software Developer Team Lead @ MariaDB Foundation Percona Live Europe

Thank you!

Contact details:[email protected]

About:https://mariadb.org/vicentiu

52MariaDB Foundationhttps://mariadb.org