Query Optimizer MySQL vs. PostgreSQL...13 2018-04-24 Query Optimizer –MySQL vs. PostgreSQL A cost...

41
BASEL BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENEVA HAMBURG COPENHAGEN LAUSANNE MUNICH STUTTGART VIENNA ZURICH @ ChrisAntognini antognini.ch/ blog Christian Antognini Query Optimizer MySQL vs. PostgreSQL Percona Live, Santa Clara (USA), 24 April 2018

Transcript of Query Optimizer MySQL vs. PostgreSQL...13 2018-04-24 Query Optimizer –MySQL vs. PostgreSQL A cost...

BASEL BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENEVA

HAMBURG COPENHAGEN LAUSANNE MUNICH STUTTGART VIENNA ZURICH

@ChrisAntognini antognini.ch/blog

Christian Antognini

Query Optimizer – MySQL vs. PostgreSQL

Percona Live, Santa Clara (USA), 24 April 2018

@ChrisAntognini

Query Optimizer – MySQL vs. PostgreSQL2 2018-04-24

Senior principal consultant, trainer and partner at Trivadis

[email protected]

http://antognini.ch

Focus: get the most out of database engines

Logical and physical database design

Query optimizer

Application performance management

Author of Troubleshooting Oracle Performance (Apress, 2008/14)

OakTable Network, Oracle ACE Director

Agenda

Query Optimizer – MySQL vs. PostgreSQL3 2018-04-24

1. Introduction

2. Configuration

3. Statistics about Data

4. Data Dictionary Metadata

5. Single-Table Access Paths

6. Joins and Sub-queries

7. Conclusion

8. References

Query Optimizer – MySQL vs. PostgreSQL4 2018-04-24

Introduction

Compared Products

Query Optimizer – MySQL vs. PostgreSQL5 2018-04-24

MySQL Community Server 8.0.11

Release date: 19 April 2018

Only the InnoDB engine is covered

PostgreSQL 10.3

Release date: 1 March 2018

Terminology

Query Optimizer – MySQL vs. PostgreSQL6 2018-04-24

In PostgreSQL the query optimizer is called planner

Disclaimer

Query Optimizer – MySQL vs. PostgreSQL7 2018-04-24

No performance tests were performed

No comparison between the execution plans generated by the two query

optimizers were performed

To compare and to evaluate the two query optimizers only the availability of

key features and the ability of the query optimizer to correctly recognize

common data patterns was considered

Aim of the Query Optimizer

Query Optimizer – MySQL vs. PostgreSQL8 2018-04-24

SQL is a declarative language

SQL doesn’t specify how to access

data, only which data to process

The strategy used to access data is

delegated to the query optimizer

The aim of the query optimizer is to

find out the optimal way to execute a

given SQL statement A

B

Inputs Considered to Produce an Execution Plan

Query Optimizer – MySQL vs. PostgreSQL9 2018-04-24

Query

Optimizer

SQL statement

Execution plan

Configuration

Metadata

Object statistics

Others?

How Is Data in Tables Stored?

Query Optimizer – MySQL vs. PostgreSQL10 2018-04-24

InnoDB uses a B-tree index Heap table

Query Optimizer – MySQL vs. PostgreSQL11 2018-04-24

Controlling the Query Optimizer

Configuration

Query Optimizer – MySQL vs. PostgreSQL12 2018-04-24

Three system variables control the

behavior of the query optimizer

Limit the number of evaluated

plans (2)

Control specific features

(1 parameter for 20 features)

The default (system) values can be

overwritten at session level and at

the statement level

Two dozen of parameters control the

behavior of the query optimizer

Limit the number of evaluated

plans (2)

Control specific features (15)

Control the genetic optimizer (7)

The default (system) values can be

overwritten at session level

Configuration – Cost Model

Query Optimizer – MySQL vs. PostgreSQL13 2018-04-24

A cost model database contains cost

estimate information for a number of

operations (8)

The default values can be changed

at the system level only

A number of parameters provide cost

estimate information for a number of

operations (11)

The default (system) values can be

overwritten at session level

Hints

Query Optimizer – MySQL vs. PostgreSQL14 2018-04-24

SELECT statement modifiers (4)

Impact statement syntax

Index hints (3)

Impact statement syntax

Cause error when index missing

Optimizer hints (21)

Similar to Oracle Database hints

Global, query block and object-level

Cause warning when syntax is wrong

(Available in EDB Advanced Server)

hints_modifiers.sql

hints_index.sql

hints_optimizer.sql

Query Optimizer – MySQL vs. PostgreSQL15 2018-04-24

Statistics about Data

Gathering Statistics

Query Optimizer – MySQL vs. PostgreSQL16 2018-04-24

The ANALYZE statement gathers

and, by default, stores statistics in

the data dictionary

By default, an asynchronous

automatic statistics recalculation

takes place

Persistent (default) as well as non-

persistent statistics exist

The ANALYZE statement gathers

and stores statistics in the data

dictionary

By default, the autovacuum daemon

recalculate statistics of modified

tables

Table Statistics

Query Optimizer – MySQL vs. PostgreSQL17 2018-04-24

Clustered index size (pages)

Number of rows

Table size (pages)

Number of rows

Number of pages marked all-visible

statistics.sql

Column Statistics

Query Optimizer – MySQL vs. PostgreSQL18 2018-04-24

Data distribution (optional)

Including fraction of entries that

are null

Fraction of values that are null

Average column width (bytes)

Number of distinct values

Statistical correlation between

physical and logical row ordering

Data distribution (optional)

Most common values and their

frequency (optional)

statistics.sql

Cross-Column Statistics

Query Optimizer – MySQL vs. PostgreSQL19 2018-04-24

Functional dependencies (optional)

Number of distinct values (optional)

statistics.sql

Index Statistics

Query Optimizer – MySQL vs. PostgreSQL20 2018-04-24

Index size (pages)

Number of leaf pages

Number of distinct keys

Several values are stored

E.g. for index “a,b,c”

“a”, “a,b”, “a,b,c”, “a,b,c,PK”

Index size (pages)

Number of indexed rows

statistics.sql

Query Optimizer – MySQL vs. PostgreSQL21 2018-04-24

Data Dictionary Metadata

Constraints – Primary Key and Unique Key

Query Optimizer – MySQL vs. PostgreSQL22 2018-04-24

Because of the clustered index, PK

has precedence over other indexes

Predicates based on UK take

precedence over other indexes

Equality predicates based on PK/UK

are probed

No particular precedence is given to

predicates based on PK/UK

Statistical correlation between

physical and logical row ordering

determines which index is used

constraints_pk_uk.sql

Constraints – Foreign Key

Query Optimizer – MySQL vs. PostgreSQL23 2018-04-24

No evidence about the usage of FK

to avoid unnecessary loss-less joins

has been observed

No evidence about the usage of FK

to avoid unnecessary loss-less joins

has been observed

constraints_fk.sql

Constraints – NOT NULL

Query Optimizer – MySQL vs. PostgreSQL24 2018-04-24

NOT NULL constraints are used to

verify the validity of predicates

By default the usage of NOT NULL

constraints to verify the validity of

predicates is enabled for specific

cases only

constraint_exclusion = partition

Statistics are used instead

constraints_not_null.sql

Constraints – CHECK

Query Optimizer – MySQL vs. PostgreSQL25 2018-04-24

No evidence about the usage of

CHECK constraints to verify the

validity of predicates has been

observed

Statistics are used instead

By default the usage of CHECK

constraints to verify the validity of

predicates is enabled for specific

cases only

constraint_exclusion = partition

Statistics are used instead

constraints_check.sql

Query Optimizer – MySQL vs. PostgreSQL26 2018-04-24

Single-Table Access Paths

Available Index Types

Query Optimizer – MySQL vs. PostgreSQL27 2018-04-24

Supported index types

B-tree (default)

R-tree (for spatial indexes)

For string columns, indexes can be

created on the leading part of

column values

B-tree indexes store NULL values

Support for invisible indexes

Supported index types

B-tree (default)

Hash, GiST, SP-GiST, GIN, BRIN

Indexes can be created on

expressions as well as on a subset

of the rows

B-tree indexes store NULL values

indexes_expression.sql

indexes_partial.sql

indexes_invisible.sql

indexes_prefix.sql

indexes_nulls.sql

Optimization of ORDER BY, MIN and MAX

Query Optimizer – MySQL vs. PostgreSQL28 2018-04-24

B-tree indexes can be used to

optimize ORDER BY, MIN and MAX

Index scans can be performed in

both directions

Keys are stored according to the

specified order

No NULLS FIRST/LAST support

B-tree indexes can be used to

optimize ORDER BY, MIN and MAX

Index scans can be performed in

both directions

Keys are stored according to the

specified order

NULLS FIRST/LAST supported

indexes_order_by.sql

indexes_min_max.sql

Merging Indexes

Query Optimizer – MySQL vs. PostgreSQL29 2018-04-24

Two or more B-tree indexes can be

merged at runtime to evaluate

multiple predicates combined with

AND or OR

When appropriate, B-tree indexes

are dynamically converted to

bitmaps in memory

One utilization of this feature is to

merge indexes to evaluate multiple

predicates combined with AND or

OR

indexes_merge.sql

(Declarative) Partitioning

Query Optimizer – MySQL vs. PostgreSQL30 2018-04-24

Available methods:

Multi-column range and list

Single-column hash

Sub-partitioning by hash

Only local indexes (PK/UK must

contain partition key)

FK not supported

Partition pruning

Available methods:

Multi-column range

Single-column list

Sub-partitioning

Only local indexes (no PK/UK)

FK not supported

Partition exclusion (pruning)

partitioning_range.sql

partitioning_list.sql

partitioning_hash.sql

Query Optimizer – MySQL vs. PostgreSQL31 2018-04-24

Joins and Sub-queries

Available Kind of Joins

Query Optimizer – MySQL vs. PostgreSQL32 2018-04-24

Available join methods

Nested loops join

(Hash join available in MariaDB)

Bushy plans are considered only

when no other possibility exists

Full outer joins are not supported

Available join methods

Nested loops join

Hash join

Merge join

Bushy plans are considered

Full outer joins are supported and optimized with hash/merge joins

joins_methods.sql

joins_syntax.sql

joins_bushy.sql

Sub-queries in WHERE Clause

Query Optimizer – MySQL vs. PostgreSQL33 2018-04-24

Simple sub-queries that are not

correctly optimized were observed

For optimal performance a rewrite

might be necessary

Problematic cases observed

Correlated NOT IN

Correlated (NOT) EXISTS

Simple sub-queries that are not

correctly optimized were observed

For optimal performance a rewrite

might be necessary

Problematic case observed

Correlated (NOT) IN

Uncorrelated NOT IN

joins_subqueries.sql

Query Optimizer – MySQL vs. PostgreSQL34 2018-04-24

Conclusion

Summary

Query Optimizer – MySQL vs. PostgreSQL35 2018-04-24

Good configuration capabilities

Hints available

Fairly good object statistics

Metadata only partially used

Good indexing capabilities, average

partition capabilities

Limited join capabilities

Good configuration capabilities

Hints missing

Very good object statistics

Metadata only partially used

Good indexing capabilities, average

partition capabilities

Good join capabilities

Query Optimizer – MySQL vs. PostgreSQL36 2018-04-24

Core Messages

The query optimizer of PostgreSQL is more advanced than the one of MySQL

In general, the query optimizer of MySQL can only do a good job with

transactional loads; the one of PostgreSQL is also suitable for analytical loads

Rate My Session

Query Optimizer – MySQL vs. PostgreSQL37 2018-04-24

Query Optimizer – MySQL vs. PostgreSQL38 2018-04-24

References

References (1)

Query Optimizer – MySQL vs. PostgreSQL39 2018-04-24

MySQL 8.0 Reference Manual

The Unofficial MySQL 8.0 Optimizer

Guide

MySQL Internals

MySQL Server Blog

PostgreSQL 10.1 Documentation

Planner source code “readme”

PostgreSQL Wiki

References (2)

Query Optimizer – MySQL vs. PostgreSQL40 2018-04-24

The verification scripts I wrote are available on GitHub

How Well a Query Optimizer Handles Subqueries?

Q&A

2018-04-24 Query Optimizer – MySQL vs. PostgreSQL41