Webinar slides: MySQL Query Tuning Trilogy: Indexing and EXPLAIN - deep dive

37
Copyright 2016 Severalnines AB 1 Your host & some logistics I'm Jean-Jérôme from the Severalnines Team and I'm your host for today's webinar! Feel free to ask any questions in the Questions section of this application or via the Chat box. You can also contact me directly via the chat box or via email: [email protected] during or after the webinar.

Transcript of Webinar slides: MySQL Query Tuning Trilogy: Indexing and EXPLAIN - deep dive

Copyright 2016 Severalnines AB

1

Your host & some logistics

I'm Jean-Jérôme from the Severalnines Team and I'm your host for today's webinar!

Feel free to ask any questions in the Questions section of this application or via the Chat box.

You can also contact me directly via the chat box or via email: [email protected] during or after the webinar.

Copyright 2016 Severalnines AB

2

About Severalnines and ClusterControl

Copyright 2016 Severalnines AB

3

What we do

Manage Scale

Monitor Deploy

Copyright 2016 Severalnines AB

4

ClusterControl Automation & Management

! Provisioning ! Deploy a cluster in minutes ! On-premises or in the cloud (AWS)

! Monitoring ! Systems view ! 1sec resolution ! DB / OS stats & performance advisors ! Configurable dashboards ! Query Analyzer ! Real-time / historical

! Management ! Multi cluster/data-center ! Automate repair/recovery ! Database upgrades ! Backups ! Configuration management ! Cloning ! One-click scaling

Copyright 2016 Severalnines AB

5

Supported Databases

Copyright 2016 Severalnines AB

6

Customers

Copyright 2016 Severalnines AB

MySQL Query Tuning - Indexing & EXPLAIN

September 27, 2016

Krzysztof Książek

Severalnines

[email protected]

7

Copyright 2016 Severalnines AB

8

Agenda

! Indexes

! How B-Tree actually works?

! MyISAM vs. InnoDB

! Different types of indexes

! EXPLAIN

! Overview

! EXPLAIN PARTITIONS

! EXPLAIN EXTENDED

! EXPLAIN JSON

Copyright 2016 Severalnines AB

9

Indexes - theory

Copyright 2016 Severalnines AB

10

Indexes - theory

! Let’s start with some theory

! MySQL uses multiple index types

! B-Tree and B+Tree indexes (MyISAM, InnoDB and Memory)

! Fulltext index (MyISAM and InnoDB)

! Spatial index (MyISAM and InnoDB)

! Hash index (Memory)

Copyright 2016 Severalnines AB

11

Indexes - B-Tree index

! Let’s check how B-Tree index is designed:

Copyright 2016 Severalnines AB

12

Indexes - B-Tree index

! We have a root node and four leaf pages

! Each entry in a leaf page has a pointer to a row related to this index entry

! Each leaf page has a pointer to the beginning of another leaf page

! This allows for a quick range scan of the index

! Index lookup is also pretty efficient

Copyright 2016 Severalnines AB

13

Indexes - B-Tree index

! Let’s insert a row with indexed value of 20

! It is added to the third leaf page as there’s still some free space there

Copyright 2016 Severalnines AB

14

Indexes - B-Tree index

! Let’s insert a row with indexed value of ’50’ - it won’t fit in the leaf page - page has to be split:

Copyright 2016 Severalnines AB

15

Indexes - B-Tree index

! Before and after the split

Copyright 2016 Severalnines AB

16

Indexes - B-Tree index

! Splitting a leaf node requires several writes to happen

! Splitting a root node requires even more writes as another level of leaf nodes has to be created

! As you can see, adding data to the index is more expensive than just inserting a row - management of the index adds significant overhead

! You have to keep in mind that indexes are, on the one hand, great as they speed up reads

! On the other hand, they slow down write operations

! More indexes not always better

Copyright 2016 Severalnines AB

17

Indexes - B-Tree index

Copyright 2016 Severalnines AB

18

Indexes - B-Tree index

Copyright 2016 Severalnines AB

19

Query tuning - indexing - MyISAM vs. InnoDB

! Two ‘types’ of index

! Primary Key - always unique

! Secondary index - may or may not be unique

! In MyISAM primary and secondary indexes are structured in the same way

! They contain pointers to data

! InnoDB is different

! Primary key - it’s a clustered index

! Instead of pointer to the row, it contains full row

! Speeds up PK lookups significantly

! Secondary index contain PK values

! PK values are used to retrieve data via PK lookup

! Long PK may significantly affect performance

Indexes - MyISAM vs. InnoDB

Copyright 2016 Severalnines AB

20

Indexes - diferent types of indexes

Copyright 2016 Severalnines AB

21

Indexes - B-Tree! B-Tree indexes can speed up queries which

match a full value

! Index can cover multiple columns - a composite index (col1, col2, col3)

! It can be used by queries with matching leftmost prefix

! SELECT * FROM table WHERE col1 = ‘1234’;

! Indexes can also work with leftmost part of the column:

! SELECT * FROM table WHERE col1 LIKE ‘1%’;

! B-Tree indexes can be used for range queries

! SELECT * FROM table WHERE col1 > ‘1%’ AND col1 < ‘4%’;

! Leaf nodes in a B-Tree index contain pointers to the rows (or PK values) but they also contain the data for the indexed column itself

! It can be used to create covering indexes

! KEY covering (col1, col2)

! SELECT col1, col2 FROM table1 WHERE col1 = 123

Copyright 2016 Severalnines AB

22

! This is a type of index that can be used for text lookups

! They do not work by comparing values, it’s more like keyword searching

! Available in both MyISAM and InnoDB (since 5.6)

! Can’t be used for WHERE conditions - you should be using the MATCH AGAINST operator

! You may use both FULLTEXT and B-Tree index on the same column

Indexes - Full text indexes

Copyright 2016 Severalnines AB

23

Indexes - Hash indexes! Hash indexes (user controlled) are used only in

MEMORY engine and can be used for exact lookups

! InnoDB uses HASH indexes as a part of Adaptive Hash Index - it’s not manageable by user

! For each row a hash is calculated on a column and then it’s used for lookups

! Let’s say hash value for a column is ‘123456’

! Data will be stored in index as a hash value -> pointer to the row

! Hash index most likely will be smaller than B-Tree, making it faster

! As long as there won’t be too many collisions

! Index can be used only for direct lookups

! It can’t be used for range queries

! Index on (col1, col2) can’t be used to locate data based only on ‘col1’ value

Copyright 2016 Severalnines AB

24

Indexes - gotchas

! Make sure you do not use functions on the left side of the WHERE clause:

! SELECT column1 FROM table WHERE LOWER(column2) = 'some value’;

! This won’t use index on (column2)

! SELECT column1 FROM table WHERE column2 = UPPER(‘some value’);

! This will use index on (column2)

Copyright 2016 Severalnines AB

25

Indexes - gotchas

! You should not mix data types

! SELECT * FROM table WHERE varcharcolumn = 12345;

! It won’t use index on (varcharcolumn)

! You can explicitly convert integer to a string

! SELECT * FROM table WHERE varcharcolumn = ‘12345’;

! This one will use index on (varcharcolumn)

Copyright 2016 Severalnines AB

26

EXPLAIN

Copyright 2016 Severalnines AB

27

EXPLAIN

! EXPLAIN gives you an option to check the query execution plan

Copyright 2016 Severalnines AB

28

EXPLAIN

! ‘Type’ column - type of a join

! eq_ref - rows will be accessed using an index, one row is read from the table for each combination of rows from the previous tables.

! ref - multiple rows can be accessed for a given value so we are using standard, non-unique index to retrieve them

! index_merge - rows are accessed through the index merge algorithm. Multiple indexes are used to locate matching rows.

Copyright 2016 Severalnines AB

29

EXPLAIN

! ‘Type’ column - type of a join

! range - only rows from a given range are being accessed, index is used to select them

! index - full index scan is performed. It can be either a result of the use of a covering index or an index can be used to retrieve rows in a sorted order

! ALL - full table scan is performed

Copyright 2016 Severalnines AB

30

EXPLAIN

! ‘possible_keys’ and ‘key’ tells us about indexes which could be used for the particular query and the index chosen by the optimizer as the most efficient one

! ‘key_len’ tells about the length of the index (or a prefix of an index) that was chosen to be used in the query

! ‘ref’ column tells us which columns or constants are compared to the index picked by the optimizer. In our case this is ‘const’ as we are using a constant in the WHERE clause. This can be another column, for example in the case of a join. You can also see ‘func’ when we are comparing a result of some function

Copyright 2016 Severalnines AB

31

EXPLAIN

! ‘rows’ gives us an estimate of how many rows the query will scan, this estimate is based on the index statistics from the storage engine therefore it may not be precise

! ‘Extra’, prints additional information relevant to how the query is going to be executed. You can see here information about different optimizations that will be applied to the query

! Using Index - only index will be used to retrieve rows (covering index)

! Using filesort - additional pass is required to retrieve rows in a sorted order

Copyright 2016 Severalnines AB

32

EXPLAIN

! JOINs - EXPLAIN output looks more complex

! Three tables are involved in the query

! Start with the ‘actor’ tableusing ‘idx_actor_last_name’ index

! Then ‘film_actor’ table was joined using PK

! Finally, ‘film’ table was joined using PK

! Plan looks optimal

! 1x1x1 rows will be scanned

Copyright 2016 Severalnines AB

33

EXPLAIN PARTITIONS

! One new column (‘partitions’) was added to EXPLAIN output

! It contains values ‘p10,p11,p12’

! Optimizer decided, based on WHERE clause, that only those partitions have to be accessed

! type: index - a PK scan will be used and then WHERE clause condition will be checked

Copyright 2016 Severalnines AB

34

EXPLAIN EXTENDED

! EXPLAIN EXTENDED followed by SHOW WARNINGS shows exact execution plan

! Please note <in_optimizer> in the SHOW WARNINGS output

! Subquery is not dependent therefore it can be materialized

! ( <materialize> (/* select#2 */ select 'M' from `employees`.`employees` `e1` where 1 )

Copyright 2016 Severalnines AB

35

EXPLAIN JSON

! EXPLAIN can print the output in JSON format

! It not only changes the format but also a new content is added

! Information about used parts of the composite index

! Additional information about subquery

! Is it dependent?

! Is it materialized?

Copyright 2016 Severalnines AB

36

EXPLAIN JSON

Copyright 2016 Severalnines AB

37

Thank You!! Blog posts covering query tuning process:

! http://severalnines.com/blog/become-mysql-dba-blog-series-database-indexing

! http://severalnines.com/blog/become-mysql-dba-blog-series-using-explain-improve-sql-queries

! Register for the next part of our trilogy:

! http://severalnines.com/upcoming-webinars

! Install ClusterControl:

! http://severalnines.com/getting-started

! Contact: [email protected]