How mysql choose the execution plan
Transcript of How mysql choose the execution plan
![Page 1: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/1.jpg)
MySQL IndexHow mysql choose the execution plan
Li Xinhe @2016 July
![Page 2: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/2.jpg)
You’ve Made a Great Choice!
Understanding indexing is crucial both for Devs and DBAs
Poor index choices are responsible for large portion of production problems.
Indexing is not a rocket science
Maybe not for Optimizer
source code lines almost 2M
code shipped per year “5 line”
![Page 3: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/3.jpg)
Agenda
1. Quiz
2. Intro MySQL & Index
3. Tools for monitoring, analyzing and tuning queries
4. MySQL cost-based optimizer
5. ICP
6. Quiz Discussion
![Page 4: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/4.jpg)
Quiz From Garena Test for Software Developers
Which of the following queries can fully utilize the composite index "INDEX(a,
b)" on the columns "a" and "b" in the "user" table? ______
A. SELECT * FROM user WHERE a=0 AND b=0;B. SELECT * FROM user WHERE a=0 OR b=0;C. SELECT * FROM user WHERE a>0 AND b=0;
D. SELECT * FROM user WHERE a=0 AND b>0;
![Page 5: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/5.jpg)
Quiz: Your Answer
A. a=0 AND b=0;B. a=0 OR b=0;C. a>0 AND b=0;
D. a=0 AND b>0;
![Page 6: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/6.jpg)
Quiz :My Answer
A. a=0 AND b=0;B. a=0 OR b=0;C. a>0 AND b=0;
D. a=0 AND b>0;
Official Answer AD
My Answer A, AD, ACD, ABCD
![Page 7: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/7.jpg)
Agenda
1. Quiz
2. Intro MySQL & Index
3. Tools for monitoring, analyzing and tuning queries
4. MySQL cost-based optimizer
5. ICP
6. Quiz Discussion
![Page 8: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/8.jpg)
MySQL & Index
What are indexes for:
Speed up access in the db
Help to enforce constraints (UNIQUE, FOREIGN KEY)
Types of Indexes
BTree Majority of indexes we deal in MySQL
RTree
HASH
FULLTEXT
![Page 9: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/9.jpg)
B++Tree Example
![Page 10: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/10.jpg)
Indexes in MyISAM vs Innodb
MyISAM:
Point to physical offset in the data file
All indexes are equivalent
Innodb
Clustered Indexes (primary key) store data in the leaf page, not pointer
Secondary Indexes
![Page 11: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/11.jpg)
![Page 12: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/12.jpg)
Indexes
Multiple Column Indexes Or Composite Index
KEY `index1` (`a`,`b`)
Still one B+Tree Index
Index query Vs Post filter
Storage Engine (Innodb) use the Index for query, then MySQL will filter if needed
Overhead of The Indexing
Update the indexes when writing
Need more space on disk and in memory
![Page 13: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/13.jpg)
Impact on Cost of Indexing for Innodb
Long PK
Make all Secondary keys longer and slower
Random PK
Insertion causes a lot of page splits reduce the lifetime of SSD
Low selectivity index
Index on gender
Random Read Vs Sequential Read
Prefetching
InnoDB read-ahead innodb_read_ahead_threshold
Oracle multiblock-read
![Page 14: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/14.jpg)
Agenda
1. Quiz
2. Intro MySQL & Index
3. Tools for monitoring, analyzing and tuning queries
4. MySQL cost-based optimizer
5. ICP
6. Quiz Discussion
![Page 15: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/15.jpg)
Explain the “EXPLAIN”
ID
select_type
SIMPLE , PRIMARY, SUBQUERY, DERIVED, UNION, UNION RESULT
Type best ---> worst
const, system > eq_ref , ref > range > index >> ALL
Possible_keys & key &Rows
Key_len: Composite Index
Extra
Using index : Covering Index
Using where: Post-filter
Using temporary: sort or group by
Using filesort: can’t sort using index.
Using index condition: ICP
![Page 16: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/16.jpg)
EXPLAIN
More data in MySQL 5.7
Try “format=json” MySQL 5.6
![Page 17: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/17.jpg)
TRACE
EXPLAIN shows the selected plan
TRACE show WHY the plan was selected:
Alternative plans
Estimated costs
Decisions mode
JSON format
How to use Mysql 5.6
SET optimizer_trace= "enabled=on"
Do query
select trace into dumpfile "/var/lib/mysql-files/trace1.log" from information_schema.optimizer_trace;
SET optimizer_trace= "enabled=off"
![Page 18: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/18.jpg)
Agenda
1. Quiz
2. Intro MySQL & Index
3. Tools for monitoring, analyzing and tuning queries
4. MySQL cost-based optimizer
5. ICP
6. Quiz Discussion
![Page 19: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/19.jpg)
MySQL Optimizer
![Page 20: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/20.jpg)
Cost-based Query Optimization: General idea
Assign cost to operations
Compute cost of partial or alternative plans
Search for plan with lowest cost
Cost-based optimizations:
Access method
Join order
Subquery strategy
Total Cost = IO cost + CPU cost
![Page 21: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/21.jpg)
Input to Cost Model
IO-cost:
Estimates from storage engine based on number of pages to read
Both index and data pages
Schema:
Length of records and keys
Uniqueness for indexes
Nullability
![Page 22: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/22.jpg)
Input to Cost Model
Statistics:
Number of rows in table
Key distribution/Cardinality:
Average number of records per key value
Only for indexed columns
Maintained by storage engine
Number of records in an index range
![Page 23: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/23.jpg)
More on Cost Model
Not just minimizing number of scanned rows
Lots of other heuristics and hacks
Primary Key is special for Innodb
Covering Index benefits
Full table scan is faster
Also can use index for sorting
Data in memory, on disk, on ssd
Note it can change dynamically based on constants and data
Memory Disk SSD
Table scan 6.8s 36s 15s
Index scan 5.2s 2.5hour 30min
![Page 24: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/24.jpg)
Cost Model Example
SELECT * FROM t2 WHERE a BETWEEN x AND y;
Table scan:
IO cost : #pages in table
CPU cost : #rows * ROW_EVALUATE_COST
Range scan:
IO cost : #pages to read from index + #rows_in_range
CPU cost: #rows_in_range * ROW_EVALUATE_COST
![Page 25: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/25.jpg)
Cost Model Example EXPLAIN
EXPLAIN SELECT * FROM t2 WHERE a BETWEEN 50 AND 60;
EXPLAIN SELECT * FROM t2 WHERE a BETWEEN 50 AND 70;
![Page 26: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/26.jpg)
Cost Model Example TRACE
![Page 27: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/27.jpg)
Agenda
1. Quiz
2. Intro MySQL & Index
3. Tools for monitoring, analyzing and tuning queries
4. MySQL cost-based optimizer
5. ICP
6. Quiz Discussion
![Page 28: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/28.jpg)
ICP Index_Condition_Pushdown
Main Ideal:
Using Index data to filter WHERE clause
Push where clause “Conditions” for Storage engine to filter
SELECT A WHERE B = 2 AND C LIKE “%lee%”NO ICP
Index(B) -- traditional, using index for range only
Index(B,C,A) -- covering. All involved columns included
Using ICP
Index(B,C) range access by B, filter clause on c, only read full row if match
![Page 29: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/29.jpg)
ICP Index_Condition_Pushdown No ICP Using ICP
WHERE B = 2 AND C LIKE “%lee%” Index (B, C)
![Page 30: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/30.jpg)
ICP Index_Condition_Pushdown
Mysql 5.6+ (5.7 support partitioned tables)
Used for the range, ref, eg_ref and ref_or_null
By default is onSELECT @@optimizer_switch;
set @@optimizer_switch = "index_condition_pushdown=off"
![Page 31: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/31.jpg)
ICP demoTable & Data
create table icp(id int, age int, name varchar(30), memo varchar(600));alter table icp add index aind(age, name, memo);while (100K){ --eval insert into icp values($i, 1, 'a$i', repeat('a$i', 100))}
SQL: select * from icp where age = 1 and memo like '%9999%';
show session status like '%handler%';
Handler_read_next 100000 -- > 10+
Explain to check using ICP “Using index condition”
![Page 32: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/32.jpg)
Agenda
1. Quiz
2. Intro MySQL & Index
3. Tools for monitoring, analyzing and tuning queries
4. MySQL cost-based optimizer
5. ICP
6. Quiz Discussion
![Page 33: How mysql choose the execution plan](https://reader035.fdocuments.us/reader035/viewer/2022070523/58ed6d841a28abbb518b4733/html5/thumbnails/33.jpg)
Quiz: Explain
Scenario 1 Most case in live db config and db distribution AD
Scenario 2 Enable Index_Condition_Pushdown ACD
Scenario 3 Special data distribution A
Scenario 4 Special table structure (Covering Index) ABCD
Scenario 5 Special Storage Engine Index using hashtab A
How to modify the question to make answer unique?
A. a=0 AND b=0;B. a=0 OR b=0;C. a>0 AND b=0;
D. a=0 AND
b>0;