Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf...

52
Percona Live, Santa Clara 2017 Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros

Transcript of Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf...

Page 1: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Optimizing MySQL without touching SQL or my.cnf

Maxim Bublis, Peter Boros

Page 2: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Page 3: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

We want to be more efficient• Less hardware • Faster responses • Power savings

Page 4: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

OS-level optimizationsSet cpufreq scaling governor to performance

$ echo performance | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

Page 5: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

What’s going onstrace and perf are our friends

Page 6: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

TCP Wrapper libraryWritten by Wietse Venema in 1990

Host-based networking ACL

It is NOT the same as MySQL ACL – doesn’t verify credentials

Page 7: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

How it worksHow libwrap works in general – it reads /etc/hosts.allow and /etc/hosts.deny

Verifies that client host can connect to server

Page 8: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Let’s run strace$ sudo strace -c -p 28894 % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 94.39 56.109363 678 82816 poll 0.82 0.489786 1 328312 read 0.66 0.391561 2 164156 munmap 0.55 0.328845 1 248451 fcntl 0.50 0.298584 2 157468 865 futex 0.49 0.291448 2 164156 open 0.44 0.259529 2 164156 mmap 0.39 0.232314 1 164894 738 setsockopt 0.38 0.228411 1 164156 close 0.35 0.209210 1 164156 fstat 0.32 0.192175 2 82079 rt_sigaction 0.26 0.156935 2 82817 accept 0.18 0.108958 1 82079 getsockname 0.17 0.100658 1 82079 getpeername 0.01 0.003137 3 1045 clone ------ ----------- ----------- --------- --------- ---------------- 100.00 59.446237 2132821 1603 total

Page 9: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Let’s run strace$ sudo strace -c -p 28894 % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 94.39 56.109363 678 82816 poll 0.82 0.489786 1 328312 read 0.66 0.391561 2 164156 munmap 0.55 0.328845 1 248451 fcntl 0.50 0.298584 2 157468 865 futex 0.49 0.291448 2 164156 open 0.44 0.259529 2 164156 mmap 0.39 0.232314 1 164894 738 setsockopt 0.38 0.228411 1 164156 close 0.35 0.209210 1 164156 fstat 0.32 0.192175 2 82079 rt_sigaction 0.26 0.156935 2 82817 accept 0.18 0.108958 1 82079 getsockname 0.17 0.100658 1 82079 getpeername 0.01 0.003137 3 1045 clone ------ ----------- ----------- --------- --------- ---------------- 100.00 59.446237 2132821 1603 total

Page 10: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Better to use iptablesMakes tens of thousands context switches per second

• open() • fstat() • read() • close() • etc

We spend 2.5% of system time there

Page 11: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Fast Mutexes

Page 12: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Mutex implementation On Linux it is spin lock + futex.

Futex is user-space mutex implementation provided by kernel.

Page 13: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Fast Mutexes“Fast” mutex

volatile ulong j;

j = 0;

for (i = 0; i < delayloops * 50; i++) j += i;

return(j);

NPTL

do { if (cnt++ >= max_cnt) { LLL_MUTEX_LOCK (mutex); break; } atomic_spin_nop (); } while (LLL_MUTEX_TRYLOCK (mutex) != 0);

Page 14: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Fast Mutexes“Fast” mutex

Does load+store+add per loop iteration

Number of spins is produced by PRNG

PRNG uses floating-point arithmetics

Utilizes CPU execution port

NPTL

Emits nop / pause instruction

Adaptive number of spins

Page 15: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Fixed in 5.7.8https://dev.mysql.com/worklog/task/?id=4601

“Pretending that we can implement faster mutexes with such a simple code is justnaive and labeling then as "fast mutex" without any evidence is misleading toour users.”

Page 16: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

sysbenchhttps://github.com/akopytov/sysbench

Scriptable multi-threaded benchmark tool based on LuaJIT

Provides a collection of OLTP-like database benchmarks

Page 17: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

sysbench example$ sysbench \ --num-threads=$num_threads \ --max-time=300 \ --max-requests=0 \ --test=/usr/local/share/sysbench/oltp_read_write.lua \ --mysql-user=sbtest \ --mysql-password=sbtest \ --mysql-host=127.0.0.1 \ --mysql-port=3306 \ --table-size=1000000 \ --tables=64 \ --report-interval=1 \ --auto-inc=off \ run

Page 18: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

CompilersUbuntu 16.04

• GCC 4.9 • GCC 5.4 • Clang 3.8

Page 19: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Page 20: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Page 21: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Profile-guided optimizations• Requires “training data” • Usually requires to rebuild binary • Common mistake: collecting profile data from unit tests

To build “instrumented” binary: -fprofile-generate

To use profiling data: -fprofile-use (GCC additionally requires -fprofile-correction)

Page 22: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Page 23: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Page 24: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

AutoFDOhttps://github.com/google/autofdo

Helps to convert perf data into profiling data

Works with both GCC and Clang

Page 25: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

AutoFDOAlmost as efficient as building instrumented binaries

Requires LBR support (last branch record) from both CPU and kernel

Regular perf works well, but works better with perf -b

Page 26: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Link-time optimizationsAlso known as Full Program Optimization)

Supported by both modern GCC and Clang with -flto option

Page 27: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Page 28: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Page 29: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Page 30: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Native instruction setsBy default compilers generate code using k8 instruction set for x86_64

It was created and released by AMD in 2000.

-march=native and -mtune=native

Page 31: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Page 32: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Page 33: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

PGO + LTOWe can combine it!

Page 34: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Page 35: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Page 36: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

PGO + LTO + NativeEven better!

Page 37: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Page 38: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Page 39: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Real world scenarios• Artificial profiling data won’t help much in real world scenarios • Probably will hurt in reality – like slower replication, etc.

Page 40: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Percona Query Playbackhttps://github.com/Percona-Lab/query-playback

Workload reproduction based on slow log

Page 41: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Options for capturingslow_query_log_timestamp_precision = microsecond slow_query_log_timestamp_always = 1 log_slow_verbosity = full long_query_time = 0

log_slow_rate_limit = 1

Page 42: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Slow log event example# Time: 170330 6:25:14.040355 # User@Host: edgestore_user[edgestore_user] @ [127.0.0.1] Id: 10671469 # Schema: stage_edgestore_shard232 Last_errno: 0 Killed: 0 # Query_time: 0.000274 Lock_time: 0.000037 Rows_sent: 0 Rows_examined: 0 Rows_affected: 1 # Bytes_sent: 11 Tmp_tables: 0 Tmp_disk_tables: 0 Tmp_table_sizes: 0 # InnoDB_trx_id: 125F7CBE2A # QC_Hit: No Full_scan: No Full_join: No Tmp_table: No Tmp_table_on_disk: No # Filesort: No Filesort_on_disk: No Merge_passes: 0 # InnoDB_IO_r_ops: 0 InnoDB_IO_r_bytes: 0 InnoDB_IO_r_wait: 0.000000 # InnoDB_rec_lock_wait: 0.000000 InnoDB_queue_wait: 0.000000 # InnoDB_pages_distinct: 5 SET timestamp=1490855114; ACTUAL QUERY SQLTEXT;

Page 43: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Learn more about Edgestore

26 April - 3:30 PM - 4:20 PM @ Ballroom A

Page 44: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Query Playback• For every thread id in slow log it opens new connection • For each client connection it enqueues queue_depth number of statements to

replay • Replays those on those connections

Page 45: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Query Playback issues• Leaks connections • Often runs into lock waits, and seems to do nothing

(Slow log ordering issue) • Output is unreadable, reasonable debugging is next to impossible

Page 46: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Query Playback enhancements• No more connection leak (preprocessing) • Correct ordering (preprocessing) • Accurate playback mode

Page 47: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Kudos to Marius Wachtlerhttps://github.com/undingen

Pyston JIT compiler team member

Joined us for a couple of planning cycles

Bicycling from Austria to Australia now

Page 48: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Example$ percona-playback \

--mysql-user=testuser \

--mysql-host=127.0.0.1 \

--mysql-password=secret \

--query-log-file my.slow.log \

--queue-depth 1 \

--mysql-schema=stage_edgestore_shard232 \

--disable-reporting-plugin error_report \

--ignore-row-result-diffs \

--mysql-max-retries=0

Page 49: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Accurate mode• Controlled by query-log-accurate-mode option • Use with high queue depth

• In theory ordering issues are still possible, but we didn’t experience any, restoring the timing solves it.

• Keeps the gap between queries if there were any • Query 1 starts at T, query 2 at T + 5 sec

• If Query 1 was 1 second originally, and was played back in 0.8 seconds, playback will wait 4.2 seconds before Query 2

• If Query 1 was 1 second originally, and was played back in 2 seconds, playback will wait 2 seconds before Query 2

• If Query 1 was 1 seconds originally, and was played back in 6 seconds, playback will start Query 2 immediately

Page 50: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

Tips for larger queue depth• “Faster than possible” playback, while ordering is guaranteed

to have issues • The session_init_query option can help here, for example for

reducing innodb_lock_wait_timeout • New mysql_filter_error option to filter out known issues

to a specific workload

Page 51: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017

We are hiring!

Page 52: Optimizing MySQL without touching SQL or my · Optimizing MySQL without touching SQL or my.cnf Maxim Bublis, Peter Boros. Percona Live, Santa Clara 2017. Percona Live, Santa Clara

Percona Live, Santa Clara 2017