Fast Incremental Backups with Percona Server and Percona XtraBackup / PLMCE 2014

26
Fast Incremental Backups with Percona Server and Percona XtraBackup Laurynas Biveinis

description

Percona Live 2014 presentation https://www.percona.com/live/mysql-conference-2014/sessions/fast-incremental-backups-percona-server-and-percona-xtrabackup

Transcript of Fast Incremental Backups with Percona Server and Percona XtraBackup / PLMCE 2014

Page 1: Fast Incremental Backups with Percona Server and Percona XtraBackup / PLMCE 2014

Fast Incremental Backupswith Percona Server and Percona XtraBackup

Laurynas Biveinis

Page 2: Fast Incremental Backups with Percona Server and Percona XtraBackup / PLMCE 2014

Agenda

• Incremental XtraBackup: performance

• Incremental XtraBackup with bitmaps: performance

• The cost of the feature

• INFORMATION_SCHEMA.INNODB_CHANGED_PAGES

• Implementation

– Bitmap file format– New server thread

2

Page 3: Fast Incremental Backups with Percona Server and Percona XtraBackup / PLMCE 2014

Incremental XtraBackup: Performance

3

0.10% 1.00% 10.00% 100.00%0%

10%20%30%40%50%60%70%80%90%

100%

Delta Size

Back

up T

ime

• Does time to backup depend on the % of changed data?

Page 4: Fast Incremental Backups with Percona Server and Percona XtraBackup / PLMCE 2014

Incremental XtraBackup: How Data Page Copying Works

4

LSN = 950LSN = 960LSN = 960LSN = 1002LSN = 1003LSN = 940LSN = 1010

table.ibd

LSN>

1000?

Base BackupLSN = 1000

readreadreadreadread

read

writewrite

writeTable.ibd.delta

Can we avoid readingthe old pages?

MySQL

Page 5: Fast Incremental Backups with Percona Server and Percona XtraBackup / PLMCE 2014

Incremental XtraBackup: Can We Avoid Reading the Old Pages?

• http://bit.ly/FBIncBackup

5

Page 6: Fast Incremental Backups with Percona Server and Percona XtraBackup / PLMCE 2014

Incremental XtraBackup: Can We Avoid Reading the Old Pages?

• How do we know which pages to read then?

• Two ways to get the modification LSN of a page:

– It is written on the page, - or -

– We can figure it out from the redo log

• The log is cyclical, we must, in the server, save the info before it is overwritten

6

Page 7: Fast Incremental Backups with Percona Server and Percona XtraBackup / PLMCE 2014

Changed Page Tracking

• Server:

– --innodb-track-changed-pages=TRUE

– Documentation at http://bit.ly/psbmpdoc

– 5.1/5.5/5.6

• XtraBackup:

– Zero configuration!

7

Page 8: Fast Incremental Backups with Percona Server and Percona XtraBackup / PLMCE 2014

Incremental XtraBackup with Changed Page Tracking

8

LSN = 950LSN = 960LSN = 960LSN = 1002LSN = 1003LSN = 940LSN = 1010

table.ibd

LSN>

1000?

Base BackupLSN = 1000

readread

read

writewrite

writeTable.ibd.delta

PerconaServer

…Changed pages betweenLSNs 980 and 1020:1002, 1003, 1010...

Page 9: Fast Incremental Backups with Percona Server and Percona XtraBackup / PLMCE 2014

Incremental XtraBackup with Changed Page Tracking: Performance

9

0.00% 0.01% 1.00% 100.00%0%

10%20%30%40%50%60%70%80%90%

100%

Full Scan

Bitmap

Delta Size

Back

up T

ime

Page 10: Fast Incremental Backups with Percona Server and Percona XtraBackup / PLMCE 2014

Percona Server with Changed Page Tracking: Server Overhead

• Nothing is ever free!

– But the price might be very well acceptable

• Potential overhead #1: extra disk space requirements

• Potential overhead #2: extra code running in the server

10

Page 11: Fast Incremental Backups with Percona Server and Percona XtraBackup / PLMCE 2014

Percona Server with Changed Page Tracking: Server Overhead

11

1 2 3 4 5 6 7 80

100

200

300

400

500

600

700

800

Log and bitmap file size comparison

Bitmap file #

Log b

yte

s /

bit

map b

yte

• A good case: > 100 log bytes for 1 bmp byte

Page 12: Fast Incremental Backups with Percona Server and Percona XtraBackup / PLMCE 2014

Percona Server with Changed Page Tracking: Server Overhead

12

• A bad case: 3-15 log bytes per 1 bmp byte

• https://bugs.launchpad.net/bugs/1269547

– We are considering fix options

Page 13: Fast Incremental Backups with Percona Server and Percona XtraBackup / PLMCE 2014

Percona Server with Changed Page Tracking: Server Overhead

• Impact on TPS and response time:

– Couldn't find it

– If you ever do find it, report it to us and try --innodb_log_checksum_algorithm=crc32

● http://bit.ly/pslogcrc32

13

Page 14: Fast Incremental Backups with Percona Server and Percona XtraBackup / PLMCE 2014

Bitmap File Naming & Sizing

• ib_modified_log_<seq>_<LSN>.xdb

– <Seq>: 1, 2, 3, ...

– <LSN>: the server LSN at the file create time

• Rotated on

–Server start

–innodb_max_bitmap_file_size

14

Page 15: Fast Incremental Backups with Percona Server and Percona XtraBackup / PLMCE 2014

Bitmap File Management

• PURGE CHANGED_PAGE_BITMAPS BEFORE <lsn>

– ib_1_8192.xdb

– ib_2_10000.xdb

– ib_3_20000.xdb

– Full backup taken, LSN = 22000

– PURGE C_P_B BEFORE 22000;

– ib_4_30000.xdb

– Incremental backup taken, LSN = 33000

– PURGE C_P_B BEFORE 33000;

15

Page 16: Fast Incremental Backups with Percona Server and Percona XtraBackup / PLMCE 2014

INFORMATION_SCHEMA.INNODB_CHANGED_PAGES

• Percona Server can read the bitmaps too

16

SHOW CREATE TABLE INFORMATION_SCHEMA.INNODB_CHANGED_PAGES;CREATE TABLE `INNODB_CHANGED_PAGES` ( `space_id` int(11) unsigned NOT NULL DEFAULT '0', `page_id` int(11) unsigned NOT NULL DEFAULT '0', `start_lsn` bigint(21) unsigned NOT NULL DEFAULT '0', `end_lsn` bigint(21) unsigned NOT NULL DEFAULT '0')

• start_lsn and end_lsn are always at the checkpoint boundary

• Does not show the exact LSN of a change

• Does not show the number of changes for one page

• Does show the number of flushes for a page over the workload

Page 17: Fast Incremental Backups with Percona Server and Percona XtraBackup / PLMCE 2014

INFORMATION_SCHEMA.INNODB_CHANGED_PAGES

17

SELECT * FROM INFORMATION_SCHEMA.INNODB_CHANGED_PAGES;space_id page_id start_lsn end_lsn0 0 8204 384700 1 8204 384705 0 8204 384705 3 8204 384700 1 38471 500005 3 38471 500005 3 50001 60000

• Don't query like that in production!

– It will read all the bitmaps you have. Gigabytes, terabytes, ...

– Add WHERE start_lsn > X AND end_lsn < Y (index condition pushdown implemented for this case)

Page 18: Fast Incremental Backups with Percona Server and Percona XtraBackup / PLMCE 2014

INFORMATION_SCHEMA.INNODB_CHANGED_PAGES

• Which tables are written to?

18

SELECT DISTINCT space_id FROM INFORMATION_SCHEMA.INNODB_CHANGED_PAGES WHERE ...;space_id010

SELECT DISTINCT t1.space_id AS space_id, t2.schema AS db, t2.name AS tname FROM INFORMATION_SCHEMA.INNODB_CHANGED_PAGES AS t1, INFORMATION_SCHEMA.INNODB_SYS_TABLES AS t2 WHERE t1.space_id = t2.space AND t1.start_lsn >...space_id db tname0 SYS_FOREIGN0 SYS_FOREIGN_COLS10 test foo

Page 19: Fast Incremental Backups with Percona Server and Percona XtraBackup / PLMCE 2014

INFORMATION_SCHEMA.INNODB_CHANGED_PAGES

• What are the hottest tables?

19

SELECT space_id, COUNT(space_id) AS number_of_flushes FROM INFORMATION_SCHEMA.INNODB_CHANGED_PAGES GROUP BY space_id ORDER BY number_of_flushes DESC;space_id number_of_flushes0 6510 511 4

Page 20: Fast Incremental Backups with Percona Server and Percona XtraBackup / PLMCE 2014

INFORMATION_SCHEMA.INNODB_CHANGED_PAGES

• What are the hottest pages?

20

SELECT space_id, page_id, COUNT(page_id) AS number_of_flushes FROM INFORMATION_SCHEMA.INNODB_CHANGED_PAGES GROUP BY space_id, page_id HAVING number_of_flushes > 2 ORDER BY number_of_flushes DESC LIMIT 8;space_id page_id number_of_flushes0 5 30 7 30 0 20 11 210 3 20 1 20 12 20 2 2

Page 21: Fast Incremental Backups with Percona Server and Percona XtraBackup / PLMCE 2014

INFORMATION_SCHEMA.INNODB_CHANGED_PAGES

• For complex queries, copy data first

21

CREATE TEMPORARY TABLE icp (space_id INT(11) NOT NULL, page_id INT(11) NOT NULL, start_lsn BIGINT(21) NOT NULL, end_lsn BIGINT(21) NOT NULL, INDEX page_id(space_id, page_id), INDEX start_lsn(start_lsn), INDEX end_lsn(end_lsn)) ENGINE=InnoDB;

INSERT INTO icp SELECT * FROM INFORMATION_SCHEMA.INNODB_CHANGED_PAGES WHERE start_lsn > 8000;

Page 22: Fast Incremental Backups with Percona Server and Percona XtraBackup / PLMCE 2014

INFORMATION_SCHEMA.INNODB_CHANGED_PAGES

• For complex queries, copy data first

22

EXPLAIN SELECT DISTINCT space_id FROM INFORMATION_SCHEMA.INNODB_CHANGED_PAGES;id select_type table type possible_keys key key_lenref rows Extra1 SIMPLE INNODB_CHANGED_PAGES ALL NULL NULL NULLNULL NULL Using temporary

EXPLAIN SELECT DISTINCT space_id FROM icp;id select_type table type possible_keys key key_lenref rows Extra1 SIMPLE icp index NULL page_id 8 NULL 74 Using index

Page 23: Fast Incremental Backups with Percona Server and Percona XtraBackup / PLMCE 2014

Implementation: File Format

23

Data for checkpoint at LSN 9000

LSN 10000

LSN 10500

A sequence of per-checkpoint varying number of data pages:

For each checkpoint:

space, start page space, start page space, start page

4KB

Each page contains a bitmap for the next 32480 pages in space starting from start page

Page 24: Fast Incremental Backups with Percona Server and Percona XtraBackup / PLMCE 2014

Implementation: Server Side

• A new XtraDB thread

– 1. Wait for log checkpoint completed event

– 2. Read the log up to the checkpoint, write the bitmap

– 3. goto 1

• Little data sharing with the rest of XtraDB

– log_sys->mutex for:● setting and getting LSNs;● calculating log read offset from LSN.

• Little extra code for the query threads

– Unread log overwrite check– Firing of the log checkpoint completed event

24

Page 25: Fast Incremental Backups with Percona Server and Percona XtraBackup / PLMCE 2014

Implementation: Things We Had to Account For

• Maximum checkpoint age violation

– Destroys untracked log data

– Make effort to avoid, but in the end we allow to overwrite it

– Responding server > fast backups

• Crash recovery

– Re-read the log if available

25

Page 26: Fast Incremental Backups with Percona Server and Percona XtraBackup / PLMCE 2014

Conclusions

• Percona Server together with Percona XtraBackup:

• Enable faster incremental backups

• Enable more frequent incremental backups

• Does not hurt server operation, but have to manage the bitmaps now

• New INFORMATION_SCHEMA table for gaining insight into data change patterns

• Is actually being used, http://bit.ly/psbmpbugs

• Thank you! Questions?

26