pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL...
-
Upload
phungnguyet -
Category
Documents
-
view
227 -
download
5
Transcript of pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL...
![Page 1: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/1.jpg)
pg_rewind
Heikki Linnakangas
![Page 2: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/2.jpg)
Your typical setup
Streaming
Replication
Master STANDBY
![Page 3: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/3.jpg)
Your typical catastrophe
Streaming
Replication
Master STANDBY
![Page 4: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/4.jpg)
Standby takes over
STANDBYMASTER
Master
![Page 5: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/5.jpg)
Wait, the old master survived after all!
Master STANDBYMASTER
![Page 6: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/6.jpg)
How do you turn the old master into standby?
MasterSTANDBY??
STANDBYMASTER
Streaming
Replication???
![Page 7: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/7.jpg)
WAL Timelines
TLI 1
![Page 8: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/8.jpg)
WAL Timelines
TLI 1 INSERT #
1
INSERT #
2
![Page 9: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/9.jpg)
Promotion
TLI 1 INSERT #
1
INSERT #3
INSERT #
2
TLI 2(on standby)
Promotion
Meteor
![Page 10: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/10.jpg)
Lost transactions
TLI 1 INSERT #
1
INSERT #3
INSERT #
2
TLI 2(on standby)
Lost transactions, notstreamed to standby
![Page 11: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/11.jpg)
What about synchronous replication?
Nope:● only commits are synchronized● records may hit the disk in master before they're
replicated anyway
TLI 1 INSERT #
1
INSERT #3
INSERT #
2
TLI 2(on standby)
Lost transactions, notstreamed to standby
![Page 12: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/12.jpg)
Even controlled failover is tricky
● How do you verify that the standby got all the WAL?
![Page 13: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/13.jpg)
How to resynchronize?
TLI 1 INSERT #
1
INSERT #3
INSERT #
2
TLI 2
???
![Page 14: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/14.jpg)
Naive approach
● Just create a recovery.conf file on old master to point to new master
● Will not work:LOG: database system was shut down at 20150305 15:26:37 EETLOG: entering standby modeLOG: consistent recovery state reached at 0/4000098LOG: invalid record length at 0/4000098LOG: fetching timeline history file for timeline 2 from primary serverFATAL: could not start WAL streaming: ERROR: requested starting point 0/4000000 on timeline 1 is not in this server's historyDETAIL: This server's history forked from timeline 1 at 0/3010758.
● Might appear to work, but may silently corrupt your database!
![Page 15: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/15.jpg)
Wrong approach
TLI 1 INSERT #
1
INSERT #3
INSERT #
2
TLI 2
WRONG!
![Page 16: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/16.jpg)
Solution 1: Rebuild from scratch
● Erase old master, take new base backup from new master, and copy it over.
● Is slow– Reads all data from disk
– Sends all data through the network
– Writes all data to disk
![Page 17: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/17.jpg)
Solution 2: rsync
● Call pg_start_backup() in new master● Use rsync to resynchronize the data dir● Be careful which options you use● Still slow
– Reads all data from disk
![Page 18: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/18.jpg)
Solution 3: pg_rewind
● Fast– Only reads and copies data that was changed
![Page 19: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/19.jpg)
Terminology
TLI 1 INSERT #
1
INSERT #3
INSERT #
2
TLI 2
Source: New master. Not modified.
Target: Old master. Overwritten with data from source.
Target
Source
![Page 20: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/20.jpg)
How it works
● Find out what blocks the lost transactions modified
● Copy those blocks from source to target
~ rsync on steroids
![Page 21: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/21.jpg)
How it works?1. Determine point of divergence
TLI 1 INSERT #
1
INSERT #3
INSERT #
2
TLI 2
● Looks at the pg_control file on both systems
![Page 22: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/22.jpg)
How it works?2. Scan the old WAL
TLI 1 INSERT #
1
INSERT #3
INSERT #
2
TLI 2
● Build a list of blocks that were changed on TLI 1– lost transactions
![Page 23: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/23.jpg)
How it works?3. Copy over all changed blocks
● Copies everything except those blocks of relation files that were not modified– pg_clog, etc.
– Configuration files
– FSM and VM files
![Page 24: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/24.jpg)
File mapbackup_label.old (COPY)base/1/12454_fsm (COPY)base/1/12454_vm (COPY)base/1/12456_fsm (COPY)
...
pg_xlog/archive_status/000000010000000000000003.done (COPY)pg_xlog/archive_status/00000002.history.done (COPY)postgresql.auto.conf (COPY)postgresql.conf (COPY)recovery.done (COPY)base/12726/12475 (COPY_TAIL)pg_xlog/archive_status/000000010000000000000003.ready (REMOVE)pg_xlog/archive_status/000000010000000000000002.00000028.backup.done (REMOVE)pg_xlog/archive_status/000000010000000000000001.done (REMOVE)pg_xlog/000000010000000000000004 (REMOVE)pg_xlog/000000010000000000000002.00000028.backup (REMOVE)pg_xlog/000000010000000000000001 (REMOVE)pg_stat/global.stat (REMOVE)pg_stat/db_12726.stat (REMOVE)pg_stat/db_0.stat (REMOVE)
![Page 25: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/25.jpg)
How it works?4. Reset the control file
● Start recovery from the point of divergence, not some later checkpoint.
TLI 1 INSERT #
1
INSERT #3
INSERT #
2
TLI 2CHECKPOINT
![Page 26: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/26.jpg)
How it works?5. Replay new WAL
● On first startup (not by pg_rewind)
TLI 1 INSERT #
1
INSERT #3
INSERT #
2
TLI 2REDO point
![Page 27: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/27.jpg)
UsageUsage: pg_rewind [OPTION]...
Options: D, targetpgdata=DIRECTORY existing data directory to modify sourcepgdata=DIRECTORY source data directory to sync with sourceserver=CONNSTR source server to sync with
P, progress write progress messages n, dryrun stop before modifying anything debug write a lot of debug messages V, version output version information, then exit ?, help show this help, then exit
![Page 28: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/28.jpg)
Example
$ pg_rewind sourceserver="host=localhost port=5433 dbname=postgres" targetpgdata=datamaster
The servers diverged at WAL position 0/3000060 on timeline 1.Rewinding from last common checkpoint at 0/2000060 on timeline 1Done!
![Page 29: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/29.jpg)
Example: --progress
$ pg_rewind progress sourceserver="host=localhost port=5433 dbname=postgres" –targetpgdata=datamaster
connected to remote serverThe servers diverged at WAL position 0/3000060 on timeline 1.Rewinding from last common checkpoint at 0/2000060 on timeline 1reading source file listreading target file listreading WAL in targetNeed to copy 51 MB (total source directory size is 67 MB)53071/53071 kB (100%) copiedcreating backup label and updating control fileDone!
![Page 30: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/30.jpg)
Example: clean failover
$ pg_rewind sourceserver="host=localhost port=5433 dbname=postgres" targetpgdata=datamaster
The servers diverged at WAL position 0/4000098 on timeline 1.No rewind required.
![Page 31: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/31.jpg)
Caveats
● Must set wal_log_hints=on in postgresql.conf– before the meteor strikes
– or use checksums (initdb -k)
● Create/drop tablespaces or databases● All WAL needs to be available in the pg_xlog
directories
![Page 32: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/32.jpg)
More use cases
● Synchronize new master to old master, instead of the other way 'round
● Synchronize a second standby after failing over● Rewind back to an earlier base backup
(haven't tested those, might not work currently)
![Page 33: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/33.jpg)
Design goals
● Safety– exit gracefully without modifying target if rewind is
not possible
– dry-run mode
– unrecognized files are copied in toto
● Ease of use● Speed
– Faster than reading through all data
![Page 34: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/34.jpg)
pg_rewind – for 9.3 and 9.4
Stand-alone versions available for 9.3 and 9.4● https://github.com/vmware/pg_rewind● PostgreSQL-licensed
![Page 35: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/35.jpg)
In PostgreSQL 9.5
● Changed WAL record format in 9.5– to support pg_rewind among other things
![Page 36: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/36.jpg)
pg_rewind – current status
Patch submitted for 9.5● http://www.postgresql.org/message-id/54FDA80
[email protected]● In current commitfest● Will go to src/bin/pg_rewind (not contrib)
![Page 37: pg rewind - Heikki Linnakangas' blog | Mostly PostgreSQL ...hlinnaka.iki.fi/presentations/NordicPGDay2015-pg_rewind.pdf · In PostgreSQL 9.5 Changed WAL record format in 9.5 – to](https://reader030.fdocuments.us/reader030/viewer/2022020214/5a78f0387f8b9adb5a8b7f24/html5/thumbnails/37.jpg)
Thank you!
● Are you hiring?● Questions?