PostgreSQL Write-Ahead Log (Heikki Linnakangas)

Post on 15-Jan-2015

2.102 views 0 download

Tags:

description

 

Transcript of PostgreSQL Write-Ahead Log (Heikki Linnakangas)

Copyright 2009 EnterpriseDB Corporation. All rights Reserved.<presentation title goes here, edit in the slide master > Slide: 1

PostgreSQL Write-Ahead LogHeikki Linnakangas

Copyright 2009 EnterpriseDB Corporation. All rights Reserved.<presentation title goes here, edit in the slide master > Slide: 2

What is Write-Ahead Log?

• Also known as the transaction log or redo log• Practically every database management system

has one• Also used by journaling file systems, transaction

managers etc.

Copyright 2009 EnterpriseDB Corporation. All rights Reserved.<presentation title goes here, edit in the slide master > Slide: 3

Write-Ahead Log concepts

• A transaction log is a sequence of log records• One log record for every change• Each WAL record is assigned an LSN (Log

Sequence Number)

Copyright 2009 EnterpriseDB Corporation. All rights Reserved.<presentation title goes here, edit in the slide master > Slide: 4

PostgreSQL Write-Ahead-Log

• Introduced in version 7.1 by Vadim B. Mikheev– Released in 2001

• REDO log only– No UNDO log– Instantaneous rollbacks– No limit on transaction size

Copyright 2009 EnterpriseDB Corporation. All rights Reserved.<presentation title goes here, edit in the slide master > Slide: 5

PostgreSQL Example

INSERT INTO tbl (id, data) VALUES (123, 'foo')

@ 0/D023940: xid 734: Heap - insert: rel 1663/12040/16402; tid 0/2@ 0/D023988: xid 734: Btree - insert: rel 1663/12040/16404; tid 1/2@ 0/D0239D0: xid 734: Transaction - commit: 2012-03-30 19:08:10.262228+03

LOG: xlog flush request 0/D023A00

Copyright 2009 EnterpriseDB Corporation. All rights Reserved.<presentation title goes here, edit in the slide master > Slide: 6

PostgreSQL WAL structure

• WAL is divided into WAL segments– Each segment is a file in pg_xlog directory

$ ls ­l pgsql/data/pg_xlog/

­rw­­­­­­­ 1 heikki heikki      239 2009­11­03 10:39 000000010000000000000034.00000020.backup

­rw­­­­­­­ 1 heikki heikki 16777216 2009­11­03 15:26 00000001000000000000003A

­rw­­­­­­­ 1 heikki heikki 16777216 2009­11­03 15:26 00000001000000000000003B

­rw­­­­­­­ 1 heikki heikki 16777216 2009­11­03 10:39 00000001000000000000003C

­rw­­­­­­­ 1 heikki heikki 16777216 2009­11­03 13:49 00000001000000000000003D

­rw­­­­­­­ 1 heikki heikki 16777216 2009­11­03 15:14 00000001000000000000003E

drwx­­­­­­ 2 heikki heikki      160 2009­11­03 15:26 archive_status

Copyright 2009 EnterpriseDB Corporation. All rights Reserved.<presentation title goes here, edit in the slide master > Slide: 7

Tools

• Developer tools– WAL_DEBUG compile option– Xlogdump http://xlogviewer.projects.postgresql.org/

Copyright 2009 EnterpriseDB Corporation. All rights Reserved.<presentation title goes here, edit in the slide master > Slide: 8

WAL-first rule

• The log record for an operation is always written to disk before the affected data pages– WAL first rule!

• In case of a crash, the WAL is replayed to reconstruct the unsaved changes

Copyright 2009 EnterpriseDB Corporation. All rights Reserved.<presentation title goes here, edit in the slide master > Slide: 9

Checkpoints

• WAL grows indefinitely• Checkpoints allow us to truncate it

1. Flush all data pages to disk2. Write a checkpoint record3. Truncate away old WAL

Copyright 2009 EnterpriseDB Corporation. All rights Reserved.<presentation title goes here, edit in the slide master > Slide: 10

WAL filenames

00000001000000000000003E

• Timeline ID is used to distinguish WAL generated before and after a PITR recovery

Log id Seg no

LSNTLI

Copyright 2009 EnterpriseDB Corporation. All rights Reserved.<presentation title goes here, edit in the slide master > Slide: 11

Crash-safety

• Many database actions involve modifying more than one page.– Example: Splitting an index page.

• Writing two data pages A and B is not atomic.• One write must happen before the other.

– We need to give the operating system and disk controller freedom to reorder the writes; we don't even know which one will happen first.

• We could crash in between.

Copyright 2009 EnterpriseDB Corporation. All rights Reserved.<presentation title goes here, edit in the slide master > Slide: 12

WAL Summary

Write-Ahead Logging is critical for:

• Durability– Once you commit, your data is safe

• Consistency– No corruption on crash

Copyright 2009 EnterpriseDB Corporation. All rights Reserved.<presentation title goes here, edit in the slide master > Slide: 13

Advanced features

But there's more!

The Write-Ahead Log also allows:

• Online backup• Point-in-time Recovery• Replication

Copyright 2009 EnterpriseDB Corporation. All rights Reserved.<presentation title goes here, edit in the slide master > Slide: 14

WAL Archiving

• Normally, old WAL is deleted at a checkpoint.• But it can also be archived

• Archive can be anything– Directory on another server– Tape drive

Copyright 2009 EnterpriseDB Corporation. All rights Reserved.<presentation title goes here, edit in the slide master > Slide: 15

WAL Archiving

In your postgresql.conf file:

archive_mode=onwal_level=archive

archive_command = 'cp %p /mnt/walarchive/%f'

Restart server, and it will copy all WAL files to /mnt/walarchive as they're generated

Copyright 2009 EnterpriseDB Corporation. All rights Reserved.<presentation title goes here, edit in the slide master > Slide: 16

WAL archive

~$ ls -l /mnt/walarchive/

yhteensä 114688

-rw------- 1 heikki heikki 16777216 30.3. 18:11 000000010000000000000001

-rw------- 1 heikki heikki 16777216 30.3. 18:11 000000010000000000000002

-rw------- 1 heikki heikki 16777216 30.3. 18:11 000000010000000000000003

-rw------- 1 heikki heikki 16777216 30.3. 18:11 000000010000000000000004

-rw------- 1 heikki heikki 16777216 30.3. 18:11 000000010000000000000005

-rw------- 1 heikki heikki 16777216 30.3. 18:11 000000010000000000000006

-rw------- 1 heikki heikki 16777216 30.3. 18:11 000000010000000000000007

Copyright 2009 EnterpriseDB Corporation. All rights Reserved.<presentation title goes here, edit in the slide master > Slide: 17

Online Backup

• Normally, you need to shut down the database while you take a backup– or use a filesystem snapshot (e.g ZFS or a SAN)

• WAL archiving makes that unnecessary

Copyright 2009 EnterpriseDB Corporation. All rights Reserved.<presentation title goes here, edit in the slide master > Slide: 18

Online Backup

1. SELECT pg_start_backup()2. rsync / tar / whatever3. SELECT pg_stop_backup();4. Include all archived WAL files from archive

directory in the backup

Copyright 2009 EnterpriseDB Corporation. All rights Reserved.<presentation title goes here, edit in the slide master > Slide: 19

Online backup: Step 1

/mnt$ ls ­l

yhteensä 8

drwxr­xr­x  2 heikki heikki 4096 30.3. 18:11 archivedir

drwx­­­­­­ 14 heikki heikki 4096 30.3. 18:10 data

/mnt$ psql ­c "SELECT pg_start_backup('my first backup')"

 pg_start_backup 

­­­­­­­­­­­­­­­­­

 0/C000020

(1 row)

Copyright 2009 EnterpriseDB Corporation. All rights Reserved.<presentation title goes here, edit in the slide master > Slide: 20

Online backup: Step 2

/mnt$ tar czf ~/backup.tar.gz data/mnt$ ls -l ~/backup.tar.gz-rw-r--r-- 1 heikki heikki 39670143 30.3. 18:33 /home/heikki/backup.tar.gz

Copyright 2009 EnterpriseDB Corporation. All rights Reserved.<presentation title goes here, edit in the slide master > Slide: 21

Online backup: Step 3

/mnt$ psql -c "SELECT pg_stop_backup()"NOTICE: pg_stop_backup complete, all required WAL segments have been archived pg_stop_backup ---------------- 0/C0000A8(1 row)

Copyright 2009 EnterpriseDB Corporation. All rights Reserved.<presentation title goes here, edit in the slide master > Slide: 22

Online backup: Step 4

/mnt$ tar czf ~/backup-xlog.tar.gz archivedir/*

/mnt$ ls -l /home/heikki/backup*.tar.gz-rw-r--r-- 1 heikki heikki 39670143 30.3. 18:33 /home/heikki/backup.tar.gz-rw-r--r-- 1 heikki heikki 41937910 30.3. 18:38 /home/heikki/backup-xlog.tar.gz

Copyright 2009 EnterpriseDB Corporation. All rights Reserved.<presentation title goes here, edit in the slide master > Slide: 23

Point-in-Time Recovery

postgres=# DROP TABLE customers;DROP TABLE

Oops!

• Once you've enabled WAL archiving and taken a backup, you can use the archived WAL to restore to any point in time

Copyright 2009 EnterpriseDB Corporation. All rights Reserved.<presentation title goes here, edit in the slide master > Slide: 24

Point-in-Time Recovery

Recovery.conf:

# By default, recovery will rollforward to the end of the WAL log.# If you want to stop rollforward at a specific point, you# must set a recovery target....#recovery_target_name = '' # e.g. 'daily backup 2011-01-26'##recovery_target_time = '' # e.g. '2004-07-14 22:39:00 EST'##recovery_target_xid = ''##recovery_target_inclusive = true

Copyright 2009 EnterpriseDB Corporation. All rights Reserved.<presentation title goes here, edit in the slide master > Slide: 25

Replication

• You can transmit WAL as it is generated to a standby server– Replication!

• This can be done using the WAL archive, or by streaming directly from the server over a TCP connection

• Standby server can be used for read-only queries (Hot Standby)

Copyright 2009 EnterpriseDB Corporation. All rights Reserved.<presentation title goes here, edit in the slide master > Slide: 26

Thank you

Write-Ahead Logging makes it possible for a database to maintain consistency. It also allows many advanced features: online backup, PITR and replication

Questions?