Postgresql Replication and Some Test Scenarios

Post on 15-Jan-2017

553 views 4 download

Transcript of Postgresql Replication and Some Test Scenarios

POSTGRESQL REPLICATIONNUR AGUS SURYOKO

SIMPLICTIC WAY TO REPLICATE POSTGRES

Primary DB Standby DB

1. Backup 3. RESTORE2. RSYNC

THIS METHOD IS… It work, yes• But:• With database growing, replication

getting slower• When the duration of backup +

rsync + restore > the interval = disaster

• Rely heavily on scripts

PROPOSAL

PostgresReplicationAsynchronousHot-Stanby

IN ONE SENTENCE

Same with Oracle Dataguard and DB2-HADR

AT A GLANCE

Primary DB Standby DB

Archive-Log Archive-Log

1. SQL Connection Open

2. Restore Database + Copy Archive Log

3. Replication in Sync

REMEMBER DB2-HADR? JUST THE SAME

The asynchronous one

YEAH, BABY!

READ-ONLY QUERY ON STANDBY

Same capability with Oracle Active Dataguard and DB2-HADR RoS (Read on Standby)

LET’S GET TO WORK2 nodes Installation: Download EnterpriseDB PostgreSQL binary distribution, save a lot of workConventions:Node1 hostname: userverNode2 hostname: userver2

Create sample table:CREATE TABLE test (

id uuid DEFAULT uuid_generate_v4() NOT NULL,

waktu timestamp with time zone DEFAULT now() NOT NULL,

text character varying(255)

);

CONTINUOUSCreate this script:#!/bin/bashwhile :do psql -d test -c "insert into test(text) values('X')"; sleep 1;done;

Run this script on node-1:primaryYou see what we are trying to do here…

NETWORKS• Maintain /etc/hosts on each nodes so that they can ping

each other• Configure ssh-key for user postgres on each nodes. Allow

passwordless ssh

PRIMARY SETUPpsql# create user replicator replication login encrypted password 'Initial1';

Keep the name. We don’t want any trouble. Believe me, I tried

PRIMARY SETUPedit postgresql.conf:wal_level = hot_standbyfsync = onsynchronous_commit = localwal_sync_method = fsyncwal_compression = onarchive_mode = onarchive_command = 'test ! -f /opt/postgresql/9.5/backup/archive/%f && cp %p /opt/postgresql/9.5/backup/archive/%f'max_wal_senders = 8wal_keep_segments = 24wal_sender_timeout = 10shot_standby = on

Careful with reddy things, adjust accordingly

PRIMARY SETUPas root: mkdir -p /opt/postgresql/9.5/backup/archiveas root: mkdir -p /opt/postgresql/9.5/backupas root: chown -R postgres:postgres /opt/postgresql/9.5/backuprestart postgres primaryensure no error in $PGDATA/pg_log

PRIMARY SETUPedit pg_hba.conf:host all replicator 192.168.56.105/32 md5host replication replicator 192.168.56.105/32 md5restart postgres primaryensure no error in $PGDATA/pg_log

Why 2? Because replication is not a database. It is a special keyword

STANDBY SETUPtest connection to primary:psql -h userver -d test -U replication -W

STANDBY SETUPedit postgresql.conf:wal_level = hot_standbyfsync = onsynchronous_commit = localwal_sync_method = fsyncwal_compression = onarchive_mode = onarchive_command = 'test ! -f /opt/postgresql/9.5/backup/archive/%f && cp %p /opt/postgresql/9.5/backup/archive/%f'max_wal_senders = 8wal_keep_segments = 24wal_sender_timeout = 10shot_standby = on

STANDBY SETUPdestroy db on standbyroot: rm -rf /opt/postgresql/9.5/dataroot: mkdir -p /opt/postgresql/9.5/dataroot: chown -R postgres:postgres /opt/postgresql/9.5/dataroot: chmod -R 700 /opt/postgresql/9.5/data

Destroy db, not the engine

backup - and direct restore from primary:pg_basebackup -h userver -D /opt/postgresql/9.5/data -U replicator -v -P

STANDBY SETUPprepare file /opt/postgresql/9.5/data/recovery.conf:standby_mode = 'on'primary_conninfo = 'host=userver port=5432 user=replicator password=Initial1'trigger_file = '/home/postgres/trigger.file‘

start standbycheck test table. data should be updated periodically

STANDBY SETUPcheck replication status:test=# select * from pg_stat_replication;sync_state must be ‘sync’

SYNC

TEST1: NODE2 SHUTDOWN OSnode1: primarynode2: standbyshutdown node2transaction on primary still goingreplication status:test=# select * from pg_stat_replication;<empty>

Test Result: OK

TEST2: START STANDBYnode1: primarynode2: standbystartup node2auto-start after OS, and sync still goingcheck replication status:test=# select * from pg_stat_replication;sync_state = ‘sync’

Test Result: OK

TEST3: NORMAL FAILOVERnode1: primarynode2: standbystop periodic insertmark last inserted value (primary): 2668fadd-59dc-468a-ae85-65f9750bc336 | 2015-11-05 22:04:15.658038+07 | Xshutdown node1

TEST3: NORMAL FAILOVERfound disconnected log in standby, this is normal:2015-11-05 22:14:18 WIB FATAL: could not connect to the primary server: could not connect to server: Connection refusedIs the server running on host "userver" (192.168.56.104) and acceptingTCP/IP connections on port 5432?

TEST3: NORMAL FAILOVERnode2 as primarytouch /home/postgres/trigger.filecheck log, this is a successful failover message:2015-11-05 22:15:53 WIB LOG: trigger file found: /home/postgres/trigger.file2015-11-05 22:15:53 WIB LOG: redo done at 0/100000282015-11-05 22:15:53 WIB LOG: last completed transaction was at log time 2015-11-05 22:04:15.658499+072015-11-05 22:15:53 WIB LOG: selected new timeline ID: 22015-11-05 22:15:54 WIB LOG: archive recovery complete2015-11-05 22:15:54 WIB LOG: MultiXact member wraparound protections are now enabled2015-11-05 22:15:54 WIB LOG: database system is ready to accept connections2015-11-05 22:15:54 WIB LOG: autovacuum launcher started

TEST3: NORMAL FAILOVERcheck latest inserted data, and compare to marked data:

2668fadd-59dc-468a-ae85-65f9750bc336 | 2015-11-05 22:04:15.658038+07 | X

data match!

ensure /home/postgres/trigger.file is deleted automatically by postgres

Test Result: OK

TEST4: NETWORK DISCONNECTnode1: primary

node2: standby

ensure the database is in sync

cut network connection

node1: primary: not in sync

node2: cannot connect to primary

re-connect network

node1: sync

node2: log report started streaming WAL

Test Result: OK

TEST5: NODE1 SHUTDOWN node1: primary

node2: standby

shutdown node1: primary

startup node1: primary

after startup, automatically sync

check test database, updated periodically

Test Result: OK

THANK YOU