Fail
Fail
over
back
Josh BerkusPostgreSQL Experts Inc.NYC PgDay 2014
mozilla logo is a trademark of the Mozilla corporation. Used here under fair use.
2 servers
1 command
admin
executesfailoverconnecttomaster?
nowhat error?shutdown
masterno responseother error
yesfail to shutdownBRINGUPstandbysuccess
did standbycome up?
nostandbyis standing by?yes
no
Automated
Failover
image from libarynth.com. used under creative commons share-alike
www.handyrep.org
Fail
over
Goals
1. Minimize Downtime
2. Minimize data loss
3. Don't make it worse!
?
Planned
vs.
Emergency
Failover once a quarter
Postgres updates
Kernel updates
Disaster Recovery drills
Just for fun!
Automated or Not?
< 1hr
false failover
testing testing
testing
complex SW
>= 1hr
2am call
training
simple script
sysadmin > software
failover in 3 parts
(1)
DetectingFailure
(2) FailingOver DB
(3) FailingOver App
1. Detecting Failure
can't connect to master
could not connect to server: Connection refused
Is the server running on host "192.168.0.1" and accepting TCP/IP
connections on port 5432?
can't connect to master
down?
too busy?
network problem?
configuration error?
can't connect to master
down?failover
too busy?don't fail over
pg_isready
pg_isready
-h 192.168.1.150
-p 6433 -t 15
192.168.1.150:6433 - accepting connections
pg_isready
0 == running and accepting connections (even if too busy)1 == running but rejecting connections (security settings)2 == not responding (down?)
more checks
can ssh?master is down;failovernopostgresprocesseson master?yesexit with erroryesattemptrestartnomaster is OK;no failoversucceed
fail
check replica
pg_isready?OK to failoveryesexit with errornois replica?yesno
some rules
don't just ping 5432
misconfiguration > downtime
tradeoff:confidence
time to failover
failover time
master poll fail:ssh master:attempt restart:verify replica:failover:
1 101 103 151 53 209 60
AppServerOneAppServerTwo
PARTITION
AppServerOneAppServerTwo
PARTITION
AppServerOneAppServerTwo
Broker
AppServerOneAppServerTwo
Proxy
Failing Over the DB
Failing Over the DB
choose a replica target
shutdown the master
promote the replica
verify the replica
remaster other replicas
Choosing a replica
One replica
Designated replica
Furthest ahead replica
One Replica
fail over to itor don'twell, that's easy
Designated Replica
load-free replica, or
cascade master, or
syncronous replica
Furthest Ahead
Pool of replicas
Least data loss
Least downtime
Other replicas can remaster
but what's furthest ahead?
receive vs. replay
receive == data it has
replay == data it applied
receive vs. replay
receive == data it hasfurthest ahead
replay == data it appliedmost caught up
receive vs. replay
get the furthest ahead, but not more than 2 hours behind on replay
receive vs. replay
get the furthest ahead, but not more than 1GB behind on replay
timestamp?
pg_last_xact_replay_timestamp()
last transaction commit
not last data
same timestamp, different receive positions
Position?
pg_xlog_location_diff()compare two XLOG locations
byte position
comparable granularly
Position?
select
pg_xlog_location_diff(
pg_current_xlog_location(),
'0/0000000');---------------701505732608
Position?
rep1: 701505732608
rep2: 701505737072
rep3: 701312124416
Replay?
more replay == slower promotion
figure out max. acceptable
sacrifice the delayed replica
Replay?
SELECT pg_xlog_location_diff( pg_last_xlog_receive_location(), pg_last_xlog_replay_location());--------------- 1232132
Replay?
SELECT pg_xlog_location_diff( pg_last_xlog_receive_location(), pg_last_xlog_replay_location());--------------- 4294967296
master shutdown
STONITH
make sure master can't restart
or can't be reached
Terminate or Isolate
promotion
pg_ctl promotemake sure it worked
may have to waithow long?
Remastering
remastering pre-9.3
all replicas are set to:
recovery_target_timeline = 'latest'
change primary_conninfo to new master
all must pull from common archive
restart replicas
remastering pre-9.3
remastering pre-9.3
remastering pre-9.3
remastering pre-9.3
remastering post-9.3
remastering post-9.3
remastering post-9.3
remastering post-9.3
all replicas are set to:
recovery_target_timeline = 'latest'
change primary_conninfo to new master
restart replicas
restart problem
must restart to remasternot likely to change soon
break connections
vs.
fall behind
3. Application Failover
3. Application Failover
old master new master
for read-write
old replicas new replicas
for load balancing
fast: prevent split-brain
CMS method
update Configuration Management System
push change to all application servers
CMS method
slow
asynchronous
hard to confirm 100% complete
network split?
zookeeper method
write new connection config to zookeeper
application servers pull connection info from zookeeper
zookeeper method
asynchronousor poor response time
delay to verify
network split?
Pacemaker method
master has virtual IP
applications connect to VIP
Pacemaker reassigns VIP on fail
Pacemaker advantages
2-node solution (mostly)
synchronous
fast
absolute isolation
Pacemaker drawbacks
really hard to configure
poor integration with load-balancing
automated failure detection too simplecan't be disabled
proxy method
application servers connect to db via proxies
change proxy config
restart/reload proxies
AppServerOneAppServerTwo
Proxy
AppServerOneAppServerTwo
Proxy
proxies
pgBouncer
pgPool
HAProxy
Zeus, BigIP, Cisco
FEMEBE
Fail
back
what?
after failover, make the old master the master again
why?
old master is better machine?
some server locations hardcoded?
doing maintenance on both servers?
why not?
bad infrastructure design?
takes a while?
need to verify old master?
just spin up a new instance?
pg_basebackup
pg_basebackup
rsync
reduce time/data for old master recopy
doesn't work as well as you'd expecthint bits
pg_rewind ++
use XLOG + data files for rsync
super fast master resync
pg_rewind --
not yet stable
need to have all XLOGsdoesn't yet support archives
need checksumsor 9.4's wal_log_hints
Automated
Failback
www.handyrep.org
Fork it on Github!
Questions?
github.com/pgexperts/HandyRepfork it!
Josh Berkus: [email protected]: www.pgexperts.com
Blog: www.databasesoup.com
Copyright 2014 PostgreSQL Experts Inc. Released under the Creative Commons Share-Alike 3.0 License. All images, logos and trademarks are the property of their respective owners and are used under principles of fair use unless otherwise noted.
Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline LevelSeventh Outline Level
Top Related