Open Source Backups For MongoDB - Percona · Open Source Backups For MongoDB David Murphy MongoDB...

19
Open Source Backups For MongoDB David Murphy MongoDB Practice Manager

Transcript of Open Source Backups For MongoDB - Percona · Open Source Backups For MongoDB David Murphy MongoDB...

Page 1: Open Source Backups For MongoDB - Percona · Open Source Backups For MongoDB David Murphy MongoDB Practice Manager . 2 About me ... 15+ years in MySQL and other RDBMs . 3 • Today’s

Open Source Backups For MongoDB

David Murphy MongoDB Practice Manager

Page 2: Open Source Backups For MongoDB - Percona · Open Source Backups For MongoDB David Murphy MongoDB Practice Manager . 2 About me ... 15+ years in MySQL and other RDBMs . 3 • Today’s

2

About me

▪ Former NoSQL/MySQL Architect for Electronic Arts (and yes, I probably worked on that game you’re

thinking about!)

▪ Original and Lead DBA for ObjectRocket, the high-performance Mongo-as-a-Service offering

▪ Mongo Master Alumni, and one of the early Mongo Masters

▪ Practice Manager for MongoDB @ Percona

▪ 15+ years in MySQL and other RDBMs

Page 3: Open Source Backups For MongoDB - Percona · Open Source Backups For MongoDB David Murphy MongoDB Practice Manager . 2 About me ... 15+ years in MySQL and other RDBMs . 3 • Today’s

3

•  Today’s typical backup types:

▪  Logical

▪  Snapshot (iSCSI/LVM)

▪  OpsManager Backups

•  Complications when it comes to sharding

•  How to get consistent sharded backups in v3.2+

•  New tool from Percona Labs

Agenda

Page 4: Open Source Backups For MongoDB - Percona · Open Source Backups For MongoDB David Murphy MongoDB Practice Manager . 2 About me ... 15+ years in MySQL and other RDBMs . 3 • Today’s

Today’s Backup Types Looking at today's single node or replica set backups, and the good and the bad in each

Page 5: Open Source Backups For MongoDB - Percona · Open Source Backups For MongoDB David Murphy MongoDB Practice Manager . 2 About me ... 15+ years in MySQL and other RDBMs . 3 • Today’s

5

Today’s Tools: Logical Backups

Almost always use mongodump, which has some particular considerations:

▪ You must determine which secondary you want to talk to - The “H” option points to single host or replica set ( using secondary reads)

- Does not protect against lagging secondaries

▪ Single node can not be consistently backed up! - Because MongoDB uses read-uncommitted without an oplog, backups are not safe.

▪ Restores take a huge amount of time but spaced used is tiny

Page 6: Open Source Backups For MongoDB - Percona · Open Source Backups For MongoDB David Murphy MongoDB Practice Manager . 2 About me ... 15+ years in MySQL and other RDBMs . 3 • Today’s

6

Today’s Tools: Snapshot Backups(LVM)

Assuming you are using LVM, there are some considerations -- however backups will always be 100% of the data size and will restore quickly with no need to “re-hydrate”

▪ Snapshot can be made instantly -You must choose which node to take a backup on (usually people make a hidden node)

-Must use 100% of the normal space, compression slows restoring

-Needs to have spare space in the VG for a snapshot volume

-Snapshot COW table will grow until it runs the VG out of space, and then the snapshot will stop

-Serious performance issues will occur while snapshot is active

You will want to delete the snapshot ASAP, after RSYNC the contents somewhere else

▪ Restores are fast and consistent Will only take the time to copy the files back into place

Page 7: Open Source Backups For MongoDB - Percona · Open Source Backups For MongoDB David Murphy MongoDB Practice Manager . 2 About me ... 15+ years in MySQL and other RDBMs . 3 • Today’s

7

Today’s Tools: Snapshot Backups(iSCSI/NFS)

Everything from LVM, on instant snapshots, and fast/consistent restores apply, however:

▪ COW table might not still be an expense depending on the NFS/SAN used

▪ Deduplication can be used to help save space

▪ Incremental hourly snapshots might be possible

▪ MongoDB performs poorly on iSCSI by default, and might need tuning

▪ Due to the nature of NFS, MongoDB (especially MMAP) should not be used in

production typically

Page 8: Open Source Backups For MongoDB - Percona · Open Source Backups For MongoDB David Murphy MongoDB Practice Manager . 2 About me ... 15+ years in MySQL and other RDBMs . 3 • Today’s

8

Using the this tool, you are choosing NOT to be open source, and locked into a vendor! ▪ Initial Backups

-Sends docs to MOM server in 10MB chunks, then sends all oplog changes

-Builds Copy DB + Applies Oplogs

-Marks this as the 1st backup done

▪ Oplog streaming -Able to now just stream and apply any oplogs to a backup like replication does

▪ Snapshots -At regular input points the current version of the DB copy is cloned

-New oplogs are applies to only 1 side

-Gives you snapshots you can return to that are maybe daily or hourly

Today’s Tools: MongoDB Ops Manager(MOM)

Page 9: Open Source Backups For MongoDB - Percona · Open Source Backups For MongoDB David Murphy MongoDB Practice Manager . 2 About me ... 15+ years in MySQL and other RDBMs . 3 • Today’s

Complications when it comes to sharding How do we backup when we are using shards? How do we time things well?

Page 10: Open Source Backups For MongoDB - Percona · Open Source Backups For MongoDB David Murphy MongoDB Practice Manager . 2 About me ... 15+ years in MySQL and other RDBMs . 3 • Today’s

10

Different shards will finish different backups types at different times ▪ Logical Backups

- Each shard will be of a different size, backups will finish at different times

- MongoDB-based dumps will not use --oplog and therefore won’t be consistent at each shard

- As different dumps finish at different times, three questions come up:

- Is the Balancer off?

- Are there any migrations running?

- What about new DB’s and manual moves?

▪ Binary/Snapshot Backups - These worked great in a single replica, but how do I make them all run at the same time?

- How do I make sure the above questions are answered?

Sharding Complications: Consistency

Page 11: Open Source Backups For MongoDB - Percona · Open Source Backups For MongoDB David Murphy MongoDB Practice Manager . 2 About me ... 15+ years in MySQL and other RDBMs . 3 • Today’s

11

▪ Logical Backups

- No Native support for consistent sharded backups

▪ Binary/Snapshot Backups - No Native support for consistent sharded backups

▪ MongoDB Ops Manager - Only snapshot support, no Point-In-Time Recovery (PITR) support

Sharding Complications: Tool’s Sharding Support

Page 12: Open Source Backups For MongoDB - Percona · Open Source Backups For MongoDB David Murphy MongoDB Practice Manager . 2 About me ... 15+ years in MySQL and other RDBMs . 3 • Today’s

Consistent Sharded Backups and 3.2+ The new design of config servers being a replica has solved a very complicated backup issue: point in time recovery of a sharded cluster.

Page 13: Open Source Backups For MongoDB - Percona · Open Source Backups For MongoDB David Murphy MongoDB Practice Manager . 2 About me ... 15+ years in MySQL and other RDBMs . 3 • Today’s

13

3.2 is a HUGE leap forward for operation groups backing up MongoDB. Having the config servers be a replica set allows all parts of the system to be handled as one:

▪  If someone was able to run a snapshot at the same time on all shards and a config server then this isn’t an issue. However micro time variations could result in missing a change and therefore failing recovery tests. ▪  There was no good way to understand how to update each shard to Backup + 1 hour, and then update all the config metadata. Now we can say restore everything to Backup + 1 hour and we know it’s safe and exactly what the system was at the time. ▪  Some more tooling is still need to constantly capture the oplog for that case, but it’s least possible to do now.

What does 3.2 help fix?

Page 14: Open Source Backups For MongoDB - Percona · Open Source Backups For MongoDB David Murphy MongoDB Practice Manager . 2 About me ... 15+ years in MySQL and other RDBMs . 3 • Today’s

Percona-Lab’s new backup tool What if there was a tool that let you point to a replica set or cluster, and it would worry about the backing up of shards, aligning the recovery point, and compressing them into a central logical place? What if that was only the first step, with binary/snapshot backups, incremental backups, and more coming?

Page 15: Open Source Backups For MongoDB - Percona · Open Source Backups For MongoDB David Murphy MongoDB Practice Manager . 2 About me ... 15+ years in MySQL and other RDBMs . 3 • Today’s

15

IT’S ALL TRUE: mongodb-consistent-backup

What is this new tool? •  Not officially support by Percona until it makes it into Percona Tools directly •  Python single binary file tool that only needs Python 2.7, and it will build on it’s virtual environment as

to not have complicated dependencies •  Intelligent enough to detect mongos, become self recursive and backup all your shards automatically,

while being flexible enough if you point it at just one replica-set to back that up. Single Mongod’s won’t work with this tool, because mongodump can’t consistently back them up!

•  Ensure all shard’s dump times are consistent with each other by opening up oplog tailers to all shards until the last dump finishes.

•  At this point is forks

▪  If 3.2+ - It has also been dumping/tailing the config servers so everything is consistent :)

▪  If 3.0 or before - Fsync locks a config server and dumps it at the last moment with the balancer off

for the whole backup.

Page 16: Open Source Backups For MongoDB - Percona · Open Source Backups For MongoDB David Murphy MongoDB Practice Manager . 2 About me ... 15+ years in MySQL and other RDBMs . 3 • Today’s

16

The vision •  Maturity to move it eventually into a Percona supported tool like XtraBackup and Percona Toolkit

▪  Remain 100% open source and free to the community ▪  Community involvement in what features you need, contributing improvements, and reporting bugs

•  Have a daemon process constantly getting oplogs for each shard and storing them as one file per hour per shard. ▪  Allow incremental backups, granular to the second recovery, while letting you control the retention

based on your budget •  Uploading to S3, Google Cloud Storage, Azure ZRS, Rackspace Cloud Files and services •  Restore tools to make more automated backups •  Modular backup methods like: LVM, MongoDump, iSCSI backups, MongoDB Admin Commands and more •  Encryption support •  Ability to filter some collections/databases out of the backups and restores •  Offline backup querying

Where is it going?

Page 17: Open Source Backups For MongoDB - Percona · Open Source Backups For MongoDB David Murphy MongoDB Practice Manager . 2 About me ... 15+ years in MySQL and other RDBMs . 3 • Today’s

17

https://github.com/Percona-Lab/mongodb_consistent_backup

•  GPL license

•  Encourage community participation

•  Very actively developed for use across all our services, but still not moved

into Percona tools (not yet officially supported)

•  All issues go to myself and the escalations team for MongoDB @ Percona

Where do I find it?

Page 18: Open Source Backups For MongoDB - Percona · Open Source Backups For MongoDB David Murphy MongoDB Practice Manager . 2 About me ... 15+ years in MySQL and other RDBMs . 3 • Today’s

Questions? What more would you like to see? Other tools the community needs?

Twitter: @dmurphy_data @percona Github: dbmurphy mongodb_consistent_backup: http://bit.ly/28InDuI

Page 19: Open Source Backups For MongoDB - Percona · Open Source Backups For MongoDB David Murphy MongoDB Practice Manager . 2 About me ... 15+ years in MySQL and other RDBMs . 3 • Today’s

DATABASE PERFORMANCE MATTERS