Letters from the Trenches: Lessons Learned Taking MongoDB to Production
-
Upload
rick-warren -
Category
Technology
-
view
1.298 -
download
2
description
Transcript of Letters from the Trenches: Lessons Learned Taking MongoDB to Production
Letters from the Trenches:Lessons Learned Taking MongoDB to Production
October 17, 2013
Rick Warren [email protected]
Traditional Internet Dating Service
Unidirectional User-Defined Criteria
eHarmony Matching
Bidirectional User-Defined Criteria
eHarmony Matching: 3 Parts
1. Bidirectional User-Defined Criteria
2. Research-Based Compatibility Models
3. Machine-Learned Affinity Models
Photo CreditsMagnifying glass: andercismo @ http://www.flickr.com/photos/andercismo/
Machine learning: University of Maryland Press Releases @ http://www.flickr.com/photos/umdnews/
Application: Find Potential Matches
As fast as possible:
1. Find people who meet each other’s preferences
2. Discard combos that violate Compatibility Models
1. Bidirectional User-Defined Criteria
Application: Find Potential Matches
• User attributes in MongoDB
– Replicated
– Sharded
• Data access pattern:
– Read-heavy
– Complex queries
• Java application
1. Bidirectional User-Defined Criteria
Application: Find Potential Matches
• In full production> 6 mos
– Following several mos limited production
– Following several mos intensive dev+testing
• No production outages
• MongoDB no longer the thing we worry about most
• User attributes in MongoDB
– Replicated
– Sharded
• Data access pattern:
– Read-heavy
– Complex queries
• Java application
Lesson: Provision for Success
Fit all data & indexes in memory– MongoDB storage implemented using
mem-mapped files
– Beware under-provisioned VMs
Minimize field names to keep data as small as possible– “Schema-less records” ==
“schema repeated millions of times”
– Morphia Java library can help with mapping
Lesson: Provision for Success
Primary
Secondary
Secondary
Shard / RS
…
Primary
Secondary
Secondary
Shard / RS
…
…
Scale write ops & data volume by adding shards
Scale
read
op
sb
y a
dd
ing
seco
nd
arie
s
Lesson: Be Ready to Tinker
• Many processes:– mongod on each node, primary
or secondary
– 2 MMS agents
– Plus, if sharding:
• mongos for each app instance
• 3 config servers
• …Each configured
separately & differently– Configuration file
– Manual commands to set up
• Less likely to have
DBA support– …and relational Best Practices
may not transfer
Use Puppet, Chef, or similar– Helps with config files, command-line
arguments
– Insufficient for adding secondaries,
configuring indexes, etc.
If scripting, use real client
driver, not mongo shell– Doesn’t handle output or errors
consistently
– Can’t wait in JavaScript
Train your DB/Ops team(s)– And expect to do more yourself
Lesson: Shadow Mode Is Your Friend
Test with real production data, conditions, and queries
Measure everything (MMS is a good start, but insufficient)
Kill mongod instances to verify resiliency
Primary school enrollment, Armenia:http://data.worldbank.org/country/armenia
X
Real Application
“Shadow” Application
Real Events & Requests
Lesson: Be Ready to Restore Your Data
• Schemas will
change
• Shard key(s)
will change– More on this later…
• You’ll
experience
MongoDB bugs
Maintain 2nd copy in
another format– Backing source of truth?
– Backup in standard format?
– Second cluster with different
version of MongoDB?
Increment DB name
with each reload
Automate reload
process, and use it
Image credit:http://tutorialphotoshopcs-putradom.blogspot.com/2012/11/create-dramatic-meteor-and-burning-city.html
Lesson: Pick a Good Shard Key
1. Distribute Data Volume Evenly– This is what auto-balancing does for you.
2. Multiply Query Performance– Isolate queries to 1 shard to multiply
read capacity by # of shards.
3. Distribute Workload Evenly– Conflicts with above!
1. Distribute Data Volume Evenly– This is what auto-balancing does for you.
2. Multiply Query Performance– Isolate queries to 1 shard to multiply
read capacity by # of shards.
3. Distribute Workload Evenly– Conflicts with above!
Lesson: Pick a Good Shard Key
Jessica Rabbit: http://disney.wikia.com/wiki/Jessica_RabbitSteve Urkel:
http://celebratingtvandfilmgeeks.wordpress.com/2010/04/25/steve-urkel-the-ultimate-90s-nerd-and-life-lessons/4-steve-urkel/
Shard 2
mongos
Shard 1
Lesson: Pick a Good Shard Key
DO These Things
Use fields appearing in every query
Choose combo that finely partitions data
Measure relative load across shards– Consider adding
secondaries to loaded shard(s) ONLY
BEWARE These Things
• Include serial numbers(or similar)
• Hash fields when reads might be a problem
• Mutable fields in shard key—remove and add
Summary
1. Provision for Success
2. Be Ready to Tinker
3. Shadow Mode Is Your Friend
4. Be Ready to Restore Your Data
5. Pick a Good Shard Key