Richmond MUG – May 2014

36
Richmond MUG – May 2014 MongoDB 2.6 ason Ford – Principal Engineer, Snagajo

description

Richmond MUG – May 2014. MongoDB 2.6. Jason Ford – Principal Engineer, Snagajob. MongoDB World!. First National MongoDB Conference June 23 – 25 in New York City Use discount code mug_25 to get 25% off registration. Meetup Calendar. Today: MongoDB 2.6. July 8: MongoDB World Post-Mortem. - PowerPoint PPT Presentation

Transcript of Richmond MUG – May 2014

Page 1: Richmond MUG – May 2014

Richmond MUG – May 2014

MongoDB 2.6Jason Ford – Principal Engineer, Snagajob

Page 2: Richmond MUG – May 2014

MongoDB World!First National MongoDB Conference

June 23 – 25 in New York City

Use discount code mug_25 to get 25% off registration

Page 3: Richmond MUG – May 2014

Meetup Calendar- Today: MongoDB 2.6

- July 8: MongoDB World Post-Mortem

- September 9: TBD

- November 4: TBD (Second Anniversary)

Page 4: Richmond MUG – May 2014

Richmond MUG – May 2014

MongoDB 2.6Jason Ford – Principal Engineer, Snagajob

Page 5: Richmond MUG – May 2014

Overview- In development for a full year

- (longer than any prior release)- First major rewrite of the codebase

- Including full rewrite of the query engine- Some significant new features, but

primary goal of release is foundation for future development

Page 6: Richmond MUG – May 2014

Read Operations- Largely transparent

- New framework highly extendable- .maxTimeMS() operator

- Allows for timeouts on a per-operation basis

- Great for adhoc queries- Available in all drivers- Indexes

Page 7: Richmond MUG – May 2014

Indexes- Background Index builds to secondary

nodes- Index builds can resume if interrupted

- Index Intersection- Great for ad-hoc queries- Still want dedicated compound indexes

for oft-used queries

- dropDups option deprecated

Page 8: Richmond MUG – May 2014

Indexes- Consider a collection with these indexes:

- { qty : 1 }- { item : 1}

- Index Intersection may be used to support the following query:

db.orders.find({ item: “abc123”, qty: { $gt : 15}})- Emphasis on MAY

- Single index queries may be more efficient

Page 9: Richmond MUG – May 2014

Read Operations- Text Search

- Beta feature in 2.4, now enabled by default

- Probably only practical for small collections- Indexes are very large

- Query execution framework completely rewritten- Query parser, optimizer, cache, etc- Find queries are noticeably faster

Page 10: Richmond MUG – May 2014

Cached Query Plan Interface- New insight/control provided into

mongoDB’s query execution- mongoDB query optimizer has long tried

to figure out the most efficient use of indexes on a per-query basis, and cache them- db.collection.getPlanCache() provides an interface to view and clear stored query strategies by query shape

Page 11: Richmond MUG – May 2014

Cached Query Plan Interface

- db.jobseeker.getPlanCache().help()

Page 12: Richmond MUG – May 2014

Aggregation Framework- Introduced in 2.2- Finally seems fully baked in 2.6- Queries return a cursor

- Used to return a single document (16MB limit)

- Results can be output to a new collection- $out operator

Page 13: Richmond MUG – May 2014

Aggregation Frameworkdb.jobseeker.aggregate( { $project : { _id: 0, alert : '$p.n'} }, { $unwind : "$alert" }, { $group : { _id : "$alert", count: {$sum : 1} } }, { $out : "alertsummary" })

Page 14: Richmond MUG – May 2014

Write Operations- Insert, Update, Delete completely

rewritten to use commands

- Write operations always returns a WriteResult object

- Forget about “fire and forget”

- Even a {w:0} specification sends back a yes/no response

Page 15: Richmond MUG – May 2014

Write OperationsSample Update Command (db.runCommand):{

update: 'collection name' , updates: [{ q: { a : 1 } , u: { $inc : { x : 1}} , multi: true/false , upsert: true/false }, ...] , writeConcern: { w: 1, j: true, wtimeout: 1000 } , ordered: true/false}

Page 16: Richmond MUG – May 2014

WriteResult Structure { "ok" : 1, "n" : 0, "nModified": 1, (Applies only to update) "nRemoved": 1, (Applies only to removes) "writeErrors" : [ { "index" : 0, "code" : 11000, "errmsg" : "insertDocument :: caused by :: 11000 E11000 duplicate key error index: t1.t.$a_1 dup key: { : 1.0 }" } ], writeConcernError: { code : 22, errInfo: { wtimeout : true }, errmsg: "Could not replicate operation within requested timeout" } }

Page 17: Richmond MUG – May 2014

Write Operations- WriteConcern can be specified on a per-

operation basis

- Field Order- _id field will ALWAYS be first- Field order will be preserved (unless a

field is renamed)

db.products.insert( { item: "envelopes", qty : 100, type: "Clasp" }, { writeConcern: { w: "majority", wtimeout: 5000 } })

Page 18: Richmond MUG – May 2014

Bulk Write Operations- All write operations can now happen in

bulk- Super cool fluid language- Significant performance increase

Page 19: Richmond MUG – May 2014

Bulk Write OperationsOLD WAY// get cursorvar cursor = db.myCollection.find({}, {_id:1}); // returns 100,000 documents

// iterate through and update each documentwhile(cursor.hasNext()){ var doc = cursor.next(); db.myCollection.update({_id : doc._id}, { $set : { up : x }});}

TIME: 67.4 Seconds

Page 20: Richmond MUG – May 2014

Bulk Write OperationsNEW WAY// create bulk objectvar bulk = db.myCollection.initializeUnorderedBulkOp();

// add update operations to BulkOpfor (var x = 0; x < 100000; x++){ bulk.find({_id : x }).update({ $set : { up : x }});}

// send update operations to the databasebulk.execute();

TIME: 5.5 Seconds (62 seconds faster)

Page 21: Richmond MUG – May 2014

Storage- Power of 2 Allocation (introduced in 2.2)

now set as the default allocation strategy

- Each record has a size in bytes that is a power of 2 (e.g. 32, 64, 128, 256, 512...16777216.)

- Smallest allocation size is 32 bytes

Page 22: Richmond MUG – May 2014

Storage- Two advantages/goals:

1. The limited number of record allocation sizes makes it easier for mongo to reuse existing allocations, reducing fragmentation2. The space allocated for each document is usually larger than the data they hold. This allows documents to grow while minimizing the chance that mongo will need to allocate space as data is added to a document.

Page 23: Richmond MUG – May 2014

Storage- Power of 2 sizes replaces previous “Exact

Fit” allocation strategy - allocated the exact size needed plus a small (configurable) padding factor

- Was inefficient for heavy write operations and inefficient for reallocating space

Page 24: Richmond MUG – May 2014

Sharding & Replication- Ability to merge Chunks

- Chunks must be contiguous- Chunks must be on same shard- One chunk must be empty

Page 25: Richmond MUG – May 2014

Sharding & ReplicationAbility to remove orphaned documents

orphaned Documents: documents on a shard that also exist in chunks on other shards as a result of failed migrations or incomplete migration cleanup due to abnormal shutdownDelete orphaned documents using cleanupOrphaned to reclaim disk space and reduce confusion.

Page 26: Richmond MUG – May 2014

Sharding & ReplicationAbility to remove orphaned documents

- Must be run on admin db of the primary member of a replica set (NOT mongos)db.runCommand( {

"cleanupOrphaned": "test.info", "startingAtKey": { x: 10 }, "secondaryThrottle": true} )

Page 27: Richmond MUG – May 2014

Security- Integration (Enterprise Edition Only)

- Kerberos introduced in 2.4- 2.6 adds LDAP and x.509 protocols

- There’s also a Windows Enterprise Edition now- Linux Enterprise introduced in 2.4

Page 28: Richmond MUG – May 2014

Security- User-Defined Roles & Collection Level

Access- Before: readonly and full admin were the only options (per database)

- 2.6 adds Role-Based Access Control- Separate upgrade

- Users are granted Roles- Roles have Privileges- Privileges are an action and a resource

- ex: Update (action) on product db (resource)

Page 29: Richmond MUG – May 2014

SecurityBuilt in Database Roles:- read (read only access)- readWrite (CRUD, create, rename, and

drop collections, create and drop indexes)- dbAdmin (read access to system.profile

collection – weirdly specific, but ok)- userAdmin (create and modify roles and

users)- dbOwner (readWrite + dbAdmin + userAdmin)

Page 30: Richmond MUG – May 2014

SecurityBuilt in Cluster Roles (create on admin DB):- clusterManager (add/remove shards,

change replset and cluster config, manage chunks, etc)- clusterMonitor (read access to cluster admin info)

- hostManager (misc admin commangs (killop/shutdown/repairDatabase)

- clusterAdmin (all of the above + dropDatabase)

Page 31: Richmond MUG – May 2014

SecurityOther Roles (adminDB):- backup, restore

(mongodump/mongorestore)- readAnyDatabase, readWriteAnyDatabase, userAdminAnyDatabase, dbAdminAnyDatabase- root (readWriteAnyDatabase, dbAdminAnyDatabase, userAdminAnyDatabase, clusterAdmin)

Page 32: Richmond MUG – May 2014

SecurityCustom Roles:db.runCommand({ createRole: "myClusterwideAdmin", privileges: [ { resource: { cluster: true }, actions: [ "addShard" ] }, { resource: { db: "config", collection: "" }, actions: [ "find", "update", "insert", "remove" ] }, { resource: { db: "users", collection: "usersCollection" }, actions: [ "update", "insert", "remove" ] }, { resource: { db: "", collection: "" }, actions: [ "find" ] } ], roles: [ { role: "read", db: "admin" } ]})

LOTS of new stuff here – check out documentation

Page 33: Richmond MUG – May 2014

SecurityUser Creation Example:use productsdb.createUser( { "user" : "accountAdmin01", "pwd": "cleartext password", "customData" : { employeeId: 12345 }, "roles" : [ { role: " myClusterwideAdmin ", db: "admin" }, { role: "readAnyDatabase", db: "admin" }, "readWrite" ] })This user has readWrite permissions on products DB, read permissions on all DBs, and has the permissions of the role we created earlier.

Page 34: Richmond MUG – May 2014

Miscellaneous$min & $max conditional updates

- Ex: db.scores.update( { _id: 1 }, { $min: { lowScore: 150 } } )

Enhancements to 2D sphere indexes

rs.printReplicationInfo()rs.printSlaveReplicationInfo()

– human readable helper methods

mongoexport supports --skip, --limit, --sort

Page 35: Richmond MUG – May 2014

The Future“You’ll see the benefits in better performance and new innovations. We re-wrote the entire query execution engine to improve scalability, and took our first step in building a sophisticated query planner by introducing index intersection. We’ve made the codebase easier to maintain, and made it easier to implement new features. Finally, MongoDB 2.6 lays the foundation for massive improvements to concurrency in MongoDB 2.8, including document-level locking.”

- Eliot Horowitz, CTO and Co-Founder, MongoDB

Page 36: Richmond MUG – May 2014

Richmond MUG – May 2014

MongoDB 2.6Jason Ford – Principal Engineer, Snagajob