Indexing and Query Optimizer (Mongo Austin)

22
Indexing, Query Optimization, the Query Optimizer — MongoAustin Mathias Stearn 10gen Inc. [email protected] @mathias mongo February 15, 2011 MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin

description

Mathias Stearn's presentation at Mongo Austin

Transcript of Indexing and Query Optimizer (Mongo Austin)

Page 1: Indexing and Query Optimizer (Mongo Austin)

Indexing, Query Optimization, the QueryOptimizer — MongoAustin

Mathias Stearn

10gen Inc.

[email protected]

@mathias mongo

February 15, 2011

MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin

Page 2: Indexing and Query Optimizer (Mongo Austin)

Indexing Basics

Indexes are tree-structured sets of references to your

documents.

The query planner can employ indexes to efficiently enumerate

and sort matching documents.

MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin

Page 3: Indexing and Query Optimizer (Mongo Austin)

However, indexing strikes people as a gray art

As is the case with relational systems, schema design and

indexing go hand in hand. . .

. . . but you also need to know about your actual (not just

predicted) query patterns.

MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin

Page 4: Indexing and Query Optimizer (Mongo Austin)

Some indexing generalities

A collection may have at most 64 indexes.

A query may only use 1 index (except for disjuncts of $or

queries).

Indexes entail additional work on inserts, updates, deletes.

MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin

Page 5: Indexing and Query Optimizer (Mongo Austin)

Creating Indexes

The id attribute is always indexed. Additional indexes can be

created with ensureIndex():

// Create an index on the user attribute

db.collection.ensureIndex({ user : 1 })

// Create a compound index on

// the user and email attributes

db.collection.ensureIndex({ user : 1, email : 1 })

// Create an index on the tags attribute,

// will index all values in list

db.collection.ensureIndex({ tags : 1 })

// Create a unique index on the user attribte

db.collection.ensureIndex({user:1}, {unique:true})

// Create an index in the background.

db.collection.ensureIndex({user:1}, {background:true})

MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin

Page 6: Indexing and Query Optimizer (Mongo Austin)

Index maintenance

// Drops an index on x

db.collection.dropIndex({x:1})

// Drops all indexes except _id

db.collection.dropIndexes()

// Rebuild and compact indexes

db.collection.reIndex()

MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin

Page 7: Indexing and Query Optimizer (Mongo Austin)

Indexes are smart about data types and structures

Indexes on attributes whose values are of different types in

different documents can speed up queries by skipping

documents where the relevant attribute isn’t of the

appropriate type.

Indexes on attributes whose values are lists will index each

element, speeding up queries that look into these attributes.

(You really want to do this for querying on tags.)

MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin

Page 8: Indexing and Query Optimizer (Mongo Austin)

When can indexes be used?

In short, if you can envision how the index might get used, it

probably is. These will all use an index on x:

db.collection.find( { x: 1 } )

db.collection.find( { x :{ $in : [1,2,3] } } )

db.collection.find( { x : { $gt : 1 } } )

db.collection.find( { x : /^a/ } )

db.collection.count( { x : 2 } )

db.collection.distinct( { x : 2 } )

db.collection.find().sort( { x : 1 } )

MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin

Page 9: Indexing and Query Optimizer (Mongo Austin)

Trickier cases where indexes can be used

db.collection.find({ x : 1 }).sort({ y : 1 })will use an index on y for sorting, if there’s no index on x.

(For this sort of case, use a compound index on both x and y

in that order.)

db.collection.update( { x : 2 } , { x : 3 } )

will use and update an index on x

MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin

Page 10: Indexing and Query Optimizer (Mongo Austin)

Some array examples

The following queries will use an index on x, and will match

documents whose x attribute is the array [2,10]

db.collection.find({ x : 2 })db.collection.find({ x : 10 })db.collection.find({ x : { $gt : 5 } })db.collection.find({ x : [2,10] })db.collection.find({ x : { $in : [2,5] }})

MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin

Page 11: Indexing and Query Optimizer (Mongo Austin)

Geospatial indexes

Geospatial indexes are a sort of special case; the operators that can

take advantage of them can only be used if the relevant indexes

have been created. Some examples:

db.collection.find({ a : [50, 50]}) finds a

document with this point for a.

db.collection.find({a : {$near : [50, 50]}})sorts results by distance.

db.collection.find({a:{$within:{$box:[[40,40],[60,60]]}}}})db.collection.find({a:{$within:{$center:[[50,50],10]}}}})

MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin

Page 12: Indexing and Query Optimizer (Mongo Austin)

When indexes cannot be used

Many sorts of negations, e.g., $ne, $not.

Tricky arithmetic, e.g., $mod.

Most regular expressions (e.g., /a/).

Expressions in $where clauses don’t take advantage of

indexes.

Of course $where clauses are mostly for complex queries that

often can’t be indexed anyway, e.g., ‘‘where a > b’’. (If

these cases matter to you, it you can precompute the match

and store that as an additional attribute, you can store that,

index it, and skip the $where clause entirely.)

map/reduce can’t take advantage of indexes (mapping

function is opaque to the query optimizer).

As a rule, if you can’t imagine how an index might be used, it

probably can’t!

MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin

Page 13: Indexing and Query Optimizer (Mongo Austin)

Never forget about compound indexes

Whenever you’re querying on multiple attributes, whether as

part of the selector document or in a sort(), compound

indexes can be used.

MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin

Page 14: Indexing and Query Optimizer (Mongo Austin)

Schema/index relationships

Sometimes, question isn’t “given the shape of these documents,

how do I index them?”, but “how might I shape the data so I can

take advantage of indexing?”

// Consider a schema that uses a list of

// attribute/value pairs:

db.c.insert({ product : "SuperDooHickey",

manufacturer : "Foo Enterprises",

catalog : [ { stock : 50,

modtime: ’2010-09-02’ },

{ price : 29.95,

modtime : ’2010-06-14’ } ] });

db.c.ensureIndex({ catalog : 1 });

// All attribute queries can use one index.

db.c.find( { catalog : { stock : { $gt : 0 } } } )

MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin

Page 15: Indexing and Query Optimizer (Mongo Austin)

Index sizes

Of course, indexes take up space. For many interesting databases,

real query performance will depend on index sizes; so it’s useful to

see these numbers.

db.collection.stats() shows indexSizes, the size of

each index in the collection.

db.stats() includes the total size of all indexes in the

database.

MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin

Page 16: Indexing and Query Optimizer (Mongo Austin)

explain()

It’s useful to be able to ensure that your query is doing what you

want it to do. For this, we have explain(). Query plans that use

an index have cursor type BtreeCursor.

db.collection.find({x:{$gt:5}}).explain()

{

"cursor" : "BtreeCursor x_1",

...

"nscanned" : 100,

...

"n" : 100,

"millis" : 0,

...

}

MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin

Page 17: Indexing and Query Optimizer (Mongo Austin)

explain(), continued

If the query plan doesn’t use the index, the cursor type will be

BasicCursor.

db.collection.find({x:{$gt:5}}).explain()

{

"cursor" : "BasicCursor",

...

"nscanned" : 12345,

...

"n" : 100,

"millis" : 4,

...

}

MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin

Page 18: Indexing and Query Optimizer (Mongo Austin)

Really, compound indexes are important

Try this at home:

1 Create a collection with a few tens of thousands of documents

having two attributes (let’s call them a and b).

2 Create a compound index on {a : 1, b : 1},3 Do a db.collection.find({a : constant}).sort({b :

1}).explain().4 Note the explain result’s millis.

5 Drop the compound index.

6 Create another compound index with the attributes reversed.

(This will be a suboptimal compound index.)

7 Explain the above query again.

8 The suboptimal index should produce a slower explain result.

MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin

Page 19: Indexing and Query Optimizer (Mongo Austin)

The DB Profiler

MongoDB includes a database profiler that, when enabled, records

the timing measurements and result counts in a collection within

the database.

// Enable the profiler on this database.

> db.setProfilingLevel(1, 100)

{ "was" : 0, "slowms" : 100, "ok" : 1 }

> db.foo.find({a: { $mod : [3, 0] } });

...

// See the profiler info.

> db.system.profile.find()

{ "ts" : "Thu Nov 18 2010 06:46:16 GMT-0500 (EST)",

"info" : "query test.$cmd ntoreturn:1

command: { count: \"foo\",

query: { a: { $mod: [ 3.0, 0.0 ] } },

fields: {} } reslen:64 406ms",

"millis" : 406 }MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin

Page 20: Indexing and Query Optimizer (Mongo Austin)

Query Optimizer

MongoDB’s query optimizer is empirical, not cost-based.

To test query plans, it tries several in parallel, and records the

plan that finishes fastest.

If a plan’s performance changes over time (e.g., as data

changes), the database will reoptimize (i.e., retry all possible

plans).

MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin

Page 21: Indexing and Query Optimizer (Mongo Austin)

Hinting the query plan

Sometimes, you might want to force the query plan. For this, we

have hint().

// Force the use of an index on attribute x:

db.collection.find({x: 1, ...}).hint({x:1})

// Force indexes to be avoided!

db.collection.find({x: 1, ...}).hint({$natural:1})

MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin

Page 22: Indexing and Query Optimizer (Mongo Austin)

Going forward

www.mongodb.org — downloads, docs, community

[email protected] — mailing list

#mongodb on irc.freenode.net

try.mongodb.org — web-based shell

10gen is hiring. Email [email protected].

10gen offers support, training, and advising services for

mongodb

MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin