Optimize drupal using mongo db

Post on 29-Jan-2018

9.893 views 0 download

Transcript of Optimize drupal using mongo db

optimize Drupal using mongoDB

agendalet’s get started

what is problem that we face (or what we need/want)

sql - relational databases

shift of technologies - noSQL databases

mongoDB - what is mongo

how to install mongo

finally - drupal & mongo

mongoDB module - what we can do with it

benefits

q&a

about me

name: Vladimir Ilicemployed at: OWFGemail: v_ilic@hotmail.comtwitter: burgerboydaddy

Who should be here instead of me

Who should be here instead of me

Who should be here instead of me

what is problem that we faceor what we want to achieve with Drupal based solutions

what is problem that we faceor what we want to achieve with Drupal based solutions

speed

what is problem that we faceor what we want to achieve with Drupal based solutions

speed

scale

what is problem that we faceor what we want to achieve with Drupal based solutions

speed

scale

simplicity

how we can do that?or how to speedup Drupal

server sideinstall Varnish for caching pages for anonymous users.install a persistent cache system (Memcached, APC, Memcache).use a CDN such as Akamai to serve static files (JavaScript, CSS, images).

how to speedup drupal - code side

use Pressflow, it allows Varnish to serve cached page for anonymous users.clean Drupal's watchdog table. Every time a watchdog error gets logged, it consumes CPU resources on the web server and database server. It also increases load time significantly.implement static and persistent cache strategies until the slow query log comes up clean.avoid PHP errors that occur within nested foreach loops at all costs.uninstall unused modules.turn on caching for Drupal core blocks and Views.

how to speedup drupal - database side

Drupal sites live or die by their database.make sure the tables are properly indexed for faster searching.do not store unnecessary records, a 100 node database will be always accessed faster than a 3 million node database.change how/where you store your data

speed - shift of technologiesfrom Pentium 100MHz, 16MB RAM, 200 MB HDD tomy cell phone: dual core 1GHz, 1GB RAM, 32GB storageor maybe some cloud server with 32+ CPU’s, few TB RAM, ...

our database technology

our database technology

our database technology

1974 - The relational database is created

sql - relational dbs

sql - relational dbs

1979

sql - relational dbs

1979 1982-1996

sql - relational dbs

1979 1982-1996 1995

simplicity?

relational database model for eCommerce app

simplicity?

relational database model for eCommerce app

simplicity?

* joins* joins* joins

relational database model for eCommerce app

real life examplesell groceries

Product {

id:UPC:brand:description:MSRP:price:in-stock:...PLU: 4011unit of measure: lborigin: BCseasonal: yes...

General Product attributes

Item Specific attributes

real life example - but we also sell books

Product {

id:UPC:brand:description:MSRP:price:in-stock:...author: Isaac Asimovtitle: Foudation’s Edgebinding: Paperbackpublication date: 1982publisher name: Spectranumber of pages: 480ISBN: 0553293389

General Product attributes stay the same

Book Specific attributes are different

real life example - ops, we want more

General Product attributes stay the same

Clothing specific attributes are totally different ... and not consistent across brands & make

Product {

id:UPC:brand:description:MSRP:price:in-stock:...brand: Leegender: Mensmake: Vintagestyle: Straight Cutlength: 34width: 42color: Blackmaterial: Cotton...

Now we’re screwed

Now we’re screwed

shift of technologies - db solutions

from well-established relational databases to NoSQL technologies.NoSQL was first introduced as concept in 1998, it really wasn’t until 2009 that it emerged as a real trend.NoSQL solutions aren't replacements traditional solutions, but rather address a specific need in addition to what one might get from traditional offerings. In 90% of the time you’ll probably implement a hybrid system.put simply, NoSQL is about being open and aware of alternative, existing and additional patterns and tools for managing your data.

shift is happening nowfew years ago (and still now) MySQL was the undisputed king of the open-source database hill.It is still growing with great speed (40% compound annual grow rate)

butThere are some competitors that are emerging and grow even faster.

shift is happening now

shift is happening now

shift is happening now

451 Research notes:

"NoSQL database technologies are largely being adopted for new projects that require additional scalability, performance, relaxed consistency and agility."

shift is happening now

451 Research notes:

"NoSQL database technologies are largely being adopted for new projects that require additional scalability, performance, relaxed consistency and agility."

in other words: web :-)

the NoSQL ecosystem

what is Not Only SQL?

what is Not Only SQL?•Non-Relational•Distributed•Open-Source•Horizontally

Scalable

•Schema-Free•Replication Support•Simple API•Eventually

Consistent

mongoDBagile and scalable

DB designed for todayMongoDB (from "humongous") is a scalable, high-performance, open source NoSQL database.

what is mongo

Horizontally Scalable

{ author: “vladimir”, date: new Date(), text: “drupal-MongoDB...”, tags: [“tech”, “database”]}

Document Oriented

Application

High Performance

Fully Consistent

other features of MongoDBDocument-based queries

Flexible document queries expressed in JSON/Javascript.Map Reduce

Flexible aggregation and data processing.Queries run in parallel on all shards.

GridFSStore files of any size easily.

Geospatial IndexingFind object based on location. (i.e. find closest n items to x)

Many Production Deployments

what is mongoDocument oriented storage

JSON-style documents with dynamic schemas offer simplicity and power.

Full index support (+geo)Index on any attribute, just like you're used to.

Replication and High AvailabilityMirror across LANs and WANs for scale and peace of mind.

QueryingRich, document-based queries

Fast In-Place UpdatesMap/Reduce

mongoDB philosophyKeep functionality when we can (key/value stores are great, but we need more)Non-relational (no joins) makes scaling horizontally practicalDocument data models are goodDatabase technology should run anywhere VMs, cloud, etc

6 mongoDB conceptsMongoDB has the same concept of a 'database' with which you are likely already familiar (or a schema for you Oracle folks). Within a MongoDB instance you can have zero or more databases, each acting as high-level containers for everything else.A database can have zero or more 'collections'. A collection shares enough in common with a traditional 'table' that you can safely think of the two as the same thing.Collections are made up of zero or more 'documents'. Again, a document can safely be thought of as a 'row'.

6 mongoDB conceptsA document is made up of one or more 'fields', which you can probably guess are a lot like 'columns'.'Indexes' in MongoDB function much like their RDBMS counterparts.'Cursors' are different than the other five concepts but they are important enough, and often overlooked, that I think they are worthy of their own discussion. The important thing to understand about cursors is that when you ask MongoDB for data, it returns a cursor, which we can do things to, such as counting or skipping ahead, without actually pulling down data.

our eCommerce problem again

our eCommerce problem again{ "_id" : ObjectId("4fc19d5e0ddc54e49b84928c"),‘customer_id’: ObjectId("4fc19d5e5eec78a34f24653d"),‘state’: ‘cart’,

‘line_items’: [ {‘UPC’:885909377275’, ‘name’: ‘Tide HE’, ‘quantity’: 2, ‘tax’: ‘HST’, ‘retail_price’: 10.99 },

{‘UPC’: ‘2348751987’, ‘name’: ‘bananas’, ‘weight’: 3.5, ‘UOM’: kg, ‘retail_price’: 2.75 }, ],

‘shipping_address’: { ‘street’: ‘1245 76 Ave.’, ‘city’: ‘Surrey’, ‘province’: ‘BC’, ‘postal code’: ‘V1Q 1K8‘ }

‘subtotal’: 13.74}

our eCommerce problem again{ "_id" : ObjectId("4fc19d5e0ddc54e49b84928c"),‘customer_id’: ObjectId("4fc19d5e5eec78a34f24653d"),‘state’: ‘cart’,

‘line_items’: [ {‘UPC’:885909377275’, ‘name’: ‘Tide HE’, ‘quantity’: 2, ‘tax’: ‘HST’, ‘retail_price’: 10.99 },

{‘UPC’: ‘2348751987’, ‘name’: ‘bananas’, ‘weight’: 3.5, ‘UOM’: kg, ‘retail_price’: 2.75 }, ],

‘shipping_address’: { ‘street’: ‘1245 76 Ave.’, ‘city’: ‘Surrey’, ‘province’: ‘BC’, ‘postal code’: ‘V1Q 1K8‘ }

‘subtotal’: 13.74}

Document- Analogous to a row in RDBMS- Represented as JSON (BSON)

our eCommerce problem again{ "_id" : ObjectId("4fc19d5e0ddc54e49b84928c"),‘customer_id’: ObjectId("4fc19d5e5eec78a34f24653d"),‘state’: ‘cart’,

‘line_items’: [ {‘UPC’:885909377275’, ‘name’: ‘Tide HE’, ‘quantity’: 2, ‘tax’: ‘HST’, ‘retail_price’: 10.99 },

{‘UPC’: ‘2348751987’, ‘name’: ‘bananas’, ‘weight’: 3.5, ‘UOM’: kg, ‘retail_price’: 2.75 }, ],

‘shipping_address’: { ‘street’: ‘1245 76 Ave.’, ‘city’: ‘Surrey’, ‘province’: ‘BC’, ‘postal code’: ‘V1Q 1K8‘ }

‘subtotal’: 13.74}

Document- Analogous to a row in RDBMS- Represented as JSON (BSON)

Embedding- Analogous to a foreign key- Can be - sub objects - collections

our eCommerce problem again{ "_id" : ObjectId("4fc19d5e0ddc54e49b84928c"),‘customer_id’: ObjectId("4fc19d5e5eec78a34f24653d"),‘state’: ‘cart’,

‘line_items’: [ {‘UPC’:885909377275’, ‘name’: ‘Tide HE’, ‘quantity’: 2, ‘tax’: ‘HST’, ‘retail_price’: 10.99 },

{‘UPC’: ‘2348751987’, ‘name’: ‘bananas’, ‘weight’: 3.5, ‘UOM’: kg, ‘retail_price’: 2.75 }, ],

‘shipping_address’: { ‘street’: ‘1245 76 Ave.’, ‘city’: ‘Surrey’, ‘province’: ‘BC’, ‘postal code’: ‘V1Q 1K8‘ }

‘subtotal’: 13.74}

Document- Analogous to a row in RDBMS- Represented as JSON (BSON)

Embedding- Analogous to a foreign key- Can be - sub objects - collections

our eCommerce problem again{ "_id" : ObjectId("4fc19d5e0ddc54e49b84928c"),‘customer_id’: ObjectId("4fc19d5e5eec78a34f24653d"),‘state’: ‘cart’,

‘line_items’: [ {‘UPC’:885909377275’, ‘name’: ‘Tide HE’, ‘quantity’: 2, ‘tax’: ‘HST’, ‘retail_price’: 10.99 },

{‘UPC’: ‘2348751987’, ‘name’: ‘bananas’, ‘weight’: 3.5, ‘UOM’: kg, ‘retail_price’: 2.75 }, ],

‘shipping_address’: { ‘street’: ‘1245 76 Ave.’, ‘city’: ‘Surrey’, ‘province’: ‘BC’, ‘postal code’: ‘V1Q 1K8‘ }

‘subtotal’: 13.74}

Document- Analogous to a row in RDBMS- Represented as JSON (BSON)

Embedding- Analogous to a foreign key- Can be - sub objects - collections

our eCommerce problem again{ "_id" : ObjectId("4fc19d5e0ddc54e49b84928c"),‘customer_id’: ObjectId("4fc19d5e5eec78a34f24653d"),‘state’: ‘cart’,

‘line_items’: [ {‘UPC’:885909377275’, ‘name’: ‘Tide HE’, ‘quantity’: 2, ‘tax’: ‘HST’, ‘retail_price’: 10.99 },

{‘UPC’: ‘2348751987’, ‘name’: ‘bananas’, ‘weight’: 3.5, ‘UOM’: kg, ‘retail_price’: 2.75 }, ],

‘shipping_address’: { ‘street’: ‘1245 76 Ave.’, ‘city’: ‘Surrey’, ‘province’: ‘BC’, ‘postal code’: ‘V1Q 1K8‘ }

‘subtotal’: 13.74}

Document- Analogous to a row in RDBMS- Represented as JSON (BSON)

Embedding- Analogous to a foreign key- Can be - sub objects - collections

References- Analogous to a foreign key- Think “relationship”

our eCommerce problem again{ "_id" : ObjectId("4fc19d5e0ddc54e49b84928c"),‘customer_id’: ObjectId("4fc19d5e5eec78a34f24653d"),‘state’: ‘cart’,

‘line_items’: [ {‘UPC’:885909377275’, ‘name’: ‘Tide HE’, ‘quantity’: 2, ‘tax’: ‘HST’, ‘retail_price’: 10.99 },

{‘UPC’: ‘2348751987’, ‘name’: ‘bananas’, ‘weight’: 3.5, ‘UOM’: kg, ‘retail_price’: 2.75 }, ],

‘shipping_address’: { ‘street’: ‘1245 76 Ave.’, ‘city’: ‘Surrey’, ‘province’: ‘BC’, ‘postal code’: ‘V1Q 1K8‘ }

‘subtotal’: 13.74}

Document- Analogous to a row in RDBMS- Represented as JSON (BSON)

Embedding- Analogous to a foreign key- Can be - sub objects - collections

References- Analogous to a foreign key- Think “relationship”

and some basic operations{ "_id" : ObjectId("4fc19d5e0ddc54e49b84928c"),‘customer_id’: ObjectId("4fc19d5e5eec78a34f24653d"),‘state’: ‘cart’,

‘line_items’: [ {‘UPC’:885909377275’, ‘name’: ‘Tide HE’, ‘quantity’: 2, ‘tax’: ‘HST’, ‘retail_price’: 10.99 },

{‘UPC’: ‘2348751987’, ‘name’: ‘bananas’, ‘weight’: 3.5, ‘UOM’: kg, ‘retail_price’: 2.75 }, ],

‘shipping_address’: { ‘street’: ‘12345 76 Ave.’, ‘city’: ‘Surrey’, ‘province’: ‘BC’, ‘postal code’: ‘V1Q 4K0‘ }

‘subtotal’: 13.74}

QueryingAll the following queries will find the document.

By property value

By embedded object property value

With comparison operators

Values in collections (implicit “in”)

db.orders.find({‘state’: ‘cart’})

db.orders.find({‘shipping_address.province’: ‘BC’})

db.orders.find({‘subtotal’: {$gt: 10})

db.orders.find({‘line_items.UPC’: ‘885909377275’})

how to install mongoyou can have mongo on virtually any platform (Windows, OS X, Linux, Solaris, FreeBSD)mongo on Ubuntu in less than 5min

add the 10gen GPG keysudo apt-key adv --keyserver keyserver.ubuntu.com --recv 7F0CEB10

Open document /etc/apt/sources.list (sudo vim /etc/apt/sources.list). Add at the bottomdeb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen

Update source list and install packagesudo apt-get update sudo apt-get install mongodb-10gen

basic mongo commandsdb.help() -- also db.help (no parentheses)show dbs -- show databasesuse -- for selecting databasedb.getCollectionNames()Since collections are schema-less, we don't explicitly need to create them. We can simply insert a document into a new collection. To do so, use the insert command, supplying it with the document to insert.

basic mongo commandsuse learn <--- switched to db learn

> db.getCollectionNames()

[ ]

> db.starwars.insert({name: 'C-3PO', gender: 'robot', position: 'Protocol droid', homeworld: 'Tatooine'})

> db.getCollectionNames()

[ "starwars", "system.indexes" ]

What you're seeing is the name of the index, the database and collection it was created against and the fields included in the index

basic mongo commands

> db.starwars.find()

{ "_id" : ObjectId("4fc19d5e0ddc54e49b84928c"), "name" : "C-3PO", "gender" : "robot", "position" : "Protocol droid", "homeworld" : "Tatooine" }

> db.system.indexes.find()

{ "v" : 1, "key" : { "_id" : 1 }, "ns" : "learn.starwars", "name" : "_id_" }

Drupal & mongoIn Drupal 7 if you use field API you want your fields inside mongodb. If you store them inside sql you cannot query them efficiently.Storing data into SQL will crete cases where you will run into denormalization issues, and with mongo that will be solved.In Drupal 7 everything is entity; nodes are entity, users are entity, comments are entity.We only need few system tables in MySQL that are hard-wired with core drupal; but those things are cached (memcache) everything else can go to mongo. and they (mongo & memcache) are easy to scale.

mongoDB modulehttp://drupal.org/project/mongodb

mongoDB modulehttp://drupal.org/project/mongodb

mongodb: support library for the other modules

mongoDB modulehttp://drupal.org/project/mongodb

mongodb: support library for the other modules mongodb_cache: Store cache items in mongodb.

mongoDB modulehttp://drupal.org/project/mongodb

mongodb: support library for the other modules mongodb_cache: Store cache items in mongodb.mongodb_field_storage: Store the fields in mongodb.

mongoDB modulehttp://drupal.org/project/mongodb

mongodb: support library for the other modules mongodb_cache: Store cache items in mongodb.mongodb_field_storage: Store the fields in mongodb.mongodb_session: Store sessions in mongodb.

mongoDB modulehttp://drupal.org/project/mongodb

mongodb: support library for the other modules mongodb_cache: Store cache items in mongodb.mongodb_field_storage: Store the fields in mongodb.mongodb_session: Store sessions in mongodb.mongodb_watchdog: Store watchdog messages in mongodb

mongoDB modulehttp://drupal.org/project/mongodb

mongodb: support library for the other modules mongodb_cache: Store cache items in mongodb.mongodb_field_storage: Store the fields in mongodb.mongodb_session: Store sessions in mongodb.

mongodb_block: Store block information in mongodb.Very close to the core block API.

mongodb_watchdog: Store watchdog messages in mongodb

mongoDB module

mongoDB module

EntityFieldQuery Views Backend

http://drupal.org/project/efq_viewsThis module enables Views to use EntityFieldQuery as the query backend, allowing you to query all defined entity types and their fields, even the ones stored in non-sql storage such as mongodb.

Load into nodes into mongo<?php// Connect$mongo = new Mongo();

// Get the database (it is created automatically)$db = $mongo->testDatabase;

// Get the collection for nodes (it is created automatically)$collection = $db->nodes;

// Get a listing of all of the node IDs$r = db_query('SELECT nid FROM {node}');

// Loop through all of the nodes...

foreach($r as $row) { print "Writing node $row->nid\n";

// Load each node and convert it to an array. $node = (array)node_load($row->nid);

// Store the node in MongoDB $collection->save($node);

}?>

code sample# drush php-script mongoimport.php

# use testDatabase;# db.nodes.find( {title: /Distineo/i} , {title: true}).limit(4);

code sample# drush php-script mongoimport.php

# use testDatabase;# db.nodes.find( {title: /Distineo/i} , {title: true}).limit(4);

// how to use mongo in php code<?php// Connect$mongo = new Mongo();

// Write our search filter (same as shell example above)$filter = array( 'title' => new MongoRegex('/Distineo/i'),);

// Run the query, getting only 5 results.$res = $mongo->testDatabase->nodes->find($filter)->limit(5);

// Loop through and print the title of each article.foreach ($res as $row) { print $row['title'] . PHP_EOL;}?>

Where mongo won’t work

Joining across Entitiesex. return birthday from profile belonging to author of current node

DEMO

Q & A

Thanks for you patienceenjoy rest of the day!

:-)