MongoDB and the MEAN Stack

Ger Hartnett & Alan Spencer MongoDB Dublin

• Fictional story of a startup using MongoDB & MEAN stack to build IoT application

• We’ll take a devops perspective - show you what to watch out for a framework like MEAN

• Tips you can use to help development team focus on the right things when close to production

• Questions • How many from operations? • How many from development?

Overview

Capacity planning/prototyping is a good idea but performance is sensitive to sample test data

The MEAN stack rocks - fast to get started - profiler can help you understand what’s under the hood

Realtime/incremental aggregation works well with IoT workloads - the “MMS approach”

With NodeJS/Express number of app servers becomes bottleneck before MongoDB

Performance tuning patterns apply - "bottleneck whack-a-mole" & “slam-dunk-optimization”

5 Things we Learned

Context: IoT & MEAN

Internet of Things

“The rise of device oriented development … new architectural and workflow challenges … distinctly different from … web and mobile development so far.” - Morten Bagai

Big Data => Humongous Data

Internet of Things

• Bosch: “IoT brings root and branch changes to the world of business”

• Richard Kreuter's Webinar May 2013

• Earlier bootcamp looked at sharding IoT

Photo by jurvetson - Creative Commons Attribution License - http://www.flickr.com/photos/jurvetson/916142

Express - web app framework/router

Angular - browser HTML/JS MVC

Node - javascript application server

MongoDB - the database

MEAN stack

Photo by benmizen - Creative Commons ShareAlike License - http://www.flickr.com/photos/benmizen/9456440635

Valeri Karpov - MongoDB Kernel Tools Team http://thecodebarbarian.wordpress.com/2013/07/22/introduction-to-the-mean-stack-part-one-setting-up-your-tools/ MEAN.io http://mean.io

Learn more about MEAN

We invest in technical new hires

Everyone does “bootcamp”

NYC for 2 weeks - product internals

Then work on a longer project 3-4 weeks

In our case: wanted to do a bit of everything, capacity planning, iterate user-stories, MongoDB a component

About MongoDB Bootcamp

The Application

!!!!!!!!

• IoT example 3 from Richard’s Webinar

Location based advertising - IoMT

Customer

Advertiser

AdvertiserAdvertiser

US1 - customer looks for advertisers near US2 - advertiser wants to see how many customers saw offer US3 - find hot spots where many customers but few advertisers

User Stories - for the application

Photo by consumerist - Creative Commons Attribution License - http://www.flickr.com/photos/consumerist/2158190589

exports.all = function(req, res) {!! findQuery = { near: [ Number(req.query.lng), Number(req.query.lat) ],!! ! maxDistance: Number(req.query.dist) };!! Advertiser.geoSearch({kind:"pub"}, findQuery, !! ! function (err, advertisers) {! // error handling!! !! res.jsonp(advertisers);!! ! });!}

Document / Model / Controller

Model (advertiser.js) Document{ name: ‘Long Hall’, pos: [-6.265535, 53.3418364], kind: “pub” }

AdvertiserSchema = new Schema({! name: { type: String,! default: ‘’},! pos: [Number],! kind: { type: String,! default: ‘place’},!});

Controller (advertisers.js)Haystack examples sent us in wrong direction initially

CRUD interface & Mongoose

CRUD interface !Raised & fixed bug in Mongoose, pull request merged

5 Things we Learned

MongoDB shell scripts 9 advertisers, small area, distance 10km MongoDB has 5 kinds of geo query 3 kinds of geo index geoSearch (haystack) looked much better than others (our 1st mistake) TIP: performance is sensitive to test data & query

US1 Initial Measurements

5 Things we Learned

The good thing about frameworks is… !they do lot’s of things for developers !!!…and the bad thing about frameworks? !they do lot’s of things for developers

To find out what’s happening - debug

Console

Mongoose: clients.findOne({ _id: ObjectId(“…”) })!Mongoose: advertisers.geoHaystack({…[-6.267765, 53.34087]})!

We used Express passport-http to add Basic-Digest auth (client id lookup) It can be hard to figure out what a framework like express/mongoose really does Tip: mongoose.set('debug', true) - detailed logging

Find out what’s happening - profiler

db.system.profile.find{"op":"query", "ns":"tings.clients",...!{“op":"command", "command":{"geoSearch"...!

{"op" :"update","ns":"tings.sessions"...!

Tip: The MongoDB profiler shows operations really happening on DB, check with dev

exports.all = function(req, res) {!. . .!! ! ! req.session = null;!! !! res.jsonp(advertisers);!}

10% performance improvement

Where did that come from?

Fixing it is not obvious

Back to the application

US1 - customer looks for advertisers near • Need to store

customer location US2 - advertiser wants to see how many customers near

US2 means we built on US1

Photo by consumerist - Creative Commons Attribution License - http://www.flickr.com/photos/consumerist/2158190589

Being a startup we decided to take a naive pragmatic approach: • Store all samples • US2 aggregates on-demand

5 Things we Learned

1 hour of raw samples @ 2k RPS = 7.2M documents !Aggregation on 7.2M raw samples took 1 second on our instances Significant impact • Run every 2 seconds

RPS dropped by factor of 4! (single instance)

US2 - Aggregation of Raw Samples

Query Aggregate

Raw Insert

Samples

Aggregate

US2 - Pre aggregation

Query Aggregate

Raw Insert

Samples

Query Aggregate

Pre Aggregate

!Update

Samples

Aggregate Aggregate

An MMS type approach Document for advertiser-customer-month !Using update multi-true (more on this later) !Query now only needs to aggregate unique customers

MongoDB shell scripts More realistic data - old measurements repeated locations 110k advertisers with clusters in DUB and NYC Performance best for near and nearSphere (2x better than Haystack)

US1 measurements revisited

• Express/Mongoose/Node • Customer Lookup • Find ($near) • Save Sample DB • Save Sample File • Preagg=multiple docs (6) • Preagg=multi-update 1 doc

Where does the time go?

5 Things we Learned

MongoD

Deployment

Chrome:PostmanNodeJS

HAproxy

NodeLoad

NodeJS

NodeJS MongoD

Scaling

5 Things we Learned

2 - HAproxy

1 - number of Node.JS

3 - load gen threads/BW

MongoD

Pattern: “slam dunk optimization"

Chrome:PostmanNodeJS

HAproxy

NodeLoad

NodeJS

NodeJS MongoD*

1. Increase number of Node.JS 2. Increase perf of proxy/balancer instance

HAproxy more balanced than Amazon ELB 3. Tweak Nodeload (generates/measures REST)

Nodeload concurrency 3x Node servers Run Nodeload on same machine as HAproxy

Development recommendation: Postman chrome ext - generates REST / Basic Auth

Performance tips

Back to the application

US3 Overview

What are the top 10 hot sales areas? • What is an “area”…? Requirements • Little impact, easy to calculate • Approx. Regular size • Optimal approx. distance - “bounding areas” • Plays nice with sharding Internals of haystack, 2dsphere? Polygon? MGRS?

US3 - Hot box - Sales, go sell!

• 4QFJ123678 precision level 100m

MGRS - Military Grid Reference System

Image by Mikael Rittri - Creative Commons ShareAlike License http://en.wikipedia.org/wiki/File:MGRSgridHawaiiSchemeAARealigned.png

MGRS - But at the poles…

39 Image by Mikael Rittri - Creative Commons ShareAlike License http://en.wikipedia.org/wiki/File:MGRSgridNorthPole.png

Introducing the ‘box’

• Reinvented the sphere • Long/lat -> box number • Tailored to specific distance • Boxes are at least 1km • Search in current and 8

neighbouring boxes !

• Filter outside circle in JS • Performed relatively well • Can be used to shard

The “box” - the poor-man’s MGRS

Replication

Impact of Replication

Secondary reads !Worked for this app !Beware - don’t try this at home!

Apply the production notes

Change from default readahead Disable NUMA & THP ext4 or XFS noatime Load test workload on different configurations Instance Store / EBS (PIOPs) SSDs / spinning rust AWS instance types

The MEAN stack rocks - fast to get started but profiler can help you understand what’s under the hood

5 Things we Learned

Next Steps

Plan to publish as blog post series and github project !Check blog.mongodb.org !Continue to explore…

Next Steps

Hadoop/YARN for aggregations Use “box” to geo-shard Try 2.6 bulk updates Dynamic angular-google-maps with socket-io Implement in another framework (Go/Clojure) to load MongoDB with less hardware Find balance between batch and pre-aggregation (see next slide)

Next Steps - continuation

Introduction to MEAN - Valeri Karpov http://thecodebarbarian.wordpress.com/2013/07/22/introduction-to-the-mean-stack-part-one-setting-up-your-tools/

MEAN.io http://mean.io

Richard Kreuter's webinar - M2M http://www.mongodb.com/presentations/webinar-realizing-promise-machine-machine-m2m-mongodb

Building MongoDB Into Your Internet of Things http://blog.mongohq.com/building-mongodb-into-your-internet-of-things-a-tutorial/

Schema design for time series data (MMS) http://blog.mongodb.org/post/65517193370/schema-design-for-time-series-data-in-mongodb

Learn More & Thank You

MongoDB and the MEAN Stack

Technology

Transcript of MongoDB and the MEAN Stack

NodeSummit - MEAN Stack

MEAN Stack Weekend

Mastering MEAN Stack

Just a T.A.D. (Traffic Analysis Drone) · MEAN (MongoDB, Express, Angular, NodeJS) Stack web application to query database Hosted on cloud (Heroku)

MEAN STACK WEB DEVELOPMENT - Ammattikorkeakoulut

MEAN Stack

The MEAN Stack: MongoDB, ExpressJS, AngularJS and Node.js

MIKE Stack Introduction - MongoDB, io.js, KendoUI, and Express

Data Science Stack with MongoDB and RStudio

Creating Scalable MEAN Stack Apps

Mean full stack development

Hire mean stack developers

MongoDB Days UK: Building Apps with the MEAN Stack

New York University Computer Science Department Courant ... · programming using the MEAN ("MongoDB Express.js AngularJS Node.js") stack and Node.js allows the use of a single language

Lean and mean MongoDB

From MEAN to the MERN Stack

MEAN Stack c0nrad. Overview Day 1: – MEAN Stack – NodeJS Mini Cat Fact Spammer – MongoDB Cat Profiles – Express Catbook API (Facebook for cats) Day 2:

The MEAN stack

CCSA Checkpoint MEAN Stack - Sevenmentor Pvt. Ltd · Getting Started with MEAN Stack Course Course Introduction What is MongoDB, Angular, Nodejs and Express.js MongoDB, Angular, Nodejs

INTERACTIVE DEVELOPMENT (UX DEVELOPER) · MEAN Stack: AngularJS 202 MEAN Stack: NodeJS 203 MEAN Stack: ExpressJS 204 Data Visualisation for Web 301 Intro to AR 301 Intro to Android