Mongo and Redis

22
NoSQL MongoDB and Redis as alternatives to traditional RDBMS

description

presentation for Bucharest BigData Meetup, short overview of MongoDB and Redis

Transcript of Mongo and Redis

Page 1: Mongo and Redis

NoSQLMongoDB and Redis as alternatives to

traditional RDBMS

Page 2: Mongo and Redis

Then...

Page 3: Mongo and Redis

...and now

*This thing weighs less than 50g

Page 4: Mongo and Redis

Meaning of NoSQL

1970 = We have no SQL1980 = Know SQL2000 = No SQL!2005 = Not only SQL2014 = No, SQL

(slide adapted from @markmadsen)

Page 5: Mongo and Redis

MongoDB

Page 6: Mongo and Redis

MongoDB

● it is the “new MySQL”● Project started in 2007 by 10gen (now MongoDB Inc)● Cross-platform, open-source● 5th most used DBMS & most used Document Store*

(next DS CouchDB - 21st)* According to db-engines.com as of Oct 2014

Page 7: Mongo and Redis

Characteristics

● “It's really a hybrid database with features from a few different places.” (Gaetan Voyer-Perrault on Quora)

● Document Oriented but NO SCHEMA! ● Documents grouped in Collections● Binary JSON (BSON) format● Load Balancing (automated sharding, sharding key

can be user defined)● Replication (Replica Sets)● Automated failover

Page 8: Mongo and Redis

Characteristics - continued

● Primary and Secondary Indexes● JavaScript for UDF● MapReduce● Capped Collections● Aggregation Framework since 2.2● Ad-hoc Query Support

Page 9: Mongo and Redis

Caveats

Page 10: Mongo and Redis

Generic performance tips

● Use 64-bit OS● Lots of RAM, fast disks (was anyone expecting

something else?)● ensure that at least indexes + working set fit in RAM

(db.stats(), db.<coll>.stats()) - if not, you might want to try TokuMX

● Design for de-normalized data models

Page 11: Mongo and Redis

Generic performance tips

● Write-Concerns● Shard early● Fixed (or at least bounded) record size => better write

performance● Use short attribute names (reduces index & data size,

OFC!)● EXT4 or XFS

Page 12: Mongo and Redis

IRL

● virtualized server 8G RAM, 4 vCPU - no sharding, no replica sets

● 100 inserts/s , 130M doc collection WITH secondary index (avg doc size 0.6k)

● 20 inserts/s 3M doc collection WITH 18 secondary indexes (avg doc size 10k)

Page 13: Mongo and Redis

Use Cases

● Logs● Location Data (Mongo has built in Geospatial ops)● Account and User Profiles● Messaging● (complex) Config Data● http://www.mongodb.com/who-uses-mongodb (hint:

Expedia, Business Insider, The Weather Channel, Foursquare, eBay)

Page 14: Mongo and Redis

Redis

Page 15: Mongo and Redis

Redis

● Salvatore Sanfilippo (@antirez)● Started in 2009● Key-Value Store● 11th most used DBMS & most used KV Store* (next

KVS memcached - 19th)● Sponsored by Pivotal (spinoff EMC/VMware)* According to db-engines.com as of Oct 2014

Page 16: Mongo and Redis

Characteristics

● Holds all data in memory, persists on disk● Data Models

○ Strings/Blobs/Bit-Maps (not really Bitmaps)○ Hashtables○ Linked Lists○ Sets○ Sorted Sets

● HyperLogLog (+2.8.9 - trade accuracy for memory)● Master Slave Replication● High Availability (through Sentinel)

Page 17: Mongo and Redis

Characteristics - continued

● Redis Cluster in works (not production ready yet) - sharding ○ asynchronous replication○ does not guarantee strong consistency (may ‘forget’ writes)

● AOF sync - default 2s● Does not support secondary indexes● Pub/Sub mode since 2.0● Key expiry● Server scripting with Lua

Page 18: Mongo and Redis

IRL

● virtualized server 4G RAM, 1vCPU● +50k get/set per second (redis-benchmark)● only 128 queries out of 1165550375 over 10ms

(0.00001%)○ uptime_in_days:439○ used_memory_human:424.09M○ used_memory_peak_human:834.94M○ total_connections_received:1352935○ db0:keys=610884,expires=355397

Page 19: Mongo and Redis

Generic performance tips

● Use short key names (reduces data size, OFC!)● You can create secondary indexes (but you have to

maintain them, e.g. using SET)● You can have ad-hoc queries (actually is query) :

using SORT

Page 20: Mongo and Redis

Use Cases

● Cache● IPSS/IPC● Queue mechanisms (see e.g. Resque)● Log/Task buffers● Statistics and aggregation datastore● (anywhere you use memcached)● http://redis.io/topics/whos-using-redis (hint: Twitter,

GitHub, Snapchat, StackOverflow a.o.)

Page 21: Mongo and Redis

Recap

One size does NOT fit all!

Page 22: Mongo and Redis

Further reading

● Must read: http://blog.andreamostosi.name/big-data/ (almost exhaustive list of all things NoSQL and BigData)