Social Gaming & Gambling Summit - London Chris Anderson
-
Upload
mediabistro -
Category
Documents
-
view
308 -
download
1
description
Transcript of Social Gaming & Gambling Summit - London Chris Anderson
@jchrisChris Anderson
NoSQL Landscape& Grid Compu7ng
1Saturday, October 6, 12
• 2.2 Billion internet users• 50% Americans use smartphones
• Your app can grow overnight
• Are you ready?
2
Growth is the New Reality
Saturday, October 6, 12
Instagrowth: Android Launch
• Instagram gained nearly 1 million users overnight when they expanded to Android
Example
3Saturday, October 6, 12
1 Instagram=
7.5M MAU*
4
Instagrowth: Android Launch
Example
Saturday, October 6, 12
Draw Something -‐ Social Game
5
35 million monthly active users in 1 monthabout 5 Instagrams
(Instagram today is waaaay more than 1 Instagram)
Saturday, October 6, 12
Goes Viral 3 Weeks aOer Launch
6
191715131197533/12826242220181614121082/6
Draw Something by OMGPOPDaily Ac)ve Users (millions)
21
2
4
6
8
10
12
14
16
35+M MAUat peak
Saturday, October 6, 12
By Contrast, at 1/2 an Instagram
7
The Simpson’s: Tapped OutDaily Ac)ve Users (millions)
Saturday, October 6, 12
GETTING IT RIGHT
8Saturday, October 6, 12
Scalable Data Layer
9
●On-‐demand cluster sizing●Grow or shrink with workload
●Easy node provisioning●All nodes are the same
●MulA-‐master Cross-‐Datacenter ReplicaAon●For a fast and reliable user experience worldwide
●EffecAve Auto-‐sharding●Should avoid cluster hot spots
Saturday, October 6, 12
Old School Hits a Scale Wall
10
Application Scales OutJust add more commodity web servers
Database Scales UpGet a bigger, more complex server
Expensive & disruptive sharding, doesn’t perform at web scale
Saturday, October 6, 12
Tradi^onal MySQL + Memcached Architecture
11
●Run as many MySQL machines as you need
●Data sharded evenly across the machines using client code
●Memcached used to provide faster response time for users and reduce load on the database Memcached Tier
MySQL Tier
App Servers
www.example.com
Saturday, October 6, 12
Limita^ons of MySQL + Memcached
12
● To scale you need to start using MySQL more simply
● Scale by hand
● Replication / Sharding is a black art
● Code overhead to manage keeping memcache and mysql in sync
● Lots of components to deploy
Learn From Others -‐ This Scenario Costs Time and Money. Scaling SQL is poten^ally disastrous when going Viral: very risky ^me for major code changes and migra^ons... you have no Time when skyrocke^ng up.
Saturday, October 6, 12
NoSQL Architectural Promise
13
Couchbase Database Servers
App Servers
www.example.com
•High Performance data access
• Scale Up/Down Horizontally
• 24x7x365 Always-On Availability
• Flexible Schema Document Model
Saturday, October 6, 12
14
NOSQL TAXONOMY
Saturday, October 6, 12
15
The Key-Value Store – the foundation of NoSQL
Key
101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101
101100101000100010011101101100101000100010011101
101100101000100010011101101100101000100010011101101100101000100010011101
OpaqueBinaryValue
Saturday, October 6, 12
16
Memcached – the NoSQL precursor
Key
101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101
101100101000100010011101101100101000100010011101
101100101000100010011101101100101000100010011101101100101000100010011101
OpaqueBinaryValue
memcached
In-‐memory onlyLimited set of opera^onsBlob Storage: Set, Add, Replace, CASRetrieval: GetStructured Data: Append, Increment
“Simple and fast.”
Challenges: cold cache, disrup^ve elas^city
Saturday, October 6, 12
17
Redis – More “Structured Data” commands
Key
101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101
101100101000100010011101101100101000100010011101
101100101000100010011101101100101000100010011101101100101000100010011101
“Data Structures”BlobListSetHash…
redis
In-‐memory onlyVast set of opera^onsBlob Storage: Set, Add, Replace, CASRetrieval: Get, Pub-‐SubStructured Data: Strings, Hashes, Lists, Sets,Sorted lists
Example opera7ons for a SetAdd, count, subtract sets, intersec^on, is member?, atomic move from one set to another
Saturday, October 6, 12
18
NoSQL catalog
Key-‐Value
memcached redis
Data Structure Document Column Graph
Cache
(mem
ory on
ly)
Saturday, October 6, 12
19
Membase – From key-‐value cache to database
Disk-‐based with built-‐in memcached cacheCache refill on restartMemcached compa^ble (drop in replacement)Highly-‐available (data replica^on)Add or remove capacity to live cluster
“Simple, fast, elas^c.”
membaseKey
101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101
101100101000100010011101101100101000100010011101
101100101000100010011101101100101000100010011101101100101000100010011101
OpaqueBinaryValue
Saturday, October 6, 12
20
NoSQL catalog
Key-‐Value
memcached
membase
redis
Data Structure Document Column Graph
Cache
(mem
ory on
ly)
Database
(mem
ory/disk)
Saturday, October 6, 12
21
Couchbase – document-‐oriented database
Key
{ “string” : “string”, “string” : value, “string” : { “string” : “string”, “string” : value }, “string” : [ array ]}
Auto-‐shardingDisk-‐based with built-‐in memcached cacheCache refill on restartMemcached compa^ble (drop in replace)Highly-‐available (data replica^on)Add or remove capacity to live cluster
When values are JSON objects (“documents”):Create indices, views and query against the views
JSONOBJECT
(“DOCUMENT”)
Couchbase
Saturday, October 6, 12
22
NoSQL catalog
Key-‐Value
memcached
membase
redis
Data Structure Document Column Graph
Cache
(mem
ory on
ly)
Database
(mem
ory/disk)
membase couchbase
Saturday, October 6, 12
23
MongoDB – Document-‐oriented database
Key
{ “string” : “string”, “string” : value, “string” : { “string” : “string”, “string” : value }, “string” : [ array ]}
Disk-‐based with in-‐memory “caching”BSON (“binary JSON”) format and wire protocolMaster-‐slave replica^onAuto-‐shardingValues are BSON objectsSupports ad hoc queries – best when indexed
BSONOBJECT
(“DOCUMENT”)
MongoDB
Saturday, October 6, 12
24
NoSQL catalog
Key-‐Value
memcached
membase
redis
Data Structure Document Column Graph
mongoDB
couchbase
Cache
(mem
ory on
ly)
Database
(mem
ory/disk)
Saturday, October 6, 12
25
Cassandra – Column overlays
Disk-‐based systemClustered External caching required for low-‐latency reads“Columns” are overlaid on the dataNot all rows must have all columnsSupports efficient queries on columnsRestart required when adding columnsGood cross-‐datacenter support
CassandraColumn 1
Column 2
Column 3 (not present)
Saturday, October 6, 12
26
NoSQL catalog
Key-‐Value
memcached
membase
redis
Data Structure Document Column Graph
mongoDB
couchbase cassandra
Cache
(mem
ory on
ly)
Database
(mem
ory/disk)
Saturday, October 6, 12
27
Neo4j – Graph database
Disk-‐based systemExternal caching required for low-‐latency readsNodes, rela^onships and pathsProper^es on nodesDelete, Insert, Traverse, etc.
Neo4j
Saturday, October 6, 12
28
NoSQL catalog
Key-‐Value
memcached
membase
redis
Data Structure Document Column Graph
mongoDB
couchbase cassandra
Cache
(mem
ory on
ly)
Database
(mem
ory/disk)
Neo4j
Saturday, October 6, 12
The Landscape
29
Speed
ScaleCouchbase
Redis
S3
Cassandra
MongoDB
Riak
HBase
CouchDB
Neo4j
SimpleDB
memcached
RDBMSDatomic
Saturday, October 6, 12
Datomic -‐ immutable func^onal data
30Saturday, October 6, 12
Hello Couchbase Server 2.0
31Saturday, October 6, 12
Couchbase Server 2.0 beta
32Saturday, October 6, 12
33
Couchbase handles real world scale
Saturday, October 6, 12
(Really) High Performance
34
Latencyless than 1/2 ms
Throughputgrows linearly with cluster size
5 Nodes -- 1.75M operations per second
Cisco and Solarflare benchmark of Couchbase Server
Saturday, October 6, 12
How fast?
35hrp://www.slideshare.net/renatko/couchbase-‐performance-‐benchmarking
Saturday, October 6, 12
Latency
Saturday, October 6, 12
Latency
Saturday, October 6, 12
COMPLEXITY IS THE ENEMY
37Saturday, October 6, 12
38
Couchbase Server Basic Opera^on
COUCHBASE CLIENT LIBRARY
§Docs distributed evenly across servers in the cluster
§Each server stores both ac)ve & replica docs§Only one server ac^ve at a ^me
§Client library provides app with simple interface to database
§Cluster map provides map to which server doc is on§App never needs to know
§ App reads, writes, updates docs§ Mul^ple App Servers can access same document at same ^me
Doc 2
Doc 5
SERVER 1
Doc 4
SERVER 2
Doc 1
SERVER 3
COUCHBASE CLIENT LIBRARY
Doc 9
Doc 7
Doc 8 Doc 6
Doc 3
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
Ac^ve Docs Ac^ve Docs Ac^ve Docs
CLUSTER MAP CLUSTER MAP
APP SERVER 1 APP SERVER 2
COUCHBASE SERVER CLUSTER
Saturday, October 6, 12
38
Couchbase Server Basic Opera^on
COUCHBASE CLIENT LIBRARY
§Docs distributed evenly across servers in the cluster
§Each server stores both ac)ve & replica docs§Only one server ac^ve at a ^me
§Client library provides app with simple interface to database
§Cluster map provides map to which server doc is on§App never needs to know
§ App reads, writes, updates docs§ Mul^ple App Servers can access same document at same ^me
Doc 4
Doc 2
Doc 5
SERVER 1
Doc 6
Doc 4
SERVER 2
Doc 7
Doc 1
SERVER 3
Doc 3
COUCHBASE CLIENT LIBRARY
Doc 9
Doc 7
Doc 8 Doc 6
Doc 3
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
Doc 9
Doc 5
DOC
DOC
DOC
Doc 1
Doc 8 Doc 2
Replica Docs Replica Docs Replica Docs
Ac^ve Docs Ac^ve Docs Ac^ve Docs
CLUSTER MAP CLUSTER MAP
APP SERVER 1 APP SERVER 2
COUCHBASE SERVER CLUSTER
Saturday, October 6, 12
38
Couchbase Server Basic Opera^on
COUCHBASE CLIENT LIBRARY
§Docs distributed evenly across servers in the cluster
§Each server stores both ac)ve & replica docs§Only one server ac^ve at a ^me
§Client library provides app with simple interface to database
§Cluster map provides map to which server doc is on§App never needs to know
§ App reads, writes, updates docs§ Mul^ple App Servers can access same document at same ^me
Doc 4
Doc 2
Doc 5
SERVER 1
Doc 6
Doc 4
SERVER 2
Doc 7
Doc 1
SERVER 3
Doc 3
Read/Write/Update
COUCHBASE CLIENT LIBRARY
Read/Write/Update
Doc 9
Doc 7
Doc 8 Doc 6
Doc 3
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
Doc 9
Doc 5
DOC
DOC
DOC
Doc 1
Doc 8 Doc 2
Replica Docs Replica Docs Replica Docs
Ac^ve Docs Ac^ve Docs Ac^ve Docs
CLUSTER MAP CLUSTER MAP
APP SERVER 1 APP SERVER 2
COUCHBASE SERVER CLUSTER
Saturday, October 6, 12
39
Add Nodes to the Cluster
§ Two servers added to cluster§ One-‐click opera^on
§ Docs automa^cally rebalanced across cluster§ Even distribu^on of docs§ Minimum doc movement
§ Cluster map updated§ App database calls now distributed over larger # of servers
Doc 7
Doc 9
Doc 3
Ac^ve Docs
Replica Docs
Doc 6
COUCHBASE CLIENT LIBRARYCLUSTER MAP
APP SERVER 1
COUCHBASE CLIENT LIBRARYCLUSTER MAP
APP SERVER 2
Doc 4
Doc 2
Doc 5
SERVER 1
Doc 6
Doc 4
SERVER 2
Doc 7
Doc 1
SERVER 3
Doc 3
Doc 9
Doc 7
Doc 8 Doc 6
Doc 3
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
Doc 9
Doc 5
DOC
DOC
DOC
Doc 1
Doc 8 Doc 2
Replica Docs Replica Docs Replica Docs
Ac^ve Docs Ac^ve Docs Ac^ve Docs
COUCHBASE SERVER CLUSTER
Saturday, October 6, 12
39
Add Nodes to the Cluster
§ Two servers added to cluster§ One-‐click opera^on
§ Docs automa^cally rebalanced across cluster§ Even distribu^on of docs§ Minimum doc movement
§ Cluster map updated§ App database calls now distributed over larger # of servers
Doc 7
Doc 9
Doc 3
Ac^ve Docs
Replica Docs
Doc 6
COUCHBASE CLIENT LIBRARYCLUSTER MAP
APP SERVER 1
COUCHBASE CLIENT LIBRARYCLUSTER MAP
APP SERVER 2
Doc 4
Doc 2
Doc 5
SERVER 1
Doc 6
Doc 4
SERVER 2
Doc 7
Doc 1
SERVER 3
Doc 3
Doc 9
Doc 7
Doc 8 Doc 6
Doc 3
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
Doc 9
Doc 5
DOC
DOC
DOC
Doc 1
Doc 8 Doc 2
Replica Docs Replica Docs Replica Docs
Ac^ve Docs Ac^ve Docs Ac^ve Docs
SERVER 4 SERVER 5
Ac^ve Docs Ac^ve Docs
Replica Docs Replica Docs
COUCHBASE SERVER CLUSTER
Saturday, October 6, 12
39
Add Nodes to the Cluster
§ Two servers added to cluster§ One-‐click opera^on
§ Docs automa^cally rebalanced across cluster§ Even distribu^on of docs§ Minimum doc movement
§ Cluster map updated§ App database calls now distributed over larger # of servers
Doc 7
Doc 9
Doc 3
Ac^ve Docs
Replica Docs
Doc 6
COUCHBASE CLIENT LIBRARYCLUSTER MAP
APP SERVER 1
COUCHBASE CLIENT LIBRARYCLUSTER MAP
APP SERVER 2
Doc 4
Doc 2
Doc 5
SERVER 1
Doc 6
Doc 4
SERVER 2
Doc 7
Doc 1
SERVER 3
Doc 3
Doc 9
Doc 7 Doc 8
Doc 6
Doc 3
DOC
DOC
DOCDOC
DOC DOC
DOC
DOC
DOC
DOC
DOC DOC
DOC
DOC
DOC
Doc 9
Doc 5DOC
DOC
DOC
Doc 1
Doc 8
Doc 2
Replica Docs Replica Docs Replica Docs
Ac^ve Docs Ac^ve Docs Ac^ve Docs
SERVER 4 SERVER 5
Ac^ve Docs Ac^ve Docs
Replica Docs Replica Docs
COUCHBASE SERVER CLUSTER
Saturday, October 6, 12
39
Add Nodes to the Cluster
§ Two servers added to cluster§ One-‐click opera^on
§ Docs automa^cally rebalanced across cluster§ Even distribu^on of docs§ Minimum doc movement
§ Cluster map updated§ App database calls now distributed over larger # of servers
Read/Write/Update Read/Write/Update
Doc 7
Doc 9
Doc 3
Ac^ve Docs
Replica Docs
Doc 6
COUCHBASE CLIENT LIBRARYCLUSTER MAP
APP SERVER 1
COUCHBASE CLIENT LIBRARYCLUSTER MAP
APP SERVER 2
Doc 4
Doc 2
Doc 5
SERVER 1
Doc 6
Doc 4
SERVER 2
Doc 7
Doc 1
SERVER 3
Doc 3
Doc 9
Doc 7 Doc 8
Doc 6
Doc 3
DOC
DOC
DOCDOC
DOC DOC
DOC
DOC
DOC
DOC
DOC DOC
DOC
DOC
DOC
Doc 9
Doc 5DOC
DOC
DOC
Doc 1
Doc 8
Doc 2
Replica Docs Replica Docs Replica Docs
Ac^ve Docs Ac^ve Docs Ac^ve Docs
SERVER 4 SERVER 5
Ac^ve Docs Ac^ve Docs
Replica Docs Replica Docs
COUCHBASE SERVER CLUSTER
Saturday, October 6, 12
40
Fail Over Node
COUCHBASE CLIENT LIBRARYCLUSTER MAP
COUCHBASE CLIENT LIBRARYCLUSTER MAP
Doc 4
Doc 2
Doc 5
SERVER 1
Doc 6
Doc 4
SERVER 2 SERVER 3
Doc 3
Doc 9
Doc 7 Doc 8
Doc 6DOC
DOC
DOCDOC
DOC
DOC
DOC
DOC
DOC DOC
DOCDoc 5DOC
DOC
DOC
Doc 1
Doc 8
Doc 2
Replica Docs Replica Docs Replica Docs
Ac^ve Docs Ac^ve Docs Ac^ve Docs
SERVER 4 SERVER 5
Ac^ve Docs Ac^ve Docs
Replica Docs Replica Docs
COUCHBASE SERVER CLUSTER
§ App servers happily accessing docs on Server 3
§ Server fails§ App server requests to server 3 fail§ Cluster detects server has failed
§ Promotes replicas of docs to ac)ve§ Updates cluster map
§ App server requests for docs now go to appropriate server
§ Typically rebalance would follow
Doc 7
Doc 9
Doc 3
Ac^ve Docs
Replica Docs
Doc 6
Doc 7
Doc 1
Doc 3
DOC
DOCDoc 9
DOC
DOC
Saturday, October 6, 12
40
Fail Over Node
COUCHBASE CLIENT LIBRARYCLUSTER MAP
COUCHBASE CLIENT LIBRARYCLUSTER MAP
Doc 4
Doc 2
Doc 5
SERVER 1
Doc 6
Doc 4
SERVER 2 SERVER 3
Doc 3
Doc 9
Doc 7 Doc 8
Doc 6DOC
DOC
DOCDOC
DOC
DOC
DOC
DOC
DOC DOC
DOCDoc 5DOC
DOC
DOC
Doc 1
Doc 8
Doc 2
Replica Docs Replica Docs Replica Docs
Ac^ve Docs Ac^ve Docs Ac^ve Docs
SERVER 4 SERVER 5
Ac^ve Docs Ac^ve Docs
Replica Docs Replica Docs
COUCHBASE SERVER CLUSTER
§ App servers happily accessing docs on Server 3
§ Server fails§ App server requests to server 3 fail§ Cluster detects server has failed
§ Promotes replicas of docs to ac)ve§ Updates cluster map
§ App server requests for docs now go to appropriate server
§ Typically rebalance would follow
Doc 7
Doc 9
Doc 3
Ac^ve Docs
Replica Docs
Doc 6
Doc 7
Doc 1
Doc 3
DOC
DOCDoc 9
DOC
DOC
Saturday, October 6, 12
40
Fail Over Node
COUCHBASE CLIENT LIBRARYCLUSTER MAP
COUCHBASE CLIENT LIBRARYCLUSTER MAP
Doc 4
Doc 2
Doc 5
SERVER 1
Doc 6
Doc 4
SERVER 2 SERVER 3
Doc 3
Doc 9
Doc 7 Doc 8
Doc 6DOC
DOC
DOCDOC
DOC
DOC
DOC
DOC
DOC DOC
DOCDoc 5DOC
DOC
DOC
Doc 1
Doc 8
Doc 2
Replica Docs Replica Docs Replica Docs
Ac^ve Docs Ac^ve Docs Ac^ve Docs
SERVER 4 SERVER 5
Ac^ve Docs Ac^ve Docs
Replica Docs Replica Docs
COUCHBASE SERVER CLUSTER
§ App servers happily accessing docs on Server 3
§ Server fails§ App server requests to server 3 fail§ Cluster detects server has failed
§ Promotes replicas of docs to ac)ve§ Updates cluster map
§ App server requests for docs now go to appropriate server
§ Typically rebalance would follow
Doc 7
Doc 9
Doc 3
Ac^ve Docs
Replica Docs
Doc 6
Doc 7
Doc 1
Doc 3
DOC
DOCDoc 9
DOC
DOC
Saturday, October 6, 12
40
Fail Over Node
COUCHBASE CLIENT LIBRARYCLUSTER MAP
COUCHBASE CLIENT LIBRARYCLUSTER MAP
Doc 4
Doc 2
Doc 5
SERVER 1
Doc 6
Doc 4
SERVER 2 SERVER 3
Doc 3
Doc 9
Doc 7 Doc 8
Doc 6DOC
DOC
DOCDOC
DOC
DOC
DOC
DOC
DOC DOC
DOCDoc 5DOC
DOC
DOC
Doc 1
Doc 8
Doc 2
Replica Docs Replica Docs Replica Docs
Ac^ve Docs Ac^ve Docs Ac^ve Docs
SERVER 4 SERVER 5
Ac^ve Docs Ac^ve Docs
Replica Docs Replica Docs
COUCHBASE SERVER CLUSTER
§ App servers happily accessing docs on Server 3
§ Server fails§ App server requests to server 3 fail§ Cluster detects server has failed
§ Promotes replicas of docs to ac)ve§ Updates cluster map
§ App server requests for docs now go to appropriate server
§ Typically rebalance would follow
Doc 7
Doc 9
Doc 3
Ac^ve Docs
Replica Docs
Doc 6
Doc 7
Doc 1
Doc 3
DOC
DOCDoc 9
DOC
DOC
Saturday, October 6, 12
40
Fail Over Node
COUCHBASE CLIENT LIBRARYCLUSTER MAP
COUCHBASE CLIENT LIBRARYCLUSTER MAP
Doc 4
Doc 2
Doc 5
SERVER 1
Doc 6
Doc 4
SERVER 2 SERVER 3
Doc 3
Doc 9
Doc 7 Doc 8
Doc 6DOC
DOC
DOCDOC
DOC
DOC
DOC
DOC
DOC DOC
DOCDoc 5DOC
DOC
DOC
Doc 1
Doc 8
Doc 2
Replica Docs Replica Docs Replica Docs
Ac^ve Docs Ac^ve Docs Ac^ve Docs
SERVER 4 SERVER 5
Ac^ve Docs Ac^ve Docs
Replica Docs Replica Docs
COUCHBASE SERVER CLUSTER
§ App servers happily accessing docs on Server 3
§ Server fails§ App server requests to server 3 fail§ Cluster detects server has failed
§ Promotes replicas of docs to ac)ve§ Updates cluster map
§ App server requests for docs now go to appropriate server
§ Typically rebalance would follow
Saturday, October 6, 12
41
●Suddenly, disk writes all began to time out●Many services experienced outages:●FourSquare, Reddit, Quora, among others
●With memory buffered writes, a scalable data layer keeps working●When EBS came back online, Couchbase wrote all the updated data to disk without missing a beat.
War Story: EBS Outage
Saturday, October 6, 12
42
Cross Data Center Replica^on
§Data close to users§Mul^ple loca^ons for disaster recovery§Independently managed clusters serving local data
US DATA CENTER EUROPE DATA CENTER ASIA DATA CENTER
Replica7on Replica7on
Replica7on
Saturday, October 6, 12
43
Built for Produc^on
Saturday, October 6, 12
JSON DOCUMENT DATABASE
44Saturday, October 6, 12
45
Document Database as Aggregate Database
hrp://mar^nfowler.com/bliki/AggregateOrientedDatabase.html
Saturday, October 6, 12
46
Document Database
This synergy between the programming model and the distribution model is very valuable. It allows the database to use its knowledge of how the application programmer clusters the data to help performance across the cluster.
hrp://mar^nfowler.com/bliki/AggregateOrientedDatabase.html
o::1001{
uid: ji22jd,customer: Ann,line_items: [
{ sku: 0321293533, quan: 3, unit_price: 48.0 },{ sku: 0321601912, quan: 1, unit_price: 39.0 },{ sku: 0131495054, quan: 1, unit_price: 51.0 }
],payment: { type: Amex, expiry: 04/2001,
last5: 12345 }}
Saturday, October 6, 12
Developers <3 JSON
47Saturday, October 6, 12
48
LET’S GET POST-‐RELATIONAL!
Saturday, October 6, 12
49
JSON Documents
• Maps more closely to external API• CRUD Opera^ons, lightweight schema
• Stored under an iden^fier key
{ “fields” : [“with basic types”, 3.14159, true], “like” : “your favorite language”}
client.set(“mydocumentid”, myDocument);mySavedDocument = client.get(“mydocumentid”);
Saturday, October 6, 12
Meta + Document Body
50
{ "brewery": "New Belgium Brewing", "name": "1554 Enlightened Black Ale", "abv": 5.5, "descrip7on": "Born of a flood...", "category": "Belgian and French Ale", "style": "Other Belgian-‐Style Ales", "updated": "2010-‐07-‐22 20:00:20"}
{ "id" : "beer_Enlightened_Black_Ale”, ...{
Documentuser data,
can be anything
unique ID
Metadataidentifier,
expiration, etc
“vintage” date format from an SQL dump >_<
Saturday, October 6, 12
{ "brewery": "New Belgium Brewing", "name": "1554 Enlightened Black Ale", "abv": 5.5, "descrip7on": "Born of a flood...", "category": "Belgian and French Ale", "style": "Other Belgian-‐Style Ales", "updated": "2010-‐07-‐22 20:00:20", “ra7ngs” : { “525” : 5, “30” : 4, “1044” : 2 }, “comment_ids” : [ “f1e62”, “6ad8c” ]}
Add comments to the beer
{ "type": "comment", "about_id": "beer_Enlightened_Black_Ale", "user_id": 525, "text": "tastes like college!", "updated": "2010-‐07-‐22 20:00:20"}
link to comments
link to beer
{ "id": "f1e62"}
Saturday, October 6, 12
52
How to: look up comments from a beer
• SERIALIZED LOOP
figure hrp://www.ibm.com/developerworks/webservices/library/ws-‐sdoarch/
beer = client.get(“beer:A_cold_one”);beer.comment_ids.each { |id| comments.push(client.get(id));}
• FAST MULTI-‐KEY LOOKUPbeer = client.get(“beer:A_cold_one”);comments = client.multiGet(beer.comment_ids)
• ASYNC VIEW QUERYcomments = client.query(“myapp”,“by_comment_on”, {:key => “beer:A_cold_one”});
Saturday, October 6, 12
53
Emergent Schema
JSON.org
Github API
Twiqer API
"Capture the user's intent"
• The database can handle it• Your app controls the schema
Saturday, October 6, 12
Audience par^cipa^on! *that means you
54
npm install twitterfightnpm start twitterfight
Saturday, October 6, 12
@jchrishrp://www.couchbase.com/
Chris Anderson
Thank You!
55Saturday, October 6, 12
56
INCREMENTAL MAP-‐REDUCEFOR REALTIME ANALYTICS
Saturday, October 6, 12
What do you mean “Incremental?”
like:CREATE INDEX city ON brewery city;
57Saturday, October 6, 12
58
QUERY PATTERN:FIND BY ATTRIBUTE
Saturday, October 6, 12
59
Find documents by a specific arribute
• Lets find beers by brewery_id!
Saturday, October 6, 12
60
The index defini^on
Saturday, October 6, 12
61
The result set: beers keyed by brewery_id
Saturday, October 6, 12
62
QUERY PATTERN:BASIC AGGREGATIONS
Saturday, October 6, 12
63
Use a built-‐in reduce func^on with a group query
• Lets find average abv for each brewery!
Saturday, October 6, 12
64
We are reducing doc.abv with _stats
Saturday, October 6, 12
65
Group reduce (reduce by unique key)
Saturday, October 6, 12
66
QUERY PATTERN:TIME-‐BASED ROLLUPSWITH KEY GROUPING
Saturday, October 6, 12
group_level=3 -‐ daily results -‐ great for graphing
67
• Daily, hourly, minute or second rollup all possible with the same index.
• hrp://crate.im/posts/couchbase-‐views-‐reddit-‐data/
Saturday, October 6, 12
68
GEO INDEX & FULL TEXT INTEGRATION
Saturday, October 6, 12
69
GeoCouch R-‐Tree Index
• Op^mized for bulk loading of large data sets• Simplified query model (bounding box, nearest neighbor)
Saturday, October 6, 12
Elas^c Search Adapter
70
• Elas^c Search is good for ad-‐hoc queries and faceted browsing• Our adapter is aware of changing Couchbase topology• Indexed by Elas^c Search aOer stored to disk in Couchbase
Saturday, October 6, 12
@jchrishrp://www.couchbase.com/
Chris Anderson
Thank You!
71Saturday, October 6, 12