How-To NoSQL 3.0 Webinar Series: Couchbase 101
description
Transcript of How-To NoSQL 3.0 Webinar Series: Couchbase 101
Couchbase 101Todd Greenstein | Engineering, Couchbase
Core Technology
Product Overview
Evolution from Memcached
©2014 Couchbase, Inc. 3
• Founders were key contributors to memcached
• Evolved into Membase, a distributed and persisted key-value store
• Evolved into Couchbase Document Store with JSON support and Map-
Reduce Indexes, Elastic Search Integration, and Cross-Data Center
Replication
Couchbase Server
©2014 Couchbase, Inc. 4
General purpose
Elastic scalability Consistent high
performance
Always
available
Flexible, global
deployment
Enterprise grade
administration
Real time big
data
Data
mobility
Developer
focused
The Most complete, scalable & highest performing NoSQL database
Couchbase Server
Couchbase offers a full range of
Data Management solutions
High Availability
Cache
Key Value Document Mobile
device
SSN: 400 658 9993
Pass: ******
©2014 Couchbase, Inc. 5
Key Capabilities
©2014 Couchbase, Inc. 6
• Developer focused
JSON Support
Indexing/Querying
Incremental Map-Reduce
• Elastic Scalability
Shared-nothing
architecture with single
node type
Cross-data center
replication (XDCR)
Push button scale out
• Consistent High Performance
Built-in Object level cache
Fine grained locking
Hash Partitioning
• Always available
Zero downtime administration and
upgrades
Streaming and rack aware
replication
Comprehensive cluster-wide
monitoring
Key Concepts
©2014 Couchbase, Inc.
Key Value
©2014 Couchbase, Inc. 8
• Couchbase operates like a Key-Value Document Store
• Key is a UTF-8 string up to 256 Bytes
• Values can be:
- Simple Datatypes: strings, numbers, datetime, boolean, and binary data can be stored --
they are stored as Base64 encoded strings
- Complex Datatypes: dictionaries/hashes, arrays/lists, can be stored in JSON format
(simple lists can be string based with delimiter)
- JSON is a special class of string with a specific format for encoding simple and complex
data structures
• Schema is unenforced and implicit, schema changes are programmatic, done online, and can
vary from Document to Document
Couchbase can act as a
Key-Value Store Document Store
2014-06-23-10:15am : 75F
2014-06-23-11:30am : 77F
2014-06-23-02:00pm : 82F
0001:
{firstname: “Dipti”,
lastname: “Borkar”,
language: “English”,
time_zone: “PST”,
zip: 94403
}
Key - UTF-8 string up to 256 bytes
Value - can be 0 bytes – 20 MB (best practice < 1 MB)©2014 Couchbase, Inc. 9
Can Represent Complex Objects and Data Structures
Very simple notation, lightweight, compact, readable
The most common API return type for Integrations
Facebook, Twitter, you name it, return JSON
Native to Javascript (can be useful)
Can be inserted straight into Couchbase (faster development)
Serialization and Deserialization are very fast
Benefits of JSON
©2014 Couchbase, Inc. 10
Storing and retrieving documents
Couchbase Cluster
Server Nodes
User/application data
Which live on
Data Buckets
DocumentsRead from / Written to
That form a
Clients
Servers
Dynamically scalable
Based on hash partitioning
©2014 Couchbase, Inc. 11
User Objectstring uid
string firstname
string lastname
int age
array favorite_colors
string email
u::[email protected]{ “uid”: 123456,
“firstname”: “John”,“lastname”: “Smith”,“age”: 22,“favorite_colors”: [“blue”, “black”],“email”: “[email protected]”
}
User Objectstring uid
string firstname
string lastname
int age
array favorite_colors
string email
u::[email protected]{ “uid”: 123456,
“firstname”: “John”,“lastname”: “Smith”,“age”: 22,“favorite_colors”: [“blue”, “black”],“email”: “[email protected]”
}
add()
get()
Objects Serialized to JSON and Back
©2014 Couchbase, Inc. 12
Core Architecture
Single Node Type
Within each server – Single Node Type
©2014 Couchbase, Inc. 14
Heart
beat
Pro
cess m
onitor
Glo
bal sin
gle
ton s
uperv
isor
Config
ura
tio
n m
anager
on each node
Rebala
nce o
rchestr
ato
r
Node h
ealth m
onitor
one per clusa
vB
ucket sta
te a
nd r
eplic
atio
n m
anager
http
RE
ST
man
ag
em
en
t A
PI/
Web
UI
HTTP
8091Erlang port mapper
4369Distributed Erlang
21100 - 21199
Erlang/OTP
storage interface
Couchbase EP Engine
11210Memcapable 2.0
Moxi
11211Memcapable 1.0
Memcached
Persistence Layer
8092Query API
Qu
ery
En
gin
e
Data Manager Cluster Manager
Single Node Operations - Write
©2014 Couchbase, Inc. 15
33 2Managed Cache
Dis
k Q
ueu
e
Disk
Replication Queue
App Server
Memory-to-Memory
Replication to other
node
Doc
Doc Doc
Managed Cache
Disk
Single Node Operations - Read
©2014 Couchbase, Inc. 16
Managed Cache
Doc 1
Get Doc 1
Doc 1Doc 1
App Server
Dis
k Q
ueu
e
Replication Queue
Memory-to-Memory
Replication to other
node
Disk
Managed Cache
Single Node Operations – Cache Ejection
©2014 Couchbase, Inc. 17
Doc 1
Doc 1
Doc 2Doc 3Doc 4Doc 5Doc 6
Doc 2Doc 3Doc 4Doc 5Doc 6
App Server
Dis
k Q
ueu
e
Replication Queue
Memory-to-Memory
Replication to other
node
Single Node Operations – Cache Miss
©2014 Couchbase, Inc. 18
33 2
Dis
k Q
ueu
e
Disk
Replication Queue
App Server
Memory-to-Memory
Replication to other
node
Doc 1
Doc 2Doc 3Doc 4Doc 5Doc 6
Doc 2Doc 3Doc 4Doc 5Doc 6
Doc 1
Doc 1Doc 1
Managed Cache
Get Doc 1
Cluster-wide Operations
©2014 Couchbase, Inc.
Each bucket has active and replica data sets
Each data set has 1024 Virtual Bucket (vBuckets)
Documents gets logically mapped to vBuckets
Document IDs always get hashed to the same virtual bucket
Virtual buckets to do not have a fixed physical server location
Mapping between the virtual buckets and physical server is called the cluster map
Each virtual bucket contains 1/1024th portion of the data set
Auto sharding – Bucket and vBuckets
vB
Data buckets
vB
1 ….. 1024
Virtual buckets
©2014 Couchbase, Inc.20
Cluster Map
Hash function (KEY)
vB1 vB2 vB3 vB4 vB5 vB6
Ph
ys
ica
l
se
rve
rs
A B C
Add nodeWhen more scalability
required
Lo
gic
al
Pa
rtit
ion
s
Cluster Map
New Cluster Map
©2014 Couchbase, Inc. 21
read/write/update
Active
SERVER 1
Active
SERVER 2
Active
SERVER 3
APP SERVER 1
COUCHBASE Client Library
CLUSTER MAP
COUCHBASE Client Library
CLUSTER MAP
APP SERVER 2
Shard
5
Shard
2
Shard
9
Shard
Shard
Shard
Shard
4
Shard
7
Shard
8
Shard
Shard
Shard
Shard
1
Shard
3
Shard
6
Shard
Shard
Shard
Replica Replica Replica
Shard
4
Shard
1
Shard
8
Shard
Shard
Shard
Shard
6
Shard
3
Shard
2
Shard
Shard
Shard
Shard
7
Shard
9
Shard
5
Shard
Shard
Shard
Multi-Node Operations
• Docs distributed evenly across servers
• Each server stores both active and replica docs- Only one server active at a time
• Client library provides app with simple interface to database
• Cluster map provides map to which server doc is on- App never needs to know
• App reads, writes, updates docs
• Multiple app servers can access same document at same time
©2014 Couchbase, Inc. 22
SERVER 4 SERVER 5
Replica
Active
Replica
Active
read/write/update
APP SERVER 1
COUCHBASE Client Library
CLUSTER MAP
COUCHBASE Client Library
CLUSTER MAP
APP SERVER 2
Active
SERVER 1
Shard
9
Shard
Replica
Shard
4
Shard
1
Shard
8
Shard
Shard
Shard
Active
SERVER 2
Shard
8
Shard
Replica
Shard
6
Shard
3
Shard
2
Shard
Shard
Shard
Active
SERVER 3
Shard
6
Shard
Replica
Shard
7
Shard
9
Shard
5
Shard
Shard
Shard
read/write/update
Shard
5
Shard
2
Shard
Shard
Shard
4
Shard
7
Shard
Shard
Shard
1
Shard
3
Shard
Shard
Adding Nodes
• Two servers added withone-click operation
• Docs automatically rebalance across cluster- Even distribution of docs- Minimum doc movement
• Cluster map updated
• App database calls now distributed over larger number of servers
©2014 Couchbase, Inc. 23
Failover
SERVER 4 SERVER 5
Replica
Active
Replica
Active
App Server 1
COUCHBASE Client Library
CLUSTER MAP
COUCHBASE Client Library
CLUSTER MAP
App Server 2
Active
SERVER 1
Shard 5
Shard 2
Shard 9Shard
Shard
Shard
Replica
Shard 4
Shard 1
Shard 8Shard
Shard
Shard
Active
SERVER 2
Shard 4
Shard 7 Shard 8
Shard
Shard Shard
Replica
Shard 6
Shard 3 Shard 2
Shard
Shard Shard
Active
SERVER 3
Shard 1
Shard 3
Shard 6Shard
Shard
Shard
Replica
Shard 7
Shard 9
Shard 5Shard
Shard
Shard
• App servers accessing Shards
• Requests to Server 3 fail
• Cluster detects server failedo Promotes replicas of
Shards to activeo Updates cluster map
• Requests for docs now go to appropriate server
• Typically rebalance would follow
Shard 1 Shard 3
Shard
©2014 Couchbase, Inc. 24
Cross Datacenter Replication (XDCR)
©2014 Couchbase, Inc. 25
• Replicates data continuously FROM source cluster TO remote clusters
• Supports unidirectional and bidirectional operation
• Application can read and write from both clusters (active – active replication)
• Replication throughput scales out linearly
• Simplified Administration via console, REST, and CLI
Cross Datacenter Replication (XDCR)
©2014 Couchbase, Inc. 26
Unidirectional Replication
• Hot spare / Disaster Recovery
• Development/Testing copies
• Replicate to indexing cluster
• Integrate to Connector e.g. Solr,
ElasticSearch
• Integrate to custom consumer
Cross Datacenter Replication (XDCR)
©2014 Couchbase, Inc. 27
Bidirectional Replication
• Multiple Active Masters
• Data locality
• Disaster Recovery
33 2
XDCR after Write
2
Managed Cache
Dis
k Q
ueu
e
Disk
Replication Queue
App Server
Couchbase Server Node
Doc 1
Doc 1
XDCR Queue
Doc 1Doc 1
(New in 3.0)
Memory-to-Memory
Replication to
remote clusterMemory-to-Memory
Replication to other
node
©2014 Couchbase, Inc.28
Optimized Memory Usage with Metadata Ejection Policy Better optimization of memory for massive databases
Enable efficient management of rarely accessed data set
Cache only keys and data for the working set & eject all historic data
Tunable Memory - Optimization for Massive Databases
Tunable Memory - Optimization for Massive Databases
100sx Reduction in Metadata Memory Consumption from 2.5 to 3.0Note: The graph represents characteristics under data mutations. ~50M docs with value size ~0.5KB
v.3.0 – Large DB with Hot Working-set
3 GB Consumed for Metadata in RAM
2.5.1 or earlier – Real-time Latency
80 MB Consumed for Metadata in RAM
Developing with Couchbase
©2014 Couchbase, Inc.
Live Demo
©2014 Couchbase, Inc.
Example of a bar chart
©2014 Couchbase, Inc. 33
0
50
100
150
200
250
300
350
400
Couchbase Mongo DB Datastax
Example of a standard table
©2014 Couchbase, Inc. 34
Vendor Feature 1 Feature 2 Feature 3
Couchbase 1.71M
Mongo DB 227K
Datastax 99K
Example of a standard line chart
©2014 Couchbase, Inc. 35
0
50
100
150
200
250
300
350
400
Couchbase Mongo DB Datastax