Couchbase_John_Bryce_Israel_Training_couchbase_overview

Introduction to Couchbase Server

Perry Krug

Sr. Solutions Architect

Couchbase Server 2.0 is a high performance, easy to scale and flexible Document “NoSQL” Database.

Easy Scalabili

ty

Consistent High

Performance

Always On

24x365

Grow cluster without application changes, without downtime with a single click

Consistent sub-millisecond read and write response times

with consistent high throughput

No downtime for software upgrades, hardware maintenance, etc.

Couchbase Server

JSONJSONJSON

JSONJSON

PERFORMANCE

Flexible Data Model

JSON document model with no fixed schema.

The NoSQL Promise

Couchbase Feature Set• Flexible Data Model:

JSON Support Indexing/Querying Incremental Map-Reduce

• Easy Scalability: “Clone to grow” with auto-sharding Cross-data center replication

• Consistent High Performance: Built-in Object level cache

• Always on 24x365 Zero-downtime maintenance Built-in data replication with auto-failover Management and Monitoring UI Reliable persistence architecture

Couchbase Server Architecture

Replication, Rebalance, Shard State Manager

REST management API/Web UI

8091Admin Console

Erla

ng /

OTP

11210 / 11211Data access ports

Object-managedCache

Storage Engine

8092Query API

Que

ry E

ngin

e

http

Data Manager Cluster Manager

Couchbase Operations

Web Application

Client Interaction

Data Flow

Cluster Management

Web Application

CouchbaseClient Library

Web Application … …

Couchbase Server Couchbase Server Couchbase Server Couchbase Server

Replication Flow

33 2

Write (‘set’) Operation2

Managed Cache

Dis

k Q

ueue

Disk

Replication Queue

App Server

Couchbase Server Node

Doc 1Doc 1

Doc 1

To other node

33 2

View processing and XDCR2

Managed Cache

Dis

k Q

ueue

Disk

Replication Queue

App Server


Doc 1Doc 1

Doc 1

To other node

XDCR Queue

Doc 1

To other clusterView engine

Doc 1

Disk Compaction

• Disk writes to data files and index are ‘append-only’

• On-disk size increases compared to actual stored data

• Compaction defragments data and index information

• Operates on a live bucket (no downtime)

• Both automatic and manual compaction available

• Compaction operates per-shard on each node

Compaction

Initial file layout:

Update some data:

After compaction:

Doc A Doc B Doc C

Doc C Doc B’ Doc A’’

Doc A Doc B Doc A’ Doc B’ Doc A’’Doc A Doc B Doc C Doc A’ Doc D

Doc D

GET

Doc

1

33 2

Read (‘get’) Operation2

Dis

k Q

ueue

Replication Queue

App Server


Doc 1

Doc 1Doc 1

Managed Cache

Disk

To other node

33 2

Cache Ejection2

Dis

k Q

ueue

Replication Queue

App Server


Doc 1

Doc 6Doc 5Doc 4Doc 3Doc 2

Doc 1

Doc 6 Doc 5 Doc 4 Doc 3 Doc 2

Managed Cache

Disk

To other node

33 2

Cache Miss2

Dis

k Q

ueue

Replication Queue

App Server


Doc 1

Doc 3Doc 5 Doc 2Doc 4

Doc 6 Doc 5 Doc 4 Doc 3 Doc 2

Doc 4

GET

Doc

1

Doc 1

Doc 1

Managed Cache

Disk

To other node

COUCHBASE SERVER CLUSTER

Cluster wide - Basic Operation

• Docs distributed evenly across servers

• Each server stores both active and replica docsOnly one server active at a time

• Client library provides app with simple interface to database

• Cluster map provides map to which server doc is onApp never needs to know

• App reads, writes, updates docs

• Multiple app servers can access same document at same time

User Configured Replica Count = 1

READ/WRITE/UPDATE

ACTIVE

VB 1

VB 7

VB 4

VB 8

VB 14

SERVER 1

ACTIVE

VB 2

VB 9

VB 5

VB 10

VB 16

SERVER 2

VB 15

ACTIVE

VB 3

VB 11

VB 6

VB 12

VB 18

REPLICA

VB 2

VB 9

VB 15

VB 3

VB 11

VB 17

REPLICA

VB 4

VB 8

VB 13

VB 6

VB 12

VB 18

REPLICA

VB 5

VB 10

VB 14

VB 7

VB 1

VB 16

SERVER 3

VB 17

APP SERVER 1

COUCHBASE Client Library

CLUSTER MAP


CLUSTER MAP

APP SERVER 2

VB 13

Cluster wide - Add Nodes to Cluster

• Two servers addedOne-click operation

• Docs automatically rebalanced across clusterEven distribution of docsMinimum doc movement

• Cluster map updated

• App database calls now distributed over larger number of servers

REPLICA

ACTIVE

Doc 5

Doc 2

Doc

Doc

Doc 4

Doc 1

Doc

Doc

SERVER 1

REPLICA

ACTIVE

Doc 4

Doc 7

Doc

Doc

Doc 6

Doc 3

Doc

Doc

SERVER 2

REPLICA

ACTIVE

Doc 1

Doc 2

Doc

Doc

Doc 7

Doc 9

Doc

Doc

SERVER 3 SERVER 4 SERVER 5

REPLICA

ACTIVE

REPLICA

ACTIVE

Doc

Doc 8 Doc

Doc 9 Doc

Doc 2 Doc

Doc 8 Doc

Doc 5 Doc

Doc 6

READ/WRITE/UPDATE READ/WRITE/UPDATE

APP SERVER 1


CLUSTER MAP


CLUSTER MAP

APP SERVER 2



Cluster wide - Fail Over Node

REPLICA

ACTIVE

Doc 5

Doc 2

Doc

Doc

Doc 4

Doc 1

Doc

Doc

SERVER 1

REPLICA

ACTIVE

Doc 4

Doc 7

Doc

Doc

Doc 6

Doc 3

Doc

Doc

SERVER 2

REPLICA

ACTIVE

Doc 1

Doc 2

Doc

Doc

Doc 7

Doc 9

Doc

Doc

SERVER 3 SERVER 4 SERVER 5

REPLICA

ACTIVE

REPLICA

ACTIVE

Doc 9

Doc 8

Doc Doc 6 Doc

Doc

Doc 5 Doc

Doc 2

Doc 8 Doc

Doc

• App servers accessing docs

• Requests to Server 3 fail

• Cluster detects server failedPromotes replicas of docs to activeUpdates cluster map

• Requests for docs now go to appropriate server

• Typically rebalance would follow

Doc

Doc 1 Doc 3

APP SERVER 1


CLUSTER MAP


CLUSTER MAP

APP SERVER 2




Indexing and Querying


ACTIVE

Doc 5

Doc 2

Doc

Doc

Doc

SERVER 1

REPLICA

Doc 4

Doc 1

Doc 8

Doc

Doc

Doc

APP SERVER 1


CLUSTER MAP


CLUSTER MAP

APP SERVER 2

Doc 9

• Indexing work is distributed amongst nodes

• Large data set possible

• Parallelize the effort

• Each node has index for data stored on it

• Queries combine the results from required nodes

ACTIVE

Doc 5

Doc 2

Doc

Doc

Doc

SERVER 2

REPLICA

Doc 4

Doc 1

Doc 8

Doc

Doc

Doc

Doc 9

ACTIVE

Doc 5

Doc 2

Doc

Doc

Doc

SERVER 3

REPLICA

Doc 4

Doc 1

Doc 8

Doc

Doc

Doc

Doc 9

Query

• Application can access both clusters (active – active replication)• Scales out linearly• Different from intra-cluster replication (“CP” versus “AP”)

XDCR: Cross Data Center Replication

Full Text Search

Documents

• get (key)– Retrieve a document

• set (key, value)– Store a document, overwrites if exists

• add (key, value)– Store a document, error/exception if exists

• replace (key, value)– Store a document, error/exception if doesn’t exist

• cas (key, value, cas)– Compare and swap, mutate document only if it hasn’t changed

while executing this operation

Store & Retrieve Operations

Check and Set/Compare and Swap (CAS)• Compares supplied CAS to validate a

change to a value: Client gets key and checksum

(cas_token) Client updates using key and checksum If checksum doesn’t match, update fails

• Client can only update if the key + CAS match

• Used when multiple clients access same data

• First client with correct CAS wins

• Subsequent client updates receive CAS mismatch

Actor 1 Actor 2

Couchbase Server

CAS mismatchSuccess

Document Driven

• Use JSON to store documents Replace serialized objects Custom structures

• Documents define a "record" of data

• Store/Update/Retrieve using same protocol

• JSON parsed by the server View system

JSON Document Structuremeta{“id”: “u::[email protected]”,“rev”: “1-0002bce0000000000”,“flags”: 0,“expiration”: 0,“type”: “json”}

document{“uid”: 123456,“firstname”: “jasdeep”,“lastname”: “Jaitla”,“age”: 22,“favorite_colors”: [“blue”, “black”],“email”: “[email protected]”}

MetaInformationIncluding Key

All Keys Unique and Kept in RAM

DocumentValue

Most Recent In Ram And Persisted To Disk

A JSON Document

{ “id": "beer_Hoptimus_Prime", “type”: “beer”, "abv": 10.0, "brewery": "Legacy Brewing Co.", "category": "North American Ale", "name": "Hoptimus Prime", "style": "Imperial or Double India Pale Ale",}

Theprimarykey

Afloat

Thetypeinformation

Other Documents and Document Relationships

{ “id": "beer_Hoptimus_Prime", “type” : “beer”, "abv": 10.0, "brewery": ”brewery_Legacy_Brewing_Co", "category": "North American Ale", "name": "Hoptimus Prime", "style": “Double India Pale Ale”}

{ “id": ”brewery_Legacy_Brewing_Co”, “type” : “brewery”, "name" : "Legacy Brewing Co.", "address": "525 Canal Street Reading, Pennsylvania, 19601 United States", "updated": "2010-07-22 20:00:20", "latitude": -75.928469, "longitude": 40.325725}

Afterthought

Simplicity of Document Oriented Datastore

• Schema is optional– Technically, each document has an implicit schema– Extend the schema at any time!

• Need a new field? Add it. Define a default for similar objects which may not have this field yet.

• Data is self-contained– Documents more naturally support the world around you, the data structures

around you

• Model data for your App/Code instead for the Database

• Try to keep documents as small as possible (less than 1MB)• Group data together that fits together, but split out portions that may

have high levels of contention or are constantly growing

Views/Indexes/Queries• Views create perspectives on a collection of documents

Primary/Secondary/Tertiary/Composite Indexing Aggregations

• Use Incremental Map/Reduce Map defines the relationship between fields in documents and output table Reduce provides method for collating/summarizing

• VIEWS materialize INDEXES Data writes are fast (no index) Index updates all changes since last update Indexes are eventually indexed Must be pre-materialized (ad-hoc querying available via full-text indexing)

• Applications QUERY the INDEX Queries are eventually consistent with respect to documents

Cluster Administration

Web Console

Backup

Data Files

cbbackup

ServerServer Server

network networknetwork

Restore

2) “cbrestore” used to restore data into live/different cluster

Data Files

cbrestore (-a)

Upgrading

2 Methods to upgrade Couchbase Server cluster:

In-place (offline) and Rolling (online)

Sizing a ClusterSizing == performance• Serve reads out of RAM• Enough IO for writes and disk operations• Mitigate inevitable failures

Reading Data Writing Data

Server

Give medocument A

Here is document A

Application Server

A

Server

Please storedocument A

OK, I storeddocument A

Application Server

A

How many nodes?

5 Key Factors determine number of nodes needed:

1) RAM2) Disk3) CPU4) Network5) Data Distribution/Safety

Couchbase Servers

Web application server

Application user

Easy Scalabili

ty

Consistent High

Performance

Always On

24x365

Grow cluster without application changes, without downtime with a single click

Consistent sub-millisecond read and write response times

with consistent high throughput

No downtime for software upgrades, hardware maintenance, etc.

Couchbase Server

JSONJSONJSON

JSONJSON

PERFORMANCE

Flexible Data Model

JSON document model with no fixed schema.

Couchbase is the Complete Solution

Thank you

Couchbase NoSQL Document Database

Couchbase_John_Bryce_Israel_Training_couchbase_overview

Technology

Transcript of Couchbase_John_Bryce_Israel_Training_couchbase_overview