Couchbase_John_Bryce_Israel_Training_couchbase_overview
-
Upload
couchbase -
Category
Technology
-
view
803 -
download
0
Transcript of Couchbase_John_Bryce_Israel_Training_couchbase_overview
Introduction to Couchbase Server
Perry Krug
Sr. Solutions Architect
Couchbase Server 2.0 is a high performance, easy to scale and flexible Document “NoSQL” Database.
Easy Scalabili
ty
Consistent High
Performance
Always On
24x365
Grow cluster without application changes, without downtime with a single click
Consistent sub-millisecond read and write response times
with consistent high throughput
No downtime for software upgrades, hardware maintenance, etc.
Couchbase Server
JSONJSONJSON
JSONJSON
PERFORMANCE
Flexible Data Model
JSON document model with no fixed schema.
The NoSQL Promise
Couchbase Feature Set• Flexible Data Model:
JSON Support Indexing/Querying Incremental Map-Reduce
• Easy Scalability: “Clone to grow” with auto-sharding Cross-data center replication
• Consistent High Performance: Built-in Object level cache
• Always on 24x365 Zero-downtime maintenance Built-in data replication with auto-failover Management and Monitoring UI Reliable persistence architecture
Couchbase Server Architecture
Replication, Rebalance, Shard State Manager
REST management API/Web UI
8091Admin Console
Erla
ng /
OTP
11210 / 11211Data access ports
Object-managedCache
Storage Engine
8092Query API
Que
ry E
ngin
e
http
Data Manager Cluster Manager
Couchbase Operations
Web Application
Client Interaction
Data Flow
Cluster Management
Web Application
CouchbaseClient Library
Web Application … …
Couchbase Server Couchbase Server Couchbase Server Couchbase Server
Replication Flow
33 2
Write (‘set’) Operation2
Managed Cache
Dis
k Q
ueue
Disk
Replication Queue
App Server
Couchbase Server Node
Doc 1Doc 1
Doc 1
To other node
33 2
View processing and XDCR2
Managed Cache
Dis
k Q
ueue
Disk
Replication Queue
App Server
Couchbase Server Node
Doc 1Doc 1
Doc 1
To other node
XDCR Queue
Doc 1
To other clusterView engine
Doc 1
Disk Compaction
• Disk writes to data files and index are ‘append-only’
• On-disk size increases compared to actual stored data
• Compaction defragments data and index information
• Operates on a live bucket (no downtime)
• Both automatic and manual compaction available
• Compaction operates per-shard on each node
Compaction
Initial file layout:
Update some data:
After compaction:
Doc A Doc B Doc C
Doc C Doc B’ Doc A’’
Doc A Doc B Doc A’ Doc B’ Doc A’’Doc A Doc B Doc C Doc A’ Doc D
Doc D
GET
Doc
1
33 2
Read (‘get’) Operation2
Dis
k Q
ueue
Replication Queue
App Server
Couchbase Server Node
Doc 1
Doc 1Doc 1
Managed Cache
Disk
To other node
33 2
Cache Ejection2
Dis
k Q
ueue
Replication Queue
App Server
Couchbase Server Node
Doc 1
Doc 6Doc 5Doc 4Doc 3Doc 2
Doc 1
Doc 6 Doc 5 Doc 4 Doc 3 Doc 2
Managed Cache
Disk
To other node
33 2
Cache Miss2
Dis
k Q
ueue
Replication Queue
App Server
Couchbase Server Node
Doc 1
Doc 3Doc 5 Doc 2Doc 4
Doc 6 Doc 5 Doc 4 Doc 3 Doc 2
Doc 4
GET
Doc
1
Doc 1
Doc 1
Managed Cache
Disk
To other node
COUCHBASE SERVER CLUSTER
Cluster wide - Basic Operation
• Docs distributed evenly across servers
• Each server stores both active and replica docsOnly one server active at a time
• Client library provides app with simple interface to database
• Cluster map provides map to which server doc is onApp never needs to know
• App reads, writes, updates docs
• Multiple app servers can access same document at same time
User Configured Replica Count = 1
READ/WRITE/UPDATE
ACTIVE
VB 1
VB 7
VB 4
VB 8
VB 14
SERVER 1
ACTIVE
VB 2
VB 9
VB 5
VB 10
VB 16
SERVER 2
VB 15
ACTIVE
VB 3
VB 11
VB 6
VB 12
VB 18
REPLICA
VB 2
VB 9
VB 15
VB 3
VB 11
VB 17
REPLICA
VB 4
VB 8
VB 13
VB 6
VB 12
VB 18
REPLICA
VB 5
VB 10
VB 14
VB 7
VB 1
VB 16
SERVER 3
VB 17
APP SERVER 1
COUCHBASE Client Library
CLUSTER MAP
COUCHBASE Client Library
CLUSTER MAP
APP SERVER 2
VB 13
Cluster wide - Add Nodes to Cluster
• Two servers addedOne-click operation
• Docs automatically rebalanced across clusterEven distribution of docsMinimum doc movement
• Cluster map updated
• App database calls now distributed over larger number of servers
REPLICA
ACTIVE
Doc 5
Doc 2
Doc
Doc
Doc 4
Doc 1
Doc
Doc
SERVER 1
REPLICA
ACTIVE
Doc 4
Doc 7
Doc
Doc
Doc 6
Doc 3
Doc
Doc
SERVER 2
REPLICA
ACTIVE
Doc 1
Doc 2
Doc
Doc
Doc 7
Doc 9
Doc
Doc
SERVER 3 SERVER 4 SERVER 5
REPLICA
ACTIVE
REPLICA
ACTIVE
Doc
Doc 8 Doc
Doc 9 Doc
Doc 2 Doc
Doc 8 Doc
Doc 5 Doc
Doc 6
READ/WRITE/UPDATE READ/WRITE/UPDATE
APP SERVER 1
COUCHBASE Client Library
CLUSTER MAP
COUCHBASE Client Library
CLUSTER MAP
APP SERVER 2
COUCHBASE SERVER CLUSTER
User Configured Replica Count = 1
Cluster wide - Fail Over Node
REPLICA
ACTIVE
Doc 5
Doc 2
Doc
Doc
Doc 4
Doc 1
Doc
Doc
SERVER 1
REPLICA
ACTIVE
Doc 4
Doc 7
Doc
Doc
Doc 6
Doc 3
Doc
Doc
SERVER 2
REPLICA
ACTIVE
Doc 1
Doc 2
Doc
Doc
Doc 7
Doc 9
Doc
Doc
SERVER 3 SERVER 4 SERVER 5
REPLICA
ACTIVE
REPLICA
ACTIVE
Doc 9
Doc 8
Doc Doc 6 Doc
Doc
Doc 5 Doc
Doc 2
Doc 8 Doc
Doc
• App servers accessing docs
• Requests to Server 3 fail
• Cluster detects server failedPromotes replicas of docs to activeUpdates cluster map
• Requests for docs now go to appropriate server
• Typically rebalance would follow
Doc
Doc 1 Doc 3
APP SERVER 1
COUCHBASE Client Library
CLUSTER MAP
COUCHBASE Client Library
CLUSTER MAP
APP SERVER 2
User Configured Replica Count = 1
COUCHBASE SERVER CLUSTER
COUCHBASE SERVER CLUSTER
Indexing and Querying
User Configured Replica Count = 1
ACTIVE
Doc 5
Doc 2
Doc
Doc
Doc
SERVER 1
REPLICA
Doc 4
Doc 1
Doc 8
Doc
Doc
Doc
APP SERVER 1
COUCHBASE Client Library
CLUSTER MAP
COUCHBASE Client Library
CLUSTER MAP
APP SERVER 2
Doc 9
• Indexing work is distributed amongst nodes
• Large data set possible
• Parallelize the effort
• Each node has index for data stored on it
• Queries combine the results from required nodes
ACTIVE
Doc 5
Doc 2
Doc
Doc
Doc
SERVER 2
REPLICA
Doc 4
Doc 1
Doc 8
Doc
Doc
Doc
Doc 9
ACTIVE
Doc 5
Doc 2
Doc
Doc
Doc
SERVER 3
REPLICA
Doc 4
Doc 1
Doc 8
Doc
Doc
Doc
Doc 9
Query
• Application can access both clusters (active – active replication)• Scales out linearly• Different from intra-cluster replication (“CP” versus “AP”)
XDCR: Cross Data Center Replication
Full Text Search
Documents
• get (key)– Retrieve a document
• set (key, value)– Store a document, overwrites if exists
• add (key, value)– Store a document, error/exception if exists
• replace (key, value)– Store a document, error/exception if doesn’t exist
• cas (key, value, cas)– Compare and swap, mutate document only if it hasn’t changed
while executing this operation
Store & Retrieve Operations
Check and Set/Compare and Swap (CAS)• Compares supplied CAS to validate a
change to a value: Client gets key and checksum
(cas_token) Client updates using key and checksum If checksum doesn’t match, update fails
• Client can only update if the key + CAS match
• Used when multiple clients access same data
• First client with correct CAS wins
• Subsequent client updates receive CAS mismatch
Actor 1 Actor 2
Couchbase Server
CAS mismatchSuccess
Document Driven
• Use JSON to store documents Replace serialized objects Custom structures
• Documents define a "record" of data
• Store/Update/Retrieve using same protocol
• JSON parsed by the server View system
JSON Document Structuremeta{“id”: “u::[email protected]”,“rev”: “1-0002bce0000000000”,“flags”: 0,“expiration”: 0,“type”: “json”}
document{“uid”: 123456,“firstname”: “jasdeep”,“lastname”: “Jaitla”,“age”: 22,“favorite_colors”: [“blue”, “black”],“email”: “[email protected]”}
MetaInformationIncluding Key
All Keys Unique and Kept in RAM
DocumentValue
Most Recent In Ram And Persisted To Disk
A JSON Document
{ “id": "beer_Hoptimus_Prime", “type”: “beer”, "abv": 10.0, "brewery": "Legacy Brewing Co.", "category": "North American Ale", "name": "Hoptimus Prime", "style": "Imperial or Double India Pale Ale",}
Theprimarykey
Afloat
Thetypeinformation
Other Documents and Document Relationships
{ “id": "beer_Hoptimus_Prime", “type” : “beer”, "abv": 10.0, "brewery": ”brewery_Legacy_Brewing_Co", "category": "North American Ale", "name": "Hoptimus Prime", "style": “Double India Pale Ale”}
{ “id": ”brewery_Legacy_Brewing_Co”, “type” : “brewery”, "name" : "Legacy Brewing Co.", "address": "525 Canal Street Reading, Pennsylvania, 19601 United States", "updated": "2010-07-22 20:00:20", "latitude": -75.928469, "longitude": 40.325725}
Afterthought
Simplicity of Document Oriented Datastore
• Schema is optional– Technically, each document has an implicit schema– Extend the schema at any time!
• Need a new field? Add it. Define a default for similar objects which may not have this field yet.
• Data is self-contained– Documents more naturally support the world around you, the data structures
around you
• Model data for your App/Code instead for the Database
• Try to keep documents as small as possible (less than 1MB)• Group data together that fits together, but split out portions that may
have high levels of contention or are constantly growing
Views/Indexes/Queries• Views create perspectives on a collection of documents
Primary/Secondary/Tertiary/Composite Indexing Aggregations
• Use Incremental Map/Reduce Map defines the relationship between fields in documents and output table Reduce provides method for collating/summarizing
• VIEWS materialize INDEXES Data writes are fast (no index) Index updates all changes since last update Indexes are eventually indexed Must be pre-materialized (ad-hoc querying available via full-text indexing)
• Applications QUERY the INDEX Queries are eventually consistent with respect to documents
Cluster Administration
Web Console
Backup
Data Files
cbbackup
ServerServer Server
network networknetwork
Restore
2) “cbrestore” used to restore data into live/different cluster
Data Files
cbrestore (-a)
Upgrading
2 Methods to upgrade Couchbase Server cluster:
In-place (offline) and Rolling (online)
Sizing a ClusterSizing == performance• Serve reads out of RAM• Enough IO for writes and disk operations• Mitigate inevitable failures
Reading Data Writing Data
Server
Give medocument A
Here is document A
Application Server
A
Server
Please storedocument A
OK, I storeddocument A
Application Server
A
How many nodes?
5 Key Factors determine number of nodes needed:
1) RAM2) Disk3) CPU4) Network5) Data Distribution/Safety
Couchbase Servers
Web application server
Application user
DEMO
Easy Scalabili
ty
Consistent High
Performance
Always On
24x365
Grow cluster without application changes, without downtime with a single click
Consistent sub-millisecond read and write response times
with consistent high throughput
No downtime for software upgrades, hardware maintenance, etc.
Couchbase Server
JSONJSONJSON
JSONJSON
PERFORMANCE
Flexible Data Model
JSON document model with no fixed schema.
Couchbase is the Complete Solution
Thank you
Couchbase NoSQL Document Database