Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance...

42
Multi Dimensional Scaling with Couchbase Server 4.0 Cihan Biyikoglu | Dir. Product Management, Couchbase

Transcript of Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance...

Page 1: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

Multi Dimensional Scalingwith Couchbase Server

4.0Cihan Biyikoglu | Dir. Product Management,

Couchbase

Page 2: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

©2015 Couchbase Inc. 2

Agenda Brief History of Scaling

– Scaling up and out NoSQL Workloads and Scalability Model

– Core Data operations, Indexing and Query Introducing Multi Dimensional Scalability

– Services and Independent Scalability Demo Q&A

Page 3: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

History of Scaling

Page 4: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

©2015 Couchbase Inc. 4

Question

Few million people are looking for a setup to efficiently live and interact. What is the most efficient way to build this infra?A) Build one giant high-rise?B) Build some mid-rises?C) Build many single-family homes

Page 5: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

©2015 Couchbase Inc. 5

Scaling UpBuild one big high-rise Vertical Scaling

– Cluster processors – hyper-threading to cores– Locally partition workload among processors – Communicate over memory

Great for fast processing but limited in scalability and elasticity

Page 6: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

©2015 Couchbase Inc. 6

Scaling outBuild a large community of single-family houses Horizontal Scaling

– Cluster commodity HW– Partition workload among nodes – Communicate over network

Great for scaling and elasticity but slower communication

Page 7: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

©2015 Couchbase Inc. 7

So what is the right model

?

Page 8: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

NoSQL Workloads &Scalability Model

Page 9: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

©2014 Couchbase, Inc. ©2015 Couchbase Inc. 9

NoSQL Workloads One Database Many Workloads

– Core Data Processing: GETs & SETs for given key– Indexing: Index maintenance and lookups– Querying: Combine index and data with complex just-in-time

data re-shaping, ordering, grouping, aggregations and more

Varying resource requirements - CPU, RAM, I/O, Network

Varying methods to optimize latency & throughput for each

9

Page 10: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

©2015 Couchbase Inc. 10

Scalability Model TodayHomogenous Scaling

– Each node get a slice of the workload– Simple to do…

But...• Workloads compete and interfere with each other• Cant fine tune each workload

- Core Data operation are partition-able so great with wider fan-out- Indexing and Query not always partition-able so worse with wider fan-out

Index Service

Couchbase Cluster

Query ServiceData Service

node1 node8

Page 11: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

Introducing Multi Dimensional Scalability

Page 12: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

©2015 Couchbase Inc. 12

Modern ArchitectureWhat is Multi-Dimensional Scalability? MDS is the architecture that enables independent scaling of data, query and indexing workloads.

Index Service

Couchbase Cluster

Query ServiceData Service

node1 node8

Page 13: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

©2015 Couchbase Inc. 13

Modern Architecture Isolated Service for minimized interference

– Independent “zones” for Query, Index and Data Services

Minimize indexing and query overhead on core KV operations

Index ServiceGlobal

Secondary Indexes

Couchbase Cluster

Query Service

Data ServiceViews and Geo Views

node1 node8

Page 14: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

©2015 Couchbase Inc. 14

Modern Architecture Independent Scalability for Best Computational Capacity per Service

Heavier indexing (index more fields) : scale up index service nodesMore RAM for query processing: scale up query service nodes

Couchbase Cluster

node1 node8

Data Service

Index ServiceQuery Service

Page 15: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

Under the HoodServices Architecture

Data, Index & Query

Page 16: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

©2015 Couchbase Inc. 16

Couchbase Server 4.0 - Cluster Architecture

STORAGE

Couchbase Server 1

SHARD7

SHARD9

SHARD5

SHARDSHARDSHARD

Managed Cache

Cluster ManagerCluster Manager

Managed CacheStorage

Data Service

Index Service

Query Service

STORAGE

Couchbase Server 2

Managed Cache

Cluster ManagerCluster Manager

Data Service

Index Service

Query Service

STORAGE

Couchbase Server 3

SHARD7

SHARD9

SHARD5

SHARDSHARDSHARD

Managed Cache

Cluster ManagerCluster Manager

Data Service

Index Service

Query Service

STORAGE

Couchbase Server 4

SHARD7

SHARD9

SHARD5

SHARDSHARDSHARD

Managed Cache

Cluster ManagerCluster Manager

Data Service

Index Service

Query Service

STORAGE

Couchbase Server 5

SHARD7

SHARD9

SHARD5

SHARDSHARDSHARD

Managed Cache

Cluster ManagerCluster Manager

Data Service

Index Service

Query Service

STORAGE

Couchbase Server 6

SHARD7

SHARD9

SHARD5

SHARDSHARDSHARD

Managed Cache

Cluster ManagerCluster Manager

Data Service

Index Service

Query Service

Managed CacheStorage

Managed CacheStorage

Managed CacheStorage

Managed CacheStorage

Managed CacheStorage

Page 17: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

©2015 Couchbase Inc. 17

Couchbase Server 4.0 - Cluster Architecture

17

STORAGE

Couchbase Server 1

SHARD7

SHARD9

SHARD5

SHARDSHARDSHARD

Managed Cache

Cluster ManagerCluster

Manager

Managed Cache

Storage

Data Service

Index Service

Query Service STORAGE

Couchbase Server 2

SHARD7

SHARD9

SHARD5

SHARDSHARDSHARD

Managed Cache

Cluster ManagerCluster

Manager

Managed Cache

Storage

Data Service

Index Service

Query Service STORAGE

Couchbase Server 3

SHARD7

SHARD9

SHARD5

SHARDSHARDSHARD

Managed Cache

Cluster ManagerCluster

Manager

Managed Cache

Storage

Data Service

Index Service

Query Service STORAGE

Couchbase Server 4

SHARD7

SHARD9

SHARD5

SHARDSHARDSHARD

Managed Cache

Cluster ManagerCluster

Manager

Managed Cache

Storage

Data Service

Index Service

Query Service STORAGE

Couchbase Server 5

SHARD7

SHARD9

SHARD5

SHARDSHARDSHARD

Managed Cache

Cluster ManagerCluster

Manager

Managed Cache

Storage

Data Service

Index Service

Query Service STORAGE

Couchbase Server 6

SHARD7

SHARD9

SHARD5

SHARDSHARDSHARD

Managed Cache

Cluster ManagerCluster

Manager

Managed Cache

Storage

Data Service

Index Service

Query Service

Page 18: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

DEMO18

Page 19: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

Connectivity

Page 20: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

©2015 Couchbase Inc. 20

Connectivity and Client Libraries

Type Port EndpointREST 8091, 18091 Admin Connections

Pointed at any node in the cluster

REST 8091, 18092 Query with View Load balanced across node of the cluster that runs data service

REST 8093, 18093 Query with N1QL Load balanced across node of the cluster that runs query service

ONLINE 11210, 11207 Core Data OperationsState-full connections from client app to nodes of the cluster that runs data service

Page 21: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

©2015 Couchbase Inc. 21

Connectivity and Client Libraries Connectivity Phases1. Auth2. Discovery

• Get cluster map 3. Service Connection

• Auth to Service• Run operation• If (topology_change) • Rerun #2

21

1,2 3

Page 22: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

©2015 Couchbase Inc. 22

Discovery and Cluster Map

Page 23: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

©2015 Couchbase Inc. 23

Discovery and Cluster Map

Page 24: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

©2015 Couchbase Inc. 24

Discovery and Cluster Map – 2 New Nodes

Page 25: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

Replication

Page 26: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

©2015 Couchbase Inc. 26

Database Change Protocol (DCP)Fast Streaming Replication DCP - An open streaming protocol that conveys the consistent database state to all

consumers– Ordering (vbucket based seq.number)– Re-startable, Resumable (version histories and rollbacks)– Consistent (snapshots)– High Performance (memory based with dedup)

Master

Local Replic

a

Index

Map/Reduc

eRemot

eReplic

a

IndexMap/Reduc

e

Source Cluster

Cross Data Center Cluster

Hadoop

Client/Applicati

on

NotificationIn future

Integration

Backup/Export

Tooling

Page 27: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

Cluster Manager

Page 28: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

©2015 Couchbase Inc. 28

Cluster ManagerCluster Manager = Governor of the ClusterManages cluster level operations and coordination among nodes

– Cluster Membership & Service Layout– Node Status & Failover– Data Placement & Rebalance– Auth

28

Page 29: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

©2015 Couchbase Inc. 29

Cluster ManagerInside Cluster Manager

per-node-&-bucket services

generic distributed facilities

generic local facilities

Logging and Other Services

distributed node discovery

Master Services- cluster level

operations - data placement - rebalancer- auto-failover

Admin Portal – REST API

Global Config (gossip replication)

Local Config Store

Per-node Services - Heartbeats, - Babysitter

Bucket services - dcp init and teardown- stats collectors,

Auth

Page 30: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

30©2014 Couchbase Inc.

Adding Nodes to Cluster Online

ACTIVE ACTIVE ACTIVE

REPLICA REPLICA REPLICA

Couchbase Server 1 Couchbase Server 2 Couchbase Server 3

ACTIVE ACTIVE

REPLICA REPLICA

Couchbase Server 4 Couchbase Server 5

SHARD5

SHARD2

SHARD SHARD

SHARD4

SHARD SHARD

SHARD1

SHARD3

SHARD SHARD

SHARD4

SHARD1

SHARD8

SHARD SHARD SHARD

SHARD6

SHARD3

SHARD2

SHARD SHARD SHARD

SHARD7

SHARD9

SHARD5

SHARD SHARD SHARD

SHARD7

SHARD

SHARD6

SHARD

SHARD8

SHARD9

SHARD

READ/WRITE/UPDATE

Cluster Manager receives the new nodes - Node inherit cluster

settings- Move active and replica

vbuckets using DCP- As vbuckets catch up,

Initiate online handoff from “existing node” to “new node”

Clients Receive Topology Change Notification- Trap not_my_vbucket

errors- Refresh cluster map and

retry operation

Page 31: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

Data Service

Page 32: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

©2015 Couchbase Inc. 32

Data Service Data Service = GET/SET + Map-Reduce Views*Tackles fast core data operations with efficient caching and disk persistence

Core Database Operations– Core GET/SET operations– Couchstore Based Storage

Terms:Bucket = database reside within a clustervBucket = hash partition of the database that reside within a node 32

Page 33: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

©2015 Couchbase Inc. 33

Data Manager Architecture

Database Engine (ep-engine)

Listener

vBucket Manager

Item Pager

Expiry PagerCheckpoint Manager

CachePartition Hash

Tables (Active and

Replica)

Partition Hash Tables

(Active and Replica)

Partition Hash Tables

(Active and Replica)

AuthNetwork IO

Flusher

Scheduler

Reader IO

Writer IO

Non IO

Batch Reader

Page 34: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

Query Service

Page 35: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

©2015 Couchbase Inc. 35

Query Service Query Service = N1QLTackles N1QL Query execution

– Query Execution– N1QL Parser & Optimizer: tokenize N1QL statement, and

generate an execution plan based utilizing indexes– Query Execution Engine: Assigns resources to query and

coordinates query execution.– Data Sources: Pluggable “data source driver” layer for

accessing data sources in Couchbase Server (data and index service) and other external data provides

35

Page 36: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

©2015 Couchbase Inc. 36

Query Service N1QL Query Processing

Query Engine

Query Processor

Listeners

Parser Optimizer

Data Stores

Execution Engine

Couchbase Server

Auth DataIndexersGSI View

s

Others…

8093/18903

File systemData Service

Index Service

......

Cluster Manager

Bucket#2

Bucket#2

Index#2

Index#1

Page 37: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

Index Service

Page 38: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

©2015 Couchbase Inc. 38

Index Service Global Secondary Indexes (NEW in 4.0)Tackles indexer for fast query execution with efficient index maintenance for N1QL Queries

– High Performance Indexing– Projector and Router : Coordinate and communicate efficient

index change notifications between data service and index service.

– Supervisor – Indexer and scannerIndexer : Maintain large number of indexes as change

notifications arriveScanner: Respond to Query Service index-scan requests with

rich set of consistency dials– Index Storage &Caching

ForestDB: Brand new storage engine for high performance index caching and storage

38

Page 39: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

©2015 Couchbase Inc. 39

Data Service

Projector & Router

Indexing Service

Query ServiceIndex Service

SupervisorIndex maintenance &

Scan coordinator

Index#2

Index#1

Query Processorcbq-engine

Bucket#1

Bucket#2

DCP Stream Index#4Index#3

...Bucket#2

Bucket#1

Projector and Router: 1 Projector and Router per node1 stream of changes per buckets per supervisor

ForestDBStorage Engine Supervisor

1 Supervisor per nodeMany indexes per Supervisor

Page 40: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

Recap

Page 41: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

©2015 Couchbase Inc. 41

Recap MSD enables unprecedented control on scalability

with Couchbase Server– Separate out competing workloads to independent services– Independently scale each service “zone” within the cluster

Couchbase Server with MDS maximizes scalability and performance– Improves scale and performance to degrees not possible with

other NoSQL or Big Data engines on premise or in the cloud– Improved price/performance and squeezes more performance

and throughput for mission critical systems

Page 42: Introducing Multi-Dimensional Scaling: Independent Scalability for Query and Indexing Performance with Couchbase Server 4.0 – Couchbase Live New York 2015

Get Started with Couchbase Server 4.0 - Couchbase.com/Downloads

Q&ACihan Biyikoglu | [email protected] |

@cihangirb