Best Practices for Scaling and Deploying Couchbase Mobile in the Cloud: Couchbase Connect 2015

34
BEST PRACTICES FOR SCALING AND DEPLOYING COUCHBASE MOBILE IN THE CLOUD Traun Leyden, Couchbase

Transcript of Best Practices for Scaling and Deploying Couchbase Mobile in the Cloud: Couchbase Connect 2015

BEST PRACTICES FOR SCALING AND DEPLOYING COUCHBASE MOBILE IN THE CLOUD

Traun Leyden, Couchbase

©2015 Couchbase Inc. ‹#›

Couchbase Mobile Overview

Couchbase Server

Couchbase Server

Couchbase Server

Sync Gateway

Sync Gateway

Sync Gateway

Sync Gateway

Couchbase Lite (iOS, Android, PhoneGap, etc.)

©2015 Couchbase Inc. ‹#›

Couchbase Mobile Overview

Couchbase Server

Couchbase Server

Couchbase Server

Sync Gateway

Sync Gateway

Sync Gateway

Sync Gateway

Couchbase Lite (iOS, Android, PhoneGap, etc)

Share Nothing

©2015 Couchbase Inc. ‹#›

Couchbase Mobile Overview

Couchbase Server

Couchbase Server

Couchbase Server

Sync Gateway

Sync Gateway

Sync Gateway

Sync Gateway

Couchbase Lite (iOS, Android, PhoneGap, etc)

Share Nothing

Sync Gateway

Scale Horizontally

©2015 Couchbase Inc. ‹#›

Couchbase Mobile Overview

Couchbase Server

Couchbase Server

Couchbase Server

Sync Gateway

Sync Gateway

Sync Gateway

Sync Gateway

Couchbase Lite (iOS, Android, PhoneGap, etc)

Share Nothing

Scale Horizontally

Couchbase Server

Scaling Challenges

©2015 Couchbase Inc. ‹#›

Clients Will Keep Long Running Sockets Open

Sync Gateway

Couchbase Lite (iOS, Android, PhoneGap, etc.)

WebSocket

HTTP Long Polling

©2015 Couchbase Inc. ‹#›

WebSocket and HTTP Long-Polling Clients

Android allows background apps to keep sockets open longer

TCP keepalive considerationsClients tend to leave half-open connections Dead peers can live up to 2 hours before being cleaned up

by the TCP/IP stackIt’s possible to tune this — see our Sync Gateway Docs

under “OS Level Tuning”

©2015 Couchbase Inc. ‹#›

Couchbase Views

Sync Gateway uses 8 Couchbase viewsCouchbase views are relatively expensive

operationsThe Couchbase Server documentation provides

specific guidelines regarding view performance

Couchbase views can pose scalability challenges

Scaling Best Practices

©2015 Couchbase Inc. ‹#›

Sizing Considerations

5K connected users per Sync Gateway on a quad-core machine

That number depends on many factors:How often users connectHow powerful your hardware isWhether custom views are used (note: requires more

cores)How many view queries are triggered

These numbers are gross approximations; do your own testing

Recommended number of users per Sync Gateway

©2015 Couchbase Inc. ‹#›

Use a Load Balancer to Distribute Load

Couchbase Server

Sync Gateway

Couchbase Lite (iOS, Android, PhoneGap, etc.)

Load Balancer

Sync Gateway

Sync Gateway

Sync Gateway

Couchbase Server

Couchbase Server

©2015 Couchbase Inc. ‹#›

Load Balancer Considerations

The load balancer can pick any Sync Gateway; there are no “sticky sessions” to worry about

Nginx is known to work with Sync GatewayAmazon Elastic Load Balancer (ELB) is known to

have WebSocket issues

©2015 Couchbase Inc. ‹#›

Multiple Datacenter ConsiderationsCouchbase XDCR (Cross Datacenter Replication)

only works in a master-slave scenarioIn general, it’s not safe to mutate a Couchbase

bucket that is updated by Sync GatewayIn a master-master scenario, the Couchbase bucket

owned by Sync Gateway would be updated in an unsafe manner, and is therefore not a valid usage scenario

sg-replicate (experimental – alpha) can work in a master-master scenario, because it speaks the Sync Gateway replication protocol and all changes go through Sync Gateway

A Tour of Our New Performance Testing Suite

©2015 Couchbase Inc. ‹#›

The ProblemTesters need to spin up large, ephemeral clusters for

performance testingTesting teams need their own cluster without interfering

with other teamsWith dedicated clusters, many of the servers are

frequently idle, which has cost implications. The performance test suite should work on:Various cloud providers (AWS, GCE, etc.)Bare-metal clusters

No common performance testing suite has been available for everyone to share and contribute to, resulting in duplicated effort

©2015 Couchbase Inc. ‹#›

We want you to use this tool!

Save time/effort by doing your own internal performance testing using this tool instead of building your own

Make it easy for us to reproduce your issue under identical conditions

Contribute back any scenarios specific to your use cases, for possible inclusion into the common performance testing suite

©2015 Couchbase Inc. ‹#›

Components of the Performance Test Suite

AWS CloudFormation – Machine Provisioning (AWS Specific)

Ansible – Software ProvisioningGateload or Gatling – load generatorsSplunk – Monitoring and Analysis (optional)

©2015 Couchbase Inc. ‹#›

AWS CloudFormation

Allows you to compactly specify a group of EC2 instances and their settings via a single JSON file

You can launch or destroy a CloudFormation cluster in single operation

You can easily automate operations via the EC2 command line interface tool

CloudFormation is available in the AWS Management Console

Features

©2015 Couchbase Inc. ‹#›

Ansible

Ansible is an automated provisioning tool, like Chef and Puppet

No agents are required – therefore there are no “bootstrapping” conundrums

Uses a declarative approach via easy-to-read YAML files – there is no “code” to maintain

Easy for people to pick up and make changes, even non-programmers

100% open source – it has 11K stars and 3K forks on GitHub

It can run on either cloud or on-premises clusters

Features

©2015 Couchbase Inc. ‹#›

Spin up an AWS CloudFormationDetermine the custom parameters that you wantNumber of Couchbase Server instancesNumber of Sync Gateway instancesNumber of Gateload or Gatling instanceInstance type for all of the aboveAMI for all of the above (defaults to stock CentOS 7

AMI)Generate a CloudFormation JSON file via the troposphere

python script, using the parameters given aboveLaunch CloudFormation via the AWS CLI tool or AWS

Management Console

©2015 Couchbase Inc. ‹#›

Provision the Cluster via Ansible

ansible-playbook install-go.yml && \ansible-playbook install-couchbase-server-3.0.3.yml && \ansible-playbook build-sync-gateway.yml && \ansible-playbook build-gateload.yml && \ansible-playbook install-sync-gateway-service.yml && \ansible-playbook install-splunkforwarder.yml

©2015 Couchbase Inc. ‹#›

Sample Ansible Playbook (YAML)

- name: Stop Couchbase Service service: name=couchbase-server.service state=stopped ignore_errors: yes- name: Uninstall couchbase server shell: rpm -e couchbase-server ignore_errors: yes- name: Remove all couchbase server residue shell: rm -rf /opt/couchbase- name: Download couchbase server get_url: url={{ couchbase_server_centos_ee_url }} dest=/tmp- name: Install Couchbase Server yum: name=/tmp/{{ couchbase_server_centos_ee_package }} state=present- name: Restart Couchbase Service service: name=couchbase-server.service state=restarted

©2015 Couchbase Inc. ‹#›

Result of the Ansible Provisioning

Couchbase Server

Couchbase Server

Couchbase Server

Sync Gateway

Sync Gateway

Sync Gateway

Sync Gateway

Gatling Gatling Gatling Gatling

©2015 Couchbase Inc. ‹#›

Gathering Performance Metrics

Load generator outputGatling built-in HTML reportingGateload JSON report

Sync Gateway statistics endpoint (JSON) Couchbase Server REST API for statistics (JSON)Splunk Management Console (optional) We are working on a re-usable Splunk dashboard

Gather performance metrics from several places

©2015 Couchbase Inc. ‹#›

Contributions Wanted

We would like your help to make the Ansible scripts work on: Google Compute EngineJoyent CloudBare-metal clusters (the easiest)

If you develop custom Gatling scenarios for your usage, please contribute them back

Monitoring Sync Gateway

©2015 Couchbase Inc. ‹#›

Monitoring Checklist

Response time of root database endpoints, check for errorslocalhost:4985/_dbname

Memory Usage via TopGC times via expvars (or logs if GODEBUG=gctrace=1)Number of Goroutines via expvarsNumber of open socket file descriptorsnetstat

Keep your eye on our Splunk configuration scripts for new additions

Cutting Edge Approaches to Cloud Deployment

©2015 Couchbase Inc. ‹#›

Kubernetes

Open-source container orchestration platform from Google

Supports Docker and rkt (Rocket) containers Runs on-cloud (GKE, AWS, etc.) or on-premisesCommercial support available from:CoreOS Tectonic (come see the talk at 1:45 pm!)Kismatic

Developer momentum is off the chartsStill under active development

©2015 Couchbase Inc. ‹#›

Joyent Triton

Container-centric infrastructure-as-a-service (IaaS)

Supports Docker containers Hosts are abstracted awayRuns on the Joyent Public Cloud or on-premisesCommercial support available from JoyentAllows you to leverage the ZFS filesystem and the

DTrace profiling tools

©2015 Couchbase Inc. ‹#›

Cloud Foundry from Pivotal

Developed to deploy the Cloud Foundry platform-as-a-service (PaaS)

Can provision and deploy software over hundreds of VMSRuns on infrastructure-as-a-service (IaaS) providers such

as: AWS | OpenStack | VMware vSphereGoogle Compute Engine | Apache CloudStack

Unifies release engineering, deployment, and lifecycle management

Allows you to easily version, package, and deploy software in a reproducible manner

©2015 Couchbase Inc. ‹#›

Links

Performance Test Suitehttps://github.com/couchbaselabs/perfcluster-aws

Couchbase on Kuberneteshttps://github.com/couchbase/kubernetes

Couchbase on Joyent Tritonhttp://tleyden.github.io/blog/2015/05/05/running-couchbase-server-under-docker-on-joyent/

Thank you.