Interactive Data Analytics with Couchbase N1QL: Couchbase Connect 2015
LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016
Transcript of LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016
![Page 1: LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016](https://reader035.fdocuments.us/reader035/viewer/2022062901/58f221591a28abab5a8b45e9/html5/thumbnails/1.jpg)
Michael Kehoe Staff Site Reliability Engineer
Going all in:From single use-case to many
![Page 2: LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016](https://reader035.fdocuments.us/reader035/viewer/2022062901/58f221591a28abab5a8b45e9/html5/thumbnails/2.jpg)
2
Overview
• The LinkedIn Story• Couchbase Use-Cases• Development & Operations• Conclusions• Questions
![Page 3: LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016](https://reader035.fdocuments.us/reader035/viewer/2022062901/58f221591a28abab5a8b45e9/html5/thumbnails/3.jpg)
3
$ whoamiMichael Kehoe
• Staff Site Reliability Engineer (SRE)• Production-SRE team• Funny accent = Australian
• Contact• linkedin.com/in/michaelkkehoe• @matrixtek
![Page 4: LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016](https://reader035.fdocuments.us/reader035/viewer/2022062901/58f221591a28abab5a8b45e9/html5/thumbnails/4.jpg)
4
$ whatis SREMichael Kehoe
• Site Reliability Engineering• Operations for the production application environment• Responsibilities include
• Architecture design• Capacity planning• Operations• Tooling
![Page 5: LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016](https://reader035.fdocuments.us/reader035/viewer/2022062901/58f221591a28abab5a8b45e9/html5/thumbnails/5.jpg)
5
$ whatis CBVTMichael Kehoe
• Couchbase Virtual Team• ~10 SRE’s• 2 Software Engineers• Sponsored by SRE Director• 5-90% of their time to support Couchbase• Encourage as many people to contribute as possible
• What do we do?• Operational work on Couchbase clusters• Evangelize the use of Couchbase within LinkedIn• Develop tools for the Couchbase Ecosystem
![Page 6: LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016](https://reader035.fdocuments.us/reader035/viewer/2022062901/58f221591a28abab5a8b45e9/html5/thumbnails/6.jpg)
6
The LinkedIn Story
• Founded in 2002, LinkedIn has grown into the world’s largest professional social media network
• 30 offices in 24 countries, Available in 24 languages• More than 450+ million members worldwide
![Page 7: LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016](https://reader035.fdocuments.us/reader035/viewer/2022062901/58f221591a28abab5a8b45e9/html5/thumbnails/7.jpg)
7
The LinkedIn Story
• Growth in Products• Profiles• Groups• Recruiter• Sales Navigator
• Growth in Internet Traffic• Billions of page-hits per day• 100k+ QPS to production services
![Page 8: LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016](https://reader035.fdocuments.us/reader035/viewer/2022062901/58f221591a28abab5a8b45e9/html5/thumbnails/8.jpg)
8
In-Memory Storage NeedsThe LinkedIn Story
• LinkedIn started as an Oracle shop
• Hyper-growth = Scaling challenges• Read-Scaling becomes important
• Applicable use-cases• Simple cache store
• Pre-warmed• Read through
• Potential for Source of Truth (SoT) store
![Page 9: LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016](https://reader035.fdocuments.us/reader035/viewer/2022062901/58f221591a28abab5a8b45e9/html5/thumbnails/9.jpg)
9
Enter CouchbaseThe LinkedIn Story
• Until 2012, we were only using Memcache as a non SoT In-Memory store
• Drawbacks• Difficult to pre-warm• No partitioning/sharding (had to write our own)• Cold-cache restarts• Difficult to move data across hosts/clusters data-centers
![Page 10: LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016](https://reader035.fdocuments.us/reader035/viewer/2022062901/58f221591a28abab5a8b45e9/html5/thumbnails/10.jpg)
10
Enter CouchbaseThe LinkedIn Story
• Evaluated replacement systems for Memcached: Mongo, Redis, and others• Couchbase had distinct advantages:
• Simple replacement for Memcached• Built-in replication and cluster expansion• Automatic partitioning• Low latency• Async writes to disk• Building tooling is simple
![Page 11: LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016](https://reader035.fdocuments.us/reader035/viewer/2022062901/58f221591a28abab5a8b45e9/html5/thumbnails/11.jpg)
11
Enter CouchbaseThe LinkedIn Story
• Today we run Couchbase in our Corporate, Staging and Production environments
• Production/ Staging statistics:• 148 buckets• 2821 hosts• 10M+ QPS
• Largest Clusters:• By Hosts: 72 Hosts• By Documents: 1.4B Documents• By QPS: 2.5M QPS
![Page 12: LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016](https://reader035.fdocuments.us/reader035/viewer/2022062901/58f221591a28abab5a8b45e9/html5/thumbnails/12.jpg)
12
SummaryUse-Cases
Today’s use-cases:• Simple read-through cache• Ephemeral Counter Store• Temporary de-duping store• SoT data-store for internal tooling
![Page 13: LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016](https://reader035.fdocuments.us/reader035/viewer/2022062901/58f221591a28abab5a8b45e9/html5/thumbnails/13.jpg)
13
Simple read-through cacheUse-Cases
• Drop-in replacement for memcache• Read-scaling• Protecting backend database from large amounts of traffic
• E.g. 3rd party ingestion credential cache
![Page 14: LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016](https://reader035.fdocuments.us/reader035/viewer/2022062901/58f221591a28abab5a8b45e9/html5/thumbnails/14.jpg)
14
Counter StoreUse-Cases
• In certain places, we simply need to increment counters from multiple systems and store them
• E.g. Anti-abuse/Anti-scraping systems (Fuse)
![Page 15: LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016](https://reader035.fdocuments.us/reader035/viewer/2022062901/58f221591a28abab5a8b45e9/html5/thumbnails/15.jpg)
15
Temporary De-duping storeUse-Cases
• Need to de-dup data over a large application cluster• E.g. Email systems – Ensure we don’t send the same email twice
![Page 16: LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016](https://reader035.fdocuments.us/reader035/viewer/2022062901/58f221591a28abab5a8b45e9/html5/thumbnails/16.jpg)
16
SoT Store for Internal ToolsUse-Cases
• For Non-Member facing tools, we use Couchbase as a SoT store.• Benefits:
• Schema-less• Short setup time• Couchbase Python Client works easily in our environment• Use views for simple map-reduce
• Example Uses:• Nurse – Autoremediation system• TrafficshiftIn – Global traffic automation system• Availability – Storing and tracking Linkedin availability data
![Page 17: LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016](https://reader035.fdocuments.us/reader035/viewer/2022062901/58f221591a28abab5a8b45e9/html5/thumbnails/17.jpg)
17
Couchbase EcosystemThe LinkedIn Story
![Page 18: LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016](https://reader035.fdocuments.us/reader035/viewer/2022062901/58f221591a28abab5a8b45e9/html5/thumbnails/18.jpg)
18
Developing around Couchbase
• Java – li-couchbase-client• Wrapper around standard Java Couchbase Client• Custom metrics emission• Using Spring interface• Storing data as Java serialized objects
• Python – couchbase-python-client
![Page 19: LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016](https://reader035.fdocuments.us/reader035/viewer/2022062901/58f221591a28abab5a8b45e9/html5/thumbnails/19.jpg)
19
Operational Tooling
In order to efficiently use Couchbase as SRE’s, we need the following:• Provisioning• Installation• Monitoring & Alerting• Infrastructure Visibility
![Page 20: LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016](https://reader035.fdocuments.us/reader035/viewer/2022062901/58f221591a28abab5a8b45e9/html5/thumbnails/20.jpg)
20
ProvisioningOperational Tooling
• Provisioning Flow• Seek estimated usage statistics for cluster
• Size of data to be stored• QPS• Redundancy Needs
• Calculate cluster sizing• Currently done with a template• Couchbase has a simple calculator available online: http://
docs.couchbase.com/prebuilt/calculators/sizing-calc.html• Request hardware for cluster(s)
![Page 21: LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016](https://reader035.fdocuments.us/reader035/viewer/2022062901/58f221591a28abab5a8b45e9/html5/thumbnails/21.jpg)
21
InstallationOperational Tooling
• Process• Enter cluster metadata into our management system (Range)• Use Salt States to install and configure cluster• See Issa Fattah’s post for more information:
• https://engineering.linkedin.com/blog/2016/04/leveraging-saltstack-to-scale-couchbase
• Benefits• Ability to perform ‘state enforcement’• Using Salt Pillar’s to encrypt cluster/ bucket passwords end-to-end
![Page 22: LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016](https://reader035.fdocuments.us/reader035/viewer/2022062901/58f221591a28abab5a8b45e9/html5/thumbnails/22.jpg)
22
Monitoring & AlertingOperational Tooling
• We run a daemon on each Couchbase Server that collects metrics every minute via Couchbase API’s
• Use cluster metadata from range to build dashboards with our own system InGraphs
• See: ‘Monitoring production deployments’: 4pm - Great America 1
![Page 23: LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016](https://reader035.fdocuments.us/reader035/viewer/2022062901/58f221591a28abab5a8b45e9/html5/thumbnails/23.jpg)
23
Monitoring & AlertingOperational Tooling
![Page 24: LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016](https://reader035.fdocuments.us/reader035/viewer/2022062901/58f221591a28abab5a8b45e9/html5/thumbnails/24.jpg)
24
ManagementOperational Tooling
• We want to see a world-view of all the clusters we run
• Having bucket cluster/server level statistics is useful
• Having a global view of who owns and operates each cluster/ bucket is useful
![Page 25: LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016](https://reader035.fdocuments.us/reader035/viewer/2022062901/58f221591a28abab5a8b45e9/html5/thumbnails/25.jpg)
25
ManagementOperational Tooling
![Page 26: LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016](https://reader035.fdocuments.us/reader035/viewer/2022062901/58f221591a28abab5a8b45e9/html5/thumbnails/26.jpg)
26
Conclusions
• Couchbase was a natural fit into our existing infrastructure
• Building an ecosystem around Couchbase was important to us and has helped Couchbase be successful at LinkedIn
• Expanding use of Couchbase• In the past year we’ve grown the number of buckets over 50%• Starting to use Views in production• Moving Couchbase into LinkedIn standard deployment infrastructure
![Page 27: LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016](https://reader035.fdocuments.us/reader035/viewer/2022062901/58f221591a28abab5a8b45e9/html5/thumbnails/27.jpg)
27
Thank You
Questions?
![Page 28: LinkedIn: Going all in: from a single use case to many – Couchbase Connect 2016](https://reader035.fdocuments.us/reader035/viewer/2022062901/58f221591a28abab5a8b45e9/html5/thumbnails/28.jpg)
©2014 LinkedIn Corporation. All Rights Reserved.©2014 LinkedIn Corporation. All Rights Reserved.