Netflix at-disney-09-26-2014
description
Transcript of Netflix at-disney-09-26-2014
![Page 1: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/1.jpg)
Cloud Data Persistence @
Monal Daxini Senior Software Engineer
Cloud Database Engineering !
@monaldax
50m+ Subscribers
![Page 2: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/2.jpg)
SummaryNetflix OSS
Microservices
m@Netflix Season 1, 2
Cassandra @ Netflix
Cassandra Best Practices
Coming Soon…
![Page 3: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/3.jpg)
Start with Zero To Cloud With @NetflixOSS
!https://github.com/Netflix-Skunkworks/zerotocloud
![Page 4: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/4.jpg)
Karyon/Governator
Hystrix
Ribbon/Eureka
Curator
EVCache
Astyanax
Turbine
Servo
Blitz4J
Function OSS Library
RxJava
Archaius
![Page 5: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/5.jpg)
Building Apps and AMIs
ASG /Cluster
WAR
ASG/Cluster
App AMI
Deploy
Launch Instances
@stonse
![Page 6: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/6.jpg)
![Page 7: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/7.jpg)
NetflixOSS
Suro Data Pipeline
Eureka
Zuul
Edda
![Page 8: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/8.jpg)
Micro ServicesMicro services DOES NOT mean better Availability
Need Fault Tolerant Architecture
Service Dependency View
Distributed Tracing (Dapper inspired)
![Page 9: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/9.jpg)
Micro Services1 response - 1 monolithic service 99.99% uptime
1 response - 30 micro services each 99.99% uptime
overall 97% uptime (20hrs downtime)
![Page 10: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/10.jpg)
Micro Services
Actual Scale
~2 Billion Edge Requests per day
Results in ~20 Billion Fan out requests to
~100 different MicroServices
![Page 11: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/11.jpg)
Fault Tolerant Arch
Depedency Isolation
Aggressive timeouts
Circuit breakers
![Page 12: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/12.jpg)
MicroServices Container
Synchronous Asynchronous
Tomcat RxNetty (UDP TCP WebSockets SSE)
ThreadPool
(1 thread per request)
EventLoops
![Page 13: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/13.jpg)
MicroServices Container
Rx
ease async programming
avoid callback hell
Netty to leverage EventLoop
Rx + Netty RxNetty
![Page 14: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/14.jpg)
* Courtsey Brendan Gregg
![Page 15: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/15.jpg)
AWS Maint
![Page 16: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/16.jpg)
@Netflix Season-1
Media Cloud Engineering
![Page 17: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/17.jpg)
Encoding PaaS
Master - Worker Pattern
Decoupled by Priority Queues with message lease
State in Cassandra
![Page 18: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/18.jpg)
Oracle >> Cassandra
Data Model & Lack of ACID
Client Cluster Symbiosis
Embrace Eventual Consistency
Data Migration
Shadow Write / Reads
![Page 19: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/19.jpg)
Object To Cassandra Mapping/** * @author mdaxini */@CColumnFamily(name = “Sequence", shared = true) @Audited(columnFamily = "sequence_audit") public class SequenceBean { @CId(name = "id") private String sequenceName; @CColumn(name = "sequenceValue") private Long sequenceValue; @CColumn(name = "updated") @TemporalAutoUpdate @JsonProperty("updated") private Date updated;
![Page 20: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/20.jpg)
Object To Cassandra Mapping@JsonAutoDetect(JsonMethod.NONE) @JsonIgnoreProperties(ignoreUnknown = true) !@CColumnFamily(name = "task") public class Job { @CId private JobKey jobKey;
public final class TaskKey { @CId(order = 0) private Long packageId; @CId(order = 1) private UUID taskId;
![Page 21: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/21.jpg)
Priority-Scheduling Queue
Evolution:
One SQS Queue per priority range
Store and forward (rate-adaptive) to SQS Queue
Rule based priority, leases, RDBMS based with prefetch
![Page 22: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/22.jpg)
Encoding PaaS Farm
One command deployment and upgrade
Self Serve
Homogeneous View of Windows and Linux
Pioneered Ubuntu - production since 2011
![Page 23: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/23.jpg)
Innovate Fast Build for Pragmatic Scale
Innovate for Business Standardize Later*
![Page 24: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/24.jpg)
@Netflix Season-2
Cloud Database Engineering
[CDE]
![Page 25: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/25.jpg)
Platform Big Data/Caching & Services
Cassandra Astyanax Priam
CassJMeter Hadoop Platform As a Service
Genie
Lipstick
Adapted from a slide by @stonse
Caching
Invi
so*
![Page 26: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/26.jpg)
CDE Charter
Spark*
Solr*
* Under Construction
Dynomite*
Redis
ElasticSearch
Cassandra (1.2.x >> 2.0.x)
Priam
Astyanax
Skynet*
![Page 27: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/27.jpg)
All OLTP Data in Cassandra
!
Almost!
![Page 28: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/28.jpg)
Cassandra Prod Footprint
90+ Clusters
2700+ Nodes
4 Datacenters (Amazon Regions)
>1 Trillion operations per day
![Page 29: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/29.jpg)
Cassandra Best Practices* Usage
*Practices I have found useful, YMMV
![Page 30: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/30.jpg)
Use RandomPartitioner
Have at least 3 replicas (quorum)
Same number of replicas - simpler operations
!
!
create keyspace oracle with placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {us-west-2 : 3, us-east : 3}
![Page 31: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/31.jpg)
Move to CQL3 from thrift
Codifies best practices
Leverage Collections (albeit restricted cardinality)
Use Key Caching
As a default turn off Row Caching
Rename all composite columns in one ALTER TABLE statement.
![Page 32: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/32.jpg)
Watch length of column names
Use “COMPACT STORAGE” wisely
Cannot use collections - depends on CompositeType
Non compact storage uses 2 bytes per internal cell, but preferred.
!
!
* Image courtsey Datastax blog
![Page 33: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/33.jpg)
cqlsh:test> SELECT * FROM events; key | column1 | column2 | value --------+---------+---------+--------- tbomba | 4 | 120 | event 1 tbomba | 4 | 2500 | event 2 tbomba | 9 | 521 | event 3 tbomba | 10 | 3525 | event 4
* Courtsey Datastax blog
CREATE TABLE events ( key text, column1 int, column2 int, value text, PRIMARY KEY(key, column1, column2) ) WITH COMPACT STORAGE
![Page 34: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/34.jpg)
Prefer CL_ONE
data replication within 500ms across the region
Using quorum reads and writes, then set read_repair_chance to 0.0 or very low value.
Make sure repairs are run often
Eventual Consistency does not mean hopeful consistency
![Page 35: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/35.jpg)
Avoid secondary indexes for high cardinality values
Most cases we set gc_grace_seconds = 10 days
Avoid hot rows
detect using node level latency metrics
![Page 36: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/36.jpg)
Avoid heavy rows
Avoid too wide rows (< 100K columns if smaller)
Don’t use C* as a Queue
Tombstones will bite you
![Page 37: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/37.jpg)
SizeTieredCompactionStrategy
write heavy workload
non-predictable I/O, 2x disk space
LeveledCompactionStrategy
read heavy work loads
predictable I/O, 2x STCS
![Page 38: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/38.jpg)
LeveledCompactionStrategy
SizeTieredCompactionStrategy
* Image courtsey Datastax blog
![Page 39: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/39.jpg)
Guesstimate and then validate sstable_size_in_mb
Hint: based on write rate and size
160mb for LeveledCompactionStrategy
SizeTieredCompactionStrategy - C* default 50mb
![Page 40: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/40.jpg)
Atomic batches
no isolation, only atomic for row within partition key
no automatic rollback
Lightweight transactions
![Page 41: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/41.jpg)
Cassandra Best Practices Operations
*Practices we have found useful, YMMV
![Page 42: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/42.jpg)
If your C* clusters footprint is significant
must have good automation
at least a C* semi-expert
Use cstar_perf to validate your initial clusters
We don’t use vnodes
On each node size disk to have 2x of expected data - ephemeral ssds no ebs
![Page 43: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/43.jpg)
Monitoring and alerting
read write latency - co-ordinator & node level
Compaction stats
Heap Usage
Network
Max & Min Row sizes
![Page 44: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/44.jpg)
Fixed tokens, double the cluster to expand
Important to size the cluster for app needs initially
benefits of fixed tokens outweighs vnodes
Take back up of all the nodes
to allow for eventual consistency on restores
Note: commitlog by default fsync only ever 10 seconds
![Page 45: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/45.jpg)
Run repairs before GCGraceSeconds expires
Throttle compactions and repairs
Repairs can take a long time
run a primary range and a Keyspace at a time to avoid performance impact.
![Page 46: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/46.jpg)
Schema disagreements - pick the nodes with the older date and restart them one at time.
nodetool reset local schema not persistent on 1.2
Recyle nodes in aws to prevent staleness
Expanding to new region
Launch nodes in new region without bootstrapping
Change Keyspace replication
Run nodetool rebuild on nodes in new region.
![Page 47: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/47.jpg)
More Info
http://techblog.netflix.com/
http://netflix.github.io/
http://slideshare.net/netflix
https://www.youtube.com/user/NetflixOpenSource
https://www.youtube.com/user/NetflixIR $$$
![Page 48: Netflix at-disney-09-26-2014](https://reader033.fdocuments.us/reader033/viewer/2022061205/54815667b4af9faa158b5f9c/html5/thumbnails/48.jpg)
??