Cassandra Summit 2014: Cassandra Compute Cloud: An elastic Cassandra Infrastructure
Cassandra on EPAM Cloud
-
Upload
oresztesz-margaritisz -
Category
Technology
-
view
159 -
download
1
Transcript of Cassandra on EPAM Cloud
Cassandra on
EPAM Cloud
Database deployed in multiple locations
AGENDA
• Typical issues with RDBMS
• Solutions with Cassandra
• Cassandra on EPAM Cloud
ABOUT ME
Oresztész Margaritisz
• Java CC member since 2015
• Distributed / Cloud Computing
• NoSQL
• Agile
• DevOps
@gitaroktato gitaroktato https://www.linkedin.com/in/oreszteszgitaroktato
TYPICAL ISSUES WITH RDBMS
TYPICAL ISSUES WITH RDBMS
• EPAM needs global delivery of services
• 25 countries
• 4 continents
• 19,600 employees
• Data storage with traditional RDBMS can be cumbersome
• Configuration issues
• Migrating data between locations can be hard
• Master - Slave configuration in local site gives tradeoff in performance
ARE WE SOLVING
THE SAME PROBLEM?
LATENCY AROUND THE GLOBE
LATENCY FOR 100 REQUESTS
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
1 request 100 requests
10 ms 1 second
BANDWIDTH VS. LATENCY
TYPICAL MULTI-MASTER MYSQL DEPLOYMENT
Master
1
Slave
1.1
Slave
1.2
Slave
1.3
Master
2
Slave
2.1
Slave
2.2
Slave
2,3
Master-Master
Replication
WE NEED NOSQL?
Dude, you need NoSQL!
ACID vs. BASE
WHY CASSANDRA?
CLIENT CONNECTIVITY
R/W
Client
R/W
Client
MULTI-REGION DEPLOYMENT
TokyoMinsk
Client
TUNABLE CONSISTENCY
Client
RAPID READ PROTECTION
Client
CASSANDRA ON EPAM CLOUD
INITIATIVE BY RND TEAM @ JAVACC
WE NEED YOU!
CONFIGURATION GUIDE
CONFIGURATION GUIDE
AWS-AP-NORTHEASTEPAM-BY1
cassandra-rackdc.properties
dc=AWS-AP-NORTHEASTrack=rack1
prefer_local=true
Public IP
cassandra.yaml
endpoint_snitch: GossipingPropertyFileSnitch
cassandra-rackdc.properties
dc=EPAM-BY1rack=rack1
prefer_local=true
cassandra.yaml
broadcast_address: <PUBLIC_IP>
cassandra.yaml
seed_provider:- class_name: org.apache.cassandra.locator.SimpleSeedProvider- seeds: <AWS_SEED_PUBLIC_IP>
cassandra.yaml
...- seeds: <BY1_SEED_PUBLIC_IP>
BOOT SEQUENCE
AWS-AP-NORTHEASTEPAM-BY1
CREATE KEYSPACE replicated WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'EPAM-BY1' : 2, 'AWS-AP-NORTHEAST' : 2 };
CAPACITY PLANNING
• Replication latency between regions
• Transactions per second for the whole cluster
• 3 MEDIUM instance in EPAM-BY1
• 3 MEDIUM instance in AWS-AP-NORTHEAST
REPLICATION LATENCY
ClientClient
WRITEREAD
Client
WRITEREAD
NTP
REPLICATION LATENCY
0 50 100 150 200 250 300 350 400
TCP Ping
DC1 -> DC2
Single Client
Average 99% Max
CLUSTER THROUGHPUT
0 5000 10000 15000 20000 25000 30000
LOCAL_QUORUMReplication: 2
LOCAL_ONEReplication: 2
LOCAL_ONEReplication: 1
node #1 node #2 node #3 SUM
SUMMARY
• Configuration is easy
• Migrating data between locations is built-in
• Load spread evenly
• Dealing with network failures by default
UP NEXT
• Real migration use-case
• Performance tuning
LOOKING FOR A REAL MIGRATION USE-CASE
KB Page
https://kb.epam.com/display/EJAVACC/Multi+datacenter+setup+with+NoSQL
Dzmitry Skaredau - [email protected]
Oresztesz Margaritisz - [email protected]
References
EPAM Project Space
https://kb.epam.com/display/EJAVACC/Multi-Region+Cassandra+set-up+in+EPAM+Cloud
Latency: The Next Web Performance Bottleneck
https://www.igvita.com/2012/07/19/latency-the-new-web-performance-bottleneck/
More Bandwdth Doesn’t Matter
https://docs.google.com/a/chromium.org/viewer?a=v&pid=sites&srcid=Y2hyb21pdW0ub3JnfGRldnxne
DoxMzcyOWI1N2I4YzI3NzE2
References pt. 2
Cassandra’s Rapid Read Protection
http://www.datastax.com/dev/blog/rapid-read-protection-in-cassandra-2-0-2