Use Your MySQL Knowledge to Become an Instant Cassandra Guru
Migration from MySQL to Cassandra for millions of active users
-
Upload
andrey-panasyuk -
Category
Technology
-
view
234 -
download
4
Transcript of Migration from MySQL to Cassandra for millions of active users
![Page 1: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/1.jpg)
How did we migrate data for millions of live users from MySQL to CassandraAndrey Panasyuk, @defascat
![Page 2: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/2.jpg)
Plan
![Page 3: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/3.jpg)
Use caseChallenges
a. Individualb. Corporate
![Page 4: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/4.jpg)
Servers● Thousands of servers in prod● Java 8● Tomcat 7● Spring 3● Hibernate 3
![Page 5: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/5.jpg)
![Page 6: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/6.jpg)
Sharded MySQL. Current state
![Page 7: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/7.jpg)
Sharded MySQL. Environment1. MySQL (Percona Server)2. Hardware configuration:
a. two Intel E2620v2 CPUb. 128GB RAMc. 12x800GB Intel SSD, RAID 10d. two 2Gb network interfaces (bonded)
![Page 8: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/8.jpg)
MemcacheD1. Hibernate
a. Query Cacheb. Entity Cache
2. 100th of nodes3. ~100MBps per Memcache node
![Page 9: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/9.jpg)
Sharded MySQL. Failover1. master2. co-master3. flogger4. archive
X4
![Page 10: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/10.jpg)
Sharded MySQL. Approach1. Hibernate changes:
a. Patching 2nd level caching:i. +environmentii. -class version
b. More info to debug problemsc. Fixing bugs
2. Own implementation:a. FitbitTransactionalb. ManagedHibernateSession
3. Dynamic sharding concept (somewhat similar to C*)
![Page 11: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/11.jpg)
Sharded MySQL. Data migrationSolution: vBucket
![Page 12: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/12.jpg)
Sharded MySQL. Data migration
Migration (96 -> 152 shards):● vBuckets to move: 96579● 1 bucket migration time: 8 min● 10 bucketmover * 3 processes - 12 days
![Page 13: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/13.jpg)
Sharded MySQL. Data migrationJob
● Setupa. Ensures vbuckets in read-only modeb. Waits for servers to reach consensus
● Executea. Triggers actions (dump, insert, etc.) on Bucketmoverb. Waits for actions to complete
● Wrap-upa. Updates shards for vbuckets, re-opens them for writesb. Advances jobs to next action
![Page 14: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/14.jpg)
Sharded MySQL. Schema migration1. Locks during schema update
Solution: pt-online-schema-change + protobuf
Drawbacks:
1. Split between DML/DDL scripts 2. Binary format (additional data)3. Additional platform specific tool
message Meta {optional string name = 1;optional string intro = 2;...repeated string requiredFeatures = 32;
}
message Challenge {optional Meta meta = 1;...optional CWRace cw_race = 6;
}
![Page 15: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/15.jpg)
Sharded MySQL. Development1. Job system across shards2. Use unsharded databases for lookup tables3. Do not forget about custom annotation
@PrimaryEntity(entityType = EntityType.SHARDED_ENTITY)
![Page 16: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/16.jpg)
Query patterns1. Create challenge2. List challenges by user3. Get challenge team leaderboard by user4. Post a message5. List challenge messages6. Cheer a message
![Page 17: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/17.jpg)
MySQL. Not a problem1. Knowledge Base2. Response Time
![Page 18: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/18.jpg)
Our problems1. MySQL
a. Scalabilityb. Fault tolerancec. Schema migrationd. Locks
2. Infrastructure costa. MemcacheDb. Redis
![Page 19: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/19.jpg)
C* expectations1. Scalability2. Quicker fault recovery3. Easier schema migration4. Lack of locks
![Page 20: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/20.jpg)
Migration specifics1. Millions of real users in prod2. No downtime
![Page 21: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/21.jpg)
![Page 22: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/22.jpg)
Apache CassandraApache Cassandra is a free and open-source distributed database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Cassandra offers robust support for clusters spanning multiple datacenters, with asynchronous masterless replication allowing low latency operations for all clients.
![Page 23: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/23.jpg)
Setting cluster up1. Performance testing2. Monitoring3. Alerting4. Incremental repairs (CASSANDRA-9935)
![Page 24: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/24.jpg)
C* tweaks1. ParNew+CMS -> G12. MaxPauseGCMillis = 200ms3. ParallelGCThreads and ConcGCThreads = 44. Compaction5. gc_grace_seconds = 0 (already big TTL for our data)
![Page 25: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/25.jpg)
Create keyspaces/tables
1. Almost the same schema with Cassandra adjustments2. Data denormalization was required in several places
![Page 26: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/26.jpg)
ID migration1. Create pseudo-random migration UUID based on BIGINT2. Thank API designers for using string as object ids.3. Make sure clients are ready for the new length of the id.4. Migrate API to UUID all over the place
![Page 27: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/27.jpg)
DAO (Data Access Object)1. Create CassandraDAO with the same interface as HibernateDAO 2. Create ProxyAdapterDAO to control which implementation to select3. Create adapter implementation for each DAO with the same
interface as HibernateDAO
![Page 28: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/28.jpg)
Enable shadow writes (percentage)1. Introduce environment specific settings for shadow writes2. Adjust ProxyAdapterDAO code to enable shadow writes by
percentage. Various implementations.3. Analyze performance (StatsD metrics for our code + Cassandra
metrics)
![Page 29: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/29.jpg)
Migrate legacy data1. Create a new job to read/migrate data2. Process data in batches
![Page 30: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/30.jpg)
Start shadow C* reads with validation1. Environment specific settings for data validation2. Adjust ProxyAdapterDAO code to enable simultaneous read from
MySQL and Cassandra3. Adjust ProxyAdapterDAO to be able to compare objects4. Logging & investigating data discrepancy.
![Page 31: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/31.jpg)
Check validation issues1. Path
a. Fix code problemsb. Migrate affected challenges againc. Go to step 1
2. Duration: 1.5 month
![Page 32: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/32.jpg)
Turn on read from C*1. Introduce C* return read percentage in the config settings2. Still do shadow MySQL reads and validations3. Increase percentage over time
![Page 33: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/33.jpg)
Turn off writes to MySQL
![Page 34: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/34.jpg)
Clean-up1. Adjust places which are not suitable for C* patterns like look
through all of the shards.2. Adjust adapters to get rid of Hibernate DAOs. Adapter hierarchy is
still presented3. Remove obsolete code4. Clean up MySQL database
![Page 35: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/35.jpg)
Challenge Events Migration Example1. Previous attempts:
a. SWF + SQSb. MySQL + Job across all shards
2. Nowa. Complication due to C* as a queue performanceb. 16 threads across 1024 buckets
![Page 36: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/36.jpg)
Code Redesign. Message cheer example1. Originally
a. Readb. Update BLOBc. Persist
2. Approacha. Update C* set as a single operation
![Page 37: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/37.jpg)
Code Redesign. Life without transactions1. BATCH2. Some object as a single source of truth
![Page 38: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/38.jpg)
![Page 39: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/39.jpg)
Challenges C*. Current State1. Two datacenters2. 18 nodes3. Hardware
a. 24-core CPUb. 64 GB RAM
4. RF: 3
![Page 40: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/40.jpg)
Results of migration
1. Significant improvement in persistence storage scalability & management (comparing to MySQL RDBMS)
2. Minimizing number of external points of failures3. Squashing Technical Debt4. Created a reusable migration module
![Page 41: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/41.jpg)
Cassandra Inconveniences1. Lack of ACID transactions2. MultiDC scenarios require concious decisions for
QUORUM/LOCAL_QUORUM. 3. Data denormalization4. CQL vs SQL limitations5. Less readable IDs
![Page 42: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/42.jpg)
Surprisingly not a big deal1. Lack of JOINs due to the model2. Lack of aggregation functions due to the model (we’re on 2.1 now)3. Eventual consistency4. IDs format change
![Page 43: Migration from MySQL to Cassandra for millions of active users](https://reader033.fdocuments.us/reader033/viewer/2022052514/58e4a6c31a28abbb038b48f1/html5/thumbnails/43.jpg)