Growing the Tech team at ridesharing startup Blablacar by N.Tricot and C.Jennewein
Outgrowing an internet startup: database administration in a fast growing company.
-
Upload
spil-engineering -
Category
Technology
-
view
127 -
download
2
description
Transcript of Outgrowing an internet startup: database administration in a fast growing company.
Spil Games: outgrowing an internet startup Art van Scheppingen Head of Database Engineering
2
1. Who is Spil Games? 2. How to professionalize? 3. Spil Storage Pla<orm 4. Ques@ons?
Overview
Who are we? Who is Spil Games?
4
• Company founded in 2001 • 350+ employees world wide • 170M unique visitors per month • 100K unique visitors per month on spilgames.com
Facts
5
Geographic Reach 170 Million Monthly Ac@ve Users(*)
Source: (*) Google Analy3cs, December 2011
• Over 40 localized portals in 19 languages • Focus on casual and social games • 170M MAU per month (30M YoY growth) • Over 40M registered users
6
Girls, Teens and Family Brands
7
Games
8
Games
9
Games
10
Games
11
Games
12
Games
• Inhouse game studios • Partnerships with Social Gaming studios • Over 1500 licensed games
13
DB Servers MAU Employees
2006 2007 2008 2009 2010 2011
DB Servers
MAU
Employees
Spil Games is growing fast!
Database Engineering How to professionalize your department
15
• Databases maintained by Systems Engineering • No focus on performance, structure or backups • Looking only one or two weeks into the future
Startup
16
• Mul@ple migra@ons to new hardware • Ping-‐pong on Master-‐Master setups • Lack of insight into performance issues
Lessons
Read+write
17
• Plan ahead up to three months • Improve database pla<orm • Reduce number of repe@@ve tasks
• Write them down step by step (wiki) • Automate where possible
• Improve monitoring • A single monitoring system is not enough!
• Forecast growth • Week / Month / Year • Look back and evaluate!
• Extend department
Professionalize
18
• Scaling the LDAP pla<orm • LDAP replaced by MySQL based solu@on (with help from Percona)
LDAP isn’t suitable for the web
MMMDMM
19
• LVM snapshot method • took 4 hours on average with manual interven@on
• Innobackupex + netcat + tar + script = quick cloning • Takes about 1 hour per 100GB • Foolproof • Can be run on ac@ve masters (if necessary)
Cloning
20
• Different monitoring systems give different insights • Different angles/metrics/purposes • Early problem detec@on • Signal abnormal use which could cause outage
Improve monitoring
21
• Uneven growth: • Ac@ve master handling all write requests • Ver@cal scaling
• Write only • More writes than reads
• SOA problems • Connec@on spawning • Open file descriptors
Growing pains
22
• An@cipate more than one year in advance • Acknowledge shortcomings/problems, look for solu@ons or alterna@ves • Don't commit to one single solu@on! • Be flexible!
• Plan for capacity per instance, not for growth alone! • Start thinking globally!
Outgrowing our startup phase
Spil Storage Platform Sharding is inevitable
24
What is this exciting project about?
25
• Natural growth • Grown out of necessity for more func@onality • Adding func@onality means more interac@on • Separa@on of database func@on • Profiles • Highscores • Comments • User Generated Content • etc
Functional sharding
26
• KISS • Problem isola@on
Advantages
Disadvantages • Uneven growth • Difference in query panerns • No data consistency • No clear ownership of data • Capacity planning on total number of reads/writes • Horizontal scaling is difficult
27
Spil Storage Platform
28
• What is the bucket model? • It is an abstrac@on layer between the database and the datamodel
• Each record has one unique owner anribute (GID) • The GID (Global IDen@fier) iden@fies different data types
• Different buckets per func@on • Anributes contain record data • Anributes do not have to correspond to schema
Bucket model
29
• Flexibility • Database backend independent • Seamless schema changes and upgrades • Sharded on both func@onal and GID level
• Even distribu@on of queries possible • Capacity planning on number and type of en@@es
• Asynchronous writes possible • Transparent data migra@on
Advantages
Disadvantages • Harder to find data • At least two lookups needed! • Datawarehousing needs a different approach
30
• Globally sharded on GID • (local) GID Lookup
How do GIDs work?
GID lookup
Shard 1 Shard 2
Persistent storage
31
Pipeline flow
Current functional shards
SPAPI
LEGACY adapter
New Application SSP
Legacy API
New GID based shards
Read only
Read + write
32
Bucket mapping and migration
33
• Each cluster of two masters will contain two shards • Data is wrinen interleaved • HA for both shards • No warmup needed
• Both masters ac@ve and “warmed up” • Slave added for backups and Datawarehouse
Master-Master Sharding
SSP
Shard 1
Shard 2
34
• Erlang cluster with many workers • Every GID has its own worker process • (Inter)cluster communica@on • (Near) linear scalability
How are we implementing this SSP?
35
• Erlang node caching • Mul@ple backend connectors • MySQL library • Handlersockets • Any other connectors if needed
• Connec@on pooling
Advantages
Disadvantages • NOT SEXY? ( hnp://spil.com/notsexy )
36
Do YOU want to be sexy?
Questions?
38
Thank you!