MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF,...

50
MySQL at Wikipedia How we do relational data at the Wikimedia Foundation Jaime Crespo Percona Live Europe 2015 -Amsterdam, 23 Sep 2015-

Transcript of MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF,...

Page 1: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

MySQL at WikipediaHow we do relational data at the Wikimedia Foundation

Jaime CrespoPercona Live Europe 2015

-Amsterdam, 23 Sep 2015-

Page 2: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

2© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Jaime Crespo● Sr. Database Administrator

at Wikimedia Foundation

● Used to work as a trainer for Oracle (MySQL), as a Consultant (Percona) and as a Freelance administrator (DBAHire.com)

Page 3: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

3

Agenda

1. The Wikimedia Foundation 4. Reliability

2. MySQL details 5. Challenges

3. Performance & Architecture 6. Q&A

Page 4: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

4© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

THE WIKIMEDIA FOUNDATION

MySQL at Wikipedia

Page 5: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

5© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Wikimedia Foundation

Page 6: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

6© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Some stats...● 530-430 Million UVPM (not

counting mobile devices)

● 17-20 Billion page views per month

● 14-18K new editors per month

● 35 Million Wikipedia Articles

● 8K new Wikipedia articles per day

● 27 Million open/free media files

More stats: reportcard.wmflabs.org

Page 7: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

7© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

What makes us different● The Wikimedia Foundation is a non profit

● Funded exclusively by donations

● These are our principles

– Stewardship– Shared power– Internationalism– Free Speech– Independence

– Freedom and open source– Serving every human being– Transparency– Accountability

https://wikimediafoundation.org/wiki/Resolution:Wikimedia_Foundation_Guiding_Principles

Page 8: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

8© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Openness● Most companies are based around

a proprietary technologies

● All the source code we create and use on our infrastructure is free software– http://git.wikimedia.org/

● All the configuration and provisioning infrastructure is also freely licensed– http://git.wikimedia.org/tree/operations%2Fpuppet.git

Page 9: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

9© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Transparency & Accountability● All software and infrastructure changes are publicly

posted*:– https://gerrit.wikimedia.org/r/#/q/status:merged+project:operations/puppet,n,z

– https://wikitech.wikimedia.org/wiki/Server_Admin_Log

● Issue tracker is publicly accessible– https://phabricator.wikimedia.org/

● Most monitoring is publicly accessible

*except security issues (until corrected) and private information

Page 10: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

10© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Privacy● Obliged to respect our users'

privacy

● SSL is enforced throughout all services

● We host all our code, data and services (up to our possibilities) and do not share it with 3rd parties– No usage of CDNs, public clouds

Page 11: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

11© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

No dependency● Even companies using open source try to bind you

to their service

● We provide you not only the software, but also the data dumps and the documentation to create your own fork of our projects– https://dumps.wikipedia.org/

– https://wikitech.wikimedia.org

– Except user's private data

Page 12: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

12© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Community Resources● Many contributors that are not

employees with production server access

● We also provide a Virtual machine (Labs) and a shared hosting platform (tools) with access to database replicas open to contributors– https://wikitech.wikimedia.org/wiki/Help:Contents

– https://wikitech.wikimedia.org/wiki/Help:Tool_Labs

Page 13: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

13© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Team● 11 people in “Technical Operations”, including 1

DBA– There is also Labs Ops, Datacenter Ops, Fundraising

Ops, Analytics Ops, Release Engineering, Services, Devs, Performance & many volunteers supporting us

● We may not be the busiest site, but “there is literally nowhere else serving as many page views per engineer”

Page 14: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

14© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

MYSQL DETAILSMySQL at Wikipedia

Page 15: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

15© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

What do we use MySQL for?● Core relational data (users, text &

file metadata, ... )– Regular browser requests– Editing API

● Reliable Key-value store:– Content of each page (revision)

● Disk-based caching:– Secondary caching level for parsed wikitext, formulas, etc.

● Analytics and events (with difficulty)● Most internal services with database needs

Page 16: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

16© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

What do we not use MySQL for? (I)

● Restful API– Cassandra

● Crunched analytics– Hadoop

● Memory caching– Memcache

● Queueing– Redis

Page 17: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

17© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

What do we not use MySQL for? (II)

● Search and logs– Elasticsearch and logstash

● Compression– Pages use application-side

compression

● File storage– We use Swift

http://blog.wikimedia.org/2012/02/09/scaling-media-storage-at-wikimedia-with-swift/

Page 18: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

18© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

MySQL versions● Past: Facebook 5.1 fork

● Currently finishing upgrading MySQL 5.5 to custom MariaDB 10 packagehttp://blog.wikimedia.org/2013/04/22/wikipedia-adopts-mariadb/

● Relaying on several 3rd party utilities: Percona Xtrabackup and Toolkit, mydumper, etc.

Page 19: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

19© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Why MariaDB?● WMF, “corporate” contributor of the MariaDB Foundation● In general, avoiding “lock-in” for production, but certain

features are great:– Multi-source replication– TokuDB– Index statistics as static tables/histograms– Open source pool of connections

● Things we patch/would require from upstream/3rd party:– Query rewriting plugin– Delayed slave– Max query running time– Extended PRIMARY KEY issues– Replication state in transactional tables

Page 20: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

20© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Some MySQL stats● ~22 Billion queries a day

– Top recorded throughput for enwiki is 145K QPS

● >800 wikis in 280 languages

● 99.99% availability for enwiki in the last 6 months

● ~20TB of non-duplicate live data

● 2.5 Billion article revisions

● 95 percentile of query execution time is 332us– (API) queries running longer than 300s are killed

Page 21: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

21© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

my.cnf● https://git.wikimedia.org/blob/operations%2FPuppet/10169911757ada824

c11ee4e3dcd214bd229f247/templates%2Fmariadb%2Fproduction.my.cnf.erb

● Particularities– MariaDB Pool-of-threads

(max_connections = 5000)

– charset = BINARY

– rpl_semi_sync*

– userstat=1

– innodb_buffer_pool_dump_at_startup

Page 22: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

22© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

PERFORMANCE & ARCHITECTURE

MySQL at Wikipedia

Page 23: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

23© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Hardware and operating systems ● Standard x86_64 servers (several providers)

● 64-192GB of RAM

● Mostly on HDs– Hardware RAID controller (RAID 10)

– Currently integrating SSDs for vertical scalability

● GNU/Linux– Ubuntu Trusty; some machines still on Precise

– Currently Migrating to Debian Jessie

Page 24: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

24© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Servers● 1300 hosts

– ~120 varnish caches

– ~320 main applications servers, scalers, job runners

– 140 active MySQL servers (including support and labs services)

– 31 Elasticsearch servers

– 20 LVS

– 48 media storage frontends and backendshttp://ganglia.wikimedia.org

Page 25: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

25© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Mediawiki software● Running on Apache with

PHP-HHVM

● Mediawiki implements its own ORM that allows database independency– MySQL and sqlite are the main maintained engines

● Read-write is split at application side– Writes and important reads go to the master

– Most reads go to the slaves● Chronology is checked at application side

https://www.mediawiki.org/wiki/MediaWiki

Page 26: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

26© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Caching● Caching reads and queuing writes

– HTTP varnish caching eliminates 9/10th of the traffic

– Table level caching (templatelinks, externallinks) makes special pages trivial

● Those are calculated asynchonously by redis jobs on slaves

– HTML and unrendered wikitext is also cached and stored on memcached/parsercache db servers

Page 27: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

27© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Datacenters● Servers are distributed among 4 datacenters:

– Ashburn, Virginia (eqiad)

– Austin, Texas (codfw)

– Amsterdam (esams)

– San Francisco, California (ulsfo)

● Only active for caching (passive for application servers, for now)

http://blog.wikimedia.org/2013/01/19/wikimedia-sites-move-to-primary-data-center-in-ashburn-virginia/

Page 28: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

28© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

DNS-based CDN

http://blog.wikimedia.org/2014/07/11/making-wikimedia-sites-faster/http://blog.wikimedia.org/2014/07/09/how-ripe-atlas-helped-wikipedia-users/

Page 29: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

29© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

MySQL Functional groups● “Core” Production Servers

● External Storage

● External Clusters

● Miscellaneous internal services

● Parsercache

● Analytics

● Labs

Page 30: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

30© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

MySQL Shards: Core servers● Most relational data: users, metadata, etc.

– s1: English Wikipedia

– s2: Large wikis

– s3: Most small wikis (~800)

– s4: Commons

– s5: Wikidata and German Wikipedia

– s6: Large wikis

– s7: Centralauth, metawiki and some large wikipedias

More details: https://noc.wikimedia.org/db.php

Page 31: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

31© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

MySQL Shards: External Storage and External cluster

● Key-value storage where the actual revision text is– es1: Read-only Clusters

– es2-es3: Read/write cluster

● x1: Very dynamic data / global data (mostly writes)– Notifications

– Extension data with very different query patterns

Page 32: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

32© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

MySQL Shards: Misc● m1-m5: Internal services databases (puppet,

phabricator, openstack, wordpress, …)

● Parsercache (pc): secondary cache level for rendered content

● Analytics and research: MySQL replicas and event logging for data analysis and statistics– Make heavy use of multi-source replication for cross-

shard joins

Page 33: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

33© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

MySQL Shards: LabsDB● Replicas for Virtual Machines (labs) and

community contributors (tools)

● Shared mysqls (and postrgresql) for tool users

● Requires sanitizing

● Challenging to administrate due to the large difference between number of users and resources available

Page 34: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

34© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

RELIABILITYMySQL at Wikipedia

Page 35: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

35© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Shard components● 1 Master

● 2-14 slaves with traditional replication– Geographically distributed

over 2 datacenters

● Semi-sync replication to avoid data loss

Page 36: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

36© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Master Failover● No automatic failover on the core

servers for masters– Wikis will go to read-only mode if the

master fails

– An operator will perform the failover (hopefully) in less than 15 minutes

● HAProxy– Only used for full automatic failover for misc.

services

Page 37: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

37© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Slave Automatic Failover● Mediawiki-controlled

● A slave is not used if: – it is unresponsive

– Its lag is larger than the configured limit (and there are other available slaves)

● Other errors (or for maintenance) require human intervention for depooling

Page 38: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

38© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Load-Balancing● Also mediawiki-controlled

● Each slave as a weight (0-N)

● It can also have a role (API, slow, dump, watchlist, recentpages, contributions, logpager)– It helps avoiding disrupting all nodes and with buffer

pool for certain query patterns

● Datacenters are active-active only for caches, applications and mysql are still active-passive

Page 39: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

39© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Data Recovery● Weekly logical backups from a spare

slave (6 month retention)– Mostly unused except for issue

investigation

– 30-day retention on binary logs

● ~Biweekly public XML dumps

● On node failure, recovery is handled by cloning from another slave (rsync or xtrabackup)

● 24-hour delayed slave with all shards (multi-source, TokuDB)

Page 40: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

40© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Maintenance● No maintenance windows

– code deployments 24/7

● No integrated system- depending on the change:– pt-online-schema-change/

online schema change

– Always enough redundancy for switchover

– Batched updatehttps://wikitech.wikimedia.org/wiki/Deployments

Page 41: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

41© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Lessons learned about recovery● Avoid flopping services: STONITH

● Chaos/monkey testing (we call it deployment schedule)

● Backups are useless: have a faster recovery plan– Data recovery <> service recovery

● Avoid active-passive setups:– Avoid failover -you won't be ready when needed

– Have redundancy and a 30% resource utilization

● Automatize and log everything (even if run manually)

Page 42: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

42© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Monitorization● “Ecosystem” problem: too many of them

– Ganglia: basic parameters

– Icinga: alerts

– Graphite & Graphana: custom graphs

– Logstash: centralization of logs● Application db errors and slow queries

– Custom DB monitoring system: “Tendril”● Graphs, slow queries and reports

– pt-query-digest ● Ishmael web interface (deprecated)

Page 43: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

43© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

CHALLENGESMySQL at Wikipedia

Page 44: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

44© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Infrastructure and code● Writes are not an issue for us -reads are

– Logged users and POST requests are not cached

● 15 year old PHP application means technical debt– Dependency on statement-based replication

– No real utf-8 support at the time

– No sql_mode set (WIP)

Page 45: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

45© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Best things about MySQL● InnoDB is reliable

● Easy to use

● Fast

● Not trying to be smart

● Wide 3rd party support (utilities)

Page 46: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

46© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Worst things about MySQL● Many manual operations (provisioning,

replication, HA, partitioning)– They have to be automated by us

– Some of them are slowly being implemented

● Lack of proper compression (both reliable and performant)

Page 47: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

47© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Future (I)● SSDs and vertical scaling

● Compression (InnoDB, RocksDB, TokuDB?)

● OLAP/Column based solutionfor analytics

● Fully Active-Active over several datacenters– Multimaster?

● Better maintenance and recovery automation

Page 48: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

48© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Future (II)● Integrated query analysis and

debugging (P_S?)

● Better monitorization– Smoke tests for data integrity,

strange states, etc.

● 10.1? 5.7? WebscaleSQL? Galera?

● Better sanitization process (binlog processor)

● Rearchitecture connection handling

Page 49: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

49© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

You can help us!● Apply for the DBA full time position:

http://grnh.se/0y4pxm

● Clone our puppet repo and start sending us patches– Or create your own wiki-based tool on Tool-Labs

● Join us at #wikimedia-operations and #wikimedia-databases at Freenode

Page 50: MySQL at Wikipedia - Wikimedia Commons · 2017-05-30 · MySQL at Wikipedia Why MariaDB? WMF, “corporate” contributor of the MariaDB Foundation In general, avoiding “lock-in”

50© 2015 Wikimedia Foundation & Jaime Crespo. http://wikimediafoundation.org. License: CC-BY-SA-4.0

MySQL at Wikipedia

Q&A