20080410 Pf Congrez Presentation E Bay V0 2

Post on 29-Jun-2015

1.792 views 1 download

Tags:

description

These are the slides I used for a talk I held at PFCongrez (April 12th 2008, in NL) about some of the internals & background at Marktplaats, the Dutch leading classifieds site!

Transcript of 20080410 Pf Congrez Presentation E Bay V0 2

A sneak preview at Marktplaats.nlPFCongrez

April 12th 2008JA. Oldenbeuvingjilles@marktplaats.nl

eBay Inc. Proprietary & Confidential

Who am I?

• Jilles Oldenbeuving• Working for Marktplaats since early 2003• Responsible for application development• Lot’s of fun:

– Great technical and infrastructural challenges– Top 3 Dutch website 59% reach!– Real business, where product drives success– World class team!

Content

• eBay’s classifieds portfolio• Marktplaats statistics• Marktplaats production environment• Scaling databases• Marktplaats and PHP

eBay’s classifieds portfolio

Content

• eBay’s classifieds portfolio• Marktplaats statistics• Marktplaats production environment• Scaling databases• Marktplaats and PHP

Marktplaats statistics

• At peak:– Over 71M page views/day– 10 new listings/s, 6M total listings– 600 search queries/s– 900 MB/s uplink traffic– 120 user generated emails/s (send from user to user)

• Collection of 20M user images (2TB)

• Utilizing 600+ servers across 3 datacenters

Content

• eBay’s classifieds portfolio• Marktplaats statistics• Marktplaats production environment• Scaling databases• Marktplaats and PHP

Search Engine

Marktplaats Production Environment

LB/Firewall

LB

LB

TrackerMogile storage nodes

NetCaches

Application

Application

LB/Firewall

Ads/UsersHitcounters

etc.AdMarkt

Readslaves

Readslaves

Readslaves

Etc..

Readslaves

Memcache

CS Backend

Simplified

LAMP

MySQL

Content

• eBay’s classifieds portfolio• Marktplaats statistics• Marktplaats production environment• Scaling databases• Marktplaats and PHP

MySQL Replication

• Slaves upon slaves doesn’t scale well…– Only spreads reads

500 reads/s

200 writes/s

250 reads/s

200 writes/s

250 reads/s

200 writes/s

w/ 1 server w/ 2 servers

As your site grows…

• Databases eventual consumed by writing

• Can not be solved by caching read actions

3 reads/s

400writes/s

3 reads/s

400writes/s

3 reads/s

400writes/s

3 reads/s

400writes/s

3 reads/s

400writes/s

3 reads/s

400writes/s

Generic MySQL database pool setup

Active Master

Failover Master

Access through VIP

SlavePool 1SlavePool 1SlavePool 1SlavePool 1SlavePool 1

SlavePool 1SlavePool 1SlavePool 1SlavePool 1SlavePool 2

LagSlave

Provides:• High availability for both writes and reads

• Scales reads

• Writes need to be scaled by partitioning (either functionally, or by modulo)

• Prevents human disasters

• Long term backups

• A way to change database schema’s without downtime

OffsiteBackup

Slaves can vary:• Different replication sets (For really high read:write ratios)

• Different indexes

• Different access patterns/impact seperation

(Ex. Cronjobs; for key buffers)

TIP: Abstract this in the code. Both configuration as well as physical vs logical mapping or look into MySQL

Proxy

How to manage database schema’s?

• The problem:– Hundreds of database instances across Marktplaats

• Each development environment it’s own database• Each QA and staging environments• Production environment• With 10-20 separate database pools each• In total 500+ databases

– Application needs to be consistent with the database version too!

Enter DBC

• In-house developed tool in PHP• Inspects your current database version and

application version and will bring those in synch

• Is aware of our database setup– Physical– Logical

• DBC is integrated with the build system

DBC

• Benefits:– Allows to “branch” database changes, but share

within project team until feature is finished– Leaves an audit trail of database changes– Allows review by DBA before propagating a

change into a release– Consistent and safe rollout of database

changes to production• Checks target system before and after

Trying to gauge interest in DBC to decide to open source it.

If you have interest, let me know.

Content

• eBay’s classifieds portfolio• Marktplaats statistics• Marktplaats production environment• Scaling databases• Marktplaats and PHP

Marktplaats and PHP

• Started out as a PHP-only shop in 1999

• PHP worked great, and scaled well up to a certain point

– Usage of Marktplaats keeps on growing– Application grew immensely in complexity– Number of developers quadrupled

• Java and SOA architecture gaining more ground

• One example of limits in PHP

APC Autofilter issue

• PHP’s speed can be improved by using an opcode cache like Zend, APC

• Examples to the right bypass this since the path is variable

• No constants in PHP!

Include (‘foo.php’)

Include ($path. ‘/foo.php’)

Include (MY_INCL. ‘/foo.php’)

If($a) include

(‘/path/foo.php’)

Include (‘/path/foo.php’)

APC Autofilter issue

• This file is cached, but parent.php is not since includes are only done at runtime.

• Child is actually created as an incomplete “mangled” class definition

• What if parent.php was already cached?

Include_once “parent.php”;

class Child extends Parent();

APC Autofilter issue

• By the time child.php is including parent.php, it is already cached• Zend changes the opcodes for child.php, removing the include of

parent.php this speeds up execution• APC can not use this version of child.php for caching• APC will stop caching child.php at all (called “Autofilter”)• …for ever!• One of the reasons why Java is gaining more traction within Marktplaats

Include_once “parent.php”;

Include_once “child.php”

$c = new Child();

Given the 1000’s of files in Marktplaats’ codebase this costs ~30% runtime performance!

We’re hiring!

Get in touch: jilles@marktplaats.nl

eBay Inc. Proprietary & Confidential

Search Engine

Marktplaats Production Environment

LB/Firewall

LB

LB

TrackerMogile storage nodes

NetCaches

Application

Application

LB/Firewall

Ads/UsersHitcounters

etc.AdMarkt

Readslaves

Readslaves

Readslaves

Etc..

Readslaves

Memcache

CS Backend

Simplified

LAMP

MySQL