20080410 Pf Congrez Presentation E Bay V0 2

22
A sneak preview at Marktplaats.nl PFCongrez April 12 th 2008 JA. Oldenbeuving [email protected] ay Inc. Proprietary & Confidential

description

These are the slides I used for a talk I held at PFCongrez (April 12th 2008, in NL) about some of the internals & background at Marktplaats, the Dutch leading classifieds site!

Transcript of 20080410 Pf Congrez Presentation E Bay V0 2

Page 1: 20080410 Pf Congrez Presentation E Bay V0 2

A sneak preview at Marktplaats.nlPFCongrez

April 12th 2008JA. [email protected]

eBay Inc. Proprietary & Confidential

Page 2: 20080410 Pf Congrez Presentation E Bay V0 2

Who am I?

• Jilles Oldenbeuving• Working for Marktplaats since early 2003• Responsible for application development• Lot’s of fun:

– Great technical and infrastructural challenges– Top 3 Dutch website 59% reach!– Real business, where product drives success– World class team!

Page 3: 20080410 Pf Congrez Presentation E Bay V0 2

Content

• eBay’s classifieds portfolio• Marktplaats statistics• Marktplaats production environment• Scaling databases• Marktplaats and PHP

Page 4: 20080410 Pf Congrez Presentation E Bay V0 2

eBay’s classifieds portfolio

Page 5: 20080410 Pf Congrez Presentation E Bay V0 2

Content

• eBay’s classifieds portfolio• Marktplaats statistics• Marktplaats production environment• Scaling databases• Marktplaats and PHP

Page 6: 20080410 Pf Congrez Presentation E Bay V0 2

Marktplaats statistics

• At peak:– Over 71M page views/day– 10 new listings/s, 6M total listings– 600 search queries/s– 900 MB/s uplink traffic– 120 user generated emails/s (send from user to user)

• Collection of 20M user images (2TB)

• Utilizing 600+ servers across 3 datacenters

Page 7: 20080410 Pf Congrez Presentation E Bay V0 2

Content

• eBay’s classifieds portfolio• Marktplaats statistics• Marktplaats production environment• Scaling databases• Marktplaats and PHP

Page 8: 20080410 Pf Congrez Presentation E Bay V0 2

Search Engine

Marktplaats Production Environment

LB/Firewall

LB

LB

TrackerMogile storage nodes

NetCaches

Application

Application

LB/Firewall

Ads/UsersHitcounters

etc.AdMarkt

Readslaves

Readslaves

Readslaves

Etc..

Readslaves

Memcache

CS Backend

Simplified

LAMP

MySQL

Page 9: 20080410 Pf Congrez Presentation E Bay V0 2

Content

• eBay’s classifieds portfolio• Marktplaats statistics• Marktplaats production environment• Scaling databases• Marktplaats and PHP

Page 10: 20080410 Pf Congrez Presentation E Bay V0 2

MySQL Replication

• Slaves upon slaves doesn’t scale well…– Only spreads reads

500 reads/s

200 writes/s

250 reads/s

200 writes/s

250 reads/s

200 writes/s

w/ 1 server w/ 2 servers

Page 11: 20080410 Pf Congrez Presentation E Bay V0 2

As your site grows…

• Databases eventual consumed by writing

• Can not be solved by caching read actions

3 reads/s

400writes/s

3 reads/s

400writes/s

3 reads/s

400writes/s

3 reads/s

400writes/s

3 reads/s

400writes/s

3 reads/s

400writes/s

Page 12: 20080410 Pf Congrez Presentation E Bay V0 2

Generic MySQL database pool setup

Active Master

Failover Master

Access through VIP

SlavePool 1SlavePool 1SlavePool 1SlavePool 1SlavePool 1

SlavePool 1SlavePool 1SlavePool 1SlavePool 1SlavePool 2

LagSlave

Provides:• High availability for both writes and reads

• Scales reads

• Writes need to be scaled by partitioning (either functionally, or by modulo)

• Prevents human disasters

• Long term backups

• A way to change database schema’s without downtime

OffsiteBackup

Slaves can vary:• Different replication sets (For really high read:write ratios)

• Different indexes

• Different access patterns/impact seperation

(Ex. Cronjobs; for key buffers)

TIP: Abstract this in the code. Both configuration as well as physical vs logical mapping or look into MySQL

Proxy

Page 13: 20080410 Pf Congrez Presentation E Bay V0 2

How to manage database schema’s?

• The problem:– Hundreds of database instances across Marktplaats

• Each development environment it’s own database• Each QA and staging environments• Production environment• With 10-20 separate database pools each• In total 500+ databases

– Application needs to be consistent with the database version too!

Page 14: 20080410 Pf Congrez Presentation E Bay V0 2

Enter DBC

• In-house developed tool in PHP• Inspects your current database version and

application version and will bring those in synch

• Is aware of our database setup– Physical– Logical

• DBC is integrated with the build system

Page 15: 20080410 Pf Congrez Presentation E Bay V0 2

DBC

• Benefits:– Allows to “branch” database changes, but share

within project team until feature is finished– Leaves an audit trail of database changes– Allows review by DBA before propagating a

change into a release– Consistent and safe rollout of database

changes to production• Checks target system before and after

Trying to gauge interest in DBC to decide to open source it.

If you have interest, let me know.

Page 16: 20080410 Pf Congrez Presentation E Bay V0 2

Content

• eBay’s classifieds portfolio• Marktplaats statistics• Marktplaats production environment• Scaling databases• Marktplaats and PHP

Page 17: 20080410 Pf Congrez Presentation E Bay V0 2

Marktplaats and PHP

• Started out as a PHP-only shop in 1999

• PHP worked great, and scaled well up to a certain point

– Usage of Marktplaats keeps on growing– Application grew immensely in complexity– Number of developers quadrupled

• Java and SOA architecture gaining more ground

• One example of limits in PHP

Page 18: 20080410 Pf Congrez Presentation E Bay V0 2

APC Autofilter issue

• PHP’s speed can be improved by using an opcode cache like Zend, APC

• Examples to the right bypass this since the path is variable

• No constants in PHP!

Include (‘foo.php’)

Include ($path. ‘/foo.php’)

Include (MY_INCL. ‘/foo.php’)

If($a) include

(‘/path/foo.php’)

Include (‘/path/foo.php’)

Page 19: 20080410 Pf Congrez Presentation E Bay V0 2

APC Autofilter issue

• This file is cached, but parent.php is not since includes are only done at runtime.

• Child is actually created as an incomplete “mangled” class definition

• What if parent.php was already cached?

Include_once “parent.php”;

class Child extends Parent();

Page 20: 20080410 Pf Congrez Presentation E Bay V0 2

APC Autofilter issue

• By the time child.php is including parent.php, it is already cached• Zend changes the opcodes for child.php, removing the include of

parent.php this speeds up execution• APC can not use this version of child.php for caching• APC will stop caching child.php at all (called “Autofilter”)• …for ever!• One of the reasons why Java is gaining more traction within Marktplaats

Include_once “parent.php”;

Include_once “child.php”

$c = new Child();

Given the 1000’s of files in Marktplaats’ codebase this costs ~30% runtime performance!

Page 21: 20080410 Pf Congrez Presentation E Bay V0 2

We’re hiring!

Get in touch: [email protected]

eBay Inc. Proprietary & Confidential

Page 22: 20080410 Pf Congrez Presentation E Bay V0 2

Search Engine

Marktplaats Production Environment

LB/Firewall

LB

LB

TrackerMogile storage nodes

NetCaches

Application

Application

LB/Firewall

Ads/UsersHitcounters

etc.AdMarkt

Readslaves

Readslaves

Readslaves

Etc..

Readslaves

Memcache

CS Backend

Simplified

LAMP

MySQL