MySpace.com MegaSite v2

29

description

MySpace.com MegaSite v2. Aber Whitcom b – Chief Technology Officer Jim Benedetto – Vice President of Technology Allen Hurff – Vice President of Engineering. Previous Myspace Scaling Landmarks. First Megasite 64+ MM Registered Users 38 MM Unique Users - PowerPoint PPT Presentation

Transcript of MySpace.com MegaSite v2

Page 1: MySpace.com MegaSite v2
Page 2: MySpace.com MegaSite v2

MySpace.com MegaSite v2

Aber Whitcomb – Chief Technology OfficerJim Benedetto – Vice President of TechnologyAllen Hurff – Vice President of Engineering

Page 3: MySpace.com MegaSite v2

Previous Myspace Scaling LandmarksFirst Megasite

64+ MM Registered Users38 MM Unique Users260,000 New Registered Users Per Day23 Trillion Page* Views/Month50.2% Female / 49.8% MalePrimary Age Demo: 14-34

185 M

70 M6 M1 M100K

Page 4: MySpace.com MegaSite v2

MySpace Company OverviewToday

As of April 2007185+ MM Registered Users90 MM Unique Users

Demographics50.2% Female / 49.8% MalePrimary Age Demo: 14-34

Internet Rank Page views in ‘000s

MySpace #1 43,723

Yahoo #2 35,576

MSN #3 13,672

Google #4 12,476

facebook #5 12,179

AOL #6 10,609

Source: comScore Media Metrix March - 2007

Page 5: MySpace.com MegaSite v2

Total Pages Viewed - Last 5 Months

Source: comScore Media Metrix April 2007

Nov 2006 Dec 2006 Jan 2007 Feb 2007 Mar 20070

5,000

10,000

15,000

20,000

25,000

30,000

35,000

40,000

45,000

50,000

MySpaceYahooMSNGoogleEbayFacebook

MM

Page 6: MySpace.com MegaSite v2

Site Trends350,000 new user registrations/day1 Billion+ total imagesMillions of new images/dayMillions of songs streamed/day4.5 Million concurrent usersLocalized and launched in 14 countries

Launched China and Latin America last week

Page 7: MySpace.com MegaSite v2

Technical Stats7 Datacenters6000 Web Servers250 Cache Servers 16gb RAM650 Ad servers250 DB Servers400 Media Processing servers7000 disks in SAN architecture70,000 mb/s bandwidth35,000 mb/s on CDN

Page 8: MySpace.com MegaSite v2

MySpace Cache

Page 9: MySpace.com MegaSite v2

Relay System DeploymentTypically used for caching MySpace user data.

Online status, hit counters, profiles, mail.Provides a transparent client API for caching C# objects.

ClusteringServers divided into "Groups" of one or more "Clusters".Clusters keep themselves up to date.Multiple load balancing schemes based on expected load.

Heavy write environmentMust scale past 20k redundant writes per second on a 15 server redundant cluster.

Page 10: MySpace.com MegaSite v2

Relay SystemPlatform for middle tier messaging.

Up to 100k request messages per second per server in prod.Purely asynchronous—no thread blocking. Concurrency and Coordination RuntimeBulk message processing.Custom unidirectional connection pooling.Custom wire format.Gzip compression for larger messages.Data center aware.Configurable components

Relay ServiceIRelayComponents

Berkeley DB

Non-locking Memory Buckets

Fixed Alloc SharedInterlocked Int Storage for Hit

Counters

Message Forwarding

CCR

Message Orchestration

CCR

RelayClient

RelayClient

Socket Server

Page 11: MySpace.com MegaSite v2

Code Management:Team Foundation Server, Team System, Team Plain, and Team Test Edition

Page 12: MySpace.com MegaSite v2

Code ManagementMySpace embraced Team Foundation Server and Team System during Beta 3MySpace was also one of the early beta testers of BizDev’s Team Plain (now owned by Microsoft).Team Foundation initially supported 32 MySpace developers and now supports 110 developers on it's way to over 230 developersMySpace is able to branch and shelve more effectively with TFS and Team System

Page 13: MySpace.com MegaSite v2

Code Management (continued)

MySpace uses Team Foundation Server as a source repository for it's .NET, C++, Flash, and Cold Fusion codebasesMySpace uses Team Plain for Product Managers and other non-development roles

Page 14: MySpace.com MegaSite v2

Code Management: Team Test EditionMySpace is a member of the Strategic Design

Review committee for the Team System suiteMySpace chose Team Test Edition which reduced cost and kept it’s Quality Assurance Staff on the same suite as the development teamsMySpace using MSSCCI providers and customization of Team Foundation Server (including the upcoming K2 Blackperl) was able to extend TFS to have better workflow and defect tracking based on our specific needs

Page 15: MySpace.com MegaSite v2

Server Farm ManagementCodespew

Page 16: MySpace.com MegaSite v2

CodeSpewMaintaining consistent, always changing code base and configs across thousands of servers proved very difficultCode rolls began to take a very long timeCodeSpew – Code deployment and maintenance utility

Two tier applicationCentral management server – C#Light agent on every production server – C#

Tightly integrated with Windows Powershell

Page 17: MySpace.com MegaSite v2

CodeSpewUDP out, TCP/IP inMassively parallel – able to update hundreds of servers at a time. File modifications are determined on a per server basis based on CRCsSecurity model for code deployment authorizationAble to execute remote powershell scripts across server farm

Page 18: MySpace.com MegaSite v2

Media Encoding/Delivery

Page 19: MySpace.com MegaSite v2

Media StatisticsVideos60TB storage15,000 concurrent streams60,000 new videos/day

Music25 Million songs142 TB of space250,000 concurrent streams

Images1 Billion+ images80 TB of space150,000 req/s8 Gigabits/sec

Page 20: MySpace.com MegaSite v2

4th Generation Media EncodingMillions of MP3, Video and Image Uploads Every DayAbility to design custom encoding profiles (bitrate, width, height, letterbox, etc.) for a variety of deployment scenarios.Job broker engine to maximize encoding resources and provide a level of QoS.Abandonment of database connectivity in favor of a web service layerXML based workflow definition to provide extensibility to the encoding engine.Coded entirely in C#

Page 21: MySpace.com MegaSite v2

4th Generation Encoding Workflow

DFS 2.0

CDNFTP ServerMediaProcessor

Filmstrip for Image Review

Web Service Communication

Layer

(Any Application)

Upload

Job Broker

User Content

Thumbnails for Categorization

Page 22: MySpace.com MegaSite v2

MySpace Distributed File System

Page 23: MySpace.com MegaSite v2

MySpace Distributed File SystemProvides an object-oriented file storeScales linearly to near-infinite capacity on commodity hardwareHigh-throughput distribution architectureSimple cross-platform storage APIDesigned exclusively for long-tail content

Demand

Acce

sse

s

Page 24: MySpace.com MegaSite v2

SledgehammerCustom high-performance event-driven web server coreWritten in C++ as a shared libraryIntegrated content cache engineIntegrates with storage layer over HTTPCapable of more than 1Gbit/s throughput on a dual-processor hostCapable of tens of thousands of concurrent streams

Page 25: MySpace.com MegaSite v2

DFS Interesting FactsDFS uses a generic “file pointer” data type for identifying files, allowing us to change URL formats and distribution mechanisms without altering data.Compatible with traditional CDNs like AkamaiCan be scaled at any granularity, from single nodes to complete clustersProvides a uniform method for developers to access any media content on MySpace

Page 26: MySpace.com MegaSite v2

Appendix

Page 27: MySpace.com MegaSite v2

Operational Wins

Pages/Sec0

50

100

150

200

250

300

2005 Server

Page 28: MySpace.com MegaSite v2

MySpace Disaster Recovery Overview

Distribute MySpace servers over 3 geographically dispersed co-location sites

Maintain presence in Los AngelesAdd a Phoenix site for active/active configurationAdd a Seattle site for active/active/active with Site Failover capability

Page 29: MySpace.com MegaSite v2

Distributed File System Architecture

Storage Cluster

Users

DFS Cache Daemon

BusinessLogic

Sledgehammer

Cache Engine

Server Accelerator Engine