Unseating the Giants Monte Zweben CEO, Splice Machine October 16, 2014.

30
Unseating the Giants Monte Zweben CEO, Splice Machine October 16, 2014

Transcript of Unseating the Giants Monte Zweben CEO, Splice Machine October 16, 2014.

Unseating the Giants

Monte ZwebenCEO, Splice Machine

October 16, 2014

2

The Big SqueezeData growing much faster than IT budgets

Source: 2013 IBM Briefing Book

Source: Gartner, Worldwide IT, Spending forecast, 3Q13 Update

Traditional RDBMSs Giants Overwhelmed…Scale-up becoming cost-prohibitive

Splice Machine | Proprietary & Confidential

4

Scale-Out: The Future of DatabasesDramatic improvement in price/performance

Scale Up(Increase server size)

Scale Out(More small servers)

vs.$ $ $ $ $ $

Rich
Need designer help

5

Unseating the Giants

vs.

Scale-Up Giants

Scale-Out Challengers

6

Scale-Out Example #1

New application chooses NoSQL

Splice Machine | Proprietary & Confidential

8

ADVERTISER ROCKET FUEL

145RTB advertisingsupply partners

21,103,424Websites

19BnDaily impressions

###MM WW CONSUMERS91,999 DEVICES

Rocket Fuel: New Application

AdExchange

Rocket Fuel Platform

Auto Optimization

Real-Time Bidding

Publishers

Exchanges

Ad networks

Advertisers

Data Providers

10

$2.38965$0.6782$1.7234

$0.09$1.78964$1.6782$1.7234$0.809$2.421.25

$2.11$1.26

$2.178$2.056$0.809$2.421.25

$2.11$1.26$2.78$1.56

$1.809$2.421.25

$2.11$1.26$2.78$0.56$2.421.25

$2.11$1.26$2.78

$0.756$0.809$2.421.25

$2.11$1.26$2.78

$1.256$1.809$2.421.25

$2.11$1.26$2.78

$0.586$2.009

1.25$2.11$1.26$2.78$1.56

$0.00

[ + ][ + ]

Site/PageGeo/WeatherTime of DayBrand AffinityUser

12

Hourly

Refresh

RTB

EXCHANGES

Bid Servers

HB

AS

E

User Profile Store

Ad ServersPixel Servers

Direct Publishers

H D F S

Master

Slaves

ETL

Bidder Logs Ad Server Logs

LikelihoodScores & Bid Value

Master Database

Apollo Ad-hoc Analytics Tools

Campaign Framework

Response Prediction

Models

Eligible Ads

ExchangePublishers

Apollo Reporting & Campaign Tools

Bid Call

Tag for Selected Ad & Bid

Rocket Fuel Ad Tag

Ad Creative Tag

Response Prediction

Models

UserLookup

UserLookup

& Update

Evaluate Ads

Load Balancer

Load Balancer

Hourly

Refresh

Ad-Rejection

1

2

3

45

6

7

8

9

13Splice Machine Proprietary and Confidential

HBase: Proven Scale-Out

Auto-sharding Scales with commodity hardware Cost-effective from GBs to PBs

High availability thru failover and replication

LSM-trees

14

Rocket Fuel: Results

Facebook likes

Searches on Google

Bid Requests Considered by Rocketfuel

5 B

6 B

45 B

Requests per day

World class request velocity on over 10 PBs of data

15

Scale-Out Example #2

Web application replaces Oracle

Splice Machine | Proprietary & Confidential

17

Before Architecture: OracleOracle too expensive, too slow, and too difficult to scale and modify

Metadata Storage

Shutterfly Website

Photo File Storage

UploaderApp

Consumers

18

After Architecture: MongoDBFlexibility and scalability of NoSQL ideal for simple web app

Metadata Storage

Shutterfly Website

Photo File Storage

UploaderApp

Consumers

19

MongoDB ArchitectureDocument data model sharded across commodity servers

20

MongoDB: Compelling Results vs. Oracle

⅕ costwith commodity scale out

9x fasterthrough parallelized queries

Increased agilitywith flexible schema and “shard on demand”

21

Scale-Out Example #3

Splice Machine | Proprietary & Confidential

Existing OLTP & OLAP Apps Replace Oracle

23

Before Architecture: Oracle RACOracle RAC too expensive and too slow, with queries up to ½ hour

• Operational Reports for Campaign Performance• Ad Hoc Audience

Segmentation

Social Feeds

Web/eCommerce Clickstreams

ETL

1st Party/CRM Data

3rd Party Data (e.g., Axciom)

POS Data

Email Marketing

Data Quality

24

After Architecture: Hadoop RDBMSRDBMS functionality with proven scale-out from Hadoop

• Operational Reports for Campaign Performance• Ad Hoc Audience

Segmentation

Social Feeds

Web/eCommerce Clickstreams

ETL

1st Party/CRM Data

3rd Party Data (e.g., Axciom)

POS Data

Email Marketing

Data Quality

25

Hadoop RDBMS: Best of Both Worlds

Scale-out on commodity servers Proven to 100s of petabytes Efficiently handle sparse data Extensive ecosystem

RDBMS ANSI SQL Real-time, concurrent updates ACID transactions ODBC/JDBC support

Hadoop

26

Distributed, Parallelized Query Execution

Parallelized computation across clusterMoves computation to the dataUtilizes HBase co-processorsNo MapReduce

HBase Co-Processor

HBase Server Memory Space

L EG EN D

27

Hadoop RDBMS: Compelling Results vs. Oracle

¼ costwith commodity scale out

3-7x fasterthrough parallelized queries

10-20x price/perfwith no application, BI or ETL rewrites

28

Scale-Up vs. Scale-Out

Scale-Up: Top Reasons1. Willing to pay for engineered systems

2. Lots of custom code (e.g., PL/SQL)

3. Proven reliability

4. Avoid risk of newer technologies

5. Less migration required

Scale-Out: Top Reasons6. Reduce costs by 4x-10x

7. Increase performance by 3x-10x

8. Ease of scalability

9. Support for flexible schemas

10.Huge ecosystem of open source tools

29

Unseating the Giants: Why is it different this time?It’s not just technology – user requirements fundamentally changed

Seismic User Shift• Budgets flat• Massive increase in data:

- Volume- Velocity- Variety

• No longer acceptable to throw data away

Disruptive Tech: Scale-Out

• Leverage commodity H/W• Reduce costs by 4-5x• Increase perf by 5-10x• Increase agility

Questions?

Monte ZwebenCEO, Splice Machine

[email protected]

Visit Booth 246

www.splicemachine.com