DataStax

39
©2013 DataStax Confidential. Do not distribute without consent. Extreme Data Velocity Continuous Availability Operational Simplicity Michael Shaler Senior Director, Business Development

description

 

Transcript of DataStax

Page 1: DataStax

©2013 DataStax Confidential. Do not distribute without consent.

Extreme Data VelocityContinuous AvailabilityOperational SimplicityMichael ShalerSenior Director, Business Development

Page 2: DataStax

What is Big Data’s payoff?

Page 3: DataStax

DataStax: CRN’s “10 Coolest Big Data Startups” Cassandra: InfoWorld’s Technology of the Year

1,000+ production deployments and 300 customers$84M in funding from industry-leading investors

Page 4: DataStax

BHAGWe are the first viable alternative to

Oracle for modern online applications.

We seek to be the first and best choice in databases.

Page 5: DataStax

No, Seriously…

Page 6: DataStax

Real-world Use Cases

Page 7: DataStax

7

Internet of Things Database Requirements

• “UTC subject predicate”: Time series data and metadata are the lingua franca of sensors/device data communications

• FAST AND ALWAYS ON: High-velocity ingest rates from geographically dispersed inputs with variable schemas/data models is the norm—and unless you tell them to do so, sensors never, ever sleep…

• HOT AND COLD: Real-time data and analytics vs. data reservoir/data factory needs vary.

• DHTs: Wide-row column-oriented distributed hash tables are the optimal home for IoT operational datastores

• AND: Other key functionality needed includes indexed search, along with both batch and real-time analytics—with data-in-flight and data-at-rest security an emerging need

• SPOILER ALERT: DataStax Enterprise supports all of the above

Page 8: DataStax

Time Series Analytics: 70B readings

Smart Grid Proof of Concept: Analyze 2 years of Smart Meter data for 1M households

Improvements in demand forecasting could yield EBITDA > $100M per GW saved

• $5M CAPEX• 10 man/months delivery

(Deploy, DevOps, Tuning)• Ongoing OPEX of > $1M

• $450K OPEX• 2 DevOps running 15 AWS nodes• Faster performance in 2 weeks• …All in the cloud

Page 9: DataStax

Major Changes: The Evolving Data Center

LOBApp

Oracle

LOBApp

MySQL

LOBApp

SQLServer

“What’s Happening?”Hyper VelocityTransactional

NoSQL

Data Warehouse

Teradata/Exadata

“What Happened?”Massive Volume

Bit Bucket

Hadoop

Page 10: DataStax

The Application World *HAS* Changed

Page 11: DataStax

11

Common Use Cases

• Big data OLTP and write intensive systems

• Time series data management

• High velocity device data consumption and analysis

• Healthcare systems input and analysis

• Media streaming (music, movies, etc.)

• Online Web retail (shopping carts, user transactions, etc.)

• Online gaming (real-time messaging, etc.)

• Real time data analytics

• Social media input and analysis

• Web click-stream analysis

• Buyer event and behavior analytics

• Fraud detection and analysis

• Risk analysis and management

• Supply chain analytics

• Web product searches

• Internal document search (law firms, etc.)

• Real estate/property searches

• Social media match ups

• Web & application log management / analysis

Page 12: DataStax

Continuous Availability Commentary

Page 13: DataStax

LondonVirginia

Santa ClaraSydney

D3A1

A2

A3

B1

B2B3

C1

C2

C3

D1

D2Cassandra: Architecture as Foundation

Page 14: DataStax

14

The New DR: Simian Army “Dystopia as a Service”

Page 15: DataStax

15

Heterogeneous Workloads: Active Everywhere

WriteAnalyze

ReadSearch

Write

Write

Read

Search

Page 16: DataStax

Our Product Solution

• DataStax Enterprise powers the big data apps that transform business.

• Extreme Data Velocity

• Continuous Availability

• Operational Simplicity

Page 17: DataStax

17

©2012 DataStax

33M streaming customers

2TAPI calls/year

~1,200Servers

55AWS clusters

12 developers

4 operators

0New data centers

Operational Simplicity

“Our primary operational data store is now Cassandra, not Oracle.”

Page 18: DataStax

Performance: NoSQL Leadership

Source: Solving Big Data Challenges for Enterprise Application Performance Management

Tillman Rabl, University of Toronto et al VLDB 2012 (August 2012, Istanbul)

Cassandra vs. HBase:

• 10x more read throughput

• 100x faster read latency

• 8x more write throughput

• 8x faster scan latency

• 4x more scan throughput

Page 19: DataStax

19

Performance: NoSQL Leadership

©2012 DataStax

YCSB Load Process

YCSB Read-write mix

YCSB Read-mostly

YCSB Write-mostly

Page 20: DataStax

20

From STB to the Scalable Cloud Message Bus

Enabling a richer active consumer experience across multiple devices, multiple platforms

Even in pre-production environment prior to tuning, achieved near-linear scalability

Page 21: DataStax

21

Instagram Scales Engaged Networks

• Transitioned from Redis (in-memory cache) to Cassandra in Amazon Web Services EC2

• Doubled cluster—and then doubled again—to support 150MM users on new infrastructure

• Continue to scale in spite of Justin Bieber storms, video formats, new features, new markets

Page 22: DataStax

Our Vision

DataStax is driving Cassandra to be the first viable alternative to the Oracle database for companies who are transforming the way they interact with customers.

Getting ahead of exploding growth• Sign big, new contracts all the time (ESPN)

• 200M unique users per month• 40TB of data

Flexible architecture • “Couldn’t shoehorn RDBMS technology”

Very small operations team• 3 people• 20 clusters• 100’s of nodes

Page 23: DataStax

Why We Exist

Today’s applications must be always available and lightning fast as they scale to previously unimaginable levels.

Cassandra delivers both with a beautifully simple and elegant architecture.

“We need a real-time, massively scalable architecture, where no one node is a single point of failure, that can easily span multiple data centers and cloud availability zones, and that’s Cassandra.”

Page 24: DataStax

What We Do Best

Cassandra was designed to do things that are impossible in other databases when it comes to availability and performance.  Forget about losing a machine here or there -- Cassandra delivers a world where you can lose an entire datacenter and still perform as your customers expect.

“We have to be ready for disaster recovery all the time. It’s really great that Cassandra allows for active-active multiple data centers where we can read and write anywhere”

Jay PatelTechnical Architect at eBay(Describing why they switched from legacy relational architecture)

Page 25: DataStax

The Modern “Application”

Page 26: DataStax

The Modern “Application”

Fraud Detection and Prevention

Page 27: DataStax

What It Means In Real Life

Page 28: DataStax

What It Means In Real Life

Page 29: DataStax

Cassandra Summit SF 2013

Page 30: DataStax

Real Growth In Production

Page 31: DataStax

We are the first viable alternative to Oracle for

modern online applications.

Page 32: DataStax

©2013 DataStax Confidential. Do not distribute without consent.

Thank You

We power the big data apps that transform business.

Page 33: DataStax

©2013 DataStax Confidential. Do not distribute without consent.

DataStax OpsCenter 4.0

Page 34: DataStax

©2013 DataStax Confidential. Do not distribute without consent.

DataStax OpsCenter 4.0

Page 35: DataStax

©2013 DataStax Confidential. Do not distribute without consent.

DataStax OpsCenter 4.0

Page 36: DataStax

©2013 DataStax Confidential. Do not distribute without consent.

DataStax OpsCenter 4.0

Page 37: DataStax

©2013 DataStax Confidential. Do not distribute without consent.

DataStax OpsCenter 4.0

Page 38: DataStax

Security in Cassandra FEA

TU

RES

BEN

EFIT

S

Internal Authentication

Manages login IDs and passwords inside

the database

+Ensures only authorized users

can access a database system

using internal validation

+Simple to implement and easy

to understand

+No learning curve from the relational

world

Object Permission Management

controls who has access to what and

who can do what in the database

+Provides granular based control over

who can add/change/delete/re

ad data

+Uses familiar GRANT/REVOKE from relational systems

+No learning curve

Client to Node Encryption

protects data in flight to and from a

database cluster

+Ensures data cannot be captured/stolen in route to a server

+Data is safe both in flight from/to a

database and on the database; complete coverage is ensured

Page 39: DataStax

Advanced Security in DataStax EnterpriseFEA

TU

RES

BEN

EFIT

S

External Authentication uses

external security software packages to

control security

+Only authorized users have access

to a database system using

external validation

+Uses most trusted external security

packages (Kerberos, LDAP), mainstays in

government and finance

+Single sign on to all data domains

Transparent Data Encryption

encrypts data at rest

+Protects sensitive data at rest from

theft and from being read at the file system level

+No changes needed at application level

+Can encrypt both Cassandra and Hadoop data

Data Auditingprovides trail of who

did and looked at what/when

+Supplies admins with an audit trail of

all accesses and changes

+Granular control to audit only what’s

needed

+Uses log4j interface to ensure

performance and efficient audit

operations