05Nov13 Webinar: Introducing Revolution R Enterprise 7 - The Big Data Big Analytics Platform

Post on 26-Jan-2015

108 views 2 download

description

 

Transcript of 05Nov13 Webinar: Introducing Revolution R Enterprise 7 - The Big Data Big Analytics Platform

Announcing: Release 7Revolution R Enterprise

Michele Chambers, Chief Strategy Officer and VP Product ManagementThomas W. Dinsmore, Director of Product Management

Tuesday, November 5

Agenda Introduction

– Demystifying R

– Revolution Analytics at a Glance

– Revolution R Enterprise

– Revolution Analytics Partner Ecosystem

– Customer TestimonialsWhat’s New in RRE 7?More InformationQuestions

2

Demystifying R

What is R & why is it so darn popular?

4

R is exploding in popularity & function

Web Site PopularityNumber of links to main web siteR

SAS

SPSS

S-Plus

Stata

Scholarly ActivityGoogle Scholar hits (’05-’09 CAGR)R 46%

SAS -11%

SPSS -27%

S-Plus 0%

Stata 10%

Internet DiscussionMean monthly traffic on email discussion list

R

SAS

Stata

SPSSS-Plus

Package GrowthNumber of R packages listed on CRAN

4,332 as of Feb 2013

5

Latest survey shows significant growth in R adoption

“A key benefit of R is that it provides near-instant availability of new and

experimental methods created by its user base — without waiting for the

development/release cycle of commercial software. SAS recognizes

the value of R to our customer base…”

Product Marketing Manager SAS Institute, Inc

“I’ve been astonished by the rate at which R has been adopted. Four years

ago, everyone in my economics department [at the University of

Chicago] was using Stata; now, as far as I can tell, R is the standard tool, and

students learn it first.”

Deputy Editor for New Products at Forbes

R Usage GrowthRexer Data Miner Survey, 2007-2013

70% of data miners report using R

24% use R as primary tool

Source: www.rexeranalytics.com

Revolution Analytics at a GlanceWho We AreOnly provider of commercial big data big analytics platform based on open source R statistical computing language

Our Software DeliversScalable Performance: Distributed & parallelized analyticsCross Platform: Write once, deploy anywhereProductivity: Easily build & deploy with latest modern analytics

Our Services DeliverKnowledge: Our experts enable you to be expertsTime-to-Value: Our Quickstart program gives you a jumpstartGuidance: Our customer support team is here to help you

Global Industries Served

Financial Services

Digital Media

Government

Health & Life Sciences

High Tech

Manufacturing

Retail

Telco

Customers200+ Global 2000

Global PresenceNorth America / EMEA / APAC

6

Revolution R Enterprise

High Performance, Scalable Analytics Portable Across Enterprise Platforms Easier to Build & Deploy Analytics

is….the only big data big analytics platform based on open source Rthe defacto statistical computing language for modern analytics

7

8

Big Data In-memory bound Hybrid memory & disk scalability

Operates on bigger volumes & factors

Speed of Analysis

Single threaded Parallel threading Shrinks analysis time

Enterprise Readiness

Community support Commercial support Delivers full service production support

Analytic Breadth & Depth

5000+ innovative analytic packages

Leverage open source packages plus Big Data ready packages

Supercharges R

Commercial Viability

Risk of deployment of open source

Commercial license Eliminate risk with open source

R is open source and drives analytic innovation but….has some limitations for Enterprises

Introducing Revolution R Enterprise (RRE)The Big Data Big Analytics Platform

R+

CR

AN

Rev

oR

DistributedR

DevelopR DeployR

ScaleR

ConnectR

Big Data Big Analytics Ready

– Enterprise readiness

– High performance analytics

– Multi-platform architecture

– Data source integration

– Development tools

– Deployment tools

9

The Platform Step by Step:R Capabilities

R+

CR

AN

DistributedR

ScaleR

ConnectR

R+CRAN• Open source R interpreter• Freely-available R algorithms• Algorithms callable by RevoR• Embeddable in R scripts• 100% Compatible with existing

R scripts, functions and packages

RevoR• Performance enhanced R interpreter• Based on open source R• Adds high-performance math

Rev

oR

DevelopR

DeployR

10

Rev

oR

DevelopR

DeployR

R+

CR

AN

DistributedR

ScaleR

ConnectR

The Platform Step by Step:Parallelization & Data Sourcing ConnectR

• High-speed & direct connectors

ScaleR• Ready-to-Use high-performance

big data big analytics • Fully-parallelized analytics• Data prep & data distillation• Descriptive statistics & statistical

tests• Correlation & covariance matrices• Predictive Models – linear, logistic,

GLM• Machine learning• Monte Carlo simulation• Tools for distributing customized

algorithms across nodes

DistributedR• Distributed computing framework• Delivers portability across platforms

11

Rev

oR

R+

CR

AN

DistributedR

ScaleR

ConnectR

DeployR• Web services software

development kit for integration analytics via Java, JavaScript or .NET APIs

• Integrates R Into application infrastructures

Capabilities:• Invokes R Scripts from

web services calls• RESTful interface for

easy integration• Works with web & mobile apps,

leading BI & Visualization tools and business rules engines

DevelopR• Integrated development

environment for R• Visual ‘step-into’ debugger

The Platform Step by Step:Tools & Deployment

DevelopR DeployR

12

R+

CR

AN

Rev

oR

DistributedR

ScaleR

ConnectR

DeployRDevelopR

Write Once. Deploy Anywhere.

DESIGNED FOR SCALE, PORTABILITY & PERFORMANCE

In the Cloud Microsoft Azure BurstAmazon AWS

Workstations & Servers DesktopServer

Clustered Systems IBM Platform LSFMicrosoft HPC

EDW IBM NetezzaTeradata

Hadoop HortonworksCloudera

13

The Power of Revolution R EnterprisePerformance & Scalability

R + CRAN

Fast Math Libraries

Memory Management

Multi-Threaded Execution

Grid Processing

Parallelized Algorithms

Parallelized User Code

In-Database Execution

Open Source Leverage latest innovation

In-Hadoop Execution

Va l ue

RevoR 3-50X faster

DistributedR Effective memory utilization

DistributedR Powerful divide & conquer

DistributedR Maximizes computation

ScaleR Labor saving power

ScaleR Leverage CRAN

ScaleR Moves computation to data

ScaleR Moves computation to data

14

Revolution R EnterprisePowering Next Generation Analytics

COMBINE INTERMEDIATE RESULTS

15

Revolution R Enterprise Revo RPerformance Enhanced R

OpenSource R

Revolution R Enterprise

Computation (4-core laptop) Open Source R Revolution R Speedup

Linear Algebra1

Matrix Multiply 176 sec 9.3 sec 18x

Cholesky Factorization 25.5 sec 1.3 sec 19x

Linear Discriminant Analysis 189 sec 74 sec 3x

General R Benchmarks2

R Benchmarks (Matrix Functions) 22 sec 3.5 sec 5x

R Benchmarks (Program Control) 5.6 sec 5.4 sec Not appreciable

1. http://www.revolutionanalytics.com/why-revolution-r/benchmarks.php2. http://r.research.att.com/benchmarks/

Customers report 3-50x performance improvements

compared to Open Source R — without changing any code

16

RRE ScaleR outperforms SAS HPA – at a fraction of the cost

Rows of data 1 billion 1 billion

Parameters “just a few” 7

Time 80 seconds 44 seconds

Data location In memory On disk

Nodes 32 5

Cores 384 20

RAM 1,536 GB 80 GB

Double

45%

1/6th

5%

5%Revolution R is faster on the same amount of data, despite using approximately a 20 th as many cores, a 20th as much RAM, a 6th as many nodes, and not pre-loading data into RAM.

Bottom Line: Revolution R Enterprise Performance = Greatly Reduced TCO

*As published by SAS in HPC Wire, April 21, 2011

Logistic Regression:

17

R + Revolution R EnterpriseUnequaled Big Data Big Analytics

Big Data Distributed Analytics

Open Source Analytics

Performance Enhanced R

R Performance Enhanced R

Open Source Analytics

Big Data Distributed Analytics

Deploy AnalyticsWeb, Mobile, Data Visualization, BI

18

Revolution R Enterprise EcosystemPower of Integration

Deployment / Consumption

Data / Infrastructure

Advanced Analytics

ETL

SI / Service

Corios

MSP / DSP

19

Customers Revolutionize their Business

Power

“We’ve combined Revolution R Enterprise and Hadoop to build and deploy customized exploratory data analysis and GAM survival models for our marketing performance management and attribution platform. Given that our data sets are already in the terabytes and are growing rapidly, we depend on Revolution R Enterprise’s scalability and power – we saw about a 4x performance improvement on 50 million records. It works brilliantly.”   - CEO, John Wallace, DataSong

4X performance 50M records scored daily

Scalability

“We’ve been able to scale our solution to a problem that’s so big that most companies could not address it. If we had to go with a different solution we wouldn’t be as efficient as we are now.” - SVP Analytics, Kevin Lyons, eXelate

TB’s data from 200+ data sources10’s thousands attributes100’s millions of scores daily

2X data 2X attributes no impact on performance

Performance

“We need a high-performance analytics

infrastructure because marketing optimization is a

lot like a financial trading. By watching the market

constantly for data or market condition updates,

we can now identify opportunities for our

clients that would otherwise be lost.”

- Chief Analytics Officer, Leon Zemel, [x+1]

20

What’s New in RRE 7

The Power of R

22

Most widely used analytics toolPreferred by working analystsMore than 6,000 packagesGlobal footprint

New• R 3.0.2

Scalable Statistical Modeling

23

Linear RegressionStepwise LinearLogistic RegressionGeneralized Linear Models

New• Stepwise Logistic• Stepwise GLM

Scalable Machine Learning

24

Decision Trees

New • Decision Forests• Tree Visualization

Data Source Integration

25

Fixed/delimited textSAS, SPSSODBCHDFS and HBaseTeradataTested• HP Vertica• Teradata Aster

New: Model Integration

26

BI Integration

27

Custom web reportsQlikView accelerator

New • Excel Accelerator• Tableau Integration

New: Business User Interface

28

Choice of Operating Systems

29

New: Inside-Hadoop Deployment

30

Name Node

Data NodeData Node Data NodeData Node Data Node

Job Tracker

Task Tracker

Task Tracker

Task Tracker

Task Tracker

Task Tracker

MapReduce

HDFS

Multi-Node Package Manager

31

Name Node

Data NodeData Node Data NodeData Node Data Node

Job Tracker

Task Tracker

Task Tracker

Task Tracker

Task Tracker

Task Tracker

MapReduce

HDFS

ScaleR in Hadoop

32

In-Database Deployment

33

Summary: What’s New in RRE 7.0

34

R+

CR

AN

DistributedR

ScaleR

ConnectR

R 3.0.2

Rev

oR

DevelopR

DeployR

Summary: What’s New in RRE 7.0

35

Rev

oR

DevelopR

DeployR

R+

CR

AN

DistributedR

ScaleR

ConnectR

Stepwise Logistic

Stepwise GLM

Decision Forests

Tree Visualizer

PMML Export

Summary: What’s New in RRE 7.0

36

Rev

oR

DevelopR

DeployR

R+

CR

AN

DistributedR

ScaleR

ConnectR

Summary: What’s New in RRE 7.0

37

Rev

oR

DevelopR

DeployR

R+

CR

AN

DistributedR

ScaleR

ConnectR

Summary: What’s New in RRE 7.0

38

Rev

oR

R+

CR

AN

DistributedR

ScaleR

ConnectR

DevelopR DeployR

39

www.revolutionanalytics.com

40

41

www.revolutionanalytics.com/contact-us

42

43