IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a...

37
© 2015 IBM Corporation IBM Analytics for Apache Spark (Spark as a Service) Arancha Ocaña IT Specialist, CTP Big Data and Spark. <<For questions about this presentation contact Arancha Ocaña [email protected]>

Transcript of IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a...

Page 1: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2015 IBM Corporation

IBM Analytics for Apache Spark (Spark as a Service)

Arancha Ocaña

IT Specialist, CTP Big Data and Spark.

<<For questions about this presentation contact Arancha Ocaña [email protected]>

Page 2: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation2~ Minutes

Platform as a ServiceCustomer Managed

Service Provider Managed

IaaS

Benefits

Setup environments and

deploy apps very quickly.

Infrastructure and platform

managed by SP.

Time Commitment

Minutes to setup and deploy.

Focus on your apps and their

data.

Timing is critical…

~ Weeks

IBM Bluemix

~ Days

Time to initial deployment

Code

Data

Runtime

Middleware

OS

Virtualization

Servers

Storage

Networking

Core IT

Today’s apps must keep up with the speed of the app revolution.

Page 3: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation3

How does Bluemix work?

Bluemix embraces Cloud Foundry as an open source Platform as a Service and extends it with IBM,

third party, and community built services.

Page 4: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation4

IBM Analytics for Apache Spark

Performant Architecture

Productive Workflows

Leverages Existing Investments

Only IBM brings strength in enterprise, scale, and a managed offering to the Spark market

Continually Improving

Fully-managed & secured Spark environment,

accessible on-demand or via reserved instances

In-memory architecture greatly reduces disk I/O

20-100x faster for common tasks

Analytic workflows across a multitude of sources

Simplified but powerful syntax (~5x less code)

Integrates with SQL, Python, Scala, etc.

No lock-in: 100% open source Spark

Spark v1.6 available

Continually updated apace evolving Spark ecosystem

Pay-as-you-go or Reserved deployment options

as a service

Page 5: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation5

IBM Analytics for Apache Spark – Personas & Practitioners

Data Scientist Application Developer

Business Analyst Data Engineer

>5,000active users

Accessible Integrated Powerful

Available standalone, within

platforms, & within solutions 10–100xfaster in-memory processing

+

Derive insights which are immediately

actionable with powerful Spark tools.

Self-service, rapid access to understanding

of the business, without IT intervention.

Integrate 100% open-standards Spark with

any application, regardless of the platform.

Assemble data pipelines with ease to

power interactive dashboards and

services.

Page 6: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2015 IBM Corporation6

Managed Service Ecosystem

Client Environmentas a service

IBM hosted,

managed, secure

environment

Apps

Data

EnvData

Result

Request

Bluemix

Platform

Other

managed

cloud services

+ 3rd party

tools

Access

Points:

Notebooks,

and others

to come

(Spark

Submit,

REST API,

Streaming)

Page 7: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation7

BigInsights

(HDFS)

Cloudant

(DBaaS)

dashDB

(Analytics

)

Swift

(Object

Storage)

SQDB

(Manage

d DB2)

Data Sources

IBM Cloud Public Cloud Cloud Apps On-Premises

Execute SQL

Statements

Streaming

Analytics via

Micro-batch

M.L. and

Statistical

Algorithms

Distributed

Graph Processing

Framework

General compute engine

Basic I/O functions

Task dispatching

Scheduling

Spark Core

Spark SQLSpark

StreamingMLlib

Machine Learning

GraphX Graphing

+

Analytics for Apache Spark – Blends Multiple Data Types, Sources, & Workloads

Page 8: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

8 © 2015 IBM Corporation

Spark Application Architecture

A Spark application is initiated from a driver program

Spark execution modes:– Standalone with the built-in cluster manager

Spark application execution via spark-submit.sh

Page 9: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation9

Analytics for Apache Spark – Notebooks

Interactive, unified, and collaborative Spark work environments built with Jupyter (iPython)

Graphical user interface for executing and visualizing the results of Spark programs

Accessible through Web in-browser documents

Easily used by both business analysts and deeply-technical programmers

-Bridge the gap between “concept” to “production” application, all within a single environment

Page 10: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation10

Examples in Spark as a service

Page 11: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation11

Examples in Spark as a service

Page 12: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation12

Page 13: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation13

13

Page 14: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation14

Lecciones aprendidas

Ejecución del Servicio de Spark y desarrollo mediante notebook

Page 15: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation15

Demo

Page 16: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation16

Demo

Page 17: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation17

Demo

Page 18: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation18

Demo

Page 19: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation19

Demo

Page 20: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation20

Demo

Page 21: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation21

Demo

Page 22: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation22

Demo

Page 23: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation23

Demo

Page 24: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation24

Demo

Page 25: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation25

Demo

Page 26: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation26

Demo

Page 27: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation27

Demo

Page 28: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation28

Demo

Page 29: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation29

Demo

Page 30: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation30

Demo

Page 31: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation31

Demo

Page 32: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation32

Demo

Page 33: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation33

Demo

Page 34: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation34

Demo

Page 35: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation35

Demo

Page 36: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation36

Demo

Page 37: IBM Analytics for Apache Spark (Spark as a Service)files.meetup.com/7770922/Spark as a Service.pdfIBM Analytics for Apache Spark –Personas & Practitioners Data Scientist Application

© 2016 IBM Corporation37

IBM Analytics for Apache Spark – Editions

Spark Application

(Driver Program)

SparkContext

Swift (SoftLayer), AWS S3, HDFS,

or other storage

Cluster Manager

Worker 1 Worker n

Spark

Executor #1Spark

Executor n

No additional charge

for the master node

– unlike competitors

Permanent storage

billed separately

Memory

CPU

Temp Storage

12.5 GB

1 Core

20 GB

Bare metal machine specs

per Spark Executor

IBM Analytics for Apache Spark

Personal Reserved Enterprise

Allowances

Access to interactive Spark

notebooks

Spark v1.4.1

2 Spark Executors per instance

Access to interactive Spark notebooks

Spark v1.4.1

30 Spark Executors per instance

Sold Via Bluemix CDS Sales (SQO), PPA

Price€ …

per instance-hour

€ …

per instance-month

. . .