Introduction to WSO2 Analytics Platform

26
Introduction to WSO2 Analytics Platform Srinath Perera VP Research WSO2 Inc.

Transcript of Introduction to WSO2 Analytics Platform

Introduction to WSO2 Analytics Platform

Srinath PereraVP Research

WSO2 Inc.

Analytics is Growing Up▪It is no longer about doing your first analytics usecase.

▪It is about ▪How to do it everyday, efficiently?

▪How to recover?▪How to make decisions?

▪How to do other forms like real-time , Interactive, and predicative analytics

Analytics 2.0 Platform▪One platform for all four forms of analytics

▪Single consistent programming model

▪One analytics archive format)

▪Support for the lifecycle of analytics Apps

Integrate well with rest of the enterprise!!

Collect Data

▪One Sensor API to publish events - REST, Thrift, JMS,

Kafka- Java clients, java

script clients*▪First you define streams (think it as a infinite table in SQL DB)

▪Then send events via Sensor APICan send to batch pipeline, Realtime pipeline or both via

configuration!

Collecting Data: Example

Java example: create and send events Events send asynchronously See client given in http://goo.gl/vIJzqc for more info

Agent agent = new Agent(agentConfiguration);publisher = new AsyncDataPublisher("tcp://hostname:7612", .. );

StreamDefinition definition = new StreamDefinition(STREAM_NAME,VERSION);definition.addPayloadData("sid", STRING);... publisher.addStreamDefinition(definition);... Event event = new Event();event.setPayloadData(eventData);publisher.publish(STREAM_NAME, VERSION, event); Send events

Define Stream

Initialize Agent

Analysis: Batch Analytics

Complex Event Processing

Analytics logic with SQL like Queries

▪Both BAM and CEP provides a SQL like data processing language

▪Since many understands SQL, above languages made large scale data processing Big Data accessible to many

▪Expressive, short, and sweet.

▪Define core operations that covers 90% of problems

▪Lets experts dig in when they like! (via User Defined functions)

Scaling CEP Queries on top of Storm

▪Accepts CEP queries with hints about how to partition streams

▪Partition streams, build a Apache Storm topology running CEP nodes as Storm Sprouts, and run it. (see http://goo.gl/pP3kdX )

Predictive Analytics

▪Predictive Analytics learns a decision function (a model) using examples

▪Is this fraud?▪How to drive?▪Handwritten text

▪Build models and use them with WSO2 CEP, BAM and ESB using WSO2 Machine Learner Product ( 2015 Q3)

▪Build model using R, export them as PMML, and use within WSO2 CEP

WSO2 Machine Learner▪A wizard to sample, explore, and understand data through visualizations

▪A wizard to configure, train machine learning models, and select the best model

▪Find and use those models with WSO2 CEP, BAM and ESB

▪Powered by Apache Spark MLLib

Communicate: Dashboards

▪Idea is to give a “Overall idea” in a glance (e.g. car dashboard)

▪Support for personalization, you can build your own dashboard.

▪Also the entry point for Drill down▪How to build?

- Dashboard via Google Gadget and content via HTML5 + java scripts

- Use charting libraries like Vega or D3

Communicate: Alerts▪Detecting conditions can be done via CEP Queries

▪Key is the “Last Mile”- Email- SMS- Push notifications to

a UI- Pager - Trigger physical

Alarm

▪How?- Select Email sender “Output Adaptor” from CEP, or

send from CEP to ESB, and ESB has lot of connectors

Communicate: APIs▪With mobile Apps, most data are exposed and shared as APIs (REST/Json ) to end users.

▪Need to expose analytics results as API

▪Following are some challenges - Security and

Permissions- API Discovery - Billing, throttling,

quotas & SLA

▪How?- Write data to a database from CEP event tables- Build Services via WSO2 Data Service - Expose them as APIs via API Manager

Event Stream Store▪One stop place for all event stream definitions

▪Let users ▪ Publish and

consume though Multiple protocols like REST, JMS, Thrift, Web Sockets etc.

▪ Discover event streams

▪ Enforce security and authorization

▪ Per-pay subscriptions

▪ Effectively a Event Stream Market Place!!

▪This will automate APIs creation as discussed in the slide before.

What is it good for?

▪Batch Analytics▪Realtime Streaming analytics

▪Realtime Interactive analytics

▪Lambda Architecture ▪Train and use a ML model

▪Selective Detailed Analysis

http://tinybuddha.com/blog/a-simple-technique-to-solve-problems-before-they-get-bigger

/

Selective Detailed Analysis

• Too expensive to do detailed analysis on all the data

• Instead detect the condition, and dig into related data

• Fraud toolbox • Other usecases– Dynamic offers at

Retail Site– Weather

Lambda Architecture

• Same code in both batch and realtime layers

• Idea is to fill the time between two batch runs

• Batch layer writes the data to a DB• Realtime layer merge with batch data via

Event Tables

Real Life Use Cases▪Health, Smart Parking solutions

▪Financial Monitoring ▪Smart City project, Vehicle tracking, Building monitoring

▪Railway monitoring ▪Throttling and Anomaly Detection

▪API Analytics (13+ customers)

▪Connected Car

Case Study: DEBS Grand Challenges

▪DEBS ((Distributed Event Based Systems) Grand Challenge is a yearly event processing challenge.

▪2014 Challenge: ▪Smart Home electricity data: 2000 sensors, 40 houses, 4 Billion events. We posted (400K events/sec) and close to one million distributed throughput with 4 nodes.

▪one of the four finalists▪2015 Challenge:

▪Based on taxi activities collected from New York City over the year 2013. 14,144 taxis 173 million taxi trip records. We posted 300K/sec on a single node and one of the finalists.

https://www.flickr.com/photos/shedboy/3681317392/

Case Study: Realtime Soccer Analysis

Watch at: https://www.youtube.com/watch?v=nRI6buQ0NOM

Case Study: TFL Traffic Analysis

Built using TFL ( Transport for London) open data feeds.

http://goo.gl/04tX6khttp://goo.gl/9xNiCm

Select the Product

Product Features

WSO2 Data Analytics Server (DAS)

Everything : Batch, Realtime, Interactive, and Predictive Analytics

WSO2 Complex Event Processor (CEP)

Realtime Analytics only

WSO2 Machine Learner

Predictive Analytics only

Questions?

Thank You