WSO2 Big Data Platform and Applications

28
WSO2 Big Data Platform and Applications Srinath Perera Director, Research, WSO2 Inc. Visiting Faculty, University of Moratuwa Member, Apache Software Foundation Research Scientist, Lanka Software Foundation

Transcript of WSO2 Big Data Platform and Applications

Page 1: WSO2 Big Data Platform and Applications

WSO2 Big Data

Platform and

Applications

Srinath Perera

Director, Research, WSO2 Inc.

Visiting Faculty, University of Moratuwa

Member, Apache Software Foundation

Research Scientist, Lanka Software Foundation

Page 2: WSO2 Big Data Platform and Applications
Page 3: WSO2 Big Data Platform and Applications

What can We do with Big Data?

Optimize (World is inefficient)o 30% food wasted farm to plate

o GE 1% initiative (http://goo.gl/eYC0QE )- 1% saving in trains can save 2B/ year

- 1% in US healthcare is 20B/ year

- In contrast, Sri Lanka total exports 9B/ year.

Save lives o Weather, Disease identification,

Personalized treatment

Technology advancemento Most high tech research are done via

simulations

Page 4: WSO2 Big Data Platform and Applications

Big Data Architecture

Page 5: WSO2 Big Data Platform and Applications

Big data Processing Technologies

Page 6: WSO2 Big Data Platform and Applications

WSO2 Analytics Platform

Page 7: WSO2 Big Data Platform and Applications

Big Data Analytics Offering

Page 8: WSO2 Big Data Platform and Applications

8

Combined Power

Users can send

events to both BAM

and CEP via the

same APIs

CEP can combine

output from batch

Processing and data

from various storage

(e.g. databases) with

real-time processing

o e.g. Implementing Lambda

Architecture

Page 9: WSO2 Big Data Platform and Applications

9

Highly Pluggable Architecture

Page 10: WSO2 Big Data Platform and Applications

WSO2 CEP

Page 11: WSO2 Big Data Platform and Applications

WSO2 BAM

● Powered by Apache Hadoop with management and queries using Apache Hive

● Parallel, distributed processing based on the MapReduce programming model

● Runs on local Hadoop node or can be delegated to a cluster of Hadoop nodes

● Scalable script-based analytics written using an easy-to-learn, SQL-like query language.

Analyzer Engine

Hadoop Cluster

Data Store(Cassandra/

RDBMS)

Page 12: WSO2 Big Data Platform and Applications

12

High Level Languages

For both batch and real-time, we provide

structured , SQL-like query languages.o No Java programming is required

Lowers the adoption entry point

BAMo Relies on Apache Hive

CEPo Implemented though our own solution, Siddhi.

Page 13: WSO2 Big Data Platform and Applications

13

Event table:(Map a database as an event stream)

Filter: (Process single transaction)

Windows:(Track a window of events)

CEP Operators with Siddhi

define stream RequestStream ( correlationID string, serviceID

string,userID string, tear string, requestTime long, ... ) ;

define table BlacklistedUserTable(userID string,time long,requestCount

long);

from RequestStream[tear==‘BRONZE’]#window.time(1 min)

select userID, requestTime as time, count(correlationID) as

requestCount

group by userID

having up requestCount > 5

insert into BlacklistedUserTable ;

Page 14: WSO2 Big Data Platform and Applications

14

Smart Home

DEBS (Distributed Event Based Systems) is a

premier academic conference, which post

yearly event processing challenge

(http://www.cse.iitb.ac.in/debs2014/?page_id=

42)

Smart Home electricity data: 2000 sensors, 40

houses, 4 Billion events

We posted fastest single node solution

measured (400K events/sec) and close to one

million distributed throughput.

WSO2 CEP based solution is one of the four

finalists (with Dresden University of

Technology, Fraunhofer Institute, and Imperial

College London)

Only generic solution to become a finalist

Page 15: WSO2 Big Data Platform and Applications

15

Healthcare Data Monitoring

Allows to search/visualize/analyze healthcare

records (HL7) across 20 hospitals in Italy

Used in combination with WSO2 ESB and BAM

Custom toolbox tailored to customer’s requirement

( to replace existing system)

Page 16: WSO2 Big Data Platform and Applications

16

Cloud IDE Analytics

Custom solution created in partnership

with Codenvy to bring analytics to

Codenvy management team and its

customers

Developed in less than a month, with a

custom plug-in to MongoDB.

Deployed in the codenvy.com platform.

Page 17: WSO2 Big Data Platform and Applications

17

Watch at: https://www.youtube.com/watch?v=nRI6buQ0NOM

Case Study: Realtime Soccer Analysis

Page 18: WSO2 Big Data Platform and Applications

18

Additional Customers Use Cases

Used in Healthcare, Parking Monitoring (see Solution patterns based

approach to rapidly create IoE solutions across industries,

o http://us14.wso2con.com/videos/#Coumara-Radja

Used by a Large Scale IoT System Provider for use cases including Vehicle

tracking, Smart City, Building Monitoring (CEP)

o See “Internet of Big Things: The Story of Pacific Controls,

http://us14.wso2con.com/videos/#Sajaad-Chaudry”

Transaction Monitoring in a Large Bank (CEP)

Knowledge Mining and tracking Prospective Customers through Natural

Language data sources (CEP)

CEP Embedded in edge Devices

o See WSO2Con 2013 - Keynote:Emerging Foundations of Next-

Generation Business Systems

https://www.youtube.com/watch?v=7CyG3JKUxWw

Throttling and Anomaly Detection by Group of Telecom Companies

Page 19: WSO2 Big Data Platform and Applications

19

Extensions and Toolboxes

Fraud and Anomaly Detection Toolbox - ( Static Rules, Statistical

outliers, Markov Chains)

Time Series Toolbox

Natural Language Processing Plugin (Entity Extraction, POS tagging,

Sentiment analysis)

GIS Toolbox (Geo Fencing, Tracking, Speed Alarms)

Running machine learning models exported as PMML with CEP (e.g.

from R)

Video Monitoring with OpenCV

For more info, http://wso2.com/library/articles/2014/08/wso2-cep-in-

action-an-analysis-of-use-in-real-world-applications-of-different-

domains/

Page 20: WSO2 Big Data Platform and Applications

20

Geo Fencing and Tracking Toolbox

Page 21: WSO2 Big Data Platform and Applications

21

SolidCon Demo -http://wso2.com/library/articles/2014/09/demonstration-on-architecture-of-internet-of-things-an-analysis/

IoT Demos and Use Cases

IOT Reference Architecture,

http://wso2.com/landing/internet-of-

things-uk-2014/

Internet of Big Things: The Story of

Pacific Controls,

http://us14.wso2con.com/videos/#Saj

aad-Chaudry

Federated Identity for IoT with

OAuth,

http://www.infoq.com/presentations/f

ederated-identity-IoT-OAuth

Page 22: WSO2 Big Data Platform and Applications

22

Analyzing sentiments for FIFA twitter hashtag

Sentimental Analysis Demo

Page 23: WSO2 Big Data Platform and Applications

Work in Progress

Page 24: WSO2 Big Data Platform and Applications

24

Predictive Analytics

Page 25: WSO2 Big Data Platform and Applications

25

Leveraging Apache Storm in CEP

Page 26: WSO2 Big Data Platform and Applications

26

BAM Enhancements

Work underway to Switch to Apache

Spark and Shark SQL like Queries

support in BAMo Faster Queries

o Keeping SQL like language

Use “Hive on Spark” for migration

purposes

Lower the adoption point of BAM by

packaging by default an RDBMS instead

of Cassandra.o Architecture already scales from small

deployments to BigData

Page 27: WSO2 Big Data Platform and Applications

Questions?

Page 28: WSO2 Big Data Platform and Applications

28

Business Model