WSO2Con USA 2017: Discover Data That Matters: Deep Dive into WSO2 Analytics

59

Transcript of WSO2Con USA 2017: Discover Data That Matters: Deep Dive into WSO2 Analytics

Creating realtime, intelligent, actionable business insights,

and data products

Collect Data

Define scheme for data

Receive Events

Analyze

Realtime analytics with Siddhi

incremental and batch analytics with Spark

SQL

Intelligence with Machine Learning

Communicate

Alerts

Dashboards

Interactive Queries

API

Experian delivers a digital marketing platform, where CEP plays a key role to analyze in real-time customers behavior and offer targeted promotions. CEP was chosen after careful analysis, primarily for its openness, its open source nature, the fact support is driven by engineers and the availability of a complete middleware, integrated with CEP, for additional use cases.

Eurecat is the Catalunya innovation center (in Spain) - Using CEP to analyze data from iBeacons deployed within department stores to offer instant rebates to user or send them help if it detected that they seem “stuck” in the shop area. They chose WSO2 due to real time processing, the variety of IoT connectors available as well as the extensible framework and the rich configuration language. They also use WSO2 ESB in conjunction with WSO2 CEP.

Pacific Controls is an innovative company delivering an IoT platform of platforms: Galaxy 2021. The platform allows to manage all kinds of devices within a building and take automated decisions such as moving an elevator or starting the air conditioning based on certain conditions. Within Galaxy2021, CEP is used for monitoring alarms and specific conditions.Pacific Controls also uses other products from the WSO2 platform, such as WSO2 ESB and Identity..

A leading airline uses CEP to enhance customer experience by calculating the average time to reach their boarding gate (going through security, walking, etc.). They also want to track the time it takes to clean a plane, in order to better streamline the boarding process and notify both the airline and customers about potential delays. They evaluated WSO2 CEP first as they were already using our platform and decided to use it as it addressed all their requirements.

Winning the Data in Motion Hack Week with AWS and Geovation, providing an impressive solution, taking the data from many modes of transport and overlaying passenger flow/train loading and pollution data, and allowing users to plan a route based on how busy their stations/routes are, whilst also taking air quality into account.

DEBS (Distributed Event Based Systems) Chalance in Smart Home electricity data: 2000 sensors, 40 houses, 4 Billion events. We posted fastest single node solution measured (400K events/sec) and close to one million distributed throughput. WSO2 CEP based solution is one of the four finalists, and the only generic solution to become a finalist.

Build solution to search, visualize, analyze healthcare records (HL7) across 20 hospitals in Italy, with the combination of WSO2 ESB.

Foods supply company in USA, detects anomalies such as delivery delays and provides personalized notifications, and makes order recommendations based on history.

+ Pluggable Custom

Receivers

{ 'name': TemperatureStream', 'version': '1.0.0', 'metaData':[ {'name':'sensorID','type':'STRING'}, ], 'correlationData':[], 'payloadData':[ {'name':'temperature','type':'DOUBLE'}, {'name':'preasure','type':'DOUBLE'} ]}

Event

StreamID TemperatureStream:1.0

Timestamp 1487270220419

sensorID AP234

temperature 23.5

preasure 94.2

SourceIP 168.50.24.2

+ Support for arbitrary key-value pairs

Schema

•••••

•–

•–

•••

–•••

define stream <event stream>(<attribute> <type>,<attribute> <type>, ...);

from <event stream>select <attribute>,<attribute>, ...insert into <event stream> ;

define stream SoftDrinkSales (region string, brand string, quantity int, price double);

from SoftDrinkSalesselect brand, quantityinsert into OutputStream ;

Output Streams are inferred

from SoftDrinkSalesselect brand, avg(price*quantity) as avgCost,‘USD’ as currencyinsert into AvgCostStream

from AvgCostStreamselect brand, toEuro(avgCost) as avgCost,‘EURO’ as currencyinsert into OutputStream ;

Enriching Streams

Using Functions

from SoftDrinkSales[region == ‘USA’ and quantity > 99]select brand, price, quantityinsert into WholeSales ;

from SoftDrinkSales#window.time(1 hour)select region, brand, avg(quantity) as avgQuantity

group by region, brandinsert into LastHourSales ;

Filtering

Aggregation over 1 hour

Other supported window types: timeBatch(), length(), lengthBatch(), etc.

define stream Purchase (price double, cardNo long,place string);

from every (a1 = Purchase[price < 10] ) -> a2 = Purchase[ price >10000 and a1.cardNo == a2.cardNo ]

within 1 dayselect a1.cardNo as cardNo, a2.price as price, a2.place as placeinsert into PotentialFraud ;

define stream StockStream (symbol string, price double, volume int);

partition by (symbol of StockStream)begin from t1=StockStream, t2=StockStream [(t2[last] is null and t1.price < price) or

(t2[last].price < price)]+within 5 min

select t1.price as initialPrice, t2[last].price as finalPrice,t1.symbol insert into IncreaingMyStockPriceStream end;

define table CardUserTable (name string, cardNum long) ;

@from(eventtable = 'rdbms' , datasource.name = ‘CardDataSource’ , table.name = ‘UserTable’, caching.algorithm’=‘LRU’)define table CardUserTable (name string, cardNum long)

••

Supported for RDBMS, In-Memory, Analytics Table,

In-Memory Data Grid (Hazelcast )

from Purchase#window.length(1) join CardUserTableon Purchase.cardNo == CardUserTable.cardNum

select Purchase.cardNo as cardNo, CardUserTable.name as name,Purchase.price as price

insert into PurchaseUserStream ;

from CardUserStreamselect name, cardNo as cardNumupdate CardUserTable

on CardUserTable.name == name ;

Similarly insert into and delete are also supported!

Join

••••

from SalesStreamselect brand, custom:toUSD(price, currency) as priceInUSDinsert into OutputStream ;

Referred with namespaces

•••••••••

+ Pluggable Custom

Publishers

•••

––

•••

––

••

from DataStream#ml:predict(“/home/user/ml.model”, “double”)select *insert into PredictionStream ;

••

––––

Data Abstraction Layer

Custom

•––

•–

••

–•

••••

•••

•–

1s 1s

1h

1m 1m 1m 1m

1h

1d

1s 1s 1s 1s 1s 1s CEP

Spark

•–

●●

●●

●●

••••

••••••••

All you need is 2 Nodes

Scale based on your need !

•––

•••••