WSO2Con EU 2016: Rethinking Message Brokering with WSO2 Message Broker
WSO2Con EU 2016: An Introduction to the WSO2 Analytics Platform
-
Upload
wso2-inc -
Category
Technology
-
view
186 -
download
7
Transcript of WSO2Con EU 2016: An Introduction to the WSO2 Analytics Platform
Introduction to WSO2 Analytics Platform
Srinath Perera (@srinath_perera) VP – Research WSO2
A Day in Your Life
Value Preposition
Collect Data § One Sensor API to publish
events - REST, Thrift, Java, JMS,
Kafka - Java clients, java script clients*
§ First you define streams (think it as a infinite table in SQL DB)
§ Then publish events via Sensor API
“Publish once, process anyway you like”
Collecting Data: Example
§ Java example: create and send events § Events send asynchronously § See client given in http://goo.gl/vIJzqc for more info
Agent agent = new Agent(agentConfiguration); publisher = new AsyncDataPublisher("tcp://hostname:7612", .. ); StreamDefinition definition = new StreamDefinition(STREAM_NAME,VERSION); definition.addPayloadData("sid", STRING); ... publisher.addStreamDefinition(definition); ... Event event = new Event(); event.setPayloadData(eventData); publisher.publish(STREAM_NAME, VERSION, event);
Send events
Define Stream
Initialize Stream
Data Collection Examples • Collect data from inbuilt agents in
WSO2 products, Tomcat etc. • Collecting your log data via log stash • Collecting JVM and JMX stats via
agent • Ingesting data from message queues
such as JMS or Kafka • Pulling data from a RSS feed, or
scraping a web page • Write a custom agent to collect data
from your system and push it to DAS
Analysis: Batch Analytics • Batch analytics reads data from a disk ( or some other
storage) and process them record by record • “MapReduce” is most widely used technology for batch
analytics – Apache Hadoop – Apache Spark 30X faster and much more flexible
• Analytics (Min, Max, average, correlation, histograms, might join or group data in many ways)
SQL like Queries: Spark SQL § Since many understands SQL, Hive made
large scale data processing Big Data accessible to many
§ Expressive, short, and sweet. § Define core operations that covers 90%
of problems § Lets experts dig in when they like! (via
User Defined functions)
insert overwrite table BusSpeed select hour, average(v) as avgV, busID from BusStream group by busID, getHour(ts);
Use Case: ESB Analytics ▪ Detailed tracing ( Message level and
mediator level) and data analysis via DAS ▪ Drill down by time, by services,
mediators, then trace back to “invocation, activity etc” ▪ Visualize Execution ▪ Finding Hot spots
Value of some Insights degrade Fast!
§ For some usecases ( e.g. stock markets, traffic, surveillance, patient monitoring) the value of insights degrades very quickly with time.
§ We need technology that can produce outputs fast § Static Queries, but need very fast output
(Alerts, Realtime control) § Dynamic and Interactive Queries ( Data
exploration)
SeeWSO2ConExperianTalk(DigitalMarke8ng)-seevideo
Real Time Analytics
People Tracking via
BLE • Track people through BLE via
triangulation
• Higher level logic via Complex
Event Processing
• Traffic Monitoring
• Smart retail
• Airport management
Case Study: Realtime Soccer Analysis
Watch at: https://www.youtube.com/watch?v=nRI6buQ0NOM
CEP Queries on top of Storm
▪ Accepts CEP queries with hints about how to partition streams ▪ Partition streams, build a Apache Storm topology running CEP nodes as Storm
Sprouts, and run it. see http://goo.gl/pP3kdX for more info.
Interactive Analytics § Best way to explore data is by
asking Ad-hoc questions § Interactive Analytics ( Search)
let you query the system and receive fast results (<10s)
§ Shows data in context (e.g. by grouping events from the same transaction together)
§ Built using Lucence based Indexes.
SparkSQL> SELECT * FROM TWITTER_DATA
Predictive Analytics § Can you “Write a program to drive a Car?” § Machine learning
§ Takes in lot of examples, and build a program that matches those examples
§ We call that program a “model” § Lot of tools
- R ( Statistical language) - Python ( Pandas, sklearn, Theano,
Tensorflow) - Apache Spark’s MLBase and Apache
Mahout (Java)
Predictive Analytics in DAS • Building models
– With WSO2 Machine Learner Product via a Wizard ( powered by MLLib)
– Build model using R and export them as PMML
• Built models can be used them with both WSO2 CEP and ESB
Predict Wait Time in the Airport • Predicting the time to
go through airport
• Real-time updates and events to passengers
• Let airport manage by allocate resources
Anomaly Detection • Has things changed?
– Anomalies by value though “Clustering”
– Anomalies through time using Markov Chains
• Detect Problems are drill in to find details
• Available as a solution 1. hAps://www.youtube.com/watch?v=aLwG4thHOXg2. hAp://wso2.com/analy8cs/solu8ons/fraud-and-anomaly-
detec8on-solu8on/3. hAp://wso2.com/whitepapers/fraud-detec8on-and-preven8on-
a-data-analy8cs-approach/
Anomaly Detection: API Analytics ▪ Includes API Statistics ▪ Relatime anomaly Detection ▪ Alert Dashboard ▪ Examples ▪ API health monitoring ▪ Abnormal response times ▪ Request Pattern Detection ▪ Detection access from New
or rarely used Host names ▪ Abnormal Tier Usage and
tier limit crossings
IoT Analytics ≅Anomalies • Visualizing and Detecting Conditions
about Moving Dots
• Visualizing, finding Problems, and Drilling Down, and doing Root cause analysis on Equipment Networks
• Doing Anomaly Detection on time series data
DetaileddiscussioninThinkingDeeplyaboutIoTAnaly8cs,hAps://iwringer.wordpress.com/2015/10/15/thinking-
deeply-about-iot-analy8cs/
Communicate: Dashboards • Dashboard give an “Overall
idea” in a glance (e.g. car dashboard) – Boring when everything is good!!
• Build your own dashboard. – WSO2 DAS supports a gadget
generation Wizard – You can write your own Gadgets
using D3 and Javascript.
Gadget Generation Wizard
• Starts with data in tabular format • Map each column to dimension in your
plot like X,Y, color, point size, etc • Create a chart with few clicks
Powered by VizGrammer lib that uses Vaga undneath (see
https://github.com/wso2/
VizGrammar)
IoT Analytics ▪ IOT Server collects data from
devices and send them to DAS ▪ Includes detailed device
management analytics ▪ Device providers can write their
analytics templates ▪ Device developers can also create
self service charts with zero coding through “Gadget Generation Wizard”
Communicate: Alerts
▪ Done with CEP Queries ▪ Last Mile (Email, SMS, Push notifications to a UI, Pager, Trigger
physical Alarm) ▪ E.g. API Manager Alert Dashboard
Analytics Templates ▪ Parameterized Analytics
Queries Written as Templates ( MACRO) ▪ End users can create
new instances by filling a form ▪ API analytics use cases
are done this way!
Key Differentiators ▪ Open Source, under Apache 2 license ▪ Publish data once, analyze it anyway you like ▪ Flexible packaging or as a scalable cluster ▪ Rich, extensible, SQL-like configuration
language ▪ Compact, easy to learn syntax addressing
complex requirements, such as time windows, patterns, sequences which would be complex to develop in a programming language such as Java. ▪ Rich set of data connectors
ThankYou!
#WSO2ConEU
Shareyourfeedbackforthissessionwso2con.com/app