31071...24 LEGO and the LEGO logo are trademarks of the LEGO Group. ©2018 The LEGO Group. 6219233
Lego-like building blocks of Storm and Spark Streaming Pipelines
-
Upload
dataworks-summithadoop-summit -
Category
Technology
-
view
462 -
download
0
Transcript of Lego-like building blocks of Storm and Spark Streaming Pipelines
Lego-Like Building Blocks of Storm and Spark Streaming Pipelines
For Rapid IOT and Streaming Analytics App DevelopmentSpeakers: Anand Venugopal, Punit Shah
Approach to this presentation
• Sharing our learnings and best practices from various Streaming Implementations
• Fairly simple concept - certainly not rocket science – but we do hope there may be some interesting ideas for you.
• Illustrating using a specific tool but you are free to implement the same concepts anyway you like
IOT and Streaming Analytics is HOT
30-50B DevicesUSD 661.74 Billion
Use Cases for Streaming Analytics
• Store, Warehouse operations – Retail• Predictive Maintenance – Manufacturing, Oil & Gas • Clinical Care and Patient Management – Healthcare - Clinical• Sensor Analytics – IOT, Manufacturing, Others• Fleet Operations – Transportation, Logistics• Fraud and Anomaly Detection – IT Security, Financial Services• Gaming Analytics – Entertainment, Gaming• Churn Analytics – Telecom, Banking, Retail• Network Traffic Analysis and Optimization – Telco• Internet Advertising – Retail, e-commerce
VERT ICALS
Use Cases for Streaming Analytics
HORIZONTALS
• Customer Experience• Clickstream Analytics• Context-sensitive Offers And Recommendations• IT Log Analytics• Security • Business Activity Monitoring
Use Cases for Streaming AnalyticsCOMBO
• Internet of Things• Mobile App Analytics• Call Center Monitoring and Analytics
Adoption Pattern of IOT and Streaming Analytics
Department 1 Department 2Department 3Department 4
Adoption Pattern of IOT and Streaming Analytics
Department 1 Department 2 Department 3Department 4
With Scale – we need a centralized efficient approach
Department 1 Department 2 Department 3Department 4
CENTRALIZED APPROACH
• Unified multi-tenant visual platform• Collaborative re-use of components
Three levels of re-use
FunctionsE.g. ETL functions (Date/ String/ Object/ Integer manipulations)
OperatorsE.g. Kafka Channel, Write_to_HDFS, Time-based aggregation;
Pipelinesi.e. Highest level of abstraction – lego-like building blocks
Re-usable stream processing patterns as pipelinesIngest – Pre-processing, Cleanup, De-duplication, re-sequencing; Filters,
Classification/ Routing - Pass on instantly to different downstream subscribers –
Instant anomaly detection – Security breaches / Fraud/ Costly failure scenarios
Rules based alerting - Customer setup rules- Notifications and triggers
Enrichment – Get key fields from the stream – dip into one or more Master DBs; create aggregate record
Time Window calculations - counters, statistics
Visualization block of raw and derived data
Data storage – a) Batch up and write data into HDFS/ HBASE etc. b) Instantly write data into an indexing store
Specific predictive model blocks
Connect pre-built pipelines to build an appIngest/ Filter/ Classify
Anomaly DetectionAlerting
Action Triggers
Index, Visualize
Time Window Statistics
Persist, Visualize
Low Latency Engine
Low Latency Engine
Low Latency Engine
Micro-batch engine Micro-batch engine
Rapid DevelopmentBest Engine for the task
Dynamic Routing
A-B Testing, Champion Challenger, Hot SwapIngest/ Filter/ Classify
MODEL 1
MODEL 2UI CONFIGURABLE DYNAMIC ROUTING RULES
DEMO
Thank you