CEP - simplified streaming architecture - Strata Singapore 2016

26
© 2016 MapR Technologies 1 © 2016 MapR Technologies 1 MapR Confidential © 2016 MapR Technologies CEP - A Simplified Enterprise Architecture for Real-time Stream Processing Mathieu Dumoulin, Data Engineer ([email protected] , @lordxar)

Transcript of CEP - simplified streaming architecture - Strata Singapore 2016

Page 1: CEP - simplified streaming architecture - Strata Singapore 2016

© 2016 MapR Technologies 1© 2016 MapR Technologies 1MapR Confidential © 2016 MapR Technologies

CEP - A Simplified Enterprise Architecture for Real-time Stream ProcessingMathieu Dumoulin, Data Engineer ([email protected], @lordxar)

Page 2: CEP - simplified streaming architecture - Strata Singapore 2016

© 2016 MapR Technologies 2© 2016 MapR Technologies 2MapR Confidential

Mathieu Dumoulin

• Living in Tokyo, Japan last 3 years• Data Engineer for MapR Professional Services • Other jobs: Data Scientist, Search Engineer• Connect with me:

–Read my blog posts: https://www.mapr.com/blog/author/mathieu-dumoulin

–Twitter: @Lordxar–Email: [email protected]

Page 3: CEP - simplified streaming architecture - Strata Singapore 2016

© 2016 MapR Technologies 3© 2016 MapR Technologies 3MapR Confidential

Content Summary1.Complex Event Processing2.Streaming Architecture3.Rules Engines for CEP4.Simplified Hadoop-based CEP Architecture5.Live Demo6.Does it scale? 7.Conclusion

Page 4: CEP - simplified streaming architecture - Strata Singapore 2016

© 2016 MapR Technologies 4© 2016 MapR Technologies 4MapR Confidential

Complex Event Processing (CEP)Some terminology:

• Event: Data with a timestamp (a log event, a transaction, ...)

• Event processing: Track and analyze streaming event data• Complex event processing is to identify meaningful events and

respond to them as quickly as possible. Usually over a sliding window on the stream of event data.

CEP is just a fancy way to do business rules on streaming data

Page 5: CEP - simplified streaming architecture - Strata Singapore 2016

© 2016 MapR Technologies 5© 2016 MapR Technologies 5MapR Confidential

IoT: Needs some CEP in There Somewhere

Page 6: CEP - simplified streaming architecture - Strata Singapore 2016

© 2016 MapR Technologies 6© 2016 MapR Technologies 6MapR Confidential

CEP in Action

The power of CEP comes from being able to detect complex situations that could not be detected from any individual data directly.

Window opened

Motion Sensor

Light turned on

Door opened

Page 7: CEP - simplified streaming architecture - Strata Singapore 2016

© 2016 MapR Technologies 7© 2016 MapR Technologies 7MapR Confidential

Actually, CEP Has Been Around For a While

Taken from March 2010 issue of the Dutch Java Magazine (source)

Page 8: CEP - simplified streaming architecture - Strata Singapore 2016

© 2016 MapR Technologies 8© 2016 MapR Technologies 8MapR Confidential

Technology Has Been Holding Rule Engines Back

• Rule engines are not new– First papers from the 90’s, many implementations in early 2000’s

• Engine is running in-memory on single node– A few GB of memory (or less) was a severe limitation– Single core CPU can only do so much

• Need modern stream messaging (Kafka, MapR Streams)– Need persistence– Need speed

• No standard, no dominant sponsor– 90’s and early 2000 dominated by Microsoft– OSS had not come of age in enterprise IT

Page 9: CEP - simplified streaming architecture - Strata Singapore 2016

© 2016 MapR Technologies 9© 2016 MapR Technologies 9MapR Confidential

CEP in a Modern Enterprise Data Pipeline

Source: Oracle / Rittman Mead Information Management Reference Architecture

Page 10: CEP - simplified streaming architecture - Strata Singapore 2016

© 2016 MapR Technologies 10© 2016 MapR Technologies 10MapR Confidential

Modern Streaming Architecture

• Build flexible systems– more efficient and easier to build– Decouples dependencies

• Better model the way business processes take place.• More value now

– Aggregates data from many sources once– Serves data to one or many projects immediately

• More value later– Run batch analytics on the data later– Reprocess the data with different algorithms later

Page 11: CEP - simplified streaming architecture - Strata Singapore 2016

© 2016 MapR Technologies 11© 2016 MapR Technologies 11MapR Confidential

Kafka-esque Messaging for Rule Engines

• Stream Persistence is a key feature• CEP is only one use case

– Support batch analytics and Ad-hoc analysis from the same data stream

• Compensate for Current Rule Engine limitations– Enables Hot Replacement for fault-tolerance– Enables simple horizontal scaling by partitioning data and rules

• Convergence– Run this use case on your existing, standard, big data technology– Use OSS frameworks and Open APIs

Page 12: CEP - simplified streaming architecture - Strata Singapore 2016

© 2016 MapR Technologies 12© 2016 MapR Technologies 12MapR Confidential

Roy Schulte, vice president, Gartner

Most CEP in IoT [...] is custom coded [...] rather than

[using a] general purpose stream platform.

See: Complex Event Processing and The Future Of Business Decisions by David Luckham and W. Roy Schulte

Page 13: CEP - simplified streaming architecture - Strata Singapore 2016

© 2016 MapR Technologies 13© 2016 MapR Technologies 13MapR Confidential

Custom Coded CEP: The Good and The BadThe Good:

• Made to order with a modern framework• “No limit” to potential for performance and scalability• Fit to purpose technology

The bad:• Engineers aren’t business domain experts• Lots of work to build from scratch every time• Changes to logic is a pain point (from business side)• Lack of available talent/organizational capability

Page 14: CEP - simplified streaming architecture - Strata Singapore 2016

© 2016 MapR Technologies 14© 2016 MapR Technologies 14MapR Confidential

Declarative Makes Sense For BusinessManage complex behavior through simple rules working together, executed by a rules Engine.

Page 15: CEP - simplified streaming architecture - Strata Singapore 2016

© 2016 MapR Technologies 15© 2016 MapR Technologies 15MapR Confidential

Drools is a business rule management system (BRMS) with a forward and backward chaining inference based rules engine.

• Project homepage: http://www.drools.org/ • Developer: Red Hat• Enterprise supported version available

– JBoss Enterprise BRMS• Enhanced implementation of the Rete algorithm

– A state of the art algorithm for rules engines• Has a GUI Rules Editor: Workbench

An Open Source Rule Engine:

Page 16: CEP - simplified streaming architecture - Strata Singapore 2016

© 2016 MapR Technologies 16© 2016 MapR Technologies 16MapR Confidential

An Open Source Rule Engine:

Production Memory

(Rules)

Working Memory

(Facts)

Pattern Matcher

AgendaDomain Expert

Rules Editor

Actions

Page 17: CEP - simplified streaming architecture - Strata Singapore 2016

© 2016 MapR Technologies 17© 2016 MapR Technologies 17MapR Confidential

STATELESSSession

CEP in Drools: Stateful Session and Sliding Window

STATELESSSession

Rule:Is the ball red?

Rule:Are there 2+ red balls in the last 4 balls I’ve seen?

Page 18: CEP - simplified streaming architecture - Strata Singapore 2016

© 2016 MapR Technologies 18© 2016 MapR Technologies 18MapR Confidential

STATEFULSession

CEP in Drools: Stateful Session + Sliding Window

STATELESSSession

Rule:Is the ball red?

Rule:Are there 2+ red balls in the last 4 balls I’ve seen?

Page 19: CEP - simplified streaming architecture - Strata Singapore 2016

© 2016 MapR Technologies 19© 2016 MapR Technologies 19MapR Confidential

Streaming Architecture for CEP

Sensors -Real-time Data

Producer

Distributed Cluster (Kafka,

MapR)

Consumer Server (Edge node, cluster

node)

Integrate with other systems

Page 20: CEP - simplified streaming architecture - Strata Singapore 2016

© 2016 MapR Technologies 20© 2016 MapR Technologies 20MapR Confidential

The Case for CEP on Streaming Architecture

• Decouple rules maintenance from code and infrastructure– Manage the cluster separately– The application code may need only minimal maintenance

• Rules maintenance in the hands of the business domain experts– Easily supports multiple projects & teams

• Data is persisted in the stream (input and output)– Open to new use cases

• Send data back to the stream– Integrate with other downstream use cases

Page 21: CEP - simplified streaming architecture - Strata Singapore 2016

© 2016 MapR Technologies 21© 2016 MapR Technologies 21MapR Confidential

But Does It Scale? Yes, But Only to a Point

• Drools and other rule engines are in-memory and the memory is not distributed– This is only a technical limitation that can be

overcome (Ex: Alluxio, Apache Ignite)• Streams make it easy to provide reasonable fault-

tolerance and quick disaster recovery• Run multiple servers, split rules logically, fan out data

into multiple topics• A single session can handle 100K+/sec events. How

much scale is needed?

Page 22: CEP - simplified streaming architecture - Strata Singapore 2016

© 2016 MapR Technologies 22© 2016 MapR Technologies 22MapR Confidential

Live Demo: Smart City Traffic Management

Page 23: CEP - simplified streaming architecture - Strata Singapore 2016

© 2016 MapR Technologies 23© 2016 MapR Technologies 23MapR Confidential

● Try out integration with Spark Streaming and Flink

● Run serious performance benchmarks

● Deploy into production

Page 24: CEP - simplified streaming architecture - Strata Singapore 2016

© 2016 MapR Technologies 24© 2016 MapR Technologies 24MapR Confidential

Recap

• It’s not Rule Engine vs. Spark and Flink Stream processing– It’s Rules + Stream Processing

– Spark Flink, Java are just an implementation choice

• Focus on business value from applying rules to data– Think of benefits of SQL vs. Java, C++, Scala, …

• Great use case for a Streaming Architecture and microservices

An in-depth blog post on this talk topic will be available on

MapR blog: https://www.mapr.com/blog/author/mathieu-dumoulin

Page 25: CEP - simplified streaming architecture - Strata Singapore 2016

© 2016 MapR Technologies 25© 2016 MapR Technologies 25MapR Confidential

Suggested Reading

● Get Ted & Ellen’s book and many more for free:○ https://www.mapr.com/ebooks/

● More more great blog content about CEP and IoT applications○ Eric Bruno on Linkedin○ Karzel et al. on InfoQ

Page 26: CEP - simplified streaming architecture - Strata Singapore 2016

© 2016 MapR Technologies 26© 2016 MapR Technologies 26MapR Confidential

Q & A@mapr

[email protected]@lordxar

Engage with us!

mapr-technologies