Five Early Challenges Of Building Streaming Fast Data Applications

30
Craig Blitz, Senior Product Director at Lightbend WEBINAR Five Early Challenges Of Building Streaming Fast Data Applications

Transcript of Five Early Challenges Of Building Streaming Fast Data Applications

Page 1: Five Early Challenges Of Building Streaming Fast Data Applications

Craig Blitz, Senior Product Director at Lightbend

WEBINAR

Five Early Challenges Of Building Streaming Fast Data Applications

Page 2: Five Early Challenges Of Building Streaming Fast Data Applications

Why does fast matter?

Recommendation Engines Automation Competitive Advantage

Page 3: Five Early Challenges Of Building Streaming Fast Data Applications

Fast Data: Opportunity Meets Necessity

Apache Hadoop

20142005

Early use of MapReduce and Hadoop

Hadoop 1.0

2008

Spark 0.5

Spark 1.0

2011 2017

Spark 2.0 Structured StreamingMLlib

Akka Streams

Growth in Mobile Data Traffic 2009-2020 [Source: Carrier & Public Wi-Fi, July 2015, Mobile Experts LLC]

Flink 1.0, kafka streams

?

Apache Beam 2.0 (!!)

Apache Beam 0.6

Page 4: Five Early Challenges Of Building Streaming Fast Data Applications

Growth in Streaming Traffic Coincides with Microservices and Cloud Native Apps

Microservices interest over time (2004-2017)

Page 5: Five Early Challenges Of Building Streaming Fast Data Applications

What is an integrated Fast Data Platform?

• A solution that ties together fast data components, microservices, cluster management, application/service lifecycle management, and support.

Messaging

Microservices

Streaming Services

Persistence

Management

Monitoring

Page 6: Five Early Challenges Of Building Streaming Fast Data Applications

Lots of Innovation, but Maturity Lags

• Innovation within components• Solution comprises many components• Components supported by different companies• Aspects of SDLC remain tricky

Page 7: Five Early Challenges Of Building Streaming Fast Data Applications

Survey Says….

A currently open survey by Lightbend looks at Fast Data and related topics. Preliminary results of 1200 initial respondents:

• 86% said they are dealing with more data compared to the past. • More than half are scrambling to process data more quickly.• The majority today process data once-daily / intra daily. • The majority are in production or pilot for production with microservices.• What's tough about Fast Data: technology choice, implementation and scale.

Page 8: Five Early Challenges Of Building Streaming Fast Data Applications

Challenge #1

Choosing Among Alternative Frameworks

Page 9: Five Early Challenges Of Building Streaming Fast Data Applications

• Excels at low-cost, scalable batch analytics• Data Warehouse Replacement• Less suitable for real-time (streaming)

Hadoop

Page 10: Five Early Challenges Of Building Streaming Fast Data Applications

Streaming Engines – So many to choose from!

Kafka Streams•Kafka Library•Consume, Produce Kafka Topics•Pull model instead of Async + Backpressure•Useful for stateful stream processing

Akka Streams•Low-latency Complex Event Processing•Integration with data sources/sinks•Iterative, pipelined processing•Integration with microservices

Spark Streaming•Mini-batch •Machine Learning•Longer running jobs like Training Models•Supports batch and near-real time•Run SQL jobs

Apache Flink•High-Volume, Low Latency•True streaming•Iterative, pipelined processing•Excellent Apache Beam Support

Page 11: Five Early Challenges Of Building Streaming Fast Data Applications

So That’s Perfectly Clear

• Choices are not always obvious• Tradeoff speed, memory, choice of libraries, …• Application may require multiple engines

Page 12: Five Early Challenges Of Building Streaming Fast Data Applications

Challenge #2

Integrating Microservices with Streaming Services

Page 13: Five Early Challenges Of Building Streaming Fast Data Applications

reactivemanifesto.org

Reactive: the architecture for modern, scalable applications

Page 14: Five Early Challenges Of Building Streaming Fast Data Applications

• A streaming service should appear as just another service in your architecture

• Must be reactive: elastic, resilient, responsive, and message-driven• Unlike Hadoop systems, which can serve results to a service when ready• We shouldn’t care how a service is implemented

Streaming Services are Part of your Application

Service A

Service C

Service B

Page 15: Five Early Challenges Of Building Streaming Fast Data Applications

Challenge #3

Understanding Operational Challenges

Page 16: Five Early Challenges Of Building Streaming Fast Data Applications

Ok, Streaming Services are Just Another Service

Page 17: Five Early Challenges Of Building Streaming Fast Data Applications

• How do they manage state?• How do you scale them?• How do you version/upgrade them?

Stream Engines Do Not Always Meet Microservices Goals

In most cases, the operator needs to know too much about the underlying component or service

Page 18: Five Early Challenges Of Building Streaming Fast Data Applications

Challenge #4

Gaining Competitive Advantage through Machine Learning

Page 19: Five Early Challenges Of Building Streaming Fast Data Applications

• Branch of artificial intelligence • Recognize patterns in data• Build models to predict outcomes• Recommend actions based on predicted outcomes vs stated goals

What Do We Mean By Machine Learning?

Page 20: Five Early Challenges Of Building Streaming Fast Data Applications

Why is Machine Learning Hot Now?

Page 21: Five Early Challenges Of Building Streaming Fast Data Applications

• Example Uses Cases• Fraud and Anomaly Detection• Recommendation Engines / Marketing

Personalization• Financial Trading• Smart Cars• Natural Language Processing• Automation

How Can Businesses Identify Machine Learning Opportunities?

Stop!Ask Yourself: Where I have hard-coded

models or rules?

Page 22: Five Early Challenges Of Building Streaming Fast Data Applications

Challenge #5

Optimizing Resource Utilization

Page 23: Five Early Challenges Of Building Streaming Fast Data Applications

• Clusters Can Get Quite Large with Many Moving Components• Interaction Between Components Quite Complex• First Generation Auto-Scalers Naïve

First Generation Resource Optimization

“Scale when CPU reaches 80%”

“Scale when Queue Length > 10”

Page 24: Five Early Challenges Of Building Streaming Fast Data Applications

• Clusters Can Get Quite Large with Many Moving Components• Interaction Between Components Quite Complex• Bottlenecks shift over time as application and infrastructure changes

But Hard To Tie These Rules Back to Business Objectives

Page 25: Five Early Challenges Of Building Streaming Fast Data Applications

What You Really Want

“Scale what you need to scale to continue to meet service-level objectives”

Page 26: Five Early Challenges Of Building Streaming Fast Data Applications

• On-Line Machine Learning Can Help• Specify Service Level Objectives per service or application• But Challenges Remain….

• Hard to Build• Need knowledge of how to scale components• “Operator Model”

Good News

Page 27: Five Early Challenges Of Building Streaming Fast Data Applications
Page 28: Five Early Challenges Of Building Streaming Fast Data Applications

• Easy on-ramp for getting started• Curated choice of components • Complete monitoring and intelligent management• Support across entire platform

Lightbend Fast Data Platform

Page 30: Five Early Challenges Of Building Streaming Fast Data Applications

Questions? Comments? Want to speak with someone at Lightbend?

Get in touch with us!lightbend.com/contact

[email protected]