© 2016 MapR Technologies 1© 2016 MapR Technologies 1MapR Confidential © 2016 MapR Technologies
Architecting a hybrid cloud application using a global publish-subscribe streaming message systemMathieu Dumoulin (MapR Technologies)Strata Singapore 2016
© 2016 MapR Technologies 2© 2016 MapR Technologies 2MapR Confidential © 2016 MapR Technologies
Streaming Architecture to Connect Everything(including Hybrid Cloud)Mathieu Dumoulin (MapR Technologies)Strata Singapore 2016
© 2016 MapR Technologies 3© 2016 MapR Technologies 3MapR Confidential
Mathieu Dumoulin, Data Engineer
• Master’s degree in text classification on Hadoop at Fujitsu Canada’s Innovation Lab and Laval University
• In Tokyo, I’ve worked as a Data Scientist, Search Engineer and Data Engineer
• Working on streaming, complex event processing and machine learning
© 2016 MapR Technologies 4© 2016 MapR Technologies 4MapR Confidential
The new rule for the future is going to be, “Anything that can be
connected, will be connected.”
Jacob Morgan, Forbes - May 2014
© 2016 MapR Technologies 5© 2016 MapR Technologies 5MapR Confidential
Talk Summary
• Clouds: private vs. public vs. hybrid• It’s all about that streaming
– Streaming for IoT– Publish-subscribe messaging systems (Kafka)– Stream Processing (Apache Spark Streaming,
Apache Flink)– Microservices
• Streams-based Architecture in the hybrid cloud– Design goals – Examples
• Recap, Q&A
© 2016 MapR Technologies 6© 2016 MapR Technologies 6MapR Confidential © 2016 MapR Technologies
Weather today for IT:
© 2016 MapR Technologies 7© 2016 MapR Technologies 7MapR Confidential
Public Cloud - Low Upfront Cost and Flexibility
The Good• Right size instances for
application• Grow with the business• “Forever” extensible• Global in a few clicks
The Bad• New complexity, no
magic• Costs can run away
The Ugly• Local data is far from
processing• Severe lock-in without
huge in-house expertise
© 2016 MapR Technologies 8© 2016 MapR Technologies 8MapR Confidential
Private Clouds - The Benefits of Ownership
The Bad• Harder to scale vertically &
horizontally• Cost of multiple datacenters
The Ugly• Pay for spike, wasted
resources• Never right size in a growing
organization
The Good• Direct access to data• Security, privacy and legal
compliance• Hardware certainty• Low running cost
© 2016 MapR Technologies 9© 2016 MapR Technologies 9MapR Confidential
Private Cloud - Europe
Private Cloud - Tokyo
Hybrid = Public vs. + Private
Spans at least one public and one private cloud. • Test new ideas with low
up-front capital cost• Cloudbursting• High Availability and Disaster
Recovery• Regulatory Requirements
IT infrastructure agility
© 2016 MapR Technologies 10© 2016 MapR Technologies 10MapR Confidential © 2016 MapR Technologies
It’s all about that streaming
© 2016 MapR Technologies 11© 2016 MapR Technologies 11MapR Confidential
Streaming Architecture the Norm for Data Driven Organizations
“Stream-based computing is becoming the norm for data-driven organizations” - Friedman & Dunning, Streaming Architecture• Build flexible systems
– more efficient and easier to build– Decouples dependencies between data source and processing
• Better model the way business processes take place.• More value now… and later
– Aggregates data from many sources once– Serves data to one or many projects immediately– More efficient and high performance– Run batch analytics, reprocess data
© 2016 MapR Technologies 12© 2016 MapR Technologies 12MapR Confidential
IoT is a Natural Use Case for Streaming
Connected devices produce data as real-time events that are modelled naturally as event streams.
Event
Some actions have value only if taken immediately– Navigation updates from traffic conditions, accident reports, disasters, …– Slowing down or stopping a factory line in response to quality issues– Re-routing items mid-way during shipping to increase efficiency– Continuous engine tuning
© 2016 MapR Technologies 13© 2016 MapR Technologies 13MapR Confidential
IoT is Happening Right Now!
© 2016 MapR Technologies 14© 2016 MapR Technologies 14MapR Confidential
Streams Make the Hybrid Cloud Practical
Streams can serve for inter-cloud communication in the exact same way they support any other scenario.
● Abstracts the differences between on-premise and cloud
● Standardize the expected flow of data between modules
● Reuse data many times, break down data silos
© 2016 MapR Technologies 15© 2016 MapR Technologies 15MapR Confidential
What Streaming Requires from a Messaging System
● The producer and consumer are fully independent● Very high throughput 1,000+/s → 1,000,000+/s● Persistence
○ Fault-tolerance○ Data is kept as a replayable sequence○ Strong ordering of events
● Naming of topics (consumers pick the data they need )● Geo-distributed replication (for Hybrid Cloud use cases)It’s very hard to get full isolation of producer and consumers while also keeping very high speed, but we must have both.
© 2016 MapR Technologies 16© 2016 MapR Technologies 16MapR Confidential
What Streaming Requires from Stream Processing FrameworksDesirable features for real-time analytics frameworks:• Open Source, active development and developer community• Supports “exactly once” guarantee, stream reprocessing• How much real-time? Microbatch vs. record-at-a-time• Performance (latency, throughput) • Other: Easy to use, compatibility, talent availability
To Know more: https://www.mapr.com/blog/stream-processing-everywhere-what-useJim Scott - Stream Processing Everywhere - What to Use? Strata San Jose 2015Also see Data Artisan’s Blog on Stream Processing Framework Myths
© 2016 MapR Technologies 17© 2016 MapR Technologies 17MapR Confidential
Which Stream Processing Frameworks?
© 2016 MapR Technologies 18© 2016 MapR Technologies 18MapR Confidential
Summing up: Technology to support Streaming
1. Lightweight messaging system
2. Stream Processing Framework
You can get an Introduction to Flink in this Free Book published by O’Reilly
© 2016 MapR Technologies 19© 2016 MapR Technologies 19MapR Confidential
Key Ideas For Effectively Using Streams
Real-time Analysis
Persist to Disk
Geo-distributed Replication
Core part of Architecture
© 2016 MapR Technologies 20© 2016 MapR Technologies 20MapR Confidential
© 2016 MapR Technologies 21© 2016 MapR Technologies 21MapR Confidential
Streaming Architecture: Ideal Platform for Microservices
Microservices are a modern distributed architecture that realizes the promises of SOA, Service Oriented Architecture• Scale up from a test use case to a global deployment• Decouples components, more modular• Modern, agile development, testing and deployment• More robust and responsive
See Krystal Valentine’s “The keys to an event-based microservices application” presentation, Strata New York 2016
© 2016 MapR Technologies 22© 2016 MapR Technologies 22MapR Confidential
Monolithic to Microservices Architecture
See Fowler’s blog about microservices: http://www.martinfowler.com/articles/microservices.html
© 2016 MapR Technologies 23© 2016 MapR Technologies 23MapR Confidential
Microservices are Truly Decoupled
© 2016 MapR Technologies 24© 2016 MapR Technologies 24MapR Confidential
When to Use Streaming Architecture
© 2016 MapR Technologies 25© 2016 MapR Technologies 25MapR Confidential © 2016 MapR Technologies
Connect Clouds with Streams:Streams-based Architecture
© 2016 MapR Technologies 26© 2016 MapR Technologies 26MapR Confidential
Switch from thinking of computer programs as state-oriented to thinking
of them in terms of flows”
Ted Dunning & Ellen Friedman, Streaming Architecture - O’Reilly - 2016
© 2016 MapR Technologies 27© 2016 MapR Technologies 27MapR Confidential
An End-to-End Streaming Architecture
Japan North Data Center
StreamGW
Global Data Center
Stream
© 2016 MapR Technologies 28© 2016 MapR Technologies 28MapR Confidential
Example Architecture: Log Analysis
© 2016 MapR Technologies 29© 2016 MapR Technologies 29MapR Confidential
Example Architecture: Log Analysis
© 2016 MapR Technologies 30© 2016 MapR Technologies 30MapR Confidential
Example Architecture: The MapR Blueprint
Download the Finserve app from Github!https://github.com/mapr-demos/finserv-application-blueprint
© 2016 MapR Technologies 31© 2016 MapR Technologies 31MapR Confidential
Conclusion
• The hybrid cloud matters for IT agility
• Use streams for communication between elements
• Streaming-based systems can be arbitrarily complex
– Still fast, responsive, reliable and easier to develop!
• In a streaming architecture world, a converged platform (built-in streaming, storage and DB) makes a difference.
© 2016 MapR Technologies 32© 2016 MapR Technologies 32MapR Confidential
Suggested Reading And Video Links
Get Ted & Ellen’s book: Read it Online for Free!
New content presented by Ted Dunning:
1. Big Data in the Cloud (blog): www.mapr.com/big-data-clouda. Direct video link:
https://youtu.be/90KrQAb1_Cw 2. Converged Advantages in the Cloud (blog):
www.mapr.com/converged-clouda. Direct video link: https://youtu.be/yjfBXNcmAHA
© 2016 MapR Technologies 33© 2016 MapR Technologies 33MapR Confidential
Q & A@mapr
[email protected]@lordxar
Engage with us!
mapr-technologies
© 2016 MapR Technologies 34© 2016 MapR Technologies 34MapR Confidential
Key Ideas for Microservices
• Services are opaque - API only• They communicate with only a few other services using
lightweight, flexible protocols. – HTTP+REST - Synchronous (frontend)– Messaging Systems (Kafka, MapR Streams) - Asynchronous (backend)
• Data formats should be future-proofed – JSON - Human readable, easy to use, low efficiency– Binary (Avro, Protobuf, Thrift) - Efficient but (somewhat) harder to use
{RESTful}
© 2016 MapR Technologies 35© 2016 MapR Technologies 35MapR Confidential
Spark Streaming or Flink: Case by Case
Micro-batches. Time-based window. Latency: seconds
Continuous flow model. Record-based window. Latency: ms
Both provide exactly once guarantee, high throughput and low overhead of fault tolerance. Both streaming and batch supported.
© 2016 MapR Technologies 36© 2016 MapR Technologies 36MapR Confidential
The Hybrid Cloud for IoT Infrastructure
• IoT is a new use case - Need to Test• Built-in need for baseload capacity and bursting data spikes• Global marketplace requires geographically dispersed
datacenters• Increasingly strict compliance requirements• IoT Security issues need to be taken seriously
Why do IoT applications call out for the flexibility of Hybrid Cloud?
Top Related