What is Kafka & why is it Important? (UKOUG Tech17, Birmingham, UK - December 2017)
-
Upload
lucas-jellema -
Category
Software
-
view
268 -
download
2
Transcript of What is Kafka & why is it Important? (UKOUG Tech17, Birmingham, UK - December 2017)
What is Apache Kafka & Why is it Important?
The Event Fabric bringing IT together
What is Apache Kafka & Why is it Important? | UKOUG Tech17 1
µ
µ
What is
Apache Kafka & Why is it Important?
It would be so nice if I could
publish my ideas and actions,
accessible near instantly for
everyone who is interested
Heck, I do not even know these people
and they may not know me [personally]
– just my pearls of wisdom. And if they
are late to the party, they can also
check out the historic archives of my
eloquence
Without fretting about the numbers of
readers involved and whether they are
in the same time zone as me and online
when I publish my messages – and
which device they use
It would be so nice if I could
publish my ideas and actions,
accessible near instantly for
everyone who is interested
Heck, I do not even know these people
and they may not know me [personally]
– just my pearls of wisdom. And if they
are late to the party, they can also
check out the historic archives of my
eloquence
Without fretting about the numbers of
readers involved and whether they are
in the same timezone as me and online
when I publish my messages – and
which device they use
• Decoupled communication
• 0, 1 or many followers
• Scalable number of messages (and parties)
• Reliable (mostly available, few messages lost)
• Full history
• Open: cross device, cross location
• Not Sub-second, near real-time fast
• Rate limited (#messages/minute)
• Size limited (140-280 characters)
• Format limited (text)
• Not for private interactions
• Not (really) for programmatic use
5
Oracle
Database
ORDERS
Oracle Database
DVX_ORDERS
6
Oracle
Database
ORDERS
Oracle Database
DVX_ORDERS
What is Apache Kafka and why is it important? 7
Oracle
Database
ORDERS
Oracle Database
DVX_ORDERS
µOracle Application
Container Cloud
Oracle DBaaS Cloud
µLocally running
Node application
What does the Twitter for System Driven Event Interaction look like?
What is Apache Kafka and why is it important? 8
• Decoupled communication – organized per topic
• 0, 1 or many Consumers per Topic
• Scalable number of messages (and parties)
• Reliable (distributed)
• Full history
• Open: libraries in many technologie & REST APIs
9
Oracle
Database
ORDERS
Oracle Database
DVX_ORDERS
µOracle Application
Container Cloud
Oracle DBaaS Cloud
µLocally running
Node application
Oracle Event Hub
What does the Twitter for System Driven Event Interaction look like?
What is Apache Kafka and why is it important? 10
• Decoupled communication – organized per topic
• 0, 1 or many Consumers per Topic
• Scalable number of messages (and parties)
• Reliable (distributed)
• Full history
• Open: libraries in many technologie & REST APIs
• Near real-time fast
• No Rate Limit
• No enforced size limit
• Anything goes (it’s all byte[])
• On premises or in cloud, private or trusted
• Very much for programmatic use
Events
Producers
Consumers
Robust, Scalable, Fast,
History Retention
Containerized/Cloud-
enabled
Open
Messaging as we know it
• JMS, Oracle Advanced Queuing, IBM MQ, MS MQ, RabbitMQ, MQTT,
XMPP, WebSockets, Oracle Coherence, …
• Challenges
• Costs
• Scalability (size and speed)
• (lack of) Distribution (and therefore availability)
• Complexity of infrastructure
• Message delivery guarantees
• Lack of technology openness
• Deal with temporarily offline consumers
• Retain history
Introducing Apache Kafka
• ..- 2010 – creation at Linkedin
• Message Bus | Event Broker
• High volume, low latency, highly reliable, cross technology
• Scalable, distributed, strict message ordering, ….
• 2011/2012 – open source under the Apache Incubator/ Top Project
• Kafka is used by many large corporations:
• Walmart, Cisco, Netflix, PayPal, LinkedIn, eBay, Spotify, Uber, Sift
Science, Zalando, The New York Times, Airbnb, Coursera, ING Bank,…
• And embraced by many software vendors & cloud providers
• Client libraries available for Node, Java, C/C++, Python, Ruby, PHP, Go,
Rust, .NET, Perl, Scala DSL, Clojure, Swift and more
Producers
Consumers
tcp
tcp
Producers
Consumers
Topic
KAFKA TERMINOLOGY
• Topic
• Message• == ByteArray
• Broker
• Producer
• Consumer
Producer Consumer
TopicBroker
Key
Value
Time
Message
Producers
Consumers
TopicBroker
Key
Value
Time
CONSUMING
• Messages are available to consumers only when they have been committed
• Kafka does not push• Unlike JMS
• Read does not destroy• Unlike JMS Topic
• (some) History available• Offline consumers can catch up
• Consumers can re-consume from the past
• Delivery Guarantees• Ordering maintained
• At-least-once (per consumer) by default; at-most-once and exactly-once can beimplemented
Producers
Consumers
TopicBroker
Key
Value
Time
Producers
Consumers
TopicBroker
tcp
tcp
WHAT’S SO SPECIAL?
• Durable
• Scalable• High volume
• High speed
• Available
• Distributed
• Open
• Quick start
• Free (no license costs)
• “Self Fulfilling Prophecy” (positive feedback loop)
CONFLUENT == ENTERPRISE KAFKA
• Freemium model• Support
• Training
• Confluent Cloud
• Platform and Tools
<CTRL F5>
Application
Server
F5
F5
F5
CTRLF5
CTRLF5
CTRLF5
Application
Server
F5
F5
F5
CTRLF5
CTRLF5
CTRLF5
Application
Server
F5
F5
F5
CTRLF5
CTRLF5
CTRLF5
FAST DATA AND ACTIVE UI
• Handle influx
• Publish findings instantaneously
• Update UI & notify end user immediately
• Analyze in real time
• Decoupled components
• No data loss when a component is temporarily down
• Scalable with volume of events and of number of clients
THE CASE AT HAND
Client
Client
Client
Client
Show live
tweet feed
for
conferencesShow live
tweet
aggregates
per
conference
Allow users
to like tweets
–and show
live list of
liked tweets
Show a live
list of top 3
liked tweets
per
conference
Tweets on #ukoug17
#ukoug_tech17
#ukoug_apps17
#ukoug_jde17
DEMO - REAL TIME, CROSS CLOUD, CROSS TECHNOLOGY PUSH
Tweets on #ukoug17
#ukoug_tech17
#ukoug_apps17
#ukoug_jde17
Client
Client
Client
Client
you
THE CASE AT HAND – STEP ONE
Client
Client
Client
Client
Tweets on #ukoug17 #ukoug_tech17
#ukoug_apps17
#ukoug_jde17
Show live
tweet feed
for
conferences
Tweets
Topic
THE CASE AT HAND – STEP ONE AND TWO
Client
Client
Client
Client
Tweets on #ukoug17 #ukoug_tech17
#ukoug_apps17
#ukoug_jde17
Show live
tweet feed
for
conferences
Tweets
Topic
KAFKA CONSUMER IN NODEGET EVENTS PUSHED INTO APPLICATION
THE CASE AT HANDSERVER SENT EVENTS FOR PUSH BACK
Client
Client
Client
Client
Show live tweet
feed for
conferences
Tweets
Topic
Server Sent
Event
Tweets on #ukoug17 #ukoug_tech17
#ukoug_apps17
#ukoug_jde17
SERVER SENT EVENT – SERVER SIDE
Client
Client
Client
Client
Server Sent
Event
Client
Client
Client
Client
Server Sent
Event
SERVER SENT EVENT – CLIENT SIDE
LIVE TWEET STREAM
Server Sent
Event
THE CASE AT HANDTWEET LIKES – CLIENT TO SERVER TO ALL CLIENTS
Client
Client
Client
Client
Show live
tweet feed
for
conferences
Tweets
Topic
SS
E
Allow users
to like tweets
–and show
live list of
liked tweets
Tweets on #ukoug17 #ukoug_tech17
#ukoug_apps17
#ukoug_jde17
THE CASE AT HANDWEB SOCKETS – FOR BI DIRECTIONAL PUSH
Client
Client
Client
Client
Show live
tweet feed
for
conferences
Tweets
Topic
SSE
WebSockets
Allow users
to like tweets
–and show
live list of
liked tweets
Tweets on #ukoug17 #ukoug_tech17
#ukoug_apps17
#ukoug_jde17
TWEET LIKES BROADCASTING
WebSockets
WebSockets
THE CASE AT HANDSTREAMING ANALYSIS OF TWEET EVENTS
Client
Client
Client
Client
Show live
tweet feed
for
conferences
Tweets
Topic
SSE
WebSockets
Allow users
to like tweets
–and show
live list of
liked tweets
Show live
tweet
aggregates
per
conference
Tweets on #ukoug17 #ukoug_tech17
#ukoug_apps17
#ukoug_jde17
THE CASE AT HAND - STREAMING ANALYSIS OF TWEETS
Client
Client
Client
Client
Show live
tweet feed
for
conferences
Tweets
Topic
WebSockets
Allow users
to like tweets
–and show
live list of
liked tweets
Show live
tweet
aggregates
per
conference
tweetAnalytics
Topic
Streaming
Tweets
Aggregation
µ
SSE
Tweets on #ukoug17 #ukoug_tech17
#ukoug_apps17
#ukoug_jde17
KAFKA STREAMS
• Real Time Event [Stream] Processing integrated into Kafka• Aggregations & Top-N• Time Windows• Continuous Queries• Latest State (event sourcing)
• Turn Stream (of changes) into Table(of most recent or current state)
• Part of the state can be quite old
• A Kafka Streams client will have statein memory
• Always to be recreated from topic partitionlog files
• Note: Kafka Streams is relatively new• Only support for Java clients
KAFKA STREAMS
Topic
Filter
Aggregate
Join
Topic
Map
(Xform)
Publish
Topic
EXAMPLE OF KAFKA STREAMS
TopicgroupBy
Aggregate
Join
Topic
Map
(Xform)
Publish
TweetMessageConference
Text
Author
Hashtag
Set Conference as key
Sum/Avg/Top3 by key
(==conference)
As JSON
Round aggregate
to nearest 100
Latest Conference
Details
Topic: CountTweetsPerConference
and possibly per time
window
KAFKA STREAMS –RUNNING COUNT TWEETS PER CONFERENCE
STREAMING TWEET ANALYTICS PUSHED TO CLIENTS
Server Sent
Event
THE CASE AT HAND - STREAMING ANALYSIS OF TWEET LIKES
Client
Client
Client
Client
Show live
tweet feed
for
conferences
Tweets
Topic
WebSockets
Allow users
to like tweets
–and show
live list of
liked tweets
Show live
tweet
aggregates
per
conference
tweetAnalytics
Topic
Streaming
Tweets
Aggregation
µ
SSE
Show a live
list of top 3
liked tweets
per
conference
Tweets on #ukoug17 #ukoug_tech17
#ukoug_apps17
#ukoug_jde17
KSQL FOR DECLARATIVE STREAM ANALYTICS THROUGH CONTINUOUS QUERIES
create table tweetAnalytics as
select conference
, count(*)
from tweetsTopic
group by conference
create stream retweets
as
select *
from tweetsTopic
where text like 'RT%'
VISUALIZING KSQL VS KAFKA STREAMS
THE CASE AT HAND - STREAMING ANALYSIS OF TWEET LIKES
Client
Client
Client
Client
Show live
tweet feed
for
conferences
Tweets
Topic
WebSockets
Allow users
to like tweets
–and show
live list of
liked tweets
Show live
tweet
aggregates
per
conference
tweetAnalytics
Topic
Streaming
Tweets
Aggregation
µ
SSE
Show a live
list of top 3
liked tweets
per
conference
Likes
Aggregation
µ
tweetLike
Topic
Top3TweetLikes
PerConference
Tweets on #ukoug17 #ukoug_tech17
#ukoug_apps17
#ukoug_jde17
WEBSOCKETS – SERVER SIDE
RUNNING TOP 3 OF BEST LIKED TWEETS PER CONFERENCE
Server Sent
Event
END TO END FLOW CLOUD ENABLED
API
Cache
EventHub CS
µ
Tweets
Aggregation
µ
LikesTweets
UI µClient
Chrome
Client
Firefox
Likes
Aggregation
µ
API
µ
Tweet
Count
Likes
Top3
Key aspects of this demo – What Kafka can do for you
• Bridging Cloud(s) and on premises systems
• Providing decoupled interaction between microservices
• Performing Streaming Analysis
• Bridging technologies (Java, Node, …)
• Bridging the availability (no | one | multiple instances)
• Provide semi-push based synchronization
• Open
• Scalable
• Reliable & Available
• Fast
• Complete historical record
What is Apache Kafka and why is it important? 58
Oracle embracing Apache Kafka
• Event Hub Cloud Service = Managed Apache Kafka platform
• Managed Topics have been announced too
• Kafka as source for Golden Gate and ODI
• Data Pipeline with Data Hub (Apache Cassandra) & Event Hub
• Oracle Service Bus Kafka Adapter
• Integration Cloud
• Stream Analytics (aka Stream Explorer fka Oracle Event Processor)
• Oracle Native Container and Microservices Platform
• Fn Serverless Platform
• JET and ADF real time push based on Apache Kafka
• In general – the bridge between on premises [public] Cloud
What is Apache Kafka and why is it important? 59
Summary
• => == =>
• Apache Kafka is emerging as platform of choice for message exchange in a world of
• Microservices
• CQRS and Data Source Synchronization
• Clouds
• Fast Data (IoT) and Streaming Analysis
• Real time data integration & distribution
• Oracle is rapidly embracing Apache Kafka on various levels
• Getting started with Apache Kafka is not very hard at all
• The platform is open source – and has broad client support (Java, Node, …)
• Many resources are available – tutorials, blog article, demonstrations, presentation
slides and recordings of conference sessions, samples on GitHub
What is Apache Kafka and why is it important? 60
Thank you!
What is Apache Kafka and why is it important? 61
• Blog: technology.amis.nl
• Email: [email protected]
• : @lucasjellema
• : lucas-jellema
• : www.amis.nl, [email protected]