Fault Tolerance with Kafka
-
Upload
edureka -
Category
Technology
-
view
2.271 -
download
0
Transcript of Fault Tolerance with Kafka
![Page 1: Fault Tolerance with Kafka](https://reader031.fdocuments.us/reader031/viewer/2022021422/58754f511a28abb8208b811b/html5/thumbnails/1.jpg)
www.edureka.co/apache-kafka
Fault Tolerance with Kafka
![Page 2: Fault Tolerance with Kafka](https://reader031.fdocuments.us/reader031/viewer/2022021422/58754f511a28abb8208b811b/html5/thumbnails/2.jpg)
www.edureka.co/apache-kafka
What will you learn today ?
What is Apache Kafka?
Architecture of Kafka
How Kafka achieves Fault Tolerance?
Hands-On : Fault Tolerance with Kafka
![Page 3: Fault Tolerance with Kafka](https://reader031.fdocuments.us/reader031/viewer/2022021422/58754f511a28abb8208b811b/html5/thumbnails/3.jpg)
www.edureka.co/apache-kafka
Data : The Ingredient
Data is the main ingredient of Internet applications and typically includes the following :
Page visits and clicks
User activities
Events corresponding to logins
Social networking activities such as likes, shares, and comments
Application specific metrics (e.g. logs, page load time, performance etc.)
![Page 4: Fault Tolerance with Kafka](https://reader031.fdocuments.us/reader031/viewer/2022021422/58754f511a28abb8208b811b/html5/thumbnails/4.jpg)
www.edureka.co/apache-kafka
Need : Real Time Analytics
In todays applications, activity data has become a part of production data and is used to run analytics in real time. These analytics can be:
Delivering advertisements to the masses
Tracking any abnormal user behavior or application hacking
Search-based on relevance
Recommendations based on popularity
![Page 5: Fault Tolerance with Kafka](https://reader031.fdocuments.us/reader031/viewer/2022021422/58754f511a28abb8208b811b/html5/thumbnails/5.jpg)
www.edureka.co/apache-kafka
Messaging Systems
Messaging systems provide seamless integration among distributed applications with the help of messages, that are shared between them
In the present big-data era, the very first challenge is to collect the data as it is a huge and the second challenge is to analyze it, one way to solve this problem is by using messaging systems
Problem :
Solution :
![Page 6: Fault Tolerance with Kafka](https://reader031.fdocuments.us/reader031/viewer/2022021422/58754f511a28abb8208b811b/html5/thumbnails/6.jpg)
www.edureka.co/apache-kafka
Apache Kafka
Apache Kafka is a distributed publish-subscribe messaging system
Originally developed at LinkedIn and later on became a part of Apache project
Kafka is fast, scalable, durable and distributed by design
![Page 7: Fault Tolerance with Kafka](https://reader031.fdocuments.us/reader031/viewer/2022021422/58754f511a28abb8208b811b/html5/thumbnails/7.jpg)
www.edureka.co/apache-kafka
Kafka Architecture
Producer
ConsumerConsumerConsumer
Producer Producer
Kafka Cluster
A stream of messages of particular category is called a topic. Producers publish messages to a topic
A Producer can be any application who can publish messages to a topic
Consumers subscribe to topics and consume the messages
Kafka cluster is a set of servers, each of which is called a broker
Kafka Architecture
![Page 8: Fault Tolerance with Kafka](https://reader031.fdocuments.us/reader031/viewer/2022021422/58754f511a28abb8208b811b/html5/thumbnails/8.jpg)
www.edureka.co/apache-kafka
ZooKeeper and Kafka
Each Kafka broker coordinates with other Kafka brokers using ZooKeeper
Producers and Consumers are notified by ZooKeeper service about the presence of new broker in Kafka system or failure of the broker in Kafka system
![Page 9: Fault Tolerance with Kafka](https://reader031.fdocuments.us/reader031/viewer/2022021422/58754f511a28abb8208b811b/html5/thumbnails/9.jpg)
www.edureka.co/apache-kafka
Kafka Clusters
With Kafka we can create multiple types of clusters, such as the following :
Single node single broker cluster
Single node multiple broker cluster
Multiple nodes multiple broker cluster
![Page 10: Fault Tolerance with Kafka](https://reader031.fdocuments.us/reader031/viewer/2022021422/58754f511a28abb8208b811b/html5/thumbnails/10.jpg)
www.edureka.co/apache-kafka
Single Node Single Broker Cluster
Producer
Producer
Producer
Consumer
Consumer
Consumer
Kafka Broker
ZooKeeper
Single Node Single Broker Cluster
![Page 11: Fault Tolerance with Kafka](https://reader031.fdocuments.us/reader031/viewer/2022021422/58754f511a28abb8208b811b/html5/thumbnails/11.jpg)
www.edureka.co/apache-kafka
Single Node Multiple Broker Cluster
Producer
Producer
Producer
Consumer
Consumer
Consumer
ZooKeeper
Single Node Multiple Broker Cluster
Broker 1
Broker 2
Broker 3
![Page 12: Fault Tolerance with Kafka](https://reader031.fdocuments.us/reader031/viewer/2022021422/58754f511a28abb8208b811b/html5/thumbnails/12.jpg)
www.edureka.co/apache-kafka
Multiple Node Multiple Broker Cluster
Producer
Producer
Producer
Consumer
Consumer
Consumer
ZooKeeper
Multiple Node Multiple Broker Cluster
Broker 1
Broker 2
Broker 1
Broker 2
Node 1
Node 2
![Page 13: Fault Tolerance with Kafka](https://reader031.fdocuments.us/reader031/viewer/2022021422/58754f511a28abb8208b811b/html5/thumbnails/13.jpg)
www.edureka.co/apache-kafka
How Kafka achieves Fault Tolerance?
For each topic, the Kafka cluster maintains a partitioned log that looks as shown below:
Each partition is an ordered, immutable sequence of messages that is continually appended to a commit log
![Page 14: Fault Tolerance with Kafka](https://reader031.fdocuments.us/reader031/viewer/2022021422/58754f511a28abb8208b811b/html5/thumbnails/14.jpg)
www.edureka.co/apache-kafka
How Kafka achieves Fault Tolerance?
The partitions of the log are distributed over the servers in the Kafka cluster with each server handling data and requests for a share of the partitions. So Kafka achieves fault tolerance by replicating each partition over a number of servers
![Page 15: Fault Tolerance with Kafka](https://reader031.fdocuments.us/reader031/viewer/2022021422/58754f511a28abb8208b811b/html5/thumbnails/15.jpg)
www.edureka.co/apache-kafka
Hands-onFault Tolerance with Kafka
![Page 16: Fault Tolerance with Kafka](https://reader031.fdocuments.us/reader031/viewer/2022021422/58754f511a28abb8208b811b/html5/thumbnails/16.jpg)
www.edureka.co/apache-kafka
Kafka @ LinkedIn
LinkedIn Newsfeed is powered by Kafka
LinkedIn recommendations are powered by Kafka
![Page 17: Fault Tolerance with Kafka](https://reader031.fdocuments.us/reader031/viewer/2022021422/58754f511a28abb8208b811b/html5/thumbnails/17.jpg)
www.edureka.co/apache-kafka
Kafka @ LinkedIn
LinkedIn notifications are powered by Kafka
Apart from this LinkedIn uses Kafka for many other purposes like log monitoring, performance metrics, search improvement etc.
![Page 18: Fault Tolerance with Kafka](https://reader031.fdocuments.us/reader031/viewer/2022021422/58754f511a28abb8208b811b/html5/thumbnails/18.jpg)
www.edureka.co/apache-kafka
Who else uses Kafka ?
DataSift uses Kafka as a collector of monitoring events and to track user’s
consumption of data streams in real time
Wooga uses Kafka to aggregate and process tracking data from all their
facebook games (hosted at various providers) in a central location
Spongecell uses Kafka to run their entire analytics and monitoring pipeline
driving both real-time and ETL applications
Loggly is the world's most popular cloud-based log management. It uses
Kafka for log collection
An exhaustive list of companies using Kafka can be found here : https://cwiki.apache.org/confluence/display/KAFKA/Powered+By
![Page 19: Fault Tolerance with Kafka](https://reader031.fdocuments.us/reader031/viewer/2022021422/58754f511a28abb8208b811b/html5/thumbnails/19.jpg)
www.edureka.co/apache-kafka
References
Apache Kafka :
http://kafka.apache.org/
Kafka Papers :
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+papers+and+presentations
Powered by Kafka :
https://cwiki.apache.org/confluence/display/KAFKA/Powered+By
LinkedIn Performance Insights :
https://engineering.linkedin.com/samza/real-time-insights-linkedins-performance-using-apache-samza
![Page 20: Fault Tolerance with Kafka](https://reader031.fdocuments.us/reader031/viewer/2022021422/58754f511a28abb8208b811b/html5/thumbnails/20.jpg)
www.edureka.co/apache-kafka
Survey
Your feedback is vital for us, be it a compliment, a suggestion or a complaint. It helps us to make your experience better!
Please spare few minutes to take the survey after the webinar.
![Page 21: Fault Tolerance with Kafka](https://reader031.fdocuments.us/reader031/viewer/2022021422/58754f511a28abb8208b811b/html5/thumbnails/21.jpg)
www.edureka.co/apache-kafka
Thank You …
Questions/Queries/Feedback
Recording and presentation will be made available to you within 24 hours