Multi-Datacenter Kafka - Strata San Jose 2017

When One Data Center Is Not EnoughBuilding Large-scale Stream Infrastructures Across Multiple Data Centerswith Apache KafkaGwen Shapira

There’s a book on that!

Actually… a chapter

Outline

Kafka overviewCommon multi data center patterns Future stuff

What is Kafka?▪It’s like a message queue, right?-Actually, it’s a “distributed commit log”-Or “streaming data platform”

0 1 2 3 4 5 6 7 8

Data Source

Data Consumer

Topics and Partitions▪Messages are organized into topics, and each topic is split into partitions.

- Each partition is an immutable, time-sequenced log of messages on disk.- Note that time ordering is guaranteed within, but not across, partitions.

0 1 2 3 4 5 6 7 8

Partition 0

Partition 1

Partition 2

Data SourceTopic

Scalable consumption model

Topic T1Partition 0Partition 1

Partition 2Partition 3

Consumer Group 1

Consumer 1

Topic T1

Consumer Group 1Consumer 1

Consumer 2

Consumer 3

Consumer 4

Kafka usage

Common use case

Large scale real time data integration

Other use cases

Scaling databasesMessagingStream processing…

Important things to remember:

1. Consumers offset commits2. Within a cluster – each partition has replicas3. Inter-cluster replication, producer and consumer defaults – all tuned for LAN

Why multiple data centers (DC)

Offload work from main clusterDisaster recoveryGeo-localization

• Saving cross-DC bandwidth• Better performance by being closer to users• Some activity is just local• Security / regulations

CloudSpecial case: Producers with network issues

Why is this difficult?

1. It isn’t, really – you consume data from one cluster and produce to another2. Network between two data centers can get tricky3. Consumers have state (offsets) – syncing this between clusters get tough

• And leads to some counter intuitive results

Pattern #1: stretched cluster

Typically done on AWS in a single region• Deploy Zookeeper and broker across 3 availability zones

Rely on intra-cluster replication to replica data across DCs

producers

consumers

DC 2 produce

rsproduce

consumers

On DC failure

Producer/consumer fail over to new DCs• Existing data preserved by intra-cluster replication• Consumer resumes from last committed offsets and will see same data

producers

consumers

DC 2 produce

consumers

When DC comes back

Intra cluster replication auto re-replicates all missing dataWhen re-replication completes, switch producer/consumer back

producers

consumers

DC 2 produce

rsproduce

consumers

Be careful with replica assignment

Don’t want all replicas in same AZRack-aware support in 0.10.0

• Configure brokers in same AZ with same broker.rack

Manual assignment pre 0.10.0

Stretched cluster NOT recommended across regions

Asymmetric network partitioning

Longer network latency => longer produce/consume timeCross region bandwidth: no read affinity in Kafka

region 1Kafk

region 2Kafk

region 3Kafk

Pattern #2: active/passive

Producers in active DCConsumers in either active or passive DC

producers

consumers

Replication

consumers

Critical Apps

Nice Reports

Cross Datacenter Replication

Consumer & Producer: read from a source cluster and write to a target clusterPer-key ordering preservedAsynchronous: target always slightly behindOffsets not preserved

• Source and target may not have same # partitions• Retries for failed writes

Options:• Confluent Multi-Datacenter Replication• MirrorMaker

On active DC failure

Fail over producers/consumers to passive clusterChallenge: which offset to resume consumption

• Offsets not identical across clusters

producers

consumers

Replication

Solutions for switching consumers

Resume from smallest offset• Duplicates

Resume from largest offset• May miss some messages (likely acceptable for real time consumers)

Replicate offsets topic• May miss some messages, may get duplicates

Set offset based on timestamp• Old API hard to use and not precise• Better and more precise API in Apache Kafka 0.10.1 (Confluent 3.1)• Nice tool coming up!

Preserve offsets during replication• Harder to do

When DC comes back

Need to reverse replication• Same challenge: determining the offsets

producers

consumers

Replication

Limitations

Reconfiguration of replication after failoverResources in passive DC under utilized

Pattern #3: active/active

Local aggregate replication to avoid cyclesProducers/consumers in both DCs

• Producers only write to local clusters

Kafka local

Kafka aggrega

producers

consumers

ReplicationKafka local

consumers

On DC failure

Same challenge on moving consumers on aggregate cluster• Offsets in the 2 aggregate cluster not identical• Unless the consumers are continuously running in both clusters

Kafka local

Kafka aggrega

producers

consumers

SFKafka

Cluster

HoustonKafka

Cluster

Allapps

West coastUsers

South CentralUsers

When DC comes back

No need to reconfigure replication

Kafka local

Kafka aggrega

producers

consumers

Alternative: avoid aggregate clusters

Prefix topic names with DC tagConfigure replication to replicate remote topics onlyConsumers need to subscribe to topics with both DC tags

producers

consumers

Replication

producers

consumers

Beyond 2 DCs

More DCs better resource utilization• With 2 DCs, each DC needs to provision 100% traffic• With 3 DCs, each DC only needs to provision 50% traffic

Setting up replication with many DCs can be daunting• Only set up aggregate clusters in 2-3

Comparison

Pros ConsStretched • Better utilization of

resources• Easy failover for

consumers

• Still need cross region story

Active/passive

• Needed for global ordering • Harder failover for consumers• Reconfiguration during failover• Resource under-utilization

Active/active • Better utilization of resources

• Can be used to avoid consumer failover

• Can be challenging to manage• More replication bandwidth

Multi-DC beyond Kafka

Kafka often used together with other data storesNeed to make sure multi-DC strategy is consistent

Example application

Consumer reads from Kafka and computes 1-min countCounts need to be stored in DB and available in every DC

Independent database per DC

Run same consumer concurrently in both DCs• No consumer failover needed

Kafka local

Kafka aggrega

producers

consumer

Stretched database across DCs

Only run one consumer per DC at any given point of time

Kafka local

Kafka aggrega

producers

consumer

on failover

Practical tips

• Consume remote, produce local• Unless you need encrypted data on the wire• Monitor!

• Burrow for replication lag• Confluent Control Center for end-to-end• JMX metrics for rates and “busy-ness”

• Tune!• Producer / Consumer tuning• Number of consumers, producers• TCP tuning for WAN

• Don’t forget to replicate configuration• Separate critical topics from nice-to-have topics

Future work

Offset reset toolOffset preservation“Remote Replicas”2-DC stretch cluster

Other cool Kafka future:• Exactly Once• Transactions• Headers

THANK YOU!Gwen Shapira| gwen@confluent.io | @gwenshap

Kafka Training with Confluent University• Kafka Developer and Operations Courses• Visit www.confluent.io/training

Want more Kafka?• Download Confluent Platform Enterprise at http://www.confluent.io/product• Apache Kafka 0.10.2 upgrade documentation at http://docs.confluent.io/3.2.0/upgrade.html • Kafka Summit recordings now available at http://kafka-summit.org/schedule/

Discount code: kafstrataSpecial Strata Attendee discount code = 25% off www.kafka-summit.orgKafka Summit New York: May 8Kafka Summit San Francisco: August 28

Presented by

Multi-Datacenter Kafka - Strata San Jose 2017

Data & Analytics

Transcript of Multi-Datacenter Kafka - Strata San Jose 2017

Enterprise Kafka: Kafka as a Service

Strata complete tasmania strata title management

Metamorfoza - Franz Kafka - Franz Kafka.pdf · Title: Metamorfoza - Franz Kafka Author: Franz Kafka Keywords: Metamorfoza - Franz Kafka Created Date: 5/13/2019 11:17:21 AM

Kafka to the Maxka - (Kafka Performance Tuning)

Kafka Tutorial - Introduction to Apache Kafka (Part 1)

Kafka Audit - Kafka Meetup - January 27th, 2015

Franz Kafka - Letter to his · PDF fileFranz Kafka Pictures Home Page Kafka and Judaism The Holocaust photographs Galleries Franz Kafka Biography Franz Kafka-Wax Museum Kafka &

Publish-subscribe Message Framework with Apache Kafka and ...vvtesh.co.in/teaching/bigdata-2020/slides/studentppt/kafka-kinesis.pdf · Introduction of Kafka Kafka elementary concepts

Kafka Performances 1perfug.github.io/assets/files/PerfUG68.pdf · Apache Kafka. 34 Consuming From Kafka - Single Consumer C. 35 Consuming From Kafka - Grouped Consumers ... programming.”

Primary Datacenter Secondary Datacenter - Koloffon Eureka · Primary Datacenter vCloud Connector vCenter Server ... Software-defined Datacenter Compute Cluster 01 ... 192.168.1.100

Paris Kafka Meetup - How to develop with Kafka

Kafka Low-Level Design discussion of Kafka Design Kafka …cloudurable.com/ppt/4-kafka-detailed-architecture.pdf · Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting

· Apache Kafka Introduction to Apache Kafka Apache Kafka Architecture explanation Practical Examples on Apache Kafka SCALA, PYTHON, SPARK Course Content

Strata schemes management queensland colcan strata

PDF: Kafka Tutorial - Cloudurablecloudurable.com/ppt/cloudurable-kafka-tutorial-v1.pdf · Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ What is Kafka?

Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at LinkedIn

Kelly Technologies · Kafka Introduction to kafka kafka Architecture Zookeeper quorum and Brokers Creating Topics , producers and consumers Kafka API Flume and Kafka PIG HBASE Introduction

Cassandra and Kafka Support on AWS/EC2cloudurable.com/ppt/kafka-tutorial-cloudruable-v2.pdf · Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting Cassandra and

Cloudurablecloudurable.com/ppt/cloudurable-kafka-intro-with-simple-java-produc… · Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Kafka Fundamentals

Kafka Streams: Hands-on Session - ce.uniroma2.it · Kafka Streams Kafka Streams: • Kafka Streams is a client library for processing and analyzing data stored in Kafka • Supports