From Big to Fast Data. How #kafka and #kafka-connect can redefine you ETL and #stream-processing
-
Upload
landoop-ltd -
Category
Software
-
view
915 -
download
1
Transcript of From Big to Fast Data. How #kafka and #kafka-connect can redefine you ETL and #stream-processing
2010
2014
- Error handling first class citizen
schema registry
Your App
Producer
Serializer
Check is format is acceptable
Retrieve schema ID
Topic
Incompatible data error
Schema ID + Data
Kafka
producerProps.put(“key.serializer", "io.confluent.kafka.serializers.KafkaAvroSerializer");producerProps.put("value.serializer","io.confluent.kafka.serializers.KafkaAvroSerializer");
shipments topic
sales topic
low inventory topicspark
streaminggenerate
data
let’s see some code
Define the data contract / schema in Avro format
generate data
1,9 M msg / secusing 1 thread
https://schema-registry-ui.landoop.com
Schemas registered for us :-)
Defining the typed data format
Initiate the streaming from 2 topics
The business logic
shipments topic
sales topic
low inventory topicspark
streaming
elastic-search
re-ordering
Simple is beautiful