Streaming Data Integration - For Women in Big Data Meetup
-
Upload
gwen-chen-shapira -
Category
Software
-
view
503 -
download
1
Transcript of Streaming Data Integration - For Women in Big Data Meetup
![Page 1: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/1.jpg)
1Confidential
Streaming Data Integrationwith Apache Kafka
![Page 2: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/2.jpg)
2Confidential
About Gwen
Gwen Shapira – System Architect @Confluent
PMC @ Apache Kafka
Moving data round since 2000
Previously:
• Software Engineer @ Cloudera
• Oracle Database Consultant
Find me:
• @gwenshap
![Page 3: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/3.jpg)
3Confidential
The Plan
1. What is Data Integration About?2. How things changed?3. What is difficult and important?4. How we solve things in Kafka?
![Page 4: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/4.jpg)
4Confidential
Data Integration
Making sure the right dataGets to the right places
![Page 5: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/5.jpg)
5Confidential
10 years ago…
InformaticaDataStageManual Optimizations
![Page 6: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/6.jpg)
6Confidential
5 years ago…
![Page 7: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/7.jpg)
7Confidential
![Page 8: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/8.jpg)
8Confidential
![Page 9: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/9.jpg)
9Confidential
Today…
• Everything streaming• Everything real-time• Everything in-memory• Everything containers• Everything clouds
![Page 10: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/10.jpg)
10Confidential
These Things Matter
• Reliability – Losing data is (usually) not OK. • Exactly Once vs At Least Once
• Timeliness • Push vs Pull• High throughput, Varying throughput
• Compression, Parallelism, Back Pressure
• Data Formats• Flexibility, Structure
• Security• Error Handling
![Page 11: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/11.jpg)
11Confidential
![Page 12: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/12.jpg)
12Confidential
After: Stream Data Platform with Kafka Distribute
d Fault Tolerant Stores Messages
Search Security
Fraud Detection Application
User Tracking Operational Logs Operational MetricsEspresso Cassandra Oracle
Hadoop Log Search Monitoring Data Warehouse
Kafka
Processes Streams
![Page 13: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/13.jpg)
13Confidential
![Page 14: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/14.jpg)
14Confidential
14
![Page 15: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/15.jpg)
15Confidential
15
![Page 16: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/16.jpg)
16Confidential
16
![Page 17: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/17.jpg)
17Confidential
17
![Page 18: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/18.jpg)
18Confidential
IntroducingKafka Connect
Large-scale streaming data import/export for Kafka
![Page 19: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/19.jpg)
19Confidential
![Page 20: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/20.jpg)
20Confidential
Overview of Connect
1. Install a cluster of Workers2. Download / Build and install Connector Plugins3. Use REST API to Start and Configure Connectors4. Connectors start Tasks. Tasks run inside Workers and copy data.
![Page 21: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/21.jpg)
21Confidential
![Page 22: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/22.jpg)
22Confidential
![Page 23: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/23.jpg)
23Confidential
![Page 24: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/24.jpg)
24Confidential
![Page 25: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/25.jpg)
25Confidential
![Page 26: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/26.jpg)
26Confidential
![Page 27: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/27.jpg)
27Confidential
![Page 28: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/28.jpg)
28Confidential
![Page 29: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/29.jpg)
30Confidential
![Page 30: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/30.jpg)
31Confidential
![Page 31: Streaming Data Integration - For Women in Big Data Meetup](https://reader035.fdocuments.us/reader035/viewer/2022081521/586f7e4b1a28ab10258b81e9/html5/thumbnails/31.jpg)
32Confidential
Questions?