"Intro to-xamarin.forms", Кирилл Стативкин, Microsoft Student Partner
Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2....
Transcript of Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2....
![Page 1: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/1.jpg)
COWBOY DATING WITH BIG DATA
DATA PLATFORM EVOLUTION IN ACTION
BORIS TROFIMOV
![Page 2: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/2.jpg)
Big Data competence lead @ Sigma Software
Worked with Verizon/Yahoo/AOL, Collective
Cofounder of Odessa JUG
Passionate follower of Scala
Associate professor at ONPU
ABOUT ME
![Page 3: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/3.jpg)
INTRO – PARTNER 1
![Page 4: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/4.jpg)
INTRO – PARTNER 2
![Page 5: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/5.jpg)
НАГРАДЫ PARTNER 2
• Медаль за Kotlin• Полный кавалер Spring & Spring Boot• Орден за взятие Kubernetes
![Page 6: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/6.jpg)
LESSON 1 – SHARED STORAGE
![Page 7: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/7.jpg)
EXPECTATIONS
PRODUCT
![Page 8: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/8.jpg)
UI
MVP
API FACADE
DB
SHARED STORAGE
![Page 9: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/9.jpg)
UI
MVP
API FACADE
DB
ГДЕрепортинг?
SHARED STORAGE
![Page 10: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/10.jpg)
UI
MVP
API FACADE
DB
3rd PROVIDERS
SHARED STORAGE
![Page 11: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/11.jpg)
PROS• Fast TTM• Relatively cheap from infra and cost perspective
CONS• Tight data and code cohesion• Different Scaling scenarios• Performance and Availability issues
SHARED STORAGE
![Page 12: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/12.jpg)
SHARED STORAGE
UI
MVP
API FACADE
OLTP
DATA PLATFORM
OLAP
3rd PROVIDERS
![Page 13: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/13.jpg)
DATA PLATFORM
3rd PROVIDERS
SYSTEM EVENTS
API FACADEDATA PLATFORM
![Page 14: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/14.jpg)
LESSON 2 – WEAK SCHEDULING
![Page 15: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/15.jpg)
SCHEDULING
UI
MVP
API FACADE
OLTP
Data Platform
OLAP
3rd PROVIDERS
![Page 16: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/16.jpg)
SCHEDULING
DATA PLATFORM
DATA PLATFORM
3rd PROVIDERS
SYSTEM EVENTS
DATA PROCESSING SCRIPT
OLAPDATA PROCESSING SCRIPT
DATA PROCESSING SCRIPT
API FACADE
![Page 17: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/17.jpg)
DATA PROCESSING SCRIPT
SCHEDULING
DATA PLATFORM
3rd PROVIDERS
SYSTEM EVENTS
CRONDQUARTZ
…
OLAP
DATA PROCESSING SCRIPT
API FACADE
DATA PLATFORM
DATA PROCESSING SCRIPT
![Page 18: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/18.jpg)
SCHEDULING
UI
MVP
API FACADE
OLTP
Data Platform
OLAP
3rd PROVIDERS
КАКОЙ КРОНТАБ?
![Page 19: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/19.jpg)
SCHEDULING
Use prod-ready schedulers§ Airflow§ Azkaban§ Oozie
§ Jenkins?
What we gain§ Identity control and Audit§ Job Lineage, Logging and Troubleshooting§ Tools to design Workflows/DAGs§ Fault Tolerance features (rerun etc.)
![Page 20: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/20.jpg)
CHAPTER 3 –MONOLITHIC DATA PLATFORM
![Page 21: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/21.jpg)
DATA PLATFORM
3rd PROVIDERS
SYSTEM EVENTS
Scheduler
OLAPAPI FACADE
DATA PLATFORM
DATA PROCESSING SCRIPT
DATA PLATFORM
![Page 22: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/22.jpg)
DATA PROCESSING SCRIPT
3rd PROVIDERS
SYSTEM EVENTS
Scheduler
OLAP
DATA PLATFORM
ПОЧЕМУ ОПЯТЬ
МОНОЛИТ?
API FACADE
DATA PLATFORM
![Page 23: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/23.jpg)
DATA PLATFORM
DWH ANALYTICSREPORTING
DECOUPLING DATA PLATFORM
![Page 24: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/24.jpg)
LOADINGEST
DATA PLATFORM
PROCESS
Scheduler
DATA PLATFORM
3rd PROVIDERS
SYSTEM EVENTS
DWH
DATA LAKE DWH
DECOUPLING DATA PLATFORM
![Page 25: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/25.jpg)
LOADINGEST
DATA PLATFORM
PROCESS
Scheduler
3rd PROVIDERS
SYSTEM EVENTS
DWH
DATA LAKE DWH
LOADINGEST PROCESS
DECOUPLING DATA PLATFORM
![Page 26: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/26.jpg)
ML TRAINING
DATA PLATFORM
ANALYTICS
OLAP
ML RUNNING
Scheduler
3rd PROVIDERS
SYSTEM EVENTS
DWH
DATA LAKE DWH
LOADINGEST PROCESS
LOADINGEST PROCESS
DECOUPLING DATA PLATFORM
![Page 27: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/27.jpg)
DATA PLATFORM
ML TRAINING
DATA PLATFORM
REPORTING
REPORTING ENGINE
ML RUNNING
Scheduler
REPORTING CACHE
REPORTING METADATA
SCHEDULED REPORTS
ANALYTICS
API FACADE
3rd PROVIDERS
SYSTEM EVENTS
OLAP
DWH
DATA LAKE DWH
LOADINGEST PROCESS
LOADINGEST PROCESS
DECOUPLING DATA PLATFORM
![Page 28: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/28.jpg)
DECOUPLING DATA PLATFORM
ARCHITECTURE DECISIONS
§ Raw data should be stored inside Data Lake
§ Introduce granular reusable and testable steps inside pipelines [ingest, validate, enrich, aggregate etc.]
§ Separate pipeline per vendor/feed
§ Introduce Data Linage, easy troubleshooting
§ Separate concerns (Scalability, Fault Tolerance)
![Page 29: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/29.jpg)
VENDOR-AGNOSTIC TECHNOLOGY STACK
§ Apache NiFi for data routing and ingestion
§ Apache Spark/Flink/Presto/Beam for processing
§ Kafka/Hive for Data Lake Storage
§ Hive/Memsql for DWH
§ Vertica/Redshift/Memsql/Clickhouse for OLAP
DECOUPLING DATA PLATFORM
![Page 30: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/30.jpg)
LESSON 4 AGGREGATE IT
![Page 31: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/31.jpg)
INGEST
DWH
ПОЧЕМУ ТАК ДОЛГО
РАНЯТСЯ РЕПОРТЫ ???
UI
MVP
API FACADE
OLTP
Data Platform
OLAP
3rd PROVIDERS
AGGREGATE IT
![Page 32: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/32.jpg)
COMMON PITFALLS
§ Direct access to Data Lake or cold DWH
DATA PLATFORM
ML TRAININGINGEST
DATA PLATFORM
REPORTING
PROCESS
REPORTING ENGINE
ML RUNNING
AirFlow
REPORTING CACHE
REPORTING METADATA
SCHEDULED REPORTS
ANALYTICS
API FACADE
3rd
PROVIDERS
SYSTEM EVENTS
LOAD
DWH
DATA LAKE / DWH
![Page 33: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/33.jpg)
COMMON PITFALLS
§ Reports query RAW data
DATA PLATFORM
ML TRAININGINGEST
DATA PLATFORM
REPORTING
PROCESS
REPORTING ENGINE
ML RUNNING
AirFlow
REPORTING CACHE
REPORTING METADATA
SCHEDULED REPORTS
ANALYTICS
API FACADE
3rd
PROVIDERS
SYSTEM EVENTS
OLAP
LOAD
DWH
DATA LAKE DWH
![Page 34: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/34.jpg)
INTRODUCE AGGREGATIONS
date hour user_cookie creative_id
05/21/19 03 4444444 123
05/21/19 03 5555555 321
05/21/19 03 6666666 321
05/21/19 04 7777777 567
impressions
creative_id campaign_id
123 1
321 1
567 2
campaigns
date hour campaign_id creative_id impressions
05/21/19 03 1 123 1
05/21/19 03 1 321 2
05/21/19 04 2 567 1
performance_ad
JOIN
![Page 35: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/35.jpg)
INGESTIONVALIDATION
ENRICHMENT
BATCH 1
HDFS/HIVE/...RAW FACT TABLE
BATCH 2
AGGREGATOR
HDFS/HIVE/…AGGREGATION TABLE
BATCH 1 BATCH 2
INTRODUCE AGGREGATIONS
![Page 36: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/36.jpg)
INGESTIONVALIDATION
ENRICHMENT
BATCH 1
HDFS/HIVE/...RAW FACT TABLE
BATCH 2
AGGREGATOR
HDFS/HIVE/…AGGREGATION TABLE
BATCH 1 BATCH 2
BATCH 3
INTRODUCE AGGREGATIONS
![Page 37: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/37.jpg)
INGESTIONVALIDATION
ENRICHMENT
BATCH 1
HDFS/HIVE/...RAW FACT TABLE
BATCH 2
AGGREGATOR
HDFS/HIVE/…AGGREGATION TABLE
BATCH 1 BATCH 2
BATCH 3
BATCH 3
INTRODUCE AGGREGATIONS
![Page 38: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/38.jpg)
BREAK DOWN DOMAIN
DOMAIN DOMAIN
• Break down domain into business-concerned areas
• Cover area with dedicated aggregation
• Example For Video Platform • Ad performance• Player performance• Video performance• Revenue performance
• Build once, reuse between multiple reports
![Page 39: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/39.jpg)
LESSON 5AVAILABILITY
![Page 40: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/40.jpg)
AVAILIABILITY
UI
MVP
API FACADE
OLTP
Data Platform
OLAP
3rd PROVIDERS
![Page 41: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/41.jpg)
AVAILIABILITY
UI
MVP
API FACADE
OLTP
Data Platform
OLAP
3rd PROVIDERS
ГДЕ МОИ РЕПОРТЫ?
![Page 42: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/42.jpg)
THINGS EASY TO MISS
AVAILABILITY
§ If possible do not share infrastructure between DP with Core services
§ Chose wise between Kappa and Lambda architectures
§ Introduce effective monitoring
§ Know your data latency and design solution based on it
![Page 43: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/43.jpg)
THINGS EASY TO MISS
FAULT TOLERANCE
§ Every job should be fail-ready and retry-able by design
§ Enable multiple attempts on scheduler side
§ Use idempotent sinks
§ Implement backpressure: Prefer Pull over Push, leverage Blob/S3/HDFS or Kafka
![Page 44: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/44.jpg)
THINGS EASY TO MISS
EFFECTIVE MONITORING
§ Collect system and app-specific metrics
§ Measure data availability [ in-rate, out-rate, lag]Bandar-Log https://github.com/VerizonAdPlatforms/bandar-log/
§ Think about Datadog [local agents, dashboards, monitors, notes]
![Page 45: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/45.jpg)
DASHBOARD EXAMPLE [INGESTION]
![Page 46: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/46.jpg)
DASHBOARD EXAMPLE [AGGREGATIONS]
![Page 47: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/47.jpg)
LESSON 6 – DATA GOVERNANCE
![Page 48: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/48.jpg)
DATA GOVERNANCE
UI
MVP
API FACADE
OLTP
Data Platform
OLAP
3rd PROVIDERS
![Page 49: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/49.jpg)
UI
MVP
API FACADE
OLTP
Data Platform
OLAP
3rd PROVIDERS
Я НАЙДУ ТЕБЯ НА ТОМ
СВЕТЕ!!!
DATA GOVERNANCE
![Page 50: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/50.jpg)
DATA GOVERNANCE CHECKLIST
Did I think about Personal Data Protection?Did I think about Data Access Control?Did I think about Data Eviction?Did I think about Data Lineage?Did I think about Data Quality?Did I think about Data Inventory?
![Page 51: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/51.jpg)
PERSONAL DATA PROTECTION
§Learn what Personally Identifiable Data (PID) is
§Think twice before storing any PID
§Anonymize data as soon as possible in ETL and prefer to use anonymized data over PID where never possible
§Introduce Anonymized Unique ID (AUID) and store relationship PID <-> AUID separately
![Page 52: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/52.jpg)
DATA ACCESS CONTROL
§ Introduce IAM for components and developers inside Data Lake and DWHControl access to PID and anonymized data
§ Introduce ACL for end users inside OLAPLeverage OLAP features to support ACL -- per row, table, schema, database
![Page 53: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/53.jpg)
DATA EVICTION
§ Design data and applications with evection enabled
§ Introduce data retention policy and schedule cleanup jobs
§ Separate data retention policy per raw and aggregation tables
§ Document retention policy
![Page 54: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/54.jpg)
DATA LINEAGE
§ Shit happens
§ Shit will happen, think about it in advance
RECOMMENDATIONS
§ Each ETL step should persist its output with reasonable retention policy
§ Persist any application logs (Spark/Yarn, CMD apps, ETL, …)
§ Log any significant application decisions
§ Persist any provenance logs (NiFi, …)
![Page 55: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/55.jpg)
DATA QUALITY
§ Introduce data validation [even if it is undefined] and track validation issues§ Schema errors (wrong type, missed mandatory field)§ Semantic errors (unknown or poorly formatted IDs)§ Business errors (certain business constraints per-event or cross-event)
§ Track any errors and expose metrics
§ Track discrepancies and expose metrics§ raw and aggregation data§ Discrepancy between real-time and batch§ Discrepancy between vendor data
![Page 56: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/56.jpg)
DATA INVENTORY
§ Document how data organized
§ Document where data stored
§ Document what and where data exported
§ Document what and where data ingested
§ Document as granular as possible -- per vendor, data source, ETL component etc.
![Page 57: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/57.jpg)
LESSON 7INTRODUCE
DATA ENGINEERING
![Page 58: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/58.jpg)
![Page 59: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/59.jpg)
![Page 60: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/60.jpg)
![Page 61: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/61.jpg)
SHARING RESPONCIBILITY TO DATA
Distinguish expertise
Involve Data Engineers to make Data Platform better and faster
![Page 62: Cowboy dating with big data-AI-Ukraine-2019 · intro –partner 1. intro –partner 2. НАГРАДЫpartner 2 •Медаль за kotlin •Полный кавалер spring& spring](https://reader036.fdocuments.us/reader036/viewer/2022070919/5fb877f5aa38fe39e13d3bde/html5/thumbnails/62.jpg)
THANK YOU