Challenges of a multi tenant kafka service
-
Upload
thomas-alex -
Category
Data & Analytics
-
view
433 -
download
6
Transcript of Challenges of a multi tenant kafka service
Thomas Alex
Principal Program Manager
Microsoft
Introduction
Goals
Solution
Tenant model
Deployment architecture
Open Discussion
Siphon: Enterprise Data Bus
Near real-time
Compliant
No data dead-ends
Hyper scale
Reliable
Network effects
8 millionEVENTS PER SECOND PEAK INGRESS
800 TB (10 GB per Sec)INGRESS PER DAY
1,800PRODUCTION KAFKA BROKERS
450TOPICS
15 Sec99th PERCENTILE LATENCY
SDK Collector
Siphon
connector
API
Management UI
Metadata dB
Customer: Major Car Manufacturer
Scenario: Connected Car Telematics
Data producers
Millions of cars
Routed via cloud gateway to Siphon endpoint
Data consumers
Spark streaming applications
Siphon compute forwards data to blob storage
UI
Backend
Source
systemsDestination
systems
Data
producers
• Send data
reliably
Customers
• Manage capacity
• Manage
tenant/topic/subscription
• Pay for the service
Data
consumers
• Consume
data in
NRT
Service owners
• Manage service
with SLA
Managed service
Availability
Reliability
Isolation
Low cost
Self-service
Regulatory Compliance
Data sharing
Instance
Instance
Instance
Customer A
Customer B
Customer C
Multiple instances
Single tenant per instance
Customer A
Customer B
Customer C
Single instance
Multiple tenant per instance
Instance
Customer A
Customer B
Customer C
Multiple instances
Multiple tenant per instance
Instance
Instance
Siphon Deployment Unit
• Ingress service (Collector)
• Kafka cluster
• Connector (HLC)
• Monitoring
Management Service
• Metadata
• Self-serve API
• Self-serve UI
Collector HLC
APIMetadata dB
Tenant
Principals (administrators, users)
Resources
Endpoint
Topics
Subscriptions
Quota
Storage capacity
Throughput
Threshold for auto-approval
Default limits
Topic capacity
Retention
Partitions
Tenant 3Traffic
Manager 3
Tenant 2Traffic
Manager 2
Siphon DU 1
Collector HLC
Siphon DU 2
Collector HLC
Siphon DU 3
Collector HLC
Tenant 1Traffic
Manager 1
Scalability
Underlying infra is IaaS
Isolation
Availability and Latency SLA
Regulatory compliance guarantees
Enterprise cloud depends on data security & privacy
Regulatory framework for certifications e.g. SOC, FEDRAMP, HIPAA
Data sharing
Manageability
Provisioning
Monitoring
Maintainability
Comments / Feedback
https://www.linkedin.com/in/tomalex/
Compliance regions North America
South America
Europe
Asia Pacific
Go Local Australia
Canada
India
Japan
United Kingdom
Sovereign Germany
China
Government
Self-service Tenant creation & management
Topic creation & management
Topic health & data preview
Subscription creation & management
AuthN Azure AD based for Self-service API & UI
Cert based for data producers and consumers
AuthZ Siphon Metadata used to authorize provisioning & management (tenants, topics, etc.)
Kafka ACLs for topic level access control
Throttling EventServer throttles based on quota limit
Monitoring Operational metrics in a single system (MDM) for monitoring and alerting
Data quality Audit Trail system for e2e latency and completeness monitoring