Kubernetes "Ubernetes" Cluster Federation by Quinton Hoole (Google, Inc) Huawei Tech Talk 2016-05-26...

34
Federation of Kubernetes Clusters (a.k.a "Übernetes") Huawei Kubernetes Meetup 2016-05- 26 Quinton Hoole <[email protected]> Staff Software Engineer - Google quinton_hoole@github

Transcript of Kubernetes "Ubernetes" Cluster Federation by Quinton Hoole (Google, Inc) Huawei Tech Talk 2016-05-26...

Federation of Kubernetes Clusters (a.k.a "Übernetes")Huawei Kubernetes Meetup 2016-05-26

Quinton Hoole <[email protected]>Staff Software Engineer - Googlequinton_hoole@github

Google has beeeg data centers...... but you know that already.

Images by Connie Zhou

But we also have rather a lot of them...

Treating these differently can have benefits...

UI

CLI

API

Control Plane Servers

Kubernetes

Users

containerscontainers

containerscontainers

containers

containerscontainers

containerscontainers

containers

containerscontainers

containerscontainers

containers

Cluster / Data Center / Availability Zone

UI

All you really care about?

API Containers

UI

CLI

API

Control Plane Clusters

Übernetes

API

Users

Kubernetes on

Kubernetes on

Kubernetes on Premise

Federation

Why is this interesting?

Reason 1: High Availability

• Cloud providers have outages, yes, but...• Has one of your application software

upgrades ever gone terribly wrong?• How about infrastructure upgrades

(auth systems? quota? data store?)• How about a fat-fingered config

change?• There are several interesting variants:

• Multiple availability zones?• Multiple cloud providers?

Cross-cluster Load Balancer

Your paying

customer

Cluster 1

Cluster 2

Cluster 3

Reason 2: Application Migration

• Migrating applications between clusters is tedious and error-prone if done manually• Much like software upgrades, you

*can* script them, but (K)ubernetes just does it quicker/safer/better.• Now with rollback too!

• On-premise ↔ Cloud• Amazon ↔ Google :-)• ...

Ubernetes

UI

On-Premise Cluster In-Cloud Cluster

Migrate: On Premise→Cloud

Different Cloud Provider

Reason 3: Policy Enforcement

• Some data must be stored and processed within specified political jurisdictions, by law.

• Some software/data must be on premise and air-gapped, by company policy.

• Some business units get to use the expensive gear, some don't.

• Auditing is also a big deal, so funnelling all operations through a central control point makes this easier.

Ubernetes

UI

U.S. Cloud Cluster E.U Cloud Cluster

On-premise Cluster

Reason 4: Vendor Lock-in Avoidance

• Make it easy to migrate applications between cloud providers.

• Run the same app on multiple cloud providers and choose the best one for your:• workload characteristics• budget• performance requirements• availability requirements

Ubernetes

UI

Kubernetes on GCE Kubernetes on AWS

Kubernetes On-Premise

Reason 5: Capacity Overflow

• Make intelligent placement decisions • Utilization• Cost• Performance Ubernetes

User

On Premise Cluster

Other Cloud Provider

Preferred Cloud Provider

Run my stuff

"OK, I'm sold. Where's the catch?"

Provider 1

Zone A

Zone B

Federation comes with some challenges...

Provider 2Zone C

Provider 1

Zone D

● Different bandwidth charges/latency/through-put/reliability

● Different service discovery (but DNS!)

● Consolidated monitoring & alerting

Cross-cluster load balancing

• Geographically aware DNS ("Geo-DNS") gets external clients to the "closest" healthy cluster

• Inter-clusters we don't need to rely on geo-DNS. Use CNAME to redirect to:

• Same Zone > Same Region > Same Continent

• Standard Kubernetes service load balancing within each cluster.

• Can be extended to divert traffic away from "healthy-but-saturated" clusters.

Cross-cluster service discovery

• DNS + Kubernetes cluster-local service discovery.

• Can default to cluster-local with failover to remote clusters.

Location affinity

• Strictly coupled pods/applications• High bandwidth requirements• Low latency requirements• High fidelity requirements• Cannot easily span clusters

• Loosely coupled• Opposite of above• Relatively easily distributed across

clusters• Preferentially coupled

• Strongly coupled but can be migrated piecemeal.

Location affinity continued...

• Also negative affinity• Don't run my replicas in the same

failure domain (host/rack/zone)• Lots of interesting points on the

continuum• Same host• Same rack• Same zone• Same metro region• Same sub-continent

• Absolute location affinity

Cross-cluster monitoring and auditing...

• "Cluster per tab" might suffice for small numbers of clusters

• Some monitoring solutions provide stronger integration and global summarization

Cluster Federation - The Implementation...

API Compatible with Kubernetes

• Less new stuff to learn• Can learn incrementally, as you

need new functionality.• Analogous argument applies to

existing automation systems (PAAS etc). • These can be ported to

Ubernetes relatively easily.• All Kubernetes entities are

"federatable".

Ubernetes or Kubernetes

Client

Applications

Applications

Applications

Run my stuff

State and control resides in underlying clusters (for the most part)

• Better scalability• Kubernetes scales with

number of nodes per cluster (<10,000)

• Ubernetes scales with number of clusters (~100)

• Better fault isolation• Kubernetes clusters fail

independently of Ubernetes

Kubernetes Cluster Kubernetes Cluster

Ubernetes

API

APIRepl. Ctrl etcState

API

APIRepl. Ctrl etcState

API

APIRepl. Ctrl etcState

• Drive current state -> desired state• But per-cluster state, not per node,

per pod etc.

• Observed state is the truth

Recurring pattern in the system

Examples: • ReplicationController• Service

observe

diff

act

Similar Control loops to Kubernetes

Modularity

Loose coupling is a goal everywhere• simpler• composable• extensible

Code-level plugins where possible

Multi-process where possible

Isolate risk by interchangeable parts

Examples:• MigrationController• Scheduler

Example: Federated RC

$ kubectl create -f my-service-rc.yaml --context="federation-1"

where my-service-rc.yaml contains the following:

kind: ReplicationController

metadata:

labels:

run: my-service

name: my-service

namespace: my-namespace

spec:

replicas: 6

selector:

run: my-service

template:

metadata:

labels:

run: my-service

spec:

containers: .... [the usual stuff]

Kubernetes Cluster Kubernetes Cluster

Ubernetes

API

APIRepl. Ctrl etcState

API

APIRepl. Ctrl etcState

API

APIRepl. Ctrl etcState

Example: Federated RC

$ ./kubectl get -o yaml rc my-service --context="cluster-1"

kind: ReplicationController

metadata:

labels:

run: my-service

name: my-service

namespace: my-namespace

selfLink: /api/v1/namespaces/my-

namespace/replicationcontrollers/my-service

uid: 86542109-9948-11e5-a38c-42010af00002

spec:

replicas: 2

...

template:

...

status:

replicas: 2

Kubernetes Cluster Kubernetes Cluster

Ubernetes

API

APIRepl. Ctrl etcState

API

APIRepl. Ctrl etcState

API

APIRepl. Ctrl etcState

Example: Federated Service $ kubectl create -f my-service.yaml --context="federation-1"

where service.yaml contains the following:

kind: Service

metadata:

labels:

run: my-service

name: my-service

namespace: my-namespace

spec:

ports:

- port: 2379

protocol: TCP

targetPort: 2379

name: client

- port: 2380

protocol: TCP

targetPort: 2380

name: peer

selector:

run: my-service

type: LoadBalancer

Cross-cluster Load Balancer

Your paying

customer

Cluster 1

Cluster 2

Cluster 3

Example: Federated Service $ kubectl get -o yaml service my-service --context="cluster-1"

apiVersion: v1

kind: Service

metadata:

labels:

run: my-service

name: my-service

...

selfLink: /api/v1/namespaces/my-namespace/services/my-service

uid: 33bfc927-93cd-11e5-a38c-42010af00002

spec:

clusterIP: 10.0.153.185

ports:

...

selector:

run: my-service

type: LoadBalancer

status:

loadBalancer:

ingress:

- ip: 104.197.117.10

Cross-cluster Load Balancer

Your paying

customer

Cluster 1

Cluster 2

Cluster 3

Example: Federated Service $ kubectl get -o yaml service my-service --context="federation-1"

apiVersion: v1

kind: Service

metadata:

creationTimestamp: 2015-11-25T23:35:23Z

labels:

...

selfLink: /api/v1/namespaces/my-namespace/services/my-service

uid: 33bfc927-93cd-11e5-a38c-42010af00007

spec:

clusterIP:

ports:

....

selector:

run: my-service

type: LoadBalancer

status:

loadBalancer:

ingress:

- hostname: my-service.my-namespace.my-federation.my-domain.com

Cross-cluster Load Balancer

Your paying

customer

Cluster 1

Cluster 2

Cluster 3

Federated Services

Note that the federated service:

1. Is API-compatible with a vanilla Kubernetes service.2. has no clusterIP (as it is cluster-independent)3. has a federation-wide load balancer hostname

What happened?

1. Underlying Kubernetes Services (one per cluster) 2. Ubernetes has also created a DNS name (e.g. on

Google Cloud DNS or AWS Route 53, depending on configuration) which provides load balancing across all of those services. For example, in a very basic configuration:

$ dig +noall +answer my-service.my-namespace.my-federation.my-domain.com

my-service.my-namespace.my-federation.my-domain.com 180 IN A 104.197.117.10

my-service.my-namespace.my-federation.my-domain.com 180 IN A 104.197.74.77

my-service.my-namespace.my-federation.my-domain.com 180 IN A 104.197.38.157

Cross-cluster Load

Balancer/DNS

Your paying

customer

Cluster 1

Cluster 2

Cluster 3

Federated Services Ubernetes also configures the local DNS servers (SkyDNS) in each Kubernetes cluster to

preferentially return the local clusterIP for the service in that cluster, with other clusters'

external service IP's (or a global load-balanced IP) also configured for failover purposes:

$ dig +noall +answer my-service.my-namespace.my-federation.my-domain.com

my-service.my-namespace.my-federation.my-domain.com 180 IN A 10.0.153.185

my-service.my-namespace.my-federation.my-domain.com 180 IN A 104.197.74.77

my-service.my-namespace.my-federation.my-domain.com 180 IN A 104.197.38.157

Ubernetes Global Service Health Checking - monitors the health of the service endpoints,

and automatically removes unhealthy endpoints from the DNS record

Cross-cluster Load

Balancer/DNS

Your paying

customer

Cluster 1

Cluster 2

Cluster 3

Federation status & plans

Federation Lite (single cluster, multiple zones)• Released in v1.2, Q1 2016• Available in GKE in Q2 2016

Federation Proper (multiple clusters, federated)• Beta in v1.3, Q2 2016 (Limited to Services, cross-cluster Service

discovery)• v1.4, Q3 2016: Extended to remaining API objects (Pods, ReplicaSets,

Secrets, ResourceQuotas, ConfigMaps, Deployments, DaemonSets, Ingress...)

PaaSes and Distros• RedHat OpenShift, CoreOS Tectonic, RedHat Atomic...• ... watch this space...

I want more!

• Draft PRD• Design:

• docs/design/federated-services.md• docs/design/federation-phase-1.md

• Code:• Ubernetes Task List GitHub issue

• Legacy Requirements doc• tinyurl.com/ubernetesv2

• Special interest group• groups.google.

com/forum/kubernetes-sig-federation

• Working Group Notes:• tinyurl.com/ubernetes-wg-notes

[email protected]• quinton_hoole@github

Kubernetes Cluster Kubernetes Cluster

Ubernetes

API

APIRepl. Ctrl etcState

API

APIRepl. Ctrl etcState

API

APIRepl. Ctrl etcState