Ignacy Kowalczyk

48
Kubernetes: lessons learnt from cluster management at Google Ignacy Kowalczyk Engineering Manager Google

Transcript of Ignacy Kowalczyk

Page 1: Ignacy Kowalczyk

Kubernetes: lessons learnt from cluster management at Google

Ignacy

Kowalczyk

Engineering

Manager

Google

Page 2: Ignacy Kowalczyk

Image by Connie Zhou

For the past 15 years, Google has been building out the world’s fastest, most powerful, highest quality cloud infrastructure on the planet.

Page 3: Ignacy Kowalczyk

job hello_world = {

runtime = { cell = 'ic' } // Cell (data center) to run in

binary = '.../hello_world_webserver' // Program to run

args = { port = '%port%' } // Command line parameters

requirements = { // Resource requirements (optional)

ram = 100M

disk = 100M

cpu = 0.1

}

replicas = 10000 // Number of tasks

}

User view

Page 4: Ignacy Kowalczyk

Google Cloud PlatformImages by Connie

Zhou

Google has been developing and using containers to manage our applications for over 10 years.

Page 5: Ignacy Kowalczyk

Google Cloud Platform

Everything at Google runs in containers:

• Gmail, Web Search, Maps, ...• MapReduce, batch, ...• GFS, Colossus, ...• Even Google’s Cloud Platform:

our VMs run in containers!

We launch over 2 billion containers per week

Page 6: Ignacy Kowalczyk

What justhappened?

web browsers

BorgMaster

link shard

UI shardBorgMaster

link shard

UI shardBorgMaster

link shard

UI shardBorgMaster

link shard

UI shard

Cell

Scheduler

borgcfg web browsers

scheduler

Borglet Borglet Borglet Borglet

BorgMaster

link shard

read/UI shard

Config file

persistent store (Paxos)

Binary

User view

Page 7: Ignacy Kowalczyk

What justhappened?

web browsers

BorgMaster

link shard

UI shardBorgMaster

link shard

UI shardBorgMaster

link shard

UI shardBorgMaster

link shard

UI shard

Cell

Scheduler

borgcfg web browsers

scheduler

Borglet Borglet Borglet Borglet

BorgMaster

link shard

read/UI shard

Config file

persistent store (Paxos)

Binary

User view

Page 8: Ignacy Kowalczyk

What justhappened?

web browsers

BorgMaster

link shard

UI shardBorgMaster

link shard

UI shardBorgMaster

link shard

UI shardBorgMaster

link shard

UI shard

Cell

Scheduler

borgcfg web browsers

scheduler

Borglet Borglet Borglet Borglet

BorgMaster

link shard

read/UI shard

Config file

persistent store (Paxos)

Binary

User view

Page 9: Ignacy Kowalczyk

What justhappened?

web browsers

BorgMaster

link shard

UI shardBorgMaster

link shard

UI shardBorgMaster

link shard

UI shardBorgMaster

link shard

UI shard

Cell

Scheduler

borgcfg web browsers

scheduler

Borglet Borglet Borglet Borglet

BorgMaster

link shard

read/UI shard

Config file

persistent store (Paxos)

Binary

User view

Page 10: Ignacy Kowalczyk

What justhappened?

web browsers

BorgMaster

link shard

UI shardBorgMaster

link shard

UI shardBorgMaster

link shard

UI shardBorgMaster

link shard

UI shard

Cell

Scheduler

borgcfg web browsers

scheduler

Borglet Borglet Borglet Borglet

BorgMaster

link shard

read/UI shard

Config file

persistent store (Paxos)

Binary

User view

Page 11: Ignacy Kowalczyk

What justhappened?

web browsers

BorgMaster

link shard

UI shardBorgMaster

link shard

UI shardBorgMaster

link shard

UI shardBorgMaster

link shard

UI shard

Cell

Scheduler

borgcfg web browsers

scheduler

Borglet Borglet Borglet Borglet

BorgMaster

link shard

read/UI shard

Config file

persistent store (Paxos)

Binary

User view

Page 12: Ignacy Kowalczyk

Hello world!

Hello world!

Hello world!

Hello world!Hello

world! Hello world! Hello

world!

Hello world!

Hello world!

Hello world!

Hello world!

Hello world!

Hello world!

Hello world!

Hello world!

Hello world!

Hello world!Hello world!

Hello world!

Hello world!

Hello world!

Hello world!

Hello world! Hello

world!

Hello world!

Hello world!

Hello world!

Image by Connie Zhou

User view

Hello world!

Hello world!

Hello world! Hello

world!

Hello world! Hello

world!

Hello world!

Hello world!

Hello world!

Hello world!

Hello world! Hello

world!

Hello world! Hello

world!

Hello world!

Hello world!

Hello world!

Hello world!

Hello world! Hello

world!

Hello world! Hello

world!

Hello world!

Hello world!

Page 13: Ignacy Kowalczyk

task-eviction rates and causes

Failures

Page 14: Ignacy Kowalczyk

Images by Connie Zhou

A 2000-machine service will have >10 task exits per dayThis is not a problem: it's normal

Failures

Page 15: Ignacy Kowalczyk

Google Cloud Platform

Pets Cattle

Failures

Page 16: Ignacy Kowalczyk

Advanced bin-packing algorithms

Experimental placement of production VM workload, July 2014

Efficiency

stranded resourcesavailable resourcesone

machine

Page 17: Ignacy Kowalczyk

machine image locked into a

platform

Plain IaaS downsides:Not portable & Opaque

Hypervisor

Guest environment

app code

libraries

guest kernel

Efficiency

Page 18: Ignacy Kowalczyk

Plain IaaS downsides:No Isolation

Hypervisor

Guest environment

app code

libraries

guest kernel

dependency???app code

Efficiency

Page 19: Ignacy Kowalczyk

Plain IaaS downsides:Little Reuse

Hypervisor

Guest environment

app code

libraries

guest kernel

Guest environment

app code

libraries

guest kernel

Guest environment

app code

libraries

guest kernelredundant

Efficiency

Page 20: Ignacy Kowalczyk

Containers create a better abstraction layer

Hypervisor

Guest environment

app code

libraries

guest kernel

cut here

Efficiency

Page 21: Ignacy Kowalczyk

Node environment

Much better: Portable, isolated, static app environments

Hypervisor

node kernel

app code

libraries

app code

libraries

app code

libraries

container 1 container 2 container 3

Efficiency

Page 22: Ignacy Kowalczyk

Key observation from Google’s experience:

Page 23: Ignacy Kowalczyk

A datacenter is not a collection of computers,a datacenter is a computer.

Page 24: Ignacy Kowalczyk

Google Cloud Platform

Kubernetes

Greek for “Helmsman”; also the root of the words “governor” and “cybernetic”

• Runs and manages containers

• Inspired by Borg

• Runs on VMs and bare metal

• 100% Open source, written in Go

Run applications on clustersnot processes on machines

Page 25: Ignacy Kowalczyk

Google Cloud Platform

Kubernetes

Getting startedhttp://kubernetes.io/docs/hellonode/

Page 26: Ignacy Kowalczyk

Google Cloud Platform

Containers

Page 27: Ignacy Kowalczyk

Google confidential │ Do not distribute

Docker containers

● Isolation - OS-level virtualization features:○ cgroups - access to resources (CPU, RAM, I/O○ namespaces - isolated filesystems, networking○ Copy-on-Write file systems

● Packaging○ Repositories of images○ Overlay file systems

Page 28: Ignacy Kowalczyk

Google Cloud Platform

Pods

Page 29: Ignacy Kowalczyk

Google Cloud Platform

Pods

Small group of tightly coupled containers & volumes

The atom of scheduling & placement

Shared namespaces• shared IP address & localhost• shared IPC, etc.• shared volumes

Example: data puller & web server

ConsumersContent Manager

File Puller

Web Server

Volume

Pod

Page 30: Ignacy Kowalczyk

Google Cloud Platform

Labels & Selectors

Page 31: Ignacy Kowalczyk

Google Cloud Platform

Arbitrary metadata

Attached to any API object

Generally represent identity

Queryable by selectors• think SQL ‘select ... where ...’

Labels

Page 32: Ignacy Kowalczyk

Google Cloud Platform

App: MyApp

Phase: prod

Role: FE

App: MyApp

Phase: test

Role: FE

App: MyApp

Phase: prod

Role: BE

App: MyApp

Phase: test

Role: BE

Selectors

Page 33: Ignacy Kowalczyk

Google Cloud Platform

App: MyApp

Phase: prod

Role: FE

App: MyApp

Phase: test

Role: FE

App: MyApp

Phase: prod

Role: BE

App: MyApp

Phase: test

Role: BE

App = MyApp

Selectors

Page 34: Ignacy Kowalczyk

Google Cloud Platform

App: MyApp

Phase: prod

Role: FE

App: MyApp

Phase: test

Role: FE

App: MyApp

Phase: prod

Role: BE

App: MyApp

Phase: test

Role: BE

App = MyApp, Role = FE

Selectors

Page 35: Ignacy Kowalczyk

Google Cloud Platform

App: MyApp

Phase: prod

Role: FE

App: MyApp

Phase: test

Role: FE

App: MyApp

Phase: prod

Role: BE

App: MyApp

Phase: test

Role: BE

App = MyApp, Role = BE

Selectors

Page 36: Ignacy Kowalczyk

Google Cloud Platform

Selectors

App: MyApp

Phase: prod

Role: FE

App: MyApp

Phase: test

Role: FE

App: MyApp

Phase: prod

Role: BE

App: MyApp

Phase: test

Role: BE

App = MyApp, Phase = prod

Page 37: Ignacy Kowalczyk

Google Cloud Platform

App: MyApp

Phase: prod

Role: FE

App: MyApp

Phase: test

Role: FE

App: MyApp

Phase: prod

Role: BE

App: MyApp

Phase: test

Role: BE

App = MyApp, Phase = test

Selectors

Page 38: Ignacy Kowalczyk

Google Cloud Platform

ReplicationControllers

Page 39: Ignacy Kowalczyk

Google Cloud Platform

ReplicationControllers

A simple control loop

Has 1 job: ensure N copies of a pod• if too few, start some• if too many, kill some• grouped by a selector

Cleanly layered on top of the core• all access is by public APIs

Replicated pods are fungible• No implied order or identity

ReplicationController- name = “my-rc”- selector = {“App”: “MyApp”}- podTemplate = { ... }- replicas = 4

API Server

How many?

3

Start 1 more

OK

How many?

4

Page 40: Ignacy Kowalczyk

Google Cloud Platform

Services

Page 41: Ignacy Kowalczyk

Google Cloud Platform

Services

A group of pods that work together• grouped by a selector

Gets a stable virtual IP and port• sometimes called the service portal• also a DNS name

Hides complexity

Client

Virtual IP

Page 42: Ignacy Kowalczyk

Google Cloud Platform

Networking

Page 43: Ignacy Kowalczyk

Google Cloud Platform

Kubernetes networking

A: 172.16.1.1

3306

B: 172.16.1.2

80

9376

11878SNAT

SNATC: 172.16.1.1

8000

Page 44: Ignacy Kowalczyk

Google Cloud Platform

Kubernetes networking

A: 172.16.1.1

3306

B: 172.16.1.2

80

9376

11878SNAT

SNATC: 172.16.1.1

8000REJECTED

Page 45: Ignacy Kowalczyk

Google Cloud Platform

Kubernetes networking

10.1.1.0/24

10.1.1.1

10.1.1.2

10.1.2.0/2410.1.2.1

10.1.3.0/24

10.1.3.1

Page 46: Ignacy Kowalczyk

Design principles

Declarative > imperative: State your desired results, let the system actuate

Control loops: Observe, rectify, repeat

Network-centric: IP addresses are cheap

Cattle > Pets: Manage your workload in bulk

Page 47: Ignacy Kowalczyk

Borg contributors

Core: Abhishek Rai, Abhishek Verma, Andy Zheng, Ashwin Kumar, Ben Smith, Beng-Hong Lim, Bin Zhang, Bolu Szewczyk, Brad Strand, Brian Budge, Brian Grant, Brian Wickman, Chengdu Huang, Chris Colohan, Cliff Stein, Cynthia Wong, Daniel Smith, Dave Bort, David Oppenheimer, David Wall, Divyesh Shah, Dawn Chen, Eric Haugen, Eric Tune, Eric Wilcox, Ethan Solomita, Gaurav Dhiman, Geeta Chaudhry, Greg Roelofs, Grzegorz Czajkowski, James Eady, Jarek Kusmierek, Jaroslaw Przybylowicz, Jason Hickey, Javier Kohen, Jeff Dean, Jeremy Dion, Jeremy Lau, Jerzy Szczepkowski, Joe Hellerstein, John Wilkes, Jonathan Wilson, Joso Eterovic, Jutta Degener, Kai Backman, Kamil Yurtsever, Ken Ashcraft, Kenji Kaneda, Kevan Miller, Kurt Steinkraus, Leo Landa, Liza Fireman, Madhukar Korupolu, Maricia Scott, Mark Logan, Mark Vandevoorde, Markus Gutschke, Matt Sparks, Maya Haridasan, Michael Abd-El-Malek, Michael Kenniston, Ming-Yee Iu, Monika Henzinger, Mukesh Kumar, Nate Calvin, Onufry Wojtaszczyk, Olcan Sercinoglu, Paul Menage, Patrick Johnson, Pavanish Nirula, Pedro Valenzuela, Percy Liang, Piotr Witusowski, Praveen Kallakuri, Rafal Sokolowski, Rajmohan Rajaraman, Richard Gooch, Rishi Gosalia, Rob Radez, Robert Hagmann, Robert Jardine, Robert Kennedy, Rohit Jnagal, Roy Bryant, Rune Dahl, Scott Garriss, Scott Johnson, Sean Howarth, Sheena Madan, Smeeta Jalan, Stan Chesnutt, Temo Arobelidze, Tim Hockin, Todd Wang, Tomasz Blaszczyk, Tomasz Wozniak, Tomek Zielonka, Victor Marmol, Vish Kannan, Vrigo Gokhale, Walfredo Cirne, Walt Drummond, Weiran Liu, Xiaopan Zhang, Xiao Zhang, Ye Zhao, and Zohaib Maya.SRE: Adam Rogoyski, Alex Milivojevic, Anil Das, Cody Smith, Cooper Bethea, Folke Behrens, Matt Liggett, James Sanford, John Millikin, Matt Brown, Miki Habryn, Peter Dahl, Robert van Gent, Seppi Wilhelmi, Seth Hettich, Torsten Marek, and Viraj Alankar.BCL and borgcfg: Marcel van Lohuizen and Robert Griesemer.Reviewers: Christos Kozyrakis, Eric Brewer, Malte Schwarzkopf, and Tom Rodeheffer.

Page 48: Ignacy Kowalczyk

K8s is helping you with ● deployment automation● utilization optimization

Q&A

kubernetes.io/[email protected]

Ignacy Kowalczyk

EngineeringManager

Google