The Architecture of Continuous Innovation - OSCON 2015

152
Chip Childers | @chipchilders VP Technology | Cloud Foundry Foundation The Architecture of Continuous Innovation

Transcript of The Architecture of Continuous Innovation - OSCON 2015

Chip Childers | @chipchildersVP Technology | Cloud Foundry Foundation

The Architecture of Continuous

Innovation

Continuous Innovation

Application Patterns Are Changing

Microservices are great.Per Martin Fowler they lead to specific

requirements:

rapid provisioningbasic monitoring

rapid application deploymentdevops culture

• Use declarative formats for setup automation, to minimize time and cost for new developers joining the project;

• Have a clean contract with the underlying OS, offering maximum portability between execution environments;

• Are suitable for deployment on modern cloud platforms, obviating the need for servers and systems administration;

• Minimize divergence between development and production, enabling continuous deployment for maximum agility;

• And can scale up without significant changes to tooling, architecture, or development practices.

But even that’s not enough…

• Role based access to resources: the right people should be able to do things and the wrong people shouldn’t

• Run specified bits on demand: take code, put it together with all the rest of the things it needs and and get it running

• Coordinate cross service configurations: in a service oriented world, services need to be configured to connect with each other

• Route public requests to running bits: the next big thing needs access to the internet

• Read and write persistent data: data has to live somewhere

• Add and remove resources: scaling is a great problem to have, but still

• Isolate resources and failures without isolation and decoupling, that is one big distributed single point of failure

• Measure performance/health: can’t manage what you don’t measure

• Detect and determine failure: sometimes, things get real… but how do you know

• Recover failures: someone is going to have to clean this mess

• Work tomorrow: when everything you’ve thought to be true has been shown not to

Containers Are Awesome, but Not Enough

Cloud Native Application Platform

Platforms on Platforms

• Better SLAs

• Flexibility

• Speed

• Availability

• Faster Time To Market

• Mobile + Data Services

• Agile and Iterative

• Leverage OSS

• Continuous Delivery

• No Downtime

• Instant scaling

• Consistency & Automation

App Dev App OpsIaaS

Understanding Cloud Native Application Platforms

.war .jar

dependencies

libraries

service manifest

App App App

LB

DB

Multi-server run time environment(s)

.tar.gz

Turning this: Into this:

Unit of Value

IaaS == Virtual Machine

• Opaque to the system

• Orchestration is post-hoc

• System changes are imperative (“launch” stuff)

App Platform == Application

• Containers are transparent

• Lifecycle is fully managed

• System changes are declarative (manifest.yml)

Removing Developer and Operational Constraints

BUILD APPLICATION

PUSH FIRST RELEASE

MAINTAIN APPLICATION

UPDATE APPLICATIONS

RETIRE APPLICATIONS

• Auto-detect frameworks• Link to App Platform

• Self-service deploy• Dynamic routing

• A/B versioning• Live upgrades

• Self-service removal

• Elastic scale• Integrated HA• Log aggregation• Policy and Auth

Prescriptive Assembly

CH

RO

NO

S

runC

sche

dule

r.nex

t

container.next

Prescriptive Assembly

CH

RO

NO

S

runC

sche

dule

r.nex

t

gorouter

Clo

ud C

ontr

olle

rAuth

Loggregator

Staging

Buildpacks

BOSH

Service Broker

Diego

LinuxWindowsDocker

etcd

Core Services

container.next

gorouter

Clo

ud C

ontr

olle

r

Auth

Loggregator

Staging

Buildpacks

BOSH

Service Broker

Diego

LinuxWindowsDocker

etcd

Core Services

Let’s talk about Diego

??

?

?

? DIEGO is

a distributed system that orchestrates containerized

workloads

? DIEGOa distributed system that orchestrates containerized workloads

Cells

Brain

BBS(currently etcd)

?

Cells

Brain

BBS(currently etcd)

scheduler

DIEGOa distributed system that orchestrates containerized workloads

?

Cells

Brain

BBS(currently etcd)

scheduler

DIEGOa distributed system that orchestrates containerized workloads

?

Cells

Brain

BBS(currently etcd)

scheduler

DIEGOa distributed system that orchestrates containerized workloads

?

Cells

Brain

BBS(currently etcd)

scheduler

DIEGOa distributed system that orchestrates containerized workloads

?

Cells

Brain

BBS(currently etcd)

health-monitor

DIEGOa distributed system that orchestrates containerized workloads

?

Cells

Brain

BBS(currently etcd)

health-monitor

DIEGOa distributed system that orchestrates containerized workloads

?

Cells

Brain

BBS(currently etcd)

health-monitor

DIEGOa distributed system that orchestrates containerized workloads

?

Cells

Brain

BBS(currently etcd)

health-monitor

DIEGOa distributed system that orchestrates containerized workloads

?

Cells

Brain

BBS(currently etcd)

health-monitor

DIEGOa distributed system that orchestrates containerized workloads

?

Cells

Brain

BBS(currently etcd)

health-monitor

DIEGOa distributed system that orchestrates containerized workloads

?

Cells

Brain

BBS(currently etcd)

health-monitor

DIEGOa distributed system that orchestrates containerized workloads

? DIEGO runs

one-off taskslong running processes

a distributed system that orchestrates containerized workloads

?Taska unit of workruns at most once

DIEGO runsa distributed system that orchestrates containerized workloads

long running processes

?Task LRPa unit of workruns at most once

N long-running instancesdistributed across cells for HAmonitored & restarted

DIEGO runsa distributed system that orchestrates containerized workloads

?

generic, platform independent, abstraction

DIEGO runsa distributed system that orchestrates containerized workloads

Task LRP

?

generic, platform independent, abstraction

DIEGO runsa distributed system that orchestrates containerized workloads

Task LRP

?

working today

DIEGO runsa distributed system that orchestrates containerized workloads

generic, platform independent, abstraction

Task LRP

? DIEGO runsa distributed system that orchestrates containerized workloads

successful abstraction

Task LRP

working today

…confusion

…confusion

=?

…confusion

=?

…confusion

? ?

isolation

??

??

isolation

shared resources

kernel

resource isolation

namespace isolation

proc

ess

A

proc

ess

B

proc

ess

C

proc

ess

D

proc

ess

E

proc

ess

Ftenant 1 tenant 2 tenant 3

??

isolation

CPU

kernel

resource isolation

namespace isolation

proc

ess

A

proc

ess

B

proc

ess

C

proc

ess

D

proc

ess

E

proc

ess

Ftenant 1 tenant 2 tenant 3

??

isolation

resource isolation

namespace isolation

CPUpr

oces

s A

proc

ess

B

proc

ess

C

proc

ess

D

proc

ess

E

proc

ess

Ftenant 1 tenant 2 tenant 3

??

isolation

resource isolation

namespace isolation

proc

ess

A

proc

ess

B

proc

ess

C

proc

ess

D

proc

ess

E

proc

ess

Ftenant 1 tenant 2 tenant 3

CPU

??

isolation

resource isolation

namespace isolation

proc

ess

A

proc

ess

B

proc

ess

C

proc

ess

D

proc

ess

E

proc

ess

Ftenant 1 tenant 2 tenant 3

CPU

??

isolation

resource isolation

namespace isolation

proc

ess

A

proc

ess

B

proc

ess

C

proc

ess

D

proc

ess

E

proc

ess

Ftenant 1 tenant 2 tenant 3

cgroups

CPU

??

isolation

resource isolation

namespace isolation

proc

ess

A

proc

ess

B

proc

ess

C

proc

ess

D

proc

ess

E

proc

ess

Ftenant 1 tenant 2 tenant 3

cgroupspr

oces

s D

proc

ess

E

proc

ess

F

CPU

??

isolation

shared resources

kernel

resource isolation

namespace isolation

proc

ess

A

proc

ess

B

proc

ess

C

proc

ess

D

proc

ess

E

proc

ess

Ftenant 1 tenant 2 tenant 3

??

isolation

kernel

resource isolation

namespace isolation

proc

ess

A

proc

ess

B

proc

ess

C

proc

ess

D

proc

ess

E

proc

ess

Ftenant 1 tenant 2 tenant 3

ProcessID

??

isolation

resource isolation

namespace isolation

proc

ess

A

proc

ess

B

proc

ess

C

proc

ess

D

proc

ess

E

proc

ess

Ftenant 1 tenant 2 tenant 3

PID 2 3 4 5 6 7

??

isolation

resource isolation

namespace isolation

proc

ess

A

proc

ess

B

proc

ess

C

proc

ess

D

proc

ess

E

proc

ess

Ftenant 1 tenant 2 tenant 3

PID 2 3 4 5 6 7

??

isolation

resource isolation

namespace isolation

proc

ess

A

proc

ess

B

proc

ess

C

proc

ess

D

proc

ess

E

proc

ess

Ftenant 1 tenant 2 tenant 3

PID 2 3 4 5 6 7

??

isolation

resource isolation

namespace isolation

proc

ess

A

proc

ess

B

proc

ess

C

proc

ess

D

proc

ess

E

proc

ess

Ftenant 1 tenant 2 tenant 3

PID 2 3 4 5 6 7

PID namespace

??

isolation

resource isolation

namespace isolation

proc

ess

A

proc

ess

B

proc

ess

C

proc

ess

D

proc

ess

E

proc

ess

Ftenant 1 tenant 2 tenant 3

PID 2 3 4 5 6 7

PID namespace

??

isolation

resource isolation

namespace isolation

proc

ess

A

proc

ess

B

proc

ess

C

proc

ess

D

proc

ess

E

proc

ess

Ftenant 1 tenant 2 tenant 3

PID 2 3 4 2 2 3

PID namespace

??

isolation

resource isolation

namespace isolation

proc

ess

A

proc

ess

B

proc

ess

C

proc

ess

D

proc

ess

E

proc

ess

Ftenant 1 tenant 2 tenant 3

PID

shared resources

kernel

NetworkMountUser

namespaces

??

=

isolation

User

Network

cgroups

PID

??

??

?

=

isolation

PID

User

Network

cgroups

??

??

=

isolation

PID

User

Network

cgroups

+

contents

??

??

=

isolation

PID

User

Network

cgroups

+

contents

+

processes

??

??

=??

??

TasksLRPs

in

??

TasksLRPs

in Garden

??

Garden

allows Diego to programmatically say

“make me a container” “put this in it” “then run this”

via a platform-agnostic API

??

Garden

allows Diego’s abstractions to be flexible

??

cf push

??

appsourcecode

Task

staging

cf push??

cf push

compiled asset

app + app-specific dependencies

assumes a particular execution context

cflinuxfs2

??

cf push

?

??

cf push

LRP

??

cf push

cflinuxfs2

preloaded rootfs

??

cf push

cflinuxfs2

preloaded rootfs

download droplet

??

cf push

cflinuxfs2

preloaded rootfs

download droplet

start command

??

cf push

Droplet LRP{

memory: 128mb,

rootfs: “preloaded:cflinuxfs2”,

setup: <download-droplet>,

run: {metadata}.start-command

}

??

cf push

Droplet LRP{

memory: 128mb,

rootfs: “preloaded:cflinuxfs2”,

setup: <download-droplet>,

run: {metadata}.start-command

}

??

cf push

{memory: 128mb,

rootfs: “preloaded:cflinuxfs2”,

setup: <download-droplet>,

run: {metadata}.start-command

}

Droplet LRP

??

cf push

{memory: 128mb,

rootfs: “preloaded:cflinuxfs2”,

setup: <download-droplet>,

run: {metadata}.start-command

}

Droplet LRP

??

cf push

??

cf push-docker

??

cf push-docker??

cf push-docker

docker image

??

cf push-docker

docker image docker metadata

??

cf push-docker

docker image docker metadata

docker registry

}

??

Docker LRP

{memory:128mb,

rootfs: “docker://docker-image”,

run: {docker metadata}.start-command

}

cf push-docker??

cf push-docker

docker image docker metadata

docker registry

}

??

Docker LRP

{memory:128mb,

rootfs: “docker://docker-image”,

run: {docker metadata}.start-command

}

cf push-docker??

Docker LRP

{memory:128mb,

rootfs: “docker://docker-image”,

run: {docker metadata}.start-command

}

cf push-docker??

Docker LRP

{memory:128mb,

rootfs: “docker://docker-image”,

run: {docker metadata}.start-command

}

cf push-docker??

???

?

(anything) (anything)

??

???

cf push-docker

??

cf push -stack windows

??

Garden-Windows

resource isolationkernel job objectdisk quotas

namespace isolationuser profilesHost Web Core(an isolated IIS instance)

Garden-Linux

resource isolationcgroups

namespace isolationPIDNetworkUserMount

??

collaborating with Microsoft

Garden-Windows

??

Garden-Windows

provides a container experience for Windows 2012that will only get better with Windows 2016

allows us to build a cf push experience

??

Garden-Linux Garden-Windows

?

??

Garden API

??

.net LRP

{memory: 128mb,

rootfs: “preloaded:windows2012R2”,

setup: <download-application>

run: {metadata}.start-command}

??

.net LRP

{memory: 128mb,

rootfs: “preloaded:windows2012R2”,

setup: <download-application>

run: {metadata}.start-command}

??

.net LRP

{memory: 128mb,

rootfs: “preloaded:windows2012R2”,

setup: <download-application>

run: {metadata}.start-command}

??

.net LRP

{memory: 128mb,

rootfs: “preloaded:windows2012R2”,

setup: <download-application>

run: {metadata}.start-command}

??

.net LRP

{memory: 128mb,

rootfs: “preloaded:windows2012R2”,

setup: <download-application>

run: {metadata}.start-command}

??

3 different contexts

??

1 cluster??

CloudController

DEA

cf push

stage

DEA

DEA

DEArun

? Current CF Architecture (Simplified)

CloudControllercf push

stage

run

app-specific generic

?

CloudControllercf push

stage

run

CCBridge

Cells

BrainBBS

Rec

epto

r AP

I

?

CloudController

CCBridge

generic consumer

? Cells

BrainBBS

Rec

epto

r AP

I

CloudController

CCBridge

BrainBBS

generic consumer

other consumers?

? Cells

BrainBBS

Rec

epto

r AP

I

Cells

BrainBBS

Task or LRP

Rec

epto

r AP

I

Cells

BrainBBS

Task or LRP

meh Rec

epto

r AP

I

Cells

BrainBBS

Task or LRP

gorouter

http traffic

Rec

epto

r AP

I

Cells

BrainBBS

Task or LRP

gorouter

http traffic

loggregator

logs

Rec

epto

r AP

I

vagrant up

vagrant up

terraform apply

vagrant up

terraform apply

ltc create <app>

lattice.cf

lattice.cf

Local VM

lattice.cf

Local VM

AWSDigital OceanGoogle Cloud PlatformOpenStack

?Why

?

?

CCUAADiegoLoggregatorGorouterBuildpacksServicesBOSH

?

CCUAADiegoLoggregatorGorouterBuildpacksServicesBOSH

?

CCUAADiegoLoggregatorGorouterBuildpacksServicesBOSH

?

CCUAADiegoLoggregatorGorouterBuildpacksServicesBOSH

single-tenant

?

CCUAADiegoLoggregatorGorouterBuildpacks*ServicesBOSH

dockersingle-tenant

?

CCUAADiegoLoggregatorGorouterBuildpacks*ServicesBOSH

BYOSdocker

single-tenant

?

CCUAADiegoLoggregatorGorouterBuildpacks*ServicesBOSH

no rolling upgradesBYOSdocker

single-tenant

?

?Why

?

…is a useful low-barrier solution to real-world problems

…makes exploring Diego easy

…is a softer onramp to the CF tech stack

…allows us to efficiently prototype new ideas for Diego’s future

Lattice…

WHEN?

“rewrite the DEA”Diego’s scope is much more than

Diego is running in production on PWSManaging ~5% of the load

Diego is in beta while we

validate performance at O(~100s) of cells

secure Diego’s internal components

Start using it alongside the DEAs now and give us feedback

Diego should be out of beta within Q3(probably)

Then what?

Placement Constraintstop of backlog post-beta

cf ssh <app/index>working now, CLI support on the way

shell access, port forwarding, scp

TCP Routing

Private Docker Registry

Support for persistence(a long term goal)

Container-Container networking(a long term goal)

Condenserlightweight buildpacks for Lattice

??

?

A Cloud Foundry is a place of practice for continuous innovation. noun pragmatic cathedral

Chip Childers | @chipchildersVP Technology | Cloud Foundry Foundation