Anynines - Cloud Foundry on OpenStack - An Experience Report

65
Cloud Foundry on OpenStack An Experience Report

description

 

Transcript of Anynines - Cloud Foundry on OpenStack - An Experience Report

Page 1: Anynines - Cloud Foundry on OpenStack - An Experience Report

Cloud Foundry on OpenStack An Experience Report

Page 2: Anynines - Cloud Foundry on OpenStack - An Experience Report

Introduction

Page 3: Anynines - Cloud Foundry on OpenStack - An Experience Report

about.me/fischerjulian

Page 4: Anynines - Cloud Foundry on OpenStack - An Experience Report
Page 5: Anynines - Cloud Foundry on OpenStack - An Experience Report
Page 6: Anynines - Cloud Foundry on OpenStack - An Experience Report

The anynines Stack

Page 7: Anynines - Cloud Foundry on OpenStack - An Experience Report

Hardware

OpenStack

Cloud Foundry

VMware

Page 8: Anynines - Cloud Foundry on OpenStack - An Experience Report

We migrated from a Rented VMware to a

self-hosted OpenStack.

Page 9: Anynines - Cloud Foundry on OpenStack - An Experience Report

For more details on this: http://rh.gd/a9vmw2sos

Page 10: Anynines - Cloud Foundry on OpenStack - An Experience Report

Things we had to think about

Page 11: Anynines - Cloud Foundry on OpenStack - An Experience Report

OpenStack Upgrades

Page 12: Anynines - Cloud Foundry on OpenStack - An Experience Report

Before Grizzly OpenStack was

not ready for production

Page 13: Anynines - Cloud Foundry on OpenStack - An Experience Report

• The upgrade process included a lot manual work

• No script driven upgrades

• Manual DB schema migrations

• Manual configuration file changes, etc.

Page 14: Anynines - Cloud Foundry on OpenStack - An Experience Report

„I scheduled a week of total downtime with all instances offline.“ - jon@jonproulx , http://rh.gd/1sNhiiz

Page 15: Anynines - Cloud Foundry on OpenStack - An Experience Report

Upcoming Upgrade Havanna > Icehouse

Page 16: Anynines - Cloud Foundry on OpenStack - An Experience Report

• Chef is used to roll-out Icehouse (incl. configuration changes)

• The upgrade is well tested on a separate multi-server OpenStack staging system

Page 17: Anynines - Cloud Foundry on OpenStack - An Experience Report

Goal: <30 min downtime.

Page 18: Anynines - Cloud Foundry on OpenStack - An Experience Report

Let’s cross fingers :)

Page 19: Anynines - Cloud Foundry on OpenStack - An Experience Report

Looking forward to rolling Upgrades with OpenStack

Icehouse http://rh.gd/1ymhViL

Page 20: Anynines - Cloud Foundry on OpenStack - An Experience Report

• No need to shutdown VMs during upgrade

• No downtime of the entire cloud

Page 21: Anynines - Cloud Foundry on OpenStack - An Experience Report

VM availability

Page 22: Anynines - Cloud Foundry on OpenStack - An Experience Report

What killes VMs?

Page 23: Anynines - Cloud Foundry on OpenStack - An Experience Report

• Random kernel panics (kernel bug) http://rh.gd/1oBUeCc

• Hardware outages (hw & power failures)

• …

Page 24: Anynines - Cloud Foundry on OpenStack - An Experience Report

Availability Zones

Page 25: Anynines - Cloud Foundry on OpenStack - An Experience Report

• Build disjunct networks, racks, etc.

• Each disjunct zone = availability zone

• Tell OpenStack about availability zones

• On provision you can choose the AZ

• Build Bosh releases accordingly

Page 26: Anynines - Cloud Foundry on OpenStack - An Experience Report

Aggregates

Page 27: Anynines - Cloud Foundry on OpenStack - An Experience Report

• Similar to AZ

• Not about failover

• Select hosts with certain attributes

• E.g. SSD-aggregate

• On provision choose host with SSD disks

Page 28: Anynines - Cloud Foundry on OpenStack - An Experience Report

Load Balancing

Page 29: Anynines - Cloud Foundry on OpenStack - An Experience Report

• Not inherently clustered

• LBaaS failover can be realized using

• Pacemaker/corosync

• GlusterFS (share LB configuration)

Page 30: Anynines - Cloud Foundry on OpenStack - An Experience Report

VM Failover Strategies

Page 31: Anynines - Cloud Foundry on OpenStack - An Experience Report

Resurrect

Page 32: Anynines - Cloud Foundry on OpenStack - An Experience Report

• Monitor VM

• Re-Build VMs automatically

• e.g. using Cloud Foundry Bosh

• + Easy

• - Takes long (minutes not seconds)

• - Open Stack doesn’t release persistent disks automatically

Page 33: Anynines - Cloud Foundry on OpenStack - An Experience Report

Failover to Standby VM

Page 34: Anynines - Cloud Foundry on OpenStack - An Experience Report

• Provide stand-by VM

• Monitor VM and perform failover

• e.g. using Pacemaker

• + Fast failover (seconds)

• - Pacemaker is not easy to use

• - Increased resource usage by stdby VM(s)

Page 35: Anynines - Cloud Foundry on OpenStack - An Experience Report

IP Failover

Page 36: Anynines - Cloud Foundry on OpenStack - An Experience Report

Three ways to failover IPs

Page 37: Anynines - Cloud Foundry on OpenStack - An Experience Report

Load Balancer

Page 38: Anynines - Cloud Foundry on OpenStack - An Experience Report

• + Fast

• + Easy (use lb weights)

• - LB becomes a bottleneck

• When OS supports HA Proxy (L3) this drawback can be eliminated

Page 39: Anynines - Cloud Foundry on OpenStack - An Experience Report

Floating-IPs

Page 40: Anynines - Cloud Foundry on OpenStack - An Experience Report

• + Easy

• + Fast

• - Only for public networking

Page 41: Anynines - Cloud Foundry on OpenStack - An Experience Report

NIC Re-attachment

Page 42: Anynines - Cloud Foundry on OpenStack - An Experience Report

• + No network bottleneck

• + No dependencies to other services

• - Slightly higher failover time (several seconds)

Page 43: Anynines - Cloud Foundry on OpenStack - An Experience Report

Implications for Cloud Foundry

Page 44: Anynines - Cloud Foundry on OpenStack - An Experience Report

Accept that VMs are ephemeral

Page 45: Anynines - Cloud Foundry on OpenStack - An Experience Report

Distribute CF components across OS availability zones

Page 46: Anynines - Cloud Foundry on OpenStack - An Experience Report

• 2 * UAA

• 2 * CC

• 2 * n * DEAs

• 2 * Health Manager

• …

Page 47: Anynines - Cloud Foundry on OpenStack - An Experience Report

UAA & CC DB =

SPOF

Page 48: Anynines - Cloud Foundry on OpenStack - An Experience Report

HA Postgres

Page 49: Anynines - Cloud Foundry on OpenStack - An Experience Report

• UAA and Cloud Controller database

• Single point of failure for Cloud Foundry

Page 50: Anynines - Cloud Foundry on OpenStack - An Experience Report

• Postgres not inherently clusterable > failover with standby vm

• Master/slave replication

• Pacemaker/corosync

• IP-Failover using NIC-reattachment

Page 51: Anynines - Cloud Foundry on OpenStack - An Experience Report

That’s half way towards a PostgreSQL CF Service

Page 52: Anynines - Cloud Foundry on OpenStack - An Experience Report

• Add a V2 Service Broker

• Add a provisioning logic

• Provision 2-node db cluster on cf create service postgres medium-cluster

Page 53: Anynines - Cloud Foundry on OpenStack - An Experience Report

CF Service Design

Page 54: Anynines - Cloud Foundry on OpenStack - An Experience Report

• Use clusterable services if possible

• Implement automatic failover if not

• Autoprovisioning using Bosh

• Organize self-healing

• (Semi-)Automatic recovery from degraded mode

Page 55: Anynines - Cloud Foundry on OpenStack - An Experience Report

Summary

Page 56: Anynines - Cloud Foundry on OpenStack - An Experience Report

• VMware’s high availability options are nice

• OpenStack helped us to save 50% costs

• OS is stable enough to run Cloud Foundry on top

• OS hardening is required and feasible

Page 57: Anynines - Cloud Foundry on OpenStack - An Experience Report

Open Source OpenStack and Open Source Cloud Foundry are SME’s best

friends!

Page 58: Anynines - Cloud Foundry on OpenStack - An Experience Report

Questions?

Page 59: Anynines - Cloud Foundry on OpenStack - An Experience Report

Thank you!

Page 60: Anynines - Cloud Foundry on OpenStack - An Experience Report

Preparing for disaster recovery

Page 61: Anynines - Cloud Foundry on OpenStack - An Experience Report

• Cinder Volume Snapshots

Page 62: Anynines - Cloud Foundry on OpenStack - An Experience Report

OpenStack Backups

Page 63: Anynines - Cloud Foundry on OpenStack - An Experience Report

OpenStack Swift

Page 64: Anynines - Cloud Foundry on OpenStack - An Experience Report

• Open source Amazon S3 replacement

• Object store with RESTful interface

• Scales horizontally to petabyte dimensions

• Fully redundant, highly available

• CF service > App Asset Storage

Page 65: Anynines - Cloud Foundry on OpenStack - An Experience Report

Coderequire "fileutils" require "find" require "fog" !class Blobstore   def initialize(connection_config, directory_key, cdn=nil, root_dir=nil)     @root_dir = root_dir     @connection_config = connection_config     @directory_key = directory_key     @cdn = cdn   end !  def local?     @connection_config[:provider].downcase == "local"   end !  def exists?(key)     !file(key).nil?   end