Chef at WebMD

Post on 07-Jul-2015

348 views 1 download

Tags:

description

Talk at DevOpsDC on 11/18/14 discussing how WebMD started using Chef, why it didn't work "out of the box" for us, and how we changed that.

Transcript of Chef at WebMD

Introducing and Extending Chefat WebMD

Adam LeffPlatform Technologies

I hate making slides.Powerpoint is garbage. Keynote is okay. Deckset rules.

http://www.decksetapp.com

It all started with a (seemingly) simple

task...

"Give me a few linux boxes, I'll take care of it."-- Adam Leff, before he regretted every choice he made in his life leading up to this point

"Here you go, just use Opsware to manage and

automate them."

The supposed process...• Create packages

• Upload packages

• Create software policies

• Attach software policies to hosts

• Remediate hosts

The actual process...• Grumble about creating packages while relearning .spec

files

• Create packages

• Upload packages

• Upload again because you don't know where they went

• Create software policies and attach

The actual process...• Curse openly about how much faster I could've done this

with bash

• Remediate

• Don't get any good feedback on remediation progress

• Wait forever for remediation

• End up with inconsistent hosts

• (�°□°��︵ ���

Enough drama... why did it suck?• Remediation was slow (5+ minutes sometimes)

• No easy config file management

• No easy way to control the order of software policies

Here, have some java...

A Better Way

Post-Chef Metrics• Cookbooks written: 1

• Recipes written: 6

• Hosts converged: 7

• Average converge time: 17 sec• Average happiness level: limitless

So long, Opsware...

"Hey, I noticed your hosts aren't registered with

Opsware anymore. Want me to fix it for you?"

But while you're here, want to see something cool?

We're not a startup.

Paralyzing Realizations

• Anyone with knife access can modify my cookbooks

• Nothing (except cookbooks) are versioned in the chef server

I don't trust my co-workers.

I don't shouldn't trust my co-workers.

Ops Come-to-$RELIGIOUS_FIGURE• You're paid to be paranoid.

• You're paid to plan for the worst.

If you build it, they will come.If you don't, they'll set up their own chef server.

Barriers to Upload• Enforce some standards

• Do some basic testing

• Eliminate need for users to have knife to upload cookbooks

"Chefkins"• Chef + Jenkins

• Sets up a new cookbook for the user

• User clones the git repo

• User commits and pushes

• Jenkins build is triggered

• Jenkins tests the cookbook

• Jenkins uploads the cookbook

"But we want/need knife access, too."Well, shit.

Throw money at the problemBut "Private Chef" and multi-tenancy didn't solve anything for us.

Rub some code on itKnife talks to chef via HTTP, so what if we wrote a proxy?

And so it was written...

Line Cook• Mimics a chef server, but still requires one

• Provides AuthN and AuthZ

• Versions everything

• Blocks cookbook uploads

To our users, it just feels like knife against a chef server.

Live DemoGod, I hope this works.

A (related) tangent...How do we track all our stuff?

CMDB

CMDBAsset and Inventory

Management

CMDBAsset and Inventory

ManagementA Source of Truth

The Saurus of Truth

Saurus Objectives• Inventory everything

• WebMD has over 300 applications!

• Develop a hierarchy

• Document owners

• Document relationships

• Arbitrary key/value data (a.k.a. "extended attributes")

The Hierarchy• Product Collection

• Product

• Application

• Component

• Instance

The Hierarchy• Product Collection (Consumer)

• Product (WebMD)

• Application (Runtime)

• Component (IIS)

• Instance (server1:80)

Other Relationships• Instance --> Environment

• Instance --> Host

• Host --> Datacenter

Back to Chef...

Attribute Inflexibility• Environment attributes were getting messy

• Too many people had to have access to make changes

• Too many cookbook wrappers

• Even with debug_value, very difficult to troubleshoot.

Sensitive DataChef Vault works but regenerating the keys file can be painful and slow.

Wait a sec...What about those Saurus Extended Attributes?

Saurus + Chef Wishlist• Let me mark an EA as sensitive so it's encrypted at rest and

not shown in the UI

• Give a host all its Extended Attributes, inheriting from the hierarchy

• Never allow a host to get EAs that aren't its own

Saurus + Chef = ❤• Saurus API endpoint for retrieving EA data

• Chef cookbook library helper for retrieving that data

• Authenticates using the Chef client key

• Returns a Chef "Mash"

{ "myInstance" => { "id" => "122136ce-7f84-4c22-b1db-c353abb2aa29", "name" => "myInstance", "created_at" => "2014-09-22T00:00:00.000+00:00", "updated_at" => "2014-09-22T18:00:00.000+00:00", "ip_address" => "127.0.0.1", "port" => "8080", "key_value_data" => { "myPassword" =>"unbr8kable", "myValue" =>"local_to_instance" } } }

Alright, what's the point?• Chef server is not perfect

• Nothing is perfect

• Chef server is an awesome artifact server

• My business logic and requirements are different than yours

• Expect to invest time/resources/etc. to make something right

Lawyers...

How to fix a technical problemFrom least expensive to most expensive...

1. Throw hardware at it

2. Buy software for it

3. Write software for it

Try these in order if you can.But it's OK if #3 is the answer.

Are you using Chef yet?• Try it! -- getchef.com

• Learn it! -- learn.getchef.com

• If you don't like it, learn something else!

Life is too short to not automate.

Thank you!