Chef at WebMD

54
Introducing and Extending Chef at WebMD Adam Leff Platform Technologies

description

Talk at DevOpsDC on 11/18/14 discussing how WebMD started using Chef, why it didn't work "out of the box" for us, and how we changed that.

Transcript of Chef at WebMD

Page 1: Chef at WebMD

Introducing and Extending Chefat WebMD

Adam LeffPlatform Technologies

Page 2: Chef at WebMD

I hate making slides.Powerpoint is garbage. Keynote is okay. Deckset rules.

http://www.decksetapp.com

Page 3: Chef at WebMD

It all started with a (seemingly) simple

task...

Page 4: Chef at WebMD

"Give me a few linux boxes, I'll take care of it."-- Adam Leff, before he regretted every choice he made in his life leading up to this point

Page 5: Chef at WebMD

"Here you go, just use Opsware to manage and

automate them."

Page 6: Chef at WebMD

The supposed process...• Create packages

• Upload packages

• Create software policies

• Attach software policies to hosts

• Remediate hosts

Page 7: Chef at WebMD

The actual process...• Grumble about creating packages while relearning .spec

files

• Create packages

• Upload packages

• Upload again because you don't know where they went

• Create software policies and attach

Page 8: Chef at WebMD

The actual process...• Curse openly about how much faster I could've done this

with bash

• Remediate

• Don't get any good feedback on remediation progress

• Wait forever for remediation

• End up with inconsistent hosts

• (�°□°��︵ ���

Page 9: Chef at WebMD

Enough drama... why did it suck?• Remediation was slow (5+ minutes sometimes)

• No easy config file management

• No easy way to control the order of software policies

Page 10: Chef at WebMD

Here, have some java...

Page 11: Chef at WebMD
Page 12: Chef at WebMD

A Better Way

Page 13: Chef at WebMD

Post-Chef Metrics• Cookbooks written: 1

• Recipes written: 6

• Hosts converged: 7

• Average converge time: 17 sec• Average happiness level: limitless

Page 14: Chef at WebMD

So long, Opsware...

Page 15: Chef at WebMD

"Hey, I noticed your hosts aren't registered with

Opsware anymore. Want me to fix it for you?"

Page 16: Chef at WebMD

But while you're here, want to see something cool?

Page 17: Chef at WebMD

We're not a startup.

Page 18: Chef at WebMD

Paralyzing Realizations

• Anyone with knife access can modify my cookbooks

• Nothing (except cookbooks) are versioned in the chef server

Page 19: Chef at WebMD

I don't trust my co-workers.

Page 20: Chef at WebMD

I don't shouldn't trust my co-workers.

Page 21: Chef at WebMD

Ops Come-to-$RELIGIOUS_FIGURE• You're paid to be paranoid.

• You're paid to plan for the worst.

Page 22: Chef at WebMD

If you build it, they will come.If you don't, they'll set up their own chef server.

Page 23: Chef at WebMD

Barriers to Upload• Enforce some standards

• Do some basic testing

• Eliminate need for users to have knife to upload cookbooks

Page 24: Chef at WebMD

"Chefkins"• Chef + Jenkins

• Sets up a new cookbook for the user

• User clones the git repo

• User commits and pushes

• Jenkins build is triggered

• Jenkins tests the cookbook

• Jenkins uploads the cookbook

Page 25: Chef at WebMD

"But we want/need knife access, too."Well, shit.

Page 26: Chef at WebMD

Throw money at the problemBut "Private Chef" and multi-tenancy didn't solve anything for us.

Page 27: Chef at WebMD

Rub some code on itKnife talks to chef via HTTP, so what if we wrote a proxy?

Page 28: Chef at WebMD

And so it was written...

Page 29: Chef at WebMD

Line Cook• Mimics a chef server, but still requires one

• Provides AuthN and AuthZ

• Versions everything

• Blocks cookbook uploads

To our users, it just feels like knife against a chef server.

Page 30: Chef at WebMD

Live DemoGod, I hope this works.

Page 31: Chef at WebMD

A (related) tangent...How do we track all our stuff?

Page 32: Chef at WebMD

CMDB

Page 33: Chef at WebMD

CMDBAsset and Inventory

Management

Page 34: Chef at WebMD

CMDBAsset and Inventory

ManagementA Source of Truth

Page 35: Chef at WebMD

The Saurus of Truth

Page 36: Chef at WebMD

Saurus Objectives• Inventory everything

• WebMD has over 300 applications!

• Develop a hierarchy

• Document owners

• Document relationships

• Arbitrary key/value data (a.k.a. "extended attributes")

Page 37: Chef at WebMD

The Hierarchy• Product Collection

• Product

• Application

• Component

• Instance

Page 38: Chef at WebMD

The Hierarchy• Product Collection (Consumer)

• Product (WebMD)

• Application (Runtime)

• Component (IIS)

• Instance (server1:80)

Page 39: Chef at WebMD

Other Relationships• Instance --> Environment

• Instance --> Host

• Host --> Datacenter

Page 40: Chef at WebMD
Page 41: Chef at WebMD
Page 42: Chef at WebMD
Page 43: Chef at WebMD

Back to Chef...

Page 44: Chef at WebMD

Attribute Inflexibility• Environment attributes were getting messy

• Too many people had to have access to make changes

• Too many cookbook wrappers

• Even with debug_value, very difficult to troubleshoot.

Page 45: Chef at WebMD

Sensitive DataChef Vault works but regenerating the keys file can be painful and slow.

Page 46: Chef at WebMD

Wait a sec...What about those Saurus Extended Attributes?

Page 47: Chef at WebMD

Saurus + Chef Wishlist• Let me mark an EA as sensitive so it's encrypted at rest and

not shown in the UI

• Give a host all its Extended Attributes, inheriting from the hierarchy

• Never allow a host to get EAs that aren't its own

Page 48: Chef at WebMD

Saurus + Chef = ❤• Saurus API endpoint for retrieving EA data

• Chef cookbook library helper for retrieving that data

• Authenticates using the Chef client key

• Returns a Chef "Mash"

Page 49: Chef at WebMD

{ "myInstance" => { "id" => "122136ce-7f84-4c22-b1db-c353abb2aa29", "name" => "myInstance", "created_at" => "2014-09-22T00:00:00.000+00:00", "updated_at" => "2014-09-22T18:00:00.000+00:00", "ip_address" => "127.0.0.1", "port" => "8080", "key_value_data" => { "myPassword" =>"unbr8kable", "myValue" =>"local_to_instance" } } }

Page 50: Chef at WebMD

Alright, what's the point?• Chef server is not perfect

• Nothing is perfect

• Chef server is an awesome artifact server

• My business logic and requirements are different than yours

• Expect to invest time/resources/etc. to make something right

Page 51: Chef at WebMD

Lawyers...

Page 52: Chef at WebMD

How to fix a technical problemFrom least expensive to most expensive...

1. Throw hardware at it

2. Buy software for it

3. Write software for it

Try these in order if you can.But it's OK if #3 is the answer.

Page 53: Chef at WebMD

Are you using Chef yet?• Try it! -- getchef.com

• Learn it! -- learn.getchef.com

• If you don't like it, learn something else!

Life is too short to not automate.

Page 54: Chef at WebMD

Thank you!