Continuously Deploying Culture: Scaling Culture at Etsy - Velocity Europe 2012

Post on 18-Oct-2014

28.964 views 2 download

description

There was a time not long ago when Etsy was laden with barriers, silos, broken communication, and noncooperation. This talk will focus on the various stages of Etsy's cultural development from the early days to present. We will tell of how Etsy overcame numerous challenges and built a strong company culture while continuing to scale.

Transcript of Continuously Deploying Culture: Scaling Culture at Etsy - Velocity Europe 2012

Continuously Deploying Culture

Scaling Culture at Michael Rembetsy, Director of Operations Engineering (@mrembetsy)

Patrick McDonnell, Senior Operations Engineer (@mcdonnps)October 4, 2012Velocity Europe

About Us• Operations team at Etsy totals twelve

• Engineering totals 125

• Company totals about 350

• 15 million members in over 150 countries

• Over 800,000 sellers

• 1.4 billion page views monthly

Disclaimer

A + B != Culture

The Beginning

Silos and Barriers• Etsy employs 30 to 35

• About half are engineers

• Siloed culture, barriers to collaboration in engineering

• “Sprouter”

• Designed to prevent engineers from directly touching databases

Management Shake-up• Maria Thomas promoted to CEO

• Understands that community is of utmost importance

• Begins to prioritize culture that supports community

• Chad Dickerson brought on as CTO

• Starts to bring focus to engineering team

Bring the Pain

Opening Communications• fix.etsy.com

• First time exposing engineering to community

• Scheduled maintenance posted ahead of time

• Outage follow-ups

• Culturally, technical debt needs to be paid off

Opening Communications• fix.etsy.com

• First time exposing engineering to community

• Scheduled maintenance posted ahead of time

• Outages followed upon

• Culturally, technical debt needs to be paid off

2008 Takeaways• WTF did I get myself into?

• Low or no impact projects

• Downtime is accepted / needed

• Community is most important

• Lucky to make it through holiday season

2008 End of Year Snapshot

• Gross Merchandise Sales: $87.3 million

• Visits: 163 million

• Unique Visits:  47.7 million

2008 Action Items• Gain support from the top and bottom to change culture

• Increase transparency both within the organization and to the public

• Pay back technical debt as soon as possible

Sea Change• Major shift in culture

• People leave or are let go

• Too many remotes

• Teams centralized in Brooklyn

• Need “DevOps” culture before remotes are manageable

Building the Foundation• DevTools team created

• Develops first iteration of Deployinator

• Stabilizes the deploy chain through dev, QA, and prod environments

• Infrastructural overhaul

• Move from lighttpd to Apache starts

• Added monitoring and graphing

• Better metrics exposure to non-technical divisions

Building the Foundation• DevTools team created

• Develops first iteration of Deployinator

• Stabilizes the deploy chain through dev, QA, and prod environments

• Infrastructural overhaul

• Move from lighttpd to Apache starts

• Added monitoring and graphing

• Better metrics exposure to non-technical divisions

Hiring Push• Only two operations engineers and too few local developers

• Concentration on sourcing talent to keep pace with growth

• Move from Downtown Brooklyn to larger space in DUMBO

• More accessible to non-Brooklynites

• More mature tech and art community

Internal Improvement• Stand-up meetings

• Time-consuming, but necessary to improve communication

• Inter-team collaboration

• Ops involved in decision-making for dev projects and vice versa

• Internal stability much stronger, damage control period ends

• Network solidified, infrastructural foundation allows for future growth

Stability Arrives• Everyone wants to collaborate from top down and bottom up

• Not just upper management pushing down

• Everyone shapes culture, suggestions always welcome

• People happy to come to work, contribute beyond their job description

• No more scheduled downtime, site remains up as much as possible

• Master database purchased as a stopgap for holiday capacity

2009 Takeaways• Foundation solidified

• Technology

• Human capital

• Beginning of “DevOps” culture

• Berlin Wall falls

2009 End of Year Snapshot

• Gross Merchandise Sales:  $177 million (102.6%)

• Visits: 320 million (96.8%)

• Unique Visits: 92.9 million (94.8%)

• Page Views: 6.45 billion

2009 Action Items• Stabilize the parts of your organization which create thrashing

• Hire staff that will make a difference

• Think about collaborative processes

• When dealing with staffing constraints, prioritize projects by impact

• Get things done (later “Just Ship”)

Renewed Energy

.etsy.com

Very end of 2009Today

30

20

10

40

Standardization Effort• PHP

• Use it and nothing else

• Everyone should be able to read and rewrite your code

• MySQL becomes database of choice

• Backfill of PostgreSQL tables to MySQL shards begins

If it Moves, Graph it• Graphing tools

• Ganglia, Graphite, FITB, Weathermap

• Monitoring

• Nagios, Naglite

• Increased focus on work/life balance

• We removed alerts! WHAT?

If it Moves, Graph it• Graphing tools

• Ganglia, Graphite, FITB, Weathermap

• Monitoring

• Nagios, Naglite

• Increased focus on work/life balance

• We removed alerts! WHAT?

Management Ideals• Accept failures but not lower standards

• Доверяй, но проверяй (trust, but verify)

• Blameless post-mortems

• Welcome one-on-ones (http://bit.ly/cCWMqr)

• Career planning

• Happy company = happy community

Engineering Processes• Developer on call

• A/B Testing

• Prototypes

• Feature flags and ramp-ups

• Schema Change Thursday

2010 Takeaways• Paring down the number of technologies used for development

• Focus on technical visibility throughout the organization

• Developers take responsibility for code release

• Freedom to hire as needed

• New focus on work life balance

2010 End of Year Snapshot

• Gross Merchandise Sales: $307 million (73.4%)

• Visits: 534 million (66.7%)

• Unique Visits: 147 million (58.2%)

• Page Views: 9.25 billion (43.5%)

2010 Action Items

• Don’t guess at what’s wrong; graph it, monitor it, and find out

• Empower developers with responsibilities

• Have clear, documented standards and practices

• Keep human management a priority

The Reaping• Death of non-standard technologies

• Mongo, Scala, CoffeeScript, etc. removed from production systems

• Sprouter is eradicated

• No more Python

• Removal of a major barrier to full developer accountability

• Check out Ross Snyder’s talk at Surge 2011 (http://bit.ly/po8zIj)

The Reaping• Death of non-standard technologies

• Mongo, Scala, CoffeeScript, etc. removed from production systems

• Sprouter is eradicated

• No more Python

• Removal of a major barrier to full developer accountability

• Check out Ross Snyder’s talk at Surge 2011 (http://bit.ly/po8zIj)

Organizational Change

• Management becomes more engineering focused

• Chad becomes CEO

• Kellan is promoted to CTO

• John is promoted to SVP of Technical Operations

Technical Contribution• Big push to contribute to open source and technical community

• Deployinator, Statsd, Logster, many more

• Three annual goals for every engineer

• Speak at a conference

• Write a blog post

• Open source code

Still Scaling

Configuration Management

And Still Scaling Culture• Engineering culture decides git is better for our working style than svn

• Product development efforts are refocused on high-impact products such as search ads

• Ops works to improve signal-to-noise ratio of alerts and develops internal tools such as Schemanator

• Focus on security

• SCRAM team, Security and Fraud team

• Check out Nick Galbreath’s talk at DevOpsDays Austin (http://slidesha.re/IMaavq)

Happy Holidays

• Game days to test failures before they happen

• Capacity planning is easier

• More dashboards (framework now on GitHub!)

• Improved weekly financial reporting

Happy Holidays

• Game days to test failures before they happen

• Capacity planning is easier

• More dashboards (framework now on GitHub!)

• Improved weekly financial reporting

2011 Takeaways• Year of the tool

• Statsd, Deployinator, Supergrep v2, Logster, Schemanator, FITB

• DevTools adds three engineers

• Automation reaches maturity, few surprises

• Focus on security allows PCI infrastructure to be completed in 6 weeks

• Engineering matures, both platform and people

2011 End of Year Snapshot

• Gross Merchandise Sales: $526 million (71.4%)

• Visits: 895 million (67.7%)

• Unique Visits: 237 million (61.1%)

• Page Views: 12.9 billion (39.3%)

2011 Action Items• Senior management at a tech company should be technology-

focused

• Implement configuration management and automation as soon as possible (saves headaches later)

• As technical staff increases, continue to focus on projects that matter

• New technical challenges do not dictate cultural shift

Massive Growth• Explosive growth in hiring

• Major organizational changes around product

• Increased focus on community

• High-impact products (shipping labels, gift cards)

• Ramp-up of internationalization projects

• Unique engineering challenges force non-standard technology

• Redis

• Virtualized CI test slaves

Spreading Information• Invitation to other teams to stop by the office for a chat

• Ops becomes more involved in external informational exchange

• Code as Craft blog posts

• Rapid scaling of personnel, no one knows everyone

• Remotes come in to Brooklyn more often

Work in Progress• Developer boredom curbed by allowing transfers between teams

• Even between divisions (such as engineering -> product)

• Developers now read data from prod databases for development

• Removes one of the last anomalies between development and production

• Front End Performance team forms to minimize site load times

2012 Takeaways• Organizational resilience allows for total focus on community

• Improvement on international product

• Increased transparency to the public

• Focus on informational exchange, both internally and externally

• Open source all the things

2012 Action Items

• Know when not to try something

• Focus on performance early

• Allow dynamic allocation of resources

• Don’t allow size to dictate culture

Future Predictions• “Wait till you hit 500 people, that’s when everything falls apart”

• Etsy will not fall apart, but will change

• Management supports and responds to cultural shifts

• Freedom of movement between job functionalities

• Trust that change is for the better

Future Predictions• Will create better and innovative ways of communicating as we scale

• Building tools is the Etsy way

• Continue to influence corporate culture

• B Corp

• Employee happiness

Office Hours

Sponsor Pavilion Table A2:00 to Last Call at the Bar

Michael Rembetsy (@mrembetsy)Patrick McDonnell (@mcdonnps)

https://github.com/etsy