WinOps Conf 2016 - Peter Mounce - DoS yourself in production every night to prove you can take it
-
Upload
winops-conf -
Category
Technology
-
view
130 -
download
0
Transcript of WinOps Conf 2016 - Peter Mounce - DoS yourself in production every night to prove you can take it
1
DoS yourself inproduction everynight to prove you
can take it@petemounce@justeat_tech@petemounce + @justeat_tech
@justeat_tech
2
Any questions?Shout them out as we go. That's more fun.
@petemounce + @justeat_tech
3
Who are JUST EAT?
4
Performance?
5
When do you suppose peaktime is?
The same time we DoS ourselves, of course!
6
We have cyclic demand
7
The problem withcontinuous delivery
Everyone wants to change everything, all the time.
8
Traditional approachLet's make an environment like
production and run load through that.
9
Individual tests take toolong
10
But of course...
11
So, test all the time
12
So, test in production
#YOLO!
We deploy 10s of small changes a dayand we have alerts. I bet we won't break
production (without noticing)
#WhatCouldPossiblyGoWrong?
Let's just do it in production with faketraffic at the same time as customers!
13
Reasons why this isn'tinsane
14
How did we start doing this?Technology aspects and people aspects
15
Have the idea to start( )We didn't invent this
16
Choose scenarios we care about
17
Choose a load agent
18
Gain confidence outside of peak time(This part is also about reassuring stakeholders that you've
got it all under control...)
19
Start adding data variety
20
Make the computer do it every dayThis is the most vital part!
21
Get more elaborate laterFake away external dependencies
x‐traffic‐flavour: fake
22
And even more elaborate...Fake away more complicated things
23
How have we kept doingthis?
... and what did we learn to do better
24
Didn't allow tests to be red (for long)
25
Needed to tune levels over time
26
Got smarter about data management
27
Embraced the fact that things break
28
What battle scars did weget... lately?
All of these would have hurt badly if we hadn't had theability to turn the pain off ourselves
29
Find unbounded result sets beforecustomers
30
Monitoring needs to be solid!
31
Realise AWS account limits are closerthan we thought...
Credit: StickerMule.com
32
Realise haproxy should balance, notmagnify load...
33
Realise we're not as smart as wethink...
Dear WinOps.org, this is why I had to leave early lastyear...
34
But...Discovered problems during peacetime, not peak time
35
What did we gain?
36
Peace of mind, #1Continuous, early, warning about:
Getting slowerRunning out of capacity
37
Peace of mind, #2Good, simple, clear operational response to most surprises:
Is fake load running? Stop it.
Scale up
Now, start to think
38
Peace of mind, #3If we find a problem Thursday night:
1. Turn off fake load for the weekend2. Enjoy weekend3. Fix it next week with less pressure
39
Performance & operability == 1stclass concern
40
Alerts become automated tests inproduction
41
git push production is one stepcloser
Continuous testing in production can be applied to morethan just performance & capacity
42
Online takeaway. Harderthan you might think
We've got many open spots for talented engineers (London,Bristol, Kiev), if you're interested.
tech.justeat.com/jobs
Get in touch peter.mounce@justeat.com