Ways to minimise performance risks in continuous delivery
-
Upload
a32an -
Category
Technology
-
view
128 -
download
0
Transcript of Ways to minimise performance risks in continuous delivery
![Page 1: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/1.jpg)
WAYS TO MINIMISE PERFORMANCE RISKS IN CONTINUOUS DELIVERY
Adriaan Thomas4 June 2013
![Page 2: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/2.jpg)
INTRODUCTION
![Page 3: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/3.jpg)
OBJECTIVEPut working software into production as quickly as possible, whilst minimising risk of load-related problems:
• Bad response times
• Lack of capacity
• Availability too low
• Excessive system resource use
Within the context of websites.
![Page 4: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/4.jpg)
![Page 5: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/5.jpg)
TRADITIONAL APPROACHLoad testing through simulation
http://www.flickr.com/photos/danramarch/4423023837
![Page 6: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/6.jpg)
DECIDE WHAT TO TEST
•Focus on busiest instant•Model most-hit functionality•Extrapolate to expected load
•Look at production traffic•Or attempt educated guess
![Page 7: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/7.jpg)
DECIDE ON SCOPE
Component test
Chain test
Full environment test•Test coverage•Level of certainty•Number of systems•Amount of work
![Page 8: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/8.jpg)
SET UP TEST DATA
• Usually starts as a copy from production
• Or educated guess what people will enter
• Render anonymous
• Make tests deterministic
• Synchronise between all systems
http://www.flickr.com/photos/22168167@N00/3889737939/
![Page 9: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/9.jpg)
DECIDE ON STRATEGY
One or more of:
•Scalability test
•Stress test
•Endurance test
•Regression test
•Resilience testhttp://www.flickr.com/photos/timjoyfamily/5935279962/
![Page 10: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/10.jpg)
DECIDE ON TEST DURATION
(which is tricky)
http://www.flickr.com/photos/wwarby/3297205226
![Page 11: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/11.jpg)
PROVIDE HARDWARE
http://www.flickr.com/photos/s_w_ellis/2681151694/
Copy of production?
Only one copy?
Virtualisation?
Sharing between teams?
![Page 12: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/12.jpg)
INTEGRATE INTO PIPELINE
Unit testFunctional integration
testLoad test
Very fast Fast Takes longer
![Page 13: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/13.jpg)
INTEGRATE INTO PIPELINE
Unit test
Functional integration
test
Load test
Very fast Takes longer
![Page 14: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/14.jpg)
PERMANENT LOAD TESTING
Daytime: constant load, teams inspect impact of changes
Nighttime: Endurance test
Weekends: refresh test data
http://ww
w.flickr.com/photos/renaissancecham
bara/5106171956/
![Page 15: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/15.jpg)
RESPONSE TIMEDNS lookup (www.xebia.com)
Time to first byte + loading HTMLTime to render
Time to document complete
Browser CPU useBandwidth
# connections to a single host
http://www.webpagetest.org/result/130522_FG_10SC/1/details/
SSL handshake
Parse times
Blocking client code
![Page 17: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/17.jpg)
CLEAR REQUIREMENTSResponse time
Fail: 10 Now: 3.5 Goal: 1Intention: Users get a response quickly so that they are happy and spend more money.
Stakeholder: Marketing dept.
Scale: 95th percentile of “document complete” response times, in seconds, measured over one minute.
Metric: Page load times as reported by our RUM tool.
Inspired by Tom Gilb, Competitive Engineering
![Page 18: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/18.jpg)
WebPageTest: first view + repeat view (median of 3)
95th percentile response times from access logs
ADJUST REQUIREMENTS DUE TO LACK OF REAL BROWSERS
![Page 19: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/19.jpg)
Playground to test changesNo impact on real users
Less pressure
More work
Guesswork and extrapolationCan take a significant amount of time
More hardware
![Page 20: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/20.jpg)
THINGS WILL BREAK...... in spite of your best efforts
http://www.flickr.com/photos/jmarty/1239950166/
![Page 21: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/21.jpg)
SO INSTEAD WE SHOULD FOCUS ON FAST RECOVERY
http://www.flickr.com/photos/19107136@N02/8386567228/
![Page 22: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/22.jpg)
“MTTR is more important than MTBF*”
John Allspaw
* for most types of F
![Page 23: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/23.jpg)
0
0.5
1.0
1.5
2.0
99th
per
cent
ile re
spon
se ti
me
(s)
Test duration
MTBF LEADS TO FUD
![Page 24: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/24.jpg)
Time→TTD find cause (RCA) write & test fix build deploy validatecom
pile
deploy & testMonitoring
Alerts
• Skills•Organisation•Culture•Maintainability• Simple architecture
•Fast w
orkstations•
Good tooling
•A
ble to quickly test locally
•A
utomation
•Fast build server•
Efficient tests
Monitoring•
Autom
ation•
Flexible architecture
TTR
![Page 25: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/25.jpg)
DEMING FEEDBACK LOOPS
Plan
Do
Study
Act
![Page 26: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/26.jpg)
OODA LOOPS
Observe
Orient
Decide
Act
![Page 27: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/27.jpg)
AVOID TEST-ONLY MEASUREMENTS
![Page 28: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/28.jpg)
SIMPLE ARCHITECTURE
![Page 29: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/29.jpg)
THE ONLY THING THAT MATTERS IS WHAT HAPPENS IN PRODUCTION
Everything else is an assumption.
![Page 30: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/30.jpg)
DEPLOYING CHANGES
http://www.flickr.com/photos/39463459@N08/5083733600
![Page 31: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/31.jpg)
![Page 32: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/32.jpg)
BLUE-GREEN DEPLOYMENTS
Version n+1
Version n
Amazon Route 53
Elastic Load
Balancer
Elastic Load
Balancer
Instances
Instances
![Page 33: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/33.jpg)
DARK LAUNCHINGWeb page DB
![Page 34: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/34.jpg)
DARK LAUNCHINGWeb page DB Weather SP
![Page 35: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/35.jpg)
DARK LAUNCHINGWeb page DB Weather SP
![Page 36: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/36.jpg)
FEATURE TOGGLES
![Page 37: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/37.jpg)
CANARY RELEASING
0% 100%
![Page 38: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/38.jpg)
PRODUCTION-IMMUNE SYSTEMS
![Page 39: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/39.jpg)
CONTROLLED LOAD TESTING
Instance RDS DB Instance
RDS DB InstanceRead Replica
Instance
Instance
Amazon Route 53
Elastic Load
Balancer
![Page 40: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/40.jpg)
MONITORING
http://www.flickr.com/photos/smieyetracking/5609671098/
![Page 41: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/41.jpg)
MONITORINGTechnical metrics•CPU use•Memory use•TPS•Response times•etc
Process metrics•# bugs•MTTR, MTTD•Time from idea to live on site•etc
Business metrics•Revenue•# unique visitors•etc
http://www.flickr.com/photos/smieyetracking/5609671098/
![Page 42: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/42.jpg)
MEASURE IMPACT OF CHANGES
![Page 43: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/43.jpg)
tail -‐f access_log | alstat.pl -‐i10 -‐n10 -‐stt
Hits Hits% TPS AvgTmTk TTmTk% AvgRSize RSize% 2013-‐06-‐04 19:37:40 (08) 14 0.1% 1.4 1.652 5.7% 2691 0.2% POST 200 /login.do 14 0.1% 1.4 0.918 3.2% 3739 0.3% GET 200 /home.do 14 0.1% 1.4 0.879 3.1% 3185 0.2% POST 200 /order.do 7 0.1% 0.7 0.807 1.4% 1974 0.1% POST 200 /account.do 4 0.0% 0.4 0.735 0.7% 3228 0.1% GET 200 /products.do 5 0.0% 0.5 0.697 0.9% 969 0.0% POST 200 /settings.do 9 0.1% 0.9 0.687 1.5% 1827 0.1% POST 200 /changeorder.do 27 0.2% 2.7 0.649 4.3% 2997 0.4% POST 200 /newpasswd.do 15 0.1% 1.5 0.580 2.2% 2488 0.2% GET 200 /offer.do 95 0.9% 9.5 0.520 12.2% 4801 2.3% GET 200 /search.do
![Page 44: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/44.jpg)
MEASURE LATENCYAvg. response times front end vs backend
Number of calls
![Page 45: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/45.jpg)
SMALL DEPLOYMENTS
http://www.flickr.com/photos/rbulmahn/4925464931/
![Page 46: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/46.jpg)
GO/NO-GO MEETINGS
• What are the biggest fears?
• How can we measure this?
• What can be done if it does happen?
![Page 47: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/47.jpg)
RETROSPECTIVESHow can we prevent a failure from happening again?
How can we detect it earlier?
Was there only one root cause?
http://www.flickr.com/photos/katerha/8380451137
![Page 48: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/48.jpg)
INTRODUCE OUTAGES
Chaos monkey
Game day exercises
http://www.flickr.com/photos/frostnova/440551442/
![Page 49: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/49.jpg)
CULTURE
• Dev and Ops work together on providing information.
• Assumptions are dangerous, try to eliminate as many as possible.
• Small changes are easier to fix than large ones.
• Deploy during office hours so everyone is available in case problems happen.
• All information, including business metrics, should be accessible to everyone.
![Page 50: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/50.jpg)
CLAMS
Culture
Lean
Automation
Measurement
Sharing
![Page 51: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/51.jpg)
SIMPLE, FLEXIBLE ARCHITECTURE
• If the site goes down often, probably its architecture is at fault
• Avoid fragile systems
• Resilience is key
• Scalable (redundancy is not waste)
• Rather many small systems than a few large ones
• State is a “hot brick”
![Page 52: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/52.jpg)
CHANGES FOR THE BUSINESS
• Accept to push smaller changes.
• Continuous delivery vs continuous deployment.
• Share data.
![Page 53: Ways to minimise performance risks in continuous delivery](https://reader036.fdocuments.us/reader036/viewer/2022070323/5597db361a28abb35e8b486c/html5/thumbnails/53.jpg)
CONCLUSION
Work on your ability to respond to failure. Trying to prevent failure can slow you down and make you focus on the wrong things.
Keep assumptions clearly separated from facts. Make your decisions based on evidence.
Measure everything, including the impact of changes to the business.
Look for your compromise, try permanent load testing first and learn from that.