WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required,...
Transcript of WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required,...
![Page 1: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/1.jpg)
WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICESMATT RANNEY
![Page 2: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/2.jpg)
WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICESMATT RANNEY
![Page 3: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/3.jpg)
![Page 4: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/4.jpg)
![Page 5: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/5.jpg)
As of June 2016:
Uber Cities Worldwide: 400+ Countries: 70 Employees: 7,000-
![Page 6: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/6.jpg)
LIFE LESSONS
![Page 7: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/7.jpg)
![Page 8: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/8.jpg)
WHY MICROSERVICES?Move and Release Independently Own your Uptime Use the “Best” tool for the job
![Page 9: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/9.jpg)
WHAT ARE THE COSTS?Now you have a distributed system What if it breaks? Operational complexity
![Page 10: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/10.jpg)
MICROSERVICESImmutable? Append Only?
![Page 11: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/11.jpg)
LESS OBVIOUS COSTSEverything is a tradeoff You can build around problems Might trade complexity for politics You get to keep your biases
![Page 12: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/12.jpg)
LANGUAGESHard to share code Hard to move between teams WIWIK: Fragments the culture
![Page 13: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/13.jpg)
RPCHTTP/REST gets complicated JSON needs a schema RPCs are slower than PCs WIWIK: servers are not browsers
![Page 14: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/14.jpg)
HOW MANY REPOS
![Page 15: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/15.jpg)
![Page 16: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/16.jpg)
APRIL 2016
MAY 2016
JUNE 2016
![Page 17: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/17.jpg)
OPERATIONALWhat happens when things break? Can other teams release your service? Understand a service in the larger context
![Page 18: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/18.jpg)
PERFORMANCEDepends on language tools
![Page 19: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/19.jpg)
![Page 20: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/20.jpg)
![Page 21: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/21.jpg)
![Page 22: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/22.jpg)
![Page 23: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/23.jpg)
PERFORMANCEDoesn’t matter until it does Probably want at least simple perf requirements WIWIK: “good” not required, but “known” is
![Page 24: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/24.jpg)
overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use 1: 1% at least 1000ms use 100: 63% at least 1000ms 1.0 - 0.99^100 = 0.634 = 63.4%
FANOUT
![Page 25: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/25.jpg)
req
uest
s th
at a
re s
low
0%
25%
50%
75%
100%
Processes Used
1 2 4 8 16 32 64 128 256 512 1024
p95 p99 p99.9
![Page 26: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/26.jpg)
TRACINGLots of ways to get this Best way to understand fanout
![Page 27: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/27.jpg)
![Page 28: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/28.jpg)
![Page 29: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/29.jpg)
![Page 30: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/30.jpg)
TRACINGProbably want sampling WIWIK: cross-lang context propagation
![Page 31: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/31.jpg)
![Page 32: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/32.jpg)
LOGGINGNeed consistent, structured logging Multiple languages makes this hard Logs are not for humans
![Page 33: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/33.jpg)
![Page 34: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/34.jpg)
![Page 35: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/35.jpg)
LOGGINGLogging floods can amplify problems WIWIK: Accounting
![Page 36: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/36.jpg)
LOAD TESTINGNeed to test against production Without breaking metrics Preferably all the time WIWIK: all systems need to handle “test” traffic
![Page 37: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/37.jpg)
FAILURE TESTINGWIWIK: people won’t like it
![Page 38: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/38.jpg)
MIGRATIONSOld stuff still has to work What happened to immutable? WIWIK: mandates are bad
![Page 39: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/39.jpg)
OPEN SOURCEBuild/buy tradeoff is hard Commoditization WIWIK: this will make people sad
![Page 40: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/40.jpg)
POLITICSServices allow people to play politics Company > Team > Self
![Page 41: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/41.jpg)
TRADEOFFSEverything is a tradeoff Try to make them intentionally
![Page 42: WHAT I WISH I KNEW BEFORE SCALING UBER TO 1,000 SERVICES€¦ · WIWIK: “good” not required, but “known” is. overall latency ≥ latency of slowest 1ms avg, 1000ms p99 use](https://reader034.fdocuments.us/reader034/viewer/2022050603/5faa625550b13874f66eb145/html5/thumbnails/42.jpg)
THANKS