for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com...
Transcript of for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com...
![Page 1: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/1.jpg)
SLOsfor Data-Intensive Services
Yoann FouquetBooking.com
![Page 2: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/2.jpg)
● Agenda
1 SLO Refresher
2 Our reservation system
3 SLO definition journey
4 Benefits
![Page 3: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/3.jpg)
● SLIs, SLOs
Service LevelIndicatorquantitative measure
availability
Service LevelObjectiveSLI ≥ target
availability for 1 week over 99.99%
![Page 4: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/4.jpg)
● Scale highlights
1,500,000+experiences bookedevery 24 hours
23years since launchfounded in 1996
50,000+physical serversacross 4 datacenters
![Page 5: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/5.jpg)
● Reservation system
Search Service
ReservationService
CreationModification
... Search queries
Data nodesData nodesData nodes
Gateway
Stream Stream
![Page 6: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/6.jpg)
● First SLOs
Search Service
ReservationService
AvailabilityLatency
Data nodesData nodesData nodes
Gateway
AvailabilityLatency
Res. success rate
Stream Stream
![Page 7: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/7.jpg)
● Stakeholders reaction Reservation service
![Page 8: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/8.jpg)
● Stakeholders reaction Search service
![Page 9: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/9.jpg)
Stream Stream
● Missing SLOs
Search Service
ReservationService
Data nodesData nodesData nodes
Gateway
Freshness?
Accuracy?
Consistency?
Durability?
![Page 10: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/10.jpg)
● Consistency SLO
Search Service
ReservationService
Data nodesData nodesData nodes
Gateway
Probe
Get orders idSearch orders and compare
![Page 11: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/11.jpg)
● Consistency SLO
99.99% of reservations are consistent among all data nodes
![Page 12: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/12.jpg)
● Consistency SLO (2nd attempt)
Search Service
Data nodesData nodesData nodes
Gatewaycompare
![Page 13: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/13.jpg)
99.99% of search results are consistent
● Consistency SLO (2nd attempt)
![Page 14: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/14.jpg)
● Freshness SLO
ReservationService
Data nodesData nodesData nodes
Gateway
Probe
Get recent ordersSearch orders
![Page 15: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/15.jpg)
● Freshness SLO
99.9% of reservations are available within xx seconds
![Page 16: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/16.jpg)
● Accuracy/Durability SLO
![Page 17: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/17.jpg)
● Accuracy/Durability SLO
![Page 18: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/18.jpg)
Stream Stream
● Current data SLOs
Search Service
ReservationService
Data nodesData nodesData nodes
Gateway
Data freshnessData consistency
![Page 19: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/19.jpg)
Stream StreamReservationService
Hadoop MR Durability
Consumer
Probe
● Reservation SLOs
CompletenessLatency
![Page 20: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/20.jpg)
![Page 21: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/21.jpg)
● Availability / Latency SLOs
![Page 22: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/22.jpg)
● Availability / Latency SLOs Buckets (manual)
Query 1Query 5
...
Query 8Query 2
...
Query 3Query 4Query 6Query 7
...
SLO latency: 50 msSLO availability
SLO latency: 100 msSLO availability
No objectives
![Page 23: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/23.jpg)
● Availability / Latency SLOs Buckets (automated)
Score ≤ X X ≤ Score ≤ Y Score ≥ Y AND AND OR Timeout ≥ x Timeout ≥ y Low timeout
SLO latency: 50 msSLO availability
SLO latency: 100 msSLO availability
No objectives
![Page 24: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/24.jpg)
![Page 25: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/25.jpg)
Was it worth it?
![Page 26: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/26.jpg)
Stream StreamReservationService
Search Service
● Auto. Mitigation
Gateway
Data nodesData nodesData nodes
Freshness Probe
Stop traffic
![Page 27: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/27.jpg)
Stream Stream
Search Service
ReservationService
Gateway
Hadoop MR DumpDaily snapshot push
Data nodesData nodesData nodes
Completeness Probe
Re-process
Fix
● Auto. Repair
![Page 28: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/28.jpg)
● Biggest gains
Awareness
Confidence
![Page 29: for Data-Intensive Services SLOs - USENIX · for Data-Intensive Services Yoann Fouquet Booking.com Agenda 1 SLO Refresher 2 Our reservation system 3 SLO definition journey 4 Benefits](https://reader033.fdocuments.us/reader033/viewer/2022053015/5f14c8a9104f941291278569/html5/thumbnails/29.jpg)
Thank you!
All references to “Booking.com", including any mention of “us”, “we” and “our” refer to Booking.com BV, the company behind Booking.com™
We’re Hiring
careers.booking.com