HAProxy tech talk
-
Upload
icebourg -
Category
Technology
-
view
4.279 -
download
7
description
Transcript of HAProxy tech talk
I ❤ HAProxy
National Airspace System - FAA
Simplified Web Architecture
Clients
iPhones
Androids
Browsers
Web Server
Nginx
Apache
lighttpd
Dynamic
PHP
Ruby
Perl
Python
Node.js
“Data”
Memcache
PostgreSQL
MySQL
Mongo
CouchDB
Redis
Oracle
ChOP Archtiecture
Clients
iPhone
Android
Desktop
Web Server
Nginx
Dynamic
PHP5-FPM
“Data”
Memcache
MySQL
Redis
Chat
YouVersion Architecture
Clients
iPhone
Android
Desktop
Web Server
Nginx
Dynamic
PHP5-FPM
Ruby (coming
soon)
“Data”
Memcache
PostgreSQL
Mongo
Oracle
HAProxy ¡ High Availability Proxy
¡ TCP load balancing proxy with awesome health checking built in
¡ Fast
¡ Scalable
¡ Makes non-HA services HA
How I Love Thee, Let Me Count The Ways… ¡ Rock solid
¡ Dead simple to run and configure
¡ Comprehensive Health Checking
¡ Lots of statistics
HAProxy Uses ¡ Not really a service unto itself
¡ Fits into the gaps between layers well
¡ Issue: Becomes a single point of failure itself
Clients Web Server
Dynamic Engine “Data”
HAProxy HAProxy* HAProxy*
* – potential future use
Eliminating SPOFs ¡ Two types of HAProxy SPOFs: ¡ Service Outage
(Hardware failure or HAProxy service failure)
¡ HAProxy Limit Outage / Upstream Outage (Hit some arbitrary limit we defined somewhere or ran out of some slots somewhere)
Service Outage ¡ HAProxy service crashes or dies for some reason
(has never happened, knock on wood)
¡ Hardware / Network Failure
Service Outage: Solution ¡ Corosync & Pacemaker
¡ Hard to configure at first, but don’t really need to touch it later
¡ Pretty much magic
¡ Two Corosync HAProxy clusters: DFW and SAN
¡ Setup is blogged about here: http://itand.me/41901523
HAProxy Limit Outage / Upstream Outage ¡ Usually because of an outage further upstream
at the Dynamic or “Data” layer
¡ Completely Hypothetical Situation: Mongo slows down, causing PHP processes to back up, causing the connection limit to go through the roof, causing total outage
What it looks like on the graph (Yesterday)
OR: WHY WE MUST MOVE MONGO STAT!
For ChOP (Chat), it’s a little different…
Upstream Outage ¡ Usually the result of running out of PHP processes.
¡ Normally each PHP process can process hundreds of req/s
¡ Something slows them down (mongo, postgres, et al) so a process can only process a smaller number of req / s (or, worse, seconds / req)
¡ Inevitably, these requests take all PHP processes, nothing else can run and HAProxy fails all health checks and shows you Binary Jesus
“Solutions” ¡ Start Hashing URLs to avoid upstream failures ¡ Want to send all URL requests to the same app server
so if it’s slow only that app server goes down
¡ Some benefit to caching as well
¡ Challenge: want to hash only part of a URL
¡ Challenge: need to separate app servers into “availability groups”
¡ Challenge: deployments, monitoring, alerting, all that crap…
HAProxy Limit Outage ¡ We set limits on all HAProxy backends and front
ends and servers to ensure they don’t get overwhelmed
¡ Sometimes these limits are too low
¡ Solution: Raise them
¡ Challenge: Raise them too high without regard for the backend, and you could cause more harm than good (Stampeding Herd)
Q&A