Connection Handling & Throttling with HAProxy

Paul Namuag

HAProxy continues to evolve into a more powerful software for load balancing and proxying solutions. It's one of the most popular high availability solutions and can work as a proxy for Layer 4 (TCP) and Layer 7 (HTTP) within the OSI Model. 

HAProxy is known as an event-driven, non-blocking, engine combining proxy with a very fast I/O layer and a priority-based, multi-threaded scheduler. As it is designed with a data forwarding goal in mind, its architecture is designed to operate in a lightweight process which is optimized to move data as fast as possible with the least possible operations. It focuses on optimizing the CPU cache efficiency by sticking connections to the same CPU as long as possible. As such it implements a layered model offering bypass mechanisms at each level ensuring data doesn't reach higher levels unless needed. Most of the processing is performed in the kernel. HAProxy does its best to help the kernel do the work as fast as possible by giving some hints or by avoiding certain operations when it guesses they could be grouped later. As a result, typical figures show 15% of the processing time spent in HAProxy versus 85% in the kernel in TCP or HTTP close mode, and about 30% for HAProxy versus 70% for the kernel in HTTP keep-alive mode.

HAProxy has additional features of load balancing also. For example, the TCP proxying feature allows us to use it for database connections such as for MySQL or PostgreSQL (or even Redis) using its built-in check service support. Even though there's database service support, it does not suffice the desired health check especially for a replication type of cluster. Best approach is to use TCP check using xinetd with HAProxy.

The Features of HAProxy

HAProxy also can maintain stateful operations as well. With the use of stick-tables operating on Layer 7 for load balancing can be achieved. Although that's one of the basic features, let's check its common features:

There are a number of basic features that HAProxy supports but let's get with the most common ones.

Proxying 

Proxying is the action of transferring data between a client and a server over two independent connections. Some of the supported features covered for proxying and connection management by HAProxy which we can work with connection handling together with databases are:

  • Provide the server with a clean connection to protect them against any client-side defect or attack; 
  • Listen to multiple IP addresses and/or ports, even port ranges; 
  • Transparent accept : intercept traffic targeting any arbitrary IP address that doesn't even belong to the local system; 
  • Provide a reliable return IP address to the servers in multi-site LBs;
  • Offload the server thanks to buffers and possibly short-lived connections to reduce their concurrent connection count and their memory footprint;
  • Support different protocol families on both sides (e.g. IPv4/IPv6/Unix); 
  • Timeout enforcement : HAProxy supports multiple levels of timeouts depending on the stage the connection is, so that a dead client or server, or an attacker cannot be granted resources for too long;
  • Protocol validation: HTTP, SSL, or payload are inspected and invalid protocol elements are rejected, unless instructed to accept them anyway;
  • Policy enforcement : ensure that only what is allowed may be forwarded; 
  • Both incoming and outgoing connections may be limited to certain network namespaces (Linux only), making it easy to build a cross-container, multi-tenant load balancer; 

High Availability

HAProxy cares a lot about availability to ensure the best global service continuity. HAProxy only uses valid servers. Other ones are automatically evicted from load balancing farms under certain conditions it is still possible to force to use. It also supports a graceful shutdown so that it is possible to take servers out of a farm without affecting any connection.  For backup servers, they are automatically used when active servers are down and replace them so that sessions are not lost when possible. This also allows you to build multiple paths to reach the same server (e.g. multiple interfaces). It has the ability to return a global failed status for a farm when too many servers are down. This, combined with the monitoring capabilities makes it possible for an upstream component to choose a different LB node for a given service.

Stateless design makes it easy to build clusters : by design, HAProxy does its best to ensure the highest service continuity without having to store information that could be lost in the event of a failure. This ensures that a takeover is the most seamless possible. HAProxy also integrates well with standard VRRP daemon keepalived. HAProxy easily tells keepalived about its state and copes very well with floating virtual IP addresses. Take note that only use IP redundancy protocols (VRRP/CARP) over cluster- based solutions (e.g. Heartbeat\) as they're the ones offering the fastest, most seamless, and most reliable switchover. 

Load Balancing

HAProxy offers a fairly complete set of load balancing features, most of which are unfortunately not available in a number of other load balancing products. No less than 10 load balancing algorithms are supported, some of which apply to input data to offer an infinite list of possibilities. The most common ones are round-robin (for short connections, pick each server in turn), leastconn (for long connections, pick the least recently used of the servers with the lowest connection count), source (for SSL farms or terminal server farms, the server directly depends on the client's source address), URI (for HTTP caches, the server directly depends on the HTTP URI), hdr (the server directly depends on the contents of a specific HTTP header field), first (for short-lived virtual machines, all connections are packed on the smallest possible subset of servers so that unused ones can be powered down). All algorithms above support per-server weights so that it is possible to accommodate from different server generations in a farm, or direct a small fraction of the traffic to specific servers (debug mode, running the next version of the software, etc). Dynamic weights are supported for round-robin, leastconn and consistent hashing ; this allows server weights to be modified on the fly from the CLI or even by an agent running on the server.

Slow-start is supported whenever a dynamic weight is supported; this allows a server to progressively take the traffic. This is an important feature for fragile application servers which require to compile classes at runtime as well as cold caches which need to fill up before being run at full throttle. Hashing can apply to various elements such as client's source address, URL components, query string element, header field values, POST parameter, RDP cookie. Consistent hashing protects server farms against massive redistribution when adding or removing servers in a farm. That's very important in large cache farms and it allows slow-start to be used to refill cold caches. A number of internal metrics such as the number of connections per server, per backend, the amount of available connection slots in a backend etc makes it possible to build very advanced load balancing strategies.

These are the most common ones but a number of basic features are supported. Some of these are SSL/TLS, Monitoring, Stickiness, Maps, ACLs and conditions, Content Switching and many more. Although these are the basic ones, it also has its advanced features.

HAProxy Advanced Features

HAProxy is designed to remain extremely stable and safe to manage in a regular production environment. HAProxy can run with multiple versions coexist on the same host as long as ports are different during configuration. It is provided as a single executable file which doesn't require any installation process. 

Some of the advanced features are its Management, System-specific capabilities, Scripting, and Tracing. Management features allow an application administrator to smoothly stop a server, detect when there's no activity on it anymore, then take it off-line, stop it, upgrade it and ensure it doesn't take any traffic while being upgraded, then test it again through the normal path without opening it to the public, and all of this without touching HAProxy at all. System-specific when deploying HAProxy provides extra features and optimizations including support for network namespaces (also known as "containers") allowing HAProxy to be a gateway between all containers as one of its supported features.

Connection Handling & Throttling

Now that we have already provided an overview with HAProxy features, our blog focuses on features on how to handle and manage the connections that are going to your HAProxy. Throttling allows you to manage the limit rate of connections that are going to connect to your HAProxy before it proceeds to the listeners. This means, it will not direct the connection first until it satisfies the condition.

Given my example setup here which I launched with ClusterControl  on a MySQL Replication Cluster with a master with two slave nodes,

global

...

.........



        #* Performance Tuning

        maxconn 8192

        spread-checks 3

        quiet

defaults

...

        maxconn 8192

        timeout check   3500ms

        timeout queue   3500ms

        timeout connect 3500ms

        timeout client  10800s

        timeout server  10800s



...

....

......



listen  haproxy_192.168.10.210_3307_rw_rw

        bind *:3307

        mode tcp

        timeout client  10800s

        timeout server  10800s

        tcp-check connect port 9200

        tcp-check expect string master\ is\ running

        balance leastconn

        option tcp-check

#       option allbackups

        default-server port 9200 inter 2s downinter 5s rise 3 fall 2 slowstart 60s maxconn 64 maxqueue 128 weight 100

        server 192.168.10.60 192.168.10.60:3306 check

        server 192.168.10.70 192.168.10.70:3306 check

        server testnode50 testnode50:3306 check




listen  haproxy_192.168.10.210_3308_ro

        bind *:3308

        mode tcp

        timeout client  10800s

        timeout server  10800s

        tcp-check connect port 9200

        tcp-check expect string is\ running

        balance leastconn

        option tcp-check

#       option allbackups

        default-server port 9200 inter 2s downinter 5s rise 3 fall 2 slowstart 60s maxconn 64 maxqueue 128 weight 100

        server 192.168.10.60 192.168.10.60:3306 check no-backup

        server 192.168.10.70 192.168.10.70:3306 check

        server testnode50 testnode50:3306 check

In the global section, we have the maxconn value of 8192. maxconn sets the maximum per-process number of concurrent connections to the assigned number value. It is equivalent to the command-line argument "-n". Proxies will stop accepting connections when this limit is reached. 

In the default section, I have another maxconn parameter again. If the global section is omitted, then this maxconn parameter will be observed and followed by the servers assigned in the listen section or commonly called as listeners.

Lastly, we have the listen section where the listeners are defined. It has also defined the maxconn for the maximum connections allowed for the corresponding listener. The listener takes precedence over defaults.maxconn if specified. It doesn't make sense if this value is higher than global.maxconn as well.

Let's try and run a test. Let's consider specifying the listener for port 3307 which accepts the writer bind at port 3307 and changes the maxconn to 199. Then run a sysbench which would try to initiate a 200 connection request and see what happens.

/etc/haproxy/haproxy.cfg has the following,

listen  haproxy_192.168.10.210_3307_rw_rw

        bind *:3307

        mode tcp

        timeout client  10800s

        timeout server  10800s

        tcp-check connect port 9200

        tcp-check expect string master\ is\ running

        balance leastconn

        option tcp-check

#       option allbackups

        default-server port 9200 inter 2s downinter 5s rise 3 fall 2 slowstart 60s maxconn 199 maxqueue 128 weight 100

        server 192.168.10.60 192.168.10.60:3306 check backup

        server 192.168.10.70 192.168.10.70:3306 check backup

        server testnode50 testnode50:3306 check

Take note of the maxconn = 199. The sysbench fails as it attempts to connect with a total of 200 connections,

# sysbench /usr/share/sysbench/oltp_read_write.lua --db-driver=mysql --events= --threads=200  --max-requests=0 --time=10 --mysql-host=192.168.10.210 --mysql-user=cmon [email protected] --mysql-port=3307 --tables=10 --report-interval=1 --skip-trx=on --table-size=10000 --rate=20 --db-ps-mode=disable --mysql-ignore-errors=all run

sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2)



Running the test with following options:

Number of threads: 200

Target transaction rate: 20/sec

Report intermediate results every 1 second(s)

Initializing random number generator from current time




Initializing worker threads...



FATAL: unable to connect to MySQL server on host '192.168.10.210', port 3307, aborting...

FATAL: error 2013: Lost connection to MySQL server at 'reading initial communication packet', system error: 0

FATAL: `thread_init' function failed: /usr/share/sysbench/oltp_common.lua:349: connection creation failed

FATAL: Thread initialization failed!

Error in my_thread_global_end(): 200 threads didn't exit

This clearly illustrates the connection handling and throttling with HAProxy. Now, fixing it and setting maxconn = 200 allows me to run sysbench successfully.

$# sysbench /usr/share/sysbench/oltp_read_write.lua --db-driver=mysql --events= --threads=200  --max-requests=0 --time=10 --mysql-host=192.168.10.210 --mysql-user=cmon [email protected] --mysql-port=3307 --tables=10 --report-interval=1 --skip-trx=on --table-size=10000 --rate=20 --db-ps-mode=disable --mysql-ignore-errors=all run

sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2)



Running the test with following options:

Number of threads: 200

Target transaction rate: 20/sec

Report intermediate results every 1 second(s)

Initializing random number generator from current time




Initializing worker threads...



Threads started!



[ 1s ] thds: 200 tps: 5.99 qps: 197.70 (r/w/o: 173.74/23.96/0.00) lat (ms,95%): 646.19 err/s: 0.00 reconn/s: 0.00

[ 1s ] queue length: 0, concurrency: 9

[ 2s ] thds: 200 tps: 17.96 qps: 333.19 (r/w/o: 261.37/71.83/0.00) lat (ms,95%): 682.06 err/s: 0.00 reconn/s: 0.00

[ 2s ] queue length: 0, concurrency: 11

[ 3s ] thds: 200 tps: 15.02 qps: 280.31 (r/w/o: 220.24/60.07/0.00) lat (ms,95%): 657.93 err/s: 0.00 reconn/s: 0.00

[ 3s ] queue length: 0, concurrency: 12

[ 4s ] thds: 200 tps: 22.01 qps: 414.17 (r/w/o: 326.13/88.04/0.00) lat (ms,95%): 657.93 err/s: 0.00 reconn/s: 0.00

[ 4s ] queue length: 0, concurrency: 13

[ 5s ] thds: 200 tps: 19.00 qps: 343.97 (r/w/o: 267.98/75.99/0.00) lat (ms,95%): 657.93 err/s: 0.00 reconn/s: 0.00

[ 5s ] queue length: 0, concurrency: 14

[ 6s ] thds: 200 tps: 21.03 qps: 388.50 (r/w/o: 304.39/84.11/0.00) lat (ms,95%): 657.93 err/s: 0.00 reconn/s: 0.00

[ 6s ] queue length: 0, concurrency: 15

[ 7s ] thds: 200 tps: 22.96 qps: 473.20 (r/w/o: 381.35/91.84/0.00) lat (ms,95%): 682.06 err/s: 0.00 reconn/s: 0.00

[ 7s ] queue length: 0, concurrency: 20

[ 8s ] thds: 200 tps: 24.96 qps: 359.44 (r/w/o: 259.60/99.85/0.00) lat (ms,95%): 719.92 err/s: 0.00 reconn/s: 0.00

[ 8s ] queue length: 0, concurrency: 11

[ 9s ] thds: 200 tps: 16.90 qps: 324.16 (r/w/o: 256.55/67.62/0.00) lat (ms,95%): 694.45 err/s: 0.00 reconn/s: 0.00

[ 9s ] queue length: 0, concurrency: 14

[ 10s ] thds: 200 tps: 26.23 qps: 437.76 (r/w/o: 330.84/105.91/1.01) lat (ms,95%): 657.93 err/s: 0.00 reconn/s: 0.00

[ 10s ] queue length: 0, concurrency: 9

SQL statistics:

    queries performed:

        read:                            2814

        write:                           803

        other:                           1

        total:                           3618

    transactions:                        201    (19.35 per sec.)

    queries:                             3618   (348.35 per sec.)

    ignored errors:                      0      (0.00 per sec.)

    reconnects:                          0      (0.00 per sec.)



General statistics:

    total time:                          10.3837s

    total number of events:              201



Latency (ms):

         min:                                   61.34

         avg:                                  622.70

         max:                                  760.05

         95th percentile:                      694.45

         sum:                               125161.88



Threads fairness:

    events (avg/stddev):           1.0050/0.07

    execution time (avg/stddev):   0.6258/0.09

Conclusion

We hope you found this guide to HAProxy helpful. How do you use it? Let us know in the comments below

ClusterControl
The only management system you’ll ever need to take control of your open source database infrastructure.