Load Balancing With HAProxy
This tutorial presents the best practices for using HAProxy in DolphinDB and offers a solution for high-availability clusters in handling high concurrency and large traffic in production environments.
Overview
HAProxy is an open-source software developed in C language, offering high availability, load balancing, and proxying for TCP and HTTP-based applications.
This tutorial uses HAProxy version 2.6 and we recommend deploying a stable version for load rebalancing.

Environment Setup
Hardware Requirements:
Hardware Resource | Configuration |
---|---|
Host | HostName |
IP | xxx.xxx.xxx.122 |
Operating System | Linux (kernel version 3.10 or higher) |
Memory | 64 GB |
CPU | x86_64 (12 CPU processors) |
Software Requirements:
Software Resource | Version |
---|---|
DolphinDB Server | 2.00.8 |
HAProxy | 2.6.2 |
Docker | 3.0 or higher |
For more instructions on the installation, see HAProxy documentation.
Installation, Deployment and Application
Before deploying HAProxy, it is recommended to set up a high-availability cluster with multiple data nodes. See High-availability Cluster Deployment.
Installation
Host Environment
Before installing HAProxy, make sure you have installed the epel-release, gcc, and systemd-devel dependencies. Execute the following command to install:
yum -y install epel-release gcc systemd-devel
Download and extract the package of the HAProxy 2.6.2 source code:
wget https://www.haproxy.org/download/2.6/src/haproxy-2.6.2.tar.gz && tar zxf haproxy-2.6.2.tar.gz
Compile the source code. Replace /path/to/haproxy and /path/to/haproxy/bin with the actual installation path.
cd haproxy-2.6.2
make clean
make -j 8 TARGET=linux-glibc USE_THREAD=1
make PREFIX=${/app/haproxy} SBINDIR=${/app/haproxy/bin} install # Replace `${/app/haproxy}` and `${/app/haproxy/bin}` with your custom directories.
Modify the system profile to include HAProxy in the system path:
echo 'export PATH=/app/haproxy/bin:$PATH' >> /etc/profile
. /etc/profile
Verify that HAProxy is installed successfully:
which haproxy
Docker Environment
To use HAProxy in a Docker environment, pull the HAProxy Docker image (we use the haproxytech/haproxy-alpine:2.6.2 version):
docker pull haproxy:2.6.2-alpine
User and Group Configuration
Before starting HAProxy, ensure that the user and group are properly configured in the configuration file. For example, to specify haproxy as both the user and group, use the following commands:
sudo groupadd haproxy
sudo useradd -g haproxy haproxy
Cluster Monitoring Configuration
HTTP Mode
Create the haproxy.cfg file on the host and set the following configurations:
global # Define global configuration
log 127.0.0.1 local2 # Define the global syslog server, with a maximum of two servers.
maxconn 4000
user haproxy
group haproxy
defaults
mode http # Set the working mode to HTTP
log global # Inherit log settings from the global configuration
option httplog # Set the log category to httplog
option dontlognull # Do not log empty connections
option http-server-close
option forwardfor except 127.0.0.0/8 # Enable IP forwarding
option redispatch # Redirect requests to other healthy servers if a server is unavailable.
retries 3 # Maximum number of connection attempts to upstream servers (set to 3)
timeout http-request 10s
timeout queue 1m # Maximum queuing time for requests
timeout connect 10s # Maximum time to establish a connection between HAProxy and the backend server
timeout client 1h # Maximum idle connection time with the client
timeout server 1h # Maximum idle connection time with the backend server
timeout http-keep-alive 10s # Keep-alive session duration with the client (set to 10s)
timeout check 10s
maxconn 3000 # Maximum number of connections allowed between clients and servers (set to 3000)
frontend ddb_fronted
bind *:8080 # Port used by the frontend to receive requests
mode http
log global
default_backend ddb_backend
backend ddb_backend
balance roundrobin # Use the dynamic weighted round-robin algorithm, which supports runtime weight adjustment and slow start mechanism. This is the fairest and most balanced load-balancing algorithm.
# Check the server port every 5 seconds. If 2 consecutive checks pass, the server is considered available; if 3 consecutive checks fail, the server is marked as unavailable.
server node1 xxx.xxx.xxx.1:9302 check inter 5s rise 2 fall 3
server node2 xxx.xxx.xxx.2:9302 check inter 5s rise 2 fall 3
server node3 xxx.xxx.xxx.3:9302 check inter 5s rise 2 fall 3
server node4 xxx.xxx.xxx.4:9302 check inter 5s rise 2 fall 3
listen stats
mode http
bind 0.0.0.0:1080 # Port for accessing the monitoring page
stats enable
stats hide-version
stats uri /haproxyamdin # URI for the monitoring page
stats realm Haproxy # Display message for the monitoring page
stats auth admin:admin # Username and password for accessing the monitoring page (both set to "admin")
stats admin if TRUE
TCP Mode
global # Define global configuration
log 127.0.0.1 local2 # Define the global syslog server, with a maximum of two servers.
maxconn 4000
user haproxy
group haproxy
defaults
mode tcp # Set the working mode to TCP
log global # Inherit log settings from the global configuration
option tcplog # Set the log category to tcplog
option dontlognull # Do not log empty connections
option http-server-close
option redispatch # Redirect requests to other healthy servers if a server is unavailable.
retries 3 # Maximum number of connection attempts to upstream servers (set to 3)
timeout http-request 10s
timeout queue 1m # Maximum queuing time for requests
timeout connect 10s # Maximum time to establish a connection between HAProxy and the backend server
timeout client 1h # Maximum idle connection time with the client
timeout server 1h # Maximum idle connection time with the backend server
timeout http-keep-alive 10s # Keep-alive session duration with the client (set to 10s)
timeout check 10s
maxconn 3000 # Maximum number of connections allowed between clients and servers (set to 3000)
frontend ddb_fronted
bind *:8080 # Port used by the frontend to receive requests
mode tcp
log global
default_backend ddb_backend
backend ddb_backend
balance roundrobin # Use the dynamic weighted round-robin algorithm, which supports runtime weight adjustment and a slow-start mechanism. This is the fairest and most balanced load-balancing algorithm.
# Check the server port every 5 seconds. If 2 consecutive checks pass, the server is considered available; if 3 consecutive checks fail, the server is marked as unavailable.
server node1 10.0.0.80:8802 check inter 5s rise 2 fall 3 send-proxy
server node2 10.0.0.81:8802 check inter 5s rise 2 fall 3 send-proxy
server node3 10.0.0.82:8802 check inter 5s rise 2 fall 3 send-proxy
listen stats
mode http
bind 0.0.0.0:1080 # Port for accessing the monitoring page
stats enable
stats hide-version
stats uri /haproxyamdin # URI for the monitoring page
stats realm Haproxy # Display message for the monitoring page
stats auth admin:admin # Username and password for accessing the monitoring page (both set to "admin")
stats admin if TRUE
Note:
The IP and port specified for the backend servers can be customized according to your scenario. For further examples, refer to HAProxy Configuration Manual.
In TCP mode, the
listen stats mode
must be configured as http.The configuration
check inter 5s rise 2 fall 3
defines the frequency of node health checks and failover. Specifically:inter 5s
means that the node status is checked every 5 seconds.rise 2
indicates that after two consecutive successful checks, the node will be marked as available.fall 3
means that after three consecutive failed checks, the node will be marked as unavailable, triggering a failover.
If you want to shorten the detection and failover time for failed nodes, you can achieve this by reducing the interval between individual checks and the number of checks required for determination. For example, modifying the configuration to
check inter 2s rise 2 fall 2
will reduce the node failure detection time from approximately 15 seconds to about 4 seconds.
Service Startup
To start HAProxy in a host environment, execute the following command. The -f
option specifies the path to the configuration file, which defaults to /etc/haproxy/haproxy.cfg. In this case, the configuration file is located at /haproxy/haproxy.cfg:
haproxy -f /haproxy/haproxy.cfg
To create an HAProxy container in a Docker environment, execute the following command. Ensure that the monitoring and frontend ports are mapped to the host, and the pre-configured haproxy.cfg
file on the host is mapped to the container:
docker run -itd --name ddb_haproxy -p 8080:8080 -p 1080:1080 -v /haproxy/haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg --privileged=true haproxy:2.6.2-alpine
Once HAProxy is successfully started, users can access the DolphinDB cluster service through the frontend port using client tools such as VS Code extension and Web.
Note: When a DolphinDB client tool connects to the listening proxy port, HAProxy will allocate connections to one of the nodes deployed in the back end for load balancing according to corresponding algorithm rules.
High Availability and Keepalived
Keepalived is a lightweight high-availability solution that dynamically manages Virtual IP Addresses (VIPs) using the VRRP protocol, enabling automatic failover between primary and backup nodes.
In this deployment, when the primary HAProxy node fails, Keepalived detects the service anomaly and automatically switches the VIP to the backup node, ensuring the continuity and stability of HAProxy services.
Keepalived Deployment


(1) Install Dependencies
Run the following command to install the necessary dependencies:
yum -y install gcc openssl-devel pcre2-devel systemd-devel openssl-devel
(2) Install Keepalived
Download Keepalived 2.3.0 from the official site, then compile and install it:
tar zxf keepalived-2.3.0.tar.gz
cd keepalived-2.3.0
./configure
make
make install
(3) Master/Backup Node Configuration
vim /etc/keepalived/keepalived.conf
Configure the primary (MASTER) node and backup (BACKUP) node as follows:
global_defs {
router_id ha_1 # On the backup node, change to ha_2
vrrp_iptables
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_script chk_haproxy {
script "/usr/sbin/pidof haproxy" # Checks if the haproxy process is running
interval 2 # Checks every 2 seconds
weight -30 # Reduces priority by 30 if check fails
}
vrrp_instance VI_1 {
state MASTER # Change to BACKUP on the backup node
interface eth1 # Replace with the actual network interface
virtual_router_id 51
priority 100 # Set to 80 on the backup node
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.0.0.100 # Replace with the actual VIP
}
track_script {
chk_haproxy
}
}
Set
state MASTER
for the primary node andstate BACKUP
for the backup node.Set
interface
to match the network card used by HAProxy.The VIP must be within the same subnet as the primary and backup nodes. For example, check the current node IP with
ip a
. In this example, HAProxy is accessible via 10.0.0.81:8080. Set the primary node to 10.0.0.80, backup node to 10.0.0.81, VIP to 10.0.0.100.
(4) Start Keepalived
Execute the following commands to start Keepalived and enable it at system startup:
systemctl restart keepalived.service # Restart Keepalived
systemctl enable keepalived.service # Enable auto-start on system boot
systemctl status keepalived.service # Check the service status
After Keepalived starts, the primary node binds to the VIP, allowing users to access the HAProxy service via the VIP (e.g., 10.0.0.100:8080). If the primary node's HAProxy service becomes unavailable, Keepalived promotes the backup node to primary and rebinds the VIP, ensuring continuous availability. To verify the current primary node, check its status or inspect whether the network interface is bound to the VIP using:
ip addr show eth1

Operation and Maintenance
HAProxy Stats Page
To view the HAProxy Stats page, enter the host IP, listening port, and configured URI (e.g. xxx.xxx.xxx.122:1080/haproxyamdin) in a browser on any machine that can access the HAProxy host.

Restart or Terminate HAProxy
You need to terminate or restart HAProxy for the configuration changes to take effect.
Run the following command to find the PID of a running HAProxy process on host:
ps -ef | grep haproxy
Then run the kill command to terminate the process:
kill -9 ${haproxy_pid}
To restart HAProxy, you can invoke command haproxy -f again.
In Docker environment, you can use the following command to restart the service:
docker restart ddb_haproxy
To terminate and delete the container, you can run the following command:
docker stop ddb_haproxy && docker rm ddb_haproxy