Getting Started with HA Proxy [High Availability]February 17, 2013
HAProxy is a very fast, very awesome and very free TCP/HTTP Load Balancer and Failover program. Its very simplistic but also quite powerful. The configuration is flexible enough to fit into many high traffic infrastructures but also simple enough to fit into any design seeking simple high availability. It can support tens of thousands of connections at high speeds and can even private a private to public address “translation” if you will by hiding web servers from the internet.
One of the most popular scenarios for HA Proxy is load balancing in its simplest form. A web server struggling to keep up with traffic demands would benefit by throwing the software in front of the web server, adding a second server and balancing between the two. This would effectively cut the load in half on any individual server and distribute it amongst the two, or however many you want, and should yield a much more stable web site.
The setup that we will be working with is populated with 1 HAProxy box and 2 web servers all on a single LAN. You can see the layout in this image:
All configuration will be done on the HAProxy machine and it is assumed that the web servers are already configured and displaying correctly. The HAProxy box I am using is running CentOS 5. Lets get started!
If you are using CentOS, you will need to enable the EPEL repo and then simply install it via yum:
yum install haproxy
Debian should have this package in the default repo:
sudo apt-get install haproxy
If you are like me and love installing things from source, first make sure you have gcc installed. If you’re on CentOS, I just like installing all of the development tools. If you are low on space or resources you may not want to install everything but its just easier in my opinion:
yum groupinstall "Development Tools"
The download and compile the program:
cd /usr/local/src/ wget http://haproxy.1wt.eu/download/1.4/src/haproxy-1.4.22.tar.gz tar xzf haproxy-1.4.22.tar.gz cd haproxy-1.4.22 make TARGET=linux26 #Change TARGET accordingly cp haproxy /usr/sbin/
On the fifth line, make sure that you change the TARGET to your linux version. If you are unsure, run the following and record the first 2 numbers:
You can also append “ARCH=i386” to build the 32 bit binary if you are running on a 64 but system. HAProxy works on quite a few machines, you have the following options when it comes to the TARGET variable:
- linux22 for Linux 2.2
- linux24 for Linux 2.4 and above (default)
- linux24e for Linux 2.4 with support for a working epoll (> 0.21)
- linux26 for Linux 2.6 and above
- linux2628 for Linux 2.6.28 and above (enables splice and tproxy)
- solaris for Solaris 8 or 10 (others untested)
- freebsd for FreeBSD 5 to 8.0 (others untested)
- openbsd for OpenBSD 3.1 to 4.6 (others untested)
- aix52 for AIX 5.2
- cygwin for Cygwin
- generic for any other OS.
- custom to manually adjust every setting
You can launch the haproxy by running the following command:
haproxy -f /etc/haproxy/haproxy.cfg -sf `pgrep haproxy`
This can be ran multiple times without fear of spawning overlapping processes. The “-sf
pgrep haproxy” will kill off the previous command if it exists and launch HAProxy using /etc/haproxy/haproxy.cfg as its config, which we will configure in the next section.
You can also create an init script from the file included:
cp examples/haproxy.init /etc/init.d/haproxy chmod a+x /etc/init.d/haproxy
Note that this will not run until /etc/haproxy/haproxy.cfg is configured…so lets do that!
Build a Basic Config
We are going to assume that your config is held in /etc/haproxy/haproxy.cfg. Please note that if you installed this with a precompiled binary such as through a deb package or RPM this may be different. For us though, we need to first create the /etc/haproxy folder, once created touch a file called haproxy.cfg. Open the file with your favorite editor and paste the following:
defaults log global mode http option httplog option dontlognull retries 3 option redispatch maxconn 20000 contimeout 5000 clitimeout 50000 srvtimeout 50000 listen webservers 192.168.1.4:80 mode http stats enable #Optional: Enables stats interface stats uri /haproxy?stats #Optional: Statistics URL stats realm Haproxy\ Statistics stats auth haproxy:stats #Optional: username:password of site balance roundrobin #balance type cookie webserver insert indirect nocache option httpclose option forwardfor #Add X-ForwardFor in the header server server1 192.168.1.5:80 cookie server1 weight 1 check server server2 192.168.1.6:80 cookie server2 weight 1 check
Modify this if needed and save it to /etc/haproxy/haproxy.cfg. Once you’re finished we can actually check the syntax of this config file by running the following:
haproxy -f /etc/haproxy/haproxy.cfg -c
If its good, you should see “Configuration file is valid”. Go ahead and start everything up!
Go ahead and visit http://192.168.1.4 in your web browser and you should see the page from one of your two web servers! Pretty cool right?!
I added a few comments to some of the lines within the config to help better explain what it does. Here is a little more info on some of those commands.
- stats enable - Simply enables the statistics web interface
- stats uri /haproxy?stats - This defines the url where you can visit the stats. ex: http://192.168.1.4/haproxy?stats
- stats realm Haproxy\ Statistics - The site info in the authentication box
- stats auth haproxy:stats - The username and password of the url above. If this is removed the stats site will not require authentication
balance roundrobin - The method used to balance between the two servers. Here is a list of the more popular options below, the full list can be found in the documentation
- round robin - Each server is used in turns, according to their weights. This is the smoothest and fairest algorithm when the server’s processing time remains equally distributed.
- source - The source IP address is hashed and divided by the total weight of the running servers to designate which server will receive the request. This ensures that the same client IP address will always reach the same server as long as no server goes down or up. If the hash result changes due to the number of running servers changing, many clients will be directed to a different server.
- leastconn - The server with the lowest number of connections receives the connection. Round-robin is performed within groups of servers of the same load to ensure that all servers will be used. Use of this algorithm is recommended where very long sessions are expected, such as LDAP, SQL, TSE, etc… but is not very well suited for protocols using short sessions such as HTTP
server server1 192.168.1.5:80 cookie server1 weight 1 check - There are a lot of options you can use here. Lets break some down
- cookie [value] - Sets the name of the cooke to store on the client. In this case its named server1
- weight [int] - This, to put it simply, puts a specific value on the servers power/importance. In our scenario both are valued at 1 meaning that both are taken as equally powerful and thus should be equally distributed. The default is 1, the max is 256. The higher the number the more load on the server proportional to the values of the other servers.
- backup - If this value is added in the server line, it is disabled unless all non backup servers go down. This is good if you want to only use this as a failover.
- check - This enables health checks on the server. If this is not added a server will always be considered available even if it actually is not physically up.
- disabled - When added the server is marked as down in maintenance mode
- fall [int] - States how long to wait until the server is considered to be down. Default is 3
- rise [int] - States how long to wait until the server is backup. Default is 2
There are many more options and more explanations which can be found in the documentation.
About that documentation, you can find it here. There are far too many options to detail here, this being a beginners guide. There are some more examples and commands that are pretty cool and should help you out!
- monitor-uri /haproxy - When this is enabled, visiting http://192.168.1.4/haproxy it simply provides a success 200 if the haproxy service is working
- option httpchk /check.html - Essentially its a much more powerful version of the “check” command. It checks the backend servers and checks /check.html. It its anything but a valid 200 response, it marks the server as down. You can take this a step further and create a php page that check for apache, mysql, server load, etc… and if any one of those is invalid it sends an error 500. This will take the server down until you can correct it.
ACL’s and Frontends
I wanted to wrap this up with another sample configuration. This one actually incorporates one of my favorite features and that’s front ends and ACL’s. Take a look ad this:
global daemon #maxconn 30000 #ulimit-n 655360 #chroot /home/haproxy #uid 500 #gid 500 #nbproc 1 pidfile /var/run/haproxy.pid log 127.0.0.1 local4 notice defaults mode http clitimeout 600000 # maximum inactivity time on the client side srvtimeout 600000 # maximum inactivity time on the server side timeout connect 8000 # maximum time to wait for a connection attempt to a server to succeed stats enable stats auth admin:password stats uri /admin?stats stats refresh 5s stats hide-version stats realm "Loadbalancer stats" #option httpchk GET /status retries 5 option redispatch option forwardfor option httpclose monitor-uri /test balance roundrobin # each server is used in turns, according to assigned weight default_backend default frontend http bind :80 acl nagios path_reg ^/nagios/? acl app path_reg ^/app/? acl mysite path_reg ^/blog/? use_backend nagios if nagios use_backend app if app use_backend mysite if mysite backend default server web1 172.16.1.201:80 cookie A check inter 1000 rise 2 fall 5 maxqueue 50 weight 30 server web2 172.16.1.202:80 cookie B check inter 1000 rise 2 fall 5 maxqueue 50 weight 75 stats admin if TRUE option httpchk HEAD /check.txt HTTP/1.0 errorfile 503 /var/www/html/503.html backend nagios server nagios 172.16.1.217:80 check inter 1000 rise 2 fall 5 maxqueue 50 weight 30 errorfile 503 /var/www/html/503.html backend app server clear 172.16.0.39:81 check inter 1000 rise 2 fall 5 maxqueue 50 weight 30 backend mysite server mysite 172.16.1.203:80 check inter 1000 rise 2 fall 5 maxqueue 50 weight 30
HAProxy’s config is really easy to use and it has a lot of value in a web stack. I hope this has given you at least the basics of HAProxy and that you have found some value in it. Enjoy!