Transparent mode with HAProxy allows you to see the IP Address of the clients computer while still having a high availability service using HAProxy.
Ps. If you are only using HTTP, a much easier alternative is inserting the clients ip in the x-forwarded-for header, follow the instructions here to achieve this:
How to implement x-forwarded-for with Microsoft IIS <-> How to implement x-forwarded-for with Apache
This posting shows how to setup a blank virgin installation of Centos 6.3 64bit minimum installation.
This guide works on the assumption that you have a public facing IP Address of 192.168.10.50 (I know thats not a real public address) and are using an internal network address space of 10.10.10.x/24 with our two web servers on 10.10.10.10 and 10.10.10.15. So we will have two network interfaces on our LoadBalancer eth0 will be set with our real world IP of 192.168.10.50 and eth1 will be set up with 10.10.10.1.
After installing our basic Centos 6.3 64bit OS, it maybe worth running a 'yum update' command first to ensure that the system is fully updated.
As this is a minimum installation you will also need to install a few other packages. These can be installed with the following command:
yum install make wget gcc pcre-static pcre-devel
I'm using the HAProxy 1.5 dev7 build for this example but at the time of writing dev12 is the latest available build and I'll assume that the following will also work with that Development Release. However, to get all the features that we require we will need to build HAProxy from source and not from the package repository. The following steps enable us to do just that:
wget http://haproxy.1wt.eu/download/1.5/src/devel/haproxy-1.5-dev7.tar.gz tar -zxf haproxy-1.5-dev7.tar.gz cd haproxy-1.5-dev7 make TARGET=linux26 USE_STATIC_PCRE=1 USE_LINUX_TPROXY=1 cp haproxy /usr/bin/haproxy cp examples/haproxy.cfg /etc/haproxy.cfg
The installation is now completed. However, we have only an example configuration file installed at '/etc/haproxy.cfg' this is the file that will store all of the settings that we require to ensure our website is available for the maximum number of visitors. So we now need to edit this configuration file I'm going to use 'vim' but if you are more familiar with 'nano', 'ee' or another editor please use that.
Have a quick look through the file if you wish and see the basic structure of the configuration file, we are going to create a VERY basic config to start with just to make sure that our installation is working.
global daemon log /dev/log local4 maxconn 40000 ulimit-n 81000 defaults log global contimeout 4000 clitimeout 42000 srvtimeout 43000 listen http1 bind 192.168.10.50:80 mode http balance roundrobin server http1_1 10.10.10.10:80 cookie http1_1 check inter 2000 rise 2 fall 3 server http1_2 10.10.10.15:80 cookie http1_2 check inter 2000 rise 2 fall 3
Save the above configuration file and then to start the HAProxy service use the following command from the command line:
/usr/sbin/haproxy -f /etc/haproxy.cfg
If everything starts correctly you should be able to browse to your real IP Address using a different computer and see you default page, as mine are just two Debian Web Server I get the following:
If you see the above image or the page for your servers. Congratulations your two web servers are now in High Availability mode. If you do not see your default page stop HAProxy with a killall haproxy command and run /usr/bin/haproxy -d -f /etc/haproxy.cfg this will restart HAProxy with debugging displayed on the console screen to stop the debug info being printed and the HAProxy Service simply press Crtl+C.
Now that the basic High Availability is working lets move to Transparent mode.
So with a stopped HAProxy service open your /etc/haproxy.cfg file again with your editor of choice and in the 'listen http1 section' add the following:
option http-server-close option forwardfor source 0.0.0.0 usesrc clientip
This forces HAProxy to use TPROXY mode in the kernel. We also need to ensure that we have the correct architecture for the TPROXY trick to work. Using the normal mode HAProxy you can have real servers anywhere on the internet because the source address always points back at the HAProxy units IP address. However if the clients source IP address is going to be used transparently then the HAProxy server MUST BE IN THE PATH of the return traffic.
The easiest way to do this is to put the backend servers in a different subnet to the front end clients and make sure that the default gateway points back at the HAProxy load balancer.
NB. With clever routing this should be possible on the same subnet but I haven't tried that yet!
How does the magic of transparent HAProxy work?
I have no idea - but to allow the magic to happen you will now need to edit your iptables rules. I have this as my ' iptables-rules.sh ' file:
iptables -t mangle -N DIVERT iptables -t mangle -A PREROUTING -p tcp -m socket -j DIVERT iptables -t mangle -A DIVERT -j MARK --set-mark 111 iptables -t mangle -A DIVERT -j ACCEPT ip rule add fwmark 111 lookup 100 ip route add local 0.0.0.0/0 dev lo table 100
If you now run this file and then start your new modified HAProxy file and retest to your web server on the Real IP Address you should be able to see in the HTTP Access logs that the address that your site was visited from is not that of the LoadBalancer.