Load balancing Microsoft Office Communications Server (OCS) with HAProxy

HAProxy Published on 2 mins Last updated

Here at Loadbalancer.org we have recently started the certification process of our product with Microsoft Office Communications Server (OCS). We already have several customers doing this with our units in Direct Routing mode but with the new Loadbalancer.org - ENTERPRISE v6.8 you can do it with the Microsoft recommended SNAT mode. So how can you do this yourself for free with the open source load balancer HAProxy? Read on...

Right, couple of points first:

  1. I'm assuming that you have installed at least version 1.4.1 of HAProxy, plenty of blogs to show you how to do that around...
  2. You have already installed at least a pair of Microsoft OCS servers and know roughly what you want to achieve

The following ports need to be Load Balanced:
(source http://technet.microsoft.com/en-us/library/dd572362(office.13).aspx)

Ports Required

5060 : SIP communication over TCP.
5061 : SIP communication over TLS.
135 : To move users from a pool and other remote DCOM-based operations.
443 : HTTPS traffic to the pool URLs.
444 : Communication between the focus (Office Communications Server 2007 R2 component that manages conference state) and the conferencing servers.
5065 : SIP listening requests for Application Sharing.
5069 : Monitoring Server.
5071 : SIP listening requests for Response Group Service.
5072 : SIP listening requests for Conferencing Attendant.
5073 : SIP listening requests for Conferencing Announcement Server.
5074 : SIP listening requests for Outside Voice Control.
8404 : TLS (remoting over MTLS) listening for inter-server communications for Response Group Service.

So lets jump right to the configuration file (I will explain the important bits in a minute):


# HAProxy configuration file generated by load balancer appliance
global
#uid 99
#gid 99
daemon
stats socket /var/run/haproxy.stat mode 600 level admin
log /dev/log local4
maxconn 40000
ulimit-n 81000
pidfile /var/run/haproxy.pid
defaults
log global
mode	http
contimeout	4000
clitimeout	1800000
srvtimeout	1800000
balance	roundrobin
listen	OCS_ALL_SERVICES 10.10.2.20:5061
bind 10.10.2.20:5060,10.10.2.20:5065
bind 10.10.2.20:5071,10.10.2.20:5072
bind 10.10.2.20:5073,10.10.2.20:5074
bind 10.10.2.20:5073,10.10.2.20:5074
bind 10.10.2.20:8404,10.10.2.20:444
bind 10.10.2.20:443,10.10.2.20:135
bind 10.10.2.20:5069

mode	tcp
option	persist
balance leastconn
stick-table type ip size 10240k expire 30m
stick on src
server OCS_Node1 10.10.2.4 weight 1 check port 5061 inter 2000 rise 2 fall 3
server OCS_Node2 10.10.2.5 weight 1 check port 5061 inter 2000 rise 2 fall 3
server	backup 127.0.0.1:9081 backup
option redispatch
option abortonclose
maxconn 40000
log global
listen	stats :7777
stats	enable
stats	uri /
option	httpclose
stats	auth loadbalancer:loadbalancer

The important bits are:

The new source IP table sticky functionality (expire after 30mins inactivity):

stick-table type ip size 10240k expire 30m
stick on src

Listen to every single required port on the same front end:

listen	OCS_ALL_SERVICES 10.10.2.20:5061
bind 10.10.2.20:5060,10.10.2.20:5065

Send to your backends but DON'T SPECIFY A destination port (i.e. use the original destination port)

server OCS_Node1 10.10.2.4 weight 1 check port 5061

Set a 30 minute timeout for long TCP connections (required for SIP):

clitimeout	1800000
srvtimeout	1800000

And try to balance fairly evenly even though we have 30mins persistence set:

balance leastconn

Simple isn't it....?

Any comments welcome,
Can this be split down into more than one cluster?