High Availability

Loadbalancer.org guarantee 99.999% (5 nines) uptime to all of our customers

Some vendors make a fair bit of hype about their products enabling 99.999% availability out of the box.

12 Mar 2009 Updated 09 Dec 2025 4 min read

Some lessons are learned the hard way, after all...

None of us are immune to downtime

Our web server crashed again the other day (it last happened about 2 years ago). I was on holiday at the time and got an automated message saying "www.loadbalancer.org is toast!".

I thought, OK that's annoying but not the end of the world as it was a Sunday afternoon. However, about an hour later I got a message from one of our support guys saying that they could not get through to the 247 support engineers to look into the server failure.

That's when I remembered that last time this happened I thought about setting up a mirror dedicated server to save downtime in the event of a re-build being required...oops didn't do that did I?

Anyway, we didn't get the web site back up until 11am on Monday morning (how sloppy is that for a load balancer vendor?). While the site was down one of the support engineers ordered a new dedicated server from another hosting provider and almost had the new one ready by the time the original was back up.

Lessons learned

So to cut a long story short we now have two dedicated servers and the data on the master is replicated to the slave with rsync, we toyed with the idea of having the servers in a DNS round robin configuration (i.e. load balanced) but then we thought why not just replicate once a day and if we have a hardware failure then manually change the DNS...

Why not full DNS round robin? and for that matter why not a full cluster with some Loadbalancer.org appliances in front?... um, how about:

Hassle
Cost
Complexity
Maintenance
Increased downtime

Hey, hang on "Increased downtime"? surely a high availability load balanced system would increase my availability and not decrease it?

Well, see points 3 & 4, the complexity of maintaining the cluster can easily make your actual availability less than that of a much cheaper single server. Remember that our server hasn't been down in 2 years, technically that's already better than 99.999% availability (by luck I know).

Andrew Hileshas a much better description of all this here: Five nines: chasing the dream?

So am I saying Loadbalancer.org appliances don't provide 99.999% availability? No, I'm saying that they probably won't, but definitely can :-).

Some vendors make a fair bit of hype about their products enabling 99.999% availability out of the box (Kemp for example):

"Say a hosting provider advertises 99.999 percent network availability. Good, the customer needs that. However, network is half the requirement of the customer. The customer needs 99.999 availability of the application that generates his revenue stream. To get the application's availability the hosting provider's customer must purchase his own load balancer or purchase the high availability service - through to the server and application - from the hosting provider. KEMP's load balancers are priced so that the hosting provider can pay less for the value add tool." - Kevin Mahon, Kemp Technologies.

OK Kevin, your load balancers are cheap but you can't just buy 99.999% availability off the shelf. You need to work at it, document it, build it, maintain it, test it, test it again...you get what I mean?

99.999% costs money, lots of money.. and does the customer really need it?

But when it comes to marketing load balancing hardware "99.999% uptime guaranteed" does sound better than my alternative: "Loadbalancer.org appliances probably make 99.9% uptime pretty easy and 99.999% uptime theoretically possible?".

OK I won't give up the day job.

Update.. I've decided to extend this blog entry with links to various down time stories to serve as reminders of how not to do it:

Spiceworks

I was browsing Spice Works the other day and came across an interesting post about a CISCO CSS induced Spiceworks Outage. This is a classic example of 'health check hell', if your health check is too fast/too strict etc. Then the likely result is that your whole cluster will go down...

Obviously you need to make sure that your health check strikes a balance between time to recognise a failure and false positives i.e. server was just a bit slow...

However one of our recommended options is to either use a single fallback server with no health check i.e. its always up!
Or even better use a pool of fallback servers with far less strict health checking (Just in case).

As always you need to make your own decision on this kind of thing and think about what will happen in each failure case (in advance!)

Netflix

I came across an interesting post on RightScale about the Amazon ELB/Netflix outage, interestingly it recommends a loose architecture with something like HAProxy (or Loadbalancer.org obviously!) as the solution. However I'm not sure that I agree, on one level I'm inclined to trust large providers like Amazon to get it right 99.9999% of the time...

But yet again you need to make your own decisions in advance about your response to disaster. Start at the DNS level and work your way down to the server level. One of the nice things about AWS is that it kind of forces you to make some high availability decisions right at the start of the process.

We have a lot of customers starting to use our AWS Load Balancer in multiple regions with ELB/Route 53 in front and the Loadbalancer.org appliance handling failover between availability zones (within the regions). Again these customers need to go to a lot of effort to ensure all this complexity doesn't come back to bite them in the backside.

Need help?

Our experts are always here

Talk to us Book a meeting

High Availability

← Previous post

backend servers with full transparent proxy configured

11 Feb 2009

Configure HAProxy with TPROXY kernel for full transparent proxy

Malcolm Turnbull Co-Founder & CEO

20 Jul 2009

Transparent proxy of SSL traffic using Pound to HAProxy backend patch and how-to

Malcolm Turnbull Co-Founder & CEO

All posts

Security

02 Mar 2026

Hardening HashiCorp Vault with load balancing for always-on security

Nick Hawkins-Booth Technical Support Engineer

In a production environment, Vault downtime isn't just a security risk — it’s a total operational hard-stop...

GSLB

19 Jan 2026

What's the best cloud load balancer? AWS, Azure, GCP, Cloudflare, or a third-party alternative?

Damian Pacuszka Solutions Architect & Technical Sales Engineer

Choosing the right load balancer is about much more than just moving traffic...

High Availability

12 Dec 2025

Architecting the perfect Christmas? A Highly Available light display!

Nick Hawkins-Booth Technical Support Engineer

In my household, only 100% uptime will do!..

Storage

24 Nov 2025

How to load balance a MariaDB Galera Cluster for performance and HA using LVS & Ldirectord

Nick Hawkins-Booth Technical Support Engineer

While a single MariaDB server might suffice for your application's backend, a MariaDB Galera Cluster offers significant advantages, making it a far superior solution...