3 Ways To Send HAProxy Health Check Email Alerts

HAProxy Published on 4 mins Last updated

Update

As of haproxy-1.6-dev1 it is now possible to send email alerts directly from HAProxy thanks to the excellent work done for us by Simon Horman. To view more details on how to configure this please see section 3.6 on the HAProxy v1.6 documentation.

To follow up to Aarons blog on HAProxy emails alerts using logwatch I was looking into different ways to achieve the same results.
Now the ideal way to monitor the health of the real servers is to to have a dedicated monitoring system in place such as Nagios( It even has a HAProxy plugin). However this is not always an option for some so they require the loadbalancer to send an alert. So I investigated some different options.

Logwatch

Logwatch does achieve the desired results but is limited in what it can do. One of the downfalls is that you do not get real time alerts when a real servers status changes. Also because it is necessary to search through the log file it causes unnecessary load especially if you have a busy server. You can reduce the amount of work logwatch needs to do by creating a strict search criteria or truncating the log file using something like logrotate.
Another option is to use option "log-health-checks" and create a log file that only contains log entries when a server changes status. This will drastically reduce the amount of work logwatch need to do.

Polling the stats socket

One of the features of HAProxy is that you can view the stats by unix socket. This will tell you if a real server is UP or DOWN. So by polling the stats socket at regular intervals we can can monitor any changes.
I have done this with a simple python script that uses the socat command to retrieve the currents stats and compares the server status with the previous status.

#!/usr/local/bin/python3
#!/usr/local/bin/python3
import subprocess
import time
import smtplib
from email.mime.text import MIMEText
def main():
	firstrun = True
	currentstat=[]
	while True:
		readstats = subprocess.check_output(["echo show stat | socat unix-connect:/var/run/haproxy.stat stdio"], shell=True)
		vips = readstats.splitlines()
		#check if server is up or down and matches previous weight
		#
		for i in range(0,len(vips)):
			#store currnet status
			if "UP" in str(vips[i]):		 			
				currentstat.append("UP")
			elif "DOWN" in str(vips[i]):
				currentstat.append("DOWN")
			else:
				currentstat.append("none")
			#ignore first run as we have no old data to compare to
			if firstrun == False:
				#compare new and old stats
				if (currentstat[i] != oldstat[i] and currentstat[i]!="none") and ("FRONTEND" not in str(vips[i]) and "BACKEND" not in str(vips[i])):
					servername= str(vips[i])
					servername=servername.split(",")
					realserver = servername[0]
					alert=realserver[2:]+ " has changed status and is now "+ currentstat[i]
					mail(str(alert))
		firstrun=False
		oldstat=[]
		oldstat=currentstat
		currentstat=[]
		time.sleep(30)	

def mail(alert):
	msg=MIMEText(alert)
	me="from@email.com"
	you="to@email.com"
	msg["Subject"] = "Layer 7 alert"
	msg["From"] = me
	msg["To"] = you

	s = smtplib.SMTP("smtpserver.com")
	s.sendmail(me,[you],msg.as_string())
	s.quit
main()

This can also be download here incase the formatting is not correct.

So this works but is far from perfect we still do not have real time alerts and do not account for when a server's status is MAINT or NOLB. But have neglected the need to read a log file and have reduced some IO which could possibly classed as an improvement over logwatch. You can change the polling time to make it check as often or as little as needed. Now this is not the most elegant way and has its downfalls so there must be a better way...

Patch HAProxy

So this brings us on to option 3 patch HAProxy to send the alerts, after all how hard can it be?
As I don't really want to write my own SMTP client or use any other library's lets go with the easy option of using mailx from the mailutils package as we know it works. The following was written for HAProxy dev18. Now I'm no developer so take the code more of a proof of concept instead of something to add to your production environment.

Most of the work is already done for us, as HAProxy has functions for setting a server up or down and also has an array containing the server name, server's status etc. So all we need to do is add our own function to send the email and parse the email address from the configuration file.

This done in the following patch files:

So in the configuration file I have added the option "email_alert" to the global section with to and from address. So you add:

email_alert to@email.com from@email.com

Where the to address is required and the from address is optional.

So now when a server is marked as down or up you will receive an email in real time alerting you to this. Which will look something like this:

Subject: Loadbalancer layer7 alert
From: from@mail.com
To: to@mail.com
X-Mailer: mail (GNU Mailutils 2.2)
Message-Id: <1@test>
Date: Fri, 25 Oct 2013 14:55:03 +0100 (BST)

Server servers/server2 is DOWN, reason: Layer4 timeout, check duration: 2001ms. 0 active and 0 back

And that's it you now have email alerts straight from HAProxy, albeit untested and liable break things.
So on choosing what you need the import questions are do you need to be alerted the very second a server goes down? And does it matter if you install an extra service to monitor things? Each method works but has its own benefits and downfalls so it is up to you to decide what suits your environment.
Feel free to use the code or even rewrite it all feedback is welcome!