Monitoring Memcache with Monit

 
Published on 2013-07-09 by John Collins. Socials: YouTube - X - Spotify - Amazon Music - Apple Podcast

Recently I had an issue with Memcache getting into a weird error state where it was still running as a service, a .pid file existed for it in /var/run, however it was refusing connections. The error was compounded by the fact that my Monit installation was telling me that it was working okay, due to the existence of the .pid file and the fact that is was listening on port 11211 (the default port for Memcache). Here was my full Monit configuration for monitoring Memcache:

check process memcached with pidfile /var/run/memcached/memcached.pid
    start program = "sbin/service memcached start"
    stop program = "sbin/service memcached stop"
    if failed host 127.0.0.1 port 11211 then restart
    if cpu > 70% for 2 cycles then alert
    if cpu > 98% for 5 cycles then restart
    if 2 restarts within 3 cycles then timeout

This is the crucial line: it should force Memcache to restart if it is not possible to connect to port 11211 on localhost:

if failed host 127.0.0.1 port 11211 then restart

But is my instance, getting the connection wasn't the issue: doing something with the connection was. If you are using a modern version of Monit (anything newer than 5.2 should be good), you will have access to the protocol directive with support for Memcache. By adding protocol MEMCACHE to our connection testing configuration, Monit will not only check if you can make a connection to Memcache, but also ensure that is responding.

Here is the updated configuration:

check process memcached with pidfile /var/run/memcached/memcached.pid
    start program = "sbin/service memcached start"
    stop program = "sbin/service memcached stop"
    if failed host 127.0.0.1 port 11211 protocol MEMCACHE then restart
    if cpu > 70% for 2 cycles then alert
    if cpu > 98% for 5 cycles then restart
    if 2 restarts within 3 cycles then timeout

Updated 2023 : note that the above post was originally published in 2013 and may be outdated, but is left here for archival purposes.