A couple of times I have happened to run across a strange issue on some F5 Big-IP LTM clusters where one of the node’s marks some resources as down although they are actually up. Which can cause quite a lot of confusion and trouble.
At least in the cases that I have seen TMM seems to start interpreting the output of health checks backwards for some hosts. In the logs you can see that the health check returned the host is up and that host was marked as down. I have had it happen a couple of times with the 11.x series LTM software and it has also happened with the 12.x versions even with the latest patch levels. But I have not seen it happen with the 13.x version(yet).
So in order to get around the issue I have usually just restarted the TMM process on the affected device and all has gone back to normal after it.
Basically to restart the TMM just log in to the device using SSH and issue the following command:
tmsh restart /sys tmm
Beware that restarting the TMM will cause the device to stop processing traffic. So, in case you are having the issue on a device processing the traffic and are running a Big-IP cluster just do a fail-over first if you already haven’t done it.
Like with many other issues the phrase “have you tried turning it off and on again” comes to mind and saves the day.
Leave a Reply