[colug-432] unix monitoring

Stephen Potter spp at unixsa.net
Tue Mar 1 23:40:20 EST 2016

On 2/24/2016 6:14 PM, Rick Hornsby wrote:
> Zabbix - I have some past experience with 1.x and 2.0/2.2.  The web UI is a little painful.  Just now I quick-like spun up a Zabbix 3.0 instance --- and things on the dashboard are blinking.  No really, green blocks on the screen are blinking.  Please stop blinking, giant blocks.

I'm using Zabbix 2.4, and while it isn't perfect by any stretch, it 
works pretty consistently.  Although it apparently had been running for 
a few years, it was only installed on about half the environment and 
wasn't really tuned when I first started my new job.  We were getting 
over 100 messages a day!  I've since rolled it out to the rest of the 
environment, tuned the alerts, and gotten our support team to actually 
start to watch the dashboard and react to the emails. We're now getting 
less than 10 a day and those are generally space issues on application 

Easy stuff is pretty easy, hard stuff is harder.  I've got about 650 AIX 
and Linux boxes being monitored.  I recently found one peculiarity that 
I haven't had time to dig into and really understand.  Using 
system.uptime and display it as "uptime" the days number resets or 
overruns.  On linux boxes, it seems to loop at one year, so if a box has 
been up 1 year, 28 days, it will only show 28 days uptime.  On AIX, it 
seems to be around 277 days that it loops and after about 750 days, it 
just starts reporting 0.


