[colug-432] unix monitoring

William Yang wyang at gcfn.net
Fri Feb 26 10:36:37 EST 2016

I've only got experience with two of your suggested solutions: Zabbix and
Nagios, but I've spent a lot of time understanding situational awareness
issues in monitoring over the years.

Generally speaking, I've been pretty happy with Nagios in production
environments. That said, it's got all the good and bad of an old package
like sendmail. Incredibly powerful, pain in the behind to configure. I
don't find it particularly brittle, though -- it does require that you have
some procedural discipline in managing Nagios configurations -- you have to
test your configurations properly, or you will break things.

I'm not really impressed with Zabbix, but the only problem I've used it on
was so big and specific that nothing could really handle it well.  Zabbix
did ultimately get set up in a way that could work for the problem, but I
still think Nagios could have done it the same way at substantally less
than the $2M that was ultimately spent.

Nagios would really benefit from a *free* GUI tool to manage (and validate)
the configuration.  I think that's commercially available (I wrote my own
config manager for it a long time ago, but it's essentially just emacs,
make, and a couple of perl and shell scripts, and my own devised way to
describe the configuration macros.  (I also once wrote a scheduler
replacement for Nagios, based on storing everything in a MySQL database,
but I still think the Nagios engine worked better than mine).


On 02/24/2016 06:14 PM, Rick Hornsby wrote:
> We're presently using Microsoft SCOM to monitor our enterprise (mix of Windows and Linux) ... and it's ... horrible.  There's simply no other way to describe the experience for a UNIX admin.  We've given up trying to automate the unix agent installation.  It's a broken in ways we cannot fix.
> So we're pondering a better solution for our UNIX environment.  BMC Patrol is out - it's got big stompy feet and an even larger price tag.  We're looking at free/OSS options to cover ~1000 unix hosts, mostly RHEL and SUSE but some Solaris and AIX.
> Start with the basics - CPU load, disk space, ports listening, processes running, etc - and have the ability to grow into application level monitoring.  It would be nice if the OSS version supported LDAP auth.  We plan to integrate the solution into our eventual server provisioning stuff that we're planning to build with Puppet.  It would also be nice if the dashboard was pretty.
> For an idea, some of the ones we're considering are Zabbix, Nagios, Sensu, and PandoraFMS.
> Zabbix - I have some past experience with 1.x and 2.0/2.2.  The web UI is a little painful.  Just now I quick-like spun up a Zabbix 3.0 instance --- and things on the dashboard are blinking.  No really, green blocks on the screen are blinking.  Please stop blinking, giant blocks.
> Nagios - It's been around long enough to have earned a bad rep for basically being old, never very user friendly, and generally brittle by modern standards.
> Sensu - It looks pretty?  Don't know much, but it's weird that the OSS version is Ruby but the enterprise version is Java?
> PandoraFMS - Don't know much about this one.
> What are you guys running?  It feels like there must be more options out there that we're not aware of.
> thanks!
> _______________________________________________
> colug-432 mailing list
> colug-432 at colug.net
> http://lists.colug.net/mailman/listinfo/colug-432

William Yang
wyang at gcfn.net

More information about the colug-432 mailing list