snmpd strangeness

Jeremy Chadwick koitsu at FreeBSD.org
Wed Nov 19 09:26:56 PST 2008


On Wed, Nov 19, 2008 at 12:11:36PM -0500, John Almberg wrote:
>
> On Nov 19, 2008, at 11:49 AM, Jeremy Chadwick wrote:
>
>> On Wed, Nov 19, 2008 at 10:57:50AM -0500, John Almberg wrote:
>>> I just noticed something odd and am looking for ideas...
>>>
>>> As you can see from the top snippet below, snmpd is getting hammered 
>>> by
>>> something. As a comparison, the load averages for this quad-core   
>>> box are
>>> usually close to zero.
>>>
>>> I'm not even sure I'm using snmpd for anything... not even sure what 
>>> it
>>> is, precisely.
>>>
>>> I'm digging into docs at the moment, but any ideas much appreciated.
>>
>> I'm greatly concerned by the fact that you have a process on your
>> machine taking up 103% CPU time (possible on a quad-core machine),
>> taking up 2621MBytes of memory (RSS), yet you have no idea what it is,
>> what SNMP is, or why said process is running on your machine.  :-)
>
> That's an easy one to answer... Someone else installed FreeBSD on this 
> machine. I have figured out MOST of what is on this box, but I'm  
> occasionally surprised, like in this case.
>
> However, now that I've read through the installer's notes, I see that he 
> had exotic plans for snmp monitoring. From what I can tell, he never got 
> it working properly.

Interesting.  For "small" installations, e.g. super simple monitoring,
most people prefer to use bsnmpd(1), which comes with FreeBSD.  The
docs are a bit sparse though, and the config syntax is weird + touchy.
I've tinkered a bit with it though.

> In the meantime, I killed off the process. I had to take a sledgehammer 
> to it, since a normal stop didn't work:
>
> [identry at on:log]> sudo /usr/local/etc/rc.d/snmpd stop
> Stopping snmpd.
> Waiting for PIDS: 45136t, 45136op, 45136, 45136, 45136, 45136, 45136,  
> 45136, 45136, 45136, 45136, 45136, 45136, 45136, 45136, 45136, 45136,  
> 45136, 45136, 45136, 45136, 45136, 45136, 45136, 45136, 45136, 45136,  
> 45136, 45136, 45136, 45136, 45136, 45136, 45136, 45136, 45136, 45136,  
> 45136, 45136, 45136, 45136, 45136, 45136, 45136, 45136, 45136, 45136,  
> 45136, 45136, 45136, 45136, 45136, 45136, 45136, 45136, 45136, 45136,  
> 45136, 45136, 45136^C
> [identry at on:log]> sudo kill -SIGKILL 45136
>
> This makes me wonder if the process was just hung in some bad way,  
> eating up cpu cycles?

Looks like it was wedged on a single CPU maybe? (If it was spiralling
out of control thread-wise, I'd expect to see it chewing up ~400% CPU,
e.g. 100% per core).

More interesting is the fact that it was taking up 2.6GB of RAM.  That
reeks of a memory leak somewhere.  Maybe the snmpd.conf tried to tie
in some shell scripts or executables?

I've seen this behaviour at work on Solaris, but it's rare.  (More
common, we see kernel panics when using old versions of Net-SNMP --
yeah, you read that right, kernel panics.  Seems the Solaris kernel has
some SNMP support in it -- yes, the kernel!)

You would have to work with the Net-SNMP folks to figure out what the
cause was.

> Out of curiosity, I then restarted it. It seemed to run without problem 
> after the restart, but after watching it for awhile, I stopped it again. 
> I don't think it's doing anything useful at the moment.

Then keep it off.  It opens up a listening port, amongst other things.
If you're not using it, don't run it.  :-)

> Now I'm curious about snmp, so perhaps I'll try to figure out how to get 
> it to something useful. This machine has 8 hard drives, and is located in 
> Manhattan, so I would certainly like to be informed if one of the raid 
> drives went on the blink. That was one of the things he was trying to get 
> working.

Net-SNMP won't give you the status of the RAID.  Neither will bsnmpd(10.
FreeBSD simply does not have the hooks to make this possible.  Someone
needs to write the code.  I do not recommend relying on shell scripts
tied into Net-SNMP to accomplish this either (for a lot of very good
reasons); write the code in native C.

It also greatly depends on what you're using for RAID.  If a hardware
controller, good luck getting the status out of an API natively (sans
Areca, which I believe offers an API) -- you'll resort to shell scripts
and CLI binaries, in which case you're *easily* better off with a
cronjob, periodic(8), or a log monitor daemon.

It never ceases to amaze me how people to try shove crazy stuff into
SNMP stacks which should be done elsewhere.  :-)  Even Juniper's JunOS,
which provides an extensive SNMP extension, does not provide everything
desired.

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |



More information about the freebsd-questions mailing list