cvs commit: src/etc Makefile sensorsd.conf src/etc/defaults rc.conf src/etc/rc.d Makefile sensorsd src/lib/libc/gen sysctl.3 src/sbin/sysctl sysctl.8 sysctl.c src/share/man/man5 rc.conf.5 src/share/man/man9 Makefile sensor_attach.9 src/sys/conf f

Constantine A. Murenin cnst at FreeBSD.org
Wed Oct 17 08:51:04 PDT 2007


On 17/10/2007 09:07, John Baldwin wrote:
> On Tuesday 16 October 2007 06:14:34 pm Constantine A. Murenin wrote:
> 
>>On 16/10/2007 17:01, John Baldwin wrote:
>>
>>
>>>On Monday 15 October 2007 10:57:48 pm Constantine A. Murenin wrote:
>>>
>>>
>>>>On 15/10/2007, John Baldwin <jhb at freebsd.org> wrote:
>>>>
>>>>
>>>>>On Monday 15 October 2007 09:43:21 am Alexander Leidinger wrote:
>>>>>
>>>>>
>>>>>>Quoting Scott Long <scottl at samsco.org> (from Mon, 15 Oct 2007
>>>>>
>>>>>01:47:59 -0600):
>>>>>
>>>>>
>>>>>>>Alexander Leidinger wrote:
>>>>>>>
>>>>>>>
>>>>>>>>Quoting Poul-Henning Kamp <phk at phk.freebsd.dk> (from Sun, 14 Oct
>>>>>>>>2007 17:54:21 +0000):
>>>>>>
>>>>>>>>>listen to the various mumblings about putting RAID-controller status
>>>>>>>>>under sensors framework.
>>>>>>>>
>>>>>>>>What's wrong with this? Currently each RAID driver has to come up
>>>>>>>>with his own way of displaying the RAID status. It's like saying
>>>>>>>>that each network driver has to implement/display the stuff you can
>>>>>>>>see with ifconfig in its own way, instead of using the proper
>>>>>>>>network driver interface for this.
>>>>>>>>
>>>>>>>
>>>>>>>For the love of God, please don't use RAID as an example to support 
>>>
>>>your
>>>
>>>
>>>>>>>argument for the sensord framework.  Representing RAID state is 
>>>
>>>several
>>>
>>>
>>>>>>>orders of magnitude more involved than representing network state.
>>>>>>>There are also landmines in the OpenBSD bits of RAID support that are
>>>>>>>best left out of FreeBSD, unless you like alienating vendors and 
>>>
>>>risking
>>>
>>>
>>>>>>>legal action.  Leave it alone.  Please.  I don't care what you do with
>>>>>>>lmsensors or cpu power settings or whatever.  Leave RAID out of it.
>>>>>>
>>>>>>Talking about RAID status is not talking about alienating vendors. I
>>>>>>don't talk about alienating vendors and I don't intent to do. You may
>>>>>>not be able to display a full blown RAID status with the sensors
>>>>>>framework, but it allows for a generic "wors/works not" or
>>>>>>"OK/degraded" status display in drivers we have the source for. This
>>>>>>is enough for status monitoring (e.g., nagios).
>>>>>
>>>>>As I mentioned in the thread on arch@ where people brought up objections 
>>>
>>>that
>>>
>>>
>>>>>were apparently completely ignored, this is far from useful for RAID
>>>>>monitoring.  For example, if my RAID is down, which disk do I need to
>>>>>replace?  Again, all this was covered earlier and (apparently) ignored.
>>>>>Also, what strikes me as odd is that I didn't see this patch posted again 
>>>
>>>for
>>>
>>>
>>>>>review this time around before it was committed.
>>>>
>>>>This has been addressed back in July. You'd use bioctl to see which
>>>>exact disc needs to be replaced. Sensorsd is intended for an initial
>>>>alert about something being wrong.
>>>
>>>
>>>In July you actually said you weren't sure about bioctl(8). :)  But also, this 
>>>model really isn't very sufficient since it doesn't handle things like drives 
>>>going away, etc.  You really need to maintain a decent amount of state to 
>>>keep all that, and this is far easier done in userland rather than in the 
>>>kernel.  However, you can choose to ignore real-world experience if you 
>>>choose.
>>>
>>>Basically, by having so little data in hw.sensors if I had to write a RAID 
>>>monitoring daemon I would just not use hw.sensors since it's easier for me to 
>>>figure out the simple status myself based on the other state I already have 
>>>to track (unless you write an event-driven daemon based on messages posted by 
>>>the firmware in which case again you wouldn't use hw.sensors for that either).
>>
>>There is no other daemon that you'd need, you'd simply use sensorsd for 
>>this.  You could write a script that would be executed by sensorsd if a 
>>certain logical disc drive sensor changes state, and then this script 
>>would call the bio framework and give you additional details on why the 
>>state was changed.
> 
> 
> That's actually not quite good enough as, for example, I want to keep yelling
> about a busted volume on a periodic basis until its fixed.  Also, having a volume
> change state doesn't tell me if a drive was pulled.  On at least one RAID
> controller firmware I am familiar with, the only way you can figure this out is
> to keep track of which drives are currently present with a generation count and
> use that to determine when a drive goes away.  Even my monitoring daemon for
> ata-raid has to do this since the ata(4) driver just detaches and removes a drive
> when it fails and you have no way to figure out which drive died as the kernel
> thinks that drive no longer exists.

As I said back in July, I'm not terribly familiar with RAID, but I don't 
see why you can't accomplish this with the sensors framework.

You didn't quote my other part of the reply about the ntpd/sensors.c 
example.  You can use the sensors framework in the same way as ntpd 
does, e.g. you can send repeated warnings as long as one of the logical 
drive sensors is not in an OK state.  In sensorsd.conf, you'd simply say 
"drive:istatus", and sensorsd won't bother you with duplicate warnings, 
since your own application will provide them more appropriately.  Or 
such feature about repeated warnings about things not being in an OK 
state can always be added to sensorsd, too.

C.


More information about the cvs-src mailing list