cvs commit: src/etc Makefile sensorsd.conf src/etc/defaults rc.conf src/etc/rc.d Makefile sensorsd src/lib/libc/gen sysctl.3 src/sbin/sysctl sysctl.8 sysctl.c src/share/man/man5 rc.conf.5 src/share/man/man9 Makefile sensor_attach.9 src/sys/conf f

John Baldwin jhb at
Wed Oct 17 06:42:50 PDT 2007

On Tuesday 16 October 2007 06:14:34 pm Constantine A. Murenin wrote:
> On 16/10/2007 17:01, John Baldwin wrote:
> > On Monday 15 October 2007 10:57:48 pm Constantine A. Murenin wrote:
> > 
> >>On 15/10/2007, John Baldwin <jhb at> wrote:
> >>
> >>>On Monday 15 October 2007 09:43:21 am Alexander Leidinger wrote:
> >>>
> >>>>Quoting Scott Long <scottl at> (from Mon, 15 Oct 2007
> >>>
> >>>01:47:59 -0600):
> >>>
> >>>>>Alexander Leidinger wrote:
> >>>>>
> >>>>>>Quoting Poul-Henning Kamp <phk at> (from Sun, 14 Oct
> >>>>>>2007 17:54:21 +0000):
> >>>>
> >>>>>>>listen to the various mumblings about putting RAID-controller status
> >>>>>>>under sensors framework.
> >>>>>>
> >>>>>>What's wrong with this? Currently each RAID driver has to come up
> >>>>>>with his own way of displaying the RAID status. It's like saying
> >>>>>>that each network driver has to implement/display the stuff you can
> >>>>>> see with ifconfig in its own way, instead of using the proper
> >>>>>>network driver interface for this.
> >>>>>>
> >>>>>
> >>>>>For the love of God, please don't use RAID as an example to support 
> > 
> > your
> > 
> >>>>>argument for the sensord framework.  Representing RAID state is 
> > 
> > several
> > 
> >>>>>orders of magnitude more involved than representing network state.
> >>>>>There are also landmines in the OpenBSD bits of RAID support that are
> >>>>>best left out of FreeBSD, unless you like alienating vendors and 
> > 
> > risking
> > 
> >>>>>legal action.  Leave it alone.  Please.  I don't care what you do with
> >>>>>lmsensors or cpu power settings or whatever.  Leave RAID out of it.
> >>>>
> >>>>Talking about RAID status is not talking about alienating vendors. I
> >>>>don't talk about alienating vendors and I don't intent to do. You may
> >>>>not be able to display a full blown RAID status with the sensors
> >>>>framework, but it allows for a generic "wors/works not" or
> >>>>"OK/degraded" status display in drivers we have the source for. This
> >>>>is enough for status monitoring (e.g., nagios).
> >>>
> >>>As I mentioned in the thread on arch@ where people brought up objections 
> > 
> > that
> > 
> >>>were apparently completely ignored, this is far from useful for RAID
> >>>monitoring.  For example, if my RAID is down, which disk do I need to
> >>>replace?  Again, all this was covered earlier and (apparently) ignored.
> >>>Also, what strikes me as odd is that I didn't see this patch posted again 
> > 
> > for
> > 
> >>>review this time around before it was committed.
> >>
> >>This has been addressed back in July. You'd use bioctl to see which
> >>exact disc needs to be replaced. Sensorsd is intended for an initial
> >>alert about something being wrong.
> > 
> > 
> > In July you actually said you weren't sure about bioctl(8). :)  But also, this 
> > model really isn't very sufficient since it doesn't handle things like drives 
> > going away, etc.  You really need to maintain a decent amount of state to 
> > keep all that, and this is far easier done in userland rather than in the 
> > kernel.  However, you can choose to ignore real-world experience if you 
> > choose.
> > 
> > Basically, by having so little data in hw.sensors if I had to write a RAID 
> > monitoring daemon I would just not use hw.sensors since it's easier for me to 
> > figure out the simple status myself based on the other state I already have 
> > to track (unless you write an event-driven daemon based on messages posted by 
> > the firmware in which case again you wouldn't use hw.sensors for that either).
> There is no other daemon that you'd need, you'd simply use sensorsd for 
> this.  You could write a script that would be executed by sensorsd if a 
> certain logical disc drive sensor changes state, and then this script 
> would call the bio framework and give you additional details on why the 
> state was changed.

That's actually not quite good enough as, for example, I want to keep yelling
about a busted volume on a periodic basis until its fixed.  Also, having a volume
change state doesn't tell me if a drive was pulled.  On at least one RAID
controller firmware I am familiar with, the only way you can figure this out is
to keep track of which drives are currently present with a generation count and
use that to determine when a drive goes away.  Even my monitoring daemon for
ata-raid has to do this since the ata(4) driver just detaches and removes a drive
when it fails and you have no way to figure out which drive died as the kernel
thinks that drive no longer exists.

John Baldwin

More information about the cvs-src mailing list