cvs commit: src/etc Makefile sensorsd.conf src/etc/defaults rc.conf src/etc/rc.d Makefile sensorsd src/lib/libc/gen sysctl.3 src/sbin/sysctl sysctl.8 sysctl.c src/share/man/man5 rc.conf.5 src/share/man/man9 Makefile sensor_attach.9 src/sys/conf files ...

Tue Oct 16 10:32:43 PDT 2007

In message <20071016183311.lu97hbwzggsk4ow4 at webmail.leidinger.net>, Alexander L
eidinger writes:

>> Yes, that is the abstract argument, but the very same argument can
>> be made for every other single kind of entity which consumes or
>> produces bytes, from fingerprint readers to 9-track tape stations.
>
>Why do we have a common linked list API? It's easy enough to do it  
>again and again and again... We have it because we don't want to do it  
>again and again... And with the sensors API we gain something similar.  

There is a very big difference between <sys/queue.h> and sensors,
in that <sys/queue.h> is not an external API, but a convenience
tool for code to maintain its own data internally, whereas sensors
is an API for exporting data.

>It adds meta-data which can be used in an automated way. This is done  
>with a consistent and documented API. Sure, we can do it with sysctls  
>by hand, but see above.

What exactly do you mean when you say "used in an automated way" ?

Can I run some magic program and tell it "alert me if something is
wrong" or do I have to write a tedious configuration file to explain
what "something is wrong" would look like to the program ?

>It is not supposed to make the monitoring itself easier.
>[...]
>A human being still has to interprete the measurements. No doubts. But  
>with the framework you don't have to hunt down where to read the  
>sensor data, and how to name it. You can write a probe which takes  
>everything in the the sensors mib and let it produce names and values  
>for the probed things automatically.

So the only problem sensors solves, is that it defines a single
place in the sysctl tree, where you can find all sorts of non-random
numbers, each of which comes with a piece of ascii text that isn't
formatted in any consistent way ?

I'd say, lets raise the bar several notches right here.

How about we look at what is desirable from such a subsystem, and
see what architecture that mandates ?

Here are some things to think about:

* Input only or input & output ?

  Would it make sense to be able to control the fans or power
  to various subsystems while we are at it ?

* It should be possible to implement a sensor in userland, so that
  interface to external sensors is possible without forcing the
  code into the kernel.  Think: Maxim/Dallas 1-Wire temperature
  sensors and similar.

* Metadata information in machine redable format:
    - recommended, min and max poll rate
    - Nominal value, quantization step and alarm limit(s)
    - alarm transgression severity for system integrity
    - sensorfailure severity for system integrity
    - physical location of measured quantity
    ...

* Event support ?
    - enumeration, arrival and departure of sensors
    - alarm transgressions
    - sensor failure
    ...

* Interface and integration with IPMI, ACPI and similar.
    Do any of these have a metadata format we can use ?

and probably a lot of other stuff I didn't think of right now...

>Now... how much hardware out there supports IPMI, or  
>better... how much in production use doesn't use IPMI?

But don't you think it would be better to have a subsystem that
made it possible to use IPMI and ACPI, than to just say "Naa,
that sucks, it must do, because we don't support it" ?

>> Let me get this straight, you're telling me:
>>
>> 	"I'm worried about this code running as root, so I'm putting
>> 	it in the kernel instead."
>
>You missed the point.

No, I most certainly did not.

By defining the sensor API (on top of sysctl) at the kernel/userland
boundary you have decided that all sensor implementations must live
in the kernel, there is no room in your architecture for sensors
that live in userland.

Effectively, you have elevated all sensor implementations to root++
priviledge, even if they don't need any priviledge at all.

I don't care much about who wrote the code or how trustworthy they
are, that's a problem that can be fixed along the way.

But I do care about taking away, by design, the choice of running
at low priviledge from people who implement sensors.

>> I repeat: The SoC interface is not the gateway to -current.
>
>It provides an idea in what people are interested in.

Sure, lets list "Peace in the middle-east" on there, I'm sure people
are interested :-)

"People", whoever they are, are interested in anything that sounds
fancy or flashy, but that doesn't mean that they can or will actually
use it for anything if somebody produce it, and it certainly gives
no guarantee that you will not shoot yourself in the foot along the
way if you do so.

People was very interested in UFS soft-updates and snapshots, but
if we had paid more attention to what it did to the buf subsystem,
we would probably have been a lot more cautious.  ZFS generates a
lot of buzz, and similar concerns can, has been and should be raised
about that.  (My answer to that on is to turn buf into a library
function, so that each filesystem gets to screw only itself up).

>And several
>committers here in the thread also showed interest in this framework  
>(maybe not in the current implementation, but at least in the idea  
>behind it).

Right, but if we didn't object, you had saddled us with this implementation,
without any actual discussion about what exactly the idea behind it
was and if that was the right idea for us.

>Just because you do not see how such a framework can be  
>useful to you (so far I have the impression from your mails, that you  
>object to the idea of this framework), 

I *can* see why and how such a framework can be useful, that's why
I'm objecting to this half-baked attempt at it.

>> Ten years ago when we didn't have P4 and the _extensive_ infrastructure
>> for making it easy for people to work out of the tree, we had to do
>> stuff like that, but there is no excuse for it today.
>
>Nobody is perfect. There will always be some bugs when something is  
>committed to -current.

Bugs, yes, and we have means to deal with them.

But we should try much harder to avoid half-baked concepts and wrong
architecture, because that is 10 times harder to fix than a plain
bug is.

>You don't talk about obvious problems here.  
>There's no destabilized system, there are no panics. You talk about  
>not using an underdocumented API and not using a generic framework for  
>creating tasks [...]

Yes, it does appear to me that we are not on the same level of
abstraction.

I am indeed not talking about how many compiler warnings or style(9)
infractions this code has.

I'm talking about:
  - if it actually solves a problem for us that we have.
  - if if should solve more problems than it does right now.
  - if it creates even more problems down the road.

I'm talking about architecture, you're talking about code.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.