cvs commit: src/etc Makefile sensorsd.conf src/etc/defaults rc.conf src/etc/rc.d Makefile sensorsd src/lib/libc/gen sysctl.3 src/sbin/sysctl sysctl.8 sysctl.c src/share/man/man5 rc.conf.5 src/share/man/man9 Makefile sensor_attach.9 src/sys/conf files ...

Wed Oct 17 07:06:05 PDT 2007

Quoting Poul-Henning Kamp <phk at phk.freebsd.dk> (from Tue, 16 Oct 2007  
17:32:40 +0000):

> In message <20071016183311.lu97hbwzggsk4ow4 at webmail.leidinger.net>,   
> Alexander L
> eidinger writes:
>
>>> Yes, that is the abstract argument, but the very same argument can
>>> be made for every other single kind of entity which consumes or
>>> produces bytes, from fingerprint readers to 9-track tape stations.
>>
>> Why do we have a common linked list API? It's easy enough to do it
>> again and again and again... We have it because we don't want to do it
>> again and again... And with the sensors API we gain something similar.
>
> There is a very big difference between <sys/queue.h> and sensors,
> in that <sys/queue.h> is not an external API, but a convenience
> tool for code to maintain its own data internally, whereas sensors
> is an API for exporting data.

The idea behind is the same. Don't do a lot by hand what can be done  
with less work with an API.

>> It adds meta-data which can be used in an automated way. This is done
>> with a consistent and documented API. Sure, we can do it with sysctls
>> by hand, but see above.
>
> What exactly do you mean when you say "used in an automated way" ?

You can write a probe for a monitoring system, which has a look at  
hw.sensors and based upon the data it sees there can generate at least  
<name> <type> <value> reports, without the need that the probe needs  
to be changed to be able to handle sensors which wheren't seen before.

> Can I run some magic program and tell it "alert me if something is
> wrong" or do I have to write a tedious configuration file to explain
> what "something is wrong" would look like to the program ?

It depends on the magic program and the sensor. As Constantine already  
explained and you surely know, there are stupid sensors and there are  
smart sensors.

>> It is not supposed to make the monitoring itself easier.
>> [...]
>> A human being still has to interprete the measurements. No doubts. But
>> with the framework you don't have to hunt down where to read the
>> sensor data, and how to name it. You can write a probe which takes
>> everything in the the sensors mib and let it produce names and values
>> for the probed things automatically.
>
> So the only problem sensors solves, is that it defines a single
> place in the sysctl tree, where you can find all sorts of non-random
> numbers, each of which comes with a piece of ascii text that isn't
> formatted in any consistent way ?
>
> I'd say, lets raise the bar several notches right here.
>
> How about we look at what is desirable from such a subsystem, and
> see what architecture that mandates ?

Hmmm... "desirable" is not the same as "useful" or "necessary". Let's  
try to not overengineer this. Note: I also think as long as we try to  
not prevent the framework to be able to handle specific things which  
we think are desirable, the framework doesn't need to be able to  
handle all from day one.

> Here are some things to think about:
>
> * Input only or input & output ?
>
>   Would it make sense to be able to control the fans or power
>   to various subsystems while we are at it ?

Apart from what Constantine said:
Do you want to change the power of various subsystems? Isn't the  
system supposed to do it itself in a sensible and automatic way? I  
would say it depends. Most of the time I don't want to fight with  
something like this in a production system (and AFAIK Intel tries to  
do more and more themself regarding power control in their CPUs, as  
they noticed that often the "messing around" with this is ...  
suboptimal).

> * It should be possible to implement a sensor in userland, so that
>   interface to external sensors is possible without forcing the
>   code into the kernel.  Think: Maxim/Dallas 1-Wire temperature
>   sensors and similar.

I see hw.sensors as a interface to get the data from sensoric data  
which is within control of the kernel into the userland. I don't think  
about it as something where "sensor" includes status info from  
userland applications. I fail to see where it is beneficial to put  
data which isn't measured by something in the kernel (e.g. the fill  
level of a database or any other value an userland program produces).  
Could you please explain why it should be possible to feed such  
userland data into the kernel?

> * Metadata information in machine redable format:
>     - recommended, min and max poll rate

Typically the monitoring programs I know poll based upon a fixed rate.  
The sensor framework caches already data, and it is up to the code  
which puts the value into place to decide if the sensor has to be  
queried again or not.

>     - Nominal value, quantization step and alarm limit(s)

Alarm limits normally are set in the monitoring applications I know  
and depend upon various factors.

>     - alarm transgression severity for system integrity

How can a sensor know this? If it is about something which is in  
active use, a violation of a specific value may be critical for the  
entire system, but if it is just present in the system and not used at  
all, crossing the same value may be not critical. In general this is a  
policy decision which can not be solved by the person writting the  
handover of this data to the sensor framework.

>     - sensorfailure severity for system integrity

Dito.

>     - physical location of measured quantity

Do you know monitoring programs which allow probes to submit this  
information to the monitoring program? If not, why should the  
framework allow to keep this information in the kernel when a file on  
the system satisfies the same requirement?

>     ...

So far the things you mentioned are better suited to be kept in the  
userland, instead of in the kernel. A simple file which a specific  
syntax would be enough to let a probe automatically match a specific  
sensor with this metadata and let it transfer this to the monitoring  
application (if the monitoring application is suited to accept this  
kind of data).

> * Event support ?
>     - enumeration, arrival and departure of sensors
>     - alarm transgressions
>     - sensor failure
>     ...
>
> * Interface and integration with IPMI, ACPI and similar.
>     Do any of these have a metadata format we can use ?

Constantine answered those already.

> and probably a lot of other stuff I didn't think of right now...
>
>> Now... how much hardware out there supports IPMI, or
>> better... how much in production use doesn't use IPMI?
>
> But don't you think it would be better to have a subsystem that
> made it possible to use IPMI and ACPI, than to just say "Naa,
> that sucks, it must do, because we don't support it" ?

I haven't said it doesn't support it, I told you already that Nate  
didn't identify something in the sensors framework which prevents the  
use of ACPI and the sensors framework, and Constantine already showed  
how it integrates with IPMI.

>>> Let me get this straight, you're telling me:
>>>
>>> 	"I'm worried about this code running as root, so I'm putting
>>> 	it in the kernel instead."
>>
>> You missed the point.
>
> No, I most certainly did not.
>
> By defining the sensor API (on top of sysctl) at the kernel/userland
> boundary you have decided that all sensor implementations must live
> in the kernel, there is no room in your architecture for sensors
> that live in userland.

No, I didn't. I said (even last time when you first told us that you  
don't like the sensors framework), that the sensors framework is  
supposed to export data which lifes in the kernel to the userland. I  
never said the sensors framework is supposed to be the one and only  
way of getting status data from a running system. Userland status  
belongs to userland programs. It would be nice to have a userland  
framework which collects userland status, so that you don't have to  
run around, and it may not be a bad idea that this userland framework  
may collect also the data from the sensors framework (e.g. one plugin  
to get all hw.sensors data, instead of multiple plugins to get all the  
various states from the various places of a non-uniform status export  
from kernel to userland). But here we talk about exporting data from  
the kernel to the userland via an API, not about a userland framework  
to collect status information (some people may say we already have  
this with SNMP).

> Effectively, you have elevated all sensor implementations to root++
> priviledge, even if they don't need any priviledge at all.

No, I haven't, see above.

> I don't care much about who wrote the code or how trustworthy they
> are, that's a problem that can be fixed along the way.
>
> But I do care about taking away, by design, the choice of running
> at low priviledge from people who implement sensors.

I'm not taking this away.

>>> I repeat: The SoC interface is not the gateway to -current.
>>
>> It provides an idea in what people are interested in.
>
> Sure, lets list "Peace in the middle-east" on there, I'm sure people
> are interested :-)
>
> "People", whoever they are, are interested in anything that sounds
> fancy or flashy, but that doesn't mean that they can or will actually
> use it for anything if somebody produce it, and it certainly gives
> no guarantee that you will not shoot yourself in the foot along the
> way if you do so.

You said you don't like the idea of an unified way of exporting sensor  
data which lifes in the kernel to the userland. You didn't provide  
technical arguments against such an API (I'm not taking lm.4 into  
account ATM, as this was not your main objection). When the "People"  
voted for this project, they voted for this idea, which seems to work  
nice in OpenBSD. You failed so far to show that it doesn't work in  
FreeBSD, while we've seen several examples where we get benefits from  
it. I don't say the implementation is free of bugs, or can not be  
improved, but you are not talking about code, you are talking about  
the idea. You are slapping our fellow committers (I don't count myself  
here) into the face, which voted in favour of this project. You are  
doing the same with those people which didn't voted against this  
project.

>> And several
>> committers here in the thread also showed interest in this framework
>> (maybe not in the current implementation, but at least in the idea
>> behind it).
>
> Right, but if we didn't object, you had saddled us with this implementation,
> without any actual discussion about what exactly the idea behind it
> was and if that was the right idea for us.

I wouldn't have saddled us with the implementation. I would have maybe  
saddled us with the API for the lifetime of one released branch... if  
nobody would have improved it in the next 18 months (when HEAD is  
branched). I don't think your very negatively sounding sentence above  
is deserved. The people voting in the GSoC are supposed to reject  
ideas which they identify as being outright bad. And for projects they  
vote for, they have a look if it makes sense. If it doesn't make  
sense, they don't vote for a project. The big disconnection between  
FreeBSD and the SOC you are trying to put here, is not the case. I  
agree that not everything which is produced in the soc deserves it's  
way into CVS. But what gets rejected to far is stuff which proved  
during the soc to be not usable. Either because the architecture  
doesn't fit, like with the pluggable disk schedulers when you  
introduced GEOM and the project wasn't usable anymore, or because the  
student missed the goal, or when the goal was achieved but the  
implementation was a cruel, or when we noticed that the project needs  
complete rearchitecture because the initial design doesn't fit.

>> Just because you do not see how such a framework can be
>> useful to you (so far I have the impression from your mails, that you
>> object to the idea of this framework),
>
> I *can* see why and how such a framework can be useful, that's why
> I'm objecting to this half-baked attempt at it.

Now you sound differently than before. Before you said you don't like  
the idea of such a framework at all. Some of the points you bring on  
the table above look overengineered to me (I pointed them out). And so  
far I wasn't able to identify a point there, which the sensors  
framework prevents to implement. I also want to point out that so far  
the goal was to do what is needed and evolutionary improve the  
architecture/implementation, instead of trying to produce a big thing  
with bells, whistles, trumpets and whatever (in Germany we call this  
"eierlegendewollmichsau"). We all know that overengineered projects  
typically fail, and that the evolutionary behavior in OpenSource  
software produces very good results (e.g., SMP in 4.x was very good  
for this time, and now as we have "more SMP" and raised expectations  
we morphed into something better). Some of the things you want to have  
for sensors looks nice. Some of this nice stuff doesn't belong into  
the kernel. And I haven't seen something which can not be done with  
the sensors framework in a next step. So far the sensors framework  
allows to provide features we don't have in RELENG_x.

>>> Ten years ago when we didn't have P4 and the _extensive_ infrastructure
>>> for making it easy for people to work out of the tree, we had to do
>>> stuff like that, but there is no excuse for it today.
>>
>> Nobody is perfect. There will always be some bugs when something is
>> committed to -current.
>
> Bugs, yes, and we have means to deal with them.
>
> But we should try much harder to avoid half-baked concepts and wrong
> architecture, because that is 10 times harder to fix than a plain
> bug is.

Feel free to point out wrong architecture. Regarding the half-baked  
part... so far Constantine already showed what is possible to do from  
the list you came up with. I still think you are fighting against the  
framework based mostly upon feelings, and not based upon technical  
facts.

>> You don't talk about obvious problems here.
>> There's no destabilized system, there are no panics. You talk about
>> not using an underdocumented API and not using a generic framework for
>> creating tasks [...]
>
> Yes, it does appear to me that we are not on the same level of
> abstraction.
>
> I am indeed not talking about how many compiler warnings or style(9)
> infractions this code has.

gcc 4 introduced new warnings. Without compiling this with -Werror on  
e.g. RELENG_6 with gcc 3, I don't think you should talk about compiler  
warnings at the moment, as the kernel is on a similar level (in case  
we don't compile with -Werror anymore). Regarding style(9) Constantine  
made several commits in p4 during the soc.

> I'm talking about:
>   - if it actually solves a problem for us that we have.

Yes.

>   - if if should solve more problems than it does right now.

As you have seen it already does more than you think it is able to do.

>   - if it creates even more problems down the road.

Have you identified some problems?

> I'm talking about architecture, you're talking about code.

Wrong. I never said the code is without the possibility to improve it  
or that it is free of flaws. I'm talking about your behavior of  
rejecting the idea (not even the architecture, but the idea) of the  
sensors framework without accepting that other people see a benefit in  
such a framework and calling it crap without comming up with technical  
reasons. I'm also talking about the idea of the framework and what it  
is supposed to do whiel you say that you don't like the idea. I may  
also have talked about parts of the code, but it is wrong to say that  
I focus on the code. And I'm also the wrong person to talk about the  
code, Constantine is the person to talk with if it is about the code.

Bye,
Alexander.

-- 
http://www.Leidinger.net  Alexander @ Leidinger.net: PGP ID = B0063FE7
http://www.FreeBSD.org     netchild @ FreeBSD.org  : PGP ID = 72077137
No one can feel as helpless as the owner of a sick goldfish.