Re: devctl_notify system is inconsistent

From: Alexander V. Chernikov <melifaro_at_ipfw.ru>
Date: Fri, 09 Dec 2022 15:25:47 UTC

> On 1 Dec 2022, at 17:40, Baptiste Daroussin <bapt@FreeBSD.org> wrote:
> 
> On Thu, Dec 01, 2022 at 08:38:37AM -0700, Warner Losh wrote:
>> On Thu, Dec 1, 2022 at 7:59 AM Baptiste Daroussin <bapt@freebsd.org> wrote:
>> 
>>> On Thu, Dec 01, 2022 at 07:45:32AM -0700, Warner Losh wrote:
>>>> On Thu, Dec 1, 2022 at 1:36 AM Baptiste Daroussin <bapt@freebsd.org>
>>> wrote:
>>>> 
>>>>> Hello,
>>>>> 
>>>>> After the addition of netlink(4) by melifaro@, I started working on a
>>> new
>>>>> genetlink(4) module, to send kernel notification to the userland via
>>>>> netlink.
>>>>> 
>>>>> The goal is to be able to have multiple consumers without the need of
>>> devd
>>>>> to be
>>>>> running.
>>>>> 
>>>>> The goal is also to be able subscribe to the events the consumer is
>>>>> willing to
>>>>> receive.
>>>>> 
>>>>> https://reviews.freebsd.org/D37574
>>>>> 
>>>>> I also added a hook to devctl_notify to make sure all its event got
>>> sent
>>>>> via
>>>>> nlsysevent. (https://reviews.freebsd.org/D37573)
>>>>> 
>>>> 
>>>> You're connecting it at the wrong level: it will miss all device events.
>>>> devctl_notify
>>>> is used by everything except newbus's device attach, detach and nomatch
>>>> events, so none of those events will make it through.
>>> 
>>> Where should I hook to get the device events?
>>> 
>> 
>> Either you need to drop down a level to where the formated events are
>> queued,
>> or you'll need to add something in devaddq() to deal with the events. These
>> are
>> a slightly different format than the devctl_notify() events because the
>> latter was
>> added later and I didn't want to cope with transitioning the old formatted
>> messages
>> to the new at that time (silly me).
>> 
>> 
>>>> 
>>>> 
>>>>> It works great and so far I am happy with it. on thing I figured out
>>> it is:
>>>>> the "system" argment of devctl_notify is inconsistent:
>>>>> Upper case vs lower case
>>>>> "kern" vs "kernel"
>>>>> 
>>>> 
>>>> They are consistent. With one exception that I deprecated in 13.x to be
>>>> removed in 14.x. I documented all of them at the time in devd.conf. I
>>> think
>>>> I'll go ahead and complete the deprecation.
>>>> 
>>>> 
>>>>> I intent to fix the following way:
>>>>> Create a new function similar to devctl_notify but with the first
>>> argument
>>>>> being
>>>>> an enum.
>>>>> 
>>>> 
>>>> I'm a pretty hard no on the enum. I rejected doing it that way when I
>>> wrote
>>>> devd
>>>> years ago. These have always been strings, and need to continue to
>>> always be
>>>> strings, or we're forever modifying devd(8) when we add a new one and
>>>> introducing
>>>> yet another opportunity for mismatch. I don't think this is a good idea
>>> at
>>>> all.
>>>> 
>>>> There's been users of devd in the past that are loadable modules that
>>> pass
>>>> their
>>>> own system name in that devd.conf files would then process. Having an
>>> enum
>>>> with
>>>> enforcement would just get in the way of this.
>>>> 
>>>> I have toyed with the idea of having a centralized list in bus.h that's a
>>>> bunch of
>>>> #defines, ala old-school X11 resources (which this is very very loosely
>>>> based on),
>>>> but it didn't seem worth the effort.
>>> 
>>> That is fine with me and actually I do need the string name to register as
>>> group
>>> name, that I didn't want to trash the strings away.
>>> 
>>> But I need a way to list them all.
>>> 
>> 
>> We have no current mechanism to do that. We could add something that would
>> register / deregister them with a sysinit + call to something in
>> kern_devctl.c which
>> could do the trick (and also deal with the ordering issues), though having
>> netlink(4)
>> as a loadable modules might be an interesting case to get right.
>> 
>> If we did that, we could return a 'token' that you'd use to call a new
>> version of
>> devctl_notify(), perhaps with some glue for legacy users (or perhaps not:
>> we change
>> kernel interfaces all the time, and could just rename it and convert
>> everything in the
>> tree).
>> 
>>> 
>>>> 
>>>>> Make the current devctl_notify convert its first argument into that
>>> enum
>>>>> and
>>>>> yell if an unkwown "system" is passed. (and probably declare
>>> devctl_notify
>>>>> deprecated)
>>>>> 
>>>> 
>>>> I don't think this buys us anything. devctl_notify cannot possibly know
>>>> about all
>>>> the possible subsystems, so this is likely doomed to high maintenance
>>> that
>>>> would
>>>> be largely ineffective.
>>>> 
>>>> Then fix the inconsistencies: all upper case as it seems the most wildly
>>> use
>>>>> case
>>>>> s/kern/kernel/g
>>>>> 
>>>>> WDYT?
>>>>> 
>>>> 
>>>> I wouldn't worry about the upper vs lower case. All the documented ones
>>> are
>>>> upper case, except kernel. There's no harm it being one exception since
>>>> changing
>>>> it will break user's devd.conf setups. I got enough feedback when I
>>> changed
>>>> "kern" to "kernel" last year for power events. And as far as I know, I've
>>>> documented
>>>> all the events that are generated today.
>>>> 
>>>> Warner
>>> 
>>> 
>>> Note that if you think nlsysevent is a bad idea, I will just forget about
>>> it.
>>> 
>> 
>> I love the idea. I got over any resistance I had when melifaro@ moved
>> things into kern_devctl.c and we talked through the issues at that time.
>> I've been hoping that someone would replace the hacky thing I did with
>> a 'routing socket'-like interface (I never could figure out hose to do it so
>> many years ago w/o gross hacks). netlink(4) has finally provided a way
>> to do it that doesn't get the routing code all jumbled up for this.
>> 
>> I just have some specific issues with your proposal. In hindsight, I should
>> have led with that in my first message :).
>> 
>> Warner
> 
> Here is my new proposal:
> 
> What about:
> 
> 1. I add a static list of systems in sys/devctl.h something like
> 
> enum {
> DEVCTL_SYSTEM_ACPI = 0,
> DEVCTL_SYSTEM_AEON = 1,
> ...
> DEVCTL_SYSTEM_ZFS = 19
> };
> 
> static const char *devctl_systems[] = {
>  "ACPI",
>  "AEON",
>  ...
>  "ZFS",
> };
> 
> this way we have a list of official freebsd's systems.
> We don't change the devctl_notify interface
> 
> and in the kernel we change the devctl_notify("ZFS" into
> devctl_notify(devctl_systems[DEVCTL_SYSTEM_ZFS],...
> 
> This is not too intrusive, and breaks none of the existing code
> 
> 2. I also hook devadq using the same interface as I already have done, but
> all the attach,detach,nomatch become an event (only in nlsysevent) in the
> "DEVICE" system,
> the "SUBSYSTEM" is the current what of devaddq
> 
> The type is changed into:
> + -> ATTACH
> - -> DETACH
> anythingelse -> NOMATCH
> and the reste of the current line becomes the data
> 
> so from nlsysvent point of view this is exactly the same kind of events as the
> one hooked in devctl_notify.
> 
> 3. in nlsysevent we have one category one can subscribe per known systems.
> All other "systems" falls into a new category named "extra" "vendor" or "other"
> 
> from the consumer point of view the name of the system is anyway contained in
> the message itself, so the category it is subscribed to can differ.
> 
> This way, I should be able to get ALL the events in a consistent way.
> I should be compatible with people who uses devctl_notify in their custom kernel
> modules.
> 
> Sounds easy enough without the requirement of complexifying kern_devctl.c with a
> registration of extra systems.
> 
> What do you think?
My 2 cents:

I’d look at the groups from the customer (e.g. userland API POV). IMHO, fine-grained groups serves 2 purposes.
The first is limiting the amount of notifications the app has to deal with. The second is moving filtering from the
application to the kernel, so the app doesn’t need to check the event’s subsystem. Note that the second part loses its value if the app subscribes for more than one group.
It is also worth noting that _some_ notifications, like ifnet event, may be high-volume, so it can be desirable to have a way to filter them out.
For example, it can be implemented as a fixed number of broad group categories (“net”, “device”, “fs”, “power”, “vendor”) and each subsystem picks one it belongs to.
It can be potentially extended with an ability to register custom groups if so desired.
I wouldn’t say the existing KPI consisting of a single devctl_notify() is set in stone. I’d actually vote to have a new function/set of functions, as the notifications are no longer limited to the devices, as devctl suggests.

For example (just to illustrate the point):
group_id = unotify_get_group(“power”) # gets existing or registers new group in the netlink subsystem
unotify_broadcast(group_id, “ACPI”, …)
unotify_free_group(group_id)


> 
> Best regards,
> Bapt
>