native inotify implementation

From: Mark Johnston <markj_at_freebsd.org>
Date: Mon, 12 May 2025 20:58:25 UTC
For the past while I've been hacking on a native implementation of
Linux's inotify.  Functionality-wise, this is similar to but not quite
equivalent to the EVFILT_VNODE kqueue filter.  While we already have a
userspace implementation of inotify built on top of kqueue, it shares
the limitations of EVFILT_VNODE, and my version can also be used in the
Linuxulator.  (Please let me know if you're interested in working on
that and testing it out.)

The WIP implementation is here: https://reviews.freebsd.org/D50315
There are some loose ends to tie up there, but I wanted to solicit
feedback before I keep spending time on it.  I also wonder how this
feature should be handled in the ports tree where libinotify is used
today: if src starts installing /usr/include/sys/inotify.h, will ports
start using the native implementation automatically?  Do we need to have
some kind of flag day?

This work was largely motivated by a race condition in EVFILT_VNODE: in
order to get events for a particular file, you first have to open it, by
which point you may have missed the event(s) you care about.  For
instance, if some upload service adds files to a directory, and you want
to know when a new file has finished uploading, you'd have to watch the
directory to get new file events, scan the directory to actually find
the new file(s), open them, and then wait for NOTE_CLOSE (which might
never arrive if the upload had already finished).  Aside from that, the
need to hold each monitored file open is also a problem for large
directory hierarchies as it's easy to exhaust file descriptor limits.

My initial solution was a new kqueue filter, EVFILT_FSWATCH, which lets
one watch for all file events under a mountpoint.  The consumer would
allocate a ring buffer with space to store paths and event metadata,
register that with the kernel, and the kernel would write entries to the
buffer, using reverse lookups to find a path for each event vnode.  This
prototype worked, but got somewhat hairy and I decided it would be
better to simply implement an existing interface: inotify already exists
and is commonly used, and has a somewhat simpler model, as it merely
watches for events within a particular directory.

Many thanks to Klara for sponsoring this work.