Re: native inotify implementation
- Reply: Vadim Goncharov : "kqueue extensibility (Was: native inotify implementation)"
- In reply to: Vadim Goncharov : "Re: native inotify implementation"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sat, 05 Jul 2025 16:30:18 UTC
On Sat, Jul 05, 2025 at 03:49:46AM +0300, Vadim Goncharov wrote: > On Sat, 17 May 2025 11:18:34 -0400 > Mark Johnston <markj@freebsd.org> wrote: > > > On Fri, May 16, 2025 at 11:02:33AM -0500, Jake Freeland wrote: > > > On Mon May 12, 2025 at 3:58 PM CDT, Mark Johnston wrote: > > > > For the past while I've been hacking on a native implementation of > > > > Linux's inotify. Functionality-wise, this is similar to but not quite > > > > equivalent to the EVFILT_VNODE kqueue filter. While we already have a > > > > userspace implementation of inotify built on top of kqueue, it shares > > > > the limitations of EVFILT_VNODE, and my version can also be used in the > > > > Linuxulator. (Please let me know if you're interested in working on > > > > that and testing it out.) > [...] > > > > This work was largely motivated by a race condition in EVFILT_VNODE: in > > > > order to get events for a particular file, you first have to open it, by > > > > which point you may have missed the event(s) you care about. For > > > > instance, if some upload service adds files to a directory, and you want > > > > to know when a new file has finished uploading, you'd have to watch the > > > > directory to get new file events, scan the directory to actually find > > > > the new file(s), open them, and then wait for NOTE_CLOSE (which might > > > > never arrive if the upload had already finished). Aside from that, the > > > > need to hold each monitored file open is also a problem for large > > > > directory hierarchies as it's easy to exhaust file descriptor limits. > > > > > > > > My initial solution was a new kqueue filter, EVFILT_FSWATCH, which lets > > > > one watch for all file events under a mountpoint. The consumer would > > > > allocate a ring buffer with space to store paths and event metadata, > > > > register that with the kernel, and the kernel would write entries to the > > > > buffer, using reverse lookups to find a path for each event vnode. This > > > > prototype worked, but got somewhat hairy and I decided it would be > > > > better to simply implement an existing interface: inotify already exists > > > > and is commonly used, and has a somewhat simpler model, as it merely > > > > watches for events within a particular directory. > > > > > > I've found that more and more developers are blindly using Linux-specific > > > interfaces these days, so +1 for natively supporting another one. > > > > > > The more support we have for these, the easier porting/Linux emulation is. > > > I think the benefits of this far outweighs the cost of maintaining the > > > code. > > > > I think so too. My perspective is that we should implement widely used > > Linux interfaces as part of the larger goal of making existing software > > usable on FreeBSD. This is more important than the purity of the > > kernel's interfaces or architecture, at least up to a certain point. > > > > The whole purpose of an OS is to let users run the programs they want to > > run, without getting in the way (too much). > > Yes, and no. While it's often useful in short-term perspective, such approach > leaves FreeBSD without unique features so it becomes yet another "Linux, just > poorer" with obvious then "why choose it?". It's understandable that in some > cases it is simple to implement compatible API, but an alternative like "have > more general solution with a compatibility shim layer via which their API is > implemented" is better, when possible. Sure, but so far there is no clear description of a more general solution, and the shortcomings of EVFILT_VNODE have been known for a long time. There's also nothing precluding this inotify implementation from being extended or replaced, just so long as a compatible implementation can be provided in libc. > It's late in which particular topic as commit was landed, but for future we > should think how to extend kqueue to be able more. As I mentioned in my original email, that's what I tried to do first. It is immediately more complicated than inotify since kevent() doesn't have a good way to return arbitrary data (particularly file names and paths) to userspace. It is possible if we make kevent() write to a user pointer embedded in the knote, but it's not simple. I note that XNU also does not use kqueue for this purpose, and I'm skeptical that it's the right substrate for a file montoring interface. > [E.g. I'd want to have notifications for my protocol with multiple streams > inside one socket (think like QUIC), but it does not fit nicely into current > struct kevent or socket API (multiple socket buffers with separate reading)]