Re: getfsstat(2) MNT_NOWAIT & stale data for zpool
Date: Wed, 08 Oct 2025 17:57:13 UTC
On 2025-Oct-08 15:59:12 +0000, Dave Cottlehuber <dch@skunkwerks.at> wrote: >When does getfsstat(2) stale info get updated? It seems this can >be easily more than 15 minutes, based on experiments in prod. It >can be triggered by running `/bin/df` but not `/bin/df /`, which >would be preferable. > >Background: > >I'm using prometheus node_exporter on a zpool to monitor disk usage. >Right now this is broken, as a zpool can fill up, and node_exporter >continues to return the old data. > >It loops over all filesystems, doing unix.Getfsstat(buf, unix.MNT_NOWAIT), >which is a wrapper around getfsstat(2) which clearly states: > > Normally mode should be specified as MNT_WAIT. If mode is set to MNT_NOWAIT, getfsstat() will return the information it has > available without requesting an update from each file system. Thus, some of the information will be out of date, but getfsstat() > will not block waiting for information from a file system that is unable to respond. It will also skip any file system that is in > the process of being unmounted, even if the unmount would eventually fail. > >Is node_exporter just broken and should be fixed to use MNT_WAIT? > >Is there some tunable to refresh this stale data more frequently, >or should I just add `/bin/df` to a regular crontab? See https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=273094 and https://github.com/prometheus/node_exporter/issues/1498 -- Peter Jeremy