held file reference issue with ZFS and nullfs
mail at florian-schulze.net
Fri Sep 6 10:44:57 UTC 2019
Since FreeBSD 12 (updated from 10.3, I skipped 11.x completely, the box
started around 9.3) I have the issue that ZFS is not freeing up space
for some deleted files. The filesystems where this happens are mounted
into multiple jails via nullfs. Only one jail has write access, the
others are read only. When files are deleted the space for them is not
freed. I can still see their objects via zdb. When I unmount one of the
read only nullfs mounts the space is freed and the objects released.
I already used lsof, procstat and fstat to see if any process still has
a reference to the file, but that is not the case. But it seems to
matter which nullfs mount is unmounted, it is always one of the read
only ones. The processes which access the read only mounts are
completely different, it only seems to matter that the files are opened
at all. Killing the processes doesn't help, only unmounting the nullfs.
Today I noticed an odd message when I used zfs diff: "Unable to
determine path or stats for object 6 in
... at zfs-diff-15651-00000001d86eb8cb: Stale NFS file handle". I don't
have NFS enabled anywhere (just checked the properties) and it never was
The zdb output for object 6:
Dataset ... [ZPL], ID 18405, cr_txg 36040566, 10.2G, 43 objects
Object lvl iblk dblk dsize dnsize lsize %full type
6 1 16K 16K 16K 512 32K 100.00 SA attr
spa_open_common: opening ...
spa_load(tank2, config trusted): LOADING
disk vdev '/dev/diskid/DISK-WD-...': best uberblock found for spa tank2.
spa_load(tank2, config untrusted): using uberblock with txg=40503599
spa_load(tank2, config trusted): spa_load_verify found 0 metadata errors
and 2 data errors
spa_load(tank2, config trusted): LOADED
In the zfs diff was also a line "- ...(on_delete_queue)".
I have one zfs filesystem where this happens quite often, one were it
happens sometimes and a few others which have a similar setup and where
I never noticed it (though the average file size on them is smaller).
I asked in #freebsd about this and koobs said I should write to this
list and CC kib at freebsd.org and mgj at freebsd.org
He also did a quick look at the nullfs changes between 10.3 and 12.0 and
spotted the following change, which he said I should mention as well:
Is there a known bug there? Could the stale NFS handle cause the leak?
Where is that NFS handle coming from?
More information about the freebsd-fs