held file reference issue with ZFS and nullfs

Florian Schulze mail at florian-schulze.net
Fri Sep 6 10:44:57 UTC 2019


Hi!

Since FreeBSD 12 (updated from 10.3, I skipped 11.x completely, the box 
started around 9.3) I have the issue that ZFS is not freeing up space 
for some deleted files. The filesystems where this happens are mounted 
into multiple jails via nullfs. Only one jail has write access, the 
others are read only. When files are deleted the space for them is not 
freed. I can still see their objects via zdb. When I unmount one of the 
read only nullfs mounts the space is freed and the objects released.

I already used lsof, procstat and fstat to see if any process still has 
a reference to the file, but that is not the case. But it seems to 
matter which nullfs mount is unmounted, it is always one of the read 
only ones. The processes which access the read only mounts are 
completely different, it only seems to matter that the files are opened 
at all. Killing the processes doesn't help, only unmounting the nullfs.

Today I noticed an odd message when I used zfs diff: "Unable to 
determine path or stats for object 6 in 
... at zfs-diff-15651-00000001d86eb8cb: Stale NFS file handle". I don't 
have NFS enabled anywhere (just checked the properties) and it never was 
enabled!

The zdb output for object 6:

Dataset ... [ZPL], ID 18405, cr_txg 36040566, 10.2G, 43 objects

     Object  lvl   iblk   dblk  dsize  dnsize lsize   %full  type
          6    1    16K    16K    16K     512    32K  100.00  SA attr 
layouts


ZFS_DBGMSG(zdb):
spa_open_common: opening ...
spa_load(tank2, config trusted): LOADING
disk vdev '/dev/diskid/DISK-WD-...': best uberblock found for spa tank2. 
txg 40503599
spa_load(tank2, config untrusted): using uberblock with txg=40503599
spa_load(tank2, config trusted): spa_load_verify found 0 metadata errors 
and 2 data errors
spa_load(tank2, config trusted): LOADED


In the zfs diff was also a line "-   ...(on_delete_queue)".

I have one zfs filesystem where this happens quite often, one were it 
happens sometimes and a few others which have a similar setup and where 
I never noticed it (though the average file size on them is smaller).

I asked in #freebsd about this and koobs said I should write to this 
list and CC kib at freebsd.org and mgj at freebsd.org
He also did a quick look at the nullfs changes between 10.3 and 12.0 and 
spotted the following change, which he said I should mention as well:
https://github.com/freebsd/freebsd/commit/82f9c275c43da09f404546cceeff187a90ecc573#diff-81e7d6520611101890dd6425324dd8f8

Is there a known bug there? Could the stale NFS handle cause the leak? 
Where is that NFS handle coming from?

Regards,
Florian Schulze


More information about the freebsd-fs mailing list