held file reference issue with ZFS and nullfs

Konstantin Belousov kostikbel at gmail.com
Fri Sep 6 12:43:56 UTC 2019


On Fri, Sep 06, 2019 at 12:44:46PM +0200, Florian Schulze wrote:
> Hi!
> 
> Since FreeBSD 12 (updated from 10.3, I skipped 11.x completely, the box 
> started around 9.3) I have the issue that ZFS is not freeing up space 
> for some deleted files. The filesystems where this happens are mounted 
> into multiple jails via nullfs. Only one jail has write access, the 
> others are read only. When files are deleted the space for them is not 
> freed. I can still see their objects via zdb. When I unmount one of the 
> read only nullfs mounts the space is freed and the objects released.
> 
> I already used lsof, procstat and fstat to see if any process still has 
> a reference to the file, but that is not the case. But it seems to 
> matter which nullfs mount is unmounted, it is always one of the read 
> only ones. The processes which access the read only mounts are 
> completely different, it only seems to matter that the files are opened 
> at all. Killing the processes doesn't help, only unmounting the nullfs.
There were some bugs in past where nullfs referenced a lower vnode but
did not dereferenced it.

> 
> Today I noticed an odd message when I used zfs diff: "Unable to 
> determine path or stats for object 6 in 
> ... at zfs-diff-15651-00000001d86eb8cb: Stale NFS file handle". I don't 
> have NFS enabled anywhere (just checked the properties) and it never was 
> enabled!
This is probably unrelated.

> 
> The zdb output for object 6:
> 
> Dataset ... [ZPL], ID 18405, cr_txg 36040566, 10.2G, 43 objects
> 
>      Object  lvl   iblk   dblk  dsize  dnsize lsize   %full  type
>           6    1    16K    16K    16K     512    32K  100.00  SA attr 
> layouts
> 
> 
> ZFS_DBGMSG(zdb):
> spa_open_common: opening ...
> spa_load(tank2, config trusted): LOADING
> disk vdev '/dev/diskid/DISK-WD-...': best uberblock found for spa tank2. 
> txg 40503599
> spa_load(tank2, config untrusted): using uberblock with txg=40503599
> spa_load(tank2, config trusted): spa_load_verify found 0 metadata errors 
> and 2 data errors
> spa_load(tank2, config trusted): LOADED
> 
> 
> In the zfs diff was also a line "-   ...(on_delete_queue)".
> 
> I have one zfs filesystem where this happens quite often, one were it 
> happens sometimes and a few others which have a similar setup and where 
> I never noticed it (though the average file size on them is smaller).
> 
> I asked in #freebsd about this and koobs said I should write to this 
> list and CC kib at freebsd.org and mgj at freebsd.org
> He also did a quick look at the nullfs changes between 10.3 and 12.0 and 
> spotted the following change, which he said I should mention as well:
> https://github.com/freebsd/freebsd/commit/82f9c275c43da09f404546cceeff187a90ecc573#diff-81e7d6520611101890dd6425324dd8f8
> 
> Is there a known bug there? Could the stale NFS handle cause the leak? 
> Where is that NFS handle coming from?

So what is the exact version of your system ?  If 12.0, upgrade kernel
to latest stable/12 and see if it helps with the leak.


More information about the freebsd-fs mailing list