Re: Sparse file support in FreeBSD NFSv4.2 server

From: Rob Norris <robn_at_despairlabs.com>
Date: Tue, 13 May 2025 06:51:16 UTC
On Tue, 13 May 2025, at 12:38 AM, Rick Macklem wrote:
> > > > - NFSv4.2 operation "ALLOCATE", to allocate disk space
> > > Will never happen for ZFS because it is bhinkasically impossible. I am not a ZFS
> > > guy, but that is what I have been told. UFS can do it, so it can be enabled if
> > > all your exports are UFS file systems.
> >
> > Solaris has fnctl(F_ALLOCSP,...), so this should work on ZFS.
> Well, I'm not a ZFS guy, but here is what I understood from the ZFS
> folk w.r.t. this:
> - When you write data to a file, new blocks are allocated for the data
> bytes, even if
>   there is already old data written to those bytes.  As such, it is
> "impossible" to
>   guarantee that a write will not reply ENOSPACE/EQUOTA.
>   One responder did think it was possible, but listed several major changes
>   that would be required to make this possible on ZFS. (So "impossible" might
>   really be "too difficult to ever be implemented".)

I _am_ an OpenZFS guy, and can say yes, this is correct. Creating a sparse region is easy, but it's the guarantee that future changes in that region will never run out of space is the tricky bit.

Without having looked at it, I can see a way to do it by creating some object-specific operation to "write" but have it accounted to a dataset's ""reservation", rather than "used". Easy to say, difficult to do. I suspect the hardest part is figuring out the best way to keep a set of reserved ranges on each object.

Incidentally, I think the same machinery is necessary to get a properly compliant implementaiton of posix_fallocate(2), which has the same guarantee.

Cheers,
Rob.