Re: NFSv4.2 READ_PLUS support?

From: Dag-Erling_Smørgrav <des_at_FreeBSD.org>
Date: Tue, 26 Aug 2025 23:10:04 UTC
Aurélien Couderc <aurelien.couderc2002@gmail.com> writes:
> Dag-Erling Smørgrav <des@freebsd.org> writes:
> > A hole is just an optimization, and it is 100% up to
> > the file system whether holes are created and where.  Any application
> > that considers a hole to be semantically different from a sequence of
> > zeroes is broken.
> No, this is part of POSIX [...]

POSIX defines a hole as follows:

  A contiguous region of bytes within a file, all having the value of
  zero.  Not all bytes with the value zero need belong to a hole;
  however, all seekable files shall have a virtual hole starting at the
  current size of the file.  A hole is typically created via truncate(),
  or if an lseek() call has been made to position beyond the end of a
  file and data subsequently written at that point, although it is up to
  the implementation to define when sparse files can be created and with
  what granularity for the size of holes.

Note the final lemma.

It also defines a sparse file as follows:

  A file that contains more holes than just the virtual hole at the end
  of the file.

That's pretty much all it has to say about holes, although you can glean
a little more from the rationale.  The rationale section for lseek(3)
mostly just rephrases the definition:

  Not all file systems support holes, and even where sparse files are
  supported, not all contiguous blocks of zero bytes are required to be
  recognized as a hole.

The rationale section for du(1) discusses the difficulty of reporting
how much disk space a file occupies:

  There are two known areas of inaccuracies in historical file systems:
  cases of indirect blocks being used by the file system or sparse files
  yielding incorrectly high values.  [...]  A sparse file is one in
  which an lseek() call has been made to a position beyond the end of
  the file and data has subsequently been written at that point.  A file
  system need not allocate all the intervening zero-filled blocks to
  such a file.

It is entirely up to the file system whether, when, and where to create
holes, and any application that assigns semantic value to holes or
expects a hole to exist anywhere else than at the end of a file is
broken.

This is not new, or unique to FreeBSD.  It has always been the case.  At
a minimum, and setting aside compressed file systems which complicate
matters enormously, holes will generally start and end on disk block
boundaries, except for the virtual hole at the end of each file.  If you
look at the cp test suite (bin/cp/tests/cp_test.sh) you'll see that I
had to use 16 MB holes to get the tests to reliably create sparse files
on all file systems I tested on.

I'd be interested in knowing where you heard otherwise, and which
applications you believe depend on predictable hole semantics.

DES
-- 
Dag-Erling Smørgrav - des@FreeBSD.org