Re: SEEK_HOLE at EOF

From: Alan Somers <asomers_at_freebsd.org>
Date: Thu, 04 Apr 2024 20:59:25 UTC
On Thu, Apr 4, 2024 at 2:56 PM Rick Macklem <rick.macklem@gmail.com> wrote:
>
> On Thu, Apr 4, 2024 at 11:15 AM Alan Somers <asomers@freebsd.org> wrote:
> >
> > tldr; there are two problems:
> > 1) tmpfs handles SEEK_HOLE differently than other file systems
> > 2) everything else handles SEEK_HOLE at EOF poorly, IMHO
> >
> > Details:
> >
> > According to lseek(2), SEEK_HOLE should return the start of the next
> > hole greater than or equal to the supplied offset.  Also, each file
> > has a zero-sized virtual hole at the very end of the file.  So I would
> > expect that calling SEEK_HOLE at EOF would return the file's size.
> > However, the man page also says that SEEK_HOLE will return ENXIO when
> > the offset points to EOF.  Those two statements seem contradictory to
> > me.  The first behavior seems more logical.  I would expect SEEK_HOLE
> > to work the same way both at EOF and at any other file offset.
> >
> > What does the spec say?
> >
> > There is no POSIX standard for this.  It was invented by Solaris,
> > Illumos's man page does not say clearly say what should happen at EOF.
> > Linux's man page is clear: "whence is SEEK_DATA or SEEK_HOLE, and
> > offset is beyond the end of the file".  That would seem to indicate
> > behavior 1: SEEK_HOLE should return the file's size at EOF.  Only
> > beyond EOF should it return ENXIO.
> Well, there is the Austin Group stuff (never ratified by POSIX as I
> understand it).
>
> Here's what it says about SEEK_HOLE and offset:
> If whence is SEEK_HOLE, the file offset shall be set to the smallest
> location of a byte within a hole and not less than offset, except that
> if offset falls within the last hole, then the file offset may be set
> to the file size instead. It shall be an error if offset is greater
> or equal to the size of the file.
>
> I'd suggest we follow this, since it is the closest to a standard that there is.

That sounds like behavior 2: return ENXIO at EOF.  For reference, do
you have a link to that somewhere?