Re: SEEK_HOLE at EOF

From: Warner Losh <imp_at_bsdimp.com>
Date: Thu, 04 Apr 2024 22:44:54 UTC
On Thu, Apr 4, 2024 at 3:39 PM Rick Macklem <rick.macklem@gmail.com> wrote:

> On Thu, Apr 4, 2024 at 1:59 PM Alan Somers <asomers@freebsd.org> wrote:
> >
> > On Thu, Apr 4, 2024 at 2:56 PM Rick Macklem <rick.macklem@gmail.com>
> wrote:
> > >
> > > On Thu, Apr 4, 2024 at 11:15 AM Alan Somers <asomers@freebsd.org>
> wrote:
> > > >
> > > > tldr; there are two problems:
> > > > 1) tmpfs handles SEEK_HOLE differently than other file systems
> > > > 2) everything else handles SEEK_HOLE at EOF poorly, IMHO
> > > >
> > > > Details:
> > > >
> > > > According to lseek(2), SEEK_HOLE should return the start of the next
> > > > hole greater than or equal to the supplied offset.  Also, each file
> > > > has a zero-sized virtual hole at the very end of the file.  So I
> would
> > > > expect that calling SEEK_HOLE at EOF would return the file's size.
> > > > However, the man page also says that SEEK_HOLE will return ENXIO when
> > > > the offset points to EOF.  Those two statements seem contradictory to
> > > > me.  The first behavior seems more logical.  I would expect SEEK_HOLE
> > > > to work the same way both at EOF and at any other file offset.
> > > >
> > > > What does the spec say?
> > > >
> > > > There is no POSIX standard for this.  It was invented by Solaris,
> > > > Illumos's man page does not say clearly say what should happen at
> EOF.
> > > > Linux's man page is clear: "whence is SEEK_DATA or SEEK_HOLE, and
> > > > offset is beyond the end of the file".  That would seem to indicate
> > > > behavior 1: SEEK_HOLE should return the file's size at EOF.  Only
> > > > beyond EOF should it return ENXIO.
> > > Well, there is the Austin Group stuff (never ratified by POSIX as I
> > > understand it).
> > >
> > > Here's what it says about SEEK_HOLE and offset:
> > > If whence is SEEK_HOLE, the file offset shall be set to the smallest
> > > location of a byte within a hole and not less than offset, except that
> > > if offset falls within the last hole, then the file offset may be set
> > > to the file size instead. It shall be an error if offset is greater
> > > or equal to the size of the file.
> > >
> > > I'd suggest we follow this, since it is the closest to a standard that
> there is.
> >
> > That sounds like behavior 2: return ENXIO at EOF.  For reference, do
> > you have a link to that somewhere?
> 0000415: add SEEK_HOLE, SEEK_DATA to lseek - Austin Group Defect
> Tracker (austingroupbugs.net)
> If this doesn't give you a link (gmail never shows the raw url for me)
> just google
> "SEEK_HOLE austin group".
>

You have to join the mailing list to have access. It's easy to do. You can
then download the latest draft (which I think is the ballot draft, so will
be quite close to final, usually just 'typos' and such are corrected before
the published standard).This will be the next POSIX.1 standard, likely this
year.

So it's kinda hard to give an exact link :(.

Warner