From nobody Fri Apr 05 18:15:37 2024 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4VB69h6WY3z5GwDd for ; Fri, 5 Apr 2024 18:15:52 +0000 (UTC) (envelope-from junchoon@dec.sakura.ne.jp) Received: from www121.sakura.ne.jp (www121.sakura.ne.jp [153.125.133.21]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4VB69f28Nyz4fQb; Fri, 5 Apr 2024 18:15:49 +0000 (UTC) (envelope-from junchoon@dec.sakura.ne.jp) Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of junchoon@dec.sakura.ne.jp designates 153.125.133.21 as permitted sender) smtp.mailfrom=junchoon@dec.sakura.ne.jp Received: from kalamity.joker.local (123-1-21-232.area1b.commufa.jp [123.1.21.232]) (authenticated bits=0) by www121.sakura.ne.jp (8.17.1/8.17.1/[SAKURA-WEB]/20201212) with ESMTPA id 435IFb1o023783; Sat, 6 Apr 2024 03:15:38 +0900 (JST) (envelope-from junchoon@dec.sakura.ne.jp) Date: Sat, 6 Apr 2024 03:15:37 +0900 From: Tomoaki AOKI To: Rick Macklem Cc: alan somers , Poul-Henning Kamp , Alan Somers , FreeBSD Hackers Subject: Re: SEEK_HOLE at EOF Message-Id: <20240406031537.69bc9e9fc6724171fe604fa8@dec.sakura.ne.jp> In-Reply-To: References: <202404050543.4355hDcS009860@critter.freebsd.dk> <202404051354.435Ds1KX086243@critter.freebsd.dk> Organization: Junchoon corps X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.33; amd64-portbld-freebsd14.0) List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@freebsd.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spamd-Bar: - X-Spamd-Result: default: False [-1.12 / 15.00]; NEURAL_HAM_LONG(-1.00)[-0.996]; NEURAL_HAM_SHORT(-0.52)[-0.520]; MV_CASE(0.50)[]; R_SPF_ALLOW(-0.20)[+ip4:153.125.133.16/28]; ONCE_RECEIVED(0.10)[]; MIME_GOOD(-0.10)[text/plain]; NEURAL_SPAM_MEDIUM(0.09)[0.095]; HAS_ORG_HEADER(0.00)[]; TO_DN_ALL(0.00)[]; FREEMAIL_CC(0.00)[gmail.com,phk.freebsd.dk,freebsd.org]; ARC_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; DMARC_NA(0.00)[sakura.ne.jp]; RCVD_TLS_LAST(0.00)[]; FREEMAIL_TO(0.00)[gmail.com]; ASN(0.00)[asn:7684, ipnet:153.125.128.0/18, country:JP]; RCPT_COUNT_FIVE(0.00)[5]; MLMMJ_DEST(0.00)[freebsd-hackers@freebsd.org]; TO_MATCH_ENVRCPT_SOME(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; RCVD_COUNT_ONE(0.00)[1]; R_DKIM_NA(0.00)[]; TAGGED_RCPT(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[] X-Rspamd-Queue-Id: 4VB69f28Nyz4fQb On Fri, 5 Apr 2024 07:23:20 -0700 Rick Macklem wrote: > On Fri, Apr 5, 2024 at 7:13 AM alan somers wrote: > > > > On Fri, Apr 5, 2024 at 7:54 AM Poul-Henning Kamp wrote: > > > > > > -------- > > > Alan Somers writes: > > > > On Thu, Apr 4, 2024 at 11:43=E2=80=AFPM Poul-Henning Kamp > > > dk> wrote: > > > > > > > > Just two minor quibbles: > > > > > > > > > > If the file position is EOF, then you /are/ "beyond the end of the file" > > > > > because a read(2) would not be able to return any data. > > > > > > > > Do you distinguish between "at EOF" and "beyond EOF"? > As a bit of an aside, NFSv4.2 does differentiate between "at EOF" > and "beyond EOF" for its Seek operation. > The fun part is that Linux did not implement what is in the RFC and shipped > to many before the "bug" was noticed (and still do not conform to the RFC > afaik). As such, there are now two ways to do it, The RFC way or the Linux > way. Selecting between them is what the sysctl vfs.nfsd.linux42server does. > > > > > And does it not > > > > trouble you that calling SEEK_HOLE from the beginning of the "virtual > > > > hole at EOF" will return ENXIO, even though calling SEEK_HOLE from the > > > > beginning of any real hole will return the current offset? > > > > > > EOF is where the file ends and there's no "hole" there, because there > > > no more file on the other side of that "hole". > > > > > > When you stand on a cliff, the ocean is not "a hole in the landscape", > > > it's where the landscape ends. > > > > Except there is a hole at EOF, a virtual hole. The draft spec > > specifically says "all seekable files shall have a virtual hole > > starting at the > > current size of the file". > I think that they used the term "virtual" to indicate this is not a real hole > and I think it was a good idea, since it allows file systems that do not > support holes to support SEEK_DATA. > > However, I still believe that conforming to the Austin Group draft is > preferable. > > rick Not read the specs and codes, so this is just a point of view from an admin and a user. I think what admins/users want woulde be that: *File size with holes seen using `ls -l` should be ALWAYS the size when ALL existed holes are completely filled, including virtual hole at the EOF (classic wording of End Of File). *If the word EOF here means the End Of Filled part, it SHALL be called differently (for example, EOFP?) to clarify. *The virlal hole should be considered as an actual hole which ends at just at classic EOF and sized (file size - position of EOFP). So there cannot be any virtual holes but called so just for convenience. Otherwise, how admins/users manage capacities left on the FS/Quota which contains the sparse file? To be clearer, for text files having EOF character code (to be clear, call it as EOFC here) at the end, and said EOF means where the EOFC is at, yes, it can be the problem raised. But it had been often happening in the wild, usually to avoid errors on next append. (As non-intentional case, cluster gaps are look-alike, but different actually and causing newbies confused, "why free disk [partition] space is not equals to the size that is subtracting summed file sizes exists from whole disk space?".) Recently, more confusions exist with FS-level compressions like lz4 compression on ZFS datasets, though. Regards > > > > > And returning ENXIO is more informative than returning the size of the > > > > > file, since it atomically tells you that there are no more holes. > > > > > > > > Ahh, that's a good point. It's the first point I've heard in favor of > > > > this option. Are you aware of any applications that need to know > > > > that? > > > > > > No, but that should not get in the way of good syscall architecture :-) > > > > > > It might be useful for archivers which try to be smart about sparse files. > > > > I imagine that most archivers would work like this: > > ofs = 0 > > loop { > > let start = lseek(fd, ofs, SEEK_DATA); > > if ENXIO { > > // No more data regions > > break > > } > > let end = lseek(fd, ofs, SEEK_HOLE); > > assert!(!ENXIO) // thanks to the virtual hole, we should never > > have ENXIO here > > copy(fd, start, end - start, ...) > > ofs = end > > } > > truncate(output_file, fd.fsize) > > > > Since archivers really only care about data regions, not holes, I > > don't think that they would usually call SEEK_HOLE at EOF. > > > > > > > > -- > > > Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 > > > phk@FreeBSD.ORG | TCP/IP since RFC 956 > > > FreeBSD committer | BSD since 4.3-tahoe > > > Never attribute to malice what can adequately be explained by incompetence. -- Tomoaki AOKI