From nobody Fri Apr 05 14:13:18 2024 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4VB0p3658Nz5GXgg for ; Fri, 5 Apr 2024 14:13:31 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-ot1-x32d.google.com (mail-ot1-x32d.google.com [IPv6:2607:f8b0:4864:20::32d]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4VB0p33646z47DZ; Fri, 5 Apr 2024 14:13:31 +0000 (UTC) (envelope-from asomers@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-ot1-x32d.google.com with SMTP id 46e09a7af769-6e89b6daa1fso1398183a34.0; Fri, 05 Apr 2024 07:13:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1712326410; x=1712931210; darn=freebsd.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=EMUrHZK0O+csdYb7ATktvE273n4pSUHgg/bOxwt/SBo=; b=jr5nPoopxURCdmkr5J0ia5n/LiLGoQBClu2Gv8obdVf//MiNOQlLntH6lQD9Hq0bWj 3yztzvwRsH4JlPsiwQdjPoS0vVcXBwbVDw7bZ1P5PyuJuB660beRK3TkXnILtbc/Qe7D c4c91jCnBpgwUn22DWqnREOWwIyKcXybJClkSeEx62ETLu4vYizklpX8PeuTE+xhC6Uo q+6One+rJEjz4wr3PoXZEOTGUtq9MJ5f1RbHIqcZmuLzLDzBrsGmXJM4DEqAjo0RQAb9 /+LiM+pFBEvv7GH/Ca40lkjYzUmonUKpVOg+pgqdiN0tvAPiAl+BhSQ1S6i46H18nObq I+wA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712326410; x=1712931210; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=EMUrHZK0O+csdYb7ATktvE273n4pSUHgg/bOxwt/SBo=; b=SZNJaAOpg5za7frLCki9T321osSVukNcTrtiSar7J/kBPO+/d4bMYukwRQJIJMDqxR 9cGtSnwTV/K+/ImIpHm5rGg/Me04CyWLrDY/R1I4iSKmT6iyDhVhMrcdkqNdAUuBkTQE lqAYSv3tCGscAH1XdqGPo/h7AELulIwn15JIX+SGbyGy1QGhaUvaIWeubTZp12bMlZzf Q8LQi04+Ylbbhvyh3R0oGYGQzrQMrHYoW9ElMFQ1sLKsNMLFjlp+aDTKeBczJhC2pE2D dolTe0EBwt7JAyRbiymjn2Yiw+ddfN1/qK/5+XutQ41mLHmTThKo7Zudj/O619hF8Nkr lFJg== X-Forwarded-Encrypted: i=1; AJvYcCWlSEJXtWirNKteqbTLvJQHchnl4bDstsPOvjAnUSrb6nNkLvKwkuQgqu3hkPPH2TvyVUdFJNbaWP36bohNNbF80FVrP8pJfX6q9PA= X-Gm-Message-State: AOJu0Yxgav7k3Kd72JNKP/oRgeujFY/WSS4zw/t7tQIlYtkKvVKMWHuo ZxwbUKQ7k2Jum63+5SvHwhOE59XXNuCn86IB414NZMcSNoQUi4zo+X64LHtfvonoXIKJpcMupbB iqjNkNKewOLhik9EzeGBm7Psw/oMpbjcyAGk= X-Google-Smtp-Source: AGHT+IHbo6Ur6/Ho6ctHrPlyzE8b8Lv1Ih6j8AiWUXjJn6vDK79iLxptyCpPMdyiVoxRrzyqPNZfda8RJ5Coxil9YIo= X-Received: by 2002:a05:6808:1806:b0:3c5:d2f7:cbaa with SMTP id bh6-20020a056808180600b003c5d2f7cbaamr1765183oib.13.1712326409985; Fri, 05 Apr 2024 07:13:29 -0700 (PDT) List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@freebsd.org MIME-Version: 1.0 References: <202404050543.4355hDcS009860@critter.freebsd.dk> <202404051354.435Ds1KX086243@critter.freebsd.dk> In-Reply-To: <202404051354.435Ds1KX086243@critter.freebsd.dk> From: alan somers Date: Fri, 5 Apr 2024 08:13:18 -0600 Message-ID: Subject: Re: SEEK_HOLE at EOF To: Poul-Henning Kamp Cc: Alan Somers , FreeBSD Hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US] X-Rspamd-Queue-Id: 4VB0p33646z47DZ On Fri, Apr 5, 2024 at 7:54=E2=80=AFAM Poul-Henning Kamp wrote: > > -------- > Alan Somers writes: > > On Thu, Apr 4, 2024 at 11:43=3DE2=3D80=3DAFPM Poul-Henning Kamp > dk> wrote: > > > > Just two minor quibbles: > > > > > > If the file position is EOF, then you /are/ "beyond the end of the fi= le" > > > because a read(2) would not be able to return any data. > > > > Do you distinguish between "at EOF" and "beyond EOF"? And does it not > > trouble you that calling SEEK_HOLE from the beginning of the "virtual > > hole at EOF" will return ENXIO, even though calling SEEK_HOLE from the > > beginning of any real hole will return the current offset? > > EOF is where the file ends and there's no "hole" there, because there > no more file on the other side of that "hole". > > When you stand on a cliff, the ocean is not "a hole in the landscape", > it's where the landscape ends. Except there is a hole at EOF, a virtual hole. The draft spec specifically says "all seekable files shall have a virtual hole starting at the current size of the file". > > > > And returning ENXIO is more informative than returning the size of th= e > > > file, since it atomically tells you that there are no more holes. > > > > Ahh, that's a good point. It's the first point I've heard in favor of > > this option. Are you aware of any applications that need to know > > that? > > No, but that should not get in the way of good syscall architecture :-) > > It might be useful for archivers which try to be smart about sparse files= . I imagine that most archivers would work like this: ofs =3D 0 loop { let start =3D lseek(fd, ofs, SEEK_DATA); if ENXIO { // No more data regions break } let end =3D lseek(fd, ofs, SEEK_HOLE); assert!(!ENXIO) // thanks to the virtual hole, we should never have ENXIO here copy(fd, start, end - start, ...) ofs =3D end } truncate(output_file, fd.fsize) Since archivers really only care about data regions, not holes, I don't think that they would usually call SEEK_HOLE at EOF. > > -- > Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 > phk@FreeBSD.ORG | TCP/IP since RFC 956 > FreeBSD committer | BSD since 4.3-tahoe > Never attribute to malice what can adequately be explained by incompetenc= e.