From nobody Wed Aug 06 16:38:53 2025 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4bxwwx0XjVz63xG9 for ; Wed, 06 Aug 2025 16:39:13 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Received: from mail-ed1-x52c.google.com (mail-ed1-x52c.google.com [IPv6:2a00:1450:4864:20::52c]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4bxwww5Xm6z3TkM; Wed, 06 Aug 2025 16:39:12 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-ed1-x52c.google.com with SMTP id 4fb4d7f45d1cf-60bfcada295so112834a12.1; Wed, 06 Aug 2025 09:39:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1754498346; x=1755103146; darn=freebsd.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=GB3iBnD0nHjClDxyY3vvNf4gjNvZauT8msmR8w187J8=; b=lZlZ/wNsitCF27fPs+UDXZsAGXCJEDQEvtWJfmb59hwUjIryDGp9DQ9oOqR4DvXcbz Wc1pYAZ6nW4Bjl7YH+zeUrC+qoC1D+xeThUo7AFL+NOc6BtXg7K8M1BY6XRTUoy3jr9l Nlq6MS0OuCCoOulATVKU0tCWpi0cNmzUIgyjgwui8qxmH7BnL2HzgkTIa0CcHulOGUYh WfVkUlp04Ig1FzO1BrhDwLyEGELUkHt6WBLlT5zuvI5f8SlLDtYLgZGUOeTMn4yP2MPe Dg4KuCpHDrZJfPPCu9rQjCUTeYrJWEONPHAycDLYiWvySwllQajpuwMgPmThzUKuNeq/ asZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754498346; x=1755103146; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=GB3iBnD0nHjClDxyY3vvNf4gjNvZauT8msmR8w187J8=; b=MLAaMufhgJomhh4DuknKIjn3QMJNmYBelma5HimoAj8WS1bz9so/Tb50vqsfcdxQFH RqKbjr9z8dW3x8t936vL3VaIgEP6TovyyINJ2yfwMRgIZHI1DLXVki4vU51OTz71Pc98 uA0V84bKUIaT8YSZF2MfTJJ74h7jUqwGJvxwB+cmN0oebCVCngJHMRAgZolzeRNGTbDh 99R3n698/N2dYiIguL/BgQOhTLdCXW+aVo2hRYaLvJxWborykBbt97qDEqjhJHaWgF9s WJskaYpxBNOP5OT3fTYZmsOFSfxMeJ0YIrLo8xtr4r1dzEA7qGYvBfheOebiYgBzZOGt zQ2w== X-Gm-Message-State: AOJu0YybjAkNebcVGQI+mXSAy7SnL1IzREcfWDqR2n3aZXPr4EfYU1pg olkNkgm7nmqxmDqHvZ0LhbxW2C/Kgj+kOe2RQ5hyBQ2jzKGVmWeZKXywcJWztlIpArR6i5PYgXZ hYLVTW2ZibdEob5RNdR+MtDWmfCAKxWwJ X-Gm-Gg: ASbGncttz1DxhZiQlcpavKXhrfQQLJFf9v1r2rO26t5KkafbFmi7bsGloGD6IzxOIpi fPnMMM/T1aBsZtshGlBwQVHHKrIDgQA9ffPgJ8XDfvsd+I3dgrEOgyVn26YbB3+NjhPh3M47iNZ d5DBcsHFtsJfanCl8rYN4iCKi2KlOGofN7tlHKk3sJ0t+Ujyj4F/E4aZlQU/slSCAHZpsYVJm1r 1xXSR6kCH6DdSPp+efn8T+VtyOwTX6L09PsYYU= X-Google-Smtp-Source: AGHT+IF7eglVlV2xjpg477E5poo6mF614txwCV/WILYprVWs1Nmv/4TD0EZD6IEP4RI5twFIA4ACYaUcpRNwwlTL0rQ= X-Received: by 2002:a05:6402:2551:b0:615:6fae:d766 with SMTP id 4fb4d7f45d1cf-61797e2c2a9mr2965552a12.26.1754498346338; Wed, 06 Aug 2025 09:39:06 -0700 (PDT) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@FreeBSD.org MIME-Version: 1.0 References: In-Reply-To: From: Rick Macklem Date: Wed, 6 Aug 2025 09:38:53 -0700 X-Gm-Features: Ac12FXz_ObDXWs7AW_C4G57DUoUzLpi-ps1Y0w9Vufpb7mLFfiJDidxk6xiun8Y Message-ID: Subject: Re: RFC: Does ZFS block cloning do this? To: Alan Somers Cc: FreeBSD CURRENT Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4bxwww5Xm6z3TkM X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; TAGGED_FROM(0.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US] On Wed, Aug 6, 2025 at 9:20=E2=80=AFAM Alan Somers wr= ote: > > On Wed, Aug 6, 2025 at 9:54=E2=80=AFAM Rick Macklem wrote: >> >> On Wed, Aug 6, 2025 at 8:32=E2=80=AFAM Alan Somers = wrote: >> > >> > On Wed, Aug 6, 2025 at 9:18=E2=80=AFAM Rick Macklem wrote: >> >> >> >> Hi, >> >> >> >> NFSv4.2 has a CLONE operation. It is described as doing: >> >> The CLONE operation is used to clone file content from a source fi= le >> >> specified by the SAVED_FH value into a destination file specified = by >> >> CURRENT_FH without actually copying the data, e.g., by using a >> >> copy-on-write mechanism. >> >> (It takes arguments for 2 files, with byte offsets and a length.) >> >> The offsets must be aligned to a value returned by the NFSv4.2 server= . >> >> 12.2.1. Attribute 77: clone_blksize >> >> >> >> The clone_blksize attribute indicates the granularity of a CLONE >> >> operation. >> >> >> >> Does ZFS block cloning do this? >> >> >> >> I am asking now, because although it might be too late, >> >> if the answer is "yes", I'd like to get VOP calls into 15.0 >> >> for it. (Hopefully with the VOP calls in place, the rest could >> >> go in sometime later, when I find the time to do it.) >> >> >> >> Thanks in advance for any comments, rick >> > >> > >> > Yes, it does that right now, if the feature@block_cloning pool attribu= te is enabled. It works with VOP_COPY_FILE_RANGE. Does NFS really need a = new VOP? >> Either a new VOP or maybe a new flag argument for VOP_COPY_FILE_RANGE(). >> Linux defined a flag argument for their copy_file_range(), but they have= never >> defined any flags. Of course, that doesn't mean there cannot be a >> "kernel internal" >> flag. >> >> So maybe adding a new VOP can be avoided. That would be nice, given the = timing >> of the 15.0 release and other churn going on. >> >> The difference for NFSv4.2 is that CLONE cannot return with partial comp= letion. >> (It assumes that a CLONE of any size will complete quickly enough for an= RPC. >> Although there is no fixed limit, most assume an RPC reply should happen= in >> 1-2sec at most. For COPY, the server can return with only part of the >> copy done.) >> It also includes alignment restrictions for the byte offsets. >> >> There is also the alignment restriction on CLONE. There doesn't seem to = be >> an alignment restriction on zfs_clone_range(), but maybe it is buried in= side it? >> I think adding yet another pathconf name to get the alignment requiremen= t and >> whether or not the file system supports it would work without any VOP ch= ange. >> >> rick > > > zfs_clone_range doesn't have any alignment restrictions. But if the argu= ment isn't aligned to a record boundary, ZFS will actually copy a partial r= ecord, rather than clone it. Regarding the copy-to-completion requirement,= could that be implemented within nfs by looping over VOP_COPY_FILE_RANGE? But the reason behind partial completion is the time restriction. The NFSv4= .2 server limits the size to vfs.nfsd.maxcopyrange and sets a 1sec time limit via a flag to vn_copy_file_range(). For CLONE, it needs to either: - be able to complete the entire "copy" within 1-2sec under normal circumstances, irrespective of length. or - return not supported, so the client will switch to using COPY. rick