Re: RFC: Does ZFS block cloning do this?

From: Alan Somers <asomers_at_freebsd.org>
Date: Wed, 06 Aug 2025 16:20:27 UTC
On Wed, Aug 6, 2025 at 9:54 AM Rick Macklem <rick.macklem@gmail.com> wrote:

> On Wed, Aug 6, 2025 at 8:32 AM Alan Somers <asomers@freebsd.org> wrote:
> >
> > On Wed, Aug 6, 2025 at 9:18 AM Rick Macklem <rick.macklem@gmail.com>
> wrote:
> >>
> >> Hi,
> >>
> >> NFSv4.2 has a CLONE operation. It is described as doing:
> >>    The CLONE operation is used to clone file content from a source file
> >>    specified by the SAVED_FH value into a destination file specified by
> >>    CURRENT_FH without actually copying the data, e.g., by using a
> >>    copy-on-write mechanism.
> >> (It takes arguments for 2 files, with byte offsets and a length.)
> >> The offsets must be aligned to a value returned by the NFSv4.2 server.
> >> 12.2.1.  Attribute 77: clone_blksize
> >>
> >>    The clone_blksize attribute indicates the granularity of a CLONE
> >>    operation.
> >>
> >> Does ZFS block cloning do this?
> >>
> >> I am asking now, because although it might be too late,
> >> if the answer is "yes", I'd like to get VOP calls into 15.0
> >> for it. (Hopefully with the VOP calls in place, the rest could
> >> go in sometime later, when I find the time to do it.)
> >>
> >> Thanks in advance for any comments, rick
> >
> >
> > Yes, it does that right now, if the feature@block_cloning pool
> attribute is enabled.  It works with VOP_COPY_FILE_RANGE.  Does NFS really
> need a new VOP?
> Either a new VOP or maybe a new flag argument for VOP_COPY_FILE_RANGE().
> Linux defined a flag argument for their copy_file_range(), but they have
> never
> defined any flags. Of course, that doesn't mean there cannot be a
> "kernel internal"
> flag.
>
> So maybe adding a new VOP can be avoided. That would be nice, given the
> timing
> of the 15.0 release and other churn going on.
>
> The difference for NFSv4.2 is that CLONE cannot return with partial
> completion.
> (It assumes that a CLONE of any size will complete quickly enough for an
> RPC.
> Although there is no fixed limit, most assume an RPC reply should happen in
> 1-2sec at most. For COPY, the server can return with only part of the
> copy done.)
> It also includes alignment restrictions for the byte offsets.
>
> There is also the alignment restriction on CLONE. There doesn't seem to be
> an alignment restriction on zfs_clone_range(), but maybe it is buried
> inside it?
> I think adding yet another pathconf name to get the alignment requirement
> and
> whether or not the file system supports it would work without any VOP
> change.
>
> rick
>

zfs_clone_range doesn't have any alignment restrictions.  But if the
argument isn't aligned to a record boundary, ZFS will actually copy a
partial record, rather than clone it.  Regarding the copy-to-completion
requirement, could that be implemented within nfs by looping over
VOP_COPY_FILE_RANGE?