Re: RFC: Does ZFS block cloning do this?
- In reply to: Rick Macklem : "Re: RFC: Does ZFS block cloning do this?"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 07 Aug 2025 17:22:41 UTC
On Thu, Aug 7, 2025 at 8:32 AM Rick Macklem <rick.macklem@gmail.com> wrote: > On Wed, Aug 6, 2025 at 9:46 AM Rick Macklem <rick.macklem@gmail.com> > wrote: > > > > On Wed, Aug 6, 2025 at 9:28 AM Alexander Motin <mav@freebsd.org> wrote: > > > > > > Hi Rick, > > > > > > On 8/6/25 11:54, Rick Macklem wrote: > > > > The difference for NFSv4.2 is that CLONE cannot return with partial > completion. > > > > (It assumes that a CLONE of any size will complete quickly enough > for an RPC. > > > > Although there is no fixed limit, most assume an RPC reply should > happen in > > > > 1-2sec at most. For COPY, the server can return with only part of the > > > > copy done.) > > > > It also includes alignment restrictions for the byte offsets. > > > > > > > > There is also the alignment restriction on CLONE. There doesn't seem > to be > > > > an alignment restriction on zfs_clone_range(), but maybe it is > buried inside it? > > > > I think adding yet another pathconf name to get the alignment > requirement and > > > > whether or not the file system supports it would work without any > VOP change. > > > > > > The semantics you describe looks similar to Linux FICLONE/FICLONERANGE > > > calls, that got adopted there before copy_file_range(). IIRC those > > > effectively mean -- clone the file or its range as requested or fail. > I > > > am not sure why some people prefer those calls, explicitly not allowing > > > fallback to copy, but theere are some, for example Veeam backup fails > if > > > ZFS rejects the cloning request for any reason. For Linux ZFS has a > > > separate code (see zpl_remap_file_range() and respective VFS calls) > > > wrapping around block cloning to implement this semantics. FreeBSD > does > > > not have the equivalent at this point, but it would be trivial to add, > > > if we really need those VOPs. > > For NFSv4.2 (which I suspect was modelled after what Linux does) the > > difference is the ability to complete the entire "copy" within 1-2sec > under > > normal circumstances. > > --> The NFSv4.2 CLONE operation requires this. > > whereas for the NFSv4.2 COPY > > --> It is allowed to return after a partial completion to adhere to the > 1-2sec > > rule. This probably does not affect ZFS, but it is needed for > > the "in general" > > UFS case. > > > > There may be no difference needed for zfs_copy_file_range(). So long as > it > > never returns after a partial completion. If it does return after > > partial completion, > > a flag would indicate "must complete it". > > > > As for FreeBSD syscalls, I don't see a need for a new one. > > I'll leave that up to others. > > pathconf(2) could be used to determine if cloning is supported. > > > > Thanks for all the comments. It looks like a new "kernel only" flag for > > VOP_COPY_FILE_RANGE() and a new name for VOP_PATHCONF() > > should be all that is needed. > So, this seems almost too easy? > > What I am thinking of (and should be easy to do in the next few days > for 15.0) is: > - Define a new pathconf variable _PC_CLONE_BLKSIZE which returns > the blksize for cloning or 0 if cloning is not supported. > - Define a new flag for copy_file_range() called COPY_FILE_RANGE_CLONE > which, if set, would require that the entire copy be completed via > cloning > (no partial copy allowed) or return ENOSYS if the file system does not > support this. > Expose this flag to userland in case any application really needs > cloning. > The code changes outside of NFS are trivial. > > So, how does this sound? ric Yes, I think that would work.