RFC: What should a copy_file_range(2) syscall do by default?

Rick Macklem rmacklem at uoguelph.ca
Sat Jun 22 16:02:01 UTC 2019


Hi,

sef@ made this comment on phabricator. I don't believe phabricator is the correct
place for "big picture" discussions, so I'm posting it here (I'm assuming sef@ doesn't
mind, since the phabricator comments are public).
sef@ wrote:
>This much work in the kernel for what //should// be user-space makes me twitchy... >but there is lots of precedent for it, so I obviously have to get with the times.
>  
>  I've done a quick review of the code; it seems most of the complexity is in the hole->detection.  I'm also annoyed that linux used size_t for the amount to copy, when >off_t would have been more appropriate.  But not much to do about that now.
>  
>  Having a default implementation means that user-space can't fall back if it's not >supported, and do it better (e.g., parallel I/O).  Should we also have a pathconf for >the feature?
>  
>  WRT your question on -fs, I have no objections to this working cross-filesystem, >although I think I might ask to have a flag to fail in that case.

Well, all I am interested in is a system call/VOP call so the NFSv4.2 client can do
a file copy locally on the NFS server instead of doing Reads/Writes across the wire.
The current code has gotten fairly complex, so I'll try and ask "how complex" this
syscall/VOP call should be?

The range of variants I can think of are:
0) - Don't do it at all.
1) - The syscall could just do a VOP_COPY_FILE_RANGE() and return whatever error
        it returns.
        --> This implies an error return for all file systems for now, with support for 
              NFSv4.2mounts being added later (FreeBSD13 hopefully).
2) - The syscall could fall back on a simple copy loop, but not try to deal with holes.
       --> The Linux man page mentions using copy_file_range(2) in a loop with
             lseek(SEEK_DATA)/lseek(SEEK_HOLE) for sparse files. This suggests that
             the Linux fallback code doesn't try to handle holes.
3) - The current patch which tries to handle holes and copy the entire byte range
       in one call.

As sef@ mentions, there is also the question of handling copying across multiple
file systems. I asked about this before and I only got the one response, which was
"do it". I have seen a discussion of adding cross-mount to the syscall for Linux, but
I don't know if/when the Linux one might support that. (They have not created
a "flag" option for this, as far as I've seen.)
It happens without additional complexity for #2 and #3 above.

Linux discussions have talked about improved performance for local file systems
based on reduced # of system calls, but I have not seen any data to show what,
if any, performance improvement has been observed. (The slow hardware I have
to test on won't be useful for performance evaluation.)

So, what do others think w.r.t. the above? rick




More information about the freebsd-fs mailing list