RFC: What should a copy_file_range(2) syscall do by default?

Rick Macklem rmacklem at uoguelph.ca
Sat Jun 22 23:10:24 UTC 2019


Sean Eric Fagan wrote:
>>Well, all I am interested in is a system call/VOP call so the NFSv4.2
>>client can do
>>a file copy locally on the NFS server instead of doing Reads/Writes
>>across the wire.
>>The current code has gotten fairly complex, so I'll try and ask "how
>>complex" this
>>syscall/VOP call should be?
>
>In a previous life, I was responsible for one of the file copy libraries, so
>this is something I do have experience with.  (I find the copy-range syscall
>interesting; AFP had a command to copy an entire hierachy on the server.)
>
>>       --> The Linux man page mentions using copy_file_range(2) in a loop with
>>             lseek(SEEK_DATA)/lseek(SEEK_HOLE) for sparse files. This
>>suggests that
>>             the Linux fallback code doesn't try to handle holes.
>
>As far as I can tell, correct; instead, the copy routine looks for holes in
>user space, and copies the non-holes.
For NFSv4.2, the client can do SEEK_DATA/SEEK_HOLE against the server, although
it does imply extra RPC RTTs.

>>Linux discussions have talked about improved performance for local file systems
>>based on reduced # of system calls, but I have not seen any data to show what,
>>if any, performance improvement has been observed. (The slow hardware I have
>>to test on won't be useful for performance evaluation.)
>
>My experience shows that it's minimal, if all it will be copying is a single
>file.  There would have to be a lot of system calls, and a *lot* of syscall
>overhead, for that to hold sway -- and they're also doing the checks for
>holes, which may end up increasing the number of system calls for them by a
>significant amount.  I'm still skeptical.
Yes, my hunch is the same.
However, I do expect a performance improvement for NFS (at least for large files),
due to savings w.r.t. RPC RTTs and avoiding data going server->client->server.
I suspect avoiding the kernel/userspace transitions may help w.r.t. fuse, too.

>Alan mentioned locking, which does buy you something, but it also means
>*locking the file while it is being copied*.  Which, for large files, is not
>so great.  I also don't think you can call any large copy atomic, unless
>you're using a signle transaction for the entire copy.
I tried posting w.r.t. atomicity and didn't get a lot of responses. However, although
kib@ didn't exactly say it should be the case, he did point out that FreeBSD has
traditionally ensured atomicity of file updates for syscalls and felt that was a good
thing. As such, I've done the range locking of both files and created new primitives
to do that while avoiding deadlock.

If others have opinions w.r.t. atomicity of file data updates within this syscall, please
post to either that thread or this one.

>Anyway:  I don't have a big objection to it, other than putting a lot of work
>into a system call, but as I said I'm clearly a couple decades behind on that
>sentiment :).
Thanks for your comments. However, you didn't seem to indicate your preferred
alternative?

I, personally, don't care, but would like to find out what the "collective" thinks, rick.


More information about the freebsd-fs mailing list