Re: RFC: Should copy_file_range(2) work for shared memory objects?

From: John F Carr <jfc_at_mit.edu>
Date: Thu, 21 Sep 2023 15:02:51 UTC
> On Sep 20, 2023, at 20:39, Rick Macklem <rick.macklem@gmail.com> wrote:
> 
> On Wed, Sep 20, 2023 at 4:54 PM John F Carr <jfc@mit.edu> wrote:
>> 
>> On Sep 20, 2023, at 16:47, Rick Macklem <rick.macklem@gmail.com> wrote:
>>> 
>>> Right now (as noted by PR#273962) copy_file_range(2)
>>> fails for shared memory objects because there is no
>>> vnode (f_vnode == NULL) for them and the code uses
>>> vnodes (including a file system specific VOP_COPY_FILE_RANGE(9)).
>>> 
>>> Do you think copy_file_range(2) should work for shared memory objects?
>>> 
>>> This would require specific handling in kern_copy_file_range()
>>> to work.  I do not think the patch would be a lot of work, but
>>> I am not familiar with the f_ops and shared memory code.
>>> 
>>> rick
>>> 
>> 
>> According to a Linux man page, some failure modes are
>> 
>>       EINVAL Either fd_in or fd_out is not a regular file.
>> 
>>       EOPNOTSUPP (since Linux 5.19) The filesystem does not support this operation.
>> 
>>       EXDEV (since Linux 5.19)
>>            The files referred to by fd_in and fd_out are not on the
>>            same filesystem, and the source and target filesystems are
>>            not of the same type, or do not support cross-filesystem copy.
>> 
>> According to the FreeBSD man page
>> 
>>     The copy_file_range() system call is expected to be compatible with the
>>     Linux system call of the same name.
>> 
> So, I guess you are advocating for sticking with "Linux compatible"?
> I'm fine with that, but we'll see what others say.
> 
> Thanks for your comments, rick
> ps; When I go look at the Linux man page, I often get an out-of-dat
>     one, so I am never sure what Linux currently does. (It is also
>     confusing because some distros implement copy_file_range()
>     in their libc instead of the kernel. I think more recent Linux kernels
>     do support the syscall.)

I think Linux compatible or a bit better is good enough for the system call.
If we want to support all sane copy operations the libc function can handle
what the kernel does not.  I don't know if emulation should be the default
behavior.  The Linux man page says "glibc 2.27 provides a user-space
emulation when it is not available."

We can not properly support all cases with or without a library wrapper.
What if the caller specifies a seek offset on a non-seekable object?
(I recall a really obnoxious program that would signal its displeasure
with your environment by calling lseek() on /dev/tty, causing you to
be logged out when the shell could not read input.)