Re: RFC: Should copy_file_range(2) return after a few seconds?

From: Rick Macklem <rick.macklem_at_gmail.com>
Date: Sun, 09 Nov 2025 02:09:15 UTC
On Sat, Nov 8, 2025 at 4:43 PM Konstantin Belousov <kib@freebsd.org> wrote:
>
> On Sat, Nov 08, 2025 at 03:22:37PM -0800, Rick Macklem wrote:
> > Hi,
> >
> > Peter Much reported a problem on the freebsd-fs@ mailing
> > list on Oct. 21 under the Subject: "Why does rangelock_enqueue()
> > hang for hours?".
> >
> > The problem was that he had a copy_file_range(2) copying
> > between a large NFS file and a local file that was taking 2hrs.
> > While this copy_file_range(2) was in progress, it was holding
> > a rangelock for the entire output file, causing another process
> > trying to read the output file to hang, waiting for the rangelock.
> >
> > Since copy_file_range(2) is not any standard (just trying to
> > emulate the Linux one), there is no definitive answer w.r.t.
> > should it hold rangelocks.  However, that is how it is currently
> > coded and I, personally, think it is appropriate to do so.
> >
> > Having a copy_file_range(2) syscall take two hours is
> > definitely an unusual case, but it does seem that it is
> > excessive?
> >
> > Peter tried a quick patch I gave him that limited the
> > copy_file_range(2) to 1sec and it fixed the problem
> > he was observing.
> >
> > Which brings me to the question...
> > Should copy_file_range(2) be time limited?
> > And, if the answer to this is "yes", how long do
> > you think the time limit should be?
> > (1sec, 2-5sec or ??)
> >
> > Note that the longer you allow copy_file_range(2)
> > to continue, the more efficient it will be.
> >
> > Thanks in advance for any comments, rick
>
> For me, making a syscall limited by runtime is very strange idea.
> IMO it should not be done.
>
> What can be done, I think, is to add signal interruption points into
> the copying loop.  AFAIR we request a chunk to be copied, for some
> size of the chunk.  After the copy is done, kernel could use
> sig_intr() to check for either interruption or suspend conditions.
> If non-zero is returned, you might finish the loop earlier, reporting
> the partial copy.
>
It already interrupts when a signal like <ctrl>C is posted.

The reporter didn't want the copy (he was actually using "cat")
to terminate. He wanted to read the output file when it was
only partially copied, which doesn't work because of the
rangelock. (It appears it does work on Linux.)

A partial copy_file_range() is normal and an NFSv4.2 server
will return a partial copy after 1sec or so since there is a
general understanding (not wired down in any RFC) that
an RPC should always reply in 1-2sec.

rick