Re: RFC: Should copy_file_range(2) return after a few seconds?

Reply: Rick Macklem : "Re: RFC: Should copy_file_range(2) return after a few seconds?"
In reply to: Rick Macklem : "RFC: Should copy_file_range(2) return after a few seconds?"
Go to: [ bottom of page ] [ top of archives ] [ this month ]

From: Konstantin Belousov <kib_at_freebsd.org>
Date: Sun, 09 Nov 2025 00:42:35 UTC

On Sat, Nov 08, 2025 at 03:22:37PM -0800, Rick Macklem wrote:
> Hi,
> 
> Peter Much reported a problem on the freebsd-fs@ mailing
> list on Oct. 21 under the Subject: "Why does rangelock_enqueue()
> hang for hours?".
> 
> The problem was that he had a copy_file_range(2) copying
> between a large NFS file and a local file that was taking 2hrs.
> While this copy_file_range(2) was in progress, it was holding
> a rangelock for the entire output file, causing another process
> trying to read the output file to hang, waiting for the rangelock.
> 
> Since copy_file_range(2) is not any standard (just trying to
> emulate the Linux one), there is no definitive answer w.r.t.
> should it hold rangelocks.  However, that is how it is currently
> coded and I, personally, think it is appropriate to do so.
> 
> Having a copy_file_range(2) syscall take two hours is
> definitely an unusual case, but it does seem that it is
> excessive?
> 
> Peter tried a quick patch I gave him that limited the
> copy_file_range(2) to 1sec and it fixed the problem
> he was observing.
> 
> Which brings me to the question...
> Should copy_file_range(2) be time limited?
> And, if the answer to this is "yes", how long do
> you think the time limit should be?
> (1sec, 2-5sec or ??)
> 
> Note that the longer you allow copy_file_range(2)
> to continue, the more efficient it will be.
> 
> Thanks in advance for any comments, rick

For me, making a syscall limited by runtime is very strange idea.
IMO it should not be done.

What can be done, I think, is to add signal interruption points into
the copying loop.  AFAIR we request a chunk to be copied, for some
size of the chunk.  After the copy is done, kernel could use
sig_intr() to check for either interruption or suspend conditions.
If non-zero is returned, you might finish the loop earlier, reporting
the partial copy.