Re: Why does rangelock_enqueue() hang for hours?

From: Rick Macklem <rick.macklem_at_gmail.com>
Date: Tue, 21 Oct 2025 13:28:20 UTC
On Tue, Oct 21, 2025 at 6:09 AM Peter 'PMc' Much
<pmc@citylink.dinoex.sub.org> wrote:
>
>
> This is 14.3-RELEASE.
>
> I am copying a file from a NFSv4 share to a local filesystem. This
> takes a couple of hours.
>
> In the meantime I want to read that partially copied file. This is
> not possible. The reading process locks up in rangelock_enqueue(),
> unkillable(!), and only after the first slow copy has completed, it
> will do it's job.
>
> Even if I do the first copy to stdout with redirect to file, the
> same problem happens. I.e.:
>
>  $ cat /nfsshare/File > /localfs/File &
>  $ cat /localfs/File  --> HANGS unkillable
This is caused by "cat" using copy_file_range(2), where the
system call is taking a long time.

The version done below makes "cat" not use copy_file_range(2).
(copy_file_range(2) is interruptible, but that stops the file copy.
It also has a "return after 1sec" option.
Maybe that option should be exposed to userland and used by
"cat", "cp" and friends at least when enabled by a command line
option. (I'll admit looking at a file while it is being copied is a bit odd?)
The whole idea behind range-lock is to prevent a read/write syscall
from seeing a partial write. It just happens that the "write" takes a long
time in this case.

Do others have thoughts on this? rick

>
> Only if I introduce another process, the tie is avoided:
>
>  $ cat /nfsshare/File | cat > /localfs/File &
>  $ cat /localfs/File  --> WORKS
>
> I very much doubt that this is how it should be.
>
> Also, if I try to get some information about the supposed operation
> of this "rangelock" feature, search engines point me to a
> "rangelock(9)" manpage on man.freebsd.org, but that page doesn't
> seem to exist. :(
>