Re: Why does rangelock_enqueue() hang for hours?
Date: Thu, 23 Oct 2025 14:21:56 UTC
On Thu, Oct 23, 2025 at 2:54 AM Peter 'PMc' Much <pmc@citylink.dinoex.sub.org> wrote: > > On Wed, Oct 22, 2025 at 08:52:00AM -0700, Rick Macklem wrote: > ! On Tue, Oct 21, 2025 at 7:50 AM Bakul Shah <bakul@iitbombay.org> wrote: > ! > > ! > I didn't read this thread before commenting on the forum where Peter > ! > first raised this issue. Adding the relevant part of my comment here: > ! > +--- > ! > By git blame cat.c we find it was added on 2023-07-08 in commit 8113cc8276. > ! > git log 8113cc8276 says > ! > cat: use copy_file_range(2) with fallback to previous behavior > ! > > ! > This allows to use special filesystem features like server-side > ! > copying on NFS 4.2 or block cloning on OpenZFS 2.2. > ! > > ! > May be it should check that these conditions are met? That is, both files should be > ! > remote or both files should be local for it to be really worth it. In any case IMHO > ! > this should not be the default behavior. Still, it should not hang.... > ! Peter, you could try the attached trivial patch (untested). > ! > ! I'm not sure if this is a reasonable thing to do, but at least you can report > ! back to let us know if it fixes your problem? > > > Hi Rick, > > I tested the patch. And I did somehting more, like > trying to update my linux installation (which was unpleasant > and didn't fully succeed) and have a look there. See below. > > The patch helps. Things on the writing side now look like this: > > ... > 1.409706711 copy_file_range(0x3,0x0,0x4,0x0,0x7fffffffffffffff,0x0) = 393216 (0x60000) > 1.216986006 copy_file_range(0x3,0x0,0x4,0x0,0x7fffffffffffffff,0x0) = 393216 (0x60000) > 1.219576946 copy_file_range(0x3,0x0,0x4,0x0,0x7fffffffffffffff,0x0) = 393216 (0x60000) > 1.025836739 copy_file_range(0x3,0x0,0x4,0x0,0x7fffffffffffffff,0x0) = 262144 (0x40000) > ... > > More interesting, the read access runs immediately, it does not wait > for that one second to find a gap. I suspect that was because you did it after the first copy_file_range() call. The 2nd and subsequent calls would not start at offset 0, so the rangelock would not start at offset 0 either. > > But, I am still wondering: why do we do this? And then I found, > Linux (6.12.38+kali-amd64) does not do it: > > > $ strace cp XX XY > ... > copy_file_range(3, NULL, 4, NULL, 9223372035781033984, 0^Z > [1]+ Stopped strace cp XX XY > $ cp XY XZ > $ > > This does not block. And it does not split the copy_file_range() > into chunks. FreeBSD 14.3 does block at this point. As you probably already know, there is no standard for copy_file_range(2). When I did it, the aim was to be Linux compatible, but I guess it is no surprise that it isn't 100% compatible. (The Linux copy_file_range(2) is a moving target. It started out as a libc function and then its semantics changed significantly at some Linux version. I've forgotten which version. Prior to that version, a copy_file_range() with a len argument that went past EOF was not allowed, if I recall correctly?) Range locking is required for read/write (I'm fairly sure that is in the POSIX standard for them). When I did copy_file_range(2) for FreeBSD others (I don't recall who) thought that it should do range locking to be consistent with read/write, which made sense to me. I will ask on freebsd-current@ (few read freebsd-fs@) to see what the consensus is w.r.t. this. (I suspect the "return after 1sec" is preferred over disabling range locking, but we'll see.) I will also run some tests on the Linux system I have, to confirm what their semantics are for a recent Linux kernel. (Don't expect to see the post for a little while.) rick > > BTW: this is another one of my creepy use-cases: freeze some > job and forget about it - and if it happens to use cp somewhere, > then all other reads traversing the concerned file (e.g. backup) > would also freeze. And then after a week we wonder why we do not > have backups. > > rgds, > PMc > > > ! > > ! > > On Oct 21, 2025, at 6:28 AM, Rick Macklem <rick.macklem@gmail.com> wrote: > ! > > > ! > > On Tue, Oct 21, 2025 at 6:09 AM Peter 'PMc' Much > ! > > <pmc@citylink.dinoex.sub.org> wrote: > ! > >> > ! > >> > ! > >> This is 14.3-RELEASE. > ! > >> > ! > >> I am copying a file from a NFSv4 share to a local filesystem. This > ! > >> takes a couple of hours. > ! > >> > ! > >> In the meantime I want to read that partially copied file. This is > ! > >> not possible. The reading process locks up in rangelock_enqueue(), > ! > >> unkillable(!), and only after the first slow copy has completed, it > ! > >> will do it's job. > ! > >> > ! > >> Even if I do the first copy to stdout with redirect to file, the > ! > >> same problem happens. I.e.: > ! > >> > ! > >> $ cat /nfsshare/File > /localfs/File & > ! > >> $ cat /localfs/File --> HANGS unkillable > ! > > This is caused by "cat" using copy_file_range(2), where the > ! > > system call is taking a long time. > ! > > > ! > > The version done below makes "cat" not use copy_file_range(2). > ! > > (copy_file_range(2) is interruptible, but that stops the file copy. > ! > > It also has a "return after 1sec" option. > ! > > Maybe that option should be exposed to userland and used by > ! > > "cat", "cp" and friends at least when enabled by a command line > ! > > option. (I'll admit looking at a file while it is being copied is a bit odd?) > ! > > The whole idea behind range-lock is to prevent a read/write syscall > ! > > from seeing a partial write. It just happens that the "write" takes a long > ! > > time in this case. > ! > > > ! > > Do others have thoughts on this? rick > ! > > > ! > >> > ! > >> Only if I introduce another process, the tie is avoided: > ! > >> > ! > >> $ cat /nfsshare/File | cat > /localfs/File & > ! > >> $ cat /localfs/File --> WORKS > ! > >> > ! > >> I very much doubt that this is how it should be. > ! > >> > ! > >> Also, if I try to get some information about the supposed operation > ! > >> of this "rangelock" feature, search engines point me to a > ! > >> "rangelock(9)" manpage on man.freebsd.org, but that page doesn't > ! > >> seem to exist. :( > ! > >> > ! > > > ! > > >