Re: Why does rangelock_enqueue() hang for hours?
- In reply to: Rick Macklem : "Re: Why does rangelock_enqueue() hang for hours?"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 23 Oct 2025 14:49:27 UTC
On Thu, Oct 23, 2025 at 07:21:56AM -0700, Rick Macklem wrote: > On Thu, Oct 23, 2025 at 2:54 AM Peter 'PMc' Much > <pmc@citylink.dinoex.sub.org> wrote: > > > > On Wed, Oct 22, 2025 at 08:52:00AM -0700, Rick Macklem wrote: > > ! On Tue, Oct 21, 2025 at 7:50 AM Bakul Shah <bakul@iitbombay.org> wrote: > > ! > > > ! > I didn't read this thread before commenting on the forum where Peter > > ! > first raised this issue. Adding the relevant part of my comment here: > > ! > +--- > > ! > By git blame cat.c we find it was added on 2023-07-08 in commit 8113cc8276. > > ! > git log 8113cc8276 says > > ! > cat: use copy_file_range(2) with fallback to previous behavior > > ! > > > ! > This allows to use special filesystem features like server-side > > ! > copying on NFS 4.2 or block cloning on OpenZFS 2.2. > > ! > > > ! > May be it should check that these conditions are met? That is, both files should be > > ! > remote or both files should be local for it to be really worth it. In any case IMHO > > ! > this should not be the default behavior. Still, it should not hang.... > > ! Peter, you could try the attached trivial patch (untested). > > ! > > ! I'm not sure if this is a reasonable thing to do, but at least you can report > > ! back to let us know if it fixes your problem? > > > > > > Hi Rick, > > > > I tested the patch. And I did somehting more, like > > trying to update my linux installation (which was unpleasant > > and didn't fully succeed) and have a look there. See below. > > > > The patch helps. Things on the writing side now look like this: > > > > ... > > 1.409706711 copy_file_range(0x3,0x0,0x4,0x0,0x7fffffffffffffff,0x0) = 393216 (0x60000) > > 1.216986006 copy_file_range(0x3,0x0,0x4,0x0,0x7fffffffffffffff,0x0) = 393216 (0x60000) > > 1.219576946 copy_file_range(0x3,0x0,0x4,0x0,0x7fffffffffffffff,0x0) = 393216 (0x60000) > > 1.025836739 copy_file_range(0x3,0x0,0x4,0x0,0x7fffffffffffffff,0x0) = 262144 (0x40000) > > ... > > > > More interesting, the read access runs immediately, it does not wait > > for that one second to find a gap. > I suspect that was because you did it after the first copy_file_range() call. > The 2nd and subsequent calls would not start at offset 0, so the rangelock > would not start at offset 0 either. > > > > > But, I am still wondering: why do we do this? And then I found, > > Linux (6.12.38+kali-amd64) does not do it: > > > > > > $ strace cp XX XY > > ... > > copy_file_range(3, NULL, 4, NULL, 9223372035781033984, 0^Z > > [1]+ Stopped strace cp XX XY > > $ cp XY XZ > > $ > > > > This does not block. And it does not split the copy_file_range() > > into chunks. FreeBSD 14.3 does block at this point. > As you probably already know, there is no standard for copy_file_range(2). > When I did it, the aim was to be Linux compatible, but I guess it is > no surprise that it isn't 100% compatible. (The Linux copy_file_range(2) > is a moving target. It started out as a libc function and then its semantics > changed significantly at some Linux version. I've forgotten which version. > Prior to that version, a copy_file_range() with a len argument that went > past EOF was not allowed, if I recall correctly?) > > Range locking is required for read/write (I'm fairly sure that is in the POSIX > standard for them). When I did copy_file_range(2) for FreeBSD others > (I don't recall who) thought that it should do range locking to be consistent > with read/write, which made sense to me. See https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html 2.9.7 Thread Interactions with Regular File Operations [List of functions about file io, including read() and write)] If two threads each call one of these functions, each call shall either see all of the specified effects of the other call, or none of them. > > I will ask on freebsd-current@ (few read freebsd-fs@) to see what the > consensus is w.r.t. this. (I suspect the "return after 1sec" is preferred > over disabling range locking, but we'll see.) I will also run some tests > on the Linux system I have, to confirm what their semantics are for > a recent Linux kernel. (Don't expect to see the post for a little while.) > > rick > > > > > BTW: this is another one of my creepy use-cases: freeze some > > job and forget about it - and if it happens to use cp somewhere, > > then all other reads traversing the concerned file (e.g. backup) > > would also freeze. And then after a week we wonder why we do not > > have backups. > > > > rgds, > > PMc > > > > > > ! > > > ! > > On Oct 21, 2025, at 6:28 AM, Rick Macklem <rick.macklem@gmail.com> wrote: > > ! > > > > ! > > On Tue, Oct 21, 2025 at 6:09 AM Peter 'PMc' Much > > ! > > <pmc@citylink.dinoex.sub.org> wrote: > > ! > >> > > ! > >> > > ! > >> This is 14.3-RELEASE. > > ! > >> > > ! > >> I am copying a file from a NFSv4 share to a local filesystem. This > > ! > >> takes a couple of hours. > > ! > >> > > ! > >> In the meantime I want to read that partially copied file. This is > > ! > >> not possible. The reading process locks up in rangelock_enqueue(), > > ! > >> unkillable(!), and only after the first slow copy has completed, it > > ! > >> will do it's job. > > ! > >> > > ! > >> Even if I do the first copy to stdout with redirect to file, the > > ! > >> same problem happens. I.e.: > > ! > >> > > ! > >> $ cat /nfsshare/File > /localfs/File & > > ! > >> $ cat /localfs/File --> HANGS unkillable > > ! > > This is caused by "cat" using copy_file_range(2), where the > > ! > > system call is taking a long time. > > ! > > > > ! > > The version done below makes "cat" not use copy_file_range(2). > > ! > > (copy_file_range(2) is interruptible, but that stops the file copy. > > ! > > It also has a "return after 1sec" option. > > ! > > Maybe that option should be exposed to userland and used by > > ! > > "cat", "cp" and friends at least when enabled by a command line > > ! > > option. (I'll admit looking at a file while it is being copied is a bit odd?) > > ! > > The whole idea behind range-lock is to prevent a read/write syscall > > ! > > from seeing a partial write. It just happens that the "write" takes a long > > ! > > time in this case. > > ! > > > > ! > > Do others have thoughts on this? rick > > ! > > > > ! > >> > > ! > >> Only if I introduce another process, the tie is avoided: > > ! > >> > > ! > >> $ cat /nfsshare/File | cat > /localfs/File & > > ! > >> $ cat /localfs/File --> WORKS > > ! > >> > > ! > >> I very much doubt that this is how it should be. > > ! > >> > > ! > >> Also, if I try to get some information about the supposed operation > > ! > >> of this "rangelock" feature, search engines point me to a > > ! > >> "rangelock(9)" manpage on man.freebsd.org, but that page doesn't > > ! > >> seem to exist. :( > > ! > >> > > ! > > > > ! > > > > > >