Re: Sorry to mail you directly with a NFS question...

From: Rick Macklem <rick.macklem_at_gmail.com>
Date: Mon, 29 May 2023 15:52:18 UTC
Since the reply ended up at the end of the long email, I'll
post it here as well.

On Sun, May 28, 2023 at 5:55 PM Terry Kennedy <terry-freebsd@glaver.org> wrote:
>
>    [This is the first time I'm trying to use the new FreeBSD
> list serer, and it is behaving really bizarrely - it stripped
> out the attachment in my first message, and when I sent the
> attachment in a subsequent email, it REPLACED my prior email
> which has vanished. I'm trying to reconstruct what I said.
> Fortunately I still have the hung terminal windows open so
> I have that data.]
>
>    I can easily reproduce this bug by editing a file on the
> NFS filesystem, making a trivial change and doing "save and
> exit" - instant hang.
>
>    I gathered the data Rick requested which is in my previous
> post.
>
I'm afraid that nothing here indicates what the problem is,
from what I can see at a quick glance.

The only thing I can think of is that your "save and exit"
might use byte range locking and rpc.lockd is flakey
at best.
If file locking does not need to be seen by other clients,
you can use the "nolockd" mount option to do the file
locking locally within the client.
If other clients do need to see the locks (files being
concurrently accessed from multiple clients), then
NFSv4 does a much better job of file locking.
(Since your server is quite old, I am not sure if switching
 to NFSv4 would be feasible for you.)

Maybe others have some other ideas, rick

>    In another terminal window on the 13.2 system (165h) with
> the hang, both filesystems show up, even after the hang:
>
> (0:19) 165h:/tmp# df -h
> Filesystem             Size    Used   Avail Capacity  Mounted on
> ...
> gate:/usr/local/src    7.7G    3.3G    3.8G    46%    /usr/local/src
> gate:/sysprog           62G     22G     35G    39%    /sysprog
> ...
>
>    In that other terminal window, I can create a file with
> 'touch' (and it is indeed created, looking at the directory
> from other clients running 12.4). But any attempt to list
> the directory results in a hang:
>
> (0:22) 165h:/tmp# touch /usr/local/src/envir/foo
> (0:23) 165h:/tmp# ls /usr/local/src/envir
> load: 0.00  cmd: ls 97107 [nfs] 25.89r 0.00u 0.00s 0% 2864k
> load: 0.00  cmd: ls 97107 [nfs] 52.44r 0.00u 0.00s 0% 2864k
> load: 0.00  cmd: ls 97107 [nfs] 175.41r 0.00u 0.00s 0% 2864k
>
>    In yet another terminal window, a create + write (as opposed
> to just a "touch") hangs:
> (0:2) 165h:/sysprog/terry# echo "Testing 123" > /usr/local/src/envir/bar
> load: 0.00  cmd: tcsh 97128 [nfs] 61.17r 0.02u 0.00s 0% 4248k
> load: 0.01  cmd: tcsh 97128 [nfs] 2736.92r 0.02u 0.00s 0% 4248k
>
>    From another 12.4 client that has the same filesystem mounted,
> things continue to work normally:
>
> (0:634) new-gate:~terry# echo "Testing 123" > /usr/local/src/envir/baz
> (0:635) new-gate:~terry# cat /usr/local/src/envir/baz
> Testing 123
>
>    Based on this, I think it is a client-side problem on the
> 13.2 system.
>