Re: NFS client hang on 13.2-RELEASE-p2 on file locking / wrong interface selected

From: Rick Macklem <rick.macklem_at_gmail.com>
Date: Thu, 24 Aug 2023 14:02:30 UTC
The NLM is a fundamentally broken protocol and has never
been described by an RFC. I strongly recommend against
using it.

If the locks do not need to be visible to other clients, the
"nolockd" mount option should work.
Otherwise consider switching the mounts to NFSv4.1/4.2.
(For either of these cases, the rpc.lockd and rpc.statd daemons
 no longer need to be run.)

I am not familiar with the NLM code and have no interest in
trying to change it.

rick

On Wed, Aug 23, 2023 at 8:56 PM J David <j.david.lists@gmail.com> wrote:
>
> Hello,
>
> We are seeing NFS hangs on FreeBSD 13.2-RELEASE-p2 clients talking to
> a Debian bookworm NFS server using NFSv3.
>
> Whenever a process attempts to lock a file on an NFS mount, for example:
>
> lockf x sleep 3
>
> That process hangs in state "nlmrcv" and goes to 100% CPU.
>
> I found huge numbers of exchanges like this via tcpdump:
>
> 03:05:41.432581 IP 172.17.200.2.998 > 172.17.250.10.50516: UDP, length 172
> 0x0000:  4500 00c8 9afb 0000 4011 c4f9 ac11 c802  E.......@.......
> 0x0010:  ac11 fa0a 03e6 c554 00b4 1af6 04ed ac8b  .......T........
> 0x0020:  0000 0000 0000 0002 0001 86b5 0000 0004  ................
> 0x0030:  0000 0002 0000 0001 0000 001c 64e6 c8e0  ............d...
> 0x0040:  0000 0002 6332 0000 0001 bb87 0001 bb87  ....c2..........
> 0x0050:  0000 0001 0000 61a8 0000 0000 0000 0000  ......a.........
> 0x0060:  0000 0004 3873 0200 0000 0000 0000 0001  ....8s..........
> 0x0070:  0000 0002 6332 0000 0000 0020 0100 0601  ....c2..........
> 0x0080:  a0da 65c4 00d0 6316 0000 0000 0000 0000  ..e...c.........
> 0x0090:  0a00 0a00 0000 0000 c733 0800 0000 0009  .........3......
> 0x00a0:  3130 3030 3031 4063 3200 0000 0001 86a1  100001@c2.......
> 0x00b0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0x00c0:  0000 0000 0000 0025                      .......%
> 03:05:41.432632 IP 172.17.250.10.50516 > 172.17.200.2.998: UDP, length 20
> 0x0000:  4500 0030 4675 4000 4011 da17 ac11 fa0a  E..0Fu@.@.......
> 0x0010:  ac11 c802 c554 03e6 001c 1a5e 04ed ac8b  .....T.....^....
> 0x0020:  0000 0001 0000 0001 0000 0001 0000 0001  ................
> 03:05:41.432647 IP 172.17.200.2.998 > 172.17.250.10.50516: UDP, length 172
> 0x0000:  4500 00c8 9afc 0000 4011 c4f8 ac11 c802  E.......@.......
> 0x0010:  ac11 fa0a 03e6 c554 00b4 1af6 04ed ac8c  .......T........
> 0x0020:  0000 0000 0000 0002 0001 86b5 0000 0004  ................
> 0x0030:  0000 0002 0000 0001 0000 001c 64e6 c8e0  ............d...
> 0x0040:  0000 0002 6332 0000 0001 bb87 0001 bb87  ....c2..........
> 0x0050:  0000 0001 0000 61a8 0000 0000 0000 0000  ......a.........
> 0x0060:  0000 0004 3973 0200 0000 0000 0000 0001  ....9s..........
> 0x0070:  0000 0002 6332 0000 0000 0020 0100 0601  ....c2..........
> 0x0080:  a0da 65c4 00d0 6316 0000 0000 0000 0000  ..e...c.........
> 0x0090:  0a00 0a00 0000 0000 c733 0800 0000 0009  .........3......
> 0x00a0:  3130 3030 3031 4063 3200 0000 0001 86a1  100001@c2.......
> 0x00b0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0x00c0:  0000 0000 0000 0025                      .......%
> 03:05:41.432697 IP 172.17.250.10.50516 > 172.17.200.2.998: UDP, length 20
> 0x0000:  4500 0030 4676 4000 4011 da16 ac11 fa0a  E..0Fv@.@.......
> 0x0010:  ac11 c802 c554 03e6 001c 1a5e 04ed ac8c  .....T.....^....
> 0x0020:  0000 0001 0000 0001 0000 0001 0000 0001  ................
>
> A huge number of these are exchanged. Like, several million over the
> span of a couple of minutes.
>
> Then the FreeBSD client system becomes unresponsive with these console messages:
>
> [nl_neigh] rtnl_lle_event: error allocating group writer
> [nl_neigh] rtnl_lle_event: error allocating group writer
> [nl_neigh] rtnl_lle_event: error allocating group writer
> [zone: mbuf] kern.ipc.nmbufs limit reached
> [nl_neigh] rtnl_lle_event: error allocating group writer
> [nl_neigh] rtnl_lle_event: error allocating group writer
> [nl_neigh] rtnl_lle_event: error allocating group writer
>
> Now, while 172.17.200.2 is an IP address on the client and
> 172.17.250.10 is an IP address on the server, that's the wrong subnet.
> The filesystem is mounted over a dedicated VLAN for NFS which has the
> IPs 172.20.200.2 and 172.20.250.10. So whatever this traffic is, it's
> using the wrong interface.
>
> I was able to work around this by swapping the 172.17.0.0 network from
> the NFS server.  But that's not exactly an optimal solution.
>
> The main problem may be on the Debian side. I was able to workaround
> the issue by swapping the order of the network interfaces on the
> Debian side so the NFS VLAN was the "first" interface. That suggests
> to me that something on the Debian side is choosing the first
> interface instead of the right interface.
>
> So I'll pursue that. But it'd sure be nice if the FreeBSD client
> didn't hang in this situation.
>
> Does anyone know what might be happening here or have any other
> insight that might help me track this down?
>
> Thanks for any advice!
>