[Bug 256280] FreeBSD nfsd serving zfs pool, linux nfsclient, often hangs (not observed in 12-stable)

From: <bugzilla-noreply_at_freebsd.org>
Date: Mon, 31 May 2021 05:20:04 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=256280

            Bug ID: 256280
           Summary: FreeBSD nfsd serving zfs pool, linux nfsclient, often
                    hangs (not observed in 12-stable)
           Product: Base System
           Version: 13.0-STABLE
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: misc
          Assignee: bugs@FreeBSD.org
          Reporter: yuan.mei@gmail.com

I am observing frequent nfs mount hang-ups after upgrading my NAS to 13-stable.
 FreeBSD here serves nfsd of a zfs pool.  A Gentoo Linux box is connected to
this NAS via a 10GbE fiber link.  Once a while, perhaps when the zfs load gets
high (afpd is running), the Linux side access to nfs hangs, then recovers after
a few minutes.  The following messages are printed in Linux's dmesg:

May 30 22:04:34 mayhome kernel: nfs: server 192.168.3.51 not responding, still
trying

But after a while, a few minutes or so, the access recovers:

May 30 22:06:35 mayhome kernel: nfs: server 192.168.3.51 OK

This behavior is only observed after updating NAS to 13-stable via buildworld,
buildkernel procedure.  The Linux side remains the same and no hardware changed
on either side.  12-stable did not exhibit any of these.

The NAS's NIC serving nfsd is

t5nex0: <Chelsio T520-SO> mem
0xdd300000-0xdd37ffff,0xdc000000-0xdcffffff,0xdd884000-0xdd885fff irq 16 at
device 0.4 on pci1
cxl0: <port 0> on t5nex0
cxl0: Ethernet address: 00:07:43:31:9c:80
cxl0: 8 txq, 8 rxq (NIC); 8 txq (TOE), 2 rxq (TOE)
cxl1: <port 1> on t5nex0
cxl1: Ethernet address: 00:07:43:31:9c:88
cxl1: 8 txq, 8 rxq (NIC); 8 txq (TOE), 2 rxq (TOE)
t5nex0: PCIe gen2 x8, 2 ports, 22 MSI-X interrupts, 54 eq, 21 iq

Linux nfsmount flags:

/home from 192.168.3.51:/mnt/nashome
 Flags:
rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.3.51,mountvers=3,mountport=855,mountproto=udp,local_lock=all,addr=192.168.3.51

-- 
You are receiving this mail because:
You are the assignee for the bug.