From nobody Tue Jan 18 15:26:49 2022 X-Original-To: bugs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 2FF381960B0D for ; Tue, 18 Jan 2022 15:26:49 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4JdXgX5f9bz3Lrx for ; Tue, 18 Jan 2022 15:26:48 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id A103223B91 for ; Tue, 18 Jan 2022 15:26:48 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 20IFQmZB077669 for ; Tue, 18 Jan 2022 15:26:48 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 20IFQmPQ077668 for bugs@FreeBSD.org; Tue, 18 Jan 2022 15:26:48 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 261291] ESX NFS4.1 client hangs, server never responds to EXCHANGE_ID/CREATE_SESSION Date: Tue, 18 Jan 2022 15:26:49 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 13.0-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: rmacklem@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Bug reports List-Archive: https://lists.freebsd.org/archives/freebsd-bugs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-bugs@freebsd.org MIME-Version: 1.0 ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1642519608; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=91EAig7GXA4PajeshUnB5gkrMFBIEphutkK5RrtHmkQ=; b=IbSW3EQTJD1RNd/RY4DF5s9AcsPWN7379jL0PVDyxbGcUJivAk3Uf9sugDeDoABp/zijVw bSE9LmPD5LW1E29nc/nq37mW5BxtJBkdP2alhIP6q/ffnVKmWshbaX5gGd2dYvpQVNGpMz aDATrxYZYxMTiKQWN8m2hvBzrlzJtUh+T7HyDeUw8Uts7Rk6aAUNxIJoGdDtu5AnvYmykk 9Y9hyC4oY7wNqeH7c9pMeECTg9ZEp8qFiEmVjv2K+lABgHN0u1hS1CsZLlTuJJKm4WlK2n jv5qylmGMw849kiMbeazgUkWUYYzQDZF8Thj5LVuVWygsx41IUKKDmNFrbgcdw== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1642519608; a=rsa-sha256; cv=none; b=dRRxdS/NqbmoeFxjva+2j8nNFqWXj0z8OpoiokrcxskWVnRlbcfY14nRlbct1zcD+7LIub CqF0vI5Y2OqYHB/UzmFayg683LvvSa/7nIGDY0Cjmgqoz55z9d5iVXZ7bjbTOH+tZJLBO9 lGFGP+t6RIJQsmgvZiESDvklVHM4V1QWNLgqop+Ti57/ciZE/1x06DGG0L1+J2ALad067E hhb0fsAYRrAOXIMYn/4yXlMF3Y/BPZXMxOhfUO0/FcphfoSHgxHn7OZK9fpA0I3XXcuuIp Tgom0vmu05P1HPZXXFeltesf+ZbKLqy9qkGNH3JVcX0fupQVBnSeXcRigfksnA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D261291 --- Comment #4 from Rick Macklem --- Hmm. Took a look and it looks like a variant of the TCP bug. You'll notice that the last NFS reply for the NFSv4 connection (port #805 on the client) shows up at packet #653. After that, all there is from the FreeBSD end are ACks, as you said. The NFSv3 RPCs near the end of the trace are done on other TCP connections (port#800 and #804) and they work. Can you roll back to before rscheff@'s fix and then revert r367492? (See PR#256280.) In other words, get that code back to its pre-r367492 state. --> The pre-r367492 code has worked ok, literally for decades. We believe that rscheff@'s fix is ok because otis@ did not observe a hang during two weeks of testing, but that doesn't guarantee it fixed the problem. Other possibilities are that the nfsd threads are getting hung trying to do some RPC around packet #653, but I would have expected that to result in all nfsd threads hung eventually (and the server obviously is not in that state, since RPCs on other connections are still working). If it happens again, do these commands on the FreeBSD server: # ps axHl <-- to look for "hung" nfsd threads # netstat -a <-- to look at the TCP connection for the broken client. (If it is in ESTABLISHED state with a non-0 Recv-Q, the rscheff@ patch has not fixed the problem.) --=20 You are receiving this mail because: You are the assignee for the bug.=