From nobody Wed Sep 08 00:56:28 2021 X-Original-To: bugs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4052A17A34F1 for ; Wed, 8 Sep 2021 00:56:29 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4H43cF1J0Yz4pFW for ; Wed, 8 Sep 2021 00:56:29 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 041D31C93B for ; Wed, 8 Sep 2021 00:56:29 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 1880uSXn092481 for ; Wed, 8 Sep 2021 00:56:28 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 1880uS4V092480 for bugs@FreeBSD.org; Wed, 8 Sep 2021 00:56:28 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 251347] NFS hangs on client side when mounted from outside in Jail Tree (BROKEN NFS SERVER OR MIDDLEWARE) Date: Wed, 08 Sep 2021 00:56:28 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 12.1-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: rmacklem@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Bug reports List-Archive: https://lists.freebsd.org/archives/freebsd-bugs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-bugs@freebsd.org MIME-Version: 1.0 X-ThisMailContainsUnwantedMimeParts: N https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D251347 --- Comment #13 from Rick Macklem --- Ok, let me try to explain what the "...BROKEN MIDDLEWARE OR.." message means. There are certain file attributes, such as fileno (think i-node#) that should *never change*. When the NFS client receives file attributes where fileno for a given file has changed, it knows something is "badly broken". One cause of this was a middleware box (hardware/software that sits between the NFS client and NFS server in the network infrastructure) that could fail. - This "middleware box" cached NFS requests/replies. If it saw a request from the NFS client for attributes for the same file it replied to the Getattr with cached attributes. --> This reduced NFS server load, since the NFS server never saw the Getattr RPC request. Such a technology existed and would sometimes reply with bogus attributes for a different file. What was this device called? I have no idea. The guy who told me about this gave no details w.r.t. vendor/product/... (I assumed he was under NDA and could not disclose details beyond this broken device generating the above problem. Since it seems that the FreeBSD server is not broken in this regard (I would see a lot more bug reports about this if it was), then what else might cause this to happen? (ie. fileno mysteriously changes) Here's some unlikely, but possible theories: - Flakey memory in the NFS server that sometimes flips a bit that happens to be used to store the "fileno" attribute. - Flakey network interface transmit side that flips a bit before calculating the network checksum, so that the network checksum succeeds. --> It would seem that most garbled network packets would be caught by checksum failures, but checksums are not infallible. You may be able to dream up more. Mostly within the network fabric between the client<-->server. Given how unlikely these latter possibilities are, you can see why the known case of the "broken middleware box" gets mention in the message. --=20 You are receiving this mail because: You are the assignee for the bug.=