[Bug 260011] Unresponsive NFS mount on AWS EFS

From: <bugzilla-noreply_at_freebsd.org>
Date: Sun, 22 May 2022 00:39:57 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260011

--- Comment #13 from Rick Macklem <rmacklem@FreeBSD.org> ---
Created attachment 234101
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=234101&action=edit
mark session slots bad and set session defunct when all slots bad

Ok, I looked at the attachments...
- There were 154 ExchangeIDs. Thats about 153 more than there
  should be. These should only occur once when doing the mount,
  plus after a server reboot.
  - Unfortunately the Amazon EFS has a fundamental design flaw,
    where a new TCP connection may connect to a different "cluster"
    (whatever Amazon considers a cluster to be) and this different
    cluster does not know any of the open/lock state and acts like
    a rebooted NFSv4 server.
    There may be other things that cause this.
To use it reliably, you need to avoid these ExchangeIDs (recovery cycles).
If you monitor the TCP connection to the server via repeated "netstat -a"
calls and you see the connection change (different client port#), then
that is causing the problem (because of the Amazon EFS design flaw).

Using "soft,intr" is asking for trouble, because an interrupted syscall
leaves the file system in a non-deterministic state. Something called
sessions maintains "exactly once" RPC semantics and this breaks the
session slot. Once all the session slots are broken, the client must
do one of these recoveries.
--> It is much better to use hard mounts and "umount -N" if/when a mount
    point is hung.

For this case, it is stuck partially through one of these recoveries,
because the "nfscl" thread that does the recovery is stuck waiting for
a session slot. I'm not sure how that can happen, since the session
would normally be marked "defunct" so that "nfscl" would not be waiting
for a slot in the session to become available.

I have attached this patch, which might help?
It does two things differently...
- It scans through all sessions looking for a match to mark defunct
  instead of just doing the first/current one. I cannot think how a
  new session would be created without the previous one being marked
  defunct, but since your "ps axHl" suggests that happens, this might
  fix the problem
- It keeps track of bad slots (caused by a soft,intr RPC failing without
  completing) and marks the session defunct when all slots are bad.
  This might make "soft,intr" mounts work better.

If you can try the patch and it improves the situation, it could be
considered for a FreeBSD commit. I doubt it will ever be committed
otherwise, because I have no way of reproducing what you are getting.

-- 
You are receiving this mail because:
You are the assignee for the bug.