amd + NFS reconnect = ICMP storm + unkillable process.

Thu Aug 25 23:00:44 UTC 2011

On Wed, Jul 6, 2011 at 4:50 AM, Martin Birgmeier <la5lbtyi at aon.at> wrote:
> Hi Artem,
>
> I have exactly the same problem as you are describing below, also with quite
> a number of amd mounts.
>
> In addition to the scenario you describe, another way this happens here
> is when downloading a file via firefox to a directory currently open in
> dolphin (KDE file manager). This will almost surely trigger the symptoms
> you describe.
>
> I've had 7.4 running on the box before, now with 8.2 this has started to
> happen.
>
> Alas, I don't have a solution.

I may be on to something. Here's what seems to be happening in my case:

* Process, that's in the middle of a syscall accessing amd mountpoint
gets interrupted.
* If the syscall was restartable, msleep at the beginning of
get_reply: loop in in clnt_dg_call() would return ERESTART.
* ERESTART will result in clnt_dg_call() returning with RPC_CANTRECV
* clnt_reconnect_call() then will try to reconnect, and msleep will
eventually fail with ERESTART in clnt_dg_call() again and the whole
thing will be repeating for a while.

I'm not familiar enough with the RPC code, but looking and clnt_vc.c
and other RPC places, it appears that both EINTR and ERESTART should
translate into RPC_INTR error. However in clnt_dg.c that's not the
case and that's what seems to make amd-mounted accesses hang.

Following patch (against RELENG-8 @ r225118) seems to have fixed the
issue for me. With the patch I no longer see the hangs or ICMP storms
on the test case that could reliably reproduce the issue within
minutes. Let me know if it helps in your case.

--- a/sys/rpc/clnt_dg.c
+++ b/sys/rpc/clnt_dg.c
@@ -636,7 +636,7 @@ get_reply:
 		 */
 		if (error != EWOULDBLOCK) {
 			errp->re_errno = error;
-			if (error == EINTR)
+			if (error == EINTR || error == ERESTART)
 				errp->re_status = stat = RPC_INTR;
 			else
 				errp->re_status = stat = RPC_CANTRECV;

--Artem

>
> We should probably file a PR, but I don't even know where to assign it to.
> Amd does not seem much maintained, it's probably using some old-style
> mounts (it never mounts anything via IPv6, for example).
>
> Regards,
>
> Martin
>
>> Hi,
>>
>> I wonder if someone else ran into this issue before and, maybe, have a
>> solution.
>>
>> I've been running into a problem where access to filesystems mouted
>> with amd wedges processes in an unkillable state and produces ICMP
>> storm on loopback interface.I've managed to narrow down to NFS
>> reconnect, but that's when I ran out of ideas.
>>
>> Usually the problem happens when I abort a parallel build job in an
>> i386 jail on FreeBSD-8/amd64 (r223055). When the build job is killed
>> now and then I end up with one process consuming 100% of CPU time on
>> one of the cores. At the same time I get a lot of messages on the
>> console saying "Limiting icmp unreach response from 49837 to 200
>> packets/sec" and the loopback traffic goes way up.
>>
>> As far as I can tell here's what's happening:
>>
>> * My setup uses a lot of filesystems mounted by amd.
>> * amd itself pretends to be an NFS server running on the localhost and
>> serving requests for amd mounts.
>> * Now and then amd seems to change the ports it uses. Beats me why.
>> * the problem seems to happen when some process is about to access amd
>> mountpoint when amd instance disappears from the port it used to
>> listen on. In my case it does correlate with interrupted builds, but I
>> have no clue why.
>> * NFS client detects disconnect and tries to reconnect using the same
>> destination port.
>> * That generates ICMP response as port is unreachable and it reconnect
>> call returns almost immediatelly.
>> * We try to reconnect again, and again, and again....
>> * the process in this state is unkillable
>>
>> Here's what the stack of the 'stuck' process looks like in those rare
>> moments when it gets to sleep:
>> 18779 100511 collect2         -                mi_switch+0x176
>> turnstile_wait+0x1cb _mtx_lock_sleep+0xe1 sleepq_catch_signals+0x386
>> sleepq_timedwait_sig+0x19 _sleep+0x1b1 clnt_dg_call+0x7e6
>> clnt_reconnect_call+0x12e nfs_request+0x212 nfs_getattr+0x2e4
>> VOP_GETATTR_APV+0x44 nfs_bioread+0x42a VOP_READLINK_APV+0x4a
>> namei+0x4f9 kern_statat_vnhook+0x92 kern_statat+0x15
>> freebsd32_stat+0x2e syscallenter+0x23d
>>
>> * Usually some timeout expires in few minutes, the process dies, ICMP
>> storm stops and the system is usable again.
>> * On occasion the process is stuck forever and I have to reboot the box.
>>
>> I'm not sure who's to blame here.
>>
>> Is the automounter at fault for disappearing from the port it was
>> supposed to listen to?
>> If NFS guilty of trying blindly to reconnect on the same port and not
>> giving up sooner?
>> Should I flog the operator (ALA myself) for misconfiguring something
>> (what?) in amd or NFS?
>>
>> More importantly -- how do I fix it?
>> Any suggestions on fixing/debugging this issue?
>>
>> --Artem
>