igb(4) watchdog timeout, lagg(4) fails

Harald Schmalzbauer h.schmalzbauer at omnilan.de
Sun Jan 11 15:36:33 UTC 2015


 Bezüglich Jack Vogel's Nachricht vom 10.01.2015 19:32 (localtime):
> Did you say this system is a VM under ESX?

Yes, but igb(4) isn't used as VF (SR-IOV), but via VT-d.
Like mentioned, the setup never showed any unexpected errors and has
been reliably withstanding single-component-failures as designed for
more than one year.
But now the watchdog timeout leaves one igb(4) nic in unoperational
state, but lagg(4) doesn't notice that there's a problem with that nic :-(
I think a watchdog reset should bring if_igb down and only up again if
it's confirmed that the reset succeeded and the nic is operational again.

Or lagg(0) would need to do some line testing. It's done with LACP, but
in that environment I don't have LACP stackable available. So I'm left
with l2hash…

Regarding the timeout, I think I found the solution – but it's not
evident yet. Besides the FreeBSD update from 9.1 to 10.1 (-stable) I
noticed that the VT-d routing changed. So I spread the PCI-slot numbers
differently to make igb0 share it's virtual APIC line with some other
cards. No more watchdog timeout until now…

Thanks,

-Harry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 196 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20150111/af6e9735/attachment.sig>


More information about the freebsd-stable mailing list