igb(4) watchdog timeout, lagg(4) fails

Harald Schmalzbauer h.schmalzbauer at omnilan.de
Thu Apr 23 08:21:27 UTC 2015


 Bezüglich Harald Schmalzbauer's Nachricht vom 20.02.2015 14:17 (localtime):
(https://lists.freebsd.org/pipermail/freebsd-stable/2015-February/081810.html)
>  Bezüglich Harald Schmalzbauer's Nachricht vom 11.02.2015 20:48
> (localtime):
>>  Bezüglich Jack Vogel's Nachricht vom 11.02.2015 18:31 (localtime):
>>> tdh and tdt mean the head and tail indices of the ring, and these
>>> values are
>>> obviously severely borked :)

Hello Jack,

could you find some time for having a look at this problem? The reported
values don't bother me, but the watchdog timeout which happens on NICs
that are PCIe-connected via the PCH. Please see my previouse findings. I
think the most significant hint for my problem seems to be the link_irq,
which becomes garbage at the first watchdog timeout occurrence, like
previously described:

>>>> For the records: Rebooting the machine (ESXi guest-only!) brought the
>> stalled igb1 back to operation.
>> The guest has 2 igb (kawela) ports, one from a NIC(Intel ET Dual Port
>> 82576)@CPU-PCIe and the second port from an identical NIC, but connected
>> via PCH-PCIe.
>> The watchdog timeout problem only occurs with the port from the
>> PCH-PCIe-connected NIC (falisfied)!
>> After the reboot the suspicious "dev.igb.1.link_irq=848" turned into:
>> dev.igb.0.link_irq: 3
>> dev.igb.1.link_irq: 4
> Jack,
>
> I'd like to let you know that "dev.igb.1.link_irq" again shows garbage
> after the watchdog timeout problem occured again:
> dev.igb.1.link_irq: 1458
>
> I can imagine that resetting goes wrong and ends in loss of link_irq.
> I now have to reboot the guest to get igb1 back to a working state, then
> the link_irq will show "4" again, but I can't tell you what was first,
> the timeour-reset or the "link_irq" jam. I guess the latter can't be the
> case, but I have no idea about the code


Thanks for any help, currently my lagg setup is permanently degraded :-(
Would be nice to have it back in a working state :-)

-Harry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 196 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20150423/5d1ab2a7/attachment.sig>


More information about the freebsd-stable mailing list