igb network lockups

Nick Rogers ncrogers at gmail.com
Mon Mar 4 16:41:58 UTC 2013


On Sun, Mar 3, 2013 at 2:14 AM, Sepherosa Ziehau <sepherosa at gmail.com> wrote:
> On Sat, Mar 2, 2013 at 12:18 AM, Nick Rogers <ncrogers at gmail.com> wrote:
>> On Fri, Mar 1, 2013 at 8:04 AM, Nick Rogers <ncrogers at gmail.com> wrote:
>>> FWIW I have been experiencing a similar issue on a number of systems
>>> using the em(4) driver under 9.1-RELEASE. This is after upgrading from
>>> a snapshot of 8.3-STABLE. My systems use PF+ALTQ as well. The symptoms
>>> are: interface stops passing traffic until the system is rebooted. I
>>> have not yet been able to gain access to the systems to dig around
>>> (after they have crashed), however my kernel/network settings are
>>> properly tuned (high mbuf limit, hw.em.rxd/txd=4096, etc). It seems to
>>> happen about once a day on systems with around a sustained 50Mb/s of
>>> traffic.
>>>
>>> I realize this is not much to go on but perhaps it helps. I am
>>> debating trying the e1000 driver in the latest CURRENT on top of
>>> 9.1-RELEASE. I noticed the Intel shared code was updated about a week
>>> ago. Would this change or perhaps another change to e1000 since
>>> 9.1-RELEASE possibly affect stability in a positive way?
>>>
>>> Thanks.
>>
>> Heres relevant pciconf output:
>>
>> em0 at pci0:1:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00
>>     vendor     = 'Intel Corporation'
>>     device     = '82574L Gigabit Network Connection'
>>     class      = network
>>     subclass   = ethernet
>>     cap 01[c8] = powerspec 2  supports D0 D3  current D0
>>     cap 05[d0] = MSI supports 1 message, 64 bit
>>     cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
>>     cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled
>> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected
>
> For 82574L, i.e. supported by em(4), MSI-X must _not_ be enabled; it
> is simply broken (you could check 82574 errata on Intel's website to
> confirm what I have said here).

Thanks. So on FreeBSD 9.1-RELEASE it is advisable to set
hw.em.enable_msix=0 for 82574L? Are there other em(x) NICs where this
is advisable?

>
> For 82575, i.e. supported by igb(4), MSI-X must _not_ be enabled; it
> is simply broken (you could check 82575 errata on Intel's website to
> confirm what I have said here).
>
> Best Regards,
> sephe
>
> --
> Tomorrow Will Never Die
>
> On Sat, Mar 2, 2013 at 12:18 AM, Nick Rogers <ncrogers at gmail.com> wrote:
>> On Fri, Mar 1, 2013 at 8:04 AM, Nick Rogers <ncrogers at gmail.com> wrote:
>>> FWIW I have been experiencing a similar issue on a number of systems
>>> using the em(4) driver under 9.1-RELEASE. This is after upgrading from
>>> a snapshot of 8.3-STABLE. My systems use PF+ALTQ as well. The symptoms
>>> are: interface stops passing traffic until the system is rebooted. I
>>> have not yet been able to gain access to the systems to dig around
>>> (after they have crashed), however my kernel/network settings are
>>> properly tuned (high mbuf limit, hw.em.rxd/txd=4096, etc). It seems to
>>> happen about once a day on systems with around a sustained 50Mb/s of
>>> traffic.
>>>
>>> I realize this is not much to go on but perhaps it helps. I am
>>> debating trying the e1000 driver in the latest CURRENT on top of
>>> 9.1-RELEASE. I noticed the Intel shared code was updated about a week
>>> ago. Would this change or perhaps another change to e1000 since
>>> 9.1-RELEASE possibly affect stability in a positive way?
>>>
>>> Thanks.
>>
>> Heres relevant pciconf output:
>>
>> em0 at pci0:1:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00
>>     vendor     = 'Intel Corporation'
>>     device     = '82574L Gigabit Network Connection'
>>     class      = network
>>     subclass   = ethernet
>>     cap 01[c8] = powerspec 2  supports D0 D3  current D0
>>     cap 05[d0] = MSI supports 1 message, 64 bit
>>     cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
>>     cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled
>> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected
>> em1 at pci0:2:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00
>>     vendor     = 'Intel Corporation'
>>     device     = '82574L Gigabit Network Connection'
>>     class      = network
>>     subclass   = ethernet
>>     cap 01[c8] = powerspec 2  supports D0 D3  current D0
>>     cap 05[d0] = MSI supports 1 message, 64 bit
>>     cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
>>     cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled
>> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected
>> em2 at pci0:7:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00
>>     vendor     = 'Intel Corporation'
>>     device     = '82574L Gigabit Network Connection'
>>     class      = network
>>     subclass   = ethernet
>>     cap 01[c8] = powerspec 2  supports D0 D3  current D0
>>     cap 05[d0] = MSI supports 1 message, 64 bit
>>     cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
>>     cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled
>> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected
>> em3 at pci0:8:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00
>>     vendor     = 'Intel Corporation'
>>     device     = '82574L Gigabit Network Connection'
>>     class      = network
>>     subclass   = ethernet
>>     cap 01[c8] = powerspec 2  supports D0 D3  current D0
>>     cap 05[d0] = MSI supports 1 message, 64 bit
>>     cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
>>     cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled
>> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected
>>
>>
>>>
>>> On Mon, Feb 25, 2013 at 10:45 AM, Jack Vogel <jfvogel at gmail.com> wrote:
>>>> Have you done any poking around, looking at stats to determine why the
>>>> hangs? For instance,
>>>> might your mbuf pool be depleted? Some other network resource perhaps?
>>>>
>>>> Jack
>>>>
>>>>
>>>> On Mon, Feb 25, 2013 at 10:38 AM, Christopher D. Harrison <
>>>> harrison at biostat.wisc.edu> wrote:
>>>>
>>>>>  Sure,
>>>>> The problem appears on both systems running with ALTQ and vanilla.
>>>>>     -C
>>>>>
>>>>> On 02/25/13 12:29, Jack Vogel wrote:
>>>>>
>>>>> I've not heard of this problem, but I think most users do not use ALTQ,
>>>>> and we (Intel) do not
>>>>> test using it. Can it be eliminated from the equation?
>>>>>
>>>>> Jack
>>>>>
>>>>>
>>>>> On Mon, Feb 25, 2013 at 10:16 AM, Christopher D. Harrison <
>>>>> harrison at biostat.wisc.edu> wrote:
>>>>>
>>>>>> I recently have been experiencing network "freezes" and network "lockups"
>>>>>> on our Freebsd 9.1 systems which are running zfs and nfs file servers.
>>>>>> I upgraded from 9.0 to 9.1 about 2 months ago and we have been having
>>>>>> issues with almost bi-monthly.   The issue manifests in the system becomes
>>>>>> unresponsive to any/all nfs clients.   The system is not resource bound as
>>>>>> our I/O is low to disk and our network is usually in the 20mbit/40mbit
>>>>>> range.   We do notice a correlation between temporary i/o spikes and
>>>>>> network freezes but not enough to send our system in to "lockup" mode for
>>>>>> the next 5min.   Currently we have 4 igb nics in 2 aggr's with 8 queue's
>>>>>> per nic and our dev.igb reports:
>>>>>>
>>>>>> dev.igb.3.%desc: Intel(R) PRO/1000 Network Connection version - 2.3.4
>>>>>>
>>>>>> I am almost certain the problem is with the ibg driver as a friend is
>>>>>> also experiencing the same problem with the same intel igb nic.   He has
>>>>>> addressed the issue by restarting the network using netif on his systems.
>>>>>> According to my friend, once the network interfaces get cleared, everything
>>>>>> comes back and starts working as expected.
>>>>>>
>>>>>> I have noticed an issue with the igb driver and I was looking for
>>>>>> thoughts on how to help address this problem.
>>>>>>
>>>>>> http://freebsd.1045724.n5.nabble.com/em-igb-if-transmit-drbr-and-ALTQ-td5760338.html
>>>>>>
>>>>>> Thoughts/Ideas are greatly appreciated!!!
>>>>>>
>>>>>>     -C
>>>>>>
>>>>>> _______________________________________________
>>>>>> freebsd-net at freebsd.org mailing list
>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
>>>>>>
>>>>>
>>>>>
>>>>>
>>>> _______________________________________________
>>>> freebsd-net at freebsd.org mailing list
>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
>> _______________________________________________
>> freebsd-net at freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
>
>
>
> --
> Tomorrow Will Never Die


More information about the freebsd-net mailing list