Problems with BCE network adapter (Dell PE2950)

Tom Judge tom at tomjudge.com
Wed Jul 11 09:19:25 UTC 2007


Tom Judge wrote:
> Tom Judge wrote:
>> Tom Judge wrote:
>>> Tom Judge wrote:
>>>> David Christensen wrote:
>>>>>> Sorry for the top post, please try following patch:
>>>>>> http://people.freebsd.org/~sephe/if_bce.c.diff
>>>>>>
>>>>>> This is probably the cause; I noticed it when bce(4) was ported to 
>>>>>> DragonFly.
>>>>>>
>>>>>
>>>>> Thanks Sephe, I think you're on to something.  I have some
>>>>> debug code in the driver to simulate mbuf allocation
>>>>> failures and when I enable that I start receiving the same
>>>>> error messages Tom reported (along with various kernel
>>>>> panics), but when I include your change the system seems
>>>>> to keep humming along. I'll certainly add your code into an update 
>>>>> shortly.
>>>>>
>>>>> Dave
>>>>>
>>>>
>>>> I'm not going to have a chance to test this patch until next week 
>>>> but I will let you know what the results are.
>>>>
>>>> Tom
>>>
>>>
>>> So here goes,  after 2 days testing we have come up with the 
>>> following data.
>>>
>>> The configuration
>>>
>>> [PE[12]950] ----> [PowerConnect 5324]
>>>
>>> The system is running 8192 byte Jumbo Frames.
>>>
>>> sultan# ifconfig bce0
>>> bce0: flags=8847<UP,BROADCAST,DEBUG,RUNNING,SIMPLEX,MULTICAST> mtu 8192
>>>         options=3b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU>
>>>         inet 172.31.0.28 netmask 0xffffff00 broadcast 172.31.0.255
>>>         inet 172.31.0.163 netmask 0xffffffff broadcast 172.31.0.163
>>>         ether 00:19:b9:e4:4d:cc
>>>         media: Ethernet autoselect (1000baseTX <full-duplex>)
>>>         status: active
>>>
>>>
>>> After applying both David and Sephe's patches I have yet to get a 
>>> system in a state where it is stable with jumbo frames enabled, the 
>>> systems crash almost immediately after the switch changes the port 
>>> state (Spanning tree) from LEARNING to FORWARDING.  The output from 
>>> this crash can be found attached as crash-1.txt.gz.
>>>
>>> If the frame size is left at 1500 then the interface seems stable, 
>>> however I can't fully test this as the interface is connected to a 
>>> GigE only network with an mtu of 8192.
>>>
>>> If BCE_DEBUG is remove from if_bcereg.h then the system just exhibits 
>>> the original problem and may or may not crash.
>>>
>>> The next test was to try the kernel with BCE_DEBUG and with the 
>>> following extra patch (so that the driver does not jump to the 
>>> breakpoint when an unexpected mbuf is found in the rx buffer).
>>>
>>> --- if_bce.c    (revision 62)
>>> +++ if_bce.c    (revision 66)
>>> @@ -4050,7 +4050,8 @@
>>>                         DBRUNIF((!(rxbd->rx_bd_flags & 
>>> RX_BD_FLAGS_END)),
>>>                                 BCE_PRINTF("%s(%d): Unexpected mbuf 
>>> found in rx_bd[0x%04X]!\n",
>>>                                 __FILE__, __LINE__, sw_chain_cons);
>>> -                               bce_breakpoint(sc));
>>> +                               bce_dump_mbuf(sc, m));
>>> +//                             bce_breakpoint(sc));
>>>
>>>                         /*
>>>                          * ToDo: If the received packet is small enough
>>>
>>>
>>> With this patch the system boots and does not crash straight away, 
>>> however it is almost completely unusable.  The output with this 
>>> kernel can be found attached as crash-2.txt.gz.  Also this causes the 
>>> following new error message:
>>>
>>> fgrep -n leak crash-2.txt
>>> 3194:bce0: /usr/src/sys/dev/bce/if_bce.c(3842): Memory leak! Lost 114 
>>> mbufs from rx chain!
>>>
>>> Has no one else come across this problem, or are Jumbo frames not 
>>> widely used?
>>>
>>> Tom
>>>
>> It would seem that the crash can be simulated just by increasing the 
>> MTU above 1500 (tested in single user mode).
>>
> 
> 
> Ok so I think I have fix the problem with the rx_bd tracking.  I have 
> ported rboyer's patch to NetBSD's bnx driver to FreeBSD (patch 
> attached).  The patch seems to get rid of two problems:
> 
> 1) Unexpected mbuf in rx_bd
> 2) Too many free rx_bd's
> 
> 
> However I am still faced with the problem of frames with missing 
> ethernet headers:
> bce0: /usr/src/sys/dev/bce/if_bce.c(4128): Unusual frame size found. 
> Min(60), Actual(0), Max(9022)
> bce0: mbuf: vaddr = 0xFFFFFF00:7B69AC00, m_len = 9216, m_flags = ( M_EXT 
> M_PKTHDR ) m_data = 0xFFFFFFFF:86F76000
> 0x00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 0x10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 0x20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 0x30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 0x40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 0x50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 0x60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 0x70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> bce0: - m_pkthdr: flags = ( ) csum_flags = ( )
> bce0: - m_ext: vaddr = 0xFFFFFFFF:86F76000, ext_size = 9216, type = 
> EXT_JUMBO9
> bce0: discard frame w/o leading ethernet header (len 4294967292 pkt len 
> 4294967292)
> bce0: /usr/src/sys/dev/bce/if_bce.c(4128): Unusual frame size found. 
> Min(60), Actual(0), Max(9022)
> bce0: mbuf: vaddr = 0xFFFFFF00:5EB48B00, m_len = 9216, m_flags = ( M_EXT 
> M_PKTHDR ) m_data = 0xFFFFFFFF:86F73000
> 0x00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 0x10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 0x20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 0x30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 0x40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 0x50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 0x60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 0x70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> bce0: - m_pkthdr: flags = ( ) csum_flags = ( )
> bce0: - m_ext: vaddr = 0xFFFFFFFF:86F73000, ext_size = 9216, type = 
> EXT_JUMBO9
> bce0: discard frame w/o leading ethernet header (len 4294967292 pkt len 
> 4294967292)
> bce0: /usr/src/sys/dev/bce/if_bce.c(4128): Unusual frame size found. 
> Min(60), Actual(27745), Max(9022)
> bce0: mbuf: vaddr = 0xFFFFFF00:5E9DDC00, m_len = 9216, m_flags = ( M_EXT 
> M_PKTHDR ) m_data = 0xFFFFFFFF:86EF8000
> 0x00: 2C 6F 75 3D 50 65 72 73 6F 6E 61 6C 2C 6F 75 3D
> 0x10: 47 72 6F 75 70 73 2C 6F 3D 4D 69 6E 74 65 6C 30
> 0x20: 28 30 11 04 02 63 6E 31 0B 04 09 63 62 75 74 74
> 0x30: 72 6F 73 65 30 13 04 09 67 69 64 4E 75 6D 62 65
> 0x40: 72 31 06 04 04 31 31 30 38 30 5E 02 01 02 64 59
> 0x50: 04 31 63 6E 3D 6D 63 61 68 6D 2C 6F 75 3D 4C 6F
> 0x60: 6E 64 6F 6E 2C 6F 75 3D 50 65 72 73 6F 6E 61 6C
> 0x70: 2C 6F 75 3D 47 72 6F 75 70 73 2C 6F 3D 4D 69 6E
> bce0: - m_pkthl er0ror
> 9 67 69 64 4E 75 6D 62 65 72 31 06 04 04 31 30
> 0x70: 33 37 30 62 02 01 02 64 5D 04 33 63 6E 3D 72 63
> bce0: - m_pkthdr: flags = ( ) csum_flags = ( )
> bce0: - m_ext: vaddr = 0xFFFFFFFF:86E8C000, ext_size = 9216, type = 
> EXT_JUMBO9
> bce0: /usr/src/sys/dev/bce/if_bce.c(4081): Unexpected mbuf found in 
> rx_bd[0x002A]!
> bce0: /usr/src/sys/dev/bce/if_bce.c(4128): Unusual frame size found. 
> Min(60), Actual(28515), Max(9022)
> bce0: mbuf: vaddr = 0xFFFFFF00:5AB4C800, m_len = 9216, m_flags = ( M_EXT 
> M_PKTHDR ) m_data = 0xFFFFFFFF:86F28000
> 0x00: 30 0E 04 02 63 6E 31 08 04 06 63 6F 68 61 72 61
> 0x10: 30 13 04 09 67 69 64 4E 75 6D 62 65 72 31 06 04
> 0x20: 04 31 30 37 32 30 65 02 01 02 64 60 04 35 63 6E
> 0x30: 3D 6A 70 69 65 6B 61 72 73 2C 6F 75 3D 43 68 69
> 0x40: 63 61 67 6F 2C 6F 75 3D 50 65 72 73 6F 6E 61 6C
> 0x50: 2C 6F 75 3D 47 72 6F 75 70 73 2C 6F 3D 4D 69 6E
> 0x60: 74 65 6C 30 27 30 10 04 02 63 6E 31 0A 04 08 6A
> 0x70: 70 69 65 6B 61 72 73 30 13 04 09 67 69 64 4E 75
> bce0: - m_pkthdr: flags = ( ) csum_flags = ( )
> bce0: - m_ext: vaddr = 0xFFFFFFFF:86F28000, ext_size = 9216, type = 
> EXT_JUMBO9
> bce0: /usr/src/sys/dev/bce/if_bce.c(4081): Unexpected mbuf found in 
> rx_bd[0x002E]!
> bce0: /usr/src/sys/dev/bce/if_bce.c(4128): Unusual frame size found. 
> Min(60), Actual(28460), Max(9022)
> bce0: mbuf: vaddr = 0xFFFFFF00:5EB9F200, m_len = 9216, m_flags = ( M_EXT 
> M_PKTHDR ) m_data = 0xFFFFFFFF:86F70000
> 0x00: 04 32 63 6E 3D 69 6E 65 73 73 2C 6F 75 3D 43 68
> 0x10: 69 63 61 67 6F 2C 6F 75 3D 50 65 72 73 6F 6E 61
> 0x20: 6C 2C 6F 75 3D 47 72 6F 75 70 73 2C 6F 3D 4D 69
> 0x30: 6E 74 65 6C 30 24 30 0D 04 02 63 6E 31 07 04 05
> 0x40: 69 6E 65 73 73 30 13 04 09 67 69 64 4E 75 6D 62
> 0x50: 65 72 31 06 04 04 31 31 34 32 30 67 02 01 02 64
> 0x60: 62 04 36 63 6E 3D 70 6D 63 6E 61 6D 61 72 61 2C
> 0x70: 6F 75 3D 43 68 69 63 61 67 6F 2C 6F 75 3D 50 65
> bce0: - m_pkthdr: flags = ( ) csum_flags = ( )
> bce0: - m_ext: vaddr = 0xFFFFFFFF:86F70000, ext_size = 9216, type = 
> EXT_JUMBO9
> bce0: /usr/src/sys/dev/bce/if_bce.c(4081): Unexpected mbuf found in 
> rx_bd[0x0032]!
> bce0: /usr/src/sys/dev/bce/if_bce.c(4128): Unusual frame size found. 
> Min(60), Actual(28787), Max(9022)
> bce0: mbuf: vaddr = 0xFFFFFF00:5AB4CA00, m_len = 9216, m_flags = ( M_EXT 
> M_PKTHDR ) m_data = 0xFFFFFFFF:86F6D000
> 0x00: 02 01 02 64 57 04 30 63 6E 3D 73 70 79 65 2C 6F
> 0x10: 75 3D 4C 6F 6E 64 6F 6E 2C 6F 75 3D 50 65 72 73
> 0x20: 6F 6E 61 6C 2C 6F 75 3D 47 72 6F 75 70 73 2C 6F
> 0x30: 3D 4D 69 6E 74 65 6C 30 23 30 0C 04 02 63 6E 31
> 0x40: 06 04 04 73 70 79 65 30 13 04 09 67 69 64 4E 75
> 0x50: 6D 62 65 72 31 06 04 04 31 32 30 39 30 59 02 01
> 0x60: 02 64 54 04 2F 63 6E 3D 71 61 2C 6F 75 3D 43 68
> 0x70: 69 63 61 67 6F 2C 6F 75 3D 50 65 72 73 6F 6E 61
> bce0: - m_pkthdr: flags = ( ) csum_flags = ( )
> bce0: - m_ext: vaddr = 0xFFFFFFFF:86F6D000, ext_size = 9216, type = 
> EXT_JUMBO9
> bce0: /usr/src/sys/dev/bce/if_bce.c(4128): Unusual frame size found. 
> Min(60), Actual(12855), Max(9022)
> bce0: mbuf: vaddr =0 67 02 01 02 64
> 0x60: 62 04 36 63 6E 3D 70 6D 63 6E 61 6D 61 72 61 2C
> 0x70: 6F 75 3D 43 68 69 63 61 67 6F 2C 6F 75 3D 50 65
> bce0: - m_pkthdr: flags = ( ) csum_flags = ( )
> bce0: - m_ext: vaddr = 0xFFFFFFFF:86F70000, ext_size = 9216, type = 
> EXT_JUMBO9
> 
> 
> 
> if_bnx.c - 1.4 -> 1.5 LOG:
> 
> RX buffers are malloced memory of 9216 bytes. This can require from 1 to
> 4 DMA memory segments, depending on how the buffer is in memory.
> When receiving a packet, we allocate a new one to remplace the one we've
> used. It can need more segments than the one it remplace, leading to
> corrution of the RX descriptors, and a panic in bus_dmamap_sync() 
> (DIAGNOSTIC
> kernels) or possibly memory corruption.
> 
> Fix:
> - bce_get_buf() allocates as many buffer as possible, checking the number
>   of free RX descriptors. Because one receive buffer is not guaranteed to
>   be remplaced on receive, call bce_get_buf() from bce_tick() too.
>   This also improve error handling from bce_get_buf().
> - use MCLGET() instead of MEXTMALLOC() if we're running with the standard
>   ethernet MTU. This gives us more receive buffers and waste less memory.
> 
> 
> Seem to be moving in the right direction slowly.
> 
> 


It seems I missed the rx_bd error,  it is still present with this patch.

Tom


More information about the freebsd-net mailing list