bce(4) - com_no_buffers (Again)

Tom Judge tom at tomjudge.com
Thu Sep 23 19:34:14 UTC 2010


The throttle command I am using in the tests is the one from here:

http://klicman.org/throttle/


On 09/23/2010 02:26 PM, Tom Judge wrote:
> On 09/23/2010 01:21 PM, David Christensen wrote:
>   
>>>>> Under testing I have yet to see a memory fragmentation issue with
>>>>>         
>>>>>           
>>> this
>>>     
>>>       
>>>>> driver.  I follow up if/when I find a problem with this again.
>>>>>
>>>>>
>>>>>         
>>>>>           
>>> So here we are again.  The system is locking up again because of 9k
>>> mbuf
>>> allocation failures.
>>>     
>>>       
>> Failure to allocate a new buffer should cause the driver to
>> drop the received frame and reuse the buffer, not lock up the
>> system.  Are you seeing the lockup come from bce(4) or does
>> it come from somewhere else due to the dropped data?
>>
>>   
>>     
> The lockup is not from the NIC as such, the systems have the appearance
> of locking up as home directories are on NFS and the user information is
> stored in a remote LDAP server.   When the system starts to drop frames
> due to lack of 9k memory regions it tends to last for a few minutes
> (when it is really bad) and stop all traffic into the system.  This
> appears to the average user as a complete system pause.
>
>
>   
>>>>> Is there a way to fix the RX buffer shortage issues (when header
>>>>> splitting is turned on) so that they are guarded by flow control.
>>>>>         
>>>>>           
>>> Maybe
>>>     
>>>       
>>>>> change the low watermark for flow control when its enabled?
>>>>>
>>>>>
>>>>>         
>>>>>           
>>>> I'm not sure how much it would help but try changing RX low
>>>> watermark. Default value is 32 which seems to be reasonable value.
>>>> But it's only for 5709/5716 controllers and Linux seems to use
>>>> different default value.
>>>>
>>>>       
>>>>         
>>> These are: NetXtreme II BCM5709 Gigabit Ethernet
>>>
>>> So my next task is to turn the watermark related defines into sysctls
>>> and turn on header splitting so that I can try to tune them without
>>> having to reboot.
>>>
>>>     
>>>       
>> Do you have flow control enabled?  There are arguments both for
>> and against flow control.  For bce(4), I haven't tested flow control
>> for quite a while and it's behavior may have changed since it is
>> controlled by firmware.   Keep an eye on the hardware statistics
>> to see that's it's actively generating pause frames.
>>   
>>     
> At the moment I have a number tests:
>
> 1) With flow control disabled and header splitting on or off flood the
> server with very small frames (200 bytes).  This will trigger the
> firmware to drop frames due to BD shortages (incrementing
> dev.bce.X.com_no_buffers). 
>
> Traffic source:
>
> route change test-system -mtu 200
> dd if=/dev/zero bs=8000 | nc -l 1111
>
> Test system:
>
> nc source 1111 > /dev/null
>
>
> 2) With flow control enabled and header splitting off flood the server
> with traffic with very slow userland processing:
>
> Traffic source:
>
> for I in 1 2 3 4 5 6 7 8; do ( dd if=/dev/zero bs=8000 | nc -l 1111$I &
> ); done
>
> Test system:
>
> 8*
> nc source 1111$I | throttle -k 1 > /dev/null
>
> On our systems this will reliably trigger denied 9k allocations.
>
> 3) With flow control enabled and header splitting on flood the server
> with very small frames (200 bytes). (Using the same test as in case 1). 
> My aim is to tune the watermark here so that there are no frames dropped
> due to BD shortages.
>
>
>
>
> I am under the impression that the best solution is to tune the RX ring
> so that flow control can be disabled but I not sure I could do this.
>
>
>   
>>> My next question is, is it possible to increase the size of the RX ring
>>> without switching to RSS?
>>>
>>>     
>>>       
>> I have a change I've been working on to allow RX/TX ring size
>> to be adjusted through a sysctl.  Let me pretty it up a bit and
>> send it to you for test.  You should be able to adjust the ring
>> size without enabling RSS.
>>
>>   
>>     
> If you can provide a patch I have hardware available to test on.
>
> Thanks
>
> Tom
>
>   


-- 
TJU13-ARIN



More information about the freebsd-net mailing list