bce com_no_buffers
Adam Schumacher
adam.schumacher at flightaware.com
Fri Feb 28 19:42:44 UTC 2014
We are running 9.2-RELEASE on some Dell PowerEdge servers with Broadcom
NetXtreme II BCM5716 1000Base-T NICs. Occasionally, we will see the
network briefly lock up and prevent new connections from being made to the
system. Usually, within a minute or two, things will clear out and
everything continues to work fine. After much digging, we noticed that
when this issue occurred, there was an increase in the count of
dev.bce.X.com_no_buffers. There are plenty of mbufs from netstat m
output, so the issue doesn¹t seem to be on that end. I can usually
increase the likelihood of errors by sending a ton of traffic to the box
with iperf, though the error rate is not positively correlated with an
increase in the number of incoming packets.
We have tried tuning options available in the bce driver including
increasing the number of interrupts and increasing the number of rx pages
as can be seen from the following sysctl output:
$ sysctl hw.bce
hw.bce.rx_ticks: 6
hw.bce.rx_ticks_int: 18
hw.bce.rx_quick_cons_trip: 3
hw.bce.rx_quick_cons_trip_int: 6
hw.bce.tx_ticks: 80
hw.bce.tx_ticks_int: 80
hw.bce.tx_quick_cons_trip: 20
hw.bce.tx_quick_cons_trip_int: 20
hw.bce.loose_rx_mtu: 0
hw.bce.hdr_split: 1
hw.bce.tx_pages: 2
hw.bce.rx_pages: 3
hw.bce.msi_enable: 1
hw.bce.tso_enable: 1
hw.bce.verbose: 1
One thing that was very odd was that when we increased hw.bce.rx_pages
from 2 -> 4, all network traffic stopped, but setting it to 3 worked
(though to no noticeable effect towards resolving the issue). From the
bce(4) man page, 3 isn¹t even a valid value. Anyone else encountered this
or have an idea of what we could try to resolve it?
{
Adam Schumacher
IT Operations and Security Engineer
FlightAware
}
More information about the freebsd-stable
mailing list