re driver crashing under load, can reproduce it.
Robert Sevat
robert.sevat at live.nl
Tue Feb 18 04:08:51 UTC 2014
Hey,
I've got a small server on which the network driver crashes completely the instant I put any network load on it. The only way to fix it is by rebooting the machine, it'll be completely unresponsive to ifconfig up or down.
I've seen a bunch of errors already:
re0: watchdog timeout
re0: link state changed to DOWN
re0: link state changed to UP
It'll start with that before the driver completely crashes and locks up, a few hunderd times the up/down changes.
Feb 18 00:49:33 transmission-video transmission-daemon[1791]: UDP Failed to set receive buffer: No buffer space available (tr-udp.c:59)
Feb 18 00:49:33 transmission-video transmission-daemon[1791]: UDP Failed to set receive buffer: requested 4194304, got 42080 (tr-udp.c:78)
I've also had this, so I've set the buffer already to 4194304 with: sysctl net.inet.udp.recvspace: 4194304.
After I did this transmission stopped complaining for a bit. An hour later the Re driver crashed again. This time after reboot the driver refused to work at all. I had to remove that from sysctl.conf and set it back to 42080 before the driver would work again.
netstat -sl re0: http://pastebin.com/NmDWJJ6k
This does show that a lot of udp packets are dropped due to full buffers:
"4880 dropped due to no socket
2708 broadcast/multicast datagrams undelivered
139828 dropped due to full socket buffers"
I have also gotten:
"Feb 15 02:39:00 incognitus kernel: sonewconn: pcb 0xfffff80028d28620: Listen queue overflow: 193 already in queue awaiting acceptance
Feb 15 02:39:03 incognitus last message repeated 207 times"
After googling a bit I have tried multiple things:
Disable acpi in the bios, and enable ErP to ensure no weird things happen with power states. I've also disabled powerd in rc.conf.
Because I also got these messages in dmesg: "ip6addrctl: socket(UDP): No buffer space available" I've disabled ipv6 on the machine.
ip6addrctl_enable="NO"
ip6addrctl_policy="ipv4_prefer"
ipv6_network_interfaces="none"
ipv6_active_all_interfaces="NO"
I have also disabling msix and msi in /boot/loader.conf because this was suggested by others.
hw.re.msi_disable="1"
hw.re.msix_disable="1"
I also have disabled hardware checksum offloading with ifconfig
ifconfig re0 -txcsum
ifconfig re0 -rxcsum
I've tried forcing the nic to use Full duplex 1000BaseTX since some people suggested it was due to auto negotiation failure. When I did this the entire driver locked up completely and refused to work until I rebooted it.
ifconfig re0 media 1000BaseTX mediaopt full-duplex
This is on a machine that runs:
root at incognitus:/ # uname -a
FreeBSD incognitus.indylix.nl 10.0-RELEASE FreeBSD 10.0-RELEASE #0 r261411: Sun Feb 2 21:51:04 CET 2014 robert at Incognitus:/usr/obj/usr/src/sys/Pf amd64
I've only added PF support to the kernel. It happens with PF enabled or disabled, makes no difference.
I've ran Pfsense 2.1 on this machine for about 3-4 months without any of these problems. This was also while putting significant load on it (120 mbit internet). But now that it runs FreeBSD 10.0 it is highly unstable as soon as I push any traffic. I can manually trigger the crash by starting an Rsync upload to another server. This upload will do roughly 80 mbit of traffic and crash it within a few Gigabytes of traffic. Or by adding a few torrents to Transmission that push a fair bit of netwerk traffic. But it's only the Re driver that crashes, the machine it self is up and responsive, only the network stops working. Crashes can be triggered within 10 minutes.
root at incognitus:/ # pciconf -lcv
re0 at pci0:1:0:0: class=0x020000 card=0xe0001458 chip=0x816810ec rev=0x06 hdr=0x00
vendor = 'Realtek Semiconductor Co., Ltd.'
device = 'RTL8111/8168B PCI Express Gigabit Ethernet controller'
class = network
subclass = ethernet
cap 01[40] = powerspec 3 supports D0 D1 D2 D3 current D0
cap 05[50] = MSI supports 1 message, 64 bit
cap 10[70] = PCI-Express 2 endpoint IRQ 1 max data 128(128) link x1(x1)
speed 2.5(2.5) ASPM disabled(L0s/L1)
cap 11[b0] = MSI-X supports 4 messages
Table in map 0x20[0x0], PBA in map 0x20[0x800]
cap 03[d0] = VPD
ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected
ecap 0002[140] = VC 1 max VC0
ecap 0003[160] = Serial 1 01000000684ce000
This is on a Gigabyte GA-C847N with the Realtek RTL8111F network card.
Any things that I could try? Commands to run? Or extra info you'd like to have? Since I'm pretty much out of ideas.
(Except of course buying a different Intel nic, which I will resort to if I can't get it resolved since it's unworkable now. I rather help debug a problem in the driver.)
Kind Regards,
Robert Sevat
More information about the freebsd-stable
mailing list