sync flood
rondzierwa at comcast.net
rondzierwa at comcast.net
Fri Jan 30 15:59:05 UTC 2015
I am using freebsd 10.1-Release on a sunfire x4500 (thumper) and have run into a couple of odd things I was hoping someone could shed some light on.
I am using the system to share up space on nfs and samba. the system has slots for up to 48 drives. I have 15 currently populated (a boot disk and 14 array disks). I created a raidz and zfs pool and began creating shares.
When I began actually using the system to serve up nfs shares over one of the on-board em ethernet devices, the system crashed within a couple of minutes. when it rebooted, the bios screen stopped with a message indicating that there had been a sync flood condition that caused the reboot. This was easily and quickly repeatable, and it made the server useless.
I have found several threads on the mailing lists where people have encountered this before and i tried a few things, but what made it stop was disabling all but the boot processor using loader.conf. Once I was running on only one processor the sync flood stopped happening and its been running under load for a day. The server has run reliably under solaris, but since they were end-of-life, i was able to economically re-purpose them.
What led me to try running on only one processpr was part of a thread that talked about changing the way interrupts route to the various processors.
http://lists.freebsd.org/pipermail/freebsd-stable/2010-July/057670.html
The thread was using a Sun X4100, so what they were doing did not seem to directly apply, but by eliminating all but the boot processor would certainly solve any interrupt routing issues, and it was easy enough to try, and it seemed to have masked the problem.
For the long term however, this is not a workable solution. This server will be given more and more things to do and the other processors will become more of a necessity.
It seems like there is something in the default assignment of hardware resources that is having a problem dealing with a system like this that has so much on the bus (6 marvel sata controllers, 4 intel pro/1000 controllers, 4 usb controllers). It also has an issue where freebsd can only allocate bus resources for 2 of the 4 ethernet devices:
em2: <Intel(R) PRO/1000 Legacy Network Connection 1.0.6> mem 0xfdbe0000-0xfdbfffff irq 61 at device 1.0 on pci8
em2: 0x40 bytes of rid 0x20 res 4 failed (0, 0xffffffffffffffff).
em2: Unable to allocate bus resource: ioport
em2: Allocation of PCI resources failed
device_attach: em2 attach returned 6
em2: <Intel(R) PRO/1000 Legacy Network Connection 1.0.6> mem 0xfdbc0000-0xfdbdffff irq 62 at device 1.1 on pci8
em2: 0x40 bytes of rid 0x20 res 4 failed (0, 0xffffffffffffffff).
em2: Unable to allocate bus resource: ioport
em2: Allocation of PCI resources failed
device_attach: em2 attach returned 6
There are plans to complicate things further by adding two InfiBand interfaces.
can anyone offer any ideas as to how to chase this problem?
thanks,
ron.
More information about the freebsd-hackers
mailing list