Strange networking behaviour in storage server

Karli Sjöberg karli.sjoberg at slu.se
Mon Jun 1 07:07:21 UTC 2015


Hey!

So we have this ZFS storage server upgraded from 9.3-RELEASE to
10.1-STABLE to overcome not being able to 1) use SSD drives as L2ARC[1]
and 2) not being able to hotswap SATA drives[2].

After the upgrade we´ve noticed a very odd networking behaviour, it
sends/receives full speed for a while, then there is a couple of minutes
of complete silence where even terminal commands like an "ls" just waits
until they are executed and then it starts sending full speed again. I
´ve linked to a screenshot showing this send and pause behaviour. The
blue line is the total, green is SMB and turquoise is NFS over jumbo
frames. It behaves this way regardless of the protocol.

http://oi62.tinypic.com/33xvjb6.jpg

The problem is that these pauses can sometimes be so long that
connections drop. Like someone is copying files over SMB or iSCSI and
suddenly they get an error message saying that the transfer failed and
they have to start over with the file(s). That´s horrible!

So far NFS has proven to be the most resillient, it´s stupid simple
nature just waits and resumes transfer when pause is over. Kudus for
that.

The server is driven by a Supermicro X9SRL-F, a Xeon 1620v2 and 64GB ECC
RAM. The hardware has been ruled out, we happened to have a identical MB
and CPU lying around and that didn´t improve things. We have also
installed a Intel PRO 100/1000 Quad-port ethernet adapter to test if
that would change things, but it hasn´t, it still behaves this way.

The two built-in NIC's are Intel 82574L and the Quad-port NIC's are
Intel 82571EB, so both em(4) driven. I happen to know that the em driver
has updated between 9.3 and 10.1. Perhaps that is to blame, but I have
no idea.

Is there anyone that can make sense of this?

[1]:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=197164

[2]:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=191348

/K


More information about the freebsd-fs mailing list