Linux NFS client and FreeBSD server strangeness

Fri Apr 6 00:19:18 UTC 2018

Ben RUBSON wrote:
>On 04 Apr 2018 20:27, Mike Tancsa wrote:
>> Note, doing something like
>>
>> dd if=/dev/zero of=/backup/test.bin bs=4096 count=5000000
>
>Note that this test may not be really relevant if you have ZFS compression
>enabled.
>
>> I too am using 9000 for the MTU.
>
>Did you try to use smaller MTU ?
>Some network adapters are known to have bugs requesting 9K mbufs for large
>MTUs.
>Especially Mellanox, not sure about Chelsio though.
When the system uses a mix of mbuf cluster sizes (almost always happens when
you use jumbo packets), you can fragment the kernel memory pool they
are allocated from to the point that jumbo clusters can't be allocated easily.

That breaks NFS performance badly. I had some patches that used jumbo
mbuf clusters for the NFS read reply any write requests.
They worked fine for a while, but would then get hammered by the fragmentation
problem. (As such, they never went into head, etc.)
(The main advantage of using jumbo clusters was that the mbuf chain for an
 RPC had fewer mbufs in it and wouldn't get bit by bugs implementing TSO
 for interfaces that could only handle 32 buffers for a TSO segment. There
 is code that avoids this problem in tcp_output(), but it only works if the
 net device driver sets the parameters correctly.)

Some have said that 9K jumbo clusters shouldn't exist in FreeBSD because
of the fragmentation problem. Others proposed using separate pools for
each mbuf cluster size, but nothing has happened as far as I know.

rick