Re: 100Gb performance

From: Rick Macklem <rick.macklem_at_gmail.com>
Date: Fri, 20 Jun 2025 05:25:12 UTC
On Thu, Jun 19, 2025 at 2:34 PM Olivier Cochard-Labbé
<olivier@freebsd.org> wrote:
>
>
>
> On Thu, Jun 19, 2025 at 4:31 PM Rick Macklem <rick.macklem@gmail.com> wrote:
>
>>
>> There is the "nconnect" mount option. It might help here.
>>
>
> Interesting!
>
> Let’s try:
>
> Server side:
> ```
> mkdir /tmp/nfs
> mount -t tmpfs tmpfs /tmp/nfs
> chmod 777 /tmp/nfs/
> cat > /etc/exports <<EOF
> V4: /tmp
> /tmp/nfs -network 1.1.1.0/24
> EOF
> sysrc nfs_server_enable=YES
> sysrc nfsv4_server_enable=YES
> sysrc nfsv4_server_only=YES
> service nfsd start
> ```
>
> Client side:
> ```
> mkdir /tmp/nfs
> sysrc nfs_client_enable=YES
> service nfsclient start
> ```
>
> Now testing standard speed:
> ```
> # mount -t nfs -o noatime,nfsv4 1.1.1.30:/nfs /tmp/nfs/
> # netstat -an -f inet -p tcp | grep 2049 | wc -l
>        1
> # dd if=/dev/zero of=/tmp/nfs/test bs=1G count=10
> 10+0 records in
> 10+0 records out
> 10737418240 bytes transferred in 8.526794 secs (1259256159 bytes/sec)
> # rm /tmp/nfs/test
> # umount /tmp/nfs
> ```
>
> And with nconnect=16:
> ```
> # mount -t nfs -o noatime,nfsv4,nconnect=16 1.1.1.30:/nfs /tmp/nfs/
> # dd if=/dev/zero of=/tmp/nfs/test bs=1G count=10
> 10+0 records in
> 10+0 records out
> 10737418240 bytes transferred in 8.633871 secs (1243638980 bytes/sec)
> # rm /tmp/nfs/test
> # netstat -an -f inet -p tcp | grep 2049 | wc -l
>       16
> ```
>
> => No difference here, but 16 output queues were correctly used with nconnect=16.
> How is load-sharing done with NFS nconnect ?
> I’ve tested with benchmarks/fio using parallel jobs and I don’t see any improvement too.
Here's a few other things you can try..
On the server:
- add nfs_server_maxio=1048576 to /etc/rc.conf.

On the client:
- put vfs.maxbcachebuf=1048576 in /boot/loader.conf
- use "wcommitsize=<some large value>" as an additional mount option.

On both client and server, bump kern.ipc.maxsockbuf up a bunch.

Once you do the mount do
# nfsstat -m
on the client and you should see the rsize/wsize set to 1048576
and a large value for wcommitsize

For reading, you should also use "readahead=8" as a mount option.

Also, if you can turn down (or turn off) interrupt moderation on the
NIC driver, try that. (Interrupt moderation is great for data streaming
in one direction but is not so good for NFS, which consists of bidirectional
traffic of mostly small RPC messages. Every Write gets a small reply message
in the server->client direction to complete the Write and delay processing
these small received messages will slow NFS down.)

rick

>
> Regards,
> Olivier