Re: RFC: NFS trunking (multiple TCP connections for a mount

From: Rick Macklem <rmacklem_at_uoguelph.ca>
Date: Fri, 2 Jul 2021 02:40:49 +0000
Rick Macklem wrote:
>In case anyone is interested in testing and/or reviewing the patch,
>it is at https://reviews.freebsd.org/D30970.
>
>Only lightly tested at this point.
>
>The NFS mount option is "nconnect=<N>", where 2<= N <= 16,
>same as Linux. (I haven't done a man page patch yet.)
I have updated the patch so that the original TCP connection is
used for RPCs that consist of small messages (therefore not needing
much network bandwidth) and the RPCs (Read/Readdir/Write) that
use larger messages are sent on the N-1 additional TCP connections
in a round robin fashion.

The message below was posted a couple of days ago on linux-nfs_at_vger.kernel.org.
It might be unfair to put it here, out of context, but I think it at least
suggests that separating the larger RPC messages from the small ones
(mostly Lookup/Getattr/Access metadata related RPCs) may be useful
under certain circumstances.
> The original issue described was how a high read/write process on the
> client could slow another process trying to do heavy metadata
> operations (like walking the filesystem). Using a different mount to
> the same multi-homed server seems to help a lot (probably because of
> the independent slot table).
--> For this implementation, there is no separate session/slot table.
      (Note that each I/O RPC only uses one table slot.)

I did not make this small vs large RPCs on a separate TCP connection
a separate option, since I believe there are already too many mount options.
If others feel it should be a separate mount option, please speak up.

The phabricator patch has been updated. Please test/review/comment.

Thanks, rick

Thanks everyone, for your input, rick

________________________________________
From: Peter Eriksson <pen_at_lysator.liu.se>
Sent: Tuesday, June 29, 2021 5:11 AM
To: Rick Macklem
Cc: freebsd-net
Subject: Re: RFC: NFS trunking (multiple TCP connections for a mount

CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to IThelp_at_uoguelph.ca


> I don't understand how multiple TCP connections to the same
> server IP address will distribute the load across multiple network
> interfaces?
> I thought that lagg would have handled this?


A lagg typically keeps all data in a TCP stream on a specific lagg member (depending on how the lagg is set up, unless you select the “roundrobin” option in freebsd -  don’t do that unless you like out-of-order packets…)

Network equipment with laggs typically hash the IP streams over the lagg members based on MAC addresses (source&target), IP addresses (source&target) and port numbers.

(We have been diagnosing a fun problem locally where we see packet losses/performance drops over our internal backbone network for certain combinations of odd/even IP addresses/port numbers when things pass certain SPB “routers” (which typically hash the streams over many “channels” between routers)… Fun fun. :-)

I think the multiple NFS TCP streams could make for some nice performance improvements in certain cases. And it would be a more generalisation of having multiple streams between two hosts - one-or-many over IPv4 and one-or-many over IPv6 at the same time. Windows SMB has a similar feature.

Just avoid the Linux NFS mounting deadlock issue with “down” servers please  :-)

- Peter


Received on Fri Jul 02 2021 - 02:40:49 UTC

Original text of this message