RFC: NFS trunking (multiple TCP connections for a mount

From: Rick Macklem <rmacklem_at_uoguelph.ca>
Date: Tue, 29 Jun 2021 00:23:21 UTC
The Linux NFS client now has a mount option "nconnect",
which specifies that multiple TCP connections be created
for an NFS mount, where RPCs are done on the connections,
in a round robin fashion. (Alternating between the two TCP
connections for the case of nconnect=2.)

The Linux man page says:
              When using a connection oriented protocol such as TCP, it
              may sometimes be advantageous to set up multiple
              connections between the client and server. For instance,
              if your clients and/or servers are equipped with multiple
              network interface cards (NICs), using multiple connections
              to spread the load may improve overall performance.  In
              such cases, the nconnect option allows the user to specify
              the number of connections that should be established
              between the client and server up to a limit of 16.

I don't understand how multiple TCP connections to the same
server IP address will distribute the load across multiple network
I thought that lagg would have handled this?

I could easily implement this, but I only have low end hardware
to test on, so I doubt that I will see any performance improvement.

However, I do think that having two TCP connections, where the
RPCs involving large RPC messages (Read/Readdir/Write) are sent
on one TCP connection and the RPCs that use small RPC messages
(Lookup/Access/Getattr,...) are sent on the other one.
--> This would avoid the frequent small RPCs from getting "logjamed"
       behind a bunch of large 1Mbyte Read replies, for example.

So, what do you think?
- Implement "nconnect" with round robin RPC assignment.
- Implement two TCP connections where large RPCs are done
  on one and small RPCs on the other.

I will note I see downsides to doing multiple TCP connections/mount.
1 - Uses up more IP port#s.
2 - When an NFS server gets overloaded, it will stop receiving RPC requests.
     This will eventually apply backpressure through TCP to the client to slow
     down RPC requests. Having multiple TCP connections would reduce this
     backpressure effect.
     --> To be honest, I suspect the slowdown in RPC replies caused by an
           overloaded server, is more effective feedback to the NFS client
           than TCP backpressure, but I am not sure.

Comments? rick