Kernel modules

Rodney W. Grimes freebsd-rwg at gndrsh.dnsmgr.net
Sun Apr 14 17:55:32 UTC 2019


> On 2019-04-13 13:29, Justin Clift wrote:
> > On 2019-04-13 23:52, Jason Bacon wrote:
> > <snip>
> >> Stability will take a long time to test properly.? I'm going to start
> >> by rerunning some of our most I/O-intensive jobs on it - jobs that
> >> actually broke our CentOS RAID servers until I switched them to NFS
> >> over RDMA.
> >
> > That's got to be the first time anyone's ever mentioned "NFS over 
> > RDMA" as
> > increasing a systems' stability. :)
> >
> > + Justin
> 
> Believe it or not...? ;-)
> 
> After my upgrade from CentOS 6 to CentOS 7, NFS over TCP started falling 
> apart under heavy load; servers and compute nodes becoming unresponsive 
> and requiring a reboot to restore stability.
> 
> If it's due to problems in the CentOS TCP stack, NFS over RDMA would 
> help by eliminating the TCP stack from the pathway.

Any idea what happened in the CentOS TCP stack between 6 and 7?

> One one cluster (old qlogic HCAs), setting net.core.netdev_budget=2000 
> seems to have solved the issue.? On the other (newer Mellanox FDR HCAs), 
> it did not seem to help, so I tried RDMA and it's been stable ever 
> since.? Down side is we can no longer monitor traffic with iftop...
> 

-- 
Rod Grimes                                                 rgrimes at freebsd.org


More information about the freebsd-infiniband mailing list