Kernel modules

Jason Bacon bacon4000 at gmail.com
Sun Apr 14 18:47:52 UTC 2019


On 2019-04-14 12:54, Rodney W. Grimes wrote:
>> On 2019-04-13 13:29, Justin Clift wrote:
>>> On 2019-04-13 23:52, Jason Bacon wrote:
>>> <snip>
>>>> Stability will take a long time to test properly.? I'm going to start
>>>> by rerunning some of our most I/O-intensive jobs on it - jobs that
>>>> actually broke our CentOS RAID servers until I switched them to NFS
>>>> over RDMA.
>>> That's got to be the first time anyone's ever mentioned "NFS over
>>> RDMA" as
>>> increasing a systems' stability. :)
>>>
>>> + Justin
>> Believe it or not...? ;-)
>>
>> After my upgrade from CentOS 6 to CentOS 7, NFS over TCP started falling
>> apart under heavy load; servers and compute nodes becoming unresponsive
>> and requiring a reboot to restore stability.
>>
>> If it's due to problems in the CentOS TCP stack, NFS over RDMA would
>> help by eliminating the TCP stack from the pathway.
> Any idea what happened in the CentOS TCP stack between 6 and 7?
>
>
Not really - I don't have time do deep dive into such specifics given 
the number of hats I have to wear.  I can only say that for our 
particular use case, CentOS 7 is generally more complicated, slower and 
slightly less reliable than 6 (which actually served us well for 
years).  I hit a few pitfalls following the upgrades, but I found my way 
around them and our clusters are pretty stable now.


-- 
Earth is a beta site.



More information about the freebsd-infiniband mailing list