Re: Poor performance with stable/13 and Mellanox ConnectX-6 (mlx5)

From: Dave Cottlehuber <dch_at_skunkwerks.at>
Date: Fri, 17 Jun 2022 15:03:04 UTC
On Fri, 17 Jun 2022, at 02:38, Mike Jakubik wrote:
> Hi,
>
> I believe you hit the nail on the head! I am now getting consistent 
> high speeds, even higher than on Linux! Is this a problem with the 
> scheduler? Should someone in that area of expertise be made aware of 
> this? More importantly i guess, would this affect real world 
> performance, these servers will be running RabbitMQ (it uses quite a 
> bit of bandwidth) and PostgreSQL w/ replication.

pinning cores for unimpeded access is very common for high performance systems. Do this both for the nics and also your apps. Be mindful of the NUMA topooogy.

You should look into both the  erlang scheduler flags for core pinning, and also ensuring that your erlang processes have unimpeded access to their own cores too. A reasonable approach is to make a simple cowboy or Phoenix app and hammer it with wrk or similar load tool to get a feel for things, and then profile and tune your own app based on those specific results.

For rabbit there is an excellent load testing tool from the pivotal team if you don’t have suitable load generators yourselves.

Tsung is an excellent tool if you put in the work to craft something specific for your use case.

Please post back to the list with your specific findings and nic/ tcp tunables, these are very helpful for the next person!

Dave