review/test: NFS patch to use pagesize mbuf clusters

Marcelo Araujo araujobsdport at gmail.com
Thu Mar 27 11:46:16 UTC 2014


Hello Rick,

We made few tests here, and we could see a little improvement for READ!
We are still double checking it. All our systems have 10G Intel Interface
with TSO enabled and we have those 32 transmit segments as limitation. We
ran the test for several times, and we didn't see any regression.

All our system is based on 9.1-RELEASE with some merges on NFS and IXGBE
from 10-RELEASE.

Our machine:
NIC - 10G Intel X540 that is based on 82599 chipset.
RAM - 24G
CPU - Intel Xeon E5-2448L 1.80Ghz.
Motherboard - Homemade.

Here attached there is a small report, from page number 18, you can see
some graphs that will make easier for you to see the results. So, let me
know if you want try anything else, any other patch and so on. I can keep
the environment for more 1 week and I can make more tests.

Best Regards,


2014-03-19 8:06 GMT+08:00 Rick Macklem <rmacklem at uoguelph.ca>:

> Marcelo Araujo wrote:
> >
> > Hello Rick,
> >
> >
> > I have couple machines with 10G interface capable with TSO.
> > Which kind of result do you expecting? Is it a speed up in read?
> >
> Well, if NFS is working well on these systems, I would hope you
> don't see any regression.
>
> If your TSO enabled interfaces can handle more than 32 transmit
> segments (there is usually a #define constant in the driver with
> something like TX_SEGMAX in it and if this is >= 34 you should
> see very little effect).
>
> Even if your network interface is one of the ones limited to 32
> transmit segments, the driver usually fixes the list via a call
> to m_defrag(). Although this involves a bunch of bcopy()'ng, you
> still might not see any easily measured performance improvement,
> assuming m_defrag() is getting the job done.
> (Network latency and disk latency in the server will predominate,
>  I suspect. A server built entirely using SSDs might be a different
>  story?)
>
> Thanks for doing testing, since a lack of a regression is what I
> care about most. (I am hoping this resolves cases where users have
> had to disable TSO to make NFS work ok for them.)
>
> rick
>
> >
> > I'm gonna make some tests today, but against 9.1-RELEASE, where my
> > servers are working on.
> >
> >
> > Best Regards,
> >
> >
> >
> >
> >
> > 2014-03-18 9:26 GMT+08:00 Rick Macklem < rmacklem at uoguelph.ca > :
> >
> >
> > Hi,
> >
> > Several of the TSO capable network interfaces have a limit of
> > 32 mbufs in the transmit mbuf chain (the drivers call these transmit
> > segments, which I admit I find confusing).
> >
> > For a 64K read/readdir reply or 64K write request, NFS passes
> > a list of 34 mbufs down to TCP. TCP will split the list, since
> > it is slightly more than 64K bytes, but that split will normally
> > be a copy by reference of the last mbuf cluster. As such, normally
> > the network interface will get a list of 34 mbufs.
> >
> > For TSO enabled interfaces that are limited to 32 mbufs in the
> > list, the usual workaround in the driver is to copy { real copy,
> > not copy by reference } the list to 32 mbuf clusters via m_defrag().
> > (A few drivers use m_collapse() which is less likely to succeed.)
> >
> > As a workaround to this problem, the attached patch modifies NFS
> > to use larger pagesize clusters, so that the 64K RPC message is
> > in 18 mbufs (assuming a 4K pagesize).
> >
> > Testing on my slow hardware which does not have TSO capability
> > shows it to be performance neutral, but I believe avoiding the
> > overhead of copying via m_defrag() { and possible failures
> > resulting in the message never being transmitted } makes this
> > patch worth doing.
> >
> > As such, I'd like to request review and/or testing of this patch
> > by anyone who can do so.
> >
> > Thanks in advance for your help, rick
> > ps: If you don't get the attachment, just email and I'll
> > send you a copy.
> >
> > _______________________________________________
> > freebsd-fs at freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> > To unsubscribe, send any mail to " freebsd-fs-unsubscribe at freebsd.org
> > "
> >
> >
> >
> >
> > --
> > Marcelo Araujo
> > araujo at FreeBSD.org
>



-- 
Marcelo Araujo
araujo at FreeBSD.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Benchmarkoriginal.pdf
Type: application/pdf
Size: 330461 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20140327/2c0c6c21/attachment-0001.pdf>


More information about the freebsd-fs mailing list