svn commit: r248196 - head/sys/nfs

Andre Oppermann andre at freebsd.org
Tue Mar 12 15:31:09 UTC 2013


On 12.03.2013 16:00, Gleb Smirnoff wrote:
>    Andre,
>
> On Tue, Mar 12, 2013 at 03:23:16PM +0100, Andre Oppermann wrote:
> A> On 12.03.2013 13:19, Gleb Smirnoff wrote:
> A> > Author: glebius
> A> > Date: Tue Mar 12 12:19:23 2013
> A> > New Revision: 248196
> A> > URL: http://svnweb.freebsd.org/changeset/base/248196
> A> >
> A> > Log:
> A> >    Use m_get2() to get mbuf of appropriate size.
> A>
> A> The problem with m_get2() is that it will attempt to use jumbo mbufs
> A> larger than PAGE_SIZE if the request size is sufficiently large.
> A>
> A> In light of the recent issues with jumbo > PAGE_SIZE allocations on
> A> loaded systems I don't think it is appropriate to extend this practice
> A> further.
> A>
> A> Anything that needs more than PAGE_SIZE (where PAGE_SIZE == 4K) mbuf
> A> space should allocate an mbuf chain with m_getm2().  That is what we
> A> have mbuf chains for.
> A>
> A> Sufficient availability of mbufs > PAGE_SIZE cannot be guaranteed after
> A> some uptime and/or a loaded system due to memory fragmentation.  Allocation
> A> can fail non-deterministically for mbufs > PAGE_SIZE long before smaller
> A> mbufs become unavailable due to overall (kernel) memory exhaustion.
> A>
> A> Mbufs > PAGE_SIZE are not only contiguous in kernel address space but
> A> also in real memory address space.  They are intended jumbo frames on
> A> poor NIC's without scatter-capable DMA engines.
> A>
> A> When you put the m_get2() function proposal for review and comments I
> A> already highlighted these issues and put forward serious concerns about
> A> this.
> A>
> A> Please change m_get2() to limit itself to mbuf allocations <= PAGE_SIZE.
>
> I already got similar patch in my queue. We have a lot of code that could
> benefit from using both m_get2() and m_getm2() that are limited to not
> use jumbos at all.

Indeed.  That's why I applaud your work in converting them.

> If you are concerned about using jumbos that are > PAGE_SIZE, then I can
> extend API in my patch.  ... done.
>
> Patch attached.
>
> The NFS code itself guarantees that it won't request > than MCLBYTES,
> so using bare m_get2() here is safe. I can add flag there later for
> clarity.

Using PAGE_SIZE clusters is perfectly fine and no flag to prevent that
is necessary.  In fact we're doing it for years on socket writes without
complaints (through m_getm2()).

However I think that m_get2() should never ever even try to attempt to
allocate mbuf clusters larger than PAGE_SIZE.  Not even with flags.

All mbufs > PAGE_SIZE should be exclusively and only ever be used by drivers
for NIC's with "challenged" DMA engines.  Possibly only available through a
dedicated API to prevent all other uses of it.

-- 
Andre



More information about the svn-src-all mailing list