kern/144330: [nfs] mbuf leakage in nfsd with zfs

Rick Macklem rmacklem at uoguelph.ca
Mon Mar 22 13:51:53 UTC 2010



On Mon, 22 Mar 2010, Daniel Braniss wrote:

>
> well, it's much better!, but no cookies yet :-)
>

Well, that's good news. I'll try and get dfr to review it and then
commit it. Thanks Mikolaj, for finding this.

> from comparing graphs in
> 	ftp://ftp.cs.huji.ac.il/users/danny/freebsd/mbuf-leak/
> store-01-e.ps: a production server running newfsd - now up almost 20 days
> 	notice that the average used mbuf is below 1000!
>
> store-02.ps: kernel without last patch, classic nfsd
> 	the leak is huge.
>
> store-02++.ps: with latest patch
> 	the leak is much smaller but I see 2 issues:
> 		- the initial leap to over 2000, then a smaller leak.

The initial leap doesn't worry me. That's just a design constraint.
A slow leak after that is still a problem. (I might have seen the
slow leak in testing here. I'll poke at it and see if I can reproduce
that.)

>
> could someone explain replay_prune() to me?
>
I just looked at it and I think it does the following:
 	- when it thinks the cache is too big (either too many entries
           or too much mbuf data) it loops around until:
 		- no longer too much or can't free any more
                 (when an entry is free'd, rc_size and rc_count are
                  reduced)
           (the loop is from the end of the tailq, so it is freeing
            the least recently used entries)
 	- the test for rce_repmsg.rm_xid != 0 avoids freeing ones
           that are in progress, since rce_repmsg is all zeroed until
           the reply has been generated

I did notice that the call to replay_prune() from replay_setsize() does 
not lock the mutex before calling it, so it doesn't look smp safe to me 
for this case, but I doubt that would cause a slow leak. (I think this is
only called when the number of mbuf clusters in the kernel changes and
might cause a kernel crash if the tailq wasn't in a consistent state as
it rattled through the list in the loop.)

rick



More information about the freebsd-fs mailing list