mbuf leakage with nfs/zfs?

Sun Feb 28 12:21:32 UTC 2010

On Feb 28, 2010, at 12:11 PM, Daniel Braniss wrote:

>> I'm pulling in Robert Watson, who has some familiarity with the UDP
>> stack/code in FreeBSD.  I'm not sure he'll be a sufficient source of
>> knowledge for this specific issue since it appears (?) to be specific to
>> NFS; Rick Macklem would be a better choice, but as reported, he's MIA.
>> 
>> Robert, are you aware of any changes or implementation issues which
>> might cause excessive (read: leaking) mbuf use under UDP-based NFS?  Do
>> you know of a way folks could determine the source of the leak, either
>> via DDB or while the system is live?
> 
> I have been runing some tests in a controlled environment.
> 
> server and client are both 64bit Xeon/X5550 @  2.67GHz with 16Gb of memory
> FreeBSD/SMP: 2 package(s) x 4 core(s) x 2 SMT threads
> 
> the client is runing latest 8.0 stable
> the load is created by runing 'make -j32 buildworld' and sleeping 150 sec.
> in between runs, this is the straight line you will see in the graphs.
> Both the src and obj directories are NFS mounted from the server, regular UFS.
> 
> when server is running 7.2-stable no leakage is seen.
> see ftp://ftp.cs.huji.ac.il/users/danny/freebsd/mbufs/{tcp,udp}-7.2.ps
> when server is runing 8.0-stable
> see ftp://ftp.cs.huji.ac.il/users/danny/freebsd/mbufs/{tcp,udp}-8.0.ps
> you can see that udp is leaking!
> 
> cheers,
> 	danny
> ps: I think the subject should be changed again, removing zfs ...

This type of problem (occurs with one client but not another) is almost always the result of the access pattern of a particular client triggering a specific (and perhaps single) bug in error-handling. For example, we might not be properly freeing the received request when generating an EPERM in an edge case. The hard bit is identifying which it is. If it's reproducible with UDP, then usually the process is:

- Build a minimal test case to trigger the problem -- ideally with as little complexity as possible.
- Run netstat -m at the beginning of the test and the end of the test on the server to count the number of leaked mbufs
- Run wireshark throughout the test
- Walk the wireshark trace looking for some error that occurs at about the same or slightly lower number of times then the number of mbufs leaked
- Iterate, narrowing the test case until it's either obvious exactly what's going on, or you've identified a relatively constrained code path and can just spot the bug by reading the code

It's almost certainly one or a small number of very specific RPCs that are triggering it -- maybe OpenBSD does an extra lookup, or stat, or something, on a name that may not exist anymore, or does it sooner than the other clients. Hard to say, other than to wave hands at the possibilities.

And it may well be we're looking at two bugs: Danny may see one bug, perhaps triggered by a race condition, but it may be different from the OpenBSD client-triggered bug (to be clear: it's definitely a FreeBSD bug, although we might only see it when an OpenBSD client is used because perhaps OpenBSD also has a bug or feature).

Robert