Recovery from mbuf cluster exhaustion

Wed Oct 8 05:41:00 PDT 2003

Bruce M Simpson wrote:

> Support for 4.7 is very limited as we transition to 4.9, please be
> prepared to upgrade the box. Bear in mind we commit fixes for problems
> to HEAD first except in those cases where RELENG_4 is more appropriate.

I wonder if it has to do with the version I use. Let's hope so, so that
an upgrade will get rid of this. Unfortunately, an upgrade to 4.9 is not
feasible for me right now (I'd have done it a long time ago otherwise).

> Did you add the 10.2.1.68 route manually? Note that there is code
> in if_ethersubr.c which should loopback a copy of a packet sent on an
> IFF_SIMPLEX interface automatically, so it shouldn't be required.
> 
> For example, on my laptop:
> 192.168.1.68         00:04:76:5e:ec:7d  UHLW        0        2    lo0
> 
> This route is created automatically by arp_rtrequest(). The RTF_WASCLONED
> (W) flag tells us this. Because ether_output() is calling if_simloop() to
> loopback the packets, the RTF_LLINFO (L) flag gets ignored.

The route is added at boot time via DHCP. In fact, this is all that I
have in my /etc/rc.conf
ifconfig_rl0="DHCP"
ifconfig_xl0="inet 10.0.0.1 netmask 255.255.255.0"
ifconfig_xl1="up"

> 
> Try removing this route and see what happens.
> 
These are the routing tables now:
Internet:
Destination Gateway            Flags    Refs      Use  Netif Expire
default     10.2.1.1           UGSc       24        0    rl0
10.2.1/24   link#1             UC          3        0    rl0
10.2.1.1    00:04:76:1f:53:60  UHLW       27      444    rl0   1134
10.2.1.2    00:60:08:10:4a:36  UHLW        0        2    rl0    928
10.2.1.6    00:d0:a8:00:a8:f5  UHLW        0      334    rl0   1178
127.0.0.1   127.0.0.1          UH          1       30    lo0

I've removed the 10.2.1.68 route on lo0. Injecting packets from
the dump file still leads to mbuf cluster depletion.

4433/4592/18240 mbufs in use (current/peak/max):
         4433 mbufs allocated to data
4432/4560/4560 mbuf clusters in use (current/peak/max)
10268 Kbytes allocated to network (75% of mb_map in use)
219 requests for memory denied
4 requests for memory delayed
0 calls to protocol drain routines

> Are you using the bridging code to do this?
> If so, can you post the bridging configuration?

I'm sorry. I meant gateway. I have "gateway_enable=yes" in rc.conf
and the following in my ipfw configuration.

00050 divert 8668 ip from any to any via rl0

This way the machine on the 10.0.0.0/24 network can get onto the
10.2.1.0/24 network (which leads to the Internet). Conversely,
traffic for the 10.0.0.0/24 received on the 10.2.1.0/24 interface
(rl0) is routed appropriately to xl0.

> Unfortunately it doesn't quite work that way... you are exercising
> a leak somewhere and it needs to be tracked down. You should collate
> all the information as we track this thread and prepare to submit a PR.

Do you suppose it's inside the xl0 driver code? As I said, bringing
the xl0 interface down stops draining my mbufs, bringing it up starts
the drain again. I find this very odd.

Peter