still mbuf leak in 9.0 / 9.1?

Jack Vogel jfvogel at gmail.com
Wed May 15 17:08:28 UTC 2013


So, you stop getting 10G transmission and so you are looking at mbuf leaks?
I don't see
anything in your data that makes it look like you've run out of available
mbufs.  You said
you're running jumbos, what size? You do realize that if you do this the
clusters are coming
from different pools and you are not displaying those. What are all your
nmb limits set to?

So, this is 9.1 RELEASE, or stable? If you are using the driver from
release I would first off
suggest you test the code from HEAD.

What is the 10G device, I see its using Twinax, and I have been told there
is a problem at
times with those that is corrected in recent shared code, this is why you
should try the
latest code.

Cheers,

Jack



On Wed, May 15, 2013 at 2:00 AM, dennis berger <db at nipsi.de> wrote:

> Hi list,
> since we activated 10gbe on ixgbe cards + jumbo frames(9k) on 9.0 and now
> on 9.1 we recognize that after a random period of time, sometimes a week,
> sometimes only a day, the
> system doesn't send any packets out. The phenomenon is that you can't
> login via ssh, nfs and istgt is not operative. Yet you can login on the
> console and execute commands.
> A clean shutdown isn't possible though. It hangs after vnode cleaning,
> normally you would see detaching of usb devices here, or other devices
> maybe?
> I've read the other post on this ML about mbuf leak in the arp handling
> code in if_ether.c line 558. We don't see any of those notices in dmesg so
> I don't think that glebius fix would apply for us.
> I'm collecting system and memory information every hour.
>
>
> Script looks like this.
> less /etc/periodic/hourly/100.report-memory.sh
> #!/bin/sh
>
> reporttimestamp=`date +%d-%m-%Y-%H-%M`
> reportname=${reporttimestamp}.txt
>
> cd /root/memory-mon
>
> top -b > $reportname
> echo "" >> $reportname
> vmstat -m >> $reportname
> echo "" >> $reportname
> vmstat -z >> $reportname
> echo "" >> $reportname
> netstat -Q >> $reportname
> echo "" >> $reportname
> netstat -n -x >> $reportname
> echo "" >> $reportname
> netstat -m >> $reportname
> /usr/bin/perl /usr/local/bin/zfs-stats -a >> $reportname
>
> When you grep for mbuf or mbuf usage you will see this for example:
>
> root at freenas:/root/memory-mon # grep mbuf_packet: *
> 14-05-2013-14-09.txt:mbuf_packet:            256,      0,    9246,
>  2786,201700429,   0,   0
> 14-05-2013-15-09.txt:mbuf_packet:            256,      0,    9256,
>  2776,201773122,   0,   0
> 14-05-2013-16-09.txt:mbuf_packet:            256,      0,    9266,
>  2766,201871553,   0,   0
> 14-05-2013-17-09.txt:mbuf_packet:            256,      0,    9276,
>  2756,201915405,   0,   0
> 14-05-2013-18-09.txt:mbuf_packet:            256,      0,    9286,
>  2746,201927956,   0,   0
> 14-05-2013-19-09.txt:mbuf_packet:            256,      0,    9296,
>  2352,201935681,   0,   0
> 14-05-2013-20-09.txt:mbuf_packet:            256,      0,    9306,
>  2342,201943754,   0,   0
> 14-05-2013-21-09.txt:mbuf_packet:            256,      0,    9316,
>  2332,201950961,   0,   0
> 14-05-2013-22-09.txt:mbuf_packet:            256,      0,    9326,
>  2450,201958150,   0,   0
> 14-05-2013-23-09.txt:mbuf_packet:            256,      0,    9336,
>  2440,201967178,   0,   0
> 15-05-2013-00-09.txt:mbuf_packet:            256,      0,    9346,
>  2430,201974561,   0,   0
> 15-05-2013-01-09.txt:mbuf_packet:            256,      0,    9356,
>  2420,201982105,   0,   0
> 15-05-2013-02-09.txt:mbuf_packet:            256,      0,    9366,
>  2410,201989463,   0,   0
> 15-05-2013-03-09.txt:mbuf_packet:            256,      0,    9378,
>  1502,203019168,   0,   0
> 15-05-2013-04-09.txt:mbuf_packet:            256,      0,    9384,
>  1624,205953601,   0,   0
> 15-05-2013-05-09.txt:mbuf_packet:            256,      0,    9394,
>  1870,205959258,   0,   0
> 15-05-2013-06-09.txt:mbuf_packet:            256,      0,    9404,
>  2500,205969396,   0,   0
> 15-05-2013-07-09.txt:mbuf_packet:            256,      0,    9414,
>  3386,207945161,   0,   0
> 15-05-2013-08-09.txt:mbuf_packet:            256,      0,    9424,
>  3376,208094689,   0,   0
> 15-05-2013-09-09.txt:mbuf_packet:            256,      0,    9434,
>  2982,208172465,   0,   0
> 15-05-2013-10-09.txt:mbuf_packet:            256,      0,    9444,
>  3100,208270369,   0,   0
>
> and
>
> root at freenas:/root/memory-mon # grep "mbufs in use" *
> 14-05-2013-14-09.txt:58444/5816/64260 mbufs in use (current/cache/total)
> 14-05-2013-15-09.txt:58455/5805/64260 mbufs in use (current/cache/total)
> 14-05-2013-16-09.txt:58464/5796/64260 mbufs in use (current/cache/total)
> 14-05-2013-17-09.txt:58475/5785/64260 mbufs in use (current/cache/total)
> 14-05-2013-18-09.txt:58484/5776/64260 mbufs in use (current/cache/total)
> 14-05-2013-19-09.txt:58493/5767/64260 mbufs in use (current/cache/total)
> 14-05-2013-20-09.txt:58503/5757/64260 mbufs in use (current/cache/total)
> 14-05-2013-21-09.txt:58513/5747/64260 mbufs in use (current/cache/total)
> 14-05-2013-22-09.txt:58523/5737/64260 mbufs in use (current/cache/total)
> 14-05-2013-23-09.txt:58533/5727/64260 mbufs in use (current/cache/total)
> 15-05-2013-00-09.txt:58543/5717/64260 mbufs in use (current/cache/total)
> 15-05-2013-01-09.txt:58554/5706/64260 mbufs in use (current/cache/total)
> 15-05-2013-02-09.txt:58563/5697/64260 mbufs in use (current/cache/total)
> 15-05-2013-03-09.txt:58639/5621/64260 mbufs in use (current/cache/total)
> 15-05-2013-04-09.txt:58581/5679/64260 mbufs in use (current/cache/total)
> 15-05-2013-05-09.txt:58591/5669/64260 mbufs in use (current/cache/total)
> 15-05-2013-06-09.txt:58602/5658/64260 mbufs in use (current/cache/total)
> 15-05-2013-07-09.txt:58613/5647/64260 mbufs in use (current/cache/total)
> 15-05-2013-08-09.txt:58623/6027/64650 mbufs in use (current/cache/total)
> 15-05-2013-09-09.txt:58634/6016/64650 mbufs in use (current/cache/total)
> 15-05-2013-10-09.txt:58645/6005/64650 mbufs in use (current/cache/total)
>
>
> This increasing number of used mbuf_packets and mbufs in use makes me
> nervous.
> See the complete reports http://knownhosts.org:/reports-14-15.tgz
>
> Thanks for help,
>
> -dennis
>
>
>
> --------------BEGIN System information---------------
> It's a stock FreeBSD 9.1, yet the hostname is called freenas. Don't be
> confused.
>
>
> igb0: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
>
> options=401bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
>         ether 00:25:90:34:c1:12
>         nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
>         media: Ethernet autoselect (1000baseT <full-duplex>)
>         status: active
> igb1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>
> options=401bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
>         ether 00:25:90:34:c1:13
>         inet 172.16.1.6 netmask 0xfffff000 broadcast 172.16.15.255
>         inet6 fe80::225:90ff:fe34:c113%igb1 prefixlen 64 scopeid 0x2
>         nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
>         media: Ethernet autoselect (1000baseT <full-duplex>)
>         status: active
> ix0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
>
> options=401bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
>         ether 00:1b:21:cc:12:8b
>         inet 10.254.254.242 netmask 0xfffffffc broadcast 10.254.254.243
>         inet6 fe80::21b:21ff:fecc:128b%ix0 prefixlen 64 scopeid 0xb
>         nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
>         media: Ethernet autoselect (10Gbase-Twinax <full-duplex>)
>         status: active
> ix1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
>
> options=401bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
>         ether 00:1b:21:cc:12:8a
>         inet 10.254.254.254 netmask 0xfffffffc broadcast 10.254.254.255
>         inet6 fe80::21b:21ff:fecc:128a%ix1 prefixlen 64 scopeid 0xc
>         nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
>         media: Ethernet autoselect (10Gbase-Twinax <full-duplex>)
>         status: active
> ix2: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
>
> options=401bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
>         ether 00:1b:21:cc:12:b3
>         inet 10.254.254.246 netmask 0xfffffffc broadcast 10.254.254.247
>         inet6 fe80::21b:21ff:fecc:12b3%ix2 prefixlen 64 scopeid 0xd
>         nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
>         media: Ethernet autoselect
>         status: no carrier
> ix3: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
>
> options=401bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
>         ether 00:1b:21:cc:12:b2
>         nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
>         media: Ethernet autoselect
>         status: no carrier
> lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
>         options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
>         inet6 ::1 prefixlen 128
>         inet6 fe80::1%lo0 prefixlen 64 scopeid 0xf
>         inet 127.0.0.1 netmask 0xff000000
>         nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
>
> #dmesg
> …..
> mfi0: 21294 (421879975s/0x0008/info) - Battery started charging
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
>
>
> I should add that the servers that are directly connected to this freebsd
> server reboot every night. This is why you see ix0 UP/DOWN
> messages in dmesg.
>
>
>
>
>
>
> ------------- END System information------------
>
>
>
> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"
>


More information about the freebsd-stable mailing list