Network memory allocation failures

Tue Sep 7 21:34:54 UTC 2010

Hi, all.

I picked up a couple of Dell R810 monsters a couple of months ago.  96G
of RAM, 24 core.  With the aid of this list, got 8.1-RELEASE on there,
and they are trucking along merrily as VirtualBox hosts.

I'm seeing memory allocation errors when sending data over the network.
It is random at best, however I can reproduce it pretty reliably.

Sending 100M to a remote machine.  Note the 2nd scp attempt worked.
Most small files can make it through unmolested.

    obb# dd if=/dev/random of=100M-test bs=1M count=100
    100+0 records in
    100+0 records out
    104857600 bytes transferred in 2.881689 secs (36387551 bytes/sec)
    obb# rsync -av 100M-test skin:/tmp/
    sending incremental file list
    100M-test
    Write failed: Cannot allocate memory
    rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: Broken pipe (32)
    rsync: connection unexpectedly closed (28 bytes received so far) [sender]
    rsync error: unexplained error (code 255) at io.c(601) [sender=3.0.7]
    obb# scp 100M-test skin:/tmp/
    100M-test        52%   52MB  52.1MB/s   00:00 ETAWrite failed: Cannot allocate memory
    lost connection
    obb# scp 100M-test skin:/tmp/
    100M-test       100%  100MB  50.0MB/s   00:02    
    obb# scp 100M-test skin:/tmp/
    100M-test         0%    0     0.0KB/s   --:-- ETAWrite failed: Cannot allocate memory
    lost connection

Fetching a file, however, works.

    obb# scp skin:/usr/local/tmp/100M-test .
    100M-test    100%  100MB  20.0MB/s   00:05    
    obb# scp skin:/usr/local/tmp/100M-test .
    100M-test    100%  100MB  20.0MB/s   00:05    
    obb# scp skin:/usr/local/tmp/100M-test .
    100M-test    100%  100MB  20.0MB/s   00:05    
    obb# scp skin:/usr/local/tmp/100M-test .
    100M-test    100%  100MB  20.0MB/s   00:05    
    ...

I've ruled out bad hardware (mainly due to the behavior being
*identical* on the sister machine, in a completely different data
center.) It's a broadcom (bce) NIC.

mbufs look fine to me.

    obb# netstat -m
    511/6659/7170 mbufs in use (current/cache/total)
    510/3678/4188/25600 mbuf clusters in use (current/cache/total/max)
    510/3202 mbuf+clusters out of packet secondary zone in use
    (current/cache)
    0/984/984/12800 4k (page size) jumbo clusters in use
    (current/cache/total/max)
    0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
    0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
    1147K/12956K/14104K bytes allocated to network (current/cache/total)
    0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
    0/0/0 requests for jumbo clusters denied (4k/9k/16k)
    0/0/0 sfbufs in use (current/peak/max)
    0 requests for sfbufs denied
    0 requests for sfbufs delayed
    0 requests for I/O initiated by sendfile
    0 calls to protocol drain routines

Plenty of available mem (not surprising):

    obb# vmstat -hc 5 -w 5 
     procs      memory      page                    disks     faults         cpu
     r b w     avm    fre   flt  re  pi  po    fr  sr mf0 mf1   in   sy   cs us sy id
     0 0 0    722M    92G   115   0   1   0  1067   0   0   0  429 32637 6520  0  1 99
     0 0 0    722M    92G     1   0   0   0     0   0   0   0    9 31830 3279  0  0 100
     0 0 0    722M    92G     0   0   0   0     3   0   0   0    8 33171 3223  0  0 100
     0 0 0    761M    92G  2593   0   0   0  1712   0   5   4  121 35384 3907  0  0 99
     1 0 0    761M    92G     0   0   0   0     0   0   0   0   10 30237 3156  0  0 100

Last bit of info, and here's where it gets really weird.  Remember how I
said this was a VirtualBox host?  Guest machines running on it (mostly
centos) don't exhibit the problem, which is also why it took me so long
to notice it in the host.  They can merrily copy data around at will,
even though they are going out through the same host interface.

I'm not sure what to check for or toggle at this point.  There are all
sorts of tunables I've been mucking around with to no avail, and so I've
reverted them to defaults.  Mostly concentrating on these:

    hw.intr_storm_threshold
    net.inet.tcp.rfc1323
    kern.ipc.nmbclusters
    kern.ipc.nmbjumbop
    net.inet.tcp.sendspace
    net.inet.tcp.recvspace
    kern.ipc.somaxconn
    kern.ipc.maxsockbuf

It was suggested to me to try limiting the RAM in loader.conf to under
32G and see what happens.  When doing this, it does appear to be "okay".
Not sure if that's coincidence, or directly related -- something with
the large amount of RAM that is confusing a data structure somewhere?
Or potentially a problem with the bce driver, specifically?

I've kind of reached a limit here in what to dig for / try next.  What
else can I do to try and determine the root problem that would be
helpful?  Anyone ever have to deal with or seen something like this
recently?  (Or hell, not recently?)

Ideas appreciated!

--
Mahlon E. Smith  
http://www.martini.nu/contact.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 155 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20100907/f8827226/attachment.pgp