scp: Write Failed: Cannot allocate memory

Jeremy Chadwick freebsd at jdc.parodius.com
Wed Jul 6 02:33:01 UTC 2011


On Wed, Jul 06, 2011 at 12:23:39PM +1000, Peter Ross wrote:
> Quoting "Jeremy Chadwick" <freebsd at jdc.parodius.com>:
> 
> >On Tue, Jul 05, 2011 at 01:03:20PM -0400, Scott Sipe wrote:
> >>I'm running virtualbox 3.2.12_1 if that has anything to do with it.
> >>
> >>sysctl vfs.zfs.arc_max: 6200000000
> >>
> >>While I'm trying to scp, kstat.zfs.misc.arcstats.size is
> >>hovering right around that value, sometimes above, sometimes
> >>below (that's as it should be, right?). I don't think that it
> >>dies when crossing over arc_max. I can run the same scp 10 times
> >>and it might fail 1-3 times, with no correlation to the
> >>arcstats.size being above/below arc_max that I can see.
> >>
> >>Scott
> >>
> >>On Jul 5, 2011, at 3:00 AM, Peter Ross wrote:
> >>
> >>>Hi all,
> >>>
> >>>just as an addition: an upgrade to last Friday's
> >>>FreeBSD-Stable and to VirtualBox 4.0.8 does not fix the
> >>>problem.
> >>>
> >>>I will experiment a bit more tomorrow after hours and grab some statistics.
> >>>
> >>>Regards
> >>>Peter
> >>>
> >>>Quoting "Peter Ross" <Peter.Ross at bogen.in-berlin.de>:
> >>>
> >>>>Hi all,
> >>>>
> >>>>I noticed a similar problem last week. It is also very
> >>>>similar to one reported last year:
> >>>>
> >>>>http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/058708.html
> >>>>
> >>>>My server is a Dell T410 server with the same bge card (the
> >>>>same pciconf -lvc output as described by Mahlon:
> >>>>
> >>>>http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/058711.html
> >>>>
> >>>>Yours, Scott, is a em(4)..
> >>>>
> >>>>Another similarity: In all cases we are using VirtualBox. I
> >>>>just want to mention it, in case it matters. I am still
> >>>>running VirtualBox 3.2.
> >>>>
> >>>>Most of the time kstat.zfs.misc.arcstats.size was reaching
> >>>>vfs.zfs.arc_max then, but I could catch one or two cases
> >>>>then the value was still below.
> >>>>
> >>>>I added vfs.zfs.prefetch_disable=1 to sysctl.conf but it does not help.
> >>>>
> >>>>BTW: It looks as ARC only gives back the memory when I
> >>>>destroy the ZFS (a cloned snapshot containing virtual
> >>>>machines). Even if nothing happens for hours the buffer
> >>>>isn't released..
> >>>>
> >>>>My machine was still running 8.2-PRERELEASE so I am upgrading.
> >>>>
> >>>>I am happy to give information gathered on old/new kernel if it helps.
> >>>>
> >>>>Regards
> >>>>Peter
> >>>>
> >>>>Quoting "Scott Sipe" <cscotts at gmail.com>:
> >>>>
> >>>>>
> >>>>>On Jul 2, 2011, at 12:54 AM, jhell wrote:
> >>>>>
> >>>>>>On Fri, Jul 01, 2011 at 03:22:32PM -0700, Jeremy Chadwick wrote:
> >>>>>>>On Fri, Jul 01, 2011 at 03:13:17PM -0400, Scott Sipe wrote:
> >>>>>>>>I'm running 8.2-RELEASE and am having new problems
> >>>>>>>>with scp. When scping
> >>>>>>>>files to a ZFS directory on the FreeBSD server --
> >>>>>>>>most notably large files
> >>>>>>>>-- the transfer frequently dies after just a few
> >>>>>>>>seconds. In my last test, I
> >>>>>>>>tried to scp an 800mb file to the FreeBSD system and
> >>>>>>>>the transfer died after
> >>>>>>>>200mb. It completely copied the next 4 times I
> >>>>>>>>tried, and then died again on
> >>>>>>>>the next attempt.
> >>>>>>>>
> >>>>>>>>On the client side:
> >>>>>>>>
> >>>>>>>>"Connection to home closed by remote host.
> >>>>>>>>lost connection"
> >>>>>>>>
> >>>>>>>>In /var/log/auth.log:
> >>>>>>>>
> >>>>>>>>Jul  1 14:54:42 freebsd sshd[18955]: fatal: Write
> >>>>>>>>failed: Cannot allocate
> >>>>>>>>memory
> >>>>>>>>
> >>>>>>>>I've never seen this before and have used scp before
> >>>>>>>>to transfer large files
> >>>>>>>>without problems. This computer has been used in
> >>>>>>>>production for months and
> >>>>>>>>has a current uptime of 36 days. I have not been
> >>>>>>>>able to notice any problems
> >>>>>>>>copying files to the server via samba or netatalk, or any problems in
> >>>>>>>>apache.
> >>>>>>>>
> >>>>>>>>Uname:
> >>>>>>>>
> >>>>>>>>FreeBSD xeon 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Sat
> >>>>>>>>Feb 19 01:02:54 EST
> >>>>>>>>2011     root at xeon:/usr/obj/usr/src/sys/GENERIC  amd64
> >>>>>>>>
> >>>>>>>>I've attached my dmesg and output of vmstat -z.
> >>>>>>>>
> >>>>>>>>I have not restarted the sshd daemon or rebooted the computer.
> >>>>>>>>
> >>>>>>>>Am glad to provide any other information or test anything else.
> >>>>>>>>
> >>>>>>>>{snip vmstat -z and dmesg}
> >>>>>>>
> >>>>>>>You didn't provide details about your networking setup (rc.conf,
> >>>>>>>ifconfig -a, etc.).  netstat -m would be useful too.
> >>>>>>>
> >>>>>>>Next, please see this thread circa September 2010, titled "Network
> >>>>>>>memory allocation failures":
> >>>>>>>
> >>>>>>>http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/thread.html#58708
> >>>>>>>
> >>>>>>>The user in that thread is using rsync, which relies on scp by default.
> >>>>>>>I believe this problem is similar, if not identical, to yours.
> >>>>>>>
> >>>>>>
> >>>>>>Please also provide your output of ( /usr/bin/limits -a ) for the server
> >>>>>>end and the client.
> >>>>>>
> >>>>>>I am not quite sure I agree with the need for ifconfig -a but some
> >>>>>>information about the networking driver your using for the interface
> >>>>>>would be helpful, uptime of the boxes. And configuration of the pool.
> >>>>>>e.g. ( zpool status -a ;zfs get all <poolname> ) You should probably
> >>>>>>prop this information up somewhere so you can reference by URL whenever
> >>>>>>needed.
> >>>>>>
> >>>>>>rsync(1) does not rely on scp(1) whatsoever but rsync(1) can be made to
> >>>>>>use ssh(1) instead of rsh(1) and I believe that is what Jeremy is
> >>>>>>stating here but correct me if I am wrong. It does use ssh(1) by
> >>>>>>default.
> >>>>>>
> >>>>>>Its a possiblity as well that if using tmpfs(5) or mdmfs(8) for /tmp
> >>>>>>type filesystems that rsync(1) may be just filling up your temp ram area
> >>>>>>and causing the connection abort which would be
> >>>>>>expected. ( df -h ) would
> >>>>>>help here.
> >>>>>
> >>>>>Hello,
> >>>>>
> >>>>>I'm not using tmpfs/mdmfs at all. The clients yesterday
> >>>>>were 3 different OSX computers (over gigabit). The FreeBSD
> >>>>>server has 12gb of ram and no bce adapter. For what it's
> >>>>>worth, the server is backed up remotely every night with
> >>>>>rsync (remote FreeBSD uses rsync to pull) to an offsite
> >>>>>(slow cable connection) FreeBSD computer, and I have not
> >>>>>seen any errors in the nightly rsync.
> >>>>>
> >>>>>Sorry for the omission of networking info, here's the
> >>>>>output of the requested commands and some that popped up
> >>>>>in the other thread:
> >>>>>
> >>>>>http://www.cap-press.com/misc/
> >>>>>
> >>>>>In rc.conf:  ifconfig_em1="inet 10.1.1.1 netmask 255.255.0.0"
> >>>>>
> >>>>>Scott
> >
> >Just to make it crystal clear to everyone:
> >
> >There is no correlation between this problem and use of ZFS.  People are
> >attempting to correlate "cannot allocate memory" messages with "anything
> >on the system that uses memory".  The VM is much more complex than that.
> >
> >Given the nature of this problem, it's much more likely the issue is
> >"somewhere" within a networking layer within FreeBSD, whether it be
> >driver-level or some sort of intermediary layer.
> >
> >Two people who have this issue in this thread are both using VirtualBox.
> >Can one, or both, of you remove VirtualBox from the configuration
> >entirely (kernel, etc. -- not sure what is required) and then see if the
> >issue goes away?
> 
> On the machine in question I only can do it after hours so I will do
> it tonight.
> 
> I was _successfully_ sending the file over the loopback interface using
> 
> cat /zpool/temp/zimbra_oldroot.vdi | ssh localhost "cat > /dev/null"
> 
> I did it, btw, with the IPv6 localhost address first (accidently),
> and then using IPv4. Both worked.
> 
> It always fails if I am sending it through the bce(4) interface,
> even if my target is the VirtualBox bridged to the bce card (so it
> does not "leave" the computer physically).
> 
> Below the uname -a, ifconfig -a, netstat -rn, pciconf -lv and kldstat output.
> 
> I have another box where I do not see that problem. It copies files
> happily over the net using ssh.
> 
> It is an an older HP ML 150 with 3GB RAM only but with a bge(4)
> driver instead. It runs the same last week's RELENG_8. I installed
> VirtualBox and enabled vboxnet (so it loads the kernel modules). But
> I do not run VirtualBox on it (because it hasn't enough RAM).
> 
> Regards
> Peter
> 
> DellT410one# uname -a
> FreeBSD DellT410one.vv.fda 8.2-STABLE FreeBSD 8.2-STABLE #1: Thu Jun
> 30 17:07:18 EST 2011
> root at DellT410one.vv.fda:/usr/obj/usr/src/sys/GENERIC  amd64
> DellT410one# ifconfig -a
> bce0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST>
> metric 0 mtu 1500
> 	options=c01bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO,LINKSTATE>
> 	ether 84:2b:2b:68:64:e4
> 	inet 192.168.50.220 netmask 0xffffff00 broadcast 192.168.50.255
> 	inet 192.168.50.221 netmask 0xffffff00 broadcast 192.168.50.255
> 	inet 192.168.50.223 netmask 0xffffff00 broadcast 192.168.50.255
> 	inet 192.168.50.224 netmask 0xffffff00 broadcast 192.168.50.255
> 	inet 192.168.50.225 netmask 0xffffff00 broadcast 192.168.50.255
> 	inet 192.168.50.226 netmask 0xffffff00 broadcast 192.168.50.255
> 	inet 192.168.50.227 netmask 0xffffff00 broadcast 192.168.50.255
> 	inet 192.168.50.219 netmask 0xffffff00 broadcast 192.168.50.255
> 	media: Ethernet autoselect (1000baseT <full-duplex>)
> 	status: active
> bce1: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
> 	options=c01bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO,LINKSTATE>
> 	ether 84:2b:2b:68:64:e5
> 	media: Ethernet autoselect
> lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
> 	options=3<RXCSUM,TXCSUM>
> 	inet6 fe80::1%lo0 prefixlen 64 scopeid 0xb
> 	inet6 ::1 prefixlen 128
> 	inet 127.0.0.1 netmask 0xff000000
> 	nd6 options=3<PERFORMNUD,ACCEPT_RTADV>
> vboxnet0: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
> 	ether 0a:00:27:00:00:00
> DellT410one# netstat -rn
> Routing tables
> 
> Internet:
> Destination        Gateway            Flags    Refs      Use  Netif Expire
> default            192.168.50.201     UGS         0    52195   bce0
> 127.0.0.1          link#11            UH          0        6    lo0
> 192.168.50.0/24    link#1             U           0  1118212   bce0
> 192.168.50.219     link#1             UHS         0     9670    lo0
> 192.168.50.220     link#1             UHS         0     8347    lo0
> 192.168.50.221     link#1             UHS         0   103024    lo0
> 192.168.50.223     link#1             UHS         0    43614    lo0
> 192.168.50.224     link#1             UHS         0     8358    lo0
> 192.168.50.225     link#1             UHS         0     8438    lo0
> 192.168.50.226     link#1             UHS         0     8338    lo0
> 192.168.50.227     link#1             UHS         0     8333    lo0
> 192.168.165.0/24   192.168.50.200     UGS         0     3311   bce0
> 192.168.166.0/24   192.168.50.200     UGS         0      699   bce0
> 192.168.167.0/24   192.168.50.200     UGS         0     3012   bce0
> 192.168.168.0/24   192.168.50.200     UGS         0      552   bce0
> 
> Internet6:
> Destination                       Gateway
> Flags      Netif Expire
> ::1                               ::1                           UH
> lo0
> fe80::%lo0/64                     link#11                       U
> lo0
> fe80::1%lo0                       link#11                       UHS
> lo0
> ff01::%lo0/32                     fe80::1%lo0                   U
> lo0
> ff02::%lo0/32                     fe80::1%lo0                   U
> lo0
> DellT410one# kldstat
> Id Refs Address            Size     Name
>  1   19 0xffffffff80100000 dbf5d0   kernel
>  2    3 0xffffffff80ec0000 4c358    vboxdrv.ko
>  3    1 0xffffffff81012000 131998   zfs.ko
>  4    1 0xffffffff81144000 1ff1     opensolaris.ko
>  5    2 0xffffffff81146000 2940     vboxnetflt.ko
>  6    2 0xffffffff81149000 8e38     netgraph.ko
>  7    1 0xffffffff81152000 153c     ng_ether.ko
>  8    1 0xffffffff81154000 e70      vboxnetadp.ko
> DellT410one# pciconf -lv
> ..
> bce0 at pci0:1:0:0:        class=0x020000 card=0x028d1028
> chip=0x163b14e4 rev=0x20 hdr=0x00
>     vendor     = 'Broadcom Corporation'
>     class      = network
>     subclass   = ethernet
> bce1 at pci0:1:0:1:        class=0x020000 card=0x028d1028
> chip=0x163b14e4 rev=0x20 hdr=0x00
>     vendor     = 'Broadcom Corporation'
>     class      = network
>     subclass   = ethernet

Could you please provide "pciconf -lvcb" output instead, specific to the
bce chips?  Thanks.

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                   Mountain View, CA, US |
| Making life hard for others since 1977.               PGP 4BD6C0CB |



More information about the freebsd-stable mailing list