Network memory allocation failures
Pyun YongHyeon
pyunyh at gmail.com
Wed Sep 8 16:52:25 UTC 2010
On Wed, Sep 08, 2010 at 07:34:44AM -0700, Mahlon E. Smith wrote:
> On Tue, Sep 07, 2010, Jeremy Chadwick wrote:
> >
> > I figured there might memory exhaustion of sorts, possibly in the bce(4)
> > driver itself, that could cause the OP's problem. bce(4) might not be
> > the problem at all. But the OP's issue seems to only occur when
> > transmitting data, not receiving:
> >
> > http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/058708.html
>
> More information:
>
> Looks like 100M wasn't enough of a test burst to tickle the problem in
> my original message... 10G is, though. It's definitely happening in
> both directions.
>
> Upgraded to -STABLE on one of the two machines last night, running
> GENERIC.
>
> FreeBSD obb 8.1-STABLE FreeBSD 8.1-STABLE #0: Tue Sep 7 19:48:55 PDT 2010 root at obb:/usr/obj/usr/src/sys/GENERIC amd64
>
>
> Outgoing:
>
> obb# scp testfile root at holp:/usr/local/tmp/
> testfile 8% 856MB 37.6MB/s 04:09 ETA
> Write failed: Cannot allocate memory
> lost connection
> obb# scp testfile root at holp:/usr/local/tmp/
> testfile 0% 72MB 34.3MB/s 04:56 ETA
> Write failed: Cannot allocate memory
> lost connection
>
> Incoming:
>
> obb# scp root at holp:/usr/local/tmp/testfile .
> testfile 6% 670MB 31.9MB/s 04:59 ETA
> Write failed: Cannot allocate memory
> lost connection
> obb# scp root at holp:/usr/local/tmp/testfile .
> testfile 1% 118MB 39.3MB/s 04:17 ETA
> Write failed: Cannot allocate memory
> lost connection
> obb# scp root at holp:/usr/local/tmp/testfile .
> testfile 15% 1613MB 29.0MB/s 04:57 ETA
> Write failed: Cannot allocate memory
> lost connection
>
I think bce(4) may not be able to return ENOMEM to user land
process so I guess it's not a bce(4) issue. To rule out possible
driver issue, could you try other controller instead of bce(4)?
>
>
> > The 2nd-to-last paragraph there is worth noting, specifically how
> > limiting maximum addressable memory to 32GB via loader.conf seems to
> > work around the issue.
>
> I'd no longer consider this a coincidence, limiting the memory to 16G
> eliminates the issue completely. I'll retest with 32G today.
>
Again, this type of change has nothing to do with driver operation.
bce(4) may have some issues on PAE but I don't think that would
trigger problems on amd64 systems.
> Incoming:
>
> obb# scp root at holp:/usr/local/tmp/testfile testfile2
> testfile 100% 10GB 17.8MB/s 09:35
> obb# scp root at holp:/usr/local/tmp/testfile testfile2
> testfile 100% 10GB 17.0MB/s 10:02
>
> Outgoing:
>
> obb# scp testfile root at holp:/usr/local/tmp/testfile2
> testfile 100% 10GB 35.7MB/s 04:47
> obb# scp testfile root at holp:/usr/local/tmp/testfile2
> testfile 100% 10GB 35.4MB/s 04:49
>
>
> > There were other problems with the systems in question back in July, it
> > seems. I assume these got hammered out somehow:
> >
> > http://www.mail-archive.com/freebsd-stable@freebsd.org/msg111408.html
>
> To a degree -- the initial install and cpu count problems are all fixed
> up, thanks to help from the list. The Intel 10G panics were stifled
> with a newer driver from Intel's site, but I ran out of time to do
> any serious testing with it, and just ended up using the broadcoms to
> satisfy my time constraint.
>
> --
> Mahlon E. Smith
> http://www.martini.nu/contact.html
More information about the freebsd-stable
mailing list