UPDATE 5.3-STABLE was Re: Possible problems with BroadcomBCM5704C 10/100/1000 on TyanThunder K8S pro S2882 twin Operteron

Doug White dwhite at gumbysoft.com
Wed Mar 9 19:22:31 PST 2005

On Mon, 7 Mar 2005, Alan Jay wrote:

> Well after upgrading to the latest -STABLE via cvsup and makeworld makekernel
> etc we have been doing some more tests over the weekend.

When did you run this cvsup?

> One of our databases ran fine all weekend so we took the plunge on Sunday to
> try our big heavily accessed database.
> It ran fine until 7.45 Monday morning - when I checked at 7.30am it was using
> around 6 of the 8Gb of RAM the server then logged:
> Mar  7 07:42:47 flappy kernel: bge1: discard frame w/o leading ethernet header
> (len 4294967292 pkt len 4294967292)

Hm, unsigned -1.  That message is printed by ether_input() if it get
handed a bum mbuf.

> Followed by:
> Mar  7 07:42:47 flappy kernel: Fatal trap 12: pag

Unfortunately this is not useful. We need the entire panic messsage and
ideally a backtrace and crashdump.  Can you connect a serial console to
this system and log the output?

> Subsequently to that it has crashed a number of times and on a couple of
> occasions has reported:
> kernel: fxp0: can't map mbuf (error 12)

Error 12 is ENOMEM and thats coming from bus_dmamap_load_mbuf().  That can
be returned if you're running out of space for bounce buffers, or kmem in
general.  scottl has been working on busdma issues in HEAD and recently
committed a fix for i386 for bounce page allocation issues.

kmem depletion would be more insidious.  Have you been getting other
message that indicates failure to allocate memory or error 12?

> By the way over the weekend the latest -STABLE which is marked 5.4-PRERELEASE
> 2 seemed much better than 5.3 had and the initial problems took much longer to
> appear.  Though once the problems started to appear, they repeated themselves
> rebooting every 1-2hrs until we removed the tests data.

That behavior sounds a lot like thermal issues.  It takes a while to warm
up to the critcal point and once it hits that point it really starts to
malfunction.  Unless the test run starts out slow or something.

Doug White                    |  FreeBSD: The Power to Serve
dwhite at gumbysoft.com          |  www.FreeBSD.org

