debugging frequent kernel panics on 8.2-RELEASE
freebsd at jdc.parodius.com
Wed Aug 10 16:00:21 UTC 2011
On Wed, Aug 10, 2011 at 04:46:17PM +0100, Steven Hartland wrote:
> >On Wed, Aug 10, 2011 at 03:22:52PM +0100, Steven Hartland wrote:
> >>The base stack reported is a double fault with no additional
> >>details and CTRL+ALT+ESC fails to break to the debugger as
> >>does and NMI, even though it at least tries printing the
> >>following many times some quite jumbled:-
> >>NMI ... going to debugger
> >If you're generating the NMI yourself (possibly via the KVM, etc.) then
> >okay, that's different. I'm trying to discern whether or not *you're*
> >generating the NMI, or if the NMI just happens and causes a panic for
> >you and that's what you're worried about.
> Yer generating it after panic in order to try and get to the debugger :)
Understood, thanks for clarifying.
> >Now to discuss the "jumbled console output":
> >The default (assuming your kernel configs are based off of GENERIC
> >within the past 4-5 years) is 128. However, the same developers stated
> >that they have great reservations over increasing this number
> >dramatically (meaning, something like 256 will probably work, but larger
> >"may have repercussions which are unknown at this time").
> Might try that if it will help but with so many production machines to
> action I'd like to try and avoid if possible.
I've used PRINTF_BUFR_SIZE=256 with success on our systems, but since it
doesn't actually *solve* the problem, I just use the default 128 and
just grit my teeth when we experience it. It's larger values (e.g.
512/1024, etc.) which there is concern over.
> >In combination with this, we use the following in /etc/rc.conf (the
> >dumpdev line is important, else savecore won't pick up anything):
> I thought this was ment to be the default from back in the 6.x days but
> it didnt seem to work, so I added the gptid device from /etc/fstab
/etc/defaults/rc.conf has dumpdev="NO", which affects two things: both
/etc/rc.d/dumpon (this script is a little tricky, you really have to
read it slowly/pay close attention to what's going on), and
I've always wondered why dumpdev="NO" is the default, not "auto", since
on a system with no swap devices in /etc/fstab dumpdev="auto" should
behave the same. Possibly the idea of the default is to ensure that
savecore(8) never gets run (e.g. there's no guarantee someone has
/var/crash, or a /var that's big enough to hold a crash dump; possibly
embedded systems or NFS-only systems, for example).
Touchy subject I guess.
| Jeremy Chadwick jdc at parodius.com |
| Parodius Networking http://www.parodius.com/ |
| UNIX Systems Administrator Mountain View, CA, US |
| Making life hard for others since 1977. PGP 4BD6C0CB |
More information about the freebsd-stable