My server gets kernel panic every 7th day

Greg 'groggy' Lehey grog at FreeBSD.org
Sun Dec 19 14:53:23 PST 2004


On Sunday, 19 December 2004 at 23:42:20 +0100, Daniel Johansson wrote:
> On Mon, 20 Dec 2004 09:08:01 +1030, Greg 'groggy' Lehey
> <grog at freebsd.org> wrote:
>> On Sunday, 19 December 2004 at 23:35:18 +0100, Daniel Johansson wrote:
>>> On Mon, 20 Dec 2004 08:59:19 +1030, Greg 'groggy' Lehey
>>> <grog at freebsd.org> wrote:
>>>> On Saturday, 18 December 2004 at 11:50:02 -0800, Kris Kennaway wrote:
>>>>> On Sat, Dec 18, 2004 at 11:57:35AM +0100, Daniel Johansson wrote:
>>>>>> Hi, i've had my server up for over a year now and it's been rock solid
>>>>>> but for the latest weeks the server has rebooted evert Saturday at
>>>>>> exact 04:19:57 because of a find command. I have no idea why and I've
>>>>>> checked the cron log and I don't think any crontab is runned at that
>>>>>> time. Not as far as I can see from the cron log. Anyway find makes the
>>>>>> server get a kernel panic and it reboots. This is the fourth week in a
>>>>>> row it happens and I've checked the hardware, no problems at all.
>>>>>
>>>>> How did you "check the hardware"?  Hardware failure is by far the
>>>>> most common cause of "strange panics under abnormal load [such as
>>>>> when the weekly cron job runs]".
>>>>
>>>> If this panic occurs repeatedly under certain circumstances, it's
>>>> probably not hardware.  Anyway, there's not much point standing
>>>> outside and scratching our heads.  We have a facility for analysing
>>>> this kind of problem: the processor dump and kernel debugger.
>>>
>>> Yeah, I want to say thank you for your help. I think I've been able to
>>> reproduce the kernel panic now, finalay!
>>>
>>> On my server I run 3 jails and every night at 04:15 when it runs
>>> periodic weekly it runs it in 3 jails + the host enviroment. This
>>> seems to cause the kernel panic, I don't really know why yet. I can
>>> run periodic weekly separatly in every jail + the host without kernel
>>> panic but when I run it at the same time on all places it kernel
>>> panics.
>>
>> What does the dump backtrace show?
>>
>>> It can still be the PSU, don't have any other atm to try with. I'll
>>> do some more testing and see if I can get any more info.
>>
>> There's no point looking at the hardware until you've looked at the
>> dump.

I'd appreciate it if you didn't require me to move the text of your
messages to where it fits.

> Okay, is this hard to do? I've no idea how to look at the dump or
> how to understand the dump. You don't have to be kernel hacker to
> understand that?

It's described in the handbook.  Basically:

- Build a kernel with debug symbols (you should be doing this anyway).
  You need the following line in your configuration file:

    makeoptions	DEBUG=-g		# Build kernel with gdb(1) debug symbols

- Make sure that dumps are enabled.  You should have something like
  this in your /etc/rc.conf:

    dumpdev=/dev/ad0s2b

  The device name should be the name of your swap partition, and it
  must be at least slightly larger than your main memory.

- Ensure you have a directory /var/crash, and that the file system in
  which it resides has enough space for the dump (a little larger than
  main memory).

- When you get a dump, it will be copied to /var/crash automatically
  on reboot.  Go there and get a backtrace.  You don't say which
  version of FreeBSD you're using, but in general this will do it:

  # cd /var/crash
  # gdb -k /usr/obj/src/sys/GENERIC/kernel.debug vmcore.0
  (gdb) bt
  
The name of the kernel (kernel.debug) depends on how you built your
kernel.  If it's not called GENERIC, the name of the directory will
change accordingly.

That's it in a nutshell.  There's much more detail in chapter 6 of my
debug tutorial, which you can find at
http://www.lemis.com/grog/Papers/Debug-tutorial/tutorial.pdf .

Greg
--
When replying to this message, please copy the original recipients.
If you don't, I may ignore the reply or reply to the original recipients.
For more information, see http://www.lemis.com/questions.html
See complete headers for address and phone numbers.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-questions/attachments/20041220/1096a1ae/attachment-0001.bin


More information about the freebsd-questions mailing list