kern/122615: occasional crash/boot while running Xorg
Bob Frazier
bobf at mrp3.com
Thu Apr 17 14:20:03 UTC 2008
The following reply was made to PR kern/122615; it has been noted by GNATS.
From: Bob Frazier <bobf at mrp3.com>
To: bug-followup at FreeBSD.org
Cc:
Subject: Re: kern/122615: occasional crash/boot while running Xorg
Date: Thu, 17 Apr 2008 08:18:20 -0700
It appears that this problem may be related specifically to the SATA
controller.
I had several crashes happen to me this morning, most of them without
Xorg running. Prior to this, Xorg had been running for several days
without incident.
I should point out that I have 2 jails running from directories on the
SATA drive, which is the 2nd drive in my system. So I can expect file
activity on this drive from time to time due to cron, etc. running in
the jails. The SATA drive has a single NFS partition and is 160Gb.
Crash 1: copying a ~180Mb file from an NFS share on a linux machine to
a location on the SATA drive. System froze up then rebooted. no core dump.
Crash 2: From the console (no X running), after copying the same file
again (while background checks were being done), copied this same file
to a USB ramdisk and started another process (in a different vconsole)
to compare a number of existing files against (should be) identical
files on the same NFS share as before. When I issued the 'umount'
command, the system rebooted. No core dump.
Crash 3: Started the file comparison (again), after manually fsck'ing
the partitions on the IDE drive (/, /tmp, /var, /usr) in single-user and
pressing CTRL+D to resume startup. System rebooted with a crash dump
(#4 in /var/crash).
Crash 4: Started the system, booted to single user, fsck'd the 4
mountpoints on the IDE drive again, ctrl+D to multi-user, and then
started typing in a command. System froze up and rebooted with a crash
dump (#5 in /var/crash).
In each case the crash symptoms are similar to the one I reported here.
I'm lacking time at the moment and will follow up with more backtraces
for the 2 crashdump files on request.
At the moment I'm running an fsck on the SATA drive with the drive
unmounted in multi-user mode (jails not running). Hopefully this won't
crash and I can validate and offload files from this drive. I am
starting to suspect that the SATA controller or the drive itself is at
the root of the problem. The typical symptoms include a message in
which the 'ad4' (SATA) drive has some kind of error, followed by a
message that suggests it is being removed or not responding or something
similar, followed by several reported errors reading/writing LBA
locations that seem unusually large for a drive that size, followed by
the crash/boot. Unfortunately this information gets lost every time, if
I'm even lucky enough to see the writing on the terminal before the
system boots. The only relevant piece of information that seems to end
up in the info.# file is "vinvalbuf: dirtybufs" as the cause for the
'panic'.
More information about the freebsd-bugs
mailing list