how to get more logging from GEOM?
rsmith at xs4all.nl
Wed Jul 16 22:14:23 UTC 2008
On Wed, Jul 16, 2008 at 02:41:28PM -0700, Jo Rhett wrote:
> On Jul 11, 2008, at 8:58 AM, Roland Smith wrote:
> >> After about 2 weeks of watching it carefully I've learned almost
> >> nothing. It's not a disk failure (AFAIK) it's not cpu overheat (now
> >> running healthd without complaints) it's not based on any given
> >> network traffic... however it does appear to accompany heavy cpu/
> >> disk
> >> activity. It usually dies when indexing my websites at night (but
> >> not
> >> always) and it sometimes dies when compiling programs. Just heavy
> >> disk isn't enough to do the job, as backups proceed without
> >> problems. Heavy cpu by itself isn't enough to do it either. But if
> >> I start compiling things and keep going a while, it will eventually
> >> hang.
> >> Is there anything else I should be looking at?
> > Power supply or motherboard would be my first guess.
> If the system went offline, I agree. But it's clearly a kernel
> deadlock, since the system remains pingable, answers TCP connections,
> etc etcc.... but doesn't respond.
Ah. Well, you did said the system 'dies', not 'becomes unresponsive'.
> No TCP negotiation, no response on
> the console, etc. It's higher level activity which isn't working...
Try compiling a kernel with debugging options e.g. WITNESS(4), MUTEX_DEBUG,
LOCK_PROFILING, DIAGNOSTIC and INVARIANTS. See /usr/src/sys/conf/NOTES
This will create a lot of messages in the dmesg output.
If you can hook the system up to another machine via serial console, you
might be able to debug the kernel. Read the kernel debugging chapter in
the Developers' Handbook.
Another tip is to create a cron job that makes log entries every couple
of minutes with logger. This might help you pinpoint the exact time of
the mishap, to correlate it to other system activity.
Be _really_ sure that it isn't hardware though. Otherwise you'll be led
on a merry goose chase looking for software errors that aren't there. If
you can restore a backup of this machine's software to a similar one, do
so and see if the hangs persist. If they don't, it's hardware.
[plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated]
pgp: 1A2B 477F 9970 BA3C 2914 B7CE 1277 EFB0 C321 A725 (KeyID: C321A725)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 195 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20080716/7c905dd3/attachment.pgp
More information about the freebsd-stable