It's happening again (panic early in boot)
Ian FREISLICH
if at hetzner.co.za
Mon Jun 7 11:00:19 GMT 2004
John Baldwin wrote:
> On Friday 04 June 2004 11:14 am, Ian FREISLICH wrote:
> > John Baldwin wrote:
> > > On Friday 04 June 2004 06:45 am, Ian FREISLICH wrote:
> > > > Hi
> > > >
> > > > Every month or so after it started working I get this panic.
> > > > The panic then goes away after a month or two, with no
> > > > explanation. During the existence of the panic I try new kernel
> > > > source once a day.
> > > >
> > > > This is an SMP machine. Using the same source UP kernels work
> > > > fine, SMP kernels don't. The last SMP kernel that worked is
> > > > circa May 17.
> > >
> > > grr, I still don't know why this happens. One thing though is
> > > that if we can fix the nested panic we might can work on the first
> > > one.
> >
> > If you want access to the box in question, I can arrange that.
> >
> > > > Booting [/boot/kernel/kernel]...
> > > > /boot/kernel/acpi.ko text=0x3a0e4 data=0x19e4+0x11ac
> > > > syms=[0x4+0x6860+0x4+0x8a87 ]
> > > > Copyright (c) 1992-2004 The FreeBSD Project.
> > > > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993,
> > > > 1994 The Regents of the University of California. All rights reserved.
> > > > FreeBSD 5.2-CURRENT #15: Fri Jun 4 10:23:23 SAST 2004
> > > >
> > > > ianf at brane-dead.freislich.nom.za:/usr/src/sys/i386/compile/BRANE-DEAD
> > > > Preloaded elf kernel "/boot/kernel/kernel" at 0xc0728000.
> > > > Preloaded elf module "/boot/kernel/acpi.ko" at 0xc0728244.
> > > > Timecounter "i8254" frequency 1193182 Hz quality 0
> > > > CPU: Pentium II/Pentium II Xeon/Celeron (267.27-MHz 686-class CPU)
> > > > Origin = "GenuineIntel" Id = 0x634 Stepping = 4
> > > >
> > > > Features=0x80fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,
> > > >MCA, CMO V,MMX>
> > > > real memory = 201261056 (191 MB)
> > > > avail memory = 191311872 (182 MB)
> > > > MPTable: <OEM00000 PROD00000000>
> > > > kernel trap 12 with interrupts disabled
> > > >
> > > >
> > > > Fatal trap 12: page fault while in kernel mode
> > > > cpuid = 0; apic id = 00
> > > > fault virtual address = 0x1c
> > > > fault code = supervisor write, page not present
> > > > instruction pointer = 0x8:0xc058d98e
> > >
> > > Can you do a gdb -k on kernel.debug and do 'l *' on this address? That
> > > might let us fix the panic in vm_fault().
> >
> > Is this what you're after?
> >
> > (kgdb) l * 0xc058d98e
> > 0xc058d98e is in vm_fault (machine/atomic.h:154).
> > 149 static __inline int
> > 150 atomic_cmpset_int(volatile u_int *dst, u_int exp, u_int src)
> > 151 {
> > 152 int res = exp;
> > 153
> > 154 __asm __volatile (
> > 155 " " __XSTRING(MPLOCKED) " "
> > 156 " cmpxchgl %1,%2 ; "
> > 157 " setz %%al ; "
> > 158 " movzbl %%al,%0 ; "
> >
> > Ian
>
> Hmm, darn inlines. :) Can you compile the kernel with either
> INVARIANTS or MUTEX_NOINLINE so that mutex ops aren't inlined,
> reproduce the panic and then do the same lookup using the new faulting
> IP?
(kgdb) l * 0xc04b9828
0xc04b9828 is in _mtx_lock_flags (../../../kern/kern_mutex.c:247).
242 void
243 _mtx_lock_flags(struct mtx *m, int opts, const char *file, int line)
244 {
245
246 MPASS(curthread != NULL);
247 KASSERT(m->mtx_object.lo_class == &lock_class_mtx_sleep,
248 ("mtx_lock() of spin mutex %s @ %s:%d", m->mtx_object.lo_name,
249 file, line));
250 WITNESS_CHECKORDER(&m->mtx_object, opts | LOP_NEWORDER | LOP_EXCLUSIVE,
251 file, line);
Interstingly with INVARIENTS, the panic is exactly the same except
for this (new) text at the end of the multiple panic:
panic: page fault
at line 815 in file ../../../i386/i386/trap.ccpuid = 0;
Uptime: 1s
panic: _mtx_lock_sleep: recursed on non-recursive mutex system map @ ../../../vm/vm_map.c:2876
at line 437 in file ../../../kern/kern_mutex.ccpuid = 0;
Uptime: 1s
panic: _mtx_lock_sleep: recursed on non-rep
Ian
--
Ian Freislich
More information about the freebsd-current
mailing list