ULE crash
Jeff Roberson
jroberson at chesapeake.net
Wed Jun 25 10:34:30 PDT 2003
On Wed, 25 Jun 2003, Ian Freislich wrote:
> Hi
>
> About 4.5 minutes after rebooting with a SCHED_ULE kernel (I give
> ULE a go every few months), top started looking really wierd (the
> CPU % just kept on accumulating for each process). Before dnetc
> started, httpd showed 17% CPU, but the system was supposedly 100%
> idle at the time according to top. Then dnetc started and things
> got wierd.
There is some bug that is preventing sleeping processes from loosing old
cpu usage. I'm looking into that. Can you tell me what version of the
sched_ule.c file you have? This looks like an old panic.
Thanks,
Jeff
>
> last pid: 607; load averages: 1.83, 0.63, 0.25 up 0+00:04:23 16:00:48
> 35 processes: 3 running, 32 sleeping
> CPU states: 0.0% user, 99.0% nice, 0.6% system, 0.4% interrupt, 0.0% idle
> Mem: 20M Active, 14M Inact, 19M Wired, 20K Cache, 25M Buf, 130M Free
> Swap: 512M Total, 512M Free
>
> PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU CPU COMMAND
> 603 ianf 139 20 1072K 880K RUN 0 0:39 105.47% 105.47% dnetc
> 575 ianf 139 20 1072K 880K CPU1 1 1:15 102.34% 102.34% dnetc
> 505 root 76 0 7208K 5420K select 0 0:01 17.97% 17.97% httpd
> 375 root 4 0 1276K 948K accept 0 0:00 9.38% 9.38% nfsd
> 526 nobody 76 0 9280K 8564K select 1 0:04 5.47% 5.47% squid
> 607 ianf 76 0 2196K 1444K CPU0 0 0:00 2.34% 2.34% top
>
> Then it froze. When I got home I found that it had at least dumped
> vmcore.24. I'll keep it around for a while and perform any inspections
> people want me to. This was with sources updated at 13h30 GMT today.
>
> panic: page fault
> panic messages:
> ---
> Fatal trap 12: page fault while in kernel mode
> cpuid = 1; lapic.id = 01000000
> fault virtual address = 0x38
> fault code = supervisor read, page not present
> instruction pointer = 0x8:0xc01e094d
> stack pointer = 0x10:0xce772be4
> frame pointer = 0x10:0xce772bf4
> code segment = base 0x0, limit 0xfffff, type 0x1b
> = DPL 0, pres 1, def32 1, gran 1
> processor eflags = interrupt enabled, resume, IOPL = 0
> current process = 603 (dnetc)
> trap number = 12
> panic: page fault
> cpuid = 1; lapic.id = 01000000
> Stack backtrace:
> boot() called on cpu#1
>
> syncing disks, buffers remaining... panic: absolutely cannot call smp_ipi_shootdown with interrupts already disabled
> cpuid = 1; lapic.id = 01000000
> boot() called on cpu#1
> Uptime: 4m15s
> Dumping 191 MB
> ata0: resetting devices ..
> done
> 16 32 48 64 80 96 112 128 144 160 176
> ---
>
> (kgdb) bt
> #0 doadump () at ../../../kern/kern_shutdown.c:240
> #1 0xc01cbe7f in boot (howto=260) at ../../../kern/kern_shutdown.c:372
> #2 0xc01cc2b8 in panic () at ../../../kern/kern_shutdown.c:550
> #3 0xc02e8f89 in smp_tlb_shootdown (vector=0, addr1=0, addr2=0)
> at ../../../i386/i386/mp_machdep.c:2356
> #4 0xc02e92a9 in smp_invlpg_range (addr1=0, addr2=0)
> at ../../../i386/i386/mp_machdep.c:2488
> #5 0xc02eb548 in pmap_invalidate_range (pmap=0xc03996e0, sva=3365310464,
> eva=3365314560) at ../../../i386/i386/pmap.c:721
> #6 0xc02eb83d in pmap_qenter (sva=3365310464, m=0xce772884, count=0)
> at ../../../i386/i386/pmap.c:948
> #7 0xc0218a31 in vm_hold_load_pages (bp=0xc76039a0, from=0, to=3365318656)
> at ../../../kern/vfs_bio.c:3574
> #8 0xc0216f5a in allocbuf (bp=0xc76039a0, size=8192)
> at ../../../kern/vfs_bio.c:2752
> #9 0xc0216cee in geteblk (size=8192) at ../../../kern/vfs_bio.c:2634
> #10 0xc0213980 in bwrite (bp=0xc75b65d8) at ../../../kern/vfs_bio.c:818
> #11 0xc02142dc in bawrite (bp=0x0) at ../../../kern/vfs_bio.c:1153
> #12 0xc021d89a in vop_stdfsync (ap=0xce772a14)
> at ../../../kern/vfs_default.c:742
> #13 0xc0193570 in spec_fsync (ap=0xce772a14)
> at ../../../fs/specfs/spec_vnops.c:417
> #14 0xc0192a38 in spec_vnoperate (ap=0x0)
> at ../../../fs/specfs/spec_vnops.c:122
> #15 0xc0294c62 in ffs_sync (mp=0xc3950a00, waitfor=2, cred=0xc0d06e80,
> td=0xc03702a0) at vnode_if.h:624
> #16 0xc022b15b in sync (td=0xc03702a0, uap=0x0)
> at ../../../kern/vfs_syscalls.c:142
> #17 0xc01cb9a1 in boot (howto=256) at ../../../kern/kern_shutdown.c:281
> #18 0xc01cc2b8 in panic () at ../../../kern/kern_shutdown.c:550
> #19 0xc02f0da2 in trap_fatal (frame=0xce772ba4, eva=0)
> at ../../../i386/i386/trap.c:836
> #20 0xc02f0333 in trap (frame=
> {tf_fs = -1060044776, tf_es = -831062000, tf_ds = -1071775728, tf_edi = -1014422336, tf_esi = -1070107520, tf_ebp = -831050764, tf_isp = -831050800, tf_ebx = 0, tf_edx = 0, tf_ecx = -1059988168, tf_eax = 0, tf_trapno = 12, tf_err = 0, tf_eip = -1071773363, tf_cs = 8, tf_eflags = 66194, tf_esp = -1070107520, tf_ss = 0}) at ../../../i386/i386/trap.c:256
> #21 0xc02d8eb8 in calltrap () at {standard input}:97
> #22 0xc01e188b in sched_choose () at ../../../kern/sched_ule.c:1161
> #23 0xc01d25e6 in choosethread () at ../../../kern/kern_switch.c:140
> #24 0xc01d422f in mi_switch () at ../../../kern/kern_synch.c:525
> #25 0xc01c1db6 in _mtx_lock_sleep (m=0xc0374a40, opts=0, file=0x0, line=0)
> at ../../../kern/kern_mutex.c:636
> #26 0xc01ca585 in getrusage (td=0x0, uap=0xce772d10)
> at ../../../kern/kern_resource.c:773
> #27 0xc02f10fc in syscall (frame=
> {tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 135360172, tf_esi = 135336096, tf_ebp = -1077938416, tf_isp = -831050380, tf_ebx = -1077938416, tf_edx = 0, tf_ecx = 0, tf_eax = 117, tf_trapno = 0, tf_err = 2, tf_eip = 134789976, tf_cs = 31, tf_eflags = 659, tf_esp = -1077938572, tf_ss = 47})
> at ../../../i386/i386/trap.c:1023
> #28 0xc02d8f0d in Xint0x80_syscall () at {standard input}:139
> ---Can't read userspace from dump, or kernel process---
>
>
> _______________________________________________
> freebsd-current at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe at freebsd.org"
>
More information about the freebsd-current
mailing list