unexpected panic in idle process (RELENG_5 2005/04/25UTC)

Sun May 8 16:44:19 PDT 2005

On Thu, 28 Apr 2005, Adrian Steinmann wrote:

> I've been running RELENG_5 -D2005/04/25UTC on a CF-based system
> (PC Engines WRAP.1C v1.03; 640 KB Base Memory; 130048 KB Extended Memory)
> CF:	01F0 Master 848A Hitachi XX.V.3.4.0.0
> 	Phys C/H/S 978/8/32 Log C/H/S 978/8/32
>
> ...
> CPU: Geode(TM) Integrated Processor by National Semi (266.64-MHz 586-class CPU)
>   Origin = "Geode by NSC"  Id = 0x540  Stepping = 0
>   Features=0x808131<FPU,TSC,MSR,CX8,CMOV,MMX>
> ...
>
> after a few hours (mainly idle and ntpd), I was surprised to note
> on the console:
>
> Fatal trap 9: general protection fault while in kernel mode
> instruction pointer     = 0x8:0xc05e8ca2
> stack pointer           = 0x10:0xc7499d10
> frame pointer           = 0x10:0xc7499d10
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, def32 1, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 11 (idle)
> [thread pid 11 tid 100003 ]
> Stopped at      cpu_idle_default+0x5:   leave

This is an odd one; the IA32 manual doesn't say you can take a GPF from a
LEAVE instruction in protected mode. %ebp looks correct so I'd wonder if
you have cooling problems or a bad processor.

>
> I've got:
>
> db> where
> Tracing pid 11 tid 100003 td 0xc0ac6480
> cpu_idle_default(c0ac5c5c,c7499d34,c049e404,0,c7499d48) at cpu_idle_default+0x5
> idle_proc(0,c7499d48,0,c049e628,0) at idle_proc+0x11
> fork_exit(c049e628,0,c7499d48) at fork_exit+0x6f
> fork_trampoline() at fork_trampoline+0x8
> --- trap 0x1, eip = 0, esp = 0xc7499d7c, ebp = 0 ---
> db> ps
>   pid   proc     uid  ppid  pgrp  flag   stat  wmesg    wchan  cmd
>   931 c109e000 1001   930   931 0004002 [SLPQ ttyin 0xc0b65a10][SLP] bash
>   930 c109e1c4 1001   928   928 0000100 [SLPQ select 0xc0674e24][SLP] sshd
>   928 c109ea98    0   831   928 0000100 [SLPQ sbwait 0xc0f49974][SLP] sshd
>   927 c10a21c4    0     1   927 0004002 [SLPQ ttyin 0xc0abea10][SLP] getty
>   853 c0b50710    0     1   853 0000000 [SLPQ nanslp 0xc067182c][SLP] cron
>   841 c0bf7710   25     1   841 0000100 [SLPQ pause 0xc0bf7748][SLP] sendmail
>   837 c0b508d4    0     1   837 0000100 [SLPQ select 0xc0674e24][SLP] sendmail
>   831 c0b501c4    0     1   831 0000100 [SLPQ select 0xc0674e24][SLP] sshd
>   818 c0b50388    0     1   818 0000000 [SLPQ select 0xc0674e24][SLP] ntpd
>   710 c0bf7000    0     1   710 0000000 [SLPQ select 0xc0674e24][SLP] syslogd
>   101 c0b50c5c    0     0     0 0000204 [SLPQ mdwait 0xc0bf8d00][SLP] md2
>    91 c0bf754c    0     0     0 0000204 [SLPQ mdwait 0xc0c23a00][SLP] md1
>    71 c0bf7388    0     0     0 0000204 [SLPQ mdwait 0xc0c23600][SLP] md0
>    46 c0afaa98    0     0     0 0000204 [SLPQ - 0xc74dbd0c][SLP] schedcpu
>    45 c0afac5c    0     0     0 0000204 [SLPQ - 0xc067db2c][SLP] nfsiod 3
>    44 c0afae20    0     0     0 0000204 [SLPQ - 0xc067db28][SLP] nfsiod 2
>    43 c0b4c000    0     0     0 0000204 [SLPQ - 0xc067db24][SLP] nfsiod 1
>    42 c0b4c1c4    0     0     0 0000204 [SLPQ - 0xc067db20][SLP] nfsiod 0
>    41 c0b4c388    0     0     0 0000204 [SLPQ syncer 0xc06715ac][SLP] syncer
>    40 c0b4c54c    0     0     0 0000204 [SLPQ vlruwt 0xc0b4c54c][SLP] vnlru
>    39 c0b4c710    0     0     0 0000204 [SLPQ psleep 0xc06753ec][SLP] bufdaemon
>    38 c0b4c8d4    0     0     0 0000204 [SLPQ pollid 0xc066f184][SLP] idlepoll
>    37 c0b4ca98    0     0     0 000020c [SLPQ pgzero 0xc0684554][SLP] pagezero
>     9 c0b4cc5c    0     0     0 0000204 [SLPQ psleep 0xc0684564][SLP] pagedaemon
>    36 c0b4ce20    0     0     0 0000204 [IWAIT] swi0: sio
>     8 c0b50000    0     0     0 0000204 [SLPQ - 0xc0b34d80][SLP] thread taskq
>    35 c0ae854c    0     0     0 0000204 [IWAIT] swi6:+
>     7 c0ae8710    0     0     0 0000204 [SLPQ - 0xc0b34e40][SLP] kqueue taskq
>    34 c0ae88d4    0     0     0 0000204 [IWAIT] swi3: cambio
>    33 c0ae8a98    0     0     0 0000204 [IWAIT] swi2: camnet
>    32 c0ae8c5c    0     0     0 0000204 [IWAIT] swi6: task queue
>    31 c0ae8e20    0     0     0 0000204 [IWAIT] swi6:+
>    30 c0afa000    0     0     0 0000204 [SLPQ - 0xc06676c0][SLP] yarrow
>     6 c0afa1c4    0     0     0 0000204 [SLPQ crypto_ret_wait 0xc0683584][SLP] crypto returns
>     5 c0afa388    0     0     0 0000204 [SLPQ crypto_wait 0xc0683544][SLP] crypto
>     4 c0afa54c    0     0     0 0000204 [SLPQ - 0xc066ba68][SLP] g_down
>     3 c0afa710    0     0     0 0000204 [SLPQ - 0xc066ba64][SLP] g_up
>     2 c0afa8d4    0     0     0 0000204 [SLPQ - 0xc066ba5c][SLP] g_event
>    29 c0acc1c4    0     0     0 0000204 [IWAIT] swi1: net
>    28 c0acc388    0     0     0 0000204 [IWAIT] swi4: vm
>    27 c0acc54c    0     0     0 000020c [RUNQ] swi5: clock sio
>    26 c0acc710    0     0     0 0000204 [IWAIT] irq15: ata1
>    25 c0acc8d4    0     0     0 0000204 [IWAIT] irq14: ata0
>    24 c0acca98    0     0     0 0000204 [IWAIT] irq13:
>    23 c0accc5c    0     0     0 0000204 [IWAIT] irq12: hifn0
>    22 c0acce20    0     0     0 0000204 [IWAIT] irq11: sis2
>    21 c0ae8000    0     0     0 0000204 [IWAIT] irq10: sis0
>    20 c0ae81c4    0     0     0 0000204 [IWAIT] irq9: sis1
>    19 c0ae8388    0     0     0 0000204 [IWAIT] irq8: rtc
>    18 c0ac5000    0     0     0 0000204 [IWAIT] irq7:
>    17 c0ac51c4    0     0     0 0000204 [IWAIT] irq6:
>    16 c0ac5388    0     0     0 0000204 [IWAIT] irq5:
>    15 c0ac554c    0     0     0 0000204 [IWAIT] irq4: sio0
>    14 c0ac5710    0     0     0 0000204 [IWAIT] irq3:
>    13 c0ac58d4    0     0     0 0000204 [IWAIT] irq1:
>    12 c0ac5a98    0     0     0 0000204 [IWAIT] irq0: clk
>    11 c0ac5c5c    0     0     0 000020c [CPU 0] idle
>     1 c0ac5e20    0     0     1 0004200 [SLPQ wait 0xc0ac5e20][SLP] init
>    10 c0acc000    0     0     0 0000204 [SLPQ ktrace 0xc066f7d8][SLP] ktrace
>     0 c066bb00    0     0     0 0000200 [SLPQ sched 0xc066bb00][SLP] swapper
> db>
>
> unfortunately, I won't be able to take a dump because the CF has
> no swap.
>
> Anyone have an idea what this could be, and what I could type into
> db> to get further info?
>
> One thing that I changed on this HW yesterday was to add a Mini-PCI
> HiFn card:
>
> hifn0 mem 0x80040000-0x80040fff,0x80000000-0x80000fff irq 12 at device 13.0 on pci0
> hifn0: Hifn 7951, rev 0, 128KB sram
>
> before that change, earlier RELENG_5 systems didn't demonstrate
> any such an "idle" panic on said HW.
>
> non-verbose dmesg and kernel config image are available, and I will
> leave the db> there, awaiting suggestions.
>
> Adrian
>
>
> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"
>

-- 
Doug White                    |  FreeBSD: The Power to Serve
dwhite at gumbysoft.com          |  www.FreeBSD.org