latest -CURRENT kernel panic
Vincent Poy
vincepoy at gmail.com
Tue Nov 9 16:30:36 PST 2004
On Tue, 9 Nov 2004 13:39:47 +0000 (GMT), Robert Watson
<rwatson at freebsd.org> wrote:
>
> On Mon, 8 Nov 2004, Vincent Poy wrote:
>
> > I updated from a March 6, 2004 -CURRENT which had been running without
> > issues to October 19, 2004 -CURRENT. All -CURRENT since October 19,
> > 2004 to the latest November 5, 2004 -CURRENT have been panicing with the
> > following:
>
> Looks like it's likely a bug in the if_xl device driver. Looks like
> Garance has already pointed you at the debugging guide in the handbook, in
> particular using a serial console via a second box to avoid having to
> type. In addition to that useful stuff, it would be useful to have you
> try the following:
A second box would work except this is running on a notebook sitting
on the floor due to space issues. The kernel crash dump method seems
to be what I'm looking for except my swap device is 256MB while I have
2048MB of RAM. What I did yesterday was unplugged the notebook and
took it next to another computer so I can type the output and
hopefully the notebook wouldn't reboot or shutoff before I finish
typing. Guess the other alternative is to just take a picture of the
screen's output with a digital camera and then type it out. =) Is it
possible to set up the serial console but only plug the serial cable
in after a panic?
> - if debug.mpsafenet=1, please try temporarily setting debug.mpsafenet=0
> in loader.conf and rebooting to see if the problem goes away. This
> might suggest a locking problem in the xl driver, for example.
You're right that it's on debug.mpsafenet=1 since from reading the
-current list, it
seems that the dc or was it de driver that required the
debug.mpsafenet=0 and I didn't hear anyone complain about panics with
the xl driver. I've added debug.mpsafenet=0 to /boot/loader.conf and
will try it on the next reboot. What I noticed was the problem is not
exactly identical each time. The one before the one posted below IIRC
said something about cannot lock free mbuf.
> - Could you use gdb or addr2line to convert xl_intr+0x262 below into a
> source line number?
How exactly do I do the above since I noticed ddb doesn't have a
addr2line command in it?
> - Could you run "show pcpu", "show locks", and "ps" and provide that
> output (serial console highly recommended).
Will do that on the next panic if it occurs after the debug.mpsafenet=1
setting.
> - Could you tell me what sort of load the system is experiencing -- in
> particular, does it have multiple network interfaces, is it performing
> web serving, packet forwarding, bridging, XWindows, etc?
It has a xl0, fwe0, ath0, plip0, lo0, pflog0, pfsync0 interface
according to ifconfig -a but I'm only using the xl0 interface as the
ath0 is not configured. The xl0 is basically doing ipfw/dummynet for
outgoing traffic shaping of 0-80 byte TCP Ack packets as well as
acting as a outgoing router for the LAN of 2 windows machines. It is
performing httpd except during the last panic, my apache wasn't
working as I am in the process of going from 1.2 to 2.0. I do have
x.org as well as KDE installed but I'm only running it as needed which
I haven't ran for the last year until about 2 hours ago after
converting from XFree86 to X.Org and then to kde3.3.
Typical load looks like this:
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
root 27 0.5 0.0 0 12 ?? RL 2:22PM 1:13.28 [swi4: clock sio]
root 0 0.0 0.0 0 4 ?? DLs 2:22PM 0:00.02 [swapper]
root 1 0.0 0.0 720 344 ?? ILs 2:22PM 0:00.01 /sbin/init --
root 2 0.0 0.0 0 12 ?? DL 2:22PM 0:00.61 [g_event]
root 3 0.0 0.0 0 12 ?? DL 2:22PM 0:04.22 [g_up]
root 4 0.0 0.0 0 12 ?? DL 2:22PM 0:03.61 [g_down]
root 5 0.0 0.0 0 12 ?? DL 2:22PM 0:00.00 [thread taskq]
root 6 0.0 0.0 0 12 ?? DL 2:22PM 0:00.00 [kqueue taskq]
root 7 0.0 0.0 0 12 ?? IL 2:22PM 0:00.00 [acpi_task0]
root 8 0.0 0.0 0 12 ?? IL 2:22PM 0:00.00 [acpi_task1]
root 9 0.0 0.0 0 12 ?? IL 2:22PM 0:00.00 [acpi_task2]
root 10 0.0 0.0 0 12 ?? DL 2:22PM 0:00.00 [ktrace]
root 11 0.0 0.0 0 12 ?? RL 2:22PM 4:41.92 [idle: cpu0]
root 12 0.0 0.0 0 12 ?? WL 2:22PM 0:00.00 [irq0: clk]
root 13 0.0 0.0 0 12 ?? WL 2:22PM 0:00.00 [irq1: atkbd0]
root 14 0.0 0.0 0 12 ?? WL 2:22PM 0:00.00 [irq3:]
root 15 0.0 0.0 0 12 ?? WL 2:22PM 0:00.00 [irq4: sio0]
root 16 0.0 0.0 0 12 ?? WL 2:22PM 0:00.00 [irq5: pcm0]
root 17 0.0 0.0 0 12 ?? WL 2:22PM 0:00.00 [irq6: fdc0]
root 18 0.0 0.0 0 12 ?? WL 2:22PM 0:00.00 [irq7: ppc0]
root 19 0.0 0.0 0 12 ?? WL 2:22PM 0:00.00 [irq8: rtc]
root 20 0.0 0.0 0 12 ?? WL 2:22PM 0:00.00 [irq9: acpi0]
root 21 0.0 0.0 0 12 ?? WL 2:22PM 0:00.00 [irq10:]
root 22 0.0 0.0 0 12 ?? WL 2:22PM 0:01.57 [irq11: cbb0 cbb1
root 23 0.0 0.0 0 12 ?? WL 2:22PM 0:00.00 [irq12: psm0]
root 24 0.0 0.0 0 12 ?? WL 2:22PM 0:00.00 [irq13:]
root 25 0.0 0.0 0 12 ?? WL 2:22PM 0:01.17 [irq14: ata0]
root 26 0.0 0.0 0 12 ?? WL 2:22PM 0:00.09 [irq15: ata1]
root 28 0.0 0.0 0 12 ?? WL 2:22PM 0:00.00 [swi3: vm]
root 29 0.0 0.0 0 12 ?? WL 2:22PM 0:00.87 [swi1: net]
root 30 0.0 0.0 0 12 ?? DL 2:22PM 0:00.54 [yarrow]
root 31 0.0 0.0 0 12 ?? WL 2:22PM 0:00.00 [swi5:+]
root 32 0.0 0.0 0 12 ?? WL 2:22PM 0:00.21 [swi6:+]
root 33 0.0 0.0 0 12 ?? WL 2:22PM 0:00.00 [swi6: task queue
root 34 0.0 0.0 0 12 ?? WL 2:22PM 0:00.00 [swi6: acpitaskq]
root 35 0.0 0.0 0 12 ?? WL 2:22PM 0:00.00 [swi2: cambio]
root 36 0.0 0.0 0 12 ?? DL 2:22PM 0:07.74 [acpi_thermal]
root 37 0.0 0.0 0 12 ?? DL 2:22PM 0:00.00 [usb0]
root 38 0.0 0.0 0 12 ?? DL 2:22PM 0:00.00 [usbtask]
root 39 0.0 0.0 0 12 ?? DL 2:22PM 0:00.00 [usb1]
root 40 0.0 0.0 0 12 ?? DL 2:22PM 0:00.00 [cbb0]
root 41 0.0 0.0 0 12 ?? DL 2:22PM 0:00.00 [cbb1]
root 42 0.0 0.0 0 12 ?? DL 2:22PM 0:00.03 [fdc0]
root 43 0.0 0.0 0 12 ?? WL 2:22PM 0:00.00 [swi0: sio]
root 44 0.0 0.0 0 12 ?? DL 2:22PM 0:00.01 [pagedaemon]
root 45 0.0 0.0 0 12 ?? DL 2:22PM 0:00.00 [vmdaemon]
root 46 0.0 0.0 0 12 ?? RL 2:22PM 0:06.67 [pagezero]
root 47 0.0 0.0 0 12 ?? DL 2:22PM 0:17.49 [bufdaemon]
root 48 0.0 0.0 0 12 ?? DL 2:22PM 0:04.29 [syncer]
root 49 0.0 0.0 0 12 ?? DL 2:22PM 0:00.04 [vnlru]
root 50 0.0 0.0 0 12 ?? DL 2:22PM 0:00.02 [hpt_wt]
root 51 0.0 0.0 0 12 ?? IL 2:22PM 0:00.00 [nfsiod 0]
root 52 0.0 0.0 0 12 ?? IL 2:22PM 0:00.00 [nfsiod 1]
root 53 0.0 0.0 0 12 ?? IL 2:22PM 0:00.00 [nfsiod 2]
root 54 0.0 0.0 0 12 ?? IL 2:22PM 0:00.00 [nfsiod 3]
root 55 0.0 0.0 0 12 ?? DL 2:22PM 0:00.60 [schedcpu]
root 145 0.0 0.0 1204 628 ?? Is 2:23PM 0:00.00 adjkerntz -i
root 215 0.0 0.0 1440 900 ?? Ss 2:23PM 0:00.23 /sbin/natd -n xl0
root 295 0.0 0.0 488 364 ?? Is 2:23PM 0:00.00 /sbin/devd
root 372 0.0 0.0 1328 876 ?? Ss 2:23PM 0:00.32 /usr/sbin/syslogd
bind 391 0.0 0.2 4360 3544 ?? Ss 2:23PM 0:00.43 /usr/sbin/named -
root 405 0.0 0.1 1408 1044 ?? Ss 2:23PM 0:00.01 /usr/sbin/rpcbind
root 528 0.0 0.0 1304 880 ?? Ss 2:23PM 0:00.45 diskcheckd: ad2 8
root 539 0.0 0.0 1284 928 ?? Is 2:23PM 0:00.00 /usr/sbin/lpd
root 572 0.0 0.0 1244 788 ?? Ss 2:23PM 0:00.01 /usr/sbin/usbd
root 592 0.0 0.1 3348 2568 ?? Is 2:23PM 0:00.16 /usr/sbin/sshd
root 597 0.0 0.1 3568 2824 ?? Ss 2:23PM 0:00.39 sendmail: accepti
smmsp 603 0.0 0.1 3444 2700 ?? Ss 2:23PM 0:00.12 sendmail: Queue r
root 624 0.0 0.1 1384 1088 ?? Ss 2:23PM 0:00.03 /usr/sbin/cron -s
root 829 0.0 0.1 6080 3120 ?? Is 2:24PM 0:00.05 sshd: vince [priv
vince 842 0.0 0.2 6080 3492 ?? S 2:24PM 0:00.63 sshd: vince at ttyp0
root 870 0.0 0.0 1304 880 ?? Ss 2:24PM 0:00.44 diskcheckd: ad2 8
uucp 884 0.0 0.1 2224 1732 ?? Is 2:24PM 0:00.00 /usr/local/sbin/f
uucp 887 0.0 0.1 2232 1896 ?? Is 2:24PM 0:00.01 /usr/local/sbin/h
root 955 0.0 0.3 6880 6188 ?? Ss 2:24PM 0:00.12 /usr/local/bin/pe
root 967 0.0 0.3 7100 6444 ?? Ss 2:24PM 0:00.11 /usr/local/bin/pe
root 986 0.0 0.0 1240 732 ?? Is 2:24PM 0:00.00 /usr/sbin/moused
root 1015 0.0 0.1 1472 1156 ?? Is 2:24PM 0:00.02 /usr/sbin/inetd -
root 1055 0.0 0.2 6080 3128 ?? Is 2:26PM 0:00.05 sshd: vince [priv
vince 1057 0.0 0.2 6080 3480 ?? S 2:26PM 0:01.06 sshd: vince at ttyp1
nobody 7012 0.0 0.1 3104 1832 ?? Is 3:16PM 0:00.02 identd -w -t120
vince 843 0.0 0.1 2628 2212 p0 Ss 2:24PM 0:00.11 -tcsh (tcsh)
vince 961 0.0 0.5 11484 9380 p0 T 2:24PM 0:01.83 pine
root 1029 0.0 0.0 1296 944 v0 Is+ 2:24PM 0:00.00 /usr/libexec/gett
root 1030 0.0 0.0 1296 944 v1 Is+ 2:24PM 0:00.00 /usr/libexec/gett
root 1031 0.0 0.0 1296 944 v2 Is+ 2:24PM 0:00.00 /usr/libexec/gett
root 1032 0.0 0.0 1296 944 v3 Is+ 2:24PM 0:00.00 /usr/libexec/gett
root 1033 0.0 0.0 1296 944 v4 Is+ 2:24PM 0:00.00 /usr/libexec/gett
root 1034 0.0 0.0 1296 944 v5 Is+ 2:24PM 0:00.00 /usr/libexec/gett
root 1035 0.0 0.0 1296 944 v6 Is+ 2:24PM 0:00.00 /usr/libexec/gett
root 1036 0.0 0.0 1296 944 v7 Is+ 2:24PM 0:00.00 /usr/libexec/gett
root 1037 0.0 0.0 1296 944 v8 Is+ 2:24PM 0:00.00 /usr/libexec/gett
root 1038 0.0 0.0 1296 944 v9 Is+ 2:24PM 0:00.00 /usr/libexec/gett
root 1039 0.0 0.0 1296 944 va Is+ 2:24PM 0:00.00 /usr/libexec/gett
root 1040 0.0 0.0 1296 944 vb Is+ 2:24PM 0:00.00 /usr/libexec/gett
> - Are you running with any tweaked network settings, such as
> net.isr.enable?
My kernel config does have maxusers=512 and 65536 nmbclusters.
IPFIREWALL, IPDIVERT, DUMMYNET, BRIDGE, HZ=1000 are also in the kernel
config. There is also this as a temporary workaround for the kmem_map
too small panic issue ever since it started about 9 or so months ago.
Don't know if this issue was fixed.
options VM_KMEM_SIZE_MAX=(384*1048576)
options VM_KMEM_SIZE_SCALE=2
For sysctl, these are the changes:
net.inet.icmp.bmcastecho 1 -> 0
net.inet.tcp.log_in_vain=0 -> 1
net.inet.udp.log_in_vain=0 -> 1
kern.maxfiles=16424 -> 65536
kern.maxfilesperproc=14781 -> 32768
net.inet.tcp.sendspace=32768 -> 65536
kern.ipc.somaxconn=128 -> 8192
net.inet.icmp.drop_redirect=0 -> 1
net.inet.icmp.log_redirect=0 -> 1
/sbin/sysctl -w kern.ipc.maxsockbuf=262144 -> 8388608
Thanks!
Cheers,
Vince
> > Slab at 0xc3d75fa8, freei 14 = 0.
> > panic: Duplicate free of item 0xc3d75e00 from zone 0xc143e9a0(Mbuf)
> > cpuid = 0
> > KDB: enter: panic
> > [thread 100019]
> > Stopped at kdb_enter+0x2c: leave
> > db> trace
> > kdb_enter(c0852d05,100,c38b0980,e,c3d75000) at kdb_enter+0x2c
> > panic(c086ce77,c3d75e00,c143e9a0,c08515e0,c086ce5b) at panic+0x17f
> > uma_dbg_free(c143e9a0,0,c3d75e00) at uma_dbg_free+0x10e
> > uma_zfree_arg(c143e9a0,c3d75e00,0) at uma_zfree_arg+0x1f6
> > m_freem(c3d75e00,c3a0b21c,c3a0b2f4,c3a0cd80,c38b7400) at m_freem+0x3e
> > xl_intr(c3a09000,0,c38b3c00,0,c0612120) at xl_intr+0x262
> > ithread_loop(c38b7400,e9f95d48,c38b7400,c0612120,0) at ithread_loop+0x186
> > fork_exit(c0612120,c38b7400,e9f95d48) at fork_exit+0x7e
> > fork_trampoline() at fork_trampoline+0x8
> > --- trap 0x1, eip = 0, esp = 0xe9f95d7c, ebp = 0 ---
> > db>
> >
> > Any ideas what could be causing this? Is there anyway to write
> > everything above to a file instead of manually typing everything?
> > Thanks.
> >
> > Cheers,
> > Vince
> > _______________________________________________
> > freebsd-current at freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-current
> > To unsubscribe, send any mail to "freebsd-current-unsubscribe at freebsd.org"
> >
>
>
More information about the freebsd-current
mailing list