7.1-STABLE crash

Mikolaj Golub to.my.trociny at gmail.com
Tue Jun 2 20:25:02 UTC 2009


On Tue, 02 Jun 2009 13:41:40 +0400 Asmodean Dark wrote:

 AD> # kgdb kernel.debug vmcore.0
 AD> GNU gdb 6.1.1 [FreeBSD]
 AD> Copyright 2004 Free Software Foundation, Inc.
 AD> GDB is free software, covered by the GNU General Public License, and you are
 AD> welcome to change it and/or distribute copies of it under certain conditions.
 AD> Type "show copying" to see the conditions.
 AD> There is absolutely no warranty for GDB.  Type "show warranty" for details.
 AD> This GDB was configured as "i386-marcel-freebsd"...No struct type named linker_file.
 AD> No struct type named linker_file.
 AD> No struct type named linker_file.
 AD> Attempt to extract a component of a value that is not a structure.
 AD> No struct type named linker_file.
 AD> No struct type named linker_file.
 AD> No struct type named linker_file.
 AD> Attempt to extract a component of a value that is not a structure.

 AD> Attempt to extract a component of a value that is not a structure pointer.
 AD> Attempt to extract a component of a value that is not a structure pointer.
 AD> Attempt to extract a component of a value that is not a structure pointer.
 AD> Attempt to extract a component of a value that is not a structure pointer.
 AD> #0  0x8063d6b0 in doadump ()
 AD> (kgdb) bt
 AD> #0  0x8063d6b0 in doadump ()
 AD> #1  0x8063dc44 in boot ()
 AD> #2  0x8063e0ca in panic ()
 AD> #3  0x807dab3d in trap_fatal ()
 AD> #4  0x807daeba in trap_pfault ()
 AD> #5  0x807db7bd in trap ()
 AD> #6  0x807c2a3b in calltrap ()
 AD> #7  0x806dcb88 in rn_match ()
 AD> #8  0x806ddc8a in rn_lookup ()
 AD> #9  0x8070e460 in ipfw_chk (args=0xe70175fc) at ../../../netinet/ip_fw2.c:1894
 AD> #10 0x80710c3d in ipfw_check_in (arg=0x0, m0=0xe7017700, ifp=0x91c5a800, dir=1, inp=0x0) at ../../../netinet/ip_fw_pfil.c:125
 AD> #11 0x806dc20f in pfil_run_hooks ()
 AD> #12 0x80713984 in ip_input (m=0x91954c00) at ../../../netinet/ip_input.c:416
 AD> #13 0x806ec0d9 in ng_iface_rcvdata ()
 AD> #14 0x806e9570 in ng_apply_item ()
 AD> #15 0x806e8569 in ng_snd_item ()
 AD> #16 0x806e9570 in ng_apply_item ()
 AD> #17 0x806e8569 in ng_snd_item ()
 AD> #18 0x806e9570 in ng_apply_item ()
 AD> #19 0x806e8569 in ng_snd_item ()
 AD> #20 0x806f16a7 in ng_ppp_proto_recv ()
 AD> #21 0x806f3ed2 in ng_ppp_rcvdata ()
 AD> #22 0x806e9570 in ng_apply_item ()
 AD> #23 0x806e8569 in ng_snd_item ()
 AD> #24 0x806e9570 in ng_apply_item ()
 AD> #25 0x806e8569 in ng_snd_item ()
 AD> #26 0x806ee3c3 in ng_ksocket_incoming2 ()
 AD> #27 0x806e969d in ng_apply_item ()
 AD> #28 0x806ea8aa in ngintr ()
 AD> #29 0x806dab72 in swi_net ()
 AD> #30 0x8061e265 in ithread_loop ()
 AD> #31 0x8061adf5 in fork_exit ()
 AD> #32 0x807c2ab0 in fork_trampoline ()
 AD> (kgdb) fr 9
 AD> #9  0x8070e460 in ipfw_chk (args=0xe70175fc) at ../../../netinet/ip_fw2.c:1894
 AD> 1894            sa.sin_len = 8;
 AD> (kgdb) list
 AD> 1889            struct sockaddr_in sa;
 AD> 1890
 AD> 1891            if (tbl >= IPFW_TABLES_MAX)
 AD> 1892                    return (0);
 AD> 1893            rnh = ch->tables[tbl];
 AD> 1894            sa.sin_len = 8;
^^^^^^^^^ looks strange. On the line 1894 I expected to see rnh_lookup() call,
which is two lines below. Are you sure your source matches the built kernel?

 AD> 1895            sa.sin_addr.s_addr = addr;
 AD> 1896            ent = (struct table_entry *)(rnh->rnh_lookup(&sa, NULL, rnh));
 AD> 1897            if (ent != NULL) {
 AD> 1898                    *val = ent->value;
 AD> (kgdb) p *cmd
 AD> $1 = {opcode = O_IP_SRC_LOOKUP, len = 1 '\001', arg1 = 2}
 AD> (kgdb) p cmd->arg1
 AD> $2 = 2

It crashed looking for src IP in table 2. But from ps otput I don't see the
process that could modify the table in that time. So the table might have been
corrupted earlier.

Unfortunately, reviewing provided info I don't have any good ideas what might
have caused this. May be other people on the list could help...

Recently I saw some backtrace of the crash in rn_match() too but then it was
pf that was looking for IP in the table. It appeared that the guy was running
ssh brute-force blocker and expiretable, which was run periodically, removed
old entries from the table. He just disabled expiretable and this stopped the
crashes.

Actually some output from crashinfo looks suspicious. zero values for fork()
calls, negative values in vmstat -m output... Does the userland where you were
running crashinfo matched the crushed kernel? Also, does you kernel match
userland on crashed box?

And certainly it would be good to provide backtrace with full debugging info
available :-). Do you remember that debugging symbols for modules are needed
too?
                                       
-- 
Mikolaj Golub


More information about the freebsd-hackers mailing list