FreeBSD 7.3, reboot after panic: double fault

John Baldwin jhb at freebsd.org
Tue Apr 20 13:41:49 UTC 2010


On Tuesday 20 April 2010 2:53:16 am c0re wrote:
> Hello All!
> I've upgraded freebsd from 7.0 to 7.3 and all was good until I tryed to
> configure gre interface and use ipfw fwd.
> I'm actually does not know what was the point of failure in my
> configuration.
>
> [ some details snipped ]
> 
> It worked about one week and then I made some configuration changes:
> added gre interface and 2 aliases:
> 
> # cat /etc/rc.conf |grep
> ifconfig_xl0="inet 192.168.0.10  netmask 255.255.255.0"
> ifconfig_xl0_alias0="192.168.0.11 netmask 255.255.255.255"
> ifconfig_xl0_alias1="192.168.0.12 netmask 255.255.255.255"
> cloned_interfaces="gre0"
> ifconfig_gre0="inet 192.168.250.6 192.168.250.5 tunnel 192.168.0.12
> 192.168.200.15 netmask 255.255.255.252 link1 up"
> 
> and
> 
> # cat /etc/rc.local
> #!/bin/sh
> ipfw add fwd 192.168.250.5 icmp from 192.168.0.11 to any out via xl0
> ipfw add fwd 192.168.250.5 tcp from 192.168.0.11 443 to any out via xl0
> ipfw add allow ip from any to any
> 
> # ifconfig gre0
> gre0: flags=b050<POINTOPOINT,RUNNING,LINK0,LINK1,MULTICAST> metric 0 mtu
> 1476
>         tunnel inet 192.168.0.12 --> 192.168.200.15
>         inet 192.168.250.6 --> 192.168.250.5 netmask 0xfffffffc
> 
> I shutted down gre interface to prevent requests via gre to buggy IP.
> 
> The main idea of such configurations was: fwd all connections to https to
> 192.168.0.1 via gre interface.
> And also I made apache configurations to make it listen on 192.168.0.11 too.
> 
> And make some tests: ping 192.168.0.11 - was fine, goes via gre. Telnet to
> 192.168.0.11  443 was fine too. Then I tryed to make browser https
> connection to 192.168.0.11. Apache showed me certificate warning and I
> accepted, then in browser nothing happened, it was trying to open page. But
> server got kernel panic at that moment.
> 
> At first time I thought that it was some power failure, I tryed 2 more times
> and got same behaviour.
> 
> So https works without kernel panic via 192.168.0.10 address but kernel
> panics when I try do https via 192.168.0.11 address that source-forwarded
> via gre.

Looks like the TCP output path got stuck in an infinite recursion loop until 
it exhausted the kernel stack:

> # cd /usr/obj/usr/src/sys/MYKERNEL
> # kgdb kernel.debug /var/crash/vmcore.2
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "i386-marcel-freebsd"...
> 
> Unread portion of the kernel message buffer:
> 
> Fatal double fault:
> eip = 0xc08e3ba3
> esp = 0xccf6dfc4
> ebp = 0xccf6e274
> cpuid = 0; apic id = 00
> panic: double fault
> cpuid = 0
> Uptime: 7m14s
> Physical memory: 235 MB
> Dumping 35 MB: 20 4
> 
> Reading symbols from /boot/kernel/acpi.ko...Reading symbols from
> /boot/kernel/acpi.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/acpi.ko
> Reading symbols from /boot/kernel/if_gre.ko...Reading symbols from
> /boot/kernel/if_gre.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/if_gre.ko
> Reading symbols from /boot/kernel/linux.ko...Reading symbols from
> /boot/kernel/linux.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/linux.ko
> #0  doadump () at pcpu.h:196
> 196             __asm __volatile("movl %%fs:0,%0" : "=r" (td));
> (kgdb) bt
> #0  doadump () at pcpu.h:196
> #1  0xc07f2857 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418
> #2  0xc07f2b29 in panic (fmt=Variable "fmt" is not available.
> ) at /usr/src/sys/kern/kern_shutdown.c:574
> #3  0xc0a7ea2b in dblfault_handler () at /usr/src/sys/i386/i386/trap.c:983
> #4  0xc08e3ba3 in ipfw_chk (args=0xccf6e28c) at
> /usr/src/sys/netinet/ip_fw2.c:2465
> #5  0xc08e6ce1 in ipfw_check_out (arg=0x0, m0=0xccf6e390, ifp=0xc25c5c00,
> dir=2, inp=0xc28ba708) at /usr/src/sys/netinet/ip_fw_pfil.c:248
> #6  0xc08a1968 in pfil_run_hooks (ph=0xc0c55240, mp=0xccf6e420,
> ifp=0xc25c5c00, dir=2, inp=0xc28ba708) at /usr/src/sys/net/pfil.c:78
> #7  0xc08eb6f2 in ip_output (m=0xc2710b00, opt=0x0, ro=0xccf6e3f4, flags=0,
> imo=0x0, inp=0xc28ba708) at /usr/src/sys/netinet/ip_output.c:443
> #8  0xc08f4016 in tcp_output (tp=0xc25b2570) at
> /usr/src/sys/netinet/tcp_output.c:1134
> #9  0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
> #10 0xc08f4105 in tcp_output (tp=0xc25b2570) at
> /usr/src/sys/netinet/tcp_output.c:1195
> #11 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
> #12 0xc08f4105 in tcp_output (tp=0xc25b2570) at
> /usr/src/sys/netinet/tcp_output.c:1195
> #13 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
> #14 0xc08f4105 in tcp_output (tp=0xc25b2570) at
> /usr/src/sys/netinet/tcp_output.c:1195
> #15 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
> #16 0xc08f4105 in tcp_output (tp=0xc25b2570) at
> /usr/src/sys/netinet/tcp_output.c:1195
> #17 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
> #18 0xc08f4105 in tcp_output (tp=0xc25b2570) at
> /usr/src/sys/netinet/tcp_output.c:1195
> #19 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
> #20 0xc08f4105 in tcp_output (tp=0xc25b2570) at
> /usr/src/sys/netinet/tcp_output.c:1195
> #21 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
> #22 0xc08f4105 in tcp_output (tp=0xc25b2570) at
> /usr/src/sys/netinet/tcp_output.c:1195
> #23 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
> #24 0xc08f4105 in tcp_output (tp=0xc25b2570) at
> /usr/src/sys/netinet/tcp_output.c:1195
> #25 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
> #26 0xc08f4105 in tcp_output (tp=0xc25b2570) at
> /usr/src/sys/netinet/tcp_output.c:1195
> #27 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
> #28 0xc08f4105 in tcp_output (tp=0xc25b2570) at
> /usr/src/sys/netinet/tcp_output.c:1195
> #29 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
> #30 0xc08f4105 in tcp_output (tp=0xc25b2570) at
> /usr/src/sys/netinet/tcp_output.c:1195
> #31 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
> #32 0xc08f4105 in tcp_output (tp=0xc25b2570) at
> /usr/src/sys/netinet/tcp_output.c:1195
> #33 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
> #34 0xc08f4105 in tcp_output (tp=0xc25b2570) at
> /usr/src/sys/netinet/tcp_output.c:1195
> #35 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
> #36 0xc08f4105 in tcp_output (tp=0xc25b2570) at
> /usr/src/sys/netinet/tcp_output.c:1195
> #37 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
> #38 0xc08f4105 in tcp_output (tp=0xc25b2570) at
> /usr/src/sys/netinet/tcp_output.c:1195
> #39 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
> #40 0xc08f4105 in tcp_output (tp=0xc25b2570) at
> /usr/src/sys/netinet/tcp_output.c:1195
> #41 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
> #42 0xc08f4105 in tcp_output (tp=0xc25b2570) at
> /usr/src/sys/netinet/tcp_output.c:1195
> #43 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
> #44 0xc08f4105 in tcp_output (tp=0xc25b2570) at
> /usr/src/sys/netinet/tcp_output.c:1195
> #45 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
> #46 0xc08f4105 in tcp_output (tp=0xc25b2570) at
> /usr/src/sys/netinet/tcp_output.c:1195
> #47 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
> #48 0xc08f4105 in tcp_output (tp=0xc25b2570) at
> /usr/src/sys/netinet/tcp_output.c:1195
> #49 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
> ---Type <return> to continue, or q <return> to quit---
> #50 0xc08f4105 in tcp_output (tp=0xc25b2570) at
> /usr/src/sys/netinet/tcp_output.c:1195
> #51 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
> #52 0xc08f4105 in tcp_output (tp=0xc25b2570) at
> /usr/src/sys/netinet/tcp_output.c:1195
> #53 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
> #54 0xc08f4105 in tcp_output (tp=0xc25b2570) at
> /usr/src/sys/netinet/tcp_output.c:1195
> #55 0xc08fdcf8 in tcp_usr_send (so=0xc2ac1820, flags=0, m=0xc270ed00,
> nam=0x0, control=0x0, td=0xc28e2d80) at tcp_offload.h:269
> #56 0xc0850405 in sosend_generic (so=0xc2ac1820, addr=0x0, uio=0xc28766c0,
> top=0xc270ed00, control=0x0, flags=0, td=0xc28e2d80) at
> /usr/src/sys/kern/uipc_socket.c:1243
> #57 0xc084bf7f in sosend (so=0xc2ac1820, addr=0x0, uio=0xc28766c0, top=0x0,
> control=0x0, flags=0, td=0xc28e2d80) at /usr/src/sys/kern/uipc_socket.c:1285
> #58 0xc0833c5b in soo_write (fp=0xc28e84c0, uio=0xc28766c0,
> active_cred=0xc28e5900, flags=0, td=0xc28e2d80) at
> /usr/src/sys/kern/sys_socket.c:103
> #59 0xc082d2e7 in dofilewrite (td=0xc28e2d80, fd=24, fp=0xc28e84c0,
> auio=0xc28766c0, offset=-1, flags=0) at file.h:257
> #60 0xc082d5c8 in kern_writev (td=0xc28e2d80, fd=24, auio=0xc28766c0) at
> /usr/src/sys/kern/sys_generic.c:402
> #61 0xc082d816 in writev (td=0xc28e2d80, uap=0xccf6fcfc) at
> /usr/src/sys/kern/sys_generic.c:388
> #62 0xc0a7f2d5 in syscall (frame=0xccf6fd38) at
> /usr/src/sys/i386/i386/trap.c:1101
> #63 0xc0a636a0 in Xint0x80_syscall () at
> /usr/src/sys/i386/i386/exception.s:262
> #64 0x00000033 in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> (kgdb)
> (kgdb) quit

tcp_output() calls tcp_mtudisc() if ip_output() returns EMSGSIZE:

		case EMSGSIZE:
			/*
			 * For some reason the interface we used initially
			 * to send segments changed to another or lowered
			 * its MTU.
			 *
			 * tcp_mtudisc() will find out the new MTU and as
			 * its last action, initiate retransmission, so it
			 * is important to not do so here.
			 *
			 * If TSO was active we either got an interface
			 * without TSO capabilits or TSO was turned off.
			 * Disable it for this connection as too and
			 * immediatly retry with MSS sized segments generated
			 * by this function.
			 */
			if (tso)
				tp->t_flags &= ~TF_TSO;
			tcp_mtudisc(tp->t_inpcb, 0);
			return (0);

But tcp_mtudisc() calls tcp_output():

	tcpstat.tcps_mturesent++;
	tp->t_rtttime = 0;
	tp->snd_nxt = tp->snd_una;
	tcp_free_sackholes(tp);
	tp->snd_recover = tp->snd_max;
	if (tp->t_flags & TF_SACK_PERMIT)
		EXIT_FASTRECOVERY(tp);
	tcp_output_send(tp);
	return (inp);

I'm not sure why it's not able to figure out the MTU, perhaps folks on net@ 
can help.  However, it would seem that for the tcp_output() case, 
tcp_mtudisc() should probably not call tcp_output_send(), but instead 
tcp_output() should just loop back up to the top after calling tcp_mtudisc() 
and retry.

-- 
John Baldwin


More information about the freebsd-stable mailing list