kern/95307: Panic (race condition?) in ipsec_process_done

Nate Nielsen nielsen at memberwebs.com
Tue Apr 4 14:40:11 UTC 2006


>Number:         95307
>Category:       kern
>Synopsis:       Panic (race condition?) in ipsec_process_done
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Apr 04 14:40:10 GMT 2006
>Closed-Date:
>Last-Modified:
>Originator:     Nate Nielsen
>Release:        FreeBSD 6.0-p5
>Organization:
None
>Environment:
FreeBSD eaglecrest-link.ws.local 6.0-RELEASE-p5 FreeBSD 6.0-RELEASE-p5 #5: Mon Apr  3 00:26:04 UTC 2006
>Description:
I've been experiencing a panic in ipsec_process_done. Below is a
backtrace and a patch which supresses the issue. I don't profess to
understand the IPSec code completely...

The panic occurs when performing IKE negotiations (racoon) with multiple
systems at the same time. The panicing boxes are routers, and running a
slow CPU so negotiations take several seconds.

Immediately after boot and while IKE is going on the system panics.
Needless to say after the reboot (after panic) IKE happens again and
this results in a the box rebooting over and over.

I'm guessing this a is due to a halfway setup IPSec keys.

The patch (will attach) is probably incomplete, but prevents the problem from
happening for me.


>How-To-Repeat:
For me this issue only happens on production systems, so debugging is
very difficult, but I've managed to get a kernel dump and backtrace.

USING:
  - FreeBSD 6.0
  - FAST_IPSEC
  - Hardware encryption (hifn driver, aes algorithm)
  - ipsec-tools 0.6.2
  - Soekris net4826
  - IPv4
  - ESP transport wrapping GIF (IPIP) tunnels.
  
IPSEC CONFIG:
  spdadd 0.0.0.0/0 0.0.0.0/0 ip4 -P in ipsec esp/transport//require;
  spdadd 0.0.0.0/0 0.0.0.0/0 ip4 -P out ipsec esp/transport//require;

BACKTRACE:

Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x70
fault code              = supervisor read, page not present
instruction pointer     = 0x20:0xc05ee61e
stack pointer           = 0x28:0xc6e43ca4
frame pointer           = 0x28:0xc6e43cb4
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 6 (crypto returns)
trap number             = 12
panic: page fault
Uptime: 1m6s
Dumping 109 MB (2 chunks)
  chunk 0: 1MB (159 pages) ... ok
  chunk 1: 109MB (27904 pages) 94 78 62 46 30 14

(kgdb) backtrace
#0  doadump () at pcpu.h:165
#1  0xc050fcb2 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:399
#2  0xc050ff48 in panic (fmt=0xc06c6078 "%s")
    at /usr/src/sys/kern/kern_shutdown.c:555
#3  0xc06a0c00 in trap_fatal (frame=0xc6e43c64, eva=112)
    at /usr/src/sys/i386/i386/trap.c:831
#4  0xc06a096b in trap_pfault (frame=0xc6e43c64, usermode=0, eva=112)
    at /usr/src/sys/i386/i386/trap.c:742
#5  0xc06a05a9 in trap (frame=
      {tf_fs = -1006895096, tf_es = 167968808, tf_ds = 168099880, tf_edi
= -1059907712, tf_esi = -1060580736, tf_ebp = -958120780, tf_isp =
-958120816, tf_ebx = -1061533440, tf_edx = -1061533440, tf_ecx =
-1059907712, tf_eax = 0, tf_trapno = 12, tf_err = -1065091072, tf_eip =
-1067522530, tf_cs = -1060634592, tf_eflags = 66178, tf_esp = 0, tf_ss =
-1061533440})
    at /usr/src/sys/i386/i386/trap.c:432
#6  0xc06903ba in calltrap () at /usr/src/sys/i386/i386/exception.s:139
#7  0xc05ee61e in ipsec_process_done (m=0xc0b6e100, isr=0xc0ba4900)
    at /usr/src/sys/netipsec/ipsec_output.c:96
#8  0xc05fbe29 in esp_output_cb (crp=0xc0d31780)
    at /usr/src/sys/netipsec/xform_esp.c:919
#9  0xc061c5d8 in crypto_ret_proc () at
/usr/src/sys/opencrypto/crypto.c:1227
#10 0xc04f9c48 in fork_exit (callout=0xc061c4c4 <crypto_ret_proc>, arg=0x0,
    frame=0xc6e43d38) at /usr/src/sys/kern/kern_fork.c:789
#11 0xc069041c in fork_trampoline () at
/usr/src/sys/i386/i386/exception.s:208


>Fix:
--- sys/netipsec/ipsec_output.c.orig    Mon Apr  3 17:58:32 2006
+++ sys/netipsec/ipsec_output.c Mon Apr  3 17:57:52 2006
@@ -93,6 +93,13 @@

        IPSEC_ASSERT(m != NULL, ("null mbuf"));
        IPSEC_ASSERT(isr != NULL, ("null ISR"));
+
+       /* XXX This happens. Figure out why. */
+       if (!isr->sav) {
+               m_freem (m);
+               return ENOBUFS;
+       }
+
        sav = isr->sav;
        IPSEC_ASSERT(sav != NULL, ("null SA"));
        IPSEC_ASSERT(sav->sah != NULL, ("null SAH"));
>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list