[Bug 254478] Panic when using ipfw and divert sockets

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Mon Mar 22 13:21:30 UTC 2021


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=254478

            Bug ID: 254478
           Summary: Panic when using ipfw and divert sockets
           Product: Base System
           Version: 12.2-RELEASE
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: bugs at FreeBSD.org
          Reporter: daniel+freebsd at kempkens.io

# freebsd-version
12.2-RELEASE-p4

On some of our production systems, we're seeing rather frequent (up to multiple
times per day) kernel panics that appear to be related to our use of divert
sockets.

We use ipfw and divert sockets to basically implement DPI for HTTP traffic. We
divert all initial packets of a HTTP connection and once we've seen the Host
header, the packet is passed to a rule with "keep-state". We stop diverting
packets (after the Host header) via a "check-state" rule before the divert
rule(s). The (relevant) rules look something like this:

$ipfw add 2000 check-state
$ipfw add 2002 divert 9002 tcp from any to $dpi_dest 80
$ipfw add 2004 skipto 3000 ip from any to any
$ipfw add 2005 skipto 4000 tcp from any to any diverted keep-state
$ipfw add 2006 skipto 4000 tcp from any to any diverted

Our current thinking is that the panic is somehow related to the amount of
packets we divert, since we currently only see the panic in production (and not
in staging or dev).

We were able to save a crash dump and did an initial analysis, but were unable
to figure out what exactly is going wrong (and where).
kgdb presented us with the following backtrace:

(kgdb) bt
#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:371
#2  0xffffffff80bbec45 in kern_reboot (howto=260) at
/usr/src/sys/kern/kern_shutdown.c:451
#3  0xffffffff80bbf083 in vpanic (fmt=<optimized out>, ap=<optimized out>) at
/usr/src/sys/kern/kern_shutdown.c:880
#4  0xffffffff80bbeea3 in panic (fmt=<unavailable>) at
/usr/src/sys/kern/kern_shutdown.c:807
#5  0xffffffff8108e911 in trap_fatal (frame=0xfffffe00005f42d0, eva=368) at
/usr/src/sys/amd64/amd64/trap.c:921
#6  0xffffffff8108e96f in trap_pfault (frame=0xfffffe00005f42d0,
usermode=<optimized out>, signo=<optimized out>, ucode=<optimized out>) at
/usr/src/sys/amd64/amd64/trap.c:739
#7  0xffffffff8108dfb6 in trap (frame=0xfffffe00005f42d0) at
/usr/src/sys/amd64/amd64/trap.c:405
#8  <signal handler called>
#9  0xffffffff82515392 in atomic_fcmpset_long (dst=<optimized out>,
src=18446735296944592704, expect=<optimized out>) at
/usr/src/sys/amd64/include/atomic.h:221
#10 divert_packet (m=0xfffff80130033e00, incoming=<optimized out>) at
/usr/src/sys/netinet/ip_divert.c:282
#11 0xffffffff8247fc12 in ipfw_divert (m0=0xfffffe00005f4550, incoming=1,
rule=<optimized out>, tee=<optimized out>) at
/usr/src/sys/netpfil/ipfw/ip_fw_pfil.c:531
#12 ipfw_check_packet (arg=<optimized out>, m0=0xfffffe00005f4550,
ifp=<optimized out>, dir=1, inp=0x0) at
/usr/src/sys/netpfil/ipfw/ip_fw_pfil.c:285
#13 0xffffffff80ce07b0 in pfil_run_hooks (ph=<optimized out>,
mp=0xfffffe00005f45b8, ifp=0xfffff8000466b800, dir=1, flags=0, inp=0x0) at
/usr/src/sys/net/pfil.c:117
#14 0xffffffff80d463eb in ip_tryforward (m=0xfffff80130033e00) at
/usr/src/sys/netinet/ip_fastfwd.c:234
#15 0xffffffff80d48c74 in ip_input (m=0xfffff80130033e00) at
/usr/src/sys/netinet/ip_input.c:575
#16 0xffffffff80cdf98a in netisr_dispatch_src (proto=1, source=<optimized out>,
m=0x1) at /usr/src/sys/net/netisr.c:1124
#17 0xffffffff80cc2b68 in ether_demux (ifp=0xfffff8000466b800,
m=0xfffff804800ad740) at /usr/src/sys/net/if_ethersubr.c:879
#18 0xffffffff824ee06e in ng_ether_rcv_upper (hook=<optimized out>,
item=<optimized out>) at /usr/src/sys/netgraph/ng_ether.c:741
#19 0xffffffff824f457c in ng_apply_item (node=0xfffff80127857e00,
item=0xfffff8058854be80, rw=0) at /usr/src/sys/netgraph/ng_base.c:2403
#20 0xffffffff824f42f8 in ng_snd_item (item=0xfffff8058854be80, flags=0) at
/usr/src/sys/netgraph/ng_base.c:2320
#21 0xffffffff824f457c in ng_apply_item (node=0xfffff80484ee3700,
item=0xfffff8058854be80, rw=0) at /usr/src/sys/netgraph/ng_base.c:2403
#22 0xffffffff824f42f8 in ng_snd_item (item=0xfffff8058854be80, flags=0) at
/usr/src/sys/netgraph/ng_base.c:2320
#23 0xffffffff824edcac in ng_ether_input (ifp=<optimized out>,
mp=0xfffffe00005f4950) at /usr/src/sys/netgraph/ng_ether.c:255
#24 0xffffffff80cc3cbb in ether_input_internal (ifp=0xfffff8000466b800,
m=0xfffff80130033e00) at /usr/src/sys/net/if_ethersubr.c:616
#25 ether_nh_input (m=<optimized out>) at /usr/src/sys/net/if_ethersubr.c:697
#26 0xffffffff80cdf98a in netisr_dispatch_src (proto=5, source=<optimized out>,
m=0x1) at /usr/src/sys/net/netisr.c:1124
#27 0xffffffff80cc2f8b in ether_input (ifp=0xfffff8000466b800,
m=0xfffff804800ad740) at /usr/src/sys/net/if_ethersubr.c:787
#28 0xffffffff80cdc146 in iflib_rxeof (rxq=<optimized out>, budget=<optimized
out>) at /usr/src/sys/net/iflib.c:2945
#29 0xffffffff80cd6652 in _task_fn_rx (context=0xfffffe00a0924900) at
/usr/src/sys/net/iflib.c:3868
#30 0xffffffff80c09941 in gtaskqueue_run_locked (queue=0xfffff80480082d00) at
/usr/src/sys/kern/subr_gtaskqueue.c:362
#31 0xffffffff80c09606 in gtaskqueue_thread_loop (arg=<optimized out>) at
/usr/src/sys/kern/subr_gtaskqueue.c:537
#32 0xffffffff80b8088e in fork_exit (callout=0xffffffff80c09550
<gtaskqueue_thread_loop>, arg=0xfffffe00007890e0, frame=0xfffffe00005f4c00) at
/usr/src/sys/kern/kern_fork.c:1080
#33 <signal handler called>

Frame 10 points to the following source code:

(kgdb) list /usr/src/sys/netinet/ip_divert.c:282
277        CK_LIST_FOREACH(inp, &V_divcb, inp_list) {
278            /* XXX why does only one socket match? */
279            if (inp->inp_lport == nport) {
280                INP_RLOCK(inp);
281                sa = inp->inp_socket;
282                SOCKBUF_LOCK(&sa->so_rcv);
283                if (sbappendaddr_locked(&sa->so_rcv,
284                    (struct sockaddr *)&divsrc, m,
285                    (struct mbuf *)0) == 0) {
286                    SOCKBUF_UNLOCK(&sa->so_rcv);

It appears that sa is NULL when it probably shouldn't be:

(kgdb) print *sa
Cannot access memory at address 0x0

(kgdb) print *inp
$6 = {[...], inp_socket = 0x0, [...]}

We can provide more information if needed. If you need the entire crash dump,
we might be able to share it privately.

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-bugs mailing list