kern/87014: BPF_MTAP/bpf_mtap are not threadsafe and cause panics
on SMP systems
Mark Gooderum
mark at verniernetworks.com
Thu Oct 6 14:10:14 PDT 2005
>Number: 87014
>Category: kern
>Synopsis: BPF_MTAP/bpf_mtap are not threadsafe and cause panics on SMP systems
>Confidential: no
>Severity: serious
>Priority: low
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Thu Oct 06 21:10:12 GMT 2005
>Closed-Date:
>Last-Modified:
>Originator: Mark Gooderum
>Release: 5.3-RELEASE
>Organization:
Vernier Networks, Inc.
>Environment:
FreeBSD 139.94.1.20 5.3-RELEASE FreeBSD 5.3-RELEASE #0: Thu Aug 4 09:03:53 PDT 2005 build at build-amd3.verniernetworks.com:/usr/build/ambit2/freebsd5/sys/i386/compile/VNISMP i386
>Description:
BPF_MTAP/BPF_MTAP2 do a non-atomic test and invoke based on the value of the if_bpf field. The problem is that if the last bpf of on the interface is deleted, the if_bpf field is set to NULL by bpf_detachd(). If another thread (such as a userland process) deletes the last bpf in the (admittedly small) window between the test and invocation then bpf_mtap() is invoked with a NULL bp parameter which cases a fault on the LIST_EMPTY check
#define BPF_MTAP(_ifp,_m) do { \
if ((_ifp)->if_bpf) { \
M_ASSERTVALID(_m); \
bpf_mtap((_ifp)->if_bpf, (_m)); \
} \
} while (0)
This happens becase on i386 at least the initial NULL check does _NOT_ fetch the variable:
Line 3359 of "../../../dev/bge/if_bge.c"
starts at address 0xc045e7a5 <bge_start_locked+53>
and ends at 0xc045e7c0 <bge_start_locked+80>.
0xc045e7a5 <bge_start_locked+53>: cmpl $0x0,0x2074(%edi,%eax,4)
0xc045e7ad <bge_start_locked+61>: jne 0xc045ea71 <bge_start_locked+769>
0xc045e7b3 <bge_start_locked+67>: lea 0xfc(%esi),%eax
0xc045e7b9 <bge_start_locked+73>: mov %eax,0xffffffec(%ebp)
0xc045e7bc <bge_start_locked+76>: lea 0x0(%esi),%es
It just does an optimized test for NULLness, the value isn't fetched until later when setting up the call:
0xc045ea46 <bge_start_locked+726>: mov %ebx,0x4(%esp)
0xc045ea4a <bge_start_locked+730>: mov 0x3c(%esi),%eax
0xc045ea4d <bge_start_locked+733>: mov %eax,(%esp)
0xc045ea50 <bge_start_locked+736>: call 0xc05897d0 <bpf_mtap>
0xc045ea55 <bge_start_locked+741>: lea 0x0(%esi),%esi
0xc045ea59 <bge_start_locked+745>: lea 0x0(%edi),%edi
0xc045ea60 <bge_start_locked+752>: mov 0xfffffff0(%ebp),%eax
0xc045ea63 <bge_start_locked+755>: cmpl $0x0,0x2074(%edi,%eax,4)
0xc045ea6b <bge_start_locked+763>: je 0xc045e7c0 <bge_start_locked+80>
So the window is small but real. Our field experience is with a box doing a large amount of packet processing while running frequent nessus scans (nessus adds and removes BPF filters on the fly as needed for certain tests).
>How-To-Repeat:
Have lots of bpf filter add/deletes happening on a system under a heavy packet load. I will attach a simple test program that adds/deletes filters on an interface at a high rate if desired.
This window also affects BPF_MTAP2/bpf_mtap2()
>Fix:
Either modify bpf_mtap()/bpf_mtap2() to check for a NULL parameter (the test/modify race doesn't apply once we are into the function because the bpf_if lasts as long as the interface, only the pointer to it in the struct ifnet comes and goes and by the time we're in bpf_mtap() we're looking a copy of the variable on the stack, not the actual ifnet field.
Most interfaces have per-interface locks that could also be used but those mutexes are currently private to the drivers.
--- /tmp/tmp.97907.0 Thu Oct 6 15:57:52 2005
+++ sys/net/bpf.c Thu Oct 6 15:57:45 2005
@@ -1201,20 +1201,27 @@
*/
void
bpf_mtap(bp, m)
struct bpf_if *bp;
struct mbuf *m;
{
struct bpf_d *d;
u_int pktlen, slen;
/*
+ * We can sometimes be invoked w/NULL bp due to a small race in
+ * BPF_MTAP(), see PR#xxxxx.
+ */
+ if (!bp)
+ return;
+
+ /*
* Lockless read to avoid cost of locking the interface if there are
* no descriptors attached.
*/
if (LIST_EMPTY(&bp->bif_dlist))
return;
pktlen = m_length(m, NULL);
if (pktlen == m->m_len) {
bpf_tap(bp, mtod(m, u_char *), pktlen);
return;
@@ -1245,20 +1252,27 @@
void
bpf_mtap2(bp, data, dlen, m)
struct bpf_if *bp;
void *data;
u_int dlen;
struct mbuf *m;
{
struct mbuf mb;
struct bpf_d *d;
u_int pktlen, slen;
+
+ /*
+ * We can sometimes be invoked w/NULL bp due to a small race in
+ * BPF_MTAP2(), see PR#xxxxx.
+ */
+ if (!bp)
+ return;
/*
* Lockless read to avoid cost of locking the interface if there are
* no descriptors attached.
*/
if (LIST_EMPTY(&bp->bif_dlist))
return;
pktlen = m_length(m, NULL);
/*
>Release-Note:
>Audit-Trail:
>Unformatted:
More information about the freebsd-bugs
mailing list