bpf/pcap are weird
guy at alum.mit.edu
Wed Nov 5 23:58:19 PST 2003
> Okay, this is goofy stuff and breaks a lot of code that otherwise makes
> certain assumptions about pcap/bpf that don't work on FreeBSD. Our
> bpf(4) doesn't actually care about the non-blocking fd flag, and our pcap(3)
> doesn't care at all about BIOCIMMEDIATE.
This is a libpcap deficiency that I will probably fix at some point, as
1) some libpcap applications might want that mode
2) the way you get that mode differs on different platforms
(some platforms always implement it, e.g. Linux; other
platforms have different ways of requesting it).
It's in my queue along with a number of other libpcap deficiencies.
> Why do we have BIOCIMMEDIATE?
> It seems like it's what SHOULD be implemented with the non-blocking I/O
No. BIOCIMMEDIATE and non-blocking mode are different.
BIOCIMMEDIATE mode means "make incoming packets readable immediately;
don't buffer them up until either the store buffer is full or the
timeout expires". This is for use in, for example, applications that
are using BPF to implement network protocols, and want to be able to
respond immediately to incoming packets, as opposed to, for example,
packet capture applications (tcpdump, Ethereal, etc.) which don't
necessarily need to immediately show or save incoming packets and which
might want to try to get as many packets as possible per read on the BPF
It does *NOT* mean "an attempt to read on this device won't block even
if *no* packets are available", nor should it - applications running in
BIOCIMMEDIATE mode would probably still want to block, rather than spin,
if no packets are available.
Non-blocking mode should mean "an attempt to read on this device won't
block, even if there are no packets remaining", so it's not identical to
If used in conjunction with a properly-working "select()" or "poll()" -
i.e., one that causes the timeout timer to start when the "select()" or
"poll()" is done, so that the "select()" or "poll()" will wake up if the
store buffer fills *OR* the timeout expires - then it does need to be
the case that, if the "select()" or "poll()" says a read on the BPF
device will succeed, it will, in fact, succeed. This could be
implemented by having reads in non-blocking mode always do a buffer
rotation if there are packets in the store buffer but not the hold
buffer, just as is the case in BIOCIMMEDIATE mode.
That's currently done in "bpf_read()" - note the "|| timed_out" in the
"if" inside the "while (d->bd_hbuf == 0)" loop. That appears to have
been introduced in 4.5, in revision 126.96.36.199, which was an MFC of
Make bpf's read timeout feature work more correctly with
select/poll, and therefore with pthreads. I doubt there is any way
to make this 100% semantically identical to the way it behaves in
unthreaded programs with blocking reads, but the solution here
should do the right thing for all reasonable usage patterns.
The basic idea is to schedule a callout for the read timeout when a
select/poll is done. When the callout fires, it ends the select if
it is still in progress, or marks the state as "timed out" if the
select has already ended for some other reason. Additional logic in
bpfread then does the right thing in the case where the timeout has
Note, I co-opted the bd_state member of the bpf_d structure. It has
been present in the structure since the initial import of 4.4-lite,
but as far as I can tell it has never been used.
PR: kern/22063 and bin/31649
PR 22063 is "bpf when used with the select system call with timeout
doesn't forward packets on timeout":
When bpf is accessed via libpcap with the select system call
with a timeout set if a less than full buffer of packets
received on the interface (and passed to bpf.c) they will never
be returned to libpcap even on a timeout. OpenBSD has a partial
fix for this (it gets the first packet of 9 up and leaves the
other 8) which I have corrected, reported to OpenBSD and ported
As a side note one of the OpenBSD people is working on a better
bpf implementation and would be interested in help by someone
knowledgable in the FreeBSD VM system to assist porting his code
when finished to FreeBSD.
(I think the "better bpf implementation" might be Michael Stolarchuk's
memory-mapped BPF, but I don't know whether it ever saw the light of
PR 31649 is "libpcap doesn't work with -pthread"; the problem is that
the userland pthreads library requires that "select()"/"poll()" and
non-blocking reads work on anything from which you're trying to read if
you can get long-term waits on it - and that wasn't the case for BPF
The question then is whether if *not* used with "select()" or "poll()"
reads should return whatever packets are there, even if the timer hasn't
expired. One could argue that it should, in which case the "if" in
question should also check for "ioflag & IO_NDELAY". I don't know
whether that would cause problems for any applications, though.
More information about the freebsd-arch