PERFORCE change 131540 for review

Robert Watson rwatson at FreeBSD.org
Mon Dec 24 12:20:51 PST 2007


http://perforce.freebsd.org/chv.cgi?CH=131540

Change 131540 by rwatson at rwatson_cinnamon on 2007/12/24 20:20:17

	Rewrite buffering portions of the man page.

Affected files ...

.. //depot/projects/zcopybpf/src/share/man/man4/bpf.4#5 edit

Differences ...

==== //depot/projects/zcopybpf/src/share/man/man4/bpf.4#5 (text+ko) ====

@@ -89,53 +89,18 @@
 Each descriptor that accepts the packet receives its own copy.
 .Pp
 .Nm
-devices operate in one of two buffering modes: buffered
+devices operate in one of two buffering modes:
 .Xr read 2 ,
-in which packet data is copied from the kernel explicitly using the
+and zero-copy.
+In buffered read mode, packet data is copied explicitly from the kernel to
+user memory buffers using the
 .Xr read 2
-system call, and zero-copy buffer mode, in which the user process provides
-two memory regions that
-.Nm
-will write to directly as the packets are accepted.
-The buffering mode may be set with the
-.Dv BIOCSETBUFMODE
-ioctl (see below), and will default to buffered
-.Xr read 2
-mode
-.Dv ( BPF_BUFMODE_BUFFER )
-by default.
-Buffers return the next group of packets that have matched the filter.
+system call.
+In zero-copy buffering mode, the kernel writes packet data directly into
+shared memory buffers provided by the user application.
 Note that an individual packet larger than the buffer size is necessarily
 truncated.
 .Pp
-In the case of buffered
-.Xr read 2 ,
-the user process will declare a fixed buffer size that will be used both for
-sizing internal buffers and for all
-.Xr read 2
-operations on the file.
-This size is returned by the
-.Dv BIOCGBLEN
-ioctl (see below), and
-can be set with
-.Dv BIOCSBLEN .
-.Pp
-In zero-copy buffering, the user process registers two memory buffers with
-.Nm
-via the
-.Dv BIOCSETZBUF
-ioctl (see below).
-The user process may monitor for completion (filling) of a buffer, at which
-point the memory contents of the buffer will be stable until the buffer is
-returned for further kernel use using the
-.Dv BIOCACKZBUF
-ioctl.
-Buffers will be of a fixed (and equal) size, be
-page-aligned, and the size must be an integer multiple of the page size.
-The maximum zero-copy buffer size is returned by the
-.Dv BIOCGETZMAX
-ioctl (see below).
-.Pp
 The packet filter will support any link level protocol that has fixed length
 headers.
 Currently, only Ethernet,
@@ -156,6 +121,144 @@
 Currently, only writes to Ethernets and
 .Tn SLIP
 links are supported.
+.Sh BUFFER MODES
+.Nm
+devices deliver packet data to the application via memory buffers provided by
+the application.
+The buffer mode is set using the
+.Dv BIOCSETBUFMODE
+ioctl, and read using the
+.Dv BIOCGETBUFMODE
+ioctl.
+.Ss Buffered read mode
+By default,
+.Nm
+devices operate in the
+.Dv BPF_BUFMODE_BUFFER
+mode, in which packet data is copied explicitly from the kernel to user
+memory using the
+.Xr read 2
+system call.
+The user process will declare a fixed buffer size that will be used both for
+sizing internal buffers and for all
+.Xr read 2
+operations on the file.
+This size is queried using the
+.Dv BIOCGBLEN
+ioctl, and is set using the
+.Dv BIOCSBLEN
+ioctl.
+.Ss Zero-copy buffer mode
+.Nm
+devices may also operate in the
+.Dv BPF_BUFMODE_ZEROCOPY
+mode, in which packet data is written directly by the kernel to memory
+buffers provided by the process, avoiding both both system call and memory
+copying overhead.
+Buffers are of fixed (and equal) size, page-aligned, and an even multiple of
+the page size.
+The maximum zero-copy buffer size is returned by the
+.Dv BIOCGETZMAX
+ioctl.
+.Pp
+The user process registers two memory buffers using the
+.Dv BIOCSETZBUF
+ioctl, which accepts a
+.Vt struct bpf_zbuf
+pointer as an argument:
+.Bd -literal
+struct bpf_zbuf {
+	void *bz_bufa;
+	void *bz_bufb;
+	size_t bz_buflen;
+};
+.Ed
+.Pp
+.Vt bz_bufa
+is a pointer to the userspace address of the first buffer that will be
+filled, and
+.Vt bz_bufb
+is a pointer to the second buffer.
+.Nm
+will then cycle between the two buffers.
+.Pp
+Buffer memory begins with a short, fixed-length header holding
+synchronization and data length information for the buffer:
+.Bd -literal
+struct bpf_zbuf_header {
+	volatile u_int  bzh_kernel_gen;	/* Kernel generation number. */
+	volatile u_int  bzh_kernel_len;	/* Length of buffer. */
+	volatile u_int  bzh_user_gen;	/* User generation number. */
+	/* ...padding for future use... */
+};
+.Ed
+.Pp
+This is followed immediately by packet data, laid out as described below.
+.Pp
+The kernel and the user process follow a simple acknowledgement protocol
+using shared memory and ioctls to synchronize access to the two buffers.
+Ownership of the buffer is signaled using the kernel and user generation
+numbers in shared memory: the kernel modifies
+.Vt bzh_kernel_gen
+to assign ownership to userspace, and the user process sets
+.Vt bzh_user_gen
+to the value in
+.Vt bzh_kernel_gen
+to acknowledge the buffer and return it to kernel ownership.
+While the kernel owns the buffer, the contents are unstable and will change
+asynchronously; while the user process owns the buffer, its contents are
+considered stable and will not be changed until the buffer is acknowledged.
+The user process will initialize the
+.Vt struct bpf_zbuf_header
+to all 0's before registering the buffer, assigning initial ownership to the
+kernel.
+.Pp
+In order to avoid caching and memory re-ordering effects, the user process
+must use appropriate atomic operations and memory barriers when checking for
+and acknowledging buffers:
+.Bd -literal
+#include <machine/atomic.h>
+
+/*
+ * Return ownership of a buffer to the kernel for reuse.
+ */
+static void
+buffer_acknowledge(struct bpf_zbuf_header *bzh)
+{
+
+	atomic_store_rel_int(&bzh->bzh_user_gen, bzh->bzh_kernel_gen);
+};
+
+/*
+ * Check whether a buffer has been assigned to userspace by the kernel.
+ * Return true if userspace owns the buffer, and false otherwise.
+ */
+static int
+buffer_check(struct bpf_zbuf_header *bzh)
+{
+
+	return (bzh->bzh_user_gen !=
+	    atomic_load_acq_int(&bzh->bzh_kernel_gen));
+}
+.Ed
+.Pp
+The user process may force the assignment of the next buffer, if any data
+is pending, to userspace using the
+.Dv BIOCROTZBUF
+ioctl.
+This allows the user process to retrieve data in a partially filled buffer
+before the buffer is completed, such as following a timeout; the process must
+still check to see if ownership has been assigned using the header generation
+numbers, as the buffer will not be assigned if there is no data available.
+.Pp
+As in the read buffering mode,
+.Xr kqueue 2 ,
+.Xr poll 2 ,
+and
+.Xr select 2
+may be used to sleep awaiting the availbility of a completed buffer.
+They will return a readable file descriptor once at least one buffer is
+assigned to user space.
 .Sh IOCTLS
 The
 .Xr ioctl 2
@@ -418,27 +521,9 @@
 .Nm
 buffering mode; possible values are
 .Dv BPF_BUFMODE_BUFFER ,
-buffered
-.Xr read 2
-mode, and
+buffered read mode, and
 .Dv BPF_BUFMODE_ZBUF ,
 zero-copy buffer mode.
-.It Dv BIOCACKZBUF
-.Pq Li struct bpf_zbuf
-Return a completed zero-copy buffer to the kernel for reuse.
-The following structure is used as an argument to these and other zero-copy
-buffer ioctls:
-.Bd -literal
-struct bpf_zbuf {
-	void *bz_bufa;
-	void *bz_bufb;
-	size_t bz_buflen;
-};
-.Ed
-.Pp
-Only the
-.Vt bz_bufa
-field will be used with this ioctl.
 .It Dv BIOCGETZBUF
 .It Dv BIOCSETZBUF
 .Pq Li struct bpf_zbuf
@@ -455,6 +540,7 @@
 and
 .Vt bz_buflen
 must be filled out.
+If buffers have already been set for this device, the ioctl will fail.
 .It Dv BIOCGETZMAX
 .Pq Li size_t
 Get the largest individual zero-copy buffer size allowed.
@@ -464,38 +550,23 @@
 buffer size, especially when there are multiple
 .Nm
 descriptors in use on 32-bit systems.
-.It Dv BIOCGETZNEXT
 .It Dv BIOCROTZBUF
-.Pq Li struct bpf_zbuf
-Get the buffer pointer and length of the next zero-copy buffer buffer ready
-for userspace use, or
-.Dv NULL
-if there is no pending buffer.
-.Pp
-.Dv BIOCGETZNEXT
-queries for the next completely filled buffer ready for immediate use,
-returning NULL if there are only empty or partially filled buffers available.
-.Pp
-.Dv BIOCROTZBUF
-queries for a filled buffer, but in the event there is only a partially
-filled buffer, will make that buffer available for userspace to use
-immediately.
+Force ownership of the next buffer to be assigned to userspace, if any data
+present in the buffer.
+If no data is present, the buffer will remain owned by the kernel.
+If userspace already owns the buffer, this operation will be a no-op.
 This allows consumers of zero-copy buffering to implement timeouts and
 retrieve partially filled buffers.
-.Dv BIOCROTZBUF
-will return
-.Dv NULL
-only if no data is present in either of the zero-copy buffers.
-.Pp
-Only the
-.Vt bz_bufa
-and
-.Vt bz_buflen
-fields will be used with this ioctl.
+In order to handle the case where no data is present in the buffer and
+therefore ownership is not assigned, the user process must check
+.Vt bzh_kernel_gen
+against
+.Vt bzh_user_gen .
 .El
 .Sh BPF HEADER
 The following structure is prepended to each packet returned by
-.Xr read 2 :
+.Xr read 2
+or via a zero-copy buffer:
 .Bd -literal
 struct bpf_hdr {
         struct timeval bh_tstamp;     /* time stamp */
@@ -861,6 +932,9 @@
 .Sh SEE ALSO
 .Xr tcpdump 1 ,
 .Xr ioctl 2 ,
+.Xr kqueue 2 ,
+.Xr poll 2 ,
+.Xr select 2 ,
 .Xr byteorder 3 ,
 .Xr ng_bpf 4 ,
 .Xr bpf 9
@@ -893,6 +967,10 @@
 Summer 1990.
 Much of the design is due to
 .An Van Jacobson .
+.Pp
+Support for zero-copy buffers was added by
+.An Robert N. M. Watson
+under contract to Seccuris Inc.
 .Sh BUGS
 The read buffer must be of a fixed size (returned by the
 .Dv BIOCGBLEN


More information about the p4-projects mailing list