svn commit: r302247 - in head: share/man/man4 share/man/man9 tools/build/options

Jonathan T. Looney jtl at FreeBSD.org
Tue Jun 28 13:37:03 UTC 2016


Author: jtl
Date: Tue Jun 28 13:37:01 2016
New Revision: 302247
URL: https://svnweb.freebsd.org/changeset/base/302247

Log:
  Document support for alternate TCP stacks.
  
  Differential Revision:	https://reviews.freebsd.org/D6940
  Reviewed by:	hiren
  Approved by:	re (gjb)
  Sponsored by:	Juniper Networks

Added:
  head/share/man/man9/tcp_functions.9   (contents, props changed)
  head/tools/build/options/WITH_EXTRA_TCP_STACKS   (contents, props changed)
Modified:
  head/share/man/man4/tcp.4
  head/share/man/man9/Makefile

Modified: head/share/man/man4/tcp.4
==============================================================================
--- head/share/man/man4/tcp.4	Tue Jun 28 07:47:42 2016	(r302246)
+++ head/share/man/man4/tcp.4	Tue Jun 28 13:37:01 2016	(r302247)
@@ -34,7 +34,7 @@
 .\"     From: @(#)tcp.4	8.1 (Berkeley) 6/5/93
 .\" $FreeBSD$
 .\"
-.Dd May 19, 2016
+.Dd June 28, 2016
 .Dt TCP 4
 .Os
 .Sh NAME
@@ -119,7 +119,7 @@ supports a number of socket options whic
 .Xr setsockopt 2
 and tested with
 .Xr getsockopt 2 :
-.Bl -tag -width ".Dv TCP_CONGESTION"
+.Bl -tag -width ".Dv TCP_FUNCTION_BLK"
 .It Dv TCP_INFO
 Information about a socket's underlying TCP session may be retrieved
 by passing the read-only option
@@ -148,6 +148,20 @@ connection.
 See
 .Xr mod_cc 4
 for details.
+.It Dv TCP_FUNCTION_BLK
+Select or query the set of functions that TCP will use for this connection.
+This allows a user to select an alternate TCP stack.
+The alternate TCP stack must already be loaded in the kernel.
+To list the available TCP stacks, see
+.Va functions_available
+in the
+.Sx MIB Variables
+section further down.
+To list the default TCP stack, see
+.Va functions_default
+in the
+.Sx MIB Variables
+section.
 .It Dv TCP_KEEPINIT
 This
 .Xr setsockopt 2
@@ -568,6 +582,10 @@ Number of times default MSS was used in 
 .It Va pmtud_blackhole_failed
 Number of connections for which retransmits continued even after MSS
 downshift.
+.It Va functions_available
+List of available TCP function blocks (TCP stacks).
+.It Va functions_default
+The default TCP function block (TCP stack).
 .El
 .Sh ERRORS
 A socket operation may fail with one of the following errors returned:
@@ -599,6 +617,10 @@ exists;
 .It Bq Er EAFNOSUPPORT
 when an attempt is made to bind or connect a socket to a multicast
 address.
+.It Bq Er EINVAL
+when trying to change TCP function blocks at an invalid point in the session;
+.It Bq Er ENOENT
+when trying to use a TCP function block that is not available;
 .El
 .Sh SEE ALSO
 .Xr getsockopt 2 ,

Modified: head/share/man/man9/Makefile
==============================================================================
--- head/share/man/man9/Makefile	Tue Jun 28 07:47:42 2016	(r302246)
+++ head/share/man/man9/Makefile	Tue Jun 28 13:37:01 2016	(r302247)
@@ -284,6 +284,7 @@ MAN=	accept_filter.9 \
 	sysctl_ctx_init.9 \
 	SYSINIT.9 \
 	taskqueue.9 \
+	tcp_functions.9 \
 	thread_exit.9 \
 	time.9 \
 	timeout.9 \
@@ -1734,6 +1735,8 @@ MLINKS+=taskqueue.9 TASK_INIT.9 \
 	taskqueue.9 taskqueue_start_threads_pinned.9 \
 	taskqueue.9 taskqueue_unblock.9 \
 	taskqueue.9 TIMEOUT_TASK_INIT.9
+MLINKS+=tcp_functions.9 register_tcp_functions.9 \
+	tcp_functions.9 deregister_tcp_functions.9
 MLINKS+=time.9 boottime.9 \
 	time.9 time_second.9 \
 	time.9 time_uptime.9

Added: head/share/man/man9/tcp_functions.9
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ head/share/man/man9/tcp_functions.9	Tue Jun 28 13:37:01 2016	(r302247)
@@ -0,0 +1,285 @@
+.\"
+.\" Copyright (c) 2016 Jonathan Looney <jtl at FreeBSD.org>
+.\" All rights reserved.
+.\"
+.\" Redistribution and use in source and binary forms, with or without
+.\" modification, are permitted provided that the following conditions
+.\" are met:
+.\" 1. Redistributions of source code must retain the above copyright
+.\"    notice, this list of conditions and the following disclaimer.
+.\" 2. Redistributions in binary form must reproduce the above copyright
+.\"    notice, this list of conditions and the following disclaimer in the
+.\"    documentation and/or other materials provided with the distribution.
+.\"
+.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR
+.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+.\" SUCH DAMAGE.
+.\"
+.\" $FreeBSD$
+.\"
+.Dd June 28, 2016
+.Dt TCP_FUNCTIONS 9
+.Os
+.Sh NAME
+.Nm tcp_functions
+.Nd Alternate TCP Stack Framework
+.Sh SYNOPSIS
+.In netinet/tcp.h
+.In netinet/tcp_var.h
+.Ft int
+.Fn register_tcp_functions "struct tcp_function_block *blk" "int wait"
+.Ft int
+.Fn deregister_tcp_functions "struct tcp_function_block *blk"
+.Sh DESCRIPTION
+The
+.Nm
+framework allows a kernel developer to implement alternate TCP stacks.
+The alternate stacks can be compiled in the kernel or can be implemented in
+loadable kernel modules.
+This functionality is intended to encourage experimentation with the TCP stack
+and to allow alternate behaviors to be deployed for different TCP connections
+on a single system.
+.Pp
+A system administrator can set a system default stack.
+By default, all TCP connections will use the system default stack.
+Additionally, users can specify a particular stack to use on a per-connection
+basis.
+(See
+.Xr tcp 4
+for details on setting the system default stack, or selecting a specific stack
+for a given connection.)
+.Pp
+This man page treats "TCP stacks" as synonymous with "function blocks".
+This is intentional.
+A "TCP stack" is a collection of functions that implement a set of behavior.
+Therefore, an alternate "function block" defines an alternate "TCP stack".
+.Pp
+.Nm
+modules must call the
+.Fn register_tcp_functions
+function during initialization and successfully call the
+.Fn deregister_tcp_functions
+function prior to allowing the module to be unloaded.
+.Pp
+The
+.Fn register_tcp_functions
+function requests that the system add a specified function block to the system.
+.Pp
+The
+.Fn deregister_tcp_functions
+function requests that the system remove a specified function block from the
+system.
+If the call fails because sockets are still using the specified function block,
+the system will mark the function block as being in the process of being
+removed.
+This will prevent additional sockets from using the specified function block.
+However, it will not impact sockets that are already using the function block.
+.Pp
+The
+.Fa blk
+argument is a pointer to a
+.Vt "struct tcp_function_block" ,
+which is explained below (see
+.Sx Function Block Structure ) .
+The
+.Fa wait
+argument is used as the
+.Fa flags
+argument to
+.Xr malloc 9 ,
+and must be set to one of the valid values defined in that man page.
+.Ss Function Block Structure
+The
+.Fa blk argument is a pointer to a
+.Vt "struct tcp_function_block" ,
+which has the following members:
+.Bd -literal -offset indent
+struct tcp_function_block {
+	char	tfb_tcp_block_name[TCP_FUNCTION_NAME_LEN_MAX];
+	int	(*tfb_tcp_output)(struct tcpcb *);
+	void	(*tfb_tcp_do_segment)(struct mbuf *, struct tcphdr *,
+			    struct socket *, struct tcpcb *,
+			    int, int, uint8_t,
+			    int);
+	int     (*tfb_tcp_ctloutput)(struct socket *so,
+			    struct sockopt *sopt,
+			    struct inpcb *inp, struct tcpcb *tp);
+	/* Optional memory allocation/free routine */
+	void	(*tfb_tcp_fb_init)(struct tcpcb *);
+	void	(*tfb_tcp_fb_fini)(struct tcpcb *);
+	/* Optional timers, must define all if you define one */
+	int	(*tfb_tcp_timer_stop_all)(struct tcpcb *);
+	void	(*tfb_tcp_timer_activate)(struct tcpcb *,
+			    uint32_t, u_int);
+	int	(*tfb_tcp_timer_active)(struct tcpcb *, uint32_t);
+	void	(*tfb_tcp_timer_stop)(struct tcpcb *, uint32_t);
+	void	(*tfb_tcp_rexmit_tmr)(struct tcpcb *);
+	volatile uint32_t tfb_refcnt;
+	uint32_t  tfb_flags;
+};
+.Ed
+.Pp
+The
+.Va tfb_tcp_block_name
+field identifies the unique name of the TCP stack, and should be no longer than
+TCP_FUNCTION_NAME_LEN_MAX-1 characters in length.
+.Pp
+The
+.Va tfb_tcp_output ,
+.Va tfb_tcp_do_segment ,
+and
+.Va tfb_tcp_ctloutput
+fields are pointers to functions that perform the equivalent actions
+as the default
+.Fn tcp_output ,
+.Fn tcp_do_segment ,
+and
+.Fn tcp_default_ctloutput
+functions, respectively.
+Each of these function pointers must be non-NULL.
+.Pp
+If a TCP stack needs to initialize data when a socket first selects the TCP
+stack (or, when the socket is first opened), it should set a non-NULL
+pointer in the
+.Va tfb_tcp_fb_init
+field.
+Likewise, if a TCP stack needs to cleanup data when a socket stops using the
+TCP stack (or, when the socket is closed), it should set a non-NULL pointer
+in the
+.Va tfb_tcp_fb_fini
+field.
+.Pp
+If the TCP stack implements additional timers, the TCP stack should set a
+non-NULL pointer in the
+.Va tfb_tcp_timer_stop_all ,
+.Va tfb_tcp_timer_activate ,
+.Va tfb_tcp_timer_active ,
+and
+.Va tfb_tcp_timer_stop
+fields.
+These fields should all be
+.Dv NULL
+or should all contain pointers to functions.
+The
+.Va tfb_tcp_timer_activate ,
+.Va tfb_tcp_timer_active ,
+and
+.Va tfb_tcp_timer_stop
+functions will be called when the
+.Fn tcp_timer_activate ,
+.Fn tcp_timer_active ,
+and
+.Fn tcp_timer_stop
+functions, respectively, are called with a timer type other than the standard
+types.
+The functions defined by the TCP stack have the same semantics (both for
+arguments and return values) as the normal timer functions they supplement.
+.Pp
+Additionally, a stack may define its own actions to take when the retransmit
+timer fires by setting a non-NULL function pointer in the
+.Va tfb_tcp_rexmit_tmr
+field.
+This function is called very early in the process of handling a retransmit
+timer.
+However, care must be taken to ensure the retransmit timer leaves the
+TCP control block in a valid state for the remainder of the retransmit
+timer logic.
+.Pp
+The
+.Va tfb_refcnt
+and
+.Va tfb_flags
+fields are used by the kernel's TCP code and will be initialized when the
+TCP stack is registered.
+.Ss Requirements for Alternate TCP Stacks
+If the TCP stack needs to store data beyond what is stored in the default
+TCP control block, the TCP stack can initialize its own per-connection storage.
+The
+.Va t_fb_ptr
+field in the
+.Vt "struct tcpcb"
+control block structure has been reserved to hold a pointer to this
+per-connection storage.
+If the TCP stack uses this alternate storage, it should understand that the
+value of the
+.Va t_fb_ptr
+pointer may not be initialized to
+.Dv NULL .
+Therefore, it should use a
+.Va tfb_tcp_fb_init
+function to initialize this field.
+Additionally, it should use a
+.Va tfb_tcp_fb_fini
+function to deallocate storage when the socket is closed.
+.Pp
+It is understood that alternate TCP stacks may keep different sets of data.
+However, in order to ensure that data is available to both the user and the
+rest of the system in a standardized format, alternate TCP stacks must
+update all fields in the TCP control block to the greatest extent practical.
+.Sh RETURN VALUES
+The
+.Fn register_tcp_functions
+and
+.Fn deregister_tcp_functions
+functions return zero on success and non-zero on failure.
+In particular, the
+.Fn deregister_tcp_functions
+will return
+.Er EBUSY
+until no more connections are using the specified TCP stack.
+A module calling
+.Fn deregister_tcp_functions
+must be prepared to wait until all connections have stopped using the
+specified TCP stack.
+.Sh ERRORS
+The
+.Fn register_tcp_functions
+function will fail if:
+.Bl -tag -width Er
+.It Bq Er EINVAL
+Any of the members of the
+.Fa blk
+argument are set incorrectly.
+.It Bq Er ENOMEM
+The function could not allocate memory for its internal data.
+.It Bq Er EALREADY
+A function block is already registered with the same name.
+.El
+The
+.Fn deregister_tcp_functions
+function will fail if:
+.Bl -tag -width Er
+.It Bq Er EPERM
+The
+.Fa blk
+argument references the kernel's compiled-in default function block.
+.It Bq Er EBUSY
+The function block is still in use by one or more sockets, or is defined as
+the current default function block.
+.It Bq Er ENOENT
+The
+.Fa blk
+argument references a function block that is not currently registered.
+.Sh SEE ALSO
+.Xr malloc 9 ,
+.Xr tcp 4
+.Sh HISTORY
+This framework first appeared in
+.Fx 11.0 .
+.Sh AUTHORS
+.An -nosplit
+The
+.Nm
+framework was written by
+.An Randall Stewart Aq Mt rrs at FreeBSD.org .
+.Pp
+This manual page was written by
+.An Jonathan Looney Aq Mt jtl at FreeBSD.org .

Added: head/tools/build/options/WITH_EXTRA_TCP_STACKS
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ head/tools/build/options/WITH_EXTRA_TCP_STACKS	Tue Jun 28 13:37:01 2016	(r302247)
@@ -0,0 +1,2 @@
+.\" $FreeBSD$
+Set to build extra TCP stack modules.


More information about the svn-src-head mailing list