git: 02b958b19535 - stable/13 - netlink: add netlink user documentation.

From: Alexander V. Chernikov <melifaro_at_FreeBSD.org>
Date: Mon, 23 Jan 2023 22:12:00 UTC
The branch stable/13 has been updated by melifaro:

URL: https://cgit.FreeBSD.org/src/commit/?id=02b958b19535828d8f19bf3601ae88ecf4503d33

commit 02b958b19535828d8f19bf3601ae88ecf4503d33
Author:     Alexander V. Chernikov <melifaro@FreeBSD.org>
AuthorDate: 2022-11-01 12:20:13 +0000
Commit:     Alexander V. Chernikov <melifaro@FreeBSD.org>
CommitDate: 2023-01-23 22:04:03 +0000

    netlink: add netlink user documentation.
    
    Add netlink(4) as a "frontend" manpage describing netlink in general.
    Add rtnelink(4) describing supported commands and attributes in
    NETLINK_ROUTE family.
    Add genetlink(4) describing generic netlink API.
    
    Reviewed by:    pauamma
    Differential Revision: https://reviews.freebsd.org/D37011
    
    (cherry picked from commit 7366c0a49c9a60d3eea7520d7ae4bc2b3ab172f3)
---
 share/man/man4/genetlink.4 | 147 +++++++++++++
 share/man/man4/netlink.4   | 344 ++++++++++++++++++++++++++++++
 share/man/man4/rtnetlink.4 | 519 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 1010 insertions(+)

diff --git a/share/man/man4/genetlink.4 b/share/man/man4/genetlink.4
new file mode 100644
index 000000000000..2c5b9b99f994
--- /dev/null
+++ b/share/man/man4/genetlink.4
@@ -0,0 +1,147 @@
+.\"
+.\" Copyright (C) 2022 Alexander Chernikov <melifaro@FreeBSD.org>.
+.\"
+.\" Redistribution and use in source and binary forms, with or without
+.\" modification, are permitted provided that the following conditions
+.\" are met:
+.\" 1. Redistributions of source code must retain the above copyright
+.\"    notice, this list of conditions and the following disclaimer.
+.\" 2. Redistributions in binary form must reproduce the above copyright
+.\"    notice, this list of conditions and the following disclaimer in the
+.\"    documentation and/or other materials provided with the distribution.
+.\"
+.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+.\" SUCH DAMAGE.
+.\"
+.\" $FreeBSD$
+.\"
+.Dd November 1, 2022
+.Dt GENETLINK 4
+.Os
+.Sh NAME
+.Nm genetlink
+.Nd Generic Netlink
+.Sh SYNOPSIS
+.In netlink/netlink.h
+.In netlink/netlink_generic.h
+.Ft int
+.Fn socket AF_NETLINK SOCK_DGRAM NETLINK_GENERIC
+.Sh DESCRIPTION
+The
+.Dv NETLINK_GENERIC
+is a "container" family, used for dynamic registration of other families
+belonging to the various subsystems.
+These subsystems provide a string family name during registration and
+receive a dynamically-allocated family id.
+Allocated family identifiers are then used by applications to get access to
+functions provided by that subsystem via netlink.
+There are standard methods for resolving string family names to family
+identifiers.
+A similar mechanism works for the notification groups provided by those
+families.
+.Pp
+All generic netlink families share a common header:
+.Bd -literal
+struct genlmsghdr {
+	uint8_t		cmd;		/* command within the family */
+	uint8_t		version;	/* ABI version for the cmd */
+	uint16_t	reserved;	/* reserved: set to 0 */
+};
+.Ed
+The family id is encoded in the
+.Dv nlmsg_type
+of the base netlink header.
+The
+.Va cmd
+field is the command identifier within the family.
+The
+.Va version
+field is the command version.
+.Sh METHODS
+The generic Netlink framework provides the base family,
+.Dv GENL_ID_CTRL
+("nlctrl") with a fixed family id.
+This family is used to list the details of all registered families.
+.Pp
+The following messages are supported by the framework:
+.Ss CTRL_CMD_GETFAMILY
+Fetches a single family or all registered families, depending on the
+.Dv NLM_F_DUMP
+flag.
+Each family is reported as
+.Dv CTRL_CMD_NEWFAMILY
+message.
+The following filters are recognised by the kernel:
+.Pp
+.Bd -literal -offset indent -compact
+CTRL_ATTR_FAMILY_ID	(uint16_t) current family id assigned by kernel
+CTRL_ATTR_FAMILY_NAME	(string) family name
+.Ed
+.Ss TLVs
+.Bl -tag -width indent
+.It Dv CTRL_ATTR_FAMILY_ID
+(uint16_t) Dynamically-assigned family identifier.
+.It Dv CTRL_ATTR_FAMILY_NAME
+(string) Family name.
+.It Dv CTRL_ATTR_HDRSIZE
+(uint32_t) Family mandatory header size (typically 0).
+.It Dv CTRL_ATTR_MAXATTR
+(uint32_t) Maximum attribute number valid for the family.
+.It Dv CTRL_ATTR_OPS
+(nested) List of the operations supported by the family.
+The attribute consists of a list of nested TLVs, with attribute values
+monotonically incremented, starting from 0.
+The following attributes are present in each TLV:
+.Bl -tag -width indent
+.It Dv CTRL_ATTR_OP_ID
+Operation (message) number.
+.It Dv CTRL_ATTR_OP_FLAGS
+Operation flags.
+The following flags are supported:
+.Bd -literal -offset indent -compact
+GENL_ADMIN_PERM		requires elevated permissions
+GENL_CMD_CAP_DO		operation is a modification request
+GENL_CMD_CAP_DUMP	operation is a get/dump request
+.Ed
+.El
+.It Dv CTRL_ATTR_MCAST_GROUPS
+(nested) List of the notification groups supported by the family.
+The attribute consists of a list of nested TLVs, with attribute values
+monotonically incremented, starting from 0.
+The following attributes are present in each TLV:
+.Bl -tag -width indent
+.It Dv CTRL_ATTR_MCAST_GRP_ID
+Group id that can be used in
+.Dv NETLINK_ADD_MEMBERSHIP
+.Xr setsockopt 2 .
+.It Dv CTRL_ATTR_MCAST_GRP_NAME
+(string) Human-readable name of the group.
+.El
+.El
+.Ss Groups
+The following groups are defined:
+.Bd -literal -offset indent -compact
+"notify"	Notifies on family registrations/removal.
+.Ed
+.Sh SEE ALSO
+.Xr netlink 4
+.Sh HISTORY
+The
+.Dv NETLINK_GENERIC
+protocol family appeared in
+.Fx 14.0 .
+.Sh AUTHORS
+The netlink was implementated by
+.An -nosplit
+.An Alexander Chernikov Aq Mt melifaro@FreeBSD.org .
+It was derived from the Google Summer of Code 2021 project by
+.An Ng Peng Nam Sean .
diff --git a/share/man/man4/netlink.4 b/share/man/man4/netlink.4
new file mode 100644
index 000000000000..c75366f560f0
--- /dev/null
+++ b/share/man/man4/netlink.4
@@ -0,0 +1,344 @@
+.\"
+.\" Copyright (C) 2022 Alexander Chernikov <melifaro@FreeBSD.org>.
+.\"
+.\" Redistribution and use in source and binary forms, with or without
+.\" modification, are permitted provided that the following conditions
+.\" are met:
+.\" 1. Redistributions of source code must retain the above copyright
+.\"    notice, this list of conditions and the following disclaimer.
+.\" 2. Redistributions in binary form must reproduce the above copyright
+.\"    notice, this list of conditions and the following disclaimer in the
+.\"    documentation and/or other materials provided with the distribution.
+.\"
+.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+.\" SUCH DAMAGE.
+.\"
+.\" $FreeBSD$
+.\"
+.Dd November 1, 2022
+.Dt NETLINK 4
+.Os
+.Sh NAME
+.Nm Netlink
+.Nd Kernel network configuration protocol
+.Sh SYNOPSIS
+.In netlink/netlink.h
+.In netlink/netlink_route.h
+.Ft int
+.Fn socket AF_NETLINK SOCK_DGRAM int family
+.Sh DESCRIPTION
+Netlink is a user-kernel message-based communication protocol primarily used
+for network stack configuration.
+Netlink is easily extendable and supports large dumps and event
+notifications, all via a single socket.
+The protocol is fully asynchronous, allowing one to issue and track multiple
+requests at once.
+Netlink consists of multiple families, which commonly group the commands
+belonging to the particular kernel subsystem.
+Currently, the supported families are:
+.Pp
+.Bd -literal -offset indent -compact
+NETLINK_ROUTE	network configuration,
+NETLINK_GENERIC	"container" family
+.Ed
+.Pp
+The
+.Dv NETLINK_ROUTE
+family handles all interfaces, addresses, neighbors, routes, and VNETs
+configuration.
+More details can be found in
+.Xr rtnetlink 4 .
+The
+.Dv NETLINK_GENERIC
+family serves as a
+.Do container Dc ,
+allowing registering other families under the
+.Dv NETLINK_GENERIC
+umbrella.
+This approach allows using a single netlink socket to interact with
+multiple netlink families at once.
+More details can be found in
+.Xr genetlink 4 .
+.Pp
+Netlink has its own sockaddr structure:
+.Bd -literal
+struct sockaddr_nl {
+	uint8_t		nl_len;		/* sizeof(sockaddr_nl) */
+	sa_family_t	nl_family;	/* netlink family */
+	uint16_t	nl_pad;		/* reserved, set to 0 */
+	uint32_t	nl_pid;		/* automatically selected, set to 0 */
+	uint32_t	nl_groups;	/* multicast groups mask to bind to */
+};
+.Ed
+.Pp
+Typically, filling this structure is not required for socket operations.
+It is presented here for completeness.
+.Sh PROTOCOL DESCRIPTION
+The protocol is message-based.
+Each message starts with the mandatory
+.Va nlmsghdr
+header, followed by the family-specific header and the list of
+type-length-value pairs (TLVs).
+TLVs can be nested.
+All headers and TLVS are padded to 4-byte boundaries.
+Each
+.Xr send 2 or
+.Xr recv 2
+system call may contain multiple messages.
+.Ss BASE HEADER
+.Bd -literal
+struct nlmsghdr {
+	uint32_t nlmsg_len;   /* Length of message including header */
+	uint16_t nlmsg_type;  /* Message type identifier */
+	uint16_t nlmsg_flags; /* Flags (NLM_F_) */
+	uint32_t nlmsg_seq;   /* Sequence number */
+	uint32_t nlmsg_pid;   /* Sending process port ID */
+};
+.Ed
+.Pp
+The
+.Va nlmsg_len
+field stores the whole message length, in bytes, including the header.
+This length has to be rounded up to the nearest 4-byte boundary when
+iterating over messages.
+The
+.Va nlmsg_type
+field represents the command/request type.
+This value is family-specific.
+The list of supported commands can be found in the relevant family
+header file.
+.Va nlmsg_seq
+is a user-provided request identifier.
+An application can track the operation result using the
+.Dv NLMSG_ERROR
+messages and matching the
+.Va nlmsg_seq
+.
+The
+.Va nlmsg_pid
+field is the message sender id.
+This field is optional for userland.
+The kernel sender id is zero.
+The
+.Va nlmsg_flags
+field contains the message-specific flags.
+The following generic flags are defined:
+.Pp
+.Bd -literal -offset indent -compact
+NLM_F_REQUEST	Indicates that the message is an actual request to the kernel
+NLM_F_ACK	Request an explicit ACK message with an operation result
+.Ed
+.Pp
+The following generic flags are defined for the "GET" request types:
+.Pp
+.Bd -literal -offset indent -compact
+NLM_F_ROOT	Return the whole dataset
+NLM_F_MATCH	Return all entries matching the criteria
+.Ed
+These two flags are typically used together, aliased to
+.Dv NLM_F_DUMP
+.Pp
+The following generic flags are defined for the "NEW" request types:
+.Pp
+.Bd -literal -offset indent -compact
+NLM_F_CREATE	Create an object if none exists
+NLM_F_EXCL	Don't replace an object if it exists
+NLM_F_REPLACE	Replace an existing matching object
+NLM_F_APPEND	Append to an existing object
+.Ed
+.Pp
+The following generic flags are defined for the replies:
+.Pp
+.Bd -literal -offset indent -compact
+NLM_F_MULTI	Indicates that the message is part of the message group
+NLM_F_DUMP_INTR	Indicates that the state dump was not completed
+NLM_F_DUMP_FILTERED	Indicates that the dump was filtered per request
+NLM_F_CAPPED	Indicates the original message was capped to its header
+NLM_F_ACK_TLVS	Indicates that extended ACK TLVs were included
+.Ed
+.Ss TLVs
+Most messages encode their attributes as type-length-value pairs (TLVs).
+The base TLV header:
+.Bd -literal
+struct nlattr {
+	uint16_t nla_len;	/* Total attribute length */
+	uint16_t nla_type;	/* Attribute type */
+};
+.Ed
+The TLV type
+.Pq Va nla_type
+scope is typically the message type or group within a family.
+For example, the
+.Dv RTN_MULTICAST
+type value is only valid for
+.Dv RTM_NEWROUTE
+,
+.Dv RTM_DELROUTE
+and
+.Dv RTM_GETROUTE
+messages.
+TLVs can be nested; in that case internal TLVs may have their own sub-types.
+All TLVs are packed with 4-byte padding.
+.Ss CONTROL MESSAGES
+A number of generic control messages are reserved in each family.
+.Pp
+.Dv NLMSG_ERROR
+reports the operation result if requested, optionally followed by
+the metadata TLVs.
+The value of
+.Va nlmsg_seq
+is set to its value in the original messages, while
+.Va nlmsg_pid
+is set to the socket pid of the original socket.
+The operation result is reported via
+.Vt "struct nlmsgerr":
+.Bd -literal
+struct nlmsgerr {
+	int	error;		/* Standard errno */
+	struct	nlmsghdr msg;	/* Original message header */
+};
+.Ed
+If the
+.Dv NETLINK_CAP_ACK
+socket option is not set, the remainder of the original message will follow.
+If the
+.Dv NETLINK_EXT_ACK
+socket option is set, kernel may add a
+.Dv NLMSGERR_ATTR_MSG
+string TLV with the textual error description, optionally followed by the
+.Dv NLMSGERR_ATTR_OFFS
+TLV, indicating the offset from the message start that triggered an error.
+.Pp
+.Dv NLMSG_DONE
+indicates the end of the message group: typically, the end of the dump.
+It contains a single
+.Vt int
+field, describing the dump result as a standard errno value.
+.Sh SOCKET OPTIONS
+Netlink supports a number of custom socket options, which can be set with
+.Xr setsockopt 2
+with the
+.Dv SOL_NETLINK
+.Fa level :
+.Bl -tag -width indent
+.It Dv NETLINK_ADD_MEMBERSHIP
+Subscribes to the notifications for the specific group (int).
+.It Dv NETLINK_DROP_MEMBERSHIP
+Unsubscribes from the notifications for the specific group (int).
+.It Dv NETLINK_LIST_MEMBERSHIPS
+Lists the memberships as a bitmask.
+.It Dv NETLINK_CAP_ACK
+Instructs the kernel to send the original message header in the reply
+without the message body.
+.It Dv NETLINK_EXT_ACK
+Acknowledges ability to receive additional TLVs in the ACK message.
+.El
+.Pp
+Additionally, netlink overrides the following socket options from the
+.Dv SOL_SOCKET
+.Fa level :
+.Bl -tag -width indent
+.It Dv SO_RCVBUF
+Sets the maximum size of the socket receive buffer.
+If the caller has
+.Dv PRIV_NET_ROUTE
+permission, the value can exceed the currently-set
+.Va kern.ipc.maxsockbuf
+value.
+.El
+.Sh SYSCTL VARIABLES
+A set of
+.Xr sysctl 8
+variables is available to tweak run-time parameters:
+.Bl -tag -width indent
+.It Va net.netlink.sendspace
+Default send buffer for the netlink socket.
+Note that the socket sendspace has to be at least as long as the longest
+message that can be transmitted via this socket.
+.El
+.Bl -tag -width indent
+.It Va net.netlink.recvspace
+Default receive buffer for the netlink socket.
+Note that the socket recvspace has to be least as long as the longest
+message that can be received from this socket.
+.El
+.Sh DEBUGGING
+Netlink implements per-functional-unit debugging, with different severities
+controllable via the
+.Va net.netlink.debug
+branch.
+These messages are logged in the kernel message buffer and can be seen in
+.Xr dmesg 8
+.
+The following severity levels are defined:
+.Bl -tag -width indent
+.It Dv LOG_DEBUG(7)
+Rare events or per-socket errors are reported here.
+This is the default level, not impacting production performance.
+.It Dv LOG_DEBUG2(8)
+Socket events such as groups memberships, privilege checks, commands and dumps
+are logged.
+This level does not incur significant performance overhead.
+.It Dv LOG_DEBUG9(9)
+All socket events, each dumped or modified entities are logged.
+Turning it on may result in significant performance overhead.
+.El
+.Sh ERRORS
+Netlink reports operation results, including errors and error metadata, by
+sending a
+.Dv NLMSG_ERROR
+message for each request message.
+The following errors can be returned:
+.Bl -tag -width Er
+.It Bq Er EPERM
+when the current privileges are insufficient to perform the required operation;
+.It Bo Er ENOBUFS Bc or Bo Er ENOMEM Bc
+when the system runs out of memory for
+an internal data structure;
+.It Bq Er ENOTSUP
+when the requested command is not supported by the family or
+the family is not supported;
+.It Bq Er EINVAL
+when some necessary TLVs are missing or invalid, detailed info
+may be provided in NLMSGERR_ATTR_MSG and NLMSGERR_ATTR_OFFS TLVs;
+.It Bq Er ENOENT
+when trying to delete a non-existent object.
+.Pp
+Additionally, a socket operation itself may fail with one of the errors
+specified in
+.Xr socket 2
+,
+.Xr recv 2
+or
+.Xr send 2
+.
+.El
+.Sh SEE ALSO
+.Xr genetrlink 4 ,
+.Xr rtnetlink 4
+.Rs
+.%A "J. Salim"
+.%A "H. Khosravi"
+.%A "A. Kleen"
+.%A "A. Kuznetsov"
+.%T "Linux Netlink as an IP Services Protocol"
+.%O "RFC 3549"
+.Re
+.Sh HISTORY
+The netlink protocol appeared in
+.Fx 14.0 .
+.Sh AUTHORS
+The netlink was implemented by
+.An -nosplit
+.An Alexander Chernikov Aq Mt melifaro@FreeBSD.org .
+It was derived from the Google Summer of Code 2021 project by
+.An Ng Peng Nam Sean .
diff --git a/share/man/man4/rtnetlink.4 b/share/man/man4/rtnetlink.4
new file mode 100644
index 000000000000..9f20671719f0
--- /dev/null
+++ b/share/man/man4/rtnetlink.4
@@ -0,0 +1,519 @@
+.\"
+.\" Copyright (C) 2022 Alexander Chernikov <melifaro@FreeBSD.org>.
+.\"
+.\" Redistribution and use in source and binary forms, with or without
+.\" modification, are permitted provided that the following conditions
+.\" are met:
+.\" 1. Redistributions of source code must retain the above copyright
+.\"    notice, this list of conditions and the following disclaimer.
+.\" 2. Redistributions in binary form must reproduce the above copyright
+.\"    notice, this list of conditions and the following disclaimer in the
+.\"    documentation and/or other materials provided with the distribution.
+.\"
+.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+.\" SUCH DAMAGE.
+.\"
+.\" $FreeBSD$
+.\"
+.Dd November 1, 2022
+.Dt RTNETLINK 4
+.Os
+.Sh NAME
+.Nm RTNetlink
+.Nd Network configuration-specific Netlink family
+.Sh SYNOPSIS
+.In netlink/netlink.h
+.In netlink/netlink_route.h
+.Ft int
+.Fn socket AF_NETLINK SOCK_DGRAM NETLINK_ROUTE
+.Sh DESCRIPTION
+The
+.Dv NETLINK_ROUTE
+family aims to be the primary configuration mechanism for all
+network-related tasks.
+Currently it supports configuring interfaces, interface addresses, routes,
+nexthops and arp/ndp neighbors.
+.Sh ROUTES
+All route configuration messages share the common header:
+.Bd -literal
+struct rtmsg {
+	unsigned char	rtm_family;	/* address family */
+	unsigned char	rtm_dst_len;	/* Prefix length */
+	unsigned char	rtm_src_len;	/* Deprecated, set to 0 */
+	unsigned char	rtm_tos;	/* Type of service (not used) */
+	unsigned char	rtm_table;	/* deprecated, set to 0 */
+	unsigned char	rtm_protocol;	/* Routing protocol id (RTPROT_) */
+	unsigned char	rtm_scope;	/* Route distance (RT_SCOPE_) */
+	unsigned char	rtm_type;	/* Route type (RTN_) */
+	unsigned 	rtm_flags;	/* Route flags (not supported) */
+};
+.Ed
+.Pp
+The
+.Va rtm_family
+specifies the route family to be operated on.
+Currently,
+.Dv AF_INET6
+and
+.Dv AF_INET
+are the only supported families.
+The route prefix length is stored in
+.Va rtm_dst_len
+.
+The caller should set the originator identity (one of the
+.Dv RTPROT_
+values) in
+.Va rtm_protocol
+.
+It is useful for users and for the application itself, allowing for easy
+identification of self-originated routes.
+The route scope has to be set via
+.Va rtm_scope
+field.
+The supported values are:
+.Bd -literal -offset indent -compact
+RT_SCOPE_UNIVERSE	Global scope
+RT_SCOPE_LINK		Link scope
+.Ed
+.Pp
+Route type needs to be set.
+The defined values are:
+.Bd -literal -offset indent -compact
+RTN_UNICAST	Unicast route
+RTN_MULTICAST	Multicast route
+RTN_BLACKHOLE	Drops traffic towards destination
+RTN_PROHIBIT	Drops traffic and sends reject
+.Ed
+.Pp
+The following messages are supported:
+.Ss RTM_NEWROUTE
+Adds a new route.
+All NL flags are supported.
+Extending a multipath route requires NLM_F_APPEND flag.
+.Ss RTM_DELROUTE
+Tries to delete a route.
+The route is specified using a combination of
+.Dv RTA_DST
+TLV and
+.Va rtm_dst_len .
+.Ss RTM_GETROUTE
+Fetches a single route or all routes in the current VNET, depending on the
+.Dv NLM_F_DUMP
+flag.
+Each route is reported as
+.Dv RTM_NEWROUTE
+message.
+The following filters are recognised by the kernel:
+.Pp
+.Bd -literal -offset indent -compact
+rtm_family	required family or AF_UNSPEC
+RTA_TABLE	fib number or RT_TABLE_UNSPEC to return all fibs
+.Ed
+.Ss TLVs
+.Bl -tag -width indent
+.It Dv RTA_DST
+(binary) IPv4/IPv6 address, depending on the
+.Va rtm_family .
+.It Dv RTA_OIF
+(uint32_t) transmit interface index.
+.It Dv RTA_GATEWAY
+(binary) IPv4/IPv6 gateway address, depending on the
+.Va rtm_family .
+.It Dv RTA_METRICS
+(nested) Container attribute, listing route properties.
+The only supported sub-attribute is
+.Dv RTAX_MTU , which stores path MTU as  uint32_t.
+.It Dv RTA_MULTIPATH
+This attribute contains multipath route nexthops with their weights.
+These nexthops are represented as a sequence of
+.Va rtnexthop
+structures, each followed by
+.Dv RTA_GATEWAY
+or
+.Dv RTA_VIA
+attributes.
+.Bd -literal
+struct rtnexthop {
+	unsigned short		rtnh_len;
+	unsigned char		rtnh_flags;
+	unsigned char		rtnh_hops;	/* nexthop weight */
+	int			rtnh_ifindex;
+};
+.Ed
+.Pp
+The
+.Va rtnh_len
+field specifies the total nexthop info length, including both
+.Va struct rtnexthop
+and the following TLVs.
+The
+.Va rtnh_hops
+field stores relative nexthop weight, used for load balancing between group
+members.
+The
+.Va rtnh_ifindex
+field contains the index of the transmit interface.
+.Pp
+The following TLVs can follow the structure:
+.Bd -literal -offset indent -compact
+RTA_GATEWAY	IPv4/IPv6 nexthop address of the gateway
+RTA_VIA		IPv6 nexthop address for IPv4 route
+RTA_KNH_ID	Kernel-specific index of the nexthop
+.Ed
+.It Dv RTA_KNH_ID
+(uint32_t) (FreeBSD-specific) Auto-allocated kernel index of the nexthop.
+.It Dv RTA_RTFLAGS
+(uint32_t) (FreeBSD-specific) rtsock route flags.
+.It Dv RTA_TABLE
+(uint32_t) Fib number of the route.
+Default route table is
+.Dv RT_TABLE_MAIN .
+To explicitely specify "all tables" one needs to set the value to
+.Dv RT_TABLE_UNSPEC .
+.It Dv RTA_EXPIRES
+(uint32_t) seconds till path expiration.
+.It Dv RTA_NH_ID
+(uint32_t) useland nexthop or nexthop group index.
+.El
+.Ss Groups
+The following groups are defined:
+.Bd -literal -offset indent -compact
+RTNLGRP_IPV4_ROUTE	Notifies on IPv4 route arrival/removal/change
+RTNLGRP_IPV6_ROUTE	Notifies on IPv6 route arrival/removal/change
+.Ed
+.Sh NEXTHOPS
+All nexthop/nexthop group configuration messages share the common header:
+.Bd -literal
+struct nhmsg {
+        unsigned char	nh_family;	/* transport family */
+	unsigned char	nh_scope;	/* ignored on RX, filled by kernel */
+	unsigned char	nh_protocol;	/* Routing protocol that installed nh */
+	unsigned char	resvd;
+	unsigned int	nh_flags;	/* RTNH_F_* flags from route.h */
+};
+.Ed
+The
+.Va nh_family
+specificies the gateway address family.
+It can be different from route address family for IPv4 routes with IPv6
+nexthops.
+The
+.Va nh_protocol
+is similar to
+.Va rtm_protocol
+field, which designates originator application identity.
+.Pp
+The following messages are supported:
+.Ss RTM_NEWNEXTHOP
+Creates a new nexthop or nexthop group.
+.Ss RTM_DELNEXTHOP
+Deletes nexthop or nexthhop group.
+The required object is specified by the
+.Dv RTA_NH_ID
+attribute.
+.Ss RTM_GETNEXTHOP
+Fetches a single nexthop or all nexthops/nexthop groups, depending on the
+.Dv NLM_F_DUMP
+flag.
+The following filters are recognised by the kernel:
+.Pp
+.Bd -literal -offset indent -compact
+RTA_NH_ID	nexthop or nexthtop group id
+NHA_GROUPS	match only nexthtop groups
+.Ed
+.Ss TLVs
+.Bl -tag -width indent
+.It Dv RTA_NH_ID
+(uint32_t) Nexthhop index used to identify particular nexthop or nexthop group.
+Should be provided by userland at the nexthtop creation time.
+.It Dv NHA_GROUP
+This attribute designates the nexthtop group and contains all of its nexthtops
+and their relative weights.
+The attribute constists of a list of
+.Va nexthop_grp
+structures:
+.Bd -literal
+struct nexthop_grp {
+	uint32_t	id;		/* nexhop userland index */
+	uint8_t		weight;         /* weight of this nexthop */
+	uint8_t		resvd1;
+	uint16_t	resvd2;
+};
+.Ed
+.It Dv NHA_GROUP_TYPE
+(uint16_t) Nexthtop group type, set to one of the following types:
+.Bd -literal -offset indent -compact
+NEXTHOP_GRP_TYPE_MPATH	default multipath group
+.Ed
+.It Dv NHA_BLACKHOLE
+(flag) Marks the nexthtop as blackhole.
+.It Dv NHA_OIF
+(uint32_t) Transmit interface index of the nexthtop.
+.It Dv NHA_GATEWAY
+(binary) IPv4/IPv6 gateway address
+.It Dv NHA_GROUPS
+(flag) Matches nexthtop groups during dump.
+.El
+.Ss Groups
+The following groups are defined:
+.Bd -literal -offset indent -compact
+RTNLGRP_NEXTHOP		Notifies on nexthop/groups arrival/removal/change
+.Ed
+.Sh INTERFACES
+All interface configuration messages share the common header:
+.Bd -literal
+struct ifinfomsg {
+	unsigned char	ifi_family;	/* not used, set to 0 */
+	unsigned char	__ifi_pad;
+	unsigned short	ifi_type;	/* ARPHRD_* */
+	int		ifi_index;	/* Inteface index */
+	unsigned	ifi_flags;	/* IFF_* flags */
+	unsigned	ifi_change;	/* IFF_* change mask */
+};
+.Ed
+.Ss RTM_NEWLINK
+Creates a new interface.
+The only mandatory TLV is
+.Dv IFLA_IFNAME .
+.Ss RTM_DELLINK
+Deletes the interface specified by
+.Dv IFLA_IFNAME .
+.Ss RTM_GETLINK
+Fetches a single interface or all interfaces in the current VNET, depending on the
+.Dv NLM_F_DUMP
+flag.
+Each interface is reported as a
+.Dv RTM_NEWLINK
+message.
+The following filters are recognised by the kernel:
+.Pp
+.Bd -literal -offset indent -compact
+ifi_index	interface index
+IFLA_IFNAME	interface name
+IFLA_ALT_IFNAME	interface name
+.Ed
+.Ss TLVs
+.Bl -tag -width indent
+.It Dv IFLA_ADDRESS
+(binary) Llink-level interface address (MAC).
+.It Dv IFLA_BROADCAST
+(binary) (readonly) Link-level broadcast address.
+.It Dv IFLA_IFNAME
+(string) New interface name.
+.It Dv IFLA_LINK
+(uint32_t) (readonly) Interface index.
+.It Dv IFLA_MASTER
+(uint32_t) Parent interface index.
+.It Dv IFLA_LINKINFO
+(nested) Interface type-specific attributes:
+.Bd -literal -offset indent -compact
+IFLA_INFO_KIND		(string) interface type ("vlan")
+IFLA_INFO_DATA		(nested) custom attributes
+.Ed
+The following types and attributes are supported:
+.Bl -tag -width indent
+.It Dv vlan
+.Bd -literal -offset indent -compact
+IFLA_VLAN_ID		(uint16_t) 802.1Q vlan id
+IFLA_VLAN_PROTOCOL	(uint16_t) Protocol: ETHERTYPE_VLAN or ETHERTYPE_QINQ
+.Ed
+.El
+.It Dv IFLA_OPERSTATE
+(uint8_t) Interface operational state per RFC 2863.
+Can be one of the following:
+.Bd -literal -offset indent -compact
+IF_OPER_UNKNOWN		status can not be determined
+IF_OPER_NOTPRESENT	some (hardware) component not present
+IF_OPER_DOWN		down
+IF_OPER_LOWERLAYERDOWN	some lower-level interface is down
+IF_OPER_TESTING		in some test mode
+IF_OPER_DORMANT		"up" but waiting for some condition (802.1X)
+IF_OPER_UP		ready to pass packets
+.Ed
+.It Dv IFLA_STATS64
+(readonly) Consists of the following 64-bit counters structure:
+.Bd -literal
+struct rtnl_link_stats64 {
+	uint64_t rx_packets;	/* total RX packets (IFCOUNTER_IPACKETS) */
+	uint64_t tx_packets;	/* total TX packets (IFCOUNTER_OPACKETS) */
+	uint64_t rx_bytes;	/* total RX bytes (IFCOUNTER_IBYTES) */
+	uint64_t tx_bytes;	/* total TX bytes (IFCOUNTER_OBYTES) */
+	uint64_t rx_errors;	/* RX errors (IFCOUNTER_IERRORS) */
+	uint64_t tx_errors;	/* RX errors (IFCOUNTER_OERRORS) */
+	uint64_t rx_dropped;	/* RX drop (no space in ring/no bufs) (IFCOUNTER_IQDROPS) */
+	uint64_t tx_dropped;	/* TX drop (IFCOUNTER_OQDROPS) */
+	uint64_t multicast;	/* RX multicast packets (IFCOUNTER_IMCASTS) */
+	uint64_t collisions;	/* not supported */
+	uint64_t rx_length_errors;	/* not supported */
+	uint64_t rx_over_errors;	/* not supported */
+	uint64_t rx_crc_errors;		/* not supported */
+	uint64_t rx_frame_errors;	/* not supported */
+	uint64_t rx_fifo_errors;	/* not supported */
+	uint64_t rx_missed_errors;	/* not supported */
+	uint64_t tx_aborted_errors;	/* not supported */
+	uint64_t tx_carrier_errors;	/* not supported */
+	uint64_t tx_fifo_errors;	/* not supported */
+	uint64_t tx_heartbeat_errors;	/* not supported */
+	uint64_t tx_window_errors;	/* not supported */
+	uint64_t rx_compressed;		/* not supported */
+	uint64_t tx_compressed;		/* not supported */
+	uint64_t rx_nohandler;	/* dropped due to no proto handler (IFCOUNTER_NOPROTO) */
+};
+.Ed
+.El
+.Ss Groups
+The following groups are defined:
+.Bd -literal -offset indent -compact
+RTNLGRP_LINK		Notifies on interface arrival/removal/change
+.Ed
+.Sh INTERFACE ADDRESSES
+All interface address configuration messages share the common header:
+.Bd -literal
+struct ifaddrmsg {
+	uint8_t		ifa_family;	/* Address family */
+	uint8_t		ifa_prefixlen;	/* Prefix length */
+	uint8_t		ifa_flags;	/* Address-specific flags */
+	uint8_t		ifa_scope;	/* Address scope */
+	uint32_t	ifa_index;	/* Link ifindex */
+};
+.Ed
+.Pp
+The
+.Va ifa_family
+specifies the address family of the interface address.
+The
+.Va ifa_prefixlen
+specifies the prefix length if applicable for the address family.
+The
+.Va ifa_index
+specifies the interface index of the target interface.
+.Ss RTM_NEWADDR
+Not supported
+.Ss RTM_DELADDR
+Not supported
+.Ss RTM_GETADDR
+.Ss TLVs
+.Bl -tag -width indent
+.It Dv IFA_ADDRESS
+(binary) masked interface address or destination address for p2p interfaces.
+.It Dv IFA_LOCAL
+(binary) local interface address
+.It Dv IFA_BROADCAST
+(binary) broacast interface address
+.El
+.Ss Groups
+The following groups are defined:
+.Bd -literal -offset indent -compact
+RTNLGRP_IPV4_IFADDR	Notifies on IPv4 ifaddr arrival/removal/change
+RTNLGRP_IPV6_IFADDR	Notifies on IPv6 ifaddr arrival/removal/change
+.Ed
+.Sh NEIGHBORS
+All neighbor configuration messages share the common header:
+.Bd -literal
+struct ndmsg {
+	uint8_t		ndm_family;
+	uint8_t		ndm_pad1;
+	uint16_t	ndm_pad2;
+	int32_t		ndm_ifindex;
+	uint16_t	ndm_state;
+	uint8_t		ndm_flags;
+	uint8_t		ndm_type;
+};
+.Ed
+.Pp
+The
+.Va ndm_family
+field specifies the address family (IPv4 or IPv6) of the neighbor.
+The
+.Va ndm_ifindex
+specifies the interface to operate on.
+The
+.Va ndm_state
+represents the entry state according to the neighbor model.
+The state can be one of the following:
+.Bd -literal -offset indent -compact
+NUD_INCOMPLETE		No lladdr, address resolution in progress
+NUD_REACHABLE		reachable & recently resolved
+NUD_STALE		has lladdr but it's stale
+NUD_DELAY		has lladdr, is stale, probes delayed
+NUD_PROBE		has lladdr, is stale, probes sent
+NUD_FAILED		unused
+.Ed
+.Pp
*** 68 LINES SKIPPED ***