svn commit: r273331 - in head: sbin/ifconfig share/man/man4 sys/conf sys/modules sys/modules/if_vxlan sys/net sys/sys
Adrian Chadd
adrian at freebsd.org
Mon Oct 20 16:57:37 UTC 2014
Hi,
Can you please create a PR that says something like "review vxlan code
for RSS after de-capsulation" and assign it to me?
I'm going to have to insert a hash recalculation after decapsulation
but I'm too busy at the moment to do it.
Thanks,
-a
On 20 October 2014 07:42, Bryan Venteicher <bryanv at freebsd.org> wrote:
> Author: bryanv
> Date: Mon Oct 20 14:42:42 2014
> New Revision: 273331
> URL: https://svnweb.freebsd.org/changeset/base/273331
>
> Log:
> Add vxlan interface
>
> vxlan creates a virtual LAN by encapsulating the inner Ethernet frame in
> a UDP packet. This implementation is based on RFC7348.
>
> Currently, the IPv6 support is not fully compliant with the specification:
> we should be able to receive UPDv6 packets with a zero checksum, but we
> need to support RFC6935 first. Patches for this should come soon.
>
> Encapsulation protocols such as vxlan emphasize the need for the FreeBSD
> network stack to support batching, GRO, and GSO. Each frame has to make
> two trips through the network stack, and each frame will be at most MTU
> sized. Performance suffers accordingly.
>
> Some latest generation NICs have begun to support vxlan HW offloads that
> we should also take advantage of. VIMAGE support should also be added soon.
>
> Differential Revision: https://reviews.freebsd.org/D384
> Reviewed by: gnn
> Relnotes: yes
>
> Added:
> head/sbin/ifconfig/ifvxlan.c (contents, props changed)
> head/share/man/man4/vxlan.4 (contents, props changed)
> head/sys/modules/if_vxlan/
> head/sys/modules/if_vxlan/Makefile (contents, props changed)
> head/sys/net/if_vxlan.c (contents, props changed)
> head/sys/net/if_vxlan.h (contents, props changed)
> Modified:
> head/sbin/ifconfig/Makefile
> head/sbin/ifconfig/ifconfig.8
> head/share/man/man4/Makefile
> head/sys/conf/NOTES
> head/sys/conf/files
> head/sys/modules/Makefile
> head/sys/sys/priv.h
>
> Modified: head/sbin/ifconfig/Makefile
> ==============================================================================
> --- head/sbin/ifconfig/Makefile Mon Oct 20 14:25:23 2014 (r273330)
> +++ head/sbin/ifconfig/Makefile Mon Oct 20 14:42:42 2014 (r273331)
> @@ -30,6 +30,7 @@ SRCS+= ifmac.c # MAC support
> SRCS+= ifmedia.c # SIOC[GS]IFMEDIA support
> SRCS+= iffib.c # non-default FIB support
> SRCS+= ifvlan.c # SIOC[GS]ETVLAN support
> +SRCS+= ifvxlan.c # VXLAN support
> SRCS+= ifgre.c # GRE keys etc
> SRCS+= ifgif.c # GIF reversed header workaround
>
>
> Modified: head/sbin/ifconfig/ifconfig.8
> ==============================================================================
> --- head/sbin/ifconfig/ifconfig.8 Mon Oct 20 14:25:23 2014 (r273330)
> +++ head/sbin/ifconfig/ifconfig.8 Mon Oct 20 14:42:42 2014 (r273331)
> @@ -28,7 +28,7 @@
> .\" From: @(#)ifconfig.8 8.3 (Berkeley) 1/5/94
> .\" $FreeBSD$
> .\"
> -.Dd October 1, 2014
> +.Dd October 20, 2014
> .Dt IFCONFIG 8
> .Os
> .Sh NAME
> @@ -2541,6 +2541,76 @@ argument is useless and hence deprecated
> .El
> .Pp
> The following parameters are used to configure
> +.Xr vxlan 4
> +interfaces.
> +.Bl -tag -width indent
> +.It Cm vni Ar identifier
> +This value is a 24-bit VXLAN Network Identifier (VNI) that identifies the
> +virtual network segment membership of the interface.
> +.It Cm local Ar address
> +The source address used in the encapsulating IPv4/IPv6 header.
> +The address should already be assigned to an existing interface.
> +When the interface is configured in unicast mode, the listening socket
> +is bound to this address.
> +.It Cm remote Ar address
> +The interface can be configured in a unicast, or point-to-point, mode
> +to create a tunnel between two hosts.
> +This is the IP address of the remote end of the tunnel.
> +.It Cm group Ar address
> +The interface can be configured in a multicast mode
> +to create a virtual network of hosts.
> +This is the IP multicast group address the interface will join.
> +.It Cm localport Ar port
> +The port number the interface will listen on.
> +The default port number is 4789.
> +.It Cm remoteport Ar port
> +The destination port number used in the encapsulating IPv4/IPv6 header.
> +The remote host should be listening on this port.
> +The default port number is 4789.
> +Note some other implementations, such as Linux,
> +do not default to the IANA assigned port,
> +but instead listen on port 8472.
> +.It Cm portrange Ar low high
> +The range of source ports used in the encapsulating IPv4/IPv6 header.
> +The port selected within the range is based on a hash of the inner frame.
> +A range is useful to provide entropy within the outer IP header
> +for more effective load balancing.
> +The default range is between the
> +.Xr sysctl 8
> +variables
> +.Va net.inet.ip.portrange.first
> +and
> +.Va net.inet.ip.portrange.last
> +.It Cm timeout Ar timeout
> +The maximum time, in seconds, before an entry in the forwarding table
> +is pruned.
> +The default is 1200 seconds (20 minutes).
> +.It Cm maxaddr Ar max
> +The maximum number of entries in the forwarding table.
> +The default is 2000.
> +.It Cm vxlandev Ar dev
> +When the interface is configured in multicast mode, the
> +.Cm dev
> +interface is used to transmit IP multicast packets.
> +.It Cm ttl Ar ttl
> +The TTL used in the encapsulating IPv4/IPv6 header.
> +The default is 64.
> +.It Cm learn
> +The source IP address and inner source Ethernet MAC address of
> +received packets are used to dynamically populate the forwarding table.
> +When in multicast mode, an entry in the forwarding table allows the
> +interface to send the frame directly to the remote host instead of
> +broadcasting the frame to the multicast group.
> +This is the default.
> +.It Fl learn
> +The forwarding table is not populated by recevied packets.
> +.It Cm flush
> +Delete all dynamically-learned addresses from the forwarding table.
> +.It Cm flushall
> +Delete all addresses, including static addresses, from the forwarding table.
> +.El
> +.Pp
> +The following parameters are used to configure
> .Xr carp 4
> protocol on an interface:
> .Bl -tag -width indent
> @@ -2745,6 +2815,7 @@ tried to alter an interface's configurat
> .Xr pfsync 4 ,
> .Xr polling 4 ,
> .Xr vlan 4 ,
> +.Xr vxlan 4 ,
> .Xr devd.conf 5 ,
> .\" .Xr eon 5 ,
> .Xr devd 8 ,
>
> Added: head/sbin/ifconfig/ifvxlan.c
> ==============================================================================
> --- /dev/null 00:00:00 1970 (empty, because file is newly added)
> +++ head/sbin/ifconfig/ifvxlan.c Mon Oct 20 14:42:42 2014 (r273331)
> @@ -0,0 +1,648 @@
> +/*-
> + * Copyright (c) 2014, Bryan Venteicher <bryanv at FreeBSD.org>
> + * All rights reserved.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + * 1. Redistributions of source code must retain the above copyright
> + * notice unmodified, this list of conditions, and the following
> + * disclaimer.
> + * 2. Redistributions in binary form must reproduce the above copyright
> + * notice, this list of conditions and the following disclaimer in the
> + * documentation and/or other materials provided with the distribution.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
> + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
> + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
> + * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
> + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
> + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include <sys/cdefs.h>
> +__FBSDID("$FreeBSD$");
> +
> +#include <sys/param.h>
> +#include <sys/ioctl.h>
> +#include <sys/socket.h>
> +#include <sys/sockio.h>
> +
> +#include <stdlib.h>
> +#include <stdint.h>
> +#include <unistd.h>
> +#include <netdb.h>
> +
> +#include <net/ethernet.h>
> +#include <net/if.h>
> +#include <net/if_var.h>
> +#include <net/if_vxlan.h>
> +#include <net/route.h>
> +#include <netinet/in.h>
> +
> +#include <ctype.h>
> +#include <stdio.h>
> +#include <string.h>
> +#include <stdlib.h>
> +#include <unistd.h>
> +#include <err.h>
> +#include <errno.h>
> +
> +#include "ifconfig.h"
> +
> +static struct ifvxlanparam params = {
> + .vxlp_vni = VXLAN_VNI_MAX,
> +};
> +
> +static int
> +get_val(const char *cp, u_long *valp)
> +{
> + char *endptr;
> + u_long val;
> +
> + errno = 0;
> + val = strtoul(cp, &endptr, 0);
> + if (cp[0] == '\0' || endptr[0] != '\0' || errno == ERANGE)
> + return (-1);
> +
> + *valp = val;
> + return (0);
> +}
> +
> +static int
> +do_cmd(int sock, u_long op, void *arg, size_t argsize, int set)
> +{
> + struct ifdrv ifd;
> +
> + bzero(&ifd, sizeof(ifd));
> +
> + strlcpy(ifd.ifd_name, ifr.ifr_name, sizeof(ifd.ifd_name));
> + ifd.ifd_cmd = op;
> + ifd.ifd_len = argsize;
> + ifd.ifd_data = arg;
> +
> + return (ioctl(sock, set ? SIOCSDRVSPEC : SIOCGDRVSPEC, &ifd));
> +}
> +
> +static int
> +vxlan_exists(int sock)
> +{
> + struct ifvxlancfg cfg;
> +
> + bzero(&cfg, sizeof(cfg));
> +
> + return (do_cmd(sock, VXLAN_CMD_GET_CONFIG, &cfg, sizeof(cfg), 0) != -1);
> +}
> +
> +static void
> +vxlan_status(int s)
> +{
> + struct ifvxlancfg cfg;
> + char src[NI_MAXHOST], dst[NI_MAXHOST];
> + char srcport[NI_MAXSERV], dstport[NI_MAXSERV];
> + struct sockaddr *lsa, *rsa;
> + int vni, mc, ipv6;
> +
> + bzero(&cfg, sizeof(cfg));
> +
> + if (do_cmd(s, VXLAN_CMD_GET_CONFIG, &cfg, sizeof(cfg), 0) < 0)
> + return;
> +
> + vni = cfg.vxlc_vni;
> + lsa = &cfg.vxlc_local_sa.sa;
> + rsa = &cfg.vxlc_remote_sa.sa;
> + ipv6 = rsa->sa_family == AF_INET6;
> +
> + /* Just report nothing if the network identity isn't set yet. */
> + if (vni >= VXLAN_VNI_MAX)
> + return;
> +
> + if (getnameinfo(lsa, lsa->sa_len, src, sizeof(src),
> + srcport, sizeof(srcport), NI_NUMERICHOST | NI_NUMERICSERV) != 0)
> + src[0] = srcport[0] = '\0';
> + if (getnameinfo(rsa, rsa->sa_len, dst, sizeof(dst),
> + dstport, sizeof(dstport), NI_NUMERICHOST | NI_NUMERICSERV) != 0)
> + dst[0] = dstport[0] = '\0';
> +
> + if (!ipv6) {
> + struct sockaddr_in *sin = (struct sockaddr_in *)rsa;
> + mc = IN_MULTICAST(ntohl(sin->sin_addr.s_addr));
> + } else {
> + struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)rsa;
> + mc = IN6_IS_ADDR_MULTICAST(&sin6->sin6_addr);
> + }
> +
> + printf("\tvxlan vni %d", vni);
> + printf(" local %s%s%s:%s", ipv6 ? "[" : "", src, ipv6 ? "]" : "",
> + srcport);
> + printf(" %s %s%s%s:%s", mc ? "group" : "remote", ipv6 ? "[" : "",
> + dst, ipv6 ? "]" : "", dstport);
> +
> + if (verbose) {
> + printf("\n\t\tconfig: ");
> + printf("%slearning portrange %d-%d ttl %d",
> + cfg.vxlc_learn ? "" : "no", cfg.vxlc_port_min,
> + cfg.vxlc_port_max, cfg.vxlc_ttl);
> + printf("\n\t\tftable: ");
> + printf("cnt %d max %d timeout %d",
> + cfg.vxlc_ftable_cnt, cfg.vxlc_ftable_max,
> + cfg.vxlc_ftable_timeout);
> + }
> +
> + putchar('\n');
> +}
> +
> +#define _LOCAL_ADDR46 \
> + (VXLAN_PARAM_WITH_LOCAL_ADDR4 | VXLAN_PARAM_WITH_LOCAL_ADDR6)
> +#define _REMOTE_ADDR46 \
> + (VXLAN_PARAM_WITH_REMOTE_ADDR4 | VXLAN_PARAM_WITH_REMOTE_ADDR6)
> +
> +static void
> +vxlan_check_params(void)
> +{
> +
> + if ((params.vxlp_with & _LOCAL_ADDR46) == _LOCAL_ADDR46)
> + errx(1, "cannot specify both local IPv4 and IPv6 addresses");
> + if ((params.vxlp_with & _REMOTE_ADDR46) == _REMOTE_ADDR46)
> + errx(1, "cannot specify both remote IPv4 and IPv6 addresses");
> + if ((params.vxlp_with & VXLAN_PARAM_WITH_LOCAL_ADDR4 &&
> + params.vxlp_with & VXLAN_PARAM_WITH_REMOTE_ADDR6) ||
> + (params.vxlp_with & VXLAN_PARAM_WITH_LOCAL_ADDR6 &&
> + params.vxlp_with & VXLAN_PARAM_WITH_REMOTE_ADDR4))
> + errx(1, "cannot mix IPv4 and IPv6 addresses");
> +}
> +
> +#undef _LOCAL_ADDR46
> +#undef _REMOTE_ADDR46
> +
> +static void
> +vxlan_cb(int s, void *arg)
> +{
> +
> +}
> +
> +static void
> +vxlan_create(int s, struct ifreq *ifr)
> +{
> +
> + vxlan_check_params();
> +
> + ifr->ifr_data = (caddr_t) ¶ms;
> + if (ioctl(s, SIOCIFCREATE2, ifr) < 0)
> + err(1, "SIOCIFCREATE2");
> +}
> +
> +static
> +DECL_CMD_FUNC(setvxlan_vni, arg, d)
> +{
> + struct ifvxlancmd cmd;
> + u_long val;
> +
> + if (get_val(arg, &val) < 0 || val >= VXLAN_VNI_MAX)
> + errx(1, "invalid network identifier: %s", arg);
> +
> + if (!vxlan_exists(s)) {
> + params.vxlp_with |= VXLAN_PARAM_WITH_VNI;
> + params.vxlp_vni = val;
> + return;
> + }
> +
> + bzero(&cmd, sizeof(cmd));
> + cmd.vxlcmd_vni = val;
> +
> + if (do_cmd(s, VXLAN_CMD_SET_VNI, &cmd, sizeof(cmd), 1) < 0)
> + err(1, "VXLAN_CMD_SET_VNI");
> +}
> +
> +static
> +DECL_CMD_FUNC(setvxlan_local, addr, d)
> +{
> + struct ifvxlancmd cmd;
> + struct addrinfo *ai;
> + struct sockaddr *sa;
> + int error;
> +
> + bzero(&cmd, sizeof(cmd));
> +
> + if ((error = getaddrinfo(addr, NULL, NULL, &ai)) != 0)
> + errx(1, "error in parsing local address string: %s",
> + gai_strerror(error));
> +
> + sa = ai->ai_addr;
> +
> + switch (ai->ai_family) {
> +#ifdef INET
> + case AF_INET: {
> + struct in_addr addr = ((struct sockaddr_in *) sa)->sin_addr;
> +
> + if (IN_MULTICAST(ntohl(addr.s_addr)))
> + errx(1, "local address cannot be multicast");
> +
> + cmd.vxlcmd_sa.in4.sin_family = AF_INET;
> + cmd.vxlcmd_sa.in4.sin_addr = addr;
> + break;
> + }
> +#endif
> +#ifdef INET6
> + case AF_INET6: {
> + struct in6_addr *addr = &((struct sockaddr_in6 *)sa)->sin6_addr;
> +
> + if (IN6_IS_ADDR_MULTICAST(addr))
> + errx(1, "local address cannot be multicast");
> +
> + cmd.vxlcmd_sa.in6.sin6_family = AF_INET6;
> + cmd.vxlcmd_sa.in6.sin6_addr = *addr;
> + break;
> + }
> +#endif
> + default:
> + errx(1, "local address %s not supported", addr);
> + }
> +
> + freeaddrinfo(ai);
> +
> + if (!vxlan_exists(s)) {
> + if (cmd.vxlcmd_sa.sa.sa_family == AF_INET) {
> + params.vxlp_with |= VXLAN_PARAM_WITH_LOCAL_ADDR4;
> + params.vxlp_local_in4 = cmd.vxlcmd_sa.in4.sin_addr;
> + } else {
> + params.vxlp_with |= VXLAN_PARAM_WITH_LOCAL_ADDR6;
> + params.vxlp_local_in6 = cmd.vxlcmd_sa.in6.sin6_addr;
> + }
> + return;
> + }
> +
> + if (do_cmd(s, VXLAN_CMD_SET_LOCAL_ADDR, &cmd, sizeof(cmd), 1) < 0)
> + err(1, "VXLAN_CMD_SET_LOCAL_ADDR");
> +}
> +
> +static
> +DECL_CMD_FUNC(setvxlan_remote, addr, d)
> +{
> + struct ifvxlancmd cmd;
> + struct addrinfo *ai;
> + struct sockaddr *sa;
> + int error;
> +
> + bzero(&cmd, sizeof(cmd));
> +
> + if ((error = getaddrinfo(addr, NULL, NULL, &ai)) != 0)
> + errx(1, "error in parsing remote address string: %s",
> + gai_strerror(error));
> +
> + sa = ai->ai_addr;
> +
> + switch (ai->ai_family) {
> +#ifdef INET
> + case AF_INET: {
> + struct in_addr addr = ((struct sockaddr_in *)sa)->sin_addr;
> +
> + if (IN_MULTICAST(ntohl(addr.s_addr)))
> + errx(1, "remote address cannot be multicast");
> +
> + cmd.vxlcmd_sa.in4.sin_family = AF_INET;
> + cmd.vxlcmd_sa.in4.sin_addr = addr;
> + break;
> + }
> +#endif
> +#ifdef INET6
> + case AF_INET6: {
> + struct in6_addr *addr = &((struct sockaddr_in6 *)sa)->sin6_addr;
> +
> + if (IN6_IS_ADDR_MULTICAST(addr))
> + errx(1, "remote address cannot be multicast");
> +
> + cmd.vxlcmd_sa.in6.sin6_family = AF_INET6;
> + cmd.vxlcmd_sa.in6.sin6_addr = *addr;
> + break;
> + }
> +#endif
> + default:
> + errx(1, "remote address %s not supported", addr);
> + }
> +
> + freeaddrinfo(ai);
> +
> + if (!vxlan_exists(s)) {
> + if (cmd.vxlcmd_sa.sa.sa_family == AF_INET) {
> + params.vxlp_with |= VXLAN_PARAM_WITH_REMOTE_ADDR4;
> + params.vxlp_remote_in4 = cmd.vxlcmd_sa.in4.sin_addr;
> + } else {
> + params.vxlp_with |= VXLAN_PARAM_WITH_REMOTE_ADDR6;
> + params.vxlp_remote_in6 = cmd.vxlcmd_sa.in6.sin6_addr;
> + }
> + return;
> + }
> +
> + if (do_cmd(s, VXLAN_CMD_SET_REMOTE_ADDR, &cmd, sizeof(cmd), 1) < 0)
> + err(1, "VXLAN_CMD_SET_REMOTE_ADDR");
> +}
> +
> +static
> +DECL_CMD_FUNC(setvxlan_group, addr, d)
> +{
> + struct ifvxlancmd cmd;
> + struct addrinfo *ai;
> + struct sockaddr *sa;
> + int error;
> +
> + bzero(&cmd, sizeof(cmd));
> +
> + if ((error = getaddrinfo(addr, NULL, NULL, &ai)) != 0)
> + errx(1, "error in parsing group address string: %s",
> + gai_strerror(error));
> +
> + sa = ai->ai_addr;
> +
> + switch (ai->ai_family) {
> +#ifdef INET
> + case AF_INET: {
> + struct in_addr addr = ((struct sockaddr_in *)sa)->sin_addr;
> +
> + if (!IN_MULTICAST(ntohl(addr.s_addr)))
> + errx(1, "group address must be multicast");
> +
> + cmd.vxlcmd_sa.in4.sin_family = AF_INET;
> + cmd.vxlcmd_sa.in4.sin_addr = addr;
> + break;
> + }
> +#endif
> +#ifdef INET6
> + case AF_INET6: {
> + struct in6_addr *addr = &((struct sockaddr_in6 *)sa)->sin6_addr;
> +
> + if (!IN6_IS_ADDR_MULTICAST(addr))
> + errx(1, "group address must be multicast");
> +
> + cmd.vxlcmd_sa.in6.sin6_family = AF_INET6;
> + cmd.vxlcmd_sa.in6.sin6_addr = *addr;
> + break;
> + }
> +#endif
> + default:
> + errx(1, "group address %s not supported", addr);
> + }
> +
> + freeaddrinfo(ai);
> +
> + if (!vxlan_exists(s)) {
> + if (cmd.vxlcmd_sa.sa.sa_family == AF_INET) {
> + params.vxlp_with |= VXLAN_PARAM_WITH_REMOTE_ADDR4;
> + params.vxlp_remote_in4 = cmd.vxlcmd_sa.in4.sin_addr;
> + } else {
> + params.vxlp_with |= VXLAN_PARAM_WITH_REMOTE_ADDR6;
> + params.vxlp_remote_in6 = cmd.vxlcmd_sa.in6.sin6_addr;
> + }
> + return;
> + }
> +
> + if (do_cmd(s, VXLAN_CMD_SET_REMOTE_ADDR, &cmd, sizeof(cmd), 1) < 0)
> + err(1, "VXLAN_CMD_SET_REMOTE_ADDR");
> +}
> +
> +static
> +DECL_CMD_FUNC(setvxlan_local_port, arg, d)
> +{
> + struct ifvxlancmd cmd;
> + u_long val;
> +
> + if (get_val(arg, &val) < 0 || val >= UINT16_MAX)
> + errx(1, "invalid local port: %s", arg);
> +
> + if (!vxlan_exists(s)) {
> + params.vxlp_with |= VXLAN_PARAM_WITH_LOCAL_PORT;
> + params.vxlp_local_port = val;
> + return;
> + }
> +
> + bzero(&cmd, sizeof(cmd));
> + cmd.vxlcmd_port = val;
> +
> + if (do_cmd(s, VXLAN_CMD_SET_LOCAL_PORT, &cmd, sizeof(cmd), 1) < 0)
> + err(1, "VXLAN_CMD_SET_LOCAL_PORT");
> +}
> +
> +static
> +DECL_CMD_FUNC(setvxlan_remote_port, arg, d)
> +{
> + struct ifvxlancmd cmd;
> + u_long val;
> +
> + if (get_val(arg, &val) < 0 || val >= UINT16_MAX)
> + errx(1, "invalid remote port: %s", arg);
> +
> + if (!vxlan_exists(s)) {
> + params.vxlp_with |= VXLAN_PARAM_WITH_REMOTE_PORT;
> + params.vxlp_remote_port = val;
> + return;
> + }
> +
> + bzero(&cmd, sizeof(cmd));
> + cmd.vxlcmd_port = val;
> +
> + if (do_cmd(s, VXLAN_CMD_SET_REMOTE_PORT, &cmd, sizeof(cmd), 1) < 0)
> + err(1, "VXLAN_CMD_SET_REMOTE_PORT");
> +}
> +
> +static
> +DECL_CMD_FUNC2(setvxlan_port_range, arg1, arg2)
> +{
> + struct ifvxlancmd cmd;
> + u_long min, max;
> +
> + if (get_val(arg1, &min) < 0 || min >= UINT16_MAX)
> + errx(1, "invalid port range minimum: %s", arg1);
> + if (get_val(arg2, &max) < 0 || max >= UINT16_MAX)
> + errx(1, "invalid port range maximum: %s", arg2);
> + if (max < min)
> + errx(1, "invalid port range");
> +
> + if (!vxlan_exists(s)) {
> + params.vxlp_with |= VXLAN_PARAM_WITH_PORT_RANGE;
> + params.vxlp_min_port = min;
> + params.vxlp_max_port = max;
> + return;
> + }
> +
> + bzero(&cmd, sizeof(cmd));
> + cmd.vxlcmd_port_min = min;
> + cmd.vxlcmd_port_max = max;
> +
> + if (do_cmd(s, VXLAN_CMD_SET_PORT_RANGE, &cmd, sizeof(cmd), 1) < 0)
> + err(1, "VXLAN_CMD_SET_PORT_RANGE");
> +}
> +
> +static
> +DECL_CMD_FUNC(setvxlan_timeout, arg, d)
> +{
> + struct ifvxlancmd cmd;
> + u_long val;
> +
> + if (get_val(arg, &val) < 0 || (val & ~0xFFFFFFFF) != 0)
> + errx(1, "invalid timeout value: %s", arg);
> +
> + if (!vxlan_exists(s)) {
> + params.vxlp_with |= VXLAN_PARAM_WITH_FTABLE_TIMEOUT;
> + params.vxlp_ftable_timeout = val & 0xFFFFFFFF;
> + return;
> + }
> +
> + bzero(&cmd, sizeof(cmd));
> + cmd.vxlcmd_ftable_timeout = val & 0xFFFFFFFF;
> +
> + if (do_cmd(s, VXLAN_CMD_SET_FTABLE_TIMEOUT, &cmd, sizeof(cmd), 1) < 0)
> + err(1, "VXLAN_CMD_SET_FTABLE_TIMEOUT");
> +}
> +
> +static
> +DECL_CMD_FUNC(setvxlan_maxaddr, arg, d)
> +{
> + struct ifvxlancmd cmd;
> + u_long val;
> +
> + if (get_val(arg, &val) < 0 || (val & ~0xFFFFFFFF) != 0)
> + errx(1, "invalid maxaddr value: %s", arg);
> +
> + if (!vxlan_exists(s)) {
> + params.vxlp_with |= VXLAN_PARAM_WITH_FTABLE_MAX;
> + params.vxlp_ftable_max = val & 0xFFFFFFFF;
> + return;
> + }
> +
> + bzero(&cmd, sizeof(cmd));
> + cmd.vxlcmd_ftable_max = val & 0xFFFFFFFF;
> +
> + if (do_cmd(s, VXLAN_CMD_SET_FTABLE_MAX, &cmd, sizeof(cmd), 1) < 0)
> + err(1, "VXLAN_CMD_SET_FTABLE_MAX");
> +}
> +
> +static
> +DECL_CMD_FUNC(setvxlan_dev, arg, d)
> +{
> + struct ifvxlancmd cmd;
> +
> + if (!vxlan_exists(s)) {
> + params.vxlp_with |= VXLAN_PARAM_WITH_MULTICAST_IF;
> + strlcpy(params.vxlp_mc_ifname, arg,
> + sizeof(params.vxlp_mc_ifname));
> + return;
> + }
> +
> + bzero(&cmd, sizeof(cmd));
> + strlcpy(cmd.vxlcmd_ifname, arg, sizeof(cmd.vxlcmd_ifname));
> +
> + if (do_cmd(s, VXLAN_CMD_SET_MULTICAST_IF, &cmd, sizeof(cmd), 1) < 0)
> + err(1, "VXLAN_CMD_SET_MULTICAST_IF");
> +}
> +
> +static
> +DECL_CMD_FUNC(setvxlan_ttl, arg, d)
> +{
> + struct ifvxlancmd cmd;
> + u_long val;
> +
> + if (get_val(arg, &val) < 0 || val > 256)
> + errx(1, "invalid TTL value: %s", arg);
> +
> + if (!vxlan_exists(s)) {
> + params.vxlp_with |= VXLAN_PARAM_WITH_TTL;
> + params.vxlp_ttl = val;
> + return;
> + }
> +
> + bzero(&cmd, sizeof(cmd));
> + cmd.vxlcmd_ttl = val;
> +
> + if (do_cmd(s, VXLAN_CMD_SET_TTL, &cmd, sizeof(cmd), 1) < 0)
> + err(1, "VXLAN_CMD_SET_TTL");
> +}
> +
> +static
> +DECL_CMD_FUNC(setvxlan_learn, arg, d)
> +{
> + struct ifvxlancmd cmd;
> +
> + if (!vxlan_exists(s)) {
> + params.vxlp_with |= VXLAN_PARAM_WITH_LEARN;
> + params.vxlp_learn = d;
> + return;
> + }
> +
> + bzero(&cmd, sizeof(cmd));
> + if (d != 0)
> + cmd.vxlcmd_flags |= VXLAN_CMD_FLAG_LEARN;
> +
> + if (do_cmd(s, VXLAN_CMD_SET_LEARN, &cmd, sizeof(cmd), 1) < 0)
> + err(1, "VXLAN_CMD_SET_LEARN");
> +}
> +
> +static void
> +setvxlan_flush(const char *val, int d, int s, const struct afswtch *afp)
> +{
> + struct ifvxlancmd cmd;
> +
> + bzero(&cmd, sizeof(cmd));
> + if (d != 0)
> + cmd.vxlcmd_flags |= VXLAN_CMD_FLAG_FLUSH_ALL;
> +
> + if (do_cmd(s, VXLAN_CMD_FLUSH, &cmd, sizeof(cmd), 1) < 0)
> + err(1, "VXLAN_CMD_FLUSH");
> +}
> +
> +static struct cmd vxlan_cmds[] = {
> +
> + DEF_CLONE_CMD_ARG("vni", setvxlan_vni),
> + DEF_CLONE_CMD_ARG("local", setvxlan_local),
> + DEF_CLONE_CMD_ARG("remote", setvxlan_remote),
> + DEF_CLONE_CMD_ARG("group", setvxlan_group),
> + DEF_CLONE_CMD_ARG("localport", setvxlan_local_port),
> + DEF_CLONE_CMD_ARG("remoteport", setvxlan_remote_port),
> + DEF_CLONE_CMD_ARG2("portrange", setvxlan_port_range),
> + DEF_CLONE_CMD_ARG("timeout", setvxlan_timeout),
> + DEF_CLONE_CMD_ARG("maxaddr", setvxlan_maxaddr),
> + DEF_CLONE_CMD_ARG("vxlandev", setvxlan_dev),
> + DEF_CLONE_CMD_ARG("ttl", setvxlan_ttl),
> + DEF_CLONE_CMD("learn", 1, setvxlan_learn),
> + DEF_CLONE_CMD("-learn", 0, setvxlan_learn),
> +
> + DEF_CMD_ARG("vni", setvxlan_vni),
> + DEF_CMD_ARG("local", setvxlan_local),
> + DEF_CMD_ARG("remote", setvxlan_remote),
> + DEF_CMD_ARG("group", setvxlan_group),
> + DEF_CMD_ARG("localport", setvxlan_local_port),
> + DEF_CMD_ARG("remoteport", setvxlan_remote_port),
> + DEF_CMD_ARG2("portrange", setvxlan_port_range),
> + DEF_CMD_ARG("timeout", setvxlan_timeout),
> + DEF_CMD_ARG("maxaddr", setvxlan_maxaddr),
> + DEF_CMD_ARG("vxlandev", setvxlan_dev),
> + DEF_CMD_ARG("ttl", setvxlan_ttl),
> + DEF_CMD("learn", 1, setvxlan_learn),
> + DEF_CMD("-learn", 0, setvxlan_learn),
> +
> + DEF_CMD("flush", 0, setvxlan_flush),
> + DEF_CMD("flushall", 1, setvxlan_flush),
> +};
> +
> +static struct afswtch af_vxlan = {
> + .af_name = "af_vxlan",
> + .af_af = AF_UNSPEC,
> + .af_other_status = vxlan_status,
> +};
> +
> +static __constructor void
> +vxlan_ctor(void)
> +{
> +#define N(a) (sizeof(a) / sizeof(a[0]))
> + size_t i;
> +
> + for (i = 0; i < N(vxlan_cmds); i++)
> + cmd_register(&vxlan_cmds[i]);
> + af_register(&af_vxlan);
> + callback_register(vxlan_cb, NULL);
> + clone_setdefcallback("vxlan", vxlan_create);
> +#undef N
> +}
>
> Modified: head/share/man/man4/Makefile
> ==============================================================================
> --- head/share/man/man4/Makefile Mon Oct 20 14:25:23 2014 (r273330)
> +++ head/share/man/man4/Makefile Mon Oct 20 14:42:42 2014 (r273331)
> @@ -567,6 +567,7 @@ MAN= aac.4 \
> ${_virtio_scsi.4} \
> vkbd.4 \
> vlan.4 \
> + vxlan.4 \
> ${_vmx.4} \
> vpo.4 \
> vr.4 \
> @@ -743,6 +744,7 @@ MLINKS+=urndis.4 if_urndis.4
> MLINKS+=${_urtw.4} ${_if_urtw.4}
> MLINKS+=vge.4 if_vge.4
> MLINKS+=vlan.4 if_vlan.4
> +MLINKS+=vxlan.4 if_vxlan.4
> MLINKS+=${_vmx.4} ${_if_vmx.4}
> MLINKS+=vpo.4 imm.4
> MLINKS+=vr.4 if_vr.4
>
> Added: head/share/man/man4/vxlan.4
> ==============================================================================
> --- /dev/null 00:00:00 1970 (empty, because file is newly added)
> +++ head/share/man/man4/vxlan.4 Mon Oct 20 14:42:42 2014 (r273331)
> @@ -0,0 +1,235 @@
> +.\" Copyright (c) 2014 Bryan Venteicher
> +.\" All rights reserved.
> +.\"
> +.\" Redistribution and use in source and binary forms, with or without
> +.\" modification, are permitted provided that the following conditions
> +.\" are met:
> +.\" 1. Redistributions of source code must retain the above copyright
> +.\" notice, this list of conditions and the following disclaimer.
> +.\" 2. Redistributions in binary form must reproduce the above copyright
> +.\" notice, this list of conditions and the following disclaimer in the
> +.\" documentation and/or other materials provided with the distribution.
> +.\"
> +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
> +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
> +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
> +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
> +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
> +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
> +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> +.\" SUCH DAMAGE.
> +.\"
> +.\" $FreeBSD$
> +.\"
> +.Dd October 20, 2014
> +.Dt VXLAN 4
> +.Os
> +.Sh NAME
> +.Nm vxlan
> +.Nd "Virtual eXtensible LAN interface"
> +.Sh SYNOPSIS
> +To compile this driver into the kernel,
> +place the following line in your
> +kernel configuration file:
> +.Bd -ragged -offset indent
> +.Cd "device vxlan"
> +.Ed
> +.Pp
> +Alternatively, to load the driver as a
> +module at boot time, place the following line in
> +.Xr loader.conf 5 :
> +.Bd -literal -offset indent
> +if_vxlan_load="YES"
> +.Ed
> +.Sh DESCRIPTION
> +The
> +.Nm
> +driver creates a virtual tunnel endpoint in a
> +.Nm
> +segment.
> +A
> +.Nm
> +segment is a virtual Layer 2 (Ethernet) network that is overlaid
> +in a Layer 3 (IP/UDP) network.
> +.Nm
> +is analogous to
> +.Xr vlan 4
> +but is designed to be better suited for large, multiple tenant
> +data center environments.
> +.Pp
> +Each
> +.Nm
> +interface is created at runtime using interface cloning.
> +This is most easily done with the
> +.Xr ifconfig 8
> +.Cm create
> +command or using the
> +.Va cloned_interfaces
> +variable in
> +.Xr rc.conf 5 .
> +The interface may be removed with the
> +.Xr ifconfig 8
> +.Cm destroy
> +command.
> +.Pp
> +The
> +.Nm
> +driver creates a pseudo Ethernet network interface
> +that supports the usual network
> +.Xr ioctl 2 Ns s
> +and is thus can be used with
> +.Xr ifconfig 8
> +like any other Ethernet interface.
> +The
> +.Nm
> +interface encapsulates the Ethernet frame
> +by prepending IP/UDP and
> +.Nm
> +headers.
> +Thus, the encapsulated (inner) frame is able to transmitted
> +over a routed, Layer 3 network to the remote host.
> +.Pp
> +The
> +.Nm
> +interface may be configured in either unicast or multicast mode.
> +When in unicast mode,
> +the interface creates a tunnel to a single remote host,
> +and all traffic is transmitted to that host.
> +When in multicast mode,
> +the interface joins an IP multicast group,
> +and receives packets sent to the group address,
> +and transmits packets to either the multicast group address,
> +or directly the remote host if there is an appropriate
> +forwarding table entry.
> +.Pp
> +When the
> +.Nm
> +interface is brought up, a
> +.Xr UDP 4
> +.Xr socket 9
> +is created based on the configuration,
> +such as the local address for unicast mode or
> +the group address for multicast mode,
> +and the listening (local) port number.
> +Since multiple
> +.Nm
> +interfaces may be created that either
> +use the same local address
> +or join the same group address,
> +and use the same port,
> +the driver may share a socket among multiple interfaces.
> +However, each interface within a socket must belong to
> +a unique
> +.Nm
> +segment.
> +The analogous
> +.Xr vlan 4
> +configuration would be a physical interface configured as
> +the parent device for multiple VLAN interfaces, each with
> +a unique VLAN tag.
> +Each
> +.Nm
> +segment is identified by a 24-bit value in the
> +.Nm
> +header called the
> +.Dq VXLAN Network Identifier ,
> +or VNI.
> +.Pp
> +When configured with the
> +.Xr ifconfig 8
> +.Cm learn
> +parameter, the interface dynamically creates forwarding table entries
> +from received packets.
> +An entry in the forwarding table maps the inner source MAC address
> +to the outer remote IP address.
> +During transmit, the interface attempts to lookup an entry for
> +the encapsulated destination MAC address.
> +If an entry is found, the IP address in the entry is used to directly
> +transmit the encapsulated frame to the destination.
> +Otherwise, when configured in multicast mode,
> +the interface must flood the frame to all hosts in the group.
> +The maximum number of entries in the table is configurable with the
> +.Xr ifconfig 8
> +.Cm maxaddr
> +command.
> +Stale entries in the table periodically pruned.
> +The timeout is configurable with the
> +.Xr ifconfig 8
> +.Cm timeout
> +command.
> +The table may be viewed with the
> +.Xr sysctl 8
> +.Cm net.link.vlxan.N.ftable.dump
> +command.
> +.Sh MTU
> +Since the
> +.Nm
> +interface encapsulates the Ethernet frame with an IP, UDP, and
> +.Nm
> +header, the resulting frame may be larger than the MTU of the
> +physical network.
> +The
> +.Nm
> +specification recommends the physical network MTU be configured
> +to use jumbo frames to accommodate the encapsulated frame size.
> +Alternatively, the
> +.Xr ifconfig 8
> +.Cm mtu
> +command may be used to reduce the MTU size on the
> +.Nm
> +interface to allow the encapsulated frame to fit in the
> +current MTU of the physical network.
> +.Sh EXAMPLES
> +Create a
> +.Nm
> +interface in unicast mode
> +with the
> +.Cm local
> +tunnel address of 192.168.100.1,
> +and the
> +.Cm remote
> +tunnel address of 192.168.100.2.
> +.Bd -literal -offset indent
> +ifconfig vxlan create vni 108 local 192.168.100.1 remote 192.168.100.2
> +.Ed
> +.Pp
> +Create a
> +.Nm
> +interface in multicast mode,
> +with the
> +.Cm local
> +address of 192.168.10.95,
> +and the
> +.Cm group
> +address of 224.0.2.6.
> +The em0 interface will be used to transmit multicast packets.
> +.Bd -literal -offset indent
> +ifconfig vxlan create vni 42 local 192.168.10.95 group 224.0.2.6 vxlandev em0
> +.Ed
> +.Pp
> +Once created, the
> +.Nm
> +interface can be configured with
> +.Xr ifconfig 8 .
> +.Sh SEE ALSO
> +.Xr ifconfig 8 ,
> +.Xr inet 4 ,
> +.Xr inet 6 ,
> +.Xr sysctl 8 ,
> +.Xr vlan 8
>
> *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***
>
More information about the svn-src-all
mailing list