[Bug 281990] offset of sa_family in sockaddr_ib inconsistent with sockaddr
Date: Thu, 10 Oct 2024 13:30:25 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281990
Bug ID: 281990
Summary: offset of sa_family in sockaddr_ib inconsistent with
sockaddr
Product: Base System
Version: 14.1-RELEASE
Hardware: amd64
OS: Any
Status: New
Severity: Affects Some People
Priority: ---
Component: bin
Assignee: bugs@FreeBSD.org
Reporter: bmueller@panasas.com
My system has a RoCE-enabled Broadcom NIC that uses the bnxt_re driver. To
test whether libfabric can see the device, I ran 'fi_info -p verbs' which trips
an assert in the libfabric library.
(gdb) where
#0 0x00000008007d810a in thr_kill () from /lib/libc.so.7
#1 0x0000000800751404 in raise () from /lib/libc.so.7
#2 0x00000008008049d9 in abort () from /lib/libc.so.7
#3 0x00000008007345f1 in __assert () from /lib/libc.so.7
#4 0x0000000800502253 in ofi_addr_set_port (addr=0x800e27510, port=0) at
./include/ofi_net.h:832
#5 0x000000080050557e in vrb_alloc_ib_addrinfo (port_num=1 '\001',
gid=0x7fffffffe3f0, pkey=65535) at prov/verbs/src/verbs_info.c:1045
#6 0x00000008005057b4 in vrb_get_sib (verbs_devs=0x8005f4380 <verbs_devs>) at
prov/verbs/src/verbs_info.c:1096
#7 0x0000000800506380 in vrb_init_info (all_infos=0x8005f3e08
<vrb_util_prov+8>) at prov/verbs/src/verbs_info.c:1400
#8 0x0000000800507c99 in vrb_getinfo (version=65555, node=0x0, service=0x0,
flags=0, hints=0x800e1b000, info=0x7fffffffe640) at
prov/verbs/src/verbs_info.c:1892
#9 0x000000080045fa2f in fi_getinfo_ (version=65555, node=0x0, service=0x0,
flags=0, hints=0x800e1b000, info=0x7fffffffe6b0) at src/fabric.c:1279
#10 0x0000000000401e22 in run (hints=0x800e1b000, node=0x0, port=0x0, flags=0)
at util/info.c:323
#11 0x000000000040227d in main (argc=3, argv=0x7fffffffe790) at util/info.c:447
(gdb) frame 5
#5 0x000000080050557e in vrb_alloc_ib_addrinfo (port_num=1 '\001',
gid=0x7fffffffe3f0, pkey=65535) at prov/verbs/src/verbs_info.c:1045
1045 ofi_addr_set_port((struct sockaddr *)sib, 0);
(gdb) x/16xb sib
0x800e27510: 0x1b 0x00 0xff 0xff 0x00 0x00 0x00 0x00
0x800e27518: 0xfe 0x80 0x00 0x00 0x00 0x00 0x00 0x00
The code below expects to be able to cast a sockaddr_ib to a sockaddr so it can
call appropriate branch of switch statement based on the value of sa_family.
This code only works properly if sa_family is at the same offset in both
structures.
vrb_alloc_ib_addrinfo(...)
{
struct sockaddr_ib *sib;
...
ofi_addr_set_port((struct sockaddr *)sib, 0);
}
static inline void ofi_addr_set_port(struct sockaddr *addr, uint16_t port)
{
struct ofi_sockaddr_ib *sib;
switch (ofi_sa_family(addr)) {
case AF_INET:
ofi_sin_port(addr) = htons(port);
break;
case AF_INET6:
ofi_sin6_port(addr) = htons(port);
break;
case AF_IB:
sib = (struct ofi_sockaddr_ib *)addr;
sib->sib_sid = htonll(((uint64_t)OFI_RDMA_PS_IB << 16) + ntohs(port));
sib->sib_sid_mask = htonll(OFI_IB_IP_PS_MASK | OFI_IB_IP_PORT_MASK);
break;
default:
FI_WARN(&core_prov, FI_LOG_FABRIC, "Unknown address format\n");
assert(0);
}
}
#define ofi_sa_family(addr) ((struct sockaddr *)(addr))->sa_family
To correct the issue, sockaddr_ib would be changed to match sockaddr:
Old:
struct sockaddr_ib {
unsigned short int sib_family; /* AF_IB */
...
}
New:
struct sockaddr_ib {
unsigned char sib_len;
sa_family sib_family; /* AF_IB */
...
}
The change will need to be made in two locations:
sys/ofed/include/rdma/ib.h
contrib/ofed/librdmacm/ib.h
--
You are receiving this mail because:
You are the assignee for the bug.