Error with mpi3mr driver

From: Sarder Kamal <bsd.m0osk_at_gmail.com>
Date: Wed, 03 Apr 2024 11:17:50 UTC
Good afternoon everyone

We recently procured a Dell PowerEdge R760xd2 server with PERC H965i
Adapter, which has 24 SAS disks attached to it. I have set the disks to be
non-raid (even though the controller is marked as RAID), and I can see the
disks when I load mpi3mr driver.

I am also able to create a raidz2 array with the disks (3 arrays of 7 disks
and 3 hot spares). I have setup bhyve on the server and the relevant zpool
is used for virtual machines.

All seem to work, until I try to write back data from within the VMs. For
example, if I try to get the ports using git clone -- it hangs after a
while. There are no errors, it just hangs. I usually run the commands from
a screen session, so I can open another screen session, but most of the
commands that need to access the disks hangs -- even Ctrl+C does not work.

In the meanwhile, in the host OS, I see these errors
Apr  3 10:43:58 newDellServer kernel: mpi3mr0: bus_dmamap_load(): retcode =
36
Apr  3 10:43:58 newDellServer kernel: mpi3mr0: request load in progress
Apr  3 10:43:58 newDellServer kernel: func: mpi3mr_action_scsiio line: 1143
Build SGLs failed
Apr  3 10:43:58 newDellServer kernel: (da6:mpi3mr0:0:281:0): WRITE(10).
CDB: 2a 00 15 6d f6 b8 00 07 e0 00
Apr  3 10:43:58 newDellServer kernel: (da6:mpi3mr0:0:281:0): CAM status:
CCB request was invalid
Apr  3 10:43:58 newDellServer kernel: (da6:mpi3mr0:0:281:0): Error 22,
Unretryable error

at different times it picks on different disks, but the essence of the
error is the same

My google search brought me to this earlier bug report
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=272469
but I'm not sure if this is relevant -- I do not have that many cores (the
server has only 96 cores in two physical CPUs).

here are some of the details from the system
# cat /etc/os-release
NAME=FreeBSD
VERSION="14.0-RELEASE-p6"
VERSION_ID="14.0"
ID=freebsd
ANSI_COLOR="0;31"
PRETTY_NAME="FreeBSD 14.0-RELEASE-p6"
CPE_NAME="cpe:/o:freebsd:freebsd:14.0"
HOME_URL="https://FreeBSD.org/"
BUG_REPORT_URL="https://bugs.FreeBSD.org/"

# dmesg | grep SMP
FreeBSD/SMP: Multiprocessor System Detected: 96 CPUs
FreeBSD/SMP: 2 package(s) x 24 core(s) x 2 hardware threads

# sysctl kern.sched.topology_spec
kern.sched.topology_spec: <groups>
 <group level="1" cache-level="0">
  <cpu count="96"
mask="ffffffffffffffff,ffffffff,0,0,0,0,0,0,0,0,0,0,0,0,0,0">0, 1, 2, 3, 4,
5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43,
44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62,
63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81,
82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95</cpu>
  <children>
   <group level="2" cache-level="3">
    <cpu count="48" mask="ffffffffffff,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">0, 1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41,
42, 43, 44, 45, 46, 47</cpu>
    <flags><flag name="NODE">NUMA node</flag></flags>
    <children>
     <group level="3" cache-level="2">
      <cpu count="2" mask="3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">0, 1</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="c,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">2, 3</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="30,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">4, 5</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="c0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">6, 7</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="300,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">8, 9</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="c00,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">10, 11</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="3000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">12, 13</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="c000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">14, 15</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="30000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">16, 17</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="c0000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">18, 19</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="300000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">20,
21</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="c00000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">22,
23</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="3000000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">24,
25</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="c000000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">26,
27</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="30000000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">28,
29</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="c0000000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">30,
31</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="300000000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">32,
33</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="c00000000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">34,
35</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="3000000000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">36,
37</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="c000000000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">38,
39</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="30000000000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">40,
41</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="c0000000000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">42,
43</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="300000000000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">44,
45</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="c00000000000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">46,
47</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
    </children>
   </group>
   <group level="2" cache-level="3">
    <cpu count="48"
mask="ffff000000000000,ffffffff,0,0,0,0,0,0,0,0,0,0,0,0,0,0">48, 49, 50,
51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69,
70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88,
89, 90, 91, 92, 93, 94, 95</cpu>
    <flags><flag name="NODE">NUMA node</flag></flags>
    <children>
     <group level="3" cache-level="2">
      <cpu count="2" mask="3000000000000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">48,
49</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="c000000000000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">50,
51</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2"
mask="30000000000000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">52, 53</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2"
mask="c0000000000000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">54, 55</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2"
mask="300000000000000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">56, 57</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2"
mask="c00000000000000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">58, 59</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2"
mask="3000000000000000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">60, 61</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2"
mask="c000000000000000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">62, 63</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0">64, 65</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="0,c,0,0,0,0,0,0,0,0,0,0,0,0,0,0">66, 67</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="0,30,0,0,0,0,0,0,0,0,0,0,0,0,0,0">68, 69</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="0,c0,0,0,0,0,0,0,0,0,0,0,0,0,0,0">70, 71</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="0,300,0,0,0,0,0,0,0,0,0,0,0,0,0,0">72, 73</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="0,c00,0,0,0,0,0,0,0,0,0,0,0,0,0,0">74, 75</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="0,3000,0,0,0,0,0,0,0,0,0,0,0,0,0,0">76, 77</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="0,c000,0,0,0,0,0,0,0,0,0,0,0,0,0,0">78, 79</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="0,30000,0,0,0,0,0,0,0,0,0,0,0,0,0,0">80, 81</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="0,c0000,0,0,0,0,0,0,0,0,0,0,0,0,0,0">82, 83</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="0,300000,0,0,0,0,0,0,0,0,0,0,0,0,0,0">84,
85</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="0,c00000,0,0,0,0,0,0,0,0,0,0,0,0,0,0">86,
87</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="0,3000000,0,0,0,0,0,0,0,0,0,0,0,0,0,0">88,
89</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="0,c000000,0,0,0,0,0,0,0,0,0,0,0,0,0,0">90,
91</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="0,30000000,0,0,0,0,0,0,0,0,0,0,0,0,0,0">92,
93</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
     <group level="3" cache-level="2">
      <cpu count="2" mask="0,c0000000,0,0,0,0,0,0,0,0,0,0,0,0,0,0">94,
95</cpu>
      <flags><flag name="THREAD">THREAD group</flag><flag name="SMT">SMT
group</flag></flags>
     </group>
    </children>
   </group>
  </children>
 </group>
</groups>

# zpool status -v
  pool: Data02
 state: ONLINE
config:

        NAME        STATE     READ WRITE CKSUM
        Data02      ONLINE       0     0     0
          raidz2-0  ONLINE       0     0     0
            da0     ONLINE       0     0     0
            da1     ONLINE       0     0     0
            da2     ONLINE       0     0     0
            da3     ONLINE       0     0     0
            da4     ONLINE       0     0     0
            da5     ONLINE       0     0     0
            da6     ONLINE       0     0     0
          raidz2-1  ONLINE       0     0     0
            da7     ONLINE       0     0     0
            da8     ONLINE       0     0     0
            da9     ONLINE       0     0     0
            da10    ONLINE       0     0     0
            da11    ONLINE       0     0     0
            da12    ONLINE       0     0     0
            da13    ONLINE       0     0     0
          raidz2-2  ONLINE       0     0     0
            da14    ONLINE       0     0     0
            da15    ONLINE       0     0     0
            da16    ONLINE       0     0     0
            da17    ONLINE       0     0     0
            da18    ONLINE       0     0     0
            da19    ONLINE       0     0     0
            da20    ONLINE       0     0     0
        spares
          da21      AVAIL
          da22      AVAIL
          da23      AVAIL

errors: No known data errors


# pcicon -lv (truncated)
mpi3mr0@pci0:80:0:0:    class=0x010400 rev=0x01 hdr=0x00 vendor=0x1000
device=0x00a5 subvendor=0x1028 subdevice=0x2114
    vendor     = 'Broadcom / LSI'
    device     = 'Fusion-MPT 24GSAS/PCIe SAS40xx'
    class      = mass storage
    subclass   = RAID


I am happy to provide more data/details if that helps, please let me also
know how to obtain them.

Any suggestion to remedy the situation is greatly appreciated.

Kind regards
SK