[Bug 262658] Mellanox ConnectX-3 Pro Initialization error

From: <bugzilla-noreply_at_freebsd.org>
Date: Fri, 25 Mar 2022 17:21:26 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=262658

--- Comment #3 from jSML4ThWwBID69YC@protonmail.com ---
(In reply to Hans Petter Selasky from comment #1)

Firmware updated today to latest releases. The issue is still present and still
random. 

Dell PowerEdge R340
BIOS Version: 2.8.3
See attached screenshot for other firmware version information. 

Mellanox data
# mst status
MST devices:
------------
  pci0:3:0:0    -       MT27520 Family [ConnectX-3 Pro]

# flint -d pci0:3:0:0 q
Image type:            FS2
FW Version:            2.42.5000
FW Release Date:       5.9.2017
Product Version:       02.42.50.00
Rom Info:              type=PXE version=3.4.752
Device ID:             4103
Description:           Node             Port1            Port2            Sys
image
GUIDs:                 ffffffffffffffff ffffffffffffffff ffffffffffffffff
ffffffffffffffff 
MACs:                                       0002c9b5e4e0     0002c9b5e4e1
VSD:                   
PSID:                  MT_1090111023

Dmesg output when it fails. 
mlx4_core0: <mlx4_core> mem 0x92200000-0x922fffff,0x91000000-0x917fffff at
device 0.0 on pci3
mlx4_core: Mellanox ConnectX core driver v3.6.0 (December 2020)
mlx4_core: Initializing mlx4_core
mlx4_core0: command 0xfff failed: fw status = 0x1
mlx4_core0: MAP_FA command failed, aborting
mlx4_core0: Failed to start FW, aborting
mlx4_core0: Failed to init fw, aborting.
device_attach: mlx4_core0 attach returned 5
ichsmb0: <Intel Cannon Lake SMBus controller> port 0xefa0-0xefbf mem
0x92512000-0x925120ff at device 31.4 on pci0
smbus0: <System Management Bus> on ichsmb0
mlx4_core0: <mlx4_core> mem 0x92200000-0x922fffff,0x91000000-0x917fffff at
device 0.0 on pci3
mlx4_core: Initializing mlx4_core
mlx4_core0: command 0xfff failed: fw status = 0x1
mlx4_core0: MAP_FA command failed, aborting
mlx4_core0: Failed to start FW, aborting
mlx4_core0: Failed to init fw, aborting.
device_attach: mlx4_core0 attach returned 5

Dmesg output when it succeeds. 
mlx4_core0: <mlx4_core> mem 0x92200000-0x922fffff,0x91000000-0x917fffff at
device 0.0 on pci3
mlx4_core: Mellanox ConnectX core driver v3.6.0 (December 2020)
mlx4_core: Initializing mlx4_core
mlx4_core0: Unable to determine PCI device chain minimum BW
mlx4_en mlx4_core0: Activating port:1
mlxen0: Ethernet address: 00:02:c9:b5:e4:e0
mlx4_en: mlx4_core0: Port 1: Using 4 TX rings
mlxen0: link state changed to DOWN
mlx4_en: mlx4_core0: Port 1: Using 8 RX rings
mlx4_en: mlxen0: Using 4 TX rings
mlx4_en: mlxen0: Using 8 RX rings
mlx4_en: mlxen0: Initializing port
mlx4_en mlx4_core0: Activating port:2
mlxen1: Ethernet address: 00:02:c9:b5:e4:e1
mlx4_en: mlx4_core0: Port 2: Using 4 TX rings
mlxen1: link state changed to DOWN
mlx4_en: mlx4_core0: Port 2: Using 8 RX rings
mlx4_en: mlxen1: Using 4 TX rings
mlx4_en: mlxen1: Using 8 RX rings
mlx4_en: mlxen1: Initializing port

I've set the boot protocol to none to disable PXE with this: 
mlxconfig -d pci0:3:0:0 set LEGACY_BOOT_PROTOCOL_P1=none
mlxconfig -d pci0:3:0:0 set LEGACY_BOOT_PROTOCOL_P2=none

This on FreeBSD 13.0-RELEASE P8.

-- 
You are receiving this mail because:
You are the assignee for the bug.