Seemingly random nvme (nda) write error on new drive (retries exhausted)

From: Rebecca Cran <rebecca_at_bsdio.com>
Date: Thu, 08 Jun 2023 05:11:48 UTC
I got a seemingly random nvme data transfer error on my new arm64 Ampere 
Altra machine, which has a Samsung PM1735 PCIe AIC NVMe drive.

Since it's a new drive and smartctl doesn't show any errors I thought it 
might be worth mentioning here.

I'm running 14.0-CURRENT FreeBSD 14.0-CURRENT #0 main-n263139-baef3a5b585f.


dmesg contains:

nvme0: WRITE sqid:16 cid:126 nsid:1 lba:2550684560 len:8
nvme0: DATA TRANSFER ERROR (00/04) crd:0 m:0 dnr:0 sqid:16 cid:126 cdw0:0
(nda0:nvme0:0:0:1): WRITE. NCB: opc=1 fuse=0 nsid=1 prp1=0 prp2=0 
cdw=98085b90 0 7 0 0 0
(nda0:nvme0:0:0:1): CAM status: CCB request completed with an error
(nda0:nvme0:0:0:1): Error 5, Retries exhausted


nvmecontrol identify nvme0 shows:

Vendor ID:                   144d
Subsystem Vendor ID:         144d
Model Number:                SAMSUNG MZPLJ6T4HALA-00007
Firmware Version:            EPK9CB5Q
Recommended Arb Burst:       8
IEEE OUI Identifier:         00 25 38
Multi-Path I/O Capabilities: Multiple controllers, Multiple ports
Max Data Transfer Size:      131072 bytes
Sanitize Crypto Erase:       Supported
Sanitize Block Erase:        Supported
Sanitize Overwrite:          Not Supported
Sanitize NDI:                Not Supported
Sanitize NODMMAS:            Undefined
Controller ID:               0x0041
Version:                     1.3.0


-- 

Rebecca Cran