nvme device errors & zfs
- Reply: Warner Losh : "Re: nvme device errors & zfs"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Mon, 04 Nov 2024 17:28:42 UTC
What's the best way to see error counters or states on an nvme
device?
I have a typical mirrored nvme zpool, that reported enough errors
in a burst last week, that 1 drive dropped off the bus [1].
After a reboot, it resilvered, I cleared the errors, and it seems
fine according to repeated scrubs and a few days of use.
I was unable to see any errors from the nvme drive itself, but
as its (just) in warranty for 2 more weeks I'd like to know
if I should return it.
I installed ports `sysutils/nvme-cli` and didn't see anything
of note there either:
$ doas nvme smart-log /dev/nvme1
0xc0484e41: opc: 0x2 fuse: 0 cid 0 nsid:0xffffffff cmd2: 0 cmd3: 0
: cdw10: 0x7f0002 cdw11: 0 cdw12: 0 cdw13: 0
: cdw14: 0 cdw15: 0 len: 0x200 is_read: 0
<--- 0 cid: 0 status 0
Smart Log for NVME device:nvme1 namespace-id:ffffffff
critical_warning : 0
temperature : 39 C
available_spare : 100%
available_spare_threshold : 10%
percentage_used : 3%
data_units_read : 121681067
data_units_written : 86619659
host_read_commands : 695211450
host_write_commands : 2187823697
controller_busy_time : 2554
power_cycles : 48
power_on_hours : 6342
unsafe_shutdowns : 38
media_errors : 0
num_err_log_entries : 0
Warning Temperature Time : 0
Critical Composite Temperature Time : 0
Temperature Sensor 1 : 39 C
Temperature Sensor 2 : 43 C
Thermal Management T1 Trans Count : 0
Thermal Management T2 Trans Count : 0
Thermal Management T1 Total Time : 0
Thermal Management T2 Total Time : 0
[1]: zpool status
status: One or more devices are faulted in response to persistent errors.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
repaired.
scan: scrub repaired 0B in 00:17:59 with 0 errors on Thu Oct 31 16:24:36 2024
config:
NAME STATE READ WRITE CKSUM
zroot DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
gpt/zfs0 ONLINE 0 0 0
gpt/zfs1 FAULTED 0 0 0 too many errors
A+
Dave
———
O for a muse of fire, that would ascend the brightest heaven of invention!