[Bug 211713] NVME controller failure: resetting (Samsung SM961 SSD Drives)

Sat Mar 31 07:04:20 UTC 2018

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713

--- Comment #58 from stan <freebsd-ssa at mailden.net> ---
following my comment #57, here more debug info in another context with same
hardware : 
I am able to boot TrueOS-Desktop-201803131015 with
`hw.nvme.per_cpi_io_queues="0"` set in /boot/loader.conf. Everything works
well, BUT fatal error comes when trying to resume after s3 suspend mode : 

I see kernel messages ending with : 

```
(…) kernel: WARN_ON(…stripped…) CSR SSP Base Not fine
(…) kernel: CSR HTP Not fine
(…) kernel: WARN_ON(…stripped…) Clearing unexpected auxiliary request for power
well 2
```

then :

```
nvme0: resetting controller
nvme0: controller ready did not become 0 within 30000 ms
nvme0: failing queued i/o
nvme0: READ sqid:1 cid:0 nsid: 1 lba:324015968 len:20
nvme0: ABORTED - BY REQUEST (00/07) sqid:1 cid:0 cdw0:0
```

and similar errors repeated a dozen times, 

then the fatal :

```
nvd0: lost device - 0 outstanding
nvd0: removing device entry
nvme0: WRITE sqid:1 cid:0 nsid:1 lba:4416948 len:48
nvme0: ABORTED - BY REQUEST (00/07) sqid:1 cid:0 cdw0:0

Fatal trap 12: page fault while in kernel mode
cpuid = 4; apic id = 04
fault virtual address   = 0x8
fault code                         = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80a3b141
stack pointer           = 0x28:0xfffffe0000545820
frame pointer           = 0x28:0xfffffe0000545860
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (nvme taskq)
[ thread pid 0 tid 100077 ]
stopped at      g_disk_done+0xc1:       movq   0x8(%rax),%rdi
db>
```

-- 
You are receiving this mail because:
You are the assignee for the bug.