[Bug 211713] NVME controller failure: resetting (Samsung SM961 SSD Drives)
bugzilla-noreply at freebsd.org
bugzilla-noreply at freebsd.org
Fri Mar 16 05:24:31 UTC 2018
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211713
--- Comment #50 from commit-hook at freebsd.org ---
A commit references this bug:
Author: imp
Date: Fri Mar 16 05:23:49 UTC 2018
New revision: 331046
URL: https://svnweb.freebsd.org/changeset/base/331046
Log:
Try polling the qpairs on timeout.
On some systems, we're getting timeouts when we use multiple queues on
drives that work perfectly well on other systems. On a hunch, Jim
Harris suggested I poll the completion queue when we get a timeout.
This patch polls the completion queue if no fatal status was
indicated. If it had pending I/O, we complete that request and
return. Otherwise, if aborts are enabled and no fatal status, we abort
the command and return. Otherwise we reset the card.
This may clear up the problem, or we may see it result in lots of
timeouts and a performance problem. Either way, we'll know the next
step. We may also need to pay attention to the fatal status bit
of the controller.
PR: 211713
Suggested by: Jim Harris
Sponsored by: Netflix
Changes:
head/sys/dev/nvme/nvme_private.h
head/sys/dev/nvme/nvme_qpair.c
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the freebsd-bugs
mailing list