kern/157397: [ada] ahci/ada/cam NCQ timeouts on Samsung and non-disable-ability
Alexander Motin
mav at FreeBSD.org
Thu Apr 4 07:50:01 UTC 2013
The following reply was made to PR kern/157397; it has been noted by GNATS.
From: Alexander Motin <mav at FreeBSD.org>
To: Matthias Andree <mandree at FreeBSD.org>
Cc: bug-followup at FreeBSD.org
Subject: Re: kern/157397: [ada] ahci/ada/cam NCQ timeouts on Samsung and non-disable-ability
Date: Thu, 04 Apr 2013 10:49:43 +0300
On 04.04.2013 01:08, Matthias Andree wrote:
> - I am running with kern.cam.ada.default_timeout=5 which makes the
> computer recover faster
There is no specific timeout value in ATA specification. 30 seconds is
probably kind of tradition. Drives without TLER (desktop models) may
have unexpectedly high number of error recovery retries. But 5 seconds
may be not enough to spin-up in some cases even for perfectly healthy drive.
> - write/read status for stalls is unclear to me, but the kernel only
> ever logs WRITE_FPDMA_QUEUED, so I guess the answer is "write".
>
> "rm -rf /usr/obj" or "log in to GNOME and try starting gnome-terminal"
> are sufficient to trigger it.
>
>
> - reducing the number of tags to 31 does not appear to help. Linux's
> libata does that only to distinguish the bit mask 0xffffffff it might
> get with 32 tags from "fatal errors".
I have no explanation why 31 tag could be better then 32 from only
ATA/AHCI specs. For siis(4) and mvs(4) that limitation is a part of
hardware design. My guess is that it can be useful for AHCI during
controller hot-plug, when missing controller will return 0xffffffff on
any read. But so far it is irrelevant for us due to mostly missing PCI
hot-plug support yet. It is not the case in logs provided.
> Logs through "egrep ahcich1\|ada1\|pass1\|ahci0" available from
> <http://people.freebsd.org/~mandree/PR157397-logs.txt>, with Serial
> numbers removed.
>
> OBSERVE that this only ever affects odd-numbered slots, never
> even-numbered slots.
Interesting observation, but I don't have explanation to it. All slots
are equal from the specs point of view.
--
Alexander Motin
More information about the freebsd-bugs
mailing list