ahcich timeouts, only with ahci, not with ataahci

Harald Schmalzbauer h.schmalzbauer at omnilan.de
Wed Mar 3 07:49:33 UTC 2010


Alexander Motin schrieb am 23.02.2010 16:10 (localtime):
> Harald Schmalzbauer wrote:
>> I'm frequently getting my machine locked with ahcichX timeouts:
>> ahcich2: Timeout on slot 0
>> ahcich2: is 00000000 cs 00000001 ss 00000000 rs 00000001 tfd c0 serr
>> 00000000
>> ahcich2: Timeout on slot 8
>> ahcich2: is 00000000 cs 00000100 ss 00000000 rs 00000100 tfd c0 serr
>> 00000000
>> ahcich2: Timeout on slot 8
>> ahcich2: is 00000000 cs fffff07f ss ffffff7f rs ffffff7f tfd c0 serr
>> 00000000
>> ...
> 
> Looking that is (Interrupt status) is zero and `rs == cs | ss` (running
> command bitmasks in driver and hardware), controller doesn't report
> command completion. Looking on TFD status 0xc0 with BUSY bit set, I
> would suppose that either disk stuck in command processing for some
> reason, or controller missed command completion status.
> 
> Have you noticed 30 second (default ATA timeout) pause before timeout
> message printed? Just want to be sure that driver waited enough before
> give up.
> 
>> This happens when backup over GbE overloads ZFS/HDD capabilities.
>> I reduced vfs.zfs.txg.timeout to 1 to prevent the machine from locking
>> up almost immediately, but from it still happens.
>> When I don't use ahci but ataahci (the old driver if I understand things
>> correct) I also see the ZFS burst write congestion, but this doesn't
>> lead to controller timeouts, thus blocking the machine.
>>
>> Sometimes the machine recovers from the disk lock, but most often I have
>> to reboot.
> 
> How it looks when it doesn't? Can you send me full log messages?

Hello, this morning I had a stall, but the machine recovered after about 
  one Minute. Here's what I got from the kernel:
ahcich2: Timeout on slot 29
ahcich2: is 00000000 cs 00000003 ss e0000003 rs e0000003 tfd c0 serr 
00000000
em1: watchdog timeout -- resetting
em1: watchdog timeout -- resetting
ahcich2: Timeout on slot 10
ahcich2: is 00000000 cs 00006000 ss 00007c00 rs 00007c00 tfd c0 serr 
00000000
ahcich2: Timeout on slot 18
ahcich2: is 00000000 cs 00040000 ss 00000000 rs 00040000 tfd c0 serr 
00000000
ahcich2: Timeout on slot 2
ahcich2: is 00000000 cs 00000004 ss 00000000 rs 00000004 tfd c0 serr 
00000000
ahcich2: Timeout on slot 2
ahcich2: is 00000000 cs 00000000 ss 0000000c rs 0000000c tfd 40 serr 
00000000

Does this tell you something useful?

Thanks,

-Harry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 196 bytes
Desc: OpenPGP digital signature
Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20100303/69256479/signature.pgp


More information about the freebsd-stable mailing list