kern/161881: [ahci] [panic] [regression] Panics after AHCI timeouts

Armin Pirkovitsch armin at frozen-zone.org
Fri Oct 21 15:10:11 UTC 2011


>Number:         161881
>Category:       kern
>Synopsis:       [ahci] [panic] [regression] Panics after AHCI timeouts
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Oct 21 15:10:11 UTC 2011
>Closed-Date:
>Last-Modified:
>Originator:     Armin Pirkovitsch
>Release:        9.0B3 & 10.0 CURRENT
>Organization:
>Environment:
FreeBSD fz-sub1.local 9.0-BETA3 FreeBSD 9.0-BETA3 #7: Fri Oct  7 14:32:22 CEST 2011     root at fz-sub1.local:/usr/obj/usr/src/sys/FZ-SUB1  amd64

>Description:
Whenever I compile some stuff or transfer bigger amounts of data there is a chance that I'll get the following error:

ahcich0: AHCI reset: device not ready after 31000ms (tfd = 00000080)
ahcich0: Timeout on slot 28 port 0
achich0: is 00000000 cs 10000000 ss 00000000 rs 10000000 tfd 80 serr 00000000 cmd 0000fc17
(this error is from machine 2)

http://oh.homeunix.org/FreeBSD/ata/DSC06461.JPG
http://oh.homeunix.org/FreeBSD/ata/DSC06462.JPG
http://oh.homeunix.org/FreeBSD/ata/DSC06463.JPG

It never recovers after that problem and the only solution is to turn off the power (reset is not sufficient - it looks like the sata controller is completely dead after such an occurrence)

I had the same problem in earlier versions of head (9-current) on machine 1 without the SSD but was able to work around them by not using ahci - that workaround no longer works - only difference is that it says "ataX:" instead of "ahcichX:" in the above error.

Machine 2 started with the problems when I put the SSD into it.

I even tried the "NO_NCQ" 'switch' which I found somewhere on the net for a similar problem.

Both BIOS' are up to date and Windows runs stable on machine 2 (which means I do not expect it to be a hardware problem) 

I've verified that the disk is fine (smartctl) and that the partitions fall within the disk.

I am aware of kern/161768 but the trace is different - therefor I open a new PR.

The problem even occurs with WITNESS, INVARIANTS etc on - and in debug mode - if that makes tracing any easier... - the backtrace stays the same.


machine 1:
ahci0: <JMicron JMB363 AHCI SATA controller> mem 0xfbcfe000-0xfbcfffff irq 16 at device 0.0 on pci4
ahci0: AHCI v1.00 with 2 3Gbps ports, Port Multiplier supported
ahcich0: <AHCI channel> at channel 0 on ahci0
ahcich1: <AHCI channel> at channel 1 on ahci0
ahci1: <Intel ICH10 AHCI SATA controller> port 0x9c00-0x9c07,0x9880-0x9883,0x9800-0x9807,0x9480-0x9483,0x9400-0x941f mem 0xf7ffc000-0xf7ffc7ff irq 20 at device 31.2 on pci0
ahci1: AHCI v1.20 with 6 3Gbps ports, Port Multiplier supported
ahcich2: <AHCI channel> at channel 0 on ahci1
ahcich3: <AHCI channel> at channel 1 on ahci1
ahcich4: <AHCI channel> at channel 2 on ahci1
ahcich5: <AHCI channel> at channel 3 on ahci1
ahcich6: <AHCI channel> at channel 4 on ahci1
ahcich7: <AHCI channel> at channel 5 on ahci1
ada0 at ahcich2 bus 0 scbus4 target 0 lun 0
ada0: <Corsair Force 3 SSD 1.3> ATA-8 SATA 3.x device
ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada0: Command Queueing enabled
ada0: 114473MB (234441648 512 byte sectors: 16H 63S/T 16383C)
ada0: Previously was known as ad10
ada1 at ahcich3 bus 0 scbus5 target 0 lun 0
ada1: <SAMSUNG HD154UI 1AG01118> ATA-7 SATA 2.x device
ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada1: Command Queueing enabled
ada1: 1430799MB (2930277168 512 byte sectors: 16H 63S/T 16383C)
ada1: Previously was known as ad12
ada2 at ahcich4 bus 0 scbus6 target 0 lun 0
ada2: <SAMSUNG HD154UI 1AG01118> ATA-7 SATA 2.x device
ada2: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada2: Command Queueing enabled
ada2: 1430799MB (2930277168 512 byte sectors: 16H 63S/T 16383C)
ada2: Previously was known as ad14

machine 2:
ahci0: <Intel 5 Series/3400 Series AHCI SATA controller> port 0x5058-0x505f,0x5084-0x5087,0x5050-0x5057,0x5080-0x5083,0x5020-0x503f mem 0xb7806000-0xb78067ff irq 19 at device 31.2 on pci0
ahci0: AHCI v1.30 with 6 3Gbps ports, Port Multiplier not supported
ahcich0: <AHCI channel> at channel 0 on ahci0
ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
ada0: <Corsair Force 3 SSD 1.3> ATA-8 SATA 3.x device
ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada0: Command Queueing enabled
ada0: 114473MB (234441648 512 byte sectors: 16H 63S/T 16383C)
ada0: Previously was known as ad4

>How-To-Repeat:

>Fix:


>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list