10.1 RC4 r273903 - zpool scrub on ssd mirror - ahci command timeout

Kai Gallasch k at free.de
Tue Dec 9 10:09:04 UTC 2014


Am Tue,  9 Dec 2014 09:04:26 +0000 (UTC)
schrieb "Ganael LAPLANCHE" <ganael.laplanche at martymac.org>:

> On Tue, 9 Dec 2014 09:34:05 +0100, Kai Gallasch wrote
> 
> Hi Kai,
> 
> > Any ideas (left) ?
> 
> There is a PR for AHCI timeouts :
> 
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=195349
> 
> I don't know if it is related to your problem but maybe you can try
> the suggested workaround ?

Thank you for this information.
But no. My problem seems to be unrelated..

K.


echo 'hint.ahci.0.msi="0"' >> /boot/loader.conf

After reboot:

# zpool scrub ssdpool
# zpool status ssdpool
  pool: ssdpool
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
	attempt was made to correct the error.  Applications are
unaffected. action: Determine if the device needs to be replaced, and
clear the errors using 'zpool clear' or replace the device with 'zpool
replace'. see: http://illumos.org/msg/ZFS-8000-9P
  scan: scrub in progress since Tue Dec  9 10:36:24 2014
        5.36G scanned out of 115G at 166M/s, 0h11m to go
        24.5K repaired, 4.65% done
config:

	NAME              STATE     READ WRITE CKSUM
	ssdpool           ONLINE       0     0     0
	  mirror-0        ONLINE       0     0     0
	    gpt/ssdpool0  ONLINE       0     0    13  (repairing)
	    gpt/ssdpool1  ONLINE       0     0     0

errors: No known data errors


After the zpool scrub finished:

# zpool status ssdpool
  pool: ssdpool
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
	attempt was made to correct the error.  Applications are
unaffected.
action: Determine if the device needs to be replaced, and clear the
errors
	using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://illumos.org/msg/ZFS-8000-9P
  scan: scrub repaired 38.5K in 0h9m with 0 errors on Tue Dec  9
10:45:58 2014
config:

	NAME              STATE     READ WRITE CKSUM
	ssdpool           ONLINE       0     0     0
	  mirror-0        ONLINE       0     0     0
	    gpt/ssdpool0  ONLINE       0     0    15
	    gpt/ssdpool1  ONLINE       0     0     4



# zpool clear ssdpool
# zpool scrub ssdpool

This "zpool scrub" run one SSD drive is lost during the scrub :-/
A "camcontrol rescan all" does not bring the missing ssd drive back..


# zpool status ssdpool
  pool: ssdpool
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas
exist for
	the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://illumos.org/msg/ZFS-8000-2Q
  scan: scrub canceled on Tue Dec  9 10:58:42 2014
config:

	NAME                     STATE     READ WRITE CKSUM
	ssdpool                  DEGRADED     0     0     0
	  mirror-0               DEGRADED     0     0     0
	    gpt/ssdpool0         ONLINE       0     0     0
	    2481016284460057031  UNAVAIL    297   215    47
	was /dev/gpt/ssdpool1



ahcich3: Timeout on slot 24 port 0
ahcich3: is 00000000 cs fc00001f ss ff00001f rs ff00001f tfd 40 serr 00000000 cmd 0024d917
(ada3:ahcich3:0:0:0): READ_FPDMA_QUEUED. ACB: 60 24 1b 4a c6 40 1e 00 00 00 00 00
(ada3:ahcich3:0:0:0): CAM status: Command timeout
(ada3:ahcich3:0:0:0): Retrying command
ahcich3: AHCI reset: device not ready after 31000ms (tfd = 00000080)
ahcich3: Timeout on slot 4 port 0
ahcich3: is 00000000 cs 00000010 ss 00000000 rs 00000010 tfd 80 serr 00000000 cmd 0024c417
(aprobe0:ahcich3:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
(aprobe0:ahcich3:0:0:0): CAM status: Command timeout
(aprobe0:ahcich3:0:0:0): Retrying command
ahcich3: AHCI reset: device not ready after 31000ms (tfd = 00000080)
ahcich3: Timeout on slot 4 port 0
ahcich3: is 00000000 cs 00000010 ss 00000000 rs 00000010 tfd 80 serr 00000000 cmd 0024c417
(aprobe0:ahcich3:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
(aprobe0:ahcich3:0:0:0): CAM status: Command timeout
(aprobe0:ahcich3:0:0:0): Error 5, Retries exhausted
ahcich3: AHCI reset: device not ready after 31000ms (tfd = 00000080)
ahcich3: Timeout on slot 4 port 0
ahcich3: is 00000000 cs 00000010 ss 00000000 rs 00000010 tfd 80 serr 00000000 cmd 0024c417
(aprobe0:ahcich3:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
(aprobe0:ahcich3:0:0:0): CAM status: Command timeout
(aprobe0:ahcich3:0:0:0): Error 5, Retry was blocked
ada3 at ahcich3 bus 0 scbus3 target 0 lun 0
ada3: <Samsung SSD 850 PRO 512GB EXM01B6Q> s/n S1SXNSAFA06835A detached
ahcich3: AHCI reset: device not ready after 31000ms (tfd = 00000080)
ahcich3: Timeout on slot 4 port 0
ahcich3: is 00000000 cs 00000010 ss 00000000 rs 00000010 tfd 80 serr 00000000 cmd 0024c417
(aprobe0:ahcich3:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
(aprobe0:ahcich3:0:0:0): CAM status: Command timeout
(aprobe0:ahcich3:0:0:0): Retrying command
ahcich3: AHCI reset: device not ready after 31000ms (tfd = 00000080)
ahcich3: Timeout on slot 4 port 0
ahcich3: is 00000000 cs 00000010 ss 00000000 rs 00000010 tfd 80 serr 00000000 cmd 0024c417
(aprobe0:ahcich3:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
(aprobe0:ahcich3:0:0:0): CAM status: Command timeout
(aprobe0:ahcich3:0:0:0): Error 5, Retries exhausted
ahcich3: AHCI reset: device not ready after 31000ms (tfd = 00000080)
ahcich3: Poll timeout on slot 4 port 0
ahcich3: is 00000000 cs 00000010 ss 00000000 rs 00000010 tfd 80 serr 00000000 cmd 0024c417
(aprobe0:ahcich3:0:0:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00
(aprobe0:ahcich3:0:0:0): CAM status: Command timeout
(aprobe0:ahcich3:0:0:0): Error 5, Retries exhausted
ahcich3: Timeout on slot 4 port 0
ahcich3: is 00000000 cs 00000010 ss 00000000 rs 00000010 tfd 80 serr 00000000 cmd 0024c417
(ada3:ahcich3:0:0:0): SETFEATURES ENABLE RCACHE. ACB: ef aa 00 00 00 40 00 00 00 00 00 00
(ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request
(ada3:ahcich3:0:0:0): Error 5, Periph was invalidated
ahcich3: AHCI reset: device not ready after 31000ms (tfd = 00000080)
ahcich3: Poll timeout on slot 4 port 0
ahcich3: is 00000000 cs 00000010 ss 00000000 rs 00000010 tfd 80 serr 00000000 cmd 0024c417
(aprobe0:ahcich3:0:0:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00
(aprobe0:ahcich3:0:0:0): CAM status: Command timeout
(aprobe0:ahcich3:0:0:0): Error 5, Retries exhausted
ahcich3: Timeout on slot 4 port 0
ahcich3: is 00000000 cs 0001fff0 ss 0001fff0 rs 0001fff0 tfd 80 serr 00000000 cmd 0024c417
(ada3:ahcich3:0:0:0): READ_FPDMA_QUEUED. ACB: 60 24 1b 4a c6 40 1e 00 00 00 00 00
(ada3:ahcich3:0:0:0): CAM status: Command timeout
(ada3:ahcich3:0:0:0): Error 5, Periph was invalidated
(ada3:ahcich3:0:0:0): READ_FPDMA_QUEUED. ACB: 60 15 3f 4a c6 40 1e 00 00 00 00 00
(ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request
(ada3:ahcich3:0:0:0): Error 5, Periph was invalidated
(ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 08 ed c4 21 40 20 00 00 00 00 00
(ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request
(ada3:ahcich3:0:0:0): Error 5, Periph was invalidated
(ada3:ahcich3:0:0:0): READ_FPDMA_QUEUED. ACB: 60 00 71 19 d3 40 04 00 00 01 00 00
(ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request
(ada3:ahcich3:0:0:0): Error 5, Periph was invalidated
(ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 f0 bb 43 23 40 20 00 00 00 00 00
(ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request
(ada3:ahcich3:0:0:0): Error 5, Periph was invalidated
(ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 e8 bb 45 23 40 20 00 00 00 00 00
(ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request
(ada3:ahcich3:0:0:0): Error 5, Periph was invalidated
(ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 e8 bb 46 23 40 20 00 00 00 00 00
(ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request
(ada3:ahcich3:0:0:0): Error 5, Periph was invalidated
(ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 e8 bb 47 23 40 20 00 00 00 00 00
(ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request
(ada3:ahcich3:0:0:0): Error 5, Periph was invalidated
(ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 e8 bb 48 23 40 20 00 00 00 00 00
(ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request
(ada3:ahcich3:0:0:0): Error 5, Periph was invalidated
(ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 e8 bb 49 23 40 20 00 00 00 00 00
(ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request
(ada3:ahcich3:0:0:0): Error 5, Periph was invalidated
(ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 e8 bb 4a 23 40 20 00 00 00 00 00
(ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request
(ada3:ahcich3:0:0:0): Error 5, Periph was invalidated
(ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 e8 bb 4b 23 40 20 00 00 00 00 00
(ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request
(ada3:ahcich3:0:0:0): Error 5, Periph was invalidated
(ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 e8 bb 4c 23 40 20 00 00 00 00 00
(ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request
(ada3:ahcich3:0:0:0): Error 5, Periph was invalidated
(ada3:ahcich3:0:0:0): Periph destroyed
ahcich3: AHCI reset: device not ready after 31000ms (tfd = 00000080)
ahcich3: Poll timeout on slot 16 port 0
ahcich3: is 00000000 cs 00010000 ss 00000000 rs 00010000 tfd 80 serr 00000000 cmd 0024d017
(aprobe0:ahcich3:0:0:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00
(aprobe0:ahcich3:0:0:0): CAM status: Command timeout
(aprobe0:ahcich3:0:0:0): Error 5, Retries exhausted


-- 
PGP-KeyID = 0xE401B671927D4A5C
I am a robot.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20141209/dbd2596b/attachment.sig>


More information about the freebsd-stable mailing list