ZFS (zpool) doesn't detect failed drive

Harald Schmalzbauer h.schmalzbauer at omnilan.de
Wed May 5 14:56:44 UTC 2010


Harald Schmalzbauer schrieb am 05.05.2010 14:41 (localtime):
> Hello,
> 
> one drive of my mirror failed today, but 'zpool staus' shows it "online".
> Every process using a ZFS mount hangs. Also 'zpool offline /dev/ad1' 
> hangs infinitely.
...
Sorry, I made an error with zpool create. Somehow the little word 
"mirror" must have been lost. So the pool wasn't a mirror but a stripe. 
Then of course I can't make one vdev offline. Sorry for the noise.
But I took the opportunity to do some tests with that failing drive and 
created a _real_ mirror. That works without failures, but using the 
mirror again leads to:
ad1: TIMEOUT - FLUSHCACHE48 retrying (1 retry left)
ad1: TIMEOUT - FLUSHCACHE48 retrying (1 retry left)
ad1: TIMEOUT - FLUSHCACHE48 retrying (1 retry left)
ad1: TIMEOUT - FLUSHCACHE48 retrying (1 retry left)
ad1: TIMEOUT - FLUSHCACHE48 retrying (1 retry left)
ad1: TIMEOUT - FLUSHCACHE48 retrying (1 retry left)
ata3: port is not ready (timeout 10000ms) tfd = 00000080
ata3: hardware reset timeout
ad1: FAILURE - device detached

Now zpool reporsts the vdev ad1 still online although it has been 
detached and 'atacontrol list' doesn't show it anymore:

zpool status
   pool: URUBAmirrorP1
  state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
         attempt was made to correct the error.  Applications are 
unaffected.
action: Determine if the device needs to be replaced, and clear the errors
         using 'zpool clear' or replace the device with 'zpool replace'.
    see: http://www.sun.com/msg/ZFS-8000-9P
  scrub: none requested
config:

         NAME        STATE     READ WRITE CKSUM
         URUBAmirrorP1  ONLINE       0     0     0
           mirror    ONLINE       0     0     0
             ad1     ONLINE       3  302K     0
             ad2     ONLINE       0     0     0

errors: No known data errors

atacontrol list
ATA channel 2:
     Master:  ad0 <TRANSCEND/20090520> SATA revision 1.x
     Slave:       no device present
ATA channel 3:
     Master:      no device present
     Slave:       no device present
ATA channel 4:
     Master:  ad2 <SAMSUNG HD154UI/1AG01118> SATA revision 2.x
     Slave:       no device present
ATA channel 5:
     Master:  ad3 <ST3750640NS/3.AEG> SATA revision 1.x
     Slave:       no device present

How should such a failure be handled?
Do I have to manually mark the drive offline for zpool?

Thanks,

-Harry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 196 bytes
Desc: OpenPGP digital signature
Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20100505/36bce5d5/signature.pgp


More information about the freebsd-stable mailing list