Another ZFS kernel panic on same block on every drive in raidz
Mark Powell
M.S.Powell at salford.ac.uk
Thu Aug 30 11:13:48 PDT 2007
On Thu, 30 Aug 2007, Mark Powell wrote:
> I am being told that a dma error is occuring on the same block on all 3
> drives at the same time:
>
> Just performing a scrub now to see what happens.
The scrub performed fine.
The panic is occuring under heavyish use; with 3 simultaneous rsync from
an XP box over samba.
Just recalled that it paniced earlier, but I was in X and couldn't see
the message. Surprisingly it did log something:
Aug 30 17:27:48 echo kernel: ad14: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435298
Aug 30 17:28:29 echo kernel: ad18: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435297
Aug 30 17:28:29 echo kernel: ad16: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435297
Aug 30 17:28:29 echo kernel: ad14: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=268435298
Aug 30 17:28:29 echo kernel: ad18: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=268435297
Aug 30 17:28:29 echo kernel: ad16: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=268435297
Aug 30 17:28:29 echo kernel: ad14: FAILURE - WRITE_DMA timed out LBA=268435298
Aug 30 17:28:29 echo kernel: ad18: FAILURE - WRITE_DMA timed out LBA=268435297
Aug 30 17:28:29 echo kernel: ad16: FAILURE - WRITE_DMA timed out LBA=268435297
Aug 30 17:28:29 echo kernel: ad18: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435297
Aug 30 17:28:29 echo kernel: ad14: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435298
Aug 30 17:28:29 echo kernel: ad16: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435297
Aug 30 17:28:29 echo kernel: ad18: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=268435297
Aug 30 17:28:29 echo kernel: ad14: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=268435298
Aug 30 17:28:29 echo kernel: ad16: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=268435297
Aug 30 17:28:29 echo kernel: ad18: FAILURE - WRITE_DMA timed out LBA=268435297
Aug 30 17:28:29 echo kernel: ad14: FAILURE - WRITE_DMA timed out LBA=268435298
Aug 30 17:28:29 echo kernel: ad16: FAILURE - WRITE_DMA timed out LBA=268435297
Aug 30 17:28:29 echo kernel: ad18: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435425
Aug 30 17:28:29 echo kernel: ad14: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435426
Aug 30 17:28:29 echo kernel: ad16: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435425
Aug 30 17:28:29 echo kernel: ad18: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=268435425
Aug 30 17:28:29 echo kernel: ad14: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=268435426
Aug 30 17:28:29 echo kernel: ad16: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=268435425
Aug 30 17:28:29 echo kernel: ad18: FAILURE - WRITE_DMA timed out LBA=268435425
Aug 30 17:28:29 echo kernel: ad14: FAILURE - WRITE_DMA timed out LBA=268435426
Here the blocks are different and 4 blocks overall are reported as having
problems. In hex they all start FFFFFxx ? They are (including the one from
the previous report):
268435297 fffff61
268435298 fffff62
268435340 fffff8c
268435425 fffffe1
268435426 fffffe2
Coincidence?
This is on amd64 with all drives connected to the ICH9 ports on a
Gigabyte Intel P35 based MB.
Current is from 25/8/7.
Cheers.
--
Mark Powell - UNIX System Administrator - The University of Salford
Information Services Division, Clifford Whitworth Building,
Salford University, Manchester, M5 4WT, UK.
Tel: +44 161 295 4837 Fax: +44 161 295 5888 www.pgp.com for PGP key
More information about the freebsd-current
mailing list