ZFS: 'checksum mismatch' all over the place

Bakul Shah bakul at bitblocks.com
Mon Aug 20 11:14:11 PDT 2007


> Aug 20 01:00:24 leibniz root: ZFS: checksum mismatch, zpool=pil path=/dev/da0
>  offset=58350080 size=512
> Aug 20 01:00:24 leibniz root: ZFS: checksum mismatch, zpool=pil path=/dev/da1
>  offset=58350080 size=512
> Aug 20 01:00:24 leibniz root: ZFS: checksum mismatch, zpool=pil path=/dev/da2
>  offset=58350080 size=512
> Aug 20 01:00:24 leibniz root: ZFS: checksum mismatch, zpool=pil path=/dev/da3
>  offset=58350080 size=512
> 
> Aug 20 01:00:24 leibniz root: ZFS: checksum mismatch, zpool=pil path=/dev/da2
>  offset=38010880 size=512
> Aug 20 01:00:24 leibniz root: ZFS: checksum mismatch, zpool=pil path=/dev/da3
>  offset=38010880 size=512
> Aug 20 01:00:24 leibniz root: ZFS: checksum mismatch, zpool=pil path=/dev/da4
>  offset=38010880 size=512
> Aug 20 01:00:24 leibniz root: ZFS: checksum mismatch, zpool=pil path=/dev/da5
>  offset=38010880 size=512
> Aug 20 01:00:24 leibniz root: ZFS: checksum mismatch, zpool=pil path=/dev/da6
>  offset=38010880 size=512
> Aug 20 01:00:24 leibniz root: ZFS: checksum mismatch, zpool=pil path=/dev/da7
>  offset=38010880 size=512
> Aug 20 01:00:24 leibniz root: ZFS: checksum mismatch, zpool=pil path=/dev/da8
>  offset=38010880 size=512
> Aug 20 01:00:24 leibniz root: ZFS: checksum mismatch, zpool=pil path=/dev/da9
>  offset=38010880 size=512
> Aug 20 01:00:24 leibniz root: ZFS: checksum mismatch, zpool=pil path=/dev/da1
> 0 offset=38010880 size=512
> 
> Can anybody offer anything to help me with this? I'm pretty much at a
> loss as to how I can find the cause of this.

This probably means the more than two blocks in zraid2 were
bad so zpool can't correct the error.

Just speculating here but may be the controller or disk
writes there "behind your back" (assuming the offset reported
is correct -- you can check zfs logic for that)?  Can you map
the offset to a disk block number?

You can try writing/reading that block (after disabling zfs)
and see if it changes in an unexpected way.  This may not
show any error if the problem is some complex interaction.

If the disks are all the same and new, check the vendor website to
see if there is a firmware upgrade.

See if replacing one disk with another type of disk changes
the error.

38010880 is 0x2440000 -- don't know if that is magic in any
way but sometime a hex value can reveal a pattern.  Always
look at the binary or hex representation of any reported
number in an error message!


More information about the freebsd-fs mailing list