Can you list internal checksums of a ZFS filesystem?

CH freebsd-fs at ch.pkts.ca
Wed Jul 18 14:59:00 UTC 2012


On Wed, 18 Jul 2012 08:43:57 +0200
Kai Gallasch <gallasch at free.de> wrote:

> Am 18.07.2012 um 00:26 schrieb CH:
> > 
> > Hello list,
> > 
> > I'm moving data to a ZFS filesystem, and it's a ton of big files
> > (more than 3 terabytes).  I don't trust the network copy command
> > completely, and so I'd like to compare checksums.  I'm not looking
> > forward to it, since it's going to be a slow process, especially if
> > I can't run the command on the server. 
> 
> You could use rsync for transfering the data.
> 
> According to its man page rsync calculates checksums for transfered
> files and on its initial run compares checksums on the sending and
> receiving side for each file:
> 
> http://www.freebsd.org/cgi/man.cgi?query=rsync&apropos=0&sektion=0&manpath=FreeBSD+Ports&arch=default&format=html
 
> <rsync's -c option detailed> 
> 
>   So at the first run starting rsync without -c switch and on a
> second run with -c should be quite sufficient for making sure, data
> has not changed after being transfered. (Except of course, the
> underlying filesystem layers lie about this to the application or a
> wrongly implemented MD5 in rsync :-)
> 
> Also rsync makes it possible to transfer the data in severeal runs,
> at times most convenient to you (or your network). It also supports a
> switch for limiting bandwith usage..
> 
> Have a nice day,
>  Kai.

Actually, I did do rsync for the initial transfers, and it had to be
restarted a couple of times for reasons that were not its fault (source
computer rebooted, ssh connection lost, etc).  However, after it
finished copying everything (ie: exiting normally), I ran it again, and
it found more stuff to copy.  This shouldn't have happened since
nothing was added to the source computer, and so now I distrust its
results and want to check it independently.  In particular, I don't
trust its directory-walking algorithm, so some files may have been
missed and may continue to be missed in future runs of rsync, with or
without -c.

The method I was going to use was 'find . -type f -print0 | xargs -0
md5sum > my.big.md5sum.file' on both source and destination, but if I
can harvest the ZFS checksums (file or block) it would cut the cpu
workload in half, and save a tree's worth of energy.


More information about the freebsd-fs mailing list