Can you list internal checksums of a ZFS filesystem?
CH
freebsd-fs at ch.pkts.ca
Wed Jul 18 14:59:00 UTC 2012
On Wed, 18 Jul 2012 08:43:57 +0200
Kai Gallasch <gallasch at free.de> wrote:
> Am 18.07.2012 um 00:26 schrieb CH:
> >
> > Hello list,
> >
> > I'm moving data to a ZFS filesystem, and it's a ton of big files
> > (more than 3 terabytes). I don't trust the network copy command
> > completely, and so I'd like to compare checksums. I'm not looking
> > forward to it, since it's going to be a slow process, especially if
> > I can't run the command on the server.
>
> You could use rsync for transfering the data.
>
> According to its man page rsync calculates checksums for transfered
> files and on its initial run compares checksums on the sending and
> receiving side for each file:
>
> http://www.freebsd.org/cgi/man.cgi?query=rsync&apropos=0&sektion=0&manpath=FreeBSD+Ports&arch=default&format=html
> <rsync's -c option detailed>
>
> So at the first run starting rsync without -c switch and on a
> second run with -c should be quite sufficient for making sure, data
> has not changed after being transfered. (Except of course, the
> underlying filesystem layers lie about this to the application or a
> wrongly implemented MD5 in rsync :-)
>
> Also rsync makes it possible to transfer the data in severeal runs,
> at times most convenient to you (or your network). It also supports a
> switch for limiting bandwith usage..
>
> Have a nice day,
> Kai.
Actually, I did do rsync for the initial transfers, and it had to be
restarted a couple of times for reasons that were not its fault (source
computer rebooted, ssh connection lost, etc). However, after it
finished copying everything (ie: exiting normally), I ran it again, and
it found more stuff to copy. This shouldn't have happened since
nothing was added to the source computer, and so now I distrust its
results and want to check it independently. In particular, I don't
trust its directory-walking algorithm, so some files may have been
missed and may continue to be missed in future runs of rsync, with or
without -c.
The method I was going to use was 'find . -type f -print0 | xargs -0
md5sum > my.big.md5sum.file' on both source and destination, but if I
can harvest the ZFS checksums (file or block) it would cut the cpu
workload in half, and save a tree's worth of energy.
More information about the freebsd-fs
mailing list