Can you list internal checksums of a ZFS filesystem?
Kai Gallasch
gallasch at free.de
Wed Jul 18 06:50:40 UTC 2012
Am 18.07.2012 um 00:26 schrieb CH:
>
> Hello list,
>
> I'm moving data to a ZFS filesystem, and it's a ton of big files (more
> than 3 terabytes). I don't trust the network copy command completely,
> and so I'd like to compare checksums. I'm not looking forward to it,
> since it's going to be a slow process, especially if I can't run the
> command on the server.
You could use rsync for transfering the data.
According to its man page rsync calculates checksums for transfered files and on its initial run compares checksums on the sending and receiving side for each file:
http://www.freebsd.org/cgi/man.cgi?query=rsync&apropos=0&sektion=0&manpath=FreeBSD+Ports&arch=default&format=html
-c, --checksum
This changes the way rsync checks if the files have been changed
and are in need of a transfer. Without this option, rsync uses
a "quick check" that (by default) checks if each file's size and
time of last modification match between the sender and receiver.
This option changes this to compare a 128-bit checksum for each
file that has a matching size. Generating the checksums means
that both sides will expend a lot of disk I/O reading all the
data in the files in the transfer (and this is prior to any
reading that will be done to transfer changed files), so this
can slow things down significantly.
The sending side generates its checksums while it is doing the
file-system scan that builds the list of the available files.
The receiver generates its checksums when it is scanning for
changed files, and will checksum any file that has the same size
as the corresponding sender's file: files with either a changed
size or a changed checksum are selected for transfer.
Note that rsync always verifies that each transferred file was
correctly reconstructed on the receiving side by checking a
whole-file checksum that is generated as the file is trans-
ferred, but that automatic after-the-transfer verification has
nothing to do with this option's before-the-transfer "Does this
file need to be updated?" check.
For protocol 30 and beyond (first supported in 3.0.0), the
checksum used is MD5. For older protocols, the checksum used is
MD4.
So at the first run starting rsync without -c switch and on a second run with -c should be quite sufficient for making sure, data has not changed after being transfered. (Except of course, the underlying filesystem layers lie about this to the application or a wrongly implemented MD5 in rsync :-)
Also rsync makes it possible to transfer the data in severeal runs, at times most convenient to you (or your network).
It also supports a switch for limiting bandwith usage..
Have a nice day,
Kai.
More information about the freebsd-fs
mailing list