Can you list internal checksums of a ZFS filesystem?

Kai Gallasch gallasch at free.de
Wed Jul 18 06:50:40 UTC 2012


Am 18.07.2012 um 00:26 schrieb CH:
> 
> Hello list,
> 
> I'm moving data to a ZFS filesystem, and it's a ton of big files (more
> than 3 terabytes).  I don't trust the network copy command completely,
> and so I'd like to compare checksums.  I'm not looking forward to it,
> since it's going to be a slow process, especially if I can't run the
> command on the server. 

You could use rsync for transfering the data.

According to its man page rsync calculates checksums for transfered files and on its initial run compares checksums on the sending and receiving side for each file:

http://www.freebsd.org/cgi/man.cgi?query=rsync&apropos=0&sektion=0&manpath=FreeBSD+Ports&arch=default&format=html


       -c, --checksum
              This changes the way rsync checks if the files have been changed
              and are in need of a transfer.  Without this option, rsync  uses
              a "quick check" that (by default) checks if each file's size and
              time of last modification match between the sender and receiver.
              This  option changes this to compare a 128-bit checksum for each
              file that has a matching size.  Generating the  checksums  means
              that  both  sides  will expend a lot of disk I/O reading all the
              data in the files in the transfer (and  this  is  prior  to  any
              reading  that  will  be done to transfer changed files), so this
              can slow things down significantly.

              The sending side generates its checksums while it is  doing  the
              file-system  scan  that  builds the list of the available files.
              The receiver generates its checksums when  it  is  scanning  for
              changed files, and will checksum any file that has the same size
              as the corresponding sender's file:  files with either a changed
              size or a changed checksum are selected for transfer.

              Note  that  rsync always verifies that each transferred file was
              correctly reconstructed on the  receiving  side  by  checking  a
              whole-file  checksum  that  is  generated  as the file is trans-
              ferred, but that automatic after-the-transfer  verification  has
              nothing  to do with this option's before-the-transfer "Does this
              file need to be updated?" check.

              For protocol 30 and  beyond  (first  supported  in  3.0.0),  the
              checksum used is MD5.  For older protocols, the checksum used is
              MD4.


  So at the first run starting rsync without -c switch and on a second run with -c should be quite sufficient for making sure, data has not changed after being transfered. (Except of course, the underlying filesystem layers lie about this to the application or a wrongly implemented MD5 in rsync :-)

Also rsync makes it possible to transfer the data in severeal runs, at times most convenient to you (or your network).
It also supports a switch for limiting bandwith usage..

Have a nice day,
 Kai.


More information about the freebsd-fs mailing list