cksum entire dir??

Mike Jeays mike.jeays at rogers.com
Fri Oct 5 02:30:18 UTC 2012


On Fri, 05 Oct 2012 05:36:19 +0400
Австин Ким <avstin at mail.ru> wrote:

> Hi, all,
> 
> > Paul Kraus <paul at kraus-haus.org> writes:
> >
> > > On Tue, Sep 11, 2012 at 9:18 PM,  <kpneal at pobox.com> wrote:
> > >
> > >> It's a real shame Unix doesn't have a really good tool for comparing
> > >> two directory trees. You can use 'diff -r' (even on binaries), but that
> > >> fails if you have devices, named pipes, or named sockets in the
> > >> filesystem. And diff or cksum don't tell you if symlinks are different.
> > >> Plus you may care about file ownership, and that's where the stat
> > >> command comes in handy.
> > >
> > > Solaris and a least a few versions of Linux have a "dircmp" command
> > > that is in reality a wrapper for diff that handles special files. The
> > > problem with it is that it tends to be slow (I had to validate
> > > millions of files).
> >
> > It's not clear what the danger profile is supposed to be here; dircmp
> > (and recursing 'diff' applications) can handle many cases, but mtree(8)
> > (with appropriate options) covers more pathological problems. Even so,
> > analysis of changes in file nodes like named sockets will usually
> > require some understanding of the application.
> >
> > I suspect that either a recursive diff or an mtree specification is a
> > good solution for the original poster's problem, but we don't have
> > enough information to be more sure than that.
> >
> > Be well.
> >        Lowell
> 
> I happened to be restoring my home directory on my local machine and needed a way to verify that its contents were in sync with the corresponding directories on a remote server.  I first tried looking for an option for _rsync_ that would check synchronization without actually forcibly synchronizing one side to the other unidirectionally, but couldn't find precisely what I was looking for.  I happened to come upon this thread, which was a coincidence that this same issue recently came up again.
> 
> Obviously there must be more rigorous, secure, and industrial-strength ways to check synchronization between corresponding directories on remote systems (apart from doing a one-way sync with _rsync_), but here's my two bits, a quick crack at a shell function to check recursively that the contents of two directories (and the filenames contained therein) have a high probability of being in sync:
> 
> ####BEGIN CUT
> 
> # s:  Function to compute recursive MD5 sum.
> s ( ) {
>   if [ -d "$1" ]
>      then DIR=$1
>      else DIR=.
>   fi
>   if [ `uname` = Linux ]
>      then find "$DIR" -type f -or -type l |sort |tr \\n \\0 |xargs -0 openssl \
>             dgst |sed s/.*\(\\\(.*\\\)\).*\ \\\(.*\\\)/\\2\ \\1/ |tee /tmp/dgst
>           openssl dgst </tmp/dgst
>      else find -s "$DIR" -type f -or -type l    |tr \\n \\0 |xargs -0 md5 \
>                  |sed s/.*\(\\\(.*\\\)\).*\ \\\(.*\\\)/\\2\ \\1/ |tee /tmp/dgst
>           md5 </tmp/dgst
>   fi
>   unset DIR
>   rm /tmp/dgst
>   return
>   }
> 
> # sq:  Function to compute recursive MD5 sum quietly.
> sq ( ) {
>   if [ -d "$1" ]
>      then DIR=$1
>      else DIR=.
>   fi
>   if [ `uname` = Linux ]
>      then find "$DIR" -type f -or -type l |sort |tr \\n \\0 |xargs -0 openssl \
>             dgst |sed s/.*\(\\\(.*\\\)\).*\ \\\(.*\\\)/\\2\ \\1/ >/tmp/dgst
>           openssl dgst </tmp/dgst
>      else find -s "$DIR" -type f -or -type l    |tr \\n \\0 |xargs -0 md5 \
>                  |sed s/.*\(\\\(.*\\\)\).*\ \\\(.*\\\)/\\2\ \\1/ >/tmp/dgst
>           md5 </tmp/dgst
>   fi
>   unset DIR
>   rm /tmp/dgst
>   return
>   }
> 
> ####END CUT
> 
> These functions simply apply the `find ... |xargs' method suggested by previous posts to output a list of MD5 digests with filenames, and then just _md5_ the resulting file.  I tried out the above in both sh(1) in FreeBSD (my local machine) as well as in ksh(1) in Linux (the remote server), though I haven't tested them extensively.  Obviously the above are not `secure,' and obviously an infinite number of variations are possible (such as, for example, also outputting file permissions and dates of last modification with ls(1) to the digest file before running _md5_ on it, to check that permissions and dates are also in sync).  Thanks to the previous posters for solving my problem!  :)
> 
> All the best,
> Austin

"rsync --dry-run" may be a simple solution that would meet your needs? You might need to add the "--delete" option.

Take another look at man rsync.


More information about the freebsd-questions mailing list