how can i be certain that a file has copied exactly?

Gary Kline kline at thought.org
Sat Dec 27 09:40:22 UTC 2008


On Sat, Dec 27, 2008 at 04:51:18AM +0200, Giorgos Keramidas wrote:
> On Fri, 26 Dec 2008 17:56:34 -0800, Gary Kline <kline at thought.org> wrote:
> > On Sat, Dec 27, 2008 at 03:29:05AM +0200, Giorgos Keramidas wrote:
> >> On Fri, 26 Dec 2008 17:13:39 -0800, Gary Kline <kline at thought.org> wrote:
> >> > is there a way i can be sure that my little C program has copied a
> >> > dos/win file named, say, foo.htm\;7 to simply foo.htm?
> >> >
> >> > my program uses fopen/fgets/fputs to copy the markup files.  of the
> >> > several i have copied, no problem.  unless i hack cmp or diff, i have
> >> > to avoid the shell.
> >> >
> >> > any ideas? in other words, does anybody have a prefab cmp(oldfile,
> >> > newfile) fn?
> >>
> >> You don't need a prefab `cmp' function, because the base system already
> >> includes tools that can help:
> >>
> >>             cmp file1 file2 ; echo $?
> >>             md5 file1 file2
> >>             sha1 file1 file2
> >>             sha256 file1 file2
> >
> > the problem is that there are several thousands of these files with
> > dos names and an embedded '\;'7 in the file names.  the shell gets in
> > the way.  i have tried
> >
> > sprintf(cmdbuf, "/usr/bin/cmp %s %s", orig, new);
> > system(cmdbuf);
> >
> > chokes on the embedded bytes.
> >
> > i'm thinking of using
> >
> > find . -name "*" -print -exec {} \;
> >
> > and let me program select out the file suffix.  i unlink the screwy
> > dos-ish filename.  that's why i want to be sure the copied/renamed
> > files are right.
> 
> Use quoting (and snprintf() because it supports range-checks for the
> buffer you are passing to it):
> 
>     snprintf(cmdbuf, sizeof(cmdbuf), "cmp \"%s\" \"%s\"", orig, new);
> 


	howdy,

	in a word, YES, /usr/bin/cmp saved the save before i unlinked the
	oldfile.  here is the strangeness.  maybe you know, giorgos, or 
	somebody else on-list.  At first--before i got smart and used your 
	snprintf to simply /bin/cp and then unlink---yes, or /bin/mv, or
	simply rename()--- Before, while i creating via fgets/fputs a new
	file, everything went fine until i ran out of buffer space.  i
	increased to buf[4096] to buf[65535].  more files were
	successfully
	copied from dos\;5 to .dos/*.htm, actually.  suddenly, cmp caught
	a mismatch and the program exited.  a careful diff showed the err 
	a something like line 3751.  my copy was missing a byte near the 
	EOF:

	</body></html

	minus the closing ">"

	so i upped the buffer space to 256000; same thing.  is there a lim
	on the sizeof arrays, or is it [more likely] sloppy hacking?  the
	size of the last file that wouldn't copy is 202K.

	just wondering.

	as i said, using snprintf() with quotes works, so i can do the
	same with the jpeg and gif files.  just cp or mv then to a
	cleaner, more rational unix-esque [[ :-) ]] name.

	gary



-- 
 Gary Kline  kline at thought.org  http://www.thought.org  Public Service Unix
        http://jottings.thought.org   http://transfinite.thought.org
    The 2.17a release of Jottings: http://jottings.thought.org/index.php



More information about the freebsd-questions mailing list