Order of files with 'cp'

Garance A Drosihn drosih at rpi.edu
Sun Nov 20 18:57:03 PST 2005


At 7:29 PM +0000 11/20/05, Brian Candler wrote:
>On Sat, Nov 19, 2005 at 11:33:54AM -0800, Tim Kientzle wrote:
>>  Brian Candler wrote:
>  > > I've noticed on FreeBSD-5.4 and -6.0 that the order in which
>  > > 'cp' copies multiple files does not match the order they're
>  > > given on the command line.
>  > ...
>  > > I've had a look through the code, and it seems that cp calls
>  > > fts_open() with the list of files in argv; fts_open then does
>  > > a qsort() on the arguments, using the comparison function
>  > > mastercmp() provided by cp:
>  >
>>  My suggestion:  Have 'cp' call fts_open once for each
>>  command-line argument, instead of giving fts_open the entire
>>  argv list to muck with.
>
>Erm, but that just undoes the reason for calling fts_open with
>mastercmp in the first place, which is to get it to pick files
>before directories (or vice versa, as its behaviour seems to
>be) as an 'optimisation'.

If I understand the situation right, the suggestion would not
completely undo the optimization that 'cp' is trying to do.
Consider the command:
     cp -rp file1 dir1 file2 dir2 destdir

The suggestion would mean the files going into destdir itself
would not be sorted, but (if I understand this thread) files
copied into destdir/dir1 and destdir/dir2 would still be sorted.

Apparently this "sorting optimization" in `cp' goes all the way
back to the original version of `cp' from 1994.  While I expect
we should change it to something better, I don't think we have
any urgent reason to fix it immediately.  Which is to say, let's
figure out what the issues are, and come up with the best fix
instead of the "easiest change" which we can rush to implement.

*Assuming* the comment is correct, and that there *is* some
performance benefit by copying files before directories, then
it still seems to me that sorting all the files is a pretty
clumsy heavy-handed way to accomplish that.  These days some
people have directories with tens of thousands of entries in
them.  Do we really want the overhead of "sorting" all of those
entries just so files are copied before directories?

I think a better fix might be to add an option to fts_open() which
tells it to "process files before directories" (or visa-versa) in
any given directory.  Then `cp' could turn on that bit, and avoid
the fake sort.

It seems to me that if fts_open realizes that is wanted, then
it could implement that behavior in some manner which is faster
than sorting all entries.

-- 
Garance Alistair Drosehn            =   gad at gilead.netel.rpi.edu
Senior Systems Programmer           or  gad at freebsd.org
Rensselaer Polytechnic Institute    or  drosih at rpi.edu


More information about the freebsd-current mailing list