bin/53475: cp(1) copies files in reverse order to destination

Dorr H. Clark dclark at applmath.scu.edu
Tue Apr 27 19:10:23 PDT 2004


The following reply was made to PR bin/53475; it has been noted by GNATS.

From: "Dorr H. Clark" <dclark at applmath.scu.edu>
To: freebsd-gnats-submit at FreeBSD.org, louie at TransSys.COM
Cc:  
Subject: Re: bin/53475: cp(1) copies files in reverse order to destination
Date: Tue, 27 Apr 2004 19:03:14 -0700

 --- cp.c_orig   Sun Apr 25 06:32:27 2004
 +++ cp.c        Sun Apr 25 06:33:50 2004
 @@ -94,7 +94,6 @@
  enum op { FILE_TO_FILE, FILE_TO_DIR, DIR_TO_DNE };
  
  static int copy(char *[], enum op, int);
 -static int mastercmp(const FTSENT * const *, const FTSENT * const *);
  static void siginfo(int __unused);
  
  int
 @@ -274,7 +273,7 @@
         mask = ~umask(0777);
         umask(~mask);
  
 -       if ((ftsp = fts_open(argv, fts_options, mastercmp)) == NULL)
 +       if ((ftsp = fts_open(argv, fts_options, NULL)) == NULL)
                 err(1, "fts_open");
         for (badcp = rval = 0; (curr = fts_read(ftsp)) != NULL; badcp =
 0) {
                 switch (curr->fts_info) {
 @@ -478,32 +477,6 @@
         if (errno)
                 err(1, "fts_read");
         return (rval);
 -}
 -
 -/*
 - * mastercmp --
 - *     The comparison function for the copy order.  The order is to
 copy
 - *     non-directory files before directory files.  The reason for this
 - *     is because files tend to be in the same cylinder group as their
 - *     parent directory, whereas directories tend not to be.  Copying
 the
 - *     files first reduces seeking.
 - */
 -static int
 -mastercmp(const FTSENT * const *a, const FTSENT * const *b)
 -{
 -       int a_info, b_info;
 -
 -       a_info = (*a)->fts_info;
 -       if (a_info == FTS_ERR || a_info == FTS_NS || a_info == FTS_DNR)
 -               return (0);
 -       b_info = (*b)->fts_info;
 -       if (b_info == FTS_ERR || b_info == FTS_NS || b_info == FTS_DNR)
 -               return (0);
 -       if (a_info == FTS_D)
 -               return (-1);
 -       if (b_info == FTS_D)
 -               return (1);
 -       return (0);
  }
  
  static void
 
 
 As quoted above, the comments in cp.c tell us the function 
 mastercmp() is an attempt to improve performance based on 
 knowing something about physical disks.
  
 This is an old optimization strategy (it's in the original 
 version of cp.c).  AFAIK, in the updated BSD filesystem, 
 when we copy a file, we don't actually move the
 physical data block of the file but change the information in its
 inode such as the address of its data block and owner.  
 
 Deleting mastercmp() and setting the comparison paramter to NULL 
 for the function fts_open() suppresses the behavior 
 in the bug.
 
 The next question is whether deleting mastercmp eliminates
 an optimization.  Our testing shows the exact opposite,
 mastercmp is degrading performance.  We did several experiments
 with cp -R to measure elapsed time on transfers between devices
 of differing file system types (to avoid UFS2 optimizations).
 Our results show removing mastercmp yields a small performance
 gain (note: we had no SCSI devices available, and second note:
 variability in file system performance seems dominated 
 by other factors).
 
 M. K. McKusick has indicated in seminars that modern disk drives
 lie to the driver about their physical layouts.  The use of
 mastercmp in cp.c is a legacy optimization from a different
 era of disk technology.  We recommend removing this call
 from cp.c to address 53475.
 
 Ting Hui, engineer
 Dorr H. Clark, advisor
 Graduate School of Engineering
 Santa Clara University


More information about the freebsd-bugs mailing list