[PATCH] adding two new options to 'cp'

Oliver Fromme olli at lurza.secnetix.de
Thu Aug 3 12:42:42 UTC 2006


Julian Elischer wrote:
 > Oliver Fromme wrote:
 > > Bakul Shah wrote:
 > > > Peter Jeremy wrote:
 > > > > As a general comment (not addressed to Tim):  There _is_ a downside
 > > > > to sparsifying files.  If you take a sparse file and start filling
 > > > > in the holes, the net result will be very badly fragmented and hence
 > > > > have very poor sequential I/O performance.  If you're never going to
 > > > > update a file then making it sparse makes sense, if you will be
 > > > > updating it, you will get better performance by making it non-sparse.
 > > > 
 > > > Except for database tables how common is this?
 > > 
 > > For example image files of media, e.g. ISO9660 images
 > > or images of hard disk partitions.  I often have to handle
 > > such images, and I certainly do _not_ want them to be
 > > sparse.
 > 
 > well then you'd be silly to go to the extra work fo specifying --sparse
 > (or whatever) wouldn't you?

Sure, in that case I wouldn't specify it (and I hope
nobody intends to make it the default, as has been
mentioned in this thread).

 > > Before someone adds a bogus "sparse file support" option
 > > to cp(1), I would rather prefer that someone fixes the
 > > existing -R option which currently doesn't handle hard-
 > > links correctly.
 > 
 > It never worked as you suppose.

I know.  Probably because nobody cared to implement it,
and there are other tools (cpio) that can do that job.

 > Changing it would be a surprise
 > (though to me a pleasant one) to many.

I guess most users of cp(1) aren't aware of the flaw.

 > > That flaw is documented in the manual page, so it might
 > > not count as a "bug", but it's a flaw nevertheless.  A lot
 > > of people -- even so-called professional admins -- use
 > > "cp -Rp" to copy directory hierarchies, and afterwards
 > > they wonder why the copy takes up much more space than
 > > the original, because all hardlinks have been copied as
 > > separate files (if they notice at all).
 > 
 > I ALWAYS use find . -depth -print0|cpio -pdmuv0 {dest}
 > or -pdlmuv (poodle-move-0?) if I want links from old to new. because it
 > is guaranteed to do that but cp  is not.

Yes, exactly.  I use:
find -d . | cpio -dump /dest/dir   (copy file hierarchy)
or:
find -d . | cpio -dumpl /dest/dir  (link file hierarchy)

"-dump" is pretty easy to remember.  Of course, if there
might be names with whitespaces, I add the -0 option, too.
I've created aliases "hcp" (hierarchy copy) and "hln"
(hierarchy link) to save typing.  That's even shorter
than "cp -al".  ;-)

For bourne shells (sh, zsh, bash), these functions work
fine:

hcp() {
        local dst
        dst=$(realpath "$2") && (
                cd -- "$1" && \
                find -d . -print0 | cpio -dump0 "$dst"
        )
}

hln() {
        local dst
        dst=$(realpath "$2") && (
                cd -- "$1" && \
                find -d . -print0 | cpio -dumpl0 "$dst"
        )
}

The hcp function _does_ copy hardlinks correctly (unlike
cp), and cpio supports the --sparse option to create
sparse files, so you can add that if you want.

Best regards
   Oliver

-- 
Oliver Fromme,  secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing
Dienstleistungen mit Schwerpunkt FreeBSD: http://www.secnetix.de/bsd
Any opinions expressed in this message may be personal to the author
and may not necessarily reflect the opinions of secnetix in any way.

"C++ is over-complicated nonsense. And Bjorn Shoestrap's book
a danger to public health. I tried reading it once, I was in
recovery for months."
        -- Cliff Sarginson


More information about the freebsd-hackers mailing list