copying milllions of small files and millions of dirs

krad kraduk at gmail.com
Tue Aug 20 07:32:50 UTC 2013


When i migrated a large mailspool in maildir format from the old nfs server
to the new one in a previous job, I 1st generated a list of the top level
maildirs. I then generated the rsync commands + plus a few other bits and
pieces for each maildir to make a single transaction like function. I then
pumped all this auto generated scripts into xjobs and ran them in parallel.
This vastly speeded up the process as sequentially running the tree was far
to slow. THis was for about 15 million maildirs in a hashed structure btw
so a fair amount of files.


eg

find /maildir -type d -maxdepth 4 | while read d
do
r=$(($RANDOM*$RANDOM))
echo rsync -a $d/ /newpath/$d/ > /tmp/scripts/$r
echo some other stuff >> /tmp/scripts/$r
done

ls /tmp/scripts/| while read f
echo /tmp/scripts/$f
done | xjobs -j 20










On 19 August 2013 18:52, aurfalien <aurfalien at gmail.com> wrote:

>
> On Aug 19, 2013, at 10:41 AM, Mark Felder wrote:
>
> > On Fri, Aug 16, 2013, at 1:46, Nicolas KOWALSKI wrote:
> >> On Thu, Aug 15, 2013 at 11:13:25AM -0700, aurfalien wrote:
> >>> Is there a faster way to copy files over NFS?
> >>
> >> I would use find+cpio. This handles hard links, permissions, and in case
> >> of later runs, will not copy files if they already exist on the
> >> destination.
> >>
> >> # cd /source/dir
> >> # find . | cpio -pvdm /destination/dir
> >>
> >
> > I always found sysutils/cpdup to be faster than rsync.
>
> Ah, bookmarking this one.
>
> Many thanks.
>
> - aurf
> _______________________________________________
> freebsd-questions at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to "
> freebsd-questions-unsubscribe at freebsd.org"
>


More information about the freebsd-questions mailing list