Splitting up sets of files for archiving (e.g. to tape, optical media)

Ronald F. Guilmette rfg at tristatelogic.com
Sun Jan 21 21:16:04 UTC 2018


In message <2456E55A-D14F-41ED-B8DD-9633BD73ACF1 at gromit.dlib.vt.edu>, 
Paul Mather <freebsd-lists at gromit.dlib.vt.edu> wrote:

>Have you looked at fpart (https://github.com/martymac/fpart)?  It looks
>to me like it is applicable to your problem (which sounds to me like a
>variant of the bin packing problem).  The fpart README even lists
>packing music files onto fixed-size DVD media as one of its examples,
>which sounds close to the archiving scenario you give above.  Plus, it
>claims to be developed on FreeBSD.

I had never heard of that tool (fpart) till now, but I have just now
skimmed the README file for that and yes, this tool appears to be quite
close to what I had in mind.

I will have to see if I can make contact with the developer of this tool,
and discuss with him what I have been doing, whch is similar but different.
(Maybe he will be motivated to include some suggestions from me.)

The problem, as I personally have approached it, is indeed a bin packing
problem, but it is not clear (from the README) that this fpart tool even
attempts to do bin packing, i.e. in order to minimize the number of output
"partitions".  (That's an important feature if you are trying to minimize
the number of archive volumes.)

Also, whereas this (fpart) tool is clearly oriented towards creating
output "partitions" which will then be the basis for subsequent rsync
operations, in my case each group of files comprising one single output
blob gets hard-linked into a newly created directory representing that
specific output blob (where these new directoties have rather unimaginative
names like "00", "01", "02"...)  Since these directores are on the same
partition as the input files to be archived, and since that whole partition
is exported to my local network (courtesy of Samba), I can then just pop
over to my Windoze system and use ImgBurn to write each output blob in
turn to a blank BD-R disk.  (I am actually doing this as we speak.)

Where things really get tricky is where one tries to really and truly
maximize the usage, down to the very last sector, of the output (archive)
volumes.  To do this perfectly, one would have to have deep knowledge of
-exactly- how ImgBurn formats a set of input files and turns them into
either a UDF or an ISO 9660 image.  I don't have such knowlegde, at present,
so I am forced to add small fudge factors in my calculations so that each
of my output blobs will in fact fit onto a single blank BD-R, at least as
far as ImgBurn is concerned.  Even so, and even using a simple-minded
bin packing algorithm, I am generally able to fill up output volumes to
99.9% or better.

Anyway, thank you Paul, for pointing me at fpart.  I quite definitely knew
nothing about it till now... and that's why I posted my queation here in the
first place, because there's such a wealth and breadth of knowledge here.


Regards,
rfg



More information about the freebsd-questions mailing list