Splitting up sets of files for archiving (e.g. to tape, optical media)

Ganael Laplanche ganael.laplanche at martymac.org
Mon Jan 22 11:18:58 UTC 2018

On Sunday 21 January 2018 13:15:57 Ronald F. Guilmette wrote:

Hi Ronald,

> The problem, as I personally have approached it, is indeed a bin packing
> problem, but it is not clear (from the README) that this fpart tool even
> attempts to do bin packing, i.e. in order to minimize the number of output
> "partitions".  (That's an important feature if you are trying to minimize
> the number of archive volumes.)

Yes, fpart uses a bin packing algorithm to try to minimize space loss and 
number of produced partitions. I'll add a few words about that in the README 
(that's not clear), thanks for your suggestion!
> Also, whereas this (fpart) tool is clearly oriented towards creating
> output "partitions" which will then be the basis for subsequent rsync
> operations, in my case each group of files comprising one single output
> blob gets hard-linked into a newly created directory representing that
> specific output blob (where these new directoties have rather unimaginative
> names like "00", "01", "02"...)

Fpart will produce partitions either to stdout or plain text files. You will 
have to create a small script to re-use those file lists, in your case to 
create your hard links.

As I told you in private, that's what fpsync - provided with fpart as an 
example - does : it passes fpart's file lists to rsync to parallelize and 
speed up file transfers over a cluster of machines.

> Where things really get tricky is where one tries to really and truly
> maximize the usage, down to the very last sector, of the output (archive)
> volumes.  To do this perfectly, one would have to have deep knowledge of
> -exactly- how ImgBurn formats a set of input files and turns them into
> either a UDF or an ISO 9660 image.  I don't have such knowlegde, at present,
> so I am forced to add small fudge factors in my calculations so that each
> of my output blobs will in fact fit onto a single blank BD-R, at least as
> far as ImgBurn is concerned.  Even so, and even using a simple-minded bin
> packing algorithm, I am generally able to fill up output volumes to 99.9%
> or better.

Fpart is not aware of the target filesystem (the one used to archive your 
data), so it will not take into account overhead used to store metadata on 
that specific filesystem. Anyway, fpart provides two options that may be 
useful to you : overloading and rounding partitions' size.

Best regards,

Ganael LAPLANCHE <ganael.laplanche at martymac.org>
http://www.martymac.org | http://contribs.martymac.org
FreeBSD: martymac <martymac at FreeBSD.org>, http://www.FreeBSD.org

More information about the freebsd-questions mailing list