FreeBSD ports USE_XZ critical issue on low-RAM computers

Sun Jun 20 18:36:58 UTC 2010

(Just to clarify, this is my personal opinion, with a certain focus of
getting ports, i. e. automated source code builds for FreeBSD, going. I
am not speaking on behalf of FreeBSD.)

Am 20.06.2010 17:23, schrieb Lasse Collin:

> I have only one computer with over 512 MiB RAM (this has 8 GiB). Thus 
> "xz -9" is usable only on one of my computers. I cannot go and fix all 
> scripts so that they first check how much RAM I have and then pick a 
> reasonable compression level. It doesn't look so good to make "xz -9" so 
> low either that it would be usable on all systems with e.g. 256 MiB RAM 
> or more (you can have higher settings than the current "xz -9", they 
> just aren't so useful usually, even -9 is not always so useful compared 
> to a bit lower settings).

The xz manpage I've been looking at suggests that -7 would be a safe
choice here, and uses ~200 MB of RAM. --best might be an alias for it,
or for -e7, and "--best" might be a safer choice in scripts anyways.

> What do you think is the best solution to the above problem without 
> putting a default memory usage limit in xz? Setting something in XZ_OPT 
> might work in many cases, but sometimes scripts set it themselves e.g. 
> to pass compression settings to some other script calling xz. Maybe xz 
> should support a config file? Or maybe another environment variable, 
> which one could assume that scripts won't touch? These are honest 
> questions and answering them would help much more than long descriptions 
> of how the current method is bad.

I know that GNOME and Opera (browser) have split configuration files,
and a four-layer approach would be conceivable:

- system-level hard configurations (can't be changed by users)
- system-level default configurations (can be changed by users)
- user-specific default configurations
- temporary configurations for just one run.

Simplifying and adapting this a bit (xz isn't a graphical tool, or
something that the sysadmin needs to see about), you might use an
XZ_OPT_OVERRIDES variable that is parsed after the command line. I could
set it to  -M40% or something similar to override -9 options in scripts.
Example use:

$ export XZ_OPT=-9
$ export XZ_OPT_OVERRIDES=-M40%
$ xz -Mmax blah.tar

would result in the same behaviour as:

$ xz -9 -M40% blah.tar
# here, the XZ_OPT_OVERRIDES cancels -Mmax from command line

and could mean: xz trying -9, but lowering that as necessary to meet the
-M40% limit.

It seems logical, and fits in with the usual "later command line options
override earlier".

I'd probably be loathe to read a configuration file - which would
effectively just push the same question a bit further out, and
complicate matters.

Environment variables with a big banner "don't XZ_OPT_OVERRIDES use in
scripts, it is reserved for the user" might work. Then everybody can
complain to the script author if it touches XZ_OPT_OVERRIDES.

>> Multithreading in xz is worth discussion if the tasks can be
>> parallelized, which is apparently not the case.  You would be
>> duplicating effort, because we have tools to run several xz on
>> distinct files at the same time, for instance BSD portable make or
>> GNU make with a "-j" option.
> 
> That's a nice way to avoid answering the question. xargs works too when 
> you have multiple small files (there's even an example on recent man 
> page of xz). Please explain how any of these help with a multigigabyte 
> file. That's where people want xz to use threads. There is more than one 
> way to parallelize the compression, and some of them increase encoder 
> memory usage quite a lot.

Actually it was a covert way to state I am clueless about the LZMA
algorithm and how parallelizable it is. Surely you need to buffer an
input stream if the input isn't seekable, but beyond that I never
mustered sufficient interest to read it up.

> Sure, it cannot "fully" parallelize, whatever that means. But the amount 
> of parallelization that is possible is welcomed by many others (you are 
> the very first person to think it's useless). For example, 7-Zip can use 
> any number of threads with .xz files and there are some liblzma-based 
> experimental tools too.

Fully parallelizable means neglible overhead on the algorithmic side, i.
e. near 100% speedup with each new processor added (considering Amdahl's
law and later refinements).

If compressing position 20-40MB in a file depends on the outcome of
compressing positions 0-20MB, the task is not parallelizable at all.

If two threads manage 140% of throughput of one, it's not "fully"
parallelizable.

> Next question could be how to determine how many threads could be OK for 
> multithreaded decompression. It doesn't "fully" parallelize either, and 
> would be possible only in certain situations. There too the memory usage 
> grows quickly when threads are added. To me, a memory usage limit 
> together with a limit on number of threads looks good; with no limits, 
> the decompressor could end up reading the whole file into RAM (and 
> swap). Threaded decompression isn't so important though, so I'm not even 
> sure if I will ever implement it.

The easy answer for you is a "-j N" option like make's, with a default
of 1. Since threads share their address space, the --memory option can
easily be interpreted either way: overall or per-thread.

I'd like to avoid this discussion though with the large audiences of
ports@ and portmgr@ involved. I think for adoption in infrastructure, we
need consistency across all computers before all else.

> The dictionary size is only one thing to get high compression. It 
> depends on the file. Some files benefit a lot when dictionary size 
> increases while others benefit mostly from spending more CPU cycles. 
> That's why there is the --extreme option. It allows improving the 
> compression ratio by spending more time without requiring so much RAM.

The manpages states "factor of two", which barely qualifies as "extreme"
in my eyes. "extreme" would be an order of magnitude (10x).

> The existence of --extreme (-e) naturally makes things slightly more 
> complicated for a user than using only a linear single-digit scale for 
> compression levels, but makes it easier to specify what is wanted 
> without requiring the user to read about the advanced options. Note that 
> I plan to revise what settings exactly are bound to different 
> compression levels before the 5.0.0 release.

My main concern remains with consistency, predictability (don't
downgrade sort-of behind my back) and reliability (don't refuse critical
operations out of misunderstood politeness).  I think we all  can expect
that -c and -d won't exchange their meaning :)

Best regards
Matthias