FreeBSD ports USE_XZ critical issue on low-RAM computers
Lasse Collin
lasse.collin at tukaani.org
Sun Jun 20 21:04:19 UTC 2010
On 2010-06-20 Ion-Mihai Tetcu wrote:
> Personally I'd suggest keeping the option to limit the memory, but as
> an option, not as default.
OK.
> One thing I would really love to see going away is the default to
> delete the archive on decompression.
Being somewhat compatible with gzip and bzip2 command line syntax is
useful, so even though I don't disagree with you, the default is and
will be to delete the input file.
> Generally, I think programs should support both, the later overriding
> the first: .conf -> env -> command line
It means that I will need to create a config file on all my computers
that have 512 MiB RAM or less to get the behavior I want. Probably other
users with older computers have to do that too to avoid insanely slow
compression and unresponsive system when some script runs "xz -9". While
I would prefer no need for a config file, people like me seem to be in a
minority, and creating a config file isn't that big deal.
Using a second environment variable would be quite similar. Only the
place where the setting is put would differ. A config file could allow
more flexibility though, e.g. it could be possible to even override the
preset levels with user-defined custom values (at his or her own risk,
of course).
> At the moment, what are the plans and the advantages of multithreding
> (both on compression and decompression)?
The "only" advantage is that threading makes things faster when there
are multiple CPU cores to use. Disadvantages of threading:
- Compression ratio might be worse. It depends on how the
threading is done. Different ways have their own pros and cons.
- Memory usage may be a lot be higher for both compression and
decompression.
The plan is to get some type of threaded compression support into
liblzma after the 5.0.0 release. Considering my free time etc. I don't
promise any kind of development schedule.
The API will done so that applications won't need to think about the
details of threading too much, and can use the zlib-style loop like they
do in single-threaded mode.
> > Next question could be how to determine how many threads could be
> > OK for multithreaded decompression. It doesn't "fully" parallelize
> > either, and would be possible only in certain situations. There
> > too the memory usage grows quickly when threads are added. To me,
> > a memory usage limit together with a limit on number of threads
> > looks good; with no limits, the decompressor could end up reading
> > the whole file into RAM (and swap). Threaded decompression isn't
> > so important though, so I'm not even sure if I will ever implement
> > it.
>
> I'd say offer an option if you want.
Sorry, I explained this poorly. Simple number of threads = something is
not good for threaded decompression. In a generic situation you don't
know beforehand how much RAM each decompressor thread would use.
If threaded decompression is implemented, maybe the default should be
one thread just to keep things simple. But there should be an option to
use optimal number of threads so that the user doesn't need to worry
about details too much. My idea for that would be to have a user-
specified maximum number of threads and a memory usage limit. Then xz
would use up to the allowed number of threads as long as the memory
usage limit is not exceeded. Without a memory usage limit, memory usage
could grow to insane amounts if there are very many cores.
It's somewhat similar for threaded compression, except that the amount
of memory needed per thread at the given compression level is known
before the compression is started. An option to easily tell xz to use
optimal number of threads would be useful e.g. in scripts, which may be
used on different computers, and thus don't want to be bothered to
figure out how many CPU cores there are. I think a thread limit combined
with memory usage limit is reasonable here too.
For the above use, there should be default values for the thread and
memory limits, so that a config file or many command line options
wouldn't be strictly required to get some threading with the "use
optimal number of threads" setting. Number of CPU cores and some
percentage of RAM could work. Users could set better values themselves,
but defaults are still a nice starting point and may be enough for many.
Note that if I remove the current default memory usage limit from xz,
the default memory usage limit used to calculate optimal number of
threads wouldn't be used for anything else; if the limit is too low, xz
would just drop to single-threaded mode to use minimal amount of RAM.
> We've pondered a bit about switching our packages from .tbz to .xz or
> tar.xz. Given that a package is made once, and downloaded and
> decompressed by a lot of users a lot of times, it would probably make
> sense to go for the smallest possible size;
I had the same reasoning when I got interested in LZMA in 2004. LZMA was
also much faster to decompress than bzip2.
Slackware uses .txz suffix for .tar.xz packages, so if you prefer a
single three-letter suffix instead of .tar.xz, .txz is the way to go.
> however, if this would mean that some users won't be able to
> decompress the packages, then probably xz isn't the tools for us.
Decoder memory usage is all about the dictionary size. With 2 MiB
dictionary you can make most packages smaller with xz than with "bzip2
-9" while keeping the decoder memory usage (3 MiB) _lower_ than that of
bzip2 (man page says 3700k without using the slower --small mode).
I would recommend using 8 MiB dictionary for packages. That way 9 MiB of
memory is needed to decompress. That's what I used for packages years
ago, and it's also the default in xz ("xz -6"). A dictionary bigger than
8 MiB is not useful unless the uncompressed file is over 8 MiB. Using
"xz -6e" might reduce the size a little more with some files, but it's
not necessarily worth the extra CPU time.
Compressing with "xz -6" needs about 100 MiB memory. It is much more
than with "bzip2 -9" (man page says 7600k), but should be fine on the
systems that create the packages.
Using "xz -9" for binary packages would be a bad choice. It doesn't save
that much space over "xz -6" and can seriously annoy users of older
computers. In contrast, decompressing files created with "xz -6" works
nicely on 100 MHz Pentium with 32 MiB RAM (16 MiB should be quite OK
too). I will need to emphasize much more in the xz docs and possibly
also in "xz --help" that using -9 really isn't usually what people want.
There are also additional filters that might help. Enabling them
requires using advanced options. You can try e.g. "xz --x86 --lzma2"
when compressing data that includes significant amount of x86-32 or
x86-64 code. That filter has a known problem that makes it perform
poorly on static libraries (and Linux kernel modules), so applying it to
all packages isn't necessarily a good idea. In the future (I don't know
when), there will be a better and easier-to-use filter, that will use
heuristics to detect when and what extra filtering should be useful.
> Speaking of sizes, do you have any statistical data regarding: source
> size, compression options, compression speed and decompression speed
> (and memory usage, since we're talking about it)?
No. It's good to note here that I haven't so far worked much on the
actual compression algorithms. The critical parts are directly derived
from Igor Pavlov's LZMA SDK (the code may look very different at first
sight, but don't let that mislead you).
As I mentioned in an earlier email, I will tweak the compression
settings mapped to the compression levels before the 5.0.0 release. To
do that I will need to collect some data from many different compression
settings. It probably won't be high quality data, since I have limited
time for experiments and I just need some rough guidelines to tweak the
options.
Here are a few known things:
- Decompression speed is roughly constant x bytes per second of
_compressed_ data on the same machine. The better the
compression has been, the faster the decompression tends to be.
However, if the data doesn't fit to RAM and the system needs
to swap out parts of the xz process, old floppy disks start to
become competitive, because the memory is accessed quite
randomly.
- Dictionary keeps the most recently processed uncompressed
data in a ring buffer. Using a dictionary bigger than the
uncompressed file is useless.
- Compressor memory usage is roughly 5-12 times the dictionary
size. It depends on the match finder (see mf under --lzma2 on
the man page). "xz -vv" shows the encoder memory usage. I might
make single -v show that info in the future along with the
decoder memory usage.
- Decompressor memory usage is a little more than the dictionary
size. The currently supported extra filters don't use
significant amount of memory.
--
Lasse Collin | IRC: Larhzu @ IRCnet & Freenode
More information about the freebsd-ports
mailing list