[Bug 224160] wc -c is slow
bugzilla-noreply at freebsd.org
bugzilla-noreply at freebsd.org
Thu Dec 7 13:07:28 UTC 2017
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=224160
Bug ID: 224160
Summary: wc -c is slow
Product: Base System
Version: CURRENT
Hardware: Any
OS: Any
Status: New
Severity: Affects Only Me
Priority: ---
Component: bin
Assignee: freebsd-bugs at FreeBSD.org
Reporter: wosch at FreeBSD.org
The wc(1) command has several optimizations to run as fast as possible.
However, it is still slow in some use cases, much slower than the GNU wc
command
Using the OpenStreetMap database dump planet-latest.osm.bz2
(from https://wiki.openstreetmap.org/wiki/Planet.osm)
which it is a 61GB bzip'd XML file.
I checked how large the uncompressed XML is, on a 32 CPU machine:
# FreeBSD wc
$ pbzip2 -dc planet-latest.osm.bz2 | time wc -c
908171295050
4729.53 real 4400.69 user 199.34 sys
the wc(1) command was running at 100% CPU time, and pbzip2 was using only 500%
CPU time.
I run the tests again with GNU wc. The wc command was using only 20% CPU time,
and pbzip2 around 3000%.
# GNU wc
$ pbzip2 -dc planet-latest.osm.bz2 | time gwc -c
908171295050
2003.15 real 8.86 user 355.53 sys
The FreeBSD wc(1) command is using 500 times more user time (4400 <-> 9) than
the GNU wc, and a little bit less system time (199 <-> 355). The bottleneck was
not pbzip2, it was wc.
We should check why the optimization for wc -c for reading from stdin is not
working.
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the freebsd-bugs
mailing list