cmp(1) has a bottleneck, but where?

Dieter BSD dieterbsd at
Thu Jan 12 19:31:46 UTC 2012

> The hard \xc2\xa0 certainly deserves a :-(.

Agreed. Brain damaged guity-until-proven-innocent anti-spam measures
force the use of webmail for outgoing email. Which amoung other problems
inserts garbage. Sorry.

>> A) Should the default vfs.read_max be increased?
> Maybe, but I don't buy most claims that larger block sizes are better.

I didn't say anything about block sizes. There needs to be enough
data in memory so that the CPU doesn't run out while the disk is

>> B) Can the mmap case be fixed? What is the aledged benefit of
>> using mmap anyway? All I've even seen are problems.
> It is much faster for cases where the file is already in memory. It
> is unclear whether this case is common enough to matter. I guess it
> isn't.

Is there a reasonably efficient way to tell if a file is already
in memory or not? If not, then we have to guess.
If the file is larger than memory it cannot already be in memory.
For real world uses, there are 2 files, and not all memory can be
used for buffering files. So cmp could check the file sizes and
if larger than x% of main memory then assume not in memory.
There could be a command line argument specifying which method to
use, or providing a guess whether the files are in memory or not.

I wrote a prototype no-features cmp using read(2) and memcmp(3).
For large files it is faster than the base cmp and uses less cpu.
It is I/O bound rather than CPU bound.

So perhaps use memcmp when possible and decide between read and mmap
based on (something)?

Assuming the added performance justifies the added complexity?

More information about the freebsd-performance mailing list