cmp(1) has a bottleneck, but where?

Sat Jan 14 11:34:59 UTC 2012

On Thu, 12 Jan 2012, Dieter BSD wrote:

>>> A) Should the default vfs.read_max be increased?
>>
>> Maybe, but I don't buy most claims that larger block sizes are better.
>
> I didn't say anything about block sizes. There needs to be enough
> data in memory so that the CPU doesn't run out while the disk is
> seeking.

Oops.  I was thinking of read-ahead essentially extending the block
size.  It (or rather clustering) does exactly that for file systems
with small block sizes, provided the blocks are contiguous.  But
too much of it gives latency and resource wastage problems.  Reads
by other processes may be queued behind read-ahead that is never
used.

>>> B) Can the mmap case be fixed? What is the aledged benefit of
>>> using mmap anyway? All I've even seen are problems.
>>
>> It is much faster for cases where the file is already in memory. It
>> is unclear whether this case is common enough to matter. I guess it
>> isn't.
>
> Is there a reasonably efficient way to tell if a file is already
> in memory or not? If not, then we have to guess.

Not that I know of.  You would want to know how much of it is in
memory.

> If the file is larger than memory it cannot already be in memory.
> For real world uses, there are 2 files, and not all memory can be
> used for buffering files. So cmp could check the file sizes and
> if larger than x% of main memory then assume not in memory.
> There could be a command line argument specifying which method to
> use, or providing a guess whether the files are in memory or not.

I think the 8MB value does that well enough, especially now that
everyone has a GB or 16 of memory.

posix_fadvise() should probably be used for large files to tell the
system not to cache the data.  Its man page reminded me of the O_DIRECT
flag.  Certainly if the combined size exceeds the size of main memory,
O_DIRECT would be good (even for benchmarks that cmp the same files :-).
But cmp and cp are too old to use it.

> I wrote a prototype no-features cmp using read(2) and memcmp(3).
> For large files it is faster than the base cmp and uses less cpu.
> It is I/O bound rather than CPU bound.

What about using mmap() and memcmp()?  mmap() shouldn't be inherently
much worse than read().  I think it shouldn't and doesn't not read
ahead the whole mmap()ed size (8MB here), since that would be bad for
latency.  So it must page it in when it is accessed, and read ahead
for that.

there is another thread about how bad mmap() and sendfile() are with
zfs, because zfs is not merged with the buffer cache so using mmap()
with it wastes about a factor of 2 of memory; sendfile() uses mmap()
so using it with zfs is bad too.  Apparently no one uses cp or cmp
with zfs :-), or they would notice its slowness there too.

> So perhaps use memcmp when possible and decide between read and mmap
> based on (something)?
>
> Assuming the added performance justifies the added complexity?

I think memcmp() instead of byte comparision for cmp -lx is not very
complex.  More interesting is memcmp() for the general case.  For
small files (<= mmap()ed size), mmap() followed by memcmp(), then
go back to a byte comp to count the line number when memcmp() fails
seems good.  Going back is messier and slower for large files.  In
the worst case of files larger than memory with a difference at the
end, it involves reading everything twice, so it is twice as slow
if it is i/o bound.

Bruce