cmp(1) has a bottleneck, but where?

Sun Jan 15 23:32:57 UTC 2012

> posix_fadvise() should probably be used for large files to tell the
> system not to cache the data. Its man page reminded me of the O_DIRECT
> flag. Certainly if the combined size exceeds the size of main memory,
> O_DIRECT would be good (even for benchmarks that cmp the same files :-).
> But cmp and cp are too old to use it.

8.2 says:
man -k posix_fadvise
posix_fadvise: nothing appropriate

The FreeBSD man pages web page says it is not in 9.0 either.

google found:
http://lists.freebsd.org/pipermail/freebsd-hackers/2011-May/035333.html

So what is this posix_fadvise() man page you mention?

O_DIRECT looked interesting, but I haven't found an explaination of
exactly what it does, and
find /usr/src/sys | xargs grep O_DIRECT | wc -l
188
was a bit much to wade through, so I didn't try O_DIRECT.

>> I wrote a prototype no-features cmp using read(2) and memcmp(3).
>> For large files it is faster than the base cmp and uses less cpu.
>> It is I/O bound rather than CPU bound.
>
> What about using mmap() and memcmp()? mmap() shouldn't be inherently
> much worse than read(). I think it shouldn't and doesn't not read
> ahead the whole mmap()ed size (8MB here), since that would be bad for
> latency. So it must page it in when it is accessed, and read ahead
> for that.

cmp 4GB 4GB
52.06 real 14.68 user 5.26 sys

cmp 4GB - < 4GB
44.37 real 33.87 user 5.53 sys

my_cmp 4GB 4GB
41.22 real 5.26 user 5.09 sys

> there is another thread about how bad mmap() and sendfile() are with
> zfs, because zfs is not merged with the buffer cache so using mmap()
> with it wastes about a factor of 2 of memory; sendfile() uses mmap()
> so using it with zfs is bad too. Apparently no one uses cp or cmp
> with zfs :-), or they would notice its slowness there too.

I recently read somewhere that zfs needs 5 GB memory for each 1 TB of disk.
People that run zfs obviously don't care about using lots of memory.

I only noticed the problem because cmp wasn't reading as fast as expected,
but wasn't cpu bound either.

> I think memcmp() instead of byte comparision for cmp -lx is not very
> complex. More interesting is memcmp() for the general case. For
> small files (<= mmap()ed size), mmap() followed by memcmp(), then
> go back to a byte comp to count the line number when memcmp() fails
> seems good. Going back is messier and slower for large files. In
> the worst case of files larger than memory with a difference at the
> end, it involves reading everything twice, so it is twice as slow
> if it is i/o bound.

Studying the cmp man page, it is... unfortunate. The default
prints the byte and line number if the files differ, so it needs
that info. The -l and -x options just keep going after the first
difference. If you want the first byte to be indexed 0 or 1 you can't
choose the radix independantly.

If we only needed the byte count it wouldn't be so bad, but needing
the line count really throws a wrench in the works if we want to use
memcpy(). The only way to avoid needing the line count is -s.