cmp(1) has a bottleneck, but where?

Bruce Evans brde at
Mon Jan 9 14:43:35 UTC 2012

On Wed, 4 Jan 2012, Bruce Evans wrote:

> On Tue, 3 Jan 2012, Marc Olzheim wrote:
>> On Tue, Jan 03, 2012 at 12:21:10AM -0800, Garrett Cooper wrote:
>>>     The file is 3.0GB in size. Look at all those page faults though!
>>> Thanks!
>>> -Garrett
>> From usr.bin/cmp/c_regular.c:
>> #define MMAP_CHUNK (8*1024*1024)
>> ...
>> for (..) {
>> 	mmap() chunk of size MMAP_CHUNK.
>> 	compare
>> 	munmap()k
>> }
>> That 8 MB chunk size sounds like a bad plan to me. I can imagine
>> something needed to be done to compare files larger than X GB on a 32bit
>> system, but 8MB is pretty small...
> 8MB is more than large enough.  It works at disk speed in my tests.  cp
> still uses this value.  Old versions of cmp used the bogus value of
> ...
> In my tests, using "-" for one of the files mainly takes lots more user
> time.  It only reduces the real time by 25%.  This is on a core2.  On
> a system with a slow CPU, it is easy for getc() to be much slower than
> the disk.

More careful tests showed serious slowness when the combined file sizes
exceeded the cache size.  cmp takes an enormous amount of CPU (see another
reply), and this seems to be done mostly in series with i/o, so the total
time increases too much.  A smaller mmap() size or not using mmap() at
all might improve paralellism.


More information about the freebsd-performance mailing list