svn commit: r227812 - head/lib/libc/string

David Chisnall theraven at FreeBSD.org
Wed Nov 23 10:37:20 UTC 2011


On 22 Nov 2011, at 20:27, David Schultz wrote:

> Benchmark or not, I think you'll have a very hard time finding a
> single real program that routinely calls strcasecmp() with
> identical pointers!

I've seen this pattern very often.  Often the linker is able to combine constant strings defined in different compilation units.  With link-time optimisation, there are also more opportunities for the compiler to do this.  

A fairly common pattern is to define constant strings as macros in a header and then use them as keys in a dictionary, first hashed and then compared with strcmp().  In this case, the == check is a significant win.  I've had to work around the fact that FreeBSD's libc is significantly slower than GNU libc in this instance by adding an extra == outside of strcmp() - this increases the size of the code everywhere this pattern is used, increasing cache usage, and lowering overall performance (and good luck coming up with a microbenchmark that demonstrates that - although I'd be happy to provide you with a Google-authord paper from a couple of years ago explaining why it's so hard to benchmark accurately on modern machines...).

It's also worth noting that the cost of the extra branch is more or less trivial, as every single character in the input strings will also need to be compared.  This change turns a linear complexity case into a constant complexity case, so it's a clear algorithmic improvement for a case that, while rare, is not as improbable as you seem to suppose.

As to the | vs || issue - by all means change it to || if it fits better with the FreeBSD style.  In the general case I prefer to use | to hint to the compiler and readers of the code that short-circuit evaluation is not required and to remove a sequence point and make life easier for the optimiser.  In this case, the two are equivalent so it's just a hint to the reader, and apparently (judging by the responses so far) one that is not well understood.

David


More information about the svn-src-all mailing list