strcspn(3) complexity improvement

David Schultz das at FreeBSD.ORG
Fri Apr 1 06:02:31 PST 2005


On Wed, Mar 30, 2005, Peter Jeremy wrote:
> On Wed, 2005-Mar-30 10:34:35 +0200, Jeremie Le Hen wrote:
> >Andreas Hauser made a patch to strcspn(3) for the DragonFly project
> >which makes it faster when dealing with long strings [1] (rev 1.4).
> >It basically changes the complexity of the function from
> >    O(strlen(str) * strlen(chars))
> >to
> >    O(strlen(str) + strlen(chars))
> >by using a charset.
> 
> It has a significantly higher overhead due to the need to zero the charset.

The overhead might be smaller if gcc optimized the bool array to
use one bit per element.  Failing that, it has to initialize 256
bytes of storage instead of 32.  Perhaps someone could write a
modified version with bit fiddling and compare.

It's worth noting that the DragonFly version will probably avoid a
bunch of branch mispredicts, too---probably about strlen(str) of
them.  This is because in the old version, you expect one
misprediction each time you exit the inner loop, unless the inner
loop executes for very few iterations.  Of course, the DF code is
still a lose for very short strings.


More information about the freebsd-current mailing list