cvs commit: src/include _ctype.h

Christoph Mallon christoph.mallon at gmx.de
Wed Oct 31 18:52:31 PDT 2007


Andrey Chernov wrote:
> On Mon, Oct 29, 2007 at 09:48:16PM +0100, Christoph Mallon wrote:
>> Andrey A. Chernov wrote:
>>> ache        2007-10-27 22:32:28 UTC
>>>   FreeBSD src repository
>>>   Modified files:
>>>     include              _ctype.h   Log:
>>>   Micro-optimization of prev. commit, change
>>>   (_c < 0 || _c >= 128) to (_c & ~0x7F)
>>>     Revision  Changes    Path
>>>   1.33      +1 -1      src/include/_ctype.h
>> Actually this is rather a micro-pessimisation. Every compiler worth its 
>> money transforms the range check into single unsigned comparison. The 
>> latter test on the other hand on x86 gets probably transformed into a test 
>> instruction. This instruction has no form with sign extended 8bit 
>> immediate, but only with 32bit immediate. This results in a significantly 
>> longer opcode (three bytes more) than a single (unsigned)_c > 127, which a 
>> sane compiler produces. I suspect some RISC machines need one more 
>> instruction for the "micro-optimised" code, too.
>> In theory GCC could transform the _c & ~0x7F back into a (unsigned)_c > 
>> 127, but it does not do this (the only compiler I found, which does this 
>> transformation, is LLVM).
>> Further IMO it is hard to decipher what _c & ~0x7F is supposed to do.
> 
> 1. My variant is compiler optimization level independent. F.e. without 
> optimization completely there is no range check transform you talk about 
> at all and very long asm code is generated. I also mean the case where gcc 
> optimization bug was avoided, removing optimization (like compiling large 
> part of Xorg server recently), using non-gcc compilers etc. cases.

Compiling without any optimisations makes the code slow for a zillion 
other reasons (no load/store optimisations, constant folding, common 
subexpression elimination, if-conversion, partial redundant expression 
elimination, strength reduction, reassociation, code placement, and many 
more), so a not transformed range check is really not of any concern.

> 2. _c & ~0x7F comes right from is{w}ascii() so there is no such enormously
> big problems to decifer. I just want to keep all ctype in style.

Repeating cryptic code does not make it better, IMO.

> 3. I see no "longer opcode (three bytes more)" you talk about in my tests 
> (andl vs cmpl was there, no testl).

See the reply to the mail with your code example.

	Christoph


More information about the cvs-src mailing list