svn commit: r280279 - head/sys/sys
Bruce Evans
brde at optusnet.com.au
Mon Apr 20 14:52:46 UTC 2015
On Tue, 21 Apr 2015, Bruce Evans wrote:
> On Mon, 20 Apr 2015, Konstantin Belousov wrote:
>
>> On Mon, Apr 13, 2015 at 04:04:45PM -0400, Jung-uk Kim wrote:
>>> Please try the attached patch.
>>> ...
>>> - __asm __volatile("xorl %k0,%k0;popcntq %1,%0"
>>> - : "=&r" (result) : "rm" (elem));
>>> ...
>>> + __asm __volatile("xorl %k0, %k0; popcntq %1, %0"
>>> + : "=r" (count) : "m" (pc_map[field]));
>> ...
>> Yes, this worked for me the same way as for you, the argument is taken
>> directly from memory, without temporary spill. Is this due to silly
>> inliner ? Whatever the reason is, I think a comment should be added
>> noting the subtlety.
>>
>> Otherwise, looks fine.
>
> Erm, this looks silly. It apparently works by making things too complicated
> for the compiler to "optimize" (where one of the optimizations actually
> gives pessimal spills). Its main changes are:
> ...
> It works better to change the constraint to "r":
It's even sillier than that. The problem is not limited to this function.
clang seems to prefer memory whenever you use the "rm" constraint. The
silliest case is when you have a chain of simple asm functions. Say the
original popcntq (without the xorl):
return (popcntq(popcntq(popcntq(popcntq(popcntq(x))))));
gcc compiles this to 5 sequential popcntq instructions, but clang
spills the results of the first 4.
This is an old bug. clang does this on FreeBSD[9-11]. cc does this
on FreeBSD[10-11] (not on FreeBSD-9 since cc = gcc there.
Asms should always use "rm" if "m" works. Ones in cpufunc.h always
do except for lidt(), lldt() and ltr(). These 3 are fixed in my version.
So cpufunc.h almost always asks for the pessimization.
Bruce
More information about the svn-src-head
mailing list