likely and unlikely

Fri Mar 19 14:42:23 UTC 2010

On Thu, 18 Mar 2010, Scott Long wrote:

> On Mar 18, 2010, at 4:11 PM, M. Warner Losh wrote:
>> In message: <alpine.BSF.2.00.1003131346270.51476 at fledge.watson.org>
>>            Robert Watson <rwatson at FreeBSD.org> writes:
>> :
>> : On Sat, 13 Mar 2010, Bruce Evans wrote:
>> :
>> : >> My point is: Handle with care!!!  Trust your compiler/CPU
>> : >> predictors/... - most of the time, they are smarter than you are ;)
>> : >
>> : > These macros may have useful 15-25 years ago for i386, i486 and
>> : > Pentium1, since CPU branch predictors were either nonexistent or not
>> : > so good. After that, CPU branch predictors became quite good.  The
>> : > macros should have been mostly unused 15-25 years ago too, since they
>> : > optimize for unreadability and unwritability.  Fortunately they are
>> : > rarely used in FreeBSD.  They were imported from NetBSD in 2003 where
>> : > they are used more (306 instances in 2005 NetBSD /sys vs 28 instances
>> : > in 2004 FreeBSD /sys; there are 2208 instances of likely() in 2004
>> : > linux-2.6.10).
>> :
>> : I think it would be reasonable to expect that people deploy branch
>> : prediction macros (as with prefetch, etc) only where there's specific
>> : measurements that indicate they are important to have there -- at the
>> : very least, pmc data, but ideally also benchmarking data.
>>
>> They are more useful on architectures where you have branches that
>> tell the CPU if they are likely or unlikely to be taken...
>
> And that's a very good point, one that Bruce really failed to address.  Not only
> is branch prediction useful for MIPS and ARM, I suspect that it's also useful
> for Atom.

I addressed it indirectly :-) :  The useful use of these macros is limited to:
- the 1% or 0.1% of CPUs that are not amd64 or i386
- on these CPUs, the small percentage of code that runs in the kernel (since
   userland is out of scope for this discussion and much harder to optimize
   globally since it is much larger)
- the small percentage of kernel code that is MD (since these optimizations
   are very MD and I hope no one one plans to tangle MI code with ifdefs for
   them)

This gives a very small target for optimization and difficulties measuring
the effect.  Most systems shouldn't spend most of the time in the kernel.
It is easy to think of specialized systems which are exceptions but hard
to think of any that spend much time in their MD part, except in copyin/out
which should already be optimized.

Bruce