kern/98460 : [kernel] [patch] fpu_clean_state() cannot be
disabled
for not AMD processors, those are not vulnerable to FreeBSD-SA-06:14.fpu
Bruce Evans
bde at zeta.org.au
Tue Jun 6 19:26:44 PDT 2006
The following reply was made to PR kern/98460; it has been noted by GNATS.
From: Bruce Evans <bde at zeta.org.au>
To: Rostislav Krasny <rosti.bsd at gmail.com>
Cc: freebsd-gnats-submit at freebsd.org
Subject: Re: kern/98460 : [kernel] [patch] fpu_clean_state() cannot be disabled
for not AMD processors, those are not vulnerable to FreeBSD-SA-06:14.fpu
Date: Wed, 7 Jun 2006 12:09:10 +1000 (EST)
On Tue, 6 Jun 2006, Rostislav Krasny wrote:
> On Mon, 5 Jun 2006 08:25:06 +1000 (EST)
> Bruce Evans <bde at zeta.org.au> wrote:
>
>> On Sun, 4 Jun 2006, Rostislav Krasny wrote:
>>
>>> On Sun, 4 Jun 2006, Bruce Evans wrote:
>>>> The configuration should be dynamic and automatic, so that it doesn't
>>>> take changes to zillions of configuration files to implement and
>>>> document an option that almost no one will know to set. I think there
>>>> is a simple feature test for the AMD misfeature.
>>>
>>> David Xu had proposed something like that. But from Colin Percival's
>>> reply I understood that it is hard to be done effectively. See their
>>> discussion by the first URL in this PR.
>>
>> I don't see how it can be hard. Perhaps it is too CPU-dependent for
>> tests based on cpuid to be easy or future-proof, but a runtime test
>> in the probe would be easy. Here is a userland version. It gives the
>> ...
>
> And then you want to call the fpu_clean_state() function conditionally,
> like in following example?
>
> if (cpu_fxsr & CPU_FXSR_NEEDCLEAN)
> fpu_clean_state();
Not quite like that. In my version there is no function call -- the code
is excecuted in the one place where it is needed, so there is no function
call overhead or possible branch prediction oferhead for the function call.
> But this looks same to what Davi Xu had proposed. Read what Colin
> Percival had replied about that proposition:
>
> http://lists.freebsd.org/pipermail/freebsd-current/2006-May/062683.html
>> The problem with doing something like this is that the branch will
>> almost never be in the processor's branch prediction tables, so you
>> will get a branch mis-prediction on the unaffected processors --
>> which is likely to be more expensive than simply running the state
>> cleaning code.
It can't possibly be _more_ expensive, since the state-cleaning code
has 2 or 3 branches in it instead of only 1. It has 1 or 2 branches
for the function call and return. Whether function calls and returns
use normal branch prediction is machine-dependent. Whatever they use,
it takes some CPU resources. The state-cleaning code has a branch in
it. This branch is slightly harder to predict than a cpu_fxsr one.
My second version of a fix avoided this branch by doing the fnclex()
unconditionally (the first version did the load unconditionally and
paniced in coner cases). The code with the branch runs much faster
than an unconditional fnclex() in a simple benchmark with the code in
a loop, but I wonder if it is still faster after branch misprediction.
> Eliminating the fpu_clean_state() by "options CPU_FXSAVE_NO_LEAK" could
> be used as a custom optimization. No one is obliged to use it, as well
> as many other CPU_* optimization options.
There are too many options and not enough automatic tuning. This
particular optimization is particularly worth not doing since it is
in the 10-100 cycle range (similar to what could be gained from avoiding
a single branch misprediction or cache miss), but I care about it since
it is to compensate for a pessimization.
Bruce
More information about the freebsd-bugs
mailing list