mfi(4) IO performance regression, post 8.1

Alexander Motin mav at FreeBSD.org
Fri Jul 20 13:38:13 UTC 2012


On 19.07.2012 18:28, Adrian Chadd wrote:
> Hm! A timer related bug?
>
> I'll CC mav@ on this, as it was his commit (and work in his general area.)
>
> I wonder what's going on - is it something to do with the two ACPI
> calls inserted there, or is it something to do with the change in
> event timer values?
>
> mav? Any ideas?

I can just agree with earlier made guess that for some reason ACPI timer 
on that system is very slow. Unless user explicitly enabled deeper 
C-states, values returned by the timer are not really used for anything, 
so there is just no place for other bug.

When doing this change I was expecting that it may have cost, but on 
most systems that cost makes effect only during high interrupt rates, 
where it is covered by automatic fallback to using faster MWAIT as idle 
method. Unluckily, that code still was not merged to 8-STABLE (only 9). 
I will recheck is there problem to merge it now.

Manual switching to MWAIT via sysctl is correct workaround for this 
situation. It may give slightly higher power consumption, but for this 
workload with many interrupts probably the best possible performance.

> On 17 July 2012 13:39, Steve McCoy <smccoy at greatbaysoftware.com> wrote:
>
>> Alright, I've finally narrowed it down to r209897, which only affects
>> acpi_cpu_idle():
>>
>> --- stable/8/sys/dev/acpica/acpi_cpu.c  2010/06/23 17:04:42     209471
>> +++ stable/8/sys/dev/acpica/acpi_cpu.c  2010/07/11 11:58:46     209897
>> @@ -930,12 +930,16 @@
>>
>>       /*
>>        * Execute HLT (or equivalent) and wait for an interrupt.  We can't
>> -     * calculate the time spent in C1 since the place we wake up is an
>> -     * ISR.  Assume we slept half of quantum and return.
>> +     * precisely calculate the time spent in C1 since the place we wake up
>> +     * is an ISR.  Assume we slept no more then half of quantum.
>>        */
>>       if (cx_next->type == ACPI_STATE_C1) {
>> -       sc->cpu_prev_sleep = (sc->cpu_prev_sleep * 3 + 500000 / hz) / 4;
>> +       AcpiHwRead(&start_time, &AcpiGbl_FADT.XPmTimerBlock);
>>          acpi_cpu_c1();
>> +       AcpiHwRead(&end_time, &AcpiGbl_FADT.XPmTimerBlock);
>> +        end_time = acpi_TimerDelta(end_time, start_time);
>> +       sc->cpu_prev_sleep = (sc->cpu_prev_sleep * 3 +
>> +           min(PM_USEC(end_time), 500000 / hz)) / 4;
>>          return;
>>       }
>>
>> My current guess is that AcpiHwRead() is a problem on our hardware. It's an
>> isolated change and, to my desperate eyes, the commit message implies that
>> it isn't critical — Do you think we could buy ourselves some time by pulling
>> it out of our version of the kernel? Or is this essential for correctness?
>> Any thoughts are appreciated, thanks!


-- 
Alexander Motin


More information about the freebsd-stable mailing list