time issues and ZFS

Mon Jan 21 20:09:28 UTC 2013

I still firmly believe the ACPI event timer code is racy, and what we
may be seeing here is the fallout from that.

It's very possible that we're missing interrupts here - the new
eventtimer code that made it into 9.x puts the halt behind a critical
section, with interrupts disabled. The only platforms that correctly
implement enable-interrupts-and-halt atomically is the HLT (well, and
the don't-sleep-at-all) idle loops on i386/amd64. The default method
is to use the ACPI sleep method, which doesn't do atomic interrupt
enable / halt.

I'm still seeing odd stuff on some of my ACPI-using netbooks when
doing net80211/ath development and it all goes away whenever I fondle
with the above settings.

So, play with kern.eventtimer.periodic, kern.eventtimer.idletick and
machdep.idle (try setting machdep.idle to hlt, or something else
listed in machdep.idle_available) - please report back what the
results are.

Adrian

On 21 January 2013 07:54, Ian Lepore <ian at freebsd.org> wrote:
> On Mon, 2013-01-21 at 17:35 +0200, Daniel Braniss wrote:
>> ...
>> >
>> > What's the output of sysctl kern.eventtimer?
>>
>> kern.eventtimer.periodic is 0
>>
>> >                                              Does the bad behavior
>> > change if you set kern.eventimer.periodic=1?
>> >
>>
>> setting kern.eventtimer.timer=LAPIC
>> instead of the default HPET made the missing cpu timers to appear:
>> # vmstat -i
>> interrupt                          total       rate
>> irq3: uart1                         1695          0
>> irq4: uart0                            5          0
>> irq19: ehci0                        3875          0
>> irq20: hpet0 uhci3               5495755       1135
>> irq21: uhci2 ehci1                    29          0
>> irq23: atapci0                        48          0
>> cpu0:timer                          7063          1
>> irq256: bce0                      117073         24
>> irq260: mfi0                       51083         10
>> irq261: mfi1                        3088          0
>> cpu1:timer                           484          0
>> cpu14:timer                           36          0
>> cpu6:timer                           486          0
>> cpu8:timer                            38          0
>> cpu5:timer                            38          0
>> cpu15:timer                           38          0
>> cpu7:timer                            32          0
>> cpu12:timer                           38          0
>> cpu3:timer                            40          0
>> cpu9:timer                            36          0
>> cpu10:timer                           34          0
>> cpu11:timer                           37          0
>> cpu2:timer                            33          0
>> cpu13:timer                           40          0
>> cpu4:timer                            36          0
>> Total                            5681160       1173
>>
>> is this relevant?
>
> I'll have to let someone who knows modern x86 hardware better comment on
> the relative merits of hpet vs. lapic timers.  If it was using hpet in
> one-shot mode, and changing it to hpet in periodic mode makes the
> problem go away, that might be a clue that there's something wrong in
> the hpet eventtimer start or interrupt routines.
>
> I wonder if a single missed interrupt in one-shot mode would bring an
> eventtimer to a halt like that?  And if so, then what is it about
> manually asking for the date that kicks it into running again?
>
> -- Ian
>
>
> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"