[RFC] Event timers on sparc64/sun4v

Marius Strobl marius at alchemy.franken.de
Sun Jul 18 14:05:10 UTC 2010


On Sat, Jul 17, 2010 at 07:26:03PM +0300, Alexander Motin wrote:
> Marius Strobl wrote:
> > On Sat, Jul 17, 2010 at 01:02:29AM +0300, Alexander Motin wrote:
> >> Marius Strobl wrote:
> 
> If it is granularity, then it is caused not by the base frequency. Here
> is typical x86 timer frequencies:
> kern.timecounter.tc.i8254.frequency: 1193182
> kern.timecounter.tc.ACPI-fast.frequency: 3579545
> kern.timecounter.tc.HPET.frequency: 14318180
> kern.timecounter.tc.TSC.frequency: 2000085680
> , but TSC is not used on SMP. Others have frequencies not higher then
> stick and still working fine. IMHO while counter is monotonic and it's
> frequency is higher then frequency of context switches - it should not
> be important. I would looked for some different reason.

If you have an idea what else could be causing it I'm all ears :)
Part of the problem is that there's so much stuff ticking inside
the kernel - hard, prof soft and clocks, time counters, cycle
counters and CPU tickers - it's hard to follow which source is
used for what and how they are supposed to work and also might
interact.

> 
> > Using the stick counter on machines consisting of CPUs running at
> > different speeds (well, actually all the combinations of using
> > stick/tick for hardclock, timecounter, CPU ticker and cycle
> > counter I tried as they didn't appear totally wrong) additionally
> > has the problem of processes getting killed as they are diagnosed
> > to have exceeded their maximum CPU limit, although with the in-tree
> > code only the timecounter provided by the host-PCI-bridge should
> > be used for this calculation as far as the MD initialization is
> > concerned when the stick counter is used to drive hardclock.
> 
> On my SB100 I've seen only tick timecounter registered. If there is some
> other timecounter hardware in a system, why it is not registered? It
> would be much easier to experiment, having more trusted spare parts.

The timecounter I was talking about is implemented by (ab)using
one of their performance counters of Schizo host-PCI-bridges
in bus cycle counting mode. The host-SBus-bridges and the Psycho
host-PCI-bridges also have a timecounter but the Hummingbird
host-PCI-bridges found in Blade 100 don't have such counters.

> 
> >>>   Thus the more desireable variant for these machines
> >>>   probably is to provide the tick counter of the BSP as the only
> >>>   non-per-CPU timer and forward it to the APs via IPIs. 
> >> It would be possible if timer was programmable from any CPU. But as I
> >> understand - it require thread to be binded, which handled by
> >> infrastructure only for per-CPU timer.
> > 
> > Wouldn't it be sufficient to bind curthread to the BSP in
> > tick_et_start() in that case? For one-shot mode this probably
> > is to much overhead (assuming a tickless kernel) but for
> > periodic mode IMO this approach should be sound.
> 
> tick_et_start() is called under spin lock and sometimes critical
> section. You can't call CPU binding there. For per-CPU timers
> reconfiguration there is special logic implemented in MI code using IPIs.

Too bad.

> 
> By the way I have some doubts about tick_get_timecount_mp() correctness.
> It tries to bind thread to BSP, but what if it is called inside
> interrupt handler, or under lock, or some else. I have doubt binding
> will work in that case.

I've no idea whether sched_bind() works under locks etc as
there's no man page describing it, however as it requires
curthread to be locked and thread_lock() itself uses a
spin lock and locking(9) basically says that acquiring a
spinlock with any other lock held is okay I assume that
the whole thing is fine with any lock held. Also if there
were such restrictions I'd expect there some KASSERTs etc
to be in place in the functions invovled preventing
incorrect use but tick_get_timecount_mp() doesn't trigger
such.
Apart from that I'm not really happy about that construct
myself but I don't see an alternative to always bind to
the same CPU when reading the tick counter in order to
get reliable results and in US-IIIi-based machines there
just isn't another piece of hardware besides the per-CPU
stick and tick counters that could be used as a timecounter
available.

> 
> >>>   This also
> >>>   leaves the stick counter of all >= US-III machines generally
> >>>   available for driving statclock, which likely is also desirable.
> >> It would be nice, but I don't know how separate their interrupts.
> > 
> > I think this should be possible in the soft interrupt dispatch.
> > However, meanwhile it came to my mind that there was a problem
> > with using the stick counter on US-IIIi machines (which also
> > only can consist of CPUs running at the same frequency though).
> 
> Is this hardware working at all? May be there is something wrong by
> definition or it is misused?

I haven't tried to verify whether the frequency actually
is correct but it in US-III-based machines it's at least
incrementing and generating interrupts and OpenSolaris also
is using it (and also just getting its frequency via the
OFW device tree) so it should be basically fine.

Marius



More information about the freebsd-sparc64 mailing list