[PATCH] randomized delay in locking primitives, take 2

Mon Aug 1 20:08:49 UTC 2016

On Mon, Aug 01, 2016 at 11:37:50AM -0700, John Baldwin wrote:
> On Sunday, July 31, 2016 02:41:13 PM Mateusz Guzik wrote:
> > On Sun, Jul 31, 2016 at 01:49:28PM +0300, Konstantin Belousov wrote:
> > [snip]
> > 
> > After an irc discussion, the following was produced (also available at:
> > https://people.freebsd.org/~mjg/lock_backoff_complete4.diff):
> > 
> > Differences:
> > - uint64_t usage was converted to u_int (also see r303584)
> > - currently unused features (cap limit and return value) were removed
> > - lock_delay args got packed into a dedicated structure
> 
> lock_delay_enabled declaration seems to be stale?
> 

Oops, thanks.

> I would maybe just provide a "standard" lock_delay_init function that the
> sysinit's use rather than duplicating the same exact code 3 times.  I'm
> not sure we really want to use different tunables for different lock types
> anyway.  (Alternatively we could even just have a single 'config' variable
> that is a global.  We can always revisit this in the future if we find that
> we need that granularity, but it would remove an extra pointer indirection
> if you just had a single 'lock_delay_config' that was exported as a global
> for now and initialized in a single SYSINIT.)
> 

The per-lock type config is partially an artifact of the real version of
the patch which has different configs per state of the lock, see loops
with rowner_loops in the current implementation of rw and sx locks and
this is were it mattered. It was cut off from this patch for simplicity
(90% of the benefit for 10% of the work).

That said, fine tuned it does matter for "mere" spinning as well but
here I put very low values on purpose.

Putting them all in one config makes for a small compatibility issue,
where debug.lock.delay_* sysctls would disappear later.

So I would prefer to just keep this as I don't think it matters much.

I have further optimisation to primitives not related to spinning. They
boil down to the fact that KDTRACE_HOOKS-enabled kernels contain an
unconditional function call to lockstat_nsecs even with the lock held.

> I think the idea is fine.  I'm less worried about the overhead of the
> divide as you are only doing it when you are contesting (so you are already
> sort of hosed anyway).  Long delays in checking the lock cookie can be
> bad (see my local APIC snafu which only polled once per microsecond).  I
> don't really think a divide is going to be that long?
> 

This should be perfectly fine. One could argue the time wasted should be
wasted efficiently, i.e. the more cpu_spinwait, the better, at least on
amd64.

-- 
Mateusz Guzik <mjguzik gmail.com>