svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys

Konstantin Belousov kostikbel at gmail.com
Wed Jan 21 08:16:13 UTC 2015


On Tue, Jan 20, 2015 at 04:37:44PM -0800, K. Macy wrote:
> I would pick stability over performance any day. However, it _seems_
> to me, and maybe I simply don't understand some key details, that the
> fix consisted of largely single-threading the callout system. And as I
> say I may simply not understand the specifics, but this sort of large
> scale disabling does not constitute a fix but is a workaround.  It's
> more like disabling preemption because it fixes a panic. Yes, it might
> "fix" a whole array of bugs that crop up but it could not be seen as a
> fix to an otherwise working system.

Well, this is not a complete truth. Let me try to explain my
understanding, I spent some time actually looking at the new code. Would
be nice if corrections to my understanding is posted, if needed.

Now, when callout_reset() is directed to change cpu, the change only
happens when the callout is associated with a lock, and that lock is
owned by the caller of callout_reset(). There are two consequences. One,
which is good, is significant simplification of the migration code,
together with the drain. This is exactly the place where there bugs
which make my head hurt, see e.g. r234952 and r243901.  Note that some
callouts follow this pattern already, e.g. process timers after r243869.

Another consequence, which is very visible and which causes the roar, is
that all lockless callers of callout_reset_on(9) now silently changed
the behaviour to callout_reset(9). There is no audit performed to
identify such callers, and there is no even a plan to fix them.  The
answer to the complains seems to be 'if you think that the fix is needed,
go and fix'.

My impression is that some set of vocal active developers consider
the second consequence unacceptable, myself included. IMO, committing
the change, however good the consequence one is, without fixing the
consequence two, is inappropriate.  And the onus of doing this is on
the person doing the change.

Yes, it is very interesting is the change actually good WRT fixing
migration, since indeed there is serious reduction in the migration
amount due to locked callout_reset() being not universally used.
It is possible that the bugs are only covered.


More information about the svn-src-head mailing list