svn commit: r286880 - head/sys/kern
Julien Charbon
jch at freebsd.org
Wed Aug 26 18:14:26 UTC 2015
Hi Hans,
On 26/08/15 10:12, Hans Petter Selasky wrote:
> On 08/26/15 09:25, Hans Petter Selasky wrote:
>> On 08/18/15 12:15, Julien Charbon wrote:
>>> Author: jch
>>> Date: Tue Aug 18 10:15:09 2015
>>> New Revision: 286880
>>> URL: https://svnweb.freebsd.org/changeset/base/286880
>>>
>>> Log:
>>> callout_stop() should return 0 (fail) when the callout is currently
>>> being serviced and indeed unstoppable.
>>>
>>> A scenario to reproduce this case is:
>>>
>>> - the callout is being serviced and at same time,
>>> - callout_reset() is called on this callout that sets
>>> the CALLOUT_PENDING flag and at same time,
>>> - callout_stop() is called on this callout and returns 1 (success)
>>> even if the callout is indeed currently running and unstoppable.
>>>
>>> This issue was caught up while making r284245 (D2763) workaround, and
>>> was discussed at BSDCan 2015. Once applied the r284245 workaround
>>> is not needed anymore and will be reverted.
>>>
>>> Differential Revision: https://reviews.freebsd.org/D3078
>>> Reviewed by: jhb
>>> Sponsored by: Verisign, Inc.
>>>
>>> Modified:
>>> head/sys/kern/kern_timeout.c
>>>
>>> Modified: head/sys/kern/kern_timeout.c
>>> ==============================================================================
>>>
>>>
>>> --- head/sys/kern/kern_timeout.c Tue Aug 18 10:07:03 2015
>>> (r286879)
>>> +++ head/sys/kern/kern_timeout.c Tue Aug 18 10:15:09 2015
>>> (r286880)
>>> @@ -1150,7 +1150,7 @@ _callout_stop_safe(struct callout *c, in
>>> struct callout_cpu *cc, *old_cc;
>>> struct lock_class *class;
>>> int direct, sq_locked, use_lock;
>>> - int not_on_a_list;
>>> + int not_on_a_list, not_running;
>>>
>>> if (safe)
>>> WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK, c->c_lock,
>>> @@ -1378,8 +1378,15 @@ again:
>>> }
>>> }
>>> callout_cc_del(c, cc);
>>> +
>>> + /*
>>> + * If we are asked to stop a callout which is currently in progress
>>> + * and indeed impossible to stop then return 0.
>>> + */
>>> + not_running = !(cc_exec_curr(cc, direct) == c);
>>> +
>>> CC_UNLOCK(cc);
>>> - return (1);
>>> + return (not_running);
>>> }
>>>
>>> void
>
> I'm sorry to say, but I think this change should be backed out as it
> changes the behaviour of the callout API. The old code was correct. The
> issues should be fixed in the TCP stack instead, like suggested by D1563
> and also r284245.
>
> This patch introduces new behaviour to the callout_stop() function,
> which is contradicting "man 9 callout" and _not_ documented anywhere!
>
> Reading "man 9 callout" in 11-current:
>
>> The callout_reset() and callout_schedule() function families
>> schedule a
>> future function invocation for callout c. If c already has a
>> pending
>> callout, it is cancelled before the new invocation is scheduled.
>
> callout_reset() causes a _new_ callout invocation and it is logical that
> the return value of callout_stop() reflect operations on the _new_
> callout invocation and not the previous one.
>
>> The function callout_stop() cancels a callout c if it is
>> currently pend-
>> ing. If the callout is pending, then callout_stop() returns a
>> non-zero
>> value. If the callout is not set, has already been serviced, or
>> is cur-
>> rently being serviced, then zero will be returned. If the
>> callout has an
>> associated lock, then that lock must be held when this function is
>> called.
>
> If there are two callout invocations that callout_stop() needs to
> handle, it should be called twice.
>
> The correct way is:
>
> callout_reset();
> callout_stop();
> callout_drain();
>
> For the TCP stack's matter, it should be:
>
> callout_reset();
> callout_stop(); /* cancel _new_ callout invocation */
> callout_stop(); /* check for any pending callout invokation */
First thank for your time. I indeed agree with your analysis and I am
not opposed to back out this change. The border between bug or feature
can indeed be thin; below why I was more on "bug" side:
o Pro "It is a feature":
- This behavior is here since the beginning and nobody ever complains
(Big one)
o Pro "It is a bug":
- Having callout_stop() returning 1 (success) while having the callout
currently being serviced is counter intuitive.
- You cannot call callout_stop() twice to address this case:
callout_reset();
callout_stop(); /* cancel _new_ callout invocation */
callout_stop(); /* check for any pending callout invokation */
Because the second callout_stop() will always return 0 (Fail). In
details: If the callout is currently being serviced (without r286880
i.e. the classical behavior):
callout_reset() returns 0 (Fail) (PENDING flag set)
callout_stop() returns 1 (Success) (PENDING flag removed)
callout_stop() returns 0 (Always fail because not PENDING flag)
In mpsafe mode, the only way a callout_stop() can return 1 (Success) is
with the PENDING flags set. The way I found to address this case is with:
fail_reset = !callout_reset();
fail_stop = !callout_stop();
if (fail_reset || fail_stop) {
/* Callout not stopped */
}
This is what I did in r284245:
https://svnweb.freebsd.org/base/head/sys/netinet/tcp_timer.c?r1=284245&r2=284244&pathrev=284245
And it felt wrong to me because callout_reset() and callout_stop()
calls can be far away.
o Pro "it is 'neither'" (i.e. API unclear by design):
In this case the same callout is indeed _both_ pending and currently
being serviced. According to man page callout(9): (As you already noticed)
If the callout is pending, then callout_stop() returns a non-zero
^^^^^^^^^^^^^^^^^^
value.
Conclusion: It is a feature!
And just after:
If the callout is not set, has already been serviced, or is
currently being serviced, then zero will be returned.
^^^^^^^^^^^^^^^^^^^^^^^^
Conclusion: It is a bug!
Obviously, no indication about the case where the callout is indeed
_both_ pending and currently being serviced.
> Remember there are other OS'es out there using our TCP/IP stack which
> do not have an identical callout subsystem.
I was not aware of that.
As I said, I am not opposed to back out this change, callout(9) API in
mpsafe mode is a already complex/subtle API, it won't change too much
the current complexity.
Let say that if nobody screams until Friday 8/28, I will put back
r284245 and revert this change _and_ I will make this case clear in the
man page.
--
Julien
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 496 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freebsd.org/pipermail/svn-src-all/attachments/20150826/2107252d/attachment.bin>
More information about the svn-src-all
mailing list