Issue with epoch_drain_callbacks and unloading iavf(4) [using iflib]

Eric Joyner erj at freebsd.org
Thu Apr 9 21:30:25 UTC 2020


On Thu, Apr 9, 2020 at 2:02 PM Eric Joyner <erj at freebsd.org> wrote:

> On Tue, Apr 7, 2020 at 4:24 PM Mark Johnston <markj at freebsd.org> wrote:
>
>> On Mon, Apr 06, 2020 at 02:34:50PM -0700, Eric Joyner wrote:
>> > On Mon, Apr 6, 2020 at 2:29 PM Mark Johnston <markj at freebsd.org> wrote:
>> >
>> > > On Mon, Apr 06, 2020 at 02:19:25PM -0700, Eric Joyner wrote:
>> > > > Mark,
>> > > >
>> > > > I think I was mistaken about the backtrace looking the same. I was
>> > > looking
>> > > > at it from within ddb, and I think I focused on the
>> > > > epoch_block_handler_preempt line and didn't notice that it only
>> stopped
>> > > > there this time. Here's the new one I've got from kgdb:
>> > >
>> > > Thanks.  Could you try to print "td->td_name" from frame 4?  It should
>> > > also be available as er->er_blockedtd.  Basically, I'm trying to
>> verify
>> > > that the interrupt thread itself isn't the one that we're waiting for,
>> > > else there is another bug to be fixed.
>> > >
>> > > If you can provide kernel symbols and vmcore, I'd be happy to look at
>> it
>> > > myself.
>> > > _______________________________________________
>> > > freebsd-net at freebsd.org mailing list
>> > > https://lists.freebsd.org/mailman/listinfo/freebsd-net
>> > > To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org
>> "
>> > >
>> >
>> > Here's what I get:
>> >
>> > (kgdb) frame 4
>> > #4  epoch_block_handler_preempt (global=0xfffff80003de0100,
>> > cr=0xfffffe00dee85900, arg=0x0) at /usr/src/sys/kern/subr_epoch.c:507
>> > 507     }
>> > (kgdb) print td->td_name
>> > $1 = "if_io_tqg_31\000\000\000\000\000\000\000"
>> > (kgdb) print er->er_blockedtd
>> > $2 = (struct thread *) 0x0
>>
>> I spent some time looking at the core.  It looks like we have yet
>> another problem: the gtaskqueue code won't exit the net epoch if it is
>> constantly running a net task.  Could you please retry with the patches
>> from before, and this one included?
>>
>> diff --git a/sys/kern/subr_gtaskqueue.c b/sys/kern/subr_gtaskqueue.c
>> index f52f32204644..2b1386a612ee 100644
>> --- a/sys/kern/subr_gtaskqueue.c
>> +++ b/sys/kern/subr_gtaskqueue.c
>> @@ -345,7 +345,7 @@ gtaskqueue_run_locked(struct gtaskqueue *queue)
>>         struct epoch_tracker et;
>>         struct gtaskqueue_busy tb;
>>         struct gtask *gtask;
>> -       bool in_net_epoch;
>> +       bool in net_epoch;
>>
>>         KASSERT(queue != NULL, ("tq is NULL"));
>>         TQ_ASSERT_LOCKED(queue);
>> @@ -361,20 +361,19 @@ gtaskqueue_run_locked(struct gtaskqueue *queue)
>>                 TQ_UNLOCK(queue);
>>
>>                 KASSERT(gtask->ta_func != NULL, ("task->ta_func is
>> NULL"));
>> -               if (!in_net_epoch && TASK_IS_NET(gtask)) {
>> -                       in_net_epoch = true;
>> +               if (TASK_IS_NET(gtask)) {
>>                         NET_EPOCH_ENTER(et);
>> -               } else if (in_net_epoch && !TASK_IS_NET(gtask)) {
>> +                       in_net_epoch = true;
>> +               }
>> +               gtask->ta_func(gtask->ta_context);
>> +               if (in_net_epoch) {
>>                         NET_EPOCH_EXIT(et);
>>                         in_net_epoch = false;
>>                 }
>> -               gtask->ta_func(gtask->ta_context);
>>
>>                 TQ_LOCK(queue);
>>                 wakeup(gtask);
>>         }
>> -       if (in_net_epoch)
>> -               NET_EPOCH_EXIT(et);
>>         LIST_REMOVE(&tb, tb_link);
>>  }
>>
>> _______________________________________________
>> freebsd-net at freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
>>
>
> Yeah, I'll give it a spin and try to get back to you before the end of the
> week.
>
> - Eric
>

I was able to try it out just now, and it looks this (and all of the other
patches) finally causes the problem to not appear! I can unload the driver
while iavf1 is receiving heavy traffic!

- Eric


More information about the freebsd-net mailing list