Re: epair and vnet jail loose connection.

From: Kristof Provost <kp_at_FreeBSD.org>
Date: Thu, 10 Mar 2022 11:44:00 UTC
On 10 Mar 2022, at 10:13, Johan Hendriks wrote:
> On 10/03/2022 08:54, Patrick M. Hausen wrote:
>> Hi Johan,
>>
>> we experience the same on 13.1-PRERELEASE. Currently trying to collect some evidence
>> (dtrace) to send to Kristof Provost who was so kind to assist. We are hit by the problem
>> in production in 12-24 hour intervals. Have not done any artificial load tests, yet.
>>
>> May I ask you to run this dtrace script while at least one jail is disconnected and while
>> traffic is present that is trying to reach the jail? If you can afford to do that in production (?)
>> that would be great. Forward to Kristof (kp@), please.
>>
>> Thanks and kind regards
>> Patrick
>> ----------
>> #!/usr/sbin/dtrace -s
>>
>> BEGIN
>> {
>>     self->in_menq = 0;
>> }
>>
>> fbt:if_epair:epair_menq:entry
>> {
>>     self->in_menq = 1;
>>     printf("In epair_menq");
>> }
>>
>> fbt:if_epair:epair_menq:return
>> / self->in_menq == 1 /
>> {
>>     self->in_menq = 0;
>>     printf("Leave epair_menq");
>> }
>>
>> fbt:kernel:taskqueue_enqueue:entry
>> / self->in_menq == 1 /
>> {
>>     printf("Enqueue task");
>>
>> }
>>
>> fbt:if_epair:epair_tx_start_deferred:entry
>> {
>>     printf("epair_tx_start_deferred");
>> }
>> ----------
>>
> I was asked the above, so hereby the output of that command.
> I did do a  hey -h2 -n 10 -c 10 -z 60s https://wp.test.nl to that machine and in the 60 seconds the jail became unresponsive. Then i did run the dtrace.sh script above like so /root/bin/dtrace.sh > /root/dtrace_output
>
> I hope this helps, if you need anything please let me know. Also root access is possible if you want. That way you do not have to create a test environment.

Were there other epair interfaces running at this time, with active traffic?

The dtrace output appears to show that the appropriate callouts (to epair_tx_start_deferred()) are getting through, so I’d expect traffic to be flowing.

Kristof