svn commit: r359436 - in head/sys: kern net sys

Kristof Provost kp at FreeBSD.org
Tue Mar 31 15:57:52 UTC 2020


On 31 Mar 2020, at 17:28, Kristof Provost wrote:
> On 31 Mar 2020, at 17:17, Mark Johnston wrote:
>> On Tue, Mar 31, 2020 at 03:51:27PM +0800, Li-Wen Hsu wrote:
>>> On Tue, Mar 31, 2020 at 3:00 PM Kristof Provost <kp at freebsd.org> 
>>> wrote:
>>>>
>>>> On 31 Mar 2020, at 7:56, Li-Wen Hsu wrote:
>>>>> On Tue, Mar 31, 2020 at 10:55 AM Mark Johnston <markj at freebsd.org> 
>>>>> wrote:
>>>>>>>> It seems could be triggered by sys.netinet6.frag6.*
>>>>>>>> sys.netpfil.common.* sbin.pfctl.pfctl_test.* tests, and there 
>>>>>>>> are lots
>>>>>>>> of test cases timed out.
>>>>>>>>
>>>>>>>> Can you help check these?
>>>>>>>
>>>>>>> I see, it is actually caused by r359438.  I'm looking at it now.
>>>>>>
>>>>>> I verified that the netpfil and netinet6 tests pass with r359477.
>>>>>
>>>>> Thanks for the fixing, the latest test panics at epair_qflush:
>>>>>
>>>>> https://ci.freebsd.org/job/FreeBSD-head-amd64-test/14747/consoleFull
>>>>>
>>>>> while executing sys.netpfil.pf.* tests. I'm not sure if this is
>>>>> related or because of previous commits (I suspect the later). I'll
>>>>> look into this.
>>>>>
>>>> That’s a know issue with epair (since EPOCH, I believe).
>>>> A number of the pf tests are disabled due to this. See 238870.
>>>
>>> I also think so, btw, currently every test run panics so I am afraid
>>> that the recent commits might make status worse (or say, make the
>>> issue easier to reproduce?)
>>
>> I haven't been able to reproduce any panics or test failures so far.
>
> Once you disable the ‘atf_skip’ lines in the pf tests a simple 
> `sudo kldload pfsync && cd /usr/tests/sys/netpfil/pf && sudo kyua 
> test` is likely sufficient.
>
The names:names test is a great candidate for this. Remove the `atf_skip 
…` line in /usr/tests/sys/netpfil/pf/names and run that a few times.
It’s not 100% reliable, but the test is very fast and will likely 
panic every other run or more.

Example backtrace:

	panic: epair_qflush: ifp=0xfffff800079c9000, epair_softc gone? sc=0

	cpuid = 1
	time = 1585666518
	KDB: stack backtrace:
	db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
0xfffffe001bd7e790
	vpanic() at vpanic+0x182/frame 0xfffffe001bd7e7e0
	panic() at panic+0x43/frame 0xfffffe001bd7e840
	epair_qflush() at epair_qflush+0x1a8/frame 0xfffffe001bd7e890
	if_down() at if_down+0x12d/frame 0xfffffe001bd7e8c0
	if_detach_internal() at if_detach_internal+0x2ee/frame 
0xfffffe001bd7e920
	if_vmove() at if_vmove+0x3c/frame 0xfffffe001bd7e970
	vnet_if_return() at vnet_if_return+0x50/frame 0xfffffe001bd7e990
	vnet_destroy() at vnet_destroy+0x130/frame 0xfffffe001bd7e9c0
	prison_deref() at prison_deref+0x29d/frame 0xfffffe001bd7ea00
	taskqueue_run_locked() at taskqueue_run_locked+0xaa/frame 
0xfffffe001bd7ea80
	taskqueue_thread_loop() at taskqueue_thread_loop+0x94/frame 
0xfffffe001bd7eab0
	fork_exit() at fork_exit+0x80/frame 0xfffffe001bd7eaf0
	fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe001bd7eaf0
	--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
	KDB: enter: panic
	[ thread pid 0 tid 100014 ]
	Stopped at      kdb_enter+0x37: movq    $0,0x10927a6(%rip)
	db>

You might see different panics too. The epair teardown flow is complex, 
and broken.

Best regards,
Kristof


More information about the svn-src-head mailing list