[CFT] SIFTR - Statistical Information For TCP Research: Uncle
Lawrence needs YOU!
lstewart at freebsd.org
Sun Jun 20 10:27:37 UTC 2010
On 06/20/10 03:58, Fabian Keil wrote:
> Lawrence Stewart<lstewart at freebsd.org> wrote:
>> On 06/13/10 18:12, Lawrence Stewart wrote:
>>> The time has come to solicit some external testing for my SIFTR tool.
>>> I'm hoping to commit it within a week or so unless problems are discovered.
>>> I'm interested in all feedback and reports of success/failure, along
>>> with details of the architecture tested and number of CPUs if you would
>>> be so kind.
> I got the following hand-transcribed panic maybe a second after
> sysctl net.inet.siftr.enabled=1
> Fatal trap 12: page fault while in kernel mode
> cpuid = 1; apic id = 01
> current process = 12 (swi4: clock)
> [ thread pid 12 tid 100006 ]
> Stopped at siftr_chkpkt+0xd0: addq $0x1,0x8(%r14)
> db> where
> Tracing pid 12 tid 100006 td 0xffffff00034037e0
> siftr_chkpt() at siftr_chkpkt+0xd0
> pfil_run_hooks() at pfil_run_hooks+0xb4
> ip_output() at ip_output+0x382
> tcp_output() tcp_output+0xa41
> tcp_timer_rexmt() at tcp_timer_rexmt+0x251
> softclock() at softclock+0x291
> intr_event_execute_handlers() at intr_event_execute_handlers+0x66
> ithread_loop at ithread_loop+0x8e
> fork_exit() at fork_exit+0x112
> fork_trampoline() at fork_trampoline+0xe
> --- trap 0, rip = 0, rsp = 0xffffff800003ad30, rbp = 0 ---
So I've tracked down the line of code where the page fault is occurring:
if (dir == PFIL_IN)
ss is a DPCPU (dynamic per-cpu) variable used to keep a set of stats
per-cpu and is initialised at the start of the function like so:
ss = DPCPU_PTR(ss);
So for ss to be NULL, that implies DPCPU_PTR() is returning NULL on your
machine. I know very little about the inner workings of the DPCPU_*
macros, but I'm pretty sure the way I use them in SIFTR is correct or at
least as intended.
Could you please go ahead and retest using a GENERIC kernel and see if
you can reproduce? There could be something in your custom kernel
causing the offsets or linker set magic used by the DPCPU bits to break
which in turn is triggering this panic in SIFTR.
Whether its your custom changes breaking DPCPU or DPCPU being fragile
remains to be seen, but the good news for me is that it looks like SIFTR
is off the hook :)
More information about the freebsd-current