debugging frequent kernel panics on 8.2-RELEASE
killing at multiplay.co.uk
Sat Aug 20 13:24:51 UTC 2011
----- Original Message -----
From: "Andriy Gapon" <avg at FreeBSD.org>
> BTW, I suspect the following scenario, but I am not able to
> verify it either via testing or in the code:
> - last process in a dying jail exits
> - pr_uref of the jail reaches zero
> - pr_uref of prison0 gets decremented
> - you attach to the jail and resurrect it
> - but pr_uref of prison0 stays decremented
> Repeat this enough times and prison0.pr_uref reaches zero.
> To reach zero even sooner just kill enough of non-jailed processes.
Ahh now that explains all of our experienced panic scenarios:-
1. jail stop / start causing the panic but only after at least a
few days worth of uptime.
Here what we're seeing is enough "leak" of pr_uref from the restarted
jails to decrement prison0.pr_uref to 0 even with all the standard
unjailed processes still running.
2. A machine reboot, after all jails have been stopped but after
less time than #2.
In this case we haven't seen enough leakage to decrement
prison0.pr_uref to 0 given the number or prison0 process but
it has been incorrectly decremented, so as soon as the reboot kicks
in and prison0 processes start exiting, prison0.pr_uref gets
further decremented and again hits 0 when it shouldn't
Now if this is the case, we should be able to confirm it with a little
1. What exactly does pr_uref represent?
2. Can what its value should be, be calculated from examining other
details of the system i.e. number of running processes, number of
If we can calculate the value that prison0.pr_uref should be, then
by examining the machines we have which have been up for a while,
we should be able to confirm if an incorrect value is present on
them and hence prove this is the case.
Ideally a little script to run in kgdb to test this would be the
best way to go.
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it.
In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster at multiplay.co.uk.
More information about the freebsd-stable