debugging frequent kernel panics on 8.2-RELEASE
avg at FreeBSD.org
Sat Aug 20 10:02:32 UTC 2011
on 18/08/2011 02:15 Steven Hartland said the following:
> In a nutshell the jail manager we're using will attempt to resurrect the jail
> from a dieing state in a few specific scenarios.
> Here's an exmaple:-
> 1. jail restart requested
> 2. jail is stopped, so the java processes is killed off, but active tcp sessions
> may prevent the timely full shutdown of the jail.
> 3. if an existing jail is detected, i.e. a dieing jail from #2, instead of
> starting a new jail we attach to the old one and exec the new java process.
> 4. if an existing jail isnt detected, i.e. where there where not hanging tcp
> sessions and #2 cleanly shutdown the jail, a new jail is created, attached to
> and the java exec'ed.
> The system uses static jailid's so its possible to determine if an existing
> jail for this "service" exists or not. This prevents duplicate services as
> well as making services easy to identify by their jailid.
> So what we could be seeing is a race between the jail shutdown and the attach
> of the new process?
Not a jail expert at all, but a few suggestions...
First, wouldn't the 'persist' jail option simplify your life a little bit?
Second, you may want to try to monitor value of prison0.pr_uref variable (e.g.
via kgdb) while executing various scenarios of what you do now. If after
finishing a certain scenario you end up with a value lower than at the start of
scenario, then this is the troublesome one.
Please note that prison0.pr_uref is composed from a number of non-jailed
processes plus a number of top-level jails. So take this into account when
comparing prison0.pr_uref values - it's better to record the initial value when
no jails are started and it's important to keep the number of non-jailed
processes the same (or to account for its changes).
More information about the freebsd-stable