Process in T state does not want to die.....
Willem Jan Withagen
wjw at digiware.nl
Wed Nov 27 15:11:47 UTC 2019
Hi,
Probably a "dumb" question, but still I wondering what is going on...
I have this ceph server running several OSDs (ceph-osd), now when they
do not get certain responses within a time limit, they commit suicide.
That is a rather convoluted process where they
- call abort()
- which is then trapped the ABORT signal handler
Try to dump the logging state
Try to dump stacktrace
- either call _exit()
or call reraise_fatal
- reraise_fatal does some logging
and calls exit(1)
And then the process ends up as:
root 3433 0.0 4.2 699944 353716 - TsJ 11Nov19 38:10.17 ceph-osd -i 2
Where the I state make it Terminated and no more processing is consumed.
But the process one way or another is not going away and keeps resources
locked that prevents starting a new daemon.
It stays in that state for a
1) few minutes, and then it is gone from the processtable.
2) forever (>24h)
But why doesn't the process die (right away)?
Killing it -9 does not help.
Trying to attach gdb brings nothing.
If it disappears from the processtable, somethings there is a core.
Do how do I debug this?
--WjW
More information about the freebsd-hackers
mailing list