suspending threads before devices

Andriy Gapon avg at FreeBSD.org
Sat Nov 15 15:06:11 UTC 2014


On 15/11/2014 12:58, Konstantin Belousov wrote:
> On Fri, Nov 14, 2014 at 11:10:45PM +0200, Andriy Gapon wrote:
>> On 22/03/2012 16:14, Konstantin Belousov wrote:
>>> I already noted this to Jung-uk, I think that current suspend handling
>>> is (somewhat) wrong. We shall not stop other CPUs for suspension when
>>> they are executing some random kernel code. Rather, CPUs should be safely
>>> stopped at the kernel->user boundary, or at sleep point, or at designated
>>> suspend point like idle loop.
>>>
>>> We already are engaged into somewhat doubtful actions like restoring of %cr2,
>>> since we might, for instance, preemt page fault handler with suspend IPI.
>>
>> I recently revisited this issue in the context of some suspend+resume problems
>> that I am having with radeonkms driver.  What surprised me is that the driver's
>> suspend code has no synchronization whatsoever with its other code paths.  So, I
>> looked first at the Linux code and then at the illumos code to see how suspend
>> is implemented there.
>> As far as I can see, those kernels do exactly what you suggest that we do.
>> Before suspending devices they first suspend all threads except for one that
>> initiates the suspend.  For userland threads a signal-like mechanism is used to
>> put them in a state similar to SIGSTOP-ed one.  With the kernel threads
>> mechanisms are different between the kernels.  Also, illumos freezes kernel
>> threads after suspending the devices, not before.
>>
>> I think that we could start with only the userland threads initially.  Do you
>> think the SIGSTOP-like approach would be hard to implement for us?
> We have most, if not all, parts of the stopping code
> already implemented. I mean the single-threading code, see
> thread_single(SINGLE_BOUNDARY). The code ensures that other threads in
> the current process are stopped either at the kernel->user boundary, or
> at the safe kernel sleep point.
> 
> This is not immediately applicable, since the caller is supposed to be
> a thread in the suspended process, but modifications to allow external
> process to do the same are really small comparing with the complexity
> of the code.  I suspect that all what is needed is change of
> 	while/if (remaining != 1)
> to
> 	while/if ((p == curproc && remaining != 1) ||
> 	    (p != curproc && remaining != 0))
> together with explicit passing of struct proc *p to thread_single.

Thank you for the pointer!
I think that maybe even more changes are required for that code to be usable for
suspending.  E.g. maybe a different p_flag bit should be used, because I think
that we would like to avoid interaction between the process level suspend and
the global suspend.  I.e. the global suspend might encounter a multi-threaded
process in a single thread mode and would need to suspend its remaining thread.

-- 
Andriy Gapon


More information about the freebsd-arch mailing list