suspending threads before devices
Andriy Gapon
avg at FreeBSD.org
Sat Nov 15 15:06:11 UTC 2014
On 15/11/2014 12:58, Konstantin Belousov wrote:
> On Fri, Nov 14, 2014 at 11:10:45PM +0200, Andriy Gapon wrote:
>> On 22/03/2012 16:14, Konstantin Belousov wrote:
>>> I already noted this to Jung-uk, I think that current suspend handling
>>> is (somewhat) wrong. We shall not stop other CPUs for suspension when
>>> they are executing some random kernel code. Rather, CPUs should be safely
>>> stopped at the kernel->user boundary, or at sleep point, or at designated
>>> suspend point like idle loop.
>>>
>>> We already are engaged into somewhat doubtful actions like restoring of %cr2,
>>> since we might, for instance, preemt page fault handler with suspend IPI.
>>
>> I recently revisited this issue in the context of some suspend+resume problems
>> that I am having with radeonkms driver. What surprised me is that the driver's
>> suspend code has no synchronization whatsoever with its other code paths. So, I
>> looked first at the Linux code and then at the illumos code to see how suspend
>> is implemented there.
>> As far as I can see, those kernels do exactly what you suggest that we do.
>> Before suspending devices they first suspend all threads except for one that
>> initiates the suspend. For userland threads a signal-like mechanism is used to
>> put them in a state similar to SIGSTOP-ed one. With the kernel threads
>> mechanisms are different between the kernels. Also, illumos freezes kernel
>> threads after suspending the devices, not before.
>>
>> I think that we could start with only the userland threads initially. Do you
>> think the SIGSTOP-like approach would be hard to implement for us?
> We have most, if not all, parts of the stopping code
> already implemented. I mean the single-threading code, see
> thread_single(SINGLE_BOUNDARY). The code ensures that other threads in
> the current process are stopped either at the kernel->user boundary, or
> at the safe kernel sleep point.
>
> This is not immediately applicable, since the caller is supposed to be
> a thread in the suspended process, but modifications to allow external
> process to do the same are really small comparing with the complexity
> of the code. I suspect that all what is needed is change of
> while/if (remaining != 1)
> to
> while/if ((p == curproc && remaining != 1) ||
> (p != curproc && remaining != 0))
> together with explicit passing of struct proc *p to thread_single.
Thank you for the pointer!
I think that maybe even more changes are required for that code to be usable for
suspending. E.g. maybe a different p_flag bit should be used, because I think
that we would like to avoid interaction between the process level suspend and
the global suspend. I.e. the global suspend might encounter a multi-threaded
process in a single thread mode and would need to suspend its remaining thread.
--
Andriy Gapon
More information about the freebsd-arch
mailing list