slight problem with r254457 (kthread_add fix)
Konstantin Belousov
kostikbel at gmail.com
Thu Mar 27 12:07:31 UTC 2014
On Thu, Mar 27, 2014 at 03:02:49AM -0600, Chris Torek wrote:
> This fix:
>
> http://lists.freebsd.org/pipermail/svn-src-head/2013-August/050550.html
>
> has introduced a bit of a problem with kernel threads that enable
> the FPU, on amd64.
>
> Before the fix, a new thread added to proc0 (when kthread_add() is
> called with p == NULL) was essentially "forked from" thread0:
>
> if (p == NULL) {
> p = &proc0;
> oldtd = &thread0;
> }
>
> Now, each such new thread "forks from" the *newest* thread in proc0:
>
> if (p == NULL)
> p = &proc0;
> ...
> PROC_LOCK(p);
> oldtd = FIRST_THREAD_IN_PROC(p);
>
> The problem is that "first thread in proc" is really the *last*
> (i.e., most recent) thread in the proc, as thread_link() does a
> TAILQ_INSERT_HEAD() on p->p_threads, and FIRST_THREAD_IN_PROC
> takes the TAILQ_FIRST element.
>
> What this means is that if something enables the FPU (e.g., a
> device creates a taskq thread and enables the FPU in that thread,
> so that the device can use XMM registers), every subsequent taskq
> also has the FPU enabled. (We found this by unconditionally
> enabling the extensions in various threads, assuming that new ones
> did not have FPU enabled for kernel use yet. That worked until
> we picked up this particular change.)
>
> The fix itself is fine but "forking from" the most-recent kernel
> thread, rather than thread0, for this case seems wrong. I see
> three obvious ways to fix *that*:
>
> - Restore the old behavior:
>
> if (p == NULL) {
> p = &proc0;
> oldtd = &thread0;
> } else
> oldtd = NULL;
> ...
> PROC_LOCK(p);
> if (oldtd == NULL)
> oldtd = FIRST_THREAD_IN_PROC(p);
>
> Conservative, will definitely work, but still seems a bit odd
> behavior-wise: a new "not in a proc" kthread clones from thread0,
> but a new "is in a proc" kthread clones from most-recent-thread
> in that kernel proc. (But that's what this did before the fix,
> modulo the bug where it cloned from a dead thread...)
>
> - Pick the *last* (i.e., earliest still-alive) thread in the proc:
>
> PROC_LOCK(p);
> oldtd = LAST_THREAD_IN_PROC(p);
>
> where LAST_THREAD_IN_PROC uses the tail entry on the tailq
> (pretty straightforward).
>
> - Get rid of FIRST_THREAD_IN_PROC entirely, as it's not clear
> what "first" means (at least it was not to me: I assumed at
> first that it meant the *oldest*, i.e., first-created, one):
> add new macro accessors NEWEST_THREAD_IN_PROC,
> OLDEST_THREAD_IN_PROC, and (for when you don't care about
> order) SOME_THREAD_IN_PROC [%], and start using those where
> appropriate. The thread_link() routine then puts "newest" and
> "oldest" at whichever end of the tailq is best for kernel code
> space/speed, and only it and the macros need to agree.
>
> [% Needs better name. Perhaps just THREAD_IN_PROC?]
>
> I actually like the last option best, but it is the largest
> overall change and it messes with all kinds of other kernel code
> (there are 62 occurrences of FIRST_THREAD_IN_PROC in a count I
> just ran). However, it becomes clearer which thread is really
> wanted.
>
> (I'm going to do the first fix, for now at least, in our kernel.)
I agree with the problem statement, but disagree with the proposed
'fix'. I.e., I agree that on the creation of a new kernel thread, the
current thread FPU grab state must not leak into the created thread.
In fact, cpu_set_upcall() already almost handles this correctly, by
resetting the pcb_save pointer back to the user save area. What is
not done is the clearing of PCB_KERNFPU flag, which seems to be the
issue you met. Am I right ?
If yes, then the following patch should handle the problem, hopefully.
diff --git a/sys/amd64/amd64/vm_machdep.c b/sys/amd64/amd64/vm_machdep.c
index 335a9c9..bc7b311 100644
--- a/sys/amd64/amd64/vm_machdep.c
+++ b/sys/amd64/amd64/vm_machdep.c
@@ -446,7 +446,8 @@ cpu_set_upcall(struct thread *td, struct thread *td0)
* values here.
*/
bcopy(td0->td_pcb, pcb2, sizeof(*pcb2));
- clear_pcb_flags(pcb2, PCB_FPUINITDONE | PCB_USERFPUINITDONE);
+ clear_pcb_flags(pcb2, PCB_FPUINITDONE | PCB_USERFPUINITDONE |
+ PCB_KERNFPU);
pcb2->pcb_save = get_pcb_user_save_pcb(pcb2);
bcopy(get_pcb_user_save_td(td0), pcb2->pcb_save,
cpu_max_ext_state_size);
diff --git a/sys/i386/i386/vm_machdep.c b/sys/i386/i386/vm_machdep.c
index ba61bdf..3562954 100644
--- a/sys/i386/i386/vm_machdep.c
+++ b/sys/i386/i386/vm_machdep.c
@@ -457,7 +457,8 @@ cpu_set_upcall(struct thread *td, struct thread *td0)
* values here.
*/
bcopy(td0->td_pcb, pcb2, sizeof(*pcb2));
- pcb2->pcb_flags &= ~(PCB_NPXINITDONE | PCB_NPXUSERINITDONE);
+ pcb2->pcb_flags &= ~(PCB_NPXINITDONE | PCB_NPXUSERINITDONE |
+ PCB_KERNNPX);
pcb2->pcb_save = &pcb2->pcb_user_save;
/*
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 834 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20140327/86b9881a/attachment.sig>
More information about the freebsd-hackers
mailing list