QEMU 0.8.1 and -kernel-kqemu: stalls with "npxdna: fpcurthread == curthread"

Juergen Lock nox at jelal.kn-bremen.de
Sat May 13 16:34:47 UTC 2006


On Sat, May 13, 2006 at 11:34:31AM +1000, Bruce Evans wrote:
> On Fri, 12 May 2006, Juergen Lock wrote:
> 
> >In article <20060512101754.K65309 at delplex.bde.org> you write:
> >>On Fri, 12 May 2006, Georg-W. Koltermann wrote:
> 
> >>>       May 11 13:04:44 hunter kernel: npxdna: fpcurthread == curthread
> >>43 times
> >>>       ...
> >>>
> >>>messages.  I then had to kill qemu.
> >>>
> >>>It does run ok without the "-kernel-kqemu" option.  Any idea?
> >>
> >>1. This error should cause a panic instead of a printf.  An invariant has
> >>   been violated.  The panic was broken in rev.1.131 of npx.c.
> >>2. This error has been implemented before.  It was in the amd64
> >>   linux32_sysvec.c until rev.1.9 of that.  There it was caused by dubious
> >>   setting of CR0_TS.  The fix is dubious too:
> >>...
> >>Maybe other emulators get this wrong similarly.
> >
> >So you think kqemu is doing something wrong?  The problem is _k_qemu is
> 
> Most likely.  It could be triggering triggering a bug in the kernel proper,
> but I can't see how it could do this without doing something wrong.
> 
> >closed source and afaik the author doesnt use freebsd, the inner
> >workings of it are in a binary blob that gets linked into a kld and it
> >runs guest code (including kernel code with -kernel-kqemu) in kernel
> >space on the host cpu.  You can see the freebsd-specific parts in
> >/usr/ports/emulators/kqemu-kmod/work/kqemu-1.3.0pre7/kqemu-freebsd.c
> >(after doing make in the port's dir) - could this be patched there?
> 
> Probably not.  I couldn't see any floating point there or in a disassembly
> of the module.  A stack trace would show what used floating point, but
> might not locate the problem exactly, depending on what used it.
> 
Its probably guest code (code running on the emulated cpu) that kqemu
runs (kqemu_exec()) that is using the fpu.

> >Btw, kqemu on amd64 also causes lots of
> >	fpudna in kernel mode!
> >messages even when not using -kernel-kqemu (so that kqemu only runs
> >guest userland code in kernel space.)
> 
> This should cause a panic too.  It indicates that the kernel is using
> the FPU without even setting up for using it.  It just gets used (*).
> This may clobber its current user since there is no setup.  (The kernel
> is currently only permitted to use the FPU for saving and restoring
> it for userland.  In disabled optimizations for old Pentiums, the FPU
> is really used by the kernel, but this requires saving the state if there
> is a current user.)  A stack track for this would locate a problem
> exactly.
> 
> (*) Note fpudna() is called unconditionally, and we only panic if it
> returns 0.  The test is especially bogus on amd64 since fpudna() always
> returns 1 there.  On i386's, the corresponding npxdna() still returns
> 0 in the !npx_exists case, but that case should never occur.

 Could we simply assume that kqemu_exec() will always use the fpu
and do the necessary things before calling it in kqemu-freebsd.c?
(and what would those be exactly?)

 Btw it seems the ndisulator has a similar problem sometimes:
	http://docs.freebsd.org/cgi/mid.cgi?20051110023940.1785116A420


More information about the freebsd-emulation mailing list