experimental qemu-devel port update, please test!

Mon Jul 9 21:19:25 UTC 2007

In article <4692650E.6030304 at FreeBSD.org> you write:
>Eric Anderson wrote:
>> On 07/09/07 09:28, Attilio Rao wrote:
>>> Eric Anderson wrote:
>>>> On 07/09/07 08:00, Eric Anderson wrote:
>>>>> Juergen Lock wrote:
>>>>>> On Sun, Jul 08, 2007 at 01:15:35PM -0500, Eric Anderson wrote:
>>>>>>> On 07/07/07 09:02, Juergen Lock wrote:
>>>>>>>> In article <468EFF46.4060001 at freebsd.org> you write:
>>>>>>>>> On 07/05/07 22:31, Eric Anderson wrote:
>>>>>>>>> [...]
>>>>>>>>> Although now I have the issue where using kqemu-kmod causes my 
>>>>>>>>> system to reboot or power off.  :(
>>>>>>>>>
>>>>>>>>> Any ideas?
>>>>>>>> This seems to be a -current issue, it doesn't happen for me at least
>>>>>>>> (6.2 and previously also 6.1.)  You could check if it is dependent
>>>>>>>> on the version of the used qemu (the 0.9.0 port, the version of
>>>>>>>> qemu-devel in ports, or the not-yet-committed updated I posted),
>>>>>>>> but I doubt it.  What may help is finding out which commit to 
>>>>>>>> -current
>>>>>>>> started kqemu to break (find an older version that worked, then
>>>>>>>> binary-search), or at least a backtrace from a kernel compiled
>>>>>>>> without -fomit-frame-pointer (putting DDB in the config seems to do
>>>>>>>> that for amd64 at least, but rebuild the entire kernel.)  There also
>>>>>>>> is an open issue for kqemu on amd64 smp,
>>>>>>>>     http://www.freebsd.org/cgi/query-pr.cgi?pr=113430
>>>>>>>> dunno if its related...
>>>>>>>>     Juergen
>>>>>>> My host is i386, SMP, and it also happens with the current 
>>>>>>> qemu-devel port.  It must have been something in -CURRENT that 
>>>>>>> changed, probably since May15th-ish.  I can't do a binary search 
>>>>>>> anytime soon to find it.  In the past, I've recompiled kqemu and 
>>>>>>> that has done the trick.  I have all the debugging built in, but 
>>>>>>> that doesn't stop the system from rebooting or powering off.
>>>>>> Hmm an UP kernel might be worth a try too...
>>>>>>
>>>>>>     Juergen
>>>>>
>>>>> Hmm - with and without UP, I get a panic, but I managed to catch a 
>>>>> panic in _vm_map_lock, something like:
>>>>>
>>>>> _vm_map_lock()
>>>>> vm_map_wire()
>>>>> kqemu_lock_user_page()
>>>>> mon_user_map()
>>>>>
>>>>>
>>>>> I'll try to get a real bt..
>>>>>
>>>>> Eric
>>>> Hmm - I suspect this commit or something near it is the issue:
>>>>
>>>>
>http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/vm/vm_map.c.diff?r1=1.384;r2=1.385;sortby=date;f=h;f=u 
>>>
>>>
>>> don't think so, as it just does a 1:1 translation with the old code 
>>> (passing 0 as argument and casting the return value).
>>>
>>> What kind of panic it is (what message it prints out)?
>>>
>>> Attilio
>>>
>> 
>> Fatal trap 12: page fault while in kernel mode
>> cpuid = 0; apic id = 00
>> fault virtual address   = 0x82
>> fault code              = supervisor read, page not present
>> instruction pointer     = 0x20:0xc0928f00
>> stack pointer           = 0x28:0xe57b7a3c
>> frame pointer           = 0x28:0xe57b7a50
>> code segment            = base 0x0, limit 0xfffff, type 0x1b
>>                         = DPL 0, pres 1, def32 1, gran 1
>> processor eflags        = interrupt enabled, resume, IOPL = 0
>> current process         = 69 (qemu)
>> 
>> 
>> #9  0xc0928f00 in _vm_map_lock (map=0x1, file=0x0, line=0) at 
>> /usr/src/sys/vm/vm_map.c:421
>> #10 0xc092986d in vm_map_wire (map=0x1, start=677306368, end=677310464, 
>> flags=1) at /usr/src/sys/vm/vm_map.c:1964
>> 
>> Maybe not that exact file, but I think that series of commits is 
>> related.  I believe before that everything worked fine (around May 15th 
>> or so).
>> 
>> What else would you like me to try?
>
>Would you see if accesses to map structure are MPSAFE and don't present 
>racy accesses?

(Disclaimer: my kernel foo still leaves much to be desired :)

 Hmm is this something that has changed recently?  kqemu has
D_NEEDGIANT, and it only explicitly drops giant for kqemu_exec,
but this seems to be happening inside kqemu_init already. (at least
thats the only place that calls mon_user_map.)

 kqemu_init, kqemu_exec, and mon_user_map are in
work/kqemu-1.3.0pre11/common/kernel.c in the kqemu-kmod port dir
if you `make patch' there, kqemu_lock_user_page is in
work/kqemu-1.3.0pre11/kqemu-freebsd.c .

 I also wonder how it is getting map=0x1 there, as you can see in
kqemu_lock_user_page it is effectively passing &curproc->p_vmspace->vm_map...

 Hmm I just saw kqemu_exec can call kqemu_lock_user_page as well as
kqemu_alloc_zeroed_page and kqemu_unlock_user_page (which are all in
work/kqemu-1.3.0pre11/kqemu-freebsd.c ), would it need to pick up giant
for those first?  (But since mon_user_map is in the backtrace that
at least can't be the reason for _this_ crash...)

	Juergen