PCI range checking under qemu-system-sparc64

Wed Sep 23 08:20:16 UTC 2015

On 22/09/15 23:14, Marius Strobl wrote:

> On Sun, Sep 20, 2015 at 04:36:31AM +0000, Alexey Dokuchaev wrote:
>> On Sun, Sep 20, 2015 at 12:05:32AM +0100, Mark Cave-Ayland wrote:
>>> [...]
>>> While I don't have any insight on the CPU tick interrupt yet, my initial
>>> feeling is that the ATA hang could be related to the PCI interrupt
>>> clearing issue that I started looking into a while back. Although it
>>> isn't a complete fix, does the attached patch against QEMU help at all?
>>
>> Did not help; putting back "device scbus+da+cd" hangs with patched QEMU.
>>
>
> Actually, it does make a bit of a difference; with a QEMU 2.4.0
> built with apb-no-clear.patch applied the number of interrupts
> seen for the emulated CMD646U2 at the point of the hang decreases
> from reliably 5 to reliably 4. Now whether that is a relevant
> functional change or what's the appropriate number of interrupts
> at that point in time I don't know, though.

It's good that you've managed to see some change in behaviour with it.
The patch fixes one particular bug that was reported to me off-list but
really the whole APB interrupt handler needs a rework to better handle
the specification.

> I tried to look into the tick interrupt problem, as that seems
> more relevant. Apparently, writes to %tick and %tick_cmpr are
> hooked up to cpu_tick_set_count() and cpu_tick_set_limit() (in
> hw/sparc64/sun4u.c) respectively in target-sparc/translate.c
> by a detour via target-sparc/helper.c. Likewise for %{h,}stick
> and %{h,}stick_cpmr. So at first I just enabled DEBUG_TIMER,
> which yields:
> marius at enigma:/home/marius > qemu-system-sparc64 -nographic -cdrom
FreeBSD-11.0-CURRENT-sparc64-bootonly.iso -boot d
> TIMER: tick set_limit limit=0x0000000000000000 (disabled)
p=0x803a29810 called with limit=0x8000000000000000 at 0x0000000000004c97
(delta=0x0000000000000000)
> TIMER: tick set_limit limit=ZERO - not starting timer
> TIMER: tick get_count count=0x000000000005df82 (disabled) p=0x803a29810
> TIMER: tick set_limit limit=0x00000000001521c2 (enabled) p=0x803a29810
called with limit=0x00000000001521c2 at 0x000000000005e0e9
(delta=0x00000000000f40d8)
> OpenBIOS for Sparc64
> <...>
>
> The first invocation of cpu_tick_set_count() is here:
> TIMER: tick set_count count=0x0000000000000000 (enabled) p=0x803a29810
> TIMER: tick set_limit limit=0x0000000000000000 (disabled)
p=0x803a29810 called with limit=0x8000000000000000 at 0x0000000000000e73
(delta=0x0000000000000000)
> TIMER: tick set_limit limit=ZERO - not starting timer
> GDB: no debug ports present
> KDB: debugger backends: ddb
> KDB: current backend: ddb
> Copyright (c) 1992-2015 The FreeBSD Project.
>
> I don't know where that write to %tick is coming from but it's not
> performed by either the FreeBSD loader or kernel. However, the write
> to %tick_cmpr seems right as we do a `wrpr (1L << 63), 0, %tick_cmpr`
> very early in the kernel in order to prevent foot-shooting from
> firmware variants that leave the tick counter enabled.
>
> There's no other call of cpu_tick_set_count() but the next one to
> cpu_tick_set_limit() is when the kernel enables the tick counter:
> Timecounter "tick" frequency 100000000 Hz quality 1000
> TIMER: tick get_count count=0x0000000006ead320 (disabled) p=0x803a29810
> TIMER: tick get_count count=0x0000000006ead7da (disabled) p=0x803a29810
> Event timer "tick" frequency 100000000 Hz quality 1000
> TIMER: tick get_count count=0x0000000006ec9b12 (disabled) p=0x803a29810
> TIMER: tick set_limit limit=0x0000000006ed5e61 (disabled)
p=0x803a29810 called with limit=0x8000000006ed5e61 at 0x0000000006eca6e5
(delta=0x000000000000b77b)
> TIMER: (null) disabled
> Timecounters tick every 1.000 msec^M^M
>
> The "called with limit=0x8000000006ed5e61" is a plain lie! We
> simply do not set (1L << 63) when writing the future tick value
> to %tick_cmpr for the tick interrupt to go off. So I tried with
> a hack to cpu_tick_set_limit():
> void cpu_tick_set_limit(CPUTimer *timer, uint64_t limit)
> {
>     int64_t now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
>
>     uint64_t real_limit = limit & ~timer->disabled_mask;
>     timer->disabled = (limit & timer->disabled_mask) ? 1 : 0;
>     timer->disabled = limit == 0;
>
> Not that this shouldn't make any difference as it just special-
> cases the fact that apart from (1L << 63), FreeBSD writes a 0
> value to %tick_cmpr for disabling the tick interrupt. Moreover,
> limit == 0 implies that (1L << 63) isn't set either, so if the
> kernel really would set (1L << 63) in the case in question,
> this change shouldn't cause a difference in behavior, but it
> in fact it does:
> Timecounter "tick" frequency 100000000 Hz quality 1000
> Event timer "tick" frequency 100000000 Hz quality 1000
> Timecounters tick every 1.000 msec
> IPsec: Initialized Security Association Processing.
> cd0 at ata3 bus 0 scbus1 target 0 lun 0
> cd0: <QEMU QEMU DVD-ROM 2.4.> Removable CD-ROM SCSI device
> cd0: Serial Number QM00003
> cd0: 13.300MB/s transfers (WDMA1, ATAPI 12bytes, PIO 65534bytes)
> cd0: cd present [103200 x 2048 byte records]
> WARNING: WITNESS option enabled, expect reduced performance.
> Trying to mount root from cd9660:/dev/iso9660/11_0_CURRENT_SPARC64_BO
[ro]...
> [ thread pid 1 tid 100002 ]
> Stopped at      tl1_trap+0x24:  stx             %o0, [%sp + 0x997]
> db>
>
> So apparently (1L << 63) in %tick_cmpr somehow is sticky
> within QEMU but I didn't manage to spot a bug in that
> regard. There actually are several obvious bugs in QEMU
> when it comes to %{hs,s,}tick and %{hs,s,}tick_cmpr
> handling, though. First off, both cpu_tick_set_count()
> and cpu_tick_set_limit() test for the disabled_mask being
> set in the value written to the register in question.
> This is wrong as only the %{s,}tick_cmpr registers have
> (1L << 63) as disable bit, in %{s,}tick (1L << 63) serves
> as NPT (Non-Privilege Trap) bit. The NPT bit causes an
> unprivileged read of %{s,}tick to trap when set (with
> real hardware that is). QEMU seems to re-invent and
> side-step that logic via CONFIG_USER_ONLY. I don't have
> documentation for %hstick and %hstick_cmpr at hand but
> they likely work the same way. Moreover, cpu_devinit()
> in hw/sparc64/sun4u.c seems to totally confuse NPT
> and disable bit in the first place:
> env->tick = cpu_timer_create("tick", cpu, tick_irq,
>                               tick_frequency, TICK_NPT_MASK);
>
> env->stick = cpu_timer_create("stick", cpu, stick_irq,
>                                    stick_frequency, TICK_INT_DIS);
>
> env->hstick = cpu_timer_create("hstick", cpu, hstick_irq,
>                                 hstick_frequency, TICK_INT_DIS);
>
> Depending on what "disabled" is supposed to mean, that
> should be either all TICK_INT_DIS or all TICK_NPT_MASK,
> but certainly not mixed.

I've had a quick look through the relevant PDFs and the definitions I
have for tick/stick are this:

tick:
  bit  63: NPT (Non-Privileged Trap enable - defaults to 1)
  bits 62 - 0: CPU cycle counter

tick_cmpr:
  bit  63: Interrupt disable (1 = no interrupt)
  bits 62 - 0: counter compare value

stick:
  bit  63: Reserved (reads 0, no write)
  bits 62 - 0: stick register count value

stick_cmpr:
  bit  63: stick alarm enable (1 = no interrupt)
  bits 62 - 0: stick register compare value

So yes, based on this I agree that cpu_tick_set_count() and
cpu_tick_get_count() shouldn't be doing anything that changes
timer->disabled at all (and presumably doesn't implement the NPT bit).

> As for the panic I get with my hack applied, the back-
> trace is:
> Trying to mount root from cd9660:/dev/iso9660/11_0_CURRENT_SPARC64_BO
[ro]...
> [ thread pid 1 tid 100002 ]
> Stopped at      tl1_trap+0x24:  stx             %o0, [%sp + 0x997]
> db> bt
> Tracing pid 1 tid 100002 td 0xfffff800015e84d0
> KDB: reentering
> KDB: stack backtrace:
> kdb_reenter() at kdb_reenter+0x5c
> trap() at trap+0x2fc
> -- kernel stack fault %o7=0xc0570d40 --
> sched_clock() at sched_clock+0x94
> KDB: reentering
> KDB: stack backtrace:
> kdb_reenter() at kdb_reenter+0x5c
> trap() at trap+0x2fc
> -- kernel stack fault %o7=0xc011a050 --
> db_read_bytes() at db_read_bytes+0x44
> KDB: reentering
> KDB: stack backtrace:
> kdb_reenter() at kdb_reenter+0x5c
> trap() at trap+0x2fc
> -- kernel stack fault %o7=0xc011a050 --
> db_read_bytes() at db_read_bytes+0x44
> KDB: reentering
> KDB: stack backtrace:
> kdb_reenter() at kdb_reenter+0x5c
> trap() at trap+0x2fc
> -- kernel stack fault %o7=0xc011a050 --
> db_read_bytes() at db_read_bytes+0x44
> <continues seemingly endlessly with db_read_bytes()>
>
> So this is a trap while inside the kernel and a
> seemingly serious stack corruption. I have no idea
> how to even debug QEMU in that regard or where to
> start looking for that bug in QEMU. However, as
> long as the bug of (1L << 63) being corrupted in
> %tick_cmpr isn't fixed, i. e. its cause unknown,
> it likely also doesn't make much sense to dig
> further into that stack corruption.

Normally a good tactic is run QEMU with -d in_asm so that each
translated block of code is output to the console. You can then search
for the particular instruction in the output to get the PC value, and
then use QEMU's gdbstub to connect an external sparc64-gdb into the
guest and a set a breakpoint at the relevant place.

> Err, what operating systems is qemu-system-sparc64
> already able to successfully host again? :)

Apparently Linux, NetBSD, OpenBSD, HelenOS... ;)  Thanks once again for
all your efforts, it really makes such a difference when people who
understand the kernel innards can explain the differences in behaviour
almost immediately without me having poke everything with a debugger.

BTW over the past few days I've been setting up a FreeBSD VM to build my
own kernels for testing. I got as far as downloading the latest current
ISO, installed it, grabbed the kernel source from github and kicking off
a buildworld. Unfortuntely this morning I came back to my computer to
find the following on the console:

awk -f MKterm.h.awk
/usr/home/mca/src/freebsd/lib/ncurses/ncurses/../../../contrib/ncurses/include/Caps
> term.h.new
sh
/usr/home/mca/src/freebsd/lib/ncurses/ncurses/../../../contrib/ncurses/include/edit_cfg.sh
/usr/home/mca/src/freebsd/lib/ncurses/ncurses/ncurses_cfg.h term.h.new
** edit: HAVE_TCGETATTR 1
** edit: HAVE_TERMIOS_H 1
** edit: HAVE_TERMIO_H 0
** edit: BROKEN_LINKER 0
mv -f term.h.new term.h
sed
</usr/home/mca/src/freebsd/lib/ncurses/ncurses/../../../contrib/ncurses/include/termcap.h.in
>termcap.h  -e "/@NCURSES_MAJOR@/s%%5%"  -e "/@NCURSES_MINOR@/s%%9%"  -e
"/@NCURSES_CONST@/s%%const%"  -e "/@NCURSES_OSPEED@/s%%short%"
sed
</usr/home/mca/src/freebsd/lib/ncurses/ncurses/../../../contrib/ncurses/include/unctrl.h.in
>unctrl.h  -e "s%@NCURSES_SP_FUNCS@%0%g"  -e "/@NCURSES_MAJOR@/s%%5%"
-e "/@NCURSES_MINOR@/s%%9%"
cc  -o make_keys -O2 -pipe   -I.
-I/usr/home/mca/obj/usr/home/mca/src/freebsd/lib/ncurses/ncurses/../ncurses
-I/usr/home/mca/src/freebsd/lib/ncurses/ncurses/../ncurses
-I/usr/home/mca/src/freebsd/lib/ncurses/ncurses/../ncurses
-I/usr/home/mca/src/freebsd/lib/ncurses/ncurses/../../../contrib/ncurses/include
-I/usr/home/mca/src/freebsd/lib/ncurses/ncurses/../../../contrib/ncurses/ncurses
-Wall -DNDEBUG -DHAVE_CONFIG_H -DFREEBSD_NATIVE -DTERMIOS -std=gnu99
-fstack-protector -Wsystem-headers -Werror -Wall -Wno-format-y2k -W
-Wno-unused-parameter -Wstrict-prototypes -Wmissing-prototypes
-Wpointer-arith -Wno-uninitialized -Wno-pointer-sign
/usr/home/mca/src/freebsd/lib/ncurses/ncurses/../../../contrib/ncurses/ncurses/tinfo/make_keys.c
./make_keys keys.list > init_keytry.h
/bin/sh: ./make_keys: Exec format error
*** Error code 126

Would it be better to try building the new kernel using a 10.2 based
system instead?

ATB,

Mark.