diskless system freeze in bios16_call() on some Intel
motherboards
Luigi Rizzo
rizzo at icir.org
Fri Sep 7 08:07:52 PDT 2007
[note, all these tests have been done on -stable, though -current
kernel exhibits the same problems in some of the tests so i
suspect there is a common problem]
On Fri, Sep 07, 2007 at 06:48:51AM -0700, Luigi Rizzo wrote:
> Hi,
> we are having some annoying problems with a number of Intel
> motherboards (Pentium4, ich6 and ich7 based, the laters are on
> D945PAW boards with SN94510J.86A bios if that matters).
>
> The symptoms are that booting a 6.x or 7.x kernel with
> etherboot causes a system freeze. This happens also if we
> try to etherboot the kernel from a 6.2 install CD.
>
> On the other hand, on the same hardware:
> - a 4.11 kernel booted with etherboot boots ok.
> - a 6.2 install CD boots ok;
> - a 6.2 install CD with the kernel replaced with ours boots ok.
>
> So it seems that at least part of the problem is how
> the execution environment is set up by etherboot as opposed to
> /boot/loader . However, it is still unclear to me why the 4.11 kernel
> works.
>
> After some instrumenting, it turns out that the freeze is in the
> call to bios16_call, and specifically in this line in
> sys/i386/i386/bioscall.s
>
> lcallw *bioscall_vector /* 16-bit call */
>
> Looking at the arguments there is nothing strange - the selector is
> 0x70 as on other machines, the address seems reasonable.
> If I comment out the lcallw, then things proceed, but apparently the
> interrupt for the network card is not set up correctly because the
> subsequent bootp replies are not received (i see them on the
> server with tcpdump) and have 'watchdog timeout' messages on the console
> of the diskless client.
>
> Any ideas on what could the problem be ?
For the benefit of the archives, and upon further investigation:
the problem is definitely related to pnpbios/bios16 calls.
One of the difference between the CD and etherboot is that the CD
loads acpi.ko as a module. This apparently prevent the calls to the
offending pnpbios stuff, and also lets the apic code correctly
configure things.
The following causes a PANIC:
- booting from etherboot with acpi compiled-in
the kernel itself panics in pmap_mapbios(), right after calling
AcpiOsMapMemory() .
The following causes the system to FREEZE:
- booting from etherboot without acpi, with SMP+apic, and with bios16_call()
uncommented (essentially it is this call that causes the freeze).
- booting from CD without loading acpi.ko (the 'safe mode'!). This too
causes the call to bios16_call() which in turn freezes.
the following causes 'WATCHDOG TIMEOUT' on the network card:
- booting from etherboot with SMP+apic, bios16_call() commented out,
and no acpi. Presumably, the apic does not route interrupts
properly on this hardware without acpi.
finally, the following WORKS WELL:
- boot from etherboot without SMP, without acpi, without apic., and
commenting out the call to bios16_call() in bios.c
This is probably using the hardware in a similar way to what
4.11 does. Note however that we need to patch the kernel source.
- boot from the CD, loading acpi.ko as a module, and irrespective of SMP+apic.
I don't know why loading acpi.ko as a module works better than compiled in,
but perhaps it is related to the order in which the functions are
called ?
I cannot do more tests now, but surely it would be interesting to
see what changes in acpi between compiled-in and kldloaded.
cheers
luigi
More information about the freebsd-current
mailing list