STI, HLT in acpi_cpu_idle_c1
dillon at apollo.backplane.com
Tue Jun 22 22:14:41 GMT 2004
:I am working with Don Bowman to try and debug this problem (the lockup).
:I have an emulator attached, and managed to get it into the locked up
:state. Three of the cpus are in idle (acpi_cpu_c1) (and have interrupts
:enabled, EFLAGS=0x246), and the other one (cpu 3) is in smp_tlb_shootdown
:waiting for one more processor to respond. The APIC register for CPU 3
:(icr_lo) indicates that the IPI (0xf3) has been sent (ie it's idle).
:The isr registers for CPU 1 indicate that vector oxf3 is pending, but it
:is not being handled. I am still trying to figure out why this is, but
:does anyone have any suggested on what else I can look at?
If the interrupt is pending on the idle cpu's APICs but no interrupt is
being delivered, and the idle cpus are in HLT with interrupts enabled,
then something is masking the pending interrupt. Check the following:
In the local APIC for each idle cpu:
* Check the TPR (task priority register) verses the priority set for
the IPI interrupt. The top 4 bits is the main priority field.
Interrupts with priorities <= the main priority bits will be masked
(so 11111111 masks all interrupts). The TPR priority should be lower
then the priority set for the IPI in question.
* Check the PPR (process priority register). This register tells you
what the priority of the highest pending interrupt that can be
dispensed to the processor. It will be set to the same contents as
the TPR if no servable interrupt is pending. The PPR is a quick way
to tell what priority of interrupt the APIC is trying to deliver to the
cpu. My guess is that it will be 0 (meaning that the APIC is not trying
to deliver anything to the cpu).
* Check the ISR bits, the TMR bits, and the IRR bits. These control
(from /usr/src/sys/i386/include/apicreg.h in the DFly source tree):
* TMR - Trigger mode register. Upon acceptance of an int
* the corresponding bit is cleared for edge-trig and
* set for level-trig. If the TMR bit is set (level),
* the local APIC sends an EOI to all I/O APICs as
* a result of software issuing an EOI command.
* IRR - Interrupt Request Register. Contains active
* interrupt requests that have been accepted but not
* yet dispensed by the current local APIC. The bit is
* cleared and the corresponding ISR bit is set when
* the INTA cycle is issued.
* ISR - Interrupt In-Service register. Interrupt has been
* delivered but not yet fully serviced. Cleared when
* an EOI is issued from the processor. An EOI will
* also send an EOI to all I/O APICs if TMR was set.
If the interrupt is masked it should be set in the IRR but not set in
the ISR. If it is set in the ISR and interrupts are enabled on the cpu,
then I have no idea what the hell is going on (because the cpu should then
service the interrupt)... unless the cpu is totally bokered.
Other possibilities: The IPI interrupt vector on the receiving CPUs is
not set up properly (I'm not sure how you can access that data, it's
programmed via the ICR so presumably it can be read back out via the
ICR somehow. Not sure).
In anycase, pull out /usr/src/sys/i386/include/apicreg.h from the
DragonFly source base... I had to go through all this crap a year ago
and decided to document the APIC registers to the hilt (based on the Intel
documentation). There is a lot of very useful information in that
<dillon at backplane.com>
More information about the freebsd-current