STI, HLT in acpi_cpu_idle_c1

Wed Jun 23 14:39:23 GMT 2004

On Tuesday 22 June 2004 09:01 pm, Gerrit Nagelhout wrote:
> Thanks for the detailed info on this.  It looks like CPU1 is trying
> to service the interrupt because PPR = 0xf0, and TPR = 0x00.  It is
> also the only CPU that has a bit set in ISR.  In this case, CPU 3
> was initiating the IPI (although I don't know why its icr_lo is
> 0xc00f6 because I was expecting it to be 0xc00f3 (and it was in
> previous lockups).  I still have no idea why CPU1 is not handling
> this interrupt though.  I am still getting used to this emulator, but
> I think the values I am reading are believable:
>
> P3>dumpAllLocalApic
> CPU 0
> ID:    0x6000000
> TPR:   0x0
> PPR:   0x0
> icr_lo:0xf3
> ISR0:  0x0
> ISR1:  0x0
> ISR2:  0x0
> ISR3:  0x0
> ISR4:  0x0
> ISR5:  0x0
> ISR6:  0x0
> ISR7:  0x0
> CPU 1
> ID:    0x7000000
> TPR:   0x0
> PPR:   0xf0
> icr_lo:0xf3
> ISR0:  0x0
> ISR1:  0x0
> ISR2:  0x0
> ISR3:  0x0
> ISR4:  0x0
> ISR5:  0x0
> ISR6:  0x0
> ISR7:  0x80000

bit 19 is set, so vector of 224 + 19 = 243.

#define APIC_LOCAL_INTS 240
#define APIC_IPI_INTS   (APIC_LOCAL_INTS + 3)
#define IPI_AST         APIC_IPI_INTS           /* Generate software trap. */

So it's an IPI_AST which is EOI'd before we do anything:

IDTVEC(cpuast)
        PUSH_FRAME
        movl    $KDSEL, %eax
        movl    %eax, %ds               /* use KERNEL data segment */
        movl    %eax, %es
        movl    $KPSEL, %eax
        movl    %eax, %fs

        movl    lapic, %edx
        movl    $0, LA_EOI(%edx)        /* End Of Interrupt to APIC */

        FAKE_MCOUNT(TF_EIP(%esp))

        MEXITCOUNT
        jmp     doreti

Hmm nothing in the kernel does an IPI to all but self with IPI_AST.  Only with 
IPI_RENDEZVOUS in MI code.

> CPU 2
> ID:    0x0
> TPR:   0x0
> PPR:   0x0
> icr_lo:0xfb
> ISR0:  0x0
> ISR1:  0x0
> ISR2:  0x0
> ISR3:  0x0
> ISR4:  0x0
> ISR5:  0x0
> ISR6:  0x0
> ISR7:  0x0
> CPU 3
> ID:    0x1000000
> TPR:   0x0
> PPR:   0x0
> icr_lo:0xc00f6

0xf6 is the vector 246

#define IPI_INVLRNG     (APIC_IPI_INTS + 3)

That is an IPI that is sent via all_but_self.  *sigh* And the TLB shootdown 
code does sit and spin in a loop with interrupts disabled after sending the 
IPI.  Hmm, I do see one possible bug.  It's only safe to spin like that if 
the same lock protects all such spin cases.  For the lazypmap stuff a 
different lock is used.  You can try this patch to see if it helps any.  Kris 
Kenneway, you might want to try this, too on the box with the lazyfix 
timeouts.

Index: pmap.c
===================================================================
RCS file: /usr/cvs/src/sys/i386/i386/pmap.c,v
retrieving revision 1.473
diff -u -r1.473 pmap.c

--- pmap.c      17 Jun 2004 06:16:57 -0000      1.473
+++ pmap.c      23 Jun 2004 14:39:32 -0000
@@ -1292,7 +1296,7 @@
        while ((mask = pmap->pm_active) != 0) {
                spins = 50000000;
                mask = mask & -mask;    /* Find least significant set bit */
-               mtx_lock_spin(&lazypmap_lock);
+               mtx_lock_spin(&smp_tlb_mtx);
 #ifdef PAE
                lazyptd = vtophys(pmap->pm_pdpt);
 #else
@@ -1312,7 +1316,7 @@
                                        break;
                        }
                }
-               mtx_unlock_spin(&lazypmap_lock);
+               mtx_unlock_spin(&smp_tlb_mtx);
                if (spins == 0)
                        printf("pmap_lazyfix: spun for 50000000\n");
        }


-- 
John Baldwin <jhb at FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve"  =  http://www.FreeBSD.org