variable hang when starting APs on Westmere processors

Mike Karels mike at karels.net
Mon May 2 21:29:07 UTC 2011


Looks like freebsd-smp is gone... not sure of the right target for this.

I just picked up a problem from another developer at work who had the good
fortune to have scheduled a vacation this week.  The short description is
that the start_ap() routine sometimes hangs, from 10 minutes to 3 hours,
while starting up CPUs.  This is with a much-modified system based on
FreeBSD 7.2.  A stock 8.2 CD hangs at the same spot almost all the time,
although the code in the two versions appears identical.

More details:  This is amd64, using an Intel S5520HCR 2-socket motherboard
with two XEON X5660 2.8GHz Westmere hex-core CPUs.  The problem happens
somewhat less with two XEON E5620 Quad core 2.4GHz CPUs.  The hang seems
to happen with higher numbered CPUs, so the hex-core with SMT has more
chances to hit the problem.

We added KTRs to the code, and found that the hang happens in the
lapic_ipi_wait() call after de-asserting RESET.

Of course, Linux doesn't exhibit the problem.

Has anyone else seen a problem like this?  Any ideas how to fix it, or
debug further?

Please copy me on responses; I'm not subscribed to this list currently.

		Mike


More information about the freebsd-amd64 mailing list