Multi processor locking problem under 7.0
jhb at freebsd.org
Tue Jan 29 16:06:30 PST 2008
On Tuesday 29 January 2008 03:26:44 pm Paul wrote:
> >I have several systems of two different types running 7.0. One is an IBM
> >3550 and the other a Dell 2950. The IBMs more than the Dells
> >consistently seem to have a kernel locking problem during dump.
> >Specifically, if I execute this command:
> > dump 0uaLCf 64 /dev/null /usr
> >Dump consistently stops in Phase IV. However, if I set
> >machdep.hlt_logical_cpus=1, dump does not stop. At the end of this
> >message is my boot information.
> >When logical_cpus=0, the following is typical of what is displayed by
> >top when dump stops:
> > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU
> > 926 root 1 4 0 75476K 71744K sbwait 0 0:04 0.00% dump
> > 928 root 1 20 0 75348K 67740K pause 1 0:02 0.00% dump
> > 929 root 1 20 0 75348K 67740K pause 1 0:02 0.00% dump
> > 927 root 1 20 0 75348K 67740K pause 1 0:02 0.00% dump
> > 919 root 1 8 0 75348K 67144K wait 0 0:00 0.00% dump
> >Fooling around a bit I have found that if I truss dump, the dump
> >continues. On the Dells, if I force disk activity during the dump, such
> >as executing a ls -lR /usr > /dev/null, the dump finishes.
> >I am unsure how to proceed in debugging this problem. It has been around
> >for a while but I am now installing the IBMs and the dump problem is a
> >no-starter. Please contact me directly on how to proceed.
> I have noticed something similar on my Intel test box.
> When compiling many ports in the tree that is updated on 7.0RC1 with
> a S5000pal with 2 Quadcore Xeons the process just STOPS. I am using
> the install disk and have not updated to the latest cvsup release yet
> (I am trying to make the world now with fingers crossed :) ) I tried
> it with just one quadcore and the same problem happens.
> There are no errors on the screen but it no longer proceeds with the
> port build. When I suspend the process and restart the make in the
> same session it has no problem getting past this impasse and with a
> few suspends the make finishes without error. It does not happen
> every time which is very odd.
> Based on your description above it seems like it may be the same problem.
> What do you think?
If you have threads blocked on "vmo_de" then upgrade to the latest RELENG_7 or
RELENG_7_0 (specifically the sys/kern/subr_sleepqueue.c file) and try again.
More information about the freebsd-amd64