Cannot build kernel with options WITNESS

Robert Watson rwatson at freebsd.org
Sat Jan 22 14:29:01 PST 2005


On Sun, 23 Jan 2005, Artem Kuchin wrote:

> > On Sat, 22 Jan 2005, Artem Kuchin wrote:
> > 
> >> I cvssed just an hour ago. 5.3-STABLE and cannot build kernel with
> >> WITNES. It complains: 
> > 
> > This occurs when building WITNESS without DDB in the kernel, which was not
> > a tested build case when I added "show alllocks", and apparently is a
> > relatively uncommon configuration as you're the first person to bump into
> > it.  I've just committed the fix as subr_witness.c:1.187 in HEAD, and
> > subr_witness.c:1.178.2.4 in RELENG_5.  Please let me know if this doesn't
> > fix the problem for you.
> 
> It fixed the problem. I am actually stuggling with unpredictable weird
> lock ups, when the host can be pinged but i cannot connect via
> ssh/telnet or httpd or anything else. It happens w/o any visible reason.
> I am running several jails with mysql and apache in each and canot make
> the whole system stable yet. 

This is typically a sign of one of two problems:

- The system is live locked due to very high load, so the ithread,
  netisrs, etc, in the kernel run fine, but user processes don't get a
  chance to run. 

- The system is dead locked due to user space processes getting wedged on
  common locks, but the kernel ithreads and netisrs can keep on
  responding. 

I generally assume that it's a deadlock as opposed to a live lock.  I'd
compile a kernel with DDB, KDB, WITNESS, and BREAK_TO_DEBUGGER.  When the
system appears to wedge, break into the debugger using a console or serial
break (FYI: serial break is more reliable, and you get the benefit of
being able to easily copy and paste debugging output using the serial
console for DDB).  Use "show alllocks" and "show lockedvnods" to examine
most of the system's locking state.  Changes are, either all the
interesting processes are stacked up on VFS or VM locks, since those kinds
of deadlocks produce the exact symptoms you describe: ping works fine
because it only hits the netisr, but when you open TCP connections, the
sshd (etc) block on VM or VFS locks attempting to fork new children or
access a file in the file system name space.  At first, the TCP
connections will establish but there will be no application data; after a
bit, they will not even return a SYN/ACK because the listen queue for the
listen socket has filled.

Robert N M Watson





More information about the freebsd-stable mailing list