Server overloaded? Or is it a bug?

Robert Watson rwatson at freebsd.org
Wed Jun 4 06:16:51 PDT 2003


On Tue, 3 Jun 2003, Daniela wrote:

> > > My server doesn't respond any more. Everything crashed.
> > > How can I find out if this is a bug or it is simply overloaded?
> >
> > Did you try looking at its console?
> > It probably have lots of pretty messages of whats going on..
> 
> Can't look at the console. It hangs completely. 

When X wedges, even if the kernel is still alive, it can be hard to regain
control of the console.  Here are some things you might try, though:

(1) Try pinging the machine from another machine on the LAN/WAN.  If it
    responds to pings, then the kernel is at least a bit alive.  You might
    try logging in and see if the X server process is doing anything
    particularly unusual.

    If you can ping but not build a TCP connection, it could be part of
    the kernel is starved of resources -- too many sockets open, or the
    like.

    If you can build a TCP connection but not get an SSH banner back from
    the server, then maybe you're out of processes and sshd can't fork. 

    If you can get partway through the banner but hang later, that might
    be the result of a file system deadlock of some sort.

    If you can log in, cool, it's probably an X problem and not strictly a
    FreeBSD problem.

(2) Try setting up a serial console -- there should be documentation for
    this in the handbook, but the easy steps are: (1) hook up the first
    serial port to a serial port on another system.   Set:

	console="comconsole"

    in /boot/loader.conf, and enable the ttyd0 line in /etc/ttys to permit
    login on the console.  Point your terminal program on the second
    machine at the serial port using 9600bps (the default for cu/tip/, so
    I usually just type in "tip com1").  Make sure you see the kernel
    probe messages, get a login prompt, etc.

    When the machine appears to wedge, see if you have any interesting
    output on the serial console -- unusual errors, panic messages, etc.
    See if you can log in.

This will help clarify the problem a bit.  If it's a panic or hang, the
next steps are generally to rebuild your kernel with debugging symbols and
DDB, see if you can get stack traces etc (documented in the handbook). 
Having a serial console set up makes this a *lot* easier, since you can
use the other machine to copy/paste debugging information rather than
trying to hand-transcribe, get a system core (difficult if system is
wedged in X), etc. 

Hope this helps.

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert at fledge.watson.org      Network Associates Laboratories




More information about the freebsd-stable mailing list