Random crash and/or reboots

Jack L. Stone jackstone at sage-one.net
Sun Sep 7 11:52:46 PDT 2003


At 02:00 PM 9.7.2003 -0400, Chuck Swiger wrote:
>Jack L. Stone wrote:
>> A while back, on a couple of occasions, I posted a query about some bad
>> behavior on my mail server. For the past several months, it has been either
>> crashing/reboot or just rebooting. It's ALWAYS triggered by a SSH login,
>> but at random and ONLY at the "su" to root -- usually the most time before
>> reboot is about 2+ weeks and then contrasted by 2 in a row right after the
>> reboot -- actually no pattern. It has never happened directly at the
console.
>[ ... ]
>> There are no indications of anything in the logs, and no core dumps. It
>> just stops and reboots, and any random time it pick. Only a couple of times
>> it has crashed without the remote login.
>
>These two paragraphs contradict each other, at least in part.  :-)
>
>You're seeing frequent crashes, which seem to be strongly correlated with 
>logging in as root, but you've also noticed crashes "without the remote
login", 
>too?  You should build a debug kernel, and enable dumping the system to swap 
>upon a panic ("man crash"), so that you have more information about the
crash.
>
>> One tip was that I might have stale NFS mountabs -- cleared them out, but
>> problem persisted.
>> 
>> The above tip was suggested when I mentioned that on a couple or more of
>> the occurrences, I managed to get to the console quickly enough to see (in
>> bright bold) "lockmgr locking against myself" -- or close to that. My
>> google of that error does mention stale mounts, but mostly about esoteric
>> code stuff. No fix found anywhere.
>
>Hmm.  Are you performing local mail delivery to NFS volumes?
>
>Normally (or historically, anyway), NFS locking problems cause rpc.lockd to 
>crash or wedge, thus resulting in NFS locking not working and possibly grim 
>results to file consistency for anything being changed by two or more
processes 
>at the same time.
>
>However, NFS locking problems generally do not result in a system panic.
>
>[ ... ]
>> http://sageweb/tmp/1-lsof.txt
>> http://sageweb/tmp/2-lsof.txt
>
>These URLs aren't fully-qualified hostnames.  Please try again.  :-)
>
>-- 
>-Chuck
>
Sorry about the lack of the full web address..... here it is:

http://www.sageweb.net/tmp/1-lsof.txt
http://www.sageweb.net/tmp/2-lsof.txt
http://www.sageweb.net/tmp/3-lsof.txt
http://www.sageweb.net/tmp/4-lsof.txt
http://www.sageweb.net/tmp/5-lsof.txt
http://www.sageweb.net/tmp/6-lsof.txt


Best regards,
Jack L. Stone,
Administrator

SageOne Net
http://www.sage-one.net
jackstone at sage-one.net


More information about the freebsd-questions mailing list