Reproducable, possibly NFS related, fatal double fault in
chris# at 1command.com
Fri Oct 19 01:13:30 PDT 2007
Quoting Kris Kennaway <kris at freebsd.org>:
> Clifton Royston wrote:
>> On Tue, Oct 16, 2007 at 01:01:46PM -0700, Chris H. wrote:
>>> excerpt from this list titled: NFS == lock && reboot, that I posted
>>> # uname -a
>>> FreeBSD host.domain.tld 6.2-RELEASE FreeBSD 6.2-RELEASE #0: Fri Jan
>>> 26 16:27:14 PST 2007
>>> Does anyone know when NFS and friends will be working again? I
>>> haven't been able
>>> to /safely/ use it from 4.8 on. I remember some talk on the list
>>> sometime ago and
>>> then it seemed to be resolved, as the discussion ended. So I thought it was
>>> fixed. Seems not. :(
>>> My scenario;
>>> mount host off root:
>>> mount script exec'd follows...
>>> #!/bin/sh -
>>> mount -t nfs host.domain.tld:/ /host
>>> mount -t nfs host.domain.tld:/var /host/var
>>> confirm mount...
>>> # ls /host
>>> .snap COPYRIGHT bin
>>> usr var tmp
>>> OK looks good...
>>> # cp /path/to/approx/10Mb/file /host/path/to/dest/dir/
>>> Fatal double fault
>>> eis 0x0blah
>>> eiblah blah0x
>>> panic double fault
>>> no dump device defined
>>> rebooting in 15sec...
>>> Hmmm... that's not good. :(
>>> My final solution was to change the lines in /etc/rc.conf
>>> Making those changes ended the "Fatal double fault && reboot in 15
>> Thanks for this very timely mention! The cluster of servers I am
>> about to upgrade from 4.8 <embarrassed cough> to 6.2 relies heavily on
>> NFS to an old Netapp. If I have got to disable rpc_lockd and
>> rpc_statd, it's good to know that now!
>> Can I ask, can anybody confirm that they're running 6.2 on NFS
>> successfully *with* lockd and statd?
> Er, yes, of course it does. The old message he is quoting is bogus
> on its own,
While I'll grant you that I haven't *yet* found/taken the time to create a
dump device and re-enable rpd_lockd && rpc_statd && cp 10Mb file to mount
point to produce an *instantaneous* "Fatal double fault". I don't think it's
fair to label my original post entirely /bogus/ - especially in light of
the recent post I replied to. Which seems to have some very common ground.
I should probably mention that since my last posting (my original thread),
I have some 20+ RELENG_6_2 boxen that *do* have rpd_lockd + rpc_statd
enabled. Yet none of them produce a "Fatal double fault". They are all
Tyan SMP boards with dual onboard fxp's - as opposed to the Nvidia UP
which has a single onboard nve. They are all inter-connected via NFS.
I have a 750Gb drive hanging off the /problematic/ Nvidia board, that I
had intended to use for NFS back-up's. But given the NFS issue I had with
it, it didn't seem to be the best solution. If anyone felt like throwing
me a "cheat sheet" for creating a dump device out of that drive and a
"quickie" for producing a backtrace. I'm sure I'd be better able to find
the required time to produce the required information. I'm sorry. It's
just that I'm a hundred million miles away from that right now. As I've
been building several large web applications, and their deadline is fast
approaching. FWIW I bounced all the servers today, and therefore have
recent /verbose/ dmesg's. Should any of the information they provide, be
of any help/use to anyone.
Take care. :)
> I don't know if he ever was able to provide meaningful traces but it
> may well be nve as in the upthread discussion.
> freebsd-stable at freebsd.org mailing list
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"
panic: kernel trap (ignored)
More information about the freebsd-stable