Redundant/failover NFS servers - stale NFS file handle

Brian Candler B.Candler at pobox.com
Tue Aug 15 12:25:43 UTC 2006


On Mon, Aug 14, 2006 at 08:43:28PM +0200, Attila Nagy wrote:
> >We use NetApp Filer clusters (NAS) for that purpose.
> >They aren't cheap, but they work very well.
> I don't like blackboxes with nice GUIs. :)

They have a command-line interface too :) Seriously, these are really
excellent devices.

> >There's another possibility, but I haven't tried it for
> >myself, so it's just theory.  :-)   You can try to put
> >geom_mirror (see gmirror(8)) on top of geom_gate (see
> >ggated(8), ggatec(8)).  Then you will have a RAID1 with
> >one component local and the other component remote.
> >However, I think it only works reliably in read-only
> >mode.
> Yes, both of them must be read only, several years ago I've used a 
> similar setup, but with a shared SCSI disk.
> Read only on the client side is OK for me, but is hard to maintain on 
> the server side.
> I guess it would be possible to do this RW, mounted only on the master 
> and if it fails, remounted (fscked, etc) on the slave, but I consider 
> that a little bit hackish.

The filesystems would have to be mounted RO on both NFS servers, in other
words be entirely static content.

This is because both boxes have local caches. If an update were to occur on
box 1, and be propagated to box 2 via ggated, then even if box 2 has the
filesystem mounted RO it will then have stale data in its local caches of
disk blocks and inodes, because the blocks on disk have changed under its
feet. At best, the wrong data will be served. At worst, the whole filesystem
will crash due to inconsistencies.

So to make an update, you would have to unmount from box 2, remount RW on
box 1, make the change, remount RO on box 1, and mount RO again on box 2.

> I can solve this problem with Linux

How?

> Of course what is really needed here is a cluster filesystem, or an NFS 
> server/file system which can solve this problem at its level.

Indeed. This was discussed at some length before, and the same answers were
given.

Regards,

Brian.


More information about the freebsd-net mailing list