Redundant/failover NFS servers - stale NFS file handle

Mon Aug 14 15:55:53 UTC 2006

Attila Nagy <bra at fsn.hu> wrote:
 > I would like to run diskless clients (they are actually servers) from 
 > FreeBSD, but I don't like having a SPoF at the NFS server level and 
 > don't want to use expensive out of the box solutions, like a NAS with a 
 > SAN behind it.

We use NetApp Filer clusters (NAS) for that purpose.
They aren't cheap, but they work very well.

 > So in theory, having two FreeBSD boxes, both with the NFS service on a 
 > CARP-based virtual IP would perfectly fit my needs.
 > 
 > The only problem is that NFS encodes some information in the 
 > filehandles, so when I'm doing a failover with the NFS clients (bringing 
 > the carp interface down on the master server), I get "Stale NFS file 
 > handle".

That's to be expected.

NFS file handles are based on the inode number.  That means
if you want to have a fail-over that's transparent for the
client, your NFS servers would need to have the same inode
numebrs for their files.  Normally, the only way to achieve
that is to duplicate the file system from the master to the
slaves using dd(1).

However, dd(1) has several drawbacks:  First, it takes a
long time, because it copies everything, including blocks
that aren't allocated at all.  Second (and most important),
if you copy a live file system (i.e. one that is mounted
read+write) with dd(1), the copy won't be in a consistent
state and will at the very least need to be fsck(8)ed (and
If you're unlucky, even fsck(8) won't be able to fix it).
One solution to the latter problem might be to take a
snapshot of the filesystem (see mksnap_ffs(8)) and the
copy that snapshot (which is read-only) with dd(1).

If the contents of the NFS exported file system don't
change very often, that might be a workable solution.

There's another possibility, but I haven't tried it for
myself, so it's just theory.  :-)   You can try to put
geom_mirror (see gmirror(8)) on top of geom_gate (see
ggated(8), ggatec(8)).  Then you will have a RAID1 with
one component local and the other component remote.
However, I think it only works reliably in read-only
mode.

 > As for the client side, Solaris has the capability of doing NFS client 
 > failover (reported to have some problems, but for now I would have only 
 > FreeBSD clients), and AMD has multiple server support, but I don't know 
 > how does that work with FreeBSD diskless boots yet. (root FS on NFS)

I don't know if this is an option for you, but you can
also put a minimal root file system into the kernel
(md file system), just sufficient to get networking +
AMD running, and mount everything else via NFS.  Another
possibility is to put a CompactFlash card to boot from
into the machines (could be read-only).  CF cards and
CF-IDE/ATA adapters are fairly cheap nowadays.  512 MB
cards are about 20 Euros over here, and that's more
than enough to contain a root FS and even a bit more
for convenience.

Best regards
   Oliver

-- 
Oliver Fromme,  secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing
Dienstleistungen mit Schwerpunkt FreeBSD: http://www.secnetix.de/bsd
Any opinions expressed in this message may be personal to the author
and may not necessarily reflect the opinions of secnetix in any way.

"That's what I love about GUIs: They make simple tasks easier,
and complex tasks impossible."
        -- John William Chambless