PoC: ZFS fail-over with HAST + carp(4) + devd
Freddie Cash
fjwcash at gmail.com
Wed Mar 3 17:13:39 UTC 2010
[Not sure if this should go to just fs@ or possibly current@ as well. I'll
start with just fs at .]
Thought I'd pass this along. It's a proof-of-concept setup I've been using
to test HAST fail-over of a ZFS pool, using devd and carp(4). The original
impetus for doing this was that ucarp doesn't work (for me) within a
VirtualBox VM. Just hangs the VM. And, I prefer to use FreeBSD base tools
whenever possible, so I thought I'd try and get it to work with carp(4).
I know this isn't perfect as it (currently) relies on a "magic constant" and
doesn't cover all the possible failure modes, but thought I'd pass it along
to get your input, comments, criticisms, suggestions, etc. With a bit more
work, it could be generalised a bit more to, for example, pull the resources
list from /etc/hast.conf, and to work with non-ZFS setups. Perhaps someday
it could be useful an an example in the HAST samples/ directory.?.
With this setup, I can pull the plug on carp0 on the master node, and the
hast devices and ZFS pool fail-over to the slave. And if I pull the plug on
carp0 on the slave, everything fails over to the master again. And it works
nicely with carp preempt enabled on the master node.
Add the following stanzas to /etc/devd.conf:
notify 10 {
match "system" "IFNET";
match "subsystem" "carp0";
match "type" "LINK_UP";
action "/usr/local/bin/carp-hast-switch master";
};
notify 10 {
match "system" "IFNET";
match "subsystem" "carp0";
match "type" "LINK_DOWN";
action "/usr/local/bin/carp-hast-switch slave";
};
Contents of /usr/local/bin/carp-hast-switch:
#!/bin/sh
# The names of the HAST resources, as listed in hast.conf
resources="disk01 disk02 disk03 disk04"
# The name of the ZFS pool built on top of HAST resources
pool="hapool"
case "$1" in
master)
logger -p local0.debug -t hast "Switching to primary
provider for ${resources}."
sleep 30
# Wait for any "hastd secondary" processes to stop
for disk in ${resources}; do
while $( pgrep -lf "hastd: ${disk} \(secondary\)" >
/dev/null 2>&1 ); do
sleep 1
done
# Switch role for each disk
hastctl role primary ${disk}
if [ $? -ne 0 ]; then
logger -p local0.debug -t hast "Unable to
change role to primary for resource ${disk}."
exit 1
fi
done
# Wait for the /dev/hast/* devices to appear
for disk in ${resources}; do
for I in $( jot 60 ); do
[ -c "/dev/hast/${disk}" ] && break
sleep 0.5
done
if [ ! -c "/dev/hast/${disk}" ]; then
logger -p local0.debug -t hast "GEOM
provider /dev/hast/${disk} did not appear."
exit 1
fi
done
logger -p local0.debug -t hast "Role for HAST resources
${resources} switched to primary."
# Import the ZFS pool; has to be done forcibly due to hostid
issues
zpool import -f -d /dev/hast ${pool} 2>&1
if [ $? -ne 0 ]; then
logger -p local0.debug -t hast "ZFS pool import for
${hapool} failed."
exit 1
fi
logger -p local0.debug -t hast "ZFS pool ${pool} imported."
;;
slave)
logger -p local0.debug -t hast "Switching to secondary
provider for ${resources}."
# Export the ZFS pool; has to be done forcibly in case the
hast resources have already switched
zpool export -f ${pool} 2>&1
if [ $? -ne 0 ]; then
logger -p local0.debug -t hast "Unable to export the
pool ${pool}."
exit 1
fi
# Switch roles for the HAST resources
for disk in ${resources}; do
hastctl role secondary ${disk} 2>&1
if [ $? -ne 0 ]; then
logger -p local0.debug -t hast "Unable to
switch role to secondary for resource ${disk}."
exit 1
fi
logger -p local0.debug -t hast "Role switched to
secondary for resource ${disk}."
done
;;
esac
--
Freddie Cash
fjwcash at gmail.com
More information about the freebsd-fs
mailing list