ZFS hot spares

Steve Polyack korvus at comcast.net
Tue Mar 9 18:33:29 UTC 2010


On 03/09/10 05:11, Ivan Voras wrote:
> On 03/08/10 19:06, Steve Polyack wrote:
>> ZFS in FreeBSD lacks at least one major feature from the Solaris
>> version: hot spares. There is a PR open at
>> http://www.freebsd.org/cgi/query-pr.cgi?pr=134491, but there hasn't been
>> any motion/thoughts posted on it since its creation almost one year ago.
>>
>> I'm aware that on Solaris, hot spare replacement is handled by a few
>> Solaris-specific daemons, zfs-retire and zfs-diagnose, which both plug
>> into the Solaris FMA (Fault Management Architecture). Have there been
>> any thoughts on porting these over or getting something similar running
>> within FreeBSD? With all of the recent SATA/SAS CAM hotplug work now
>> committed, it would be nice to have automatic replacement of hot spares
>> with a future hot-replacement of the failed drive.
>>
>> On the other side, I'd be interested in hearing if anyone has had
>> success in rolling their own scripted solution: i.e. something which
>> polls 'zpool status' looking for failed drives and performing hot-spare
>> replacements automatically.
>
> You don't have to exactly poll it. See /etc/devd.conf:
>
> # Sample ZFS problem reports handling.
> notify 10 {
>         match "system"          "ZFS";
>         match "type"            "zpool";
>         action "logger -p kern.err 'ZFS: failed to load zpool $pool'";
> };
>
> notify 10 {
>         match "system"          "ZFS";
>         match "type"            "vdev";
>         action "logger -p kern.err 'ZFS: vdev failure, zpool=$pool 
> type=$type'";
> };
>
> notify 10 {
>         match "system"          "ZFS";
>         match "type"            "data";
>         action "logger -p kern.warn 'ZFS: zpool I/O failure, 
> zpool=$pool error=$zio_err'";
> };
>
> notify 10 {
>         match "system"          "ZFS";
>         match "type"            "io";
>         action "logger -p kern.warn 'ZFS: vdev I/O failure, 
> zpool=$pool path=$vdev_path offset=$zio_offset size=$zio_size 
> error=$zio_err'";
> };
>
> notify 10 {
>         match "system"          "ZFS";
>         match "type"            "checksum";
>         action "logger -p kern.warn 'ZFS: checksum mismatch, 
> zpool=$pool path=$vdev_path offset=$zio_offset size=$zio_size'";
> };
>
> I don't really know if these notifications actually work since I don't 
> have hot-plug test machines, but if they do, this looks like a decent 
> starting point.
>

Thanks for the suggestions.  I received a similar one from someone 
else.  If I get time to build a ZFS lab machine then I will certainly 
try these out and provide feedback on how they work.



More information about the freebsd-fs mailing list