HAST with broken HDD
InterNetX - Juergen Gotteswinter
jg at internetx.com
Sun Oct 5 15:50:55 UTC 2014
Am 05.10.2014 um 16:50 schrieb Dmitry Morozovsky:
> On Fri, 3 Oct 2014, Mikolaj Golub wrote:
>
>> Disk errors are recorded to syslog. Also error counters are displayed
>> in `hastctl list' output. There is snmp_hast(3) in base -- a module
>> for bsnmp to retrieve this statistics via snmp protocol (traps are not
>> supported though).
>>
>> For notifications, the hastd can be configured to execute an arbitrary
>> command on various HAST events (see description for `exec' in
>> hast.conf(5)). Unfortunately, it does not have hooks for I/O error
>> events currently. It might be worth adding though. The problem with
>> this that it may generate to many events, so some throttling is
>> needed.
>
> And, I it, this should be noted, some kind of error-coalescing or similar
> before going from "warning" shate (there are some read error, but otherwise the
> disk is useable, and it would be overly hassle to switch to remote component
> completely) to "error" state (component is unuseable and needs to be replaced
> ASAP; drop it from HAST pair, and switchover if needed).
>
> Error such as "device lost" is, of course, fatal from the very beginning; but
> -- how should we interpret, well, sporadic controller resets with the disk
> coming back and catching syncing again?
>
>
Hi Dmitry,
since HAST is somehow not so different from DRBD, why dont take their
way of Error Handling as "Template". DRBD works pretty well and rock
solid since years, a well established Solution. HAST got the potencial
to become this also, with some improvements.
Just my 2 Cents :)
More information about the freebsd-fs
mailing list