Constant rebooting after power loss
gpalmer at freebsd.org
Sat Apr 2 18:43:33 UTC 2011
On Sat, Apr 02, 2011 at 12:55:15PM -0400, David Magda wrote:
> On Apr 1, 2011, at 23:35, Matthew Dillon wrote:
> > The solution to this first item is for the OS/filesystem to issue a
> > disk flush command to the drive at appropriate times. If I recall the
> > ZFS implementation in FreeBSD *DOES* do this for transaction groups,
> > which guarantees that a prior transaction group is fully synced before
> > a new ones starts running (HAMMER in DragonFly also does this).
> > (Just getting an 'ack' from the write transaction over the SATA bus only
> > means the data made it to the drive's cache, not that it made it to
> > the platter).
> It should also be noted that some drives ignore or lie about these flush commands: i.e., they say they flushed the buffers but did not in fact do so. This is sometimes done on cheap SATA drives, but also on expensive SANS. If the former's case it's often to help with benchmark numbers. In the latter's case, it's usually okay because the buffers are actually NVRAM, and so are safe across power cycles. There are also some USB-to-SATA chipsets that don't handle flush commands and simply ACK them without passing them to the drive, so yanking a drive can cause problems.
SANs are *theoretically* safer because of their battery backed caches, however
it's not guaranteed - I've seen an array controller crash and royally screw
the data sets as a result, even when the cache was allegedly mirrored to
the redundant controller in the array.
NVRAM/battery backed cache protects against certain failures but introduces
other failures in their place. You have to do your own risk/benefit
analysis before seeing which is the best solution for your usage scenario.
As long as it is "in transit" to permanent storage, it's at risk. All the
disk redundancy/battery backed caches in the world is no replacement for
a comprehensive *and regularly tested* backup strategy.
More information about the freebsd-stable