Constant rebooting after power loss

Matthew Dillon dillon at apollo.backplane.com
Sat Apr 2 18:57:21 UTC 2011


:It should also be noted that some drives ignore or lie about these flush commands: i.e., they say they flushed the buffers but did not in fact do so. This is sometimes done on cheap SATA drives, but also on expensive SANS. If the former's case it's often to help with benchmark numbers. In the latter's case, it's usually okay because the buffers are actually NVRAM, and so are safe across power cycles. There are also some USB-to-SATA chipsets that don't handle flush commands and simply ACK them without passing them to the drive, so yanking a drive can cause problems.
:
:There has been quite a bit of discussion on the zfs-discuss list on this topic of the years, especially when it comes to (consumer) SSDs.

    Remember also that numerous ZFS studies have been debunked in recent
    years, though I agree with the idea that going that extra mile requires
    not trusting anything.  In many respects ZFS's biggest enemy now is
    bugs in ZFS itself (or the OS it runs under), and not so much glitches
    in the underlying storage framework.

    I am unaware of *ANY* mainstream hard drive or SSD made in the last
    10 years which ignores the disk flush command.  In previous decades HD
    vendors played games with caching all the time but there are fewer
    HD vendors now and they all compete heavily with each other... they
    don't play those games any more for fear of losing their reputation.
    There is very little vendor loyalty in the hard drive business.

    When it comes to SSDs there are all sorts of fringe vendors, and I
    certainly would not trust any of those, but if you stick to
    well known vendors like Intel or OCZ it will work.  Look for who's
    chipsets are under the hood more than for whos name is slapped onto
    the SSD and get as close to the source as you can.

    Most current-day disk flush command issues are at a higher level.  For
    example, numerous VMs ignore the command (don't even bother to fsync()
    the underlying block devices or files!).  There isn't anything you can
    do about a VM other than complain about it to the vendor.  I've been hit
    by this precisely issue running HAMMER inside a VM on a windows box.
    If the VM blue-screen's the windows box (which happens quite often)
    the data on-disk can wind up corrupted beyond all measure.

    People who use VMs with direct-attached filesystems basically rely on
    the host computer never crashing and should really have no expectation
    of storage reliability short of running the VM inside an IBM mainframe.
    That is the unfortunate truth.

    With USB the primary culprit is virtually *all* USB/Firewire/SATA
    bridges, as you noted, because I think there are only like 2 or 3
    actual manufacturers and they are all broken.  The USB standard itself
    shares the blame for this.  It is a really horrible standard.

    USB-sticks are the ones that typically either lock up or return
    success but don't actually flush their (fortunately limited) caches.
    Nobody in their right mind uses USB to attach a disk when reliability
    is important.  It's fine to have it... I have lots of USB sticks and
    a few USB-attached HDs lying around, but I have *ZERO* expectation of
    reliability from them and neither should anyone else.

    SD cards are in the same category as USB.  Useful but untrustworthy.

    Other fringe consumer crap, like fake-raid (BIOS-based RAID), is equally
    unreliable when it comes to dealing with outright crashes.  Always fun
    to have drives which can't be moved to other machines if a mobo dies!
    Not!

    With network attached drives the standard itself is broken.  It tries to
    define command completion as the data being on-media which is stupid
    when no other direct-attached standard requires that.  Stupidity in
    standards is a primary factor in vendors ignoring portions of standards.

    In the case of network-attached drives implemented with direct-attached
    drives on machines with software drivers to bridge to the network,
    it comes down to whether the software deals with the flush command
    properly, because it sure as hell isn't going to sync each write
    command all the way to the media!

    But frankly, none of these issues should stop anyone from not using
    the command or rationalizing it away.  Not that I am blaming anyone for
    trying to rationalize it away, I am simply pointing out that in a
    market as large as the generic 'storage' market is, there are always
    going to be tons of broken stuff out there to avoid.  It's buyer beware.

    What we care about here, in this discussion, is direct-attached
    SATA/eSATA/SAS, port multipliers and other external enclosure bridges,
    high-end SCSI phys and, NVRAM aside (which is arguable), real RAID
    hardware.  And well-known vendors (fringe SSDs do not count).  That
    covers 90% of the market and 99% of the cases where protocol reliability
    is required.

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>


More information about the freebsd-stable mailing list