ZFS passdevgonecb

Michael B. Eichorn ike at michaeleichorn.com
Fri Jun 19 15:08:34 UTC 2015


On Fri, 2015-06-19 at 18:41 +1000, Da Rock wrote:
> Ok, top posting as a summary really - numerous threads of thought going 
> on now.
> 
> First, that ggatel and mountver workaround - how does zfs take that? Is 
> it possible zfs will have a dummy spit if this is between it writing to 
> the drive? I'd assume not given it can use a md device as a vnode, but 
> doesn't hurt to ask.
Zfs really expects to talk to real disks. It is possible to mess with it but you
are probably risking your data. I wouldn't go for it.
> 
> Second, I've tried another drive and still the same issue; also.swapped 
> cable, and still errors. So a controller test would be ideal - or a new 
> system is in order :)
From what I hear most problems are the controllers unless you have a disk trying
to power-save without being told (see next).
> 
> Third, that consumer/raid drive difference seems a bit dodgy doesn't it? 
> :) Appears they're basically forcing you to pay up for the privilege... 
> wouldn't surprise me! Regardless, though, how does that stack up in an 
> ordinary situation? I doubt you'd have a raid certified drive in a 
> desktop to play games or edit home movies, and in that scenario it would 
> spak out as the only drive in the system and probably crash, wouldn't 
> it? Or am I not thinking it through properly? Possible given the amount 
> of sleep I've had lately...
There are a few levels of drives these days. There are the enterprise grade
drives that can take a bit more heat and have a longer mean time to failure.
These drive have tighter manufacturing tolerances and cost lots more. Then there
are NAS/RAID consumer drives that are made to about the same tolerances as a
desktop drive but have a few modifications for a 24/7 workload. Typical desktop
drives are not actually designed for 24/7 operation.

Then there is the firmware, some drives designed for 'green-ness' try to spin
down and do other things to save power, with a typical Windows desktop this is
probably good.  However zfs (and most raids) expect the drive to do nothing
without being told since zfs knows more than any firmware could.

Frankly for most small-business and home loads WD Red Drives + zfs is enough for
servers. I don't even put spinning rust in desktops anymore SSD + a network
drive is enough for non-workstation tasks.


But anyway if you are still haveing problems go get in touch with allanjude@ he
just co-authored "FreeBSD Mastery: ZFS" with Michael Lucas and is right NOW
looking for case studies in fixing ZFS problems for the second volume.

> Thanks for the brainstorming help guys.
> 
> 
> On 06/18/15 05:24, Michael Powell wrote:
> > Da Rock wrote:
> > 
> > > I hate jumping in like this out of the blue, but time is not on my side
> > > atm with a lot going on.
> > > 
> > > I have a problem with some devices disappearing on various versions of
> > > FreeBSD and machines (laptops, workstations/servers). Umass are the
> > > norm, with the message occurring most on usb sticks and sd cards.
> > > 
> > > Big problem atm is that my file server has a failed disk in the raid,
> > > and I've tried replacing it with a new drive (twice now), and both times
> > > it begins to resilver and then it is "REMOVED". If I online it again, it
> > > goes for about 10mins then REMOVED again.
> > > 
> > > Dmesg shows that the device is removed from the devfs with a
> > > passdevgonecb/lost device message. This apparently occurs right at boot
> > > too, as it shows amongst the usual scrolling during boot.
> > > 
> > > I had a chat with someone and they mentioned the cable and/or controller
> > > could be the issue. Could anyone add any insight or tests I could do?
> > > I'm not exactly claim to be an expert at zfs, so maybe something might
> > > need addressing there too.
> > > 
> > > I'd particularly appreciate a means of testing the controller.
> > > 
> > > ATM the main theory is to replace the board - not a happy thought! :/
> > > 
> > Another item to consider, even if only to exclude it, is: what kind of
> > drives are these? Are they server RAID certified or are they 'consumer'
> > desktop types. RAID certified are designed to time-limit attempts at error
> > recovery. There is a window in time that if a consumer drive takes too long
> > at internal ECC the controller will drop it. Doubt this is your case, I bet
> > you have server drives. I just mention it because using consumer desktop
> > drives on RAID controllers can sometimes be problematic.
> > 
> > -Mike
> > 
> > 
> > 
> > _______________________________________________
> > freebsd-questions at freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-questions
> > To unsubscribe, send any mail to "freebsd-questions-unsubscribe at freebsd.org"
> 
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 5761 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-questions/attachments/20150619/368f65f2/attachment.bin>


More information about the freebsd-questions mailing list