A failed drive causes system to hang

Sat Apr 13 21:36:33 UTC 2013

On Sat, Apr 13, 2013 at 04:59:51PM -0400, Quartz wrote:
> 
> >This is what happens when end-users start to try and "correlate" issues
> >to one another's without actually taking the time to fully read the
> >thread and follow along actively.
> 
> He was experiencing a system hang, which appeared to be related to
> zfs and/or cam. I'm experiencing a system hang, which appears to be
> related to zfs and/or cam. I am in fact following along with this
> thread.

The correlation was incorrect, however, which is my point.  Treat every
incident uniquely.

> >Your issue: "on my raidz2 pool, when I lose more than 2 disks, I/O to
> >the pool stalls indefinitely,
> 
> Close, but not quite- Yes, io to the pool stalls, but io in general
> also stalls. It appears the problem possibly doesn't start until
> there's io traffic to the pool though.

All I was able to reproduce was that I/O ***to the pool*** (once its
broken) stalls.  I'm still waiting on you to go through the same
method/model I did here, providing all the data:

http://lists.freebsd.org/pipermail/freebsd-fs/2013-March/016814.html

> >but I can still use the system barring
> >ZFS-related things;
> 
> No. I've responded to this misconception on your part more than
> once- I *CANNOT* use the system in any reliable way, random commands
> fail. I've had it hang trying cd from one dir on the boot volume to
> another dir on the boot volume. The only thing I can *reliably* do
> is log in. Past that point all bets are off.

Quoting you:

http://lists.freebsd.org/pipermail/freebsd-fs/2013-March/016822.html

"I can nose around the boot drive just fine, but anything involving i/o
that so much as sneezes in the general direction of the pool hangs the
machine.  Once this happens I can log in via ssh, but that's pretty much
it."

This conflicts directly with your above statement.

Then you proceed to provide ""the evidence that nothing works"",
specifically:

"zpool destroy hangs" -- this touches the pool
"zpool replace hangs" -- this touches the pool
"zpool history hangs" -- this touches the pool
"shutdown -r now gets half way through then hangs" -- this touches the pool
"reboot -q same as shutdown" -- this touches the pool (flushing of FS cache)

If you're able to log in to the machine via SSH, it means that things
like /etc/master.passwd can be read, and also that /var/log/utx* and
similar files get updated (written to) successfully.  So, to me, it
indicates that only I/O to anything involving the ZFS pool is what
causes indefinite stalling (of that application/command only).

To make, this makes perfect sense.

If you have other proof that indicates otherwise (such as non-ZFS
filesystems start also stalling/causing problems), please provide those
details.  But as it stands, we don't even know what the "boot drive"
consists of (filesystems, etc.) because you haven't provided any of that
necessary information.  Starting to see the problem?

I sound like a broken record, it's because all the necessary information
needed to diagnose this is stuff only you have access to.

> >I don't know how to get the system back into a
> >usable state from this situation"
> 
> "...short of having to hard reset", yes.
> 
> >Else, all you've provided so far is a general explanation. You have
> >still not provided concise step-by-step information like I've asked.
> 
> *WHAT* info? You have YET TO TELL ME WHAT THE CRAP YOU ACTUALLY NEED
> from me. I've said many times I'm perfectly willing to give you logs
> or run tests, but I'm not about to post a tarball of my entire drive
> and output of every possible command I could ever run.

I've given you 2 examples of what's generally needed.  First example
(yes, this URL again):

http://lists.freebsd.org/pipermail/freebsd-fs/2013-March/016814.html

Second example:

http://lists.freebsd.org/pipermail/freebsd-fs/2013-January/016324.html

Read what I've written in full in both of those posts please.  Don't
skim -- you'll see that I go "step by step" looking at certain things,
noting what the kernel is showing on the console, and taking notes of
what transpires each step of the way (including physical actions taken).

Start with that, and if there's stuff omitted/missing then we can get
that later.

> For all the harping you do about "not enough info" you're just as
> bad yourself.

I see.

> >I've gone so far as to give you an example of what to provide:
> >
> >http://lists.freebsd.org/pipermail/freebsd-fs/2013-March/016814.html
> 
> The only thing there you ask for is a dmesg, which I subsequently
> provided. Nowhere in that thread do you ask me to give you
> *anything* else, besides your generic mantra of "more info". And
> yes, I did read it again just now three times over to make sure. The
> closest you come is:
> 
> "This is why hard data/logs/etc. are necessary, and why
> every single step of the way needs to be provided, including physical
> tasks performed."
> 
> ... but you still never told me WHICH logs or WHAT data you need.
> I've already given you the steps I took re: removing drives, steps
> which *you yourself* confirmed to express the problem.

All I was able to confirm was the following, with regards to a
permanently damaged pool in a non-recoverable state (ex. 3 disks lost
in a raidz2 pool), taken from here:

http://lists.freebsd.org/pipermail/freebsd-fs/2013-March/016814.html

The results:

* Loss of the 3rd disk does not show up in "zpool status" -- it still
continues to show "ONLINE" state, but with incremented WRITE counters.
As such, "zpool status" does work.

* failmode=wait causes all I/O to the pool to block/wait indefinitely.
This includes running processes or new processes doing I/O.  This is
by design.

* failmode=continue causes existing processes with pending write I/O to
the pool to return EIO (I/O error) and thus might (depends on the program
and its behaviour) exit/kick you back to a shell.  New processes
issuing read I/O from the pool will block/wait.  I did not test what
happens with new processes that issue new write I/O to the pool.

* Doing something like "ls -l /filesystem_on_that_pool" works, which
seems a little strange to me -- possibly this is due to the VFS or
underlying caching layers involved.

* Re-insertion of one of the yanked (or "failed") disks does not result
in CAM reporting disk insertion.  This happens even *after* the
CAM-related fixes that were committed in r247115.  Thus "zpool replace"
returns "cannot open 'xxx': no such GEOM provider" since the disk
appears missing from /dev, however "camcontrol devlist" still shows it
attached (and probably why ZFS still shows it as "ONLINE").

What you've stated happens differs from the above, and this is why I
keep asking for you to please go step-by-step in reproducing your issue,
provide all output (including commands you issue), and all physical
tasks performed, plus what the console shows each step of the way.  I'm
sorry but that's the only way.

> >I will again point to the 2nd-to-last paragraph of my above referenced
> >mail.
> 
> The "2nd-to-last paragraph" is:
> 
> "So in summary: there seem to be multiple issues shown above, but I can
> confirm that failmode=continue **does** pass EIO to *running* processes
> that are doing I/O.  Subsequent I/O, however, is questionable at this
> time."
> 
> Unless you're typing in a language other than english, that isn't
> asking me jack shit.

The paragraph I was referring to:

"I'll end this Email with (hopefully) an educational statement:  I hope
my analysis shows you why very thorough, detailed output/etc. needs to
be provided when reporting a problem, and not just some "general"
description.  This is why hard data/logs/etc. are necessary, and why
every single step of the way needs to be provided, including physical
tasks performed."

> >Once concise details are given and (highly preferable!) a step-by-step
> >way to reproduce the issue 100% of the time
> 
> *YOU'VE ALREADY REPRODUCED THIS ON YOUR OWN MACHINE.*
> 
> Seriously, wtf?

No I haven't.  My attempt to reproduce the issue/analysis is above.
Some of the things you have happening I cannot reproduce.  So can you
please go through the same procedure/methodology and do the same
write-up I did, but with your system?

-- 
| Jeremy Chadwick                                   jdc at koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Mountain View, CA, US                                            |
| Making life hard for others since 1977.             PGP 4BD6C0CB |