Problem with zpool remove of log device

Stephen McKay mckay at FreeBSD.org
Thu Jun 15 11:59:56 UTC 2017


Sorry for the slow response.  I've been away (without email) for a few days.

On Friday, 9th June 2017, lukasz at wasikowski.net wrote:

>W dniu 2017-06-09 o 05:18, Stephen McKay pisze:
>
>>     while :
>>     do
>>         date > foo
>>         fsync foo
>>     done
>> 
>> With this running, my system does 600 writes per second to the log
>> according to zpool iostat.  That drops to zero once I kill the script.
>
>Zero, so no writes to log are performed during execution of this script.

OK.  I believe this means your log is in a "pending removal" state and
has not been finally removed because ZFS thinks there's still data
stored on it.  I'm happy for true ZFS experts to confirm or deny this
theory.  Anybody?

>I applied this patch to 11.1-PRERELASE, nothing changed. Still zpool
>remove exits with errcode 0, but log device is still attached to pool.

Thanks for trying this out, but I managed to leave out essential
information.

In my rush to do things before going away (see above), I didn't read my
notes on this event.  The patch has the safety feature of requiring the
log to be offline.  This means you will have to break the mirror (by
detaching one disk from it), then offline the remaining disk, and finally,
trigger the hack by attempting to remove the remaining disk.

When I was stuck in this situation, I had already reduced my log to
a single disk before discovering the accounting error problem, so I
don't know if you can activate the patch code without first breaking
the mirror.  I don't think you can offline a mirror.  I've not tried.

I've now had time to review my notes (sorry I didn't do that first up).
My pool had a data mirror-0 (gpt/data0 and gpt/data1) and a log mirror-1
(gpt/log0 and gpt/log1).  The sequence I did, minus most of the failed
attempts at random stuff, status checks, and so forth, was:

# zpool remove pool mirror-1   #Did nothing but should have worked.
# zpool detach pool gpt/log1   #Broke the log mirror.
# zpool remove pool gpt/log0   #Did nothing but should have worked.
# zpool offline pool gpt/log0  #Just fiddling to change the state.
# zpool remove pool gpt/log0   #Still nothing.
... Discovered plausible hack. Built and booted hacked kernel. ...
# zpool remove pool gpt/log0   #Glorious success!

So, my log was already down to one offline disk before I got hacky.
That's why I forgot to mention it.

You could break your mirror or you could modify the hack to remove the
VDEV_STATE_OFFLINE check.  If you have already saved all your important
data then either would be fine as an experiment.

Cheers,

Stephen.


More information about the freebsd-fs mailing list