Problem with zpool remove of log device

kc atgb kisscoolandthegangbang at hotmail.fr
Mon Jun 12 22:16:26 UTC 2017



Le mer. 07 juin 2017 08:21:09 CEST
Stephen McKay <mckay at FreeBSD.org> a écrit:

> On Friday, 26th May 2017, lukasz at wasikowski.net wrote:
> 
> >I cant remove log device from pool - operation ends ok, but log device
> >is still in the pool (bug?).
> >
> ># uname -a
> >FreeBSD xxx.yyy.com 11.0-STABLE FreeBSD 11.0-STABLE #0 r316543: Thu Apr
> >6 08:22:43 CEST 2017     root at xxx.yyy.com:/usr/obj/usr/src/sys/YYY  amd64
> >
> ># zpool status tank
> >[..snip..]
> >
> >        NAME                   STATE     READ WRITE CKSUM
> >        tank                 ONLINE       0     0     0
> >          mirror-0             ONLINE       0     0     0
> >            ada2p3             ONLINE       0     0     0
> >            ada3p3             ONLINE       0     0     0
> >        logs
> >          mirror-1             ONLINE       0     0     0
> >            gpt/tankssdzil0  ONLINE       0     0     0  block size: 512B configured, 4096B native
> >            gpt/tankssdzil1  ONLINE       0     0     0  block size: 512B configured, 4096B native
> 
> >When I try to remove log device operation ends without errors:
> >
> ># zpool remove tank mirror-1; echo $?
> >0
> >
> >But the log device is still there:
> >[..snip..]
> >I'd like to remove it - how should I proceed?
> 
> Does your system still write to the log?  Use "zfs iostat -v 1" to
> check.  I think it is probably no longer be in use and only the final
> disconnection failed.
> 
> What does "zpool list -v" tell you?  If you have a non-zero ALLOC
> column for your log mirror and the log is no longer being used then
> you may have hit an accounting bug in zfs that the zfsonlinux people
> ran into a while ago.
> 
> I had this problem when I tried to remove a log mirror from a pool
> I have been using for years.  I solved it by tweaking the zfsonlinux
> hack a bit and slotting it into 9.3.
> 
> If you apply this hack be sure to have a full backup first!  When I
> used it, I did my backup and a scrub then booted the hacked kernel,
> issued the zfs remove command (which succeeded), reverted the kernel,
> then scrubbed again.  All went well.
> 
> Good luck!
> 
> Here's the patch against 9.3 (should be close even for 11.0):
> 
> Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c
> ===================================================================
> --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c	(revision 309860)
> +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c	(working copy)
> @@ -5446,6 +5446,18 @@
>  	ASSERT(vd == vd->vdev_top);
>  
>  	/*
> +	 * slog stuck hack - barnes333 at gmail.com
> +	 * https://github.com/zfsonlinux/zfs/issues/1422
> +	 */
> +	if (vd->vdev_islog && vd->vdev_removing
> +	    && vd->vdev_state == VDEV_STATE_OFFLINE
> +	    && vd->vdev_stat.vs_alloc > 0) {
> +		printf("ZFS: slog stuck hack - clearing vs_alloc: %llu\n",
> +		    (unsigned long long)vd->vdev_stat.vs_alloc);
> +		vd->vdev_stat.vs_alloc = 0;
> +	}
> +
> +	/*
>  	 * Only remove any devices which are empty.
>  	 */
>  	if (vd->vdev_stat.vs_alloc != 0)
> 
> Cheers,
> 
> Stephen.
> _______________________________________________
> freebsd-fs at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"
> 

I have this case once again. The first time it was one month ago. 
I had to backup ma datas and destroy and recreate the pool to remove the "faulted" log device. 

I'll try your patch. I hope I'll be more lucky than OP. I have to backup first again. 

In my opinion, maybe this problem is related to a certain type of data or activity. I have my pool for few years now and added a log only some months ago. 
It is a little bit strange that it happened to me twice in so little laps of time and others are not affected. 

K.


More information about the freebsd-fs mailing list