Problem with zpool remove of log device

kc atgb kisscoolandthegangbang at hotmail.fr
Sat Jun 17 22:32:29 UTC 2017



Le lun. 12 juin 2017 22:16:02 CEST
kc atgb <kisscoolandthegangbang at hotmail.fr> a écrit:

> 
> 
> Le mer. 07 juin 2017 08:21:09 CEST
> Stephen McKay <mckay at FreeBSD.org> a écrit:
> 
> > On Friday, 26th May 2017, lukasz at wasikowski.net wrote:
> > 
> > >I cant remove log device from pool - operation ends ok, but log device
> > >is still in the pool (bug?).
> > >
> > ># uname -a
> > >FreeBSD xxx.yyy.com 11.0-STABLE FreeBSD 11.0-STABLE #0 r316543: Thu Apr
> > >6 08:22:43 CEST 2017     root at xxx.yyy.com:/usr/obj/usr/src/sys/YYY  amd64
> > >
> > ># zpool status tank
> > >[..snip..]
> > >
> > >        NAME                   STATE     READ WRITE CKSUM
> > >        tank                 ONLINE       0     0     0
> > >          mirror-0             ONLINE       0     0     0
> > >            ada2p3             ONLINE       0     0     0
> > >            ada3p3             ONLINE       0     0     0
> > >        logs
> > >          mirror-1             ONLINE       0     0     0
> > >            gpt/tankssdzil0  ONLINE       0     0     0  block size: 512B configured, 4096B native
> > >            gpt/tankssdzil1  ONLINE       0     0     0  block size: 512B configured, 4096B native
> > 
> > >When I try to remove log device operation ends without errors:
> > >
> > ># zpool remove tank mirror-1; echo $?
> > >0
> > >
> > >But the log device is still there:
> > >[..snip..]
> > >I'd like to remove it - how should I proceed?
> > 
> > Does your system still write to the log?  Use "zfs iostat -v 1" to
> > check.  I think it is probably no longer be in use and only the final
> > disconnection failed.
> > 
> > What does "zpool list -v" tell you?  If you have a non-zero ALLOC
> > column for your log mirror and the log is no longer being used then
> > you may have hit an accounting bug in zfs that the zfsonlinux people
> > ran into a while ago.
> > 
> > I had this problem when I tried to remove a log mirror from a pool
> > I have been using for years.  I solved it by tweaking the zfsonlinux
> > hack a bit and slotting it into 9.3.
> > 
> > If you apply this hack be sure to have a full backup first!  When I
> > used it, I did my backup and a scrub then booted the hacked kernel,
> > issued the zfs remove command (which succeeded), reverted the kernel,
> > then scrubbed again.  All went well.
> > 
> > Good luck!
> > 
> > Here's the patch against 9.3 (should be close even for 11.0):
> > 
> > Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c
> > ===================================================================
> > --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c	(revision 309860)
> > +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c	(working copy)
> > @@ -5446,6 +5446,18 @@
> >  	ASSERT(vd == vd->vdev_top);
> >  
> >  	/*
> > +	 * slog stuck hack - barnes333 at gmail.com
> > +	 * https://github.com/zfsonlinux/zfs/issues/1422
> > +	 */
> > +	if (vd->vdev_islog && vd->vdev_removing
> > +	    && vd->vdev_state == VDEV_STATE_OFFLINE
> > +	    && vd->vdev_stat.vs_alloc > 0) {
> > +		printf("ZFS: slog stuck hack - clearing vs_alloc: %llu\n",
> > +		    (unsigned long long)vd->vdev_stat.vs_alloc);
> > +		vd->vdev_stat.vs_alloc = 0;
> > +	}
> > +
> > +	/*
> >  	 * Only remove any devices which are empty.
> >  	 */
> >  	if (vd->vdev_stat.vs_alloc != 0)
> > 
> > Cheers,
> > 
> > Stephen.
> > _______________________________________________
> > freebsd-fs at freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> > To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"
> > 
> 
> I have this case once again. The first time it was one month ago. 
> I had to backup ma datas and destroy and recreate the pool to remove the "faulted" log device. 
> 
> I'll try your patch. I hope I'll be more lucky than OP. I have to backup first again. 
> 
> In my opinion, maybe this problem is related to a certain type of data or activity. I have my pool for few years now and added a log only some months ago. 
> It is a little bit strange that it happened to me twice in so little laps of time and others are not affected. 
> 
> K.
> _______________________________________________
> freebsd-fs at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"
> 

I have succesfully applied the patch and build the kernel. The removal of log 
device has worked too. It was in offline state, then I had to remove the drive
so it was marked as unavailable before removal.  

My FreeBSD version :
FreeBSD my.host.name 9.3-STABLE FreeBSD 9.3-STABLE #0 r315141: Sun Mar 12 
16:00:24 CET 2017     root at my.host.name:/usr/obj/usr/src/sys/GENERIC amd64

I'm still curious about why is it happening. Any idea ? 

K.


More information about the freebsd-fs mailing list