ZFS + xattr -> panic loop

Fri Nov 14 08:58:24 UTC 2014

Am 13.11.2014 um 20:43 schrieb javocado:
> After running an rsync (upload) into my zfs filesystem using the
> --fake-super option (which stores permissions in extended attributes) the
> zfs filesystem has somehow become corrupt. When booting the system it
> panic's upon zfs startup:
> 
> panic: solaris assert: VFS_ROOT(vfsp, LK_EXCLUSIVE, &vp) == 0, file:
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c
> , line: 1170
> cpuid = 12
> KDB: stack backtrace:
> #0 0xffffffff8034b3ae at kdb_backtrace+0x5e
> #1 0xffffffff803185c7 at panic+0x187
> #2 0xffffffff80ab8213 at zfs_mount+0x563
> #3 0xffffffff803a2635 at vfs_donmount+0xdc5
> #4 0xffffffff803a3133 at nmount+0x63
> #5 0xffffffff80553284 at amd64_syscall+0x1f4
> #6 0xffffffff8053bc2c at Xfast_syscall+0xfc
> 
> I was able to boot the system by setting the filesystem:
> 
> canmount=off
> 
> However, I am still unable to mount it manually without causing a similar
> panic.

> So, setting the cause of the panic aside for the moment, I just want to get
> to the data. I'd like to think I can do that by turning off xattrs and
> mounting it read-only however:
> 
> # zfs set xattr=off pool/data
> property 'xattr' not supported on FreeBSD: permission denied
> 
> How can I turn that off?

I do not think that turning on xattr is the root cause of your
problem with the pool.

BTW: Does your system use ECC for its RAM?

You can set

	vfs.zfs.recover="1"

in /boot/loader.conf (or at the loader prompt) to prevent many
consistency checks, that cause panics. But I'd make sure that
no writes occur, else the corruption may become much worse than
it already is.

I have been able to recover my own ZFS pool and help somebody
else with recovery, and the method might be applicable to your
case, too:

You may want to try accessing the pool with zdb, which has a userland
implementation of a lot of the kernel code. That way, you can avoid
crashing your system and you can possibly go back to a state, where
the pool was sane.

I'd rewind to a sane TXG, backup and rebuild the pool from that state.
But in theory you could even continue using the pool, starting from
a state where internal is (was) consistent.

> Second or alternately, if I want to just dump the filesystem out excluding
> the extended attributes, so that I have a clean, mountable data set, how
> would I do that?

I'd try too check the pool state with zdb, adding the options "-AAA -L".

You could list the history to select a TXG that is recent enough to
be acceptable as a recovery point ("zdb -AAA -L -hv <POOLNAME>").

Then look at the poll with various zdb commands, e.g. "-d", "-b", "-m"
inserted into "zdb -AAA -L -t <TXG> <CMD> <POOL>". Select values for
TXG from the history - zdb will execute the commands as if the pool
still was at the corresponding point in time.

If zdb can report pool statistics and pool details (e.g. metaslabs)
without crashing (and without the need for -AAA and -L), then there
is a good chance, that you rewound to a sane TXG and that you will be
able to recover your data as it was at the time the TXG was written.

I'd still only mount R/O and perform a dump as soon as possible.
I'm short of time and probably unable to help with any details, but
you'll find debug and recovery hints on the web (some apply only to
the Solaris version of ZFS, but most can be used for the FreeBSD
implementation as well).

Good luck ...

Regards, STefan