ZFS panics on pool moved from OpenSolaris

Morgan Reed morgan.s.reed at gmail.com
Sun Feb 5 05:24:07 UTC 2012


Hi all,

     I'm experiencing an issue in migrating my NAS from OpenSolaris
over to FreeBSD, I've tried both releng_8_2 and releng_9 I have
similar issues in both cases.

The pool is a RAID-Z pool comprising 4 1TB drives, it was originally
created on OpenSolaris (not sure what version, 2010.09 maybe, it was
one of the last ones prior to the Oracle acquisition), pool was a V14
pool, initially I built a FreeBSD-8.2 system to migrate the pool to,
migrated it over OK, upgraded it from V14 to V15, but later testing
revealed something wasn't happy, when listing certain directories (and
even doing an ls -la at the root of the pool) resulted in a kernel
panic (Mostly GENERIC kernel, rebuilt with KVA_PAGES 512 but other
than that stock);

panic: avl_find()  succeeded inside avl_add()
cpuid = 0
KDB: stack backtrace:
#0 0x808e0d07 at kdb_backtrace+0x47
#1 0x808b1dc7 at panic+0x117
#2 0x862e6602 at avl_add+0x52
#3 0x8635c136 at zfs_fuid_table_load+0x1f6
#4 0x8635c3ee at zfs_fuid_init+0x14e
#5 0x8635c4d7 at zfs_fuid_find_by_idx+0xb7
#6 0x8635c52d at zfs_fuid_map_id+0x2d
#7 0x8635d56f at zfs_groupmember+0x2f
#8 0x8636df0b at zfs_zaccess_aces_check+0x1db
#9 0x8636377 at zfs_zaccess+0x57
#10 0x8636d6fb at zfs_zaccess_rwx+0x3b
#11 0x86385f61 at zfs_freebsd_access+0xf1
#12 0x80c02ea2 at VOP_ACCESS_APV+0x42
#13 0x809457cf at change_dir+0x5f
#14 0x809467b1 at kern_chdir+0x81
#15 0x80946a22 at chdir+0x22
#16 0x808eca39 at syscallenter+0x329
#17 0x80be4e14 at syscall+0x34

Looks like something in the permissions structure was causing grief,
tried running a scrub across the pool, didn't resolve the issue.

After spending some time fighting with it I decided that it wasn't
worth the effort, and I upgraded to FreeBSD-9.0 to see if that would
assist (I normally avoid x.0 releases), once again pool imported fine,
however I was still seeing similar panics, ran a scrub across the
pool, still not happy, also upgraded the pool to v28 tried again, when
that failed I scrubbed again but still no joy.

As a matter of interest I booted an OpenIndiana live CD and tried
copying the directories contents to another location, I am now able to
list the directories. However there are still issues.

The issue seems to have shifted slightly, stack trace from a recent
panic is below (GENERIC kernel on 9.0-RELEASE);

panic: avl_find()  succeeded inside avl_add()
cpuid = 0
KDB: stack backtrace:
#0 0xc0a4b157 at kdb_backtrace+0x47
#1 0xc0a186b7 at panic+0x117
#2 0xc5a2d7b2 at avl_add+0x52
#3 0xc5ac44e6 at zfs_fuid_table_load+0x1f6
#4 0xc5ac479e at zfs_fuid_init+0x14e
#5 0xc5ac4893 at zfs_fuid_find_by_idx+0xc3
#6 0xc5ac48ed at zfs_fuid_map_id+0x2d
#7 0xc5ac492f at zfs_groupmember+0x2f
#8 0xc5adbdcb at zfs_zaccess_aces_check+0x1db
#9 0xc5adc257 at zfs_zaccess+0xb7
#10 0xc5afa7d4 at zfs_freebsd_getattr+0x1f4
#11 0xc0d69322 at VOP_GETATTR_APV+0x42
#12 0xc0ab81c9 at vn_stat+0x79
#13 0xc0aaefdd at kern_statat_vnhook+0xfd
#14 0xc0aaf1cc at kern_statat+0x3c
#15 0xc0aaf156 at kern_lstat+0x36
#16 0xc0aaf1ff at sys_lstat+0x2f
#17 0xc0d49315 at syscall+0x355

This time it appears to be related to some extended attribute(s), I
can do an ls on one of the directories in question but an ls -la
causes a panic, so it would seem that it's some attribute which is
only shown in the long form of the ls output that is causing the
issue.

I've done some digging around via the magic of google and this seems
to be a fairly common issue, but I've not found a solution for it
(barring copying the data off, recreating the pool and restoring the
data, I'd like to avoid this if at all possible.

If I could determine what the problematic attribute was and a means to
strip it (be that from FreeBSD or from an OpenIndiana liveCD) I think
that will get me back up and running.

If anybody can provide some suggestions as to what I may be able to do
to resolve this issue in situ I would be very grateful.

Thanks,

Morgan


More information about the freebsd-stable mailing list