zfs diff deadlock
Peter Jeremy
peter at rulingia.com
Tue Nov 13 20:27:47 UTC 2012
On 2012-Nov-11 10:12:58 +0200, Andriy Gapon <avg at FreeBSD.org> wrote:
>on 11/11/2012 09:27 Peter Jeremy said the following:
>> On 2012-Nov-11 09:32:49 +1100, Peter Jeremy <peter at server.rulingia.com>
>> wrote:
>>> I recently decided to do a "zfs diff" between two snapshots to try and
>>> identify why there was so much "USED" space in the snapshot. The diff ran
>>> for a while (though with very little IO) but has now wedged unkillably.
>>> There's nothing on the console or in any logs, the pool reports no
>>> problems and there are no other visible FS issues. Any ideas on tracking
>>> this down?
>> ...
>>> The systems is running a 4-month old 8-stable (r237444)
>>
>> I've tried a second system running the same world with the same result, so
>> this looks like a real bug in ZFS rather than a system glitch.
>>
>
>Are you able to catch the state of all threads in the system?
>E.g. via procstat -k -a.
>Or a crash dump.
Unfortunately, neither of those systems are really suitable for
debugging. I have setup a VBox and sent most of the offending FS to
it. That gives somewhat different results: On a recent 8-stable
(r242865M), I get a panic whilst on a recent head, I get a "Unable to
determine path or stats" error.
On 8-stable, I have a crashdump and the panic is:
suspending ithread with the following locks held:
shared spin mutex ({6") r = 0 (0xffffff005c395a80) locked @ /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c:522
panic: witness_warn
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
kdb_backtrace() at kdb_backtrace+0x37
panic() at panic+0x1ce
witness_warn() at witness_warn+0x2b2
ithread_loop() at ithread_loop+0x112
fork_exit() at fork_exit+0x11d
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffff800008ccf0, rbp = 0 ---
Note that zap.c:522 is the rw_enter() in zap_get_leaf_byblk() - which
is the offending function in the backtrace on r237444.
On head, I get some normal differences terminated by:
Unable to determine path or stats for object 2128453 in tank/beckett/home at 20120518: Invalid argument
A scrub reports no issues but the problem remains:
root at FB10-64:~ # zpool status
pool: tank
state: ONLINE
status: The pool is formatted using a legacy on-disk format. The pool can
still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'. Once this is done, the
pool will no longer be accessible on software that does not support feature
flags.
scan: scrub repaired 0 in 3h24m with 0 errors on Wed Nov 14 01:58:36 2012
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
ada2 ONLINE 0 0 0
errors: No known data errors
I've done some searching and found 2 hits on the message - one in an
OI IRC log and the other in a ZFS-on-Linux list. Neither offered any
insights.
I've tried ktracing the zfs diff and that ends:
1856 zfs CALL read(0x7,0x7fffffbfc160,0x18)
1856 zfs GIO fd 7 read 24 bytes
0x0000 0400 0000 0000 0000 e079 2000 0000 0000 397a 2000 0000 0000 |.........y .....9z .....|
1856 zfs RET read 24/0x18
1856 zfs CALL ioctl(0x3,0xd5985a36,0x7fffffbfc178)
1856 zfs RET ioctl 0
1856 zfs CALL read(0x7,0x7fffffbfc160,0x18)
1856 zfs GIO fd 7 read 24 bytes
0x0000 0200 0000 0000 0000 3a7a 2000 0000 0000 4d7a 2000 0000 0000 |........:z .....Mz .....|
1856 zfs RET read 24/0x18
1856 zfs CALL ioctl(0x3,0xd5985a39,0x7fffffbfab18)
1856 zfs RET ioctl -1 errno 2 No such file or directory
1856 zfs CALL ioctl(0x3,0xd5985a39,0x7fffffbfab18)
1856 zfs RET ioctl -1 errno 2 No such file or directory
1856 zfs CALL ioctl(0x3,0xd5985a39,0x7fffffbfab18)
1856 zfs RET ioctl 0
1856 zfs CALL ioctl(0x3,0xd5985a39,0x7fffffbfab18)
1856 zfs RET ioctl 0
1856 zfs CALL ioctl(0x3,0xd5985a39,0x7fffffbfab18)
1856 zfs RET ioctl -1 errno 2 No such file or directory
1856 zfs CALL ioctl(0x3,0xd5985a39,0x7fffffbfab18)
1856 zfs RET ioctl -1 errno 2 No such file or directory
1856 zfs CALL ioctl(0x3,0xd5985a39,0x7fffffbfab18)
1856 zfs RET ioctl 0
1856 zfs CALL ioctl(0x3,0xd5985a39,0x7fffffbfab18)
1856 zfs RET ioctl 0
1856 zfs CALL ioctl(0x3,0xd5985a39,0x7fffffbfab18)
1856 zfs RET ioctl -1 errno 2 No such file or directory
1856 zfs CALL ioctl(0x3,0xd5985a39,0x7fffffbfab18)
1856 zfs RET ioctl -1 errno 2 No such file or directory
1856 zfs CALL ioctl(0x3,0xd5985a39,0x7fffffbfab18)
1856 zfs RET ioctl -1 errno 2 No such file or directory
1856 zfs CALL ioctl(0x3,0xd5985a39,0x7fffffbfab18)
1856 zfs RET ioctl -1 errno 2 No such file or directory
1856 zfs CALL ioctl(0x3,0xd5985a39,0x7fffffbfab18)
1856 zfs RET ioctl 0
1856 zfs CALL ioctl(0x3,0xd5985a39,0x7fffffbfab18)
1856 zfs RET ioctl 0
1856 zfs CALL ioctl(0x3,0xd5985a39,0x7fffffbfab18)
1856 zfs RET ioctl -1 errno 2 No such file or directory
1856 zfs CALL ioctl(0x3,0xd5985a39,0x7fffffbfab18)
1856 zfs RET ioctl -1 errno 2 No such file or directory
1856 zfs CALL ioctl(0x3,0xd5985a39,0x7fffffbfab18)
1856 zfs RET ioctl 0
1856 zfs CALL ioctl(0x3,0xd5985a39,0x7fffffbfab18)
1856 zfs RET ioctl 0
1856 zfs CALL ioctl(0x3,0xd5985a39,0x7fffffbfab18)
1856 zfs RET ioctl -1 errno 2 No such file or directory
1856 zfs CALL ioctl(0x3,0xd5985a39,0x7fffffbfab18)
1856 zfs RET ioctl -1 errno 2 No such file or directory
1856 zfs CALL ioctl(0x3,0xd5985a39,0x7fffffbfab18)
1856 zfs RET ioctl 0
1856 zfs CALL ioctl(0x3,0xd5985a39,0x7fffffbfab18)
1856 zfs RET ioctl 0
1856 zfs CALL ioctl(0x3,0xd5985a39,0x7fffffbfab18)
1856 zfs RET ioctl -1 errno 22 Invalid argument
1856 zfs CALL close(0x1)
1856 zfs RET close 0
1856 zfs CALL close(0x7)
1856 zfs RET ioctl -1 errno 32 Broken pipe
1856 zfs CALL close(0x8)
1856 zfs RET close 0
1856 zfs CALL thr_kill(0x18adf,SIG 32)
1856 zfs RET thr_kill 0
1856 zfs CALL _umtx_op(0x802c06c00,0x2,0x18adf,0,0)
1856 zfs RET close 0
1856 zfs PSIG SIG 32 caught handler=0x8020537f0 mask=0x0 code=SI_LWP
1856 zfs CALL sigreturn(0x7fffffbfbca0)
1856 zfs RET sigreturn JUSTRETURN
1856 zfs CALL thr_exit(0x802c06c00)
1856 zfs RET _umtx_op 0
1856 zfs CALL close(0x6)
1856 zfs RET close 0
1856 zfs CALL stat(0x7fffffffa900,0x7fffffffa888)
1856 zfs NAMI "/usr/share/nls/C/libc.cat"
1856 zfs RET stat -1 errno 2 No such file or directory
1856 zfs CALL stat(0x7fffffffa900,0x7fffffffa888)
1856 zfs NAMI "/usr/share/nls/libc/C"
1856 zfs RET stat -1 errno 2 No such file or directory
1856 zfs CALL stat(0x7fffffffa900,0x7fffffffa888)
1856 zfs NAMI "/usr/local/share/nls/C/libc.cat"
1856 zfs RET stat -1 errno 2 No such file or directory
1856 zfs CALL stat(0x7fffffffa900,0x7fffffffa888)
1856 zfs NAMI "/usr/local/share/nls/libc/C"
1856 zfs RET stat -1 errno 2 No such file or directory
1856 zfs CALL write(0x2,0x7fffffffa740,0x65)
1856 zfs GIO fd 2 wrote 101 bytes
"Unable to determine path or stats for object 2128453 in tank/beckett/home at 20120518: Invalid argument
"
1856 zfs RET write 101/0x65
1856 zfs CALL close(0x5)
1856 zfs RET close 0
1856 zfs CALL close(0x3)
1856 zfs RET close 0
1856 zfs CALL close(0x4)
1856 zfs RET close 0
1856 zfs CALL exit(0x1)
--
Peter Jeremy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20121114/a1e800e4/attachment.sig>
More information about the freebsd-fs
mailing list