[REPORT] Some panics on ZFS

Sat Dec 23 06:42:02 PST 2006

On Sat, Dec 23, 2006 at 11:17:38AM +0900, Norikatsu Shigemura wrote:
> Hi pjd!
> 
> 	I noticed some problems on ZFS, so I report to you:-).
> 
> 	1. After using ZFS, when kldunload zfs, memory leaked about 8KB.
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> ZFS filesystem version 3
> ZFS storage pool version 3
> Warning: memory type solaris leaked memory on destroy (116 allocations, 7328 bytes leaked).
> ZFS filesystem version 3
> ZFS storage pool version 3
>  Warning: memory type solaris leaked memory on destroy (116 allocations, 7328 bytes leaked).
> ZFS filesystem version 3
> ZFS storage pool version 3
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

This one is already fixed.

> 	2. When not-kldload zfs, 'mount -t zfs tank /mnt' causes panic.
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> panic: mutex Giant owned at /usr/src/sys/modules/zfs/../../contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c:232
> cpuid = 0
> KDB: enter: panic
> [thread pid 1126 tid 100147 ]
> Stopped at      kdb_enter+0x30: leave
> db> bt
> Tracing pid 1126 tid 100147 td 0x851bed80
> kdb_enter(806b5d1c,0,806b4fe7,fa2e96ac,851bed80,...) at kdb_enter+0x30
> panic(806b4fe7,806b50bc,84c5e8e6,e8,852bb800,...) at panic+0x14e
> _mtx_assert(8072af88,2,84c5e8e6,e8,fa2e96f8,...) at _mtx_assert+0xfb
> vdev_geom_open(852bb800,fa2e9720,fa2e9728,175,806b653d,...) at vdev_geom_open+0x77
> vdev_open(852bb800,0,0,852bb000,852bb000,...) at vdev_open+0x180
> vdev_root_open(852bb000,fa2e9784,fa2e978c,806b653d,122,...) at vdev_root_open+0x42
> vdev_open(852bb000,84c57fa7,0,0,804a89f0,...) at vdev_open+0x180
> spa_load(1,0,2c,84868340,6,...) at spa_load+0x1d6
> spa_open_common(84c57c13,0,84c57c13,fa2e9870,804bdbe1,...) at spa_open_common+0x15f
> dsl_dir_open_spa(0,84868340,84c57e3b,fa2e99d0,fa2e99d4,...) at dsl_dir_open_spa+0x62
> dsl_dir_open(84868340,84c57e3b,fa2e99d0,fa2e99d4,e34,...) at dsl_dir_open+0x2e
> dsl_prop_get(84868340,84c5d09d,8,1,fa2e9a50,...) at dsl_prop_get+0x29
> dsl_prop_get_integer(84868340,84c5d09d,fa2e9a50,0,1c7,...) at dsl_prop_get_integer+0x38
> zfs_mount(853c57d4,851bed80,806bebc5,3ce,0,...) at zfs_mount+0x13a
> vfs_domount(851bed80,84bab450,84edc170,0,84bab9d0,...) at vfs_domount+0x77a
> vfs_donmount(851bed80,0,850e0b80,850e0b80,3,...) at vfs_donmount+0x4c9
> nmount(851bed80,fa2e9d00,c,c,fa2e9d38,...) at nmount+0xaa
> syscall(fa2e9d38) at syscall+0x2e3
> Xint0x80_syscall() at Xint0x80_syscall+0x20
> --- syscall (7, FreeBSD ELF32, wait4), eip = 0x2, esp = 0x207, ebp = 0x7fbfd948 ---
> db> 
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> 
> 	Other case too, mount -a with /etc/fstab like following line:
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> tank	/mnt	zfs	rw	0	0
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> 
> 	But 'kldload zfs; mount -t zfs tank /mnt' didn't panic.  f(?_?)?

It can be easly fixed, but bascially in zfs world you should mount file
systems after loading zfs.ko via:

	# zfs mount -a

and activate zvols:

	# zfs volinit

> 	3. When I'm testing with iozone, ZFS panic-ed.
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> Fatal trap 12: page fault while in kernel mode
> cpuid = 0; apic id = 00
> fault virtual address   = 0x4
> fault code              = supervisor read, page not present
> instruction pointer     = 0x20:0x84bef72b
> stack pointer           = 0x28:0xfa0378c0
> frame pointer           = 0x28:0xfa037900
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, def32 1, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 17326 (iozone)
> [thread pid 17326 tid 100174 ]
> Stopped at      arc_release+0xab:       cmpl    %ecx,0x4(%eax)
> db> bt
> Tracing pid 17326 tid 100174 td 0x84ef3000
> arc_release(a782935c,b36901e0,36,8072a3a0,0,...) at arc_release+0xab
> dbuf_dirty(b36901e0,d6922a00,e28c000,0,4000,...) at dbuf_dirty+0x3ec
> dmu_write(84697b80,1909b,0,e28c000,0,...) at dmu_write+0x8f
> zfs_write(fa037b90,0,806b3987,0,0,...) at zfs_write+0x6c2
> VOP_WRITE_APV(84c61f00,fa037b90,84ef3000,806bfde7,23e,...) at VOP_WRITE_APV+0x14c
> vn_write(88e8b990,fa037c58,87c39e00,0,84ef3000,...) at vn_write+0x240
> dofilewrite(84ef3000,3,88e8b990,fa037c58,ffffffff,...) at dofilewrite+0x85
> kern_writev(84ef3000,3,fa037c58,8710000,30000,...) at kern_writev+0x60
> write(84ef3000,fa037d00,c,34a,84ef3000,...) at write+0x4f
> syscall(fa037d38) at syscall+0x2e3
> Xint0x80_syscall() at Xint0x80_syscall+0x20
> --- syscall (138674176), eip = 0x2, esp = 0x212, ebp = 0x401c8520 ---
> db> 
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Interesting, I'll try to reproduce it.

> 	Other case, but I think that this is current problem:-).
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> panic: mi_switch: switch in a critical section
> cpuid = 0
> KDB: enter: panic
> [thread pid 2262 tid 100131 ]
> Stopped at      kdb_enter+0x30: leave
> db> 
> db> 
> db> bt
> Tracing pid 2262 tid 100131 td 0x84f95510
> kdb_enter(806b5d1c,0,806b66d6,f9fe3b7c,84f95510,...) at kdb_enter+0x30
> panic(806b66d6,9,806b6699,15a,84f8d510,...) at panic+0x14e
> mi_switch(1,0,806b98ac,285,80730ed4,...) at mi_switch+0xdb
> turnstile_wait(8072ab08,84f8d510,0,166,84f8d512,...) at turnstile_wait+0x4c5
> _mtx_lock_sleep(8072ab08,84f95510,0,806b653d,171,...) at _mtx_lock_sleep+0x18a
> _mtx_lock_flags(8072ab08,0,806b653d,171,93,...) at _mtx_lock_flags+0xc8
> _sx_assert(85198cac,4,84f41a97,36,0,...) at _sx_assert+0x10b
> _sx_xunlock(85198cac,84f41a97,36,93,85198cac,...) at _sx_xunlock+0x28
> cv_timedwait(85198d64,85198cac,1388,5f,0,...) at cv_timedwait+0xd2
> txg_thread_wait(85198d64,5,85198d5c,85198d54,85198d64,...) at txg_thread_wait+0x3f
> txg_timelimit_thread(85198c00,f9fe3d38,806b2c0a,328,84f99900,...) at txg_timelimit_thread+0x75
> fork_exit(84f083c0,85198c00,f9fe3d38) at fork_exit+0xd1
> fork_trampoline() at fork_trampoline+0x8
> --- trap 0x1, eip = 0, esp = 0xf9fe3d6c, ebp = 0 ---
> db> 
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Not necessarily, it could be a bug in my Solaris compatible convar(9)
based on sx(9) locks.

> 	I saw some memory modified after free.
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> Memory modified after free 0x8f695c00(1020) val=deadc09e @ 0x8f695db0
> panic: Most recently used by solaris
> 
> cpuid = 1
> KDB: enter: panic
> [thread pid 1412 tid 100240 ]
> Stopped at      kdb_enter+0x30: leave
> db> bt     
> Tracing pid 1412 tid 100240 td 0x85209360
> kdb_enter(806b5d1c,1,806cfcfe,fa0df808,85209360,...) at kdb_enter+0x30
> panic(806cfcfe,84c73b15,3fc,deadc09e,8f695db0,...) at panic+0x14e
> mtrash_ctor(8f695c00,400,0,2,0,...) at mtrash_ctor+0x65
> uma_zalloc_arg(810675a0,0,2,8106ea00,88035000,...) at uma_zalloc_arg+0x12b
> malloc(240,84c7ab20,2,fa0df89c,84c0149e,...) at malloc+0xda
> kmem_alloc(240,2,88035000,8b261a00,fa0df8e4,...) at kmem_alloc+0x21
> kmem_zalloc(240,2,84944840,fa0df8cc,804bdbe1,...) at kmem_zalloc+0x1e
> zio_create(77,0,8b261a00,90b82000,4000,...) at zio_create+0x2d
> zio_write(8b56ac00,88035000,6,2,1,...) at zio_write+0x77
> arc_write(8b56ac00,88035000,6,2,1,...) at arc_write+0x105
> dbuf_sync(90aaa000,8b56ac00,87dfd000,0,84c72bf8,...) at dbuf_sync+0x3ca
> dnode_sync(888ca478,0,8b56ac00,87dfd000,8b56ac00,...) at dnode_sync+0x51b
> dmu_objset_sync_dnodes(87dfd000,400,84c72b11,806b4d34,ae,...) at dmu_objset_sync_dnodes+0x97
> dmu_objset_sync(8481c000,87dfd000,77,0,fa0dfbac,...) at dmu_objset_sync+0x60
> dsl_dataset_sync(87be7600,87dfd000,0,850a25cc,850a2584,...) at dsl_dataset_sync+0x24
> dsl_pool_sync(850a2400,77,0,1,0,...) at dsl_pool_sync+0x8f
> spa_sync(88035000,77,0,850a255c,850a2554,...) at spa_sync+0x453
> txg_sync_thread(850a2400,fa0dfd38,806b2c0a,328,85234240,...) at txg_sync_thread+0x1d7
> fork_exit(84c39d20,850a2400,fa0dfd38) at fork_exit+0xd1
> fork_trampoline() at fork_trampoline+0x8
> --- trap 0x1, eip = 0, esp = 0xfa0dfd6c, ebp = 0 ---
> db>
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Are you able to setup memguard(9)? Or maybe you can give me procedure to
reproduce it?

> 	4. Workaround for "kmem_malloc(X): kmem_map too small: Y total allocated"
> 	   a. write "options KVA_PAGES=512" to kernel config
> 	   b. write vm.kmem_size=1610612736 to /boot/loader.conf
> 	   c. use over 1.5GB physical memory:-)
> 	   I don't catch this type panic:-).  But top(1) reported ...

I implement quite good work-around few days ago, which should fix make
it much harder to trigger this panic (I'm not able to do it anymore, but
I don't think what I got is the real fix).

Thanks for our feedback!

I'll try to reproduce iozone panic, I'd also like to reproduce "Memory
modified after free" and "mi_switch: switch in a critical section"
panics, can you tell me how to do it?

-- 
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd at FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20061223/1998cbb8/attachment.pgp