ZFS txg implementation flaw
Slawa Olhovchenkov
slw at zxy.spb.ru
Mon Oct 28 21:43:56 UTC 2013
On Mon, Oct 28, 2013 at 02:38:30PM -0700, Xin Li wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
>
> On 10/28/13 14:32, Slawa Olhovchenkov wrote:
> > On Mon, Oct 28, 2013 at 02:22:16PM -0700, Jordan Hubbard wrote:
> >
> >>
> >> On Oct 28, 2013, at 2:28 AM, Slawa Olhovchenkov <slw at zxy.spb.ru>
> >> wrote:
> >>
> >>> As I see ZFS cretate seperate thread for earch txg writing.
> >>> Also for writing to L2ARC. As result -- up to several thousands
> >>> threads created and destoyed per second. And hundreds thousands
> >>> page allocations, zeroing, maping unmaping and freeing per
> >>> seconds. Very high overhead.
> >>
> >> How are you measuring the number of threads being created /
> >> destroyed? This claim seems erroneous given how the ZFS thread
> >> pool mechanism actually works (and yes, there are thread pools
> >> already).
> >>
> >> It would be helpful to both see your measurement methodology and
> >> the workload you are using in your tests.
> >
> > Semi-indirect. dtrace -n 'fbt:kernel:vm_object_terminate:entry {
> > @traces[stack()] = count(); }'
> >
> > After some (2-3) seconds
> >
> > kernel`vnode_destroy_vobject+0xb9 zfs.ko`zfs_freebsd_reclaim+0x2e
> > kernel`VOP_RECLAIM_APV+0x78 kernel`vgonel+0x134
> > kernel`vnlru_free+0x362 kernel`vnlru_proc+0x61e
> > kernel`fork_exit+0x11f kernel`0xffffffff80cdbfde 2490
0xffffffff80cdbfd0 <fork_trampoline>: mov %r12,%rdi
0xffffffff80cdbfd3 <fork_trampoline+3>: mov %rbx,%rsi
0xffffffff80cdbfd6 <fork_trampoline+6>: mov %rsp,%rdx
0xffffffff80cdbfd9 <fork_trampoline+9>: callq 0xffffffff808db560 <fork_exit>
0xffffffff80cdbfde <fork_trampoline+14>: jmpq 0xffffffff80cdca80 <doreti>
0xffffffff80cdbfe3 <fork_trampoline+19>: nopw 0x0(%rax,%rax,1)
0xffffffff80cdbfe9 <fork_trampoline+25>: nopl 0x0(%rax)
> > I don't have user process created threads nor do fork/exit.
>
> This has nothing to do with fork/exit but does suggest that you are
> running of vnodes. What does sysctl -a | grep vnode say?
kern.maxvnodes: 1095872
kern.minvnodes: 273968
vm.stats.vm.v_vnodepgsout: 0
vm.stats.vm.v_vnodepgsin: 62399
vm.stats.vm.v_vnodeout: 0
vm.stats.vm.v_vnodein: 10680
vfs.freevnodes: 275107
vfs.wantfreevnodes: 273968
vfs.numvnodes: 316321
debug.sizeof.vnode: 504
More information about the freebsd-current
mailing list