FreeBSD 10.0-BETA1 #8 r256765M spend too much time in locks
Vitalij Satanivskij
satan at ukr.net
Thu Oct 24 07:50:26 UTC 2013
EEE forget to notice - old system have none compression on main zfs
Vitalij Satanivskij wrote:
VS> Hello.
VS>
VS> After upgrading system from old current (r245701) to fresh current r255173 (than switch to stable/10 r256765M)
VS> found some strange system behavior:
VS>
VS> Diff from r256765M anf r256765 is
VS> Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
VS> ===================================================================
VS> --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c (revision 256765)
VS> +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c (working copy)
VS> @@ -5147,7 +5147,7 @@
VS> len = l2hdr->b_asize;
VS> cdata = zio_data_buf_alloc(len);
VS> csize = zio_compress_data(ZIO_COMPRESS_LZ4, l2hdr->b_tmp_cdata,
VS> - cdata, l2hdr->b_asize, (size_t)SPA_MINBLOCKSIZE);
VS> + cdata, l2hdr->b_asize, (size_t)(1ULL << l2hdr->b_dev->l2ad_vdev->vdev_ashift));
VS>
VS> if (csize == 0) {
VS> /* zero block, indicate that there's nothing to write */
VS>
VS>
VS> But same situation was befor patch.
VS>
VS>
VS> System load to high
VS> CPU: 6.8% user, 0.0% nice, 57.3% system, 0.8% interrupt, 35.1% idle
VS>
VS> hotkernel (dtrace script) says
VS>
VS> kernel`__mtx_unlock_flags 292 0.3%
VS> kernel`__mtx_lock_flags 315 0.3%
VS> zfs.ko`lzjb_compress 349 0.3%
VS> kernel`__rw_wlock_hard 686 0.7%
VS> kernel`spinlock_exit 1050 1.0%
VS> kernel`vmem_xalloc 7516 7.3%
VS> kernel`_sx_xlock_hard 8664 8.5%
VS> kernel`acpi_cpu_c1 9737 9.5%
VS> kernel`cpu_idle 20189 19.7%
VS> kernel`__mtx_lock_sleep 45952 44.9%
VS>
VS>
VS>
VS> Trying to understand where is a problem I'm build kernel with lock profiling.
VS>
VS> and get some data (for one minute )
VS>
VS> (file attached)
VS>
VS> using agregation find most lock's is in
VS>
VS> 14,159818 /usr/src/sys/kern/subr_vmem.c:1128(sleep mutex:kmem arena)
VS> 9,553523 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1597(sx:buf_hash_table.ht_locks[i].ht_lock)
VS> 8,386943 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:3541(sx:l2arc_buflist_mtx)
VS> 8,110206 /usr/src/sys/kern/subr_vmem.c:1230(sleep mutex:kmem arena)
VS> 5,909429 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1969(sx:arc_mru->arcs_locks[i].arcs_lock)
VS> 5,452206 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1969(sx:arc_mfu->arcs_locks[i].arcs_lock)
VS> 5,050224 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c:303(sx:tx->tx_cpu[c].tc_open_lock)
VS> 4,232819 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:891(sx:buf_hash_table.ht_locks[i].ht_lock)
VS> 4,211348 /usr/src/sys/kern/vfs_subr.c:2101(lockmgr:zfs)
VS> 4,011656 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:862(sx:buf_hash_table.ht_locks[i].ht_lock)
VS> 3,823698 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:2009(sx:arc_eviction_mtx)
VS> 2,697344 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:126(sx:h->hash_mutexes[i])
VS> 2,343242 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1256(sx:arc_mfu->arcs_locks[i].arcs_lock)
VS> 1,752713 /usr/src/sys/kern/vfs_lookup.c:707(lockmgr:zfs)
VS> 1,580856 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:1136(sx:zfsvfs->z_hold_mtx[i])
VS> 1,534242 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1291(sx:arc_mfu->arcs_locks[i].arcs_lock)
VS> 1,331583 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:129(sx:db->db_mtx)
VS> 1,105058 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c:427(sx:vq->vq_lock)
VS> 1,080855 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c:396(sx:vq->vq_lock)
VS> 0,858568 /usr/src/sys/kern/vfs_cache.c:488(rw:Name Cache)
VS> 0,831652 /usr/src/sys/vm/vm_kern.c:344(rw:kmem vm object)
VS> 0,815439 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1845(sx:buf_hash_table.ht_locks[i].ht_lock)
VS> 0,812613 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1256(sx:arc_mru->arcs_locks[i].arcs_lock)
VS> 0,794274 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1529(lockmgr:zfs)
VS> 0,669845 /usr/src/sys/vm/uma_core.c:2097(sleep mutex:zio_cache)
VS>
VS>
VS>
VS>
VS> Short system description
VS> CPU E5-1650
VS> MEM 128Gb ddr3-1600
VS>
VS> Storage subsystem
VS>
VS> 36x1Tb WD RE4 drives on LSI SAS2308 Controler
VS> 3x180Gb Intel ssd 530 series as l2 cache
VS>
VS>
VS> POOL is 18 mirrors, two drives in ich mirror and 3 cache devices
VS>
VS> eg.
VS> ....
VS> mirror-14 ONLINE 0 0 0
VS> gpt/disk28 ONLINE 0 0 0
VS> gpt/disk29 ONLINE 0 0 0
VS> mirror-15 ONLINE 0 0 0
VS> gpt/disk30 ONLINE 0 0 0
VS> gpt/disk31 ONLINE 0 0 0
VS> mirror-16 ONLINE 0 0 0
VS> gpt/disk32 ONLINE 0 0 0
VS> gpt/disk33 ONLINE 0 0 0
VS> mirror-17 ONLINE 0 0 0
VS> gpt/disk34 ONLINE 0 0 0
VS> gpt/disk35 ONLINE 0 0 0
VS> cache
VS> ada1 ONLINE 0 0 0
VS> ada2 ONLINE 0 0 0
VS> ada3 ONLINE 0 0 0
VS>
VS>
VS> POOL have two ZFS
VS>
VS> main with options (diffs from default) -
VS> compression lz4
VS> secondarycache all
VS> sync disabled
VS>
VS> Data size for it around 6Tb (compresed) eg
VS> disk1 refcompressratio 1.56x -
VS> disk1 written 5,99T -
VS> disk1 logicalused 10,8T -
VS> disk1 logicalreferenced 9,32T -
VS>
VS>
VS> and another with options
VS> recordsize 4K, before it was 32k, but internal software use data blocks mostly 4k so we try to change (without real data realocate )
VS> compresion -s off
VS> sync disabled
VS> secondarycache all
VS>
VS> DATA size on it around 1.4Tb
VS>
VS> ARC configured to use max 80Gb
VS>
VS> top most time looks like this
VS>
VS> Mem: 14G Active, 13G Inact, 94G Wired, 497M Cache, 3297M Buf, 2214M Free
VS> ARC: 80G Total, 2010M MFU, 70G MRU, 49M Anon, 3822M Header, 4288M Other
VS>
VS>
VS> LA's - around 10-20 depend on day time.
VS>
VS>
VS> zpool iostat disk1 1
VS> capacity operations bandwidth
VS> pool alloc free read write read write
VS> ---------- ----- ----- ----- ----- ----- -----
VS> disk1 7,45T 8,86T 546 1,49K 16,4M 13,4M
VS> disk1 7,45T 8,86T 272 3,91K 11,7M 27,4M
VS> disk1 7,45T 8,86T 344 2,94K 7,26M 25,2M
VS> disk1 7,45T 8,86T 224 2,02K 9,91M 21,8M
VS> disk1 7,45T 8,86T 244 2,35K 8,23M 18,4M
VS> disk1 7,45T 8,86T 245 2,54K 6,44M 23,4M
VS> disk1 7,45T 8,86T 114 2,94K 3,28M 13,3M
VS> disk1 7,45T 8,86T 288 4,43K 6,09M 33,5M
VS> disk1 7,45T 8,86T 157 1,26K 2,98M 15,7M
VS> disk1 7,45T 8,86T 94 842 3,07M 13,6M
VS> disk1 7,45T 8,86T 327 1,71K 9,63M 8,21M
VS> disk1 7,45T 8,86T 237 1,81K 5,73M 29,3M
VS> disk1 7,45T 8,86T 247 3,47K 5,17M 29,6M
VS> disk1 7,45T 8,86T 165 2,37K 3,22M 16,7M
VS> disk1 7,45T 8,86T 155 3,23K 3,27M 23,9M
VS>
VS> Strange as timeout seted up to 10sec's.
VS>
VS> What intresting - after reboot system work fine for some time, at last while ARC not become 80G size.
VS> Low limit for arc is 40gb, strange but old system can take memory from arc eg like this
VS>
VS>
VS> Mem: 32G Active, 8797M Inact, 78G Wired, 2370M Cache, 209M Buf, 3977M Free
VS> ARC: 43G Total, 2204M MFU, 28G MRU, 135M Anon, 7898M Header, 5301M Other
VS>
VS> On new ARC getting to it max allowed size.
VS>
VS> So for now question is, what it can be, what we can try to improve system perfomance and so on?
VS>
VS>
VS>
VS> _______________________________________________
VS> freebsd-hackers at freebsd.org mailing list
VS> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
VS> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe at freebsd.org"
More information about the freebsd-hackers
mailing list