zfs & waiting on zio->io_cv

Dan Nelson dnelson at allantgroup.com
Fri Oct 24 15:09:19 UTC 2008


In the last episode (Oct 24), Danny Braniss said:
> there is a big delay (probably more than 1 sec.) when doing simple tasks
> on this zfs, like ls(1), or 'zfs list', long enough to hit ^T
> and get the same [zio->io_cv)], any hints?
> 
> store-01# zfs list
> (hitting ^T)load: 0.00  cmd: zfs 88376 [zio->io_cv)] 0.00u 0.00s 0% 1672k
> (hitting ^T)load: 0.00  cmd: zfs 88376 [zio->io_cv)] 0.00u 0.00s 0% 1684k
> NAME              USED  AVAIL  REFER  MOUNTPOINT
> h                 472G  11.2T    23K  /h
> h/home            466G  11.2T   466G  /h/home
> h/home at 23-10-08    54K      -   466G  -
> h/root             18K  11.2T    18K  /h/root
> h/src              18K  11.2T    18K  /h/src
> h/system         5.64G  11.2T  5.64G  /h/system

That's sort of the equivalent to waiting in "biord" on a UFS
filesystem, I think.  ZFS is just waiting for the disk to return a
block.  If you happen to do something during the window where ZFS is
commiting its transaction group, it has to wait until the sync
finishes.  If some other process is doing a lot of writes, or you only
have one disk in your zpool, or your pool is close to full, it may take
a couple seconds to sync.

There's a couple of things you can try to improve interactive
performance.  Raising zfs's arc_max is the easiest to do, and will let
ZFS cache more stuff, increasing the likelyhood that an "ls" will be
able to read from cache instead of having to go to disk.  Setting it at
1/4 your physical RAM is probably as high as you can go without causing
panics.

Raising txg_time ( in /sys/cddl/.../zfs/txg.c ) from 5 to
say 30 will tell zfs to sync less often, which can be a win if you
don't actually do that much writing.  With a single spindle, it may
take a substantial fraction of a second just to sync a tiny txg due to
the number of copies of metadata ZFS writes for redundancy.

If you do a lot of writing, lowering zfs_vdev_max_pending ( in
/sys/cddl/.../zfs/vdev_queue.c ) from 35 down to 16 or less will reduce
the number of simultaneous I/Os ZFS will try to send to each disk,
which will let your reads compete a little better with other I/O.  On
ATA or SATA disks, you might want to set it to 2.

-- 
	Dan Nelson
	dnelson at allantgroup.com


More information about the freebsd-hackers mailing list