zfs & waiting on zio->io_cv

Jeremy Chadwick koitsu at FreeBSD.org
Sat Oct 25 08:02:01 UTC 2008


On Sat, Oct 25, 2008 at 09:48:15AM +0200, Danny Braniss wrote:
> > In the last episode (Oct 24), Danny Braniss said:
> > > there is a big delay (probably more than 1 sec.) when doing simple tasks
> > > on this zfs, like ls(1), or 'zfs list', long enough to hit ^T
> > > and get the same [zio->io_cv)], any hints?
> > > 
> > > store-01# zfs list
> > > (hitting ^T)load: 0.00  cmd: zfs 88376 [zio->io_cv)] 0.00u 0.00s 0% 1672k
> > > (hitting ^T)load: 0.00  cmd: zfs 88376 [zio->io_cv)] 0.00u 0.00s 0% 1684k
> > > NAME              USED  AVAIL  REFER  MOUNTPOINT
> > > h                 472G  11.2T    23K  /h
> > > h/home            466G  11.2T   466G  /h/home
> > > h/home at 23-10-08    54K      -   466G  -
> > > h/root             18K  11.2T    18K  /h/root
> > > h/src              18K  11.2T    18K  /h/src
> > > h/system         5.64G  11.2T  5.64G  /h/system
> > 
> > That's sort of the equivalent to waiting in "biord" on a UFS
> > filesystem, I think.  ZFS is just waiting for the disk to return a
> > block.  If you happen to do something during the window where ZFS is
> > commiting its transaction group, it has to wait until the sync
> > finishes.  If some other process is doing a lot of writes, or you only
> > have one disk in your zpool, or your pool is close to full, it may take
> > a couple seconds to sync.
> > 
> > There's a couple of things you can try to improve interactive
> > performance.  Raising zfs's arc_max is the easiest to do, and will let
> > ZFS cache more stuff, increasing the likelyhood that an "ls" will be
> > able to read from cache instead of having to go to disk.  Setting it at
> > 1/4 your physical RAM is probably as high as you can go without causing
> > panics.
> > 
> > Raising txg_time ( in /sys/cddl/.../zfs/txg.c ) from 5 to
> > say 30 will tell zfs to sync less often, which can be a win if you
> > don't actually do that much writing.  With a single spindle, it may
> > take a substantial fraction of a second just to sync a tiny txg due to
> > the number of copies of metadata ZFS writes for redundancy.
> > 
> > If you do a lot of writing, lowering zfs_vdev_max_pending ( in
> > /sys/cddl/.../zfs/vdev_queue.c ) from 35 down to 16 or less will reduce
> > the number of simultaneous I/Os ZFS will try to send to each disk,
> > which will let your reads compete a little better with other I/O.  On
> > ATA or SATA disks, you might want to set it to 2.
> > 
> ok, forgot to mention a small detail, the machine is a cuad core, with 8gb
> of main memory, the disks are 14x1tb connected via a perc/raid5
> tests show that disk access is quiet fast, over 200Mg/s.
> 
> the 'delays' are seen when the machine is totaly idle. (it's not production 
> yet)
> and been up for some time. btw, I can't reproduce the 'delay', so I think
> it has to do with caching.
> 
> I guess this beast needs some tunning, are there any tools out there
> to monitor/tune ZFS? 

Monitor ZFS: sysctl
Tune ZFS: vi /boot/loader.conf or sysctl

I'm not sure what you're looking for.  :-)

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |



More information about the freebsd-hackers mailing list