ZFS vs UFS2 overhead and may be a bug?

Bakul Shah bakul at bitblocks.com
Thu May 3 21:15:03 UTC 2007


> Interesting. There are two problems. First is that cat(1) uses
> st_blksize to find out best size of I/O request and we force it to
> PAGE_SIZE, which is very, very wrong for ZFS - it should be equal to
> recordsize. I need to find discussion about this:
> 
> 	/*
> 	 * According to www.opengroup.org, the meaning of st_blksize is
> 	 *   "a filesystem-specific preferred I/O block size for this
> 	 *    object.  In some filesystem types, this may vary from file
> 	 *    to file"
> 	 * Default to PAGE_SIZE after much discussion.
> 	 * XXX: min(PAGE_SIZE, vp->v_bufobj.bo_bsize) may be more
> 	 * correct.
> 	 */
> 
> 	sb->st_blksize = PAGE_SIZE;

This does seem suboptimal.  Almost always one reads an entire
file and the overhead of going to the disk is high enough
that one may as well read small files in one syscall.  Apps
that want to keep lots and lots of files open can always
adjust the buffer size.

Since disk seek access time is the largest cost component,
ideally contiguously allocated data should be read in one
access in order to avoid any extra seeks.  At the very least
st_blksize should be as large as the minimum unit of
contiguous allocation (== filesystem block size).  Even V7
unix had this!

> I tested it on Solaris and this is not FreeBSD-specific problem, the
> same is on Solaris. Is there a chance you could send your observations
> to zfs-discuss at opensolaris.org, but just comparsion between dd(1) with
> bs=128k and bs=4k (the other tests might be confusing).

I just did so.


More information about the freebsd-fs mailing list