MAXBSIZE increase

Sun Mar 29 02:56:20 UTC 2015

Kostik wrote:
> On Fri, Mar 27, 2015 at 10:57:05PM +0200, Alexander Motin wrote:
> > Hi.
> > 
> > Experimenting with NFS and ZFS I found an inter-operation issue:
> > ZFS by
> > default uses block of 128KB, while FreeBSD NFS (both client and
> > server)
> > is limited to 64KB requests by the value of MAXBSIZE. On file
> > rewrite
> > that limitation makes ZFS to do slow read-modify-write cycles for
> > every
> > write operation, instead of just writing the new data.  Trivial
> > iozone
> > test show major difference between initial write and rewrite speeds
> > because of this issue.
> > 
> > Looking through the sources I've found and in r280347 fixed number
> > of
> > improper MAXBSIZE use cases in device drivers. After that I see no
> > any
> > reason why MAXBSIZE can not be increased to at least 128KB to match
> > ZFS
> > default (ZFS now supports block up to 1MB, but that is not default
> > and
> > so far rare). I've made a test build and also successfully created
> > UFS
> > file system with 128KB block -- not sure it is needed, but seems it
> > survives this change well too.
> > 
> > Is there anything I am missing, or it is safe to rise this limit
> > now?
> 
> This post is useless after the Bruce explanation, but I still want to
> highlidht the most important point from that long story:
> 
> increasing MAXBSIZE without tuning other buffer cache parameters
> would dis-balance the buffer cache.  Allowing bigger buffers
> increases
> fragmentation, while limiting the total number of buffers.  Also, it
> changes the tuning for runtime limits for amount of io in flight, see
> hi/lo runningspace initialization.
>From an NFS perspective, all it cares about is the maximum size of buffer
cache block it can use.

Maybe creating a separate constant that is specifically "max buffer cache
block size", but does not define the maximum size of any file system's
block would help?

If the constant only defines maximum buffer cache block size, then it
could be tuned based on architecture, so that amd64 could use much
larger values for the buffer cache tunables. (As Bruce explained,
i386 puts a very low limit on the buffer cache, due to KVM limitations.)
Put another way, separate maximum buffer cache block from the maximum
block size used by any on-disk file system.
Other than KVM limits, I think the problems with increasing MAXBSIZE
are because it is used as a maximum block size for file systems like UFS.

Btw, since NFS already uses 64K buffers by default, there is already
a dis-balanced buffer cache. Unfortunately, increasing BKVASIZE will make
it allow even fewer buffers for i386. I don't know if buffer cache
fragmentation has been causing anyone problems?
Does anyone know if buffer cache fragmentation can cause any failure
or will it just impact performance?
(All I can see is that allocation of buffers > BKVASIZE can fragment
 the buffer cache's address space such that there might not be a
 contiguous area large enough for a buffer's allocation. I don't know
 what happens then?)

I would like to see the NFS client be able to use 128K rsize/wsize.
I would also like to see a larger buffer cache for machines like
amd64 with a lot of RAM, so that wcommitsize (the size of write that
the client can do asynchronously) can be much larger, too.
(For i386, we probably have to live with a small buffer cache and
 maybe a 64K maximum buffer cache block size.)

rick
> _______________________________________________
> freebsd-hackers at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to
> "freebsd-hackers-unsubscribe at freebsd.org"
>