(in)appropriate uses for MAXBSIZE

Rick Macklem rmacklem at uoguelph.ca
Sun Apr 11 14:09:31 UTC 2010



On Sun, 11 Apr 2010, Bruce Evans wrote:

>
> Er, the maximum size of buffers in the buffer cache is especially
> irrelevant for nfs.  It is almost irrelevant for physical disks because
> clustering normally increases the bulk transfer size to MAXPHYS.
> Clustering takes a lot of CPU but doesn't affect the transfer rate much
> unless there is not enough CPU.  It is even less relevant for network
> i/o since there is a sort of reverse-clustering -- the buffers get split
> up into tiny packets (normally 1500 bytes less some header bytes) at
> the hardware level.  Again a lot of CPU is involved doing the (reverse)
> clustering, and again this doesn't affect the transfer rate much.
> However, 1500 is so tiny that the reverse-clustering ratio of the i/o
> size relative to MAXBSIZE (65536/1500) is much smaller than the normal
> clustering ratio relative to MAXBSIZE (132768/65536) and the extra CPU
> is more significant for network i/o.  (These aren't the actual normal
> ratios, but ones the limits of the attainable ones by varying only the
> block sizes under the file system's control.)  However2, increasing the
> network i/o size can make little difference to this problem -- it can
> only increase the already-too-large reverse-clustering ratio, while
> possibly reducing other reverse-clustering ratios (the others are for
> assembling the nfs buffers from local file system buffers; the local
> file system buffers are normally disassembled from pbuf size (MAXPHYS)
> to file system size (normally 16K); then conversion to nfs buffers
> involves either a sort of clustering or reverse clustering depending
> on the relative sizes of the buffers).  There are more gains to be
> had from increasing the network i/o size.  tcp allows larger buffers
> at intermediate levels but they still get split up at the hardware
> level.  Only some networks allow jumbo frames.
>
Oh, and if the 1Mbyte write rpc can somehow hand the data portion
(the 1Mbyte of data) to sosend() as a single 1Mbyte mbuf cluster
referencing (not copied from) the 1Mbyte buffer cache block, so
the data never needs to be copied until it gets to the network
device driver, that would be great. However, this goes way beyond
the increase of MAXBSIZE that I think I need so that the client
can actually do a 1Mbyte write RPC.

Have a good weekend (what's left of it), rick



More information about the freebsd-arch mailing list