vfs.ffs.rawreadahead

Scott Long scottl at samsco.org
Wed Sep 3 17:56:19 UTC 2008


On Wed, 3 Sep 2008, Igor Sysoev wrote:

> On Wed, Sep 03, 2008 at 10:44:46AM -0600, Scott Long wrote:
>
>> On Wed, 3 Sep 2008, Igor Sysoev wrote:
>>> On Wed, Sep 03, 2008 at 03:39:55PM +0300, Kostik Belousov wrote:
>>>
>>>> On Wed, Sep 03, 2008 at 01:53:52PM +0400, Igor Sysoev wrote:
>>>>> Hi,
>>>>>
>>>>> could anyone tell what does vfs.ffs.rawreadahead enable ?
>>>>> As I understand it's used in DIRECTIO code that allows read data
>>>>> directly to an userland buffer bypassing the buffer cache.
>>>>> What I can not understand where the read ahead data can be placed in ?
>>>>
>>>> The operation of the ffs_rawread is more accurately described as
>>>> bypassing the page cache. It creates the physical buffer that maps
>>>> the user pages.
>>>>
>>>> The readahead is performed only when the supplied user memory region
>>>> is bigger then blocksize. In this case, two reads are performed
>>>> simultaneously, with both buffers mapping consequent blocks from
>>>> user-supplied buffers. The read operation looks like footsteps.
>>>
>>> Nice!
>>>
>>> As I understand the size limit of one read operation is MAXPHYS, which is
>>> equal to 128K due to LBA28 ATA limit. On SCSI, SATA, and LBA48 ATA this
>>> limit
>>> can be increased. Is it safe ?
>>
>> The value of MAXPHYS is unrelated to capabilities or limitations of ATA.
>> It was chosen based on the needs to prevent an excessive amount of
>> parallel I/O from exhausting the kernel address space and system memory.
>> In fact, the concern was with SCSI, not with ATA.
>>
>> MAXPHYS can be raised, especially on 64bit platforms, but doing so also
>> bloats the sizes of a few key data structures.  I've been looking at a
>> solution for this, and I'd rather that people keep their MAXPHYS changes
>> confined to their local trees rather than changing FreeBSD unless they
>> also solve the associated side effects.
>
> As I understand MAXPHYS affects at least on pager_map size: on modern
> machines it's usually 256 * MAXPHYS = 32M, therefore increasing MAXPHYS
> will increase the map too.

This is intended and desirable.

>
> The 128K is probably good value and I do not suggest to increase it by
> default, I just want to increase MAXPHYS to improve disk throughput
> on some hosts where nginx serves large files (1G+) using DIRECTIO.

I've tested increases up to 1M, and they all are very beneficial not
only for silly sequential style benchmarks but also for clustered i/o. 
256-512k is the sweet spot, but Windows has set the standard at 1M and
I'd like to have FreeBSD follow suit eventually.

>
> BTW, is it possible to change MAXPHYS to a loader tunnable ?
>
>

No.  Struct buf is sized based on MAXPHYS, and there's no convenient way
yet to dynamically size that at runtime.

Scott



More information about the freebsd-stable mailing list