MAXPHYS bump for FreeBSD 13
Scott Long
scottl at samsco.org
Sat Nov 14 18:48:36 UTC 2020
> On Nov 14, 2020, at 11:37 AM, Konstantin Belousov <kostikbel at gmail.com> wrote:
>
> On Sat, Nov 14, 2020 at 10:01:05AM -0500, Alexander Motin wrote:
>> On Fri, 13 Nov 2020 21:09:37 +0200 Konstantin Belousov wrote:
>>> To put the specific numbers, for struct buf it means increase by 1792
>>> bytes. For bio it does not, because it does not embed vm_page_t[] into
>>> the structure.
>>>
>>> Worse, typical struct buf addend for excess vm_page pointers is going
>>> to be unused, because normal size of the UFS block is 32K. It is
>>> going to be only used by clusters and physbufs.
>>>
>>> So I object against bumping this value without reworking buffers
>>> handling of b_pages[]. Most straightforward approach is stop using
>>> MAXPHYS to size this array, and use external array for clusters.
>>> Pbufs can embed large array.
>>
>> I am not very familiar with struct buf usage, so I'd appreciate some
>> help there.
>>
>> Quickly looking on pbuf, it seems trivial to allocate external b_pages
>> array of any size in pbuf_init, that should easily satisfy all of pbuf
>> descendants. Cluster and vnode/swap pagers code are pbuf descendants
>> also. Vnode pager I guess may only need replacement for
>> nitems(bp->b_pages) in few places.
> I planned to look at making MAXPHYS a tunable.
>
> You are right, we would need:
> 1. move b_pages to the end of struct buf and declaring it as flexible.
> This would make KBI worse because struct buf depends on some debugging
> options, and than b_pages offset depends on config.
>
> Another option could be to change b_pages to pointer, if we are fine with
> one more indirection. But in my plan, real array is always allocated past
> struct buf, so flexible array is more correct even.
>
I like this, and I was in the middle of writing up an email that described it.
There could be multiple malloc types or UMA zones of different sizes,
depending on the intended i/o size, or just a runtime change to the size of
a single allocation size.
> 2. Preallocating both normal bufs and pbufs together with the arrays.
>
> 3. I considered adding B_SMALLPAGES flag to b_flags and use it to indicate
> that buffer has 'small' b_pages. All buffers rotated through getnewbuf()/
> buf_alloc() should have it set.
>
This would work nicely with a variable sized allocator, yes.
> 4. There could be some places which either malloc() or allocate struct buf
> on stack (I tend to believe that I converted all later places to formed).
> They would need to get special handling.
>
I couldn’t find any places that allocated a buf on the stack or embedded it
into another structure.
> md(4) uses pbufs.
>
> 4. My larger concern is, in fact, cam and drivers.
>
Can you describe your concern?
>>
>> Could you or somebody help with vfs/ffs code, where I suppose the
>> smaller page lists are used?
> Do you plan to work on this ? I can help, sure.
>
> Still, I wanted to make MAXPHYS (and 'small' MAXPHYS, this is not same as
> DFLPHYS), a tunable, in the scope of this work.
Sounds great, thank you for looking at it.
Scott
More information about the freebsd-arch
mailing list