MAXPHYS bump for FreeBSD 13

Scott Long scottl at samsco.org
Sat Nov 14 18:48:36 UTC 2020


> On Nov 14, 2020, at 11:37 AM, Konstantin Belousov <kostikbel at gmail.com> wrote:
> 
> On Sat, Nov 14, 2020 at 10:01:05AM -0500, Alexander Motin wrote:
>> On Fri, 13 Nov 2020 21:09:37 +0200 Konstantin Belousov wrote:
>>> To put the specific numbers, for struct buf it means increase by 1792
>>> bytes. For bio it does not, because it does not embed vm_page_t[] into
>>> the structure.
>>> 
>>> Worse, typical struct buf addend for excess vm_page pointers is going
>>> to be unused, because normal size of the UFS block is 32K.  It is
>>> going to be only used by clusters and physbufs.
>>> 
>>> So I object against bumping this value without reworking buffers
>>> handling of b_pages[].  Most straightforward approach is stop using
>>> MAXPHYS to size this array, and use external array for clusters.
>>> Pbufs can embed large array.
>> 
>> I am not very familiar with struct buf usage, so I'd appreciate some
>> help there.
>> 
>> Quickly looking on pbuf, it seems trivial to allocate external b_pages
>> array of any size in pbuf_init, that should easily satisfy all of pbuf
>> descendants.  Cluster and vnode/swap pagers code are pbuf descendants
>> also.  Vnode pager I guess may only need replacement for
>> nitems(bp->b_pages) in few places.
> I planned to look at making MAXPHYS a tunable.
> 
> You are right, we would need:
> 1. move b_pages to the end of struct buf and declaring it as flexible.
> This would make KBI worse because struct buf depends on some debugging
> options, and than b_pages offset depends on config.
> 
> Another option could be to change b_pages to pointer, if we are fine with
> one more indirection.  But in my plan, real array is always allocated past
> struct buf, so flexible array is more correct even.
> 

I like this, and I was in the middle of writing up an email that described it.
There could be multiple malloc types or UMA zones of different sizes,
depending on the intended i/o size, or just a runtime change to the size of
a single allocation size.

> 2. Preallocating both normal bufs and pbufs together with the arrays.
> 
> 3. I considered adding B_SMALLPAGES flag to b_flags and use it to indicate
> that buffer has 'small' b_pages.  All buffers rotated through getnewbuf()/
> buf_alloc() should have it set.
> 

This would work nicely with a variable sized allocator, yes.

> 4. There could be some places which either malloc() or allocate struct buf
> on stack (I tend to believe that I converted all later places to formed).
> They would need to get special handling.
> 

I couldn’t find any places that allocated a buf on the stack or embedded it
into another structure.

> md(4) uses pbufs.
> 
> 4. My larger concern is, in fact, cam and drivers.
> 

Can you describe your concern?

>> 
>> Could you or somebody help with vfs/ffs code, where I suppose the
>> smaller page lists are used?
> Do you plan to work on this ?  I can help, sure.
> 
> Still, I wanted to make MAXPHYS (and 'small' MAXPHYS, this is not same as
> DFLPHYS), a tunable, in the scope of this work.

Sounds great, thank you for looking at it.

Scott



More information about the freebsd-arch mailing list