busdma dflt_lock on amd64 > 4 GB
jc at oxado.com
Wed Oct 26 09:01:53 PDT 2005
Thanks for the input. I'm utterly lost in unknown terrain, but I'm
trying to understand...
At 16:09 26/10/2005, Scott Long wrote:
>So, the panic is doing exactly what it is supposed to do. It's guarding
>against bugs in the driver. The workaround for this is to use the
>NOWAIT flag in all instances of bus_dmamap_load() where deferals can
As pointed out by Soren, this is not documented in man bus_dma :-/ It
says bus_dmamap_load flags are supposed to be 0, and BUS_DMA_ALLOCNOW
should be set at tag creation to avoid EINPROGRESS. I'm not sure the
two would actually be equivalent, either. And from what I understand,
even a call to bus_dma_tag_create with BUS_DMA_ALLOCNOW can be
successful but not actually allocate what will be needed later (see below).
> This, however, means that using bounce pages still remains
> fragile and that the driver is still likely to return ENOMEM to the
> upper layers. C'est la vie, I guess. At one time I had patches that
>made ATA use the busdma API correctly (it is one of the few remaining
>that does not), but they rotted over time.
So what would be the "correct" way? Move the part that's after the
DMA setup in the callback? I suppose there are limitations as to what
can happen in the callback, though, so it would complicate things quite a bit.
Obviously, a lockfunc would be needed in this situation, right?
Also, I believe many other drivers just have lots of BUS_DMA_ALLOCNOW
or BUS_DMA_NOWAIT all over the place, I'm not sure that's the
"correct" way, is it?
>No. Some tags specifically should not permit deferals.
How do they do that? Setting BUS_DMA_ALLOCNOW in the tag, or
BUS_DMA_NOWAIT in the map_load, or both, or something else? What
should make one decide when deferrals should not be permitted? It is
my impression that quite a few drivers happily decide they don't like
deferrals at all whatever happens...
>Just about every other modern driver honors the API correctly.
Depends what you mean by "correctly". I'm not sure using
BUS_DMA_NOWAIT is the right way to go as it fails if there is
contention for bounce buffers.
>Bounce pages cannot be reclaimed to the system, so overallocating just
I'm not talking about over-allocating, but rather allocating what is
needed: I don't understand why bus_dma_tag_create limits the total
number of bounce pages in a bounce zone to maxsize if
BUS_DMA_ALLOCNOW is set (which prevents bus_dmamap_create from
allocating any further bounce pages as long as there's only one map
per tag, which seems pretty common), while bus_dmamap_create will
allocate maxsize additional pages if BUS_DMA_ALLOCNOW was not set.
The end result is that the ata driver is limited to 32 bounce pages
whatever the number of instances (I guess that's channels, or
disks?), while other drivers get hundreds of bounce pages which they
hardly use. Maybe this is intended and it's just the way the ata
driver uses tags and maps that is wrong, maybe it's the busdma logic
that is wrong, I don't know...
> The whole point of the deferal mechanism is to allow
>you to allocate enough pages for a normal load while also being able to
>handle sporadic spikes in load (like when the syncer runs) without
In this case 32 bounce pages (out of 8 GB RAM) for 6 disks seems like
a very tight bottleneck to me.
More information about the freebsd-amd64