8.0-RELEASE-p3: 4k jumbo mbuf cluster exhaustion

Andre Oppermann andre at freebsd.org
Mon Aug 23 19:45:19 UTC 2010


On 23.08.2010 21:16, Pyun YongHyeon wrote:
> On Mon, Aug 23, 2010 at 09:04:02PM +0200, Andre Oppermann wrote:
>> On 23.08.2010 19:52, Pyun YongHyeon wrote:
>>> On Mon, Aug 23, 2010 at 12:18:01PM +0200, Andre Oppermann wrote:
>>>> The function that is called on a socket write is sosend_generic() which
>>>> makes use of m_getm2().  This function allocates mbuf chains with the
>>>> tightest packing it can achieve.  It will make use 4k (page size) mbufs
>>>> as much as it can.  This is where they come from.
>>>>
>>>> It seems the 4k clusters do not get freed back to the pool after they've
>>>> been sent by the NIC and dropped from the socket buffer after the ACK has
>>>> arrived.  The leak must occur in one of these two places.  The socket
>>>> buffer is unlikely as it would affect not just you but everyone else too.
>>>> Thus the mbuf freeing after DMA/tx in the bce(4) driver is the prime
>>>> suspect.
>>>>
>>>
>>> I know bce(4) has a couple of bug in TX path(wrong dma tag, lack of
>>> bus_dmamap_sync(9) etc) but this is the same code path with/without
>>> TX checksum offloading. This is one of reason why I still do not
>>> understand what's really happening here. TX checksum offloading may
>>> introduce additional frame processing time to fill internal FIFO to
>>> compute checksum before transmitting the frame to wire such that it
>>> can change timing of TX path. This timing change might trigger the
>>> TX path bug. It's just vague guessing though.
>>
>> Had a chat with Claudio at OpenBSD and he said that the bce(4) DMA engine
>> can only access the first 1GB of physical RAM and has to use bounce
>> buffers all the time.  Maybe this is related.
>>
>
> Really? I don't remember I saw such a DMA address space limitation
> in data sheet. And I don't think Broadcom made such a horrible
> thing for controllers targeted for servers. The only limitation I
> know is BCM5708 is not able to handle DMA addresses greater than
> 40bits so bce(4) limits the DMA address space in DMA tag creation.

Oops... OpenBSD bce(4) != FreeBSD bce(4).  The former is for BCM440x
chips the latter for BCM57xx.

-- 
Andre


More information about the freebsd-net mailing list