loop inside uma_zfree critical section

Monthadar Al Jaberi monthadar at gmail.com
Wed Dec 14 19:47:21 UTC 2011


On Tue, Dec 13, 2011 at 4:50 PM, Monthadar Al Jaberi
<monthadar at gmail.com> wrote:
> On Tue, Dec 13, 2011 at 3:35 PM, John Baldwin <jhb at freebsd.org> wrote:
>> On Tuesday, December 13, 2011 7:46:34 am Monthadar Al Jaberi wrote:
>>> Hi,
>>>
>>> I am not sure why I am having this problem, but looking
>>> at the code I dont understand uma_core.c really good.
>>> So I hope someone can shed a light on this:
>>>
>>> I am running on an arm board and and running a kernel module
>>> that behaves like a wlan interface. so I tx and rx packets.
>>>
>>> For now tx is only only sending beacon like frames.
>>> This is done through using ieee80211_beacon_alloc().
>>>
>>> Then in a callout task to generate periodic beacons:
>>>
>>>     m_dup(avp->beacon, M_DONTWAIT);
>>>     mtx_lock(...);
>>>     STAILQ_INSERT_TAIL(...);
>>>     taskqueue_enqueue(...);
>>>     mtx_unlock(...);
>>>     callout_schedule(...);
>>>
>>> On the RX side, the interrupt handler will read out buffer
>>> then place it on a queue to be handled by wlan-glue code.
>>> For now wlan-glue code just frees the mbuf it instead of
>>> calling net80211 ieee80211_input() functions:
>>>
>>>     m_copyback(...);
>>>     /* Allocate new mbuf for next RX. */
>>>     MGETHDR(..., M_DONTWAIT, MT_DATA);
>>>     bzero((mtod(sc->Rx_m, void *)), MHLEN);
>>>     sc->Rx_m->m_len = 0; /* NB: m_gethdr does not set */
>>>     sc->Rx_m->m_data += 20; /* make headroom */
>>>
>>> Then I use a lockmgr inside my kernel module that should
>>> make sure that we either are on TX or RX path.
>>
>> Uh, you can't use a lockmgr lock in interrupt handlers or in
>> if_transmit/if_start routines.  You should most likely just be using a plain
>> mutex instead.  Also, new code shouldn't use lockmgr in general.  If you
>> need a sleepable lock, use sx instead.  It has a more straightforward API.
>
> Ok, I will change the interrupt handler to do something like this:
>
>    disaple_interrupt();
>    taskqueue_enqueue(...); /* on new rx task queue */
>
> Then on the new rx proc:
>
>    sx_slock(...);
>    m_copyback(...);
>    enable_interrupt();
>    /* Allocate new mbuf for next RX. */
>    MGETHDR(..., M_DONTWAIT, MT_DATA);
>    bzero((mtod(sc->Rx_m, void *)), MHLEN);
>    sc->Rx_m->m_len = 0; /* NB: m_gethdr does not set */
>    sc->Rx_m->m_data += 20; /* make headroom */
>    sx_sunlock(...);
>
> I lock TX/RX paths to make sure my code is threading safe.
> Also because while programming my deivce (SPI communicatioin)
> there will be a tsleep() waiting for the DMA interrupt and
> thus we could be prempted by e.g. a beacon_callout etc...
>

I did implement your suggestions, using sx and modified interrupt handler
as specified above. But still same problem as before.

>>
>>> The problem seems to be at [2], somehow after swapping
>>> buckets in uma_zfree m_dup returns a pointer to
>>> an mbuf that is still being used by us, [1] and [3]
>>> have same address.
>>> Then we call m_freem twice on same mbuf, [4] and [5].
>>> And a loop occurs inside uma_free.
>>> I am using mbufs in a wrong way? Shouldnt mbufs be thread safe?
>>> Problem seems to occur while swapping buckets.
>>
>> Hmm, the uma uses its own locking, so it should be safe, yes.  However, you
>> are correct about [1] and [3].  The thing is, after [1] the mbuf shouldn't
>> be in any buckets at all (it only gets put back into the bucket during a
>> free).  Are you sure the mbuf wasn't double free'd previously?

I rechecked and it is almost certain that I dont double free the mbuf
before [1].
And its not like it crashed in the beginning, it does run for a while
and then it crashes. So our code works for like a hundred beacons sent/received
between two arm boards. Its feels like something is preempted, which explains
why the mbuf is still in the bucket (wrongly)?

>
> From my log I can only see the mbuf being used once before
> by a beacon_callout and then it was freed by m_freem().
> So I cant see that it was freed twice before that.
>
> How can I go by debbuging this?
>
>>
>> --
>> John Baldwin
>
>
>
> --
> Monthadar Al Jaberi



-- 
Monthadar Al Jaberi


More information about the freebsd-hackers mailing list