dmar, dma_pool, etc

Niclas Zeising zeising at freebsd.org
Tue Apr 30 16:54:50 UTC 2019


On 2019-04-30 14:54, Niclas Zeising wrote:
> On 2019-04-30 14:25, Andrey Fesenko wrote:
>> .On Tue, Apr 30, 2019 at 2:16 AM Tycho Nightingale 
>> <tychon at freebsd.org> wrote:
>>>
>>>
>>> Hi,
>>>
>>>> On Apr 29, 2019, at 4:24 PM, Tycho Nightingale <tychon at freebsd.org> 
>>>> wrote:
>>>>
>>>>
>>>>> On Apr 29, 2019, at 2:34 PM, Niclas Zeising <zeising at freebsd.org> 
>>>>> wrote:
>>>>>
>>>>> On 2019-04-29 20:27, Niclas Zeising wrote:
>>>>>> On 2019-04-29 18:00, Tycho Nightingale wrote:
>>>>>>>> On Apr 29, 2019, at 11:06 AM, Niclas Zeising 
>>>>>>>> <zeising at FreeBSD.org> wrote:
>>>>>>>>
>>>>>>>> As a side note, I can readily reproduce the hang on a spare 
>>>>>>>> laptop, please let me know if I can help in testing or 
>>>>>>>> diagnosing in any way.
>>>>>>>
>>>>>>>
>>>>>>> If you can readily reproduce the hang, since there are 2 halves 
>>>>>>> that comprised the fix (the DMA pool and non-pool mappings) it 
>>>>>>> would be instructive to try reverting either dmapool.h or 
>>>>>>> dma-mapping.h independently to see if that helps.
>>>>>>>
>>>>>> Hi!
>>>>>> I will test this and report back.  Thank you!
>>>>>> Regards
>>>>>
>>>>> Hi!
>>>>> Just reverting dmapool.h or dma-mapping.h doesn't work, it won't 
>>>>> build. I need to revert more than that.  Can you help me figure out 
>>>>> what to revert in either case, or provide a patch?
>>>>
>>>> Thanks for trying.  I managed to setup my HW and reproduce this and 
>>>> I see your point that it doesn’t cleave in half perfectly.
>>>>
>>>> Since I’ve got a (non)-working test case, let me investigate a bit 
>>>> further.
>>>
>>> I believe I’ve figured this out.  At least in my case I’ve been able 
>>> to eliminate the hangs with the patch included at the bottom of this 
>>> email.
>>>
>>> The issue stems from the implementation of dma_map_sg().  According 
>>> to the linux DMA documentation[1], entries of the scatter/gather list 
>>> may be coalesced:
>>>
>>>          int
>>>          dma_map_sg(struct device *dev, struct scatterlist *sg, int 
>>> nents, enum dma_data_direction direction)
>>>
>>>          Returns: the number of DMA address segments mapped (this may 
>>> be shorter
>>>          than <nents> passed in if some elements of the 
>>> scatter/gather list are
>>>          physically or virtually adjacent and an IOMMU maps them with 
>>> a single
>>>          entry).
>>>
>>> My implementation of dma_map_sg() does just that.  As it turns out 
>>> there are several consumers of dma_map_sg(), e.g. 
>>> i915_gem_map_dma_buf() and i915_gem_gtt_prepare_pages(), and 
>>> mock_map_dma_buf() among others that aren’t compliant with this 
>>> documented API.  Going back to the non-coalesced version is likely 
>>> the only path forward as bugs in the callee redefine this API de facto.
>>>
>>> If this addresses the hangs you are seeing, I will post this on 
>>> Phabricator.
>>>
>>> Tycho
>>>
>>> [1] https://www.kernel.org/doc/Documentation/DMA-API.txt
>>>
>>> Index: sys/compat/linuxkpi/common/src/linux_pci.c
>>> ===================================================================
>>> --- sys/compat/linuxkpi/common/src/linux_pci.c  (revision 346687)
>>> +++ sys/compat/linuxkpi/common/src/linux_pci.c  (working copy)
>>> @@ -565,10 +565,8 @@
>>>   {
>>>          struct linux_dma_priv *priv;
>>>          struct linux_dma_obj *obj;
>>> -       struct scatterlist *dma_sg, *sg;
>>> -       int dma_nents, error, nseg;
>>> -       size_t seg_len;
>>> -       vm_paddr_t seg_phys, prev_phys_end;
>>> +       struct scatterlist *sg;
>>> +       int error, i, nseg;
>>>          bus_dma_segment_t seg;
>>>
>>>          priv = dev->dma_priv;
>>> @@ -580,25 +578,11 @@
>>>                  return (0);
>>>          }
>>>
>>> -       sg = sgl;
>>> -       dma_sg = sg;
>>> -       dma_nents = 0;
>>> -       while (nents > 0) {
>>> -               seg_phys = sg_phys(sg);
>>> -               seg_len = sg->length;
>>> -               while (--nents > 0) {
>>> -                       prev_phys_end = sg_phys(sg) + sg->length;
>>> -                       sg = sg_next(sg);
>>> -                       if (prev_phys_end != sg_phys(sg))
>>> -                               break;
>>> -                       seg_len += sg->length;
>>> -               }
>>> -
>>> +       for_each_sg(sgl, sg, nents, i) {
>>>                  nseg = -1;
>>>                  mtx_lock(&priv->dma_lock);
>>>                  if (_bus_dmamap_load_phys(priv->dmat, obj->dmamap,
>>> -                   seg_phys, seg_len, BUS_DMA_NOWAIT,
>>> -                   &seg, &nseg) != 0) {
>>> +                   sg_phys(sg), sg->length, 0, &seg, &nseg) != 0) {
>>>                          bus_dmamap_unload(priv->dmat, obj->dmamap);
>>>                          bus_dmamap_destroy(priv->dmat, obj->dmamap);
>>>                          mtx_unlock(&priv->dma_lock);
>>> @@ -607,14 +591,9 @@
>>>                  }
>>>                  mtx_unlock(&priv->dma_lock);
>>>                  KASSERT(++nseg == 1, ("More than one segment 
>>> (nseg=%d)", nseg));
>>> +               sg_dma_address(sg) = seg.ds_addr;
>>> +       }
>>>
>>> -               sg_dma_address(dma_sg) = seg.ds_addr;
>>> -               sg_dma_len(dma_sg) = seg.ds_len;
>>> -
>>> -               dma_sg = sg_next(dma_sg);
>>> -               dma_nents++;
>>> -        }
>>> -
>>>          obj->dma_addr = sg_dma_address(sgl);
>>>
>>>          mtx_lock(&priv->ptree_lock);
>>> @@ -629,7 +608,7 @@
>>>                  return (0);
>>>          }
>>>
>>> -       return (dma_nents);
>>> +       return (nents);
>>>   }
>>>
>>>   void
>>
>> Thanks, this patch make system more stable Farefox apparently work 
>> infinitely.
>>
>> However, a heavy application (3D game in wine) still hangs after a
>> while, once hung after about a minute, about another (after reboot)
>> ten minutes later, video artifacts appear before the crash
>>
> 
> Hi!
> This used to work before the regression?
> This means that this patch doesn't solve all the issues with the dmar 
> changes.  Which GPU are you using?
> Regards

Hi!
The patch has been updated and now lives as part of this review: 
https://reviews.freebsd.org/D20097
Can you please test that version?  It requires an updated source tree to 
current from today.
Thanks!
Regards
-- 
Niclas Zeising


More information about the freebsd-x11 mailing list