svn commit: r337273 - head/sys/dev/nvme

Konstantin Belousov kostikbel at gmail.com
Sat Aug 4 14:47:44 UTC 2018


On Sat, Aug 04, 2018 at 08:29:47AM -0500, Justin Hibbits wrote:
> On Sat, Aug 4, 2018, 08:03 Konstantin Belousov <kostikbel at gmail.com> wrote:
> 
> > On Sat, Aug 04, 2018 at 05:14:31AM -0700, John Baldwin wrote:
> > > On 8/4/18 1:08 AM, Konstantin Belousov wrote:
> > > > On Fri, Aug 03, 2018 at 08:04:06PM +0000, Justin Hibbits wrote:
> > > >> Author: jhibbits
> > > >> Date: Fri Aug  3 20:04:06 2018
> > > >> New Revision: 337273
> > > >> URL: https://svnweb.freebsd.org/changeset/base/337273
> > > >>
> > > >> Log:
> > > >>   nvme(4): Add bus_dmamap_sync() at the end of the request path
> > > >>
> > > >>   Summary:
> > > >>   Some architectures, in this case powerpc64, need explicit
> > synchronization
> > > >>   barriers vs device accesses.
> > > >>
> > > >>   Prior to this change, when running 'make buildworld -j72' on a
> > 18-core
> > > >>   (72-thread) POWER9, I would see controller resets often.  With this
> > change, I
> > > >>   don't see these resets messages, though another tester still does,
> > for yet to be
> > > >>   determined reasons, so this may not be a complete fix.
> > Additionally, I see a
> > > >>   ~5-10% speed up in buildworld times, likely due to not needing to
> > reset the
> > > >>   controller.
> > > >>
> > > >>   Reviewed By: jimharris
> > > >>   Differential Revision: https://reviews.freebsd.org/D16570
> > > >>
> > > >> Modified:
> > > >>   head/sys/dev/nvme/nvme_qpair.c
> > > >>
> > > >> Modified: head/sys/dev/nvme/nvme_qpair.c
> > > >>
> > ==============================================================================
> > > >> --- head/sys/dev/nvme/nvme_qpair.c Fri Aug  3 19:24:04 2018
> > (r337272)
> > > >> +++ head/sys/dev/nvme/nvme_qpair.c Fri Aug  3 20:04:06 2018
> > (r337273)
> > > >> @@ -401,9 +401,13 @@ nvme_qpair_complete_tracker(struct nvme_qpair
> > *qpair,
> > > >>            req->retries++;
> > > >>            nvme_qpair_submit_tracker(qpair, tr);
> > > >>    } else {
> > > >> -          if (req->type != NVME_REQUEST_NULL)
> > > >> +          if (req->type != NVME_REQUEST_NULL) {
> > > >> +                  bus_dmamap_sync(qpair->dma_tag_payload,
> > > >> +                      tr->payload_dma_map,
> > > >> +                      BUS_DMASYNC_POSTREAD | BUS_DMASYNC_POSTWRITE);
> > > >>                    bus_dmamap_unload(qpair->dma_tag_payload,
> > > >>                        tr->payload_dma_map);
> > > >> +          }
> > > >>
> > > >>            nvme_free_request(req);
> > > >>            tr->req = NULL;
> > > >> @@ -487,6 +491,8 @@ nvme_qpair_process_completions(struct nvme_qpair
> > *qpai
> > > >>             */
> > > >>            return (false);
> > > >>
> > > >> +  bus_dmamap_sync(qpair->dma_tag, qpair->queuemem_map,
> > > >> +      BUS_DMASYNC_POSTREAD | BUS_DMASYNC_POSTWRITE);
> > > >>    while (1) {
> > > >>            cpl = qpair->cpl[qpair->cq_head];
> > > >>
> > > >> @@ -828,7 +834,16 @@ nvme_qpair_submit_tracker(struct nvme_qpair
> > *qpair, st
> > > >>    if (++qpair->sq_tail == qpair->num_entries)
> > > >>            qpair->sq_tail = 0;
> > > >>
> > > >> +  bus_dmamap_sync(qpair->dma_tag, qpair->queuemem_map,
> > > >> +      BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE);
> > > >> +#ifndef __powerpc__
> > > >> +  /*
> > > >> +   * powerpc's bus_dmamap_sync() already includes a heavyweight
> > sync, but
> > > >> +   * no other archs do.
> > > >> +   */
> > > >>    wmb();
> > > >> +#endif
> > > > What is the purpose of this call ?  It is useless without paired read
> > > > barrier.  So where is the reciprocal rmb() ?
> > >
> > > For DMA, the rmb is in the device controller.  However, architectures
> > > that need this kind of ordering should do it in their bus_dmmap_sync op,
> > > and this explicit one needs to be removed.  (Alpha had a wmb in its
> > > bus_dmamap_sync op for this reason.)
> > Yes, if something special is needed, it should happen in platform-specific
> > busdma code.
> >
> > Also, if wmb() is needed, then it is not a supposed semantic or
> > wmb(), but a specific side-effects of one of the instruction in the
> > implementation of wmb().
> >
> > As I noted, on x86 it is not needed and detrimental to the performance.
> >
> 
> According to Jim Harris it is needed for x86. I tried to remove it thinking
> it was unnecessary with the sync in place now.  The wmb() was added way
> back in r240616 by him, when only x86 was supported by the driver.
> bus_dmamap_sync() for all archs except powerpc lack a barrier, from what I
> can see.  See the review for more discussion that took place. If it isn't
> needed I will gladly remove it.

I am very curious why x86 would need MFENCE there.  I do not see an
explanation in the review.


More information about the svn-src-all mailing list