svn commit: r281280 - head/sys/dev/nvme

Wed Apr 8 22:51:38 UTC 2015

On Wed, Apr 8, 2015 at 3:21 PM, Alexander Kabaev <kabaev at gmail.com> wrote:

> On Wed, 8 Apr 2015 21:46:19 +0000 (UTC)
> Jim Harris <jimharris at FreeBSD.org> wrote:
>
> > Author: jimharris
> > Date: Wed Apr  8 21:46:18 2015
> > New Revision: 281280
> > URL: https://svnweb.freebsd.org/changeset/base/281280
> >
> > Log:
> >   nvme: fall back to a smaller MSI-X vector allocation if necessary
> >
> >   Previously, if per-CPU MSI-X vectors could not be allocated,
> >   nvme(4) would fall back to INTx with a single I/O queue pair.
> >   This change will still fall back to a single I/O queue pair, but
> >   allocate MSI-X vectors instead of reverting to INTx.
> >
> >   MFC after:  1 week
> >   Sponsored by:       Intel
> >
> > Modified:
> >   head/sys/dev/nvme/nvme_ctrlr.c
> >
> > Modified: head/sys/dev/nvme/nvme_ctrlr.c
> >
> ==============================================================================
> > --- head/sys/dev/nvme/nvme_ctrlr.c    Wed Apr  8 21:10:13
> > 2015  (r281279) +++ head/sys/dev/nvme/nvme_ctrlr.c    Wed
> > Apr  8 21:46:18 2015  (r281280) @@ -1144,9 +1144,17 @@
> > nvme_ctrlr_construct(struct nvme_control /* One vector per IO queue,
> > plus one vector for admin queue. */ num_vectors =
> > ctrlr->num_io_queues + 1;
> > -     if (pci_msix_count(dev) < num_vectors) {
> > +     /*
> > +      * If we cannot even allocate 2 vectors (one for admin, one
> > for
> > +      *  I/O), then revert to INTx.
> > +      */
> > +     if (pci_msix_count(dev) < 2) {
> >               ctrlr->msix_enabled = 0;
> >               goto intx;
> > +     } else if (pci_msix_count(dev) < num_vectors) {
> > +             ctrlr->per_cpu_io_queues = FALSE;
> > +             ctrlr->num_io_queues = 1;
> > +             num_vectors = 2; /* one for admin, one for I/O */
> >       }
> >
> >       if (pci_alloc_msix(dev, &num_vectors) != 0) {
>
> Huh, Linux just falls back to as many vectors as it can and just
> allocates them between per-cpu queues in a round-robin manner. I think
> is is a better approach than what we have here, would you consider it?
>

This has been on my todo list for a while but have not had time to tackle
it.  I'm hoping to spend some time on it in the next couple of weeks though.

I would prefer it to be smarter than just round-robin.  For example, if
multiple cores are sharing a queue pair, we probably want those cores to
be on the same CPU socket.  Or if hyper-threading is enabled, we likely
want to assign those logical cores to the same queue pair.

But short-term, yes - simple round-robin would be better than the current
fallback scheme.

-Jim

> --
> Alexander Kabaev
>