40 cores, 48 NVMe disks, feel free to take over

Eric Joyner erj at freebsd.org
Tue Sep 20 17:12:50 UTC 2016


Christoph,

Did you end up filing a bug report? I thought the ixl(4) MSI-X interrupt
allocation failure was handled properly in 11, but if it's as you suspect,
I might still have to look at it again.

On Mon, Sep 19, 2016 at 1:10 PM John Baldwin <jhb at freebsd.org> wrote:

> On Monday, September 19, 2016 11:56:58 AM Adrian Chadd wrote:
> > Hi,
> >
> > I think the nvme allocation issue is known. John?
>
> A kernel with 'options EARLY_AP_STARTUP' (which I plan to enable by default
> in HEAD "soon") should boot fine without needing the force_intx hack.  The
> option is available in 11 but not enabled by default.
>
> > -a
> >
> >
> > On 11 September 2016 at 13:35, Kevin P. Neal <kpn at neutralgood.org>
> wrote:
> > > On Sat, Sep 10, 2016 at 10:57:07AM +0200, Christoph Pilka wrote:
> > >> Hi,
> > >>
> > >> the server we got to experiment with is the SuperMicro 2028R-NR48N (
> https://www.supermicro.nl/products/system/2U/2028/SSG-2028R-NR48N.cfm <
> https://www.supermicro.nl/products/system/2U/2028/SSG-2028R-NR48N.cfm>),
> the board itself is a X10DSC+
> > >
> > > The best thing to do is file a bug report. If you don't then your
> report
> > > will probably fall through the cracks. Include all the info you've
> posted
> > > so far.
> > >
> > >> //Chris
> > >>
> > >> > On 09 Sep 2016, at 23:14, Dennis Glatting <freebsd at pki2.com> wrote:
> > >> >
> > >> > On Fri, 2016-09-09 at 22:51 +0200, Christoph Pilka wrote:
> > >> >> Hi,
> > >> >>
> > >> >> we've just been granted a short-term loan of a server from
> Supermicro
> > >> >> with 40 physical cores (plus HTT) and 48 NVMe drives. After a bit
> of
> > >> >> mucking about, we managed to get 11-RC running. A couple of things
> > >> >> are preventing the system from being terribly useful:
> > >> >>
> > >> >> - We have to use hw.nvme.force_intx=1 for the server to boot
> > >> >> If we don't, it panics around the 9th NVMe drive with "panic:
> > >> >> couldn't find an APIC vector for IRQ...". Increasing
> > >> >> hw.nvme.min_cpus_per_ioq brings it further, but it still panics
> later
> > >> >> in the NVMe enumeration/init. hw.nvme.per_cpu_io_queues=0 causes it
> > >> >> to panic later (I suspect during ixl init - the box has 4x10gb
> > >> >> ethernet ports).
> > >> >>
> > >> >> - zfskern seems to be the limiting factor when doing ~40 parallel
> "dd
> > >> >> if=/dev/zer of=<file> bs=1m" on a zpool stripe of all 48 drives.
> Each
> > >> >> drive shows ~30% utilization (gstat), I can do ~14GB/sec write and
> 16
> > >> >> read.
> > >> >>
> > >> >> - direct writing to the NVMe devices (dd from /dev/zero) gives
> about
> > >> >> 550MB/sec and ~91% utilization per device
> > >> >>
> > >> >> Obviously, the first item is the most troublesome. The rest is
> based
> > >> >> on entirely synthetic testing and may have little or no actual
> impact
> > >> >> on the server's usability or fitness for our purposes.
> > >> >>
> > >> >> There is nothing but sshd running on the server, and if anyone
> wants
> > >> >> to play around you'll have IPMI access (remote kvm, virtual media,
> > >> >> power) and root.
> > >> >>
> > >> >> Any takers?
> > >> >>
> > >> >
> > >> >
> > >> > I'm curious to know what board you have. I have had FreeBSD,
> including
> > >> > release 11 candidates, running on SM boards without any trouble
> > >> > although some of them are older boards. I haven't looked at ZFS
> > >> > performance because mine are typically low disk use. That said, my
> > >> > virtual server (also a SM) IOPs suck but so do its disks.
> > >> >
> > >> > I recently found the Intel RAID chip on one SM isn't real RAID,
> rather
> > >> > it's pseudo RAID but for a few dollars more it could be real RAID.
> :(
> > >> > It was killing IOPs so I popped in an old LSI board, routed the
> cables
> > >> > from the Intel chip, and the server is now a happy camper. I then
> > >> > replaced 11-RC with Ubuntu 16.10 due to a specific application but
> I am
> > >> > also running RAIDz2 under Ubuntu on three trash 2.5T disks (I
> didn't do
> > >> > this for any reason other than fun).
> > >> >
> > >> > root at Tuck3r:/opt/bin# zpool status
> > >> >   pool: opt
> > >> >  state: ONLINE
> > >> >   scan: none requested
> > >> > config:
> > >> >
> > >> >     NAME        STATE     READ WRITE CKSUM
> > >> >     opt         ONLINE       0     0     0
> > >> >       raidz2-0  ONLINE       0     0     0
> > >> >         sda     ONLINE       0     0     0
> > >> >         sdb     ONLINE       0     0     0
> > >> >         sdc     ONLINE       0     0     0
> > >> >
> > >> >
> > >> >
> > >> >> Wbr
> > >> >> Christoph Pilka
> > >> >> Modirum MDpay
> > >> >>
> > >> >> Sent from my iPhone
> > >> >> _______________________________________________
> > >> >> freebsd-questions at freebsd.org <mailto:
> freebsd-questions at freebsd.org> mailing list
> > >> >> https://lists.freebsd.org/mailman/listinfo/freebsd-questions <
> https://lists.freebsd.org/mailman/listinfo/freebsd-questions>
> > >> >> To unsubscribe, send any mail to
> "freebsd-questions-unsubscribe at freeb
> > >> >> sd.org <http://sd.org/>"
> > >> > _______________________________________________
> > >> > freebsd-questions at freebsd.org <mailto:freebsd-questions at freebsd.org>
> mailing list
> > >> > https://lists.freebsd.org/mailman/listinfo/freebsd-questions <
> https://lists.freebsd.org/mailman/listinfo/freebsd-questions>
> > >> > To unsubscribe, send any mail to "
> freebsd-questions-unsubscribe at freebsd.org <mailto:
> freebsd-questions-unsubscribe at freebsd.org>"
> > >>
> > >> _______________________________________________
> > >> freebsd-questions at freebsd.org mailing list
> > >> https://lists.freebsd.org/mailman/listinfo/freebsd-questions
> > >> To unsubscribe, send any mail to "
> freebsd-questions-unsubscribe at freebsd.org"
> > > --
> > > Kevin P. Neal
> http://www.pobox.com/~kpn/
> > >
> > >  "Good grief, I've just noticed I've typed in a rant. Sorry chaps!"
> > >                             Keir Finlow Bates, circa 1998
> > > _______________________________________________
> > > freebsd-questions at freebsd.org mailing list
> > > https://lists.freebsd.org/mailman/listinfo/freebsd-questions
> > > To unsubscribe, send any mail to "
> freebsd-questions-unsubscribe at freebsd.org"
>
>
> --
> John Baldwin
> _______________________________________________
> freebsd-questions at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to "
> freebsd-questions-unsubscribe at freebsd.org"
>


More information about the freebsd-questions mailing list