40 cores, 48 NVMe disks, feel free to take over

Christoph Pilka c.pilka at asconix.com
Fri Sep 9 20:51:49 UTC 2016


we've just been granted a short-term loan of a server from Supermicro with 40 physical cores (plus HTT) and 48 NVMe drives. After a bit of mucking about, we managed to get 11-RC running. A couple of things are preventing the system from being terribly useful:

- We have to use hw.nvme.force_intx=1 for the server to boot
If we don't, it panics around the 9th NVMe drive with "panic: couldn't find an APIC vector for IRQ...". Increasing hw.nvme.min_cpus_per_ioq brings it further, but it still panics later in the NVMe enumeration/init. hw.nvme.per_cpu_io_queues=0 causes it to panic later (I suspect during ixl init - the box has 4x10gb ethernet ports).

- zfskern seems to be the limiting factor when doing ~40 parallel "dd if=/dev/zer of=<file> bs=1m" on a zpool stripe of all 48 drives. Each drive shows ~30% utilization (gstat), I can do ~14GB/sec write and 16 read.

- direct writing to the NVMe devices (dd from /dev/zero) gives about 550MB/sec and ~91% utilization per device 

Obviously, the first item is the most troublesome. The rest is based on entirely synthetic testing and may have little or no actual impact on the server's usability or fitness for our purposes. 

There is nothing but sshd running on the server, and if anyone wants to play around you'll have IPMI access (remote kvm, virtual media, power) and root.

Any takers?

Christoph Pilka
Modirum MDpay

Sent from my iPhone

More information about the freebsd-questions mailing list