Re: really slow problem with nvme

From: Warner Losh <imp_at_bsdimp.com>
Date: Fri, 23 Feb 2024 19:16:20 UTC
On Fri, Feb 23, 2024 at 12:03 PM Bjoern A. Zeeb <
bzeeb-lists@lists.zabbadoz.net> wrote:

> On Fri, 23 Feb 2024, Warner Losh wrote:
>
> > On Fri, Feb 23, 2024, 10:46 AM Bjoern A. Zeeb <
> > bzeeb-lists@lists.zabbadoz.net> wrote:
> >
> >> Hi,
> >>
> >> this is a Samsung SSD 970 EVO Plus 1TB nvme and gpart and newfs
> >> were already slow (it took like two hours for newfs).
> >>
> >> Here's another example now:
> >>
> >> # /usr/bin/time mkdir foo
> >>          1.82 real         0.00 user         0.00 sys
> >>
> >> How does one debug this?
> >>
> >
> > What filesystem? Sounds like UFS but just making sure. .
>
> yes, ufs
>
> > So what's the link speed and number of lanes?  If it's bad i might reseat
> > (though that might not help) that looks good...
>
> pciconf I had checked:
>
> nvme0@pci4:1:0:0:       class=0x010802 rev=0x00 hdr=0x00 vendor=0x144d
> device=0xa808 subvendor=0x144d subdevice=0xa801
>      class      = mass storage
>      subclass   = NVM
>      bar   [10] = type Memory, range 64, base 0x40000000, size 16384,
> enabled
>      cap 01[40] = powerspec 3  supports D0 D3  current D0
>      cap 05[50] = MSI supports 1 message, 64 bit
>      cap 10[70] = PCI-Express 2 endpoint max data 128(256) FLR RO NS
>                   max read 512
>                   link x2(x4) speed 8.0(8.0) ASPM disabled(L1) ClockPM
> disabled
>      cap 11[b0] = MSI-X supports 33 messages, enabled
>                   Table in map 0x10[0x3000], PBA in map 0x10[0x2000]
>      ecap 0001[100] = AER 2 0 fatal 0 non-fatal 0 corrected
>      ecap 0003[148] = Serial 1 0000000000000000
>      ecap 0004[158] = Power Budgeting 1
>      ecap 0019[168] = PCIe Sec 1 lane errors 0
>      ecap 0018[188] = LTR 1
>      ecap 001e[190] = L1 PM Substates 1
>

x4 card in a x2 slot. If that's intentional, then this looks good.


>
> > Though I'd bet money that this is an interrupt issue. I'd do a vmstat. -i
> > to watch how quickly they accumulate...
>
> That I am waiting for a full world to get onto it.  I wish I could have
> netbooted but not possible there currently.
>
> Only took 15 minutes to extract the tar now.  Should have used ddb...
> hadn't thought of that before...
>
> # vmstat -ai | grep nvme
> its0,0: nvme0:admin                                       0          0
> its0,1: nvme0:io0                                         0          0
> its0,2: nvme0:io1                                         0          0
> its0,3: nvme0:io2                                         0          0
> its0,4: nvme0:io3                                         0          0
> its0,5: nvme0:io4                                         0          0
> its0,6: nvme0:io5                                         0          0
> its0,7: nvme0:io6                                         0          0
> its0,8: nvme0:io7                                         0          0
>
> How does this even work?  Do we poll?
>

Yes. We poll, and poll slowly.  You have an interrupt problem.

On an ARM platform. Fun. ITS and I are old.... foes? Friends? frenemies?

As for why, I don't know. I've been fortunate never to have to chase
interrupts not working on arm problems....


> And before you ask:
>
> [1.000407] nvme0: <Generic NVMe Device> mem 0x40000000-0x40003fff at
> device 0.0 on pci5
> [1.000409] nvme0: attempting to allocate 9 MSI-X vectors (33 supported)
> [1.000410] nvme0: using IRQs 106-114 for MSI-X
> [1.000411] nvme0: CapLo: 0x3c033fff: MQES 16383, CQR, AMS WRRwUPC, TO 60
> [1.000412] nvme0: CapHi: 0x00000030: DSTRD 0, NSSRS, CSS 1, CPS 0, MPSMIN
> 0, MPSMAX 0
> [1.000413] nvme0: Version: 0x00010300: 1.3
>

Yea, that's what I'd expect.


> > How old is the drive? Fresh install? Do other drives have this same issue
> > in the same slot? Dies this drive have issues in other maxhines or slots?
>
> The drive is a few months old but only in the box until it went on this
> board.
>
> I checked nvmecontrol for anything obvious but didn't see.
>

OK. So not 'super old nand in its death throes being slow"


> > Oh, and what's its temperature? Any message in dmesg?
>
> Nothing in dmesg, temp seems not too bad.  Took a while to get
> smartmontools;
> we have no way to see this in nvmecontrol in human readable form, do we?
>
> Temperature Sensor 1:               51 Celsius
> Temperature Sensor 2:               48 Celsius
>

A little warm, but not terrible. 50 is where I start to worry a bit, but
the card won't thermal
throttle until more like 60. We don't currently have a nvmecontrol identify
field to tell you this
(I should add it, this is the second time in as many weeks I've wanted it).

Ok I got a 2nd identical machine netbooted remotely (pressue with
> problems often helps) -- slightly different freebsd version and kernel,
> same baord, same type of nvme bought together:
>
> # /usr/bin/time dd if=/dev/zero of=/dev/nda0 bs=1M count=1024
> 1024+0 records in
> 1024+0 records out
> 1073741824 bytes transferred in 1.657316 secs (647879880 bytes/sec)
>          1.66 real         0.00 user         0.94 sys
>
> and ddb> show intrcnt
> ..
> its0,0: nvme0:admin                     24
> its0,1: nvme0:io0                       126
> its0,2: nvme0:io1                       143
> its0,3: nvme0:io2                       131
> its0,4: nvme0:io3                       128
> its0,5: nvme0:io4                       135
> its0,6: nvme0:io5                       147
> its0,7: nvme0:io6                       143
> its0,8: nvme0:io7                       144


Yea, that's what I'd expect/ Dozens to hundreds of interrupts.

I'll try to make sure I can safely access both over the weekend remotely
> from a more comforting place and I know where to start looking now...
>
> Thanks!
>

No problem!

Warner