From nobody Sun Oct 17 23:52:39 2021 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 716E01807817 for ; Sun, 17 Oct 2021 23:53:01 +0000 (UTC) (envelope-from timp87@gmail.com) Received: from mail-ed1-x529.google.com (mail-ed1-x529.google.com [IPv6:2a00:1450:4864:20::529]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4HXcJX0mwPz3nHn for ; Sun, 17 Oct 2021 23:53:00 +0000 (UTC) (envelope-from timp87@gmail.com) Received: by mail-ed1-x529.google.com with SMTP id r18so63714512edv.12 for ; Sun, 17 Oct 2021 16:53:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=0WcXOHsVdlh4RXa39QpRTASmlpy56Xqh5/f0EO1pR/I=; b=L+zdRsd22a0/1HRo1i/9wXarVtVtUWuvvXbmEt4H9JNC0d3/Kizc7DOWUR2Is3w2Ta VkAKwX0VQHnrHieQLml+AgrcIfkWq61mnxyp2/jFkxos3qRwvR4GnHlsnQ+MKz4fblbc 66hRd3mjYJsGWzeCVx7a9tiS0tp7POBqjztUcmkrO0jh3SLwXLlAFQDqEX6/VWIndIyB 3LUyHHAx9A2o5SyggXQSmvZwO//DdT288IzEkkVDBm2fEAeuUJjgeskLi2t3Ba0NpEvT xPBuy5BdoaN2eXeYXBlxUJqmF6USVX2aXW4ExSYf/KGg78gqq3EDMDEY7BV4WV4z20MO 7vvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=0WcXOHsVdlh4RXa39QpRTASmlpy56Xqh5/f0EO1pR/I=; b=tML4gFwYMPtZi01xraZLWNIaMfP11Z6s2x5YhWpJA/dRGel3LtzNjBbM+YRz6WDT3+ rG7EgJo6KwlcbOZNpUUj6/+30VrBVbiR4Ilt15qa87yeJRAwICwREZwdgud16sPFxhPA Jd3ya04J27fsnpm5Wi+PmOQBHHvDQu6DaKGjcwVZ/HDTEL3N2DPwYy0QbmsDDGEgWUR9 I6wVwQEJQBUhfBZQqMyjEhqTg3B3DAmnloS4XHmykwmgNO0K2zPR6FnA9fZ80KcMEOTx /mRvK26ge8LTsL2SnM5p4hdAMfFHBMVptPL304ggyoOKX0AIG+Rv64ORSbiCGuuYZvZo sz/A== X-Gm-Message-State: AOAM532oO2e/3vFkEfprTZBzFyChl/icbM13VDzv8JffyEMRXmSupmMP uTw1L9eJTIeHWuiulIun/hrklOas3AKaukfbBXzQVbqm96A= X-Google-Smtp-Source: ABdhPJxFhTVfmkqy8dTqsOPF/ETMyzd8dke1ANGplqqcg+GMNoQT2I7fR2G7uJnXFYv4E8q7EbSwY2x6cAVbjFi+J/s= X-Received: by 2002:a05:6402:3489:: with SMTP id v9mr39845235edc.130.1634514771412; Sun, 17 Oct 2021 16:52:51 -0700 (PDT) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 References: <4fa413f4-f167-ce1d-ce2f-a2a05a34dc32@gmail.com> In-Reply-To: <4fa413f4-f167-ce1d-ce2f-a2a05a34dc32@gmail.com> From: Pavel Timofeev Date: Sun, 17 Oct 2021 17:52:39 -0600 Message-ID: Subject: Re: Dell Latitude 7400 - nvme0: Missing interrupt To: Alexander Motin Cc: Warner Losh , Chuck Tuffli , freebsd-current Content-Type: multipart/mixed; boundary="00000000000054b6a805ce952077" X-Rspamd-Queue-Id: 4HXcJX0mwPz3nHn X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20210112 header.b=L+zdRsd2; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of timp87@gmail.com designates 2a00:1450:4864:20::529 as permitted sender) smtp.mailfrom=timp87@gmail.com X-Spamd-Result: default: False [-1.45 / 15.00]; FREEMAIL_FROM(0.00)[gmail.com]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; HAS_ATTACHMENT(0.00)[]; MID_RHS_MATCH_FROMTLD(0.00)[]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; NEURAL_HAM_SHORT(-0.45)[-0.449]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:+,3:~,4:~,5:~,6:~,7:~]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20210112]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; MIME_GOOD(-0.10)[multipart/mixed,multipart/alternative,text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-current@freebsd.org]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(1.00)[1.000]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::529:from]; RCVD_COUNT_TWO(0.00)[2]; FREEMAIL_CC(0.00)[bsdimp.com,gmail.com,freebsd.org]; RCVD_TLS_ALL(0.00)[] X-ThisMailContainsUnwantedMimeParts: Y --00000000000054b6a805ce952077 Content-Type: multipart/alternative; boundary="00000000000054b6a705ce952075" --00000000000054b6a705ce952075 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable =D0=B2=D1=81, 17 =D0=BE=D0=BA=D1=82. 2021 =D0=B3. =D0=B2 11:19, Alexander M= otin : > It may be a noise, but comparing logs I see in reboot case also > "acpi_ec0: not getting interrupts, switched to polled mode". I am > thinking whether the problem may be caused not by SSD, but by some > resource conflict/misconfiguration in the system. Pavel, can you > compare `devinfo -vr` and `lspci -vvvvv` in both cases. looking for any > differences? Are you running the latest BIOS? > > On 12.10.2021 15:56, Warner Losh wrote: > > One piece of data that would be good to have: > > > > nvmecontrol identify nvme0 > > > > There's an optional feature that none of my drives have, but that the > Linux > > driver does oddly > > weird things when enabled. The output of that command will help me > > understand if that may > > be in play. Maybe we need to do oddly weird things too :) > > > > Warner > > > > On Sun, Oct 10, 2021 at 11:00 PM Warner Losh wrote: > > > >> > >> > >> On Sun, Oct 10, 2021 at 10:48 PM Pavel Timofeev > wrote: > >> > >>> =D1=81=D0=B1, 9 =D0=BE=D0=BA=D1=82. 2021 =D0=B3. =D0=B2 14:59, Warner= Losh : > >>> > >>>> > >>>> > >>>> On Sat, Oct 9, 2021, 8:44 AM Pavel Timofeev wrote= : > >>>> > >>>>> > >>>>> > >>>>> =D0=BF=D1=82, 8 =D0=BE=D0=BA=D1=82. 2021 =D0=B3. =D0=B2 14:49, Warn= er Losh : > >>>>> > >>>>>> > >>>>>> > >>>>>> On Fri, Oct 8, 2021 at 2:42 PM Pavel Timofeev > >>>>>> wrote: > >>>>>> > >>>>>>> > >>>>>>> > >>>>>>> =D1=81=D0=B1, 21 =D0=B0=D0=B2=D0=B3. 2021 =D0=B3. =D0=B2 15:22, W= arner Losh : > >>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> On Sat, Aug 21, 2021 at 3:06 PM Pavel Timofeev > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Warner Losh : > >>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Fri, Aug 20, 2021 at 10:42 PM Pavel Timofeev < > timp87@gmail.com> > >>>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> Pavel Timofeev : > >>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Chuck Tuffli : > >>>>>>>>>>>> > >>>>>>>>>>>>> On Mon, Aug 16, 2021 at 7:43 PM Pavel Timofeev < > >>>>>>>>>>> timp87@gmail.com> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Hello > >>>>>>>>>>>>>> I've got a Dell Latitude 7400 and tried installing the > latest > >>>>>>>>>>>>> 14.0-CURRENT > >>>>>>>>>>>>>> (main-n248636-d20e9e02db3) on it. > >>>>>>>>>>>>>> Despite other things the weird one which concerns me is > >>>>>>>>>>>>>> nvme0: Missing interrupt > >>>>>>>>>>>>>> message I get sometimes on the console. > >>>>>>>>>>>>>> It seems like I get it only after the reboot of the laptop= , > >>>>>>>>>>> i. e. not > >>>>>>>>>>>>>> getting that message if I power cycle the laptop, at least= I > >>>>>>>>>>> haven't > >>>>>>>>>>>>> seen > >>>>>>>>>>>>>> them for now in such cases. > >>>>>>>>>>>>>> So when the laptop is rebooted I can't even take advantage > of > >>>>>>>>>>>>>> nvmecontrol(8) quickly. > >>>>>>>>>>>>>> Well, it still works, but it takes tens of seconds to retu= rn > >>>>>>>>>>> the output. > >>>>>>>>>>>>> ... > >>>>>>>>>>>>>> dmesg when power cycled - > >>>>>>>>>>>>>> > >>>>>>>>>>> > https://drive.google.com/file/d/1dB27oB1O2CcnZy6DvOOhmFO8SN8V8SwJ > >>>>>>>>>>>>>> dmesg when rebooted - > >>>>>>>>>>>>>> > >>>>>>>>>>> > https://drive.google.com/file/d/1DsKTMkihp_OmUcirByLaVO4o2mU38Bxh > >>>>>>>>>>>>> > >>>>>>>>>>>>> I'm sort of curious about the time stamps for the log > messages > >>>>>>>>>>> in the > >>>>>>>>>>>>> failing case. Something like: > >>>>>>>>>>>>> > >>>>>>>>>>>>> $ grep "nv\(me\|d\)" /var/log/messages > >>>>>>>>>>>>> > >>>>>>>>>>>>> --chuck > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Well, I can't see timestamps in the verbose boot log. Am I > >>>>>>>>>>> missing some > >>>>>>>>>>>> configuration for that? > >>>>>>>>>>>> > >>>>>>>>>>>> $ grep "nv\(me\|d\)" /var/log/messages > >>>>>>>>>>>> nvme0: mem > >>>>>>>>>>>> > >>>>>>>>>>> > 0xcc100000-0xcc103fff,0xcc105000-0xcc105fff,0xcc104000-0xcc104fff at devi= ce > >>>>>>>>>>>> 0.0 on pci6 > >>>>>>>>>>>> nvme0: attempting to allocate 5 MSI-X vectors (17 supported) > >>>>>>>>>>>> nvme0: using IRQs 133-137 for MSI-X > >>>>>>>>>>>> nvme0: CapLo: 0x140103ff: MQES 1023, CQR, TO 20 > >>>>>>>>>>>> nvme0: CapHi: 0x00000030: DSTRD 0, NSSRS, CSS 1, MPSMIN 0, > >>>>>>>>>>> MPSMAX 0 > >>>>>>>>>>>> nvme0: Version: 0x00010300: 1.3 > >>>>>>>>>>>> nvme0: Missing interrupt > >>>>>>>>>>>> nvme0: Missing interrupt > >>>>>>>>>>>> nvme0: Missing interrupt > >>>>>>>>>>>> nvme0: Missing interrupt > >>>>>>>>>>>> nvme0: Missing interrupt > >>>>>>>>>>>> nvme0: Missing interrupt > >>>>>>>>>>>> nvme0: Missing interrupt > >>>>>>>>>>>> nvme0: Missing interrupt > >>>>>>>>>>>> nvme0: Missing interrupt > >>>>>>>>>>>> nvme0: Missing interrupt > >>>>>>>>>>>> nvme0: Missing interrupt > >>>>>>>>>>>> nvme0: Missing interrupt > >>>>>>>>>>>> nvd0: NVMe namespace > >>>>>>>>>>>> GEOM: new disk nvd0 > >>>>>>>>>>>> nvd0: 488386MB (1000215216 512 byte sectors) > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Ah, sorry, provided wrong output. > >>>>>>>>>>> Here is what you requested: > >>>>>>>>>>> $ grep "nv\(me\|d\)" /var/log/messages > >>>>>>>>>>> Aug 21 04:34:36 nostromo kernel: nvme0: > mem > >>>>>>>>>>> > 0xcc100000-0xcc103fff,0xcc105000-0xcc105fff,0xcc104000-0xcc104fff > >>>>>>>>>>> at device > >>>>>>>>>>> 0.0 on pci6 > >>>>>>>>>>> Aug 21 04:34:36 nostromo kernel: nvme0: attempting to allocat= e > 5 > >>>>>>>>>>> MSI-X > >>>>>>>>>>> vectors (17 supported) > >>>>>>>>>>> Aug 21 04:34:36 nostromo kernel: nvme0: using IRQs 133-137 fo= r > >>>>>>>>>>> MSI-X > >>>>>>>>>>> Aug 21 04:34:36 nostromo kernel: nvme0: CapLo: 0x140103ff: MQ= ES > >>>>>>>>>>> 1023, CQR, > >>>>>>>>>>> TO 20 > >>>>>>>>>>> Aug 21 04:34:36 nostromo kernel: nvme0: CapHi: 0x00000030: > DSTRD > >>>>>>>>>>> 0, NSSRS, > >>>>>>>>>>> CSS 1, MPSMIN 0, MPSMAX 0 > >>>>>>>>>>> Aug 21 04:34:36 nostromo kernel: nvme0: Version: 0x00010300: > 1.3 > >>>>>>>>>>> Aug 21 04:34:36 nostromo kernel: nvme0: Missing interrupt > >>>>>>>>>>> Aug 21 04:34:36 nostromo kernel: nvme0: Missing interrupt > >>>>>>>>>>> Aug 21 04:34:36 nostromo kernel: nvme0: Missing interrupt > >>>>>>>>>>> Aug 21 04:34:36 nostromo kernel: nvd0: >>>>>>>>>>> 512GB> NVMe > >>>>>>>>>>> namespace > >>>>>>>>>>> Aug 21 04:34:36 nostromo kernel: GEOM: new disk nvd0 > >>>>>>>>>>> Aug 21 04:34:36 nostromo kernel: nvd0: 488386MB (1000215216 5= 12 > >>>>>>>>>>> byte > >>>>>>>>>>> sectors) > >>>>>>>>>>> Aug 21 04:34:42 nostromo kernel: nvme0: Missing interrupt > >>>>>>>>>>> Aug 21 04:35:36 nostromo kernel: nvme0: Missing interrupt > >>>>>>>>>>> Aug 21 04:35:50 nostromo kernel: nvme0: Missing interrupt > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> What happens if you set hw.nvme.use_nvd=3D0 and > >>>>>>>>>> hw.cam.nda.nvd_compat=3D1 > >>>>>>>>>> in the boot loader and reboot? Same thing except nda where nvd > >>>>>>>>>> was? Or does > >>>>>>>>>> it work? > >>>>>>>>>> > >>>>>>>>>> Something weird is going on in the interrupt assignment, I > think, > >>>>>>>>>> but I > >>>>>>>>>> wanted to get any nvd vs nda issues out of the way first. > >>>>>>>>>> > >>>>>>>>>> Warner > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> Do you mean kern.cam.nda.nvd_compat instead > >>>>>>>>> of hw.cam.nda.nvd_compat? > >>>>>>>>> kern.cam.nda.nvd_compat is 1 by default now. > >>>>>>>>> > >>>>>>>>> So I tried to set hw.nvme.use_nvd to 1 as suggested, but I sti= ll > >>>>>>>>> see > >>>>>>>>> nvme0: Missing interrupt > >>>>>>>>> and now also > >>>>>>>>> Root mount waiting for: CAM > >>>>>>>>> messages besides those > >>>>>>>>> > >>>>>>>> > >>>>>>>> OK. That all makes sense. I'd forgotten that nvd_compat=3D1 by > default > >>>>>>>> these > >>>>>>>> days. > >>>>>>>> > >>>>>>>> I'll take a look on monday starting at the differences in > interrupt > >>>>>>>> assignment that > >>>>>>>> are apparent when you cold boot vs reboot. > >>>>>>>> > >>>>>>>> Thanks for checking... I'd hoped this was a cheap fix, but also > >>>>>>>> didn't really > >>>>>>>> expect it to be. > >>>>>>>> > >>>>>>>> Warner > >>>>>>>> > >>>>>>>> > >>>>>>> I've recently upgraded to main-n249974-17f790f49f5 and it got eve= n > >>>>>>> worse now. > >>>>>>> So clean poweron works as before. > >>>>>>> But if rebooted nvme drive refuses to work, while before the code > >>>>>>> upgrade it was just complaining about missing interrupts. > >>>>>>> > >>>>>>> currently dmesg show this: > >>>>>>> nvme0: mem > >>>>>>> 0xcc100000-0xcc103fff,0xcc105000-0xcc105fff,0xcc104000-0xcc104fff > at device > >>>>>>> 0.0 on pci6 > >>>>>>> nvd0: NVMe namespace > >>>>>>> nvd0: 488386MB (1000215216 512 byte sectors) > >>>>>>> nvme0: mem > >>>>>>> 0xcc100000-0xcc103fff,0xcc105000-0xcc105fff,0xcc104000-0xcc104fff > at device > >>>>>>> 0.0 on pci6 > >>>>>>> > >>>>>> > >>>>>> Why is this showing up twice? Or is everything above this line lef= t > >>>>>> over from the first, working boot? > >>>>>> > >>>>>> > >>>>>>> nvme0: RECOVERY_START 9585870784 vs 9367036288 > >>>>>>> nvme0: timeout with nothing complete, resetting > >>>>>>> nvme0: Resetting controller due to a timeout. > >>>>>>> nvme0: RECOVERY_WAITING > >>>>>>> nvme0: resetting controller > >>>>>>> nvme0: aborting outstanding admin command > >>>>>>> nvme0: IDENTIFY (06) sqid:0 cid:15 nsid:0 cdw10:00000001 > >>>>>>> cdw11:00000000 > >>>>>>> nvme0: ABORTED - BY REQUEST (00/07) sqid:0 cid:15 cdw0:0 > >>>>>>> nvme0: nvme_identify_controller failed! > >>>>>>> nvme0: waiting > >>>>>>> > >>>>>> > >>>>>> Clearly something bad is going on with the drive here... We looked > >>>>>> into the completion queues when we didn't get an interrupt and > there was > >>>>>> nothing complete there.... > >>>>>> > >>>>>> The only thing I can think of is that this means there's a phase > error > >>>>>> between the drive and the system. I recently removed a second rese= t > and > >>>>>> made it an option NVME_2X_RESET. Can you see if adding > >>>>>> 'options NVME_2X_RESET' to your kernel config fixes this? > >>>>>> > >>>>>> Warner > >>>>>> > >>>>>> > >>>>>>> nvme0: mem > >>>>>>> 0xcc100000-0xcc103fff,0xcc105000-0xcc105fff,0xcc104000-0xcc104fff > at device > >>>>>>> 0.0 on pci6 > >>>>>>> nvme0: RECOVERY_START 9362778467 vs 9361830561 > >>>>>>> nvme0: timeout with nothing complete, resetting > >>>>>>> nvme0: Resetting controller due to a timeout. > >>>>>>> nvme0: RECOVERY_WAITING > >>>>>>> nvme0: resetting controller > >>>>>>> nvme0: aborting outstanding admin command > >>>>>>> nvme0: IDENTIFY (06) sqid:0 cid:15 nsid:0 cdw10:00000001 > >>>>>>> cdw11:00000000 > >>>>>>> nvme0: ABORTED - BY REQUEST (00/07) sqid:0 cid:15 cdw0:0 > >>>>>>> nvme0: nvme_identify_controller failed! > >>>>>>> nvme0: waiting > >>>>>>> > >>>>>>> > >>>>> > >>>>> Sorry, it's showing twice due to multiple reboots. For one boot it'= s > >>>>> like: > >>>>> nvme0: mem > >>>>> 0xcc100000-0xcc103fff,0xcc105000-0xcc105fff,0xcc104000-0xcc104fff a= t > device > >>>>> 0.0 on pci6 > >>>>> nvme0: RECOVERY_START 9633303481 vs 9365971423 > >>>>> nvme0: timeout with nothing complete, resetting > >>>>> nvme0: Resetting controller due to a timeout. > >>>>> nvme0: RECOVERY_WAITING > >>>>> nvme0: resetting controller > >>>>> nvme0: aborting outstanding admin command > >>>>> nvme0: IDENTIFY (06) sqid:0 cid:15 nsid:0 cdw10:00000001 > cdw11:00000000 > >>>>> nvme0: ABORTED - BY REQUEST (00/07) sqid:0 cid:15 cdw0:0 > >>>>> nvme0: nvme_identify_controller failed! > >>>>> nvme0: waiting > >>>>> > >>>>> Well, neither Windows not Linux have any problems with the device. = I > >>>>> understand they may be hiding it or workaround somehow. > >>>>> > >>>> > >>>> Yea, I'm trying to figure out why your machine is different than min= e, > >>>> and what Windows or Linux do that is different. It may be dodgy > hardware, > >>>> but others have no trouble... > >>>> > >>>> I'll try setting NVME_2X_RESET in the kernel config and report back > in a > >>>>> while. > >>>>> > >>>> > >>>> Thanks. If it helps, that tells me something. If it doesn't, that > tells > >>>> me something else. > >>>> > >>>> I suspect that it is somewhere else in the system, tbh, but I need t= o > >>>> find it systematically. > >>>> > >>>> Warner > >>>> > >>> > >>> Surprisingly, setting NVME_2X_RESET in the kernel config hasn't chang= ed > >>> anything. I. e it didn't help. > >>> > >> > >> While it would have been nice to have this be the fix, I'm not that > >> surprised either. > >> It was the biggest change of late, apart from the big re-arrangement > that > >> I'd done. > >> > >> So the other changes have moved from the occasional missing interrupt > >> message > >> (which the old code would get when a command wasn't completed in the > >> timeout > >> period, but that we found to be done when we scanned the completion > queue. > >> Now > >> the device is detected fine (as before), but then doesn't do I/O at al= l > >> (including not > >> completing the identify command!) and is worse. This is unexpected and > I'm > >> trying > >> understand what happens on reboot that 'changes'the working state and > why > >> the > >> new code behaves so differently. > >> > >> Warner > >> > > > > -- > Alexander Motin > Thanks for the reply. It's using the latest firmware. This is the first thing I do in such case. Attaching devinfo and lspci output. These are diffs showing the difference between clean boot and a reboot: $ diff -u devinfo.ok devinfo.nok --- devinfo.ok 2021-10-17 17:48:07.964346000 -0600 +++ devinfo.nok 2021-10-17 17:48:07.886881000 -0600 @@ -214,10 +214,6 @@ nvme0 pnpinfo vendor=3D0x1c5c device=3D0x1639 subvendor=3D0x1c= 5c subdevice=3D0x1639 class=3D0x010802 at slot=3D0 function=3D0 dbsf=3Dpci0:59= :0:0 handle=3D\_SB_.PCI0.RP13.PXSX Interrupt request lines: 0x85 - 0x86 - 0x87 - 0x88 - 0x89 pcib7 memory window: 0xcc100000-0xcc103fff 0xcc104000-0xcc104fff $ diff -u lspci.ok lspci.nok --- lspci.ok 2021-10-17 17:48:15.894470000 -0600 +++ lspci.nok 2021-10-17 17:48:15.341379000 -0600 @@ -132,7 +132,7 @@ Flags: PMEClk- DSI+ D1- D2- AuxCurrent=3D0mA PME(D0+,D1-,D2-,D3hot+,D3col= d+) Status: D0 NoSoftRst+ PME-Enable- DSel=3D0 DScale=3D0 PME- Capabilities: [d0] MSI: Enable+ Count=3D1/1 Maskable- 64bit+ - Address: 00000000fee06000 Data: 0033 + Address: 00000000fee06000 Data: 0034 Capabilities: [40] Express (v2) Root Complex Integrated Endpoint, MSI 00 DevCap: MaxPayload 128 bytes, PhantFunc 0 ExtTag- RBE- FLReset+ --00000000000054b6a705ce952075-- --00000000000054b6a805ce952077--