nvidia drivers mutex lock
Tomoaki AOKI
junchoon at dec.sakura.ne.jp
Fri Jun 9 15:11:25 UTC 2017
Hmm, now I now strongly suspect hardware or noise issue, as nvidia GPU
seems to fall / re-appear on bus for some times.
If it WAS a desktop one and GPU is attached via PCIe connector,
I'll immediately power off and re-connect the card, with some
physical dust cleaning, but this time the GPU is onboard...
*Not shure, but possibly, too short timeout on driver initialization
code can show problems like this (too short to initialize).
On Thu, 8 Jun 2017 02:27:51 +0800
blubee blubeeme <gurenchan at gmail.com> wrote:
> I was just looking through dmesg and noticed these:
>
> Jun 6 21:40:52 blubee kernel: nvidia-modeset: Allocated GPU:0
> (GPU-54a7b304-c99d-efee-0117-0ce119063cd6) @ PCI:0000:01:00.0
> Jun 6 21:41:05 blubee kernel: NVRM: GPU at PCI:0000:01:00:
> GPU-54a7b304-c99d-efee-0117-0ce119063cd6
> Jun 6 21:41:05 blubee kernel: NVRM: GPU Board Serial Number:
> Jun 6 21:41:05 blubee kernel: NVRM: Xid (PCI:0000:01:00): 79, GPU has
> fallen off the bus.
> Jun 6 21:41:05 blubee kernel:
> Jun 6 21:41:05 blubee kernel: NVRM: GPU at 0000:01:00.0 has fallen off the
> bus.
> Jun 6 21:41:05 blubee kernel: NVRM: GPU is on Board .
> Jun 6 21:41:05 blubee kernel: NVRM: A GPU crash dump has been created. If
> possible, please run
> Jun 6 21:41:05 blubee kernel: NVRM: nvidia-bug-report.sh as root to
> collect this data before
> Jun 6 21:41:05 blubee kernel: NVRM: the NVIDIA kernel module is unloaded.
> Jun 6 21:41:05 blubee kernel: nvidia-modeset: ERROR: GPU:0: Failed to
> query display engine channel state: 0x0000927c:0:0:0x0000000f
> Jun 6 21:41:05 blubee kernel: nvidia-modeset: ERROR: GPU:0: Failed to
> query display engine channel state: 0x0000927c:0:0:0x0000000f
> Jun 6 21:41:05 blubee kernel: vgapci0: child nvidia0 requested
> pci_enable_io
> Jun 6 21:41:05 blubee kernel: nvidia-modeset: ERROR: GPU:0: Failed to
> query display engine channel state: 0x0000927c:0:0:0x0000000f
> Jun 6 21:41:06 blubee kernel: nvidia-modeset: ERROR: GPU:0: Failed to
> query display engine channel state: 0x0000927c:0:0:0x0000000f
> Jun 6 21:41:22 blubee kernel: .
>
> then that lead me to this nvidia forum thread:
> https://devtalk.nvidia.com/default/topic/985037/gtx-1070-quot-gpu-has-fallen-off-the-bus-quot-running-3d-games-in-arch-linux-/
>
> maybe it could help somehow?
>
> Best,
> Owen
>
> On Tue, Jun 6, 2017 at 10:08 PM, blubee blubeeme <gurenchan at gmail.com>
> wrote:
>
> > This is getting out of hand. I can't even keep x going for ten minutes
> > sometimes.
> > I've tested all the suggestions in this thread and they just don't work.
> >
> > I have put out a print of sysctl hw. here : https://paste2.org/
> >
> > With this CPU: hw.model: Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz
> > The bios on this laptop I can either set graphics to discrete or mshybrid.
> >
> > I've tried in the past to disable discrete and run mshybrid but that
> > always comes up with 0 screens found. Even just doing Xorg -configure.
> >
> > Anyone have some tips on disabling nvidia drivers, running this cpu with
> > igpu for a while?
> >
> > Best,
> > Owen
> >
> > On Sun, Jun 4, 2017, 18:11 blubee blubeeme <gurenchan at gmail.com> wrote:
> >
> >> Thanks a lot! I'll give it a shot in a bit.
> >>
> >> Best,
> >> Owen
> >>
> >> On Sun, Jun 4, 2017, 16:59 Tomoaki AOKI <junchoon at dec.sakura.ne.jp>
> >> wrote:
> >>
> >>> Yes. FreeBSD patches in x11/nvidia-drivers/files are applied as usual.
> >>>
> >>> But beware! Sometimes upstream changes make any of FreeBSD patches not
> >>> applicable (incorporating any of these, incompatible modifies, ...).
> >>>
> >>> For 381.22, current patchset applies and builds fine for me.
> >>>
> >>>
> >>> On Sun, 04 Jun 2017 08:04:50 +0000
> >>> blubee blubeeme <gurenchan at gmail.com> wrote:
> >>>
> >>> > I'm running with svn and I build by make.
> >>> > If in use these steps, the BSD related patches will be applied, etc?
> >>> >
> >>> > Best,
> >>> > Owen
> >>> >
> >>> > On Sun, Jun 4, 2017, 15:53 Tomoaki AOKI <junchoon at dec.sakura.ne.jp>
> >>> wrote:
> >>> >
> >>> > > Hi.
> >>> > >
> >>> > > Not in ports tree, but easily overridden by adding
> >>> > >
> >>> > > DISTVERSION=381.22 -DNO_CHECKSUM
> >>> > >
> >>> > > on make command line. Makefile of x11/nvidia-driver has a mechanism
> >>> > > to do so for someone requires newer version (newer GPU support,
> >>> etc.).
> >>> > >
> >>> > > If you're using portupgrade,
> >>> > >
> >>> > > portupgrade -m 'DISTVERSION=381.22 -DNO_CHECKSUM' -f
> >>> x11/nvidia-driver
> >>> > >
> >>> > > would do the same.
> >>> > >
> >>> > > If you installed it via pkg, there's no way to try. :-(
> >>> > > (As it's pre-built.)
> >>> > >
> >>> > >
> >>> > > On Sun, 04 Jun 2017 07:04:01 +0000
> >>> > > blubee blubeeme <gurenchan at gmail.com> wrote:
> >>> > >
> >>> > > > Hi @tomoaki
> >>> > > > Is that version of nvidia drivers currently in the ports tree? I
> >>> just
> >>> > > > checked but it seems not to be.
> >>> > > >
> >>> > > > @jeffrey
> >>> > > > I just generated a new xorg based on the force composition
> >>> setting. I
> >>> > > > merged it with my previous xorg I'll reboot, see if it gives better
> >>> > > > performance.
> >>> > > >
> >>> > > > It seems like my system is locking up more frequently now.
> >>> Sometimes
> >>> > > right
> >>> > > > after a reboot the system, the screen locks and it's reboot and
> >>> pray.
> >>> > > >
> >>> > > > Best,
> >>> > > > Owen
> >>> > > >
> >>> > > > On Sat, Jun 3, 2017, 21:59 Jeffrey Bouquet <
> >>> jeffreybouquet at yahoo.com>
> >>> > > wrote:
> >>> > > >
> >>> > > > > SOME LINES BOTTOM POSTED, SEE...
> >>> > > > > --------------------------------------------
> >>> > > > > On Fri, 6/2/17, Tomoaki AOKI <junchoon at dec.sakura.ne.jp> wrote:
> >>> > > > >
> >>> > > > > Subject: Re: nvidia drivers mutex lock
> >>> > > > > To: freebsd-current at freebsd.org
> >>> > > > > Cc: "Jeffrey Bouquet" <jeffreybouquet at yahoo.com>, "blubee
> >>> blubeeme" <
> >>> > > > > gurenchan at gmail.com>
> >>> > > > > Date: Friday, June 2, 2017, 11:25 PM
> >>> > > > >
> >>> > > > > Hi.
> >>> > > > > Version
> >>> > > > > 381.22 (5 days newer than 375.66) of the driver states...
> >>> > > > > [1]
> >>> > > > >
> >>> > > > > Fixed hangs and
> >>> > > > > crashes that could occur when an OpenGL context is
> >>> > > > > created while the system is out of available
> >>> > > > > memory.
> >>> > > > >
> >>> > > > > Can this be related
> >>> > > > > with your hang?
> >>> > > > >
> >>> > > > > IMHO,
> >>> > > > > possibly allocating new resource (using os.lock_mtx
> >>> > > > > guard)
> >>> > > > > without checking the lock first while
> >>> > > > > previous request is waiting for
> >>> > > > > another can
> >>> > > > > cause the duplicated lock situation. And high memory
> >>> > > > > pressure would easily cause the situation.
> >>> > > > >
> >>> > > > > [1] http://www.nvidia.com/Download
> >>> /driverResults.aspx/118527/en-us
> >>> > > > >
> >>> > > > > Hope it helps.
> >>> > > > >
> >>> > > > >
> >>> > > > > On Thu, 1 Jun
> >>> > > > > 2017 22:35:46 +0000 (UTC)
> >>> > > > > Jeffrey Bouquet
> >>> > > > > <jeffreybouquet at yahoo.com>
> >>> > > > > wrote:
> >>> > > > >
> >>> > > > > > I see the same
> >>> > > > > message, upon load, ...
> >>> > > > > >
> >>> > > > > --------------------------------------------
> >>> > > > > > On Thu, 6/1/17, blubee blubeeme <gurenchan at gmail.com>
> >>> > > > > wrote:
> >>> > > > > >
> >>> > > > > > Subject:
> >>> > > > > nvidia drivers mutex lock
> >>> > > > > > To: freebsd-ports at freebsd.org,
> >>> > > > > freebsd-current at freebsd.org
> >>> > > > > > Date: Thursday, June 1, 2017, 11:35
> >>> > > > > AM
> >>> > > > > >
> >>> > > > > > I'm
> >>> > > > > running nvidia-drivers 375.66 with a GTX
> >>> > > > > > 1070 on FreeBSD-Current
> >>> > > > > >
> >>> > > > > > This problem
> >>> > > > > just started happening
> >>> > > > > > recently but,
> >>> > > > > every so often my laptop
> >>> > > > > > screen will
> >>> > > > > just blank out and then I
> >>> > > > > > have to
> >>> > > > > power cycle to get the
> >>> > > > > > machine up and
> >>> > > > > running again.
> >>> > > > > >
> >>> > > > > > It seems to be a problem with nvidia
> >>> > > > > > drivers acquiring duplicate lock. Any
> >>> > > > > > info on this?
> >>> > > > > >
> >>> > > > > > Jun〓 2 02:29:41 blubee kernel:
> >>> > > > > > acquiring duplicate lock of same
> >>> > > > > type:
> >>> > > > > > "os.lock_mtx"
> >>> > > > > > Jun〓 2 02:29:41 blubee kernel: 1st
> >>> > > > > > os.lock_mtx @ nvidia_os.c:841
> >>> > > > > > Jun〓 2 02:29:41 blubee kernel: 2nd
> >>> > > > > > os.lock_mtx @ nvidia_os.c:841
> >>> > > > > > Jun〓 2 02:29:41 blubee kernel:
> >>> > > > > > stack backtrace:
> >>> > > > > >
> >>> > > > > Jun〓 2 02:29:41 blubee kernel: #0
> >>> > > > > >
> >>> > > > > 0xffffffff80ab7770 at
> >>> > > > > >
> >>> > > > > witness_debugger+0x70
> >>> > > > > > Jun〓 2
> >>> > > > > 02:29:41 blubee kernel: #1
> >>> > > > > >
> >>> > > > > 0xffffffff80ab7663 at
> >>> > > > > >
> >>> > > > > witness_checkorder+0xe23
> >>> > > > > > Jun〓 2
> >>> > > > > 02:29:41 blubee kernel: #2
> >>> > > > > >
> >>> > > > > 0xffffffff80a35b93 at
> >>> > > > > >
> >>> > > > > __mtx_lock_flags+0x93
> >>> > > > > > Jun〓 2
> >>> > > > > 02:29:41 blubee kernel: #3
> >>> > > > > >
> >>> > > > > 0xffffffff82f4397b at
> >>> > > > > >
> >>> > > > > os_acquire_spinlock+0x1b
> >>> > > > > > Jun〓 2
> >>> > > > > 02:29:41 blubee kernel: #4
> >>> > > > > >
> >>> > > > > 0xffffffff82c48b15 at _nv012002rm+0x185
> >>> > > > > > Jun〓 2 02:29:41 blubee kernel:
> >>> > > > > > ACPI Warning:
> >>> > > > > \_SB.PCI0.PEG0.PEGP._DSM:
> >>> > > > > > Argument #4
> >>> > > > > type mismatch - Found
> >>> > > > > > [Buffer], ACPI
> >>> > > > > requires [Package]
> >>> > > > > >
> >>> > > > > (20170303/nsarguments-205)
> >>> > > > > > Jun〓 2
> >>> > > > > 02:29:42 blubee kernel:
> >>> > > > > >
> >>> > > > > nvidia-modeset: Allocated GPU:0
> >>> > > > > >
> >>> > > > > (GPU-54a7b304-c99d-efee-0117-0ce119063cd6) @
> >>> > > > > > PCI:0000:01:00.0
> >>> > > > > >
> >>> > > > >
> >>> > > > > > Best,
> >>> > > > > > Owen
> >>> > > > > >
> >>> > > > > _______________________________________________
> >>> > > > > > freebsd-ports at freebsd.org
> >>> > > > > > mailing list
> >>> > > > > > https://lists.freebsd.org/mailman/listinfo/freebsd-ports
> >>> > > > > > To unsubscribe, send any mail to
> >>> > > > > "freebsd-ports-unsubscribe at freebsd.org"
> >>> > > > > >
> >>> > > > > >
> >>> > > > > >
> >>> > > > > > ... then Xorg will
> >>> > > > > run happily twelve hours or so. The lockups here happen
> >>> > > > > usually
> >>> > > > > > when too large or too many of
> >>> > > > > number of tabs/ large web pages with complex CSS etc
> >>> > > > > > are opened at a time.
> >>> > > > > > So no help, just a 'me
> >>> > > > > too'.
> >>> > > > > >
> >>> > > > > _______________________________________________
> >>> > > > > > freebsd-current at freebsd.org
> >>> > > > > mailing list
> >>> > > > > > https://lists.freebsd.org/mailman/listinfo/freebsd-current
> >>> > > > > >
> >>> > > > > To unsubscribe, send any mail to "
> >>> > > freebsd-current-unsubscribe at freebsd.org
> >>> > > > > "
> >>> > > > > >
> >>> > > > > >
> >>> > > > >
> >>> > > > >
> >>> > > > > --
> >>> > > > > Tomoaki
> >>> > > > > AOKI <junchoon at dec.sakura.ne.jp>
> >>> > > > >
> >>> > > > >
> >>> > > > >
> >>> > > > > ........................
> >>> > > > > might be a workaround
> >>> > > > > Xorg/nvidia ran all night with this:
> >>> > > > > nvidia-settings >> X server display configuration >>
> >>> Advanced >>
> >>> > > Force
> >>> > > > > Full Composition Pipeline
> >>> > > > > ... for the laptop freezing. Could not hurt to try. " merge
> >>> with
> >>> > > > > Xorg.conf " from nvidia-settings...
> >>> > > > > ......................
> >>> > > > > 18 hours uptime so far, even past
> >>> > > > > the 3 am periodic scripts. Have not rebooted out of the Xorg
> >>> though
> >>> > > so
> >>> > > > > may require edit-out of
> >>> > > > > xorg.conf if that is the case, in other words differing from
> >>> real-time
> >>> > > > > apply and
> >>> > > > > xorg initially start applies.
> >>> > > > > ........
> >>> > > > >
> >>> > > > >
> >>> > > > _______________________________________________
> >>> > > > freebsd-current at freebsd.org mailing list
> >>> > > > https://lists.freebsd.org/mailman/listinfo/freebsd-current
> >>> > > > To unsubscribe, send any mail to "
> >>> > > freebsd-current-unsubscribe at freebsd.org"
> >>> > > >
> >>> > > >
> >>> > >
> >>> > >
> >>> > > --
> >>> > > Tomoaki AOKI <junchoon at dec.sakura.ne.jp>
> >>> > >
> >>>
> >>>
> >>> --
> >>> Tomoaki AOKI <junchoon at dec.sakura.ne.jp>
> >>>
> >>
> _______________________________________________
> freebsd-current at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe at freebsd.org"
>
>
--
Tomoaki AOKI <junchoon at dec.sakura.ne.jp>
More information about the freebsd-current
mailing list