[Bug 268276] Regression: Black screen on resume caused by commit 9e007a88d65b
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 268276] Regression: Black screen on resume caused by commit 9e007a88d65b"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 268276] Regression: Black screen on resume caused by commit 9e007a88d65b"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 268276] Regression: Black screen on resume caused by commit 9e007a88d65b"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 268276] Regression: Black screen on resume caused by commit 9e007a88d65b"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 268276] Regression: Black screen on resume caused by commit 9e007a88d65b"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Fri, 09 Dec 2022 15:17:30 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=268276
Bug ID: 268276
Summary: Regression: Black screen on resume caused by commit
9e007a88d65b
Product: Base System
Version: CURRENT
Hardware: amd64
OS: Any
Status: New
Severity: Affects Some People
Priority: ---
Component: kern
Assignee: bugs@FreeBSD.org
Reporter: ashafer@badland.io
I've finally narrowed down the cause for suspend/resume breaking on my ryzen
system for the past year. Commit 9e007a88d65b changed the polling rate of
atkbd, which for some reason causes the GPU to disappear off the PCI bus,
leaving the screen black.
Author: Alexander Motin <mav@FreeBSD.org>
Date: Wed Jan 5 11:32:44 2022 -0500
atkbd: Reduce polling rate from 10Hz to ~1Hz.
In my understanding this is only needed to workaround lost interrupts.
I was thinking to remove it completely, but the comment about edge-
triggered interrupt may be true and needs deeper investigation. ~1Hz
should be often enough to handle the supposedly rare loss cases, but
rare enough to not appear in top. Add sysctl hw.atkbd.hz to tune it.
MFC after: 1 month
The workaround is to put sysctl hw.atkbd.hz=10 in /boot/loader.conf
System is AMD Ryzen 9 5900X, TUF Gaming b550-PLUS motherboard, NVIDIA GTX 960.
I did update the motherboard firmware but that didn't help.
Usually when resuming you can ssh into the machine, but if you try to do
anything graphical the following prints:
Dec 9 02:12:32 mick kernel: NVRM: GPU at PCI:0000:07:00:
GPU-8293a5fd-a5ed-570d-283f-675298ebf38c
Dec 9 02:12:32 mick kernel: NVRM: Xid (PCI:0000:07:00): 79, pid='<unknown>',
name=<unknown>, GPU has fallen off the bus.
Dec 9 02:12:32 mick kernel: NVRM: GPU 0000:07:00.0: GPU has fallen off the
bus.
Dec 9 02:12:32 mick devd[384]: notify_clients: send() failed; dropping
unresponsive client
Dec 9 02:12:32 mick kernel: nvidia-modeset: ERROR: GPU:0: Failed detecting
connected display devices
Dec 9 02:12:32 mick syslogd: last message repeated 2 times
Dec 9 02:12:32 mick kernel: nvidia-modeset: ERROR: GPU:0: Failure reading
maximum pixel clock value for display device HDMI-0.
Dec 9 02:12:32 mick kernel: nvidia-modeset: ERROR: GPU:0: Failed detecting
connected display devices
I first noticed this on GhostBSD, and for some reason couldn't reproduce the
bisect range on FreeBSD kernels. I had to bisect between GhostBSD's 21.12.24
and 22.3.16 kernel releases to find this commit. Then I could apply the sysctl
workaround to a FreeBSD CURRENT kernel and have suspend/resume working again.
Why was this change made? Is there some performance reason why we don't want to
be polling atkbd so much? I'm not sure why this would affect the entire PCI
bus, but since it breaks suspend resume on certain machines it would be nice to
get a fix into base so things work out of the box again without having to add
the sysctl workaround.
--
You are receiving this mail because:
You are the assignee for the bug.