[Bug 255745] resume on T490 Thinkpad often broken after upgrade 12 -> 13

From: <bugzilla-noreply_at_freebsd.org>
Date: Mon, 26 Jul 2021 18:07:55 UTC

--- Comment #19 from John Grafton <john.grafton@runbox.com> ---
After more testing, I was able to get 6abe97c0140d54d3520c30517b2bdebc3de92a62
to fail (making my previous comment incorrect).  Thus I went back to the
beginning of 13-CURRENT to find the where the issue made its way into the code
base.  Instead of November 2020, it appears to have been added in July 2020.

I've spent the past week `git bisecting` my way through CURRENT looking for the
beginning of the suspend/resume issue.

Here's a brief description of my testing methodology:
1) Build world / kernel on 12 core system in a 12.2 jail (takes ~ 1/2 hour)
2) Install world / kernel on laptop from NFS to new boot environment on laptop
3) Boot laptop into new boot environment
4) Compile DRM driver for current kernel
5) kldload i915kms driver
6) Close lid and leave closed for 5 seconds
7) Open lid and wait to see if the system fails to resume
8) Repeat lid close and open in console 15 times and in X 5 times
9) If resume fails, mark bisect fail, if all resumes succeed, mark bisect good

Usually, resumes would fail after 5 or so tries.  I never had a resume fail in
X that didn't first fail in the console.  It did not seem to make a difference
if I was on the console or in X.org.

I found that before 12b2f3daaa597f346a4b0065bf7f75378524ef88 the X1 Carbon Gen
6 resumes all 20 times without issue.  The main flaw in my methodology is
assuming 20 suspend and resumes are enough to accurately test for the issue.  

I'm currently using src commit e7677232d6eed5f5cae80c1d5968eea5b9266b59 with
graphics/drm-current-kmod from ports commit
d9b8d3b2b3b5ffcf4b83572098d64803fe237b90 as my daily laptop to test the
assumption that 20 suspend and resumes are enough to make a generalization
about whether my tests were valid.

Oddly, commit 12b2f3daaa597f346a4b0065bf7f75378524ef88 being flagged as the
issue is very strange to me because that commit appears to be essentially a
NOOP as it's just cleaning up #ifdef statements for older systems.  12.2 which
does not fail at all seems to have similar patches in place.

My next steps are to continue testing this older commit as my daily laptop and
see if I can get it to fail.  Then attempt to build a version of 13-RELEASE
with what appears to be the offending patch removed and see if it fails.

You are receiving this mail because:
You are the assignee for the bug.