acpi timer reads all ones [Was: efirtc + atrtc at the same time]
John Baldwin
jhb at FreeBSD.org
Wed May 27 21:38:26 UTC 2020
On 5/27/20 2:05 PM, Hans Petter Selasky wrote:
> On 2020-05-27 15:41, Justin Hibbits wrote:
>> On Wed, 27 May 2020 06:27:16 -0700
>> John Baldwin <jhb at FreeBSD.org> wrote:
>>
>>> On 5/27/20 2:39 AM, Andriy Gapon wrote:
>>>> On 27/05/2020 11:13, Andriy Gapon wrote:
>>>>> I added more diagnostics and it seems to support the idea that the
>>>>> problem is related to I/O cycles and bridges.
>>>>>
>>>>> ACPI timer suddenly starts returning 0xffffffff and that lasts for
>>>>> tens of microseconds before the timer goes back to returning
>>>>> normal values with an expected increase.
>>>>> AMD provides a proprietary way to access ACPI registers via MMIO
>>>>> (0xfed808xx). That mechanism is unaffected, ACPI timer register
>>>>> always returns good values.
>>>>>
>>>>> The problem seems to happen when restoring configuration of a
>>>>> particular PCI bridge. What's interesting is that the bridge
>>>>> decodes one memory range and one I/O range.
>>>>>
>>>>> Looking at pci_cfg_restore() I wonder if it is wise to restore
>>>>> PCIR_COMMAND so early. Could it be that after the resume the
>>>>> bridge is configured with a wrong I/O range (e.g., too wide) and
>>>>> by writing PCIR_COMMAND we enable that decoding. So, the bridge
>>>>> steals I/O cycles destined for ACPI support hardware. If there is
>>>>> nothing behind the bridge to handle those ports, then we get those
>>>>> bad readings. Once the bridge configuration is fully restored, the
>>>>> I/O handling goes back to normal.
>>>>
>>>> From what I see, this looks like a BIOS bug.
>>>> Upon resume, it swaps window configurations of pcib1 and pcib2
>>>> (until FreeBSD restores them). pcib1 originally does not have an
>>>> I/O window. So, BIOS programs both base and limit of pcib2 I/O
>>>> window to zero. When FreeBSD writes its command register to
>>>> enable I/O decoding it starts claiming 0x0 - 0xFFF I/O port range.
>>>> That covers the ACPI ports at 0x8xx.
>>>>
>>>> Some printf-s.
>>>> From (verbose) boot time:
>>>> pcib1: domain 0
>>>> pcib1: secondary bus 1
>>>> pcib1: subordinate bus 1
>>>> pcib1: memory decode 0xfea00000-0xfeafffff
>>>> pcib2: domain 0
>>>> pcib2: secondary bus 2
>>>> pcib2: subordinate bus 2
>>>> pcib2: I/O decode 0xf000-0xffff
>>>> pcib2: memory decode 0xfe900000-0xfe9fffff
>>>>
>>>> My printf-s from resume time:
>>>> pcib1: old I/O base (low): 0xf1
>>>> pcib1: old I/O base (high): 0x0
>>>> pcib1: old I/O limit (low): 0x1
>>>> pcib1: old I/O limit (high): 0x0
>>>> pcib2: old I/O base (low): 0x1
>>>> pcib2: old I/O base (high): 0x0
>>>> pcib2: old I/O limit (low): 0x1
>>>> pcib2: old I/O limit (high): 0x0
>>>
>>> The "solution" I think is to have resume be multi-pass and to resume
>>> all the bridges first before trying to resume leaf devices (including
>>> timers), but that's a fair bit of work. It might be that we just
>>> need to resume timer interrupts later after the new-bus resume (I
>>> think we currently do it before?), though the reason for that was to
>>> allow resume methods in devices to sleep (I'm not sure if any do).
>>>
>>
>> That sounds like a good fit for https://reviews.freebsd.org/D203 .
>> Someone (TM) just needs to take it over the finish line... 6 years
>> later.
>
> Is this perhaps related to:
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=237666
No. I get that constantly on a desktop that never suspends/resumes.
It only started after upgrading to 12.0.
--
John Baldwin
More information about the freebsd-current
mailing list