Re: Help me grok the ath(4) device attach code

From: Adrian Chadd <adrian_at_freebsd.org>
Date: Wed, 31 May 2023 05:17:19 UTC
On Tue, 30 May 2023 at 22:12, John Nielsen <lists@jnielsen.net> wrote:

> On May 30, 2023, at 10:56 PM, Adrian Chadd <adrian@freebsd.org> wrote:
>
> On Tue, 30 May 2023 at 20:56, John Nielsen <lists@jnielsen.net> wrote:
>
>> > On May 30, 2023, at 8:02 PM, Adrian Chadd <adrian@freebsd.org> wrote:
>> >
>> > Err, if it's coming up w/ that MAC then it's not finding and attaching
>> right to the OTP/EEPROM calibration information. That's the big red flag
>> that it in general won't work correctly.
>> >
>> > Can you provide the rest of the ath_hal messages? I'd like to see what
>> it's saying during boot around it checking the EEPROM/OTP contents. It's
>> possible there's some work around required for this NIC.
>>
>> He speaks! Thanks for taking the time. I just realized that
>> ath_hal_printf doesn’t prepend “ath%d” so I’ve been missing those messages
>> when grep-ing. Here’s the whole snippet:
>>
>> ath0: <Atheros AR946x/AR948x> mem 0xf7a00000-0xf7a7ffff at device 0.0 on
>> pci4
>> ar9300_flash_map: unimplemented for now
>> Restoring Cal data from DRAM
>> Restoring Cal data from EEPROM
>> Restoring Cal data from Flash
>> Restoring Cal data from Flash
>> Restoring Cal data from OTP
>> ar9300_eeprom_restore_internal[4338] No vaid CAL, calling default template
>> ar9300_hw_attach: ar9300_eeprom_attach returned 0
>>
>
> Yeah, this bit right here is the problem. It's not finding a valid
> calibration.
>
>
> oh err, is there a wifi enable/disable switch or something? maybe it's
> asserted and somehow it's mucking up the NIC?
>
>
> There is a physical switch and it’s in the “enable” position.
>
>  I wonder what ath9k is doing here? Is there some weird pci based
> workaround/flag for the given NIC PCI id?
>
>
> That was the first breadcrumb BZ threw me but I can’t find anything. There
> are some .driver_data hints for adjacent subdevice IDs but none for this
> one (Dell 0x020d) in either FreeBSD or Linux that I could find.
>
> The kernel on the Arch Linux USB I have handy doesn’t appear to have been
> compiled with CONFIG_ATH_DEBUG but here’s what it has in
> /sys/kernel/ieee80211/phy0/ath9k/base_eeprom:
>       EEPROM Version :          2
>           RegDomain1 :        108
>           RegDomain2 :         31
>              TX Mask :          3
>              RX Mask :          3
>           Allow 5GHz :          1
>           Allow 2GHz :          1
>    Disable 2GHz HT20 :          0
>    Disable 2GHz HT40 :          0
>    Disable 5Ghz HT20 :          0
>    Disable 5Ghz HT40 :          0
>           Big Endian :          0
>            RF Silent :         45
>            BT option :          0
>           Device Cap :          0
>          Device Type :          5
>   Power Table Offset :          0
>         Tuning Caps1 :          0
>         Tuning Caps2 :          0
>  Enable Tx Temp Comp :          1
>  Enable Tx Volt Comp :          0
>    Enable fast clock :          1
>      Enable doubling :          1
>   Internal regulator :          0
>         Enable Paprd :          0
>      Driver Strength :          0
>           Quick Drop :          1
>    Chain mask Reduce :          0
>    Write enable Gpio :          6
>    WLAN Disable Gpio :          0
>        WLAN LED Gpio :          8
>  Rx Band Select Gpio :        255
>              Tx Gain :          1
>              Rx Gain :          3
>               SW Reg :  303972983
>           MacAddress : 44:39:c4:5b:44:4a
>
> It also has some calibration and other data in modal_eeprom.
>
> There is this commit in ath9k which mentions an alternative EEPROM
> address, but I’m not sure if that’s relevant. From what I can tell the
> probe should succeed at the normal base_address 0x3ff instead of needing to
> try the “4k” one 0xfff.
>
> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/commit/drivers/net/wireless/ath/ath9k?id=528782ecf59f7bab2f1368628a479f49be59b512
>

Yeah i'd try that. It'd be nice if I knew that the NIC used OTP or EEPROM
though.

There's known issues with all the Atheros chips (sigh) with how the EEPROM
and PCIe bus reset .. interact.
(If the bus reset is too short then the EEPROM state machine gets stuck and
nothing gets read.) It makes debugging this hard because the NIC itself
will work in another device fine, because it's the BIOS/ACPI code. :(


-adrian