Re: git: ded037e65e52 - main - qat: driver updates to improve code and fix bugs

From: Kristof Provost <kp_at_FreeBSD.org>
Date: Wed, 30 Jul 2025 12:29:37 UTC
On 6 Jun 2025, at 15:57, Mark Johnston wrote:
> The branch main has been updated by markj:
>
> URL: 
> https://cgit.FreeBSD.org/src/commit/?id=ded037e65e5239671b1292ec987a2e0894b217b5
>
> commit ded037e65e5239671b1292ec987a2e0894b217b5
> Author:     Hareshx Sankar Raj <hareshx.sankar.raj@intel.com>
> AuthorDate: 2025-05-07 09:38:21 +0000
> Commit:     Mark Johnston <markj@FreeBSD.org>
> CommitDate: 2025-06-06 13:43:54 +0000
>
>     qat: driver updates to improve code and fix bugs
>
>     Bug fixes and improvements are done for the qat code base
>     to improve code quality.
>
>     Reviewed by:    markj, ziaee
>     MFC after:      2 weeks
>     Sponsored by:   Intel Corporation
>     Differential Revision:  https://reviews.freebsd.org/D50379
> ---
…
> diff --git a/sys/dev/qat/qat_hw/qat_c3xxx/adf_c3xxx_hw_data.h 
> b/sys/dev/qat/qat_hw/qat_c3xxx/adf_c3xxx_hw_data.h
> index bfc5db1f5e5c..cddfc3f84853 100644
> --- a/sys/dev/qat/qat_hw/qat_c3xxx/adf_c3xxx_hw_data.h
> +++ b/sys/dev/qat/qat_hw/qat_c3xxx/adf_c3xxx_hw_data.h
> @@ -1,11 +1,11 @@
>  /* SPDX-License-Identifier: BSD-3-Clause */
> -/* Copyright(c) 2007-2022 Intel Corporation */
> +/* Copyright(c) 2007-2025 Intel Corporation */
>  #ifndef ADF_C3XXX_HW_DATA_H_
>  #define ADF_C3XXX_HW_DATA_H_
>
>  /* PCIe configuration space */
> -#define ADF_C3XXX_PMISC_BAR 0
> -#define ADF_C3XXX_ETR_BAR 1
> +#define ADF_C3XXX_PMISC_BAR 1
> +#define ADF_C3XXX_ETR_BAR 2
>  #define ADF_C3XXX_RX_RINGS_OFFSET 8
>  #define ADF_C3XXX_TX_RINGS_MASK 0xFF
>  #define ADF_C3XXX_MAX_ACCELERATORS 3

We’re seeing panics loading the QAT driver on Atom Processor C3000 
(with pfSense).

I believe this change is the trigger.
This causes us to look for the ETR registers in the third BAR, but there 
are only two on that hardware:

	# pciconf -l -b -vV pci0:1:0:0
	none0@pci0:1:0:0:	class=0x0b4000 rev=0x11 hdr=0x00 vendor=0x8086 
device=0x19e2 subvendor=0x8086 subdevice=0x19e2
	    vendor     = 'Intel Corporation'
	    device     = 'Atom Processor C3000 Series QuickAssist Technology'
	    class      = processor
	    bar   [18] = type Memory, range 64, base 0x80700000, size 262144, 
enabled
	    bar   [20] = type Memory, range 64, base 0x80740000, size 262144, 
enabled

That produces this backtrace:

	Fatal trap 12: page fault while in kernel mode
	cpuid = 1; apic id = 18
	fault virtual address	= 0x10
	fault code		= supervisor read data, page not present
	instruction pointer	= 0x20:0xffffffff83c4c25a
	stack pointer	        = 0x28:0xfffffe00681c26b0
	frame pointer	        = 0x28:0xfffffe00681c26b0
	code segment		= base 0x0, limit 0xfffff, type 0x1b
				= DPL 0, pres 1, long 1, def32 0, gran 1
	processor eflags	= interrupt enabled, resume, IOPL = 0
	current process		= 3075 (kldload)
	rdi: 0000000000000000 rsi: 0000000000000000 rdx: 0000000000000000
	rcx: 0000000000000000  r8: fffffe00681c2715  r9: 000000000000000a
	rax: 00000000000001ff rbx: 0000000000000000 rbp: fffffe00681c26b0
	r10: 0000000000000000 r11: 0000000000000001 r12: 0000000000000000
	r13: fffff80003a1a000 r14: 0000000000000000 r15: 0000000000000008
	trap number		= 12
	panic: page fault
	cpuid = 1
	time = 1753859856
	KDB: enter: panic
	[ thread pid 3075 tid 100160 ]
	Stopped at      kdb_enter+0x33: movq    $0,0x1b55532(%rip)
	db> bt
	Tracing pid 3075 tid 100160 td 0xfffff80100d4a780
	kdb_enter() at kdb_enter+0x33/frame 0xfffffe00681c2530
	panic() at panic+0x43/frame 0xfffffe00681c2590
	trap_pfault() at trap_pfault+0x3c9/frame 0xfffffe00681c25e0
	calltrap() at calltrap+0x8/frame 0xfffffe00681c25e0
	--- trap 0xc, rip = 0xffffffff83c4c25a, rsp = 0xfffffe00681c26b0, rbp = 
0xfffffe00681c26b0 ---
	write_csr_ring_config() at write_csr_ring_config+0xa/frame 
0xfffffe00681c26b0
	adf_init_etr_data() at adf_init_etr_data+0x36a/frame 0xfffffe00681c2840
	adf_dev_init() at adf_dev_init+0xc1/frame 0xfffffe00681c28f0
	adf_attach() at adf_attach+0x4e1/frame 0xfffffe00681c2950
	device_attach() at device_attach+0x43d/frame 0xfffffe00681c29a0
	pci_driver_added() at pci_driver_added+0xf2/frame 0xfffffe00681c29e0
	devclass_driver_added() at devclass_driver_added+0x29/frame 
0xfffffe00681c2a10
	devclass_add_driver() at devclass_add_driver+0x11e/frame 
0xfffffe00681c2a50
	module_register_init() at module_register_init+0x85/frame 
0xfffffe00681c2a80
	linker_load_module() at linker_load_module+0xc0f/frame 
0xfffffe00681c2d80
	kern_kldload() at kern_kldload+0x165/frame 0xfffffe00681c2dd0
	sys_kldload() at sys_kldload+0x59/frame 0xfffffe00681c2e00
	amd64_syscall() at amd64_syscall+0x126/frame 0xfffffe00681c2f30
	fast_syscall_common() at fast_syscall_common+0xf8/frame 
0xfffffe00681c2f30
	--More--

Because we’ve tried to use a pci_bars[2] entry with virt_addr == 0 
(because it’s outside of the BARs we set up).

Reverting just the above hunk of the patch fixes the panic for us.

Best regards,
Kristof