Re: register x18

From: Andrew Turner <andrew_at_fubar.geek.nz>
Date: Fri, 16 Jul 2021 17:53:14 +0200
> On 16 Jul 2021, at 17:07, Michael Tuexen <tuexen_at_freebsd.org> wrote:
> 
>> On 16. Jul 2021, at 14:51, Andrew Turner <andrew_at_fubar.geek.nz <mailto:andrew_at_fubar.geek.nz>> wrote:
>> 
>> 
>>> On 16 Jul 2021, at 13:08, tuexen_at_freebsd.org <mailto:tuexen_at_freebsd.org> wrote:
>>> 
>>>> On 16. Jul 2021, at 04:06, Mark Millard <marklmi_at_yahoo.com <mailto:marklmi_at_yahoo.com>> wrote:
>>>> 
>>>> 
>>>> 
>>>> On 2021-Jul-15, at 17:40, Michael Tuexen <tuexen at freebsd.org> wrote:
>>>> 
>>>>> Dear all,
>>>>> 
>>>>> register x18 seems to be special. What is it used for in FreeBSD?
>>>>> 
>>>>> Best regards
>>>>> Michael
>>>> 
>>>> https://developer.arm.com/documentation/den0024/a/The-ABI-for-ARM-64-bit-Architecture/Register-use-in-the-AArch64-Procedure-Call-Standard/Parameters-in-general-purpose-registers
>>>> 
>>>> reports:
>>>> 
>>>> QUOTE
>>>> 	• X18 is the platform register and is reserved for the use of platform ABIs. This is an adional temporary register on platforms that don't assign a special meaning to it.
>>>> END QUOTE
>>>> 
>>>> So, special, yes. But I do not know what the "platform ABI" usage
>>>> for it might be on FreeBSD. So, for the most part, this does not
>>>> well-answer your question. Sorry.
>>> Yepp, I found the above text. However, x18 seems to be used when accessing
>>> global variables. I am looking at a panic, where the system panics on accessing
>>> global variable, which can be controlled by sysctl.
>>> It seems that x18 does not have the expected value, but it is also not set in
>>> the function...
>> 
>> X18 is used to store the pointer to the pcpu data It should only ever be set when we enter the kernel from userland by the exception handler.
> Hi Andrew,
> 
> thanks for the response. Hmm. I was hoping that the answers helps me to understand
> a panic that I'm observing when stress testing the TCP RACK stack. I'm transferring
> 10GB via scp and at some point of time (not right at the beginning), the machine panics.
> The machine is an eMAG system.
> 
> Here is what I know:
> 
> Initially it panics multiple times (always at the same place) in
> https://cgit.freebsd.org/src/tree/sys/netinet/tcp_stacks/rack.c#n16540 <https://cgit.freebsd.org/src/tree/sys/netinet/tcp_stacks/rack.c#n16540>
> when it is trying to read V_tcp_map_entries_limit.
> 
> I discussed this with rrs_at_ and since we had no clue, I tried to just compile
> out the if condition.
> 
> Then is paniced in
> https://cgit.freebsd.org/src/tree/sys/netinet/tcp_stacks/rack.c#n16928 <https://cgit.freebsd.org/src/tree/sys/netinet/tcp_stacks/rack.c#n16928>
> at
> https://cgit.freebsd.org/src/tree/sys/netinet/tcp_stacks/rack.c#n15664 <https://cgit.freebsd.org/src/tree/sys/netinet/tcp_stacks/rack.c#n15664>
> which is basically the next place where a V_ variable is accessed.
> 
> Please note that for debugging I'm using a kernel without VIMAGE support,
> since we initially thought that it might be related a VNET bug.
> 
> So I decided to look at the disassembly of rack_sndbuf_autoscale (I added some comments):
> 
>   0xffff000001388a6c <+0>:	stp	x29, x30, [sp, #-32]!
>   0xffff000001388a70 <+4>:	str	x19, [sp, #16]
>   0xffff000001388a74 <+8>:	mov	x29, sp
>   0xffff000001388a78 <+12>:	ldr	x9, [x0, #24]				// x9 = rack->tp;
>   0xffff000001388a7c <+16>:	ldr	w8, [x0, #188]				// w8 = rack->r_ctl.cwnd_to_use
>   0xffff000001388a80 <+20>:	adrp	x12, 0xffff0000013ac000
>   0xffff000001388a84 <+24>:	ldr	w10, [x9, #52]				// w10 = tp->snd_wnd;
>   0xffff000001388a88 <+28>:	ldr	x11, [x18]
>   0xffff000001388a8c <+32>:	ldr	x11, [x11, #1256]
>   0xffff000001388a90 <+36>:	cmp	w8, w10
>   0xffff000001388a94 <+40>:	csel	w10, w8, w10, cc  // cc = lo, ul, last	// min(rack->r_ctl.cwnd_to_use, tp->snd_wnd);
> => 0xffff000001388a98 <+44>:	ldr	x11, [x11, #40]
>   0xffff000001388a9c <+48>:	ldr	x12, [x12, #2752]
>   0xffff000001388aa0 <+52>:	ldr	w11, [x11, x12]				// w11 = V_tcp_do_autosndbuf ???
>   0xffff000001388aa4 <+56>:	cbz	w11, 0xffff000001388be0 <rack_sndbuf_autoscale+372>
>   0xffff000001388aa8 <+60>:	ldr	x8, [x0, #32]				// x8 = rack->rc_inp
>   0xffff000001388aac <+64>:	ldr	x19, [x8, #120]				// x19 = so = x8->inp_socket
>   0xffff000001388ab0 <+68>:	ldrb	w8, [x19, #817]				// w8 = (x19->so_snd.sb_flags << 8) & 0ff
>   0xffff000001388ab4 <+72>:	tbz	w8, #3, 0xffff000001388be0 <rack_sndbuf_autoscale+372> so->so_snd.sb_flags & SB_AUTOSIZE == 0
>   0xffff000001388ab8 <+76>:	ldr	w11, [x9, #52]				// w11 = tp->snd_wnd
>   0xffff000001388abc <+80>:	ldr	w8, [x19, #740]				// w8 = so->so_snd.sb_hiwat
>   0xffff000001388ac0 <+84>:	lsr	w11, w11, #2
>   0xffff000001388ac4 <+88>:	add	w11, w11, w11, lsl #2
>   0xffff000001388ac8 <+92>:	cmp	w11, w8
>   0xffff000001388acc <+96>:	b.cc <http://b.cc/>	0xffff000001388be0 <rack_sndbuf_autoscale+372>  // b.lo, b.ul, b.last
>   0xffff000001388ad0 <+100>:	ldr	w11, [x19, #736]
>   0xffff000001388ad4 <+104>:	lsr	w8, w8, #3
>   0xffff000001388ad8 <+108>:	lsl	w12, w8, #3
>   0xffff000001388adc <+112>:	sub	w8, w12, w8
>   0xffff000001388ae0 <+116>:	cmp	w11, w8
>   0xffff000001388ae4 <+120>:	b.cc <http://b.cc/>	0xffff000001388be0 <rack_sndbuf_autoscale+372>  // b.lo, b.ul, b.last
>   0xffff000001388ae8 <+124>:	ldr	x8, [x18]
>   0xffff000001388aec <+128>:	ldr	x8, [x8, #1256]
>   0xffff000001388af0 <+132>:	ldr	x12, [x8, #40]
>   0xffff000001388af4 <+136>:	adrp	x8, 0xffff0000013ac000
>   0xffff000001388af8 <+140>:	ldr	x8, [x8, #2760]
>   0xffff000001388afc <+144>:	ldr	w12, [x12, x8]
>   0xffff000001388b00 <+148>:	cmp	w11, w12
> 
> So it seems that the code accessing V_tcp_do_autosndbuf is:
> 
>   0xffff000001388a80 <+20>:	adrp	x12, 0xffff0000013ac000
> ...
>   0xffff000001388a88 <+28>:	ldr	x11, [x18]
>   0xffff000001388a8c <+32>:	ldr	x11, [x11, #1256]
> ...
> => 0xffff000001388a98 <+44>:	ldr	x11, [x11, #40]
>   0xffff000001388a9c <+48>:	ldr	x12, [x12, #2752]
>   0xffff000001388aa0 <+52>:	ldr	w11, [x11, x12]				// w11 = V_tcp_do_autosndbuf ???
> 
> and for V_tcp_autosndbuf_max it is:
>   0xffff000001388ae8 <+124>:	ldr	x8, [x18]
>   0xffff000001388aec <+128>:	ldr	x8, [x8, #1256]
>   0xffff000001388af0 <+132>:	ldr	x12, [x8, #40]
>   0xffff000001388af4 <+136>:	adrp	x8, 0xffff0000013ac000
>   0xffff000001388af8 <+140>:	ldr	x8, [x8, #2760]
>   0xffff000001388afc <+144>:	ldr	w12, [x12, x8]
> 
> The #2752 versus #2760 could be the offset of the variable.
> 
> Does the above code makes sense to you? The code relevant for the crash seems to be:
> 
> 0xffff000001388a88 <+28>:	ldr	x11, [x18]
> 0xffff000001388a8c <+32>:	ldr	x11, [x11, #1256]
> 0xffff000001388a98 <+44>:	ldr	x11, [x11, #40]
> 
> Since it is crashing at 0xffff000001388a98 <+44>, my assumption was that x18 is wrong...
> But does this use fit to your description?

This code is loading curthread from the pcpu data, then loading whatever is 1256 bytes within struct thread. I checked the offset of td_vnet and found it was at the correct location so it would appear to be using VIMAGE and has a bad vnet pointer.

The other assembly above also looks like it’s using VIMAGE as they have similar code with the same offsets.

> 
> I'm trying to debug this on arm64, since I can reproduce it on arm64. But there is
> also a bug report that this happens on amd64: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=257195 <https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=257195>
> 
> Any idea what can be wrong? Any hint how to progress?

If you can reproduce of amd64 it might pay to test with KASAN.

How stable is the bad pointer value? It might pay to add KASSERTS to the code to check curvnet (the macro to get td_vnet) is not the bad value, or at least greater than VM_MIN_KERNEL_ADDRESS.

Andrew
Received on Fri Jul 16 2021 - 15:53:14 UTC

Original text of this message