Re: dtrace bitfields failure (was: 12.3-RC1 fails ...)

From: Mark Johnston <markj_at_freebsd.org>
Date: Tue, 04 Jan 2022 18:01:55 UTC
On Tue, Jan 04, 2022 at 04:05:53PM +0100, Peter wrote:
> 
> Hija,
> 
>   sadly, I was too early in agreeing that the two patches
>      22082f15f9
>      68396709e7
> together do solve the issue. They only do on a certain assumption,
> which does not hold true in all cases.
> 
> 
> Let's look at https://reviews.freebsd.org/D27213
> 
> This is the code in question that will trigger the action:
> 
>      if (dst_type == CTF_ERR && name[0] != '\0' &&
>              (hep = ctf_hash_lookup(&src_fp->ctf_names, src_fp, name,
>              strlen(name))) != NULL &&
>              src_type != (ctf_id_t)hep->h_type) {
> 
> What happens here: in the case of a bitfield type we need to also
> copy the corresponding intrinsic type. This condition here checks for
> the case and also should deliver that respective intrinsic type
> into the "hep" variable.
> 
> But this depends on the assumption that the intrinsic type appears
> first in the "src_fp" container, so that the hash will point to it.
> And that is not necessarily true; it depends on what options you have
> in your kernel config.
> 
> 
> For instance, with my custom kernel, things look like this:
> 
> $ ctfdump -t kernel.full
> 
> - Types ----------------------------------------------------------------------
> 
>   [1] STRUCT (anon) (8 bytes)
>         sle_next type=262 off=0
> 
>   [2] STRUCT (anon) (8 bytes)
>         stqe_next type=262 off=0
> 
>   [3] UNION (anon) (8 bytes)
>         m_next type=262 off=0
>         m_slist type=1 off=0
>         m_stailq type=2 off=0
> 
>   [4] UNION (anon) (8 bytes)
>         m_nextpkt type=262 off=0
>         m_slistpkt type=1 off=0
>         m_stailqpkt type=2 off=0
> 
>   <5> INTEGER char encoding=SIGNED CHAR offset=0 bits=8
>   <6> POINTER (anon) refers to 5
>   <7> TYPEDEF caddr_t refers to 6
>   <8> INTEGER int encoding=SIGNED offset=0 bits=32
>   <9> TYPEDEF __int32_t refers to 8
>   <10> TYPEDEF int32_t refers to 9
>   [11] INTEGER unsigned int encoding=0x0 offset=0 bits=8
>   [12] INTEGER unsigned int encoding=0x0 offset=0 bits=24
>   [13] STRUCT (anon) (8 bytes)
>         cstqe_next type=229 off=0
> 
>   <14> POINTER (anon) refers to 229
>   [15] STRUCT (anon) (16 bytes)
>         le_next type=229 off=0
>         le_prev type=14 off=64
> 
>   <16> INTEGER long encoding=SIGNED offset=0 bits=64
>   <17> ARRAY (anon) content: 5 index: 16 nelems: 16
> 
>   <18> INTEGER unsigned int encoding=0x0 offset=0 bits=32
>   <19> TYPEDEF u_int refers to 18
> [etc.etc.]
> 
> 
> As we can see, this one has the bitfield types as #11 and #12, and
> the intrinsic type as #18. And consequentially things do fail.
> 
> 
> I currently do not know what is the culprit. Has the linking stage of
> the kernel a flaw? Or is the patch D27213 based on a wrong assumption?
> 
> I hope You guys can answer that. For now I changed the patch D27213
> to cover the case, so that things do work.
> Further details on request.

I'm not immediately sure where the problem is.  Could you please post
the kernel configuration and src revision that you're using, so that I
can try and reproduce this?  How exactly does the bug manifest?