crash of 32-bit powerpc -r347549 kernel built via system-clang-8 (crash is while trying to mount the root file system) [debug kernel case: code generation error]

Mark Millard marklmi at yahoo.com
Wed Jun 5 21:27:59 UTC 2019


[This is from my experiments with more modern toolchains than
normally/offocially used, specifically for 32-bit powerpc this
time.]

On 2019-Jun-5, at 01:35, Mark Millard <marklmi at yahoo.com> wrote:

> On 2019-Jun-3, at 19:40, Mark Millard <marklmi at yahoo.com> wrote:
> 
>> On 2019-Jun-3, at 17:24, Mark Millard <marklmi at yahoo.com> wrote:
>> 
>>> I tried (cross) building a 32-bit powerpc kernel and world (non-debug) 
>>> with system-clang (on amd64) and use of devel/powerpc64-binutils . The
>>> installed kernel panics trying to mount the root file system.
>>> 
>>> FYI: Typed from picture of screen . . .
>>> 
>>> Trying to mount root from ufs:/dev/ufs/FBSDG4Srootfs [rw,noatime]...
>>> panic: getnewbuf_empty: Locked buf 0xd2800000 on free queue.
>>> . . .
>>> 0xd6919080: at kdb_backtrace+0x64
>>> 0xd69190e0: at vpanic+0x200
>>> 0xd6919150: at panic+0x50
>>> 0xd6919190: at getnewbuf+0x594
>>> 0xd69191f0: at getblkx+0x540
>>> 0xd69192a0: at breadn_flags+0x90
>>> 0xd69192f0: at ffs_use_bread+0x9c
>>> 0xd6919330: at readsuper+0x68
>>> 0xd6919370: at ffs_sbget+0xcc
>>> 0xd69193c0: at ffs_mount+0x18b8
>>> 0xd69194f0: at vfs_domount+0xa74
>>> 0xd69196a0: at vfs_donmount+0x944
>>> 0xd6919700: at kernel_mount+0x64
>>> 0xd6919740: at parse_mount+0x52c
>>> 0xd6919840: at vfs_mountroot+0x71c
>>> 0xd69199b0: at start_init+0x44
>>> 0xd6919a10: at fork_exit_0xcc
>>> 0xd6919a40: at fork_trampoline+0xc
>>> KDB: enter panic
>>> [ thread pid 1 tid 100002 ]
>>> Stopped at kdb_enter+0x74: addi r3,r0,0x0
>>> 
>>> This reproduces with each boot attempt.
>>> 
>>> Replacing the kernel with one built via gcc 4.2.1 and booting
>>> the result does not panic.
>>> 
>>> 
>>> FYI for the context of the panic call:
>>> 
>>> /usr/src/sys/kern/vfs_bio.c :
>>> 
>>> static struct buf *
>>> buf_alloc(struct bufdomain *bd)
>>> {
>>>      struct buf *bp;
>>>      int freebufs;
>>> 
>>>      /*
>>>       * We can only run out of bufs in the buf zone if the average buf
>>>       * is less than BKVASIZE.  In this case the actual wait/block will
>>>       * come from buf_reycle() failing to flush one of these small bufs.
>>>       */
>>>      bp = NULL;
>>>      freebufs = atomic_fetchadd_int(&bd->bd_freebuffers, -1);
>>>      if (freebufs > 0)
>>>              bp = uma_zalloc(buf_zone, M_NOWAIT);
>>>      if (bp == NULL) {
>>>              atomic_add_int(&bd->bd_freebuffers, 1);
>>>              bufspace_daemon_wakeup(bd);
>>>              counter_u64_add(numbufallocfails, 1);
>>>              return (NULL);
>>>      }
>>>      /*
>>>       * Wake-up the bufspace daemon on transition below threshold.
>>>       */
>>>      if (freebufs == bd->bd_lofreebuffers)
>>>              bufspace_daemon_wakeup(bd);
>>> 
>>>      if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_NOWAIT, NULL) != 0)
>>>              panic("getnewbuf_empty: Locked buf %p on free queue.", bp);
>> 
>> 
>> I tried making a debug kernel build via system-clang-8. It
>> reports differently but still during getnewbuf being active
>> on the stack (again typed from a picture):
>> 
>> Trying to mount root from ufs:/dev/ufs/FBSDG4Srootfs [rw,noatime]...
>> . . . (ignore witness/diagnostic warnings) . . .
>> panic: bq_remove: Locked buf 0xd2a00000 not on a queue.
>> . . .
>> 0xd6b7bfd0: at kdb_backtrace+0x64
>> 0xd6b7c030: at vpanic+0x200
>> 0xd6b7c0a0: at panic+0x50
>> 0xd6b7c0e0: at bq_remove+01e0
>> 0xd6b7c100: at buf_import+0x8c
>> 0xd6b7c130: at uma_zalloc_arg+0x544
>> 0xd6b7c190: at getnewbuf+0x380
>> 0xd6b7c1f0: at getblkx+0x620
>> 0xd6b7c290: at breadn_flags+0x90
>> 0xd6b7c2e0: at ffs_use_bread+0xa8
>> 0xd6b7c320: at readsuper+0x68
>> 0xd6b7c360: at ffs_sbget+0xcc
>> 0xd6b7c3b0: at ffs_mount+0xefc
>> 0xd6b7c4e0: at vfs_domount+0xa754
>> 0xd6b7c690: at vfs_donmount+0x78c
>> 0xd6b7c6f0: at kernel_mount+0x7c
>> 0xd6b7c730: at parse_mount+0x52c
>> 0xd6b7c830: at vfs_mountroot+0x660
>> 0xd6b7c9a0: at start_init+0x4c
>> 0xd6b7ca10: at fork_exit_0xb0
>> 0xd6b7ca40: at fork_trampoline+0xc
>> 
>> /usr/src/sys/kern/vfs_bio.c :
>> 
>> static void
>> bq_remove(struct bufqueue *bq, struct buf *bp)
>> {
>> 
>>       CTR3(KTR_BUF, "bq_remove(%p) vp %p flags %X",
>>           bp, bp->b_vp, bp->b_flags);
>>       KASSERT(bp->b_qindex != QUEUE_NONE,
>>           ("bq_remove: buffer %p not on a queue.", bp));
>> . . .
>> 
>> For reference:
>> 
>> static int
>> buf_import(void *arg, void **store, int cnt, int domain, int flags)
>> {
>>       struct buf *bp;
>>       int i;
>> 
>>       BQ_LOCK(&bqempty);
>>       for (i = 0; i < cnt; i++) {
>>               bp = TAILQ_FIRST(&bqempty.bq_queue);
>>               if (bp == NULL)
>>                       break;
>>               bq_remove(&bqempty, bp);
>>               store[i] = bp;
>>       }
>>       BQ_UNLOCK(&bqempty);
>> 
>>       return (i);
>> }
>> 
>> 
> 
> I tried building the debug kernel with KTR for KTR_BUF.
> Installing and booting the result did not panic. Manually
> forcing getting to ddb> soon enough and doing "show ktr"
> did show a bq_remove for 0xd2a00000 (and later activity).
> 
> From the looks of the KTR_BUF CTRn's, this suggests to me
> that the access to bp->qindex in bq_remove is racy in
> some way vs. updates to the value.

The code produced by clang for the debug kernel, KTR
off in this case, for:

      KASSERT(bp->b_qindex != QUEUE_NONE,
          ("bq_remove: buffer %p not on a queue.", bp));

is wrong [the 84(r29) accesses bp->b_qindex]:

. . .
00618aa8 <bq_remove+0x34> lbz     r5,84(r29)
00618aac <bq_remove+0x38> cmplwi  r5,4
00618ab0 <bq_remove+0x3c> bgt-    00618c10 <bq_remove+0x19c>
. . .
00618c10 <bq_remove+0x19c> lwz     r3,-32364(r30)
00618c14 <bq_remove+0x1a0> crclr   4*cr1+eq
00618c18 <bq_remove+0x1a4> mr      r4,r29
00618c1c <bq_remove+0x1a8> bl      00541ca0 <panic>
. . .

Comparing against 4 does not match any part of
bq_remove. Comparison via gt would make sense for:

/usr/src/sys/sys/buf.h: uint8_t         b_qindex;       /* (Q) buffer queue index */)

if the comparison was against zero. It should
have been:

/usr/src/sys/kern/vfs_bio.c:#define QUEUE_NONE  0       /* on no queue */


This is for a head -r347549 32-bit powerpc FreeBSD context,
built with system clang (an amd6->powerpc cross build using
devel/powerpc64-binutils ).



===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)



More information about the freebsd-toolchain mailing list